Back to For Beginners

Manipulating sounds C++

For Beginners

Started by Concentrate May 14, 2011 05:17 PM

13 comments, last by NicolasJay 12 years, 10 months ago

Concentrate

181

Author

May 14, 2011 05:17 PM

I decided to make a music visualization software. Right now, I'm looking at FMOD; seems relatively simple. But I want to extract the raw compressed data from the music file and run some FFT on it in order to use it on the program, is it possible to do this with FMOD or do you have any other suggestions? Thanks

regards, D.Chhetri

EDIT: I'm using sdl so I'm looking it its music classes/functions, still any advice would be great.

Edge cases will show your design flaws in your code!
Visit my site
Visit my FaceBook
Visit my github

NicolasJay

May 15, 2011 03:58 PM

EDIT: I'm using sdl so I'm looking it its music classes/functions, still any advice would be great.

Firstly, if you were going to choose to go it alone, I would recommend the SFML library instead of SDL. SDL_mixer is very old, and the 'C' style interface makes me cry..

I would go with 1.6, which has a C++ style interface. You will have to build the DLLs yourself, but it's not that hard to get it up and running, and you can get help in their forum if you need it. You should search first though, because there are already a ton of questions answered about how to build the SFML DLLs in their forum.

http://www.sfml-dev.org/download.php

BUT, unless you are already good at low level programming, this may not be the right place to start. I would recommend starting out with something like a Winamp Plugin. Starting an MP3 player from scratch is a ton of work, whereas Winamp already has a development kit that is specifically designed for exactly what you wanna do.

http://wiki.winamp.com/wiki/Plug-in_Developer

Then, once you get the theory down, you can look at breaking away from Winamp and making something lower level if you want.

Concentrate

181

Author

May 15, 2011 07:11 PM

I think I can handle it without doing it as a plugin, the whole point of this project is to exercise my programming ability not to make the actual project necessarily. I'm not a big fan of C-style programming as well. But
I'm using opengl and sdl already so I figured I would use already whats provided. But do you think I would be able to retrive the raw data using sdl functions?

Edge cases will show your design flaws in your code!
Visit my site
Visit my FaceBook
Visit my github

Wooh

1,088

May 15, 2011 07:27 PM

If you use SDL_mixer you can access the raw data from the Mix_Chunk datastrucure. It contains a pointer to the data. If you use SDL_LoadWAV you will load raw data into a buffer so using SDL it will be easy to access the raw data.

NicolasJay

May 16, 2011 10:13 AM

well then, what you will have to do will be somewhere along these lines:

instead of using Mix_OpenAudio() which is a wrapper function, you will have to use the lower level SDL_OpenAudio(). The lower level one will allow you to set your own audio callback and feed the audio into the stream directly via the *stream pointer in the callback. This of course means that you will have to decode the audio yourself.

I wrote a class for playing movies this way not to long ago.
here is the code for reference:

sdlffmovie.h
[source lang="cpp"]
#ifndef INC_SDLFFMOVIE_H
#define INC_SDLFFMOVIE_H
#include "Definitions.h"
#include <SDL.h>
#include <SDL_ffmpeg.h>

#define BUFFER_SIZE 20 // in audio frames

class SDL_FFMovie
{

public:
SDL_FFMovie();
~SDL_FFMovie();

// run a movie
int Run(SDL_Surface *screen, int x, int y, const char *filename);
bool movieDone;

private:

// get the current time of the playing movie
int64_t GetTimestamp();

// create a buffer for audio frames
SDL_ffmpegAudioFrame **audioFrame;

// the frame used for drawing the current movie
SDL_ffmpegVideoFrame *videoFrame;

// the movie file
SDL_ffmpegFile *movie;

/* use a mutex to prevent errors due to multithreading */
SDL_mutex *mutex;

// the width and height of the current movie
int movie_w;
int movie_h;

// start time and position of the current movie
int64_t timestamp;
int64_t movStart;

// callback for SDL audio playback
static void AudioCallback(void *data, Uint8 *stream, int length);
void ThisAudioCallback(void *data, Uint8 *stream, int length);

void Cleanup();

};

#endif /* INC_SDLFFMOVIE_H */
[/source]
sdlffmovie.cpp
[source lang="cpp"]
#include "sdlffmovie.h"

int SDL_FFMovie::Run(SDL_Surface *screen, int x, int y, const char *filename)
{
// open the mpeg file
movie = SDL_ffmpegOpen(filename);
if(movie == NULL)
{
fprintf(stderr, "could not open %s: %s\n", filename, SDL_ffmpegGetError());
return -1;
}

// create a mutex for the movie data
mutex = SDL_CreateMutex();

// select the first audio and video streams from the mpeg
SDL_ffmpegSelectAudioStream(movie, 0);
SDL_ffmpegSelectVideoStream(movie, 0);

// retrieve the audio format of the movie
SDL_AudioSpec specs = SDL_ffmpegGetAudioSpec(movie, 512, AudioCallback);
specs.userdata = this; // pass "this" to the audio callback

// retrieve the video sizeo of the movie
SDL_ffmpegGetVideoSize(movie, &movie_w, &movie_h);

// create a video frame to blit to the screen
videoFrame = SDL_ffmpegCreateVideoFrame();
videoFrame->surface = SDL_CreateRGBSurface(0, movie_w, movie_h, 24, 0x0000FF, 0x00FF00, 0xFF0000, 0);

// create the source rectangle
SDL_Rect src_rect = {0, 0 ,movie_w, movie_h};
SDL_Rect dst_rect = {x, y ,movie_w, movie_h};

if(SDL_ffmpegValidAudio(movie))
{
// open the audio device
if(SDL_OpenAudio(&specs, 0) != 0)
{
fprintf(stderr, "Couldn't open audio: %s\n", SDL_GetError());
Cleanup();
return -1;
}

// calculate the audio frame size (2 bytes per sample)
int frameSize = specs.channels * specs.samples * 2;

// allocate the audio buffer
audioFrame = new SDL_ffmpegAudioFrame*[BUFFER_SIZE];

//create and fill the audio buffer
for(int i = 0; i < BUFFER_SIZE; i++)
{
audioFrame = SDL_ffmpegCreateAudioFrame(movie, frameSize);

if(audioFrame == NULL)
{
Cleanup();
return -1;
}

SDL_ffmpegGetAudioFrame(movie, audioFrame);
}

// unpause audio so the buffer starts being read.
SDL_PauseAudio(0);

// store the time at which the movie started playing
movStart = SDL_GetTicks();
}

SDL_Event Event;
movieDone = false;
while(movieDone == false)
{
/*Uint8 *keys = SDL_GetKeyState(NULL);
if(keys[SDLK_ESCAPE])
movieDone = true;*/
// handle keyboard and mouse input

while (SDL_PollEvent(&Event))
{
switch(Event.type)
{
case SDL_MOUSEMOTION:
movieDone = true;
break;
case SDL_KEYDOWN:
movieDone = true;
switch(Event.key.keysym.sym)
{
case SDLK_ESCAPE:
movieDone = true;
}
break;
case SDL_QUIT:
movieDone = true;
break;
}
}

// fill up the audio buffer if neccessary
if(SDL_ffmpegValidAudio(movie))
{
SDL_LockMutex(mutex);

for(int i = 0; i < BUFFER_SIZE; i++)
{
// check if frame is empty
if(audioFrame->size == 0)
{
// fill frame with new data
SDL_ffmpegGetAudioFrame(movie, audioFrame);
}
}

SDL_UnlockMutex(mutex);
}

// draw the video frame
if(videoFrame)
{
// if the current frame has expired, get a new one
if(videoFrame->pts < GetTimestamp())
{
SDL_ffmpegGetVideoFrame(movie, videoFrame);
}

// draw the current frame to the screen
if(videoFrame->surface != NULL)
{
SDL_FillRect(screen, 0, 0);
SDL_BlitSurface(videoFrame->surface, &src_rect, screen, &dst_rect);
SDL_Flip(screen);
}

// exit if this is the last frame
if(videoFrame->last)
{
movieDone = true;
}
}
}

Cleanup();

return 0;
}

void SDL_FFMovie::AudioCallback(void *data, Uint8 *stream, int length)
{
((SDL_FFMovie*)data)->ThisAudioCallback(NULL, stream, length);
}

void SDL_FFMovie::ThisAudioCallback(void *data, Uint8 *stream, int length)
{
// lock mutex, so audioFrame[] will not be changed from another thread
SDL_LockMutex( mutex );

if(audioFrame[0]->size == length)
{
// update timestamp
timestamp = audioFrame[0]->pts;

// copy one frame from the buffer to the stream
memcpy(stream, audioFrame[0]->buffer, audioFrame[0]->size);

// mark the frame as used
audioFrame[0]->size = 0;

// move the empty frame to the end of the buffer
SDL_ffmpegAudioFrame *f = audioFrame[0];
for(int i = 1; i < BUFFER_SIZE; i++ )
{
audioFrame[i - 1] = audioFrame;
}

audioFrame[BUFFER_SIZE - 1] = f;
}
else
{
// no frames available
memset(stream, 0, length);
}

SDL_UnlockMutex( mutex );
}

int64_t SDL_FFMovie::GetTimestamp()
{
// return the position that the current movie should be at
if(SDL_ffmpegValidAudio(movie))
{
return timestamp;
}
else if(SDL_ffmpegValidVideo(movie))
{
return SDL_GetTicks() - movStart;
}

return 0;
}

void SDL_FFMovie::Cleanup()
{
// free the movie file
if(movie != NULL)
{
SDL_ffmpegFree(movie);
movie = NULL;
}

// stop any audio playback
if(SDL_ffmpegValidAudio(movie))
{
SDL_PauseAudio(1);
}

SDL_CloseAudio();

// free all audio frames and delete the buffer
if(audioFrame != NULL)
{
for(int i = 0; i < BUFFER_SIZE; i++)
{
if(audioFrame != NULL)
{
SDL_ffmpegFreeAudioFrame(audioFrame);
}
}

delete [] audioFrame;
audioFrame = NULL;
}

// free video frame
if(videoFrame != NULL)
{
SDL_ffmpegFreeVideoFrame(videoFrame);
videoFrame = NULL;
}

// destroy the mutex
if(mutex != NULL)
{
SDL_DestroyMutex(mutex);
mutex = NULL;
}

movie_w = 0;
movie_h = 0;

timestamp = 0;
movStart = 0;
}

SDL_FFMovie::SDL_FFMovie()
{
audioFrame = NULL;
videoFrame = NULL;
movie = NULL;
mutex = NULL;
movie_w = 0;
movie_h = 0;
timestamp = 0;
movStart = 0;
}

SDL_FFMovie::~SDL_FFMovie()
{

}
[/source]

The two important parts are this one, from the main loop where I use the ffmpeg decoder to decompress audio frames and fill up the buffer:
[source lang="cpp"]
// fill up the audio buffer if neccessary
if(SDL_ffmpegValidAudio(movie))
{
SDL_LockMutex(mutex);

for(int i = 0; i < BUFFER_SIZE; i++)
{
// check if frame is empty
if(audioFrame->size == 0)
{
// fill frame with new data
SDL_ffmpegGetAudioFrame(movie, audioFrame);
}
}

SDL_UnlockMutex(mutex);
}
[/source]

and this one, inside the audio callback where I feed them into the audio stream whenever SDL calls the callback because it needs more audio data to continue playing.
[source lang="cpp"]
if(audioFrame[0]->size == length)
{
// update timestamp
timestamp = audioFrame[0]->pts;

// copy one frame from the buffer to the stream
memcpy(stream, audioFrame[0]->buffer, audioFrame[0]->size);

// mark the frame as used
audioFrame[0]->size = 0;

// move the empty frame to the end of the buffer
SDL_ffmpegAudioFrame *f = audioFrame[0];
for(int i = 1; i < BUFFER_SIZE; i++ )
{
audioFrame[i - 1] = audioFrame;
}

audioFrame[BUFFER_SIZE - 1] = f;
}
[/source]

then of course, SDL_PauseAudio(0) causes SDL to start playing the audio, and hence calling the callback asking for audio data.

In your case, it may be more appropriate to find another audio decoder just for wave files instead of ffmpeg.

Concentrate

181

Author

May 16, 2011 07:24 PM

Hmmmm, I was thinking more like the following :

1) Get raw data
2) Do Transform
3) Calculate proper timing
4) Play Sound, and display spectrum making sure they are in sync.

That way I thought I won't have to deal with video-stuff. Also I was thinking when I do this, Mix_Chunk *data = Mix_LoadWAV("a.wav") , that data->abuf contains the full music data in a.wav? So I figured I would have to run DFT on data->abuf ? Then go on from there? What do you think?

Edge cases will show your design flaws in your code!
Visit my site
Visit my FaceBook
Visit my github

NicolasJay

May 16, 2011 11:40 PM

Hmmmm, I was thinking more like the following :

1) Get raw data
2) Do Transform
3) Calculate proper timing
4) Play Sound, and display spectrum making sure they are in sync.

That way I thought I won't have to deal with video-stuff. Also I was thinking when I do this, Mix_Chunk *data = Mix_LoadWAV("a.wav") , that data->abuf contains the full music data in a.wav? So I figured I would have to run DFT on data->abuf ? Then go on from there? What do you think?

1)Get raw data
-SDL is not needed to do this. What you need is a decoder. It sounds like a DJ program you are trying to make, in which case, you would probably be reading MP3s primarily. Mix_LoadMUS() in SDL uses the SMPEG library internally to load the mp3 files. I am sure you can dig up the documention for how to use SMPEG since you can easily get the libarary from the SDL website, but I would strongly recommend trying to find a different decoder, as SMPEG is extrememly old. I think there are quite a few different decoders that you can choose from. This page on Codeproject shows one example using "libmad" mp3 decoder. It's up to you though, as long as you have some decoder that will read in the compressed music file of your choice, and give you the raw wav data.

2)Do transform:
Once you have the raw wav data, you can display it fairly easily. I am not sure of the specifics, but when they say "a 16bit signed wave file" they literally mean that one (sample or frame?) can be represented with the 16bit signed c++ data type "short int" or just "short". And that 16 bit signed value represents the amplitude of the wave at that audio frame, which is not hard to represent graphically. Transforming it on the other hand, is much more difficult..I couldn't comment on it.

3) Calculate proper timing:
this is where the audio callback from my last post comes in. You don't have to. SDL will do it for you. You just have to fill the audio buffer with the output from the decoder whenever SDL calls the callback function. I am fairly certain that the audio callback is the only way that SDL will allow you to feed sound data into the sound card to be played. I don't think that it's as simple as waiting longer to send the next chunk of audio without getting all kinds of crackling and popping. My guess is that it would be similar to how images are resampled when you stretch them in paint(the pixels get doubled over, and in better apps like Photoshop, a type of blur or "filter" is applied to smooth things out so they don't look pixelated). Basically, you would have to "transform" your data and lengthen/shorten it before putting it into your audio buffer

4) Play Sound, and display spectrum making sure they are in sync:
like I said above, feeding the data into the stream from the callback is how you would make it play, and SDL makes the determination of when it would be appropriate to call the audio callback, i.e., when enough time has passed that the next chunk of sound should be playing.

and I think by DFT you mean DSP right?(digital sound processing)
again, that's a whole nother science...and I couldn't comment on that =/

NicolasJay

May 17, 2011 12:00 AM

Edit: About Mix_Chunk::abuf, I looked at the documentation, and apparenly, Mix_LoadWAV does support MP3, so I suppose it may be possible to parse whatever that buffer points at, but I wouldn't really bother with it. I think you will have a very difficult time finding enough documentation to get this working properly. Again, I would recommend, that you look for a more current library with better documentation. If you choose to use SDL, you should make a call to Mix_QuerySpec() after Mix_OpenAudio() incase SDL is unable to open the audio in the format you asked it for. I am saying this because of the fact that Mix_OpenAudio is a wrapper for SDL_OpenAudio, which has the following arguments: SDL_AudioSpec *desired, and SDL_AudioSpec *obtained. This means that you may not end up with the format you wanted after the thing has been opened. You will need to know the audio format before you know how to parse the data in the buffer. I think you will have a very hard time finding proper documentation on all this for SDL. SDL 1.2 is very old, and SDL's Sam Lantinga also recently threw in the towel on the upcoming SDL 1.3 as well. SDL 1.3 will continue as an Open Source project, but it's future is nowhere near certain, and of course, the incomplete documentation will undoubtedly be a major pain to deal with. SDL is a great starter library, but you will need something better for more advanced projects.

Concentrate

181

Author

May 17, 2011 01:14 AM

NicolasJay, I just want to say thank you for your time that you have taken to help me and guide me. Only if more people were kind and willing as you. As of right now, I'm not quite sure, If I want to use an external decoder or not. Right now I'm just reading some articles on DSP. But I will re-read your previous posts and think exactly what I want to do. The end goal for me is to be able to create a cross-platform software, where it could play basic format music and displays some sort of visualization for the currently played music.

P.S. If I have any more question about this, would you mind if I PM you from time to time?

regards, D.Chhetri

Edge cases will show your design flaws in your code!
Visit my site
Visit my FaceBook
Visit my github

NicolasJay

May 18, 2011 02:06 AM

If I have any more question about this, would you mind if I PM you from time to time?

No trouble at all, but you may have to be patient with me because there are times when I go quite a while without checking my PMs. If not me though, I'm sure someone will be around to help you out.

Nick

Manipulating sounds C++

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Manipulating sounds C++

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines