Sign in to follow this  
Concentrate

Manipulating sounds C++

Recommended Posts

I decided to make a music visualization software. Right now, I'm looking at FMOD; seems relatively simple. But I want to extract the raw compressed data from the music file and run some FFT on it in order to use it on the program, is it possible to do this with FMOD or do you have any other suggestions? Thanks



regards, D.Chhetri

EDIT: I'm using sdl so I'm looking it its music classes/functions, still any advice would be great.

Share this post


Link to post
Share on other sites
[quote name='D.Chhetri' timestamp='1305393437' post='4810755']
EDIT: I'm using sdl so I'm looking it its music classes/functions, still any advice would be great.
[/quote]

Firstly, if you were going to choose to go it alone, I would recommend the SFML library instead of SDL. SDL_mixer is very old, and the 'C' style interface makes me cry..

I would go with 1.6, which has a C++ style interface. You will have to build the DLLs yourself, but it's not that hard to get it up and running, and you can get help in their forum if you need it. You should search first though, because there are already a ton of questions answered about how to build the SFML DLLs in their forum.

[url="http://www.sfml-dev.org/download.php"]http://www.sfml-dev.org/download.php[/url]

BUT, unless you are already good at low level programming, this may not be the right place to start. I would recommend starting out with something like a Winamp Plugin. Starting an MP3 player from scratch is a ton of work, whereas Winamp already has a development kit that is specifically designed for exactly what you wanna do.

[url="http://wiki.winamp.com/wiki/Plug-in_Developer"]http://wiki.winamp.com/wiki/Plug-in_Developer[/url]

Then, once you get the theory down, you can look at breaking away from Winamp and making something lower level if you want.

Share this post


Link to post
Share on other sites
I think I can handle it without doing it as a plugin, the whole point of this project is to exercise my programming ability not to make the actual project necessarily. I'm not a big fan of C-style programming as well. But
I'm using opengl and sdl already so I figured I would use already whats provided. But do you think I would be able to retrive the raw data using sdl functions?

Share this post


Link to post
Share on other sites
If you use SDL_mixer you can access the raw data from the Mix_Chunk datastrucure. It contains a pointer to the data. If you use SDL_LoadWAV you will load raw data into a buffer so using SDL it will be easy to access the raw data.

Share this post


Link to post
Share on other sites
well then, what you will have to do will be somewhere along these lines:

instead of using Mix_OpenAudio() which is a wrapper function, you will have to use the lower level [url="http://wiki.libsdl.org/moin.cgi/SDL_OpenAudio"]SDL_OpenAudio()[/url]. The lower level one will allow you to set your own audio callback and feed the audio into the stream directly via the *stream pointer in the callback. This of course means that you will have to decode the audio yourself.

I wrote a class for playing movies this way not to long ago.
here is the code for reference:

[b]sdlffmovie.h[/b]
[source lang="cpp"]
#ifndef INC_SDLFFMOVIE_H
#define INC_SDLFFMOVIE_H
#include "Definitions.h"
#include <SDL.h>
#include <SDL_ffmpeg.h>

#define BUFFER_SIZE 20 // in audio frames

class SDL_FFMovie
{

public:
SDL_FFMovie();
~SDL_FFMovie();

// run a movie
int Run(SDL_Surface *screen, int x, int y, const char *filename);
bool movieDone;

private:

// get the current time of the playing movie
int64_t GetTimestamp();

// create a buffer for audio frames
SDL_ffmpegAudioFrame **audioFrame;

// the frame used for drawing the current movie
SDL_ffmpegVideoFrame *videoFrame;

// the movie file
SDL_ffmpegFile *movie;

/* use a mutex to prevent errors due to multithreading */
SDL_mutex *mutex;

// the width and height of the current movie
int movie_w;
int movie_h;

// start time and position of the current movie
int64_t timestamp;
int64_t movStart;

// callback for SDL audio playback
static void AudioCallback(void *data, Uint8 *stream, int length);
void ThisAudioCallback(void *data, Uint8 *stream, int length);

void Cleanup();


};

#endif /* INC_SDLFFMOVIE_H */
[/source]
[b]sdlffmovie.cpp[/b]
[source lang="cpp"]
#include "sdlffmovie.h"

int SDL_FFMovie::Run(SDL_Surface *screen, int x, int y, const char *filename)
{
// open the mpeg file
movie = SDL_ffmpegOpen(filename);
if(movie == NULL)
{
fprintf(stderr, "could not open %s: %s\n", filename, SDL_ffmpegGetError());
return -1;
}

// create a mutex for the movie data
mutex = SDL_CreateMutex();

// select the first audio and video streams from the mpeg
SDL_ffmpegSelectAudioStream(movie, 0);
SDL_ffmpegSelectVideoStream(movie, 0);

// retrieve the audio format of the movie
SDL_AudioSpec specs = SDL_ffmpegGetAudioSpec(movie, 512, AudioCallback);
specs.userdata = this; // pass "this" to the audio callback

// retrieve the video sizeo of the movie
SDL_ffmpegGetVideoSize(movie, &movie_w, &movie_h);

// create a video frame to blit to the screen
videoFrame = SDL_ffmpegCreateVideoFrame();
videoFrame->surface = SDL_CreateRGBSurface(0, movie_w, movie_h, 24, 0x0000FF, 0x00FF00, 0xFF0000, 0);

// create the source rectangle
SDL_Rect src_rect = {0, 0 ,movie_w, movie_h};
SDL_Rect dst_rect = {x, y ,movie_w, movie_h};

if(SDL_ffmpegValidAudio(movie))
{
// open the audio device
if(SDL_OpenAudio(&specs, 0) != 0)
{
fprintf(stderr, "Couldn't open audio: %s\n", SDL_GetError());
Cleanup();
return -1;
}

// calculate the audio frame size (2 bytes per sample)
int frameSize = specs.channels * specs.samples * 2;

// allocate the audio buffer
audioFrame = new SDL_ffmpegAudioFrame*[BUFFER_SIZE];

//create and fill the audio buffer
for(int i = 0; i < BUFFER_SIZE; i++)
{
audioFrame[i] = SDL_ffmpegCreateAudioFrame(movie, frameSize);

if(audioFrame[i] == NULL)
{
Cleanup();
return -1;
}

SDL_ffmpegGetAudioFrame(movie, audioFrame[i]);
}

// unpause audio so the buffer starts being read.
SDL_PauseAudio(0);

// store the time at which the movie started playing
movStart = SDL_GetTicks();
}



SDL_Event Event;
movieDone = false;
while(movieDone == false)
{
/*Uint8 *keys = SDL_GetKeyState(NULL);
if(keys[SDLK_ESCAPE])
movieDone = true;*/
// handle keyboard and mouse input

while (SDL_PollEvent(&Event))
{
switch(Event.type)
{
case SDL_MOUSEMOTION:
movieDone = true;
break;
case SDL_KEYDOWN:
movieDone = true;
switch(Event.key.keysym.sym)
{
case SDLK_ESCAPE:
movieDone = true;
}
break;
case SDL_QUIT:
movieDone = true;
break;
}
}

// fill up the audio buffer if neccessary
if(SDL_ffmpegValidAudio(movie))
{
SDL_LockMutex(mutex);

for(int i = 0; i < BUFFER_SIZE; i++)
{
// check if frame is empty
if(audioFrame[i]->size == 0)
{
// fill frame with new data
SDL_ffmpegGetAudioFrame(movie, audioFrame[i]);
}
}

SDL_UnlockMutex(mutex);
}

// draw the video frame
if(videoFrame)
{
// if the current frame has expired, get a new one
if(videoFrame->pts < GetTimestamp())
{
SDL_ffmpegGetVideoFrame(movie, videoFrame);
}

// draw the current frame to the screen
if(videoFrame->surface != NULL)
{
SDL_FillRect(screen, 0, 0);
SDL_BlitSurface(videoFrame->surface, &src_rect, screen, &dst_rect);
SDL_Flip(screen);
}

// exit if this is the last frame
if(videoFrame->last)
{
movieDone = true;
}
}
}

Cleanup();

return 0;
}

void SDL_FFMovie::AudioCallback(void *data, Uint8 *stream, int length)
{
((SDL_FFMovie*)data)->ThisAudioCallback(NULL, stream, length);
}

void SDL_FFMovie::ThisAudioCallback(void *data, Uint8 *stream, int length)
{
// lock mutex, so audioFrame[] will not be changed from another thread
SDL_LockMutex( mutex );

if(audioFrame[0]->size == length)
{
// update timestamp
timestamp = audioFrame[0]->pts;

// copy one frame from the buffer to the stream
memcpy(stream, audioFrame[0]->buffer, audioFrame[0]->size);

// mark the frame as used
audioFrame[0]->size = 0;

// move the empty frame to the end of the buffer
SDL_ffmpegAudioFrame *f = audioFrame[0];
for(int i = 1; i < BUFFER_SIZE; i++ )
{
audioFrame[i - 1] = audioFrame[i];
}

audioFrame[BUFFER_SIZE - 1] = f;
}
else
{
// no frames available
memset(stream, 0, length);
}

SDL_UnlockMutex( mutex );
}

int64_t SDL_FFMovie::GetTimestamp()
{
// return the position that the current movie should be at
if(SDL_ffmpegValidAudio(movie))
{
return timestamp;
}
else if(SDL_ffmpegValidVideo(movie))
{
return SDL_GetTicks() - movStart;
}

return 0;
}

void SDL_FFMovie::Cleanup()
{
// free the movie file
if(movie != NULL)
{
SDL_ffmpegFree(movie);
movie = NULL;
}

// stop any audio playback
if(SDL_ffmpegValidAudio(movie))
{
SDL_PauseAudio(1);
}

SDL_CloseAudio();

// free all audio frames and delete the buffer
if(audioFrame != NULL)
{
for(int i = 0; i < BUFFER_SIZE; i++)
{
if(audioFrame[i] != NULL)
{
SDL_ffmpegFreeAudioFrame(audioFrame[i]);
}
}

delete [] audioFrame;
audioFrame = NULL;
}

// free video frame
if(videoFrame != NULL)
{
SDL_ffmpegFreeVideoFrame(videoFrame);
videoFrame = NULL;
}

// destroy the mutex
if(mutex != NULL)
{
SDL_DestroyMutex(mutex);
mutex = NULL;
}

movie_w = 0;
movie_h = 0;

timestamp = 0;
movStart = 0;
}

SDL_FFMovie::SDL_FFMovie()
{
audioFrame = NULL;
videoFrame = NULL;
movie = NULL;
mutex = NULL;
movie_w = 0;
movie_h = 0;
timestamp = 0;
movStart = 0;
}

SDL_FFMovie::~SDL_FFMovie()
{

}
[/source]

The two important parts are this one, from the main loop where I use the ffmpeg decoder to decompress audio frames and fill up the buffer:
[source lang="cpp"]
// fill up the audio buffer if neccessary
if(SDL_ffmpegValidAudio(movie))
{
SDL_LockMutex(mutex);

for(int i = 0; i < BUFFER_SIZE; i++)
{
// check if frame is empty
if(audioFrame[i]->size == 0)
{
// fill frame with new data
SDL_ffmpegGetAudioFrame(movie, audioFrame[i]);
}
}

SDL_UnlockMutex(mutex);
}
[/source]

and this one, inside the audio callback where I feed them into the audio stream whenever SDL calls the callback because it needs more audio data to continue playing.
[source lang="cpp"]
if(audioFrame[0]->size == length)
{
// update timestamp
timestamp = audioFrame[0]->pts;

// copy one frame from the buffer to the stream
memcpy(stream, audioFrame[0]->buffer, audioFrame[0]->size);

// mark the frame as used
audioFrame[0]->size = 0;

// move the empty frame to the end of the buffer
SDL_ffmpegAudioFrame *f = audioFrame[0];
for(int i = 1; i < BUFFER_SIZE; i++ )
{
audioFrame[i - 1] = audioFrame[i];
}

audioFrame[BUFFER_SIZE - 1] = f;
}
[/source]

then of course, SDL_PauseAudio(0) causes SDL to start playing the audio, and hence calling the callback asking for audio data.

In your case, it may be more appropriate to find another audio decoder just for wave files instead of ffmpeg.

Share this post


Link to post
Share on other sites
Hmmmm, I was thinking more like the following :

1) Get raw data
2) Do Transform
3) Calculate proper timing
4) Play Sound, and display spectrum making sure they are in sync.

That way I thought I won't have to deal with video-stuff. Also I was thinking when I do this, [i] Mix_Chunk *data = Mix_LoadWAV("a.wav") [/i] , that data->abuf contains the full music data in a.wav? So I figured I would have to run DFT on data->abuf ? Then go on from there? What do you think?

Share this post


Link to post
Share on other sites
[quote name='D.Chhetri' timestamp='1305573890' post='4811586']
Hmmmm, I was thinking more like the following :

1) Get raw data
2) Do Transform
3) Calculate proper timing
4) Play Sound, and display spectrum making sure they are in sync.

That way I thought I won't have to deal with video-stuff. Also I was thinking when I do this, [i] Mix_Chunk *data = Mix_LoadWAV("a.wav") [/i] , that data->abuf contains the full music data in a.wav? So I figured I would have to run DFT on data->abuf ? Then go on from there? What do you think?
[/quote]

1)Get raw data
-SDL is not needed to do this. What you need is a decoder. It sounds like a DJ program you are trying to make, in which case, you would probably be reading MP3s primarily. Mix_LoadMUS() in SDL uses the SMPEG library internally to load the mp3 files. I am sure you can dig up the documention for how to use SMPEG since you can easily get the libarary from the SDL website, but I would strongly recommend trying to find a different decoder, as SMPEG is extrememly old. I think there are quite a few different decoders that you can choose from. [url="http://www.codeproject.com/KB/audio-video/madlldlib.aspx"]This page on Codeproject[/url] shows one example using "libmad" mp3 decoder. It's up to you though, as long as you have some decoder that will read in the compressed music file of your choice, and give you the raw wav data.

2)Do transform:
Once you have the raw wav data, you can display it fairly easily. I am not sure of the specifics, but when they say "a 16bit signed wave file" they literally mean that one (sample or frame?) can be represented with the 16bit signed c++ data type "[url="http://www.cplusplus.com/doc/tutorial/variables/"]short int[/url]" or just "short". And that 16 bit signed value represents the amplitude of the wave at that audio frame, which is not hard to represent graphically. Transforming it on the other hand, is much more difficult..I couldn't comment on it.

3) Calculate proper timing:
this is where the audio callback from my last post comes in. You don't have to. SDL will do it for you. You just have to fill the audio buffer with the output from the decoder whenever SDL calls the callback function. I am fairly certain that the audio callback is the only way that SDL will allow you to feed sound data into the sound card to be played. I don't think that it's as simple as waiting longer to send the next chunk of audio without getting all kinds of crackling and popping. My guess is that it would be similar to how images are resampled when you stretch them in paint(the pixels get doubled over, and in better apps like Photoshop, a type of blur or "filter" is applied to smooth things out so they don't look pixelated). Basically, you would have to "transform" your data and lengthen/shorten it before putting it into your audio buffer

4) Play Sound, and display spectrum making sure they are in sync:
like I said above, feeding the data into the stream from the callback is how you would make it play, and SDL makes the determination of when it would be appropriate to call the audio callback, i.e., when enough time has passed that the next chunk of sound should be playing.

and I think by DFT you mean DSP right?(digital sound processing)
again, that's a whole nother science...and I couldn't comment on that =/

Share this post


Link to post
Share on other sites
Edit: About Mix_Chunk::abuf, I looked at the [url="http://sdl.beuc.net/sdl.wiki/Mix_LoadWAV"]documentation[/url], and apparenly, Mix_LoadWAV does support MP3, so I suppose it may be possible to parse whatever that buffer points at, but I wouldn't really bother with it. I think you will have a very difficult time finding enough documentation to get this working properly. Again, I would recommend, that you look for a more current library with better documentation. If you choose to use SDL, you should make a call to Mix_QuerySpec() after Mix_OpenAudio() incase SDL is unable to open the audio in the format you asked it for. I am saying this because of the fact that Mix_OpenAudio is a wrapper for [url="http://www.libsdl.org/docs/html/sdlopenaudio.html"]SDL_OpenAudio[/url], which has the following arguments: SDL_AudioSpec *desired, and SDL_AudioSpec *obtained. This means that you may not end up with the format you wanted after the thing has been opened. You will need to know the audio format before you know how to parse the data in the buffer. I think you will have a very hard time finding proper documentation on all this for SDL. SDL 1.2 is very old, and SDL's Sam Lantinga also recently threw in the towel on the upcoming SDL 1.3 as well. SDL 1.3 will continue as an Open Source project, but it's future is nowhere near certain, and of course, the incomplete documentation will undoubtedly be a major pain to deal with. SDL is a great starter library, but you will need something better for more advanced projects.

Share this post


Link to post
Share on other sites
NicolasJay, I just want to say thank you for your time that you have taken to help me and guide me. Only if more people were kind and willing as you. As of right now, I'm not quite sure, If I want to use an external decoder or not. Right now I'm just reading some articles on DSP. But I will re-read your previous posts and think exactly what I want to do. The end goal for me is to be able to create a cross-platform software, where it could play basic format music and displays some sort of visualization for the currently played music.

P.S. If I have any more question about this, would you mind if I PM you from time to time?


regards, D.Chhetri

Share this post


Link to post
Share on other sites
[quote name='D.Chhetri' timestamp='1305594881' post='4811719']
If I have any more question about this, would you mind if I PM you from time to time?
[/quote]

No trouble at all, but you may have to be patient with me because there are times when I go quite a while without checking my PMs. If not me though, I'm sure someone will be around to help you out.

Nick

Share this post


Link to post
Share on other sites
So here is the deal, after reading and experimenting, I decided to just use SDL_mixer. It has couple of functions that I can use.

First [i] Mix_OpenAudio[color="#7b3f8e"]([/color][font=Monaco][size=2]MIX_DEFAULT_FREQUENCY, MIX_DEFAULT_FORMAT,STEREO,SAMPLE_CHUNK) [/i][/size][/font]
[font=Monaco][size=2]
[/size][/font]
[font=Monaco][size=2]
[/size][/font]
[font=Monaco][size=2]As you said, I realize that sometimes there might be trouble and it might not get the correct default values, but for now this will be the easiest way to approach my project.[/size][/font]
[font=Monaco][size=2]
[/size][/font]
[font=Monaco][size=2]Now [i] [/size][/font][font=Monaco][size=2]Mix_RegisterEffect(CHANNEL,SampleChunkProcessorCallBack,endProcessCallBack, ARG_DATA); [/i][/size][/font]
[font=Monaco][size=2]
[/size][/font]
[font=Monaco][size=2]this would register a callback like you said, to channel CHANNEL. [/size][/font]
[font=Monaco][size=2]
[/size][/font]
[font=Monaco][size=2]And this function [/size][/font][font=Monaco][size=2][i]Mix_PlayChannel(CHANNEL,music, NUMBER_OF_TIMES_TO_PLAY_AGAIN);[/i][/size][/font]
[font=Monaco][size=2][i]
[/i][/size][/font]
[font="Monaco"][size=2]so when SDL starts playing music, SampleChunkProcessorCallBack will be called with the sample chunk data and its length. Sample chunk data is of type void, but the[/size][/font]
[font="Monaco"][size=2]its actually a 16bit sample( MIX_DEFAULT_FORMAT => [/size][/font][font=Monaco][size=2]AUDIO_S16LSB => 0x8010 => "16 bit sample" ). [/size][/font]
[font=Monaco][size=2]
[/size][/font]
[font=Monaco][size=2]
[/size][/font]
[font=Monaco][size=2]So now when the callback function is called I do the following:[/size][/font]
[font=Monaco][size=2]
[/size][/font]
[font=Monaco][size=2]1) Convert the void*chunkData into a array of 16 bit values( short int ).[/size][/font]
[font=Monaco][size=2]2) Run DFT( FFT ) on the time domain data, to convert it into frequency domain[/size][/font]
[font=Monaco][size=2]3) Use the Frequency domain information somehow( Frequency domain information contains a vector of realPart and vector of ImaginaryPart ) to get some numerical data[/size][/font]
[font=Monaco][size=2]4) Use that numerical data to display the result on a graph using opengl. Presumably, just display the wave spectrum( if thats whats its called )[/size][/font]
[font=Monaco][size=2]5) Repeat 1[/size][/font]
[font=Monaco][size=2]
[/size][/font]
[font="Monaco"][size=2]I might have to do some filtering and apply some noise reduction technique before displaying it.[/size][/font]
[font="Monaco"][size=2]
[/size][/font]
[font="Monaco"][size=2]What do you think, or anyone else that can help for that matter ?[/size][/font]
[font="Monaco"][size=2]
[/size][/font]
[font="Monaco"][size=2]
[/size][/font]
[font="Monaco"][size=2]regards, D.Chhetri[/size][/font]

Share this post


Link to post
Share on other sites
May I suggest [url="http://fftw.org/"]FFTW[/url] for doing the DFT? We used it in uni for a project and it worked out pretty well.

To calculate the amplitude at a certain frequency, you have to take the magnitude of the complex number. i.e sqrt(real^2+imaginary^2)

Depending on the chunk size that SDL throws at you you may see some artifacts in the frequency domain since it is actually a [url="http://en.wikipedia.org/wiki/Window_function"]rectangular window function into the waveform[/url].

cheers

Share this post


Link to post
Share on other sites
With artifacts I mean the following:

The DFT gives you a discrete frequency spectrum. When you sample a chunk of a longer waveform, there will be frequency components present which don't quite fit into the chunk. (i.e chunk length is not an integer multiple of its wavelength).
This will 'smear' the frequency spectrum.
However, I guess for just visualizing music it should be good enough.

Also keep in mind that the lowest frequency detectable by the DFT is restricted by the length of the chunk.
Shorter sample length = better time resoultion, worse frequency resolution

Share this post


Link to post
Share on other sites
I am actually working on something now where I have to do pretty much all the low level sound stuff by myself =/

anyways, I came across this while researching:
http://www.mixxx.org/download.php

open source dj mixing software. It's quite a bit of code, but still probably a good resource.

anyways, off to continue digging..ugh

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this