Back to General and Gameplay Programming

How are audio channels arranged in a .wav file?

Do you really think I'm crazy enough to tell the world my real name? · 2015-11-20T00:39:21

Okay, I'm given the task to write a tool that gets the volume level of each audio channel within a .wav file with 4 channels. I don't know how the data is arranged (i.e. how the channels are aranged byte wise), but I do know how to read a .wav file from scratch. Let's say samples are 16-bit, do I ready every fourth word to get the data per channel? And if you're thinking of saying "use .ogg instead", just know that I can't because the tool that we are using here for my automation testing generates .wav files, and I have to build my autmation around this and more, so I won't bother with that. Thanks, Shogun.

General and Gameplay Programming Programming

Started by blueshogun96 November 17, 2015 10:17 PM

13 comments, last by blueshogun96 8 years, 5 months ago

blueshogun96

2,267

Author

November 19, 2015 02:05 AM

wav isn't always raw PCM though. Make sure to check the RIFF to see if you need to decode it first.

Yeah, I always check WAVEFORMATEX[TENSIBLE] structure and not blindly assume it's PCM. Made that mistake before, will not do it again.

Anyway, turns out the format is IEEE float. Damn! Looks like I've got some more work to do.

Shogun.

Khatharr

8,814

November 19, 2015 02:15 AM

Is it a problem to work with float? Bah, I'll ask in the chat.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

blueshogun96

2,267

Author

November 19, 2015 08:06 PM

Is it a problem to work with float? Bah, I'll ask in the chat.

I don't know if it works roughly the same as PCM or what, only with 32-bit floating point precision. And what would the maximum value be for a sound sample of this format?

Shogun.

EDIT: I came across this excellent article that explains audio conversion(s), mixing and PCM/IEEE float formatting fairly extensively. http://www.codeproject.com/Articles/501521/How-to-convert-between-most-audio-formats-in-NET

Thanks for the responses.

Khatharr

8,814

November 19, 2015 08:53 PM

It's still PCM. The floats are normalized -1 to 1.

Ah, yeah, that article talks about it. I went into the chat yesterday and you had left like 10 seconds prior.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

blueshogun96

2,267

Author

November 20, 2015 12:39 AM

Yeah, also came to that conclusion.

Now I have one problem, whenever I try to read a float value, I always get this value "-nan(ind)". I tried googling this, but no results came up (mostly garbage). This is my code:


        void* riff = malloc( size );
	WAVEFORMATEXTENSIBLE* wfx = NULL;
	BYTE* pcm = NULL;
	DWORD dwPcmSize;

	fread( riff, size, 1, fp );

	int result = UnpackWAVDataEx( riff, &wfx, &pcm, &dwPcmSize );

	if( !result )
	{
		printf( "Could not read .wav fle!\n" );
	}
	else if( wfx->Format.nChannels != 4 )
	{
		printf( "All wav files are expected to have 4 channels...\n" );
	}
	else if( wfx->SubFormat == KSDATAFORMAT_SUBTYPE_PCM )
	{
		printf( "PCM formatted .wav file detected...\n" );

                /* TODO: If necessary */
	}
	else if( wfx->SubFormat == KSDATAFORMAT_SUBTYPE_IEEE_FLOAT )
	{
		printf( "IEEE float formatted .wav file detected...\n" );

		float* soundbytes = (float*) pcm;
		
		float ch1 = 0.0f;
		float ch2 = 0.0f;
		float ch3 = 0.0f;
		float ch4 = 0.0f;

		for( DWORD i = 0; i < dwPcmSize/4; i += 4 )
		{
			ch1 += GetDB(soundbytes[i+0]);
			ch2 += GetDB(soundbytes[i+1]);
			ch3 += GetDB(soundbytes[i+2]);
			ch4 += GetDB(soundbytes[i+3]);
		}

		float avg[4] = { ch1/4.0f, ch2/4.0f, ch3/4.0f, ch4/4.0f };

		FILE* fout = fopen( "out.txt", "w" );
		fprintf( fout, "Channel 1: %f\nChannel 2: %f\nChannel 3: %f\nChannel 4: %f\n", avg[0], avg[1], avg[2], avg[3] );
		fclose(fout);
	}
	else
	{
		printf( "This .wav format is not supported...\n" );
	}

Am I doing this right? Or do I have to do a conversion of DWORD to float every 4 bytes?

Shogun.

EDIT: Found the problem, it has to do with the way I calculate sound levels based on this algorithm: