How are audio channels arranged in a .wav file?

Started by
13 comments, last by blueshogun96 8 years, 4 months ago

Okay, I'm given the task to write a tool that gets the volume level of each audio channel within a .wav file with 4 channels. I don't know how the data is arranged (i.e. how the channels are aranged byte wise), but I do know how to read a .wav file from scratch. Let's say samples are 16-bit, do I ready every fourth word to get the data per channel?

And if you're thinking of saying "use .ogg instead", just know that I can't because the tool that we are using here for my automation testing generates .wav files, and I have to build my autmation around this and more, so I won't bother with that.

Thanks,

Shogun.

Advertisement
If you can use an audio library, use that to do the heavy lifting for you. Otherwise, take a look at the WAV format specification. It won't be as simple as reading every other byte, because not only are WAV files chunked, they could also be compressed.

What library would you recommend fastcall? I understand how RIFF works, it's just the audio channel stuff I don't understand. When I reach the data chunk, how do you separate each audio channel from the data?

Shogun.

According to The DirectSound Programming Guide, you can use the mmio* (mmioOpen, mmioRead, winmm.lib, desktop only) functions provided by win32. And I suppose OpenAL would be a suitable cross-platform alternative.

Thanks fastcall, but not quite. The mmio API doesn't appear to let me select one channel to read from. I'm not going to be playing back any of these audio samples either. I'm just going to be reading the sound samples to average out the sound levels from each.

Shogun.

Inside the RIFF there is a WAVE (or, in fact, you can store non-WAVE data such as MP3 audio).

Assuming you have a WAVE, the sample data layout depends on the compression, bitrate, number of channels, etc.

http://soundfile.sapp.org/doc/WaveFormat/

The data is wave format extensible. I guess I have to understand this codec first before moving forward.

Shogun.

For PCM (almost all wav files), audio data is stored as an array of interleaved channels of whatever the sample size is (8, 16, 24 common, 32, 64, 32fp, 64fp possible). Integer values are signed little-endian.

e.g.

sample1L sample1R sample2L sample2R sample3L sample3R...

For PCM (almost all wav files), audio data is stored as an array of interleaved channels of whatever the sample size is (8, 16, 24 common, 32, 64, 32fp, 64fp possible). Integer values are signed little-endian.

e.g.

sample1L sample1R sample2L sample2R sample3L sample3R...

Okay, I had a feeling that was it. If that's the case, then I can write this tool easily and quickly. Thanks.

Shogun.

wav isn't always raw PCM though. Make sure to check the RIFF to see if you need to decode it first.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

This topic is closed to new replies.

Advertisement