Compressed sound, what format to use?
I suspect you're not resetting the parameters at the start of each block. See http://wiki.multimedia.cx/index.php?title=Microsoft_IMA_ADPCM for how it does that for MS ADPCM.
I suspect you're not resetting the parameters at the start of each block. See http://wiki.multimed...osoft_IMA_ADPCM for how it does that for MS ADPCM.
That sounds about right! Thanks for your reply! Although .. I'm having a bit of troubling getting my head around this.. Does this mean I should do the decoding in chunks? Right now I'm applying the decoding algorithm to the entire data:
[source]int decode(int16* dst, const uint8* src, uint srcSize)[/source]
where srcSize is the size of the encoded data in bits and my predictedValue and stepIndex continuously follow through to the next nibble-iteration. Do I need to instead do this per block (block align in fmt-chunk?)?
perhaps something like this:
[source]int decode(int16* dst, const uint8* src, int srcOffset, uint srcSize)[/source]
where srcOffset is the current offset into the source data?
Thanks in advance!
That sounds about right to me. Note that the MS-ADPCM format stores the predictedValue and stepIndex at the start of each block. I suspect your data will be similar.
You should be able to work out how much extra header data there is - if there was no block header the compression ratio would be exactly 4:1.
You should be able to work out how much extra header data there is - if there was no block header the compression ratio would be exactly 4:1.
The audio I'm working with comes from .wav-files which I presume means I should follow this: http://wiki.multimed...osoft_IMA_ADPCM
I'm a bit confused on the "This field reveals the size of a block of IMA-encoded data." and that it then says "an individual chunk of data begins with the following preamble:". I'm guessing they both refer to the same thing?
So, from what I've interpreted I should do something like this:
Does that look correct? If it does I'll have to post some code because I can't get it bloody right!
I'm a bit confused on the "This field reveals the size of a block of IMA-encoded data." and that it then says "an individual chunk of data begins with the following preamble:". I'm guessing they both refer to the same thing?
So, from what I've interpreted I should do something like this:
foreach(adpcm_block in raw_audio_data)
{
var predictedValue = bytes 0 - 1 of adpcm_block
var stepIndex = byte2 of adpcm_block
// ignore byte 3 of adpcm_block
foreach(4bit nibble in adpcm_block) // this would start on byte 4 (the fifth byte) in the block?
{
decompress(nibble)
}
}
Does that look correct? If it does I'll have to post some code because I can't get it bloody right!
I don't think block and chunk are the same thing, if you follow the link to the WAVEFORMATEX structure details and then onto the MSDN page you'll find the following;
Block alignment, in bytes. The block alignment is the minimum atomic unit of data for the wFormatTag format type. If wFormatTag is WAVE_FORMAT_PCM, nBlockAlign must equal (nChannels × wBitsPerSample) / 8. For non-PCM formats, this member must be computed according to the manufacturer's specification of the format tag.
[/quote]
So for a 16bit stereo sample you get (2 x 16) / 8 or 4 bytes per sample.
A 'chunk' on the other hand looks to be header + audio data, which will of course be larger than 4 bytes ;)
Hey phantom and thanks for your reply!
I don't think block and chunk are the same thing, if you follow the link to the WAVEFORMATEX structure details and then onto the MSDN page you'll find the following;
Block alignment, in bytes. The block alignment is the minimum atomic unit of data for the wFormatTag format type. If wFormatTag is WAVE_FORMAT_PCM, nBlockAlign must equal (nChannels × wBitsPerSample) / 8. For non-PCM formats, this member must be computed according to the manufacturer's specification of the format tag.
So for a 16bit stereo sample you get (2 x 16) / 8 or 4 bytes per sample.
A 'chunk' on the other hand looks to be header + audio data, which will of course be larger than 4 bytes ;)
[/quote]
That's exactly what caused my confusion in the first place! Pulling apart wav-files I've been extracting various chunks (look here), which makes the following
If the IMA data is monaural, an individual chunk of data begins with the following preamble:
bytes 0-1: initial predictor (in little-endian format)
byte 2: initial index
byte 3: unknown, usually 0 and is probably reserved
[/quote]
make no sense to me, at least if they are referring to the header+data-chunks. But I would think they're actually referring to the blocks - on the other hand I heard/read/can't remember somewhere that IMA ADPCM cannot be decompressed per-block which would contradict each block having a header, that could've been in a dream though.. CONFUSED !
Edit: Uh, I'm running so many parallell possible solutions to this little problem - I'll go back to the one I was working on when posting the images above and apply my new knowledge. This thread can be put on ice for now.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement