Compressed sound, what format to use?

Started by
14 comments, last by Fredrik Elmegren 12 years, 2 months ago

Hi!

I'm about to start implementing support for playing compressed sound for our game engine. Unsurprisingly we're on a tight budget both performance-wise and memory-wise. I've been looking at the ADPCM-wav-compression and it could be a possible solution, although it would require a lot of extra hacking. We use OpenAL, and unfortunately it doesn't natively support ADPCM which would mean we'd have to 'manually' decompress the sound before sending it to OpenAL.

Right, so my question to you is: What sound compression would you recommend us using? As I said, performance is a big thing for us so the it can't have a large inpact on that.

Cheers!
/ Freddy

ps. an audio-section here in the forums would be nice! smile.png

Advertisement
Have you look at Ogg Vorbis from xiph.org
Patrick
I would indeed also vote for Ogg Vorbis format. It is a good format that can be used easily in combination with OpenAL. The encoder and decoders are also not emcumbered with wierd licenses, so you can freely use it.

Crafter 2D: the open source 2D game framework

?Github: https://github.com/crafter2d/crafter2d
Twitter: [twitter]crafter_2d[/twitter]

While I would also recommend Vorbis (what's usually miscalled OGG, we're talking about the encoding format here, guys), take into account he is worried about performance. I guess he's expecting a large amount of sound sources all having to get decompressed at the same time.

I have used Vorbis in the past in some games to store both the background music and sound effects. By this I mean the sound data was loaded into memory as-is (still Vorbis) and then decompressed during playback. This was done to save memory (even though it may have been overkill for sound effects). Didn't seem to have any impact on performance at all (and it was a quite old computer for today's standards - the CPU was a 2.4 GHz Pentium IV, also the game was software rendered), but then again at most you had the background music and a couple of sound effects going on.

I have absolutely no idea of a good compression format that has good decompression performance. ADPCM is very fast to decompress, but the compression ratio isn't all that good in comparison to newer encoding methods.

EDIT: also it may be worth a shot to use Vorbis for background music and something more lightweight for sound effects. There would be only one music going on so performance isn't much of an issue there, while sound effects are short so space usage isn't much of an issue there. Sounds like it could be a good trade-off (even better would be to let users specify what format to use for each sound source if possible, ideally explaining what's the best way to use each format).

EDIT 2: also in case you wonder, ADPCM is fast. I know of a homebrew sound engine that can do ADPCM playback at 22 KHz on a 3.58 MHz Z80 (and that's an 8-bit processor!), so in the case of ADPCM performance would be the least of your worries. Just don't expect all that much compression from it (although it's still significant compared to uncompressed).
Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.
Thank you all for your replies!

Sik you're spot on. We have a lot of sound loaded into memory, where all the larger sound files already are vorbis compressed. It's all the smaller (400kb and less) uncompressed sounds we're trying to reduce the size of. I did some testing and came to the conclusion that Ogg Vorbis is not a viable solution for us as it would be too expensive performance-wise to decode.

ADPCM seems to be the way to go here. We'd be able to compress all of our uncompressed wav down to a 1/4 in size which is quite nice. But, as I mentioned, we're using OpenAL which doesn't support ADPCM on Windows. This means I'll have to make a hack that decodes/decompresses the ADPCM-sound before sending it to OpenAL.

Please reply if you think there's anything I should keep in mind when doing this.

Again, thank you all smile.png
Hi, it would help if you tell us what kind of CPU budget you have (a Pentium II processor? an Intel Core 2 Duo? Single or Dual core? An ARM for cellphones? iPhone? Android?) and what is your memory budget (16MB? 128MB? 256? 512? 4GB?) and how much memory you're using already without sounds loaded, and how much time in seconds you have in sounds.

I personally use raw pcm for sounds (most of them areEdit: Somehow half of my post was cut. A GD.Net bug? I have to go, I'll repost later. Still answer the question above

Hi, it would help if you tell us what kind of CPU budget you have (a Pentium II processor? an Intel Core 2 Duo? Single or Dual core? An ARM for cellphones? iPhone? Android?)


If it's a Windows app, I would look into XAct. Doesn't matter what the file format is because it's compressed down into XAct specific file (*.xwb). The amount of compression can be set.
I wrote some code to decode ADPCM, but I found a bad flaw in the specification. It introduces unnecessary padding at the end of the file.
Please reply if you think there's anything I should keep in mind when doing this.

Well, here's another suggestion but I guess you'll kill me for this =P

Basically you could try reducing the quality of sound effects when size becomes problematic. People can't distinguish between 8-bit and 16-bit samples unless they're audiophiles, and many sound effects can be downsampled without much distortion (how much you can downsample depends on the sound effect - low pitched ones aren't affected much, high pitched ones are less tolerant). I have tried this before and it worked pretty well.

One thing to take into account though is that if you downsample you should avoid interpolation at all costs, because that's what makes them sound worse. It doesn't help it usually happens at the hardware level so it's hard to avoid... Generally you do this by setting the audio output at a higher sample rate and then repeating samples when playing back (e.g. if the sound is 11025 Hz and the output is 44100 Hz you'd repeat each sample four times). This ensures the audio output sounds clear and not muffled.

Besides that, yeah, not much to say. ADPCM is extremely fast to process so you probably shouldn't worry all that much about it, and most likely Vorbis is being decompressed in software on most computers anyways.
Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.
Right!

So I've implemented a decoder (http://wiki.multimedia.cx/index.php?title=IMA_ADPCM) to decode my IMA ADPCM sound.. buuuut the sound gets errrmm.. I'll let this picture talk for me:

353bb0x.png

See the 'chunks' in the wave? (Oh, and it sounds like it looks btw) The first 'chunk' is perfect but then it goes dooownhill.. Any clue from just looking at this what I might be doing wrong? If not I'll post some code.

Cheers!

This topic is closed to new replies.

Advertisement