SDL_mixer and sound sample format

Started by
3 comments, last by Kercyn 7 years, 1 month ago

Hello everybody.

I'm currently developing the sound engine for my game engine using SDL_mixer 2 and there is something I don't understand. There is a function called Mix_OpenAudio which initializes the mixer and accepts a few arguments. One of those arguments is called format, and it can take these values. My sound knowledge is almost none, so bear with me.

As far as I understand this, format tells SDL how to parse the sounds it's given to play. I tried playing an SFX which was in 32-bit float format while passing different format values to Mix_OpenAudio and the sound played the same every time. Although I might have thought it played the same, as it was a quick *whoosh* SFX. If I have an RGB image and I parse it as BGR, it won't appear the same. Shouldn't this happen with sound as well?

My question is, do these values matter? Should I give the user of my engine the ability to use all the formats listed or can I safely omit some?

Thank you.

Advertisement

Mix_OpenAudio is the output format, but sounds have their own input format. To use a graphics analogy, it's perfectly okay to have a 24bit RGB frame buffer even when you load and render 8 bit palettised GIFs, as long as the system knows how to convert between them. This is what SDL_mixer does for you.

In practice it's rare to come across any file formats in use other than WAV, MP3, and OGG (although FLAC is arguably a good one to have) and your output format should almost always be 44khz, 16bit (signed) output. (SDL docs say 22Khz but I wouldn't agree.)

Mix_OpenAudio is the output format, but sounds have their own input format. To use a graphics analogy, it's perfectly okay to have a 24bit RGB frame buffer even when you load and render 8 bit palettised GIFs, as long as the system knows how to convert between them. This is what SDL_mixer does for you.

In practice it's rare to come across any file formats in use other than WAV, MP3, and OGG (although FLAC is arguably a good one to have) and your output format should almost always be 44khz, 16bit (signed) output. (SDL docs say 22Khz but I wouldn't agree.)

Hmm, I see. So the output format only concerns the OS and whatever audio middleware it uses and not the actual sound. I'm currently loading support for WAV, MP3, FLAC and OGG in the mixer, and the frequency, chunk size and channel are all modifiable by the user, it was only the output format I was unsure about. So I guess I just fix the output format to 16bit signed?

Also, why are you saying "almost always"? Would I be "safe" (or at least "relatively safe") by fixing it to 16bit signed? Sorry for being pedantic, but this game engine is my thesis and I'd like to justify various design and implementation decisions I make, plus it never hurts to learn more about something. Does ALSA and whatever middleware Windows uses accept 16bit signed samples as their default?

Thanks again for your answer.

The 'middleware' here is SDL_mixer. :) That outputs to the OS via one of the various drivers - on Windows that's probably WASAPI, or maybe DirectSound, or ASIO. I think there's a way to force SDL_mixer to choose one, but I forget how. All of them support standard formats like 44/16 anyway, so you'll have no problem with that output setting. The differences between the driver models are in how they support multiple programs (e.g. do they implement their own mixer?) and the way in which you use them (some might require data to be pushed in periodically, others will issue a callback requesting data), but this is all one level of abstraction away from you thanks to SDL.

Some people might argue you can get away with 22khz output, but I'd say you lose a bit of high-end frequencies that way, so don't do that. And some people have their audio interfaces set to use 24bit output, which is nice to have - but your 16bit output will work fine with that anyway (it just ignores the least significant 8 bits). I don't know if SDL_mixer supports 24bit output but it's no big deal if not.

Stereo 44KHz 16 bit signed output is a standard because that is the format for standard CD audio; as such, almost every file format supports that natively, and any consumer level computer audio device (e.g. sound cards) will be configured to accept it, along with 48Khz for DVD playback.

ALSA is an audio framework for Linux, which is sort-of-like WASAPI or ASIO or whatever, but arguably worse than all of them. (Linux audio is a massive mess and has always been that way.)

Thanks a lot!

This topic is closed to new replies.

Advertisement