Today I'm stepping out of my comfort zone and into the wondrous and vastly less documented world of audio programming for games.
For me, getting into an audio mindset was quite hard coming from doing mostly graphics development as you try to translate the concepts of rendering pipelines, shaders, materials, etc. to their possible audio counterparts. This is not a realistic approach of course as there are many concepts found within graphics development which simply do not translate to audio programming, and vice versa.
Throughout the years I've developed and maintained a fairly simple cross-platform 3D audio 'engine' for use in games. This engine has all the basic fundamental features you'd expect in an audio engine: 2D and 3D audio sources with different falloff models, data streaming, support for multiple input formats, multi-channel/surround output, support for multiple back-ends (eg. OpenAL), etc. It has no concept of anything more advanced than these features, so there are no filters or effects, no occlusion models, no fancy volume regulation systems, and so on.
This system has suited my needs throughout the years because I honestly considered audio as a secondary feature, and because I was quite happy with just some basic audio sources here and there for environmental audio and some interactive audio playback.
These days I'm working on a collaborative project which is of a much larger scope than pretty much any game project I've done before, so the requirements for both my graphics and audio systems have gone up quite a bit. I'm very happy to say that I have completed work on my graphics system overhaul last week and that it is faster, more flexible and generally better than ever before.
For my audio system I considered a couple of options:
- Completely ditch my current system and start from scratch with 'advanced features' in mind from the very beginning, or
- Work with the current system and extend it, or
- Forget the idea of implementing an audio system from scratch and go with a solution like FMOD.
I ruled out the third option because I don't like to have libraries with more restrictive licenses in my projects, and because I suffer from a bad case of enjoying wheel-reinventing. Since our project is more of a proof of concept type of deal and because we have a "it's done when it's done" mentality without any fixed deadline dates I can take all the time I need to go with one of the first two options.
After some code review I decided to go with extending the current system after making some alterations which allow for an easier integration of newer features and some general code cleanup.
So now I'll have to take a look at our project requirements so I can sketch out a list of features and a very rough roadmap. Some features I would definitely like to see are a data-driven audio pipeline setup, proper audio occlusion and filters and effects.
There's another concept I've been toying with for the last couple of days which involves programmable filters (in graphics terms: shaders for audio data), where you can write audio filters or effects as small programs which get applied to a chunk of audio data before it gets buffered. I have some very basic knowledge of discrete-time systems and signal processing from when I took some engineering classes a couple of years ago which could provide a starting point, but I have no idea of how feasible this idea would be for real-time systems or whether this has been done before. I guess I'll have to do some prototyping to find out whether this would even remotely work.
I suppose that's about all I needed to ramble about today, if anyone has some relevant audio-related papers, case studies, post-mortems, etc. feel free to share.