Sound Programming from Scratch

Started by
13 comments, last by Aressera 11 years, 2 months ago

I previously posted the question here, but got the muscian response. http://www.gamedev.net/topic/637873-sound-programming-from-scratch/

I'd also like the programmer response.

Not sure if this question is better asked here or in the programming forum. I've been playing with sound programming, but I find the APIs hide too much stuff. When I learned 3D graphics, I started by learning from scratch. I would like to know if anyone has any resources (books or online) that teach audio programming from scratch. Much like writing a software renderer from scratch to learn about the algorithms.

I've googled and amazoned, but I don't really know enough about the subject to make a decision if the books I found are any good. And man, I though software books were expensive. Audio books are not cheap. smile.png

Thanks,

I think, therefore I am. I think? - "George Carlin"
My Website: Indie Game Programming

My Twitter: https://twitter.com/indieprogram

My Book: http://amzn.com/1305076532

Advertisement

Here's the definitive book that will help you get started with the basics: Scientist And Engineer's Guide To Digital Signal Processing. It's hefty (600+ pages), covers everything from audio to image processing to compression and it's FREE. It's written for absolute beginners, but develops to an intermediate difficulty as you keep reading, has sparse code listings and very few actual equations, which DSP is infamous for. If you're starting out, then IMO this is the place to get the basic knowledge.

"Audio programming from scratch" is a very broad term and if you want more specific advice, I'm afraid you're going to have to be more specific with your question!

"Audio programming from scratch" is a very broad term and if you want more specific advice, I'm afraid you're going to have to be more specific with your question!

I guess what I would like to understand is how to write Direct Sound or OpenAL from scratch.

I think, therefore I am. I think? - "George Carlin"
My Website: Indie Game Programming

My Twitter: https://twitter.com/indieprogram

My Book: http://amzn.com/1305076532

DirectSound is essentially a driver that bridges the gap between third party libraries (such as OpenGL, FMOD, etc) and hardware. DSound does emulate some effects and provides access to hardware acceleration if possible, but at ground level it's precisely that and nothing more: a driver.

Now, I'm not too familiar with OpenAL overall, but it's likely just a library like FMOD, which builds on top of native drivers depending on what operating system you're compiling on and what is available. OpenAL and FMOD (and other libraries) also provide additional functionality, like time-to-frequency domain conversion (essentially raw FFT and IFFT calls), effects (reverb, delay, etc) and format support (easy loading of audio file formats).

In short, you're probably not thinking of writing a driver, in which case "writing DirectSound from scratch" doesn't really make much sense. You are probably thinking of implementing various library functionalities, such as effects and the like (just to be clear: if you do - for whatever reason - want to write a driver, then I can't help you). In the latter case, however, I would suggest two things:

1) start by reading the book I linked to. I'm sorry to say, but it's kind of apparent that you're not really aware of what you're even wanting to do. Building a knowledge base to work off of is the place to start. DSP is literally one of the most comprehensive and demanding fields out there and has to do with everything from circuitry design to programming synthesizers to implementing an incredible slew of various effects in code

2) if you don't feel comfortable simply reading up on things and really really want to do some coding, try an icebreaker assignment: keep reading and start writing somethin like a really simple additive synthesizer (let's say 2 oscillators using a few wavetables and a couple of filters). You will never figure out how this stuff works from code (which is why reading is so important), but conversely also implementing things like filters in code from theory is highly technical. My approach, which I deem pretty healthy, is that it's essential to have an understanding of what each knob (on a synthesizer or audio control panel) does and how it affects the signal, but it isn't imperative to understand the underlying mathematics. The same applies to code unless you really want to over-compensate.

As another thing, you might want to start by examining how software synthesizers work and what all the different knobs do. Let me know if you would like some suggestions.

As for code, here are two invaluable resources to get you started: KVR Audio (check out the forums for active DSP-related discussions) and musicdsp.org (check out the wide variety of user-submitted source code listings).

If you're wondering what a software synth has in common with an audio library, then the answer is that a synth generally boils down to being a DSP library in and of itself with the distinction that all modules are specialized and structured to manipulate sound in a specific sequence as opposed to being standalone functions.

Hopefully I understood you correctly and what I wrote helps!

DirectSound is essentially a driver that bridges the gap between third party libraries (such as OpenGL, FMOD, etc) and hardware. DSound does emulate some effects and provides access to hardware acceleration if possible, but at ground level it's precisely that and nothing more: a driver.

Now, I'm not too familiar with OpenAL overall, but it's likely just a library like FMOD, which builds on top of native drivers depending on what operating system you're compiling on and what is available. OpenAL and FMOD (and other libraries) also provide additional functionality, like time-to-frequency domain conversion (essentially raw FFT and IFFT calls), effects (reverb, delay, etc) and format support (easy loading of audio file formats).

In short, you're probably not thinking of writing a driver, in which case "writing DirectSound from scratch" doesn't really make much sense. You are probably thinking of implementing various library functionalities, such as effects and the like (just to be clear: if you do - for whatever reason - want to write a driver, then I can't help you).

Yes, I don't really know what I want out of this smile.png. I've done lots of programming, and written a software renderer from scratch to learn about graphics. The last two books I purchased about 3D engine programming didn't cover sound. It seemed strange, because I figured that the sound stuff would be important. The more I learn about this, however, the more it seems like the sound and graphics are very different areas.

So yes, I shouldn't say I was to write Direct Sound. I think I mean I would like to be able to do things in software like mixing, reverb, pan, High and low pass filters, and that kind of thing. I don't really know what I need to learn, because if I already knew that, I wouldn't need to ask. biggrin.png

I will check out the book. It looks like a good place to start.

I think, therefore I am. I think? - "George Carlin"
My Website: Indie Game Programming

My Twitter: https://twitter.com/indieprogram

My Book: http://amzn.com/1305076532

I have written a software 3D sound engine during college.

As I was targeting windows only, I used waveOutOpen (http://msdn.microsoft.com/en-us/library/windows/desktop/dd743866(v=vs.85).aspx) to send the raw PCM data to the speakers.

This is a good starting point : http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=4422&lngWId=3.

Once you got a basic sound coming from the speakers, you can simple keep on adding features and build a nice architecture around it.

Mixing sounds is a simple as adding them together, resampling is simply interpolating and effects like echo or low/high pass filter aren't that hard and fairly documented.

I have written a software 3D sound engine during college.

As I was targeting windows only, I used waveOutOpen (http://msdn.microsoft.com/en-us/library/windows/desktop/dd743866(v=vs.85).aspx) to send the raw PCM data to the speakers.

This is a good starting point : http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=4422&lngWId=3.

Once you got a basic sound coming from the speakers, you can simple keep on adding features and build a nice architecture around it.

Mixing sounds is a simple as adding them together, resampling is simply interpolating and effects like echo or low/high pass filter aren't that hard and fairly documented.

YES!!! This is what I was looking for. The sound equivalent of getting a buffer and setting each pixel value. This, along with the DSP book, is a great starting point.

Thanks!

I think, therefore I am. I think? - "George Carlin"
My Website: Indie Game Programming

My Twitter: https://twitter.com/indieprogram

My Book: http://amzn.com/1305076532

As I was targeting windows only, I used waveOutOpen (http://msdn.microsof...6(v=vs.85).aspx) to send the raw PCM data to the speakers.

You should although consider the XAudio2 windows api.

If you are interested in doing raw device I/O, check out the WASAPI. It is intended for use by modern professional audio applications, has low latency, and gives you access to all of the device's channels/sample rates/capabilities. It is the successor to waveOutOpen() and related functions on Vista+.

ASIO is another pro-level option supported by a lot of hardware drivers, but it isn't as widely supported as the above.

I have written a software 3D sound engine during college.

As I was targeting windows only, I used waveOutOpen (http://msdn.microsoft.com/en-us/library/windows/desktop/dd743866(v=vs.85).aspx) to send the raw PCM data to the speakers.

This is a good starting point : http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=4422&lngWId=3.

Once you got a basic sound coming from the speakers, you can simple keep on adding features and build a nice architecture around it.

Mixing sounds is a simple as adding them together, resampling is simply interpolating and effects like echo or low/high pass filter aren't that hard and fairly documented.

Great info, I've been meaning to look into this for awhile myself.

A question if you come back around -- did you find it difficult to keep the buffer full for gap-less playback? Even though audio processing isn't terribly intensive, I've always been concerned about Windows' ability to keep up with the real-time constraints while also having fast response time to sound events. Audio is far more susceptible to even tiny gaps in playback -- the ear notices micro-second gaps, while entire video frames can slip by. What were your experiences?

throw table_exception("(? ???)? ? ???");

This topic is closed to new replies.

Advertisement