Back to General and Gameplay Programming

Sound Programming from Scratch

Tim Wright · 2013-01-30T19:56:24

I previously posted the question here, but got the muscian response. http://www.gamedev.net/topic/637873-sound-programming-from-scratch/ I'd also like the programmer response. Not sure if this question is better asked here or in the programming forum. I've been playing with sound programming, but I find the APIs hide too much stuff. When I learned 3D graphics, I started by learning from scratch. I would like to know if anyone has any resources (books or online) that teach audio programming from scratch. Much like writing a software renderer from scratch to learn about the algorithms. I've googled and amazoned, but I don't really know enough about the subject to make a decision if the books I found are any good. And man, I though software books were expensive. Audio books are not cheap. Thanks,

General and Gameplay Programming Programming

Started by Glass_Knife January 27, 2013 04:24 PM

13 comments, last by Aressera 11 years, 3 months ago

Aressera

3,145

January 30, 2013 03:54 AM

I have written a software 3D sound engine during college.

As I was targeting windows only, I used waveOutOpen (http://msdn.microsoft.com/en-us/library/windows/desktop/dd743866(v=vs.85).aspx) to send the raw PCM data to the speakers.

This is a good starting point : http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=4422&lngWId=3.

Once you got a basic sound coming from the speakers, you can simple keep on adding features and build a nice architecture around it.

Mixing sounds is a simple as adding them together, resampling is simply interpolating and effects like echo or low/high pass filter aren't that hard and fairly documented.

Great info, I've been meaning to look into this for awhile myself.

A question if you come back around -- did you find it difficult to keep the buffer full for gap-less playback? Even though audio processing isn't terribly intensive, I've always been concerned about Windows' ability to keep up with the real-time constraints while also having fast response time to sound events. Audio is far more susceptible to even tiny gaps in playback -- the ear notices micro-second gaps, while entire video frames can slip by. What were your experiences?

In my experience with the waveOutOpen() family of functions you need a really big buffer to avoid gapless playback, making low-latency audio impossible. The reason behind this is that this API is not a callback-based API, whereas more advanced APIs like WASAPI on Windows and CoreAudio on OS X allow you to register a callback method which is called from the main system audio thread whenever output audio is needed. The OS/driver maintains the buffer for you and synchronizes the callback so that there is only a few ms of latency between your code and the hardware.

Glass_Knife

8,637

Author

January 30, 2013 04:45 AM

If you are interested in doing raw device I/O, check out the WASAPI. It is intended for use by modern professional audio applications, has low latency, and gives you access to all of the device's channels/sample rates/capabilities. It is the successor to waveOutOpen() and related functions on Vista+.

ASIO is another pro-level option supported by a lot of hardware drivers, but it isn't as widely supported as the above.

Thanks. This is definitely going to help. :)

I think, therefore I am. I think? - "George Carlin"
My Website: Indie Game Programming

My Twitter: https://twitter.com/indieprogram

My Book: http://amzn.com/1305076532

l0calh05t

1,829

January 30, 2013 09:49 AM

For lowest latency, you should look at WASAPI exclusive mode (with callbacks) or even ASIO (but that will not work with all sound cards).

Yourself

1,962

January 30, 2013 09:59 AM

I have written a software 3D sound engine during college.

As I was targeting windows only, I used waveOutOpen (http://msdn.microsoft.com/en-us/library/windows/desktop/dd743866(v=vs.85).aspx) to send the raw PCM data to the speakers.

This is a good starting point : http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=4422&lngWId=3.

Once you got a basic sound coming from the speakers, you can simple keep on adding features and build a nice architecture around it.

Mixing sounds is a simple as adding them together, resampling is simply interpolating and effects like echo or low/high pass filter aren't that hard and fairly documented.

Great info, I've been meaning to look into this for awhile myself.

A question if you come back around -- did you find it difficult to keep the buffer full for gap-less playback? Even though audio processing isn't terribly intensive, I've always been concerned about Windows' ability to keep up with the real-time constraints while also having fast response time to sound events. Audio is far more susceptible to even tiny gaps in playback -- the ear notices micro-second gaps, while entire video frames can slip by. What were your experiences?

In my experience with the waveOutOpen() family of functions you need a really big buffer to avoid gapless playback, making low-latency audio impossible. The reason behind this is that this API is not a callback-based API, whereas more advanced APIs like WASAPI on Windows and CoreAudio on OS X allow you to register a callback method which is called from the main system audio thread whenever output audio is needed. The OS/driver maintains the buffer for you and synchronizes the callback so that there is only a few ms of latency between your code and the hardware.

WaveOutOpen is callback based (see the last 3 parameters), combined with a dedicated thread, it worked pretty good.

I didn't really had issues with gaps. I used 2 64kb buffers (not sure if that is considered a big buffer for audio programming) and the effects weren't really compute intensive ( low/high pass filter, echo, ...).

I was able to play 20+ sounds at the same time without a problem.

Although there is a latency of 1-2 buffers before a sound is actually beeing played, I didn't noticed it.

Aressera

3,145

January 30, 2013 07:56 PM

WaveOutOpen is callback based (see the last 3 parameters), combined with a dedicated thread, it worked pretty good.
I didn't really had issues with gaps. I used 2 64kb buffers (not sure if that is considered a big buffer for audio programming) and the effects weren't really compute intensive ( low/high pass filter, echo, ...).
I was able to play 20+ sounds at the same time without a problem.
Although there is a latency of 1-2 buffers before a sound is actually beeing played, I didn't noticed it.

Oops, my mistake, I was going from memory. And yes, 64kb is a very large buffer. For 16-bit stereo sound, that's 16384 samples or almost 370ms at 44.1kHz. It's not surprising it only took a few buffers to get gapless playback. In my implementation, I needed a similarly sized delay buffer (but split over multiple smaller buffers) to avoid gaps. Most devices work on <= 512 samples in a buffer (for ~10ms of latency), so you're really not getting anywhere close to good latency.

The other reason why that API is not the best option is because it internally does sample rate conversion and other lossy effects on the audio before it is sent to the device. This is probably OK for simple playback but it's not desirable for more complex audio tasks.

Sound Programming from Scratch

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Sound Programming from Scratch

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines