Archived

This topic is now archived and is closed to further replies.

okonomiyaki

Loading sounds into sound buffers..

Recommended Posts

I''m just curious. I want some basic understanding of exactly what I''m loading into my sound buffer for it to play (using directsound). When I open a wav file, I simply copy everything to it. Or, I''ve tried playing around with sine waves and such and created some interesting (yet annoying) sound effects. However, when I put a simple number such as 1000 in their, it doesn''t have a constant pitch. In fact, nothing plays. It''s as if a wave is required for noise to occur. But then, how do you achieve a perfect pitch? Calculate the wave so that it oscillates perfectly around it? I can''t get much but annoying jerks of music. I don''t know much of sound waves and how they work, so don''t scream at me! How does all that magic work?

Share this post


Link to post
Share on other sites
okonomiyaki,

The problem that you''re encountering is simple to solve if you think about it.

In the physical world, we hear a constant sound because our ear drums are moving back and forth at a specific speed. Think of a guitar; when you strike a string, the string vibrates up and down at a specific speed, resonating sound waves that contact our ear drums, move our ear drums at that same speed, letting us interpret the sound.

If you look at a speaker you can see the same thing; the speaker moves in and out based on the signal in the wires, pushing and pulling the air, causing sound waves to eminate.

What does this mean in the context of your problem? Well if you have a direct sound buffer that contains all the same number in them (say 1000), the signal that arrives at your speaker is always the same; the speaker never moves and so you never cause sound waves to be created.

In contrast, if you have a sin wave in there, you will have values that go from -1 to +1 in a sinusoidal pattern. Have a look in a program like cooledit or sound forge - you''ll see that the data is in sound waves, waves that travel from the positive to the negative. When data is moved from floating point representation, wherein a wave goes from -1 to +1, to an integer format (like 8bit audio or 16bit audio), that range becomes -127 to 127 and -32767 to 36727 respectively.

The range that is used when swinging from the positive to the negative and back again only effects the amplitute (loudness) of a sound, not the pitch. The pitch is defined by the speed by which the wave moves from -1 to +1 and back again. If you think about it a while it makes sense - the faster you cycle between -1 and +1, the faster your speaker moves back and forth, the more compressed the waves it emits are, the faster our eardrums are strummed, and the higher the pitch we hear.

When you fill a directsound buffer with the output of sin() you get sinusoidal wave form that has only 1 harmonic, that is it is a pure tone at that given pitch. Using other kinds of formulas for generating the data in your buffer will yield different timbers, or qualities of sound. On a synthesizer you often see multiple wave forms: sine, triangle wave, square wave. Each of these has a different ''duty cycle'' which defines how long it lies in the positive side of the axis versus the negative. While the sine wave is easy to generate it isn''t very useful. You might want to experiment with the square wave, or the triangle wave, which have multiple secondary harmonics, giving them a more interesting timbre.

If you are loading a wave file into a buffer, all you need to load is the DATA portion of a wave file. Look into the format of a wave file - it is a RIFF file with a WAVE subchunk containing a "FMT " chunk defining the format of the data, a "DATA" chunk containing the data, and possibly some other chunks.

You''ll need to write a wavefile parser that can read the wave file, extract the format of the data such as how many channels it has, what it''s sample rate is, it''s bits per sample. Then you need to allocate a direct sound buffer for the data and copy that data into that buffer. There''s a lot of information on the format of a Wave file on the net.

http://www.borg.com/~jglatt/tech/wave.htm

is an example, but there are better ones out there.

Best of luck.

- S

Hope that helps!

Share this post


Link to post
Share on other sites
Thank you very much for the compliment.

I have in the past thought of writing an article for gamedev because I feel that sound is often considered very confusing. Maybe in the future.

Share this post


Link to post
Share on other sites
Is the sound in those buffers really stored as a signal output? I never knew that. A assumed it stored things like the pitch of the sound for every ms (or whatever timevalue is used).
This is interresting indeed

Share this post


Link to post
Share on other sites
To say that the sound stored is 'signal output' is nubuleous. It does indeed contain a digital 'signal output', butw hat is stored within a buffer is the sample data that, when passed through the DAC (digital to analog converter) creates a 'voltage signal' that is passed to your analog amplifier and on to the speakers.

When you think about it some more, it makes sense.

Sample data is stored at a given sample rate eg 44100 hz or 44100 cycles per second for CD audio. That means that when the audio was being converted from analog to digital it was done so using an ADC (Analog to Digital Converter) that took one 16 bit sample every 1/44100 seconds. (Actually it does two, one for the left channel, one for the right, but you get the idea). Since the sampling is done at a constant rate, when you end up with is a digital representation of that analog signal input. When you go to play the data back, you are pushing a 16 bit sample into the DAC once ever 1/44100 seconds, rebuilding the signal input.

Does that make sense?

The neat thing about this is that you can do some interesting things with the sample data:

Assume you have your 16 bit monophonic (single channel) audio in a buffer:



short* pAudioData = {some data loaded from disk};
unsigned long uNumberOfSamples = {the length of the data in samples; }

for (unsigned long i = 0; i < uNumberOfSamples; i++)
{
pAudioData[i] = pAudioData[i] / 2;
}



Then play that - it will have half of the amplitude of the original data! That's because, although the wave form is still the same, it doesn't travel from -1 to +1 as far, it spends more time around the 0 axis. That's how you can manipulate the volume of digital audio.

If you were to take the same audio data and copy it into a new buffer, skipping every second sample from the source data, your new data would be twice as fast as the previous, and twice the pitch. This is because your have compressed the waveform, which we know creates higher pitches.

If you want to play back a direct sound buffer at a different pitch, you normally set the pitch on the DirectSound buffer object. What this is telling DirectSound is to use a complex algorithm to skip part of the source data, or to play part of the source data fore more then one sample.




[edited by - Sphet on October 24, 2003 4:56:07 PM]

[edited by - Sphet on October 24, 2003 5:02:52 PM]

Share this post


Link to post
Share on other sites
Yes, it sure is. DSP, or Digital Signal Processing, is a blanket term for anything that involves working with signal in the digital domain. When speaking about digital audio and DSP it is usually about things like mixing, editing, effects, noise reduction, signal analysis and such. It''s sure neat stuff to mess around with!

Share this post


Link to post
Share on other sites