Pulse Code Modulation quesitons

Started by
2 comments, last by CodeDemon 13 years, 3 months ago
I've been working on some audio stuff, loading wav's from disk, recording and playing, etc.... I understand PCM and how it's sampled, but what I don't understand is what the samples actually mean. How is the sound represented by the values? Higher values, higher volume? How does timbre, pitch, etc... affect the values (and more importantly, vice-versa).

Also any suggestions on synchronizing multiple audio sources would be greatly appreciated, but I won't ask for too much :).

Thanks.
--------------------------------------Not All Martyrs See Divinity, But At Least You Tried
Advertisement
Each sample stores the amplitude ("volume") of the sound signal at a particular point in time, nothing more. Pitch is governed by frequency and the timbre is affected by the "shape" of the wave; these are down to the overall interpretation of the samples (in the same way that individual pixels in an image file won't tell you how blurry the overall image is, for example).

For an example of one way you can change the timbre of a sound wave simply and effectively see frequency modulation synthesis.

[Website] [+++ Divide By Cucumber Error. Please Reinstall Universe And Reboot +++]


I've been working on some audio stuff, loading wav's from disk, recording and playing, etc.... I understand PCM and how it's sampled, but what I don't understand is what the samples actually mean. How is the sound represented by the values? Higher values, higher volume? How does timbre, pitch, etc... affect the values (and more importantly, vice-versa).

Also any suggestions on synchronizing multiple audio sources would be greatly appreciated, but I won't ask for too much :).

Thanks.

You can think of PCM as the position of the speaker/mic diaphram. Values further from the midpoint represent louder sounds.
You can translate between this Time Domain signal and a Frequency Domain signal using a Fourier Transform. A 15KHz buzz would show up as a single spike in the frequency domain, while in the time domain (your PCM) you'd see the values moving up and down in a 15KHz sin wave.

I've been working on some audio stuff, loading wav's from disk, recording and playing, etc.... I understand PCM and how it's sampled, but what I don't understand is what the samples actually mean. How is the sound represented by the values? Higher values, higher volume? How does timbre, pitch, etc... affect the values (and more importantly, vice-versa).

Also any suggestions on synchronizing multiple audio sources would be greatly appreciated, but I won't ask for too much :).

Thanks.


Each sample represents an amplitude relative to some scale over time (it's generally linear with PCM from completely quiet to max volume, but it gets mapped to a logarithmic decibel scale by the digital to analog conversion hardware when output to a speaker). That's it. You're probably wondering where the magic is, how can extremely rich sounds and audio be reconstructed from this? The magic is in our ears and brains. Our brains are very good at integrating differences in amplitude of mechanical vibrations (sound) over various time domains picked up by our inner year and demodulating them into different frequency domains, mapping them to the unique sensations of different types of sound that we experience. It's really all in our heads.

Timber and pitch are simply just mathematical properties that can be isolated and examined from a series of amplitude samples. If you plot any sound sample, say represented by a function f on a Cartesian graph, with amplitude being the Y axis, and time being the X axis, it's possible to represent the plot of the function f by an infinite sum of sinusoidal waves. In the bounded and discrete case, if you have a sound with n samples, you can represent that sound precisely with n/2 sinusoidal waves, where each wave represents a different frequency of a particular amplitude and phase. Mapping a function into set of sinusoidal waves is known as transforming the function into the frequency domain. This is usually done by the Discrete Fourier Transform or Fast Fourier Transform. You can change the pitch of a sound by shifting the frequency of each sinusoidal wave when you reconstruct the digital sample during the inverse Fourier Transform, which coincidentally is the same process as the forward Fourier transform.

Also, what do you mean by synchronizing multiple audio sources?

This topic is closed to new replies.

Advertisement