How can sample rate be 44000Hz?

Started by
13 comments, last by Brother Bob 9 years, 3 months ago

I've read that the human ear can hear up to 22000hz frequency, I suppose that when they write "frequency" they are referring to cycles per second. A standard frequency to store audio files is 44khz because that is the double of 22khz, being needed because of the Nyquist theorem.

However, how would this translate in samples per second? imagining a sine wave(or a sawtooth wave) four samples as minimum would be needed to complete a cycle(a sample when sine receives 0?, the next one at ?/2, the next one at 1?, and the last one on the cycle at 3?/2).
That results in 22000hz*4samples == 88000 samples per second
Am I correct?

Intel Core 2 Quad CPU Q6600, 2.4 GHz. 3GB RAM. ATI Radeon HD 3400.
Advertisement

You only need 2 samples to produce a sine-like oscillation, hence the Nyquist theorem. Imagine alternating -1 and 1 samples. The sample rate is the sample rate.

Most humans can't hear above around 15-16kHz once you're an adult. Only the very young can hear up to 20kHz. The extra frequency range is there so that there is some extra headroom above what is audible. This is needed for low-pass filtering when converting from higher sample rates or when low-pass filtering in the digital to analogue converters.

Sample rate and and the data being sampled is also dis-joint. Nyquist theorem in reference to sampling also related to reproduction, i.e in order to faithfully reproduce and a sampled signal, it must be sampled at twice the highest frequency ( reproduceable or valid or whatnot ). Implicit in this, is that sampling at a higher rate will allow you to capture higher frequencies ( if they exist in the signal )

The sample rate is given in Hz is just how much times the data is sampled per second so if the sample rate is 4400Hz -> 4.4kHz, then the data is being sampled 4,400 times every second.

Giving the sample rate in Hz is because this would be the clock frequency that is used to drive the A/D Converter, if it do the conversion in a single step.

As cgrant said, the sampling rate is samples per second, not cycles per second. Hz is "times per second", as in "x times per second, you take a sample from the audio stream".

Also, usually it's 44100Hz, not 44000Hz (though you probably know that and were simplifying).

 

For the specific number, that's largely from history, coming from radio and television standards.

In radio at the time signal processing electronics were being invented, FM and FM Stereo were the big new things. A bit of Google shows the actual rules involved was 47 CFR Section 73.310. Quoting from their description, "the multiplex subcarriers (two or more) must be located between 20 kHz and 99 kHz. The arithmetic sum of all the multiplex subcarriers may not exceed 30% modulation (22.5 kHz)" So the biggest a single audio channel needed to be was 22.5 kHz.

Television used a different range but still followed FM for audio, where part of the signal was needed for the audio subcarrier. So it also fell under the 22.5 kHz umbrella for FM regulations. Later the NTSC standard and frequency allocations allowed double the bandwidth by assigning a secondary audio channel, bringing it to 44.1.

There were other frequencies and ranges used outside the United States, so in the 1970s when Sony and others were working on digital equipment, they picked 44.1 because it worked with existing equipment for FM radio (22.5 kHz), for NTSC television at both 30 and 60 rates (22.5 kHz and 44.1 kHz), for PAL 50 television (44.1 kHz audio rate), and a few others. There were other competing rates used by various standards in radio and television, 48kHz is still a common value and is used in various formats and was revived in several digital formats.

Early sound cards (from a time when both space and processing were expensive) wanted to reduce the space and processing needs. Since they are computers and halving and doubling are cheap and easy, the 22.5, and 11.25 rates were common in early sound cards, both a half and quarter of the 44.1 kHz value. Going the opposite way, DVD Audio has 88.2 kHz and 176.4 kHz audio, double and quadruple the number.

Also, the sampling theorem is only about frequencies.

It's pretty easy to picture intuitively that sampling at 2x will conserve the frequency of a sine wave, but not it's amplitude.

Edit: Disregard the above, it's inaccurate and misleading.

44kHz is just "good enough", you need much higher rate for high fidelity...

Also, the sampling theorem is only about frequencies.

It's pretty easy to picture intuitively that sampling at 2x will conserve the frequency of a sine wave, but not it's amplitude.


Do you have a reference, or an example? My understanding of the sampling theorem is very different from yours.
There is more to time-discretization than just the Nyquist theorem.
When you time-discretize a continous signal, you essentially turn it into a stream of impulses. One impulse for each sample. To turn this stream of impulses back into a time-continous signal, you need to low-pass filter it (at least mathematically speaking). Imagine it like the low-pass filter bluring out all the spikes of the impulses, but keeping the general form of the signal intact.
You can show, that if the highest frequencies in the original signal were below half the sampling rate, then all the additional frequencies due to the spiky impulses are above half the sampling rate. So, (again mathematically speaking) the low-pass filter used for perfect reconstruction must let everything below half the sampling frequency pass undisturbed, but completely filter out everything above it. If you had such a filter (you can't build it) and you if you had an infinitely long sample stream (the filter is non-causal and has an infinite response, so you need an infinitely long sample stream) then you can perfectly reconstruct everything if the original signal truly never exceeded half the sampling frequency. As Olof Hedman already pointed out, exactly half the sampling frequency is the point where it breaks apart. At that point, you can no longer differentiate between phase and amplitude. But if the frequency is a smidge lower, due to the infinite amount of samples you can perfectly reconstruct it.

In practice, you can't build a perfect low-pass filter (except, maybe, if the signal is periodic?). Which means, the filter actually being used will have, roughly speaking, three frequency regions. A low frequency region which gets through undisturbed, a middle region, where the amplitudes get damped and a high frequency region where the filter blocks. And depending on the "width" of the middle region, you must keep a margin between the highest frequencies in your original signal and half the sampling rate (essentially what Aressera already said).

Also note, that sampling of a continous signal has nothing to do with the cycles in a synchronous circuit.
Ohforf's description seems correct to me, except I am not sure it is all that relevant for sound at 44,100 samples per second. What if you use a crappy filter, or perhaps even no filter at all? The resulting signal is not a correct reconstruction of the original, but the difference of the two signals is guaranteed to not have any frequency components under 22.5KHz (a simple corollary of the sampling theorem), which means that an ear cannot distinguish them.

This topic is closed to new replies.

Advertisement