Archived

This topic is now archived and is closed to further replies.

Zipster

Speeding up a sound...

Recommended Posts

I''m trying to think of a way to speed up the rate that a sounds plays at, but NOT have it affect the pitch of the sound. Right now, I was thinking I could set up a system where I let the sound play at normal speed. Then, at set intervals of time, I calculate what position of the sound the player is supposed to be at, using the desired playrate for the calculations, and jump the player to that location. The end result is skipping, but the more times a second I jump the player, I smoother it would get, uness the sound it played so fast that it''s grossly noticable. Are they any other ways to do this, preferably ones that play the entire sound?

Share this post


Link to post
Share on other sites
Physics.

Sound is a waveform; the "speed" of the sound is analogous to (actually the inverse of) its wavelength and its ptich is related to its amplitude (height). To modify the sound, modify the waveform. To obtain lower pitch, scale the waveform down (multiply y by a number < 1); to obtain higher speed, scale time down as well (multiply x by a number < 1).



I wanna work for Microsoft!

Share this post


Link to post
Share on other sites
quote:

Sound is a waveform; the "speed" of the sound is analogous to (actually the inverse of) its wavelength and its ptich is related to its amplitude (height). To modify the sound, modify the waveform. To obtain lower pitch, scale the waveform down (multiply y by a number < 1); to obtain higher speed, scale time down as well (multiply x by a number < 1).


I thought volume is amplitude and pitch is frequency, in which case you need to resample the sound but maintain the frequency. Can''t direct sound alter the pitch, so you can compensate for any increase/decrease in playback speed?

Share this post


Link to post
Share on other sites
quote:
Original post by invective
I thought volume is amplitude and pitch is frequency, in which case you need to resample the sound but maintain the frequency.

Ack! I can''t believe I wrote that! You''re absolutely right

/me scuttles off in utter shame...



I wanna work for Microsoft!

Share this post


Link to post
Share on other sites
quote:
Can''t direct sound alter the pitch, so you can compensate for any increase/decrease in playback speed?


That''s the catch. There isn''t any functions to change the playback speed, only the pitch! There are several effects available, but these are your standard distortion, echo, flanger, gargle, etc., no playback rate changing. There is a function to change the play offset, so my original plan would still work. Other than what I see in MSDN, I have no idea how to change playback rate if any function of the sort exists.

You mentioned resampling the sound but maintaining frequency... any way of doing it in practice? Don''t worry about the math, I can take anything

Share this post


Link to post
Share on other sites
Can be done, although you are going to struggle for real-time. Zipster, you were nearly there as well.

Essentially, imagine a waveform a sampled at 100Hz - we have 100 discrete sample points per second, and playing this audio back will take exactly one second. Now, we change the playback rate to 200Hz. The sample now plays back at twice the pitch, and only takes half a second. This is obviously not what we want. Now, we will produce a resampled waveform b , by interpolating the amplitude at each sample point. Every OTHER sample in b will the same as every sample in a , whilst the remaining 'gaps' between these will be filled by using an interpolative scheme.

Imagine a simple linear interpolation scheme ( with interpolated values denoted by a * )

a[0] = 0
a[1] = 10
a[2] = 50
a[3] = 20

Resampled, we have :
b[0] = 0
b[1] = 5 *
b[2] = 10
b[3] = 30 *
b[4] = 50
b[5] = 35 *
b[6] = 20


Linear interpolation is fast, easy but produces generally poor results ( poor in the eyes, sorry, ears, of audio people - I doubt most casual users would notice. ) More exotic interpolation schemes can be used if you are willing to trade simplicity and speed for clarity - lagrange interpolation, cubic interpolation, beizer interpolation or cosine interpolation all produce superior results, whilst the best signal can be produced using bandlimited interpolation.

And I apologise to any DSPers for this grossly simplified and lacklustre resampling 101 .

Edited by - Colin Barry on November 5, 2001 4:23:30 AM

Share this post


Link to post
Share on other sites
Yeah, real-time isn''t my friend in that situation

Now I was glancing at GoldWave to see how it did its time warp, and it appears that there were two options: "Similarity" and "FFT". Now I don''t know if Similarity is a common term, but I''ve sure heard of FFT before. Any insight on these? What Colin told me is what I had in mind, only I wasn''t planning on interpolating the values inbetween, just jumping.

Share this post


Link to post
Share on other sites
Never heard of similarity before; I guess it is some technique developed by the Goldwave Author.

FFT, on the other hand, stands for Fast Fourier Transform. The FFT is a way of decomposing a signal into an array of sinuosoids of a discreet amplitude, frequency and phase. From here, you can do all kinds of clever analysis stuff, or pitch-shift by transposing each individual sine component. Very nice.

A good reference for FFT is Stephen Sprengers DSP Dimension, whilst if it is FFT implementation you want you should check out the Fastest Fourier Transform In The West. Yee Haw!

Edited by - Colin Barry on November 6, 2001 4:07:55 AM

Share this post


Link to post
Share on other sites
Diffrent ideas:

Another way of doing it is splitting the sample up if it is drums, vocals, etc.

Also, you can loop the middle section of a note (like soundfonts, mods, etc), if this is approprite in your situation.


ANDREW RUSSELL STUDIOS
Looking for my webpage? Funny that... Me too!
Resist nes8bit :: Bow Down to Linux Communisum

Share this post


Link to post
Share on other sites
Good way of doing it - This is what Steinbergs .rex file format does - reduces ( for example ) a drum loop into composite samples and rebuilds this at a different tempo by increasing or decreasing the time between successive hits. Because all you are doing is changing the gaps between samples, there is absolutely no loss of quality.

Share this post


Link to post
Share on other sites
Ahh, I recall the part about splitting the waveform into sinuosoids from Fourier''s Theorem. I guess that''s where the second ''F'' in FFT comes from The quantum theory book I have goes into much more depth than I need for practical use though!

Hmmm....FFT sounds good! Just some background on what I have here... I''m working with 8-bit mono sounds that were originally sampled at 11025Hz. So we don''t have ultra high quality sounds here. From your abundant knowledge of waveforms and transformations, what kind of relative speed would the calculations take? I''m making a rough estimate that my program will have to be able to play 3 discrete sounds every 2 seconds, and I don''t want anything to noticably lag behind.

Thanks again.

Share this post


Link to post
Share on other sites