16 bit datatype for audio?

Started by
28 comments, last by Aressera 11 years, 8 months ago

you should know you can't "take two bytes and bitshift one of them to make an unsigned char" (because that makes no sense).


an unsigned short is two bytes, pretty obvious in context. if you take an unsigned short and >> 8, what do you get?

probably the same thing as me mate.

friendly forum.
neither a follower nor a leader behttp://www.xoxos.net
Advertisement

to resample audio, you must have audio.

i don't have audio, i am a synthesist. no wavs.

if you want to call the direct application of form to a variable resampling, you go ahead now.

Then I have to agree with you that there's likely been a massive breakdown in communication. Can you, as clear as you can, explain your situation and question, and if possible show the relevant code?
[size=2][ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

Then I have to agree with you that there's likely been a massive breakdown in communication. Can you, as clear as you can, explain your situation and question, and if possible show the relevant code?


i'll be back later and let you know how it works. it appears that all i needed to know was that, as hazarded in my opening, the buffer is indeed to be composed of concatenated bytes. petzold makes no mention of bit depth variation, i had also incorrectly guessed that the buffer would be composed of other variable types, and my exploration of this attempted implementation had not resulted in facile success one anticipates when performing simple tasks. :)

i hope i can make this statement without being razzed about the fact that *all* datastreams are composed of bytes. when one is intending to discern an undocumented convention, it is data and not reason that gets you to the goal.
neither a follower nor a leader behttp://www.xoxos.net

[quote name='Cornstalks' timestamp='1344988340' post='4969643']
you should know you can't "take two bytes and bitshift one of them to make an unsigned char" (because that makes no sense).


an unsigned short is two bytes, pretty obvious in context. if you take an unsigned short and >> 8, what do you get?
[/quote]
You said "take two bytes" (not "take an unsigned short"). There's a distinct difference here between the two (yeah, they're both made up of 16 bits, but how you interpret those 16 bits is quite different, generally). Also, if you took two bytes, you wouldn't bitshift either of them. You'd just take the high byte and forget about the low byte.



i hope i can make this statement without being razzed about the fact that *all* datastreams are composed of bytes.

Yes, but it's how you interpret those bytes that is important.

Good luck.
[size=2][ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]
yap works fine. turn short into two chars, send to buffer in correct order as per
http://msdn.microsoft.com/en-us/library/windows/desktop/dd797880(v=vs.85).aspx

lovely 16 bit streaming audio :)

and to keep the fine comedy we've been experiencing going, word is that BYTES have pulled a TAFKAP and are now known as TVFKAUC. terrific!

have a good and fine existence.
neither a follower nor a leader behttp://www.xoxos.net
nah bro if you took two bytes, you put the low one in first, then the high one.

if you continually focus on the way in which my statements can be misunderstood instead of how they apply to context, you know what, dude..

you know what.. cos i'm gonna tell you dude..

dude, you're gonna spend a lot of time on this forum.
neither a follower nor a leader behttp://www.xoxos.net
Watch your attitude, xoxos.

People are trying to help you here, and being dismissive, snarky, and rude to them because they cannot read your mind is unfair and unwanted in this community.

Please consider that if there is a breakdown in communication in one of your threads, the onus is on you to communicate clearly, not on everyone else to be psychic.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Guys, you're being rather patronising to him. Changing audio from 8 bits to 16 bits is not a resampling operation, and can indeed be done with a bitshift. So all of you arguing about how difficult an operation resampling is and how he's asking the wrong question are completely off the mark here.

Xoxos, I'm not sure if you solved your problem or not, but I'll go over some of the issues. Usually the audio APIs deal with 16 bit signed data, -32768 to +32767, which if you divide by 32768 will give you the float values that you'd normally have in a DAW. The important things to consider are that instead of stepping 1 byte at a time through the data, you must step 16 bits at a time (ie. 2 bytes) - but if you have a pointer to the relevant data type (eg. unsigned short*) then incrementing that by 1 will move the correct distance through the data. (Bear in mind that if it's stereo data, it'll have the left and right channels interleaved, so you'll get a sample for the left, then one for the right.)

If you have 8 bit unsigned data, and need to convert it to 16 bit unsigned data, you can copy the 8 bit value into the 16 bit data type and shift it left by 8. Obviously the quality will not improve, because you don't have the misisng 8 bytes information to reconstruct it with, but the data format will be correct. But if you're generating the data (eg. from a sine wave) then just output the value as 16 bits from the start. Construct the float representation you're used to, ensure it's within the -1 to +1 range, then multiply that by 32768 to get the 16 bit value you need.
Chill. We're not trying to make him feel like an idiot. There was obviously a breakdown in communication (I won't do any blaming on this one; I don't care who's to blame), we've tried to clarify things, I corrected a technical flaw in something that was said, xoxos had attitude about the correction, Hodgman warned him about his attitude, you posted, and now I'm here making another post (but I had fun doing the below experiment).


Changing audio from 8 bits to 16 bits is not a resampling operation

Yes it is... If you have sampled at 8 bits, and need new samples at 16 bits, you have to resample (regardless of how you do it, even if it's just shifting left 8 bits). That's exactly what resampling is.


and can indeed be done with a bitshift.

Not if you really care about your waveform's quality. There are different ways to do it too, like linearly scaling your samples to 16 bits. Which is best depends on various factors, like what kind of noise reduction algorithm you use after resampling (but I'd say neither is best without the noise reduction step).


So all of you arguing about how difficult an operation resampling is and how he's asking the wrong question are completely off the mark here.

We were off the mark, yes, but because we misunderstood what he wanted (I believe... I'm still not 100% sure). It sounds to me (in retrospect) that he has 16 bit audio data and he just wanted to know how to proper pass a byte pointer to this data, which yes, is not a resampling question (it's just a casting question). BUT, honestly, properly resampling is not a trivial task (assuming you want good audio, which I've always stated when I say this; if you don't care so much, yes, you can do it trivially). You have to do things like dither, and your dithering algorithm can make a big difference. Just rescaling your audio samples will kind of work, but it's not the "proper" way to do it.

When you quantize your waveform to 8 bits, and then resample it to 16 bits, you've got 8 extra bits to play with. Good resampling libraries will try to use those bits to minimize the noise you introduced when you first quantized to 8 bits.

I got bored and made a simple sample program that makes a sine wave, samples it to 8 bits, and then resamples it to 16 bits (either by shifting or linearly scaling) and optionally adding a trivial dither (note I haven't fine tuned it--at all). Listen to the 4 samples attached and decide what you think are best, and see how each one was made in the spoiler tag. Note that I didn't use any fancy noise reduction algorithm, and I have no doubt that if these were properly processed the quality would be much greater (but would only further prove that shifting/scaling your samples isn't enough). All samples are mono, 44100Hz.
[attachment=10705:A.wav][attachment=10706:B.wav][attachment=10707:C.wav][attachment=10708:D.wav]
[spoiler]
Seriously, don't just look at this info without giving the test an honest effort!
[spoiler]
You sure you're ready?
[spoiler]
A = Resampled with shift left 8 bits, dither added
B = Resampled with multiply (linear scaling), no dither
C = Resampled with multiply (linear scaling), dither added
D = Resampled with shift left 8 bits, no dither

The one I thought was best was (don't peak unless you've made your decision!):
[spoiler]
A, though C was close (I think A's lack of using the low bits helped make the dither more effective)

B and D have a higher frequency noise that I think sounds worse.
[/spoiler]
[/spoiler]
[/spoiler]
[/spoiler]

Anyway, now I'm getting off track.
[size=2][ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

I got bored and made a simple sample program that makes a sine wave, samples it to 8 bits, and then resamples it to 16 bits (either by shifting or linearly scaling) and optionally adding a trivial dither (note I haven't fine tuned it--at all). Listen to the 4 samples attached and decide what you think are best, and see how each one was made in the spoiler tag. Note that I didn't use any fancy noise reduction algorithm, and I have no doubt that if these were properly processed the quality would be much greater (but would only further prove that shifting/scaling your samples isn't enough). All samples are mono, 44100Hz.


Here are my thoughts:

[spoiler]

I initially like B the most because it seems to have the least amount of noise of the 4. However, I will agree that the noise that is there is the 2nd worst, D sounding the worst.

I can definitely hear the dither working in A and C, but it does seem to increase the total amount of noise by at least 3 to 6dB, masking the harsh noise in B + D. Personally I'd rather have a slightly less aggressive dither or no dither at all.

Listening on JBL LSR4328P studio monitors + ATH-M50 headphones through a MOTU 896mk3 on OS X.
[/spoiler]

It would have been nice to have a comparison where the test tones weren't generated at 100% full-scale, it's possible that the results might be different (or more audible), especially since most people are listening through Windows Mixer (which does some compression/limiting near 0dBFS I believe). That itself might be adding some additional distortion to the pure tone.

This topic is closed to new replies.

Advertisement