Jump to content

  • Log In with Google      Sign In   
  • Create Account


16 bit datatype for audio?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
29 replies to this topic

#1 xoxos   Members   -  Reputation: 90

Like
0Likes
Like

Posted 14 August 2012 - 01:02 AM

been a long time since i've been here, forums have changed, hope this is the right one.


i have added sound to a win32 app as per petzold's "programming windows 5th ed." 'sinewave' example. the issue i am having is that all of these examples use 8 bit sound with PBYTE.

i would quite prefer to use 16 bit audio - i know my way very well around audio dsp. but i don't know what the 16 bit equivalent to a PBYTE should be, especially as i can't find any documentation for PBYTE.

can anyone tell me how i should find out what PBYTE is? it's not online, it's not in my schildt. i can only assume it's a pointer to an array of BYTE (a datatype i've never dealt with before today).


i'm guessing that BYTE typecasts can be swapped for short typecasts, and that i should create a buffer length array of shorts and use a pointer to that as a replacement to the PBYTE...


...but since PBYTE is sort of a piece of total fiction to me that may as well have five heads and have stepped out of a flying saucer, perhaps someone can tell me if there is some equally mysterious equivalent appropriate for 16 bit data.

tia.
neither a follower nor a leader behttp://www.xoxos.net

Sponsor:

#2 krippy2k8   Members   -  Reputation: 642

Like
1Likes
Like

Posted 14 August 2012 - 01:29 AM

PBYTE is just a typedef for BYTE*, with BYTE being a typedef for unsigned char, so in the end it types out as unsigned char*

The 16-bit equivalent would be PWORD, or PUINT16 to make it more clear. Both of these type out to unsigned short*

For future reference:
http://msdn.microsof...1(v=vs.85).aspx

Edited by krippy2k8, 14 August 2012 - 01:30 AM.


#3 ApochPiQ   Moderators   -  Reputation: 14247

Like
1Likes
Like

Posted 14 August 2012 - 01:31 AM

Note that casting a pointer to a buffer of 16-bit data into a pointer to an 8-bit buffer will probably not do what you expect.

#4 xoxos   Members   -  Reputation: 90

Like
0Likes
Like

Posted 14 August 2012 - 11:28 AM

thank you both -

i have extensive experience with audio dsp, but in very few venues because i have a difficult time with sdks (eg. though i was easily able to intuit the typedef PBYTE, i was unable to find a reference for it). as a result, there are some elementary regions of programming i have no experience with.. eg. i never have to use structures for the work i do, so i am still wading whenever i have to deal with them, which hopefully is some other time.

so.. i can switch the PBYTE out for PWORD when defining pBuffer1 but this gives me the anticipated 'cannot convert unsigned short *' to 'char *' with this statement

pWaveHdr1->lpData = pBuffer1 ;


perhaps understandably, i am challenged by the concept that a preexistent structure exists that handles 8 bit wavs but not 16 bit.

am i supposed to define my own struct for 16 bit wavs, or is there something i can use, or some way to modify what exists?


is there some kind of information somewhere, perhaps a diety, who can present me with the things i need in order to send 16 bit stereo sound from my windows application?
neither a follower nor a leader behttp://www.xoxos.net

#5 xoxos   Members   -  Reputation: 90

Like
0Likes
Like

Posted 14 August 2012 - 01:59 PM

i could take two bytes and bitshift one of them to make an unsigned char...


but dudes.... how is this normally done, so when i go to integrate my code in the future i'm on the same page... surely there must be some convention by now... ? because i've been searching for this for hours now..

Edited by xoxos, 14 August 2012 - 02:00 PM.

neither a follower nor a leader behttp://www.xoxos.net

#6 Cornstalks   Crossbones+   -  Reputation: 6966

Like
0Likes
Like

Posted 14 August 2012 - 04:35 PM

I think you're in over your head at this point and that you should slow down and step back. Work up to this problem you have. You can't just magically convert 8 bit audio to 16 bit audio (given your "extensive experience with audio dsp" I thought you would've known this). You have to resample it, though a naive bit depth conversion will probably not sound as good as you'd want it.

Casting a int8_t pointer to an int16_t pointer just changes the pointer. It doesn't change the actual data, so that does no good (and even if it did, you'd be better of properly resampling the audio to a higher bit depth).

For myself, I use swresample (from the FFmeg project) to resample audio.
[ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

#7 xoxos   Members   -  Reputation: 90

Like
0Likes
Like

Posted 14 August 2012 - 04:55 PM

i use an sdk and normally handle audio as a float. i'm renowned for my work in physical modeling and generative audio.. magazine articles, career and stuff.. i can map musgrave's multifractal to a sphere..

but that doesn't give me a vernacular. eg. having i/o handled for me i wouldn't have known that BYTESs or WORDs were used, to me this seems a needless complication but whatever, hopefully my ego isn't on trial here.


i don't need to know the idea of resampling, i would like to become familiar with the convention. of course i'm in over my head, i wouldn't be posting if i could discern how to accomplish this profoundly elementary function as i am.
neither a follower nor a leader behttp://www.xoxos.net

#8 xoxos   Members   -  Reputation: 90

Like
0Likes
Like

Posted 14 August 2012 - 05:15 PM

..but more importantly, i know what resampling is, i do not know if it's what's normally done here... are you telling me that that's how this is normally handled? each short is split into two unsigned chars that are added sequentially? please, just say so!

or, alternatively, is the waveformatex routine modified to handle the shorts directly, or does wavehdr et c. always deal with chars?


at the present time i am not dealing with wavs, i am intending to implement a stream in which i can use the synthesis algorithms i am familiar with. i can synthesize using ints, whatever.. i need to know how to implement whatever variable format it expects.. you know, the stuff that's hard to reference. petzold doesn't cover his use of BYTES in the text.

Edited by xoxos, 14 August 2012 - 05:31 PM.

neither a follower nor a leader behttp://www.xoxos.net

#9 ApochPiQ   Moderators   -  Reputation: 14247

Like
0Likes
Like

Posted 14 August 2012 - 05:21 PM

I don't think people typically use the Win32 API to do nontrivial audio work. You generally use a library that wraps all the details for you and lets you provide the waveform data in whatever format you have available. I'd recommend looking into audio SDKs instead of trying to bang your 16-bit square peg into the Windows 8-bit round hole.

#10 Bacterius   Crossbones+   -  Reputation: 8134

Like
0Likes
Like

Posted 14 August 2012 - 05:23 PM

..but more importantly, i know what resampling is, i do not know if it's what's normally done here... are you telling me that that's how this is normally handled? each short is split into two unsigned chars that are added sequentially? please, just say so!

or, alternatively, is the waveformatex routine modified to handle the shorts directly, or does wavehdr et c. always deal with chars?

Resampling an image by expanding each pixel to a 2x2 square - and vice versa - looks like crap. Audio is no different. Simply rescaling each 16-bit sample to 8-bit (or the opposite) will sound chunky. And no, splitting a short into two chars will not do what you want, do you understand what each short represents in a 16-bit mono audio waveform?

Edited by Bacterius, 14 August 2012 - 05:25 PM.

The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#11 ReaperSMS   Members   -  Reputation: 827

Like
0Likes
Like

Posted 14 August 2012 - 05:28 PM

I suggest reading up on waveout at http://msdn.microsoft.com/en-us/library/windows/desktop/dd757715(v=vs.85).aspx

The lpData member of WAVEHDR is just a pointer to the buffer. The WAVEHDR itself is a description of one particular buffer you're planning on handing to waveout at some point. Assuming things are set up properly as mentioned below, you'd just cast your pointer to a PBYTE.

The format of the data you should point that at is determined by the WAVEFORMATEX you hand to waveOutOpen, for 16 bit audio, you'd fill in the appropriate fields to indicate 16 bit per sample PCM, either mono or stereo.

#12 xoxos   Members   -  Reputation: 90

Like
0Likes
Like

Posted 14 August 2012 - 05:39 PM

bacterius - forgive me, becaus eit seems we have a fundamental breakdown in communication.


i've been coding antialiasing and bandlimiting in c++ for a decade. you can find me in sound on sound and other magazines... i know all my ints and doubles and yummy bits.. please, give me some credit. i know sod all about sdks... i have trauma with sdks... but i have extensive experience actually flipping bits and doing things with them. i eat that way. please understand that i am 100% aware that there's something here i don't know, but it doesn't mean i don't know anything.

*if*
it is not accomplished in the manner i dreamed up,

*how*
is it normally accomplished??

ApochPiQ - thanks for the word. for trivial apps like simple games the 8 bit hole works. i am coding a base script for multimedia.

i'll have to look at other options if this doesn't work, but at present i'm not entirely sold on the idea that it's not done. the format wouldn't accept a 16 bit declaration if that were the case, would it? ;)
neither a follower nor a leader behttp://www.xoxos.net

#13 xoxos   Members   -  Reputation: 90

Like
0Likes
Like

Posted 14 August 2012 - 05:44 PM

The lpData member of WAVEHDR is just a pointer to the buffer. The WAVEHDR itself is a description of one particular buffer you're planning on handing to waveout at some point. Assuming things are set up properly as mentioned below, you'd just cast your pointer to a PBYTE.

The format of the data you should point that at is determined by the WAVEFORMATEX you hand to waveOutOpen, for 16 bit audio, you'd fill in the appropriate fields to indicate 16 bit per sample PCM, either mono or stereo.


not understanding your methodology... if i cast a pointer as a PBYTE and direct it to an array of shorts, that doesn't seem viable.

what i have got atm is relaced all the BYTE casts with WORD casts. it works (signal has correct unsigned short amplitude response), but it works bad, and i think it's because of the casting. still picking through.. so hard to believe no one's got an elementary implementation documented for htis anywhere...
neither a follower nor a leader behttp://www.xoxos.net

#14 Cornstalks   Crossbones+   -  Reputation: 6966

Like
1Likes
Like

Posted 14 August 2012 - 05:52 PM

For the love, there's no need to reply twice within 5 minutes of each other. Just write one big reply. There's also an edit button if you need to make edits or addendums.

Also, get over your ego. This might sound harsh, but I don't care what experience you have (or don't have), and I don't think anyone else does either. It's only hurting you here, becaise, IMO, if you know your ints and doubles and eat bits, you should know you can't "take two bytes and bitshift one of them to make an unsigned char" (because that makes no sense).

i don't need to know the idea of resampling, i would like to become familiar with the convention. of course i'm in over my head, i wouldn't be posting if i could discern how to accomplish this profoundly elementary function as i am.

What I'm suggesting is that this problem isn't profoundly elementary (assuming you want good sounding audio).


..but more importantly, i know what resampling is, i do not know if it's what's normally done here... are you telling me that that's how this is normally handled? each short is split into two unsigned chars that are added sequentially? please, just say so!

I'm not telling you how this is properly handled, because I'm not even sure how to properly re-quantize audio. It's not a trivial problem. Get familiar with PCM audio signals, though. 8-bit PCM signals use 8-bits per sample, with each sample being in the unsigned range of 0 to 255 (a signed sample in the range -128 to 127 is possible but not as common as unsigned for 8-bit PCM). 16-bit PCM signals use 16-bits per sample, with each sample being in the signed range of -32768 to 32767.

at the present time i am not dealing with wavs, i am intending to implement a stream in which i can use the synthesis algorithms i am familiar with. i can synthesize using ints, whatever.. i need to know how to implement whatever variable format it expects.. you know, the stuff that's hard to reference. petzold doesn't cover his use of BYTES in the text.

Like ApochPiQ suggested, find a proper library for this. I use swresample, others use other things.

You're on a game development website in a "Game Programming" forum. This is not a digital signal processing forum. Don't expect everyone here to be able to dish out the exact answer you're looking for. There are other websites that are more suited to your question. We're happy to help, but we're probably not the best crowd to ask this kind of (non-trivial) question.
[ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

#15 ReaperSMS   Members   -  Reputation: 827

Like
0Likes
Like

Posted 14 August 2012 - 06:07 PM

not understanding your methodology... if i cast a pointer as a PBYTE and direct it to an array of shorts, that doesn't seem viable.

what i have got atm is relaced all the BYTE casts with WORD casts. it works (signal has correct unsigned short amplitude response), but it works bad, and i think it's because of the casting. still picking through.. so hard to believe no one's got an elementary implementation documented for htis anywhere...


It is viable, because that PBYTE pointer is actually a LPSTR, which is windows speak for "points at an array of bytes to be interpreted as the API sees fit". The WAVEFORMATEX structure you fill out and hand to waveOutOpen informs the API what data format your provided buffer actually is. I don't know what you're doing for casts where, but if you set it for stereo 16-bit, it is going to want 16-bit signed audio, stored little endian, and IIRC, left channel then right channel for stereo.

The example code they have there is terrible, but so is the waveOut interface in general. That would be why most people use some other API that provides a more humane interface.

#16 xoxos   Members   -  Reputation: 90

Like
0Likes
Like

Posted 14 August 2012 - 06:10 PM

And no, splitting a short into two chars will not do what you want, do you understand what each short represents in a 16-bit mono audio waveform?


apparently that's exactly how it is performed,
http://msdn.microsoft.com/en-us/library/windows/desktop/dd797880(v=vs.85).aspx
"16-bit mono.....
Each sample is 2 bytes. Sample 1 is followed by samples 2, 3, 4, and so on. For each sample, the first byte is the low-order byte of channel 0 and the second byte is the high-order byte of channel 0."


will have a bit of a rest now and then try it as a 2x length buffer...
neither a follower nor a leader behttp://www.xoxos.net

#17 Bacterius   Crossbones+   -  Reputation: 8134

Like
0Likes
Like

Posted 14 August 2012 - 06:15 PM

bacterius - forgive me, becaus eit seems we have a fundamental breakdown in communication.


i've been coding antialiasing and bandlimiting in c++ for a decade. you can find me in sound on sound and other magazines... i know all my ints and doubles and yummy bits.. please, give me some credit. i know sod all about sdks... i have trauma with sdks... but i have extensive experience actually flipping bits and doing things with them. i eat that way. please understand that i am 100% aware that there's something here i don't know, but it doesn't mean i don't know anything.

Look that's great and I'm happy you're so knowledgeable in DSP and audio processing, but as far as I am concerned, I don't really care about whether you are an audio guru or not. All that matters is that we can communicate. Right now you're not helping.

not understanding your methodology... if i cast a pointer as a PBYTE and direct it to an array of shorts, that doesn't seem viable.

The waveOut interface only asks for a pointer to some buffer in memory. It doesn't care whether you give it a PBYTE, a long*, or a void*. It will interpret the buffer depending on the header you specified (where you set the sampling rate, the mono/stereo flag, etc...). What you just need is to send it a properly formed buffer (following the PCM specification).

apparently that's exactly how it is performed,
http://msdn.microsof...0(v=vs.85).aspx
"16-bit mono.....
Each sample is 2 bytes. Sample 1 is followed by samples 2, 3, 4, and so on. For each sample, the first byte is the low-order byte of channel 0 and the second byte is the high-order byte of channel 0."

Well done, you discovered how 16-bit mono is represented in memory, where each sample is a 2-byte quantity (i.e. a word). But if you try to use a 16-bit sample and read it as an 8-bit sample, it will not work, because the API will expect each sample to be on 1 byte, whereas your buffer will have 2-byte samples. So you'll be reading "half a sample" each time and the sound will be weird. The simplest form of resampling you could to is take each 2-byte sample, divide it by 256 to obtain a 1-byte sample, and write that back to a new 8-bit buffer. But this doesn't work too well in practice.

The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#18 xoxos   Members   -  Reputation: 90

Like
0Likes
Like

Posted 14 August 2012 - 06:22 PM

reaperSMS - cheers. without dealing with formatting in the sdks i'm used to, the consideration that a pointer doesn't have to be sized never occured to me, i can see that now. when i'm finished typing all the replies i can try it lol.


You're on a game development website in a "Game Programming" forum. This is not a digital signal processing forum. Don't expect everyone here to be able to dish out the exact answer you're looking for. There are other websites that are more suited to your question. We're happy to help, but we're probably not the best crowd to ask this kind of (non-trivial) question.


adding 16 bit audio to a windows application really doesn't seem so exotic to computer gaming, does it?
neither a follower nor a leader behttp://www.xoxos.net

#19 Cornstalks   Crossbones+   -  Reputation: 6966

Like
-1Likes
Like

Posted 14 August 2012 - 06:24 PM


You're on a game development website in a "Game Programming" forum. This is not a digital signal processing forum. Don't expect everyone here to be able to dish out the exact answer you're looking for. There are other websites that are more suited to your question. We're happy to help, but we're probably not the best crowd to ask this kind of (non-trivial) question.


adding 16 bit audio to a windows application really doesn't seem so exotic to computer gaming, does it?

That's not your real question. Your real question is about resampling audio.
[ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

#20 xoxos   Members   -  Reputation: 90

Like
0Likes
Like

Posted 14 August 2012 - 06:31 PM

That's not your real question. Your real question is about resampling audio.


to resample audio, you must have audio.

i don't have audio, i am a synthesist. no wavs.

if you want to call the direct application of form to a variable resampling, you go ahead now.
neither a follower nor a leader behttp://www.xoxos.net




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS