Sound in 3D Space

Started by
3 comments, last by Dmytry 19 years, 6 months ago
This is the first time I've tried to do this. I've been moving panning and volume around in 2D engines for a long time, but I think things are a bit different now. All I have to work with is panning (0 - 255, 128 as center), volume (0-255, as in 0 = silent), the camera's position coordinates, and position coordinates of where the sound is being played. I don't want to do anything fancy; just shift sound back and forth between the left and right speaker, and have the volume decrease based on distance. The distance part is pie. I could just decide on a multiplier and remove distance * mul from volume. But panning has me wondering about a few things. If an object is left of the camera, say about 50 feet to the exact left of it, the panning is most likely going to be full-left. But what happens as that object moves forward (forward relative to the camera)? Doesn't the sound become more centered? In other words, I don't think I can rely on the distance left or right from the camera. I think I'll most likely need to take forward/backward distance into consideration. Or convert 3D space into 2D screen coordinates. Has anyone come up with a decent algorithm they wouldn't mind sharing? Or any advice at all is just as good. Lack of math notation is also extremely appreciated [lol] On the other hand, code-style math goes right to my brain, so any number calculation explanations in this form means happiness++ Thanks for any direction [smile]
Advertisement
First of all, let me tell I'm not expert or even remotely experienced in this field.
However, the question (and a possible answer) interested me.

If I draw the situation on paper (in 2d, but easily convertable to 3d), I think it safe to sat the following:


  • Obviously, the sound is louder when closer to the camera

  • There's some sort of relationship between the left/right position (x direction), and back/front position (z direction)



This last point shows why, when moving the object toward the camera on the z-axis, the sound get less centered (and vice versa).
You could also say that the angle between the x-axis and z-axis is the relationship.



I'm not sure this is correct, but to me this seems about right.
To setup a mathematical solution for this would require some more thinking, but maybe this gives you some new ideas.
Let head is placed at 0,0,0 and is looking in -z direction, y is up, ears is directed to +x and -x. Sound source is at x,y,z.
If head is at camera you can find position of object by multiplying it with world-to-camera matrix.
Panning must simply depend to angle between x=0 plane and direction to object :)

For 3D, i think something like
panning=x/sqrt(x*x+y*y+z*z);// sine of that angle

or maybe better

panning=(2.0/pi)*atan2(x,sqrt(y*y*a+z*z*b));// somewhat transformed angle between x=0 plane and sound source

where you can ajust a and b (start with a=1 , b=1) , a and b depends to shape of head :-)

For your 0..255 , use
Your_0_255_Panning=127.5+127*panning;// you may need to replace "+" by "-" if sound is reversed.
if(Your_0_255_Panning<0){Your_0_255_Panning=0;} else
  &nbspif(Your_0_255_Panning>255)Your_0_255_Panning=255

But things also depends alot to what panning does really mean in your API.

What happen if you output sound with panning=255 ?
If in that case one of speakers is loud and other is _totally_ mute, you probably need to do something like

panning=panning*(k+(1.0-k)/(1.0+distance_squared));//if "head size" is something about 1 unit.(not 100 and not 0.01)

You'll have to tune k for better sound.

What this formule do:
in case sound is closer, volume difference must be bigger. I just maked up this formule, it have not so much of physical meaning.

In reality, human recognizes direction by comparing inputs from ears taking into account time delay between signals(arguable) and inequality of signal distortions created by ears and head. It's not that simple to simulate, and can not be simulated by simple panning and volume.
Sorry it took me so long to respond. I switched to working on another area and didn't want to reply until I tried to understand this.

I'm using FMod for sound. It's panning works like this..
Full left = 100% left, 0% right (panning = 0)
Centered = 50% left, 50% right (panning = 128)
Full right = 0 left, 100% right (panning = 255)

So I guess you could say it's normalized.

Transforming the sound position with the view matrix (inverted camera matrix) and Dividing it's X by it's length works perfectly for panning [grin]. But I can't seem to get the panning=panning*(k+(1.0-k)/(1.0+distance_squared)); to work. When I add this after the panning calculation, it makes the panning difference very small. It mostly stays at 127. My camera is about 25.0 units radius. Does this mean I should swap 1.0 with 25.0? What range should k be in? My unit scale is 1 inch. Sorry, I just don't totally understand this part. It sounds perfect without this. Does this make it even better? [smile]

Thanks again for your help, Dmytry. You are a genious.
And thanks much for the diagram of the sound layout, WanMaster. That was extremely helpful.

edit: this algorithm sounds better than the 3D sound implimented in DirectSound and FMod [grin]

edit2: Now that the panning sounds so good, it makes my volume changes sound pathetic. I'm thinking about making a less linear drop off of volume. Would this sound better? As in the volume drops off faster at close ranges, then much slower once the sound gets farther away. I'm taking a guess here, but that is something like the 3D sound implimentations I've heard seem to work.

[Edited by - Jiia on October 26, 2004 3:15:33 AM]
I think k should be 0.75 or so... with k=1 , this formule doesn't change anything.

And, in
panning=panning*(k+(1.0-k)/(1.0-distance_squared));
there's a typo, should be
panning=panning*(k+(1.0-k)/(1.0+distance_squared));

it will be better to
panning=panning*(k+(l*l)*(1.0-k)/((l*l)+distance_squared);
where l may be 2*head size
It's not for panning but for change of panning with distance. That when thing is close to head, volume difference between ears is bigger than when far, but not much. It's why there's k.
Also, it might be even better if you'll subtract head size from distance before doing this.

And as about volume falling... i think something like
volume=volume*fpow((o*o)/(o*o+distance_squared),p);
should work. where fpow returns first argument in power of second.
Where o it's some constant , maybe head size * 5 or so, and p is between 0 and 1.5 or so.(probably you need it to be in range 0.3..1)

This topic is closed to new replies.

Advertisement