Advertisement Jump to content
Sign in to follow this  

Sound triangulation using Apollonius?

This topic is 2678 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm not a mathematician (nor very good at it) but I recently had a look at how one can triangulate a sound source based on the arrival times of different microphones.
Isn't it possible to use a special case of Apollonius' problem as a solution to triangulate a sound source much easier? Or will inaccuracies in the measurements makes this impossible in practice?

In this graphic I have simplified the problem very much (2D only) and made the speed of sound = 1 m/s


I'm guessing this could extended to 3D *relatively* easy using spheres.
I'm interested to hear if this could work in real life applications or if the current methods used are much better/reliable.

Note: The pixels are to measure in the graphic, if you want to verify the lengths and distances you can measure yourself using a graphics program :)

Share this post

Link to post
Share on other sites
I haven't reviewed the literature myself, but I know there's been a fair amount of work in this area. Among other reasons, the military was looking to pinpoint snipers using a network of microphones (and I think something similar was deployed on the south side of Chicago), so there was grant money flowing. As far as I know -- and I don't know a ton -- the single biggest issue is multipath interference; the sound bounces around and you hear lots of echoes. Clock synchronization of the various microphones might be another problem, but I'm guessing GPS helps with that (and there are all sorts of clock synchronization protocols out there).

Your reduction to Apollonius' problem looks right to me. Ultimately, there's some system of equations you're trying to solve that relate the various delays. To deal with noise, you add more equations (corresponding to more microphones) to get an overdetermined system, and then, rather than insisting that they all hold exactly, you define some measure of error and try to minimize that -- e.g., least squares.

When you can, it's nice to give this kind of thing a probabilistic interpretation. One way to think about it is that each equation gives you a set where the source must be, and solving the system exactly is intersecting the sets. The way one would typically deal with noise is to "soften" these sets somehow -- usually by replacing them with probability distributions, in which case set intersection gets replaced by multiplication of probabilities (that's Bayes' Rule). Ultimately this gives you a probability distribution over where the source might be, and, if pressed to give a single answer, you typically return ether the expected value of that distribution, or, more often, the point that maximizes it (called the "maximum aposteriori" or "MAP" estimate).

Some googling turns up, e.g., this student presentation

which seems quite useful, as well as some papers (which I haven't read), like

http://www.icsi.berk...pdf/2520_40.PDF .

At the very least, maybe they'll be helpful in locating other references.


Share this post

Link to post
Share on other sites
The starting point, t+0, t+80 and t+150, is very shaky: you don't know t to begin with because you don't measure distances but rather the difference between distances (i.e. the source is 80 seconds closer to A than to B).
With two microphones, the equation for source at P and microphones at A and B with a measured delay t[sub]AB [/sub]is
which is an hyperbola, constrained by |P-A|>|P-B| or vice versa (an half plane, eliminating half of each branch of the hyperbola).

With three microphones, you get three such equations and the target is at the intersection of three hyperbolas; measurement errors demand some kind of approximation .

Share this post

Link to post
Share on other sites
Yes the starting time t is unknown (thats the entire idea) - which is why I used relative time differences from when the first sound pulse is detected.
The first mic to get the sound becomes mic A, the second mic B and so on.

The idea is that when you find the circle that tangents the two circles and the point (circle of width 0) you get the solution circle. The distance from the center of the solution circle to mic A is the unknown time t (when speed of sound is 1 m/s).

Share this post

Link to post
Share on other sites
So each microphone after the first gives you a hyperbola? Thanks Lorenzo for bothering to figure that out. :-)

Heck, you can just brute-force discretize this problem on a grid. Just multiplicatively blend a bunch of blurry hyperbolas, renormalizing occasionally (so all pixels add to 1), and pick the brightest pixel. Or use negative-log-probabilities with additive blending, and then pick the darkest pixel. Even a naive GPU implementation can probably find global optima to hundred-microphone problems at 60 Hz.

Now, is your target moving, and do you have a model for how it moves? Then you just have to advect your probability distribution each frame. If the vector field is linear you can just do that with standard texture mapping by moving some vertices of a quad. Is there process noise? That's just a per-frame blur -- the more noise, the more blur. Easy peasy. I love when problems are low-dimensional.

This kind of thing would be called a "Bayes Filter," by the way.

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using, you agree to our community Guidelines, Terms of Use, and Privacy Policy. is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!