Jump to content
  • Advertisement
Sign in to follow this  
technobot2

Fastest way to solve a quadratic equation?

This topic is 4026 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I was just wondering - what is the fastest algorithm to solve a quadratic equation in c++ or a similar language (assuming plain IEEE floating point)? For that matter, is there a generally fastest algorithm at all, or is it just a matter of case-specific optimizations? For example, I found that at least in some cases, by using b' = b/2a, and c' = c/a I can save a few floating point operations. Then the equation reduces to: x1,2 = -b' +/- sqrt(b'2 - c') and if one can calculate b' and c' directly, this saves some computation. (And of course, checking if b'2 - c <= 0 helps avoid unnecessary sqrt)

Share this post


Link to post
Share on other sites
Advertisement
I would imagine the compiler does a pretty good job of optimizing stuff like this. I can't imagine too many cases where the quadratic equation would be a bottleneck in a program - and if it were improving performance would likely have to be on a case by case basis.

Share this post


Link to post
Share on other sites
The algorithm itself is trivial. There's not much you can optimize about it.

But, if you need to perform this on more data, then you could obviously vectorize everything, and use the SIMD or GPU or something similar.

But pray tell, why is solving 10 million quadratic equations per second too slow?

Share this post


Link to post
Share on other sites
Quote:
Original post by cshowe
I can't imagine too many cases where the quadratic equation would be a bottleneck in a program - and if it were improving performance would likely have to be on a case by case basis.
Quote:
Original post by Antheus
...

I'm currently working on a ray tracer. There many of the ray-primitive intersection tests require to solve a quadratic equation, and since those tests are the heart of the ray tracer, I would imagine any improvement in speed here would be noticeable.

Granted though, I know that this may not be the biggest bottleneck, and I will probably get bugger gains by reducing the number of tests using bounding volume hierarchies and the like. At the moment it is more of an intellectual pursuit than a real optimization effort. :-)


Antheus, using SIMD here does sound like a good idea, though maybe tricky.. I'll have to remember that for when I get to the optimization stage.

At the moment I have a 5 object scene rendering to a 800x600 window, with just primary and shadow rays, completely unoptimized (too early yet). That's about 5 million intersection tests, and it takes about 2-3 seconds to render a frame on my 1.7Ghz pc. Though not all of these use a quadratic equation, and I'm sure there are bottlenecks in other places, on a wild guess I'd say the quadratic equations do contribute..

Quote:
Original post by joanusdmentia
If you really need it you could use a short taylor series expansion of the sqrt to speed things up (at the cost of precision, though).

I suspect that on modern processors an sqrt would be faster than a taylor series, even a short one. But I admit this is an interesting idea that I haven't thought of. :-)

Share this post


Link to post
Share on other sites
Quote:
I would imagine any improvement in speed here would be noticeable.


Not really. As you increase number of objects, the number of ray bounces increases as well. Without some basic spatial portioning for collision tests this will kill you algorithmically.

You'd also be better off asking this in the graphics forum, mentioning it's for ray-tracing from the start.

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
Quote:
I would imagine any improvement in speed here would be noticeable.

Not really. As you increase number of objects, the number of ray bounces increases as well. Without some basic spatial portioning for collision tests this will kill you algorithmically.

True, but once the number of tests is reduced to a minimum, the next major impact should be the speed of each such test, since in an optimized ray tracer a large portion of the time will be spent doing those tests. And this is where the quadratic equation comes in.
Quote:
You'd also be better off asking this in the graphics forum, mentioning it's for ray-tracing from the start.

For ray-tracing specific solutions, yes. But since I'm still far from a true optimization phase, I was more interested in the generic question. Hence I posted here without mentioning the actual use in my case. I only mentioned ray tracing as an example of when the quadratic equation may play a large part in a program.

Share this post


Link to post
Share on other sites
Quote:
Original post by technobot2
Hi, I was just wondering - what is the fastest algorithm to solve a quadratic equation in c++ or a similar language (assuming plain IEEE floating point)? For that matter, is there a generally fastest algorithm at all, or is it just a matter of case-specific optimizations?

For example, I found that at least in some cases, by using b' = b/2a, and c' = c/a I can save a few floating point operations. Then the equation reduces to:
x1,2 = -b' +/- sqrt(b'2 - c')
and if one can calculate b' and c' directly, this saves some computation. (And of course, checking if b'2 - c <= 0 helps avoid unnecessary sqrt)
I'm not sure if the rearranged equation is equivalent to
x = (-b +/- sqrt(b2 - 4ac)) / 2a
but I could be wrong.
Where did you learn that rearrangment from - Links please?
Have you considered the geometric method of ray/sphere intersection testing?

Sorry it has been a number of years since I wrote my RTRT.

Share this post


Link to post
Share on other sites
Quote:
Original post by technobot2
True, but once the number of tests is reduced to a minimum, the next major impact should be the speed of each such test, since in an optimized ray tracer a large portion of the time will be spent doing those tests. And this is where the quadratic equation comes in.

Then do the algorithmic optimization first. If you end up doing 1000 solutions per frame you shouldn't waste time "optimizing" what the compiler already does for you. Premature optimization is the root of all evil, only when you have done algorithmic optimizations and confirmed that the solution of quadratic equations is a bottleneck should you try to optimize it. And after doing any kind of what you believe to be optimizations make sure to test whether it speeds up your code, doesn't affect it or actually makes it slower.

You talk about saving instructions, but according to some quick counting you do the same number of square root operations, the same number of subtractions, one more division and two less multiplications. According to my copy Intel's optimization manual a single precision division (float) is three times as slow as a multiplication when it comes to latency and 11.5-15 times as slow for throughput on P4s. For better precision it just becomes worse and the throughput of a division can be up to 22 times as slow as a multiplication (using extended precision). So saving two multiplications at the expense of one division seems to be a bad tradeoff, but with modern CPUs this kind of cycle counting can't be used as a performance measure.

If you really need to optimize a piece of code like this you need to take pipelining, prefetching and caching into account, but a compiler should be able to do this much better than you.

Also are you sure you even need the general form? Often problems can be reduced to a simpler form. For instance you might know that your discriminant is always nonnegative and can therefore omit the nasty conditional.

Quote:
I'm not sure if the rearranged equation is equivalent to
x = (-b +/- sqrt(b2 - 4ac)) / (2a)
but I could be wrong.

It follows from elementary algebra.
-b' +/- sqrt(b'2 - c') =
-b/2a +/- sqrt((b/2a)2 - c/a) =
-b/2a +/- sqrt(b2/4a2 - 4ac/4a2) =
-b/2a +/- sqrt( (b2 - 4ac)/4a2) =
-b/2a +/- sqrt(b2 - 4ac)/sqrt(4a2) =
-b/2a +/- sqrt(b2 - 4ac)/2a =
(-b +/- sqrt(b2 - 4ac))/2a

How close an approximation it's in actual floating point arithmetic I don't know, but if floating point numbers perfectly represented the real numbers then it would be valid.

Share this post


Link to post
Share on other sites
Quote:
Hi, I was just wondering - what is the fastest algorithm to solve a quadratic equation in c++ or a similar language (assuming plain IEEE floating point)? For that matter, is there a generally fastest algorithm at all, or is it just a matter of case-specific optimizations?


I'm not sure how much of an issue it is but the traditional form can have a number of issues with floating point precision, in particular if b^2 is much larger then 4ac or if b^2 - 4ac is very close to 0, if it is a problem then the following form may be of use (or you may be able to manipulate this form to reduce the number of operations). Just a note that these issues are unlikely to appear on x86 architectures if the compiler performs the whole operation on the FP stack due to the extended 80 bit precision that the processor uses for intermediate results.
    -b +/- sqrt(b^2 - 4ac)   -b -/+ sqrt(b^2 - 4ac)
x = ---------------------- * ----------------------
2a -b -/+ sqrt(b^2 - 4ac)

b^2 - (b^2 - 4ac)
= -----------------------------
2a * (-b -/+ sqrt(b^2 - 4ac))

2c
= ----------------------
-b -/+ sqrt(b^2 - 4ac)
Quote:

I'm not sure if the rearranged equation is equivalent to
x = (-b +/- sqrt(b^2 - 4ac)) / 2a
but I could be wrong.
Where did you learn that rearrangment from - Links please?
I'ts easy to verify,
x = -b' +/- sqrt(b'^2 - c') where b' = b / 2a and c' = c / a
= -(b / 2a) +/- sqrt((b/2a)^2 - (c / a))
= -(b / 2a) +/- sqrt(b^2/4a^2 - c/a)
= -(b / 2a) +/- sqrt(1 / (4 * a^2) * (4 * a^2) * (b^2/4a^2 - c/a))
= -(b / 2a) +/- sqrt(1 / (4 * a^2) * (b^2 - 4ac))
= -(b / 2a) +/- sqrt(1/(4 * a^2)) * sqrt(b^2 - 4ac)
= -(b / 2a) +/- 1/2a * sqrt(b^2 - 4ac)
= (-b +/- sqrt(b^2 - 4ac)) / 2a


Edit: To slow :)
Edit: If you do decide to go with SSE I'd apply partial fractions to the above result and get it into the form a' + b'/sqrt(c') so that you can take advantage of the limited precision reciprocal square root instruction and compare to see if that gives improved performance.

[Edited by - Julian90 on July 9, 2007 6:49:28 AM]

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!