Archived

This topic is now archived and is closed to further replies.

AndreTheGiant

arc cosine slow in c++ ?

Recommended Posts

Im reading Lamothe''s Tricks of the 3d GPG''s, and on p 1383 he makes a huge deal about not calling an arccosine() function in ''real-time'' code because its too damn slow. He then spends about 4 pages showing you how to make it faster. The project Im working on right now uses C++''s acos() funciton in a couple of places, so I decided to see if it really was too slow to use. Since I cant figure out how to use the profiler in VS.net, (Heck, I dont even know if there IS one), I decided to do it the old fashioned way and just call the code a bunch of times:
for (int i = 0; i < 1000000; i++)
   acos(.570);
Here, I''ve called acos() 1 million times, and on my computer the program executes instantly. So is acos() not that slow after all, or am I missing something?

Share this post


Link to post
Share on other sites
Probably, your compiler is optimizing away the whole thing, ergo that code is not very good for testing the performance of the function.



DISCLAIMER: If any of the above statements are incorrect, feel free to deliver me a good hard slap!

My games: DracMan | Swift blocks

[edited by - JohanOfverstedt on March 20, 2004 8:25:55 PM]

Share this post


Link to post
Share on other sites
It might be a good idea to also have something to compare it to. Perhaps implementing what Lamothe suggests, or maybe just timing a similarly structured sqrt call in a loop.

Share this post


Link to post
Share on other sites
Doc, I compared it to the sqrt function as you said, and it runs about the same speed (I increased the iterations to 10 million so i can more easily see a difference - now acos() takes about 1.5 seconds, while sqrt takes about 1 second)


JohanOfverstedt, I considered that, but if that were the case, then in my above test, I think that both sqrt and acos would take the same time wouldnt it? In any case, can you recommend a better test?


I guess what I want to know is: is that considered slow? If it takes 1.5 second to execute 10 million calls to arccos, is that really something you need to speed up?

[edited by - AndreTheGiant on March 20, 2004 8:35:42 PM]

Share this post


Link to post
Share on other sites
quote:
Original post by AndreTheGiant
I guess what I want to know is: is that considered slow? If it takes 1.5 second to execute 10 million calls to arccos, is that really something you need to speed up?


Depends on how many times you need to call it. I wouldn''t worry about it unless you really do need to call it a million times per frame. And chances are you don''t.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Apply the normal performance rule.

Write for maintainability, then profile for performance tuning (and optimize the hot spots). The other, slightly less important rule is look for algorithmic improvements rather than micro optimizations.

Chances are arccos is no where near the top when it comes to CPU consumption when mixed with an entire game-like application. Only after you''ve determined that arccos is required so much that it is a bottle neck, and there are no other algorithms without arccos to perform your task, should you start looking at making arccos faster.

Share this post


Link to post
Share on other sites
So.... Lamothe is a bonehead?

Also, as I said, Visual Studio .NET has no profiler surprisingly. What are my alternatives? Are there freely available profilers?

Share this post


Link to post
Share on other sites
quote:
Original post by AndreTheGiant
In any case, can you recommend a better test?
Use cin >> x then calculate acos(x) instead.

Share this post


Link to post
Share on other sites
Hmm well then how will I know that its not the I/O causing any delays? Im guessint it WOULD be the io.

Anyway, I dont think I care anymore. Maybe lamothe was using an older compiler than I am, and his didnt optimize. In fact, his 4 pages of code to make acos() faster implemented some sort of a lookup table. Maybe thats what my compiler does.

Either way, its plenty fast enough for me.

Share this post


Link to post
Share on other sites
I was using a MacLaurin expansion for cosine (not arc cosine) for a while until I relized that there weasn''t much point, not to mention I wasn''t sure my method was faster.

There must be a polynomial expansion you can use for arccos, but it''s probably a non-issue. Remember, unless you''re writing for a calculator or something, you have better things to spend your time on. LaMothe wrote that stuff a long time ago.

Share this post


Link to post
Share on other sites
quote:
Original post by AndreTheGiant
Hmm well then how will I know that its not the I/O causing any delays? Im guessint it WOULD be the io.
You ask for x once, not at every iteration If you are really worried, you can put a cout << "starting" before the loop.

Share this post


Link to post
Share on other sites
The volatile should keep the loop from vanishing. I''m not sure exactly what other optimizations it may affect, however, so the test may not be a good one.


volatile int x;
x = .570;
for( int i = 0; i < 1000000; ++i )
acos( x );

Share this post


Link to post
Share on other sites
quote:
Original post by Cedric
You ask for x once , not at every iteration


Oh. Yea. Right. Duh!

Ive tried the volatile suggestion, as well as yours, Cedric. The loop still takes about 1.5 seconds.

My conclusion:
Fuck it.

[edited by - AndreTheGiant on March 20, 2004 10:49:11 PM]

Share this post


Link to post
Share on other sites
That is definitely being optimized because you are not using the result at all. Just put x+=acos(.570); and you will get a more accurate result.

Share this post


Link to post
Share on other sites
Unfortunately, arccos is extremely slow. As for VS.NET, afaik, there is no profiler. VS6 has one, but not .NET. You have to purchase it separately.

Unfortunately, I don''t know of any way to make acos faster. When I was in college, I remember I heard something, but I forgot. Such ashame, because one of the things I was doing for work required that I call acos 9 billion times. Since it required extreme precision, I was not about to take something that just gives good enough.

Share this post


Link to post
Share on other sites
well if youknow the math behind generating ARC cosine you could make a template metaprogrammed version of the code that the compiler will reduce to a constant at compile time and you get instant arc cosines during runtime. i read about doing that in game programming gems

Share this post


Link to post
Share on other sites
The MS optimizer is pretty smart, if it can determine a bit of code has no effect on the outside world, it completely removes it.

You have to call acos on the result from the last one (or it will remove the loop) and return the result from that /and/ use it, or it will remove the function call. Volatile is not good because it places additional restrictions that are not accurate for the production case.

LaMonthe''s not a bonehead, you should not invoke transendental functions in time-critical code. Odds are good your code is not time-critical though. The other tricky thing, is that hardware will often implement the optimizations for you (pentiums use LUTs, and you can bet 3D cards do too). GameBoy''s (for certain) and (probably) PlayStations don''t have the monster floating-point processors PCs do.

You can turn on the assembly output (with source) and see exactly what it does with your code.

Be sure to enable ''intrinsic'' functions - this can turn functions such as sine or cosine into a single op code.

You need the Professional version of VS to get the optimizer, and I think the Enterprise and Architect versions to get the profiler. (IIRC VS.Net 7.0 didn''t have one, but 7.1 does)

Share this post


Link to post
Share on other sites