using lookup tables?

Started by
7 comments, last by Cone83 22 years, 1 month ago
Hello I just wanted to ask if it is (much) more effective to use lookup tables for the trigonometric functions instead of the funktions from math.h (yes, I''m a c++ programmer *g*). Thanx Konstantin Schauwecker
Advertisement
You won''t notice too big difference on modern PCs.
People can answer you thousands of responses favouring one alternative or another. I feel that hardcoded methods are enough fast (I think that Intel hard-code CORDIC algorithms for trig functions) so lookuptables aren''t faster.

The only correct answer is TRY IT.

Make a simple code that compute the same number several thousands of times with one method and with another. Then check times.

Also, if you check it. Don''t forget to try every trig function. If standard cos() is faster than lookupCos(), arccos() don''t have to be faster than lookupArccos().
Like Pepe said, try it out. But I would add that you should try it in your actual app not in a test program. Just doing thousands of test and timing it could unrealistically tilt it because you whole lookup table would likely stay in the cache. So just use #define to switch back in forth in your app.

Jack
You also have to weight how accuracy will favor into it. Depending on the size of your lookup table (you''ll have to walk that fun line between memory usage and speed gain) you might have to sacrifice some accuracy.

For example, I used to use a table of 360 doubles to store sines and cosines (one table for each) which was great back when I was writing stuff for mode 13h in DOS, but if you''re making like an FPS, characters who can only rotate to the nearest degree will look less than smooth up close.
If you use sin/cos lookups, there is no need to have two seperate tables. Just make a sin lookup table and use lookupsin(x+90) for cos.
Be careful when doing benchmarks of this kind…

Often, when you don’t do anything with the results of a function, the compiler would completely remove your code, as it (correctly) determines that the code is not needed at all. Also, sometimes when you use constants as inputs to your functions, some optimization could happen that would not happen otherwise, and in all cases you would get a false sense of performance.

I did some interesting tests wrt a sin function implementation, and posted the results in the thread called “Fast trig function” about one page back. There’s also some CPU cycle timing results for various tests.

Basically, the best way I found to prevent the compiler from removing any code was to have a loop with a radians value that was incremented by a tiny amount for each test (ranging from –pi to pi total). Then keep a running total from the result of the function call. Use this value (cast to int) as the return value for the main() function. This way, the compiler cannot ignore the results from your code.

[EDIT] As was mentioned previously as well, using lookup tables might have other drawbacks – like clogging up the cache with a lot of data. Your benchmarks might seem to run fast, but when you use the table lookup function in your actual code, it might hurt the performance more than help it, because now there’s other data contending for the cache that was not in your benchmark. Just something else to keep in mind…


SS


Edited by - Axter on February 22, 2002 8:00:52 PM
SS
One thought: If you use the sin and cos functions you need to work fully in floating point but if you use lookup tables at some point you''ll have to use some variation of fixed point.
table[1.2342] is a bit strange so you could just do lookup[(int)(angle*1000)] but now you add a multiplication and a conversion to your lookup.
You could used fixed point throughout e.g 2pi equates to 1024. But angles over 1024 need some kind of test which means either a repeated loop or mod %, neither of which is quick. If you were having only 256 values you could put it in a char, then when you went over 255, it would automatically go back to 0 and start again and you could always safely do lookup[angle]. However 256 ''degrees'' is a bit limiting. The next size is a short which would mean you could have 65536 ''degrees'' which is really accurate. While it still only takes up.25Mb (not much relative to 128Mb) It''s obviously going to be slow with the cache.
My thought: use a short then instead of lookup[angle] use lookup[angle>>n] and set n to what ever you need to get the number of ''degree'' you want. for instance having n==8 would give you 256 ''degrees'', n==10 would give you 1024. I figure with the cache in mind you''d probably want n to be either 9 or 10. Before anyone starts about fixed point, my tests show MUL,SUB,ADD,FADD,FSUB & FMUL all take one cycle. However while DIV is 1 cycle, FDIV seems to be around 10 (all on Duron 750). So if you have to do any divisions then fixed point would give an extra advantage to the lookup table. Plus the integer and float ops can be done at the same time in parallel so having int and float ops interspersed should give up to 2x performance boost over purely float or even purely int.
You can use an arbitrary angular unit. Think about it: Whoever decided there should be 360 degrees in a circle? The Babylonians, and that''s just because they liked the number 60. Well, computers like the number 2. So make 1024 things in a circle. Then, to clamp it, just use sintable[n & 0x7ff]. That''s nice and speedy! For cos, use sintable[(n + 256) & 0x7ff]. That''s also very fast.

This topic is closed to new replies.

Advertisement