Trigonometric Lookup Table: Float vs. Double

Started by
13 comments, last by cr88192 11 years, 1 month ago

For a game that I am designing, I have been designing a barebones 2D game framework that includes physics, boundng boxes, cameras, game states, etc. Naturally, because my game is largely physics based, I felt the necessity to design a lookup table for my trig functions.

Currently, this Trig class uses only integer values for angles, and three arrays for sine, cosine, and tangent (Each as a Double wrapper object so that values could pass by reference).

In an attempt to try and make my framework less memory intensive, I have been going through my code an using appropriately sized variables over simply ints and doubles. It occurred to me that instead of saving hundreds of doubles (at 64 bytes each), I could cut that in half by using floats (at 4 bytes each).

My question comes in multiple parts: In your experience, would this help a good amount in terms of memory management - especially for lower end computers/netbooks? Would the loss of precision from a double to a float be significant to the player? Could this potentially increase the number of trigonometry-based object at any given time?

Thank you for your time :)

Advertisement

Naturally, because my game is largely physics based, I felt the necessity to design a lookup table for my trig functions.

Why did you feel "the necessity" ?

Twenty or so years ago, when floating point was slow and memory access speeds were not a bottleneck, back then it made sense. On hardware that has software floating point, such as handhelds and certain older consoles, it can also make sense. But on most modern development they are the exception, not the rule.

On a modern desktop CPU --- any x86 processor since about 1997 or so --- the CPU operations are much faster and more accurate than a memory lookup.

Also, your trigonometric lookup tables are currently taking up 360*3*8 bytes, that is 8640 bytes. Cutting this in half would free up a little more than 4 kilobytes.


Naturally, because my game is largely physics based, I felt the necessity to design a lookup table for my trig functions.

Why did you feel "the necessity" ?

Twenty or so years ago, when floating point was slow and memory access speeds were not a bottleneck, back then it made sense. On hardware that has software floating point, such as handhelds and certain older consoles, it can also make sense. But on most modern development they are the exception, not the rule.

On a modern desktop CPU --- any x86 processor since about 1997 or so --- the CPU operations are much faster and more accurate than a memory lookup.


On any x86 processor the CPU fails to meet Javas accuracy requirements for trig functions (Allthough the new intel CPUs(not sure if its just IA64 though) does have more accurate trig functions available, not sure if the JVM can use those yet though) so Java will perform a software argument reduction(possibly using fixed point arithmetics) for you before calling the native trig functions, this can get significantly more expensive(for large arguments) than a table lookup.

Allthough with a lookuptable you have to perform argument reduction in software anyway so unless your lookuptable fits in cache it should pretty much always be faster to do a quick and dirty argument reduction yourself and the cost to precision will be smaller.

if you don't need high accuracy the fastest method in java(on x86, normal trig is really fast on ARM, Sparc, etc) is most likely to use an aproximation function.
[size="1"]I don't suffer from insanity, I'm enjoying every minute of it.
The voices in my head may not be real, but they have some good ideas!

Naturally, because my game is largely physics based, I felt the necessity to design a lookup table for my trig functions.

Why did you feel "the necessity" ?

Twenty or so years ago, when floating point was slow and memory access speeds were not a bottleneck, back then it made sense. On hardware that has software floating point, such as handhelds and certain older consoles, it can also make sense. But on most modern development they are the exception, not the rule.

On a modern desktop CPU --- any x86 processor since about 1997 or so --- the CPU operations are much faster and more accurate than a memory lookup.

On any x86 processor the CPU fails to meet Javas accuracy requirements for trig functions (Allthough the new intel CPUs does have more accurate trig functions available, not sure if the JVM can use those yet though) so Java will perform a software argument reduction(possibly using fixed point arithmetics) for you before calling the native trig functions, this can get significantly more expensive(for large arguments) than a table lookup.

Allthough with a lookuptable you have to perform argument reduction in software anyway so unless your lookuptable fits in cache it should pretty much always be faster to do a quick and dirty argument reduction yourself and the cost to precision will be smaller.

if you don't need high accuracy the fastest method in java is most likely to use an aproximation function.

pretty much.

but, if the number of table entries is reasonable (for example, 256 or 1024, or similar), fitting in cache shouldn't be too much of an issue.

for example (for a 4096-entry sin table):

f=sinTable[((int)(angle*651.8986469+0.5))&4095];

where the magic constant there is: 4096/(2*PI), though slightly rounded.

sometimes, similar also makes sense when working with fixed-point, generally as conversions between floating-point and integer/fixed-point values are expensive. generally, fixed-point mostly makes sense for signal processing tasks (audio/video codecs), generally as both the input and output are usually small integer values, and doing lots of these sorts of type conversions can eat a lot of clock-cycles.

Regarding precision, you will probably find that you can use float without a problem. I can switch my entire physics code from double to float and I see no significant difference. Certainly for trig functions, which don't have that much range, you will get away with it.

It doesn't make any difference to calculation speed, because the CPU will be using double precision registers, and surely the memory cost of double over float is not significant, but it can mean less data to read and therefore better cache performance.

Whether you should be using lookup tables at all is another matter.

To clarify, I felt that a lookup table would be a necessity because:

  • More than half of my ingame entities would be using some form of rotation
  • Rotation is a key aspect of the gameplay
  • Elastic and inelastic collisions are frequent

If I understand correctly, the Math variations on trig functions run an algorithm to accurately determine it each call. For lookup tables, it only has to run those calculations once for each angle and function (less if you understand the symmetry of the unit circle). So let's assume that you only really need to run 270 (aka 90*3) calls to a trig function and its really accurate algorithm. Because of the high frequency of trig functions in my game, it wouldn't suprise me that at any given time I could be running AT LEAST 50 of these functions on any given frame.

For the record, I honestly don't mind using only integer angles or sacrificing some accuracy. I do most of my away-from-home programming on an outdated netbook (32-bit processor, 2 GB RAM, 1.6 Ghz CPU, >4 years old), where Flash can lag easily. Anything that can optimize calculations is useful for me.

EDIT: Thank you EWClay, that at least addresses the issue of precision. As long as it comes close and there are no discernible differences visually, its fine for me.

it wouldn't suprise me that at any given time I could be running AT LEAST 50 of these functions on any given frame.

That is a trivial workload for a modern processor.

Let's assume you are running on a single 2GHz processor.
Let's also assume 60FPS. That's over 33.3 MILLION cycles per frame.
Let's assume a bad-case scenario of 50 cycles per operation. (In practice they are around 10-20 cycles).

So 50 cycles per * "at least 50" of these functions = 2500 cycles.
2500 cycles = 1.25 microseconds.

That is less than 0.001% of your processor time per frame.

I'm curious what you are doing with the remaining 33,330,833 cycles per frame.

Twenty or so years ago, when floating point was slow and memory access speeds were not a bottleneck, back then it made sense. On hardware that has software floating point, such as handhelds and certain older consoles, it can also make sense. But on most modern development they are the exception, not the rule.

On a modern desktop CPU --- any x86 processor since about 1997 or so --- the CPU operations are much faster and more accurate than a memory lookup.

Just to clarify, while it is true that some floating point operations now can often be done in comparable time to some integer operations, do not mistake trigonometric functions for a next-to-free operation. Even on systems where trig operations are only one machine instruction, it won't be one clock cycle. I'd guess at least ten. These problems are exacerbated in a VM (although Java, in general, may be able to make it almost as fast as native C by allocating code in an executable segment).

That said, as you mentioned, CPU operations are usually better than incurring a memory lookup. Hence:

if the number of table entries is reasonable (for example, 256 or 1024, or similar), fitting in cache shouldn't be too much of an issue.

This. Accessing in L1 cache is generally the same cost (or one, three clock cycles more) as register access. Consequently, you will gain a lot by using a table that fits into it.

The smaller your trig table is, the more of it is likely more of it is to be in cache. For most applications, if you're careful, you need shockingly little precision, too. I would probably store values of sine and cosine in a single signed byte, and then index into it with another signed byte. This table is tiny, and should fit entirely into L2 cache. The greatest cost would be casting the result to a float--which, shouldn't hurt much at all--especially if all you wanted to do was just check the sign instead. Tangent can be calculated as sin/cos which still isn't too bad (but how often do you need to calculate it anyway?).

You mentioned rotations--an application that requires higher precision. The best choice is to apply these as offsets to a base matrix each frame, so you don't accumulate error. If constantly updating matrices is a problem, you'd best use quaternions so that your rotation matrix doesn't accumulate skew (you should do that anyway, since all floating point operations accrue error in general, just more slowly).

[size="1"]And a Unix user said rm -rf *.* and all was null and void...|There's no place like 127.0.0.1|The Application "Programmer" has unexpectedly quit. An error of type A.M. has occurred.
[size="2"]

To clarify, I felt that a lookup table would be a necessity because:

  • More than half of my ingame entities would be using some form of rotation
  • Rotation is a key aspect of the gameplay
  • Elastic and inelastic collisions are frequent
If I understand correctly, the Math variations on trig functions run an algorithm to accurately determine it each call. For lookup tables, it only has to run those calculations once for each angle and function (less if you understand the symmetry of the unit circle). So let's assume that you only really need to run 270 (aka 90*3) calls to a trig function and its really accurate algorithm. Because of the high frequency of trig functions in my game, it wouldn't suprise me that at any given time I could be running AT LEAST 50 of these functions on any given frame.

For the record, I honestly don't mind using only integer angles or sacrificing some accuracy. I do most of my away-from-home programming on an outdated netbook (32-bit processor, 2 GB RAM, 1.6 Ghz CPU, >4 years old), where Flash can lag easily. Anything that can optimize calculations is useful for me.

EDIT: Thank you EWClay, that at least addresses the issue of precision. As long as it comes close and there are no discernible differences visually, its fine for me.


50 per frame is not a problem, while frob understates the cost you shouldn't go far above 200 cycles per call unless your arguments go really far out of range so a few thousand trig calls per frame is still perfectly fine.
[size="1"]I don't suffer from insanity, I'm enjoying every minute of it.
The voices in my head may not be real, but they have some good ideas!

This topic is closed to new replies.

Advertisement