Started by Aug 20 2001 07:51 AM

,
11 replies to this topic

Posted 20 August 2001 - 07:51 AM

yo dudes,
At the moment I have a function called Advance3d which looks something like this.
Advance3d(&x,&y,&z,angle_up,angle_y,float length);
It uses asin and acos to calculate the coordinates of x,y and z after theyve advanced length along the up and y angle. It works allright, but I think the asin and acos are slowing down my program, and I was wondering if there was any other way to write an advance function...
frank

Posted 20 August 2001 - 09:10 AM

There are some ways to improve the performance of sin and cos, although I personally have not yet bothered with them. For example, table lookups may be faster if you can deal with somewhat reduced precision. I think coding a few terms of a Taylor series might be faster on some CPU''s, again if you can accept larger truncation error than the native acos and asin functions provide. You may even find that you can get speedup by exploiting various pipelines that exist in your target CPU---this could require hand optimization of the assembly code generated by your compiler.

I found this link to something called bMath. I have *NO* idea whether it is good, but here you go:

http://www.bmath.net/bmath/

There is also a book that focuses specifically on fast math. I have a copy and its decent, but I don''t recall the title right now. I''ll have to try to remember to look it up tonight.

Graham Rhodes

Senior Scientist

Applied Research Associates, Inc.

I found this link to something called bMath. I have *NO* idea whether it is good, but here you go:

http://www.bmath.net/bmath/

There is also a book that focuses specifically on fast math. I have a copy and its decent, but I don''t recall the title right now. I''ll have to try to remember to look it up tonight.

Graham Rhodes

Senior Scientist

Applied Research Associates, Inc.

Posted 20 August 2001 - 09:32 AM

I''m not sure exactly what your up to but the method you are using is indeed slow due to the acos and asin functions being used. You can remove the need for any cos and sin functions by using the rotation matrix (if you have it) the third row of the rotation matrix is a vector pointing in the direction your model is facing. You can use that vector to advance in the direction you are facing - only multiplication and addition required.

Don''t know if that helps but all the best in any case.

henry

Don''t know if that helps but all the best in any case.

henry

Posted 20 August 2001 - 11:12 AM

You can also aproximate sin using so called Pade function

P(x)=20*x*(294-31*x^{2})/(11*x^{4}+360*x^{2}+5880)

It aproximates sin(x) for x belonging to (-pi/2, pi/2).

Also cos(x)=sin(pi/2-x), so you have cos function. To move x to proper range use formulas sin(pi-a)=sin(a),sin(pi+a)=-sin(a).

But I think the best solution is to make a lookup table.

K.

P(x)=20*x*(294-31*x

It aproximates sin(x) for x belonging to (-pi/2, pi/2).

Also cos(x)=sin(pi/2-x), so you have cos function. To move x to proper range use formulas sin(pi-a)=sin(a),sin(pi+a)=-sin(a).

But I think the best solution is to make a lookup table.

K.

Posted 20 August 2001 - 12:00 PM

Could use lookup tables, look at the math library in the Quake3 source.

Posted 21 August 2001 - 03:35 AM

Here is the book I mentioned:

"MATH Toolkit for REAL-TIME Programming" by Jack W. Crenshaw, published by CMP Books in 2000, ISBN 1-929629-09-5.

Its a good book. There is an entire chapter on computing sine and cosine, and Jack does a fairly in depth study on why some implementations are slow, and how to speed them up. He talks about table lookups, the pros (speed) and cons (accuracy). Best of all, he provides source code that you can use (not table lookup code)! I don''t know exactly what the Intel or AMD math processors are doing internally, so I don''t know how the code in this book compares---the book is designed so that you can program embedded systems and even systems that do not support floating point math at all (i.e., Windows CE handhelds).

I was at SIGGRAPH last week, and I bumped into some folks I know at a 3D game engine company. They demonstrated to me a new engine they''re working on that runs on handhelds. The demonstration was on a Compaq iPAQ. What they showed was a 2000 triangle skinned character, dancing around in a navigatable 3D view, with texture maps on the character and a floor plane. The texture coordinates on the floor plane were animated. The demo ran at, I would estimate, around 15-20 frames per second. With the floor plane and other things, they''re getting around 60,000 triangles/second and they believe they can push 100,000 tris/second with optimization. I remember that 5 years ago it was difficult to get that performance on a desktop machine with very early consumer 3D graphics boards (i.e., the original Matrox Millenium). And back around 1990, that is the performance we were getting on SGI Personal Iris workstations over at North Carolina State University where I was a grad student. Now on a handheld. Without floating point math. Just amazing.

Graham Rhodes

Senior Scientist

Applied Research Associates, Inc.

"MATH Toolkit for REAL-TIME Programming" by Jack W. Crenshaw, published by CMP Books in 2000, ISBN 1-929629-09-5.

Its a good book. There is an entire chapter on computing sine and cosine, and Jack does a fairly in depth study on why some implementations are slow, and how to speed them up. He talks about table lookups, the pros (speed) and cons (accuracy). Best of all, he provides source code that you can use (not table lookup code)! I don''t know exactly what the Intel or AMD math processors are doing internally, so I don''t know how the code in this book compares---the book is designed so that you can program embedded systems and even systems that do not support floating point math at all (i.e., Windows CE handhelds).

I was at SIGGRAPH last week, and I bumped into some folks I know at a 3D game engine company. They demonstrated to me a new engine they''re working on that runs on handhelds. The demonstration was on a Compaq iPAQ. What they showed was a 2000 triangle skinned character, dancing around in a navigatable 3D view, with texture maps on the character and a floor plane. The texture coordinates on the floor plane were animated. The demo ran at, I would estimate, around 15-20 frames per second. With the floor plane and other things, they''re getting around 60,000 triangles/second and they believe they can push 100,000 tris/second with optimization. I remember that 5 years ago it was difficult to get that performance on a desktop machine with very early consumer 3D graphics boards (i.e., the original Matrox Millenium). And back around 1990, that is the performance we were getting on SGI Personal Iris workstations over at North Carolina State University where I was a grad student. Now on a handheld. Without floating point math. Just amazing.

Graham Rhodes

Senior Scientist

Applied Research Associates, Inc.

Posted 21 August 2001 - 06:41 AM

sounds good,

I read this article about unreal 2 and it said in unreal2 there is an average of 100''000 polygons per screen, and its supposed to run properly on a geforce 2 or 3. imagine that: 100''000 polygons. In my game im having trouble with about 5000 polygons per screen..

Back to the sin and cos thing, Its not a problem to create an array of floats for sin, cos and tan, but how do you do asin, acos and atan. the argument for the atan function ranges from 0 to unlimited(or whatever you call it in english). How big would you make the array?? I''ve tried an array of 1000 floats, but then it wont work properly. I''ll try this Pade fuction grudzio was talking about...

icecode

I read this article about unreal 2 and it said in unreal2 there is an average of 100''000 polygons per screen, and its supposed to run properly on a geforce 2 or 3. imagine that: 100''000 polygons. In my game im having trouble with about 5000 polygons per screen..

Back to the sin and cos thing, Its not a problem to create an array of floats for sin, cos and tan, but how do you do asin, acos and atan. the argument for the atan function ranges from 0 to unlimited(or whatever you call it in english). How big would you make the array?? I''ve tried an array of 1000 floats, but then it wont work properly. I''ll try this Pade fuction grudzio was talking about...

icecode

Posted 21 August 2001 - 07:29 AM

i''ve myself used precalculated sin/cos tables with 1024 elements and used %-operator to keep everything inside this table... like a = sintable[x%1024]. faster way would be using array with 256 (byte) elements like i used to do back in amiga days... but it''s not very precise anymore. a = sintable[(byte)x].

Posted 21 August 2001 - 08:18 AM

asin and acos are periodic functions, which repeat every 2pi radians. Even though they take parameters from 0 to infinity you can always find a parameter within the range 0 to 2pi that has exactly the same sine and cosine values as your value, no matter how large. For example, 9812222.33681007 is equal to 2pi * 1561660 + 23.17, and so you can just find the sine and cosine of 23.17. You can do something like a mod() to find the remainder of angle/2pi, and that remainder is the number you would send to asin() and acos(). This makes it possible to build much smaller tables. If you exploit various symmetries between asin and acos you can further reduce the domain to allow a smaller table or greater precision within a large table.

303 has done just this, although the x he''s using must not be in degrees or radians (because there really is no good reason to store values beyond 360 degrees or 2pi radians in the table).

Have you actually profiled your code to be sure it is the asin and acos calls that are slowing you down?

Graham Rhodes

Senior Scientist

Applied Research Associates, Inc.

303 has done just this, although the x he''s using must not be in degrees or radians (because there really is no good reason to store values beyond 360 degrees or 2pi radians in the table).

Have you actually profiled your code to be sure it is the asin and acos calls that are slowing you down?

Graham Rhodes

Senior Scientist

Applied Research Associates, Inc.

Posted 22 August 2001 - 04:20 AM

arcsin can be approximated by the following

1*3*...*(2v-1) x^(2*v+1)

arcsin x= x+(infinite sum, v=0) --------------- * ---------

2*4*...*(2v) 2v*1

1*3*...*(2v-1) x^(2*v+1)

arcsin x= x+(infinite sum, v=0) --------------- * ---------

2*4*...*(2v) 2v*1

Posted 22 August 2001 - 05:08 AM

quote:Original post by grhodes_at_work

asin and acos are periodic functions, which repeat every 2pi radians. Even though they take parameters from 0 to infinity you can always find a parameter within the range 0 to 2pi that has exactly the same sine and cosine values as your value, no matter how large.

Very, very not true. You''re thinking of sin() and cos().

asin() and acos(), being inverses of the sin() and cos() funcs, take in values between -1 and 1. asin() outputs a value between -pi/2 and pi/2, while acos() outputs a value between 0 and pi (even though there are many possible returns ot the function)*.

atan() does take values from -infinity to infinity, however. It will output a value between -pi/2 and pi/2.

~ Dragonus

* There are an infinite number of solutions to the equations y = asin(x) and y = acos(x) for all x between -1 and 1. Just as sin(0) = sin(pi) = sin(2pi) = 0, since asin() is the inverse of sin(), asin(0) = {..., 0, pi, 2pi, ...}. However, since we can only have one return value from the function rather than an infinite number, we select the most common value, 0, as the return value.