Sign in to follow this  

Is there any high performance maths library

This topic is 3580 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi Is there any optimized maths library around you can recommend? I'm looking around, I can't find any that is very full featured and optimized with SSE2 etc. Intel Maths Library don't support quaterion, AMD don't either etc.

Share this post


Link to post
Share on other sites
No, no d3dx. Would be consoles based in future, so prefer not to have directx dependency. Also, the API interface in d3dx still rather C like.

Share this post


Link to post
Share on other sites
I have sworn to support CML, but it is not optimized quite like you say.

I think it is full-featured though (quaternions)

http://cmldev.net/

Share this post


Link to post
Share on other sites
Quote:
Original post by Void
No, no d3dx. Would be consoles based in future, so prefer not to have directx dependency.


You might like to check out the open source Sony vectorized math library, which is available to download as part of the Bullet dynamics engine.

You can download Bullet from here and then look in the Extras/vectormathlibrary directory for the math library source.

It implements classes like Matrix3, Matrix4, Point3, Quat, Transform3, Vector3 and Vector4, and has optimsed versions for SSE, (PS3) PPU and SPU, as well as a standard (cross-platform) scalar version. It also has both C and C++ interfaces.


Share this post


Link to post
Share on other sites
I checked out the Sony library. It's pretty, and twice as fast as simple implementations. But the debug version is somehow twice as slow, strange though.

Unfortunately, it still missing a few bits like Plane class, projection matrices, but otherwise looking good.

Share this post


Link to post
Share on other sites
My approach is a little different. I have a matrix and vector maths system which is an wrapper with the underlying physics engine, and then a bunch of functions whcih use <cmath>.

The problem was that I kept changing physics engines (first bullet, then ODE, now newton) and I wanted to use the same matrix and vector maths system as the physics system. I was using matrices in my camera system, and for the AI. The upside to this is that I don't have to go through my code and change the maths calls when I change physics engines. When I use it in an application with no physics engine, I use my own system, but that's not been the case for over a year.

Why not try that? Pick one, and write a wrapper to it.

Share this post


Link to post
Share on other sites
Quote:
Original post by WillC

You might like to check out the open source Sony vectorized math library, which is available to download as part of the Bullet dynamics engine.


I second the Sony vector math libs.

Indy

Share this post


Link to post
Share on other sites
Hmm, I did a few tests. While the Sony lib was faster, the accuracy looks questionable. I compared with it Ogre maths, which has also SIMD and hacked custom maths code.

Here's my test. Note, I just did some data IO to ensure it doesn't get compiled away in Release mode.


#include &lt;iostream&gt;

#include "vectormath/cpp/vectormath_aos.h"

#include &lt;Ogre.h&gt;

#ifdef _DEBUG
#pragma comment (lib, "OgreMain_d.lib")
#else
#pragma comment (lib, "OgreMain.lib")
#endif


#include &lt;cmath&gt;
#include &lt;float.h&gt;

#include &lt;fstream&gt;

using namespace std;

using namespace Vectormath::Aos;

ostream & operator &lt;&lt; (ostream & os, const Vector3 & v)
{
return os &lt;&lt; v.getX () &lt;&lt; "," &lt;&lt; v.getY () &lt;&lt; "," &lt;&lt; v.getZ ();
}


#include &lt;windows.h&gt;
/// Create a Timer, which will immediately begin counting
/// up from 0.0 seconds.
/// You can call reset() to make it start over.
class Timer
{
public:
Timer()
{
reset();
}
/// reset() makes the timer start over counting from 0.0 seconds.
void reset()
{
unsigned __int64 pf;
QueryPerformanceFrequency( (LARGE_INTEGER *)&pf );
freq_ = 1.0 / (double)pf;
QueryPerformanceCounter( (LARGE_INTEGER *)&baseTime_ );
}
/// seconds() returns the number of seconds (to very high resolution)
/// elapsed since the timer was last created or reset().
double seconds()
{
unsigned __int64 val;
QueryPerformanceCounter( (LARGE_INTEGER *)&val );
return (val - baseTime_) * freq_;
}
/// seconds() returns the number of milliseconds (to very high resolution)
/// elapsed since the timer was last created or reset().
double milliseconds()
{
return seconds() * 1000.0;
}
private:
double freq_;
unsigned __int64 baseTime_;
};

struct MyVec
{
union
{
struct
{
float x,y,z;
};

float data [3];
};

MyVec () : x(0), y(0), z(0)
{}

MyVec (float x, float y, float z)
: x (x), y (y), z (z)
{}

MyVec & operator = (const MyVec & v)
{
memcpy (data, v.data, sizeof (float) * 3);
return *this;
}
};

ostream & operator &lt;&lt; (ostream & os, const MyVec & v)
{
return os &lt;&lt; v.x &lt;&lt; "," &lt;&lt; v.y &lt;&lt; "," &lt;&lt; v.z;
}


MyVec MyCross (const MyVec & v1, const MyVec & v2)
{
return (MyVec (v1.y * v2.z - v1.z * v2.y
,v1.z * v2.x - v1.x * v2.z
,v1.x * v2.y - v1.y * v2.x));
}

MyVec MyNormalize (const MyVec & v)
{
MyVec result = v;

const float distance = sqrt (v.x*v.x + v.y*v.y + v.z*v.z);

if (FLT_EPSILON &lt; distance)
{
result.x /= distance;
result.y /= distance;
result.z /= distance;
}
return result;
}

int main ()
{
Timer timer;

const int max_iteration = 10000000;
// this causes integer overflow in new in VS2005?
// const int max_iteration = 100000000;

{
double startTime = timer.seconds ();

Vectormath::Aos::Vector3 *array = new Vectormath::Aos::Vector3 [max_iteration];

Vectormath::Aos::Vector3 result;
for (int i = 0; i &lt; max_iteration ;++i)
{
Vectormath::Aos::Vector3 v1 (i,2,3);
Vectormath::Aos::Vector3 v2 (i,3,4);
Vectormath::Aos::Vector3 v3 = cross (v1, v2);
Vectormath::Aos::Vector3 v4 = normalize (v3);

result = v4;

array [i] = result;
}


double endTime = timer.seconds ();

ofstream ofs ("1.txt");
for (int i = 0; i &lt; 1000; ++i)
{
ofs &lt;&lt; array [i] &lt;&lt; endl;
}

delete [] array;

cout &lt;&lt; result &lt;&lt; endl;
cout &lt;&lt; "Time taken is for SIMD is " &lt;&lt; endTime - startTime &lt;&lt; endl;
}

{
double startTime = timer.seconds ();

MyVec *array = new MyVec [max_iteration];

MyVec result;
for (int i = 0; i &lt; max_iteration ;++i)
{
MyVec v1 (i,2,3);
MyVec v2 (i,3,4);
MyVec v3 = MyCross (v1, v2);
MyVec v4 = MyNormalize (v3);

result = v4;

array [i] = result;
}

double endTime = timer.seconds ();

ofstream ofs ("2.txt");
for (int i = 0; i &lt; 1000; ++i)
{
ofs &lt;&lt; array [i] &lt;&lt; endl;
}

delete [] array;

cout &lt;&lt; result &lt;&lt; endl;
cout &lt;&lt; "Time taken is for MyVec is " &lt;&lt; endTime - startTime &lt;&lt; endl;
}



{
double startTime = timer.seconds ();

Ogre::Vector3 *array = new Ogre::Vector3 [max_iteration];

Ogre::Vector3 result;
for (int i = 0; i &lt; max_iteration ;++i)
{
Ogre::Vector3 v1 (i,2,3);
Ogre::Vector3 v2 (i,3,4);
Ogre::Vector3 v3 = v1.crossProduct (v2);
Ogre::Vector3 v4 = v3.normalisedCopy ();

result = v4;

array [i] = result;
}

double endTime = timer.seconds ();

ofstream ofs ("3.txt");
for (int i = 0; i &lt; 1000; ++i)
{
ofs &lt;&lt; array [i] &lt;&lt; endl;
}

delete [] array;

cout &lt;&lt; result &lt;&lt; endl;
cout &lt;&lt; "Time taken is for MyVec is " &lt;&lt; endTime - startTime &lt;&lt; endl;
}


return 0;
}

Share this post


Link to post
Share on other sites
You can add a Newton Rapson iteration to provide more precision when using _mm_rsqrt_ps. Note that this only applies to the SSE version of the Sony Vectormath library. The Playstation 3 Cell SPU and PPU versions should be fine.

It has been improved, so the SSE version of Sony Vectormath in Bullet 2.67 will have higher precision, at the cost of some performance. There is still the normalizeApprox for the faster version. You can download the SSE fix here:

http://www.bulletphysics.com/Bullet/phpBB3/viewtopic.php?f=9&t=1551

Thanks for the feedback, hope this helps,
Erwin
http://bulletphysics.com

[Edited by - erwincoumans on February 28, 2008 1:12:02 PM]

Share this post


Link to post
Share on other sites

This topic is 3580 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this