Back to General and Gameplay Programming

maths Library

ArnoAtWork · 2004-01-27T17:31:15

I try to make a mathematical library for vectors, matrices... I would like to include SSE and SSE2 support. The problem is to design correctly this lib. I would like to make a test at first to define if the CPU supports or not, SSE instructions. But after that, I would like to call the best instruction for each mathematical needs. But how I should implement that? Should I test at each function like: Vector3 operator+ (const Vector3& v1, const Vector3& v2) { if(SSEsupport) return addVector3SSE(v1,v2); else return addVector3C(v1,v2); } Or should I use virtual interface? MathsInterface* ptrOfMathsInterface; ... if(SupportSSE()) ptrOfMathsInteface = new MathsInterfaceSSE(); else ptrOfMathsInteface = new MathsInterfaceC(); ... Vector3 operator+ (const Vector3& v1, const Vector3& v2) { return ptrOfMathsInterface->addVector3(v1,v2); } Thanks a lot. [edited by - arnoatwork on January 22, 2004 5:26:23 PM]

General and Gameplay Programming Programming

Started by ArnoAtWork January 22, 2004 04:23 PM

14 comments, last by ArnoAtWork 20 years, 2 months ago

Sander

1,332

January 23, 2004 09:06 AM

What I do is provide multiple binaries for different platforms. Just #ifdef #endif the relevant parts of the code. Do make sure however that you still check for SSE/SSE2 support before you run it. People could very easily download the wrong version ofcourse. If they did download the wron version, gently quit with an error message explaining them where to get the right version.

It''s a bit more work and you will have to hassle with multiple binary versions, but you''ll get the fastest codepath.

Sander Maréchal
[Lone Wolves Game Development][RoboBlast][Articles][GD Emporium][Webdesign][E-mail]

<hr />
Sander Marechal<small>[Lone Wolves][Hearts for GNOME][E-mail][Forum FAQ]</small>

Qw3r7yU10p!

819

January 23, 2004 09:32 AM

Put the platform specific code in its own functions
Write a generic version of the functions too.
Build your class using templates
Choose the different platform functions depending upon the template specialisation
typedef the specialisations to vector3D in separate platform specific headers
Change the include paths for different builds for different platforms
No need for #ifdefs
Nice and neat
Sorry not my usual detailed reply. Stuff to do

Pete

mishikel

148

January 23, 2004 09:39 AM

I''ve written a math library and I''m pretty pleased that everything works correctly. However, I have some similar optimization questions:

SSE
Jan and Sander, it sounds like you guys having working SSE code in your math libs. How much speed improvement have you noticed? In which areas are SSE optimizations most crucial?

Inlining
Which functions did you choose to inline (everything, nothing, something in between)? How much of a speed improvement did you notice?

Thanks,
Matt

Jan Wassenberg

1,000

January 23, 2004 02:41 PM

Sander: hehe, didn''t consider that, because of the trouble for the user - many people don''t know what SSE is, or at least that you need a PIII or Athlon XP ("what''s that?") to run it. I guess it''s workable with a ''you installed the wrong version'' check, but that''s still a hassle.

I actually don''t think any of these suggestions are worth the trouble, unless you find that your math code is demonstrably too slow, and further, that it would be improved by SSE. mishikel, I don''t SSE-optimize stuff unless it really, really matters (see CLOD terrain engine on my page for one example), and math lib isn''t one of them, IMO. For a few odd matrix ops, SSE doesn''t make a difference at all. If you do enough that it would, I''d write the whole thing in asm, doing register alloc myself. It''s kind of silly to load stuff from memory, do a few SSE ops on it, and write it back out to memory.
That said, if you have lots of fsqrt(), you still win by replacing fsqrt with rsqrtss & mulss, even with parameter passing overhead.

E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3

Sander

1,332

January 26, 2004 10:37 AM

mishikel:
I can''t tell the speed improvement of inlining since we never NOT inlined the functions we use. That is the entire reason we went for a multi-binary approach. Inlining + SSE(2) = max speed.

The SSE functions take approximately 20%-25% of the time of the normal C functions (when operating on arrays of 4D vectors). SSE2 has still to be profiled correctly. If you use other vectors (like 3D ones) SSE speed improvement is less than that.

Jan:
We are eliminating the binary hassle via an installer/launcher. Our game will be an online multiplayer only game, thus the latest binaries are always available via the internet. At startup, the launcher checks SSE(2) support (or Altivec for Macintosh) and in the installed version is not the optimal one, the user is prompted to download the optimal version. Zero hassling for the user.

Sander Maréchal
[Lone Wolves Game Development][RoboBlast][Articles][GD Emporium][Webdesign][E-mail]

<hr />
Sander Marechal<small>[Lone Wolves][Hearts for GNOME][E-mail][Forum FAQ]</small>

Jan Wassenberg

1,000

January 27, 2004 05:31 PM

ah, ok, cool.
How do the SSE and FPU versions compare with ''regular'' math lib usage (a few matrices here and there), as opposed to large batches, where SSE is obviously faster?

E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3

maths Library

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

maths Library

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines