Back to General and Gameplay Programming

Math API performance: saving CPU cycles?

Aaron Carter · 2012-09-27T17:53:03

I'm trying not to go overboard with micro-optimizations for my engine's math API, but I am trying to put some consideration into performance. For example, it's my understanding than multiplication is slightly faster than division, and saves a few CPU cycles here and there; and this can add up when high-frequency code is executing over and over in a game loop. So I have done things like this example from my Matrix structure: [source lang="csharp"] public static Matrix operator /(Matrix mat, float div) { #if PERFORM_CHECKS if (div == 0) throw new MathematicalException( "Divisor is zero.", new DivideByZeroException()); #endif float num = 1f / div; var result = Matrix.Identity; result.M11 = mat.M11 * num; result.M12 = mat.M12 * num; result.M13 = mat.M13 * num; result.M14 = mat.M14 * num; result.M21 = mat.M21 * num; result.M22 = mat.M22 * num; result.M23 = mat.M23 * num; result.M24 = mat.M24 * num; result.M31 = mat.M31 * num; result.M32 = mat.M32 * num; result.M33 = mat.M33 * num; result.M34 = mat.M34 * num; result.M41 = mat.M41 * num; result.M42 = mat.M42 * num; result.M43 = mat.M43 * num; result.M44 = mat.M44 * num; return result; }[/source] Is this correct/true, and should I be doing it this way? And what other optimizations might I use in general to make my math code blazing fast and efficient? Might I even consider doing something like this: [source lang="csharp"]#if !PERFORM_CHECKS unchecked { #endif // math code here... #if !PERFORM_CHECKS } #endif[/source]

General and Gameplay Programming Programming

Started by ATC September 25, 2012 05:14 PM

21 comments, last by ATC 11 years, 7 months ago

ATC

551

Author

September 27, 2012 04:38 PM

I didn't mean that a programmer could achieve better results through hand coded assembly, I think the point that I and those commenters were making is that unless you post the output assembly for the benchmarks, there's no way of knowing if the computer is really doing the same thing in both cases. If I understood correctly what people were saying was because the benchmark test was just buffering a string and never actually doing anything with it, that the JIT compiler was likely optimizing out the entire benchmark test. However, without the raw assembly output, we'll never know. That's all I was saying.

That still proves the point even if that's the case... the fact that a JIT compiler could make such a marked optimization that a C compiler cannot. Imagine the compounded result of tons of major optimizations, just from the compiler, in the context of a huge program.

_______________________________________________________________________________
CEO & Lead Developer at ATCWARE™
"Project X-1"; a 100% managed, platform-agnostic game & simulation engine

Please visit our new forums and help us test them and break the ice!
___________________________________________________________________________________

metsfan

679

September 27, 2012 05:39 PM

That still proves the point even if that's the case... the fact that a JIT compiler could make such a marked optimization that a C compiler cannot. Imagine the compounded result of tons of major optimizations, just from the compiler, in the context of a huge program.

No disagreement there, but it still makes for a poor benchmark

ATC

551

Author

September 27, 2012 05:53 PM

No disagreement there, but it still makes for a poor benchmark

You can argue that, sure. Perhaps someday we as a community should get together and do some rigorous benchmarking of C, C++, C# (in a scientific, emperical fashion) and various other languages and make a page dedicated to the results; to help game programmers make optimization decisions. But until that day, I'm just going to worry about squeezing the most performance out of my C# math API as possible; whether or not C or C# is the faster is irrelevant to what I'm doing; in any case C# is plenty fast enough because I've seen it with own eyes -- running 7500fps on extremely data-heavy, complex scenes without culling. ;-)

So back on topic... what else could I do to optimize math routines? What pitfalls should I be looking out for? What tips and tricks should be kept in mind? And again, has anyone used or know anything about Intel's Math Kernel Library?

Math API performance: saving CPU cycles?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Math API performance: saving CPU cycles?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines