XNAMath and triangle hit test

Started by
2 comments, last by CastorX 13 years, 9 months ago
Hi!
I am trying to rewrite my ray-caster's renderer code to SSE compatible, but there is some performance problem with this triangle-ray intersection tester method.
(it is basically the method that had been introduced by Tomas Möller in his paper: Fast Minimum Storage RayTriangle Intersection)

Performance test program: http://bitbasement.uw.hu/Fragments/xnavector.html
I did not post it here because it is long.
It tests a triangle against many vector placed on a grid over the triangle.
The XNA version is slower and I don't know why, so I have a few questions.

Q#1: Why is the XNA version 25-30% slower?
Q#2: Is it possible to make it faster?
Q#3: Could someone recommend an SSE optimized version of this Ray-Triangle intersection tester function?

Thanks!
Advertisement
Without seeing the XNA version it would be hard to give you exact pointers on how to improve performance.
www.dadoogames.com
I haven't tried to compile or test that code, but here's some educated guesses:

- Passing the parameters by value instead of const reference is probably not a good idea. Putting the loop inside the function would be ideal as it would save on float -> fxmvector conversions.

- Conversion back to individual floats for testing isn't ideal. If you can rework it to test four at a time with vector code it'll probably be quicker.

EDIT: You may find this article interesting / useful.

[Edited by - Adam_42 on July 5, 2010 3:40:53 PM]
Thank you! The GoParallel article looks good.
I've tried the const ref. and by value versions for the XNA free test function. Passing the parameters by value makes them a little faster. Probably because these are inlined functions. And I've made an improvement: precalculate the edge vectors and use float instead of doubles in the non-XNA verson -> now it is twice as fast as my badly-designed XNA version. However I've checked the disassembly code and... the compiler used single scalar versions of SSE almost everywhere. I can see now that it requires an other point of view when someone wants to write programs using SSE. Microsoft recommends to use XNAMath always where it is possible. But, does it really make things faster? I'm going to rewrite the XNA function to process 4 triangles in one function call.
Thanks for the replies.

This topic is closed to new replies.

Advertisement