d3dx library sucks! please read this!

Started by
22 comments, last by DrunkenHyena 19 years, 1 month ago
Quote:Original post by EvilDecl81
Quote:
myVec3 v1(1.5f, 2.0f, 1.0f);
myVec3 v2(5.0f, 4.0f, 1.0f), fd;

D3DXVECTOR3 v3(1.5f, 2.0f, 1.0f), v4(5.0f, 4.0f, 1.0f), v5;

unsigned long i1 = timeGetTime();

for(unsigned int i = 0; i<100000000; i++) {

MATHMyVec3Cross(&v1, &v2, &fd);
//MATHMyVec3Cross2(&v1, &v2, &fd);
//D3DXVec3Cross(&v5, &v4, &v3);

}

unsigned long i2 = timeGetTime();
i2-=i1;


HEY, COMPILE ON VC++ 2003!!


Quite frankly, I'm surprised you got any real meaningful data. VC should have removed the loop since it doesn't do anything. Unless you are running a debug build. You can't just put a function in a loop and call it a million times - you have to do non-trivial operation the compiler can't just optimize away.

Cross product is also a very uninteresting test. Since the call goes through a dispatch table, you always have a small amount of overhead for any math routine. This shows up in very simple math operations. My suggestion: Try beating the Matrix Multiply.


Does it really optimise it away? I actually think it will not do that, as it is working on variables outside of the loop body. If it would optimise this loop away, the total behaviour of the application which expects the variable to be changed, changes. Anyone can say something about this?

Crafter 2D: the open source 2D game framework

?Github: https://github.com/crafter2d/crafter2d
Twitter: [twitter]crafter_2d[/twitter]

Advertisement
Quote:Original post by ashade
I suspect the d3dx library has two versions, one lies inside d3dx9math.inl and other is already compiled on d3dx9.lib. This can justify why I've got vector crossing two slow (it was using d3dx9math.inl, which is not optimized). Now, i'm getting really fast replies from d3dx code. Moreover, it's much faster than any processor especific optimization, so I think the d3dx9.lib functions are using the graphics hardware to make the calculations instead of 3dnow, sse or the standard fpu... do you think am I right?


I'm really can't see how you managed to do that... as the only reference for the (e.g.) D3DXVec3Dot is in the inline-file... if I do not include that I get an unresolved reference, meaning that it likely doesn't exist in the library. Please fill me in on what secret references you have found.

And I really can't see why it should use the GPU to do the calculations, to me that doesn't seem like a way to speed things up with still such simple calculations.


Quote:Original post by jeroenb
Quote:Original post by EvilDecl81
Quite frankly, I'm surprised you got any real meaningful data. VC should have removed the loop since it doesn't do anything. Unless you are running a debug build. You can't just put a function in a loop and call it a million times - you have to do non-trivial operation the compiler can't just optimize away.


Does it really optimise it away? I actually think it will not do that, as it is working on variables outside of the loop body. If it would optimise this loop away, the total behaviour of the application which expects the variable to be changed, changes. Anyone can say something about this?

Yes, when I initially tried that code it was optimized away. That doesn't change the behavior of the application at all, because v5 (in the case of the D3DX test) isn't used outside the loop at all, and inside the loop, it isn't used for any calculations - it's only a result that's gets written every iteration. The compiler detected that, and removed the loop.

Quote:I have a hard time believing that as this is in the D3DX-include file ;)

(Regarding detecting the CPU and selecting the suitable functions)
That doesn't happen with the straight inline stuff of course, it happens with more complex, non-inline stuff (those in the lib, like matrix inversion).

Quote:Original post by ashade
Moreover, it's much faster than any processor especific optimization, so I think the d3dx9.lib functions are using the graphics hardware to make the calculations instead of 3dnow, sse or the standard fpu... do you think am I right?

No, reading back the result from the gpu would be far slower than even the most naive implementation of the code.

The code in D3DX just happens to be pretty darn good. Not surprising when you consider that engineers from AMD and Intel worked on it.
Stay Casual,KenDrunken Hyena

This topic is closed to new replies.

Advertisement