[.net] Very sad IL optimization. Am I missing something?

Started by
11 comments, last by Arild Fines 17 years, 10 months ago
I've been contemplating porting my game engine over to C# and the .NET platform from C++, and one of those things that has concerned me is the gratuitus use of pass-by-value. In the case of large value types, like a 4x4 matrix, a pass-by-value pattern can really slow things down if you do it a lot. One of the structs I looked at in particular was the Microsoft.DirectX.Matrix struct. Apparently there are no methods that support pass-by-reference and in fact as far as I can tell the language simply has no support to pass-by-reference with operators. The one thing that disturbed me the most is how the JIT compiles the simple Matrix.Multipy into machine code. As you can see from the example below which I took from the DBGCLR tool on a Release compiled source file, the JIT spams a bunch of movq instructions to copy the value of the Matrix value type. In a game engine you might not really do so many matrix multiplies that it makes a huge difference, but it seems to me that if we're going to move to .NET we still need to be able to squeeze as much as we can out of the runtime to hit the framerate targets that we have under load. My question is, am I missing something? Is there a better way? Write my own math library with methods that have pass-by-reference? After all the time I've spent tweaking to get the most out of it sometimes even hand writing SIMD code in assembler, all those movq instructions are bugging the heck out of me.

            Matrix m = Matrix.Identity;
00000035  lea         ecx,[ebp+FFFFFF34h] 
0000003b  call        dword ptr ds:[00ED5914h] 
00000041  lea         edi,[ebp-4Ch] 
00000044  lea         esi,[ebp+FFFFFF34h] 
0000004a  mov         ecx,10h 
0000004f  rep movs    dword ptr es:[edi],dword ptr [esi] 
            Matrix n = Matrix.Identity;
00000051  lea         ecx,[ebp+FFFFFEF4h] 
00000057  call        dword ptr ds:[00ED5914h] 
0000005d  lea         edi,[ebp+FFFFFF74h] 
00000063  lea         esi,[ebp+FFFFFEF4h] 
00000069  mov         ecx,10h 
0000006e  rep movs    dword ptr es:[edi],dword ptr [esi] 
            n = Matrix.Multiply( n,m);
00000070  lea         eax,[ebp+FFFFFF74h] 
00000076  sub         esp,40h 
00000079  movq        xmm0,mmword ptr [eax] 
0000007d  movq        mmword ptr [esp],xmm0 
00000082  movq        xmm0,mmword ptr [eax+8] 
00000087  movq        mmword ptr [esp+8],xmm0 
0000008d  movq        xmm0,mmword ptr [eax+10h] 
00000092  movq        mmword ptr [esp+10h],xmm0 
00000098  movq        xmm0,mmword ptr [eax+18h] 
0000009d  movq        mmword ptr [esp+18h],xmm0 
000000a3  movq        xmm0,mmword ptr [eax+20h] 
000000a8  movq        mmword ptr [esp+20h],xmm0 
000000ae  movq        xmm0,mmword ptr [eax+28h] 
000000b3  movq        mmword ptr [esp+28h],xmm0 
000000b9  movq        xmm0,mmword ptr [eax+30h] 
000000be  movq        mmword ptr [esp+30h],xmm0 
000000c4  movq        xmm0,mmword ptr [eax+38h] 
000000c9  movq        mmword ptr [esp+38h],xmm0 
000000cf  lea         eax,[ebp-4Ch] 
000000d2  sub         esp,40h 
000000d5  movq        xmm0,mmword ptr [eax] 
000000d9  movq        mmword ptr [esp],xmm0 
000000de  movq        xmm0,mmword ptr [eax+8] 
000000e3  movq        mmword ptr [esp+8],xmm0 
000000e9  movq        xmm0,mmword ptr [eax+10h] 
000000ee  movq        mmword ptr [esp+10h],xmm0 
000000f4  movq        xmm0,mmword ptr [eax+18h] 
000000f9  movq        mmword ptr [esp+18h],xmm0 
000000ff  movq        xmm0,mmword ptr [eax+20h] 
00000104  movq        mmword ptr [esp+20h],xmm0 
0000010a  movq        xmm0,mmword ptr [eax+28h] 
0000010f  movq        mmword ptr [esp+28h],xmm0 
00000115  movq        xmm0,mmword ptr [eax+30h] 
0000011a  movq        mmword ptr [esp+30h],xmm0 
00000120  movq        xmm0,mmword ptr [eax+38h] 
00000125  movq        mmword ptr [esp+38h],xmm0 
0000012b  lea         ecx,[ebp+FFFFFEB4h] 
00000131  call        dword ptr ds:[00ED590Ch] 
00000137  lea         edi,[ebp+FFFFFF74h] 
0000013d  lea         esi,[ebp+FFFFFEB4h] 
00000143  mov         ecx,10h 
00000148  rep movs    dword ptr es:[edi],dword ptr [esi] 
Advertisement
You are passing by reference. Or rather, you're passing the reference by value. Classes in C# are reference types.
Quote:Original post by nsto119
You are passing by reference. Or rather, you're passing the reference by value. Classes in C# are reference types.


But Microsoft.DirectX.Matrix isn't a class, its a struct.
You can force it to pass by reference using the "ref" keyword, I believe.
I know you can force it. I don't remember whether it was "ref" or "out" though.
Quote:Original post by Guru2012
You can force it to pass by reference using the "ref" keyword, I believe.
I know you can force it. I don't remember whether it was "ref" or "out" though.


It seems that this only works if the method definition uses this keyword.
Then put it in the method definition...
Quote:Original post by Guru2012
Then put it in the method definition...


But I can't change the method definition for types in the Microsoft.DirectX namespace.
Clearly. The method needs to know what calling convention to use.

You can also define the method to take an object, which will box the value type (Matrix being the value).

However -- are you sure that this copying is actually a performance problem? I mean, how much of your program's time is spent multiplying matrices, anyway? Should be less than a percent, unless you do a hundred instances of a hundred-bone animation...
enum Bool { True, False, FileNotFound };
That code isn't remotely optimized. Look at the sub esp, 0x40 instructions. Optimized form of that would put a single sub esp, 0x80 instead of one before each matrix.

I really doubt that Microsoft would ship .Net without knowing how to optimize something as simple as that.

It's possible that it's not showing you the final optimized native assembly instructions...

I'm especially suspicious because it's telling you that the function address starts down at 00000000. Nothing usually sits down that far.

[Edited by - Nypyren on June 6, 2006 11:37:05 PM]
Quote:Original post by hplus0603
are you sure that this copying is actually a performance problem? I mean, how much of your program's time is spent multiplying matrices, anyway? Should be less than a percent, unless you do a hundred instances of a hundred-bone animation...


This may be true... can't really tell until I've got some rudimentary code ported over and have started running a scene. Really, I find it interesting... and frankly as a C++ programmer it's disturbing to me... that the API passes them by value. :D



This topic is closed to new replies.

Advertisement