Jump to content

  • Log In with Google      Sign In   
  • Create Account


#ActualAllEightUp

Posted 20 September 2013 - 03:24 PM

Eef, there is bad and then there is *that* holy disaster. VC is completely destroying the function for some reason moving and flushing the data multiple times for no apparent reason. Are you sure you are patched all the way up and have the processor pack and all that? It was bad sure but that is absolutely pitiful. I'll post up my assembly output here, for the same thing in just a minute soon as I get a release with symbols built.

Here's the VC2012 codegen:

00B81A06 movaps xmm1,xmmword ptr [__xmm@0000000040400000400000003f800000 (0BDF4C0h)]
00B81A0D movaps xmm0,xmm1
00B81A10 dpps xmm0,xmm1,77h
00B81A16 sqrtps xmm0,xmm0
00B81A19 movss xmm2,dword ptr [__real@3c23d70a (0BDF384h)]
00B81A21 divps xmm1,xmm0

Just a bit more reasonable. smile.png

That misses the storage back to the wrapper because my code passes by register and this is in a unit test so I had to extract the code from the surrounding setup/teardown of the test. Oh, and it uses the full sqrt and divs due to needing a bit more accuracy in the calling code.

#4AllEightUp

Posted 20 September 2013 - 03:23 PM

Eef, there is bad and then there is *that* holy disaster. VC is completely destroying the function for some reason moving and flushing the data multiple times for no apparent reason. Are you sure you are patched all the way up and have the processor pack and all that? It was bad sure but that is absolutely pitiful. I'll post up my assembly output here, for the same thing in just a minute soon as I get a release with symbols built.

Here's the VC2012 codegen:

00B81A06 movaps xmm1,xmmword ptr [__xmm@0000000040400000400000003f800000 (0BDF4C0h)]
00B81A0D movaps xmm0,xmm1
00B81A10 dpps xmm0,xmm1,77h
00B81A16 sqrtps xmm0,xmm0
00B81A19 movss xmm2,dword ptr [__real@3c23d70a (0BDF384h)]
00B81A21 divps xmm1,xmm0

Just a bit more reasonable. smile.png

That misses the storage back to the wrapper because my code passes by register and this is in a unit test so I had to extract the code from the surrounding setup/teardown of the test.

#3AllEightUp

Posted 20 September 2013 - 03:16 PM

Eef, there is bad and then there is *that* holy disaster. VC is completely destroying the function for some reason moving and flushing the data multiple times for no apparent reason. Are you sure you are patched all the way up and have the processor pack and all that? It was bad sure but that is absolutely pitiful. I'll post up my assembly output here, for the same thing in just a minute soon as I get a release with symbols built.

Here's the VC2012 codegen:

011AD510 mov dword ptr fs:[00000000h],eax
011AD516 movaps xmm0,xmmword ptr [__xmm@0000000040400000400000003f800000 (120D380h)]
011AD51D dpps xmm0,xmmword ptr [__xmm@0000000040c0000040a0000040800000 (120D3C0h)],77h
011AD527 sub esp,8
011AD52A lea eax,[ebp-1Ch]
011AD52D movss dword ptr [esp+4],xmm0

Just a bit more reasonable. smile.png

Oh, sec that was just a dot product.

#2AllEightUp

Posted 20 September 2013 - 03:15 PM

Eef, there is bad and then there is *that* holy disaster. VC is completely destroying the function for some reason moving and flushing the data multiple times for no apparent reason. Are you sure you are patched all the way up and have the processor pack and all that? It was bad sure but that is absolutely pitiful. I'll post up my assembly output here, for the same thing in just a minute soon as I get a release with symbols built.

Here's the VC2012 codegen:

011AD510 mov dword ptr fs:[00000000h],eax
011AD516 movaps xmm0,xmmword ptr [__xmm@0000000040400000400000003f800000 (120D380h)]
011AD51D dpps xmm0,xmmword ptr [__xmm@0000000040c0000040a0000040800000 (120D3C0h)],77h
011AD527 sub esp,8
011AD52A lea eax,[ebp-1Ch]
011AD52D movss dword ptr [esp+4],xmm0

Just a bit more reasonable. smile.png

#1AllEightUp

Posted 20 September 2013 - 03:07 PM

Eef, there is bad and then there is *that* holy disaster. VC is completely destroying the function for some reason moving and flushing the data multiple times for no apparent reason. Are you sure you are patched all the way up and have the processor pack and all that? It was bad sure but that is absolutely pitiful. I'll post up my assembly output here, for the same thing in just a minute soon as I get a release with symbols built.

PARTNERS