Quote:Original post by krum
With all due respect, Washu, this has nothing to do with unsafe code - rather this is a good demonstration of how poor the JIT's optimizer is. Either way, I think we'll find that the safe code may take at least one additional clock cycle per loop due to the additional add
This JIT optimizer has an extremely short amount of time to run. Unlike C++ compilation where you can run for hours without people caring, .NET jitting has to happen within microseconds. Frankly there just aren't a hell of a lot of optimizations you can do in that short of a span of time. I should note that x64 code produced is much better than x86 code though (simply because of certain instruction set guarantees).
Quote:lea eax,[ecx+edx*4+8]
vs
lea edx,[edi+ecx*4]
I believe the fact that the unsafe code has a few more instructions to set the function up is largely insignificant. Even if the CPU can do both LEA instructions in the same number of clock cycles, I think to suggest that in this case unsafe code produces a substantial performance penalty is going overboard.
The setup code is expensive, however a loop like that does not introduce a significant hit for either one. However, it did kill your claim that unsafe code in a loop would be significantly faster than safe code. The reality is that an LEA of [r + r * c + c] is about 1 cycle to 1.5 cycles slower than [r + r * c]. Pipelining and cache misses will more than make up for that amount of time.
Quote:The more extreme case would be if you had an array of 4x4 transformation matrices, say from an animation, that you needed to multiply with a parent transform. Using safe code could be tremendously slower because it would need to copy each matrix at least twice for each operation, whereas using pointers could reduce the memory bandwidth requirements.
My point is that using unsafe code per se is not slower. What you do with it is - like anything - a whole different story.
Eh, careful there. There are plenty of managed operations you can do that will eliminate the copies. ref parameters being one of the big ones. Dropping straight down to unsafe code just because it "appears" faster is a misnomer. Furthermore, the setup code you so diligently tossed away adds an invisible overhead that will hit you when you least expect it. First and foremost, any time a collection runs with pinned objects, those objects don't get moved, this fragments the heap. That fragmentation slows down allocations, and also slows down further collections when they happen (it ends up having to do more work). Now, the chances of a collection happening during short pinned durations is fairly small. The cost of the lock and unlocking of the critical section in order to pin the object is not so small. Furthermore, matrix multiplication typically requires per element access, something that I've shown in my journal with the simplest of operations (that of calculating the magnitude of a vector) is NOT the best generated code in the world for pinned pointers. The fact is, the JIT spends very little time optimizing unsafe code, it spends more time optimizing managed code. There is are areas where "unsafe" code 'might' be faster, for instance ones that deals with memory copies. Although you would be hard pressed to beat the built in memory copy (as it uses what was, in 2005, considered the most optimal form of a memory copy for the P4/AMD chipsets). Even MDX doesn't do that, instead delegating its copy operations to a pinvoke of memcpy.
Frankly, unsafe code is unsafe for many reasons, not just because it can't be verified, but because you are taking great risks in attempting to outguess the JIT and "optimize" your code. If you want to optimize your code, then using in-assembly pre-compiled machine code would be recommended. This way you can hand optimize it using the latest instruction sets, vectorization, and other operations.
Now, there are certainly cases where unsafe code can beat managed code. More often than not though, it falls under the 80/20 rule and such optimizations aren't optimizations, just wastes of time. You're better of investing in better algorithms first.