2. Simple loop tests aren't at all accurate for benchmarking (I cache warmup, process/thread context switches in the middle of your loop etc, D cache warm up for the first access to the timer variables etc)...
3. ...Nor are they reprensentative of what the compiler would generate for a real app. In the real app, if the compiler runs out of integer registers before it does FP stack space, the results would skew in favour of the FP code.
4. The MSVC optimiser collapses those loops anyway, so you only get valid results by disabling optimisations of the loop (say by making the iterator volatile) - which results in some pretty sub-optimal pooey code generation.
5. The float constant you were adding is treated as a double.
6. try the following slightly modified (and fairer) version of your code in both release and debug:
#include <stdlib.h>#include <iostream.h>#include <windows.h>int main(){ unsigned int dt; unsigned int t1 = GetTickCount(); float temp1 = 0.0f; for (volatile unsigned int i = 0; i < 4200000000U; ++i) { temp1 += .00013f; } unsigned int t2 = GetTickCount(); dt = t2 - t1; cout << "Floating Point-\n Total Time: " << dt << "ms\n"; cout << " Average Time: " << dt / 4200000000.0f << "ms\n\n"; unsigned int t3 = GetTickCount(); unsigned int temp2 = 0U; for (volatile unsigned int j = 0; j < 4200000000U; ++j) { temp2 += 13; } unsigned int t4 = GetTickCount(); dt = t4 - t3; cout << "Integer-\n Total Time: " << dt << "ms\n"; cout << " Average Time: " << dt / 4200000000.0f << "ms\n\n"; return 0;}
6a. On the (Intel CPU) machine I'm browsing on, with a DEBUG build I got the following results for two runs:
Floating Point- Total Time: 60046ms Average Time: 1.42967e-005msInteger- Total Time: 12789ms Average Time: 3.045e-006ms-------------------------------------Floating Point- Total Time: 59726ms Average Time: 1.42205e-005msInteger- Total Time: 12788ms Average Time: 3.04476e-006ms
6b. Now let's see the (MSVC6 compiler optimised) RELEASE version on the same machine:
Floating Point- Total Time: 19287ms Average Time: 4.59214e-006msInteger- Total Time: 21601ms Average Time: 5.1431e-006ms----------------------------------------------Floating Point- Total Time: 19568ms Average Time: 4.65905e-006msInteger- Total Time: 21592ms Average Time: 5.14095e-006ms
6c. See how much the surrounding code can skew your "benchmarking" results...
7. That's not to say MSVC6's optimisation of FP code is ideal - it's not, it misses many good opportunities to use FP latency hiding and uses temporary flushes to memory too often (though that could be for numerical consistency). The integer optimisation code in MSVC6 is better.
8. And no, I've not got any suggestions as to why the release integer code ends up slower than the debug integer code
[BAH! - edit kills the source tags]
[edited by - S1CA on October 12, 2003 7:05:52 PM]