I don''t quite understand how you''re comparing the code - they do completely different things!
The C code does the square root of a number that is increased per iteration (incidentally a good optimiser could probably optimise this away if it is aware that sqrt {it''d probably have to deal with sqrt as an intrinsic} is a unique operation [as in 1 to 1 mapping always] - if it did that it could probably calculate the whole loop in one go) - I''ve had MSVS.NET 2003 do this kind of thing for me. BTW: someone remind me, is ++ a valid operation on a double - logically I only really like it on discrete things, eg integers or iterators - for floating point stuff I tend to prefer x += 1.0; - the optimiser should be able to take care of that in the most optimal way I''m sure.
The assembly code mutliplies together 2 fpu registers, that do not appear to loaded with any content! IIRC you should get a hardware underflow exception as the FPU has no registers allocated on the fpu stack!!
The x86 asm code that is equivelent to what you wrote in the c code''s loop is
__asm { fld QWORD PTR [f] fld1 fld ST(1) fsqrt fstp DWORD PTR [result] fadd fstp QWORD PTR [f]}// is equivelent to... (given float result & double f)result = (float) sqrt( f );f += 1.0;
BTW: you don''t need to put a semi-colon after each statement in asm, I believe the instruction deliminator is a new line, a semi colon indicates the start of a comment in asm.