Back to General and Gameplay Programming

(float)int VS (float)DWORD Optimization

General and Gameplay Programming Programming

Started by whitde February 01, 2007 11:49 PM

6 comments, last by TheUnbeliever 17 years, 2 months ago

whitde

134

Author

February 01, 2007 11:49 PM

I'm sure we all have a lot of casts in out code. For example the width of a texture is a DWORD and you need it as a float to add to some other floating point variable. One thing I noticed when looking at the disassembly was that the code for DWORD->float cast is a LOT more code than int->float. The code below has 2 almost identical statements... except one is cast to int first and results in 2 lines of assembler. The original code where a DWORD is converted to a float is 6 lines of code, has a conditional branch and creates a variable. Just by casting the DWORD to an int first results in smaller, faster code.


; 213  : 		tex_width = (float)((int)iTexture::GetWidth(texture));

  0004a	57		 push	 edi
  0004b	ff 15 00 00 00
	00		 call	 DWORD PTR __imp_?GetWidth@iTexture@@SAKK@Z
  00051	89 44 24 20	 mov	 DWORD PTR tv338[esp+24], eax
  00055	db 44 24 20	 fild	 DWORD PTR tv338[esp+24]

; 214  : 		tex_height = (float)iTexture::GetHeight(texture);

  00059	57		 push	 edi
  0005a	d9 5e 08	 fstp	 DWORD PTR [esi+8]
  0005d	ff 15 00 00 00
	00		 call	 DWORD PTR __imp_?GetHeight@iTexture@@SAKK@Z
  00063	83 c4 08	 add	 esp, 8
  00066	85 c0		 test	 eax, eax
  00068	89 44 24 1c	 mov	 DWORD PTR tv336[esp+20], eax
  0006c	db 44 24 1c	 fild	 DWORD PTR tv336[esp+20]
  00070	7d 06		 jge	 SHORT $L64718
  00072	d8 05 00 00 00
	00		 fadd	 DWORD PTR __real@4f800000
$L64718:
  00078	d9 56 0c	 fst	 DWORD PTR [esi+12]

RidiculousX

140

February 02, 2007 12:18 AM

Wow, that's good to know.

But isn't a DWORD declared as an unsigned long (although you might not treat it as unsigned)? Then casting that to a signed int could involve some loss of data (if your numbers are big enough).

whitde

134

Author

February 02, 2007 12:35 AM

DWORD is basically unsigned... so the overhead is testing to see if the most significant bit is set. If it is... the number is adjusted.

Seems the FPU always treats the numbers as signed and the compiler compensates if the variable is unsigned.

If you know your numbers are small 0 - 2 billionish then you will have no problems.

mattd

1,078

February 02, 2007 12:40 AM

Here's my interpretation.

    float f = static_cast<float>(static_cast<int>(dw));00401036 89 04 24         mov         dword ptr [esp],eax 00401039 DB 04 24         fild        dword ptr [esp]

When casting from int to float, the compiler can make use of the fild instruction, which converts and loads an int into the floating-point registers.

    float f = static_cast<float>(dw);00401036 85 C0            test        eax,eax 00401038 89 04 24         mov         dword ptr [esp],eax 0040103B DB 04 24         fild        dword ptr [esp] 0040103E 7D 06            jge         main+16h (401046h) 00401040 D8 05 04 21 40 00 fadd        dword ptr [__real@4f800000 (402104h)]

When you cast from a DWORD (unsigned long) to float, the compiler treats the source DWORD as an int, so it can use the fild instruction once again. However, large DWORDs which are unrepresentable as ints (those greater than 0x7FFFFFFF) will be seen by the fild instruction as negative ints. The compiler makes a check to see whether the DWORD, when interpreted as a int, is negative or not (line 00401036 performs the test, and 0040103E performs the actual branch if the interpreted number is not negative). If it is found to be interpreted as negative, the float 0x4f800000 (0x1.0p32) is added to the result of the conversion in order to correct it (line 00401040).

So, while the int to float conversion takes less instructions (notice I explicitly did not say time), the forced cast from DWORD to int in the first place means the result will be incorrect if the DWORD is sufficiently large enough so as to be not representable as an int. For example:

#include <iostream>#include <windows.h>int main (){    DWORD dw = 0xDEADBEEF;    float f = static_cast<float>(static_cast<int>(dw));    std::cout << "f = " << f << std::endl;    return 0;}

produces the incorrect output f = -5.59039e+008, whereas..

#include <iostream>#include <windows.h>int main (){    DWORD dw = 0xDEADBEEF;    float f = static_cast<float>(dw);    std::cout << "f = " << f << std::endl;    return 0;}

correctly produces f = 3.73593e+009 (0xDEADBEEF = 3735928559).

Refer here: http://msdn2.microsoft.com/en-us/library/ms861534.aspx

Quote:
The static_cast operator converts expression to the type of type-id based solely on the types present in the expression. No run-time type check is made to ensure the safety of the conversion.

[...]

For instance, static_cast can be used to convert from an int to a char. However, the resulting char may not have enough bits to hold the entire int value. Again, it is left to the programmer to ensure that the results of a static_cast conversion are safe.

With all this said, this is probably a prime example of a premature optimization. Are you sure that this results in a quantitatively faster program? Have you profiled it? Is this conversion the source of a bottleneck in execution, and worthy of special treatment (it shouldn't be)? It even comes with the cost of incorrect outcomes in some cases (maybe the entire DWORD -> int cast when the DWORD is unrepresentable as an int is undefined behaviour - I'm sure one of those people who love to shout "Undefined behaviour!" whenever they can will inform me if it is :P).

[Edited by - mattd on February 2, 2007 1:40:35 AM]

RidiculousX

140

February 02, 2007 12:43 AM

Also good to know.

SunTzu

286

February 02, 2007 06:19 AM

Quote:It even comes with the cost of incorrect outcomes in some cases (maybe the entire DWORD -> int cast when the DWORD is unrepresentable as an int is undefined behaviour - I'm sure one of those people who love to shout "Undefined behaviour!" whenever they can will inform me if it is :P).

Not undefined - perfectly well defined - but potentially incorrect, yes.

If you know the DWORD is in the range [0, INT_MAX] there may be some benefit in this, otherwise, not such a good idea. And even then, you'd have to have a comment in your code explaining why you've done it or people would look at it and think "WTF?", and even then... if your program's performance is really constrained by things like this, I'd either be appalled or impressed, I'm not quite sure which... but that's not to say it's impossible I suppose.

Julian90

736

February 02, 2007 06:57 AM

Quote:Not undefined - perfectly well defined - but potentially incorrect, yes.

Not necaserily

Quote:4.5 Integral promotions
3. If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

Random Link

TheUnbeliever

963

February 02, 2007 10:15 AM

Quote:Original post by Julian90
Quote:Not undefined - perfectly well defined - but potentially incorrect, yes.
Not necaserily
Quote:4.5 Integral promotions
3. If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

There is a very distinct difference between undefined behaviour and implementation defined behaviour. The latter is valid, if dubious - I don't know the C++ definition but the C definition is that it is conforming but neither strictly conforming nor maximally portable (as the latter of these two relies on the former). Relying on a specific behaviour which is undefined is a Bad Thing and should be considered very poor practice indeed.

A compiler has to handle implementation defined behaviour. For undefined behaviour it can do whatever it likes - throw an error, compile it and let the user burn themselves - etc.

[TheUnbeliever]

(float)int VS (float)DWORD Optimization

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

(float)int VS (float)DWORD Optimization

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines