Sign in to follow this  
Ned_K

What is going on with: int i = *(int*)&x where x is a float?

Recommended Posts

Ok, I'm sure many have seen this snippet for inverse square roots: float InvSqrt(float x) { float xhalf = 0.5f*x; int i = *(int*)&x; // get bits for floating value printf("%x\n", i); i = 0x5f3759df - (i>>1); // gives initial guess y0 x = *(float*)&i; // convert bits back to float x = x*(1.5f-xhalf*x*x); // Newton step, repeating increases accuracy return x; } What's happening with this line: int i = *(int*)&x; Is that converting float x to a pointer, casting it to an int pointer and then dereferencing it and putting the result in an int? Is that defined? I also see similar code with the *(int*)&var format elsewhere now that I am looking for it.

Share this post


Link to post
Share on other sites
I suppose you could do this if you wanted be able to access the individual bits of a floating point number, since that's would you would have in the integer. I have no idea how valid this sort of thing is, but it certainly looks terribly ugly and unsafe.

Share this post


Link to post
Share on other sites
Quote:
Original post by MJP
I suppose you could do this if you wanted be able to access the individual bits of a floating point number, since that's would you would have in the integer. I have no idea how valid this sort of thing is, but it certainly looks terribly ugly and unsafe.


Write-up on it:

http://www.lomont.org/Math/Papers/2003/InvSqrt.pdf

But he focuses on the constant a few lines down and the overall efficiency gains.

I'm just hung up on the syntax of that one line.

Share this post


Link to post
Share on other sites
It is of course designed for 4 byte integers and single precision floating point values. So it's not going to be covered by the C++ standard.

It looks ugly, but all it does is place the exact same 32 bits that came from a float into what is now interpreted as an integer.

This is so the bitshift and bit manipulation can be performed.

Of course, the constant 0x5f3759df is also dependent on 32 bit integers and IEEE 754.

Share this post


Link to post
Share on other sites
Two little notes:

1) That code can break if you run on a compiler that supports strict aliasing optimizations if they are enabled. Casting through a union is the most common workaround. See this page for more details: http://www.cellperformance.com/mike_acton/2006/06/understanding_strict_aliasing.html

2) The compiler/CPU can do screwy stuff when converting between an integer and float if you set the wrong bits. We had a bug in our endian swapping code a while back where converting an int to a float via address casting caused NaNs in very specific cases. I don't recall the specifics, but it went all the way down to getting the value into a register. I think it was doing something like:

float fData;
Read(&fData, sizeof(4)); // Reads in unswapped data, which involves interpreting the data as an int.
Swap(&fData, sizeof(4)); // Swaps the data to the correct endian

If the unswapped data bits were arranged just right, the value stored in fData was converted to a NaN, basically corrupting the data before the Swap occured.

Basically, we needed to do this instead:

uint32 nData;
Read(&nData, sizeof(4)); // Reads in unswapped data
Swap(&nData, sizeof(4)); // Swaps the data to the correct endian
float fData = UnionCast<float>(nData); // Convert the representation to a float

Moral: Be really careful when converting between types. Even when you know what you are doing, its very easy to introduce either a compiler specific bug or a value specific bug.

Share this post


Link to post
Share on other sites
Quote:
Is that converting float x to a pointer, casting it to an int pointer and then dereferencing it and putting the result in an int?


float pi = 3.14159;
int x = (int)pi;
-> x == 3 (00000003h)

int x = *(int*)π
-> x is a bitwise copy of pi, so you can perform bitwise operations. It most certainly is not 00000003h.

Share this post


Link to post
Share on other sites
Long story short, accessing a memory address through a type-punned pointer (other than char*) gives undefined results according to the standard. In practice, this works as expected on most common platforms, though it will break strict aliasing as pointed out above. "As expected" means that it gives you the bit-pattern of the floating point number, which wouldn't be possible to get through normal means.

Most compilers support an extension to allow you do the same thing via unions:
union { float f; unsigned int i; } converter;
converter.f = someFloat;
someInt = converter.i;
I actually prefer this method quite a bit, but it too isn't fully supported by the standard. It's a very common compiler extension though, so in practice it's safe to use.

The only standards-compliant way to do this conversion is by casting the float to a char* and reconstructing it into an int one byte at a time:
compile_time_assert(sizeof(unsigned int) == sizeof(float)); // not guaranteed
unsigned char* src = (unsigned char*)&someFloat;
unsigned char* dest = (unsigned char*)&someInt;
for (int i = 0; i < sizeof(float); i++)
{
dest[i] = src[i];
}
But this code is overbearing for the sake of ultra-correctness when the other two methods work with most known combinations of platform and compiler manufacturer.

(Type-punning == referencing a memory address by something other than its actual type.)

As stated, I'd prefer the union method to the in-place pointer casting, but if you really need to do this I suppose it's up to you which you choose.

Share this post


Link to post
Share on other sites
Quote:
Original post by exwonder
Most compilers support an extension to allow you do the same thing via unions:
union { float f; unsigned int i; } converter;
converter.f = someFloat;
someInt = converter.i;
I actually prefer this method quite a bit, but it too isn't fully supported by the standard. It's a very common compiler extension though, so in practice it's safe to use.

I would recommend casting over converting via a union since the former is permitted by the Standard (albeit with implementation-defined behaviour) and the latter is prohibited, even if it does work in practice. A cast is also clearer, especially if you use the form reinterpret_cast< int & >(x).

Σnigma

Share this post


Link to post
Share on other sites
Quote:
Original post by Enigma
I would recommend casting over converting via a union since the former is permitted by the Standard (albeit with implementation-defined behaviour) and the latter is prohibited, even if it does work in practice. A cast is also clearer, especially if you use the form reinterpret_cast< int & >(x).

I'll have to re-look up the union thing at some point.

reinterpret_cast is definitely a good solution, though, if I recall correctly... reinterpret_cast's from int to float are not allowed. So it won't be helpful in every situation where you'd want to manipulate the bits of a float directly.

As messy as it is, the C-style casting captures the fact that the conversion will most likely have to go between register sets via memory to happen, which is a nice side-effect, though one that most people won't be thinking about.

Share this post


Link to post
Share on other sites
In Java, you can do this:


float f = 3.141592654f;
int i = Float.floatToIntBits(f);


Ah, one of those rare cases where a high level language does a low level thing better than C/C++ :)

Share this post


Link to post
Share on other sites
Code like this scares the devil out of me. If you want to do yourself a favour, never use such constructs. They may work now, or they may appear to work, but eventually at some point in the future, it will fail (and probably you will not even know what hit you).

Hacks are ok if they are needed. They still aren't pretty, but if they get you what you can't otherwise achieve, that's ok (for me at least, opinions differ).
However, in this case, it isn't needed at all. Using SSE math (or similar) is a much better alternative. Two steps of Newton-Raphson will take around a dozen cycles. I doubt that the above code runs significantly faster than that.

Share this post


Link to post
Share on other sites
Quote:
Original post by samoth
Code like this scares the devil out of me. If you want to do yourself a favour, never use such constructs. They may work now, or they may appear to work, but eventually at some point in the future, it will fail (and probably you will not even know what hit you).

Hacks are ok if they are needed. They still aren't pretty, but if they get you what you can't otherwise achieve, that's ok (for me at least, opinions differ).
However, in this case, it isn't needed at all. Using SSE math (or similar) is a much better alternative. Two steps of Newton-Raphson will take around a dozen cycles. I doubt that the above code runs significantly faster than that.


I second the above post.

That said, good advice is not always heeded for the simple fact that people will always want to know if something can be done...so, expect dodgy hacks to hang around forever -- because always someone will want to know if they can pull something off faster or more efficiently than the 'safe' version. Of course, if we all end up progressing to managed code, perhaps crap like this won't come up anymore (though I doubt it) :P

~Shiny

Share this post


Link to post
Share on other sites
Quote:
Original post by MJP
I suppose you could do this if you wanted be able to access the individual bits of a floating point number, since that's would you would have in the integer. I have no idea how valid this sort of thing is, but it certainly looks terribly ugly and unsafe.


Huhu, that's funny, because in the DirectX documentation, they recommend the use of :

*(DWORD *)&floatValue

to pass floating values as DWORD render states values :)

(anyway I agree it's ugly, the first time I saw this in the documentation, I believed it was a sort of joke :p)

Share this post


Link to post
Share on other sites
Quote:
Original post by paic
Quote:
Original post by MJP
I suppose you could do this if you wanted be able to access the individual bits of a floating point number, since that's would you would have in the integer. I have no idea how valid this sort of thing is, but it certainly looks terribly ugly and unsafe.

Huhu, that's funny, because in the DirectX documentation, they recommend the use of :
*(DWORD *)e
to pass floating values as DWORD render states values :)
(anyway I agree it's ugly, the first time I saw this in the documentation, I believed it was a sort of joke :p)


The point here is that it does work dependably on almost all platforms, and is used by most software. Strict aliasing never really seemed to catch on, so there aren't in practice many problems with it.

Share this post


Link to post
Share on other sites
Quote:
Original post by paic
Quote:
Original post by MJP
I suppose you could do this if you wanted be able to access the individual bits of a floating point number, since that's would you would have in the integer. I have no idea how valid this sort of thing is, but it certainly looks terribly ugly and unsafe.


Huhu, that's funny, because in the DirectX documentation, they recommend the use of :

*(DWORD *)&floatValue

to pass floating values as DWORD render states values :)

(anyway I agree it's ugly, the first time I saw this in the documentation, I believed it was a sort of joke :p)


That is kind of funny and kind of tragic. Microsoft has encouraged a lot of bad practices over the years. As we all know, we shouldn't use that as a sign of generally applicable acceptable style. I do see people justify things in this manner, at times. I had to untrain some of my coworkers from disrespecting the type system (still a work in progress, at times). My general advice is if you have to do something ugly and possibly non-portable, I suggest using asserts to at least have some confidence that it did what you think it did and hide it off in a prettier facade-like piece of code so clients don't have to be affected by the ugliness.

Share this post


Link to post
Share on other sites
Quote:
Original post by Boder
It is of course designed for 4 byte integers and single precision floating point values. So it's not going to be covered by the C++ standard.

It looks ugly, but all it does is place the exact same 32 bits that came from a float into what is now interpreted as an integer.

This is so the bitshift and bit manipulation can be performed.

Of course, the constant 0x5f3759df is also dependent on 32 bit integers and IEEE 754.


It's also processor dependent due to byte order. IEEE 754 specifies bit order, but does not specify how the bytes that make up those bits gets laid out in memory. It can be the same endianness as integer representation on a processor or it may not. If it's not then the bit shift isn't going to behave as expected.

Share this post


Link to post
Share on other sites
A few points:

1. Both unions and casting have their issues. You only use code like this when you have a test before hand to catch if it fails. This test needs only be done once when the app starts, and warns you. This way you catch compiler differences, architecture differences, etc.

2. You should not use a signed int, since right shift of a signed int is implementation defined.

3. Also signed ints are not always twos complement. There are at least three (if I recall) possible ways a C compiler can store signed ints.

4. Obviously some architectures dont have 32 bit ints, and this will also break there.

Hence, use an unsigned int, and if you need to put stuff like this in your code, include a test function that gets called that tests all your assumptions. Then your code does not introduce hard to track down bugs, because every time you switch compilers/platforms your tests quickly catch the assumption changes.

To those of you saying never to use such constructs, you are missing the point. Sometimes you *NEED* a little more performance, in which case you do stuff like this. Saying never to use it is like saying never to use inline assembly, never unroll loops, never reorganize clean code to messy code to optimize cache issues, etc. The whole point of code like this is a simple 1.0f/sqrt(val) is simply way too inefficient for some uses.

For those unfamiliar with the issues, google the constant in the code and read up.

Share this post


Link to post
Share on other sites
Does an in-depth discussion of type-punning (especially in C) and this *(int*)&x sort of construct exist somewhere? I can't seem to find one. Just bits and pieces. Wikipedia seems to have the largest chunk of info in one place but it's pretty weak.

Share this post


Link to post
Share on other sites
Quote:
To those of you saying never to use such constructs, you are missing the point. Sometimes you *NEED* a little more performance, in which case you do stuff like this.
Sorry, but I have to object. I was saying that hacks are ok, if they are needed. Here they aren't needed.

While it is true that your bit trick runs about 6.8 times faster than the standard function when fully optimised (counting clock ticks on an AMD64 single core system using gcc 4.2), providing -msse -mfpmath=sse on the commandline causes the compiler to generate code that runs almost twice as fast, with full precision, without punning pointers, and without possible gotchas.

Share this post


Link to post
Share on other sites
Quote:
Original post by Ned_K
Does an in-depth discussion of type-punning (especially in C) and this *(int*)&x sort of construct exist somewhere? I can't seem to find one. Just bits and pieces. Wikipedia seems to have the largest chunk of info in one place but it's pretty weak.

Brian Hook's book Write Portable Code has a discussion on this issue. It basically says what I did in the post above, but is a little more in-depth.

Here's a tinyurl link to the excerpt on google reader... I hope it works:

http://tinyurl.com/yukdnx

Share this post


Link to post
Share on other sites
Quote:
Original post by samoth
Quote:
To those of you saying never to use such constructs, you are missing the point. Sometimes you *NEED* a little more performance, in which case you do stuff like this.
Sorry, but I have to object. I was saying that hacks are ok, if they are needed. Here they aren't needed.

While it is true that your bit trick runs about 6.8 times faster than the standard function when fully optimised (counting clock ticks on an AMD64 single core system using gcc 4.2), providing -msse -mfpmath=sse on the commandline causes the compiler to generate code that runs almost twice as fast, with full precision, without punning pointers, and without possible gotchas.


You assume SSE is available and your compiler knows how to use it, which is an even bigger assumption across platforms. The above code is quite portable, and works on almost any platform with IEEE 754 floats, and even a few that don't (but have nearly conformant floats). So ask yourself - between SSE and the above code, which is really more portable when you do *need* the speed?

The trick posted was made popular in the Quake source code, and there was no SSE available at that time. Often in a game you don't need "full precision" since it could be used for computing normals for lighting, and who cares if they are off by a few %.

If you're writing a per pixel rasterizer, for example, for a handheld on say an ARM, where you don't have SSE, and the screen is pretty small, you need speed, and the above code is *exactly* what you want. You don't have the luxury of SSE, pixel shaders, GPUs, and other numerical tricks.

In short, any good programmer should understand *each* of the approaches and choose the one that does what you need. Which gets back to the fact that the original posters code may be needed.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this