casting double* to float*

Started by
22 comments, last by frob 8 years, 1 month ago

Hi,

So i have this function that looks like this:-


glfwGetCursorPos(_IN_ GLFWwindow *window, _OUT_ double *x, _OUT_ double *y);

what it basically does is spit out the mouse position from the x and y arguments.

so if you do something like this, it works fine.


double x = 0.0f, y = 0.0f;
glfwGetCursorPos(window, &x, &y);

std::cout << x << " " << y << std::endl;

However if I try this, I get a value of zero. I don't know why.


float x = 0.0f, y = 0.0f;
glfwGetCursorPos(window, (double*)&x, (double*)&y);

std::cout << x << " " << y << std::endl;
Advertisement
You're telling the function you're calling that your pointer points to memory that has enough room to store a double, but the actual memory is only big enough for a float.

Getting zero back is the least of what could go wrong - you are corrupting the stack. The additional 32 bits of the double are being written to whatever comes after the space reserved for your float. Most likely problem is that you'll overwrite a different local variable in your function, but you could also overwrite the saved frame pointer or return address.


Stack view for this scenario on x86/x64:
                                          +-----------------+
                                          |                 |
                                          |        +------------------+
                                          |        |        |         |
<- glfwGetCursorPos locals | ptr window | ptr &x | ptr &y | float x | float y | ptr {rBP} | ptr retAddr |
                           (      pushed call arguments   )
                                                          ( your locals                   )
                                                                                          ( return to whoever called you )
Write to *x overwrites both x and y:                      | *x as double      |
Write to *y overwrites both y and rBP:                              | *y as double  ------|
rBP = EBP on x86, RBP on x64, optional if you have "omit frame pointer" turned on. If that's the case you'll stomp retAddr instead.

On x64, RBP and the return address are 64-bit and you'll overwrite half of them with part of your double.
On x86, EBP and the return address are 32-bit and you'll completely overwrite them with the second half of y.

In your calling function, your local variables that you take the address of must be doubles. You can cast them to float *after* you get them back.

If your function has other local variables besides just x and y, they will be somewhere on the stack as well, and could be getting overwritten instead of rBP/retAddr.

(edited like 50 times: ASCII art is a pain)

You're telling the function you're calling that your pointer points to memory that has enough room to store a double, but the actual memory is only big enough for a float.

Getting zero back is the least of what could go wrong - you are corrupting the stack. The additional 32 bits of the double are being written to whatever comes after the space reserved for your float. Most likely problem is that you'll overwrite a different local variable in your function, but you could also overwrite the saved frame pointer or return address.

Stack view for this scenario on x86/x64:
+-----------------+
| |
| +------------------+
| | | |
<- glfwGetCursorPos locals | ptr window | ptr &x | ptr &y | float x | float y | ptr {rBP} | ptr retAddr |
( pushed call arguments )
( your locals )
( return to whoever called you )
Write to *x overwrites both x and y: | *x as double |
Write to *y overwrites both y and rBP: | *y as double ------|
rBP = EBP on x86, RBP on x64, optional if you have "omit frame pointer" turned on. If that's the case you'll stomp retAddr instead.

On x64, RBP and the return address are 64-bit and you'll overwrite half of them with part of your double.
On x86, EBP and the return address are 32-bit and you'll completely overwrite them with the second half of y.

In your calling function, your local variables that you take the address of must be doubles. You can cast them to float *after* you get them back.


(edited like 50 times: ASCII art is a pain)

hahaha :D Nice ASCII art skills happy.png

Awesome explanation. thank you !

That sucks. I was hoping there would be a way to directly cast them to float without having to create a new double variable then cast it to a float. ooh well.

Speaking of double and float. float is still faster than double on the CPU correct? Having my Vector2 class be a double rather than a float would slow performance down. Is that still the case in 2016?


Having my Vector2 class be a double rather than a float would slow performance down. Is that still the case in 2016?

I'm not sure if this is the case but is there a reason you need them to be doubles? That's fairly overkill. You could also consider making your Vector class to be a template class and then you can use either doubles or floats as needed.

Interested in Fractals? Check out my App, Fractal Scout, free on the Google Play store.


Having my Vector2 class be a double rather than a float would slow performance down. Is that still the case in 2016?

I'm not sure if this is the case but is there a reason you need them to be doubles? That's fairly overkill. You could also consider making your Vector class to be a template class and then you can use either doubles or floats as needed.

I agree, I feel double is overkill. That why I'm trying to cast it to float. GLFW library only gives double. I don't have a choice in the matter.

Just do this and hide it somewhere so you never have to look at it again:


double dx, dy;
glfwGetCursorPos(window, &dx, &dy);
float myX, myY;
myX = (float)dx;
myY = (floay)dy;

Just do this and hide it somewhere so you never have to look at it again:


double dx, dy;
glfwGetCursorPos(window, &dx, &dy);
float myX, myY;
myX = (float)dx;
myY = (floay)dy;

Yeah, that is exactly what I did. happy.png

The most significant performance hit for using doubles will probably come from the CPU cache misses and memory bandwidth caused by them taking up twice as much memory.

In addition SSE instructions can handle four floats at a time, but only two doubles at a time. So for code which uses them the performance hit can be significant.

For basic operations on values in registers, floats aren't significantly faster than doubles. For some more complex operations (like division) doubles will be slower than floats, as they have more precision, but overall you probably won't notice much difference.

For details on specific instructions look at Intel's optimization manual - http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html (for example you can compare the divsd and divss instructions there, to see the timings for float vs double division).

The most significant performance hit for using doubles will probably come from the CPU cache misses and memory bandwidth caused by them taking up twice as much memory.

In addition SSE instructions can handle four floats at a time, but only two doubles at a time. So for code which uses them the performance hit can be significant.

For basic operations on values in registers, floats aren't significantly faster than doubles. For some more complex operations (like division) doubles will be slower than floats, as they have more precision, but overall you probably won't notice much difference.

For details on specific instructions look at Intel's optimization manual - http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html (for example you can compare the divsd and divss instructions there, to see the timings for float vs double division).

Yep. that's what I thought. I have been reading a lot about cache optimization and fitting everything on the cache line. It's very interesting.

Thanks happy.png

Just do this and hide it somewhere so you never have to look at it again:

double dx, dy;
glfwGetCursorPos(window, &dx, &dy);
float myX, myY;
myX = (float)dx;
myY = (floay)dy;
Yeah, that is exactly what I did. ^_^
Never use C casts. Use static_cast.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

This topic is closed to new replies.

Advertisement