# Floor functon definition

#1JohnnyCode

Posted 13 August 2013 - 06:21 PM

Hi.

Is it possible to return a closest integer to a floating number by defining a funtion that uses only +,-,*,/ operations?

#2Álvaro

Posted 13 August 2013 - 06:30 PM

float round(float x) {
x += 12582912.0f;
x -= 12582912.0f;
return x;
}

I am not sure the exact range of inputs for which that works, but I think it's the best you can do.

EDIT: Removed meaningless ".5f" at the end of the first constant: That bit is out of the precision of the float, so the code will compile to the same thing whether you put a ".0f" or a ".5f" there.

#3JohnnyCode

Posted 13 August 2013 - 06:53 PM

I must admit I do not see how this could work though. Could you elaborate on the definition a bit? About the two constants and such. Thanks a lot

#4Álvaro

Posted 13 August 2013 - 07:25 PM

Let's start with a number that smallish (say, less than 4 million). The constant is designed so the sum of x plus the constant will have its exponent so large that the 24 bits of precision in a float will precisely allow it to represent integers (meaning, the distance between consecutive floats in this range is 1). That sum is where the rounding happens, because the bits that don't fit are discarded, hopefully with some reasonable rounding rules. The subtraction of the magic constant brings the number back to its original range, but we have lost the bits beyond the integer part.

#5JohnnyCode

Posted 14 August 2013 - 01:37 AM

What would the constant be for 64 bit floats? I gess the same since exponent is the same range as in 64 bit, but it seems not to work.

I could truncate the float by shifting significant bits to right by amount of (bias)-(exponent unbiased) times and then set exponent to bias. But I cannot afford bitwise operations, only algebraic ones, how could I do this with algebraic operation? Thanks a lot

#6Hodgman

Posted 14 August 2013 - 01:46 AM

Is this some kind of arbitrary challenge? What's the actual problem, and why can you only use +,-,*,/?

#7JohnnyCode

Posted 14 August 2013 - 02:00 AM

Is this some kind of arbitrary challenge? What's the actual problem, and why can you only use +,-,*,/?

I need to do the operation on gpu, a float resulting to a float. So I cannot use things like modulo, or bitwise ops. Using standard operations would also make this definable as a proper math function, but that is not my concern.

#8Hodgman

Posted 14 August 2013 - 02:37 AM

GPU's can do floor natively, so using the built-in floor function will most likely be much faster than emulating it yourself using a bunch of arithmetic.

Is there a reason you're avoiding the built-in implementation?

#9Álvaro

Posted 14 August 2013 - 02:49 AM

The constant for a double would be 3*2^51 = 6755399441055744.0

#10JohnnyCode

Posted 15 August 2013 - 05:41 PM

Alvaro you rule the world. The posted hacks works for 32 and 64 bit IEEE floatings like a charm.

