C++ cant find a match for 16 bit float and how to convert 32 bit float to 16 bit one

Started by
11 comments, last by alh420 9 years ago

I need to make a floating point texture howevere device i program to uses only 16 bit floating point textures so i need somehow to put the data there.

problem is i cant find any specification that defines 16 bit float.

one guy told me that its named half well compiler says theres no such thing ;]

so basically i want to convert float * arr32bit; to something * arr16bit;

i think simple casting should do the work but who knows...

Advertisement
device i program to

How about saying which device that is? Presumably there should be a library you use which defines types and functions which operate on those types. Since C++ itself doesn't know what a half-float is it must be implemented at the library level. For example, the NVidia Half-float extension for OpenGL.

C++: A Dialog | C++0x Features: Part1 (lambdas, auto, static_assert) , Part 2 (rvalue references) , Part 3 (decltype) | Write Games | Fix Your Timestep!

See e.g. this wikipedia article.

well i though c++ has it ;]

im trying to do that on sony xperia j with android 4.1.2 it uses ogles 2

im digging through

https://www.khronos.org/registry/gles/extensions/OES/OES_texture_float.txt

to find any corresponding definition of 16 bit float but cant find anythign usable.

now i am wondering if i could just create an empty texture of 16bit float and write 32 bit float to it with fragment shader so it will somehow put a 16 bit flaot pixel there

thanks


now i am wondering if i could just create an empty texture of 16bit float and write 32 bit float to it with fragment shader so it will somehow put a 16 bit flaot pixel there

OpenGL converts texture data as you uploaded it (via glTexImage).

If you specify an internal format that is 16-bit float, and a format and type matching your 32-bit float image data, the driver will perform the necessary conversion.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

C++ itself doesn't define a 16-bit float type as far as I know, though I can imagine there are platform-specific tool-chains that might support them (Lots of DSPs use 24bit floats and their compilers support them, presumably).

You see 16bit floats mostly in things like GLSL/HLSL shaders, or an API like OpenGL ES might provide a library type and conversions.

You can also do the conversion yourself by banging bits if your intent is to simply upload them to a device that handles them naively. I don't think you can just truncate the mantissa/exponent , but the math is straight forward if a bit exacting.

throw table_exception("(? ???)? ? ???");

You may find the following well-commented code useful for converting between half-float and float types while handling special values like infinity and NAN. For me, these functions are wrapped in a HalfFloat class that has all of the standard arithmetic and conversion operators.



/// A static constant for a half float with a value of zero.
static const UInt16 ZERO = 0x0000;


/// A static constant for a half float with a value of not-a-number.
static const UInt16 NOT_A_NUMBER = 0xFFFF;


/// A static constant for a half float with a value of positive infinity.
static const UInt16 POSITIVE_INFINITY = 0x7C00;


/// A static constant for a half float with a value of negative infinity.
static const UInt16 NEGATIVE_INFINITY = 0xFC00;


/// A mask which isolates the sign of a half float number.
static const UInt16 HALF_FLOAT_SIGN_MASK = 0x8000;


/// A mask which isolates the exponent of a half float number.
static const UInt16 HALF_FLOAT_EXPONENT_MASK = 0x7C00;


/// A mask which isolates the significand of a half float number.
static const UInt16 HALF_FLOAT_SIGNIFICAND_MASK = 0x03FF;


/// A mask which isolates the sign of a single precision float number.
static const UInt32 FLOAT_SIGN_MASK = 0x80000000;


/// A mask which isolates the exponent of a single precision float number.
static const UInt32 FLOAT_EXPONENT_MASK = 0x7F800000;


/// A mask which isolates the significand of a single precision float number.
static const UInt32 FLOAT_SIGNIFICAND_MASK = 0x007FFFFF;


/// Convert the specified single precision float number to a half precision float number.
static UInt16 floatToHalfFloat( Float floatValue )
{
	// Catch special case floating point values.
	if ( math::isNAN( floatValue ) )
		return NOT_A_NUMBER;
	else if ( math::isInfinity( floatValue ) )
		return POSITIVE_INFINITY;
	else if ( math::isNegativeInfinity( floatValue ) )
		return NEGATIVE_INFINITY;
	
	UInt32 value = *((UInt32*)&floatValue);
	
	if ( floatValue == Float(0) )
		return UInt16( value >> 16 );
	else
	{
		// Start by computing the significand in half precision format.
		UInt16 output = UInt16((value & FLOAT_SIGNIFICAND_MASK) >> 13);
		
		register UInt32 exponent = ((value & FLOAT_EXPONENT_MASK) >> 23);
		
		// Check for subnormal numbers.
		if ( exponent != 0 )
		{
			// Check for overflow when converting large numbers, returning positive or negative infinity.
			if ( exponent > 142 )
				return UInt16((value & FLOAT_SIGN_MASK) >> 16) | UInt16(0x7C00);
			
			// Add the exponent of the half float, converting the offset binary formats of the representations.
			output |= (((exponent - 112) << 10) & HALF_FLOAT_EXPONENT_MASK);
		}
		
		// Add the sign bit.
		output |= UInt16((value & FLOAT_SIGN_MASK) >> 16);
		
		return output;
	}
}




/// Convert the specified half float number to a single precision float number.
static Float halfFloatToFloat( UInt16 halfFloat )
{
	// Catch special case half floating point values.
	switch ( halfFloat )
	{
		case NOT_A_NUMBER:
			return math::nan<Float>();
		case POSITIVE_INFINITY:
			return math::infinity<Float>();
		case NEGATIVE_INFINITY:
			return math::negativeInfinity<Float>();
	}
	
	// Start by computing the significand in single precision format.
	UInt32 value = UInt32(halfFloat & HALF_FLOAT_SIGNIFICAND_MASK) << 13;
	
	register UInt32 exponent = UInt32(halfFloat & HALF_FLOAT_EXPONENT_MASK) >> 10;
	
	if ( exponent != 0 )
	{
		// Add the exponent of the float, converting the offset binary formats of the representations.
		value |= (((exponent - 15 + 127) << 23) & FLOAT_EXPONENT_MASK);
	}
	
	// Add the sign bit.
	value |= UInt32(halfFloat & HALF_FLOAT_SIGN_MASK) << 16;
	
	return *((Float*)&value);
}

I've been using this code written by Mike Acton from Insomniac to convert between float and *half*.

You might need to change inline to __inline or the extension from .c to .cpp when working with visual studio.

You can use a union to convert float to/from uint32_t:

 
union Helper
{
    float f;
    uint32_t u;
};
 
Helper helper;
helper.f = 5.0f;

uint16_t h = half_from_float(helper.u);
 
//and back
 
helper.u = half_to_float(h);
 
float y = helper.f;
 
 
For large amounts of data, there are also SIMD intrinsics that can do this:

half -> float: _mm_cvtph_ps and _mm256_cvtph_ps
float -> half: _mm_cvtps_ph and _mm256_cvtps_ph
see https://software.intel.com/sites/landingpage/IntrinsicsGuide/

Oh, I just noticed you aren't doing this on a PC. But some ARM processors support similar conversion functions. See for example: https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html

Just letting glTexImage do the conversion should be the easiest and most compatible way to do it, which could be important if you target android.

You don't know the combination of cpu and gpu the customer will have in their phone, and what float format they expect, so instead of handling all cases, just let the driver do it.

This topic is closed to new replies.

Advertisement