Sign in to follow this  
dbircsak

C union float trick?

Recommended Posts

I'm looking through the Quake3 networking code and I come accross this:
void MSG_WriteFloat( msg_t *sb, float f ) {
	union {
		float	f;
		int	l;
	} dat;
	
	dat.f = f;
	MSG_WriteBits( sb, dat.l, 32 );
}
void MSG_WriteChar( msg_t *sb, int c ) {
#ifdef PARANOID
	if (c < -128 || c > 127)
		Com_Error (ERR_FATAL, "MSG_WriteChar: range error");
#endif

	MSG_WriteBits( sb, c, 8 );
}
void MSG_WriteLong( msg_t *sb, int c ) {
	MSG_WriteBits( sb, c, 32 );
}

The function I'm wondering about is MSG_WriteFloat... they declare a union, assign something to the float, and then send the int?! Why do they do this? Wait, I just realized I should look at the read function...
int MSG_ReadLong( msg_t *msg ) {
	int	c;
	
	c = MSG_ReadBits( msg, 32 );
	if ( msg->readcount > msg->cursize ) {
		c = -1;
	}	
	
	return c;
}

float MSG_ReadFloat( msg_t *msg ) {
	union {
		byte	b[4];
		float	f;
		int	l;
	} dat;
	
	dat.l = MSG_ReadBits( msg, 32 );
	if ( msg->readcount > msg->cursize ) {
		dat.f = -1;
	}	
	
	return dat.f;	
}
Again...they assign something to the int, but return the float??? I don't get it. How can they work with floats by using ints in a union variable? Thanks for the help, Darrell

Share this post


Link to post
Share on other sites
The read bits and write bits functions take ints (not floats). And with a union all the different possible objects share the same piece of memory (so changing one will change the others). And on the system they're using (x86) sizeof(float)=sizeof(int).

Edit ask if you need a clarification on the above, my description is terse.

Share this post


Link to post
Share on other sites
Just a warning...according to the C++ Standard assigning one type of a union and then reading another is undefined. Most compilers do what you would probably want/expect, but it would still be perfectly valid for an app to crash then.

Share this post


Link to post
Share on other sites
This is in fact illegal C++ (not sure about C), however most compilers support it because it's a common trick. The reason is that compilers can optimize by assuming variables of different types cannot alias to the same memory location, but in this case it's fairly easy for the compiler to handle.

Edit: beaten.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Promit
The correct C++ way to do this would be to reinterpret_cast it to a float. In C, usually you would use *(int*)&f.
Are you sure? I thought such reinterpret_casts were implementation defined? Wouldn't the proper method be to use a binary steam's operator <

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
(continued due to crappy handling of angled brackets in AP posts)
a binary stream's insertion operator? Is there a binary (in the file as text vs binary sense) equivalent of stringstream that could be used to convert POD types to some type of byte buffer?
-Extrarius

Share this post


Link to post
Share on other sites
Quote:
Original post by Promit
The correct C++ way to do this would be to reinterpret_cast it to a float.


Really? In MSVC++7,


int i = 10;
float j = reinterpret_cast<float>(i);


gives me:

error C2440: 'reinterpret_cast' : cannot convert from 'int' to 'float'

Share this post


Link to post
Share on other sites
Quote:
Original post by bakery2k1
Quote:
Original post by Promit
The correct C++ way to do this would be to reinterpret_cast it to a float.


Really? In MSVC++7,


int i = 10;
float j = reinterpret_cast<float>(i);


gives me:

error C2440: 'reinterpret_cast' : cannot convert from 'int' to 'float'


You would use pointers.

Share this post


Link to post
Share on other sites
Thanks Cocalus & folks. I tried it as a C++ project and it returns an undefined value. So looks like it's a C only trick. I suppose they did it for speed.

cplusplus.com sayz, "All the elements of the union declaration occupy the same physical space in memory. Its size is the one of the greatest element of the declaration."

Darrell

Share this post


Link to post
Share on other sites
Quote:
Original post by ZQJ
This is in fact illegal C++ (not sure about C), however most compilers support it because it's a common trick. The reason is that compilers can optimize by assuming variables of different types cannot alias to the same memory location, but in this case it's fairly easy for the compiler to handle.

Edit: beaten.


Skimming over the C99 spec, so far it appears legal. The rules for structures apply to unions, so if a struct can have a float and an int, so can a union.

Quote:

6.7.2.1 Structure and union specifiers

7. A member of a structure or union may have any object type other than a variably modified type.93) In addition, a member may be declared to consist of a specified number of bits (including a sign bit, if any). Such a member is called a bit-field;94) its width is preceded by a colon.

93) A structure or union can not contain a member with a variably modified type because member names are not ordinary identifiers as defined in 6.2.3.

13. The size of a union is sufficient to contain the largest of its members. The value of at most one of the members can be stored in a union object at any time. A pointer to a union object, suitably converted, points to each of its members or if a member is a bitfield, then to the unit in which it resides), and vice versa.


Share this post


Link to post
Share on other sites
Guest Anonymous Poster
LessBread - could u please provide a link to the C99 Standard? I am not sure I am holding the correct thing, even after googling for it. Thanks.

Share this post


Link to post
Share on other sites
Quote:
Also from n869.pdf "6.5.2.3 Structure and union members"
5 With one exception, if the value of a member of a union object is used when the most recent store to the object was to a different member, the behavior is implementation-defined 70). One special guarantee is made in order to simplify the use of unions: If a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

70) The "byte orders" for scalar types are invisible to isolated programs that do not indulge in type punning (for example, by assigning to one member of a union and inspecting the storage by accessing another member that is an appropriately sized array of character type), but have to be accounted for when conforming to externally imposed storage layouts.

Implementation-defined means different compilers can do different things with this code. You have to check the documentation of your compiler to find out what it will do.

Share this post


Link to post
Share on other sites
seeing this, its no wonder that Q3 uses such a lot of bandwidth


void MSG_WriteFloat( msg_t *sb, float f ) {
union {
float f;
int l;
} dat;

dat.f = f;
MSG_WriteBits( sb, dat.l, 32 );
}



This makes the point of delta compression, which they are said to use, completely useless

Share this post


Link to post
Share on other sites
From the gcc manual:

Quote:

-fstrict-aliasing
Allows the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C (and C++), this activates optimizations based on the type of expressions. In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same. For example, an unsigned int can alias an int, but not a void* or a double. A character type may alias any other type.

Pay special attention to code like this:

union a_union {
int i;
double d;
};

int f() {
a_union t;
t.d = 3.0;
return t.i;
}


The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above will work as expected. However, this code might not:

int f() {
a_union t;
int* ip;
t.d = 3.0;
ip = &t.i;
return *ip;
}


Every language that wishes to perform language-specific alias analysis should define a function that computes, given an tree node, an alias set for the node. Nodes in different alias sets are not allowed to alias. For an example, see the C front-end function c_get_alias_set.

Enabled at levels -O2, -O3, -Os.


So DON'T use the pointer cast trick, stick with the union trick.

Share this post


Link to post
Share on other sites
Quote:
Original post by ZQJ
So DON'T use the pointer cast trick, stick with the union trick.

In addition unions are defined to be the size of the larger type they contain

this will not help in this situation for your program [trying to save a double in a union and pull out an int32], but it will keep you from stepping on other things nearby if you use the evil pointer technique

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this