Sign in to follow this  

C++: Cross-platform binary file writing

This topic is 3590 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

To store various data, my program reads/writes a binary file. I use the standard fstream library to do this. Works great, but I like to keep my code cross-platform, and so I'm wondering how to make sure that it is. Currently I have every piece of data casted to chars, and write the chars. I use sizeof() to write the appropriate number of bytes/chars for each item, and similarly uses sizeof() to read them back. My concerns: 1.)If I create my datafiles on a platform where sizeof() returns a different value for certain types (such as int, etc...) than the platform my program is later compiled and run on, it will read the incorrect number of bytes for certain items. Possible solution: Remake the datafiles on each platform I compile on? 2.) Endianess: Obviously this is an issue. If I cast an item: (char*) &myInt, and write it's sizeof(myInt) bytes to the file, when they are later read and cast back to an (int*), the bytes will be backwards on a machine with different endianness. 3.) Are there others I'm missing? I've searched the net but most places simply say: don't do this. But is there a way? I know these casts to bytes seem messy, but it's limited to one place in my code that handles it for everything else, and it works just great for me. (I just haven't tried it on different platforms yet. [wink]) Any thoughts?

Share this post


Link to post
Share on other sites
This is a perfectly reasonable thing to do. I'd recommend you define some types that you guarantee will be of a certain size regardless of platform; define these in some Platform.h header kind of thing. I.e., int32, uint16, etc. When you store binary files, use these typedefs, and then you'll be guaranteed that sizeof( int32 ) is always 4 regardless of platform.

As far as the endian issues, if you want to re-use the exact same file between platforms, you'll need to store in the file which endian type the file was written in, and then endian-swap when loading if necessary. Alternatively, you could write different binary files out for all your target platforms, and do the endian-swapping before you write out the file; this latter option gives faster run-time loading of course.

The only other issues you might run into are if you're reading/writing entire structures whose alignment/padding might be defined differently on different compilers. Normally this isn't too much of an issue, but you can use things like #pragma pack, or just be explicit about adding your own padding member variables. Other issues that could conceivably come up are when you actually want to write different structures for different platforms, but it doesn't seem like that's within the scope of what you're dealing with.

Share this post


Link to post
Share on other sites
Keep in mind there are a lot of libraries that already provide platform headers for things like integer widths. Ex: boost::cstdint and SDL both contain integer width typedefs. SDL also has functions for dealing with endianness. However, one thing you'll have to deal with is that there is no reliable way to serialize floating point values as binary between platforms. While IEEE 754 specifies things like bit order and so forth, it doesn't specify byte order, so two platforms with the same endianness and both using IEEE 754 floating point types may have incompatible binary representations. Unfortunately, the only portable way to serialize floats is to do so by converting them into text.

Also rather than rely on packing pragmas to serialize structures, you can just serialize and deserialize the individual members one at a time. You'd probably need to do this anyways for endian issues.

Share this post


Link to post
Share on other sites
Thanks for all the help!

I'm already serializing the individual members rather than the structures as a whole, so padding won't be an issue.

I've just looked into the boost cstdint.hpp header and it looks like it will work great, so I'm going to go ahead and use that.

Thanks for the heads up on floating point values. I'll try to work around needing to save any.

One more question... So can I assume regular old "unsigned char" types are safe across platforms?

Share this post


Link to post
Share on other sites
Quote:
3.)I know these casts to bytes seem messy, but it's limited to one place in my code that handles it for everything else, and it works just great for me. (I just haven't tried it on different platforms yet.


Memory alignment. Individuals values in binary stream may not be properly aligned for processor.

Unless you assume Intel architecture, you should rely on memcpy to copy data to/from buffer. On others, simply type-casting memory offsets into types you want to read can fail. Note that this is processor-level exception.

Example:

short int
0A 00 | 02 00 00 00 |

char * int_ptr = buf[2];
int value = *( (int *)int_ptr); // boom, attempting to real improperly aligned int

Share this post


Link to post
Share on other sites
Quote:
Original post by BeauMN
So can I assume regular old "unsigned char" types are safe across platforms?


No. char may not be the same size on all platforms. Some platforms use a 32-bit char. However, in game programming, you can get away with just using a BOOST_STATIC_ASSERT(sizeof(int8_t) == 1), which will fail to compile for those platforms and then otherwise ignoring non 8-bit chars. This is a different story in non-game programming disciplines.

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
Unless you assume Intel architecture, you should rely on memcpy to copy data to/from buffer.


That's not an issue as long as you use fstream's read()/write() directly on your data members.

Share this post


Link to post
Share on other sites
Hmm. so in the case of a system with 32-bit chars, does it even work to use read() and write() with uint8_t? Such as:

myStream.write( myBuffer, sizeof( uint8_t ) );

I'm guessing no, since the definition of write is:

ostream& write( const char* buffer, streamsize num );

where num is the number of bytes in the buffer. And in this case, a byte would be at least 32 bits. So num would end up being 0.25... can streamsize be a floating point value?


This discussion has raised a new question for me, though perhaps it belongs on the OpenGL forum, but it concerns this topic so I'll ask it here for now:

One of the things I want to store in my binary files are OpenGL color values. These eventually get mapped between 0.0 (zero intensity) and 1.0 (full intensity) for each of the color's components. Now I don't want to use floats directly, due to there not being a reliable way to serialize floats across platforms as discussed earlier. But this isn't a problem as the glColor function can alternatively take bytes, shorts, and ints in addition to floats.

So if I pass unsigned ints into glColor, the documentation states that the largest-representable value gets mapped to 1.0 and the smallest to 0.0. But what does OpenGL consider the "largest-representable value"?

For example, if I call:

boost::uint8_t red = 255;
boost::uint8_t green = 255;
boost::uint8_t blue = 255;
boost::uint8_t alpha = 255;
glColor4ui( red, green, blue, alpha );

Will OpenGL know that the "largest-representable value" of these uin8_t is 255 and map the components to 1.0 accordingly? Or will it assume the "largest-represtable value" is that of an usigned 32-bit int on a platform in which that is the default? The function prototype looks like:

void glColor4ui( GLuint red, GLuint green, GLuint blue, GLuint alpha )

simply taking in GLuints... but what are these GLuints? Are they cross-platform?

I could just use bytes with glColor4ub() if I want a max of 255, but would that still work on the 32-bit byte platform mentioned earlier?

Share this post


Link to post
Share on other sites
see. They are just typedefs. So far I know you must handle the portability and endianess of the types you use yourself.

I really can't belive there isn't a standar library for portable serialization... there gotta be one in boost.

Share this post


Link to post
Share on other sites
Quote:
Original post by BeauMN
Hmm. so in the case of a system with 32-bit chars, does it even work to use read() and write() with uint8_t? Such as:

myStream.write( myBuffer, sizeof( uint8_t ) );


That will fail to compile because of no definition for uint8_t.

Quote:
So num would end up being 0.25... can streamsize be a floating point value?

No.

Quote:

simply taking in GLuints... but what are these GLuints? Are they cross-platform?

Is a GL typedef for four byte unsigned integers that can vary by platform. If you want the maximum value you can use std::numeric_limits<GLuint>::max().

Quote:
I could just use bytes with glColor4ub() if I want a max of 255, but would that still work on the 32-bit byte platform mentioned earlier?

I wouldn't worry about 32-bit byte platforms if you're doing game programming.

Share this post


Link to post
Share on other sites
Okay, I won't worry about 32-bit byte platforms. Incidentally, can I just not worry about any non-8-bit byte platforms? For game development is it generally safe to assume an 8-bit byte? And is it also safe to assume an "unsigned int" is 32-bit? (i.e. can I just forgo using the boost/stdcint.hpp types altogether?)

Basically here's what I'm hoping to support:
Microsoft Windows Platforms
Mac OSX Platforms
Linux (running on Intel architectures)

Is that pretty good coverage for a game? Are there any other common platforms that people would hope to play a game on?

I'm tempted to just use the c++ "unsigned char" and "unsigned int" for my game engine, since OpenGL will take them in as GLubytes and GLuints anyways, which is going to be platform specific. Aside from the endian issues for versions of OSX still running on PowerPCs, should that be pretty safe for the platforms above? (Saving the chars and ints to binaries, reading them back, and passing them to OpenGL) Or would it still be better for me to use the boost/stdcint.hpp?

Basically I don't want to dig myself into any holes that I'll regret later, but also don't want to obsess over excessive portability that doesn't really matter.

Share this post


Link to post
Share on other sites
You can use the GLuint and GLubytes typedefs pretty safely; while they will be typedef'd to a platform dependent type, they exist for platform independence; much in the same way that uint32_t will be typedef'd to a platform dependent type, but using uint32_t itself is platform independent.

Share this post


Link to post
Share on other sites

This topic is 3590 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this