[java] Aarghh.. Reading files written with C output

Started by
4 comments, last by ibawt 18 years, 8 months ago
I have a C executable which writes a bunch of unsigned integers, unsigned shorts and floats to a data file. The C program is compiled using MinGW, and the data file is written to using calls to fwrite. This data file is later opened and the values read by a Java program. The problem is, when reading the file in using the Java program, the byte order for the primitives (int, short and float) seems to be reversed. No problem, I thought; I'll just read the primitives in byte-by-byte and swap the order. This worked- I now have a function like this:

    private int readInt(DataInputStream in) {
        int value = 0;
        
        try {
           value = (in.readUnsignedByte()) | (in.readUnsignedByte()<<8) |
                   (in.readUnsignedByte()<<16) | (in.readUnsignedByte()<<24);
        } catch(Exception _e) {}
        
        return value;
    }
And a similar readShort function. The problem occurs when it comes to floats- read in using the standard DataInputStream function readFloat() they are all messed up, just like the ints/floats. From this, I guess that they are mangled in the same way, with the byteorder reversed- unfortunately, as far as I'm aware, bitwise operations can't be performed on floats, so I can't write a function like the above to re-arrange their component bytes. So: - Can anyone recommend me a way to read these floats correctly? - Can anyone tell me why this order-reversal happens? For the first problem, I quickly came up with the solution of reading 4 bytes off the input stream, writing them to an output stream in the reverse order and then reading them back in as a float. This seems so far fetched a solution, though, that I'm worried I've overlooked something simple. Rates to anyone who can help!
the rug - funpowered.com
Advertisement
First of all, nice job on the detective work =)

You've stumbled on a weird quirk of processors: endianness. Some processors like to have their bytes ordered one way, and other processors have it the opposite way. The JVM is big-endian, and x86 processors (what your C app runs on) are little-endian.

Usually you can remain blissfully unaware of endianness, because as long as you stick with one endianness or the other, you won't ever notice the difference. (unless you start poking around by casting an int* to a char*). But as soon as you start tranferring data from little-endian to big-endian, things go wrong. This most often happens when sending data on a network, but it can also happen with files.

It looks like the best Java way to deal with it is to use a ByteBuffer and call the order(ByteOrder bo) function. Or you could just have the C code save everything as big-endian, since C is better at messing around with data like that :)
Thanks, rate++ :o)

One more thing. I'm doing this in order to read in a 3d model from a data file, and currently using one created by using the C program to convert a 3ds file to a more readable format. I hope to implement real 3ds support at some point- should I expect 3ds files to be written in the same way (using little endian)? And would it be the same for other formats? My guess is yes- but I'd rather just be clear on that.

Edit: Just checked the 3ds specification (googled for "3ds file endian"- should have done that before posting, really) and it seems it is indeed little-endian. So it seems this is the way it's usually done (can someone confirm/deny that?)

Thanks again.
the rug - funpowered.com
Quote:Original post by The Rug
Edit: Just checked the 3ds specification (googled for "3ds file endian"- should have done that before posting, really) and it seems it is indeed little-endian. So it seems this is the way it's usually done (can someone confirm/deny that?)


I don't know, I think little-endian might be more popular, but I did a search and found a couple file formats that are big-endian (jpg and wav). So it varies. But a given file format will always be clearly defined as one or the other.
I'm pretty sure the float class has a static method for converting to and from the bitwise representation. ie that is use an integer.
So remember that endianess has nothing to do with how the data is interperted, ie a float is 32 bits, so in terms of the byte swapping it will behave the same way. So for something like this:

public float readFloat() throws java.io.IOException {   return Float.intBitsToFloat( readInt() );}

This topic is closed to new replies.

Advertisement