cozzie

C++ Writing char arrays (binary)

Recommended Posts

Hi all,

For practice I'm trying to write 3D mesh data to a binary format. To eventually also be able to read meshes from binary files (provided by the asset pipeline).

I understand how to write floats, vectors of floats, ints etc by using a write((char*)myVar, sizeof(type) to the file, for which I'm using std::ofstream.

My goal is to write a 3D vector or other type (struct) to the file and be able to read it back the same way. From what I understand, I need to create function that returns a char array combining all members of the struct.

So in short; how can I create this char array, in a "safe" way?

So far I've come up with this approach:

- calculate number of needed bytes/chars, i.e 12 bytes for a 3D vector with 3 floats

- char result[12]

- cast the 3 individual floats to char[4]'s and use strcpy to result for the 1st and strcat for the others

Is this the way to go or would you suggest other approaches?

question 2; How would I read/convert the char array back to the 3 floats when reading it from the binary file?

Any input is appreciated.

Share this post


Link to post
Share on other sites

I'm not even sure what's your problem there. Let's start from the basics: the only thing that truly exist is memory and bits. As long as you're going with 'basic' types such as floats the following will do: 

ubyte blobby[12];
memcpy(blobby, src, sizeof(blobby));
float *bruh = (float*)blobby;

Hoping `blobby` to be properly aligned.

First you have to ensure your `int` is the same as the serialized `int` and that's why you have `__int32` or `std::int64_t`. And of course you have endianess problems but let's assume you'll be loading on 'coherent' machines.

When `struct` enters the picture `memcpy` goes out due to possibility of different compiler packing.

There are various options for (de)serialization, I'd suggest pupping all the things.

That or google protobuffers but in my experience they're not quite convenient for game purposes due to their everything can be optional nonsense.

 

 

Share this post


Link to post
Share on other sites

Thanks guys.

I've been playing around using this input. Unfortunately writing the struct or array (std::vector) at once, doesn't work by trying to cast to (myType*), it does work when I cast it to (char*). See the code samples below, tested all 3 positive.

Would I have any risks on this approach, depending on platform? (size of float etc.).
I'll make sure the struct I use as a POD struct, which is the case in the example below.

void ReadWriteStruct()
{
	std::ofstream writeFile;
	writeFile.open("data.bin", std::ios::out, std::ios::binary);
	writeFile.write((char*)&srcVector, sizeof(VECTOR3));
	writeFile.close();

	MessageBox(NULL, L"Vector written to binary file - full struct!", L"Done", MB_OK);

	std::ifstream readFile;
	readFile.open("data.bin", std::ios::in, std::ios::binary);

	VECTOR3 myReadVec3;
	readFile.read((char*)&myReadVec3, sizeof(VECTOR3));
	readFile.close();

	char tempText[100];
	sprintf_s(tempText, "Read: %f, %f, %f\n", myReadVec3.x, myReadVec3.y, myReadVec3.z);
	OutputDebugStringA(tempText);
}

void ReadWriteArray()
{
	std::vector<VECTOR3> myVectors(2);
	myVectors[0].x = 0.2f;
	myVectors[0].y = 0.7f;
	myVectors[0].z = 0.95f;
	myVectors[1].x = 5.2f;
	myVectors[1].y = 4.7f;
	myVectors[1].z = 7.75f;

	std::ofstream writeFile;
	writeFile.open("data.bin", std::ios::out, std::ios::binary);
	writeFile.write((char*)&myVectors[0], myVectors.size() * sizeof(VECTOR3));
	writeFile.close();

	MessageBox(NULL, L"Vector written to binary file - full struct!", L"Done", MB_OK);

	std::ifstream readFile;
	readFile.open("data.bin", std::ios::in, std::ios::binary);

	std::vector<VECTOR3> readVectors(2);
	readFile.read((char*)&readVectors[0], sizeof(VECTOR3));
	readFile.read((char*)&readVectors[1], sizeof(VECTOR3));
	readFile.close();

	char tempText[100];
	sprintf_s(tempText, "Read 1: %f, %f, %f\n", readVectors[0].x, readVectors[0].y, readVectors[0].z);
	OutputDebugStringA(tempText);
	sprintf_s(tempText, "Read 2: %f, %f, %f\n", readVectors[1].x, readVectors[1].y, readVectors[1].z);
	OutputDebugStringA(tempText);
}

 

Share this post


Link to post
Share on other sites
17 hours ago, cozzie said:

Thanks guys.

I've been playing around using this input. Unfortunately writing the struct or array (std::vector) at once, doesn't work by trying to cast to (myType*), it does work when I cast it to (char*). See the code samples below, tested all 3 positive.

Would I have any risks on this approach, depending on platform? (size of float etc.).
I'll make sure the struct I use as a POD struct, which is the case in the example below.


void ReadWriteStruct()
{
	std::ofstream writeFile;
	writeFile.open("data.bin", std::ios::out, std::ios::binary);
	writeFile.write((char*)&srcVector, sizeof(VECTOR3));
	writeFile.close();

	MessageBox(NULL, L"Vector written to binary file - full struct!", L"Done", MB_OK);

	std::ifstream readFile;
	readFile.open("data.bin", std::ios::in, std::ios::binary);

	VECTOR3 myReadVec3;
	readFile.read((char*)&myReadVec3, sizeof(VECTOR3));
	readFile.close();

	char tempText[100];
	sprintf_s(tempText, "Read: %f, %f, %f\n", myReadVec3.x, myReadVec3.y, myReadVec3.z);
	OutputDebugStringA(tempText);
}

void ReadWriteArray()
{
	std::vector<VECTOR3> myVectors(2);
	myVectors[0].x = 0.2f;
	myVectors[0].y = 0.7f;
	myVectors[0].z = 0.95f;
	myVectors[1].x = 5.2f;
	myVectors[1].y = 4.7f;
	myVectors[1].z = 7.75f;

	std::ofstream writeFile;
	writeFile.open("data.bin", std::ios::out, std::ios::binary);
	writeFile.write((char*)&myVectors[0], myVectors.size() * sizeof(VECTOR3));
	writeFile.close();

	MessageBox(NULL, L"Vector written to binary file - full struct!", L"Done", MB_OK);

	std::ifstream readFile;
	readFile.open("data.bin", std::ios::in, std::ios::binary);

	std::vector<VECTOR3> readVectors(2);
	readFile.read((char*)&readVectors[0], sizeof(VECTOR3));
	readFile.read((char*)&readVectors[1], sizeof(VECTOR3));
	readFile.close();

	char tempText[100];
	sprintf_s(tempText, "Read 1: %f, %f, %f\n", readVectors[0].x, readVectors[0].y, readVectors[0].z);
	OutputDebugStringA(tempText);
	sprintf_s(tempText, "Read 2: %f, %f, %f\n", readVectors[1].x, readVectors[1].y, readVectors[1].z);
	OutputDebugStringA(tempText);
}

 

I made a mistake, it should have been char* not myType*. I don't see a structure in your post but usually a vector struct is a POD.

You can read the vector array in a single call as well.

Krohm already posted the problems you may face.

Share this post


Link to post
Share on other sites

Thanks. The VECTOR3 is my struct, so it's working for both an individual struct as a std::vector of them. Thanks

Edited by cozzie

Share this post


Link to post
Share on other sites
On 10/11/2017 at 12:57 AM, Krohm said:

There are various options for (de)serialization, I'd suggest pupping all the things.

I second this approach.

Break it down so that you're only serializing fundamental types. This allows you to avoid all issues with alignment and padding, properly handle endian-ness, and trivially support composites of serializeable types.

There's nothing you gain from trying to read/write entire buffers of typed data at once, other than the potential for hard-to-find bugs.

Share this post


Link to post
Share on other sites

A little bit off topic but i always first writr then length of char array and then i writr char array anyway

Memcpy is your pal, you just memcpy(&floatvar, &chartable[index],3);

 

Something kike this you copy some amount memory to float pointer

Share this post


Link to post
Share on other sites

Do you mean first writing everything to a char array (aka buffer) and then write that buffer to file?

That could work, but then I have to calculate the index manually, which could be a risk if some type has a different bytesize on another problem (writing per variable and using sizeof, I think prevents this). In your example, shouldn't 3 be 4 (bytes for a float)?

But perhaps there's a way to use a buffer while keeping this in mind (or don't manually calculate the index of each variable).

Share this post


Link to post
Share on other sites

I don't understand the difficulty that you have, aren't you just over-thinking stuff? I mean

class Writer {
  ...
  void WriteInt(int n) { Store((const char *)&n, sizeof(n)); }
  void WriteFloat(float n) { Store((const char *)&n, sizeof(n)); }
  // etc
 
  void Store(const char *base, size_t sz) {
    for (int i = 0; i < sz; i++) // store or write base[i]
  }
}


struct SomeStuff {
  int a;
  float b;
  
  void Write(Writer &w) {
    w.WriteInt(a);
    w.WriteFloat(b);
  }
}

Make a Writer class that has a function for each type of data, for each structure you have add a "Write" method that writes self to the Writer. Reading is just the other way around, you have a Reader class with functions that produce various elementary values, and each struct has a Read method that assigns fields in the same order as the Write method.

 

Share this post


Link to post
Share on other sites

Be aware though that the above is not portable. (For example, there are at least 4 different byte patterns that WriteInt could generate on relatively modern systems, for example, although you're unlikely to see 2 of them.)

I have no idea what the previous 2 posts were about, however. Perhaps the typos caused confusion. Writing variable-length arrays or lists of anything is usually best achieved by prepending the length, obviously.

Share this post


Link to post
Share on other sites
50 minutes ago, Alberth said:

Make a Writer class that has a function for each type of data, for each structure you have add a "Write" method that writes self to the Writer. Reading is just the other way around, you have a Reader class with functions that produce various elementary values, and each struct has a Read method that assigns fields in the same order as the Write method.

 

Or go with pupping and have a single function to read, write and compute size. No repeating yourself but I feel like beating a dead horse at this point.

Share this post


Link to post
Share on other sites

(Just an aside: I don't like the use of the term 'pupping' because that was just invented for that article. The idea of writing out the fields individually is not new, nor is the idea of using a single function for both reads and writes, e.g. https://gafferongames.com/post/serialization_strategies/ . Apart from that however, the linked article is good.)

Share this post


Link to post
Share on other sites

Ohh I did this a while ago, I ended up posting a stackoverflow question because I was still learning many of the fundamentals to C++ at the time. https://stackoverflow.com/questions/20198002/how-is-a-struct-stored-in-memory

I'll synthesize the important stuff for you though. Which is that all you really need to know is that an object of any struct is stored in memory as an array of bytes (and lets be honest even primitive types are too, they are just much shorter). Binary files are essentially really long arrays of bytes.

So one of my favorite reasons to use a union is to have char arrays laying around which represent some data that is quite likely going to be pushed into a file. You need to know how many bytes your data is, but it is quite easy.

struct POD
{
	float X,Y,Z;
};

union quickie_data
{
	POD data_obj;
	char data_raw[sizeof(POD)];
};

Totally unnecessary though. Somebody has probably already said this, but you can just say `write( (char*)my_struct, sizeof(MyStruct));`

Share this post


Link to post
Share on other sites
16 hours ago, coope said:

Totally unnecessary though. Somebody has probably already said this, but you can just say `write( (char*)my_struct, sizeof(MyStruct));`

Yes, it was literally in the 3rd paragraph of the 1st post. Much of the rest of the thread is pointing out why this is a bad thing to do.

Share this post


Link to post
Share on other sites

A bit late to the party, but have a look at this pattern, I think it's a really elegant way to serialize binary data. Also have a read of section 7.4 in Beej's guide to network programming about data serialization and how to make binary data portable here: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#serialization

struct Serializer
{
    // To be implemented by files, buffers, network connections, or whatever stream you want to write to
    virtual size_t write(void* data, size_t size) = 0;

    bool writeUByte(unsigned char byte) { return write(&byte, sizeof(byte)) == sizeof(byte); }
    bool writeUInt(uint32_t i) { return write(&i, sizeof(i)) == sizeof(i); }
    bool writeFloat(float f) {
        uint32_t buf = htonf(f);  // Convert to portable representation
        return writeUInt(buf);
    }
    bool writeVector3(const Vector3& v) {
        bool success = true;
        success &= writeFloat(v.x);
        success &= writeFloat(v.y);
        success &= writeFloat(v.z);
        return success;
    }
    // Etc.
};

Basically you write methods for all of the different types of data you want to have serialized in your program (maybe you have buffers, strings, quaternions...) Then you can implement the serializer for a file like so:

struct File : public Serializer{
    size_t write(void* data, size_t size) override {
        return fwrite(fp_, data, size);
    }
private:
    FILE* fp_;
};

Or you could implement the serializer for a buffer:

struct Buffer : public Serializer{
    size_t write(void* data, size_t size) override {
        for (size_t i = 0; i != size; ++i)
            buffer_.push_back(((char*)data)[i]);  // horribly slow way
    }
private:
    std::vector<char> buffer_;
};

You get the idea. The point is, the code writing the data doesn't have to care about what it's writing too and it also doesn't have to care about endianness or type sizes, because that's handled by Serializer.

Then, similarly, you write a Deserializer to handle reading the data back:

struct Deserializer
{
    virtual size_t read(void* dest, size_t size) = 0;

    unsigned char readUByte() {
        unsigned char buf;
        read(&buf, sizeof(buf));
        return buf;
    }

    uint32_t readUInt() {
        uint32_t buf;
        read(&buf, sizeof(buf));
        return buf;
    }

    float readFloat() {
        uint32_t buf = readUInt();
        return ntohf(buf);
    }

    Vector3 readVector3() {
        Vector3 v;
        v.x = readFloat();
        v.y = readFloat();
        v.z = readFloat();
        return v;
    }
};

Example usage would be:

struct Vector3
{
    float x, y, z;
};

int main()
{
    Vector3 v = {1.0f, 4.0f, 7.0f};

    File ser;
    ser.open("whatever.dat");
    ser.writeVector3(v);
    ser.close();

    File des;
    des.open("whatever.dat");
    v = des.readVector3();
    des.close();
}

 

Edited by TheComet

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


  • Forum Statistics

    • Total Topics
      627704
    • Total Posts
      2978715
  • Similar Content

    • By Tispe
      Hi
      I want to test out a polymorphic entity component system where the idea is that the components of an entity are "compositioned" using templated multiple inheritance. But I am running into an issue because I am stacking a bunch of methods with the same names inside a class (but they have different signatures). I want these methods to be overloaded by the template type but my compiler says the access is ambiguous. I have issues making them unambiguous with the using declaration because the paramter pack expansion causes a syntax error.
      Can anyone here give me some advice on this?
       
      template <class T> class component { T m_data; protected: component() {}; ~component() {}; public: void set(const T& data) { m_data = data; }; }; template <class ...Ts> class entity : public component<Ts>... { public: entity() {}; ~entity() {}; //using component<Ts>::set...; // syntax error }; struct position { float x{}; float y{}; float z{}; }; struct velocity { float x{}; float y{}; float z{}; }; int main() { entity<position, velocity> myEntity; position pos = { 1.0f, 1.0f, 1.0f }; velocity vel = { 2.0f, 2.0f, 2.0f }; myEntity.set(pos); // error C2385: ambiguous access of 'set' //myEntity.set(vel); return 0; }  
    • By Baemz
      Hello,
      I've been working on some culling-techniques for a project. We've built our own engine so pretty much everything is built from scratch. I've set up a frustum with the following code, assuming that the FOV is 90 degrees.
      float angle = CU::ToRadians(45.f); Plane<float> nearPlane(Vector3<float>(0, 0, aNear), Vector3<float>(0, 0, -1)); Plane<float> farPlane(Vector3<float>(0, 0, aFar), Vector3<float>(0, 0, 1)); Plane<float> right(Vector3<float>(0, 0, 0), Vector3<float>(angle, 0, -angle)); Plane<float> left(Vector3<float>(0, 0, 0), Vector3<float>(-angle, 0, -angle)); Plane<float> up(Vector3<float>(0, 0, 0), Vector3<float>(0, angle, -angle)); Plane<float> down(Vector3<float>(0, 0, 0), Vector3<float>(0, -angle, -angle)); myVolume.AddPlane(nearPlane); myVolume.AddPlane(farPlane); myVolume.AddPlane(right); myVolume.AddPlane(left); myVolume.AddPlane(up); myVolume.AddPlane(down); When checking the intersections I am using a BoundingSphere of my models, which is calculated by taking the average position of all vertices and then choosing the furthest distance to a vertex for radius. The actual intersection test looks like this, where the "myFrustum90" is the actual frustum described above.
      The orientationInverse is the viewMatrix in this case.
      bool CFrustum::Intersects(const SFrustumCollider& aCollider) { CU::Vector4<float> position = CU::Vector4<float>(aCollider.myCenter.x, aCollider.myCenter.y, aCollider.myCenter.z, 1.f) * myOrientationInverse; return myFrustum90.Inside({ position.x, position.y, position.z }, aCollider.myRadius); } The Inside() function looks like this.
      template <typename T> bool PlaneVolume<T>::Inside(Vector3<T> aPosition, T aRadius) const { for (unsigned short i = 0; i < myPlaneList.size(); ++i) { if (myPlaneList[i].ClassifySpherePlane(aPosition, aRadius) > 0) { return false; } } return true; } And this is the ClassifySpherePlane() function. (The plane is defined as a Vector4 called myABCD, where ABC is the normal)
      template <typename T> inline int Plane<T>::ClassifySpherePlane(Vector3<T> aSpherePosition, float aSphereRadius) const { float distance = (aSpherePosition.Dot(myNormal)) - myABCD.w; // completely on the front side if (distance >= aSphereRadius) { return 1; } // completely on the backside (aka "inside") if (distance <= -aSphereRadius) { return -1; } //sphere intersects the plane return 0; }  
      Please bare in mind that this code is not optimized nor well-written by any means. I am just looking to get it working.
      The result of this culling is that the models seem to be culled a bit "too early", so that the culling is visible and the models pops away.
      How do I get the culling to work properly?
      I have tried different techniques but haven't gotten any of them to work.
      If you need more code or explanations feel free to ask for it.

      Thanks.
       
    • By AyeRonTarpas
      A friend of mine and I are making a 2D game engine as a learning experience and to hopefully build upon the experience in the long run.

      -What I'm using:
          C++;. Since im learning this language while in college and its one of the popular language to make games with why not.     Visual Studios; Im using a windows so yea.     SDL or GLFW; was thinking about SDL since i do some research on it where it is catching my interest but i hear SDL is a huge package compared to GLFW, so i may do GLFW to start with as learning since i may get overwhelmed with SDL.  
      -Questions
      Knowing what we want in the engine what should our main focus be in terms of learning. File managements, with headers, functions ect. How can i properly manage files with out confusing myself and my friend when sharing code. Alternative to Visual studios: My friend has a mac and cant properly use Vis studios, is there another alternative to it?  
    • By Defend
      Not asking about singletons here (nor advocating). With that clarified:
      If we assume someone wants a global + unique object, why isn't a namespace always the preferred approach in C++, over implementing a singleton class?
      I've only seen the namespace approach encouraged when there aren't statics being shared. Eg; from Google's style guidelines:
      But why not have non-member functions that share static data, declared in an unnamed namespace? And why isn't this generally suggested as a better alternative to writing a singleton class in C++?
    • By Finalspace
      I am playing around with ImGui, adding a UI to my level editor, but i am having trouble with mouse clicks in ImGui triggers a click in my actual editor (Because it uses the same input states).
      Right know i have just a main menu bar and the editor rendering is behind that. When i click on a menuitem for example, and select another sub item and click, i place a block in my editor - just because the input events are the same for both ImGui and my editor.
      So i have a few questions:
      1.) Is there a way to detect when ImGui has taken any action/hover in the previous frame, so i can skip the input handling for my editor? (This way i can prevent the issue i have right now)
      2.) Can i add a imgui window which fully fits the rest of the entire screen - after a main bar has been added? (Do i manually need to calculate the size?)
      3.) Should i make the entire editor with ImGUI only - so i wont get any input fuzziness problems?
  • Popular Now