C++ Writing char arrays (binary)

Recommended Posts

Hi all,

For practice I'm trying to write 3D mesh data to a binary format. To eventually also be able to read meshes from binary files (provided by the asset pipeline).

I understand how to write floats, vectors of floats, ints etc by using a write((char*)myVar, sizeof(type) to the file, for which I'm using std::ofstream.

My goal is to write a 3D vector or other type (struct) to the file and be able to read it back the same way. From what I understand, I need to create function that returns a char array combining all members of the struct.

So in short; how can I create this char array, in a "safe" way?

So far I've come up with this approach:

- calculate number of needed bytes/chars, i.e 12 bytes for a 3D vector with 3 floats

- char result[12]

- cast the 3 individual floats to char[4]'s and use strcpy to result for the 1st and strcat for the others

Is this the way to go or would you suggest other approaches?

question 2; How would I read/convert the char array back to the 3 floats when reading it from the binary file?

Any input is appreciated.

Share this post


Link to post
Share on other sites

I'm not even sure what's your problem there. Let's start from the basics: the only thing that truly exist is memory and bits. As long as you're going with 'basic' types such as floats the following will do: 

ubyte blobby[12];
memcpy(blobby, src, sizeof(blobby));
float *bruh = (float*)blobby;

Hoping `blobby` to be properly aligned.

First you have to ensure your `int` is the same as the serialized `int` and that's why you have `__int32` or `std::int64_t`. And of course you have endianess problems but let's assume you'll be loading on 'coherent' machines.

When `struct` enters the picture `memcpy` goes out due to possibility of different compiler packing.

There are various options for (de)serialization, I'd suggest pupping all the things.

That or google protobuffers but in my experience they're not quite convenient for game purposes due to their everything can be optional nonsense.

 

 

Share this post


Link to post
Share on other sites

Thanks guys.

I've been playing around using this input. Unfortunately writing the struct or array (std::vector) at once, doesn't work by trying to cast to (myType*), it does work when I cast it to (char*). See the code samples below, tested all 3 positive.

Would I have any risks on this approach, depending on platform? (size of float etc.).
I'll make sure the struct I use as a POD struct, which is the case in the example below.

void ReadWriteStruct()
{
	std::ofstream writeFile;
	writeFile.open("data.bin", std::ios::out, std::ios::binary);
	writeFile.write((char*)&srcVector, sizeof(VECTOR3));
	writeFile.close();

	MessageBox(NULL, L"Vector written to binary file - full struct!", L"Done", MB_OK);

	std::ifstream readFile;
	readFile.open("data.bin", std::ios::in, std::ios::binary);

	VECTOR3 myReadVec3;
	readFile.read((char*)&myReadVec3, sizeof(VECTOR3));
	readFile.close();

	char tempText[100];
	sprintf_s(tempText, "Read: %f, %f, %f\n", myReadVec3.x, myReadVec3.y, myReadVec3.z);
	OutputDebugStringA(tempText);
}

void ReadWriteArray()
{
	std::vector<VECTOR3> myVectors(2);
	myVectors[0].x = 0.2f;
	myVectors[0].y = 0.7f;
	myVectors[0].z = 0.95f;
	myVectors[1].x = 5.2f;
	myVectors[1].y = 4.7f;
	myVectors[1].z = 7.75f;

	std::ofstream writeFile;
	writeFile.open("data.bin", std::ios::out, std::ios::binary);
	writeFile.write((char*)&myVectors[0], myVectors.size() * sizeof(VECTOR3));
	writeFile.close();

	MessageBox(NULL, L"Vector written to binary file - full struct!", L"Done", MB_OK);

	std::ifstream readFile;
	readFile.open("data.bin", std::ios::in, std::ios::binary);

	std::vector<VECTOR3> readVectors(2);
	readFile.read((char*)&readVectors[0], sizeof(VECTOR3));
	readFile.read((char*)&readVectors[1], sizeof(VECTOR3));
	readFile.close();

	char tempText[100];
	sprintf_s(tempText, "Read 1: %f, %f, %f\n", readVectors[0].x, readVectors[0].y, readVectors[0].z);
	OutputDebugStringA(tempText);
	sprintf_s(tempText, "Read 2: %f, %f, %f\n", readVectors[1].x, readVectors[1].y, readVectors[1].z);
	OutputDebugStringA(tempText);
}

 

Share this post


Link to post
Share on other sites
17 hours ago, cozzie said:

Thanks guys.

I've been playing around using this input. Unfortunately writing the struct or array (std::vector) at once, doesn't work by trying to cast to (myType*), it does work when I cast it to (char*). See the code samples below, tested all 3 positive.

Would I have any risks on this approach, depending on platform? (size of float etc.).
I'll make sure the struct I use as a POD struct, which is the case in the example below.


void ReadWriteStruct()
{
	std::ofstream writeFile;
	writeFile.open("data.bin", std::ios::out, std::ios::binary);
	writeFile.write((char*)&srcVector, sizeof(VECTOR3));
	writeFile.close();

	MessageBox(NULL, L"Vector written to binary file - full struct!", L"Done", MB_OK);

	std::ifstream readFile;
	readFile.open("data.bin", std::ios::in, std::ios::binary);

	VECTOR3 myReadVec3;
	readFile.read((char*)&myReadVec3, sizeof(VECTOR3));
	readFile.close();

	char tempText[100];
	sprintf_s(tempText, "Read: %f, %f, %f\n", myReadVec3.x, myReadVec3.y, myReadVec3.z);
	OutputDebugStringA(tempText);
}

void ReadWriteArray()
{
	std::vector<VECTOR3> myVectors(2);
	myVectors[0].x = 0.2f;
	myVectors[0].y = 0.7f;
	myVectors[0].z = 0.95f;
	myVectors[1].x = 5.2f;
	myVectors[1].y = 4.7f;
	myVectors[1].z = 7.75f;

	std::ofstream writeFile;
	writeFile.open("data.bin", std::ios::out, std::ios::binary);
	writeFile.write((char*)&myVectors[0], myVectors.size() * sizeof(VECTOR3));
	writeFile.close();

	MessageBox(NULL, L"Vector written to binary file - full struct!", L"Done", MB_OK);

	std::ifstream readFile;
	readFile.open("data.bin", std::ios::in, std::ios::binary);

	std::vector<VECTOR3> readVectors(2);
	readFile.read((char*)&readVectors[0], sizeof(VECTOR3));
	readFile.read((char*)&readVectors[1], sizeof(VECTOR3));
	readFile.close();

	char tempText[100];
	sprintf_s(tempText, "Read 1: %f, %f, %f\n", readVectors[0].x, readVectors[0].y, readVectors[0].z);
	OutputDebugStringA(tempText);
	sprintf_s(tempText, "Read 2: %f, %f, %f\n", readVectors[1].x, readVectors[1].y, readVectors[1].z);
	OutputDebugStringA(tempText);
}

 

I made a mistake, it should have been char* not myType*. I don't see a structure in your post but usually a vector struct is a POD.

You can read the vector array in a single call as well.

Krohm already posted the problems you may face.

Share this post


Link to post
Share on other sites
On 10/11/2017 at 12:57 AM, Krohm said:

There are various options for (de)serialization, I'd suggest pupping all the things.

I second this approach.

Break it down so that you're only serializing fundamental types. This allows you to avoid all issues with alignment and padding, properly handle endian-ness, and trivially support composites of serializeable types.

There's nothing you gain from trying to read/write entire buffers of typed data at once, other than the potential for hard-to-find bugs.

Share this post


Link to post
Share on other sites

A little bit off topic but i always first writr then length of char array and then i writr char array anyway

Memcpy is your pal, you just memcpy(&floatvar, &chartable[index],3);

 

Something kike this you copy some amount memory to float pointer

Share this post


Link to post
Share on other sites

Do you mean first writing everything to a char array (aka buffer) and then write that buffer to file?

That could work, but then I have to calculate the index manually, which could be a risk if some type has a different bytesize on another problem (writing per variable and using sizeof, I think prevents this). In your example, shouldn't 3 be 4 (bytes for a float)?

But perhaps there's a way to use a buffer while keeping this in mind (or don't manually calculate the index of each variable).

Share this post


Link to post
Share on other sites

I don't understand the difficulty that you have, aren't you just over-thinking stuff? I mean

class Writer {
  ...
  void WriteInt(int n) { Store((const char *)&n, sizeof(n)); }
  void WriteFloat(float n) { Store((const char *)&n, sizeof(n)); }
  // etc
 
  void Store(const char *base, size_t sz) {
    for (int i = 0; i < sz; i++) // store or write base[i]
  }
}


struct SomeStuff {
  int a;
  float b;
  
  void Write(Writer &w) {
    w.WriteInt(a);
    w.WriteFloat(b);
  }
}

Make a Writer class that has a function for each type of data, for each structure you have add a "Write" method that writes self to the Writer. Reading is just the other way around, you have a Reader class with functions that produce various elementary values, and each struct has a Read method that assigns fields in the same order as the Write method.

 

Share this post


Link to post
Share on other sites

Be aware though that the above is not portable. (For example, there are at least 4 different byte patterns that WriteInt could generate on relatively modern systems, for example, although you're unlikely to see 2 of them.)

I have no idea what the previous 2 posts were about, however. Perhaps the typos caused confusion. Writing variable-length arrays or lists of anything is usually best achieved by prepending the length, obviously.

Share this post


Link to post
Share on other sites
50 minutes ago, Alberth said:

Make a Writer class that has a function for each type of data, for each structure you have add a "Write" method that writes self to the Writer. Reading is just the other way around, you have a Reader class with functions that produce various elementary values, and each struct has a Read method that assigns fields in the same order as the Write method.

 

Or go with pupping and have a single function to read, write and compute size. No repeating yourself but I feel like beating a dead horse at this point.

Share this post


Link to post
Share on other sites

(Just an aside: I don't like the use of the term 'pupping' because that was just invented for that article. The idea of writing out the fields individually is not new, nor is the idea of using a single function for both reads and writes, e.g. https://gafferongames.com/post/serialization_strategies/ . Apart from that however, the linked article is good.)

Share this post


Link to post
Share on other sites

Ohh I did this a while ago, I ended up posting a stackoverflow question because I was still learning many of the fundamentals to C++ at the time. https://stackoverflow.com/questions/20198002/how-is-a-struct-stored-in-memory

I'll synthesize the important stuff for you though. Which is that all you really need to know is that an object of any struct is stored in memory as an array of bytes (and lets be honest even primitive types are too, they are just much shorter). Binary files are essentially really long arrays of bytes.

So one of my favorite reasons to use a union is to have char arrays laying around which represent some data that is quite likely going to be pushed into a file. You need to know how many bytes your data is, but it is quite easy.

struct POD
{
	float X,Y,Z;
};

union quickie_data
{
	POD data_obj;
	char data_raw[sizeof(POD)];
};

Totally unnecessary though. Somebody has probably already said this, but you can just say `write( (char*)my_struct, sizeof(MyStruct));`

Share this post


Link to post
Share on other sites
16 hours ago, coope said:

Totally unnecessary though. Somebody has probably already said this, but you can just say `write( (char*)my_struct, sizeof(MyStruct));`

Yes, it was literally in the 3rd paragraph of the 1st post. Much of the rest of the thread is pointing out why this is a bad thing to do.

Share this post


Link to post
Share on other sites

A bit late to the party, but have a look at this pattern, I think it's a really elegant way to serialize binary data. Also have a read of section 7.4 in Beej's guide to network programming about data serialization and how to make binary data portable here: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#serialization

struct Serializer
{
    // To be implemented by files, buffers, network connections, or whatever stream you want to write to
    virtual size_t write(void* data, size_t size) = 0;

    bool writeUByte(unsigned char byte) { return write(&byte, sizeof(byte)) == sizeof(byte); }
    bool writeUInt(uint32_t i) { return write(&i, sizeof(i)) == sizeof(i); }
    bool writeFloat(float f) {
        uint32_t buf = htonf(f);  // Convert to portable representation
        return writeUInt(buf);
    }
    bool writeVector3(const Vector3& v) {
        bool success = true;
        success &= writeFloat(v.x);
        success &= writeFloat(v.y);
        success &= writeFloat(v.z);
        return success;
    }
    // Etc.
};

Basically you write methods for all of the different types of data you want to have serialized in your program (maybe you have buffers, strings, quaternions...) Then you can implement the serializer for a file like so:

struct File : public Serializer{
    size_t write(void* data, size_t size) override {
        return fwrite(fp_, data, size);
    }
private:
    FILE* fp_;
};

Or you could implement the serializer for a buffer:

struct Buffer : public Serializer{
    size_t write(void* data, size_t size) override {
        for (size_t i = 0; i != size; ++i)
            buffer_.push_back(((char*)data)[i]);  // horribly slow way
    }
private:
    std::vector<char> buffer_;
};

You get the idea. The point is, the code writing the data doesn't have to care about what it's writing too and it also doesn't have to care about endianness or type sizes, because that's handled by Serializer.

Then, similarly, you write a Deserializer to handle reading the data back:

struct Deserializer
{
    virtual size_t read(void* dest, size_t size) = 0;

    unsigned char readUByte() {
        unsigned char buf;
        read(&buf, sizeof(buf));
        return buf;
    }

    uint32_t readUInt() {
        uint32_t buf;
        read(&buf, sizeof(buf));
        return buf;
    }

    float readFloat() {
        uint32_t buf = readUInt();
        return ntohf(buf);
    }

    Vector3 readVector3() {
        Vector3 v;
        v.x = readFloat();
        v.y = readFloat();
        v.z = readFloat();
        return v;
    }
};

Example usage would be:

struct Vector3
{
    float x, y, z;
};

int main()
{
    Vector3 v = {1.0f, 4.0f, 7.0f};

    File ser;
    ser.open("whatever.dat");
    ser.writeVector3(v);
    ser.close();

    File des;
    des.open("whatever.dat");
    v = des.readVector3();
    des.close();
}

 

Edited by TheComet

Share this post


Link to post
Share on other sites

Thanks guys and sorry for the late response. I've adder the ReadXxx and WriteXxx helpers, the code's much cleaner now. I decided to stick with writing/ casting only the individual standard types (float, uint etc).

What I didn't figure out yet is how I can read a full chunk of data and "place" that in a std::vector. For example, say that I know that there are 100 floats in the file, would it be possible to read them all at once and place them in a std::vector<float> (with a size of 100)?

Share this post


Link to post
Share on other sites
On ‎30‎/‎10‎/‎2017 at 8:04 AM, cozzie said:

Thanks guys and sorry for the late response. I've adder the ReadXxx and WriteXxx helpers, the code's much cleaner now. I decided to stick with writing/ casting only the individual standard types (float, uint etc).

What I didn't figure out yet is how I can read a full chunk of data and "place" that in a std::vector. For example, say that I know that there are 100 floats in the file, would it be possible to read them all at once and place them in a std::vector<float> (with a size of 100)?

I'd add methods for doing that to you serializer/deserializer classes:

class Deserializer {
    /* ... */
  
    void readFloatArray(std::vector<float>* v) {
        size_t arraySize = readUInt();  // we saved the size so we know how much to read
        v->resize(arraySize);
        for (size_t i = 0; i < arraySize; ++i)
            v->push_back(readFloat());
    }
  
    /* ... */
};

class Serializer {
    /* ... */
    
    void writeFloatArray(const std::vector<float>& v) {
        writeUInt(v.size());  // save size, for reading back
        for (size_t i = 0; i < v.size(); ++i)
            writeFloat(v[i]);
    }
  
    /* ... */
};

Then, to read the 100 floats:

File file;
file.open("whatever.dat");

std::vector<float> myFloats;
file.readFloatArray(&myFloats);

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


  • Forum Statistics

    • Total Topics
      628686
    • Total Posts
      2984229
  • Similar Content

    • By lawnjelly
      It comes that time again when I try and get my PC build working on Android via Android Studio. All was going swimmingly, it ran in the emulator fine, but on my first actual test device (Google Nexus 7 2012 tablet (32 bit ARM Cortex-A9, ARM v7A architecture)) I was getting a 'SIGBUS illegal alignment' crash.
      My little research has indicated that while x86 is fine with loading 16 / 32 / 64 bit values from any byte address in memory, the earlier ARM chips may need data to be aligned to the data size. This isn't a massive problem, and I see the reason for it (probably faster, like SIMD aligned loads, and simpler for the CPU). I probably have quite a few of these, particular in my own byte packed file formats. I can adjust the exporter / formats so that they are using the required alignment.
      Just to confirm, if anyone knows this, is it all 16 / 32 / 64 bit accesses that need to be data size aligned on early android devices? Or e.g. just 64 bit size access? 
      And is there any easy way to get the compiler to spit out some kind of useful information as to the alignment of each member of a struct / class, so I can quickly pin down the culprits?
      The ARM docs (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15414.html) suggest another alternative is using a __packed qualifier. Anyone used this, is this practical?
    • By Josheir
      In the following code:

       
      Point p = a[1]; center of rotation for (int i = 0; I<4; i++) { int x = a[i].x - p.x; int y = a[i].y - p.y; a[i].x = y + p.x; a[i].y = - x + p.y; }  
      I am understanding that a 90 degree shift results in a change like:   
      xNew = -y
      yNew = x
       
      Could someone please explain how the two additions and subtractions of the p.x and p.y works?
       
      Thank you,
      Josheir
    • By alex1997
      Hey, I've a minor problem that prevents me from moving forward with development and looking to find a way that could solve it. Overall, I'm having a sf::VertexArray object and looking to reander a shader inside its area. The problem is that the shader takes the window as canvas and only becomes visible in the object range which is not what I'm looking for.. 
      Here's a stackoverflow links that shows the expected behaviour image. Any tips or help is really appreciated. I would have accepted that answer, but currently it does not work with #version 330 ...
    • By noodleBowl
      I just finished up my 1st iteration of my sprite renderer and I'm sort of questioning its performance.
      Currently, I am trying to render 10K worth of 64x64 textured sprites in a 800x600 window. These sprites all using the same texture, vertex shader, and pixel shader. There is basically no state changes. The sprite renderer itself is dynamic using the D3D11_MAP_WRITE_NO_OVERWRITE then D3D11_MAP_WRITE_DISCARD when the vertex buffer is full. The buffer is large enough to hold all 10K sprites and execute them in a single draw call. Cutting the buffer size down to only being able to fit 1000 sprites before a draw call is executed does not seem to matter / improve performance.  When I clock the time it takes to complete the render method for my sprite renderer (the only renderer that is running) I'm getting about 40ms. Aside from trying to adjust the size of the vertex buffer, I have tried using 1x1 texture and making the window smaller (640x480) as quick and dirty check to see if the GPU was the bottleneck, but I still get 40ms with both of those cases. 

      I'm kind of at a loss. What are some of the ways that I could figure out where my bottleneck is?
      I feel like only being able to render 10K sprites is really low, but I'm not sure. I'm not sure if I coded a poor renderer and there is a bottleneck somewhere or I'm being limited by my hardware

      Just some other info:
      Dev PC specs: GPU: Intel HD Graphics 4600 / Nvidia GTX 850M (Nvidia is set to be the preferred GPU in the Nvida control panel. Vsync is set to off) CPU: Intel Core i7-4710HQ @ 2.5GHz Renderer:
      //The renderer has a working depth buffer //Sprites have matrices that are precomputed. These pretransformed vertices are placed into the buffer Matrix4 model = sprite->getModelMatrix(); verts[0].position = model * verts[0].position; verts[1].position = model * verts[1].position; verts[2].position = model * verts[2].position; verts[3].position = model * verts[3].position; verts[4].position = model * verts[4].position; verts[5].position = model * verts[5].position; //Vertex buffer is flaged for dynamic use vertexBuffer = BufferModule::createVertexBuffer(D3D11_USAGE_DYNAMIC, D3D11_CPU_ACCESS_WRITE, sizeof(SpriteVertex) * MAX_VERTEX_COUNT_FOR_BUFFER); //The vertex buffer is mapped to when adding a sprite to the buffer //vertexBufferMapType could be D3D11_MAP_WRITE_NO_OVERWRITE or D3D11_MAP_WRITE_DISCARD depending on the data already in the vertex buffer D3D11_MAPPED_SUBRESOURCE resource = vertexBuffer->map(vertexBufferMapType); memcpy(((SpriteVertex*)resource.pData) + vertexCountInBuffer, verts, BYTES_PER_SPRITE); vertexBuffer->unmap(); //The constant buffer used for the MVP matrix is updated once per draw call D3D11_MAPPED_SUBRESOURCE resource = mvpConstBuffer->map(D3D11_MAP_WRITE_DISCARD); memcpy(resource.pData, projectionMatrix.getData(), sizeof(Matrix4)); mvpConstBuffer->unmap(); Vertex / Pixel Shader:
      cbuffer mvpBuffer : register(b0) { matrix mvp; } struct VertexInput { float4 position : POSITION; float2 texCoords : TEXCOORD0; float4 color : COLOR; }; struct PixelInput { float4 position : SV_POSITION; float2 texCoords : TEXCOORD0; float4 color : COLOR; }; PixelInput VSMain(VertexInput input) { input.position.w = 1.0f; PixelInput output; output.position = mul(mvp, input.position); output.texCoords = input.texCoords; output.color = input.color; return output; } Texture2D shaderTexture; SamplerState samplerType; float4 PSMain(PixelInput input) : SV_TARGET { float4 textureColor = shaderTexture.Sample(samplerType, input.texCoords); return textureColor; }  
      If anymore info is needed feel free to ask, I would really like to know how I can improve this assuming I'm not hardware limited
    • By John Mckrugins
      My short-term  goal right now is a job as a Junior Programmer in any game company, just to get my foot int the door and start earning some income.
      My long term goal is to Programme for bigger more established  game companies and help games that interest me.
      Im in semi-fortunate position where i don't have to work a full time job so i have the  learn how to programme.
      i did my research into whats a good beginner way to start,  Unity and C# came up a lot, so i threw my hat in.
      For the past 5 months i've been learning C# and Unity using the udemy tutorials at a slow but steady pace as i come from a 0 maths/ programming background.
      Right now  getting the hang of things , understanding most code and the  unity engine to a point where i feel comfortable at my current level around Beginner/ Intermediate.
      Although im still keen to continue with Unity, I cant help this nagging feeling that(lets say for arguments sake i do actually get a job) if i do peruse this path and end up with a job as a developer for a company that uses Unity or whatever else uses C# . There is going to be a point at however many X years down the line i am, im still using unity,  im going to be in a position where i want to work on bigger more mainstream games that use C++.
      I want to get a job ASAP, i know it will take at the very least another 7 months probably, learning more code, making a portfolio and all the rest, so i dont really want to change and start from scratch again.
      Im not bashing unity but it looks like its main use is mobile games, which would be perfectly fine as a starting point, but not as a long term career. 
      Hypothetically  If i continue to focus on learning C# / Unity to reach my goal, and at some-point i want to move into bigger prospects and learn C++, how screwed would i be if i wanted to transition over.
      Im by no means a  smart guy that picks up things fast, im just willing to put in the time/effort.
      Should i scrap learning C# and unity for C++ / Unreal  or just power on through and transition later down the line after i get some experience first.
      Time is a factor, but i want to make sure im not condemning myself to a path i wont like later down the line...
       
       
  • Popular Now