Writing 'floats' to a file

Started by
7 comments, last by Brum87 8 years, 4 months ago

Hi guys,

I am in the process of writing a model converter for my game. But, I occasionally don't get the desired output (i.e. the models are sometimes garbled).

The models files are purely geometry (x, y, & z positions).

ofstream myfile;
outputFilename = filename + "dat";
myfile.open(outputFilename, std::
 
fstream::out, std::fstream::binary);
 
int faces = GetFaceCount();
 
char *test = new char[faces * 36];      // Is this a problem as I can only use char and not 'unsigned char'?
 
for (int i = 0; i != faces; i++)
{
   memcpy(test + i * 36, &face[i].vertex0.x, sizeof(float));
   memcpy(test + 4 + i * 36, &face[i].vertex0.y, sizeof(float));
   memcpy(test + 8 + i * 36, &face[i].vertex0.z, sizeof(float));
   memcpy(test + 12 + i * 36, &face[i].vertex2.x, sizeof(float));
   memcpy(test + 16 + i * 36, &face[i].vertex2.y, sizeof(float));
   memcpy(test + 20 + i * 36, &face[i].vertex2.z, sizeof(float));
   memcpy(test + 24 + i * 36, &face[i].vertex1.x, sizeof(float));
   memcpy(test + 28 + i * 36, &face[i].vertex1.y, sizeof(float));
   memcpy(test + 32 + i * 36, &face[i].vertex1.z, sizeof(float));
}
 
myfile.write(test, faces * 36);
myfile.close();

I am concerned that it may be he use of 'char' causing some values to be corrupted, but fstream expects the first parameter of write() to be a char, so 'unsigned char' can't be used.

Would this be the cause of some values occasionally being wrong?

If so, is there another way that I can write floats to a file in quad byte form and not as text?

Thanks in advance :)

Advertisement

It might not fix the issue, but you are heavily overcomplicating the process. while std::ostream::write expects a char*, it does not need to be actual char data. You can just cast whatever you want to (char*), pass in the correct size and it will write it out. So you can do:


float *test = new float[faces * 9];      // Is this a problem as I can only use char and not 'unsigned char'?
 
for (int i = 0; i != faces; i++)
{
   memcpy(test + i * 9, &face[i].vertex0, sizeof(float*3)); // or: sizeof(your vertex/vector structure)
   memcpy(test + 3 + i * 9, &face[i].vertex2, sizeof(float*3)); // or: sizeof(your vertex/vector structure)
   memcpy(test + 6 + i * 9, &face[i].vertex1, sizeof(float*3)); // or: sizeof(your vertex/vector structure)
}
 
myfile.write((char*)test, faces * 9);
myfile.close();

See this simplifies the offset-calculation by quite a bit. Notice also that you do not have to memcpy every single vertexX-member, you can just copy the whole thing (i case your vector-classes elements are laid out one after another in the declaration).

I hope this is of some help.

Hi Juliean,

I did actually try to write 'vertex' earlier on, but the data was all over the place. I suspected that x, y, & z might have been stored in memory in reverse order so I went back to the longer method.

I actually got this up and running a couple of minutes ago by ditching fstream and using fwrite(). So, it looks like it may have been a char/unsigned char issue as suspected.

fwrite(test, sizeof(float)*faces*9, 1, file);
You can use a byte stream (std:stringstream since you are using std) to load the float array into a string, open a file after you have the bytes, and write them all once (maybe using stringstream.str().c_str()). That is probably faster and easy to debug.

If you plan to incorporate an array into your binary model file I suggest you to write the size of the array first so you can just pre allocate the data and load it in a linear fashion when it is time to load.

Also don't forget endian will be needed to know when reading, you can add the endian type on the beginning of the file in save process.

I don't think the use of char is the problem. Endian shouldn't make a difference if you are writing and reading on the same platform but something to be aware of. For simplicity I would skip the 'new' buffer you have there just use the streams. Why alloc twice if you don't have to?

As you are using streams I would recommend overloading the << streams operator:


std::ostream& operator <<(std::ostream&, const <yourtype>&){
//your implementation here
}

The code will simply become something like


for (const <yourtype>& item : face)
{
    myfile << item;
}

I don't think using streams will produce the fastest code but I like that way the code fits together. The technique can be used for all types and makes everything simpler if a lot of the code looks the same.

More details here:
http://www.cplusplus.com/reference/ostream/ostream/
http://stackoverflow.com/questions/476272/how-to-properly-overload-the-operator-for-an-ostream

The general expectation of operator << and >> with std::stream classes is to have text in/output. I would strongly recommend against doing binary input/output for some types. It's counterintuitive when reading the code and it's far too easy to accidentally serialize types textually inside your normally binary representation.
If you want to keep using << and >> I would strongly recommend defining your own binary_stream type which does not share operators which the normal std::stream classes.

I'll post my code here in the case the last answer is not clear.

I'm not using std. I really think std overloaded operators makes things very complicated to read and write.

The class below handles both reading and writing of binary files. The class manages its own buffer only when is used as a read-only stream. (You should probably break this into small classes later.)

The header:


#ifndef __BYTE_STREAM__
#define __BYTE_STREAM__

#include "..\IRStdLib\CString.h"
#include "CIStream.h"

class CIByteStream : public CIStream {
public :
	CIByteStream(unsigned char* _pbBytes, unsigned int _ui32BytesCount);
	
	~CIByteStream();

	virtual I_STREAM_ERRORS LoadBytesFromFile(const CString& _sFilePath);
	virtual I_STREAM_ERRORS WriteBytesInFile(const CString& _sFilePath) const;

	const unsigned char* GetBuffer() const;
	unsigned int GetBufferSize() const;

	virtual void ReadBytes(unsigned char** _pbOut, unsigned int _ui32BytesCount) const;
	virtual unsigned int ReadInt32() const;
	virtual float ReadFloat() const;

	virtual void WriteBytes(const unsigned char* _pbBytes, unsigned int _ui32BytesCount);
	virtual void WriteInt32(unsigned int _ui32Value);
	virtual void WriteFloat(float _fValue);
protected :
	unsigned int m_ui32BufferSize;
	unsigned char* m_pbBuffer;
	mutable unsigned char* m_pbCurBufferPtr;
};

inline const unsigned char* CIByteStream::GetBuffer() const {
	return m_pbBuffer;
}

inline unsigned int CIByteStream::GetBufferSize() const {
	return m_ui32BufferSize;
}

#endif // #ifndef __BYTE_STREAM__

The .cpp:


#include "CIByteStream.h"
#include <cstring>
#include <cstdio>

CIByteStream::CIByteStream(unsigned char* _pbBytes, unsigned int _ui32BytesCount) :
	m_pbBuffer(NULL),
	m_pbCurBufferPtr(_pbBytes),
	m_ui32BufferSize(_ui32BytesCount) {
}

CIByteStream::~CIByteStream() {
	::free(m_pbBuffer);
}

CIStream::I_STREAM_ERRORS CIByteStream::LoadBytesFromFile(const CString& _sFilePath) {
	FILE* pfStream = NULL;
	::fopen_s(&pfStream, (char*) _sFilePath.Chars(), "rb");
	if (!pfStream) { 
		return CIStream::IE_FAILED;
	}
	
	::fseek(pfStream, 0L, SEEK_END);	
	unsigned int ui32Size = ::ftell(pfStream);
	
	if (!ui32Size) {
		::fclose(pfStream);
		return CIStream::IE_EMPTY;
	}

	m_pbBuffer = reinterpret_cast<unsigned char*>(::malloc(ui32Size));
	
	::fseek(pfStream, 0L, SEEK_SET);
	::fread(m_pbBuffer, 1, ui32Size, pfStream);
	::fclose(pfStream);
	
	m_pbCurBufferPtr = m_pbBuffer;
	m_ui32BufferSize = ui32Size;

	return CIStream::IE_NONE;
}

CIByteStream::I_STREAM_ERRORS CIByteStream::WriteBytesInFile(const CString& _sFilePath) const {
	FILE* pfStream = NULL;
	::fopen_s(&pfStream, (char*)_sFilePath.Chars(), "wb");
	if (!pfStream) {
		return CIStream::IE_FAILED;
	}
	::fwrite(m_pbBuffer, sizeof(unsigned char), m_ui32BufferSize, pfStream);
	::fclose(pfStream);
	return CIStream::IE_NONE;
}

void CIByteStream::ReadBytes(unsigned char** _pbOut, unsigned int _ui32Size) const {
	unsigned char* pbOutBytes = *_pbOut;
	for (unsigned int I = 0; I < _ui32Size; ++I) {
		pbOutBytes[I] = m_pbCurBufferPtr[I];
	}
	m_pbCurBufferPtr += _ui32Size;
}

unsigned int CIByteStream::ReadInt32() const {
	unsigned int ui32Out = *reinterpret_cast<unsigned int*>(m_pbCurBufferPtr);
	m_pbCurBufferPtr += sizeof(unsigned int);
	return ui32Out;
}

float CIByteStream::ReadFloat() const {
	float fOut = *reinterpret_cast<float*>(m_pbCurBufferPtr);
	m_pbCurBufferPtr += sizeof(float);
	return fOut;
}

void CIByteStream::WriteBytes(const unsigned char* _pbBytes, unsigned int _ui32BytesCount) {
	::memcpy(m_pbCurBufferPtr, _pbBytes, _ui32BytesCount);
	m_pbCurBufferPtr += _ui32BytesCount;
}

void CIByteStream::WriteInt32(unsigned int _ui32Value) {
	::memcpy(m_pbCurBufferPtr, reinterpret_cast<unsigned char*>(&_ui32Value), sizeof(unsigned int));
	m_pbCurBufferPtr += sizeof(unsigned int);
}

void CIByteStream::WriteFloat(float _fValue) {
	::memcpy(m_pbCurBufferPtr, reinterpret_cast<unsigned char*>(&_fValue), sizeof(float));
	m_pbCurBufferPtr += sizeof(float);
}

So, let's say you want to write your own binary model file to the secondary disk. Then you have something along these lines:


u32 ui32ByteCount = pmModel->GetByteCount();
unsigned char* pbBytes = (unsigned char*) ::malloc(ui32ByteCount);

CIByteStream ibsByteStream( pbBytes, ui32ByteCount );
pmModel->Write( ibsByteStream );
ibsByteStream.WriteBytesInFile( "Filename.bin" );

::free(pbBytes);

The ui32ByteCount variable it is the size of the model returned by CModel::GetByteCount(), which loops through every element that will be written (in the same order of reading) in ibsByteStream.WriteBytesInFile( sBuffer.c_str() ).

For your case, then you have something like:


u32 CModel::GetByteCount() {
    u32 ui32Total = 0;
    // We will write the array size first.
    ui32Total += sizeof(u32);
    // Now we write the elements.
    for( u32 I = 0; I < ui32FaceCount; ++I ) {
    // Assuming that there are three vertices per face.
         ui32Total += 3 * sizeof(Vertex3);     
    }
    return ui32Total;
}

void CModel::WriteToStream(CIByteStream& _ibsStream) const {
    // We will write the array size first.
    _ibsStream.WriteInt32( ui32FaceCount );
    // Now we write the elements.
    for( u32 I = 0; I < ui32FaceCount; ++I ) {
         const Face& fFace = faces[I];
         // Assuming that there are three vertices per face.
         for (u32 J = 0; J < 3; ++J) {
             for (u32 K = 0; K < 3; ++K) {
              const Vertex3& vertex = fFace.vertex[J];
              _ibsStream.WriteFloat( vertex[K]);
             }
         }
}

// When its time to convert a model:
int main(...) {
    CModel mModel;
    mModel.ConvertFrom("Filename.thirtypartformat");
    u32 ui32ByteCount = mModel.GetByteCount();
    unsigned char* pbBytes = (unsigned char*) ::malloc(ui32ByteCount);

    // Note that we're keeping the file opened just after we've the data.
    CIByteStream ibsStream( pbBytes, ui32ByteCount );
    mModel.WriteToStream(ibsStream);
    ibsStream.WriteBytesInFile("Filename.bin");

    ::free(pbBytes);
}

I don't know the internals of ofstream or ifstream, so I can't argue against performance very much. But the above way is much clear I think since there are no confusing overloaded operators.

Hope that helps.

The general expectation of operator << and >> with std::stream classes is to have text in/output. I would strongly recommend against doing binary input/output for some types. It's counterintuitive when reading the code and it's far too easy to accidentally serialize types textually inside your normally binary representation.
If you want to keep using << and >> I would strongly recommend defining your own binary_stream type which does not share operators which the normal std::stream classes.

I didn't realise that was this distinction in std::streams. A very short look through some online material verifies this response. I now think what I said before is not a great idea for working with binary data. What I was trying to get at in my post is the use of the raw char[] and the overcomplicated maths for indexing the array. You are probably doing it this way for speed. Using the original method I can see 2 allocs and the maths looks like an attempt to improve cache coherency.

For this particular case then, given what i have learnt about << operator, using std::stream's and aiming for simplicity, I would forget buffering up the data in raw char[]'s just use ostream.write for each element. If you want to get the speed back from creating a buffer then a custom buffer class like Irlan's would be the way I would go with it. Its the number of allocations which is going to be slow but the code is so much simpler.

I find this a very interesting topic.

I see there are a lot of different opinions on this topic with regard to std::stream's, operator misuse, code comprehension practice e.t.c. Personally I see the << and >> operators as general stream operators and thinking of them as such is a very clean way of conveying the transformation. The important factor is the target. Given what you have said about std::stream's I agree a binary_stream class would be one way to solve it without ambiguity. I like this because the target describes the method formatting applied to source data and ensures that all data pushed to the same target is formatted based on the correct scheme. It would be very strange to have some types formatted one way and other types formatted another way.

This topic is closed to new replies.

Advertisement