Reading from file to structs

Started by
21 comments, last by Hodgman 12 years, 6 months ago
Hello.

I tried different ways to load file contents in to structs but have some trouble. Simply because the structs contain strings, which makes them variable in size. The file is supposed to hold several instanses which each needs to be loaded in to a struct.

I am thinking that the file header contains the number of structs in the file. The struct header can contain the length of the string.


Would you be so kind to show me how I can read a chunk of data from a file and cast that in to a struct that has a string in it? (btw are std::strings bad for this?)


LOADSPRITEOBJECT LoadSpriteStruct;
std::ifstream inbal("spritedata.txt", std::ios::in | std::ios::binary);
if(!inbal) {
MessageBox(NULL, L"Unable to open", L"Error", MB_OK);
PostQuitMessage(0);
return;
}

inbal.read((char *) &LoadSpriteStruct, sizeof(LoadSpriteStruct));
//inbal.close();

MakeSprite(LoadSpriteStruct); //init sprite and push in to a vector



void USERINTERFACEBOX::MakeSprite(LOADSPRITEOBJECT &instance)
{
SPRITEOBJECT Sprite;

//Set read data
Sprite.Color = LoadSpriteStruct.Color;
Sprite.name = LoadSpriteStruct.name;
Sprite.OffsetPosition = LoadSpriteStruct.OffsetPosition;
Sprite.visible = LoadSpriteStruct.visible;

//Calculate position relative parent box
Sprite.position = BoxPosition + Sprite.OffsetPosition;

//Init Sprite and put in array
SpriteHandler.LoadSprite(Sprite);
Sprites.push_back(Sprite);



struct LOADSPRITEOBJECT
{
int NameLength;
std::string name;
D3DXVECTOR3 OffsetPosition;
D3DCOLOR Color;
bool visible;
};
Advertisement
If you want to read and write the whole struct in one line you can use a char array instead of std::string. Another way is to handle each member in the struct separately. You can write a std::string to file by first writing the length of the string and then the string data.
Char arrays are fixed size if I read them in one line right? I think its too limiting and may waste space.

Got any code on how I can LOAD a std::string from file by first reading the length of the string and then the string data?


struct LOADSPRITEOBJECT
{
int NameLength;
std::string name;
D3DXVECTOR3 OffsetPosition;
D3DCOLOR Color;
bool visible;
};

The usual way of using variable-length strings:struct Foo {
int bar;
std::string name;
int baz;
};

-- serialization:
Foo object;
write( &object.bar, sizeof(int) );
int length = object.name.size();
write( &length, sizeof(int) );
write( object.name.c_str(), length+1 );
write( &object.baz, sizeof(int) );

-- deserialization:
Foo object;
read(&object.bar, sizeof(int));
int length;
read(&length, sizeof(int));
char* buffer = new char[length];
read( buffer, length );
buffer[length] = '\0';
object.name = buffer;
delete [] buffer;
read(&object.baz, sizeof(int));

The usual way of using in-place memory offsets:
struct Foo {
int bar;
char* name;
int baz;
};

-- serialization:
Foo object;
write( &object.bar, sizeof(int) );
int offset = int((char*)(&object+1) - (char*)(&object.name));
write( &offset, sizeof(int) );
write( &object.baz, sizeof(int) );
write( object.name, strlen(object.name)+1 );

-- deserialization:
void* buffer = readWholeFile();
Foo* object = (Foo*)buffer;
object->name = ((char*)&object->name) + int(object->name);

-- serialization:
Foo object;
write( &object.bar, sizeof(int) );
int length = object.name.size();
write( &length, sizeof(int) );
write( object.name.c_str(), length+1 );
write( &object.baz, sizeof(int) );



In this case the layout in the file would be:

int bar
int length
char str[length+1]
int baz


right? "bar" could be used to identify what type of struct we are reading?

In this case the layout in the file would be:
[/quote]
That is the current layout of the data in the file. str[length] is a NUL character.


"bar" could be used to identify what type of struct we are reading?
[/quote]
Yes. Sometimes you can omit such identifiers, as the format of the file only allows certain structures in certain positions.

Note that if you are explicitly writing the length, the NUL terminator can be omitted. Care must be taken in this case to correctly convert the not-NUL-terminated character array into a std::string. There are constructor overloads that take a character pointer and a length, or the assign() member function could be used.
The usual way of using in-place memory offsets:
struct Foo {
int bar;
char* name;
int baz;
};

-- serialization:
Foo object;
write( &object.bar, sizeof(int) );
int offset = int((char*)(&object+1) - (char*)(&object.name)); //offset address is now between bar and baz?
write( &offset, sizeof(int) );
write( &object.baz, sizeof(int) );
write( object.name, strlen(object.name)+1 );

-- deserialization:
void* buffer = readWholeFile();
Foo* object = (Foo*)buffer;
object->name = ((char*)&object->name) + int(object->name);//did you mean offset?

Char arrays are fixed size if I read them in one line right? I think its too limiting and may waste space.


Char arrays are fixed size. That doesn't mean they have to waste space though....


struct LOADSPRITEOBJECT
{
int NameLength;
D3DXVECTOR3 OffsetPosition;
D3DCOLOR Color;
bool visible;
char name[1];

LOADSPRITEOBJECT* next() const
{
char* ptr = name + strlen(name) + 1;
return (LOADSPRITEOBJECT*)((void*)ptr);
}

private:
LOADSPRITEOBJECT();
};


struct LOADSPRITEOBJECT_FILEHEADER
{
int num;
LOADSPRITEOBJECT items[1];

private:
LOADSPRITEOBJECT_FILEHEADER();
}

class LoadOfSpriteObjectsFromFile
{
public:

void load(const char* filename)
{
FILE* fp = fopen(filename,"rb");
if(fp)
{
fseek(fp, 0, SEEK_END);
size_t sz = ftell(fp);
rewind(fp);
data = new unsigned char[ sz ];
fread(data, 1, sz, fp);
fclose(fp);

objects.resize( header.num );

LOADSPRITEOBJECT* obj = header->items;
for(int i=0; i<num; ++i)
{
objects = obj;
obj = obj->next();
}
}
}

private:

union
{
unsigned char* data;
LOADSPRITEOBJECT_FILEHEADER* header;
};
std::vector<LOADSPRITEOBJECT*> objects;
};


[quote name='Tispe' timestamp='1318227892' post='4870982']
Char arrays are fixed size if I read them in one line right? I think its too limiting and may waste space.


Char arrays are fixed size. That doesn't mean they have to waste space though....


struct LOADSPRITEOBJECT
{
int NameLength;
D3DXVECTOR3 OffsetPosition;
D3DCOLOR Color;
bool visible;
char name[1];

LOADSPRITEOBJECT* next() const
{
char* ptr = name + strlen(name) + 1;
return (LOADSPRITEOBJECT*)((void*)ptr);
}

private:
LOADSPRITEOBJECT();
};


struct LOADSPRITEOBJECT_FILEHEADER
{
int num;
LOADSPRITEOBJECT items[1];

private:
LOADSPRITEOBJECT_FILEHEADER();
}

class LoadOfSpriteObjectsFromFile
{
public:

void load(const char* filename)
{
FILE* fp = fopen(filename,"rb");
if(fp)
{
fseek(fp, 0, SEEK_END);
size_t sz = ftell(fp);
rewind(fp);
data = new unsigned char[ sz ];
fread(data, 1, sz, fp);
fclose(fp);

objects.resize( header.num );

LOADSPRITEOBJECT* obj = header->items;
for(int i=0; i<num; ++i)
{
objects = obj;
obj = obj->next();
}
}
}

private:

union
{
unsigned char* data;
LOADSPRITEOBJECT_FILEHEADER* header;
};
std::vector<LOADSPRITEOBJECT*> objects;
};


[/quote]
Are you seriously suggesting a solution by abusing memory like that where you tell the compiler and user you have a one-character array for the string, and then storing the actual string content way outside the array and the object? That in itself is undefined, and then your code doesn't even consider the fact that you're not aligning consecutive structures properly to ensure that their members are aligned.

You cannot use your objects by themselves; they are only good for storing them as pointers in an array given your code to load them. Your code will blow up as soon as you try do treat an object as a value. What you propose is nothing more than a pointer and a dynamic sized string, but instead of having a safe implementation of the pointer, you're way into the realm of undefined behavior.

Are you seriously suggesting a solution by abusing memory like that where you tell the compiler and user you have a one-character array for the string, and then storing the actual string content way outside the array and the object?

Actually, it's a perfectly normal C idiom that was formalized in C99 with flexible array members. Even the Windows headers use it. Ex: the SYMBOL_INFO structure in dbghelp.h. I'm not personally a big fan of this technique, but it's not uncommon.

This topic is closed to new replies.

Advertisement