Jump to content
  • Advertisement
Sign in to follow this  
Pilpel

Creating my own file structure

This topic is 481 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I found very little info about this on the web. Basically I want to create my own file format that stores mesh data (vertices, skinning info, etc).

I don't know where to start and I have very poor knowledge about file structures in general. What's the difference between opening files the normal way and opening them in binary mode? I always thought that binary mode is supposed to prevent people from being able to read the file in notepad, but I'm pretty sure I'm wrong. :lol:

Any tips and sources will be appreciated. thanks!

edit: I found this https://www.gamedev.net/resources/_/technical/game-programming/resource-files-explained-r902, but it's 17 years old. Is it still good?

Edited by Pilpel

Share this post


Link to post
Share on other sites
Advertisement

A good search term is "serialization" - the act of translating runtime objects into a format that can be stored on disk (and deserialization is the opposite process).
You can do this ad hoc with APIs like fopen/fread/fwrite/fclose, or there's plenty of higher level serialization libraries that make it easier / more robust.

What's the difference between opening files the normal way and opening them in binary mode?

Almost nothing. Text mode might look for specific bytes and make some changes, like converting between Windows and Unix newline character sequences. Binary mode just reads/writes the exact bytes without any interference. You typically always want to use binary mode :P

Share this post


Link to post
Share on other sites

So about the serialization process. You mean I can just have a class that contains a huge amount of data, reinterpret_cast it to void*, and call fwrite()?

Share this post


Link to post
Share on other sites

You mean I can just have a class that contains a huge amount of data, reinterpret_cast it to void*, and call fwrite()?
Yes and no. You can, but if it contains any pointers, then these will no longer be valid after you've quit your app, started it again and re-loaded the file... And if it contains any C++ objects with constructors, these won't get re-run during the deserialization process... and if you recompile your code on a different platform/compiler then perhaps the memory layout of the structure could be different and the files from your own version won't load correctly... etc, etc...

It's much better to save/load one field at a time when using this method.

More advanced methods of serialization will tolerate all of these things, and even let you load data from different versions of the file, tolerate errors, etc... 

Share this post


Link to post
Share on other sites

So about the serialization process. You mean I can just have a class that contains a huge amount of data, reinterpret_cast it to void*, and call fwrite()?

Yes, if...

... you don't have pointers, and if you don't have virtual functions, networking and different architectures are of no concern, and a couple of other things to keep in mind (such as different program versions).

For virtual functions, you will have to call placement new on each object's address upon loading, to set up the vtable. For pointers in general, you will have to replace them with something that can be moved around, most likely indices or "handles". Then you can, in principle, just dump the thing to disk and load it from disk again (plus, some fixup).

Serialization is much more heavyweight, but it works in a much more general (and robust) way, too. Add a field in two years from now? It will still work. Make something optional so you don't waste bandwidth when it's not used? Serialization will do. Etc, etc.

Share this post


Link to post
Share on other sites

Then using some kind of a fileSave(Object *obj) function (and Object* fileLoad() to load) seems a lot more intuitive than serialization, no?

by the way, what do you mean by different program versions?

Edited by Pilpel

Share this post


Link to post
Share on other sites

Many OSs use many different architectures. A Windows XP Systems is a little bit different from a Windows 10 one is a little bit different from a Linux one is a little bit different from an Mac OS one. Some systems use different endianess (the way how bytes are structured to build the data) and also some systems use different size types other than 1 Byte - Char, 2 Byte - Short, 4 Byte - Int and so on. Means your serializer needs to catch all this to ensure that data written as 0x205f is not reversed on an other system to 0x5f20 and vise versa.

Some file formats like .bsa use little endian as standard format for data and other file formats like fbx ascii use pure text to store informations. For small file formats it might be good to store them as JSON encoded file where other file formats may be encrypted or compressed; need to be stored as bytes.

Creating a file format is quite simple, anything you need is to define the file layout and how data is stored in it. As an example see this

//File format header

3 byte [PAK] //Identify .pak files
2 byte Version [1 byte Major, 1 byte Minor] //Version of the .pak file
1 byte Flags //Describing the file mode

//File Header

1 byte Entry Type //May be file or folder
1 byte Name //Length of the name string
N byte Name String

   if file

      1 byte Flags //Again flags that describe the file
      2 byte Offset //The offset from beginning of a chunk to read from
      4 byte Chunk ID //The packed chunk ID
      4 byte Length //Size of the file in bytes

   if directory

      4 bytes Childs //Number of childs included in this directory

...

It is a short sketch of the package format I use in my game engine to bundle a games content in. As you see this describes the layout of the file so anyone nows how to write and how to read / interpret it

Share this post


Link to post
Share on other sites

Then using some kind of a fileSave(Object *obj) function (and Object* fileLoad() to load) seems a lot more intuitive than serialization, no?

That is serialisation. Or at least it will be, inside those functions, which could well just be using fwrite or whatever. Serialisation is the act of writing something out, byte by byte, hence the name (each byte is written in serial).

Share this post


Link to post
Share on other sites

by the way, what do you mean by different program versions?
That is where the fun starts. Suppose I write a program, and use your data format for writing and reading files.

All good and well, hooraay!!

 

Now I change the program, data from your file needs to be stored in a different way in my program.

Or, I change the data that is being saved and loaded, how does one load an old data file (assuming I can compute values for missing data).

 

Of course, both above things happen in real life, when you evolve your game.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!