What exactly is a file header?

Started by
11 comments, last by Komal Shashank 9 years ago

Hi... I'm new to GameDev.net. This is my first post. Can someone please tell me: What is a file header? Why is it used? And what would I need to do to write a file header to a file or read a file header from a file in binary using fstream?

Searching for this on Google turns up results for header files (.h). Nothing for file header. Please help. Thank you.
Advertisement

The thing with binary files is that they do not in them selves have any structure. (like an xml file)

It is just one big blob of data.

To help parse the data, the designer of the file format has defined a "file header" which contain information on how the data blob should be interpreted.

For an image, it could for example be the pixel format, width, height, and what compression is used.

Technically, it is just the first few kB of the file contents (how much depends on the format).

So to read it, you just read the file as normal.

The first reads from the file will be from the "header"

More complex file formats typically have more then one header. Each header in front of its own blob/block of data, defining how that block should be read back.

This way, some blocks can be optional, and older versions of the software might be able to read files from newer versions, just ignoring the blocks it does not understand.

For example, the PCX header is described here: http://www.fastgraph.com/help/pcx_header_format.html

The first 128 bytes of the file is the header. You can use the header to get information about what's in the rest of the file.

It's like a map.

<edit>

Here's the docs for the Quake BSP. As you can see, it also carries a header: http://www.gamers.org/dEngine/quake/spec/quake-spec34/qkspec_4.htm

Too many projects; too much time

OK... So if I wanted to write in C++, say a vector of integers (std::vector<int>), in binary format, I just ofstream their binary representation as bytes into the file? Also, how would I separate the file's data from it's header while writing so that when reading it will be distinguished respectively and treated as such (header as header and data as data)?

A file header is part of the file, it's not really separate from it. It's just a number of bytes that's defined (in the file's format description) as having some specific meaning.

So you don't do anything special to read or write it, it's just some more data.

For your own file format, you could define a struct that holds a value to identify the file as something that you know (some pre-defined 32bit integer value, for example) and then some useful information (count of the items you've written, for example).

 

This looks like it's a very practical guide to read/write binary files in c++, including making your own file header: http://www.cplusplus.com/articles/DzywvCM9/

Too many projects; too much time

Nice! That's a very detailed article. Thanks for the link. And thank you guys for helping out. It really answered a lot of questions I had. I really appreciate it. I upvoted all your posts. Cheers! smile.png

Back when the Internet was much younger, wotsit.org was a good resource on a bunch of file formats, which included details about file headers and inner details.

A header was a common prefix in your file that you could use to identify your file format. As a simple example, the .gif format starts with a header like "GIF87a" or "GIF89a", the drawing canvas size, and more information about the image.

An example from an audio format file:

OFFSET              Count TYPE   Description
0000h                  20 char   ASCIIZ song name
0014h                   8 char   Tracker name
001Ch                   1 byte   ID=1Ah
001Dh                   1 byte   File type
                                 1 - song (contains no samples)
                                 2 - module (contains samples)
001Eh                   1 byte   Major version number
001Fh                   1 byte   Minor version number
0020h                   1 byte   Playback tempo
0021h                   1 byte   Number of patterns
...
If you are going to store data in a file, it is important for you to understand what data you are storing in your file so you can read it back out again.

Although it is possible to just write whatever happens to come to your mind at the time, usually a bit of planning will pay off. A short 'magic number' at the beginning to quickly identify your file (and reject unidentified files). A text blurb so someone peeking in the file with a regular file editor can see some descriptive information. An EOF marker so those dumping to a terminal don't get too confused.

Then a version number in case you need to change what you contain. Then some information about what is in your file, how bit it is. Then the individual payloads, again with version numbers so you can handle changes over time. Etc.

Thank you, frob! That is a very detailed and informative explanation. I do have one question though... All the different types of data that you mention that can be stored in the header, can all this be declared inside a struct and written normally at the start of the file just like the payload data? Please clarify. Thanks.

Thank you, frob! That is a very detailed and informative explanation. I do have one question though... All the different types of data that you mention that can be stored in the header, can all this be declared inside a struct and written normally at the start of the file just like the payload data? Please clarify. Thanks.

That's pretty much how that would work, yeah. Fill out your header struct in your code and just write it to the beginning of your file. At read time you just read in this struct again from the beginning of the file and you're good to go

I gets all your texture budgets!

This topic is closed to new replies.

Advertisement