binary save, adding vars, and changing savegame file formats

Started by
28 comments, last by Norman Barrows 11 years ago

There are three problems with that code:

1) multiple calls to fwrite instead of a single call as frob mentions.

2) fopen/fclose are horribly slow on most platforms due to permission checks and all that, you really want all the data (and your files) in a single big archive file which you just keep open.

3) just "omg" I want to run and hide after seeing that. :)

Ignoring 1 and 2 for the time being, even with C you can simplify all of that by writing at worst a macro. Macro's may be evil but they will save you huge amounts of typing and possibility of error:

#define WRITE_ITEM( type, item ) _fwrite_nolock((unsigned char*)item, 1, sizeof(type....... blah blah.......

That's just scary code to see all those casts and sizeofs written out over and over.

Worrying about #1 and #2 are not worth discussion at this time.

Advertisement
Maybe I'm biased from my current job, but I totally second the initial comment from Nypyren. The Google Protobuf approach is amazingly powerful, as you can continue to tack on as many additional "optional" fields as you want after the fact. Both pre-version and post-version parsers will ignore unknown field ids (added or removed), so as long as you
1) never change a given field (for field Id 1, the type stays the same.)
2) never re-use a given field number (really the same as 1 restated)
3) use optional as often as possible (because you can't un-required a field later if you have data stored with that field)
Then you avoid nearly all the versioning problems.
There is _slight_ overhead in that you'll start to have to use "if (foo.hasField())" all over the place when reading in the data, but really if you think about it, 9 times out of 10 you're adding a field to cover some subset of data anyway. Ie. Shader "shiny" requires some new vertex information that only needs to exist for objects using "shiny" do you really want to add a field to a fixed structure and require that ALL the data in your game be re-built just to accommodate a single new object? Saving iteration time is huge. Don't ignore that fact that you'll save a massive amount of time being able to read new data (and not choke to death) with an old binary so this isn't just about "upgrading an old save" as you put it. The new saves should still load in old binaries.

If you're really afraid of the data overhead this adds, just shove your data stream through zlib before writing it out to disk.
Generally what I do for buffering is use something like C#'s BinaryWriter and a MemoryStream.

MemoryStream is a wrapper around a byte array which allows the array to dynamically grow, and also keeps a Position pointer, which tracks where to read/write next.

BinaryWriter is a fairly simple class that just converts values to bytes, copies them to the underlying stream, and increments the stream's position pointer as it goes.

After you're done writing to the MemoryStream, you just grab its internal byte array and dump the entire thing into your file in a single call.


It's trivial to do this in C++ - you can either use existing stream classes which support most of this, or if you want to use a C interface, it's trivial to write it from scratch with minimal effort (it's a hundred lines of code or less - basically a few dozen functions with 1-3 lines apiece).

You've got thirty-something calls to fwrite. That is overhead that can be trivially avoided.

You know how big things are going to be. Create a really big buffer of that known size, dump everything into that big buffer, and make one call.

good point.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

2) fopen/fclose are horribly slow on most platforms due to permission checks and all that, you really want all the data (and your files) in a single big archive file which you just keep open.

won't that be less bullet-proof in case of machine lockup or power outage? I'm trying to design a bullet proof save so even power outage during an overwrite won't wipe the user's game. I would think that any open file when power goes out you could kiss goodbye. that was the nice thing about the copy followed by overwrite. if you lost power during the copy, the original was still there. if you lost power during overwrite, you had a backup copy. all you lost was the progress the player was trying to save when the power died.

I've now gone to a round robin naming scheme (save_a and save_b), overwriting the older on save, and loading the newer on load. save is now down to two seconds. probably acceptable. but i may try for 1 second.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

That's just scary code to see all those casts and sizeofs written out over and over.

tell me about it!

i usually use inline wrapper functions like writefilebin()

when switching from text to binary, i was using the newer fwrite() with the extra checks and such. ran almost twice as slow. so then i dug into the docs and found that good old slam bytes from mem to disk was now called _fwrite_nolock().

when coding things like that, i'll make a "template" fwrite....(,,,,); with blanks to be filled in.

cut and paste, then just fill in the blanks. glorified word processing, that's all most coding is.

but i don't usually write code like that. usually i'd have a nice inline wrapper function that is designed to minimize typing.

this was the first test code. and it worked so well (compared to text and locking version) there was no need to touch it.

but i'll definitely be writing a wrapper for it, as it looks like that will be required for save games in all my titles. so it gets added to Rockland's in-house game library.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Shader "shiny" requires some new vertex information that only needs to exist for objects using "shiny" do you really want to add a field to a fixed structure and require that ALL the data in your game be re-built just to accommodate a single new object?

I would think that in OO one would handle that with inheritance. Dealing with the save file format is another issue though.

I would run into the problem you described when first working on models and animations. I would add a new field to the declaration of a model, then have to add that to all models already made. Fortunately, I was re-creating a modeling and animation system i'd built in the past so it only took a few models before i got all the parts in place - i think i forgot scaling the first time. so i had a nice 20 limb hierarchical animation system, but couldn't scale individual limbs.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

well, i decided to try a version that saves individual fields of each struct, written as binary. this way, when i add a new variable to a struct, i add it to save, then load and save the playtest games to convert them, then add it to load, and i'm done. i used inline wrappers for _fwrite_nolock. save times increased from 2 seconds for writing array of structs to 3 seconds for writing individual fields of structs. no keys. IDs, sizes, or version numbers, just the struct fields as they appear in the declarations. then i decided to try buffering it. i wrote a version that malloc'd a 100 meg buffer, memcpy'd the struct fields to the buffer, then wrote the entire thing with a single _fwrite_nolock. and then free the buffer of course. and buffering got me exactly sqaut. still 3 seconds to save.

lessons learned:

1. keeping the file open means you lose it if the power goes out. not a robust design.

2. buffering 60 meg of data vs writing it out one int and float and char[100] at a time did not provide any significant speedup. lots of calls to _fwrite_nolock is not a bottleneck.

3. writing more data slows things down (obviously). so the flexibility of a size,id,value format must be weighed against the overhead of the additional data written. a size,id,value format will always be slower than a format that simply saves fields in a pre-defined order.

4. its more work to save individual fields than it is to save an entire struct, so its more work at first to create the code to save a struct one field at a time. however its much less work to add a new variable to a struct and convert existing save files when saving things one field at a time.

In my case, i only have to deal with in-house old file formats. for dealing with old file formats on the user's pc, you'd init all your data structures to default values, then load. if you get EOF, stop loading. new variables always appear at the end of the file, so when you run out of file, that's all there is to load form this older format, and the newer vars use the default values. when it saves it uses the new format, saving the new vars along with the loaded old data. this lets you import older file formats by simply adding EOF checks to reads for new variables added to the end of the format.

has anyone heard of a "init, load vars til eof, save all" algo for automatically importing old save games? I don't recall that one in school (software engineering OSU). Algos from the regular world of computing (size,ID,value and "keep the file open") were mentioned. but as usual, what they teach in the regular computing world has issues (extra overhead or data loss on power outage) when it comes to using it for games.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

3. writing more data slows things down (obviously). so the flexibility of a size,id,value format must be weighed against the overhead of the additional data written. a size,id,value format will always be slower than a format that simply saves fields in a pre-defined order.

it occurred to me that saving fields in a predefined order is essentially the same thing as a size,id,value format, where the size, id, and order of appearance are implicitly defined by the code that loads and saves the file.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

The other day i tried adding a new variable to the file format for the first time and it works great. just add the var to the struct, then add a line to load and save. now i have text file flexibility and (probably) the fastest possible binary speed.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

This topic is closed to new replies.

Advertisement