Save File Compression

Started by
9 comments, last by Sc4Freak 14 years, 10 months ago
I have a question regarding save files. I am making a 2d tile-based game. Currently the game world is 1500x1200 tiles. I need to save the characteristics of these tiles. Here are my specifics: Language: C++ Platforms: Windows XP/Vista, Linux API/Libraries: SDL, OpenGL I am just playing around and experimenting at this point; however, if I save a char at each ( x, y ) location of the map, the save file is ~1.8Mb. Obviously if I save more, the save file(s) are correspondingly larger. What size of a save file is considered offensive to an end user? I also have not even started to work on saving game entities, player states, game states, etc. However, these data probably will occupy less space (just a guess, so far). I am now looking into compression. Is Zlib the go to choice? I do not know if it will work easily cross platform or if I will have to start to divide my code into Windows/Linux versions. I will experiment. So basically I am looking for advice regarding offensiveness of save file size and should I implement file compression. Thanks in advance for any advice and helpful replies.
Advertisement
I wouldn't worry about it. AAA games have been using ungodly amounts of disk space for years. Yeah, ZLib is a pretty quick and easy solution. It certainly should be cross platform (Win/Mac/Linux) in my experience.

Quote:Original post by signal_
I am just playing around and experimenting at this point; however, if I save a char at each ( x, y ) location of the map, the save file is ~1.8Mb. Obviously if I save more, the save file(s) are correspondingly larger.
10-20 MB is not unreasonable for the save files of a desktop game.
Quote:I am now looking into compression. Is Zlib the go to choice? I do not know if it will work easily cross platform or if I will have to start to divide my code into Windows/Linux versions. I will experiment.
zlib is a reasonable choice - it is fully cross-platform, and offers decent compression. Note however that there may be more efficient ways to compress your data, specific to the type of data you are storing.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

I've seen some really offensive save sizes from some games. Sometimes as much as 20+ MB per save.

1) Do you really need to save the whole map? Does the user create the map from scratch? Does every or almost every tile contain information that changes frequently? The less space-intensive way is to only save the things that changed. Then you can load the normal level map, load the save, and apply the changes. This can potentially have the benefit that re-load times are much faster (you only need to load and apply the deltas again, not the whole map).

2) I think full-out compression is a bad idea. To the user, fast loading times (which are very noticeable) will ultimately be more important that file size (which only power users will ever look at), and I suspect that compression will not make your load times faster (although, IO is generally slow - maybe you should run some tests to compare). Run-length encoding, on the other hand, is a very simple technique that you can apply that probably won't change save/load times by much (unless you have a lot of custom data per tile, then it won't really buy you anything).
Thanks for the replies, everyone. I guess I will heed sybixsus' advice and not worry about it too much. I do not anticipate going over 10Mb not to mention 20Mb. Good points about the loading times too.

Quote:Original post by lightbringer
I've seen some really offensive save sizes from some games. Sometimes as much as 20+ MB per save.

1) Do you really need to save the whole map? Does the user create the map from scratch? Does every or almost every tile contain information that changes frequently? The less space-intensive way is to only save the things that changed. Then you can load the normal level map, load the save, and apply the changes. This can potentially have the benefit that re-load times are much faster (you only need to load and apply the deltas again, not the whole map).


Lightbringer, you reminded me something that I forgot to post in the OP. I was going to say that I could have an initial assumption for a basic tile. That is to say, what is the most common tile characteristics? Assume each tile to be of this basic type and only record tiles that deviate from this initial assumption. This addresses yr question regarding the necessity of saving the whole map. I anticipate saving some space doing this.

I do procedurally generate the map and other things from scratch at the request of the user so there is no 'normal' map. Edit: Regarding RLE, I will look into this as well. I read the gd.net article on it too and it seems like I could implement something effective.

Thanks.
How long does it take to generate the map? Theoretically, given a fast and predictable PRNG, you could just store the seed numbers. But that's just theory, at the moment it doesn't sound like you really need to put too much thought into this :D
Quote:Original post by swiftcoder10-20 MB is not unreasonable for the save files of a desktop game.


It is not reasonable.

This implies that game state actually contains 160 megabits of information. That is huge.

While such sizes may be possible, I don't consider them reasonable. In the same way I don't consider HP printer's drivers of 370Mb to be reasonable. Or the SUV's fuel efficiency.

Quote:I do procedurally generate the map and other things from scratch at the request of the user so there is no 'normal' map.


Does the map change? If not, just save the random seed you use to generate the map (value passed to srand). This is what Dune 2 did. Took 2 or 4 bytes for 64x64 map.

Then, it merely wrote changes on top of that map. Fog of war and spice was a bitmap, units were stored individually, all of which makes it very easy for zlib to achieve excellent compression (Dune 2 didn't use compression IIRC, but it merely minimized the amount of data).

One way to improve on characteristics is to reorganize the data. If map entries contain 8 bits (1 bit collision, 4 bits tile type, 1 bit something, 2 bits another), instead of writing them out as n*8 bits, split them by type. So make one layer that contains only collision bits, one that contains 4 bits for type, and so on.

While the data size and values will remain exactly the same, they emphasize the relations between like values. For example, collision map will result in long runs of 1s and 0s. Perhaps the whole map is just zeros.

Compressing this type of data with entropy or statistical encoding (zlib is both) may result in superior compression. This needs to be tested, since it depends on relation between individual fields. If tile type 12 always has collision bit set to 1, there will likely be no advantage.

Quote:I read the gd.net article on it too and it seems like I could implement something effective.
zlib will offer best bang-for-buck. It is not worth reinventing anything.


There are actually two facets to this. On one hand, hardware on PCs is abundant enough. But at the same time, PCs are a fixed market, and everything else is exploding. If the advice is to still carry any semblance to the name of this site, then the answer remains no - such sizes are not reasonable. It is worth exploring *many* simple and valid techniques which can be used to reduce data sizes down by several magnitudes.

There is no need to count bits anymore as to squeeze into 640kb of memory. There is also no need to spend much time on it. But noticing that is valid, and likely fairly easily solvable problem.
Quote:Original post by signal_
I do procedurally generate the map and other things from scratch at the request of the user so there is no 'normal' map.
If the map is procedurally generated, then your original map can be regenerated from the seed value.

Then you only need save the modifications the player has made, and you load by regenerating and then applying the modifications.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Thanks for the advice guys.

I was able to come up with some simple solutions based on implementing the suggestions in yr responses... saving just the seeds, modifications, etc. Works like a charm.

Hopefully now my save files will not be offensive.
Quote:Original post by lightbringer
2) I think full-out compression is a bad idea. To the user, fast loading times (which are very noticeable) will ultimately be more important that file size (which only power users will ever look at), and I suspect that compression will not make your load times faster (although, IO is generally slow - maybe you should run some tests to compare). Run-length encoding, on the other hand, is a very simple technique that you can apply that probably won't change save/load times by much (unless you have a lot of custom data per tile, then it won't really buy you anything).
Depending on the disk and CPU, compression/decompression might actually be faster than reading and writing the entire thing. IO is, as you pointed out, very slow. Under the right circumstances, the benefits of not having to write as much data to disk will outweigh the extra processing done by compressing it. There is, for example, an experimental patchset for the Linux kernel that augments memory swapping to disk by simply compressing memory when it's not needed. The performance benefits are huge.

This topic is closed to new replies.

Advertisement