Compressing Files

Started by
6 comments, last by m_e_my_self 20 years, 5 months ago
I''ve been thinking this one over, and I haven''t been able to decide. Modern games take up lots of hard drive space. Some of them use compression on the files, and they''re decompressed on the fly. This makes them take up less hard drive space, but it also adds to the processing power needed during the game. If you don''t use compression, it''ll take up more hard drive space, but possible take longer to read off of the CD or hard drive. So my question is, is it worth it nowadays, for a 3d shooter, to make a big archive with most of the files and uncompress files on the fly out of it? Or is it wasted effort?
Advertisement
Just extract the needed files for the level you are loading. this should not take more than 5Mb so you could extract it, with a fast algo, in <1min.
think again.Why do we have "loading ...." in most of modern games?
shouldn''t take more than 5mb for what? the loading screen bitmap?
1) Big archives don''t need to be compressed. The motives for using a single monolithic data file are slightly different:


a. The act of opening files, particularly on a "secure" OS (such as any of the NT based versions of Windows) can be very expensive. Having a single monolithic file means you can open that (along with all the costly security/permissions checking etc) when the game starts and never have to open another OS level file again.


b. Seeking on physical storage media (CD/DVD, the platter of a hard disk etc) is *very* expensive and should be avoided at all costs, particularly backwards and random seeks.

With lots of individual files you have no guarantee that those files will be written to the media in any particular order and whether they''ll be written as a single contiguous block so you have no guaranteed way of removing redundant seeks.

With a single monolithic file/archive you can sort the location of the files in the archive into the exact order they will be loaded in game, thus removing all unnecessary seeks.

Having the files within the archive in load order also takes full advantage of hardware and OS level data caching.


c. Operating systems often have a minimum file size, under Windows with modern filesystems its 1Kb - any file which is smaller than that is padded out to that size on disk and so will use that much room. Depending on how your game works, you might have lots of files that are much smaller than that (for example .INI files) - that means wasted space on disk.
Having your own monolithic archive means you can avoid padding of files and so have a smaller install size.



2) Byte for byte, access to a physical disk is always going to take longer than access to [physical] memory and CPU cycles for most (average) decompression schemes. However things aren''t quite so simple:

a. If you''re using the CPU for anything else while you''re loading (as you would with asynchronous/background loading/streaming), then the effective CPU resources you can dedicate to decompression are drastically reduced - potentially to less (byte for byte) than reading the uncompressed data straight from disk.

b. Modern hard disks (and even CD/DVD drives) are capable of bus-mastered DMA which takes practically no resources from the CPU to transfer data into [physical] memory.

c. There are large read buffers in use on both the hard disk controller board and at the OS end which combined with DMA and deliberate/predictive read-ahead means the extra data for an uncompressed file might not actually take any longer to load (depending on the exact situation).

d. On most modern operating systems such as Windows, memory isn''t what you think it is. Sometimes it''s the physical RAM chips, sometimes (often on some systems) it''s virtual - i.e. its in the page file on your hard disk!.
If an average level in your game over-commits the amount of memory it requires to amounts above the available physical memory then somewhere in your game you''ll have extra hard disk accesses to fetch that memory.
If your loading overcommits - you can end up with three times as much physical disk access than you''re accounting for.


3) Something you haven''t mentioned is "in memory" file formats. Something that is commonly done is to store a whole file and even whole level on disk in exactly the same format as it would be in memory whilst playing the game (i.e. its just like a dump of memory - the only extra work is fixing pointers).

a. In-memory level formats avoid the conversion time. For example to load a texture in 32bit compressed TGA format into a raw RGB 565 16bit texture requires some CPU work, memory allocation etc to load the file.

b. An in memory file lets you allocate the memory for and load the whole level in one go - this can be very good for memory performance and loading multiple files in one go.

c. They also make asynchronous loading really easy since you don''t need to perform any processing per file until the end of loading.

d. A great number of console games use in-memory loading.


4) If you compress on a per-file basis, include a flag to exclude some files. Formats such as PNG and JPEG are already compressed and re-compressing often means wasted CPU time to decompress.


5) Having "loading..." in any PC game installed to hard disk displayed on screen for long enough that it becomes noticable/annoying is very wrong and hints at sloppiness somewhere.
For a console game, loading from CD or DVD, there are minimum REQUIRED (TRC/TCR/LotCheck) specified loading times so console game developers tend to pay a lot of attention to this area.

--
Simon O''Connor
3D Game Programmer &
Microsoft DirectX MVP

Simon O'Connor | Technical Director (Newcastle) Lockwood Publishing | LinkedIn | Personal site

Try this:
in the game control panel, have a slider like this:

(DISK SPACE)-----\/------(SPEED)
when user clicks ok, recompresses files to either
a. use less disk space , higher loading times.
b. use more disk space, less load

l8rz
IMO compression data is almost always a good thing.

Chances are there will be plenty of free cpu cycles since the HDD or cdrom read speed will be the bottleneck.

Even if cpu usage is a factor, something such as lzo could be used. The compression wouldn''t be as good as zlib but it''s is very fast to decompress.


Drakonite

[Insert Witty Signature Here]
Shoot Pixels Not People
So, basically it''s debatable as to whether one should use compression or not, but it''s basically mandatory to create "packages"?
Thanks for the input guys

This topic is closed to new replies.

Advertisement