How to avoid slow loading problem in games

Started by
34 comments, last by cr88192 11 years, 2 months ago

As you know one of the problems that plague a game is slow loading time. It breaks immersion. IMHO, this game does it the most noticealy. The game is Spectral Souls on the PSP. I made a 50 second video recording my character exiting a store and entering back in.

">

Exits starts at 0:02 and ends at 0:20 - 18 seconds total. A game that says disc access while having a black screen can't be good either.

Entering the store starts at 0:23-0:37 - 14 seconds total. So I guess the game might have the data of the previous area you entered stored temporarily somewhere and that saves you 4 seconds?

I do the same test again.

Exiting the store starts at 0:38-0:47-8 seconds total. Wow now it is much less time to load!

Did the programmers not test the algorithms for execution time?

How does a programmer prevent this from happening?

I'm currently writing my own game in Java and I certainly learned a lot about this game loading time. Although I only recorded entering and exiting the store, the same applies the same way the game loads animation files of a spell or executes an algorithm to control monster's behavior. Each dialogue screen also needs 5-8 seconds before the dialogue appears.

I would not want to implement this type of loading in my game. This is the only game I have come across that does things this bad. I'm pretty sure every gamer prefer smooth transition in loading.

Advertisement

First off, games do often more then just loading data when loading a level first time. They need to prepare the level, unpack data etc. Then they cache it and next time you load it, loading is much faster, this happens often on systems where the data carrier (DVD/cardridge) has only limited memory amounts.

The best solution seems to be streaming, that is, the level is not completely loaded at the beginning, often memory expensive data like textures are left out. After the game starts, the rest is loaded, aka streamed, part for part while the game is running.

The PSP has very small RAM and the seek time for UMDs is absolutely horrid. The best thing to do is to use a fast compression/decompression library (because the decompress is faster than reading the full data from the disc would be) and have the data compressed in such a way that it can be decompressed directly into the memory it belongs in without the need for post processing. It's a step beyond serialization, but it's not too hard to do with the flyweight pattern and some creative disc arrangement. 18 seconds of load time is ridiculous even for the PSP. Although the RAM is limited it sounds like either your initial 18 seconds was a bad disc seek or else the game is using a resource cache and ends up not needing to load as much when you immediately re-enter an area.

Another method of reducing load times is to use formats that take up less space. On the PSP the CLUT-based 16bpp color mode can be pretty compact even for raw fullscreen bitmaps and can still look super-sexy. The gpu can render directly from this type, so....

Basically reducing load times is just like any other optimization: profile and correct. Find the bottleneck and work on it until it's not a bottleneck anymore, then move on to the next bottleneck.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

+1 to Khatharr's assessment. Fast heavy compression cuts down the time you waste waiting on the disk. And in-memory formats save you from doing too much processing at load time.

imho the big pitfalls programmers run into for loading data are:

1) Parsing text files. It's simple, and convenient. It often offers benefits like being able to read your data in any text editor and often being able see your edits on-the-fly in game. Pre-baked binary formats take a bit more effort at edit time because you need a tool to pack text->binary before the game can reload it, and changing the binary structure often means re-building all files of a type. But moving over to a binary format saves space, and wasted compute time parsing the text.

2) Sparse data trees laid out as individual files. It's really nice to have your data broken up as a "index" and "items". But then you end up reading dozens of files to pull together all the information you need. Consider a model. You have the model verts, the material definition, some textures, a shader. Overall this is maybe 7 files for one model and you're jumping all over the disk to get this information. No matter how you order the loading, seeking around the files will waste time. It's preferable to just pack everything for the level ahead of time into a more optimal layout in a single file. Reading the single file is fast and simple. You can thusly load all the data first, preferably packed up by their usage pattern, and then link it up in memory after the monolithic pack files are loaded. Now you load just a few files (all model/level mesh data, all textures, player mesh data, player textures, player sounds, level sounds) and spend your time jumping around memory to link things up. Much faster.

3) Allowing things to be allocated in memory during the load process. You should be able to pre-process all your data to know exactly how many of each thing you have and how large each thing is and reserve all the memory up front. Something like "std::vector::push_back()" has amortized O(1) insert, but it will waste a TONNE of time re-sizing and copying if you don't call "std::vector::reserve(totalObjectCount)" first. It's the same idea for all the data the game is loading.

+1 to Khatharr's assessment. Fast heavy compression cuts down the time you waste waiting on the disk. And in-memory formats save you from doing too much processing at load time.

imho the big pitfalls programmers run into for loading data are:

1) Parsing text files. It's simple, and convenient. It often offers benefits like being able to read your data in any text editor and often being able see your edits on-the-fly in game. Pre-baked binary formats take a bit more effort at edit time because you need a tool to pack text->binary before the game can reload it, and changing the binary structure often means re-building all files of a type. But moving over to a binary format saves space, and wasted compute time parsing the text.

2) Sparse data trees laid out as individual files. It's really nice to have your data broken up as a "index" and "items". But then you end up reading dozens of files to pull together all the information you need. Consider a model. You have the model verts, the material definition, some textures, a shader. Overall this is maybe 7 files for one model and you're jumping all over the disk to get this information. No matter how you order the loading, seeking around the files will waste time. It's preferable to just pack everything for the level ahead of time into a more optimal layout in a single file. Reading the single file is fast and simple. You can thusly load all the data first, preferably packed up by their usage pattern, and then link it up in memory after the monolithic pack files are loaded. Now you load just a few files (all model/level mesh data, all textures, player mesh data, player textures, player sounds, level sounds) and spend your time jumping around memory to link things up. Much faster.

3) Allowing things to be allocated in memory during the load process. You should be able to pre-process all your data to know exactly how many of each thing you have and how large each thing is and reserve all the memory up front. Something like "std::vector::push_back()" has amortized O(1) insert, but it will waste a TONNE of time re-sizing and copying if you don't call "std::vector::reserve(totalObjectCount)" first. It's the same idea for all the data the game is loading.

Thanks for the knowledge. This is very useful information! rolleyes.gif My other question would be I'm sure the programmers know about the slow loading times but why not fix it?

warnexus, on 01 Feb 2013 - 16:08, said:
...I'm sure the programmers know about the slow loading times but why not fix it?

Maybe ran out of time.
Maybe ran out of money.
Maybe ran out of ****s to give. (Sometimes this happens.)
Maybe just got bushwhacked by the company bureaucracy.
void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

good loading times are hard.

for example, in my case (on a PC), loading textures is a good part of the startup time, and a lot of this due to resampling the (generally small) number of non-power-of-2 textures to be power-of-2 sizes. this is then followed by the inner loops for doing the inverse-filtering for PNG files (more so, the cycle-eating evilness of the Paeth filter, which wouldn't be as bad, except that this tends to be one of the most generally effective, and thus most-used, filters in PNG).

I sometimes also wish that PNG also had a simpler "A+B-C" linear filter, which can often compete well with Paeth, but, more notably, is much cheaper (it could improve performance by potentially dilluting the number of times the encoder picks the Paeth filter). granted, along similar lines, one can also wish that PNG filtered per-block rather than per-scanline, ... but alas. (nevermind all the pros and cons of using custom image formats for textures...).

note that even as such, decoding such a texture may still be faster than loading a simpler uncompressed texture would be (such as BMP or TGA), due mostly to the time it requires to read files from the HDD.

luckily, one can usually cache textures.

unluckily, you still have to load them the first time they are seen.

there is a trick though, namely to have alternate lower-resolution and high-resolution versions of textures.

initially, only the low-resolution versions are loaded, and then any high-resolution and extended-component textures (normal-maps, ...) are streamed in during play.

clever streaming = using a thread. in my case (poor-man's streaming), it often means using a timer, meaning that a certain number of milliseconds can be used doing whatever (if too much time has gone by, we abort and wait until later). granted, for longer operations, this lazy option can still result in chopiness (mostly due to the potentially high number of milliseconds loading a texture can require, potentially going somewhat over the millisecond budget).

if graphical settings are set low, also the high-resolution versions may also be skipped entirely.

sometimes, loading may be hindered by other things, such as a few common offenders:

parsing text files;

linear lookups.

linear lookups are extra bad, as they can turn loading from an an O(n) operation into an O(n^2) operation.

IME, linear lookups have more often been a bigger problem than traditional parsing tasks, like reading off tokens or decoding numbers.

matching strings (such as for command-names) has sometimes been an issue, but shares a common solution with that of the linear lookup problem: hashing.

say, for example:

read in token (or read in line and split into tokens);

use a hash-based lookup, mapping the token to a command-ID number or similar;

"switch()".

a bigger problem with text formats is more often not actually the parsing, but rather reading them in from disk.

basically, for reading lots of small files, the OS's filesystem is often your enemy.

potentially better is, when possible, to bundle them into an archive, such as a ZIP, then fetch the files from this.

if implemented well, reading the contents from a ZIP archive can actually outperform reading them via normal file-IO (both due to bundling, and also reducing total disk IO via storing the data in a deflated form).

a downside though is that there are planty of braindamaged and stupid ways to handle this as well.

(if not implemented stupidly, it is possible to get good random access speeds to a ZIP archive with 100k+ files, but if implemented stupidly, so help you...).

granted, due to the way ZIP is designed, the above may still require the initial up-front cost of reading-in the central directory and transforming it into a more efficient directory tree structure (for example, large flat lists of directory-entry nodes, internally linked into a heirarchical tree structure, such as to allow the directory tree to be more efficiently "descended into", like in the OS filesystem).

during development, a disadvantage of ZIP though is that it can't be readily accessed by the OS or by "normal" apps (such as graphics editors, ...), so is more something for "final deployment". often these will be given a special extension (such as "pk" or "pk3" or similar) to reduce the likelihood of a naive user extracting them.

for some other specialized use cases, I am using a loosely WAD-based format. it is also linked into a heirarchy, albeit less efficiently (via simply linking to the parent directory entry) mostly to save space (ideally, we also want a 'next' link, but for A vs B, the parent-link won vs the next-link for the use-case). like ZIP, contents are usually deflated. this can avoid some of the up-front cost (but, with their own costs).

but, even with everything, getting everything loaded up in a few seconds or less isn't really an easy task (with modern expectations for content).

loading time has to come somewhere, either at engine startup, at world loading, or during gameplay.

typically, compromises are made.

not to say though that some games don't just have bad loading code...

> void hurrrrrrrr() {__asm sub [ebp+4],5;}

LOLz

(quickly going nowhere fast...).

> void hurrrrrrrr() {__asm sub [ebp+4],5;}

LOLz

(quickly going nowhere fast...).

<3

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

good loading times are hard.

for example, in my case (on a PC), loading textures is a good part of the startup time, and a lot of this due to resampling the (generally small) number of non-power-of-2 textures to be power-of-2 sizes. this is then followed by the inner loops for doing the inverse-filtering for PNG files (more so, the cycle-eating evilness of the Paeth filter, which wouldn't be as bad, except that this tends to be one of the most generally effective, and thus most-used, filters in PNG).

I sometimes also wish that PNG also had a simpler "A+B-C" linear filter, which can often compete well with Paeth, but, more notably, is much cheaper (it could improve performance by potentially dilluting the number of times the encoder picks the Paeth filter). granted, along similar lines, one can also wish that PNG filtered per-block rather than per-scanline, ... but alas. (nevermind all the pros and cons of using custom image formats for textures...).

note that even as such, decoding such a texture may still be faster than loading a simpler uncompressed texture would be (such as BMP or TGA), due mostly to the time it requires to read files from the HDD.

luckily, one can usually cache textures.
unluckily, you still have to load them the first time they are seen.

Why would you have any data in non optimal dimensions, and why would you even bother loading any of those formats? This stuff should be done once, and only once. Write a tool that goes over your assets folder and does any needed conversions.

Take all your files, and convert them to proper resolution (better yet, beat your artist with a blunt object until he makes proper power of 2 images. Images look bad when you change their aspect ratio), and then turn them into compressed DDS files. You're writing a game, not an image editor, it doesn't need to know about PNG or TGA or anything else.

Then when your game runs, dump your compressed DDS directly into VRAM. You don't need to check their dimensions. You don't need to uncompress or convert anything. Just dump it directly. DDS also handles mip levels, which covers your next paragraph.

When the game is running, everything should already be ready to go. Nothing should ever have to be converted or processed on the fly. Keep a good copy of everything for editing purposes, but make sure it's all converted when you build your game. Do a simple date comparison, if any asset is newer than the last converted one, convert it then.

This topic is closed to new replies.

Advertisement