C++ - do you use STL in your game?

Started by
32 comments, last by jbadams 11 years, 10 months ago
I think a lot of flak that the standard library gets comes from misuse, because most classes offer functionality they are completely unsuited for (such as insert in vector). Programmers throw them around willy-nilly without thinking about which structure is best for the intended usage. This leads to lists being used for memory that needs to be contiguous, which leads to programmers claiming that the stantard library is slow.
Advertisement

[background=rgb(250, 251, 252)]If you need to insert or remove items somewhere in the middle instead of just at the end (vector) or at the beginning or the end (deque) a linked list it practically guaranteed to be faster.[/background]



[/quote]
It depends. If the order is irrelevant, "swap and pop" is very fast.


[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]

[background=rgb(250, 251, 252)]Use the proper container for a given algorithm, taking into account the algorithmic performance (in this case O(1) vs O(n) insertion).[/background][/font]



[/quote]
The cache performance, not the algorithmic performance, dominates modern hardware. Jumping around memory using a linked list is worst case behaviour from the cache's perspective. Algorithmic analysis assumes that accessing memory costs the same regardless of how it is done.

In most real programs, insertions and deletions are "rare" and iteration is common. Speeding up insertion and deletion, while making iteration slower, will often have a much larger cost in the long run. When the data set is small, the performance implications of copying it all around is negligable. When the data set gets large, cache tends to dominate.

This blog post quotes a slide from Bjarne Stroustroup illustrating some of these ideas. Unfortunately the code isn't included and I have no time to search for it now.

Again, this is a general rule but not absolute. I used the qualifying terms "real program" and "almost always" for a reason. There are sweet spots where linked lists make more sense. In particular, linked lists have a (more or less) stable running speeds, while vector only has amortised running speed, with potentially large spikes around reallocations and deletions. Though linked lists aren't immune to spikes. Carefully loaded linked lists might behave nicely with the cache, but the "wrong" set of insertions and deletions might radically change their performance profile.

It depends. If the order is irrelevant, "swap and pop" is very fast.


If it is irrelevant, yes.


The cache performance, not the algorithmic performance, dominates modern hardware. Jumping around memory using a linked list is worst case behaviour from the cache's perspective. Algorithmic analysis assumes that accessing memory costs the same regardless of how it is done.


True up to some constant factor. There is always a break even point at which the algorithmically better container will be faster. Of course that point might be way beyond what you'll ever need, heck, even bubble sorts can be useful for small, almost-sorted lists/vectors. In the end, profiling is the only way to find out what works best, and in that case the STL is great because due to the consistent interfaces it is easy to swap container types.
The cache performance, not the algorithmic performance, dominates modern hardware. Jumping around memory using a linked list is worst case behaviour from the cache's perspective. Algorithmic analysis assumes that accessing memory costs the same regardless of how it is done.
QFE. This is why designing your data-structures to fit into compact blobs of bytes is so important.
If you come up with compact "file-formats for memory", which can use linked-lists or offsets or arrays internally, like below, then you'll actually be making proper use of your CPU. The best algorithmic structure is secondary and must be designed to fit the primary constraint of the working-set size.
char* data = (char*)malloc(640*1024);//640KiB should be enough for anybody
vector<Foo,42>& fooVec = *new(data) vector<Foo,42>();
data += sizeof(fooVec);


The above is the reason game engine dev's might avoid STL containers - because the STL enforces the interface of random new/delete calls on your memory allocator. You can supply custom allocators to the STL, but they have to follow it's interface.
In contrast, many game-systems use allocation techniques such as stacks and scopes, which have a completely different interface and logical rules for allocating/deallocating. It's only in this situation, where you can't make [font=courier new,courier,monospace]std::vector[/font] use your game's allocator that you should write your own (which is probably much simpler and less featureful).

N.B. if I was writing a game where systems were already using [font=courier new,courier,monospace]new[/font]/[font=courier new,courier,monospace]delete[/font], then using [font=courier new,courier,monospace]std::vector[/font]/etc would be the obvious (standard) choice. It's only when working in an environment with a completely different memory allocation strategy that you need to abandon the standard containers.

The cache performance, not the algorithmic performance, dominates modern hardware. Jumping around memory using a linked list is worst case behaviour from the cache's perspective. Algorithmic analysis assumes that accessing memory costs the same regardless of how it is done.

In most real programs, insertions and deletions are "rare" and iteration is common. Speeding up insertion and deletion, while making iteration slower, will often have a much larger cost in the long run. When the data set is small, the performance implications of copying it all around is negligable. When the data set gets large, cache tends to dominate.
I don't think anyone would disagree that vectors would be used more often, but the argument is that there are still real world cases where linked lists are better for performance. In some cases, order is important, for example.

This blog post quotes a slide from Bjarne Stroustroup illustrating some of these ideas. Unfortunately the code isn't included and I have no time to search for it now.[/quote]But this is a case that also includes the time to search linearly for the insertion point. It also only shows a single insertion/deletion AFAICT - what about an algorithm with a large number of insertions?

Again, this is a general rule but not absolute. I used the qualifying terms "real program" and "almost always" for a reason. There are sweet spots where linked lists make more sense.[/quote]Yes, I think this is the point smile.png Your original posts said one should not use lists at all, not that vectors are usually better.

(Also there are cases where neither make a difference for performance, but using a list fits better with writing readable code due to the nature of the performance, so using a vector in some belief that it'll be faster is just premature optimisation.)

http://erebusrpg.sourceforge.net/ - Erebus, Open Source RPG for Windows/Linux/Android
http://conquests.sourceforge.net/ - Conquests, Open Source Civ-like Game for Windows/Linux


The above is the reason game engine dev's might avoid STL containers - because the STL enforces the interface of random new/delete calls on your memory allocator. You can supply custom allocators to the STL, but they have to follow it's interface.


The interface allows for enough flexibility in most cases. E.g. pooled allocators etc. should all be possible.

Since we are already on the topic of cache coherence: Instead of arrays of compact structs, structs of arrays can be even better. And for optimal performance you may need arrays of structures of arrays... caches are a bitch ;)
[quote name='Hodgman' timestamp='1337003402' post='4940069']The above is the reason game engine dev's might avoid STL containers - because the STL enforces the interface of random new/delete calls on your memory allocator. You can supply custom allocators to the STL, but they have to follow it's interface.
The interface allows for enough flexibility in most cases. E.g. pooled allocators etc. should all be possible.[/quote]In most cases for whom? I do actually agree with that sentence, for most people, but when talking about high-performance game engines, "general purpose" allocation (i.e. new/delete) is entirely inadequate for a lot of purposes. For starters it's a singleton, which is asking for trouble...
If it were adequate, you wouldn't have game dev's avoiding the STL because it doesn't work with their stack/scope/ring allocators linked to above.
Since we are already on the topic of cache coherence: Instead of arrays of compact structs, structs of arrays can be even better. And for optimal performance you may need arrays of structures of arrays... caches are a bitch ;)
And lists and maps and trees too... as long as they're designed to fit in the constraints.
The interface allows for enough flexibility in most cases. E.g. pooled allocators etc. should all be possible.[/quote]

STL doesn't allow for pooled allocators due to underspecified allocator semantics. It works for specific implementations and code will be fully standard-compliant, but will mysteriously break on some compilers.

STL basically says: valid implementation may be stateful or stateless.


Linked lists in practice also suffer from another problem - search. There are two cases:
- iteration/additions/removal dominate - GC-collected vector wins
- search dominates - map/hashmap wins

Even for things like browser DOM or UI in general, one will frequently search for a group of objects fitting a criteria, requiring full sweep over entire structure with multiple results or operations.

Linked lists make sense only when the cost of copying is disproportionate to other operations or where reallocation cannot be performed at all, such as disk-based data or things like URL. In all other cases they lose even at algorithmic level. And even in these cases, bulk operations are often used (b-tree or other form of pre-computed index, sitemaps for web sites).

[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]

[background=rgb(250, 251, 252)]Your original posts said one should not use lists at all, not that vectors are usually better.[/background]

[/font]
[/quote]
Please quote where I said that.


[quote name='rip-off' timestamp='1337002061' post='4940057']The cache performance, not the algorithmic performance, dominates modern hardware. Jumping around memory using a linked list is worst case behaviour from the cache's perspective. Algorithmic analysis assumes that accessing memory costs the same regardless of how it is done.
QFE. This is why designing your data-structures to fit into compact blobs of bytes is so important.
If you come up with compact "file-formats for memory", which can use linked-lists or offsets or arrays internally, like below, then you'll actually be making proper use of your CPU. The best algorithmic structure is secondary and must be designed to fit the primary constraint of the working-set size.
char* data = (char*)malloc(640*1024);//640KiB should be enough for anybody
vector<Foo,42>& fooVec = *new(data) vector<Foo,42>();
data += sizeof(fooVec);


The above is the reason game engine dev's might avoid STL containers - because the STL enforces the interface of random new/delete calls on your memory allocator. You can supply custom allocators to the STL, but they have to follow it's interface.
In contrast, many game-systems use allocation techniques such as stacks and scopes, which have a completely different interface and logical rules for allocating/deallocating. It's only in this situation, where you can't make [font=courier new,courier,monospace]std::vector[/font] use your game's allocator that you should write your own (which is probably much simpler and less featureful).

N.B. if I was writing a game where systems were already using [font=courier new,courier,monospace]new[/font]/[font=courier new,courier,monospace]delete[/font], then using [font=courier new,courier,monospace]std::vector[/font]/etc would be the obvious (standard) choice. It's only when working in an environment with a completely different memory allocation strategy that you need to abandon the standard containers.
[/quote]
Glad you mentioned it before I had to be the party pooper.

In contrast with many who replied above me, I wouldn’t touch STL with a 3-inch pole.
EASTL: Because Electronic Arts may make crap games, but they still know what they are doing.
Real games crave certain allocation disciplines. I even outlined stack allocators in my most recent tutorial. These things improve my performance by factors you can’t even believe, and this is typical of game development. Games need this kind of performance.

But it is not just that. While some claim that iterators are not slower than the alternatives, the reality is that this is largely compiler-dependent. Some (most) compilers always perform range checks on array access unless a specific macro is defined. While that macro (NDEBUG) is easy to define, it still doesn’t actually mean that all of your STL components will stop using that array-access check. Some might still be checking array access even with that defined.

The reason so many deviate from STL is:
#1: Everything mentioned by EASTL.
#2: Even if you think you are in control, you are not. You have no idea what is happening under the wheels. Iterations could be as fast as array access on one machine, but 3 times as long on another machine.

Making STL classes that are as fast as those provided by the most optimal STL vendors? I honestly have to wonder if that is so hard. I had no problems making faster implementations (one for POD, one for objects), and there does not seem to be any factor that outweighs overall usage efficiency. That is, by knowing how I intend to use a certain object, one efficiency major can be used over another to guarantee optimal use of custom vectors (etc.)

Simply put, I would not touch STL with a 3-inch pole.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

This topic is closed to new replies.

Advertisement