Sign in to follow this  
codeToad

C++ - do you use STL in your game?

Recommended Posts

codeToad    142
I heard that the STL is too slow for use in a high-performance game. I've been told that if you want to use the same coding techniques as a AAA game, you have to make custom classes for all your linked lists, trees, etc., rather than just using LinkedList<MyClass> (or whatever the STL syntax is).

Is this true?

My game will be so simple that the performance will not really matter. I'm doing this for the programming experience.

Share this post


Link to post
Share on other sites
japro    887
Claims that STL is "too slow" are utter nonsense. Especially std::vector is identical in performance to a "raw dynamic array". You will have a REALLY hard time writing faster containers than the ones in good STL implementations. Now there might be the case that your data has a special structure that you can abuse, in that case you can think about rolling your own.

Share this post


Link to post
Share on other sites
codeToad    142
Thanks, guys. I got some experience writing my own linked lists and trees in the past, so I think I'm going to go with the collections provided by the C++ standard library.

Share this post


Link to post
Share on other sites
Cornstalks    7030
One reason AAA games may not use the C++ Standard Library is because of poor compiler support for certain standard language features, such as exceptions or templates. Some compilers for some consoles don't support these things too well, and so an alternate will be developed that doesn't use (or avoids using too much) exceptions or templates to work around the compiler's lacking implementation.

Another reason might be because the developers are targeting a specific operating system, and that operating system has some custom functions that allow the developers do something very specific (like memory-mapping files, for example). The developers may rewrite parts of the C++ Standard Library to take advantage of such specific functionality that's not exposed in the C++ Standard Library.

I'm just giving a couple of examples for when one might be justified in rewriting parts of the C++ Standard Library. If you're rewriting it (or parts of it), you need to have a very specific goal in mind and a very specific problem you're solving/working around. A naive rewrite is worse than stupid (unless doing it solely for educational purposes).

I haven't worked on a AAA game, but I've worked on some industrial-level software that sells for several thousands of dollars, and we use the C++ Standard Library (I haven't dug into the speed critical parts, but I think those are mostly done in assembly or SSE instructions where data structures aren't the bottleneck and the C++ Standard Library doesn't have algorithms for what we're doing). Use it.

I think jbadams summed up my opinions and views very well though.

Share this post


Link to post
Share on other sites
themean    1339
The big problem of stl is such that his implementation on different compilers are not the same, but this don't must stop you to using him
Boost containers are alternative of stl
You aways can typedef the container
[CODE]
typedef std::vector<Unit> UnitArray;
[/CODE]
And if you have performance problem you aways can change typedef with your own container
[CODE]
typedef MyVector<Unit> UnitArray;
[/CODE]
The only requirement is such that your container must have all methods of stl container equivalent Edited by themean

Share this post


Link to post
Share on other sites
Zoomulator    273
I agree with jbadams post with a few additions.

Two things to keep in mind.

First, there's a few people out there who aren't aware of the iterator debugging in visual studio's implementation of the STL (standard template library) containers. This causes a major slowdown in debug builds, and release code are often tens of times faster without it! Declare NDEBUG, or some _NO_ITERATOR_DEBUG (I think it was) to get rid of it. You can find the details on msdn if need be. It's a good safety net though that can really catch nasty bugs.

Second of all: all STL and standard library features are generalized to allow for a one-fits-all solution. This is generally a good thing, and they've implemented it in such a great way so that you can even assign your own memory allocators to avoid that potential slowdown too, which was mentioned before. Having them take memory from a pre-allocated pool greatly improves performance for the linked containers.

..But, generalized is always generalized, which means that you can't optimize it for your particular case and may run a bit slower because of it. You'd still have to be a damn nifty coder to make it better and as stable as the STD library.

Saying that, STL containers is my first choice, always. Most reasons stated by jbadams already.

Share this post


Link to post
Share on other sites
rip-off    10976
[quote]
I heard that the STL is too slow for use in a high-performance game.
[/quote]
Any performance issues that AAA companies find with the Standard C++ Library is that it's general nature doesn't allow them to do certain things. For instance, there isn't a way to tell std::vector<> to use a given piece of memory easily. The "allocator" parameter for standard library templates are a little underbaked and often don't give enough control to the client programmer.

Thus, the problems are of a design nature, not the actual implementation. The implementation of the standard containers tends to be very high quality. Most programmers are not going to beat the performance of classes like std::vector if they stick to the interface it outlines. In fact, many underexperienced programmers are likely to produce code that is algorithmically poorer, and thus will suffer heavily in real use.

[quote]
I've been told that if you want to use the same coding techniques as a AAA game, you have to make custom classes...
[/quote]
It makes no sense to apply the "coding techniques" of a massive, professional team to your small personal projects.

[quote]
... your linked lists, trees, etc., rather than just using LinkedList<MyClass> (or whatever the STL syntax is).
[/quote]
If you're worried about performance, you should strongly consider not using linked lists at all. They are almost always going to be slower than dynamic arrays on modern machines with real data.

[quote]
[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif][size=3][left][background=rgb(250, 251, 252)]I'm doing this for the programming experience.[/background][/left][/size][/font][/color]
[/quote]
I believe there is great value in implementing the standard containers yourself, as a learning exercise, in particular if you have a reliable source to get some feedback from. However, I would not recommend using such an implementation for a real project.

Share this post


Link to post
Share on other sites
l0calh05t    1796
[quote name='rip-off' timestamp='1336996225' post='4940024']
If you're worried about performance, you should strongly consider not using linked lists at all. They are almost always going to be slower than dynamic arrays on modern machines with real data.
[/quote]

Well, that would depend on how it is used. If you need to insert or remove items somewhere in the middle instead of just at the end (vector) or at the beginning or the end (deque) a linked list it practically guaranteed to be faster. Use the proper container for a given algorithm, taking into account the algorithmic performance (in this case O(1) vs O(n) insertion).

Share this post


Link to post
Share on other sites
turch    590
I think a lot of flak that the standard library gets comes from misuse, because most classes offer functionality they are completely unsuited for (such as insert in vector). Programmers throw them around willy-nilly without thinking about which structure is best for the intended usage. This leads to lists being used for memory that needs to be contiguous, which leads to programmers claiming that the stantard library is slow. Edited by turch

Share this post


Link to post
Share on other sites
l0calh05t    1796
[quote name='rip-off' timestamp='1337002061' post='4940057']
It depends. If the order is irrelevant, "swap and pop" is very fast.
[/quote]

If it is irrelevant, yes.

[quote name='rip-off' timestamp='1337002061' post='4940057']
The cache performance, not the algorithmic performance, dominates modern hardware. Jumping around memory using a linked list is worst case behaviour from the cache's perspective. Algorithmic analysis assumes that accessing memory costs the same regardless of how it is done.
[/quote]

True up to some constant factor. There is always a break even point at which the algorithmically better container will be faster. Of course that point might be way beyond what you'll ever need, heck, even bubble sorts can be useful for small, almost-sorted lists/vectors. In the end, profiling is the only way to find out what works best, and in that case the STL is great because due to the consistent interfaces it is easy to swap container types.

Share this post


Link to post
Share on other sites
Hodgman    51223
[quote name='rip-off' timestamp='1337002061' post='4940057']The cache performance, not the algorithmic performance, dominates modern hardware. Jumping around memory using a linked list is worst case behaviour from the cache's perspective. Algorithmic analysis assumes that accessing memory costs the same regardless of how it is done.[/quote]QFE. This is why designing your data-structures to fit into compact blobs of bytes is so important.
If you come up with compact "[url="http://bitsquid.blogspot.com.au/2010/02/blob-and-i.html"]file-formats for memory[/url]", which can use linked-lists or offsets or arrays internally, like below, then you'll actually be making proper use of your CPU. The best algorithmic structure is secondary and must be designed to fit the primary constraint of the working-set size.
[code]char* data = (char*)malloc(640*1024);//640KiB should be enough for anybody
vector<Foo,42>& fooVec = *new(data) vector<Foo,42>();
data += sizeof(fooVec);[/code]

The above is the reason game engine dev's might avoid STL containers - because the STL enforces the interface of random new/delete calls on your memory allocator. You can supply custom allocators to the STL, but they have to follow it's interface.
In contrast, many game-systems use allocation techniques such as [url="http://futurattack.googlecode.com/files/Double_Ended_Stack_allocator%20(1).pdf"]stacks[/url] and [url="http://publications.dice.se/attachments/scopestacks_public.pdf"]scopes[/url], which have a completely different interface and logical rules for allocating/deallocating. It's only in this situation, where you can't make [font=courier new,courier,monospace]std::vector[/font] use your game's allocator that you should write your own ([i]which is probably much simpler and less featureful[/i]).

N.B. if I was writing a game where systems were already using [font=courier new,courier,monospace]new[/font]/[font=courier new,courier,monospace]delete[/font], then using [font=courier new,courier,monospace]std::vector[/font]/etc would be the obvious (standard) choice. It's only when working in an environment with a completely different memory allocation strategy that you need to abandon the standard containers. Edited by Hodgman

Share this post


Link to post
Share on other sites
mdwh    1108
[quote name='rip-off' timestamp='1337002061' post='4940057']
The cache performance, not the algorithmic performance, dominates modern hardware. Jumping around memory using a linked list is worst case behaviour from the cache's perspective. Algorithmic analysis assumes that accessing memory costs the same regardless of how it is done.

In most real programs, insertions and deletions are "rare" and iteration is common. Speeding up insertion and deletion, while making iteration slower, will often have a much larger cost in the long run. When the data set is small, the performance implications of copying it all around is negligable. When the data set gets large, cache tends to dominate.[/quote]I don't think anyone would disagree that vectors would be used more often, but the argument is that there are still real world cases where linked lists are better for performance. In some cases, order is important, for example.

[quote][url="http://bulldozer00.com/2012/02/09/vectors-and-lists/"]This blog post[/url] quotes a slide from Bjarne Stroustroup illustrating some of these ideas. Unfortunately the code isn't included and I have no time to search for it now.[/quote]But this is a case that also includes the time to search linearly for the insertion point. It also only shows a single insertion/deletion AFAICT - what about an algorithm with a large number of insertions?

[quote]Again, this is a general rule but not absolute. I used the qualifying terms "real program" and "almost always" for a reason. There are sweet spots where linked lists make more sense.[/quote]Yes, I think this is the point [img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img] Your original posts said one should not use lists at all, not that vectors are usually better.

(Also there are cases where neither make a difference for performance, but using a list fits better with writing readable code due to the nature of the performance, so using a vector in some belief that it'll be faster is just premature optimisation.) Edited by mdwh

Share this post


Link to post
Share on other sites
l0calh05t    1796
[quote name='Hodgman' timestamp='1337003402' post='4940069']
The above is the reason game engine dev's might avoid STL containers - because the STL enforces the interface of random new/delete calls on your memory allocator. You can supply custom allocators to the STL, but they have to follow it's interface.
[/quote]

The interface allows for enough flexibility in most cases. E.g. pooled allocators etc. should all be possible.

Since we are already on the topic of cache coherence: Instead of arrays of compact structs, structs of arrays can be even better. And for optimal performance you may need arrays of structures of arrays... caches are a bitch ;)

Share this post


Link to post
Share on other sites
Hodgman    51223
[quote name='l0calh05t' timestamp='1337004526' post='4940078'][quote name='Hodgman' timestamp='1337003402' post='4940069']The above is the reason game engine dev's might avoid STL containers - because the STL enforces the interface of random new/delete calls on your memory allocator. You can supply custom allocators to the STL, but they have to follow it's interface.
[/quote]The interface allows for enough flexibility in most cases. E.g. pooled allocators etc. should all be possible.[/quote]In most cases for whom? I do actually agree with that sentence, for most people, but when talking about high-performance game engines, "general purpose" allocation (i.e. new/delete) is entirely inadequate for a lot of purposes. For starters it's a singleton, which is asking for trouble...
If it were adequate, you wouldn't have game dev's avoiding the STL because it doesn't work with their stack/scope/ring allocators linked to above.[quote name='l0calh05t' timestamp='1337004526' post='4940078']Since we are already on the topic of cache coherence: Instead of arrays of compact structs, structs of arrays can be even better. And for optimal performance you may need arrays of structures of arrays... caches are a bitch ;)[/quote]And lists and maps and trees too... as long as they're designed to fit in the constraints. Edited by Hodgman

Share this post


Link to post
Share on other sites
Antheus    2409
[quote]The interface allows for enough flexibility in most cases. E.g. pooled allocators etc. should all be possible.[/quote]

STL doesn't allow for pooled allocators due to underspecified allocator semantics. It works for specific implementations and code will be fully standard-compliant, but will mysteriously break on some compilers.

STL basically says: valid implementation may be stateful or stateless.


Linked lists in practice also suffer from another problem - search. There are two cases:
- iteration/additions/removal dominate - GC-collected vector wins
- search dominates - map/hashmap wins

Even for things like browser DOM or UI in general, one will frequently search for a group of objects fitting a criteria, requiring full sweep over entire structure with multiple results or operations.

Linked lists make sense only when the cost of copying is disproportionate to other operations or where reallocation cannot be performed at all, such as disk-based data or things like URL. In all other cases they lose even at algorithmic level. And even in these cases, bulk operations are often used (b-tree or other form of pre-computed index, sitemaps for web sites).

Share this post


Link to post
Share on other sites
rip-off    10976
[quote]
[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif][size=3][left][background=rgb(250, 251, 252)]Your original posts said one should not use lists at all, not that vectors are usually better.[/background][/left][/size][/font][/color]
[/quote]
Please quote where I said that.

Share this post


Link to post
Share on other sites
L. Spiro    25620
[quote name='Hodgman' timestamp='1337003402' post='4940069']
[quote name='rip-off' timestamp='1337002061' post='4940057']The cache performance, not the algorithmic performance, dominates modern hardware. Jumping around memory using a linked list is worst case behaviour from the cache's perspective. Algorithmic analysis assumes that accessing memory costs the same regardless of how it is done.[/quote]QFE. This is why designing your data-structures to fit into compact blobs of bytes is so important.
If you come up with compact "[url="http://bitsquid.blogspot.com.au/2010/02/blob-and-i.html"]file-formats for memory[/url]", which can use linked-lists or offsets or arrays internally, like below, then you'll actually be making proper use of your CPU. The best algorithmic structure is secondary and must be designed to fit the primary constraint of the working-set size.
[code]char* data = (char*)malloc(640*1024);//640KiB should be enough for anybody
vector<Foo,42>& fooVec = *new(data) vector<Foo,42>();
data += sizeof(fooVec);[/code]

The above is the reason game engine dev's might avoid STL containers - because the STL enforces the interface of random new/delete calls on your memory allocator. You can supply custom allocators to the STL, but they have to follow it's interface.
In contrast, many game-systems use allocation techniques such as [url="http://futurattack.googlecode.com/files/Double_Ended_Stack_allocator%20(1).pdf"]stacks[/url] and [url="http://publications.dice.se/attachments/scopestacks_public.pdf"]scopes[/url], which have a completely different interface and logical rules for allocating/deallocating. It's only in this situation, where you can't make [font=courier new,courier,monospace]std::vector[/font] use your game's allocator that you should write your own ([i]which is probably much simpler and less featureful[/i]).

N.B. if I was writing a game where systems were already using [font=courier new,courier,monospace]new[/font]/[font=courier new,courier,monospace]delete[/font], then using [font=courier new,courier,monospace]std::vector[/font]/etc would be the obvious (standard) choice. It's only when working in an environment with a completely different memory allocation strategy that you need to abandon the standard containers.
[/quote]
Glad you mentioned it before I had to be the party pooper.

In contrast with many who replied above me, I wouldn’t touch STL with a 3-inch pole.
[url="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html"]EASTL[/url]: Because Electronic Arts may make crap games, but they still know what they are doing.
Real games crave certain allocation disciplines. I even outlined stack allocators in my [url="http://lspiroengine.com/?p=400"]most recent tutorial[/url]. These things improve my performance by factors you can’t even believe, and this is typical of game development. Games need this kind of performance.

But it is not just that. While some claim that iterators are not slower than the alternatives, the reality is that this is largely compiler-dependent. Some (most) compilers always perform range checks on array access unless a specific macro is defined. While that macro (NDEBUG) is easy to define, it still doesn’t actually mean that all of your STL components will stop using that array-access check. Some might still be checking array access even with that defined.

The reason so many deviate from STL is:
#1: Everything mentioned by EASTL.
#2: Even if you think you are in control, you are not. You have no idea what is happening under the wheels. Iterations could be as fast as array access on one machine, but 3 times as long on another machine.

Making STL classes that are as fast as those provided by the most optimal STL vendors? I honestly have to wonder if that is so hard. I had no problems making faster implementations (one for POD, one for objects), and there does not seem to be any factor that outweighs overall usage efficiency. That is, by knowing how I intend to use a certain object, one efficiency major can be used over another to guarantee optimal use of custom vectors (etc.)

Simply put, I would not touch STL with a 3-inch pole.


L. Spiro Edited by L. Spiro

Share this post


Link to post
Share on other sites
l0calh05t    1796
[quote name='Antheus' timestamp='1337005930' post='4940087']
STL basically says: valid implementation may be stateful or stateless.
[/quote]

C++11 added stateful allocators as a requirement. Question is... how many years until it becomes widely available?

Share this post


Link to post
Share on other sites
hupsilardee    491
I don't use STL because I never figured out how to delete from a list while iterating through, so I made my own classes for List<>, Array<>, and Map<>

My containers allow me to do this
[code]
List<Zombie> zombies;

uint zcount = zombies.Count();
for (uint i = 0; i < zcount; i++)
{
uint bcount = bullets.Count();
for (uint j = 0; j < bcount; j++)
{
if (collide(bullets[j], zombies[i]))
{
bullets.Remove(j);
zombies.Remove(i);
}
}
}
[/code]

Share this post


Link to post
Share on other sites
alh420    5995
[quote name='hupsilardee' timestamp='1337074866' post='4940335']
I don't use STL because I never figured out how to delete from a list while iterating through, so I made my own classes for List<>, Array<>, and Map<>
[/quote]

Update your iterator with the one that is returned by erase, and it should work fine.
[code]
it = collection.erase(it);
[/code]

Just out of curiousity, how does your container handle it?
For example the fact that "bcount" and "zcount" no longer is valid after you "Remove".
Also, how does it handle that the index will "skip" one element when you erase? Edited by Olof Hedman

Share this post


Link to post
Share on other sites
rip-off    10976
[quote]
[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif][size=3][left][background=rgb(250, 251, 252)]I don't use STL because I never figured out how to delete from a list while iterating through[/background][/left][/size][/font][/color]
[/quote]
Looking at your code, you still haven't. That loop is fundamentally wrong in a number of ways. In addition to the cases Olaf highlighted, what if multiple bullets collide with a given zombie in a single frame? You'll end up removing zombies who might be nowhere near the bullet.

Standard C++ containers approach this problem by design - if using iterations, there are established idioms for erasing while iterating:
[code]


bool handleBulletCollision(vector<Bullet> &bullets, const Zombie &zombie)
{
for (auto i = bullets.begin() ; i != bullets.end() ; ++i)
{
if(collides(*i, zombie))
{
bullets.erase(i);
return true;
}
}
return false;
}

void handleZombieBulletCollisions(vector<Zombie> &zombies, vector<Bullet> &bullets)
{
auto i = zombies.begin();
while(i != zombies.end())
{
if(handleBulletCollision(bullets, *i))
{
i = zombies.erase(i);
}
else
{
++i;
}
}
}

[/code]

We could re-write the latter function something like:
[code]

void handleZombieBulletCollisions(vector<Zombie> &zombies, vector<Bullet> &bullets)
{
auto end = zombies.end();
auto i = remove_if(zombies.begin(), end, [&bullets](const Zombie &zombie) {
return handleBulletCollision(bullets, zombie);
});
zombies.erase(i, end);
}
[/code]
That implementation involves less copies being made if multiple zombies are removed in a single pass.

Share this post


Link to post
Share on other sites
hupsilardee    491
Sorry. That code was not strictly accurate, it's just that I got in the habit of storing the count like that before a loop, when using STL. Here is an actual example, from the code I used to actually test the Array<> class

[code]
for (uint i = 0; i < list.GetCount(); i++)
{
printf("List[%i] = %i\n", i, list.Get(i));
}
for (uint i = 0; i < list.GetCount(); i++)
{
if (list.Get(i) % 2 == 0)
{
list.Remove(i);
i--;
}
}
for (uint i = 0; i < list.GetCount(); i++)
{
printf("List[%i] = %i\n", i, list.Get(i));
}
[/code]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this