Sign in to follow this  
TimChan

The STL for game engine architecture

Recommended Posts

What are the pros and cons of using the STL for game engine architecture? And how about writing our own "STL"? Recently I'm reading David H. Eberly's lastest published book about game engine architecture. The author said that the STL is so slow that it is not suitable for a game engine, writing a small "STL" is more desirable. For me, I would rather choose using the STL because I think writing and maintaining another "STL" is really time-consuming even it is small.

Share this post


Link to post
Share on other sites
always use the STL unless you can PROVE, with numbers, that it is too slow. It is faster than what you would write. Authors who claim that it is too slow are not smart people and you should consider throwing their books away.

Share this post


Link to post
Share on other sites
Unless you are targetting your game at a platform where STL is unavailable, no.

I don't think there are any current consoles where STL is not available.

Mark

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
The STL is made for general use, so in specific cases you can speed your code up by using specialized containers.
But for general use you can't be faster than the STL, because the STL has strict requirements of being very fast (there is no range-checking and everything that slows down your code).

Until you can prove that the STL is the factor that causes your program to be slow and not your code that uses the STL, go with the STL, otherwise write your custom containers that do better.

Share this post


Link to post
Share on other sites
Quote:
Original post by Glak
always use the STL unless you can PROVE, with numbers, that it is too slow. It is faster than what you would write. Authors who claim that it is too slow are not smart people and you should consider throwing their books away.


People who make blanket statements like this without checking the primary source should be thrown away.

Anytime you use STL components (or Java's container classes) for a high-performance application, you need to understand what the components are designed to do. Much of STL is designed to produce algorithms that meet certain constraints on asymptotic order. For example, "set" maintains a sorted collection to support an O(log n) insertion, deletion, and lookup for n elements. As with any asymptotic analysis, the "order" refers to a limit as n approaches infinity. Moreover, the order has a hidden constant. When you say the positive function f(n) is O(log n), that means f(n) <= C*log(n) for some positive constant C and for n >= n0 for some n0 "sufficiently large". The values "C" and "n0" are typically determined from empirical evidence--you run your program for increasingly larger values of n and analyze the results. In many applications, n is *not* sufficiently large to obtain the benefits of STL, and a hand-rolled substitute *specially designed for your application* will perform better. Computer graphics and computational geometry have always been about micromanaging your resources to boost performance--sometimes you just have to roll your own.

STL (and Java containers) also have potential issues with memory usage and memory management. My favorite example where both fail miserably is in "set". In my medical image applications, I have triangle meshes with millions of vertices. I need adjacency information for doing things like continuous level of detail, surface evolution, and intersection testing. My vertex class stored a "set" of pointers to triangles sharing the vertex. Both STL and Java required an inordinate amount of memory to store the adjacency information. It turns out the vertex "set" members contain on average about 6 elements (this is to be expected statistically for triangle meshes extracted from 3D images). The memory overhead for "set" was massive, but designed to support asymptotic speed constraints. Switching to a simple dynamically resizeable array with a default of 8 elements reduced the memory usage by half (from 1GB to 0.5GB). That said, other applications might not care about the memory overhead, in which case STL is acceptable.

A big issue with memory management for games on a console is that the designers and programmers need to specify a "memory budget" for each component. The graphics system needs to be handed a chunk of memory--that is all it gets. Similarly, the sound system, the AI system, and any other system of interest is given its budget. It is absolutely *essential* that each system not exceed its budget. If you use STL, or any other set of tools, that call 'new' and 'delete' in ways you cannot predict or control, you only risk exceeding your budget at an unexpected time. Chris Butcher, from Bungie, talked about these issues in detail at Game Tech in December 2004. A major problem was that their physics component (Havok) was not designed to be given a memory budget, and that caused havoc in their development of Halo 2. When I make such claims in my books, it is because I attend these conferences and pay attention to what developers are saying, and because I also get to experience some of the problems in my own company's contracts.

All this said, I do use STL sometimes. But like any practicing engineer will do, I profile my applications and I monitor memory usage. If STL code is the bottleneck for speed or memory, I will try to understand what my code is doing to cause that and I will roll my own substitute and test to make certain it does better than STL.

Share this post


Link to post
Share on other sites
Try a custom allocator if you need extra speed before writing your own container. Chances are the allocation is the aspect that could be optimized most in a custom container.

As a real world example, my bot (www.omni-bot.com) uses A* for pathfinding, and STL for the internal data structures. My open list is an std::vector, my closed list is an stdext::hash_map.

Performance was pretty good, the STL structures haven't been the hot spots on my profiles for a while, but I wanted to experiment with custom allocators to see how much extra performance I could get with 2 extra lines of code.

For my benchmarking I wrote a double for loop that ran A* from every waypoint to every other waypoint. This would ensure a real world performance measure, and not some meaningless benchmark of inserting into a container a million times in a loop.

Anyways, I ran this benchmark in the largest map I had waypointed at the time.

Here were the results useing the raw data structures with default allocators:

with:
typedef stdext::hash_map<int, Waypoint*> WaypointHashMap;

generated 333506 paths in 59.764768 seconds: 5580.311141 paths/second

And then the results by simply using the boost::fast_pool_allocator:
typedef stdext::hash_compare<int> HashMapCompare;
typedef boost::fast_pool_allocator< std::pair< const int, Waypoint* >, boost::default_user_allocator_new_delete, boost::details::pool::default_mutex, 512 > HashMapAllocator;
typedef stdext::hash_map<int, Waypoint*, HashMapCompare, HashMapAllocator > WaypointHashMap;

generated 333506 paths in 46.905756 seconds: 7110.129485 paths/second

Personally I'd much rather use 1-2 lines of code and get a ~20% performance improvement(or more, this is the only allocator I've tried so far) than attempt to custom write a container that will end up less flexible than STL.

Understanding STL, using the right containers for the job, and custom allocators if you need more speed/memory allocation control IMO are the the requirements you should cover before reinventing them.

My 2c

[Edited by - DrEvil on October 23, 2005 11:19:25 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by DrEvil
As a real world example, my bot (www.omni-bot.com) uses A* for pathfinding, and STL for the internal data structures. My open list is an std::vector, my closed list is an stdext::hash_map.

You are using a std::vector as a O(n) priority queue and tell others about performance? ;)

Share this post


Link to post
Share on other sites
Quote:
Original post by Dave Eberly
STL (and Java containers) also have potential issues with memory usage and memory management. My favorite example where both fail miserably is in "set". In my medical image applications, I have triangle meshes with millions of vertices. I need adjacency information for doing things like continuous level of detail, surface evolution, and intersection testing. My vertex class stored a "set" of pointers to triangles sharing the vertex. Both STL and Java required an inordinate amount of memory to store the adjacency information. It turns out the vertex "set" members contain on average about 6 elements (this is to be expected statistically for triangle meshes extracted from 3D images). The memory overhead for "set" was massive, but designed to support asymptotic speed constraints. Switching to a simple dynamically resizeable array with a default of 8 elements reduced the memory usage by half (from 1GB to 0.5GB). That said, other applications might not care about the memory overhead, in which case STL is acceptable.

A dynamically resizable array, like std::vector? Just because you didn't pick the right container in the first place doesn't automatically mean you need to move outside the standard library automatically.

Quote:

A big issue with memory management for games on a console is that the designers and programmers need to specify a "memory budget" for each component. The graphics system needs to be handed a chunk of memory--that is all it gets. Similarly, the sound system, the AI system, and any other system of interest is given its budget. It is absolutely *essential* that each system not exceed its budget. If you use STL, or any other set of tools, that call 'new' and 'delete' in ways you cannot predict or control, you only risk exceeding your budget at an unexpected time. Chris Butcher, from Bungie, talked about these issues in detail at Game Tech in December 2004. A major problem was that their physics component (Havok) was not designed to be given a memory budget, and that caused havoc in their development of Halo 2. When I make such claims in my books, it is because I attend these conferences and pay attention to what developers are saying, and because I also get to experience some of the problems in my own company's contracts.

Which can be addressed by creating a budgetted allocator, which is a lot easier than writing an entire set of your own container classes.

Quote:

All this said, I do use STL sometimes. But like any practicing engineer will do, I profile my applications and I monitor memory usage. If STL code is the bottleneck for speed or memory, I will try to understand what my code is doing to cause that and I will roll my own substitute and test to make certain it does better than STL.

I notice you leave out a step where you try a different STL type, allocator or algorithm. Does that mean you don't try it?

Now I haven't read your book, but I hope you present better arguments there then you did in your post here, because all I see here are reasons to learn how to use the standard library better, not good reasons for writing your own containers.

Share this post


Link to post
Share on other sites
I've talked with a programmer from retro studios who worked on Metroid Prime and apparently they did not use STL since they said it was too slow, though they actually used Visual C++ 6.0 to compile if I remember correctly, so I'd imagine that could have definately been one of the sources of their problem. They also decided not to use any exceptions at all. Keep in mind that how good your STL implementation is relies on your compiler and the library implementations it provides.

Really though, I'd still recommend using the STL. Unless you can make a container which has different complexities or added functionalities of the STL containers which better fit your needs, then use it. If you need a dynamically sized array, use a vector (unless it's of bool), if you need a doubly-linked list, use a list, etc. The STL does not provide solutions to all problems, but the solutions that it does provide are [generally] designed very well, and the implementors of them are probably more experienced and spent more time on it than you, not to mention that they can work with non-portable optimizations which only apply to the specific compilers they are provided with, and they are heavily bug-checked.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
You already said it, they used VC6. VC6 is now almost 8years old, it was released in the same year as the c++ standard has been finished (but even back in 1998 it was a very bad compiler). Every conclusion made of ouptuts from a compiler as worse as vc6 is garbage.

Share this post


Link to post
Share on other sites
In the past there were valid reasons to write your own STL equivalent. Generally the reasons were related to how compilers handled templates. Code bloat used to be a real problem and if you were developing for a console this was of particular concern. New compilers are very good at reusing code, so good in fact that using templates can result in smaller code than using hand written specialized code.

Most of the other issues related to the performance of STL really have nothing to do with the STL its self but how people use it. Just read SiCrane’s post for an example. You need to know what to use and when to use it. Incorrect usage of any library be it yours, the STL, or some other library will result in inefficient code.

I only wish I had Dave Eberly math skills, but I completely disagree with his opinion of the STL. Correctly used with a good compiler the STL is an exceptionally fast, optimized, generic library. If you can use it then do so and don’t waste time trying to write your own, use that time to learn correct usage of STL.

Edit: Fix link to SiCranes post
Edit: Why cant I fix this link!

Share this post


Link to post
Share on other sites
Glak, David Eberly is not "stupid" and his books gap a very deep rift between cursory "Game Programming in 21 Days!" and what it takes to create a commercial quality engine.

That said, attacking the STL around here is like throwing stones at a hornets nest!

Quote:
Original post by Dave Eberly
Anytime you use STL components (or Java's container classes) for a high-performance application, you need to understand what the components are designed to do.


Please don't make it sound like the C++ STL is comparable in performance to the Java container classes. The template nature of the STL allows for much more optimization and the containers have significantly less overhead. It also allows for customization of the STL containers which is very important for the memory concern under discussion.


Quote:
Original post by Dave Eberly
A big issue with memory management for games on a console is that the designers and programmers need to specify a "memory budget" for each component. The graphics system needs to be handed a chunk of memory--that is all it gets. Similarly, the sound system, the AI system, and any other system of interest is given its budget. It is absolutely *essential* that each system not exceed its budget. If you use STL, or any other set of tools, that call 'new' and 'delete' in ways you cannot predict or control, you only risk exceeding your budget at an unexpected time.


This is what I do not understand. You can control how all STL containers allocate and deallocate memory. Every one takes an allocator as a template parameter and optimized pre-built STL-compatible allocators are available in boost so you don’t have to write your own. Only the default allocator uses new & delete.
You could also use boost::array and then you have a std::vector compatible interface to a fixed-sized array, so you can still use all the STL algorithms on it.
You can also .reserve() the space in the non-associative containers (e.g . vector) ahead of time to prevent multiple allocations.


Quote:

Chris Butcher, from Bungie, talked about these issues in detail at Game Tech in December 2004. A major problem was that their physics component (Havok) was not designed to be given a memory budget, and that caused havoc in their development of Halo 2. When I make such claims in my books, it is because I attend these conferences and pay attention to what developers are saying, and because I also get to experience some of the problems in my own company's contracts.


That's a more troublesome issue. It is a fundamentally different problem when you have constrained memory. General solutions assume you have sufficient memory to solve the problem at hand. When you an insufficient amount of memory you can no longer find the optimal solution. You have to start using heuristics to select the best solution from what you can calculate. Tic-Tac-Toe AI vs. Chess AI.

To me, that says the Havok engine does not scale or the Halo2 team was asking for too much precision given their constrained environment. Did they say how they solved the problem?

Share this post


Link to post
Share on other sites
Quote:
The memory overhead for "set" was massive, but designed to support asymptotic speed constraints. Switching to a simple dynamically resizeable array with a default of 8 elements reduced the memory usage by half (from 1GB to 0.5GB).


Otherwise known as a "vector"?

Seems to me the choice of STL container to use was the problem in this case, not the STL itself. If the type of object you are storing is, say, 12 bytes a vector (dynamic array) of those things would be a little over numElements * 12 bytes in size (allowing for capacity being greater than size, and perhaps a few bytes of vector object overhead too). A set would have the overheard of two pointers and a colour enumeration per element, probably 12 bytes in total making the size of each element 24. Yeah, that's doubled the size of your container. The moral of the story: prefer vector to set if memory use is more important than lookup speed. Fairly basic stuff.

If you put lawnmower oil in a Formula One race-car, you're not going to get much performance out of it.

[Edited by - SunTzu on October 24, 2005 4:15:47 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
The STL is made for general use, so in specific cases you can speed your code up by using specialized containers.


I know you made a fair comment about STL but i would just like to point out this recurring misconception or "common myth" here, just becaue something is generalized does not mean it cannot take advantage of optimizations of *special cases*.

This is exactly the case for STL, it is by it's very nature since with templates virtually all information about a type is available and it's standardized so library implementors can take advantage of there own compilers but still provide an invariant interface to the client.

I've been through this so many times before it gets boring. There are many advance techniques to take advantage of optimizations of *special cases* that have been around for a long while now that your average Joe bloggs C++ programmer just has no clue about. Hence your average Joe bloggs C++ programmer has virtually no chance of writing a customized replica that does any better (by the way STL is already customizable).

Just some of the the the things a typical implementation does:


  • Containers are parameterized by allocator type allowing for custom memory models/schemes, the allocator concept separates allocation/deallocation from construction/destruction for efficiency reasons (like in the case of std::vector). I bet half of the self proclaimed C++ programmers don't know how or knew you could do this.


  • If an allocator type is an empty type which is typically the case since allocators are most likely to be stateless this still may take up storage as a data member, known as code bloat. Most modern C++ compilers support an optimization called empty member/base optimization (EMO/EBO) most library implementors will take advantage of this optimization. Did you even know this problem/solution existed? i bet you didn't.

  • Each container employs a certain level of exception safe code, they typically apply the resource acquisition is initialization (RAII) idiom in slightly different way from the norm without sacrificing performance. When you wrote your custom solution did you even think about exceptions thrown such as for allocation failure among other things? i bet you didn't


  • Where ever there is a possibility, optimizitions for POD-types (POD = Plain old Datatype a technical term used in the standard) and/or contiguous memory are used, this all happens at compile-time.

    The canonical example is std::copy, where ever possible it will dispatch (at compile-time) to typically use memmove (not memcpy because the input and output ranges are permitted to overlap), failing that then different method of iteration can be used for random accessible iterators that is candidate for compiler optimizations such as loop unrolling, failing that then the most general method of copying is used. std::vector typically also does similarly internally.

    This is done using a simple template metaprogramming technique called type/tag dispatching with something called compile-time traits in particular iterators traits and type traits. Now most people don't know what a POD-type is let alone anything about template metaprogramming etc.


  • Most generic algorithms take advantage of particular iterator category and dispatches (at compile-time) to use a better implementation for those *special cases* this is using (as i mentioned earlier) with compile-time tag/type dispatching with iterator and/or type traits.


  • Even the containers use some form of template metaprogramming internally such as tag/type dispatching.



I could go on more but i think i've covered the major ones so now you get the point, for the average joe blogg C++ programmer who does not own a copy of the standard, and has no clue about these advance techniques and idioms in C++ (which takes about ~3 years to learn them all properly) don't waste your time, If it's there to use use it.

By the way yes std::vector on modern C++ compilers pwns Doom 3's equivalent.

[Edited by - snk_kid on October 24, 2005 6:38:17 AM]

Share this post


Link to post
Share on other sites
Hey,

Yeah, there are some great discussions about STL use in games that we've had in the past. A google search [I dunno, even 'site:gamedev.net STL'] should turn up a few of them, some of them contain some points that may not have been raised here yet.

That said, I'd have to advise using the STL _until_ there's a compelling reason not to use it. I have rolled my own list classes into my current project, but only because they provide me with functionality that IIRC doesn't exist in the STL. That and some STL containers are a bitch to get working with the memory management scheme that I was using at some point. These are nowhere near as fast as the STL ones [for obvious reasons] when doing STL stuff, but they allow my functionality [the stuff that I do a lot of] to be used quickly and cleanly.

But yeah, maintaining even your own basic set of containers is a bitch to maintain. And it isn't nessecarily faster [but then again I'm not exactly writing my stuff for speed yet, and I am in no way an optomisation whiz].

--CJM

Share this post


Link to post
Share on other sites
My opinnion is that use STL when ever it is possible.
If concerning performance, the numbers(let's think that you code your own core libraries that are fast) have so little difference with STL and your own library, but you have wasted a lot of time coding them.
STL offers good and usable types with little or no hassle.
I'm currently coding engine with C where you must code your own type libraries (like hashtables and such) and it's generally a pain.
If you still want to create your own STL, do it for purposes of study or either if you have much time in your hands ;)

Share this post


Link to post
Share on other sites
Quote:
My opinnion is that use STL when ever it is possible.

Agree. Spend the time learning how/when/why to use STL instead of coding a replacement.
Quote:
New compilers are very good at reusing code, so good in fact that using templates can result in smaller code than using hand written specialized code

I've heard this to be true (I.e all templates instanced with a pointer type can share the same code), but I haven't actually seen it working.

In one (PS2) game (Battlefield - Modern Combat) I manged to get the .elf size down from 6.5 mb to just over 3 mb by just modifying two class templates.
One was my own (~1 mb), the other was std::basic_string (~2 mb).

A common misperception is that templates = inlined code and indeed for the STL we were using this was true (STLPort).

We only had two instances of std::basic_string, char and unsigned short (for unicode strings).
So I split the longer std::basic_string methods (operator+, substr, find etc) into two header files (think .h and .cpp).
Then used explicit instanciation of the two types we used, all other source code is unaffected by this.

Now whenever I create a new templated class I tend to always split it into two files and use explicit instanciation.
In fact the new .elf was faster then the old (fitted more code in the cache I guess).

There was also a source with about 50 std::map<>::find calls (in a setup function), each and every one became inlined by the compiler (I think the compiled code for that function was around 128 kb!).
So I simply wrapped the map.find() call in another method and instead got 50 functions calls reducing the code size to nearly nothing.

If you're aware of these small things STL and templated classes can be very effective.

Whenever I use a method like find, erase and insert I tend to think: "Do I want 1 kb of inlined code right here?", if the answer is no (using the same method later etc), I create a wrapper function (which is more or less the same as letting the compiler decide when to inline).

Just my 2c

Share this post


Link to post
Share on other sites
Quote:

I've heard this to be true (I.e all templates instanced with a pointer type can share the same code), but I haven't actually seen it working.


I've seen the VC7.1 C++ compiler employ something (apparently called "Identical COMDAT folding") where it would basically map the entry point of functions that compiled to the same machine code to the same address, thus saving space by not duplicating the code. This was happening with templated classes, largely, but also with free functions. I imagine there are other methods the compiler could use to reduce code bloat as well.

Share this post


Link to post
Share on other sites
Quote:
Original post by SiCrane
A dynamically resizable array, like std::vector? Just because you didn't pick the right container in the first place doesn't automatically mean you need to move outside the standard library automatically.


The choice was a standard array. My application only inserts, never removes, so the array need only increase in size when necessary. The choice of initial number of elements was based on knowledge of the application's memory patterns, and in nearly all cases there is never a need to reallocate (to increase number of elements). My standard array has the storage overhead for the array pointer itself. The std::vector that ships with MSVC++ 7.1 has three pointers, which leads to much more overhead when you have a very large number of containers (as my application did). The amount of time to implement mine? About 15 minutes. I am quite certain I choose the right container (mine) and I am quite certain that nothing in STL will do better for this specific task.

Quote:

Which can be addressed by creating a budgetted allocator, which is a lot easier than writing an entire set of your own container classes.


Actually, no. Writing a budgetted allocator which also takes advantage of specific knowledge of the engine (graphics, physics, etc.) takes a lot of time, much more than writing the logic for some basic container classes.

Quote:

I notice you leave out a step where you try a different STL type, allocator or algorithm. Does that mean you don't try it?


I did try it (see comments about standard array), but did not see amy reason to mention it in my post.

Also, one item I made mention of, which I have not seen arguments for or against, was that many of STL algorithms are designed to obtain a specified asymptotic order. If I recall, some of the orders are *requirements* for anyone implementing an STL. Applications with "large N" should benefit quite well from some STL algorithms, but for applications with "medium size N", you really do have to experiment and profile. I cannot imagine that you would take the stance "Always use quicksort instead of bubble sort because quicksort is O(N log N) but bubble sort is O(N^2)". For small N, bubble sort will outperform quicksort (the break-even N will, of course, depend on implementation). Why would I go with quicksort for small N when I know I can do better?

Yes, you can use specialized allocators if memory management is an issue, but you cannot change the algorithms used in STL. If I can develop a better algorithm, I certainly will use it.

Quote:

Now I haven't read your book, but I hope you present better arguments there then you did in your post here, because all I see here are reasons to learn how to use the standard library better, not good reasons for writing your own containers.


I am sorry that you believe my arguments are bad as a result of lack of knowledge of STL. But until you have walked in my shoes and worked on the applications I have worked on, you do not have enough information to make this judgment. I will not make the same judgment of you--if STL always suits your needs, great. Go for it. It does not always suit my needs, and that is not a result of lack of knowledge or experience.

Share this post


Link to post
Share on other sites
Dave Eberly, i have the utmost respect for you but i do have some arguments against some of your comments in your last post so don't take this the wrong way, it's just some constructive criticism.

Quote:
Original post by Dave Eberly
The choice was a standard array. My application only inserts, never removes, so the array need only increase in size when necessary. The choice of initial number of elements was based on knowledge of the application's memory patterns, and in nearly all cases there is never a need to reallocate (to increase number of elements).


std::vector doesn't typically reallocate until it's size reaches it's capacity (it seperates allocation/deallocation from construction/destruction for efficiency). When std::vector does grow it will typically grow roughly around twice it's original capacity, std::vector is typically implemenated with an exponential growth strategy.

If you want to subvert the automated growing system you can use std::vector::reserve which will allocate a chunk of uninitialized memory in advance all ready and waiting.

Quote:
Original post by Dave Eberly
My standard array has the storage overhead for the array pointer itself. The std::vector that ships with MSVC++ 7.1 has three pointers, which leads to much more overhead when you have a very large number of containers (as my application did).


Are you talking about Wm3::TArray? from what i can see it stores 3 integers and 1 pointer or did you refer to Wm3::TSharedArray there i can see it has 1 integer, 1 pointer and has virtual destructor so most likely has a hidden vptr member.

There is a valid reason why most implementations of std::vector store 3 pointers. Since std::vector seperates allocation/deallocation from construction/destruction (for efficiency).

Typically the first pointer refers to the start of the array, the second pointer refers to the last in-place constructed element of the array, the last, extra pointer refers to the end of the storage (between finish and end of storage pointer is the part which is uninitialized) so:

finish - start = vector's size.
end_of_storage - start = vector's capactiy.

It doesn't need to explicitly store the size since it can and is calculated in constant time from the pointers it stores.

Quote:
Original post by Dave Eberly
The amount of time to implement mine? About 15 minutes.


Don't take this the wrong but from what i saw of the implementation it does have some issues as a class template. There are optimizations available that it simply does not take advantage of like std::vector does on modern C++ compilers.

Quote:
Original post by Dave Eberly
Actually, no. Writing a budgetted allocator which also takes advantage of specific knowledge of the engine (graphics, physics, etc.) takes a lot of time, much more than writing the logic for some basic container classes.


This is a good point but it is not necessary to write your own custom allocator types, if you don't mind using a library or compiler library extensions like on GCC they have something like 6-8 different types of STL compliant allocator types for anyone to use.

Quote:
Original post by Dave Eberly
Also, one item I made mention of, which I have not seen arguments for or against, was that many of STL algorithms are designed to obtain a specified asymptotic order. If I recall, some of the orders are *requirements* for anyone implementing an STL.


C++ standard does state time/space complexities for generic algorithms but i think that is a good thing as it helps writing portable code across different C++ compilers and it helps a client to choose the right algorithm for the job if one exists.

Quote:
Original post by Dave Eberly
Applications with "large N" should benefit quite well from some STL algorithms, but for applications with "medium size N", you really do have to experiment and profile. I cannot imagine that you would take the stance "Always use quicksort instead of bubble sort because quicksort is O(N log N) but bubble sort is O(N^2)". For small N, bubble sort will outperform quicksort (the break-even N will, of course, depend on implementation). Why would I go with quicksort for small N when I know I can do better?


The C++ standard library provides a whole family of sort routines to choose from, it is an implemenation as to what algorithm is employed for each however on most typical implementations none of them use the original quicksort algorithm (excluding C's library here).

std::sort is typically implemenatated with introsort, introsort is very similar to median-of-three quicksort that has worst case complexity of O(N log(N)).

[Edited by - snk_kid on October 25, 2005 7:33:21 AM]

Share this post


Link to post
Share on other sites
Am I dumb? It seems to me that everybody is barely saying the same thing: use STL, until your own problems show you that you should write your own containers.

Dave Eberly's particular use of the STL (a lot of very small containers) showed him that his problems could not be solved using the STL so he decided to create his own set of container. Changing the allocator will not change the number of pointers you have in the dimkumware (vc++) implementation of std::vector<>.

There is one particular thing about the STL that haven't been said here: the STL is not only a collection of containers, it is also a collection of algorithms that runs on iterators (not on the container themselves). As a consequence they are not plagued by theses memory problems. Now, if you create you own small containers and still want ot use the STL algorithms then you'll have to provide your own iterator implementation (this is where the things begin to be weird). You can also decide to avoid them - but in most cases you'll have to rewrite at least some of them, and you won't be able to achieve their speed.

What is the correct solution then?

Regards,

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Dave Eberly
I am sorry that you believe my arguments are bad as a result of lack of knowledge of STL. But until you have walked in my shoes and worked on the applications I have worked on, you do not have enough information to make this judgment. I will not make the same judgment of you--if STL always suits your needs, great. Go for it. It does not always suit my needs, and that is not a result of lack of knowledge or experience.


But how much experience of actually making games? Medical imaging applications aren't games the last time I looked at them; although admittedly I haven't been in a hospital for a while.

Of particular note from your post about the overhead of using a std::vector, is that on the consoles (e.g. PS2, Xbox) the memory allocator will be returning blocks with a minimum size of 16 bytes (for alignment), so your ‘optimised’ vector with only one pointer will take up exactly the same amount of memory (16 bytes) as the standard stl version.

Before I go further I should point out that i have all of your books, and have learnt a great deal of maths techniques and theory from them for which I’m greatly indebted, but I have always been gravely concerned about their practical use in commercial games. To take an example… no one in there right mind would write a commercial 3D game on a console (or PC for that matter) using Matrix and Vector classes that are just arrays of floats. Doing this on a PS2 is just suicide. A total lack of any kind of SIMD maths in your libraries or any of your books is a good indication to me of how out of touch you are with real (commercial console) game development.

On the subject of your quote from Bungie on their problems with using Havok, due to its use of stl: Actually, having worked on integrating Havok into a PS2/Xbox game I can say that these problems can be overcome by using a custom memory allocation system for their container classes. Specifically use of small memory pools for allocations less than 256 bytes, and other similar techniques to eliminate fragmentation. Since Havok uses its own custom container templates (derived from STL) this is relatively straight forward task. That Bungie didn’t realise this straight away is a little worrying, as for me Havok were quite clear from the outset that this would be necessary for good performance.

I’ve been writing console games since the PS1, and have used stl on the PC, Xbox, PS2 and now on the PS3. I have occasionally needed to implement my own containers for specific tasks, and when I did, they have been better than the STL versions for that task, but for the vast majority of things I use std::vector, list, map, etc, as they are fast and dependable. I always use custom allocators with fixed sized memory pools for different components of the game, but that goes without saying.

In summary: claiming that STL is not suitable for console game development (as you do in you book) is wrong and in my opinion is grossly misleading the young and impressionable upcoming game developers that are reading your books. Not only that, but using the limited resources and the (invalid) claim of limited STL support on consoles as a reason not to use it, is extremely ironic for me, since the entire of you code library and game engine architecture is completely unusable for a commercial console game anyway.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this