STL Performance Thread

Started by
79 comments, last by ApochPiQ 14 years, 1 month ago
A discussion regarding STL performance started here: http://www.gamedev.net/community/forums/topic.asp?topic_id=563033t After discussing with Sneftel, we agreed it would be more appropriate to create a new thread.
Quote:Original post by thatguyfromthething Sure, it will be faster than what someone who is a novice programmer writes, but how many times have you seen people using stl drop down to a straight c array? If you need to do that more than one in a thousand times you suffer all the penalty of if you didn't use it at all, plus the penalty of huge learning curve and million problems and quirks.
If you're using an array that will always be the same size, you're correct that a vector is not the proper choice. Something like boost::array<> is a better choice to avoid the extra allocations. Is this a flaw in the STL or the developer who simply used the incorrect data structure? I vote the latter. Moreso, though, this points to pre-optimization. The point of code is not to be as fast as possible, simply fast enough. Therefore, if it's not a bottleneck in your code, why bother? A huge learning curve and a million problems and quirks is a blanket statement that I have not seen in practice. Can you elaborate on this a bit more?
Quote:Original post by thatguyfromthething To write a good vector class as an example, it's really an exercise in memory management. That's too much for the average guy to tackle in a realistic timeframe, but the gains you can make are enormous even for vector class if you know what you are doing. Like if you might want your memory aligned properly by default or to have some SIMD get generated out (and no, this will never EVER happen automatically in your code).
I think you're speaking of specific cases. STL is not a golden hammer that solves all solutions. But, it's fair to say it solves most. Only after profiling had revealed the vector to be a bottleneck would it be a consideration to write my own.
Quote:Original post by thatguyfromthething And that's just easily measurable speed you can improve, not the insidious problems caused by code size bloat when you have everything defined as multiple templates defined in terms of each other, which template on multiple items.
Why is this an insidious problem? A larger exe is not really a big deal in most cases.
Quote:Original post by thatguyfromthething Not to mention where is the stl kdtree, hashtable, octree, bsp, cover tree, ball tree, rtree, btree, rtree, spatial index etc. etc. etc.? I mean, not even a hash table? Seriously? So how far can you go with game programming with just a vector and a map?
hash table is in the new STL standard and available under most compilers at this point or via STL port. It's unordered_map, unordered_set. There are tree and graph containers available (TCL, Boost, etc...). STL doesn't cover every possible collection, sure, but I don't see how that means that STL should be avoided.
Quote:Original post by thatguyfromthething Also not mentioning crossplatform compatibility. Linux? Who cares? Has anyone ever made a cent supporting linux?
Yes.
Quote:Original post by thatguyfromthething Mac support is nice I guess but then there's what? 5000 other platforms? Wouldn't you like to have code work on more than 1% of those platforms without additional tweaks? I have sat next to guys on multiple occasions who were trying to port stl over to other platforms, and for the time spent you may as well make your own - especially since there's not a hell of a lot in there anyway. Even if an implementation is lame you will be able to change it any time you want easily. I've gone through millions of lines of other people's code and hacked away but there's no way I could go into stlport source and figure out thing one of what's going on without searching around for hours.
Is there a specific platform you've had trouble with? I've used STL on Win32, Mac, various flavors of Linux, HP-UX, VxWorks and Nucleus OS. Of course, there's some platforms where STL may not be ideal, because of a weaker compiler, so you make adjustments for those cases. Those tend to be the minority of cases, not the majority.
Quote:Original post by thatguyfromthething A template library written almost entirely in macros! The irony of that is astounding. That's exactly what templates are there to replace, and if you need or want to resort to macros chances are you are seriously abusing templates and have a poor design, which I strongly believe is the case.
If you're claiming that some implementations make use of macros, ok... true. I don't see your point. Templates aren't a drop-in replacement for macros. They're used to cover cases where macros were being abused and provide benefits that macros simply don't support. They don't cover all functionality covered in macros. The C++ standard does not require the use of macros in implementing the library either, so your statement simply doesn't hold water.
Quote:Original post by thatguyfromthething So we get this kind of lame vector class, some math utilities that are far from optimal, a sort that is slow and has a ridiculously cumbersome interface using function points. Iostreams? Probably worst implementation idea in history. Except iterators, of course, which is the other big 'selling point' that is supposedly 'object oriented'.
The vector<> class is not lame by any means. That's just silly. If you're comparing ease of use against Java or C#, note that they had time to observe what was done and improve upon it.
Quote:Original post by thatguyfromthething ... rant removed ... But, templates are the opposite of OOP. Instead of unifying your interface you are shattering it completely and making it impossible to tell what code is actually generated. WHAT GETS GENERATED? This is very hard to track down and is mostly left up to the people making the compiler, and for anything where you care about performance this is an issue. And if you care about more than one compiler, you get to track it down more than once, and you get things working great on one but not the other.
I disagree. I think the correct approach first identify that you have a performance problem using a profiler, then make adjustments to fix it. Why fix non-problems? If there's no performance problem, why does the generated code matter? That's the whole principle of abstraction.
Quote:Original post by thatguyfromthething I have gotten surprised countless times at the stupid things compilers will do, especially with templates. Which is not to say not to use them at all because with C++ you are somewhat forced into it, but when you template on serveral items and do all the cute tricks you see in stl and boost, it is almost totally hopeless to ever track down what's really going on at a low level and you will see sudden performance drops for seemingly no reason. From the stupid comments I see all the time it's clear most people using stl and boost don't even know how templates really work or how the compiler works at all.
You can trace into STL and Boost. I agree it's more difficult than having the code entirely exposed, but it's certainly doable. The extra difficulty is offset by the fact that this is an infrequent task. If performance is dropping for seemingly no reason maybe you need to question what you're actually doing and if STL or Boost is the culprit.
Quote:Original post by thatguyfromthething So obviously it's terrible if you really care about performance
Not obvious, no.
Quote:Original post by thatguyfromthething and it doesn't have the data structures you would care about for making anything but some kind of business app without resorting to another API.
STL doesn't have a lot of the functionality in other libraries, I agree. Boost and other libraries fill in these gaps. While not ideal, it's hardly a lost situation and certainly not a reason to avoid STL altogether.
Quote:Original post by thatguyfromthething And it's it's the antithesis of OOP because instead of simplifying interfaces and and types it exponentially grows them, and instead of decoupling interface and implementation it again exponentially increases the coupling effect between files.
I don't agree. Template meta-programming and OOP are two different paradigms. If you're thinking of templates as an OO construct, maybe that's why you're having so much difficulty. You use them together, but not interchangeably.
Quote:Original post by thatguyfromthething So it's a terrible idea from either of the two important perspectives. If you are doing like a large business or database or general GUI app, you should care about the OOP aspect. If you are doing like a game or graphics tool then the performance aspect for most tasks and the OOP for things like events and GUI. If you don't care about either of these things, why are you using C++ and not perl or something?
The STL doesn't prohibit you from accomplishing either of these things. That's ridiculous. That's just blind hatred for the technology, no offense.
Quote:Original post by thatguyfromthething So no, I am not spreading lies or misinformation. I used to think more like you, but years and years of struggling with C++ to make it behave properly have finally made me realize why this is such a poor approach.
It's good to be enlightened. I just question the basis behind it. When used properly, STL makes your life easier, not more difficult. I have worked on many applications, including high-performance servers and embedded. I haven't experienced the overwhelming breadth of the type of problems you describe. Isolated incidents, yes, those happen with every library. The big problem is that there was a time when what you're saying used to (mostly) hold water. The problem is it was well over 10 years ago. Nowadays, and in most cases, the advice you're giving is simply bad advice, no offense. I don't agree with the mentality of trying to micro-optimize every single section of code, even when 95% of that code isn't along the critical path.
Advertisement
I've been in and out of the games industry for long enough that I've heard all the arguments that are being used about STL before.

I've heard them about using "proper" C++ -- by which people mean using virtual dispatching and inheritance.

Before that, they were talking about C++ vs C.

Before that it was C vs assembly language...

It's actually a pretty rare programmer whose hand-rolled last-year's technology can beat this year's compiler optimisations. I'm not saying it's not possible. It's certainly not possible without having a profiler and it's amazing how many people who haven't written line 1 of their software are spending their time thinking about how to optimise things. People who haven't ever written a game, haven't ever finished a bit of software are fretting that their game is SO special that it will put SUCH stress on machines that they need to START writing in assembly or without vdispatch or by building their own datastructs or deciding to code in low-level shader languages or whatever the hell this week's stupid argument is.

This is why the earth's hard disks are full of enormously optimised games that are only 1% complete and will never make it any further.

The original authors of UNIX were actually astoundingly lazy in that good way that really good devs are. That's why almost all the data structures are singly linked lists. They wrote them like that to start with and intended to optimise later when they needed to. And they never found a need to. On the other hand, they wrote an operating system whose basic design is still in use over 30 years later between three of them in a few months. There's a lesson there.

Of course, now that you can write a 'Breakout' or a 'Pacman' without absolutely *needing* to do it in cutting-edge bleeding eyes optimised assembly language because BBC Basic interpreted on a 1MHz microprocessor really won't cut it; what's the very, very first game everyone tries to write as their very first learning level project? World of flipping WarCraft, of course...

It doesn't matter. To STL or not to STL in this context is a useless religious argument. I don't fight with you about which is better, needle nosed pliers or monkey wrenches.

The algorithms in the STL are authored a specific way that is often extensible: custom memory allocators, etc. It's important that you understand exactly how they are implemented. Once you know that you can decide if they are good for your task or not.

Sometimes you need to write clones with limited or specific functionality, sometimes they are perfect.

There is no correct side to the argument as stated: "is the STL good or bad". The only correct answer is "sometimes"

-me
Quote:Original post by Katie
The original authors of UNIX were actually astoundingly lazy in that good way that really good devs are. That's why almost all the data structures are singly linked lists.


I had nothing to do with UNIX, but I did do a fair amount of all kinds of coding in the 90s. I used linked lists for everything as well, in a variety of languages (which means 'both', C and Pascal).

Reason: There was no standard library, there was no code reuse, there was no versioning system, there was no storage (30MB disk with 2MB free), there was no internet, the BBS crawled along with 14.4bps, there was no wikis, no references, no codez sites, there was 'a book' on algorithms, and that was about it.

So when I needed a dynamic structure - I burst typed the link-list stanza (there was no copy paste either, there was no MDI).


And no, they were not dark ages, and there did indeed exist many alternatives, and people actually were doing real software - but point remains, in those days, I could write linked list blindly, not even thinking about it.

Compare this to RB trees, vectors, hash tables, etc, all of which depend on various internal details to be implemented robustly (resize factors, hash functions).

Also, in DOS times, vectors/arrays were not all that cool due to 64kb segment limit. And linked lists allocated memory accurately. Vector with 50% wasted space? Are you insane? 50 bytes wastes space could ruin one's day.



The reason I'm putting that out is because I didn't use linked lists because they were good - but because they were a 1lb. hammer. And while I had a full toolbox of power tools, I had to run everything on a single 1.5V battery, so I used the thing which didn't use power at all.

Today, someone might assume that linked lists must be magic because of all the reasons - they are not - they are dead simple, memory efficient and have constant performance characteristics. They are also likely the slowest of dynamic containers, but at least they are consistently slow.


Times have changed. It really was only 15 years ago when third-party libraries and compilers were a problem for many reasons. None of those reasons exist anymore.

Regarding STL: the reasons it "has performance problems" is due to the abstractions it defines. Those abstractions introduce specific performance characteristics. STL still compiles to assembly, and not some weird, 20% slower code.

Meanwhile, jobs ads are flooded looking for Ruby programmers, no previous experience needed, to develop cloud social applications, two week turnaround time.

The times have changed.
Quote:Original post by Palidine
It doesn't matter. To STL or not to STL in this context is a useless religious argument. I don't fight with you about which is better, needle nosed pliers or monkey wrenches.

The algorithms in the STL are authored a specific way that is often extensible: custom memory allocators, etc. It's important that you understand exactly how they are implemented. Once you know that you can decide if they are good for your task or not.

Sometimes you need to write clones with limited or specific functionality, sometimes they are perfect.

There is no correct side to the argument as stated: "is the STL good or bad". The only correct answer is "sometimes"

-me


Fair enough. The original author seemed to imply that the STL had no place in gaming or anything performance intensive, which was what I was replying to. My original point was for the majority of cases it's right choice, but certainly not all. There is no tool that is the right choice 100% of the time.
I think a dev should show little humbleness in regards to new technology and paradigms. I've been there where asm was king, then inline asm in C++, then rolling your own data structures becouse STL was akward, etc.. You just gotta keep unlearning bad practices (which was considered good) while learning the new. Unlearning might be the bigger challenge.

Quote:Original post by thatguyfromthething
... rant removed ...

But, templates are the opposite of OOP. Instead of unifying your interface you are shattering it completely and making it impossible to tell what code is actually generated. WHAT GETS GENERATED? This is very hard to track down and is mostly left up to the people making the compiler, and for anything where you care about performance this is an issue. And if you care about more than one compiler, you get to track it down more than once, and you get things working great on one but not the other.

It's not the opposite if OOP. Its generic programming in OO. It's more like duck typing with added benefit ot static type checking.

Why do you want to know what it generates? If profiler shows big perf drop there, just check it out from disassembly, if you're in low level stuff. Usually the unspecialized version does the job just fine, but you can always specialize it, to generate whatever you wish. And on larger scale, the compiler usually does a better job on optimization than any human.

Quote:Original post by thatguyfromthething
I have gotten surprised countless times at the stupid things compilers will do, especially with templates. Which is not to say not to use them at all because with C++ you are somewhat forced into it, but when you template on serveral items and do all the cute tricks you see in stl and boost, it is almost totally hopeless to ever track down what's really going on at a low level and you will see sudden performance drops for seemingly no reason. From the stupid comments I see all the time it's clear most people using stl and boost don't even know how templates really work or how the compiler works at all.

Heh, that's a bit ironic. Yea compilers create suboptimal code time to time, shown by profiler. Like not inlining obvious function. But it's trivial to optimize it later, before the product ships. I haven't faced any difficulties to see low level code and to guide the compiler for better output.
Quote:Original post by Ftn
I think a dev should show little humbleness in regards to new technology and paradigms. I've been there where asm was king, then inline asm in C++, then rolling your own data structures becouse STL was akward, etc.. You just gotta keep unlearning bad practices (which was considered good) while learning the new. Unlearning might be the bigger challenge.


This. A thousand times this.

My biggest annoyance when it comes to programming are the 40 or 50 year olds who have 'been there, done that' and know the 'right' way to do something because 'it was true 20 years ago'. Never mind that times have changed, compilers have improved, cpu speed out paces memory speed and hundreds of other changes.

I've gone on a rant about this before; adapt or gtfo basically. Stop trying to hold back the state of the art.

(I should note that not all 40 and 50 year old coders are like this, the best ones do adapt and I'm glad about that. I also had a horrible shock when I went to write '30 year old' and realised 'crud, thats me in a few months.. errm' *chuckles* I pray I never get like that...)

Quote:
It's actually a pretty rare programmer whose hand-rolled last-year's technology can beat this year's compiler optimisations.

I'd say this is a big misconception, too. People think that the compiler has some magic ability to go in and do wonders and that somehow with the stl that it's even better and these super advanced programmers (who mainly wrote this code 15+ years ago) have some secret mojo no one else does.

But the reality is the compiler is generally good at inlining, and good at return value optimization and good at rearranging order for pipeline execution. You might get a little better if you mess with compiler specific directives but to get simd generated out for example you need to do it yourself.

Quote:
It's certainly not possible without having a profiler

Profiler is another misconception, it's like saying you need a debugger to figure out bugs. It's just raw data, it catches things that amount to mistakes but it really tells you very little. Like the debugger, it is easy to misuse as a crutch that negates actual thought - it's not there to debug logic, but to catch simple mistakes and that's it.

Quote:
and it's amazing how many people who haven't written line 1 of their software are spending their time thinking about how to optimise things. People who haven't ever written a game, haven't ever finished a bit of software are fretting that their game is SO special that it will put SUCH stress on machines that they need to START writing in assembly or without

Well sure, it is always the case you can preoptimize things.

It's also easy to assume everyone on a site like this is a 15 year old game enthusiast or a college student but that's not necessarily the case and I'd bet there are also some of the best programmers in the world hanging around.

@Rydinare
This is sort of disjoint since I won't go through and edit everything for nine hours just to make a message board post.

But, just because you don't see something doesn't make it not so.

Most programs don't need to worry a lot about code size or performance, but most programs don't need to be written in C++. It's easy to get gigabytes of object files in C++, though.

The point of macros is metaprogramming, too. Templates were introduced as a replacement, because they work ok for little stuff but they are hard to deal with when making something large. I don't say anyone who uses macros is incompetent, but a template library programmed almost entirely in macros is pretty ironic and shows that it's probably not someone who should be listened to much. But, free has a big appeal.

Yes, other languages did learn from C++ mistakes. That's why they don't use C++ templates at all, but what does that have to do with anything? My point is not to bash on anybody, just to point out reality. The fact C++ has been left in the dust should say a lot about many of C++'s features.

I know templates are not OO, I'm the one who said it. Most people who talk about stl, design patterns, and OOP in C++ are pretty vague on what OOP is and how templates work. Or more likely, simply dead wrong. But, it's unsurprising because the way templates and C++ in general work is beyond mindboggling if you really look at it in detail.

People drop to c array because of performance, not size. I said nothing about resizing. You can also preallocate a vector, but that's not the real issue.

Sure you can trace through anything, but good luck with that using stlport or dinkumware, or probably any other big template library. But more important, what assembly gets generated? The point is not to look for mistakes but to find out why things slow down. This is possible but much harder with templates, almost impossible when you have complicated things going on (which almost anything in the stl qualifies as).

Who said I microoptimized anything? How would ten years passing have changed how stl works? Most of stlport was written 15+ years ago and has not changed a whit as far as I can tell in that time, whereas what qualifies as optimal code today has changed dramatically because hardware has changed dramatically. Sure computers can perform most tasks faster than ever, but again it comes to why are you using C++ if you don't care about performance? For some things you basically get paid in direct proportion to how much of x you can do, or even in exponential proportion. If Modo could handle 20 billion poly models would they be languishing in obscurity while zbrush has a dozen times the customers? Modo is certainly easier to use and it has all the same basic features, but bottom line is it's useless for most of what zbrush does so well simply due to not having the raw firepower.

How does stl solve almost anything? This came up as a discussion about data structures, if you remember. You (and I) are only talking about vectors because that's about all there is worth noting in stl. If that's all you need for data structures, you probably are mistaken to use C++ at all. I'd almost say the entire reason to use C++ at all compared to say Java is to be able to make specialized APIs. Of course if there is some API you need that's only in C++ then you are stuck with C++, but the real point is so that the people writing the API can have performance and flexibility and in turn use other low level APIs. Say you want to make something akin to Oracle or a game engine. Do you really think STL will cover your data structure needs? Well, it won't even scratch the surface. However if you are just some guy who uses Oracle or a game engine's built in stuff and make GUIs on top of it, by all means don't worry about data structures they use at all, because it's out of your hands anyway. If that's the case, drop C++ completely if you can and use C# or mono or python or Java to call on their API to do the actual work. But you can be pretty sure that somewhere in those programs there's a lot of work that went into various data structures.

If you want to convince yourself stl is a joke, just make a class with a vector of ints and a couple other primitive members, basically a typical object of some kind. Then make a vector of these classes, each with a dozen ints in the child vector. Add about a million to the vector, then sort it and see how long that takes to complete(for comparison, I can run the same code with my own implementation in about 150ms). Also note how awkward this is, syntactically speaking. Sure, you can blame the implementation, but the implementation is going to be different on each platform and yet tend to universally suck because. Also check the difference between what you get with a static array and a vector, I know that put a sad frown on my face.

[Edited by - phantom on February 25, 2010 3:42:16 AM]

This is my thread. There are many threads like it, but this one is mine.

Quote:If you want to convince yourself stl is a joke, just make a class with a vector of ints and a couple other primitive members, basically a typical object of some kind. Then make a vector of these classes, each with a dozen ints in the child vector. Add about a million to the vector, then sort it and see how long that takes to complete(for comparison, I can run the same code with my own implementation in about 150ms).

Funny thing is, that takes a few ms using the STL on my machine. So, what now?
Quote:Original post by thatguyfromthething
Quote:It's certainly not possible without having a profiler
Profiler is another misconception, it's like saying you need a debugger to figure out bugs. It's just raw data, it catches things that amount to mistakes but it really tells you very little. Like the debugger, it is easy to misuse as a crutch that negates actual thought - it's not there to debug logic, but to catch simple mistakes and that's it.
Are you telling me you don't use a profiler when doing performance analysis? If not a profiler then what? How do you know "Implementation A" is faster or slower than "Implementation B" if you don't profile?

This topic is closed to new replies.

Advertisement