Archived

This topic is now archived and is closed to further replies.

phillip_fletcher

STL speed question

Recommended Posts

Given:
  
class CGraphicalObject
{
public:
     virtual bool Draw(void) { /* draw stuff */ return true; }
};

  
Which is faster?
  
typedef std::list<CGraphicalObject*> ObjectPtrList;

ObjectPtrList ListOfObjects;

// list is filled elsewhere....


ObjectPtrList::iterator ObjectIter;
for(ObjectIter = ListOfObjects.begin(); ObjectIter != ListOfObjects.end(); ObjectIter++)
{
     (*ObjectIter)->Draw();
}
  
Or
  
typedef std::list<CGraphicalObject*> ObjectPtrList;

ObjectPtrList ListOfObjects;

// list is filled elsewhere....

std::for_each(ListOfObjects.begin(), ListOfObjects.end(), std::mem_fun(CGraphicalObject::Draw));
  

Share this post


Link to post
Share on other sites
I would venture to say that in this instance, the second would be slightly faster as it caches the call to end().

But not by much...

Share this post


Link to post
Share on other sites
Don''t worry about it. Period. Do whatever you think is a more concise and readable form.

A lot of beginning and even intermediate programmers make the mistake of premature optimization, trying to wring every last cycle out of their CPU before they even have any idea where the bottleneck is.

Three rules of optimization, stolen from someone or other here (feel free to speak up and take credit):
1. Don''t optimize.
2. (for advanced programmers) Don''t optimize yet.
3. Use a profiler.


How appropriate. You fight like a cow.

Share this post


Link to post
Share on other sites
Premature Optimization is the #1 cause of non-written code. It is also the source of all evil. Satan was spawned because God tried to prematurely optimize his code.

Stay away from it. The bottom line is, ugly code that works is ALWAYS better than elegant / optimized code that DOESN''T work.

So just make it work and then worry about it after it works.

Share this post


Link to post
Share on other sites
Hmm, I have to partially take exception with what daerid said.

Ugly code that works is bad. elegant, unoptimized code that also works is good. Ugly, optimized code that really, really works is probably bad.


How appropriate. You fight like a cow.

Share this post


Link to post
Share on other sites
Well this is actually out of code that I wrote a few years ago. I'm going over a few of the drawing routines, which are the bottleneck, the former was the original form of the code. And after looking at it recently, optimization is certainly not premature, its unavoidable (And for the "Don't worry about it", its my job to worry about it, I have the task of making the application run fast, lucky me huh ). I like the way that the "for_each" works, and I personally think it is cleaner, but either version is readable to my co-workers, and I. I wasn't sure if the STL implementations were inline, or not, in which case all, I would imagine that all the function calls would add alot of overhead. I'm not sure. I do know that the STL implementation(with VC) of "for_each" contains alot of function calls, probably more than the for loop.

Thank you all for you words though, I'll just assume that the too are fairly equal in speed.


[edited by - phillip_fletcher on May 5, 2003 3:17:37 PM]

Share this post


Link to post
Share on other sites
[RANT]

quote:

[...] Premature Optimization [...]


I regularily get neurological seizures, when I see/read/hear this term. This whole concept of "premature optimization", which seems to be so loved by the academic crowd, is one of the most idiotic concepts I know of. Just after hungarian notation. Welcome to the the world of "modern" software development: extremely slow and bloated code, that takes ten times more space than it should, and executes ten times slower than it could. Great. And often this is not the result of inherently bad design, but the result of simple little errors from the "premature optimization" brainwashing.

quote:

Three rules of optimization, stolen from someone or other here (feel free to speak up and take credit):
1. Don''t optimize.
2. (for advanced programmers) Don''t optimize yet.
3. Use a profiler.


I fully agree with 3, no question about that. I do not entirely agree with 2. There comes a point in your programming experience, where you instinctively know, that a certain piece of code is going to be a bottleneck. For example, let''s take a string or a vector class. You know that it is going to be used throughout your code. Emphasis is on the amount of potential use: if some piece of (library) code is going to be extensively used, then chances are, that it will sooner or later end up in time critical parts. It can not be wrong to optimize such code beforehand. In fact, it will even increase your productivity. And I certainly do not agree with point 1. Actually, I think that point 1 is one of the main reasons for low-quality, slow and bloated code.

And just to clear up another common mistake: optimized code does not need to be ugly, or unmaintainable. Optimized code doesn''t necessarilly mean ASM either. Optimized code is a way of thinking and organizing your algorithmic structures. Optimizing is also programming with speed and efficiency in mind. Optimized programming is not using a std::map, if an std::vector would do it. Or not using operator new() in a string class for small elements, but a pool allocator. It''s choosing the right tools (ie. structures and algorithms) to do the job. Unfortunately, lots of people lose sight of that: they write bloated and slow code, with the idea of subsequent optimization in the head. "We''ll optimize it later, first it needs to run". Bullshit. We all know, that it is never going to be optimized later - if the product works, it is considered ready for shipping. Try to convince the marketing departement, that the apparently fully opeartional (but slow) software needs another two month optimization pass. Good luck.

[/RANT]

Share this post


Link to post
Share on other sites
quote:
Original post by Yann L
I regularily get neurological seizures, when I see/read/hear this term. This whole concept of "premature optimization", which seems to be so loved by the academic crowd, is one of the most idiotic concepts I know of. Just after hungarian notation. Welcome to the the world of "modern" software development: extremely slow and bloated code, that takes ten times more space than it should, and executes ten times slower than it could. Great. And often this is not the result of inherently bad design, but the result of simple little errors from the "premature optimization" brainwashing.

Well, of course, it''s not an either-or situation. You make a good point about optimized design in terms of program flow, data structures, etc... what I''d call "macro-optimization". My comments, and stating of those rules, were more geared towards micro-optimization, which this seems to gear more towards. In this case, since he''s using the member function pointer functor in foreach, there''ll be an extra function dereference and no chance to inline. Since it''s a virtual function, though, you wouldn''t be inlining much anyway if you wrote a custom functor--just the vtable dereference. In the for loop case, he isn''t caching end(), but it''s not as weighty as a branch (on x86) so not much to worry about there.

But all of this is meaningless... it''s a few clock cycles per object. Meanwhile, each object is undoubtedly taking at least thousands of clock cycles each in the Draw() call. So I look at this, and my brain yells at me, "it''s an outer loop--the bottleneck isn''t here." I definitely agree that macro-optimizations should be implemented from day 1, starting with design.

You make a good point about optimizing commonly used helper functions. But I believe that this is a fairly rare case to be encountered on these boards--I''ve noticed that the majority of people asking about optimizations are not hacking a matrix transform or an FFT, but rather asking about a code structure like this. I think that this is partially because most commonly used functions have already been written for people, either in D3DX (though don''t get me started on the optimization of a few of those functions) or the CRT.


How appropriate. You fight like a cow.

Share this post


Link to post
Share on other sites
quote:

In this case, since he''s using the member function pointer functor in foreach, there''ll be an extra function dereference and no chance to inline. Since it''s a virtual function, though, you wouldn''t be inlining much anyway if you wrote a custom functor--just the vtable dereference. In the for loop case, he isn''t caching end(), but it''s not as weighty as a branch (on x86) so not much to worry about there.


True, but I wasn''t specifically referring to his case. It was more a general purpose rant

I fully agree with you about the macro and micro-optimization issue. But unfortunately, this premature optimization paradigm seems to spread out recently, widely out of control. It goes so far, that efficient code is losing all importance, ie. "the compiler is psychic and will do everything for me" behaviour. OK, modern compilers can do much more today, but the overall structure still comes from a human mind.

And what is even more alarming, is that this stance seems to be more and more common with programming students in the CS domain, even majored ones still have this idea.

Share this post


Link to post
Share on other sites
I think a major reason for the indoctrination of CS students is the greater emphasis on high level design. For the first time, professors are teaching UML at the same time as C++... there''s a push to have an elegant, extensible design that accomplishes everything in a fully explicated way. (I heard of one professor at this school who took points off when students used casts in their project.... *shudder*) I think it''s a noble goal, since it eventually teaches students to be adept at macro-optimization, but craftsmanship of the individual line of code certainly decreases.

I think that the more you know about a language, the easier it is to find a balance between Booch-pleasing design and tight code. Functors are a good example of this--a surprisingly elegant solution to the problem of function-call overhead in inner loops, without decreasing readability in the slightest.

Give those non-optimizing CS students some time, and a project or two on a GBA or something..they''ll come around.

Share this post


Link to post
Share on other sites
Heh Isn''t it relieving to know, that at least some of them will end up developping on embedded systems (which is a very interesting activity, BTW)...

On the other hand, it''s considerably less funny, if your company decides to hire two such "programmers". And after you assigned them to their first small evaluation-project, you browse over their code, horrified, and you just want to call human resources, screaming: fire them immediately !! ...

Share this post


Link to post
Share on other sites
quote:
Original post by Sneftel
Three rules of optimization, stolen from someone or other here (feel free to speak up and take credit):
1. Don''t optimize.
2. (for advanced programmers) Don''t optimize yet.
3. Use a profiler.


I think that is my exact quote but I stole it from someone. Probably Herb Sutter, who was quoting someone else.

I think the reason I agree with step 2 is that many people over-estimate their level of ability as programmers (especially beginners) and worry about optimising things when structure, adaptability and correctness are more important. If you really are an advanced programmer you''ll be reaching for the profiler anyway.

I enjoy refactoring (changing the structure of code without changing its external behaviour) and in many ways it disregards optimisation. However it can help organise code to make optimisation easier. For a very simple example, it encourages methods to be short. This means you end up with small sections of code made up of many function calls. It is easier to profile short functions. I often find myself profiling then refactoring to be able to get a better grip on exactly which part of the function is taking the time.

I agree that our code often never gets optimised. I''ve worked on engines that after leaving the company and going away travelling for a year I came back and though ''this is a bit slow'', done some profiling and speeded it up by 10 times (really!). I was shocked but things just keep rolling on. I''m working on something right now that hasn''t been thoroughly tested and someone complained that it took 30 minutes to come back with some complicated results. I profiled, made a few small changes and it''s almost instantaneous now. Its to do with money but is bad business as far as I''m concerned. There is no way our customers can maintain their confidence in our prodcuts.

Yann, what you describe is possibly modular code that will be commonlly used in a project, and possibly could be written up front and optimised with little need for changes latter on. I don''t mean for the rules to be applied only at the end of a project but at the end of development of certain modules. It may be that you work your way through the rules on a daily basis. That is probably the bad thing about them - its easy to think it mean never.

PS preincrement rather than postincrement

Share this post


Link to post
Share on other sites
The points in the quotes on "premature optimization" are mainly these:
- Too often people will optimize at the EXPENSE of readability. Get the code working FIRST, then go and find what needs to be optimized. In this case, that seems to be what has happened.
- Too often you will spend 95% of your time squeezing those few clock cycles out of your for loop, that is executed once per frame. The point is that there is absolutely no need, OR USE, in optimizing things that are relatively fast. EG) Why optimize how a function deals with the passed RECT structure when the blit itself takes 1000 times as long?

ALGORITHMIC choices are not usually referred to as "optimization", but more merely design. Of course you consider what order your algorithms and data structures are before using them, and which would be best. BUT, the point is that the process should go more like:
1) Design the program, data structures and algorithms.
2) Programming
3) Testing
4) Something is too slow? Use a profiler to find the few critical sections. There are generally only 1 or 2 short sections of code (dozen lines at most) that are causing most of the slowdown. OPTIMIZE THOSE, and don't worry about the rest.

[Edit] Petewood's section on modular code is exactly right. You can still write code, test, optimize, and finalize it in segments (libraries are the obvious example)... as long as you still maintain the correct ORDERING of development.

[edited by - AndyTX on May 5, 2003 5:43:14 PM]

Share this post


Link to post
Share on other sites
quote:
Original post by Yann L
you browse over their code, horrified, and you just want to call human resources, screaming: fire them immediately !! ...

I can sympathize. I''ve seen O(2n) implementations of O(n) algorithms.


How appropriate. You fight like a cow.

Share this post


Link to post
Share on other sites
quote:

I agree that our code often never gets optimised. I've worked on engines that after leaving the company and going away travelling for a year I came back and though 'this is a bit slow', done some profiling and speeded it up by 10 times (really!). I was shocked but things just keep rolling on. I'm working on something right now that hasn't been thoroughly tested and someone complained that it took 30 minutes to come back with some complicated results. I profiled, made a few small changes and it's almost instantaneous now. Its to do with money but is bad business as far as I'm concerned. There is no way our customers can maintain their confidence in our prodcuts.


That's exactly my point - in the end, it's going to backfire.

quote:

ALGORITHMIC choices are not usually referred to as "optimization", but more merely design.


Design = macro optimization. The basic structural layout of a program will implicitely have an impact on the efficiency and performance. So the simple choice of an std::map vs. an std::vector vs. a std::list, or the simple pre- vs. postincrement issue petewood brought up, will probably modify the structure of your code - but it will even more have an impact on your performance. Writing efficient and performance aware code is optimization. Hand optimized ASM is a whole different issue. It's more about questions like: how can I avoid needless calling of this expensive copy constructor, the creation of temporaries, etc.

quote:

Of course you consider what order your algorithms and data structures are before using them, and which would be best. BUT, the point is that the process should go more like:
1) Design the program, data structures and algorithms.
2) Programming
3) Testing
4) Something is too slow? Use a profiler to find the few critical sections.


That's the theory. In practice, however, you forgot point:

3.5) if it seems operational, ship it, and forget about point 4.

quote:

There are generally only 1 or 2 short sections of code (dozen lines at most) that are causing most of the slowdown. OPTIMIZE THOSE, and don't worry about the rest.


Optimizing poorly written time critical sections can potentially disrupt your entire code base, esp. if design errors were made. Often you'll have the problem of the 'sliding bottleneck', where the problematic area is distributed among dozens of different classes and modules. If you optimize one, the bottleneck will shift to another - a module perhaps written by an entirely different part of your dev team (speaking of large scale projects). When keeping consistent and efficient code design in mind, however, those scenarios can almost totally be avoided. And we can finally focus on the last micro-optimisations, those being pretty rare, as you said.


[edited by - Yann L on May 5, 2003 6:05:31 PM]

Share this post


Link to post
Share on other sites
Which is faster? I'd guess this is (but I could be wrong):


typedef std::list ObjectPtrList;
ObjectPtrList LOB;
// list is filled elsewhere....
ObjectPtrList::iterator it = LOB.begin();
ObjectPtrList::const_iterator end = LOB.end();

for( ; it != end; ++it ) {
(*it)->Draw();
}


[edited by - rypyr on May 5, 2003 6:07:32 PM]

Share this post


Link to post
Share on other sites
quote:
Original post by Yann L
Design = macro optimization. The basic structural layout of a program will implicitely have an impact on the efficiency and performance. So the simple choice of an std::map vs. an std::vector vs. a std::list, or the simple pre- vs. postincrement issue petewood brought up, will probably modify the structure of your code - but it will even more have an impact on your performance.



No, I know what you''re saying and I agree. The main feature is to get the design correct right off the bat, otherwise you are asking for trouble. Regardless of efficiency, you don''t want to be recoding a fundimental data structure in your program at the last instant. Data factories and abstraction aside, in the real world, this just isn''t reasonable most of the time.

My only comment is that the points made about optimization earlier are referring to code-level performance optimization, not algorithmic design, even if you want to dub this "optimization" as well. To be completely weird, the word optimization can even refer to making your code smaller, at the EXPENSE of speed (for example).

quote:

That''s the theory. In practice, however, you forgot point:
3.5) if it seems operational, ship it, and forget about point 4.



That''s where object orientation comes in. The process is applied to a very small segment. For example, you need to write code to alpha blend an image to the screen (in DDraw lets say, to make it harder).

- To start with, you work out how it is going to be done... going through all the pixels, extracting RGB information, weighted average, re-encoding and moving on to the next pixel.
- Also at this stage you might note that there are potentially some MMX optimizations to be had, but none of these will fundimentally impact the structure of the algorithm... we''re still in the design phase here.
- You write some test code... since it''s a very small portion of the project, as with OO you should always be working with (usually <100 lines of code), it takes you very little time to draft up a simple for-loop that does the required blend.
- You spend some time debugging and testing until you are sure the algorithm is sound.
- THEN you say, "ok, now after some preliminary benchmarks, this might need to be faster". HOWEVER, note that it could be the case that you only need to alpha blend a single small image one time in your entire project... then you don''t even need to bother cracking out your i86 ASM and Intel Architecture Manuals in the next step.
- However if you do need some more speed, and since you already considered algorithmic optimizations in the design phase, you deside to convert some of it to MMX assembler, and thus you process 4 pixels at a time, for example. After coding all of this through and testing it all thoroughly, you descover that it meets the original design document decisions for performance and thus you are done.

The points about "optimization" are that if you had have tried to code this routine from start to finish in straight MMX, it would have taken you LONGER overall, and it would have been much more painful to work out the logic and debug. Thus, the "premature code optimization" would have hurt the final produce in EVERY way.

That''s all those quotes are trying to say, and I agree that they have been too far generalized. Don''t dismiss them because of that though, as the "efficiency freak" syndrome is still just as common today as ever...

Share this post


Link to post
Share on other sites
Some good points in this thread. I think there are reasons both for thinking about performance right from the beginning as well as not being obsessed about optimization too early, depending on the original goals.

If a major requirement of a product is that it performs within a given time, then obviously you need to think about this up front. I worked at one company where the focus of the product was processing a huge number of records as fast as possible. Therefore, much of the design was concerned with this. Templates were used almost exclusively for inheritance because the overhead of function calls was too great—with templates, many functions end up inlining. This was something you could not do later. In general, you don’t want to get to the end of development, profile, and then discover that the only way you’re going to make the program faster is to redesign in a major way.

But I suppose the “optimize later” philosophy pertains to that practice of spending too much time agonizing over which form of loop is faster when it probably won’t matter. But with a little experience, I think you can “optimize” as you go along in a sense without wasting any time at all. For example, you know not to pass a string by value if you can pass it by reference.

I think that in some cases, code that is slightly slower but also clearer can be the better choice. Maintainability is a value built into code that is hard to measure and so it usually gets ignored. If you are the only one working on the code, then perhaps it doesn’t matter. But have you every gone into someone else’s code to fix or modify something and it took you a lot longer just because they had found some clever way to make the code run faster, and it was so hard to figure out what they were doing? If speed is not critical, do it the clear way and you will save the company money in development time by making your code more maintainable for the next poor slob that comes along.

I fully agree that it happens that your code will ship before you get a chance to optimize it. But consider that that was a decision made by marketing people. If you had secretly spent extra time up front to optimize, then would the whole development cycle have been longer? Would the delay cause you to miss a market window? As developers, we all want to produce the best quality. Unfortunately, we have to deal with competing companies that may release lower quality products sooner and capture the market.

I’m not sure I’m convinced by the ideas of XP (which depends on refactoring) although I confess I haven’t read much about it. But it sounds to me like an excuse to jump right into coding before you think about the design carefully. And refactoring, although they say it’s not rewriting, does sound like rewriting. I think there are some types of systems that lend themselves more the XP/refactoring approach than other. In some large systems, I believe you just need to think a lot up front about how you are going to build it. Otherwise you’ll find yourself redesigning and rewriting over again.

I guess that was just a rambling rant of various ideas. But the subject is thought-provoking.

Share this post


Link to post
Share on other sites
How about time the 2 methods! Did you try that Phillip?

also add a 3ird method:


typedef std::list ObjectPtrList;
ObjectPtrList ListOfObjects;
// list is filled elsewhere....

ObjectPtrList::iterator ObjectIter, end;

// so you wont have to call "ListOfObjects.end()" every time
end = ListOfObjects.end();

for ( ObjectIter = ListOfObjects.begin(); ObjectIter != end;
ObjectIter++ )
{
(*ObjectIter)->Draw();
}

Share this post


Link to post
Share on other sites
quote:
Original post by Sneftel
Hmm, I have to partially take exception with what daerid said.

Ugly code that works is bad. elegant, unoptimized code that also works is good. Ugly, optimized code that really, really works is probably bad.


How appropriate. You fight like a cow.


That doesn''t really refute what I said. Remember, my point is simply this: works > doesn''t work

Always. And you can consider un-maintainable code non-working.

Share this post


Link to post
Share on other sites
Please allow me to say this again (I''m sure it''s been said already): Use ++Iterator rather than Iterator++ Why? Because the (...)++ operator involves a recopy constructor call, whereas the ++(...) operator does not.

Second tip: From reading STL sources, there is a general consensus that using an algorithms offered with a STL implementation is PROBABLY faster than emulating the algorithm oneself. (aka for_each vs making the loop oneself) This said, the STL does not guarantee it, it''s simply common sense that those who wrote up the algorithms know the STL they wrote and can use a few quick tricks if they feel like it.

One rule is: An algorithm should NOT be slower than implementing it yourself. (I''m sure there are exceptions, like implementing a kick-ass radix sort in place of a sorting algorithm from a crummy STL implementation, but I''m talking tendencies here.)

Okay, phillip_fletcher, all arguments about optimizing or not asides, you''re asking us how to make this faster. My vote is not only on for_each, it is also for you to explore the possibility of changing your STL implementation if you use VC6.0''s. Why?

I''ve learned that the STL implementation offered in VC6.0 is rather crummy. (It works, but it''s slow and can''t handle some things in the standard. VC7''s implementation is fine and fast, according to what I heard, however) I''ve done some profiling of Dinkumware''s implementation verses STLport''s and saw some tenfold improvements in hashed lists. (Something to do with a smart allocator) Thus, I''m quite sure you''d benefit in using STLport over Dinkumware STL.


=^.^= Leaders and teachers should remember: It is best to offer others what they Need, not what they Want.

Share this post


Link to post
Share on other sites
Spacecat,

Yes, I use MSVC 6.0 at work, they bought VC7 but will not upgrade this project to it, not sure why. I use VC7 at home, and I am very happy with it. This is not a question of making it faster, but which one is faster. I personally think that there are a lot of better list algorithms, but not my choice. Thanks everyone for suggestions and comments.

Share this post


Link to post
Share on other sites
quote:
Original post by Spacecat
Please allow me to say this again (I''m sure it''s been said already): Use ++Iterator rather than Iterator++ Why? Because the (...)++ operator involves a recopy constructor call, whereas the ++(...) operator does not.

Actually, if you look at the asm for this, they end up doing the same thing. I tested this with a vector and a set iterator, to primatives. Perhaps iterators to non-primatives will do different things.

Share this post


Link to post
Share on other sites
quote:
Original post by Sneftel
I can sympathize. I''ve seen O(2n) implementations of O(n) algorithms.
That''s radical! Maybe the most common error I see is the "N-multiplier" caused by looping through the container even though the searched attribute never changes.

E.g.
for (int i=0; i < strlen(x); i++) { /* ... */ } 

Share this post


Link to post
Share on other sites