Jump to content

  • Log In with Google      Sign In   
  • Create Account


Worst profiler results


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
10 replies to this topic

#1 Oolala   Members   -  Reputation: 707

Posted 21 January 2014 - 07:39 PM

Question is pretty straight-forward:

 

You just finished writing a bit of code, and it doesn't quite meet your performance demands, so you run it through a profiler to find out where you need to trim fat.  What are you really hoping doesn't show up in your profiler results?

 

For me, I just got the result that memory allocation is accounting for roughly 75% of my runtime.  This really is one of those things that I hope not to see, mostly because it's one of those kinds of problems that can't really be addressed in one place, and is a sign that everything everywhere needs changed.



Sponsor:

#2 fastcall22   Crossbones+   -  Reputation: 3970

Posted 21 January 2014 - 08:44 PM

I just got the result that memory allocation is accounting for roughly 75% of my runtime.

Dynamic memory allocations are expensive. Heap fragmentation is devastating. Analyze your objects' lifetimes carefully. Use the stack where it makes sense (whenever an object's lifetime is FILO ordered), object pooling where it makes sense, and avoid frequent heap allocations by reserving the size of containers before using them.

WW91J3ZlIGdvdCBhIHNlY3JldCBib251cyBwb2ludCE=


#3 Promit   Moderators   -  Reputation: 6109

Posted 21 January 2014 - 09:02 PM

I have one fear when profiling. I fear that when I run it, nothing interesting shows up. Not hot spots, no obvious mistakes, no clear points to tackle. Just a nice even trace with lots of things taking up small amounts of time to do small things. Because optimizing that is the stuff of nightmares.

 

Also, any time I see large amounts of time spent in middleware or especially the graphics driver is likely to be hell. 


Edited by Promit, 21 January 2014 - 09:08 PM.


#4 Hodgman   Moderators   -  Reputation: 27686

Posted 21 January 2014 - 09:04 PM


You just finished writing a bit of code, and it doesn't quite meet your performance demands, so you run it through a profiler to find out where you need to trim fat.  What are you really hoping doesn't show up in your profiler results?
FifoFullCallback sad.png

#5 frob   Moderators   -  Reputation: 18883

Posted 22 January 2014 - 01:36 AM

You just finished writing a bit of code, and it doesn't quite meet your performance demands, so you run it through a profiler to find out where you need to trim fat.  What are you really hoping doesn't show up in your profiler results?

Seeing nothing big is a serious one for me. I've done a lot of profiling and instrumenting of code, especially on some code bases destined for 20MHz and 66MHz processor systems. You can pick out the low hanging fruit pretty quickly.

What is devastating to me is finding hundreds or even thousands of tiny things, each with a cost that is only slightly too high, and each requiring a manual effort to fix.

 
For me, I just got the result that memory allocation is accounting for roughly 75% of my runtime.  This really is one of those things that I hope not to see, mostly because it's one of those kinds of problems that can't really be addressed in one place, and is a sign that everything everywhere needs changed.

That one is usually pretty simple. Count yourself lucky.

Assuming your profiler lets you sort by caller you can usually navigate quickly up the tree to find a small number of serious offenders. If the profiler itself doesn't do that (most do) then export your data into a big spreadsheet with calling data, use an Excel pivot table and play with it until you discover the offenders.

If this is the first pass through the code base there are a few common patterns. One common pattern is not reserving space in dynamic arrays (such as std::vector), instead simply adding items one at a time causing a large number of resizes. Usually these are evidenced by a brief stutter as the system drops a frame or two doing twenty thousand allocations. Another is the frequent creation of temporary objects, and since it is a performance concern it is likely happening in a relatively tight loop, so it is probably just a single location where you need to adjust object lifetimes. It may be a resource that is frequently being created in destroyed where a cache or persistent buffer would help. Or it could be a memory leak, premature object release and allocation, or similar problem.

All of those problems can be quickly identified with a good profiler. With 75% of your time spent in the allocator it should be glaringly obvious from the profile which functions need examination.
Check out my personal indie blog at bryanwagstaff.com.

#6 ApochPiQ   Moderators   -  Reputation: 14292

Posted 22 January 2014 - 12:39 PM

Man, allocation is one of my favorite profiler results, because it's basically trivial to change to better allocation strategies without rewriting a ton of code. You know exactly what you need to hit and you should ideally also know its usage patterns well enough to know immediately how to pick a better allocation scheme. It's like free performance.

 

Algorithmic improvements are a little worse, because they require hitting more code to improve; but they usually are only painful to me in the sense that they mean I made a dumb implementation choice up-front.

 

Past that is micro-optimization, where I have to do fiddly stupid things to try and squeeze out a few thousand cycles here and there, deep in some hot inner loop or something.

 

But I'm in agreement with earlier posters in that seeing nothing is by far the worst. A similar cousin is seeing only calls that block inside the kernel, such as waiting on mutexes in a multithreaded program. Seeing only blocking calls/suspended threads means you're going to have a nasty time finding the actual performance problem, because your wall-clock performance is dominated by not doing anything.



#7 Oolala   Members   -  Reputation: 707

Posted 22 January 2014 - 02:24 PM

I agree, seeing nothing would be worse.

 

The reason that I hate seeing 'new' float to the top of stuff is mostly because it almost guarantees me having to make a bunch of tiny tweaks in a bunch of tiny places to implement either a pooling mechanism, or a more intelligent copy-by-value mechanism for composite classes.  The second one is that which is currently killing me.  It either ends up meaning a few high-level structural changes to things, or a whole bunch of small and annoying changes in a bunch of places.

 

Normally my first pass on code is sort of written to the algorithmic complexity of the problem at hand, and not really taking into consideration performance much.  Seeing "new" at the top of my profiler hot-spot list is pretty much the computer telling me "no", and demanding that I now mire myself in the fine-grained performance details.  Admittedly, there are worse things the profiler can tell me, but this is always an annoying one that tends to mean far-reaching changes.

 

Yes though, seeing nothing is definitely worse.  Once got a profiler result that had not a single hit over 0.1% without counting children in the call graph.  That one turned out to be not so bad because there were some very high-level changes that could be made that made a big impact, but it wasn't obvious from the profiler results.

 

Sounds like some others are working on embedded systems, in which case the profiler can't hurt you any more than you're already used to being hurt on a daily basis.  Embedded systems are so painful in general.  Did some work on FPGA & cpu hybrid systems a while back, and it took so much work to get even the smallest things done.



#8 ApochPiQ   Moderators   -  Reputation: 14292

Posted 22 January 2014 - 09:07 PM

If you're using "new" that heavily, you're probably doing something wrong.

#9 Matias Goldberg   Crossbones+   -  Reputation: 3007

Posted 23 January 2014 - 08:12 AM

There are three situations I fear:

  1. The one mentioned: Lots of small things adding up (aka nothing "big" to focus on). This is by far fear #1
  2. Everything is a disaster: There are so many "big" things, the code is so badly written it's just better to rewrite from scratch. It's very similar to #1 (if everything's big, nothing's big).  But the situation is so bad, it needs to be put in its own category.
  3. The "big" things can't be optimized further: Something big is showing up on the profiler, but it is already cache friendly, SIMD optimized, and multithreaded. The reason it's showing up is because... well, it's the nature of the job being performed. There's just too much data to process. The only solution is to search for other algorithmic optimizations, but you have a hard time thinking of anything better than the algo being used. Fortunately this one is really hard to happen, because rarely I see code so well written and designed. Where I see this problem the most is in emulators.

Edited by Matias Goldberg, 23 January 2014 - 08:15 AM.


#10 phantom   Moderators   -  Reputation: 6795

Posted 23 January 2014 - 06:35 PM

I'd agree with all the 'nothing' and 'lots of small things' above but I'd also like to add one more to the mix; something that you caused ;)

#11 frob   Moderators   -  Reputation: 18883

Posted 23 January 2014 - 09:11 PM

I'd agree with all the 'nothing' and 'lots of small things' above but I'd also like to add one more to the mix; something that you caused ;)

I learned years ago to just assume I caused all the problems. It saves time.

Usually I didn't, but when you approach it like "What did I break, and how can I fix it?" it is much more useful than taking time to assign blame to others. Also it looks better to managers.

If it is something particularly nasty you can check version control when done, but most of the time isn't worth it. I often respond to queries about broken systems with "I wonder how I broke that..." because in some code bases a change in one place really can cause unexpected behavior in the most bizarre places. Being someone who submits a lot of changes affecting core functionality also means you are likely to break other people's stuff. A GPE working on the fringes usually only breaks his own stuff.

One tech lead I worked with had an interesting viewpoint. As developers grow in skill, experience, and domain, the bugs they cause should be even bigger, more powerful, and harder to fix. A good developer will constantly challenge himself. Bugs usually require more brainpower to find and fix than to write, and if some tricky bug took the full brainpower of a senior engineer to construct you know it will be nasty to fix. A junior engineer who breaks the build with a missing semicolon is fine. When a senior engineer breaks the build you can tell they are doing their job well when the breakage is a subtle change that takes the entire team out for three days and five people hunting through the code can't find the bug even after isolating the specific change that broke it.
Check out my personal indie blog at bryanwagstaff.com.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS