How do you use the VS2008 profiler effectively

Started by
0 comments, last by frob 13 years, 6 months ago
I've been using VS2008's profiler in instrumentation mode to benchmark some expensive computational geometry code I've written. But there a quite a few issues making the profiler less effective than I'd hoped.

1) The profiler collects an *enormous* amount of data, and running it for more than a few seconds results in data files hundreds of MB in size. Since my program's render loop is continuous, the profiler collects data like crazy until I pause it. These files take forever to process unless I'm careful to only turn collection on/off immediately before and after the part of the program I want to test.

2) Code runs about 100x slower while being profiled. The figures the profiler gives me on the time spent in various functions is not accurate. Idealy I'd go off the proportion of time spent in each part of the algorithm, but that doesn't seem to be accurate either. Sometimes it records a certain part of the algorithm as a bottleneck, and other times the listed time taken is almost zero. I reproduce the same conditions each time I profile, so those functions always do the same amount of work.

3) Memory allocation is very slow from inside the Visual Studio IDE, even in release mode. Some of the operations I'm profiling are allocation heavy, and the algorithm as a whole runs 5x as fast if I run the .exe from outside the IDE. Despite this, new/delete aren't showing up as bottlenecks at all.

4) A few times I got seemingly impossible results. Such as the value for "time spent in function" being larger than the value for "time spent in function and it's children".

I ended up doing my own tests by recording the time before and after certain important function calls, which gave me much more consistent and accurate data. There was one thing the profiler was good at though. It records the function call tree and the number of times each (non-inlined) function was called.

Anyway, is there a way to use the profiler to get more accurate results? Is it more effective in cases where you don't already know which parts of the program are bottleneck code?
Advertisement
Quote:Original post by taz0010
I've been using VS2008's profiler in instrumentation mode to benchmark some expensive computational geometry code I've written. But there a quite a few issues making the profiler less effective than I'd hoped.

...

I ended up doing my own tests by recording the time before and after certain important function calls, which gave me much more consistent and accurate data.

There was one thing the profiler was good at though. It records the function call tree and the number of times each (non-inlined) function was called.

Anyway, is there a way to use the profiler to get more accurate results? Is it more effective in cases where you don't already know which parts of the program are bottleneck code?

There are lots of ways to profile.

First (and hopefully this is obvious) you should only profile the optimized build.

You are right that invasive profilers are not useful to just turn on at application start and leave running the entire time. Instead you get your test environment ready to the place you want to examine, start the profiler, do the action, and stop the profile.


Function-based profilers are useful in measuring call counts, large-scale timings, and call graphs.

You normally want to run the profiler four or five times and throw out outlier data. Usually it will give good results, but oddities occur.

The instrumenting profiler gives exact call counts and exact trees. This is often a very useful tool in finding anomolies.

You can also use a statistical profiler. It is less invasive, just peeking in to the application at regular intervals to see what it is doing. They don't require nearly as much space, and they run much faster. Total counts are inexact and precision is lost, but they are generally accurate at getting the big pictures of what your code is doing. Like the invasive profilers these should only be used for a brief moment, maybe 50-100 frames.


Many times I will get the application running, drop a breakpoint, run the profiler for a single game frame (until the breakpoint hits again), and use those results. Other times I will use a counted breakpoint to run for 10 or 20 frames. That is more than enough information for call counts and call graphs.


You mentioned one area a system-wide profiler is very useful: Function hotspots. If you notice that one small update is calling strlen() a few thousand times, it lets you know somewhere to clean up. When you run your profiler for a single frame and see a single function is called hundreds of times you know something is fishy.

Another very useful item is memory profiling. You can identify locations that are performing large or frequent allocations. You might discover that someone is adding thousands of items to a std::vector or std::string without reserving space first, which can result in many thousands of unnecessary slow allocations. Call graphs for memory functions can help identify areas of code that need to be improved.

You might be able to generate usable timing results from a global profile, but the profiler's own overhead can sometimes make this difficult. It is often useful to limit profiling to a specific module or library. This reduces the total output to just the section you are interested in adjusting.




Profilers are great at a high level but not very useful at finding local issues. They help you identify slow functions and hotspots, but aren't useful for local optimizations.

As an example, a profiler can help you identify when you call normalize too often, and it can even help you identify when one of your matrix functions is taking more long on average than it ought to. But it won't help you make either function faster.

I'm guessing the computational geometry code is a localized block of code, not a huge collection of functions. In that situation code analysis and static tools are a better option.

This topic is closed to new replies.

Advertisement