Jump to content
  • Advertisement
Sign in to follow this  
swiftcoder

std::set destructor performance

This topic is 3700 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have written a small simulation (work related), which performs most of its work on std::sets. The profiler is telling me that 40% of the running time is spent in the destructor of std::set, and another 25% is spent in the copy constructor. At the same time, the expected performance sinks (union, intersection and difference operations) each consume less than 5% of the running time. The sets are relatively small (never greater than 100 items), and contain only pointers, so the destructor and copy constructor should have very little work to do. I have tried reusing the sets by calling clear(), which had no performance benefit, and nor does replacing the copy constructor with a call to swap(). I am running Apple's version of GCC 4.0.1, and am compiling with the -O3 -g flags (-g to obtain symbol names for the profiler). Profiler in use is Apple's Shark. Any ideas how I might improve on this performance? It seems more than a little odd, given that the items involved are raw pointers. I would rather not have to replace std::set, but if necessary, that is certainly an option.

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by SiCrane
Did you try using a pooled allocator?
Thanks! I hadn't thought of that, so I just added in boost::fast_pooled_allocator, with the result that the copy constructor cost fell 1-2%, while the destructor cost fell 15%.

That is a very decent speedup, but these two still account ~50% of my running time, and still dwarf the cost of the actual set operations (12%, 8% and 6% respectively).

Share this post


Link to post
Share on other sites
union, intersection and difference doesn't involve allocation or deallocation.

Well, it does -- but that time gets assigned (probably) to the constructor and destructor. :)

Sets and Maps are tree based structures. This makes them exceedingly slow to copy.

How are you using your std sets? Do you add and remove elements, or is it more of a "build up the set" then "read from the finished set" then "throw away the set" sequence?

Because the build/use/dispose can be done quit efficiently with lazy-sorting on a vector buffer. And if duplicate elements is (relatively) rare or not a problem, then you can even defer detection of duplication until the lazy-sort efficiently.

Share this post


Link to post
Share on other sites
Obviously, to provide any useful information you need to show us some sample code. But my hunch is that you're passing/returning std::sets to/from functions by value. Unless you can actually explain why your code should spend any significant amount of time creating/destroying sets.

Share this post


Link to post
Share on other sites
Quote:
Original post by osmanb
Obviously, to provide any useful information you need to show us some sample code. But my hunch is that you're passing/returning std::sets to/from functions by value. Unless you can actually explain why your code should spend any significant amount of time creating/destroying sets.
Code is long and convoluted (this is a simulation of a bittorrent-like p2p network).

All functions producing sets are passed the destination set by reference, and sets are only allocated in the main loop.

In a given cycle of the simulation, there are anywhere between 10 and 100 set operations (union, intersection or difference), and only a single set copy/swap.

However, if each of those operations counts as a copy to fill the destination set, and a destruction to clear the result (as NotAYakk suggests), you can see where the problem may arise.

Share this post


Link to post
Share on other sites
I know this is probably a totally dumb question, but you are running a Release build, right? I've seen problems with sets and maps consuming huge amounts of CPU time due to iterator checking and other debug/safety functionality.

Share this post


Link to post
Share on other sites
Quote:
Original post by swiftcoder
you can see where the problem may arise.
Is it actually slow then?
The union, intersection and difference functions have a linear time complexity, algorithmically quite fast plus they're only dealing with pointers, I'm not actually that surprised that they only account for 26% of the running time, compared to the more-hefty manipulations of the set data structures as well as allocation and deallocation of memory.

Share this post


Link to post
Share on other sites
Well, since you're containing PODs, you could eventually do a region allocator and avoid walking through the structure to free the memory.

Standard containers don't allow that, however. Maybe you could look into a Boost.Intrusive solution.

Why do you have a container of pointers also?

Share this post


Link to post
Share on other sites
Quote:
Original post by ApochPiQ
I know this is probably a totally dumb question, but you are running a Release build, right? I've seen problems with sets and maps consuming huge amounts of CPU time due to iterator checking and other debug/safety functionality.
Ja, highest optimisation level under GCC.
Quote:
Original post by dmatter
Is it actually slow then?
Bit hard to say - I don't really have a baseline. The simulation was originally written in python, but I ported it to C++ when the running times became too large.

The C++ code shows a 4-5x speedup over Python, which means that the (very small) simulations we are running currently take only 10 minutes each. Unfortunately, in the near future I will have to run much larger simulations, and very many of them.

Ultimately, after applying the pool allocator, it is probably fast enough - but the less time I spend waiting for data, the better.
Quote:
Original post by loufoque
Well, since you're containing PODs, you could eventually do a region allocator and avoid walking through the structure to free the memory.

Standard containers don't allow that, however. Maybe you could look into a Boost.Intrusive solution.

Why do you have a container of pointers also?
All the sets contain only pointers to nodes, while the nodes them selves are stored in a sorted, contiguous std::vector. This has the advantage that the sets are only operating on pointers (no expensive copies to be made).

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!