HeapAlloc is Hogging My Cycles

Started by
8 comments, last by Kylotan 19 years ago
I've been optomizing a routine I've been working on lately (Profile, Fix Bottleneck, Rinse and Repeat). Now out of the top 2 processor hogs one is a data structure problem I will have to fix, and the other is "RTLAllocateHeap", which is documented as "Heap Alloc" (This is an XP system, if it matters). The code never explicitely call this function, but it is using ~20% of my time. I've pooled the largest chunks of my memory resources so that new and delete are called less, but HeapAlloc is running many times more than the new operator anyway. (31k times for new, but 63 mil for HeapAlloc) Most of what I've read online talks about what it does and how to use it, but does anyone have any idea in what cases it might be called implicitly? Maybe by coding around those situations I can take back that 20% of time. My only guess right now is that when I "clear()" a std::set it also deallocates its memory. I'm tryingt to figure out if thats the case, but my debugger is somehow not reporting the size correctly. Its VS 2003, and the AutoWatch window keeps reporting the size jumping from 0 to some gigantic number back to 0. From looking at my code though there is no possible way I can think of that it could possible be resizing like that. Thanks for any help -Chris
Turring Machines are better than C++ any day ^_~
Advertisement
I think its generally best to not have any (or at least very little) memory allocation/deallocation during the game loop, only at the start and the end of the game (or levels or whatever). Maybe you could look into reducing the amount of memory you allocate/deallocate during your game?
Quote:Original post by BosskIn Soviet Russia, you STFU WITH THOSE LAME JOKES!
It actually only allocates memory once durring each loop, and even then its just a vector that is declared with the exact size it needs to be, unless my memory manager needs to allocate more memory. But as you can see from the number of times new is called, that pales in comparison to HeapAlloc. Also, operator New is pretty low on the profilers list of problem points, so I'm thinking that HeapAlloc is being implicitly called elsewhere.

Any ideas?
Turring Machines are better than C++ any day ^_~
You say you only allocate memory once, but you also say you're clearing an std::set each time, which implies to me that you're also filling it each time, yes? Adding elements to an std::set is going to cause an allocation as it creates a new node to store the element in.
Quote:Original post by Kylotan
You say you only allocate memory once, but you also say you're clearing an std::set each time, which implies to me that you're also filling it each time, yes? Adding elements to an std::set is going to cause an allocation as it creates a new node to store the element in.

Ya know, you are exactly right. For some reason it didn't occur to me that it wouldn't work like its .Net counterpart and reserve memory for that. I guess I need to research how to right an allocator that will manage a memory pool for my set.

Thanks
Turring Machines are better than C++ any day ^_~
I came across this yesterday, which could perhaps be of use: Pooled Allocators for The STL.
Doesn't Boost include such allocators?
I just checked and found boost::pool_allocator and a few variations.
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Quote:Original post by Extrarius
Doesn't Boost include such allocators?
I just checked and found boost::pool_allocator and a few variations.

Yep, for now that is the route I am taking, using the boost::fast_pool_allocator. Looks like I still have some bottlenecks to sort out with my uses of the set, but at least the memory management is probably as good as it is going to get for now.

If anyone is still reading this post, am I an idiot for using a set for an A* open list? I'm doing a ton of inserts and erases on it, and those are the most expensive operations I am doing right now.
Turring Machines are better than C++ any day ^_~
Quote:Original post by intrest86
Ya know, you are exactly right. For some reason it didn't occur to me that it wouldn't work like its .Net counterpart and reserve memory for that. I guess I need to research how to right an allocator that will manage a memory pool for my set.


Just to clarify something as it was only hinted all standard library containers are parameterized by allocator type, generally the last template type parameter and defaults to using std::allocator that generally uses global operators new/delete via explicit invocation & placement new operator (de/allocation & construction/destruction is separated for efficency).

This means you can write or use custom allocator types that uses different memory models/schemes and as mentioned already the boost library has some packaged STL compliant allocator types, example usage:

typedef boost::fast_pool_allocator<foo> foo_alloc;typedef std::set< foo, std::less<foo>, foo_alloc > foo_set;


Note that in general allocators are not really allocators of T (most of the time), they are generally *rebound* (at compile-time) into an allocator of something else internally because generally containers allocate nodes not just T so you can assume that internally a container will do something like as follows:

template < typename Tp >struct foo_node {   //.......   Tp value;};template < typename Tp, typename Alloc = std::allocator<Tp> >class foo_container {   typedef typename Alloc::template rebind< foo_node<Tp> >::other node_allocator;   struct foo_impl : node_allocator { //empty base optimization (EBO)         //....   };   foo_impl imp;   //....   };


Quote:Original post by Extrarius
Doesn't Boost include such allocators?
I just checked and found boost::pool_allocator and a few variations.


So far they are only 2, pool_allocator & fast_pool_allocator. Don't let the names fool you [smile] generally pool_allocator is slightly better for containers that allocate large contiguous chunks of memory as is the case with std::vector.

fast_pool_allocator is slightly better with containers that allocate smaller non contiguous chunks. I found i got superb performance/efficiency with a combination of std::deque & boost::fast_pool_allocator.
Quote:Original post by intrest86
If anyone is still reading this post, am I an idiot for using a set for an A* open list? I'm doing a ton of inserts and erases on it, and those are the most expensive operations I am doing right now.


I use an std::deque, with the default STLPort allocator. Getting the next node is just a front/pop_front, and I just use a linear search to find where to insert new nodes. I have no performance problems yet, and if I do I'll try a binary search there instead.

If you write your A* algorithm in a fairly generic way and don't refer to the data types at all in the main function, you should be able to swap the implementation containers around quite easily and see what works best for you.

This topic is closed to new replies.

Advertisement