\$10

### Image of the Day Submit

IOTD | Top Screenshots

## My Custom Memory Pool Seems Too Slow. Need Advice.

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

11 replies to this topic

### #1away2903  Members

Posted 19 April 2014 - 06:31 AM

Hi guys,

So I'm warming up myself with C++ again, but my C++ isn't so much before anyway. As I am back, I try to get deeper in memory management.

I am currently trying to create my own generic memory pool for any object that inherits a parent class (I call it Object, which has a static memory pool attached to it). I need this memory pool so I could manually manage any child class that inherits Object.h at any time I, or the pool desire (I guess it's a garbage collector in the end; it is simply a pool).

So every time a class that inherits Object, it will have its new and delete overridden. The "new" will have a normal manual memory allocation, and "delete" simply stores the memory back to the pool in case it is needed again/not going back to free store, unless/until I flush them all either manually/automatically depends on the application needs.

Since children classes are always different, I make the memory pool to be generic per child class, therefore I use STL map. I do this to avoid memory corruption as every child class should have different size of memory required. This causes every memory allocation to check which class to find.

The memory pool works, but the performance is severe. I don't know whether it is the architecture, or STL overuse, or the code is too blunt, but I really need someone to give feedback on it. The performance is like 9x slower compare to default "new" when I test it with 100K iteration (1.8 sec compare to 0.2 sec) when allocating the memory. I even tried another test by filling the pool first before the test but it gets even slower! And I don't even include tracker for the pointers yet.

The thing is, I don't know if this is normal or not. But having it 9 times slower is somehow unacceptable for me. So again, please have a check if I miss something, maybe it's unsafe, the design sucks, or anything you find odd (it contains leaks, corruption, etc.). I need guidance.

Soo here's the code:

MemoryPool.h -> The memory pool!

/**
* Generic stack-based memory pool.
*/

#ifndef MEMORY_POOL_H
#define MEMORY_POOL_H

#include <typeinfo>
#include <vector>
#include <map>
#include <string>

#include "Common.h"

namespace Engine
{
using std::vector;
using std::map;
using std::string;

template<typename T>
class MemoryPool
{
public:
MemoryPool();
~MemoryPool();
void* alloc(size_t size);
void dealloc(void* ptr);

private:
map<string, vector<char*>> _pool = map<string, vector<char*>>();
};

template<typename T>
MemoryPool<T>::MemoryPool()
{
}

template<typename T>
MemoryPool<T>::~MemoryPool()
{
}

template<typename T>
void* MemoryPool<T>::alloc(size_t size)
{
char* ptr;
vector<char*> specificPool;
std::map<string, vector<char*>>::const_iterator searcher = _pool.find(typeid(T).name());

if (searcher != _pool.end())
{
specificPool = searcher->second;
}
else
{
_pool.insert(std::pair<string, vector<char*>>(typeid(T).name(), vector<char*>()));
specificPool = _pool[typeid(T).name()];
}

// If there are idle pointers, then reuse one.
if (specificPool.size() > 0)
{
ptr = specificPool.back();

specificPool.pop_back();

return ptr;
}

// If there is nothing to reuse, then allocate new one.
ptr = new char[size];

if (ptr != NULL)
{
return ptr;
}

// Return nullptr when no memory left.
return nullptr;
}

template<typename T>
void MemoryPool<T>::dealloc(void* ptr)
{
std::map<string, vector<char*>>::const_iterator searcher = _pool.find(typeid(T).name());

if (searcher != _pool.end())
{
_pool[typeid(T).name()].push_back((char*)ptr);
}
else
{
cout << "ERROR: No pointer pointing at such object to deallocate!" << endl;
}
}
}

#endif

Object.h -> The parent class!

#ifndef OBJECT_H
#define OBJECT_H

#include <string>
#include <vector>

#include "MemoryPool.h"

namespace Engine
{
using std::string;
using std::vector;

template<typename T>
class Object
{
public:
static MemoryPool<T> memoryPool;

public:
Object(string name);
~Object();
string& getName();
void* operator new(size_t size);
void operator delete(void* ptr);

/**
* Reset the object. Always put all cleaners here instead of destructors!
*/
virtual void reset() = 0;

private:
string _name;
};

template <typename T>
MemoryPool<T> Object<T>::memoryPool = MemoryPool<T>();

template <typename T>
Object<T>::Object(string name) : _name(name)
{
}

template <typename T>
Object<T>::~Object()
{
_name.clear();
}

template <typename T>
void* Object<T>::operator new(size_t size)
{
return memoryPool.alloc(size);
}

template <typename T>
void Object<T>::operator delete(void* ptr)
{
memoryPool.dealloc(ptr);
}

template <typename T>
string& Object<T>::getName()
{
return _name;
}
}

#endif



Entity.h -> Sample usage of a child class that inherits Object.

#ifndef ENTITY_H
#define ENTITY_H

#include <vector>

#include "Common.h"
#include "Object.h"
#include "Component.h"

namespace Engine
{
using std::vector;

class Entity : public Object<Entity>
{
public:
vector<Component*> components;

Entity(string name);
~Entity();
void clearAllComponents();
void reset();
};
}

#endif


And here's how I benchmark them (without filling the pool first):

#include <iostream>
#include <ctime>

#include "Engine\Core\Entity.h"

using std::cout;
using std::endl;

using Engine::Entity;

/* Normal class */
class Car
{
public:
Car() {}
~Car() {}
void getName() {}
};

int main()
{
float startTime2 = (float)clock() / CLOCKS_PER_SEC;
for (int iii = 0; iii < 100000; ++iii)
{
Entity* o = new Entity("Entity");
}
cout << (float)clock() / CLOCKS_PER_SEC - startTime2 << endl;

float startTime = (float)clock() / CLOCKS_PER_SEC;
for (int i = 0; i < 100000; ++i)
{
Car* car = new Car();
}
cout << (float)clock() / CLOCKS_PER_SEC - startTime << endl;

return 0;
}


Note that there's Common.h header file. It contains only standard includes like cout and endl for debugging purposes, nothing else.

Thank you guys!

Edited by mychii, 20 April 2014 - 04:54 AM.

### #2ApochPiQ  Moderators

Posted 19 April 2014 - 06:40 AM

In your memory pool's alloc function, specificPool is copied by value from the map on each allocation. I suspect this is a bug and what you really want is to have specificPool be a pointer to a vector in the map. Is that correct?

Wielder of the Sacred Wands

### #3Waterlimon  Members

Posted 19 April 2014 - 07:07 AM

Do I understand this correctly:

-Entity is Object<Entity>

-Instances of Object<Entity> have MemoryPool<Entity>

-Only objects of type Entity can be allocated into that specific pool

I dont see how the map works here, wont every instantiation of MemoryPool only allow a single type to be allocated anyways, which would mean that there will only every be a single key in the map? Im not very used to template code and static members so i might have misundestood something.

Anyways, I do have suggestions, assuming there is nothing fundamentally wrong in how the pool works:

-Would it be possible to use the hash_code of the type_info struct instead of the name for the map key? This way you wouldnt need to use std::string but could have an int instead which might be faster.

-Reserve some capacity into the vectors when you create them so they dont need to reallocate all the time

-You could try using an unorderer_set instead of a map, access is about O(1) I think. If the map access is a bottleneck.

-In your test code, make sure that the objects you compare are actually the same. In the code you posted, car is probably a single byte, while your Entity might even allocate memory because of the vector contained.

And of course, use a profiler to figure out what it is you are doing that is slowing down your pool.

EDIT:

You might want to consider a less intrusive option. Your classes probably should not know where and how they are allocated.

Edited by Waterlimon, 19 April 2014 - 07:15 AM.

o3o

### #4away2903  Members

Posted 19 April 2014 - 07:17 AM

In your memory pool's alloc function, specificPool is copied by value from the map on each allocation. I suspect this is a bug and what you really want is to have specificPool be a pointer to a vector in the map. Is that correct?

I guess you're right, I can optimize that part. I'll let you know. EDIT: Tried it! it reduces by 0.3 sec! Thanks. ( )

Do I understand this correctly:

-Entity is Object<Entity>

-Instances of Object<Entity> have MemoryPool<Entity>

-Only objects of type Entity can be allocated into that specific pool

I dont see how the map works here, wont every instantiation of MemoryPool only allow a single type to be allocated anyways, which would mean that there will only every be a single key in the map? Im not very used to template code and static members so i might have misundestood something.

Anyways, I do have suggestions, assuming there is nothing fundamentally wrong in how the pool works:

-Would it be possible to use the hash_code of the type_info struct instead of the name for the map key? This way you wouldnt need to use std::string but could have an int instead which might be faster.

-Reserve some capacity into the vectors when you create them so they dont need to reallocate all the time

-You could try using an unorderer_set instead of a map, access is about O(1) I think. If the map access is a bottleneck.

-In your test code, make sure that the objects you compare are actually the same. In the code you posted, car is probably a single byte, while your Entity might even allocate memory because of the vector contained.

And of course, use a profiler to figure out what it is you are doing that is slowing down your pool.

The idea of the map is to avoid that specific situation. Say I have A, B, C class that inherits Object, I assume that Object would be generic on each of them, like Object<A>, Object<B>, Object<C>. So the map contains vectors of pointers for each A, B, and C when required. Say we're allocating memory for A, if the map is not yet have vectors of pointers for A objects, then create such vector. Then A would have its own pool. Whenever Pool for A is filled, when allocating new As, it will automatically reuse the pointers from the vector inside map, which has they key for A.

1. Hmm I haven't check on hash_code yet, maybe it would. Yes! probably string is the case.

2. Will try!

3. I'll try that, yeah probably the map. Thanks for pointing it out!

4. You're right. I was probably got terrified already with the number. I'll post out with comparable items. Still,1.8 sec for such test is somehow unexpected, but that's why I open this thread to make sure if it's okay or not okay.

EDIT: Using all the first 3 suggestions (including ApochPiQ suggestion), it drops to 0.6 seconds for 100K iterations of allocating the Entity! But I used unordered_map here instead of unordered_set, because I am unable to track which object to pool without a key. I'll find a way for that. And also I hardcoded the capacity reserve for vector according to the test. I have to find out on how this should work in real application. Anyway, without reserving the capacity, I will get 0.95x seconds.

I am so not knowing how to use this IDE (Visual Studio 2013), is it related to that? do you mind telling me about this profiler stuff?

Edited by mychii, 19 April 2014 - 08:54 AM.

### #5Waterlimon  Members

Posted 19 April 2014 - 07:24 AM

I have used the Very Sleepy profiler (not part of visual studio) to do my profiling. I dont think visual studio express edition has a profiler.

I understand what you intend to do with the map, but to me it seems as if you would have 3 maps: 1 for A, 1 for B, 1 for C, because the MemoryPool class itself is templated. (so you have MemoryPool<A> with its map, MemoryPool<B> with another map etc.)

Looking at the code it seems as if the MemoryPool class itself does not need to be a template. Only some of the functions (the allocation ones) need to be. This would probably fix it.

Edited by Waterlimon, 19 April 2014 - 07:26 AM.

o3o

### #6away2903  Members

Posted 19 April 2014 - 07:32 AM

I have used the Very Sleepy profiler (not part of visual studio) to do my profiling. I dont think visual studio express edition has a profiler.

I understand what you intend to do with the map, but to me it seems as if you would have 3 maps: 1 for A, 1 for B, 1 for C, because the MemoryPool class itself is templated. (so you have MemoryPool<A> with its map, MemoryPool<B> with another map etc.)

Looking at the code it seems as if the MemoryPool class itself does not need to be a template. Only some of the functions (the allocation ones) need to be. This would probably fix it.

I'll check on that profiler, thanks!

That's the funny thing though, the memoryPool in Object is static, so I am assuming it is only one. What I thought is (here's the dangerous part), the memoryPool attached to Object would adapt to whichever who wants to use it (A, B, or C). So again, since memoryPool is only one as static, I assume that the map in memoryPool is shared and dynamically adapt whoever is using it whether it's A, B, or C. But I can't really tell if this is correct or not (anyone please clarify).

Hmm I thought about that too, I'll give in another shot.

Edited by mychii, 19 April 2014 - 07:32 AM.

### #7ApochPiQ  Moderators

Posted 19 April 2014 - 07:50 AM

The key issue is that MemoryPool<Foo> is a different type than MemoryPool<Bar>. This means that static members are not shared between the two different pools.

Wielder of the Sacred Wands

### #8away2903  Members

Posted 19 April 2014 - 08:02 AM

EDIT:

You might want to consider a less intrusive option. Your classes probably should not know where and how they are allocated.

Any idea of the usage? I don't seem to understand the less intrusive option should be. Is it because it has to inherit Object with template?

The key issue is that MemoryPool<Foo> is a different type than MemoryPool<Bar>. This means that static members are not shared between the two different pools.

Oh my I didn't know that at all. Well that clears up what Waterlimon suggested then, that memoryPool probably won't need to be a template.

I guess I have to learn more on how this template works. Thanks ApochPiQ!

Edited by mychii, 19 April 2014 - 08:26 AM.

### #9away2903  Members

Posted 19 April 2014 - 09:10 AM

After the clarification about the template issue, I tried to remove the map and decided to use the template instead. It removes the needs to find which pool I have to allocate (due to map), which increases the performance slightly (up to 0.1 sec) compare to the one that uses map and without template. If I reserve the capacity (by the total number of the test), it is reduced to 0.23 sec! The number of classes that would need this would probably at hundreds or less, unless doing such template that way is very expensive (I don't know).

EDIT: As Waterlimon suggested, I've changed the benchmark against the Car class to be quite a bit the same as Entity. The Car class now inherits Vehicle, where it has to pass a string through its constructor and save it to a name variable. Apparently this is the culprit. Now the Car increases up to 0.5x seconds!

Now everything looks quite normal, as a matter of fact, better!

Thanks a bunch guys!! you guys are very helpful.

Edited by mychii, 19 April 2014 - 05:32 PM.

### #10swiftcoder  Senior Moderators

Posted 21 April 2014 - 04:36 AM

The Car class now inherits Vehicle, where it has to pass a string through its constructor and save it to a name variable. Apparently this is the culprit.

Are you passing those strings by const reference, or copying them at every step?

Failing to pass complex objects (like string, vector, etc) by const reference will result in a lot of unnecessary copies, which tends to put a serious damper on performance.

Tristam MacDonald - Software Engineer @ Amazon - [swiftcoding] [GitHub]

### #11Waterlimon  Members

Posted 21 April 2014 - 06:14 AM

It would make more sense if you just used the same class for comparing the performance of the allocation method (make a version of the entity class without the Object base class). You only want to compare the allocation method, not the particular properties of the two different classes.

Additionally, I would argue that your test is not very fair to the pool allocator, because you only test allocation, not simultaneous allocation and deallocation, which might be a more realistic situation.

o3o

### #12rip-off  Moderators

Posted 21 April 2014 - 06:56 AM

It isn't clear what problem you're trying to solve. Is allocation currently a bottleneck? Have you investigated trying to avoid heap allocation, rather than trying to speed it up?

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.