Designing a stack based Property class

Started by
6 comments, last by IkarusDowned 11 years, 2 months ago

I've been working on a way to have generic properties for an entity-component style system, but without resorting to using pointers of a base class to reference the myriad of different component types a game could have, and using little to no heap allocations.

I'm building a system where the entity has a map of states (equivalent of components), only they are much more fine grained (i'm storing individual types, such as float, int, bool and the like). Whether that's gonna be efficient is yet to be seen, but that's not my question.

Starting from the outside, i have a Statemap class, which is just a wrapper around an unordered_map. The map is templated on .
The State class has template methods, but is not templated itself. It contains an integer handle and a pointer to an IDataStore interface.
The IDataStore has 2 methods, one for acquiring a handle based resource, and another for releasing a handle.
The DataStore class inherits from IDataStore, and is templated on any type (the only requirement is that the type is default constructible, but seeing as i'm currently using only primitive data types for the States makes it ok for now).

I'm at work atm, so i've only come up with an untested sample (so i dont know whether all the template stuff is actually gonna work), but i'd like to get opinions on this anyway. note, the sample ahead contains no sanity checks whether the handle is valid, but they do exist in variuos places inside the DataStore's methods.,>




class IDataStore
{
public:
	int AcquireHandle(int handle = NOT_FOUND) = 0;
	void ReleaseHandle(int handle) = 0;
};

template<typename T>
class DataStore : public IDataStore
{
public:
	DataStore(int defaultSize)
	{
		_cache.reserve(defaultSize);
		...
	}
	
	int AcquireHandle(int handle)
	{
		if(handle == NOT_FOUND)
		{
			if(_freeSlots.empty())
			{
				_refs.push_back(0);
				handle = _cache.size();
				_cache.push_back(T()); //maybe use emplace or something
			}
			else
			{
				handle = _freeSlots.back();
				_freeSlots.pop_back();
			}
		}
		++_refs[handle];
		return handle;
	}
	
	void ReleaseHandle(int handle)
	{
		--_refs[handle];
		if(_refs[handle] == 0)
		{
			_freeSlots.push_back(handle);
		}
	}
	
	T& get(uint handle)
	{
		return _cache[handle];
	}
	
private:
	std::vector<T> _cache;
	std::vector<uint> _freeSlots;
	std::vector<uint> _refs;
};


class Property
{
public:
	template<typename T>
	T& as()
	{
		DataStore& store = getDataStoreReference<T>();
		if(_store != &store)
		{
			throw exception;
		}
		return store.get(_handle);
	}
	
	template<typename T>
	Property(T& value)
	{
		DataStore& store = getDataStoreReference<T>();
		_store = &store;
		_handle = store.Acquire();
		store.get(_handle) = value;
	}
	
	template<typename T>
	Property()
	{
		DataStore& store = getDataStoreReference<T>();
		_store = &store;
		_handle = store.Acquire();
	}
	
	Property(const Property& rhs)
	: _store(rhs._store), _handle(rhs._handle)
	{}
	
	Property& Property::operator=(const Property& rhs)
	{
		if(this != &rhs)
		{
			_store->ReleaseHandle(_handle);
			_store = rhs._store;
			_handle = _store->Acquire(rhs._handle);
		}
		return *this;
	}
	
	~Property()
	{
		_store->ReleaseHandle();
	}
	
private:
	IDataStore* _store; //pointer if a allow resetting, reference if not
	int _handle;
};

This is a sample of the basic outline. Theoretically (i'll test when i get home) you can allocate States on the stack, freely copy them around (as only the handle is copied, not the data itself), to whatever you want with them.
To get the data, you get a copy or reference to the State class, and call the template method as(), and if the types are ok, you get the actuall data by reference.
The function getDataStoreReference() is a bit of a problem tho. Originally i was planning to make it a small function which makes a static DataStore store; and immediatelly returns it, but if that's considered a bad thing (global masked behind a function), the function could make a call to somewhere else and get the actual store which is allocated wherever. Point is, that function can return the Store from wherever it wants, as long as it always returns the same object which doesnt move in memory.

Any thoughts on this are welcome, and criticism (and pointing out holes) is even more welcome.

edit note: the post editor is wonky...

devstropo.blogspot.com - Random stuff about my gamedev hobby

Advertisement

What you are trying to do is basically called the Singleton pattern. Singleton patterns pretty much all use static variables, and their instance is accssed usually through a getInstance() type function. in your case, you have named it getDataStoreReference().

Googling Singleton Pattern would be a good place to start. Also, if you want a good (albeit complex) example of templated singleton-pattern, check out the code for Boost's singleton_pool. It does a similar thing to what you are trying. sepcifically, check the pool/detail/singleton.hpp file and the pool/singleton_pool.hpp file to see how thats done. In fact, the detail/singleton.hpp instance() function does almost exactly what you are worried about being "bad." They have even documented it with comments as to why they are doing it, and should give you some more insight into how you can solve your problem.

Hope that helps!

Yes, i realize the function is basically a global in function disguise, just as a singleton is a global in class disguise. However, the way the stores are instantiated can be up to the programmer, eg. i can make all of the stores in the main loop/engine class explicitly, and put them in some lookup table to be retrieved by a class method. That part is still up for debate on how it could/should be implemented.

The point of the design is to have all data of some type to be in a contiguous block of memory, so it is easier to serialize to disk (for example) and to be more cache friendly. Acquiring the DataStore instance of a specific type isn't the focus of this topic, although it is appreciated to see how it could be improved/changed.

But, you did make me realize that the DataStore has a requirement of always having just one instance, so maybe that justifies the singleton-ish design? Although, i've often read comments like "if you need only one instance, then instantiate only one instance, don't force it". Don't know whether that would make sense in this case. Perhaps acquiring the store instance could go through a proxy object that could have different store instances for different contexts/scenes/screens. Food for thought.

Also, i may have titled the topic incorrectly, by stack based i meant without dynamic allocations :P

devstropo.blogspot.com - Random stuff about my gamedev hobby

So by "stack" you meant "callstack," then? I can't see why this would be a good idea, or even desired. the heap was specifically designed to allow you to hold onto data that beyond stack scope such that it would still be relevant after a call stack unwind.

Out of personal experience with high-speed data systems and general knowledge of how the call-stack / heap works, you should know that you may find huge performance HITS putting non-trivial objects on the callstack.

If you still absolutely want to do this, you'll need to probably:

- extern the decleration to the data you wish to use in your above code

- instantiate it from the top of the main() loop, relying on temporal instantiation.

- ensure that the destructors for said data properly destroy it all on the end of main() scope

if you want to avoid the extern, you need to pass a pointer to the data in all of your functions.

Also, again, what I'm suggesting would allow you to do the linear-allocation scheme using templates. Since you can singleton the linear instance of the type -- ie std::vector<T> and just use that as the storage mechanism for the type.

If under "callstack" you mean automatic storage (vs free storage used for dynamic allocations with the new keyword), then yes.

If the data stores *are* allocated in the main() or any other top layer, then the data they hold will persist through stack unwinding, no?

Note that i never said anything about putting non-trivial objects on the stack. The data would be mostly PODs, with a couple non-POD stuff (Rect, Vec2, Vec3, Color).

As i've been reading the forum a lot these days (quite bored at work), i've read on multiple occasions in multiple posts from multiple people that you should allocate stuff on the stack as much as possible, and ofcourse, trying to keep only one copy of it, so you would pass most of your data around as pointers/references).

Also, many people talked about having the data be cache friendly, storing them in a vector (contiguous memory block) rather than a list (different and unrelated storage locations). Are they wrong, or am i pushing that idea too far?

However, where the data stores are is only part of the above code. The rest of the code allows for the programmer to hold any data types in the same container without using pointers to the data at all (no base classes, no new keyword anywhere), and he only has to know what type the property is. I've tested it in my code, it seems to be working fine so I'm going to use this in my code base for storing Entity properties as long as i dont find anything inherently wrong with it.

devstropo.blogspot.com - Random stuff about my gamedev hobby

[quote name='Strewya' timestamp='1358537173' post='5022960']
If under "callstack" you mean automatic storage (vs free storage used for dynamic allocations with the new keyword), then yes.
[/quote]

Yes, this is called the call stack.

[quote name='Strewya' timestamp='1358537173' post='5022960']
If the data stores *are* allocated in the main() or any other top layer, then the data they hold will persist through stack unwinding, no?
[/quote]

Yes, this isn't a problem. You will still need to have some way of accessing this data, however. In which case, you expose either a global pointer, or you create a singleton. As an implementation detail, I can't really see how this is better than putting things in the heap, but that's your call :)

[quote name='Strewya' timestamp='1358537173' post='5022960']
As i've been reading the forum a lot these days (quite bored at work), i've read on multiple occasions in multiple posts from multiple people that you should allocate stuff on the stack as much as possible, and ofcourse, trying to keep only one copy of it, so you would pass most of your data around as pointers/references).
[/quote]

This is a generalization of a larger problem of not understanding how to properly manage heap memory. Yes, stack-based memory allocation is great if you can guarantee the scope limitations. But if you are planning on making continuous and squential allocations, you may notice a performance penalty. I've tested this with a node-parser library written in C++ i wrote where i attempted to shift the allocations from heap to the callstack. for 1million nested values, the performance went downhill. My assumption was that your data objects would generally be allocated / deallocated frequently; maybe a bad assumption on my part?

[quote name='Strewya' timestamp='1358537173' post='5022960']
Also, many people talked about having the data be cache friendly, storing them in a vector (contiguous memory block) rather than a list (different and unrelated storage locations). Are they wrong, or am i pushing that idea too far?
[/quote]

nope, that's correct. keep in mind that there are some generalizations going on here too (such as cache-line issues and stuff) but i won't get your head muddled with that sort of detail. You should also remember that if you are planning on using the common-implementations of vector type containers (STL for one), all its doing is creating a dynamic array on the heap and providing an access pattern for you. You aren't actually doing it "just on the stack." Welcome to the wide world of C++ :)

Yes, this isn't a problem. You will still need to have some way of accessing this data, however. In which case, you expose either a global pointer, or you create a singleton. As an implementation detail, I can't really see how this is better than putting things in the heap, but that's your call smile.png
Now that you put it like that, yeah, i could. But it really is an implementation detail. I started with the stack for the automatic destruction of objects that go out of scope, but having them wrapped in shared_ptr or unique_ptr gives me the same thing. But at the end, i really need to have global access to those objects. I guess i should wrap them into something to limit project wide access, and allow access only to the few classes that actually use them.
My assumption was that your data objects would generally be allocated / deallocated frequently; maybe a bad assumption on my part?
Well, the objects wouldn't be deallocated at all, they would persist in memory until the program ends. Each time a State is allocated and requests a type handle, if it's requesting a handle that's already in use (copy and assignment), it just points to it. If it needs a new one, the value is push_back'ed into the vector, and remains there forever. If avalue in the vector loses all references to it, only it's index is put into the _freeSlots vector, ready to be reused again when needed. So i only have allocations which happen when acquiring a new data handle (and the occasional resizing of the vector, but that can be controlled with reserving enough memory in advance if needed). And the allocations could occur at any point, though it's most likely i'll construct objects at level/map load or similair waiting point.
nope, that's correct. keep in mind that there are some generalizations going on here too (such as cache-line issues and stuff) but i won't get your head muddled with that sort of detail. You should also remember that if you are planning on using the common-implementations of vector type containers (STL for one), all its doing is creating a dynamic array on the heap and providing an access pattern for you. You aren't actually doing it "just on the stack." Welcome to the wide world of C++ smile.png
Yeah, i'm using STL's vector, and i knew it does that on the heap. I was mostly just concerned with having the actuall data be sequential, than forcing it to be on the stack itself.
My design goals for this were:
- keep data sequential in memory (that's what vector is for)
- avoid typecasting if possible (that's what the templated getDataStoreReference() is for)
- contain multiple data types in the same container of choice without using 'new' (the original idea which got me on this path)

Thanks for the replies and helping me realize some things about my designs smile.png

devstropo.blogspot.com - Random stuff about my gamedev hobby

no problem :)

Also keep in mind that "cache-friendly" is only going to matter if you are planning to access neighboring data in really close intervals. If you are only accessing your DataStore a couple times at once, and on top of that you are doing it with lots of other code in the middle your cache-locality is for the most part not going to matter. ie:


getData(a);
getData(b);
getData(c);
//...continues
//OR
for( int i = 0; i < 100; ++i)
{
    getData(d);
}

in the above case, for example, cache locality will possibly come into play.

if instead its something like


getData(a);
//...do a whole bunch of processing, a few un-related Update() calls, etc
getData(b);
//do some more processing

you will see no practical difference.

That being said, arrays are faster random access lookup than lists, but again its a matter of your needs more than "what is the best container / object / pattern in the world"

This topic is closed to new replies.

Advertisement