Sign in to follow this  
Madster

Generic Datastore

Recommended Posts

Madster    242
Hi, haven't posted in a few months.... grr classes. Anyway, here's my question. Has anyone implemented or has ideas on how to implement a generic datastore object? The idea is to have a data-centric architecture and have several classes attaching arbitrary data to the datastore, who will serve it to anyone who wants it, will cache it if needed and will properly destroy it. Also it should only be possible to wipe the data stored by the same class. More than one Datastore should be allowed, as it would be nice, but that's not really critical. The main issues are: -Classes should be allowed to store any kind of data in the store, ordered or unordered. -Data should be cached if needed and classes should be able to turn caching on and off at will (there's probably no need to toggle it after creating the first object in the store for a given class, though) -Classes might want to see the data stored by other classes. -Any class might request the data to be delivered back to them in a certain order, maybe with special ordering types. for the first and second point I just thought of a template with handles and handle dereference objects, as seen in a Gems book. For the third I'm thinking maybe I could define a separate class with the datatype and #include it in all the interested parties For the fourth one I'm at a loss. Is this doable without calls to sort and find on each data fetch? Would I need a sort-of-iterator-for-handles thing? I don't even know if the two methods I had considered before would allow this. Plus, it's all theory so far. An use case would be something like storing the level geometry and then requesting as an octree from the physics engine and then the same geometry as a bsp by the renderer. All of this without having geometry datatypes in the Datastore class. Anyone done this before? PS: please excuse any weirdness in this post.. it's 6am :s

Share this post


Link to post
Share on other sites
Madster    242
Eh I forgot...any class should be able to modify what was stored by another class... and there should be a way to *pair* data, such as attaching physics info to a game actor.

Am I aiming too high? O_o

Share this post


Link to post
Share on other sites
Madster    242
Okay, just thinking out loud here:

collections would be inside of a static STL map after a type specified in a parameter to the template, so that when you want to access a collection of values, you'd ask by type and name.

You would use it this way:

Datastore<mydatatype> *store; // emphasis: you can create it whenever you feel like it, could be in a variable instead
store = new Datastore<mydatatype>("nameofmystore");
store.push( mydata ); // where mydata is of type mydatatype
delete store;

...

store = new Datastore<mydatatype>("nameofmystore");
i = store.begin();
for (....)


So, since the actual store is static, it doesn't get deleted unless you explicitly do so. You only need to spawn "interfases" to the store.
the store name is to avoid confusion when there are more than 1 stores with the same type.

The use for this is decoupling the loaders from the renderers from the memory managers :)
Instead of iterators ( store.begin() and such) I am considering using handlers.

any thoughts?

[Edited by - Madster on December 30, 2005 9:09:38 PM]

Share this post


Link to post
Share on other sites
cypherx    204
I don't really understand what you're trying to do, but whatever it is...use a singleton.

DataStore<T>& store = DataStore<T>::GetRef();
store.push_back(t);

Leave them pointers alone!

-Alex

Share this post


Link to post
Share on other sites
Madster    242
umm I'm trying to set a scheme up so that arbitrary classes will be able to post data or get data. They don't have to know who posted the data they need, they just assume it's there and use it. This should improve modularity I think, and clean up dependencies (which have bit me several times in the past).

Say you have a simple procedural control system that modifies a vector of ints called "inputs". You decide to make a new object oriented system with bells and whistles... that does the same thing. It's configurable and has all sorts of nice things, but it's basic purpose is still posting input data.
So... it just modifies the same vector of ints called "inputs". No need to re-hook with game logic everywhere, just once for the update call in the main loop.
The same could be done with everything from pathfinding AI to textures and sounds.

So that's the purpose.
Hm it is really a singleton, isn't it?
getting the instance like that would be faster than creating and destroying stuff explicitly? or it just makes it harder to go wrong?

Share this post


Link to post
Share on other sites
daerid    354
One idea would be to embed a database engine that uses in-memory storage, and then just use SQL statements to query and update the data.

Share this post


Link to post
Share on other sites
I really don't recommend you do this. You are going to be writing code imperatively, and storing mutable data in a global data structure. Even if your code is not multithreaded (which in this situation would cause you pain and suffering, and ultimately death), you are still breaking down walls of abstraction and your ability to reason about the behavior of code is severely crippled. Any given computation can potentially modify this global state, and the rest of your code has no idea that this is happening. Thus, the behavior of a function invoked with the same paramters twice can result in two different values. At least with OO (not recommended, but used as an example) we can encapsulate some of this state and minimize these weird effects. You also have the issue of tracking dependencies ... you are going to end up having some data hanging around in this repository when it is no longer needed, which is basically a memory leak. So you are going to implement some reference-tracking system on top of your existing memory management solution. You really have no modularization, you have merely transformed object-to-object dependencies into object-to-mutable-global-state dependencies. It's like the worst of OO combined with the clean, elegant approach of the Windows Registry.

Share this post


Link to post
Share on other sites
Madster    242
Interesting.

I hadn't considered SQL, and it could be argued that I am reinventing the wheel. I'll look into that, though I'm not too fond of passing strings that need to be parsed for simple accesses.

The Reindeer Effect: thanks for the reply.
For multithreading, mutexes would have to come into play. Acessing the same resource from multiple threads at the same time would mean a simple delay, and a warning would be logged, since one would want to minimize these collisions.
About this:
Quote:
Any given computation can potentially modify this global state, and the rest of your code has no idea that this is happening. Thus, the behavior of a function invoked with the same paramters twice can result in two different values.

This is the purpose of the system, that any given computation could potentially modify the global state. Think of it as a game class, where the datastore is a private member. Any method could modify it, yes, that's why not *all* data goes there, only the kind we need to share.
Calling a stateful function (such as most class methods) will often result in two different values. That's not really a problem.

About data hanging around: yes, could be a problem. I plan to provide erase methods, and the method to get data will create a default object if there is none. Keep in mind this is mostly to keep resources and gamestate. The stored objects can also be reference counted, but i believe that's not what you meant. pointers to objects in the datastore are volatile, and not to be kept around.
Thread safety is accomplished, as usual, trough mutexes.

A memory leak is unfreed data that cannot possibly be freed anymore. This is not the case, as the data is still contained in a class that will free it on closure. If misused, it can produce bad memory management at most.

About this:
Quote:
You really have no modularization, you have merely transformed object-to-object dependencies into object-to-mutable-global-state dependencies. It's like the worst of OO combined with the clean, elegant approach of the Windows Registry.

I believe moving dependencies from object-to-object to object-to-mutable-global-state does produce modularization, since now I can attach and detach modules at will. It was, in fact, the driving idea. Can you elaborate more on why you feel this isn't the case?
I was aiming also for registry-like functionality, without getting Win32-dependant, and also keeping things in RAM. Can you elaborate on why you feel this is the worst of OO? It is not pure OO. is that it? I feel it's pretty clean, but I'm still deciding on the final form.

Thanks again :)

Share this post


Link to post
Share on other sites
T1Oracle    100
I say this is a bad approach to a game engine, you want data driven architecture in web design where the data is the most important thing. Also, in web sites you ahve to represent that data in a myriad of ways. This is seldom true in games, and even when you need to represent that data in various ways it certaintly doesn't need to be done with the same level of performance that is expected of a web server.

Game engines should be Object Oriented with well defined and optimizes hierachies. The data should be organized according to how it works with the rest of the engine so it is right there next to the algorithms that will need it the most. Queries into a central datastore should not be necessary to process critical game data. It's retrieval by the systems that immediately need it, should be as optimal as possible.

Share this post


Link to post
Share on other sites
hplus0603    11347
What is the querying mechanism? If it's just by name (or name-path), then hashes from Perl, or tables from Lua, do exactly that. You can do something very similar in C++ using various "variant" data types, where "map of variants," "list of variants" and "vector of variants" are data types supported by the variant itself.

Share this post


Link to post
Share on other sites
Madster    242
why should pure OO be the paradigm of choice? only recently has OO replaced procedural programming in the game shelves. I like the flexibility that C++ provides for that exact same reason, that I'm not forced into any paradigm and I get to choose what provides the chosen blend of clearness and performance.

Since the variety in game datatypes is usually low, the penalty for queries should be low, as the order of the call is many magnitudes below the amount of calculations and iterations that will be performed on the fetched data. As I mentioned (I hope I did, at least) I plan to store collections in this datastore, not individual objects.

hplus0603, that's exactly what I had in mind. storing either lists, vectors or whatever is needed in hashes. Of course, you could store single POD types as well, but that's probably a bad idea.
Edit: as you mentioned, the query is a string that gets hashed, and its hash compared to other ones. Should be much faster than parsing an SQL sentence.

If I was to discard this idea, I would have a mess of dependencies. How do you usually organize the data flow inside your code? who owns the pointers to each type of resource? This is something that still puzzles me.
Thanks for the replies so far, and any further replies.

Share this post


Link to post
Share on other sites
T1Oracle    100
Quote:
Original post by Madster
Since the variety in game datatypes is usually low

In a very simple unmodern engine I guess. Something with a GUI, a script engine, physics engine (even a light weight one), particle systems, scene graph, some means of organizing polygon data, 3D models, collision detection information, texutres, audio samples/music, along with game specific data can easily lead to many different data types. All of them would have a heirachy best described in an OO approach, and they certaintly would stand to suffer serious performance penalties if forced into a generic storage situation.

Quote:
Original post by Madsterwhy should pure OO be the paradigm of choice?

Not sure if anything is ever "pure OO" but if it can organize things more logically and in a way that connects information more to how it works and what it is than some generic or less functional way, then it should be applied.

Share this post


Link to post
Share on other sites
Madster    242
Even if between all the things you named there wasn't any overlap, it's still low compared to the amount of object that will be handled in each category.
Also I overlooked that there is no penalty whatsoever for variety of datatypes, so that's not even an issue. The only issue is the hash lookup, which isn't slow enough to be a problem IMO, and the added code complexity, which is easy to duplicate even if you don't fully understand it.

However, I'd still like to look at alternatives. I got this idea from a thesis that proposed data-centered architecture, and then failed to show a realistic implementation (the one shown just included everything from everywhere).

Which is the usual way? I know several people here are fans of smart pointers, but GameDev was the first place I ever heard about them.
How and where do you store all your data in your middle-sized game?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this