Handle managers

Published March 19, 2006
Advertisement
After thinking it over, I believe the concept of handles will fit my purpose very well (thanks again ApochIq!). The main task in front of me now is how best to use them in the GameDB system...

Being that the Database class should really just be a storage class, I think we need handle managers to be able to tap directly into this. If anything gets removed from the storage the handle managers should instantly reflect this - this means that the handle managers should be able to tap directly into the data.

The results sets returned will now hold only handles to the underlying data which must be resolved before any data can be accessed. This still raises an interesting question though given that a result set is effectively a 'view' of the data. If you delete an item from the database, all views should know of this and remove the data from the results sets themselves. I don't want a view to be rendered 'stale' because the data underneath has changed. This brings back my point of connected and unconnected recordsets. Sometimes we may want to have a snapshot of the data at a given time, whereas others we will want a live view. The best way I feel to acheive this is to run an internal messaging system and send out various messages whenever certain events occur.

The points raised about concurrency are valid. I can see this system being effectively run in another thread to fill up a few result sets and another thread being able to work with the sets whether filled or partially filled. The idea of results streams is springing to mind... I've never done any true MT programming before, so this aspect is going to be interesting to say the least.

I should perhaps clarify what sort of 'database' this is - I'm not sure :P It's not a relational database (of sorts). The database element is effectively based around both attributes and tags; entities are retreivable by specifying conditions on the tags and attributes they posess. There's no concept of a 'table' at all, so the strict classifications of C++ classes and relational theories of RDMS simply don't apply; what we have is tags, an object can be tagged with whatever you like and exist as part of multiple tag collections at one, or none at all. Relationships don't really exist, except that you can store handles to other entities as attributes of another entity. Entities can have new attributes added or removed with the API or via scripting, they can also have function attributes (so you can call functions on them). The action query side of the API will let you update the attributes of objects en-masse by specifying the selectivity conditions.



I'm happy to announce that Spotlight #1 is up, go read it on the frontpage. I'm playing the Starscape game by Moonpod as initial 'research' for another article. Their work is outstanding, very polished and I love being able to call playing games as research!
Previous Entry Issue
Next Entry Bridge Commander?
0 likes 3 comments

Comments

jollyjeffers
I'm not sure I can offer much comment on the technical side of your problem beyond the fact that it sounds difficult! It also strikes me as a fairly fundamental choice - one where there is no single correct answer and whichever you choose will become a core influence to how the whole library/API turns out... [oh]

Although, as a potential user, I'm not sure I like the idea of having run a query only to then find that it changes as I work my way through it. I can understand why that might be a good thing, but it's the sort of complexity/complication that makes my job a whole lot more difficult [lol]

Gonna go grab some dinner and read the Sector 13 spotlight. Had a quick scan, a lot better for having the screenshots in it!

Btw, Did you know your journal is the 6th most replied to?

Cheers,
Jack
March 19, 2006 03:48 PM
ApochPiQ
I know in typical RDBMSes you don't worry about data "going away" after a query. For instance, say I try to pull up all the posts from BobThePoster, and then iterate the result set to find ones longer than 24 characters. While I'm iterating, Bob deletes one of his posts. My resultset will remain intact - it's a snapshot of the results of my query at the time I made the query. My final count may be off by one because of the deleted post, but it's still accurate at the time that the query was initiated.

Rather than counteract this with a messaging system that morphs resultsets to keep them "unstale," you use referential integrity rules. In most cases, it is not trivial to know what resultsets are affected by a simple change. For instance, the deletion of Bob's post may invalidate the query "SELECT COUNT(*) FROM Posts WHERE PosterID = 384" (assuming Bob is poster 384). However, in order to know that, we have to re-run the query and see if it passes over the deleted record. Repeat that for potentially thousands of active queries with various lifetimes, and you've got a terrifically nasty mess to deal with.

Now add synchronization and simultaneity issues [grin]


Personally, I'd favor a design approach on the side of RI rather than trying to fight stale resultsets. The complement of RI is atomic transactions, where you can do many updates to the data and guarantee that all of them are treated as a single update - i.e. nobody else is rearranging data out from under you during the process of those updates. RI basically lets you define metadata that indicates what data relies on others. For instance, the Posts table probably has a RI rule requiring that the PosterID link to an existing record in the Posters table, or be NULL to allow anonymous posts.

Of course, for this system, the only RI you need could be as simple as "does Object X still exist, and does it have a tag Y?" Beyond that I think it should probably be the application/logic layer's responsibility to ensure that the data it's fooling with is valid.

Atomic transactions may be useful as well, but I personally find it a bit hard to think of (for instance) a game world in terms of atomic transactions.


Then again, there's always the Garbage Collection philosophy - permit just about anything, complain if acting on something that doesn't exist, and periodically trim back all the cruft. That may either turn out to be extremely applicable or very deadly; I can't say it's at all clear to me at this point which outcome is more likely.
March 19, 2006 04:56 PM
cbenoi1
I have implemented such a system in the past. I have made a simpler version for gaming that I use from time to time on various projects. The core DB is using tags, and I use a multi-threaded task queue manager. What you have to do is load up a list of tasks and relations between them and run the task manager. Each thread pulls a task from the queue and executes it, putting back the results in specific pre-allocated tags. The TaskQ manager does a quick topological sort to find out which task(s) can be pull next, depending on what the task graph looks like at any point in time. A task can generate others; a typical example would be tessellation of geometry by subdivision until some curvature criteria is met. Once the last task is done, then all worker threads are put to sleep and the master thread can work on the results. I used this system to implement a simple raytracer and I started using it for an RTS game. I also extended it so that the database and task manager are shared across a cluster of networked machines; this allows your database to be much bigger than available memory, and for a bigger number of available worker threads to play with.

One of the particularities of my implementation is that a DB item can be data or a function. So the relationship between two TaskQ elements can be a function that is evaluated at runtime. I currently use an enum because I have a fixed number of relation types; but it would be easy to have a function evaluated instead.

Ping me if you want to discuss this in more details.

-cb
April 11, 2006 08:51 PM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Advertisement