Using SQL for Templated Data

Started by
5 comments, last by Bearhugger 8 years, 2 months ago

First, I want to say I'm not an SQL pro. Nested SELECT queries were new to me until mid-January.

Anyway, I'm looking to implement a system where I can create new objects by basing them on other ones and then change only the fields I want to change while maintaining the link to the parent for the other fields. That way, if I change something at the parent level, the changes automatically propagates to the children unless they have an explicit value for that field. Sort of like a prefab, but a prefab that can itself inherit another prefab, forming a sort of inheritance tree.

I've seen something like this being used in Blizzard's RTS map editors starting with Warcraft 3, and this is something I could use for my own projects since I deal with a lot of data.

I know how to implement this in C# and C++, but I'm not sure how to do it at the database level. I can see a table where a line having a NULL for a particular field means to look up the parent's properties (and if it's NULL for the parent, go see the parent's parent, and so on), but when I have things like a many-to-many or one-to-many relation, I'm not quite sure how to do it with a relational model. I want the children to be able to add items to a list of items, or to just replace the list.

Is a relational database suitable for storing this kind of data model, or should I use something else to serialize the data? I know that Blizzard uses XML to represent this so it half-answers this question.

Advertisement
This is usually done with the coalescing function:

with `objects` (`id`,`parent_id`,`name`,`type`,`value`) as ( values
    ( 1, null, 'fruit',     'F', -5 ),
    ( 2, null, 'vegetable', 'B',  7 ),
    ( 3, 1,    'apple',     null, 100 ),
    ( 4, 2,    'carrot',    'X',  null )
) select
    child.id,
    coalesce(child.name, parent.name) `name`,
    coalesce(child.type, parent.type) `type`,
    coalesce(child.value, parent.value) `value`
from
    `objects` child
left outer join
    `objects` parent
on  child.parent_id = parent.id
;
This might be incorrect (modern) SQL syntax, but it should get the point across.

Did you consider writing a code generator?

Store your data as literal values, and write a program that expands those values to full text. (Mostly a bunch of "print" statements.)

There are also templating systems for web development (to generate web pages) that you can use.

I wouldn't bother to use C++ or C# for this, it's just text expansion, Python can do this fine.

One step further would be to store those values in a separate file, or even write them in custom syntax that you parse.

Sorry. I only partially answered the question. You may want to check to see if common table expressions are supported by your database engine. Some popular database engines are a bit behind the times in supporting modern SQL features (*cough* MySQL *cough*). In any case, assuming a schema:
create table `prefabs` (
    `id` int primary key not null auto_increment,
    `parent_id` int null,
    `name` varchar(50) not null
);

create table `prefab_properties` (
    `id` int primary key not null auto_increment,
    `prefab_id` int not null,
    `property_id` int not null,
    `value` varbinary(128) null
);

create table `properties` (
    `id` int primary key not null auto_increment,
    `name` varchar(100) not null,
    `type` varchar(10) not null /* int, string, color, float, and et. al. */
);
Then you could use CTEs to do something like:
1. Walk the inheritance tree. The CTE will build a set of all parents for every node.
2. Aggregate non-null properties for every node, and select the first property of that type.

with `tree` as ( /* set of all prefabs’ parents */
    select  p.id        `root_id`,
            p.id        `current_id`,
            p.parent_id `parent_id`,
            1           `level`
    from    prefabs p
    where   :has_prefab_ids is null
    or      p.id in :prefab_ids
    union all
    select  t.root_id,
            p.id,
            p.parent_id,
            t.level + 1
    from    tree t
    join    prefabs p
    on      t.parent_id = p.id
), `tree_props` as ( /* join prefabs in `tree` with their non-null properties */
    select  t.*,
            pp.property_id,
            pp.id `pp_id`
    from    tree `t`
    join    prefab_properties `pp`
    on      t.current_id = pp.prefab_id
    where   pp.value is not null
) select
    _1.root_id `prefab_id`,
    _1.property_id,
    pp.value,
    t.current_id `inherited_from`
from ( /* first encountered non-null properties of prefab */
    select  tp.root_id,
            tp.property_id,
            min(tp.level) `min_level`
    from    tree_props `tp`
    group by
            tp.root_id,
            tp.property_id
) _1
inner join /* rejoin tree_props to obtain property value */
    tree_props `tp`
on  _1.root_id = tp.root_id
and _1.property_id = tp.property_id
and _1.min_level = tp.level
inner join
    prefab_properties pp
on  tp.pp_id = pp.id
;
Of course, a sane person might consider calculating effective properties in software, rather than rely on a database to do it for them. smile.png

Ok, so to do that in pure SQL, you need have a table that stores the schema, and then you have to store each value of each property of each object in a separate table. You can't have a clean table that lists all simple properties and leave them to NULL if they're not assigned? That's a bit more complex than I hoped it would. I also intend to use SQLite as I have no need for a DBMS powerhouse, and I don't think it has the most complex SQL features if MySQL doesn't. I suppose I could use SQL Server though.

Given a table that has a parent_id field, is it simple to write an SQL SELECT query that will simply walk up the tree and return me the lines in the order that they are encountered? (First line is the leaf, then it's it's parent, then grandparent, all the way to the root.) And then I could sort out the actual properties in software.

A side question that's possibly important: why do you need this in SQL in the first place? Is there a strong reason you can't use a NoSQL object database? Is there a good reason you need the data to be exposed to a query language at all?

Given that this is a game development forum, I'm somewhat disinclined to believe that databases (SQL or otherwise) are a good solution to whatever you're doing.

Even if you have a large amount of data, there's always the option of encoding your data into a big blob and just shoving that into a single database field, leaving all the prototypical composition to your C++/C# code. The database only needs to actually "see" data that you want to index or query, after all.

I would argue that if you're doing this kind of data composition and you do actually need an indexable database, you're possibly _way_ better off use Mongo or Couch or Dynamo or something instead of a first-order relational (SQL) data model.

Sean Middleditch – Game Systems Engineer – Join my team!

At this point I'm simply exploring my options.

Right now I'm at the bottom of the barrel when it comes to serializing game data. To give you an idea, in the current system, the game data is serialized directly to a big ugly "db.dat" file using the .NET binary serializer and 10 year old code that I probably wrote right after noticing that C# had properties. As you can imagine, this is a pain in the ass because if you change anything in any of the classes that you serialize, the .NET serializer may refuse to load it, and then you have to write serializer handlers to read the data and initialize it yourself. Many times I was afraid I would not be able to recover the game data.

Since I'm rewriting the data sub-system of my editor from scratch, I'm also implementing a better way to serialize that data.

A relational database is not my only option, but it interests me because it will make it much easier for me to modify the data, and it would let me recover the data if something goes wrong. (Most are ACID.) In addition, I figure I could use something like Entity Framework or NHibernate to help me deal with data. Installing a DBMS on a system (though I would far prefer SQLite) is not a big deal to me because this is only for the game editor.

My other options would be to serialize to JSON or XML. I will take a more serious look at those CTE, but I'm not quite at ease with the SQL language to do those recursive queries right, so right now JSON looks like my preferred option.

But I will look at those NoSQL databases.

This topic is closed to new replies.

Advertisement