What is the best way to implement this "weird" kind of graph in C++ for a game?

Started by
27 comments, last by mike3 12 years, 2 months ago
Interesting idea. But I don't get what you mean by "mutable during dev versus const at release" here, the way this works is that there is a set-up routine that uses those add functions (in any version) to set up the graphs for the level structures. That initialization needs to be done somewhere, anyways -- how would you propose it would be done? Not to mention that I may want to generate some randomly. And it may even be that the graphs need to be altered during the game process (currently, I don't know of a place where that would happen, but I'd like to leave the door open for it.). I.e. there is no "dev vs. release" distinction I can see here. What I was hoping for here was something like a regular graph type, but with the added structure of "strings"/"branches" overlaid. And the suggestion I got on another forum I asked this on was to use just that: a regular graph, and build off that, and that's what I was wondering how to do. Yet if that's not possible, then maybe I would need something like this.

Though I kind of, sort of, like this approach since it does address the whole thing about what to do when deleting a node or whatever: since we re-build the entire graph from scratch, then when we do a delete we don't need to mess around with updating the strings or whatever, we just blow them away and write out fresh new ones.
Advertisement

Interesting idea. But I don't get what you mean by "mutable during dev versus const at release" here, the way this works is that there is a set-up routine that uses those add functions (in any version) to set up the graphs for the level structures. That initialization needs to be done somewhere, anyways -- how would you propose it would be done?


So the mutable versus const thing is that during development I tend to have various tools enabled which allow me to mess with the world. Those tools/hacks/tweaks would all live in the mutable wrapper. Now, if they do something simple they can reach down into the presentation layer (i.e. the data/interface which the game engine actually uses) and make the change directly or, if it is something big and complicated they can use the un-optimized data and just rewrite the entire presentation so I don't need to write potentially buggy edit functionality. Big changes can/will cause hitching of performance but for these purposes I don't care normally.

Now, as to the initialization function. I try to avoid such things in cases where it is possible. (Doesn't seem you want to limit yourself though, so this may not be appropriate.) Basically the way I try to do things is that the presentation data can be serialized in/out without going through the mutable layer. I.e. it is complete, initialized and "ready to go" in all ways and requires no functionality from the mutable layer to load/save off of disk. The intention here is to eventually remove the mutable interface and run with the constant non-modifiable data.

Now, if this is supposed to be run time dynamic even in the final game, that does throw a wrench into the works. You can keep the mutable variation in the final build and look at it slightly differently. Instead of writing out a complete copy with pointers and such, you could split your data into the raw data bits and a relational container. I.e. just for example:


struct Element
{
.. just some data here, no pointers or reference information beyond the mentioned generic "tuple" type thing.
};

struct ElementRef
{
Element* mpData;
Element* mpRelatedElement;
StringData* mpWhatever;
};


So, in this case the "save" becomes just a process of optimization of the data so the engine can use it efficiently and you don't have to duplicate all the data. Again, this is just one possible example, there are several possible which would not require duplicating the bulk of the data in two different places but this is probably the easiest and best solution to the problem.


Not to mention that I may want to generate some randomly. And it may even be that the graphs need to be altered during the game process (currently, I don't know of a place where that would happen, but I'd like to leave the door open for it.). I.e. there is no "dev vs. release" distinction I can see here. What I was hoping for here was something like a regular graph type, but with the added structure of "strings"/"branches" overlaid. And the suggestion I got on another forum I asked this on was to use just that: a regular graph, and build off that, and that's what I was wondering how to do. Yet if that's not possible, then maybe I would need something like this.

The first thing I would do if you want to just fix up the data structure itself is diagram it out and run use cases through the data. Understanding your data and the usage of it first usually helps more than anything else. While cheesy, one of the tools I use is MySql Workbench to model the data into an sql form and then I write all the operations as sql. If I see the sql getting too complicated, I look at ways to rework the data layout to simplify things. You don't need "real" data for this process, just a reasonable representation of something organized in the expected manner.

The gain from doing this process is a better understanding of your data, the access patterns required and which pieces of functionality will be the most difficult. Mapping the changed data layout back to code is a simple process and very often you can collapse it back to a very understandable structure, though it rarely is organized in any manner similar to what you started with. More than likely the suggestion to do the graph separate and tack on the strings separately would be the end of such an experimental process, but you may see a better solution just by looking at the data and relationships in a different manner.


Though I kind of, sort of, like this approach since it does address the whole thing about what to do when deleting a node or whatever: since we re-build the entire graph from scratch, then when we do a delete we don't need to mess around with updating the strings or whatever, we just blow them away and write out fresh new ones.

There are many things I like about this pattern myself. The simplification and separation of editing code/data from the actual game used data is a great benefit. And, as mentioned above, you can make variations where the mutable data continues to exist. Being able to simply rewrite the data in an optimized manner without having to figure out all the little details of which pointers to update, what items are invalidated etc saves a lot of time. Of course, if you do dynamically changing content through the mutable interface you might eventually have to go in and optimize the functions and potentially end up with doing the complicated modifications of invalidating the correct things etc, but you can do that later when your data and usage patterns have stabilized. Doing those optimizations too early could lock you into a bad data layout and make it very painful to fix at a later time.
A graph is a set of vertices. Some vertices may be connected by a pair of edges.

String mentioned above is a graph as well.

First part:std::map<VertexID, VertexData> Vertices;Vertices represents all vertices used by Strings and main graph itself.

Now we need an edge list. We have two choices here, depending on how they will be used. For in-order traversal, we could just list vertices, but for general graph, we need pairs of vertices. For frequently updated items we need to be able to find edges which connect a specific vertex. Unfortunately, the required structure is missing from standard library, one could use boost multiindex.

Instead, we'll do a bit of a workaround.typedef std::pair<VertexID, VertexID> Edge;
typedef std::vector<Edge> Graph;
Graph is a heap, ordered container, making operations log(n).

Finally, to remove items, we need a reverse index. Using smart pointers and similar has been shown to be problematic, so instead we track things manually:
typedef std::set<Graph *> GraphIndex; // no smart pointers, just weak reference
typedef std::map<VertexID, GraphIndex> GraphLookup;


For algorithms:addVertex(VertexID, VertexData) // log(n)
- Put VertexData in Vertices

addEdge(Graph g, VertexID a, VertexID b) // 3 * log(n)
- insert edge into Graph
- add g to GraphLookup[a]
- add g to GraphLookup

removeVertex(VertexID v) // log(n) + m * log(n)
- get GraphIndex g from GraphLookup[v]
- remove v from all g
- remove v from Vertices


Those should be enough to fully realize all operations. Iteration is omitted since it's simple enough, so is merging of graphs.

Graph is a vector, but using <algorithm> one can update it as a heap, which uses lexicographic ordering (1-1, 1-3, 1-5, 2-1, 2-3, ...). Not perfect, but doesn't require external libraries for partial edge matches (such as searching for all nodes that start at edge 1. Searching by destination edge isn't efficient.

Note that there is no special "String" class, a string is defined simply as an edge list, same as main graph. For small graphs (<64 edges), linear search would work just fine.

Strings can also be maintained by another list:typedef std::map<StringID, Graph> Strings;

All modification operations should be log(n), so it will work well to thousands, even tens of thousands of vertices and edges.

[quote name='mike3' timestamp='1328668068' post='4910737']
Not to mention that I may want to generate some randomly. And it may even be that the graphs need to be altered during the game process (currently, I don't know of a place where that would happen, but I'd like to leave the door open for it.). I.e. there is no "dev vs. release" distinction I can see here. What I was hoping for here was something like a regular graph type, but with the added structure of "strings"/"branches" overlaid. And the suggestion I got on another forum I asked this on was to use just that: a regular graph, and build off that, and that's what I was wondering how to do. Yet if that's not possible, then maybe I would need something like this.

The first thing I would do if you want to just fix up the data structure itself is diagram it out and run use cases through the data. Understanding your data and the usage of it first usually helps more than anything else. While cheesy, one of the tools I use is MySql Workbench to model the data into an sql form and then I write all the operations as sql. If I see the sql getting too complicated, I look at ways to rework the data layout to simplify things. You don't need "real" data for this process, just a reasonable representation of something organized in the expected manner.

The gain from doing this process is a better understanding of your data, the access patterns required and which pieces of functionality will be the most difficult. Mapping the changed data layout back to code is a simple process and very often you can collapse it back to a very understandable structure, though it rarely is organized in any manner similar to what you started with. More than likely the suggestion to do the graph separate and tack on the strings separately would be the end of such an experimental process, but you may see a better solution just by looking at the data and relationships in a different manner.
[/quote]

And that's just it. If I try to look at the data, try to "diagram" it, I get the following:

1. We need to store parameters for each level.
2. We need to store what levels a level is connected to.
3. We need to store parameters for the connections.
4. Finally, we need to be able to group levels together in linear "strings" so as to be able to store parameters for those strings and the levels in a string are all strung together.

Looking at 1 and 2, I see "graph". If we tried to write out a simplified description, like this:


Primary String:
Level 1:
Name (a parameter): "Alpha"
Type (a parameter): Generic level
Connected to:
Level 2 of primary string
Connection params:
Type: Stairway
Level 2:
Name (a parameter): "Beta"
Type (a parameter): Generic level
Connected to:
Level 1 of primary string
Connection params:
Type: Stairway


Or something like that. Even just writing that down seems to scream "graph" to me. We've got levels and connections stored with them, telling what levels are connected to what other levels.

But when we go to use a graph, we run up against the problem I mentioned: how to represent the "strings". If we keep the strings as lists of iterators to the corresponding graph nodes, then we run into the problem of iterator invalidation in deletion and what to do in that case. Seems to me like what we've got so far is a choice of:

1. use my original code,
2. use a different graph implementation, one that does not have invalidating iterators that invalidate on any of the operations we perform on this web,
3. scrap the idea of using a standard graph -- but I've heard that it's often better to stick to proven data structures when possible, and it doesn't seem like it's _impossible_ to use a standard graph here, just tricky,
4. go to the "mutable" approach. But the main difficulty I see here is added complexity. It seems, looking at it, that what we'd need are two kinds of graph class: one that lists nodes and what not with ID codes and stores connections by referencing ID codes (i.e. an adjacency list with the levels storing tuples of ID codes for the nodes connected to + associated connection data), which is easy to edit (ID codes are stable so there is no invalidation), and one that stores with iterators and what not, which has the trouble of invalidation but doesn't require searches to look up nodes and whatnot based on ID codes, i.e. an "editable" graph and an "optimized" graph, with functions to "bake" the former to the latter. So then we need another graph implementation. Whereas in the other approaches we could even not write our own graph class at all and instead just use something like Boost's graph functionality (wow! and I'm thinking of switching to the Boost graph thing just to offload that extra code in my own program for the graph).

But this is where it stands, as far as I can tell. To me, it seems I'm leaning on option #2, but I'm curious what the best way is that's in the vein of 1 and 2, i.e. using just a standard graph, and also am curious how it can be done with invalidating iterators, if there's a way to address the concerns mentioned in my OP in that case.



1. We need to store parameters for each level.
2. We need to store what levels a level is connected to.
3. We need to store parameters for the connections.
4. Finally, we need to be able to group levels together in linear "strings" so as to be able to store parameters for those strings and the levels in a string are all strung together.

Looking at 1 and 2, I see "graph". If we tried to write out a simplified description, like this:


Primary String:
Level 1:
Name (a parameter): "Alpha"
Type (a parameter): Generic level
Connected to:
Level 2 of primary string
Connection params:
Type: Stairway
Level 2:
Name (a parameter): "Beta"
Type (a parameter): Generic level
Connected to:
Level 1 of primary string
Connection params:
Type: Stairway




If this is the graph (levels as nodes; following level to play relationships as directed edges; data attached to both), the "strings" do not exist at all: you simply have one or more first levels and then you can just follow graph edges.
Note that a graph allows game levels to have multiple exits (leading to different levels), multiple entrances (reached from different levels) and loops of connections allowing the player to return to already visited levels.

Omae Wa Mou Shindeiru

But the main difficulty I see here is added complexity. It seems, looking at it, that what we'd need are two kinds of graph class: one that lists nodes and what not with ID codes and stores connections by referencing ID codes (i.e. an adjacency list with the levels storing tuples of ID codes for the nodes connected to + associated connection data), which is easy to edit (ID codes are stable so there is no invalidation), and one that stores with iterators and what not, which has the trouble of invalidation but doesn't require searches to look up nodes and whatnot based on ID codes, i.e. an "editable" graph and an "optimized" graph, with functions to "bake" the former to the latter. So then we need another graph implementation. Whereas in the other approaches we could even not write our own graph class at all and instead just use something like Boost's graph functionality (wow! and I'm thinking of switching to the Boost graph thing just to offload that extra code in my own program for the graph).[/quote]

You're mixing up everything together, from tiny syntactic details of particular classes to various design issues to some specific fixation on the need for something that is a String.

A graph is a graph and is incredibly trivial. It is a list of vertices and a list of edges.

boost::graph is a disaster. Template MPL overload. Flexible, but incredibly verbose to get even the basic case running.


Primary String:
Level 1:
Name (a parameter): "Alpha"
Type (a parameter): Generic level
Connected to:
Level 2 of primary string
Connection params:
Type: Stairway
Level 2:
Name (a parameter): "Beta"
Type (a parameter): Generic level
Connected to:
Level 1 of primary string
Connection params:
Type: Stairway
[/quote]

What is the purpose of all of this? Looking at things I don't understand anything here.

A "level" for me is a game level. You know: Mario, level 1-1. When you complete that, you go to level 1-2. After completing 1-8, you go to world 2 and start with level 2-1.
What is alpha/beta? What is a stairway? And why are there strings?

Is this supposed to be some sort of pathfinding? Because technically, any relation can be expressed as a graph, some can just be simplified to a tree or list. Yet graph structures are rarely used because they simply aren't needed.

And that's just it. If I try to look at the data, try to "diagram" it, I get the following:

1. We need to store parameters for each level.
2. We need to store what levels a level is connected to.
3. We need to store parameters for the connections.
4. Finally, we need to be able to group levels together in linear "strings" so as to be able to store parameters for those strings and the levels in a string are all strung together.

Looking at 1 and 2, I see "graph". If we tried to write out a simplified description, like this:


Primary String:
Level 1:
Name (a parameter): "Alpha"
Type (a parameter): Generic level
Connected to:
Level 2 of primary string
Connection params:
Type: Stairway
Level 2:
Name (a parameter): "Beta"
Type (a parameter): Generic level
Connected to:
Level 1 of primary string
Connection params:
Type: Stairway



With that more simplified outline of the problem I don't see a graph, I see a fairly simple db schema. Ok, it is a graph (anything can be called a graph for the most part) but this is a lot more simplified than I thought earlier. It seems you should be able to normalize this in the db term and look at it from a different perspective which may work out better or give you further ideas on how to rework things. Basically I would start by considering (assuming this works for your needs) that the level and the connections can both be described in a generic manner: A named set of generic parameters. So, you could do something like:


typedef std::string PropType_t;
struct Property
{
PropType_t Type;
PropType_t Data;
};

typedef std::multimap< std::string, Property > PropMap_t;

// C++11 initializer pseudo code.
PropMap_t Levels
{
{ "Level1", { "Name", "Alpha" } },
{ "Level1", { "Type", "Generic Level" } },
{ "Level1", { "Connections", "Level1Connections" } },

{ "Level1Connections", { "Stairwell", "Level2" } },
{ "Level1Connections", { "Corridor", "Level2" } },
{ "Level1Connections", { "Bridge", "Level3" } },
{ "Level2",
<etc>
}
}


So, you now have a generic set of attributes in a single structure and they can refer to each other via "Name" within the map. You loose iteration ability of the "Levels" themselves so you probably want to re-add the concept of a list of levels:


typedef std::vector< std::string > Levels_t;


Now, if you want to get all connections from Level1:


// Convert sql to C++
// SELECT Data FROM tblProperties WHERE Name = (SELECT Data FROM tblProperties WHERE Name='Level1' AND Type = 'Connections')

Levels_t FindConnections( const std::string& levelName )
{
Levels_t result;
typedef std::pair< PropMap_t::const_iterator, PropMap_t::const_iterator > ItPair_t;

ItPair_t props = Levels.equal_range( levelName );
for( ; props.first!=props.second; ++props.first )
{
if( props.second->Type == "Connections" )
{
ItPair_t connections = Levels.equal_range( props.first->Data );
for( ; connections.first!=connections.second; ++connections.first )
result.push_back( connections.first->Data );
break;
}
}

return result;
}


Most other operations you might need would end up along the same basic lines. Yup, I think very database oriented, if i can't represent something with fairly simple SQL I go back to the drawing board and rework the data.

And of course you can use this data as the "mutable" version and build your optimized (with iterators/pointers) version after edits.
"strings do not exist at all"

"don't see a graph"

OK, so maybe my given example was a little too simple. Here's a more complicated one with data fields closer to what we'd actually have in the game (e.g. individual levels are not named, but strings are!):


Web1 (the "weird graph" I'm referring to)
Name: "Mega Caverns"
String 1:
Name: "Main"
Level 1:
Type: Generic level
Connections:
Level 2, String 1
Type: Stairway
Level 2:
Type: Generic level
Connections:
Level 1, String 1
Type: Stairway
Level 1, String 2
Type: Stairway
Level 3, String 1
Type: Stairway
Level 3:
Type: Generic level
Connections:
Level 2, String 1
Type: Stairway
String 2:
Name: "Side Branch"
Level 1:
Type: Generic level
Connections:
Level 2, String 2
Type: Stairway
Level 2, String 1
Type: Stairway
Level 2:
Type: Generic level
Connections:
Level 1, String 2
Type: Stairway
Level 3, String 2
Type: Stairway
Level 3:
Type: Generic level
Connections:
Level 2, String 2
Type: Stairway


(The actual game version also has file name tags and flags indicating whether a given level has been generated, the file names being for the files that store the levels.)

Why I think of graphs is because I usually diagram something like this in this way:


Let "X" denote that connected levels belong to string 1. Let "Y" denote that
connected levels belong to string 2. Let "-" denote an inter-string connection.

Level 1
X
X Stairway
X
Level 2 ------------ Level 1 YYYYYYYYYYYY Level 2 YYYYYYYYYYYY Level 3
X Stairway Stairway Stairway
X Stairway
X
Level 3


and it looks like a graph with nodes connected by edges. More sophisticated examples could even have cycles in them. But it's this diagramming that gave me the graph idea.

Though I suppose you could implement this with the d/b as well, so then it's a toss-off between d/b and graph.
It seems that, in your more complex example, the true graph is still levels and directed reachability links, and the strings are a completely superfluous and artificial grouping of levels; globally unique node identifiers are available and you are already using them.

If grouping your levels makes sense, for example because you have regions in space or episodes in time, you can use collections of pointers to nodes or collections of node names that aren't part of the graph data structure. Freestanding collections of pointers or names are nice because your objects can belong to multiple ones; they don't "own" objects.

Omae Wa Mou Shindeiru


It seems that, in your more complex example, the true graph is still levels and directed reachability links, and the strings are a completely superfluous and artificial grouping of levels; globally unique node identifiers are available and you are already using them.

If grouping your levels makes sense, for example because you have regions in space or episodes in time, you can use collections of pointers to nodes or collections of node names that aren't part of the graph data structure. Freestanding collections of pointers or names are nice because your objects can belong to multiple ones; they don't "own" objects.


That's correct. That's what I've been trying to do. The problem is that the iterators to the nodes can be invalidated upon deletion, and that leads to needing to update them in all strings every time a string is deleted. And this brings some issues, which I mention in my original post. Would the best approach, then, be just to use some kind of graph that does not invalidate iterators on deletion, or has some other type of stable node reference?

This topic is closed to new replies.

Advertisement