( Eventual ) Consistency in an MMOish game database structure

22 posts in this topic

Hello there o/ ,

I've started working on a browser game and trying to figure out database structure but my problem is I somehow managed to get a structure but not sure if makes sense and if it does how to handle consistency across several databases of different types.

There is a schema that I hope it makes sense

According to current plan,

1. Memory database (Redis) : It will be database where main action takes place due to speed concerns.
2. Snapshot database ( an in-memory persistent database featuring more than key/value pairs ) : This is the database only storing data memory DB will need in case of a rebuild and as name implies snapshot of current data. It's eventually consistent and in case of a failure it's authoritative.
3. History database ( probably a RDBMS ) : This database is for storing history data that's relevant to players ( No point in storing that Player345278 has achieved level 28 ages ago in an in memory database) , for data they might need to see one day ( in fashion of ancient tweets )
4. Log database : As name implies it's for storing logs of any kind ( Once again no point in storing that Player345278 has killed a chicken ages ago if player won't need such data )
5. Analytics database: It's a server side tool to mine log data for fraud and behavioral pattern etc issues.

In current plan, there are writes to several databases in each action (which is heart breaking) for example,

Player finds 100 golds : memory one changes total gold, snapshot one also makes change, log one logs a log, history one is also updated if needed ( such as daily gold finders table )

Question is :

1 - Is there a better alternative in your opinion ?

2 - In case of a failure of any kind , what's the best way to ensure minimal loss of data? Should it be like snapshot is authoritative so memory one copies all from it , then logs before latest timestamp of snapshot is deleted , and have no good idea for history

0

Share on other sites

I would start at the data-end.

Decide what data you have, which tables, which columns and keys, and how are tables connected, ie the data base scheme. If you normalize that, there is no redundancy, so it's consistent by definition.

That gives you the data structure of your application at conceptual level. Theoretically, you can push all stuff in one data base, and it's running (and that might be a good idea to start with, as you can make the game then).

Your picture is much more about how to make the data accessible fast enough to scale up to more users. That is only relevant if you have (sufficient) users in the first place. I am a newbie here, but I would define some data base as authorative, and everything else as "cache" (ie non-authorative). As far as I know, how to speed up access heavily depends on the queries that you perform, and without game, you don't know the queries and their frequencies, so making a sane design is pretty much impossible at this stage, unless you've done it a few times already.

As such, I'd suggest drop as much complexity as you can, and make things work first. Then try to figure out where to add complexity to get better <whatever you need to improve>.

1

Share on other sites

First of all, thanks for replies.

And I know it doesn't look like but this is simplified model as,

1 - Need Redis for fast paced transactions and some scheduled tasks,

2 - "Snapshot" Database is needed primarily for its graph features

3 - "History" one is probably RDBMS as aggregate functions are common.

4 - Log / Analytics are rather off topic so plain text is ofc on the table

I'm not happy of having that much moving parts, that's why trying to reconcile them

And "failure of any kind" is a bit overstatement At least when server had somehow managed to crash ( not much to do if server explodes after all ) or so ( common risks )

Well, I'm rather newbie in this as well. Actually my concern is when there are several databases from different technologies, their response to a failure would be different. If not configured to take snapshot, Redis data will be gone, some will use wall ahead log etc. Just I need a way to ensure that they all reverted back to same point

0

Share on other sites

Thanks for reply. Although I agree mostly with your view, a single database solution isn't necessarily best practice in my opinion.

Quote

Most games I've been involved with went down the route of just using the database as a save-game file for each player. If servers stay up and there are no other bugs then it is eventually consistent. The DB becomes sort of a write-back cache. It does mean that 'in-memory' is the authoritative data (rather than the DB, as is the case in more traditional software), you do have to be careful around migrating players between servers (zones, instances, whatever), and for important trades that may have significant repercussions you may want to save all participants at once atomically anyway.

But quoting from you from another thread ( of https://www.gamedev.net/forums/topic/686334-mmo-database/ ) , I think this approach is quite applicable.

So, when everything is fine Redis/in-memory one is authoritative and other database acts as an eventually consistent database as save game data ( aka snapshot ) . When there is some sort of reboot or failure, this save game DB is authoritative (as in-memory data is no more obviously) and data is just copied from other db to redis.

I'd love that in-memory and persistent databases are same so it's fine but not really sure of handlings issues that might arise otherwise.

And for prior schema, actually it can be simplified as only in-memory and persistent databases.

What would best practice be in case of you can't have " one single database to rule the universe " ? Would it need something like a monitor checking all databases with timestamps?

0

Share on other sites

What proof do you have that a single standard, say PostgressSQL, data base won't do the job? These things can handle real-time airline reservations, ATM transactions, real-time online orders, and what not. They don't get scared of a 1,000,000 records or more. These things typically don't just die.

One of the problems with software is that it doesn't have a natural upper limit in complexity. Unlike physical building space, in software you can always add one more layer to make it "better" in some way. Unfortunately, better in one way costs something elsewhere, ie you're just moving misery.

For a concrete system where you know the required performance characteristics precisely, you can design a solution such that misery is not hurting you (or at least minimally hurting). You can't do that in the general sense, for the simple reason that you need to know where misery will not affect you, which is impossible without knowing required performance characteristics.

You can in theory distribute misery evenly over the entire design space, but then it's always hurting, ie you make the really good performing cases unreachable by design. I think this is what you're doing currently, you think a single data base isn't enough, so you make this horribly complicated multi-DB system where stuff has to be synchronized all the time in some way, and preventive costs are made to avoid getting into trouble when unknown, possibly not even existing shit hits the fan.

This is all fine if a horribly complicated multi-DB system with real unexpected shit is what you eventually really need to have. However, in practice, this may not be the case as Kylotan said. Thus in all likelihood, your multi-DB synchronization is actually making the data base just slower than it needs to be, that is, your design efforts at this stage might be counter-effective.

The standard software tactic here is to define an interface to program against, then pick the simplest possible solution as implementation, and move on to other parts of the problem. When the implementation is actually not performing good enough any more, you reconsider how to implement the interface (ie move the hurt to a new spot, out of the way in that situation.)

Effectively, you're reducing and moving the design effort here. If a simple implementation of the interface suffices, you don't need to design your multi-DB thing at all. You saved yourself a few months of effort.

If it is really needed, you will at some point run into limitations with the simpler solution, and at that point in time, do the work to design and implement your multi-DB thing. The nice thing is then that you know at that point in time where the simple solution fails, so you can take that specifically into consideration when designing a better DB implementation. At that point you also know where misery can be moved to, such that it is not affecting you. The result is going to be better performing then anything you can cook up now.

1

Share on other sites

Actually one database might handle issue effectively ( just stumbled upon https://engineering.hipolabs.com/graphdb-to-postgresql/  ) but that's another aspect of issue when model is simplified to two databases ( Redis and X )

My interest for Redis is not about performance but handling issues that doesn't need frequent write ( for a classic MMO, it might be player location or hit point in the middle of battle )

As I said I'm also not happy with too much moving parts (which is more error prone) so trying to find optimal solution.

0

Share on other sites

As mentioned above, the fact that data changes rapidly does not mean you need a real-time database to handle it. For most MMOs, when player location or hit points change, there is simply no database involved.

I won't quote what you said to me in a direct message since that is private, but I can tell you that your assumptions about what is necessary for a game like this are incorrect. You seem to want a different database for every different type of data you have, and a database for your in-memory data, and if that is what you want, then no, you can't do it more simply. If you're asking what you need, then you don't need most of this, and I'm speaking as someone who has worked on live MMOs and who has experience of several databases (including graph DBs).

0

Share on other sites
7 hours ago, SotarOraiste said:

As I said I'm also not happy with too much moving parts (which is more error prone) so trying to find optimal solution.

Without a running game you have no information what to optimize. No doubt you will have a result, but it's not going to be optimal.

0

Share on other sites

Typically folks deploy Redis as a cache in front of a database, when they have determined the specific bottlenecks in their existing system.

In practice, tuning a Redis cluster to run faster than a decent SQL implementation is... complex. For simple key/value storage operations most databases are blindingly fast, and there is no guarantee that you will have a workload that is actually faster in Redis.

0

Share on other sites

Posted (edited)

I think what is generally suggested ( that database is authoritative and redis is cache layer ) is contradicting your quote which made more sense to me. But still I evaluated options and came up with a solution of using them "where they shine most"

------

So due to GraphDB limitations, will use memory one as main authoritative database and Redis as cache ( unless there is a use case using Redis makes more sense to prevent unnecessary workload or bottlenecks ) and there will be a database both for logging + some aggregated data.

Is this good enough?

------

And when using two different databases is a technical necessity, what would you recommend for consistency? I thought of using a query queue for both databases and a timestamp to track so if there is an inconsistency one of databases might rollback (but actually not sure if it's doable)

Edited by SotarOraiste
0

Share on other sites

I would love to hear an actual reason why using two different databases is a "technical necessity."

2

Share on other sites
7 hours ago, SotarOraiste said:

And when using two different databases is a technical necessity, what would you recommend for consistency?

There are cases where it is necessary.  But you did not describe that case.

Fortunately for everybody using technology, it is a problem that has been solved since the 1970s.  There are many techniques covered in most good books about enterprise-style data management.  Usually it involves multiple transactions spanning multiple machines with a list of steps that must be followed exactly for the process to work.

My recommendation if you were actually in that situation is to refer to any of those hundreds of excellent books on the topic, studying them carefully to be certain you understand the procedures and the risks, then following the detailed instructions precisely.

0

Share on other sites
7 hours ago, ApochPiQ said:

I would love to hear an actual reason why using two different databases is a "technical necessity."

In my case, a graphDB is needed for where graph shines most ( shortest path or graph traversals ) so some of data must reside in it and some data which is perfect use case for relational one must be in RDBMS.

Pity graph data is unable to be serialized efficiently.

0

Share on other sites
12 hours ago, SotarOraiste said:

I think what is generally suggested ( that database is authoritative and redis is cache layer ) is contradicting your quote which made more sense to me.

What is generally suggested for games is that the in-memory state is authoritative, the database is a write-behind cache of that, and Redis is usually not used at all.

1 hour ago, SotarOraiste said:

In my case, a graphDB is needed for where graph shines most

That is not the definition of "needed". That is the definition of "wanted". I wouldn't recommend using a graph DB for games, because the main benefit of graph DBs is that you can easily perform complex queries spanning multiple entities that are indirectly related; and not only can I not think of an example where I've needed to do that in a game, but in a game I would be doing it with data that is in memory anyway, meaning the graph DB is irrelevant.

0

Share on other sites

Fair enough, will check literature

Still didn't get why Redis can't be that in-memory state in form of key/value pairs, tbh.

And for GraphDB, it's not a sole GraphDB such as Neo4J but a "multi-model" one which actually means a documentDB where vertices and edges are in different collections when you need graph functions, so it also have document storage uses , otherwise I'd use a solution like you mentioned.

Can this data be serialized? Probably to a great extent but I find a documentDB more logical for a game, maybe I'm wrong

0

Share on other sites

I'm going to be as gentle as I possibly can here.

All data stored on disk or transmitted over a network is "serialized" by definition. That's what the word means - putting the data into a "serial" stream of bytes, or an ordered sequence if you will. If you're not proficient in basic terminology of the trade, you are not ready to be making the kind of decisions you seem to want to make here, and you are not ready to make the kind of game you're talking about. You need to do a lot more research and drink a lot less NoSQL koolaid.

A graph database is massive overkill for the kinds of graph searches used in games. For example, the canonical pathfinding algorithm is A* or some variant of it; this algorithm operates on tiny slices of memory in the game itself - not a separate database - and as such will probably be around an order of magnitude more efficient if coded correctly. Even social network mapping can be done with the same graph code in game memory and eliminate the need for a fat DB entirely.

I have no idea how you have reached your conclusion that a document DB makes sense for implementing a game simulation, but you could not be more wrong.

A typical MMO implementation is a custom piece of software - the "game server" - which stores all of the world's information inside its own memory space. It handles everything from physics and moving around the world to item drops from creatures and rolling on loot tables. It handles the core implementation of everything the game needs to do in order to be a game.

There is a reason for this, and it is (plain and simple) compactness. You may have heard of "locality" as well. If I have to ask a DB server on the next datacenter rack for the position of a player, and the response takes ~1ms to reach me, I've burned literally one hundred thousand times the time it would take to just fetch that position from RAM. If your game can run at N speed in a database-driven configuration, reading documents for all of its game state and whatnot, it can run N*100,000 speed if it's doing everything in a local process on a single machine.

Given that most games struggle to support a few hundred people in the same simulation, I think it's fair to say that making a five order of magnitude compromise on your basic simulation performance is... well, stupid.

After a game server exists, it makes sense to talk about persistence, which is storing subsets of the game world data in a way that allows for a few niceties:

- Players can disconnect and reconnect and be in the same place they left off.
- Players who are not connected cost zero resources.
- Servers can be shut down for maintenance or upgrades without destroying the game.
- And so on.

MMO persistence is traditionally very minimalistic from a certain perspective. You need a few things to do it right, like schema migration and versioning capabilities, but any reasonable DB has that. (N.B. I am indeed implying that if your DB of choice can't handle a schema version upgrade then it's no good.) You don't need anything more than a BLOB column type or even sparse files per-player on disk. Databases just don't act as a bottleneck of most MMOs unless you design them very poorly.

The one and only time I would suggest splitting out multiple persistence solutions is if you want to generate time-series data like event logs or other metrics. Then use an appropriate tool for metrics aggregation and retention in addition to the persistence DB for game data. But that's a 100,000 CCU problem. You have a ways to go before you should be thinking about that sort of stuff.

1

Share on other sites

My understanding of serialization was like Wikipedia definition of "Serialization is the process of translating data structures or object state into a format that can be stored (for example, in a file or memory buffer ) or transmitted (for example, across a network connection link) and reconstructed later" . Sorry if term wasn't right but by "SQL serialization" I was mostly meaning structuring tables.

From that POV, a pseudo JSON structure such as

{
"id" : 3289723498,
"name" : "whatever",
"level" : 216,
"weapons" : {
"sword" : { "stat" : "blabla" , "stat2" : "blabla2" },
"axe" : ...
},
"inventory" : { .... } ,
....
}

looks more manageable to me ( with a trade-off of storing repetitive data ) than a SQL JOIN fest, but as I said I may be wrong. ( I at least tend to believe it's a design choice rather than a "never-ever" but I can be still be very wrong)

My primary reason to use a GraphDB is Floyd-Warshall / Dijkstra for "spatial" data. If I decide to use a relational database, I ofc either limit it to such functions or switch to a node module one. My current point is as "if JSON structure fits to my purpose , why not use single solution for both?"

And ok, it ofc makes perfect sense for "game server software" (which is Node.JS based as this is a browser game) to handle data in memory. Although I didn't make a thorough search, doubt Node.JS (V8) shines in memory management / garbage collection as much as your beloved ( C ? ) choice. And considering Redis is a " in-memory data structure store " , I see no harm in using this key/value store for storing data in-memory as long as crucial performance hits don't occur in comparison.

Still, I got the part that using single database is cure of all evils and it rather becomes a question of how you store data after that. Thanks for your response, it has merit indeed.

0

Share on other sites

Wikipedia's definition agrees with mine; storing data in a table structure is generally considered an example of deserialization, i.e. not storing things in a flat byte buffer of transmissible data. Anyways, my point stands; if you're having trouble with basic terminology you're going to have a lot more trouble with anything sophisticated like justifying a multi-DB infrastructure (for one wild example).

JSON versus normalized tables is a pointless debate for now; you're still missing fundamental concepts. It will be all but impossible for you to succeed unless you work on your fundamentals first. You need to get rid of this notion that you have a giant database-shaped hammer and everything that involves data must be smacked with it like a nail.

The whole point of my description of game servers is not that they do things in memory. The point is that they do it in the same memory as the game. Your "in memory database solution" is not the same thing as writing your game logic to work independently of a database. You must understand this fundamental concept or any further discussion is going to be a waste of time.

What I'm telling you is that storing all your game data in a DB will have performance implications. More specifically, it will be impossible to scale to more than a handful of players if you do this. You need to stash the data for the game simulation in the same program as the game simulation itself, and only use databases for recording player/character data when it is time to do so. In fact there's really not much point in using a database at all unless you actually know what you're doing with them (and - nothing personal - you clearly do not).

When it comes to games implementation, doing Floyd-Warshall, Dijkstra, or A*, or any other graph search/analysis algorithm on a database is just silly. You can find implementations of those algorithms in practically any mainstream language you can imagine. Just do the search in your own simulation code and you're done. I virtually guarantee it will perform better than trying to shoehorn a game into a shrinkwrap DB solution.

By the way, my "choices" (read: experience deploying real, successful MMOs) are largely independent of things like language. You can do all the things I'm talking about atop Node.js just as realistically as you could do it in a homegrown C or C++ framework. Again, pretty much any mature, robust language is fine as long as you can design the architecture right. There will be differences between how well a Node implementation performs versus a highly tuned C++ implementation, sure, but those affect things at very high numbers of users. By the time you need to worry about it, you can probably afford to worry about it (i.e. pay someone to worry for you).

0

Share on other sites

Using GraphDB for your game's spatial queries and Redis for game data lookups, I look forward to seeing what you can create.

It is not that the choices cannot do the things, it is that you are doing things at a breathtakingly wrong scale.  Imagine going fishing with grandpa in a small lake but instead of a rowboat and poles you've got a battleship and explosive depth charges. Or wanting to find a path to the corner store by bringing in an army of scouts and pathfinders to make sure you've got the path down the street safely and securely.

The tools you are discussing are capable of doing the things you describe, but they are for radically different purposes.

At this point I hope you go forward with your project.  It will serve as an enlightening learning experience.

1

Share on other sites

Ok o7 , will evaluate whole structure to come up with a solution not overengineering and overcomplicated without a good reason.

Thanks for responses.

1

Create an account

Register a new account