( Eventual ) Consistency in an MMOish game database structure

Started by
26 comments, last by Kylotan 6 years, 8 months ago

Hello there o/ ,

   I've started working on a browser game and trying to figure out database structure but my problem is I somehow managed to get a structure but not sure if makes sense and if it does how to handle consistency across several databases of different types.

   There is a schema that I hope it makes sense

gQuS1rn.png

According to current plan,

  1. Memory database (Redis) : It will be database where main action takes place due to speed concerns.
  2. Snapshot database ( an in-memory persistent database featuring more than key/value pairs ) : This is the database only storing data memory DB will need in case of a rebuild and as name implies snapshot of current data. It's eventually consistent and in case of a failure it's authoritative.
  3. History database ( probably a RDBMS ) : This database is for storing history data that's relevant to players ( No point in storing that Player345278 has achieved level 28 ages ago in an in memory database) , for data they might need to see one day ( in fashion of ancient tweets )
  4. Log database : As name implies it's for storing logs of any kind ( Once again no point in storing that Player345278 has killed a chicken ages ago if player won't need such data )
  5. Analytics database: It's a server side tool to mine log data for fraud and behavioral pattern etc issues.

 

 In current plan, there are writes to several databases in each action (which is heart breaking) for example,

Player finds 100 golds : memory one changes total gold, snapshot one also makes change, log one logs a log, history one is also updated if needed ( such as daily gold finders table )

Question is :

1 - Is there a better alternative in your opinion ?

2 - In case of a failure of any kind , what's the best way to ensure minimal loss of data? Should it be like snapshot is authoritative so memory one copies all from it , then logs before latest timestamp of snapshot is deleted , and have no good idea for history :)

Thanks in advance.

   

Advertisement
5 hours ago, SotarOraiste said:

1 - Is there a better alternative in your opinion ?

Yes. YAGNI.  

Redis is great if you are doing several hundred thousand transactions per second.  That means you'll have millions of customers.  You don't need that yet, and by the time you do, you'll be able to afford people who write that for you.

Working from previously loaded content is something databases do anyway.  They'll use nearly all the available RAM on the box for it, and you can buy boxes with many gigabytes and even terabytes of memory.  Again, if you reach the point where you actually need to code this yourself, you'll have enough customers and money that you can afford people who are expert in that.

A "history database" as you describe it is what many games do. Keep a record of the important things over time, in addition to the current state of things.  Nothing special here, nothing to replicate, just build a bog-standard database.

The log database and analytics database are both too big at present. Just dump your logs to plain text log files.  Eventually you may want bigger systems like Splunk or Big Data, and that's great because both of them work with standard plain text log files.  By logging to plain text log files you're already on the path to the big systems.

6 hours ago, SotarOraiste said:

2 - In case of a failure of any kind , what's the best way to ensure minimal loss of data? Should it be like snapshot is authoritative so memory one copies all from it , then logs before latest timestamp of snapshot is deleted , and have no good idea for history :)

You probably don't need the "failure of any kind" support.  That is extremely rare, even in the business world.

Databases are extremely tolerant for the common problems, and most of the issues can be resolved with something called isolation levels. You can go from very strict, where everything must be fully committed to disk before anything is allowed to continue, to playing fast-and-loose where you can use data immediately before the transaction is even complete, let alone written to disk.

But even then, games typically don't write to databases for every event.  Typically dynamic things like shooting at bunnies, picking up loot, or wandering around the map are accumulated and written every few minutes. They may also be written at certain checkpoint locations.  Sometimes there are critical things that need to be written immediately, like a purchase or trade with another player, and in that case you will want the person to wait around for confirmation that the transaction is complete and serialized. That only takes a few milliseconds or maybe a second if your system is running slowly. But they are fairly rare transactions, so a brief animation or spinner works well.

A lone developer is unlikely to suffer problems like that, especially if you're getting hosting through AWS or Google that have amazing data centers. You are far more likely to experience database problems through your own bugs and defects, so regular backups of production systems are the answer.

I would start at the data-end.

Decide what data you have, which tables, which columns and keys, and how are tables connected, ie the data base scheme. If you normalize that, there is no redundancy, so it's consistent by definition.

That gives you the data structure of your application at conceptual level. Theoretically, you can push all stuff in one data base, and it's running (and that might be a good idea to start with, as you can make the game then).

Your picture is much more about how to make the data accessible fast enough to scale up to more users. That is only relevant if you have (sufficient) users in the first place. I am a newbie here, but I would define some data base as authorative, and everything else as "cache" (ie non-authorative). As far as I know, how to speed up access heavily depends on the queries that you perform, and without game, you don't know the queries and their frequencies, so making a sane design is pretty much impossible at this stage, unless you've done it a few times already.

As such, I'd suggest drop as much complexity as you can, and make things work first. Then try to figure out where to add complexity to get better <whatever you need to improve>.

First of all, thanks for replies.

@frob

And I know it doesn't look like but this is simplified model :D as,

1 - Need Redis for fast paced transactions and some scheduled tasks,

2 - "Snapshot" Database is needed primarily for its graph features

3 - "History" one is probably RDBMS as aggregate functions are common.

4 - Log / Analytics are rather off topic :) so plain text is ofc on the table

I'm not happy of having that much moving parts, that's why trying to reconcile them :)

And "failure of any kind" is a bit overstatement :) At least when server had somehow managed to crash ( not much to do if server explodes after all :) ) or so ( common risks )

@Alberth

Well, I'm rather newbie in this as well. Actually my concern is when there are several databases from different technologies, their response to a failure would be different. If not configured to take snapshot, Redis data will be gone, some will use wall ahead log etc. Just I need a way to ensure that they all reverted back to same point :)

1 hour ago, SotarOraiste said:

I'm rather newbie in this as well.

Then you need to trust us when we say that you've overcomplicated things.

MMO database structures I have seen in live, revenue-making production:

  • 1x Postgres database. Several tables. Text based logging.
  • 1x Oracle database, several tables. 1x 'log' database for analytics etc., not sure what was used there.

Nothing more complex than that.

You don't need Redis for 'fast paced transactions'. Relational DBs have been handling fast-paced transactions for decades before Redis came about. I would bet $100 that your browser game will not come close to needing the sort of performance that forces you to abandon a traditional relational DB.

For most persistent games you can handle much of your data via 2 simple approaches:

  • if it's mostly inconsequential (e.g. hitpoint loss, position in the world), make the change in memory and schedule an eventual flush to the DB, some seconds (or even minutes) later.
  • if it is important or involves real money (or equivalent), make the in-memory changes wait until a DB transaction for it has finished.

You don't want to be treating the whole game state as an in-memory database; that adds much complexity and some performance issues to a system where 90% of the data is non-critical. Most games simply do not care if the player loses a minute or two of progress if the servers crash because (a) virtually nothing of importance happens in that typical time period, and (b) the servers shouldn't crash anyway. As long as the things which are of importance - again, real money transactions, levelling up, big achievements - are persisted to the database immediately, you have little to worry about.

The snapshot database sounds pointless. You'll need a DB to store your game state and the game state can include past events, i.e. your history database.

The log and analytics databases can be the same. In fact, they probably should be the same, since you won't always know in advance which events are worth logging for business intelligence and which ones are not. Log all interesting events that happen. Create queries to extract business-relevant ones.

@Kylotan

Thanks for reply. Although I agree mostly with your view, a single database solution isn't necessarily best practice in my opinion.

Quote

Most games I've been involved with went down the route of just using the database as a save-game file for each player. If servers stay up and there are no other bugs then it is eventually consistent. The DB becomes sort of a write-back cache. It does mean that 'in-memory' is the authoritative data (rather than the DB, as is the case in more traditional software), you do have to be careful around migrating players between servers (zones, instances, whatever), and for important trades that may have significant repercussions you may want to save all participants at once atomically anyway.

But quoting from you from another thread ( of https://www.gamedev.net/forums/topic/686334-mmo-database/ ) , I think this approach is quite applicable.

So, when everything is fine Redis/in-memory one is authoritative and other database acts as an eventually consistent database as save game data ( aka snapshot ) . When there is some sort of reboot or failure, this save game DB is authoritative (as in-memory data is no more obviously) and data is just copied from other db to redis.

I'd love that in-memory and persistent databases are same so it's fine but not really sure of handlings issues that might arise otherwise.

And for prior schema, actually it can be simplified as only in-memory and persistent databases.

What would best practice be in case of you can't have " one single database to rule the universe " ? Would it need something like a monitor checking all databases with timestamps?

What proof do you have that a single standard, say PostgressSQL, data base won't do the job? These things can handle real-time airline reservations, ATM transactions, real-time online orders, and what not. They don't get scared of a 1,000,000 records or more. These things typically don't just die.

 

One of the problems with software is that it doesn't have a natural upper limit in complexity. Unlike physical building space, in software you can always add one more layer to make it "better" in some way. Unfortunately, better in one way costs something elsewhere, ie you're just moving misery.

For a concrete system where you know the required performance characteristics precisely, you can design a solution such that misery is not hurting you (or at least minimally hurting). You can't do that in the general sense, for the simple reason that you need to know where misery will not affect you, which is impossible without knowing required performance characteristics.

You can in theory distribute misery evenly over the entire design space, but then it's always hurting, ie you make the really good performing cases unreachable by design. I think this is what you're doing currently, you think a single data base isn't enough, so you make this horribly complicated multi-DB system where stuff has to be synchronized all the time in some way, and preventive costs are made to avoid getting into trouble when unknown, possibly not even existing shit hits the fan.

This is all fine if a horribly complicated multi-DB system with real unexpected shit is what you eventually really need to have. However, in practice, this may not be the case as Kylotan said. Thus in all likelihood, your multi-DB synchronization is actually making the data base just slower than it needs to be, that is, your design efforts at this stage might be counter-effective.

 

The standard software tactic here is to define an interface to program against, then pick the simplest possible solution as implementation, and move on to other parts of the problem. When the implementation is actually not performing good enough any more, you reconsider how to implement the interface (ie move the hurt to a new spot, out of the way in that situation.)

Effectively, you're reducing and moving the design effort here. If a simple implementation of the interface suffices, you don't need to design your multi-DB thing at all. You saved yourself a few months of effort.

If it is really needed, you will at some point run into limitations with the simpler solution, and at that point in time, do the work to design and implement your multi-DB thing. The nice thing is then that you know at that point in time where the simple solution fails, so you can take that specifically into consideration when designing a better DB implementation. At that point you also know where misery can be moved to, such that it is not affecting you. The result is going to be better performing then anything you can cook up now.

 

 

@Alberth

Thanks for reply.

Actually one database might handle issue effectively ( just stumbled upon https://engineering.hipolabs.com/graphdb-to-postgresql/  ) but that's another aspect of issue when model is simplified to two databases ( Redis and X )

My interest for Redis is not about performance but handling issues that doesn't need frequent write ( for a classic MMO, it might be player location or hit point in the middle of battle )

As I said I'm also not happy with too much moving parts (which is more error prone) so trying to find optimal solution.

As mentioned above, the fact that data changes rapidly does not mean you need a real-time database to handle it. For most MMOs, when player location or hit points change, there is simply no database involved.

I won't quote what you said to me in a direct message since that is private, but I can tell you that your assumptions about what is necessary for a game like this are incorrect. You seem to want a different database for every different type of data you have, and a database for your in-memory data, and if that is what you want, then no, you can't do it more simply. If you're asking what you need, then you don't need most of this, and I'm speaking as someone who has worked on live MMOs and who has experience of several databases (including graph DBs).

7 hours ago, SotarOraiste said:

As I said I'm also not happy with too much moving parts (which is more error prone) so trying to find optimal solution.

Without a running game you have no information what to optimize. No doubt you will have a result, but it's not going to be optimal.

This topic is closed to new replies.

Advertisement