game data persistence problem

Started by
10 comments, last by hplus0603 9 years, 6 months ago

I've been working on a distributed mmo cluster system, in which different business logic handle is distributed to different server.

In this system, I have a character server and a guild server. Character server owns all characters' base info, guild server owns all guilds' base info and all guild members' base info.

Both the character server and the guild server own character's level. When the character server updates a character's level, it first commits an update transaction to the database asynchronously. After the commit returns, the character server will publish a character-level-update message, the guild server will then subscribe the message and update the corresponding guild member's level.

My problem is, if the character server commits the update transaction, but crashes before the commit returns or before it publishes the character-level-update message, and then we restart the character server, characte level in the character sever will be the newest value, but guild member level in the guild server will still be the old value. Can anyone give any suggestion to solve this problem? Or is there anything wrong in my design? Thanks!

Here below is my cluster design:

---------------------------------------------------------------------------------
<clients> <clients> <clients>
| | |
<gate server> <gate server> <gate server>
|______________|____________|
|
<route server> // route different request to different server
_______________|_____________
| | |
<char server> <guild server> <other server>
|______________|_____________|
|
<shared database>
---------------------------------------------------------------------------------
Advertisement

Put both actions (increase the level in char server and guild server) into a single transaction. You can archive this by a two-phase commit. It is important, that you need to track your transaction state (use a persistent state tracking). If it fails during server shutdown, you either need to roll it back or to continue it once you restart the server .

Put both actions (increase the level in char server and guild server) into a single transaction. You can archive this by a two-phase commit. It is important, that you need to track your transaction state (use a persistent state tracking). If it fails during server shutdown, you either need to roll it back or to continue it once you restart the server .

Thanks to Ashaman73, but maybe there's some simpler solution?

You mentioned two-phase-commit, does that mean I should update two table (within a single transaction) ? The character table and guild member table ? But actually, I don't have a guild member table. I just have a guild table, with a key field (guild_id) and a blob field (guild_member_id_list). The field guild_member_id_list contains all members' id, that is character id. Any member info the guild server needs can be acheived from character server by the member id. So what I want is, when the character server changed some character's level, only one table needs to be updated, that is the character table. Then the character server publishes a specific type of message to told the newest character level to all other servers including the guild server, who are interested with character level and have already subscribed this type of message. Those subscriber servers just store the character level locally, they won't store the character level to any database table.

To make it more clear and simple, let's say I have a character server and a chat server. The chat server doesn't have to maintain any persistent data in any database table, everything it maintains is temporary and is just in its local RAM. What the chat server needs is the name, gender, and some other mutable states (e.g. level, battle state) of those characters who are in some specific chat channel. I wish all these character data the chat server needs would just come from the character server, not from any real database table, and the chat server doesn't have to persist any of these data to any database table. Yeah, it looks like some data replication mechanism.

So does my design I metioned above have any problem? I really can't find a simle and sutable solution now.

your problem is that you're storing the same data in two different databases on two different servers, which can get out of sync.

so the first question is why?

i assume you store the guild member's level on the guild server to reduce the number of times the guild server must ask for data from the character server.

seems you have two choices:

1. revert to a more standard relational database design where data is stored in one and only one place. IE character level is stored on the character server, and the guild server must get the info from the character server every time.

2. do something to guarantee the servers stay in sync, as described above by Ashaman73

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Denormalized data is very commonly a consistency problem, and a trade-off you should be careful about.

So, first: Why do you need character servers separate from guild servers? Is this actually an operational requirement, or are you just "planning ahead"? If you can put characters and guilds into the same database, then that's a lot easier. It may also be possible to connect the two servers using a single transaction monitor function, so they look like one database, and support a single commit function (this is implemented in the database using two-phase commit, as suggested above.)
For MySQL, look for example at XA transactions: dev.mysql.com/doc/refman/5.6/en/xa-restrictions.html
For DB/2 or Oracle, they have similar support (for a fee.)

Second: Why does the guild server need the character level at all? Can it fetch the character level from the source of truth (character server) when it's needed?

Third: How bad is it if the guild server is out of date? Can it be out of date for 15 minutes in the rare case that you have a crash involving the character server? If so, keep a last-modified time stamp on the character server side, and run a reconciliation process every 15 minutes that selects everything that changed since the last reconciliation, and re-applies it to the guild server (which may be a no-op, or may actually "heal" the denormalized data.)

In large systems, data denormalization versus source-of-truth is a significant design concern, and there is no "one true" solution. The right choice depends on requirements for the particular case. This is an instance of the more general problem of "cache invalidation," which famously is one of the two actually-hard problems in computer science. (The other being "naming tings.")
enum Bool { True, False, FileNotFound };

your problem is that you're storing the same data in two different databases on two different servers

Thanks to Norman Barrows, but maybe I didn't make my case clear. Actually, I have only one database shared by two or more different servers. The character server updates the character table in the database, while the guild server updates the guild table in the same database.


So, first: Why do you need character servers separate from guild servers? Is this actually an operational requirement, or are you just "planning ahead"?

Yes, the reason I separate character server from guild server is just "planning ahead", I'm just trying "different logic is distributed in different process/server".


If you can put characters and guilds into the same database, then that's a lot easier.

Actually, as I replied above to Norman Barrows, my characters and guilds are just in the same database. So how to to in a lot easier way?


Second: Why does the guild server need the character level at all?

Some logic validation in the guild server will depend on guild member level.


Can it fetch the character level from the source of truth (character server) when it's needed?

Yes, what I want is the guild server gets character level from the character server. But I want the character level "to be informed of/broadcasted/published passively by the character server to any other server including the guild server only when the character server changed it, and the guild server stores the newest level locally as temporary data in it's RAM" instead of "to be fetched proactively by the guild server from the character server whenever the guild server needs it", yes, "passively informed of", not "proactively fetched", because I think "passively informed of" will be simpler, more graceful and more effective. Besides that, the key is, in my logic handle design, the character server must first ensures the character-level-update transaction has already been committed sucessfully before it can inform of the newes character level to the guild server (and other servers). So my problem is, if the character server commits sucessfully, but crashes unfortunately before it informs, then the guild server (and other servers) won't know the newest character level, while the character level in the database already changed to the newest. Still worse, if we then restart the character server, it will reconstruct local memory data from database, then it will hold the newest character level, while the other servers still hold the old character level.

Allow me to propose an alternate plan of action, stolen from tools I'm used to using.

1) Every transaction must be idempotent. This means that you do not commit a "increase level by 1" transaction, but a "iwitggwg has transitioned from level 8 to level 9". If you commit the same transaction multiple times, no big deal.

2) Every non-trivial event goes into a persistent message queue.

3) Messages may only be removed from the message queue if they have been processed in full.

Thus, if your transaction now follows:

Commit "level 8->9 for iwitggwg" -> message queue

Commit "level 9 for iwitggwg" -> db

Commit "level 8->9 for iwitggwg's guild" -> message queue

remove message from character queue

Commit "level 9 for iwitggwg's guild"

remove message from guild queue

Then if at any point the server crashes, the persistent message queue will not have had the message removed (point #3), thus it will re-run the relevant parts of the event. Thus the importance of point #1, as each db transaction may be attempted multiple times.

Shouldnt this be something more systematic for the whole game Database architecture ??

DBs have transaction logging features - built in redundancy so that if there is a 'crash' the DB can build back the lost data states from mutliple sources. Something 'heavyweight' for a MMORPG (and definitely needed for any game that uses Players Real World Money and the things they purchase using it)

This is the old 'Sync' issue for world state (I remember UO long ago where frequent server crashes wiped out everything you did for the current day - bothersome when you were trying to advance your characters and the high frequency of all the effort being lost).

The same problem issue would exist for many significant transactions and coherence between multiple characters (like trading) -- of course depending on your game mechanics which are trivial vs which are significant.

I looked into this issue for a multi-server (AI nodes/Zone nodes/Client front-end nodes/DB nodes) simulation system, where failures had to not lose any significant interactions - it required DB for rolled out zones (simulation running IN-memory) and immediate Backup DB (data updates piped directly to a second DB machine mirror image) AND logging mechanisms for both, as well as frequent Sync saves being done to minimize rebuilding/resume time when there was a major failure (all nodes having spares ready to assume activity when entire server machines failed).

--------------------------------------------[size="1"]Ratings are Opinion, not Fact
The simplest solution is to make character leveling simply update both database tables at the same time in a single transaction. Given that you share a database on the back-end, this is simple to implement.

If you want notifications to go between systems, the typical way of doing this is to keep a work queue of some sort -- either records in a database of "work to be done" or a persistent message queue of "notifications" where processes can subscribe to "topics," or something else along those lines. The guild server would then simply subscribe to the "players that have leveled" notification, and do whatever it needs to do when it hears about it.

However, I think you're putting the cart before the horse a bit here. The likely bottleneck in your system is either the router server (for real-time data,) or the database server (for durable updates,) unless you have some really inefficient performance bugs in the other systems. Thus, I'd suggest trying to separate out the databases to separate stores, and also getting rid of the router server, instead keeping the gateway servers all up to date with what the routing should be (assuming this needs any logic at all.)

Without real load, though, it's impossible to predict what will happen if and when you actually get real users at any real level. A single physical machine or two can likely serve up to 10,000 simultaneous players (excluding physics simulation / interest management,) so splitting along some arbitrary boundary is likely to be, well, arbitrary, until you really know the shape of the workload!
enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement