Jump to content

  • Log In with Google      Sign In   
  • Create Account

game data persistence problem


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
11 replies to this topic

#1 iwitggwg   Members   -  Reputation: 113

Like
0Likes
Like

Posted 28 August 2014 - 02:43 AM

  I've been working on a distributed mmo cluster system, in which different business logic handle is distributed to different server.

  In this system, I have a character server and a guild server. Character server owns all characters' base info, guild server owns all guilds' base info and all guild members' base info.

  Both the character server and the guild server own character's level. When the character server updates a character's level, it first commits an update transaction to the database asynchronously. After the commit returns, the character server will publish a character-level-update message, the guild server will then subscribe the message and update the corresponding guild member's level.

  My problem is, if the character server commits the update transaction, but crashes before the commit returns or before it publishes the character-level-update message,  and then we restart the character server, characte level in the character sever will be the newest value, but guild member level in the guild server will still be the old value. Can anyone give any suggestion to solve this problem? Or is there anything wrong in my design? Thanks!

 

Here below is my cluster design:

---------------------------------------------------------------------------------
  <clients>              <clients>           <clients>
         |                       |                      |
<gate server>    <gate server>     <gate server>
         |______________|____________|
                                 |
                         <route server>  // route different request to different server
         _______________|_____________
         |                      |                     |
<char server>    <guild server>    <other server>
         |______________|_____________|
                                 |
                        <shared database>
---------------------------------------------------------------------------------

Edited by iwitggwg, 28 August 2014 - 09:00 PM.


Sponsor:

#2 Ashaman73   Crossbones+   -  Reputation: 7991

Like
0Likes
Like

Posted 28 August 2014 - 03:09 AM

Put both actions (increase the level in char server and guild server) into a single transaction. You can archive this by a two-phase commit. It is important, that you need to track your transaction state (use a persistent state tracking). If it fails during server shutdown, you either need to roll it back or to continue it once you restart the server .


Edited by Ashaman73, 28 August 2014 - 05:38 AM.


#3 iwitggwg   Members   -  Reputation: 113

Like
0Likes
Like

Posted 28 August 2014 - 07:45 AM

Put both actions (increase the level in char server and guild server) into a single transaction. You can archive this by a two-phase commit. It is important, that you need to track your transaction state (use a persistent state tracking). If it fails during server shutdown, you either need to roll it back or to continue it once you restart the server .

Thanks to Ashaman73, but maybe there's some simpler solution?

You mentioned two-phase-commit, does that mean I should update two table (within a single transaction) ? The character table and guild member table ? But actually, I don't have a guild member table. I just have a guild table, with a key field (guild_id) and a blob field (guild_member_id_list). The field guild_member_id_list contains all members' id, that is character id. Any member info the guild server needs can be acheived from character server by the member id. So what I want is, when the character server changed some character's level, only one table needs to be updated, that is the character table. Then the character server publishes a specific type of message to told the newest character level to all other servers including the guild server, who are interested with character level and have already subscribed this type of message. Those subscriber servers just store the character level locally, they won't store the character level to any database table.

To make it more clear and simple, let's say I have a character server and a chat server. The chat server doesn't have to maintain any persistent data in any database table, everything it maintains is temporary and is just in its local RAM. What the chat server needs is the name, gender, and some other mutable states (e.g. level, battle state) of those characters who are in some specific chat channel. I wish all these character data the chat server needs would just come from the character server, not from any real database table, and the chat server doesn't have to persist any of these data to any database table. Yeah, it looks like some data replication mechanism.

So does my design I metioned above have any problem? I really can't find a simle and sutable solution now.


Edited by iwitggwg, 28 August 2014 - 07:59 AM.


#4 Norman Barrows   Crossbones+   -  Reputation: 2308

Like
0Likes
Like

Posted 28 August 2014 - 08:54 AM

your problem is that you're storing the same data in two different databases on two different servers, which can get out of sync.

 

so the first question is why? 

 

i assume you store the guild member's level on the guild server to reduce the number of times the guild server must ask for data from the character server.

 

seems you have two choices:

 

1. revert to a more standard relational database design where data is stored in one and only one place. IE character level is stored on the character server, and the guild server must get the info from the character server every time.

 

2. do something to guarantee the servers stay in sync, as described above by Ashaman73


Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

 

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

 

 


#5 hplus0603   Moderators   -  Reputation: 5706

Like
1Likes
Like

Posted 28 August 2014 - 09:26 AM

Denormalized data is very commonly a consistency problem, and a trade-off you should be careful about.

So, first: Why do you need character servers separate from guild servers? Is this actually an operational requirement, or are you just "planning ahead"? If you can put characters and guilds into the same database, then that's a lot easier. It may also be possible to connect the two servers using a single transaction monitor function, so they look like one database, and support a single commit function (this is implemented in the database using two-phase commit, as suggested above.)
For MySQL, look for example at XA transactions: dev.mysql.com/doc/refman/5.6/en/xa-restrictions.html
For DB/2 or Oracle, they have similar support (for a fee.)

Second: Why does the guild server need the character level at all? Can it fetch the character level from the source of truth (character server) when it's needed?

Third: How bad is it if the guild server is out of date? Can it be out of date for 15 minutes in the rare case that you have a crash involving the character server? If so, keep a last-modified time stamp on the character server side, and run a reconciliation process every 15 minutes that selects everything that changed since the last reconciliation, and re-applies it to the guild server (which may be a no-op, or may actually "heal" the denormalized data.)

In large systems, data denormalization versus source-of-truth is a significant design concern, and there is no "one true" solution. The right choice depends on requirements for the particular case. This is an instance of the more general problem of "cache invalidation," which famously is one of the two actually-hard problems in computer science. (The other being "naming tings.")
enum Bool { True, False, FileNotFound };

#6 iwitggwg   Members   -  Reputation: 113

Like
0Likes
Like

Posted 28 August 2014 - 07:07 PM


your problem is that you're storing the same data in two different databases on two different servers

Thanks to Norman Barrows, but maybe I didn't make my case clear. Actually, I have only one database shared by two or more different servers. The character server updates the character table in the database, while the guild server updates the guild table in the same database.


Edited by iwitggwg, 28 August 2014 - 08:16 PM.


#7 iwitggwg   Members   -  Reputation: 113

Like
0Likes
Like

Posted 28 August 2014 - 08:01 PM


So, first: Why do you need character servers separate from guild servers? Is this actually an operational requirement, or are you just "planning ahead"? 

Yes, the reason I separate character server from guild server is just "planning ahead", I'm just trying "different logic is distributed in different process/server".

 


If you can put characters and guilds into the same database, then that's a lot easier.

Actually, as I replied above to Norman Barrows, my characters and guilds are just in the same database. So how to to in a lot easier way?

 


Second: Why does the guild server need the character level at all? 

Some logic validation in the guild server will depend on guild member level.

 


Can it fetch the character level from the source of truth (character server) when it's needed?

Yes, what I want is the guild server gets character level from the character server. But I want the character level "to be informed of/broadcasted/published passively by the character server to any other server including the guild server only when the character server changed it, and the guild server stores the newest level locally as temporary data in it's RAM" instead of "to be fetched proactively by the guild server from the character server whenever the guild server needs it", yes, "passively informed of", not "proactively fetched", because I think "passively informed of" will be simpler, more graceful and more effective. Besides that, the key is, in my logic handle design, the character server must first ensures the character-level-update transaction has already been committed sucessfully before it can inform of the newes character level to the guild server (and other servers). So my problem is, if the character server commits sucessfully, but crashes unfortunately before it informs, then the guild server (and other servers) won't know the newest character level, while the character level in the database already changed to the newest. Still worse, if we then restart the character server, it will reconstruct local memory data from database, then it will hold the newest character level, while the other servers still hold the old character level. 


Edited by iwitggwg, 28 August 2014 - 08:12 PM.


#8 KulSeran   Members   -  Reputation: 2583

Like
0Likes
Like

Posted 28 August 2014 - 09:16 PM

Allow me to propose an alternate plan of action, stolen from tools I'm used to using.

 

1) Every transaction must be idempotent.  This means that you do not commit a "increase level by 1" transaction, but a "iwitggwg has transitioned from level 8 to level 9".  If you commit the same transaction multiple times, no big deal.

 

2) Every non-trivial event goes into a persistent message queue.

 

3) Messages may only be removed from the message queue if they have been processed in full.

 

Thus, if your transaction now follows:

Commit "level 8->9 for iwitggwg" -> message queue

Commit "level 9 for iwitggwg" -> db

Commit "level 8->9 for iwitggwg's guild" -> message queue

remove message from character queue

Commit "level 9 for iwitggwg's guild"

remove message from guild queue

 

Then if at any point the server crashes, the persistent message queue will not have had the message removed (point #3), thus it will re-run the relevant parts of the event. Thus the importance of point #1, as each db transaction may be attempted multiple times.


Edited by KulSeran, 28 August 2014 - 09:17 PM.


#9 wodinoneeye   Members   -  Reputation: 876

Like
0Likes
Like

Posted 28 August 2014 - 11:15 PM

Shouldnt this be something more systematic for the whole game Database architecture ??

 

DBs have transaction logging features - built in redundancy so that if there is a 'crash' the DB can build back the lost data states from mutliple sources.  Something 'heavyweight' for a MMORPG  (and definitely needed for any game that uses Players Real World Money and the things they purchase using it)

 

This is the old 'Sync' issue for world state (I remember UO long ago where frequent server crashes wiped out everything you did for the current day - bothersome when you were trying to advance your characters and the high frequency of all the effort being lost).

 

The same problem  issue would exist for many significant  transactions and coherence between multiple characters (like trading) -- of course depending on your game mechanics which are trivial vs which are significant.

 

I looked into this issue for a multi-server (AI nodes/Zone nodes/Client front-end nodes/DB nodes) simulation system, where failures had to not lose any significant interactions - it required DB for rolled out zones (simulation running IN-memory) and immediate Backup DB (data updates piped directly to a second DB machine mirror image) AND logging mechanisms for both, as well as frequent  Sync saves being done to minimize  rebuilding/resume time when there was a major failure (all nodes having spares ready to assume activity when entire server machines failed).


--------------------------------------------Ratings are Opinion, not Fact

#10 hplus0603   Moderators   -  Reputation: 5706

Like
0Likes
Like

Posted 29 August 2014 - 11:30 AM

The simplest solution is to make character leveling simply update both database tables at the same time in a single transaction. Given that you share a database on the back-end, this is simple to implement.

If you want notifications to go between systems, the typical way of doing this is to keep a work queue of some sort -- either records in a database of "work to be done" or a persistent message queue of "notifications" where processes can subscribe to "topics," or something else along those lines. The guild server would then simply subscribe to the "players that have leveled" notification, and do whatever it needs to do when it hears about it.

However, I think you're putting the cart before the horse a bit here. The likely bottleneck in your system is either the router server (for real-time data,) or the database server (for durable updates,) unless you have some really inefficient performance bugs in the other systems. Thus, I'd suggest trying to separate out the databases to separate stores, and also getting rid of the router server, instead keeping the gateway servers all up to date with what the routing should be (assuming this needs any logic at all.)

Without real load, though, it's impossible to predict what will happen if and when you actually get real users at any real level. A single physical machine or two can likely serve up to 10,000 simultaneous players (excluding physics simulation / interest management,) so splitting along some arbitrary boundary is likely to be, well, arbitrary, until you really know the shape of the workload!
enum Bool { True, False, FileNotFound };

#11 Gl2eenDl2agon   GDNet+   -  Reputation: 297

Like
0Likes
Like

Posted 27 September 2014 - 09:47 PM

One problem I see implied by this design is that the assumption is being made that CPU usage will be the bottleneck when in fact the network and the database will be the bottleneck.

Keep all of the data manipulation logic on the same server and then if you have world physics simulation you may distribute that since physics simulation is CPU expensive.

For example, the main game server could request that a physics sim server do a path check and then accept or reject the request to move from the client based upon what the physics server returns. This means that the main game server can handle other clients while the physics server(s) churn along.

 

Typical you would distribute logic that takes a long time to run so that the overhead and latency of these servers communicating across ethernet is small relative to the time to complete the action. Game logic routines that perform if this than sql update that are not good candidates. Physics simulation could benefit. Having a distributed database server on multiple physical boxes would most defiantly benefit your read access depending on the DBMS. Your game server probably won't benefit.

 

Also your gateway server will bottleneck the design even if it passes everything directly through since its ethernet interface will always be slower than the parallel processing of multiple servers.

 

In reality your WAN bandwidth and database disk access will be your most imminent bottlenecks regardless of how slow and cheap your servers are.

For example, lets say that 100 players stand on top of each other and all cast the same area-of-effect spell.

That means that there are 100 spells hitting 99 targets. Then 100x99 spell damage events. 100x99 hitpoint updates. 100x99 packets x3 (spell start, spell finish, hitpoint update) on the network  which is about 100x99x3x80bytes = 2.3 megabytes consumed in 100 milliseconds or 181 megabits per second required to not lag. Of course that assumes tiny 80 byte binary packets and you aren't sending bloated XML or ascii text which would quickly sap your bandwidth.

Now imagine if 3000 players all stood on top of each other and started spamming area-of-effect spells. (That actually happened in dark age of camelot on a regular basis tongue.png )

Even 3000 players in one location simply typing "hello" in chat could DOS your server from bandwidth starvation. It's unlikely that CPU usage would be impacted severely by the game logic involved unless you had physics bounds testing on the server side.


Edited by Gl2eenDl2agon, 27 September 2014 - 09:48 PM.


#12 hplus0603   Moderators   -  Reputation: 5706

Like
0Likes
Like

Posted 28 September 2014 - 03:28 PM

the assumption is being made that CPU usage will be the bottleneck when in fact the network and the database will be the bottleneck


That depends on the specifics of the game. There are certainly lots of game styles where CPU will be a bottleneck before a 10 Gbps network link or an occasionally-checkpointing sharded database.

Network will be a bottleneck if you have a highly sharded simulation model where lots of entities need to have simulation-tick updates from lots of remote network entities.

Database will be a bottleneck only if you design your game rules to need too much hard (persisted) consistency -- like "oh, I fired a bullet; I need to commit this fact to the database!" There are better ways of doing that (including better ways of durable consumables management, such as buffering logs.)
enum Bool { True, False, FileNotFound };




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS