Failover strategy

Networking and Multiplayer Programming

Started by mynick123123 February 05, 2010 07:06 PM

40 comments, last by phresnel 14 years, 1 month ago

100

Author

February 05, 2010 07:06 PM

Hi! I have an online multiplayer card game which involves players' credit. Now I want to put some sort of a failover system in the case where a server shuts down or something, you know, I want to be able to recover the players' lost credit. The process is that the players sit at a table with X credit. As they play the credit might go up and down. Possible scenario: A player sits with 100 credit and after 20 minutes have only 20 credit. Then the server shuts down. I want now that when it gets up it will credit the player's account or something. How would you go for it? I'd simply write every development/change to the disk but that would kill the hdisk as well as slow down the whole operation, as writing to the disk is a blocking operation (yes, I might use non-blocking approach but then I'd continue the game without really saving). Another approach I thought about is having a "failover server" that would act as a "log server", and when some server fails, the log can be retrieved. That of course would make everything much more complicated but that, right now, sounds best. But, what happens when that "failover server" fails? The server runs on Windows. Anyone here did it before? Is there a way, except "blocking mode" disk writing, to achieve a 100% guarantee?

hplus0603

11,916

February 06, 2010 01:36 AM

Writing changes in credit to disk is what databases do. With ssd storage, you can do probably 10,000 transactions per second through a well tuned sql database. Databases also already inplement replication and hot failover. So why re-invent that particular wheel?

enum Bool { True, False, FileNotFound };

mynick123123

100

Author

February 06, 2010 05:48 AM

You got a point..
But when I hear database I immediately think about string manipulation (the sql commands).
Besides, there's so much overhead in this, in terms of other features that I don't need..

I'll have to try it, simply, and I'll have to see how to have a db cluster because the maximum I know about it is how to create a new table :)

Many thanks!

mynick123123

100

Author

February 06, 2010 05:51 AM

By the way, another thing I thought of, and maybe someone can point me some points on it, is to have some sort of a "memcached" server/cluster.
Maybe create a custom server of this type that each client have a "repository" which is online on the memory, and when the connection fails, then store the data to disk.
Fast & furious.

Zimans

237

February 06, 2010 10:54 AM

How do you store a players credit when you shutdown the server? Are you trying to protect against your server application crashing, or physical hardware failure? Do you really need to store every change in credit? Would writing to disk once every X rounds be sufficient?

Mist MMO's save player state based on some kind of event, such as completing a quest, zoning to a different map, player disconnect/logout, etc. Do you have a similar mentality? Does your game have beginning and end sessions?

--Z

Spodi

642

February 06, 2010 11:14 AM

Quote:Original post by mynick123123
But when I hear database I immediately think about string manipulation (the sql commands).

String manipulation is almost always slower than binary, but that doesn't mean the performance is bad. SQL strings aren't usually that long, and the parsing engines are far from dumb. You can also either put them in stored procedures or prepared statements, allowing the server to parse the command once completely then just execute the cached copy with very minimal overhead.

Also, notice how hplus said "10,000 transactions per second through a well tuned sql database". Sure, you may roll your own "lite" system that can do maybe twice that, but none of that will matter if you only need 500 transactions per second tops.

NetGore - Open source multiplayer RPG engine

Antheus

2,410

February 06, 2010 11:23 AM

Quote:Is there a way, except "blocking mode" disk writing, to achieve a 100% guarantee?

No. By definition. You need to wait until you get confirmation that write operation completed in entirety.

The simplest and scalable solution in this case is for each game to be standalone, using something like SQLite. When a game session completes, read the final standings, and update the master database.

This approach avoids scalability problems since you can spawn arbitrary number of game session nodes (one per 1000 sessions perhaps), and uses proven database for reliable storage.

The only gotcha to consider is master database bottleneck, which may mean that a player cannot start a new game until previous state has been processed and included.

Still, even using MySQL for master, updating thousands of game results per second should not be a problem (it's just an int with new credit count), especially if individual instances are split horizontally.

mynick123123

100

Author

February 06, 2010 08:47 PM

Yes, I need to store every credit change, although you could say it's session based game.
See, it's a texas holdem poker game.
I could save a "state of credit" every start of a match (when dealing hands to players) and when a player goes out of a match (fold).

Quote:
The simplest and scalable solution in this case is for each game to be standalone, using something like SQLite. When a game session completes, read the final standings, and update the master database.

.. Sounds good, maybe I'll combine it with the above approach... Still wondering.

mynick123123

100

Author

February 17, 2010 01:34 PM

Hello all, again! :)

I have to admit that this failover task is the hardest I had so far - believe me that threading, networking, memory-management are a piece of cake compared to that.

I'm still trying to figure out the best approach to do that.
I'll keep writing here more details, maybe some "golden-brain" might come up with a bright idea.

The issue is actually a "family" of issues;
Imagine a terminal server connected to a game server (has the game state).

1. Player has 500 chips.
2. Player asks to sit at a table with 100 chips.
3. Terminal sends a command to the game server, containing the player's account number. It'll subtract the 100 chips upon confirmation from the game server.
4. The game server approves it, and sends a confirmation. The terminal server has just crashed, before getting the confirmation.

How would you handle such a situation?
I think I need some sort of transaction system.
It must be FAST, otherwise the overall operation will be slow, even when everything runs normally with no crashes.

Any ideas folks?

Antheus

2,410

February 17, 2010 01:52 PM

Quote:4. The game server approves it, and sends a confirmation. The terminal server has just crashed, before getting the confirmation.

Terminal server is dumb, it doesn't matter if it crashed.

Quote:It must be FAST, otherwise the overall operation will be slow, even when everything runs normally with no crashes.

If it needs to be fast, then server that is capable of handling 7 transactions per hour will be enough, right?

Or how do you define fast?

Is 10,000 updates per second fast enough? Why not 20,000?

How about getting a paper napkin out and doing some math.

100Mbit connection can handle ~20,000 packets per second. Response is n packets - each action taken by each player is seen by others. Let there be 5 players per table. This means, each action taken by player generates 1+5 packets. Across all tables, we get 20,000/6 = ~3.3k actions.

What this says is the following: Regardless of how many tables run on a server, bandwidth limits us to 3.3k actions total.

How often does a person take an action? Once every 5 seconds? Every 10? If every 5 seconds, then a single server can handle 15,000 users.

This number is surprisingly close to actual numbers.

Now it's up to you to take a selection of databases, perhaps MySQL, redis, MongoDB, .... and see if they can sustain 3.3k updates per second.

And that is without any special hardware, any special considerations, and 15k active concurrent sustained users is more than most sites. And it is possible to buy some pretty heavy hardware to just scale vertically to second 100Mbit connection if needed.

Next - get this service running and see what happens in reality, where the problems are, etc, and work from there. It might be that scalability issues will come from somewhere else completely.

Failover strategy

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Failover strategy

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines