distributed server design and mysql

Started by
9 comments, last by hplus0603 16 years, 9 months ago
Hello Developers ;) first of all i want to excuse myself for my poor english. I am working on a network engine for a game and found that my concept might not work well. The world of the game is divided in smaller areas but the servers are not dedicated to an area. This is because i want to have more servers updateing one area, when there is more action there. The problem is that the world is accessed through a mysql database. To prevent wrong calculations i have to lock the table while a server updates the world(from the initial select till the insert query). As far as i can imagine, this would mean, that i cant really get use of all those servers, as they will have to wait for each other. Another approach would be to have more databases(pro area for an isnstace) but this wont fix the problem, when all players are in one area and will additionaly add the problem when a npc or a mob changes the area. So i really cant find a solution for this. Any help and hint will be greatly appreciated! ~ Ivaylo
Advertisement
First, storing any kind of real-time updated data in an on-disk database won't work, period. That's why Sun recommends that you only run "persistent" updates through their Sun Game Server, and use a loose consistency model for anything real-time, for example.

Second, if more than one server can update the same area, then why divide the world? Just make all the servers serve all of the world. There is, however, a general problem with letting more than one server serve the same area: how do you communicate the state of objects between the servers? (I e, I'm on server A, hitting a monster on servers B; how does server A know where the monster is, how many hitpoints it has; etc?) There is still active research in this area, but there is no panacea (at least not yet :-)

Third, the player density problem is the toughest problem in MMO games, and is not yet solved. That's why, in Asheron's Call, you have Portal Storms; in WoW you have instanced dungeons; in Planetside you have Pop Locked continents/planets; etc. You have to design for the most players you can cram into a single machine, and then design the game mechanics so that you don't go over that capacity limit.

enum Bool { True, False, FileNotFound };
Thank you very much for your reply!

Quote:Original post by hplus0603
First, storing any kind of real-time updated data in an on-disk database won't work, period. That's why Sun recommends that you only run "persistent" updates through their Sun Game Server, and use a loose consistency model for anything real-time, for example.


Well i was planning to run the database in a ramdisk, but as you say it will still not be able to run fine in real-time.

Quote:Original post by hplus0603
Second, if more than one server can update the same area, then why divide the world? Just make all the servers serve all of the world.

The world is divided in areas only because of area restrictions, terrain design, but have nothing to do with server distribution. All servers will update the whole world. Sorry if i missed that :)

Quote:Original post by hplus0603
There is, however, a general problem with letting more than one server serve the same area: how do you communicate the state of objects between the servers? (I e, I'm on server A, hitting a monster on servers B; how does server A know where the monster is, how many hitpoints it has; etc?) There is still active research in this area, but there is no panacea (at least not yet :-)

Well thats my problem, and i was thinking of database-based communication, which seems not a good idea. Can you give me a hint, where can i get some more info on this problem and possible solutions. I am starting to think of RPCs when a server need information from another server.

Quote:Original post by hplus0603
Third, the player density problem is the toughest problem in MMO games, and is not yet solved. That's why, in Asheron's Call, you have Portal Storms; in WoW you have instanced dungeons; in Planetside you have Pop Locked continents/planets; etc. You have to design for the most players you can cram into a single machine, and then design the game mechanics so that you don't go over that capacity limit.

You are talking of some realmserver, where all players data is stored, and more world servers, where the world is updated? If i got you right, this is the way i was planning to design it.

Thanks again, Ivaylo
All currently shipping solutions I know of have only a single server for a given physical part of the world. They may have more than one machine serving a single realm/instance, coupled together in a cluster, but for any given point in the world, only a single server has the authority. Typically, objects will be ghosted near the borders between the server processes that make up the same instance, where objects come close to other servers. This means that an object will move between servers as it moves in the world. Note that ghosts add load on two servers (the authority, and the ghost), so border area needs to be "small" compared to served area for this to scale.

If what you want to do is make more than one server be able to be authoritative for any given location, then nobody has shipped a working game that I know of that scales that way -- that's pretty much the Holy Grail, and nobody knows how to make that work with "infinite" scalability within a single locale while retaining real-time interactive response. So, no, I can't really give you any pointers on how to make that happen :-)

However, as RPC and MPI style APIs start becoming more efficient, and latencies for round-trips start going down, there may in the end exist an architecture that allows multiple servers for the same area, and letting objects interact. There are some nasty n-squared problems involved in that, though, especially around intra-entity collisions, so don't hold your breath.
enum Bool { True, False, FileNotFound };
Quote:Original post by hplus0603
All currently shipping solutions I know of have only a single server for a given physical part of the world.
...
If what you want to do is make more than one server be able to be authoritative for any given location, then nobody has shipped a working game that I know of that scales that way -- that's pretty much the Holy Grail, and nobody knows how to make that work with "infinite" scalability within a single locale while retaining real-time interactive response. So, no, I can't really give you any pointers on how to make that happen :-)


There are well-known ways of making it work well to a large finite limit (more than most current games would need), but the issue right now seems to be more one of "why bother?" - the game-designs don't currently require it, so it's a waste of time and money.
Quote:Original post by riffle
I am working on a network engine for a game and found that my concept might not work well.

The world of the game is divided in smaller areas but the servers are not dedicated to an area. This is because i want to have more servers updateing one area, when there is more action there.

The problem is that the world is accessed through a mysql database. To prevent wrong calculations i have to lock the table while a server updates the world(from the initial select till the insert query). As far as i can imagine, this would mean, that i cant really get use of all those servers, as they will have to wait for each other.


Standard approaches for this:

1. Ask yourself why are you writing to a database? Answer should be something like "because I want to persist data even when the machine has to be shutdown, and because I want a rich query language to interrogate and update the data". Neither of those demand *literal* real-time updates - you can delay by seconds (some real-time games they delay by minutes, some games even go for tens of minutes between writes to the DB). That should solve most of your problem - just buffer up a load of data to write. A lot of your data will get overwritten every few minutes anyway, so you won't need to save as much per unit time if you save less often.

2. Use an in-memory cache. Most games either write a custom cache for data that needs to go to the DB or use an in-memory SQL DB that then writes back to the "real" SQL DB. The advantage of the latter is that you still get all the access to SQL whilst having real-time data. If your game gets big enough, you'll probably need to do both - have an in-memory DB in front of your main DB, and then have a DB-cache in front of that.
Best thanks for your tips! Really appreciated!

First of all - the problem is that the game we do is based on massive pvp events on a single area (like the guild castle). So we expect 100-200 players + many npcs acting in a single small area. As a WoW player i know that the WoW servers cant manage with such event. But from what i learned from ur replies seems that its not rly possible to be done aswell.

So my concept is now as follows:

There is a servers that handles the login data, so every client will connect to this server at the beginning. This server creates a permanent TCP connection with the client to send important messages(more to that later).

After a successful logon the client will be redirected to a lobby server where all the information regarding the characters is hold. After the player choses a character to play, he will bekome 2 IP adresses - 1 to the server he must send to and one to the server, he will get from. (through the tcp connection from the login server)

The decision is taken from a load balancer, who watches the net load of those servers.

The receive servers just send the data, they become to the world updater server which is responsible for the area, where the player is, and the send servers take this information and send it to the clients, that are attached to them.

The client stores 3 ip adresses: of the lobby send and receive servers, and these could change(falls server goes down or sth like that). So if the receive server is down the login server will send (through the tcp channel) a message to the client, telling him the new ip address to send to.

The main idea behind this structure is to manage the network load of the servers(which will work only if the update servers manage to handle the input data).

So could such concept work or is it totally screwed?

Thanks again for your help :)
Quote:First of all - the problem is that the game we do is based on massive pvp events on a single area (like the guild castle). So we expect 100-200 players + many npcs acting in a single small area. As a WoW player i know that the WoW servers cant manage with such event. But from what i learned from ur replies seems that its not rly possible to be done aswell.


The number of players is irrelevant.

Here is the breakdown of what the limits are:
- Number of messages.
How many does each player generate per second, what's the average packet size. These directly translate into CPU/logic load. 200 players, 3 messages/sec (1 movement, 1 action, 1 other). 600 messages/second should be manageable by single node.

- Feedback
Each action that happens in world causes a side-effect. Each of these needs to be propagated to users. If you use proxy servers, each of these messages needs to be sent to each proxy server (or one broadcast call). This results in 1:n mapping. For example, player using AoE will hit 5 others. Then again, combat results will likely be cached, so deltas are accumulated over time, resulting in much less than 1:5 message explosion

- Network bandwidth. X / (n_messages * avg_message_size).
This will put a hard-cap on your server. During testing, I settled for 20,000 messages per second per 100Mb interface. As such, I also decided to limit 20k/client and 500 clients per proxy server (no oversubscribing).

If you connect clients directly to zone server, your numbers change, since you need to add extra traffic from cluster.

The overall impact on CPU from message distribution is rather small, if you serialize the data smart.

What you now need to determine, is what your CPU quota for a given zone will be. With 3 messages/sec for 200 users, that gives you at most 1000/600 = 1.6 ms per message. Some of these messages are trivial, others complex. Ideally, you might want to keep this value down to 1ms, leaving you with 40% idle time for everything else.

Realistically, some messages will be processed faster than that. Many messages can be handled in several nanoseconds. For actions that you know will take very long time, you might want to handle them fully asynchronously (if non-demanding) or on a separate machine entirely.

In combat, the only real-time part involves movement and skill use. All of these can be optimized and possibly pre-calculated to an extent.


Referring to existing MMOs in a blanket statements as to what they can handle isn't useful. WoW doesn't need 200+ players in a zone. Largest groups are 40-80 or thereabout. The game is designed around 100 max players.

The rest of the impact of such large events comes simply from message propagation and distribution. Separate client proxy servers can take a lot of load, since they take care of tedious task of multi-casting the messages, while the core server runs in a single process (or even thread), doing nothing but logic.

But having multiple servers to handle the same area isn't insta-benefit. If anything, you'll need to slow down gameplay enough to allow inter-server synchronization, possibly even allow for roll-backs.

500, even 1000+ players per zone isn't impossible, far from it. But it's almost never needed, or even desired (clients have trouble rendering 1000 players, all customized, different models, textures, reaching the cap on rendering far before the network limit).


NPCs are a different story. The logic here depends on how and when they are updated, how much work they need to do, .... Once again, no hard numbers.

And then there's the usual problems: path-finding, collision detection, administrative logging, all of which quickly add up to consume the resources.

Using multiple servers is same as multi-threading. Same issues arise. With the exception that synchronization becomes insanely costly operation (from nano or pico seconds per lock to over 1 millisecond - a million/billion time increase). So you really don't want to distribute dependent systems with plenty of shared resources.
Thank you Antheus!

Following your calculations i managed to become a better sight of how does it looks like intern :) I will inform all of you of our status as soon as we have some progress, other opinions are still welcome and will be appreciated! ;)

And some reply of the server architecture will be great too, if i am not asking too much :)

~ Ivaylo
Networked service is something like this:
class Service{  void on_receive( message )  {    queue.push_back( message );  }  void run()  {    while (running) {      message = queue.pop();      // parse, execute    }  }}


And that's it.

Then you simply instantiate these objects (locally or remotely), and implement the execution of messages on the server (logic). Each of these is also connected to a socket which reads from network, and passes the packets received to on_receive method.

You then have various services which perform tasks (login, connection, zone, combat, AI, ...).

Client works in the same way, except that it receives responses from the server.

How you send the messages is merely a question of syntax. And, for a cluster, you'll likely need a central directory, that keeps track of the services currently running.

And, the way CORBA, RMI and Ice do it, you can run these services locally or remotely, allowing for flexible distribution of objects across an arbitrary number of nodes.

And at some point, you can probably even implement dynamic service instantiation, where, if overloaded, a service may spawn itself on several nodes. But that's an advanced topic which can't be practically summed up in a few lines.

Just keep in mind, with half a dozen such services, you can scale your server to thousands of users on only a handful of machines.

This topic is closed to new replies.

Advertisement