C++ linux load balancing application examples

Started by
13 comments, last by Kylotan 14 years, 7 months ago
I'm a developer of a MMO game and currently we're at my company facing some scalability issues which, I think, can be resolved with proper clustering of the game world. The main idea is to have a number of zones("locations" in terms of my game) running on dedicated servers. When a player decides to go to some specific location the load balancer decides which zone server will be actually serving the player(that's actually why I need a Level 7 load balancer). The world in the game is not seamless so I don't have to worry about complex visibility situations on the edge of zones. Ideally I'd like to make zone switching completely transparent for the client. In case of the failure of one of the zone servers I'd like the balancer to redirect users to one of the least loaded zone servers which will play the role of the missing zone server for some time. As for the loadbalancer being the point of possible failure, I think it can be fixed using heartbeat paired with the second loadbalancer. I was thinking about usage of Linux Virtual Server(http://www.linuxvirtualserver.org) with ktcpvs but folks told my ktcpvs was pretty out-dated. What I really like about Linux Virtual Server is its ability to setup LVS/DR schema(http://www.linuxvirtualserver.org/VS-DRouting.html). I wonder if it's possible without actually using LVS... Could you guys please recommend any examples/links of load balancers? I'm currently looking at HAProxy and Perdition being interesting examples of Level 7 load balancers but maybe someone has more MMO oriented examples? Thanks!
Advertisement
The problem is that you need your load balancer to be "smarter" about the application than a generic system will allow for.

We've ended up writing our own load balancing system on top of Perl and SSH, and it works just fine. It wasn't that complex, once we understood what we needed to do, either. There are just a few pieces:

1) Something which knows what hardware resources are available. We use a custom Linux image which registers itself with this component when it boots. This image also contains our server software.

2) Something which knows the load on each piece of hardware. I think we used Nagios here, but we may have switched to something else.

3) Something which maps "I want to go to location X" to "you need to connect to IP Y port Z" -- this is totally application specific.

4) Reporting of which application-level areas have high vs low occupancy. This is application specific.

5) Deciding when to add another instance of a particular zone. This is mostly application specific, and it has to interact with option 3) in some way.

Note that it's hard to un-instantiate a zone once it's instantiated, because it takes a long time for the last user to log out of the zone, even if you stop adding new users to the zone, and you generally don't want to force users to re-connect to another zone instance.

In general, the most robust solution is the one where running services "check in" with the manager every once in a while (say, once a minute), and report load and the fact that they're alive. When a server doesn't check in for two minutes, it's assumed dead. Also, if a server refuses a request to instantiate a player when the player has gotten that server as his target, the server is assumed dead.

For dead processes, we make sure that the PID and any temp files go away, and the machine capacity then goes back into the pool of available resources. If a machine cycles a lot like this, it's marked as suspect (there's always some bad hardware somewhere causing trouble in a larger cluster of machines).

When a zone server goes dead, the load balancer will generally see that there's too little capacity for that area (including 0 capacity if that was the only server!) and instantiate more resources as necessary -- no particlarly special casing is needed for that.

Finally -- the main point of failure is the determination of "player wants to go to area X, what process should he/she connect to?" We actually solve that by broadcasting the decision that the load balancer would make, when that decision changes (it never changes faster than once every few minutes in our system). For example, if zones A and C both connect to zone B, they generally know which process handles new-player requests for zone B, so when the player decides to enter zone B, the servers know what to tell the player.

This way, if the load balancer dies, the system will actually keep running just fine, until such time that some zone goes into overload, at which point only that zone will be hurt. This makes the system resilient. There are good Linux scripts to re-start a process if it dies; we use those for the load balancer itself.

Short of catastrophic failure (say, the internal network gets physically partitioned because someone trips on the power cord for a switch), this system can handle load balancing for such an application very well, and it is very robust. Especially the part where the load balancer tells the other servers about decisions ahead of time, rather than being involved in each query; and the part where each service is responsible for registering/updating itself with the load balancer/monitor; makes the system self-healing.
enum Bool { True, False, FileNotFound };
First of all thank you _very_ much for the detailed answer!

Quote:Original post by hplus0603
The problem is that you need your load balancer to be "smarter" about the application than a generic system will allow for.


Exactly, that's why I mentioned Level 7 balancer(i.e application aware)

Quote:
We've ended up writing our own load balancing system on top of Perl and SSH, and it works just fine. It wasn't that complex, once we understood what we needed to do, either.


This sounds interesting, I'm actually planning to implement the balancer using a simple c++ application built on top boost::asio. But of course, scripting it with Perl or Python maybe a better(and simpler) idea.

Quote:
There are just a few pieces:

1) Something which knows what hardware resources are available. We use a custom Linux image which registers itself with this component when it boots. This image also contains our server software.


I'm planning to implement this as well. BTW, do you use something like SystemImager(http://wiki.systemimager.org/index.php/Main_Page) for imaging?

Quote:
2) Something which knows the load on each piece of hardware. I think we used Nagios here, but we may have switched to something else.


Yep, I'm using Nagios as well paired with ganglia(+collectl)

Quote:
3) Something which maps "I want to go to location X" to "you need to connect to IP Y port Z" -- this is totally application specific.


That's what I was actually planning to use the load balancer for.

Quote:
4) Reporting of which application-level areas have high vs low occupancy. This is application specific.

5) Deciding when to add another instance of a particular zone. This is mostly application specific, and it has to interact with option 3) in some way.

Note that it's hard to un-instantiate a zone once it's instantiated, because it takes a long time for the last user to log out of the zone, even if you stop adding new users to the zone, and you generally don't want to force users to re-connect to another zone instance.

In general, the most robust solution is the one where running services "check in" with the manager every once in a while (say, once a minute), and report load and the fact that they're alive. When a server doesn't check in for two minutes, it's assumed dead. Also, if a server refuses a request to instantiate a player when the player has gotten that server as his target, the server is assumed dead.


There is a nice article in Massively Multiplayer Game Development 2 called "MMP Server Cluster Architecture" which describes the similar approach.

Quote:
For dead processes, we make sure that the PID and any temp files go away, and the machine capacity then goes back into the pool of available resources. If a machine cycles a lot like this, it's marked as suspect (there's always some bad hardware somewhere causing trouble in a larger cluster of machines).

When a zone server goes dead, the load balancer will generally see that there's too little capacity for that area (including 0 capacity if that was the only server!) and instantiate more resources as necessary -- no particlarly special casing is needed for that.


Thanks for these tips!

Quote:
Finally -- the main point of failure is the determination of "player wants to go to area X, what process should he/she connect to?" We actually solve that by broadcasting the decision that the load balancer would make, when that decision changes (it never changes faster than once every few minutes in our system). For example, if zones A and C both connect to zone B, they generally know which process handles new-player requests for zone B, so when the player decides to enter zone B, the servers know what to tell the player.


Here comes the most interesting part :) It's a bit unclear for me whether the client is aware of the load balancer in this schema or not. Do you pass all the traffic through the load balancer and it transparently delegates the client requests to the necessary zone server? Or the client is aware of the load balancer and it asks it for the zone server once the player wants to change the location?

Quote:
This way, if the load balancer dies, the system will actually keep running just fine, until such time that some zone goes into overload, at which point only that zone will be hurt. This makes the system resilient. There are good Linux scripts to re-start a process if it dies; we use those for the load balancer itself.

Short of catastrophic failure (say, the internal network gets physically partitioned because someone trips on the power cord for a switch), this system can handle load balancing for such an application very well, and it is very robust. Especially the part where the load balancer tells the other servers about decisions ahead of time, rather than being involved in each query; and the part where each service is responsible for registering/updating itself with the load balancer/monitor; makes the system self-healing.


Sounds really good. Again thanks a lot for sharing!
"Or the client is aware of the load balancer and it asks it for the zone server once the player wants to change the location?"

Every time I've done anything like this we've had the client go ask the load balancer for resources -- otherwise the retransmission of the messages becomes the bottleneck.

It also has the advantage that there's less points of critical failure and much less stress on your routing system.
Quote:Original post by Katie
"Or the client is aware of the load balancer and it asks it for the zone server once the player wants to change the location?"

Every time I've done anything like this we've had the client go ask the load balancer for resources -- otherwise the retransmission of the messages becomes the bottleneck.

It also has the advantage that there's less points of critical failure and much less stress on your routing system.


I totally agree, however, LVS/DR schema, I believe, should resolve these issues while keeping the client unaware of the load balancing details... That's actually why I started looking closely at LVS.

Quote:Here comes the most interesting part :) It's a bit unclear for me whether the client is aware of the load balancer in this schema or not. Do you pass all the traffic through the load balancer and it transparently delegates the client requests to the necessary zone server? Or the client is aware of the load balancer and it asks it for the zone server once the player wants to change the location?


Neither.

Zoning happens when you hit some boundary. Each zone server knows who its neighbors are, in zone ID. For each zone ID, each zone server subscribes to load balancing information from the load balancer. Thus, whenever a player hits the boundary and must be shunted to the next zone, the zone server itself knows where the player needs to go, so it simply tells the player "please connect to there."

Thus, all the load balancer does is collect status/load information from all processes, decides when to instantiate new processes, and informs all zone servers about the preferred zone instance for each of their neighbor zones.

Other odds and ends:

We also support teleport.

When teleporting to a friend, you always end up in the zone instance that the friend is.

When teleporting to a bookmark, the teleportation process has to look up the preferred instance for the destination zone, which is a round-trip to the load balancer at that point -- but from the server that initiates the teleport, not from the client. (Our system is very server service centric, to avoid client cheats or data exposure.)

When a client just connects to a zone instance, the instance may decide that it is overloaded, and kick the client to the "next" instance, if available. Each zone keeps a list of the "next" instance of itself; updated by the load balancer when the neighbor list is updated. This means that during heavily fluctuating load when the load balancer can't keep up, some zoning operations take longer (because there's one or more "kicks" that mean re-connect to a new zone instance).

Finally, the "preferred" zone instance for a new user may not be the lowest loaded zone. For gameplay reasons, you want players to aggregate enough that they can interact. Thus, for "preferred," we make something like "the zone that has the highest load, but below 80% load" preferred. If there is no such zone (all are 80 or above), and new zone instances can't be instantiated because the cluster is at capacity, we make the lowest-loaded server preferred. The value "80" probably varies based on game design, system update speed, and a number of other factors.
enum Bool { True, False, FileNotFound };
Quote:
Zoning happens when you hit some boundary. Each zone server knows who its neighbors are, in zone ID. For each zone ID, each zone server subscribes to load balancing information from the load balancer. Thus, whenever a player hits the boundary and must be shunted to the next zone, the zone server itself knows where the player needs to go, so it simply tells the player "please connect to there."


Again, thanks a lot for sharing these details! One more question, so I guess, once the client sends the "want-to-change-location" packet to the current zone server the following happens:

1) the current zone server responds with the packet "use-this-zone-server" packet
2) the client drops its current network connection
3) the client creates the new network connection to this suggested zone server and tries to log on

Correct?

Quote:
Thus, all the load balancer does is collect status/load information from all processes, decides when to instantiate new processes, and informs all zone servers about the preferred zone instance for each of their neighbor zones.

Other odds and ends:

We also support teleport.

When teleporting to a friend, you always end up in the zone instance that the friend is.

When teleporting to a bookmark, the teleportation process has to look up the preferred instance for the destination zone, which is a round-trip to the load balancer at that point -- but from the server that initiates the teleport, not from the client. (Our system is very server service centric, to avoid client cheats or data exposure.)

When a client just connects to a zone instance, the instance may decide that it is overloaded, and kick the client to the "next" instance, if available. Each zone keeps a list of the "next" instance of itself; updated by the load balancer when the neighbor list is updated. This means that during heavily fluctuating load when the load balancer can't keep up, some zoning operations take longer (because there's one or more "kicks" that mean re-connect to a new zone instance).

Finally, the "preferred" zone instance for a new user may not be the lowest loaded zone. For gameplay reasons, you want players to aggregate enough that they can interact. Thus, for "preferred," we make something like "the zone that has the highest load, but below 80% load" preferred. If there is no such zone (all are 80 or above), and new zone instances can't be instantiated because the cluster is at capacity, we make the lowest-loaded server preferred. The value "80" probably varies based on game design, system update speed, and a number of other factors.


This is _very_ helpful, thanks!
Quote:Original post by pachanga
Again, thanks a lot for sharing these details! One more question, so I guess, once the client sends the "want-to-change-location" packet to the current zone server the following happens:

1) the current zone server responds with the packet "use-this-zone-server" packet
2) the client drops its current network connection
3) the client creates the new network connection to this suggested zone server and tries to log on

If you have an intermediate proxy system, it might perform steps 2 and 3 for you seamlessly.

Another approach is to perform step 3 before step 2 so that you don't spend any time without connectivity.
The client doesn't detect or request anything. It's the server that detects that you need to zone. (We can also change our zone map at runtime, although we seldom use that)

Because we do seamless zoning, the client connects to the new zone a fair bit before it disconnects from the old zone. In fact, if you stay close to the border, you will be permanently connected to both zones.
enum Bool { True, False, FileNotFound };
Quote:Original post by Kylotan
Quote:Original post by pachanga
Again, thanks a lot for sharing these details! One more question, so I guess, once the client sends the "want-to-change-location" packet to the current zone server the following happens:

1) the current zone server responds with the packet "use-this-zone-server" packet
2) the client drops its current network connection
3) the client creates the new network connection to this suggested zone server and tries to log on

If you have an intermediate proxy system, it might perform steps 2 and 3 for you seamlessly.


It seems there are two common problems with the proxy: a) single point of failure b) possible networking bottleneck

Quote:
Another approach is to perform step 3 before step 2 so that you don't spend any time without connectivity.


Sounds good, thanks

This topic is closed to new replies.

Advertisement