Game server DoS / DDoS mitigation strategies?

Started by
28 comments, last by hplus0603 10 years, 8 months ago


Lobby and Game appear to be on the net to me. If you're forwarding to them, they're on the net.

The typical behavior is to have a proxy server that sits between the servers that do the game logic and the player. The proxy server does all the validation, load balancing, packet validation, etc. Then it forwards the player requests on to the game logic servers. Those servers do the processing of things like physics, making sure that the actions are acceptable, and then issue a response to the proxy server that tells it aye or nay to the player's request, along with publishing relevant state information.

That sounds like a lot of work unless validation/packet validation is heavyweight - and that should depend a lot on the type of game. I personally worked on poker server backends for a few years and I never ran into a need to do something like that.

That said, the same company worked on a "next gen" poker platform which distributed everything, even worse than what you describe. Scalability and reliability was in the dust for that platform.

So to me that solution sounds like quite the trade-off.

I'm also wondering what sort of attack you'd do on a game server with a single open port. Yes, it's possible to overload it, but killing the proxy by overloading it would amount to the same thing, wouldn't it? Please explain the benefits, because I don't really see them.

Advertisement


As to the encryption, I don't generally trust myself to get all the edge cases correct and rather use the separate SSL connection for initialization. The ssl connection is only used for login and other secure items. As a benefit, you can use it for the initial NAT traversal since you have the connection anyway. In general though, this tcp connection is only there for a brief period and won't interfere with the UDP once initiated.

That's good advice in general. I've dug deep into cryptography to design a protocol which I feel fairly confident in. Mostly because it's basically an implementation combining two well known protocols. Still, I know it's a risk.

(Even if it's broken, I don't actually use any user generated passwords. In addition, gameplay is mostly anonymized. It's not really possible to determine the e-mail of another player, and since game is partitioned into separate worlds, if you get hold of someone's credentials it's unlikely to be someone playing in the same world as you.)


Using https wouldnt protect the game servers directly but could do woners as far as login and lobby systems. With the pro cloudflare you can set the cache to 30 seconds and they have servers all over the world so when people hi the lobby they can bring up the cached version 30 seconds isnt that much of a difference then the cloudflare servers would only touch your server once every 30 seconds. You would probably want to whitelist the cloudflare servers and block anything else through a firewall. Also the nameservers would point to cloudflare not your ip so you can keep your ip for login/lobby not known to the public. You could also have a fall back server or multiple if some get in trouble. You would also have the servers report over https to keep that login/lobby server masked. Overhead would also drop as you dont have to serve https from your server you could serve http and have cloudflare do all the heavy https work.

In my case all of login/lobby/game are plain sockets. No http. That said, it would be possible to convert them to http(s) services. However, cached data would not do. Both login and lobby servers serve personalized results, and for security reasons the login server shouldn't serve the same data twice. Some of the lobby functions might be possible to cache, but player registration - to take one example - obviously wouldn't work.

Or am I missing something?

Well even if the results are personalized cloudflare has more ddos protection than 1 person with limited funding and limited time can do and caching could still be of use just not as much for the personalized results but things like some of the lobbys and player registration(until posted) could be and wouldnt have to touch your server at all. I run quite a few websites and trying to get into the gaming world on my spare time so my answers will definitly be web biased :) but running my own servers cloudflare has opened many doors that would have cost much more and much more time to handle on my own.

And like I mentioned it can handle the ssl so your server doesnt have to which is a major plus because you can quickly kill the server resources with multiple connections on a ddos and it would be much harder to take down a network like cloudflare.

What kind of game are you making is it a mmo or something like fps? Some mmos work entirely over web traffic and some fps use ssl web traffic to login and get going like firefall they use aws.

I personally feel like if its a fps your better off getting high powered servers and trying managing the downtime that comes from my team fortress and css servers theres just not much you can do other than not making it worth the attack.

Lobby and Game appear to be on the net to me. If you're forwarding to them, they're on the net.

The typical behavior is to have a proxy server that sits between the servers that do the game logic and the player. The proxy server does all the validation, load balancing, packet validation, etc. Then it forwards the player requests on to the game logic servers. Those servers do the processing of things like physics, making sure that the actions are acceptable, and then issue a response to the proxy server that tells it aye or nay to the player's request, along with publishing relevant state information.


That sounds like a lot of work unless validation/packet validation is heavyweight - and that should depend a lot on the type of game. I personally worked on poker server backends for a few years and I never ran into a need to do something like that.

That said, the same company worked on a "next gen" poker platform which distributed everything, even worse than what you describe. Scalability and reliability was in the dust for that platform.

So to me that solution sounds like quite the trade-off.

I'm also wondering what sort of attack you'd do on a game server with a single open port. Yes, it's possible to overload it, but killing the proxy by overloading it would amount to the same thing, wouldn't it? Please explain the benefits, because I don't really see them.


Packet validation, i.e. verifying CRCs, packet types, etc. and then translating those into game events is fairly lightweight, and the perfect job for a proxy device. It allows you to have a buffer between the players and the game play servers, which allows you to invisibly scale game play servers based on CPU load, and add additional proxies to handle network load.

Because the proxy must be able to send data out, one could use a remote code execution exploit on the open incoming port to launch a process that establishes an outgoing connection to a remote server. At that point I now have the ability to remotely control the proxy server.

Once I've got the ability to remotely control the proxy server, TO ANY DEGREE AT ALL, I can then work towards moving laterally on the network to nearby devices. This includes things like using the database credentials of the server to establish connections to the database and pulling down the data off of it. Since your database server wasn't nicely hidden on another part of the network, and because database access wasn't performed through some sort of service, I've now obtained a list of all the usernames, passwords, email addresses, and all that other information that traditionally gets leaked. But wait, there's more!

Let us imagine for a second that this is a F2P game, money is made by the company by selling certain perks. Since I now have direct DB access I can use that to perform actions such as adding certain items to my account. If this was a game like EVE Online that action could have significant in game economic impact, along with the potential realworld economic impact on THE COMPANY. Imagine adding a thousand PLEX to a single account and selling them on the EVE market. You could trivially depress the market for plex well below the current amount (something like 600,000,000 ISK), and also the company would lose $17,500.

Furthermore, since this would all be done through a few simple SQL queries, you would have a hell of a time tracking it down and even finding out WHICH accounts had been altered, assuming you even noticed it.

The proxy server is your first line of defense, and it should be setup in such a way as to assume that it WILL be compromised, and from there you design the system such that said compromised system cannot significantly affect the rest of your systems. You do this by isolating the network it is on (typically as a DMZ) and only allowing it to talk to specific servers on the internal net, such as game play servers. This then means that in order for me to penetrate further than a proxy server I now have to engineer an exploit that will hit the game play servers. Furthermore, since you can very explicitly restrict exactly what ports are open for Incoming AND OUTGOING connections on game play servers (which you can't do on the proxy servers, as it must be able to maintain connections to a disparate set of clients) you can limit the ability of the hacker to send and receive from those systems (they would have to inject the code into the current process or crash the game play service, both of which would likely be noticed).

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.


Packet validation, i.e. verifying CRCs, packet types, etc. and then translating those into game events is fairly lightweight, and the perfect job for a proxy device. It allows you to have a buffer between the players and the game play servers, which allows you to invisibly scale game play servers based on CPU load, and add additional proxies to handle network load.

Well, that depends on the game again. Depending on the setup, you might still be able to scale up game play servers, even without a proxy.

This was already something with could do on the poker servers I mentioned. Basically tables would spawn on all available game server, auto-balancing themselves so the server with the lowest load took on the most new tables. If you added a new game server it would report itself to the lobby and start creating tables. Removing a server was as simple as telling the server not to create any more tables.

This was a very straightforward solution and consequently had very few issues.

Similarly in my game, each game server is really game agnostic. Game worlds are persisted to a db, but held entirely in memory once a game server start "hosting" it. Once gameplay winds down the new state can again be persisted to db. Again, this is very domain specific to my particular game as play is done in short sessions and each game only holds about 100 players.

But I can easily see games that your proxy solution could work great for too.


Because the proxy must be able to send data out, one could use a remote code execution exploit on the open incoming port to launch a process that establishes an outgoing connection to a remote server. At that point I now have the ability to remotely control the proxy server.

Let's say the firewall is a NetBSD installation forwarding ports 6000, 6100, 6200 to three different machines on the local network. This machine again is firewalled from the local network except for the game ports. Typically it's only possible to reach the firewall using ssh from a special machine, everything else is blocked.

It doesn't seem like a trivial hack, breaking into the servers hosting the games (and from those, reaching the db)

Let's say the firewall is a NetBSD installation forwarding ports 6000, 6100, 6200 to three different machines on the local network. This machine again is firewalled from the local network except for the game ports. Typically it's only possible to reach the firewall using ssh from a special machine, everything else is blocked.

It doesn't seem like a trivial hack, breaking into the servers hosting the games (and from those, reaching the db)


All I have to do is find a packet combination that includes the code I want to execute along with the appropriate buffer overflow exploit. Its not impossible to do, and happens frequently enough on the Internets. That's how a majority of the hacks over the last few years have worked. Exploiting bugs in software to gain remote access to the machine.

Simply fire walling your game play servers is not sufficient. The servers can still establish outgoing connections on various ports, and if you attempted to prevent that you would find, very quickly, you were unable to have players logging into the machine, especially if you were using TCP/IP.

Lateral movement, once I'm on your network, is actually very easy. It is, in fact, easier than getting into the initial machine in most cases. This is because your internal network has less security on it than the gateway will. Leaves me a significantly greater number of potential access paths. Furthermore, since your game play server has to access the database, I simply need to access the configuration file and I've got your database login details. Now I can simply download the database from the server with those credentials.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

Washu:

So your point is that with a proxy I can lock the game servers to a single outbound port which is a big security win, right?

You're basically thinking of an exploit that can go right through the firewall and execute arbitrary code on the machines hosting the game servers. But if you are, then locking the game servers shouldn't be much of an issue. It's a few more jumps to attack the service hosting the db and then onto the db, but there's not really any new kind of obstacle added, is there?

But I am assuming a breach so:

1. Login does not depend on permanent passwords, when you log in with an existing account you need to verify the login using your e-mail. You're the issued a new temporary passcode for that device. In case of a breach, all passcodes can be reset server side with the only inconvenience that the players need to re-authorize. No passwords are leaked.

2. Game server certificates and keys are stored encrypted in the config files, so even with the config files the private keys can't be recovered.

3. Even if the keys are leaked, changing them will only invalidate issued tickets.

4. If the game server private key is leaked that will only enable a MitM attack, as the keys are only used to authenticate the server to a player.

Not perfect, but better than having user passwords in the db.

It's a few more jumps to attack the service hosting the db and then onto the db, but there's not really any new kind of obstacle added, is there?

Its a lot harder than you think. With the game services exposed via a public port then I have a direct attack vector. I.e. the port. If I have to run everything through the proxy server then it will take longer, and you're more likely to notice.

2. Game server certificates and keys are stored encrypted in the config files, so even with the config files the private keys can't be recovered.

Encrypted how? If i'm ON the server then I can read anything the SERVER can read. I can also poke around in memory and simply grab the decrypted keys, or even skip that and grab the decrypted configuration file.

4. If the game server private key is leaked that will only enable a MitM attack, as the keys are only used to authenticate the server to a player.

And a MitM attack is exactly what I would want, if i didn't have access to the database. As it would let me collect all the logins going to the server. For a GOLD FARMER, that's what I want. I liquidate all the assets on your account, and then sell the gold to some sucker with too much money. EVE had that issue for a while, so does WoW. Mind you, their servers weren't hacked (and their design matches what I described btw), but instead they stole the credentials using viruses or spoofed websites.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

Let me try and explain this a bit better:

If you have a set of game servers that are all exposed to the internet (port forward, just behind a firewall, on the DMZ, whatever). These servers, obviously, have to talk to the database. So clearly there's an open connection between them and the database. Even if the database is behind another firewall, on a different network segment, or any number of other network design decisions.

But let us take the example of a set of port forwards to the different game servers, with the database server on the same network as the game servers.

If all of the game servers are on the same network then, once one is compromised, then it becomes fairly trivial to move laterally to all the other machines *on that segment of the network*. I.e. if I compromise the server at port 6000, I have control over that machine. I can now establish LOCAL connections from that machine to all of the other machines on that network segment, INCLUDING the database server.

The chances of you detecting and stopping such an intrusion in time to prevent damage, or the theft of data, is rather minuscule.

Once someone is ON your servers, no amount of encryption will help you. They can poke directly into memory to yank the keys, passwords, algorithms, or anything else they want out. They can decrypt your configuration files and database connection strings with minimal effort. Frankly, once they're on that segment of the network, that segment is compromised and NOTHING on it is guaranteed.

Now, if we change the network layout slightly, we end up improving security immensely. Specifically, if we move the database server off the local network and onto an internal LAN, then setup a firewall between that LAN and the DMZ that the game servers occupy we have introduced an additional barrier. No longer can I simply remote straight into the database server and say, make a copy of the raw database file for later examination. I'm reliant on connecting via the game servers into the database and running queries. Those queries will only have the permissions of the user that is used by the game servers, although those permissions are usually pretty wide since the game servers have to write changes to the database. Nevertheless, this introduces an additional step, and increases the chances of my detection significantly. Any serious queries I write will introduce a noticeable database load, and it makes it a lot harder for me to simply disguise my hacking as random crap that's getting thrown at the server. While this is good, its not great. The chances of detection are still really low, and I can still end up doing bulk queries and getting large amounts of data out of the system before you ever notice.

Now, if we add an intermediary between the client and the game servers we can impose additional layers of security. In addition we get additional scalability options. Firstly, if we have a DMZ with a series of proxy servers on it, with those proxy servers doing nothing but translating client packets into game events, which are send to the game servers, then we can scale the networking side of things (provided we have a decent backbone between the game servers and the proxies), fairly easily. Too much traffic for one proxy to handle? Toss up another and tell the login server about it.

The game servers are then put behind the firewall with the database. The only way in or out of the local area network is via communication with the game servers. The database is not exposed to the proxy servers at all. Since the game servers only communicate with the proxy servers via game events (i.e. internal packets), then the only ways to attack the game servers, and thus gain direct access to the database, would be to work out YET ANOTHER EXPLOIT and inject that into one of the game servers. In addition, since the game servers only respond to game events, I cannot issue bulk queries and obtain copies of your database WITHOUT hacking my way through that barrier.

Now, if your proxies are doing the validation and translation of packets, batching up zone events and sending them to the correct game server, etc. Then your game servers should be doing validation of those events, such as running physics, etc. This does not exempt the game servers from running basic validation, such as ensuring packet format and length are correct for the event type specified.

This significantly increases the chances of detection. Something as simple as trying a buffer overflow on say a player position update request would likely be caught. Too many such requests might ban that account, but if you kept seeing those requests in the admin log, you would probably open up your sniffer and see whose running a bot.

Now, its not perfect. Someone can STILL get in, and you can still not notice them. But the barrier to entry has been significantly increases with relatively little cost. For any REAL MMO you're going to have to have multiple servers running a single shard ANYWAYS.

As a reference, here is the architecture of the EVE node system, the VPN links are firewalled (i.e. DMZ):

e2lm.png

(This is from their GDC 2009 presentation)

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

This topic is closed to new replies.

Advertisement