Game server DoS / DDoS mitigation strategies?

Started by
28 comments, last by hplus0603 10 years, 8 months ago

Are there any must-have DoS / DDoS mitigation strategies one should always build into a server?

Regardless of TCP or UDP, it feels like there is very little one could do against a DDoS which tries to saturate the network.

Even if the attacker can't do that, simply sending login packets with spoofed IP addresses/ports could be a problem. For something like SSL, it's possible to hit the CPU hard by initiating a handshake. CPU exhaustion attacks can be mitigated by puzzle challenge-style logins, but it should be fairly easy to block login by making sure that the all login "slots" are in use.

(I know some websites that use CloudFlare, but that's for serving http/https content and so isn't an option)

Any opinions of what a reasonable line of defence is?

Advertisement

There are always some things you should do in server code but they are only going to handle relatively small DoS attempts. Higher bandwidth multi-location DDoS attempts are something you have to plan for using colo's, multiple IP's, hardware, etc. Assuming just making the server able to withstand general and minor attacks, you should handle several things. There is a bit of a difference between TCP and UDP though:

1. With TCP you need to check your OS syn flood measures. See http://en.wikipedia.org/wiki/SYN_flood. I believe most OS's have many of the protections already in place, you might want to tweak them if possible. I believe many Linux servers include automatic exponential syn reply backoffs to handle DDoS so you may be about as covered as you can get. Just a good thing to check.

2. UDP servers will have much the same sort of flood attack vulnerability, so you have to implement the various bits of mitigation code yourself.

3. Both TCP and UDP need to have bandwidth limiters. The attacker could have a valid login but a hacked client or tool which once logged in starts blasting bandwidth at the server beyond what it should. With TCP you just disconnect them, UDP you do the same thing effectively.

4. UDP needs to validate each packet is not a spoof/injection. Generally the first thing in the packet should be some very quick to validate hash/crc value. The last time I did this I used SSL to get the initial connection from client to server, perform login securely and give the client a unique ID/salt value. Then, when a UDP game packet is received, the first 16 bits are a crc of the salt, the ip and packet length. With that, you can at least check the basic validity of a packet cheaply before you start full decode, validation and processing. Obviously if the attack is coming from someone with a real client or tool which logs in properly they can have the salt value and you might get valid packets that should still be thrown away. You should boot the bastard if you see such thrown out packets too often of course.

5. Exponential backoff on failed login attempts. I.e. bad username/password. This should be a ip+user recorded item. So, the first time a given IP+username gives you a bad login, you refuse to accept login for 5 seconds. Next time, 10 seconds, then 20, etc. You record it as ip+username incase the real user attempts to connect after the attack. You don't want the user to have to wait possibly several minutes for something they didn't do.. :)

6. Same ip+multiple usernames login failures. Same as #5 basically.

These are just some of the larger things I can think of off the top of my head. There are a large number of additions for UDP connections. This is not a small subject, it is actually quite large. But, the basics above are a simple starting selection of the more important ones.

With the mentioning of cloudflare and in AllEightUps post number 4 he uses a ssl connection for initial id/salt I saw recently that the new beta firefall uses simple https requests to aws servers for login. You could do similar use cloudflare to prevent alot of attacks on the initial login atleast. You would still have to worry about the bandwidth issues after logged in though. Depending on what kind of game it is if its not real time you could possibly get away with all http requests using the cloudflare or other technology thats already in place.

The ability to detect remote IPs that issue an unnormally high rate of requests, or an unnormal distribution of request types, is useful.

Also, hardening your system and setting in place upstream filtering is useful. If your ISP can throw away UDP traffic destined to you on ports that you don't care about, then your link doesn't need to worry about this. (The ability to do this varies based on the ISP and your relation to them, i e how good a customer you are.)

Making sure you have SYN cookies turned on on your servers and/or load balancers is common sense. Same thing for filtering obviously bad IP fragments, etc.

Regarding CloudFlare, they are a mid-tier DDoS vendor, worrying more about CDN than DDoS, so they don't have a lot of ability to work with custom protocols like games. I think you'd get the same from other CDNs like Akamai.

There are higher-end vendors, where you pay five figures a month plus possibly five figures per event to help mitigate attacks. Those vendors may be able to put in place custom filters that you and they develop together, and can target any services. Names that come to mind include Prolexic, Neustar, and Verisign. Some of those guys even claim that they started out specifically as DDoS mitigation providers for MMO games!

Another option is to run on a public cloud, such as Amazon ECC. Yes, you pay for bandwidth, but if the DDoS is some number of gigabits for a few hours, that's not actually going to accrue all that much traffic. On the other hand, on ECC, a determined attacker will be able to generate pretty big bandwidth bills for your service... and Amazon may find that your service disrupts other customers and cut you off if it gets bad enough.

Finally, you can get high-speed connections to your ISP, with a lower commit level. For example, you might be able to find an ISP that lets you commit to half a gigabit per second, yet allows 100 Gbps interconnect, and bills by the 95th percentile. You need to be under DDoS for almost two days, full time, to actually make the DDoS hit your 95th percentile. Given that renting out botnets is a lucrative market these days, somebody has to be really pissed at you to hit you with dozens or even a hundred gigabits for more than an hour. I've only really heard of that happening to the financial services people, presumably by black op teams from their competitors or extortionists.

enum Bool { True, False, FileNotFound };

Good ideas all around. Some questions:

AllEightUp:

For bandwidth-limiting, you're thinking of a simple thing to prevent the client from issuing requests and getting response from the server for the request, right? Because nothing's going to prevent the packets actually hitting my server. Or are you thinking of some configuration I could do to firewalls?

For UDP I have encryption, but only after the UDP payload is spliced into sub-commands. I don't have it on the UDP protocol level. I don't know if that's a problem or not. For TCP the encryption happens after I split the stream into packets, but that feels "safer".

jeff8j:

Doing login over https has occurred to me exactly for that reason. Right now it uses a puzzle-challenge and limited evaluation pipe to prevent overload. Still, that won't protect it against bandwidth saturation. On the other hand, say I go with something like aws, then this might even make it worse. After all, killing the login server with requests is fairly harmless. However, if they fail to do so, or are able to get a large amount of login tickets, then they might start targeting the lobby and game servers - and those are the ones I would like to protect.

I've made secure login for game/lobby as cheap as possible by relying on a kerberus-style ticket to set up encryption, but they are hitting the db after the security handshake... The nice thing about the login servers are that they don't affect any sub systems if overloaded.

It's hard for me to tell if moving the login to a simple https service would help or not.

hplus0603:

What would I gain by filtering but my game port? Assuming I have a firewall closing everything else, what would they get by bombarding other ports, as opposed to simply the game port?

There are also other things you need to be sure to handle, one of the latest and greatest attacks is the simplest of all:

A pseudo slow client.

The trick is to establish TCP connections, valid ones, but send as little data as possible. Since data is trickling over the connection, the bad connection remains open, consuming a great deal of resources. It is, in effect, the same as a TCP Syn flood attack, except the connections actually complete.

There are, of course, the other things you need to handle:

Validate everything, never trust data coming from the client.


What would I gain by filtering but my game port? Assuming I have a firewall closing everything else, what would they get by bombarding other ports, as opposed to simply the game port?

A software firewall on a port can "help" to a degree, but the machine still ends up having to handle the data getting sent to it. While the data is simply discarded, it is still being sent to the machine.

Hardware firewalls are a bit better on this, since they're actually designed for throughput and packet streaming, so they can tend to discard blocked traffic at a much faster rate, but again... that's still traffic ON THE WIRE. If you're getting charged for bandwidth, you'll get charged regardless of if you accepted the traffic or not.


I've made secure login for game/lobby as cheap as possible by relying on a kerberus-style ticket to set up encryption, but they are hitting the db after the security handshake... The nice thing about the login servers are that they don't affect any sub systems if overloaded.

Who is hitting the database? The client should never touch the database. In fact, the service the client talks to should, ideally, not talk directly to the database. The database should be firewalled, locked away behind services that do all the intermediate actions. Those services should be located on servers that are locked away from the rest of your machines, if at all possible. The simple fact is, a server on the internet is a gateway. Once someone gets into your gateway, then you need to limit their lateral movement. A database server sitting on the same network as said gateway is a ripe target to get its data ripped. But a database server locked behind some service gateways and firewalls is a much harder target to access and would give you time to notice and stop said infiltration.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

Washu: In regards to the port blocking, say I have the game port at 6000. What does an attacker gain by flooding 6001 that the attacker can't gain by flooding 6000? That's what I was wondering.

In regards to the db - it's the server that connects to the db of course, I was referring to the fact that a login / request by the client may cause the server to perform a db query.

In the case of a DoS attack on a server, then that server will trigger many db requests. If the queries are good and the server has a connection pool that limits the number of simultaneous queries to the db, that shouldn't be a problem. Still, I prefer to avoid enabling a player to cause a db query as far as possible.

Like you said, the db should never be accessible directly from the internet, and preferably thoroughly locked away inside the internal network.

In my case I was imagining the setup looking a bit like this:

Internet
|
|
V
+----------+
| Firewall | some.domainnamehere.net <- Firewall connected to internet
+----------+
6000 | | | 6200 Port forwarding to the right server
___________| | |____________
| | |
| | 6100 |
| | |
v v v
+-------+ +-------+ +------+
| Login | | Lobby | | Game | <- Servers not directly connected to net
+-------+ +-------+ +------+
| |
|_____ _____|
| |
v v
+--------+
| Db |
+--------+

Washu: In regards to the port blocking, say I have the game port at 6000. What does an attacker gain by flooding 6001 that the attacker can't gain by flooding 6000? That's what I was wondering.

Not much, other than increasing your traffic bill. They're much more likely to attempt to flood an open port than one that's closed. Mainly because the goal is to swamp the machine, and the easiest way to swamp a machine is to use up the kernel's free memory by clogging it up with useless garbage through open ports.

In regards to the db - it's the server that connects to the db of course, I was referring to the fact that a login / request by the client may cause the server to perform a db query.

In the case of a DoS attack on a server, then that server will trigger many db requests. If the queries are good and the server has a connection pool that limits the number of simultaneous queries to the db, that shouldn't be a problem. Still, I prefer to avoid enabling a player to cause a db query as far as possible.

Like you said, the db should never be accessible directly from the internet, and preferably thoroughly locked away inside the internal network.

In my case I was imagining the setup looking a bit like this:

Lobby and Game appear to be on the net to me. If you're forwarding to them, they're on the net.

The typical behavior is to have a proxy server that sits between the servers that do the game logic and the player. The proxy server does all the validation, load balancing, packet validation, etc. Then it forwards the player requests on to the game logic servers. Those servers do the processing of things like physics, making sure that the actions are acceptable, and then issue a response to the proxy server that tells it aye or nay to the player's request, along with publishing relevant state information.

There would be a DMZ for the proxy servers, and then an internal network for the game logic servers.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

AllEightUp:

For bandwidth-limiting, you're thinking of a simple thing to prevent the client from issuing requests and getting response from the server for the request, right? Because nothing's going to prevent the packets actually hitting my server. Or are you thinking of some configuration I could do to firewalls?

For UDP I have encryption, but only after the UDP payload is spliced into sub-commands. I don't have it on the UDP protocol level. I don't know if that's a problem or not. For TCP the encryption happens after I split the stream into packets, but that feels "safer".

The bandwidth limiting is basically a preventative measure. A common attack vector is for a client to login correctly and everything looks nice but they start sending valid commands as fast as possible. For instance, say you have movement messages, instead of sending a movement message packed in with other data at say 10 times a second, they start sending just the movement update alone at 1000 times a second. Everything is 'valid' but they are hitting you in two places, the bandwidth of course and the CPU overhead of dealing with the valid but excessive packets. When detected, you boot the connection and stop accepting the packets, that brings CPU/memory on the server back to normal even if you are still getting slammed with the excessive bandwidth. If they keep it up even if you keep booting and ignoring them, normal mitigation such as telling the router to bounce the packets for an hour and of course contacting the source ISP or backbone provider and informing them of the ip source etc.

As to the encryption, I don't generally trust myself to get all the edge cases correct and rather use the separate SSL connection for initialization. The ssl connection is only used for login and other secure items. As a benefit, you can use it for the initial NAT traversal since you have the connection anyway. In general though, this tcp connection is only there for a brief period and won't interfere with the UDP once initiated. This is mostly paranoia on my part, I just don't want to risk that I goof something up which allows someone to crack the UDP encryption somehow and grab up everyone's passwords. It's not paranoid if everyone really is out to get ya... :)

Lerno:

Using https wouldnt protect the game servers directly but could do woners as far as login and lobby systems. With the pro cloudflare you can set the cache to 30 seconds and they have servers all over the world so when people hi the lobby they can bring up the cached version 30 seconds isnt that much of a difference then the cloudflare servers would only touch your server once every 30 seconds. You would probably want to whitelist the cloudflare servers and block anything else through a firewall. Also the nameservers would point to cloudflare not your ip so you can keep your ip for login/lobby not known to the public. You could also have a fall back server or multiple if some get in trouble. You would also have the servers report over https to keep that login/lobby server masked. Overhead would also drop as you dont have to serve https from your server you could serve http and have cloudflare do all the heavy https work.

Note: If your game could use websockets you could block ips and cloudflares ddos would still work to prevent them from even touching your servers. This does have the downside of its not a direct connection so latency will go up a bit but I have done this and its not that bad I figure because cloudflare trys to be as close to the end user as possible so its pretty much in line on the way to the server.

In the end the only way to be slightly confident that things wont go down is to have enough servers in enough locations with big pipes that it would be very difficult to bring them down and if one goes down another handles the load. Thats how google, facebook and the like handle things and thats why a web option is easier to get going something like cloudflare has many powerful servers in place around the world with massive pipes I know I couldnt afford.

Of course there are alternatives to cloudflare some way better but none at the same level of cost $20 a month or even free for basic protection so thats why I keep going back to it.

This topic is closed to new replies.

Advertisement