Hardening a production server
The question came up:
How would I go about 'Hardening' the host server?
In theory, hardening is a simple mixture between good configuration, good software, and good operational practice.
- Don't run any services you don't need to. Typically, this means just running a SSH server and your specific game server. Other services, like a local caching DNS server, is generally not a good idea!
- Don't allow connections to management interfaces from the greater internet. For example, if you have a private network, don't allow SSH connections from public addresses, only from private addresses.
- Enforce access rules both in configuration files (assuming your servers have configurations for which addresses/interfaces they bind to) as well as using host-level filtering (iptables on Linux, Windows Firewall on Windows.)
- Be fast about patching as soon as security updates are available for your OS and for servers you didn't write yourself.
- For software you write, make sure that you check the size and validity of each packet. If a field of a UDP packet says "there are 312 bytes of user name" but you only received 270 bytes total, that's a bad packet. Beware signed values (char is signed and may be negative!) and test all code with all extreme edge case values.
- Religiously read the logs from your server. When there is a new kind of log message you haven't seen before, research what it is and why you see it. Classify log messages into exactly two classes: "can be ignored at runtime, but useful for post-mortem debugging" and "needs human attention pronto." If some log is not useful and doesn't indicate a problem, AND you know what it means, then configure/filter that specific message out.
- Keep a constant eye on metrics such as CPU load, RAM usage, network ingress/egress, CPU temperature, disk space, disk I/O rate, swap usage, and so on. Set alerts when they fall outside normal bands.
- Inside your private network, still enforce access with SSH keys, use TLS where possible, etc. Only allow cross-host connections that make sense. For example, if you have application servers, and game servers, and database servers, the game servers should not be allowed to access the database servers, but should have to go through the application servers. Enforce these rules with network filtering. Another good example is SSH command line access; typically you have a "bastion" host that you can SSH into, and SSH to other hosts is only allowed from that host and some spare/backup host. SSH directly from a game server host to an application server host would indicate a non-standard use, which should be disallowed.
- Keep track of what traffic you see internally. If there is a new protocol or new pair of hosts talking that you haven't seen before and don't expect, alert and investigate!
When you actually get to implement this in practice, you will find that certain practices or requirements will conflict -- maybe the ability to quickly patch servers means that your servers need to be able to SSH outside of your network to grab those patches, but you don't generally want to allow SSH out from the servers to the greater internet. Or maybe there is a traditional "features versus security versus schedule" conflict. How you deal with those determines your culture.
I can say that starting to do these things early is, in the long run, much easier than trying to come from behind later. If you have a mess of internal connections and millions of unaudited log messages per day, trying to bring order is a massive undertaking. On the other hand, if your game sucks because you spent all time on hardening and not on gameplay and polish, then all that hardening work was wasted (other than as a learning excercise.)