have enabled web server
hosting companies to offer $5 per month web hosting to thousands by lowering their own administrative costs, a refined set of public domain and cheap game administration remote admin tools have made
it relatively easy for the game server hosting companies to pass on routine maintenance issues for their rented servers to even mildly technical customers, whether it be uploading of new game maps
and player files to stopping and starting different configurations of game server processes.
available in both Linux and Windows flavors. If the server is a Windows system, the software developer can use their pre-existing Microsoft DevStudio IDE to manage projects and compiles, with the
Intel C and C++ compilers underneath. If the server is a Linux system, the user can choose to use the Eclipse software development CDT environment or good, old fashioned command line editors and
How to start? For this exercise, a solid game engine example was selected: Richard Stanway's R1Q2 3). This is a tightened and enhanced version of the Quake
2 engine, which was release to the Open Source community by ID Software back in 2001. Older code? Yes, but many game programmers cut their teeth on Q2 mod development. It's a known space and a good
reference point. Rich's R1Q2 was coupled with code from the LOX Q2 mod, an "extreme weapons" mod built by David S. Martin and friends, and enhanced by Geoff Joy and others. The LOX mod is a good
example of performance challenging code, as the massive number of events that can be created by a single player with the right weapon selections and feature combinations can bring an otherwise
healthy server to its knees.
Again for this example, the target server platform is Linux, the default choice among server hosting companies where game server engines have a Linux server offering. The test server used was a
vanilla Red Hat Enterprise Linux 3 (Taroon Update 4) server, running on a 3.7 GHz Pentium 4 with 1 Gig of RAM, spinning a standard Serial ATA hard drive. Note that all of the steps being discussed
here, including the optimization techniques and compiler features, are applicable to or available on Windows as well.
Get the code. Unwrapping the code and doing a straight gcc compile using the ---O2 optimization switch with the provided makefiles generated usable binaries that performed as expected. A pair of
client machines running on an isolated net connected without issue and achieved pings from varying from 15 to 35 ms. Since this code has had some level of grooming, compiler warnings were
Perform reference benchmarking. In this case, two client machines were connected to the server, running its standard version of binaries, from a local network connection. Their static pings were
recorded, as were their pings when the server was stressed. In this case, the stress test involved having the players from both client test machines launch 4 napalm grenades per second from a fixed
location on the servers default level, generating at least 128 in-game explosions per second. Client "freeze" behavior, typical in this server stress condition, was monitored, as was the frequency of
"RATEDROP" warnings, issued from the server when a significant drop in server-client data exchange rate is detected.
Get the Intel compiler. The Intel C/C++ compiler package is available for demo download, with academic, non-commercial, and commercial licenses. The software installs on nearly all major Linux
distributions, including those not supporting RPM.
Update the makefiles to enable optimization options. In this case, that meant changing "CC=gcc" to "CC=icc". The R1Q2 makefile required no dependency changes or LDFLAGS changes. The LOX makefile
required a minor change to the LDFLAGS setting to accommodate the new library home for a couple of key string functions.
For round one of our compiler optimization exercise, CFLAGS was changed to add the -02 optimization switch. This is the most commonly recommended option, performing many
optimizations for speed without significant regard to the impact on code size, including but not limited to:
- Forward substitution
- Constant propagation
- Dead static function, code, and store elimination
- Tail recursions
- Partial redundancy elimination
One thing that became clear during the initial build with the Intel compiler was that the number of warnings increased, going from 4 to 62. Most of the warnings were variable type checking issues.
Some of them warranted further investigation. In this case, only minor code changes were required. The newly rebuilt binaries were tested and results gathered.
For the next round of optimization, the -02 CFLAGS option was changed to -03. This option, according to the documentation, contains "more aggressive
optimizations, such as prefetching, scalar replacement, and loop and memory access transformations". This includes all of the features of the -02 optimization, plus loop unrolling, code replication
to eliminate branches, and padding of certain power-of-two arrays to improve cache use. Again, the newly built binaries were tested and results were gathered.
For round three, the binaries were built with an added switch: -axN. This switch enables processor-targeted optimization, in this case specifically for Intel Pentium 4 and
compatible chips. Once again the new binaries were tested.
The final round of compiler switch optimization called for changing the -axN switch to -axP. This option optimizes the output for Intel Pentium 4 processors with
Streaming SIMD Extensions 3 (SSE3) instruction support. Once more the resulting binaries were tested.
Webmin web-based interface for system administration http://www.webmin.com
2) Intel C/C++ compilers and related software products http://www.intel.com
3) R1Q2 Quake 2 release http://www.r1ch.net/stuff/r1q2