SpatialOS single shard MMO

Started by
29 comments, last by hplus0603 6 years, 9 months ago
On 20/06/2017 at 7:31 AM, drainedman said:

I can't afford $50k but I am planning on building a beowulf cluster to support a kind of MMO with 100k+ entities (not all human but all are persistent).

I'm just going to go for a simple method - each 2 km sq region is handled one process. Every process sends sync messages to its neighbours via network messaging. Only edges of the regions are kept in sync. I'd like to run this over a VLAN to separate network channels.

Of course this has lots of limitations but if I get this far then I will implement some load balancing e.g. split the regions with heavy loads into 1km sq regions, etc.

So long as not too many objects converge in spot and do too many interactions it might be ok.

I am currently building a mini cluster of raspberry pi's. So far I got two of them but plan to expand to four.

Each raspberry pi has 4 cores and 1 GB RAM and 16 GB SSD drive.

You could easly support 20,000 - 40,000+ concurrent players on that, but it heavily depends on the game design.
When doing single shard multiplayer games, it is critical to avoid choke points, such as having one capitol city or one location for the auction house where players tend to group together.

I made my own distributed server in Erlang and on an 8 core desktop computer I could support congestion of about 2 groups with 2000 players at the very same spot. This would be similar to players forming a warband in WoW and running around killing stuff.
With those 2 groups of 2000 players each, the same server was under heavy load but still responsive, I tried to fire up a third group of that size but then things started to lag badly.

In an other test, I spread the players evenly about the world, I had divided the world (1024x1024m) into smaller chunks of 64x64 meters. With about 60 players in each such chunk the server were running smoothly handling around 12,000 concurrent players with latencies between 10 - 50 ms.

I have not yet tested the performance on my raspberry pi cluster, but I am working on it now and should have some interesting results to share later this  summer.

Supporting 100k+ entities were most of them are static or mostly inactive should not be very hard, but it is time consuming to build such software unless you use something already existing.

Advertisement

You could easly support 20,000 - 40,000+ concurrent players on that

While the Raspberry Pi is amazing, it's not THAT amazing. For example, the Ethernet is a 100 Mbps interface sitting on an internal USB hub. And calling the MicroSD card a "SSD" is ... a little optimistic :-)

I bet, if players aren't doing much, and you don't have real-time simulation requirements, and bandwidth is carefully managed, you could do thousands per Pi. That's still pretty amazing!

 

enum Bool { True, False, FileNotFound };

I have always generally preferred to roll my own solutions where there is not a proven technology available. I think the reason behind this is that I like to fully understand how my system works. I also like to deal in precise quantities, I like to know what is and isn't possible. Using 3rd party stuff I can never be sure if I'm using it wrong or if the "technology" is limited.

For example Erlang gets recommended a lot but I just don't see the speed advantage of using Erlang for low latency distributed simulation for instance. Perhaps I just don't like Erlang. Maybe the supposed advantage is that its easier to use but I'm already invested in my own pet technology so that its easier doesn't really hold true for me either. 

In any case my approach is not treading new ground, which gives me some optimism. I have read papers on similar stuff done years ago. 

For example Erlang gets recommended a lot but I just don't see the speed advantage of using Erlang for low latency distributed simulation for instance.

Erlang gives you the advantages of immutable data. It also gives you the advantages of ultra-cheap "processes" (more like "fibers" but the immutable data means you can't accidentally someone else.) It also gives you the advantages of being able to upgrade the running code in-place. No rolling re-starts, no socket close and reconnects, just keep on going with the new version of the code. (The immutable-isolated-data concept makes writing these in-RAM migrations possible.) Finally, it gives you very small-size, threaded garbage collection, because each little micro-process has its own heap that's collected in isolation.

The upgrade-in-place is actually quite hard to do with DLL re-loads in C/C++, or most of any other language. The other bits are possible in other languages, with slightly different trade-offs. If you're interested in systems, in general, and haven't used most of those features (especially immutable-data functional programming) on previous projects, it's at least worth checking out for learning. It is a very different paradigm, though (if you haven't already used ML/Haskell/OCaml/F#) so expect going to be very hard and uphill initially. It really takes time and effort to learn, like anything else that is actually big, different, and worthwhile.

That doesn't mean Erlang is the right choice for you. The IPC is low-latency, but cross-node communication uses TCP. The VM generates native code for execution, but the constant overhead is noticeable (Erlang is slightly slower than Java in my experience.) Not every project needs 100% uptime even through deploys and rollbacks. But, it represents a fundamentally different approach to the problems faced by distributed systems developers, and thus, it's quite worth learning in detail.

 

enum Bool { True, False, FileNotFound };
9 hours ago, hplus0603 said:

 

 

While the Raspberry Pi is amazing, it's not THAT amazing. For example, the Ethernet is a 100 Mbps interface sitting on an internal USB hub. And calling the MicroSD card a "SSD" is ... a little optimistic :-)

I bet, if players aren't doing much, and you don't have real-time simulation requirements, and bandwidth is carefully managed, you could do thousands per Pi. That's still pretty amazing!

 

I was referring to my Raspberry cluster of 4 Raspberry's, maybe that was not clear in my post, sorry about that. But yeah, the numbers heavily depend on the type of game play you have in your game.

For what its worth this seems to be based off Docker containers https://docs.docker.com/engine/swarm/how-swarm-mode-works/nodes/. Which is interesting of itself and can get a bit more detail of the technical aspects.

I don't make any claims to fully understand all of this, however my initial impressions are that this is more suitable to stateless load balancing (of web apps and so on) and rollouts of installations - rather than being an MMO platform and directly addressing the MMO issues. I cannot see any guarantees about state synchronization, speeds, etc. I tend to prefer to deal in hard numbers and not cuddly flow charts.

 

2 hours ago, drainedman said:

For what its worth this seems to be based off Docker containers https://docs.docker.com/engine/swarm/how-swarm-mode-works/nodes/. Which is interesting of itself and can get a bit more detail of the technical aspects.

 

Docker is largely a cheaper way of running "virtual machines" of software in isolation from other software. It makes building the "machine images" (containers) straightforward, and has a number of orchestration methods to put containers on physical hardware. The draw-back, compared to VMWare or Xen or whatever, is that all the containers have to run on the same kernel, and that kernel needs to support containers in turn.

You can run web servers in containers. You can run game servers in containers. You can run memory caches in containers. You can run proxies in containers. You can even run desktop software in containers (if you forward the appropriate desktop.) You can also run databases in containers, although the I/O virtualization ends up making that so slow that nobody really does that for real.

The fact that something "uses containers" or even "is built on containers" doesn't really tell us much at all about what it does, other than that it presumably has some way of breaking up work into smaller, hopefully scaleable, units.

enum Bool { True, False, FileNotFound };
2 hours ago, hplus0603 said:

The fact that something "uses containers" or even "is built on containers" doesn't really tell us much at all about what it does, other than that it presumably has some way of breaking up work into smaller, hopefully scaleable, units.

That's true. How might one attempt to build scaleable units with docker containers?

The same way you'd build scalable units with any other hosting. You consider that your system is going to need to scale in some way, so you design it to share the load across an arbitrary number of processes.

Containers are a technology designed to make it easy for you to run an arbitrary number of identical processes. So these 2 concepts work well together. These processes don't care if they are web servers or MMO servers or neural nets analysing cat pictures. The key is that you can start up a lot of them, easily.

Containers have nothing to do with state synchronisation - that's what the SpatialOS part does.

The other thing containers do, is deliver a "complete working piece of software" in a single download.

All the libraries and data files and whatnot needed by the software in the container, is typically included inside the container image itself.

Thus, containers are also a convenient way of delivering pre-configured software for various tasks, even if those tasks aren't inherently scalable. Instead, what you get then is the convenience of "download a thing, click run, and it will (hopefully) Just Work (tm)"

Thus, "it uses containers" tells us pretty much nothing about the innards of the system, but it does tell us that it's probably easy to download something and start it up, and it's probably easy to start up many copies of it. Whether those copies will efficiently form some kind of larger cluster, is totally up to the implementation of the software on the inside.

enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement