Peer-to-peer Massive Multiplayer Game

Started by
52 comments, last by hplus0603 7 years, 10 months ago

DISCLAIMER: I originally posted this on gamedev.stackexchange.com however I was told by one of the mods that my questions would be better suited for GDNet; and hence this post.

Introduction

I'm currently starting my very first multiplayer game and for the networking model, I'm leaning more towards a peer-to-peer model as opposed to a client-server model due to the benefit of not having to purchase a dedicated server.

However I want my game to be able to handle a large number of users concurrently (around 50-100 users) in a reasonable manner.

My Current Proof of Concept

So far, I've come up with a proof of concept that can achieve moderate speed and moderate load balancing across the network; although I'm sure it can be improved on.

For it to work, one of the peers has to be designated as a central authority (i.e. the host).

The host then receives updates from users through a star topology and then broadcasts the changes to other users through a tree topology (see image below)
wITev.pngFJcaO.png

As shown in the image above, half of the users are leaf nodes and therefore won't be responsible for relaying any information.

The other half of the users however will each be responsible for relaying their information to 2 other users.

The advantage of this is that packets can be sent out in parallel which in turn allows moderately high speed

Furthermore because one half of the users are responsible for relaying information to 2 other users, and the other half is not responsible for relaying any information, this achieves moderate load balancing (better than having one user send out all the packets).

Also note that since each peer is just relaying information from the host, and not modifying it themselves, we can have the host digitally sign their messages (with a timestamp) to verify their authenticity (the host would have to first distribute their public key to every single user on the network).

Problem

However one thing that this proof of concept doesn't consider is "churn" (when users join and leave the network).

For example, if a user randomly disconnects without notifying the host, then all of it's children (and it's children's children) will lose connection as well.

Possible Solution

Upon searching the web, I came across one paper which uses a "fat tree" topology. In their version, they use "heartbeats" to detect unexpected departures.

From the paper itself:

To detect unannounced departures, Nemo relies on heartbeats exchanged among the cluster’s peers. Unreported members are given a fixed time interval before being considered dead, at which point a repair algorithm is initiated. If the failed peer happens to be a leader, the tree itself must be fixed, the members of the victim’s cluster must elect the replacement leader from among themselves.

One problem I can see with this "heartbeat" method is that it generates a lot of overhead on the network; especially when you're dealing with 50-100 users.

Another Possible Solution

One possible solution that I've thought of is to have nodes "complain" (using TCP) to the host when they don't receive a certain number of packets within a certain time interval (i.e. a threshold).

When the host receives a complaint, it temporarily removes the node's parent from the network and promotes the child to it's parent's place. If the parent has not disconnected, it will eventually complain as well for not receiving any packets; in which case we can replace the parent back in it's former position. However if the parent does not complain, then we can assume that the parent disconnected and remove them from the network.

An advantage of this method is that it gets rid of freeloaders as well. However a disadvantage is that it does not have any load balancing since all the "complaints" are being sent to the host.

Conclusion

So my question is what are some popular methods employed in peer-to-peer networks for dealing with churn? What are some of their advantages/disadvantages? I'd appreciate it if people can link me to some resources as well.

TL;DR

Trying to implement a peer-to-peer massive multiplayer game (50-100 players) which has the properties of:

  • minimal latency
  • good load balancing
  • good resilience to churn (more emphasis on this)

Additional Note

A common method used in today's "peer-to-peer" networked games is to assign one of the peers as the server and then have all the other peers act as clients. Although this allows an easy way to deal with churn, it does not have any load balancing at all (since it's basically a client-server model). And this doesn't scale well when there are 50-100 users (which is something that I'm trying to achieve).

Advertisement

peerless data sharing is one thing

distributed computing is something different.

for distributed computing you're going to need some sort of host / master / server to coordinate things, especially if you have nodes dropping into and off the network during runtime.

so you should probably be looking at models for distributed computing, not peerless networking. peerless networking is just the hardware architecture your distributed computing system will run on. peerless is the lower layer of code, and distributed computing is the upper layer of code built on the lower layer.

i think you'll find that in distributed computing they always have some sort of host / master / controller computer, or some node(s) on the network that perform this function.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Hey Norman,

The problem I'm dealing with is not so much a distributed computation problem. All of the computation involved in my game is fairly lightweight and should be able to run on all computers. The problem is dealing with dynamic changes within the game and broadcasting it over the entire network.

So far, I've been able to come up with a proof of concept that has fairly moderate load balancing and speed but I haven't found a good way to deal with fault-tolerance/churn/resilience (i.e. when nodes disconnect).

Thanks for your input,

achew

This design probably won't work for large scale games and it will struggle with resynching data when parts of the tree split and heal. It's basically the same issues faced by irc networks, which I have a ton of experience with.

You might want to look into high performance message passing architectures which by their nature are designed as peer to peer such as opendds and opensplice (which I've also had experience with).

These handle synchronisation of data, can decide what data is authoritative (important to combat cheating) and are high performance. They support reliable and unreliable packet types plus persistent network data. Generally they only work on lans as they find peers by multicast broadcasting but this is a simple thing to overcome and in fact opensplice and others were working on a Internet peer discovery when I last looked. These solutions are used very heavily in military and engineering problems but might be scalable enough for gaming networks. I haven't tried but am curious enough to try it myself one day.

The type of network these solutions exhibit is an cyclic graph rather than acyclic directed graph.

Let me know if this helps at all!
It looks like you are mixing different purposes of the technology haphazardly and expecting the best of all of them.
Your paper is about multicast communications. Direct multicast and indirect multicast by bouncing off peers can reduce a host's bandwidth requirements, but think for a moment: Is that really what your game system should be doing?

So my question is what are some popular methods employed in peer-to-peer networks for dealing with churn? What are some of their advantages/disadvantages?
Direct peer-to-peer communications is good in games where you have one person who wants to talk to another. In that case you have effectively built your game as a mesh architecture, either a fully-connected mesh or partially-connected mesh.
Peer-to-peer communications works in games where the individual machines are self-hosting or cooperatively hosting the game. It is fairly common in match-based games to have the match created on the server,then have the individual game clients run the game simulation as everyone plays the session, be that session a round of golf or a deathmatch.
There is another form of peer to peer communication which your paper describes, more similar to the bit-torrent protocol. That method is great at getting specific information out to a large number of machines, but it does so by taking more clock time, and by assuming that everyone wants to get all of the identical information.
In an MMO that is almost never the case, except perhaps in transferring maps. In such games you only want the players to know about their immediate neighborhood. You do not want the time taken to transfer the data from peer to peer to peer to peer and eventually to the one who cares, you want a direct transfer.
> Trying to implement a peer-to-peer massive multiplayer game (50-100 players) which has the properties of: minimal latency, good load balancing, good resilience to churn (more emphasis on this)
That size game is not "massive", just regular multiplayer. Until you get over 100K concurrent players or so you are not "massive". Games with 100+ players interacting in real time have been common since the late 1970s and 1980s in MUD and MUSH style games.

Hey braindigitalis,

Yes, you are absolutely right. The lower you are in the tree, the more nodes that have to transversed (which causes a bigger chance of delivery failure).

I've taken a look at some of the Data Distribution Services that you mentioned as well. Do you know of any good resources for expanding on how it all works?

Thanks,

achew

Hey frob,

Are you suggesting that it would be better to just ditch the whole peer-to-peer networking and go with a client-server architecture?

The peer-to-peer networking that I am trying to achieve falls into this category that you described:

Peer-to-peer communications works in games where the individual machines are self-hosting or cooperatively hosting the game. It is fairly common in match-based games to have the match created on the server,then have the individual game clients run the game simulation as everyone plays the session, be that session a round of golf or a deathmatch.


The game I'm working on also has the property that each player's action will have an affect on all the players in the entire game (unlike some games where a player's actions only affects a small region of the map), and hence multicasting is necessary.

Also when I use the word "massive", I'm using it in the context of peer-to-peer games since I have yet to find a peer-to-peer networked game that supports more than 50 players concurrently.

Thanks,
achew

EDIT: I found a peer-to-peer game called Huntercoin which I think supports an indefinite number of players concurrently. Their method uses a blockchain, which won't work for a game with only 50-100 players.

Communications tend to break down as numbers increase. It gets even worse when you write:

I'm working on also has the property that each player's action will have an affect on all the players in the entire game

Read up on the n-squared problem, since it typically applies to games as described. The ultra-short version is that as you start applying changes to more and more game clients' data, the work and the communications grow roughly by n squared. So if 2 players have communications of 4 units, 3 players will have roughly communications of 9 units, 4 players will have roughly communications of 16 units. 50 players have communications of 2500 units. 100 players have communications of 10,000 units. Such games quickly reach the point of saturating the weak links in the communications network. It takes a long time to saturate if machines are in a datacenter, lines saturate quickly if people are on varied home connections. Similar things start happening with processing, although modern processors are more likely to be able to keep up than communications.

This is one reason true MMO-style games operate on a star and primarily communicate only what is local to the player, and they tend to reduce what is communicated to smaller areas when there is more going on. In a game where everyone knows everything the communication requirements grow rapidly, even with a fully connected peer-to-peer mesh where everyone talks to everyone else. Partially connected meshes where peers can clone/route messages along to others will have even more communications needs.

The churn problem means the mesh will be constantly adding and removing new nodes, adding complications but those can be designed around fairly easily relative to the communications growth problems.

"Peer to peer" means two different things.

In some games, a model where one user hosts a game instance, and other users connect to that host, this is called "peer to peer" because the game operator only needs to worry about matchmaking, not game simulation.

In other instances, "peer to peer" really means each peer has some kind of direct relation to the other peers. This has never worked out well for gaming, both because of network topology problems, and because of the n-squared connections.

When you have 100 users all trying to connect in a peer-to-peer mesh, each computer has to connect to 99 other computers, which means a whole ton of overhead on the network. Additionally, because of NAT problems, some of those edges in that graph will not actually work.

Much better to elect some user who has reasonable network and CPU ability the "host" and have everyone else connect to that user. Establish a small number of "backup hosts" that can get elected host if the main host drops out of the game, so that connections are pre-established and migration can be smooth-ish.
enum Bool { True, False, FileNotFound };

Hello frob and hplus0603,

Correct me I'm wrong here, but I did some reading up on the n-squared problem (a.k.a. Metcalfe's law) and it seems that the communications applies to the entire network as a whole. So in a typical client server model where the server is responsible for relaying every communication on the network, the cost to maintain that server would be O(n2). However in a peer-to-peer model, where the traffic is divided equally, the cost for each peer would be O(n). This makes sense since if the number of users increases by a factor of "n" the number of connections each peer has to sustain increases by a factor of "n" as well (think of a fully connected mesh for example).

However now that you mention it, you guys gave me another idea. Instead of dividing up the users in a perfectly balanced BST, would it be better to use something like a space-partitioning tree which gets maintained by the host.

For example this...
VkX0kjT.png

Instead of this...
FJcaO.png

And since maintaining a space-partitioning tree isn't too computational expensive, it shouldn't be too hard on the host. Or would it be better to remove the BST and just have each user get associated directly with a subregion like this (which I think is what both of you guys are suggesting)...
MdxZked.png

Thanks guys!
achew

This topic is closed to new replies.

Advertisement