MMORPG and the ol' UDP vs TCP

Started by
68 comments, last by hplus0603 18 years, 10 months ago
Quote:Original post by John Schultz
This makes sense for objects that are rapidly changing state when the system is not running lock-step (or don't require ordered state consistency). To date, I have not run into a problem where this method can provide significant bandwidth savings, but I'll keep it in mind as a future option.

We've found that this method is useful for almost all simulation objects in the 3D (or 2D) world. Players, projectiles, vehicles, etc, whose updates constitute a substantial amount of the server->client communication. In reality it doesn't "save" bandwidth, rather it allows you to more optimally use the bandwidth that you have. Because you're not guaranteeing that any particular data are being delivered, the network layer has the latitude to prioritize updates on the basis of "interest" to a particular client. This results in a substantially more accurate presentation to the client for a given bandwidth setting.

Quote:
Given that this thread is titled MMORPG..., have you tested TNL with 2000-3000 player connections, under real-world internet conditions?

The largest "real-world" games we tested with were 100+ player "single zone" Tribes 2 servers. The problem domain is slightly different - i.e. Tribes was a twitch shooter with substantially more interactivity than most MMO type games, but it should translate well into the MMO type domain.

Quote:
Worst case scenario analysis for a MMORPG and 3000 very active players:

3kbytes/sec * 3000 players = 9000kbytes/sec, 72,000kbits/sec, 72Mbits/sec.

This means you'll probably have many fat pipes, as well as extra routing capabilities to deal with varying internet conditions. Given the unpredictability of network conditions, if the server does not actively adapt its bandwidth output, the system is going to fall apart (lots of data, lots of connections, lots of unpredictability).

It seems to me that an MMO service provider is going to want to make sure that it has enough bandwidth to handle peak load, with a healthy margin. It would be a trivial addition to TNL to allow a server-wide maximum bandwidth setting (i.e. 50 mbits/sec) and then adapt the fixed rates for all connections down as new clients connect.
Quote:
Thus, I hope it is clear why I have been defending TCP*: it really is an excellent protocol. Some of its features are not ideal for games/simulations, but the RTO calculation (see appendix A in Jacobson's paper) is required if a custom protocol is to be used in a large-scale, real world internet environment (such as a MMORPG). It's probable that the UDP-based MMORPG's that fail/fall-apart is due to poor bandwidth management.

Yeah, TCP's got some good bandwidth adaptation features. But classifying all data as guaranteed makes for big simulation trouble when you inevitably run into packet loss.
Quote:
In summary, study the history and design of TCP, and use the best feature(s) for custom game/simulation protocols, while leaving out (or retuning) features that hurt game/simulation performance.

Good summary!
Advertisement
Quote:Original post by markf_gg
Quote:Original post by John Schultz
This makes sense for objects that are rapidly changing state when the system is not running lock-step (or don't require ordered state consistency). To date, I have not run into a problem where this method can provide significant bandwidth savings, but I'll keep it in mind as a future option.

We've found that this method is useful for almost all simulation objects in the 3D (or 2D) world. Players, projectiles, vehicles, etc, whose updates constitute a substantial amount of the server->client communication. In reality it doesn't "save" bandwidth, rather it allows you to more optimally use the bandwidth that you have. Because you're not guaranteeing that any particular data are being delivered, the network layer has the latitude to prioritize updates on the basis of "interest" to a particular client. This results in a substantially more accurate presentation to the client for a given bandwidth setting.


I send this type of data in the unreliable/non-guaranteed channel: it makes up most of the data transmitted as well. Data is added and compressed based on each player's viewspace (more compression for far away objects, really far away objects get updated less frequently, etc.). Classic dead reckoning tracks the simulation so that updates get sent only after a divergence threshold is met. That is, even if the data/object is marked 'dirty', an update is not sent unless the interpolated/extrapolated error is significant (the sender simulates what the receiver should be seeing).

The original statement:

Quote:
If OOORS is of limited use, what other class of data beyond guaranteed and non-guaranteed do you see of value? I agree that the biggest problem is too much data sent as guaranteed, which is a network game design issue. I have not yet seen a strong argument for supporting other classes of data. Either the data absolutely has to get there, or it doesn't. Perhaps you can give an example where this is not true?


asked if there were other useful classes of network data beyond reliable/guaranteed and unreliable/non-guaranteed. Your example uses the unreliable channel: it's a management layer between the network layer and the game layer.

Quote:
Quote:
Worst case scenario analysis for a MMORPG and 3000 very active players:

3kbytes/sec * 3000 players = 9000kbytes/sec, 72,000kbits/sec, 72Mbits/sec.

This means you'll probably have many fat pipes, as well as extra routing capabilities to deal with varying internet conditions. Given the unpredictability of network conditions, if the server does not actively adapt its bandwidth output, the system is going to fall apart (lots of data, lots of connections, lots of unpredictability).

It seems to me that an MMO service provider is going to want to make sure that it has enough bandwidth to handle peak load, with a healthy margin. It would be a trivial addition to TNL to allow a server-wide maximum bandwidth setting (i.e. 50 mbits/sec) and then adapt the fixed rates for all connections down as new clients connect.


Networks are unpredictable. You can also think of bandwidth adaption as a form of fault tolerance. See Jacobson's paper (also see graphs/data present in other papers I referenced). The original version of TCP used a more naive approach (see Cerf & Kahns's 1974 paper and read their comments regarding bandwidth. It did not work well in practice, thus Jacobson's paper in 1988).

Quote:
Quote:
Thus, I hope it is clear why I have been defending TCP*: it really is an excellent protocol. Some of its features are not ideal for games/simulations, but the RTO calculation (see appendix A in Jacobson's paper) is required if a custom protocol is to be used in a large-scale, real world internet environment (such as a MMORPG). It's probable that the UDP-based MMORPG's that fail/fall-apart is due to poor bandwidth management.

Yeah, TCP's got some good bandwidth adaptation features. But classifying all data as guaranteed makes for big simulation trouble when you inevitably run into packet loss.


I see the misunderstanding: I'm only proposing TCP/reliable-only when there is no choice (firewall issues, etc.). A TCP-UDP design can work fine (provided the UDP channel is bandwidth managed as well). A custom UDP protocol should implement TCP-like (customized for games/simulations) bandwidth adaption (for both reliable and unreliable data (which will typically be sent in the same packet)).

Quote:
Quote:
In summary, study the history and design of TCP, and use the best feature(s) for custom game/simulation protocols, while leaving out (or retuning) features that hurt game/simulation performance.

Good summary!


Thanks!
Guild Wars uses TCP exclusively. From what I know there have been no major issues with lag in GW.

That being said obviously a perfect UDP solution would be better than TCP. The problem is it is very, very hard coming up with a perfect UDP solution. In fact it is quite tricky coming up with a UDP solution that is better than native TCP. TCP has had years of evolution applied to it and on the face of it it is fairly straightforward implementing a UDP protocol that seems to work - until you throw several hundred clients on it.

Unless you really, really need the benefits of UDP, I'd suggest just going with TCP. It shaves weeks off your development time (unless you go with a middleware package) and it is often "good enough". Hell, it's "good enough" for World of Warcraft and Guild Wars.

Great thread! I vote for it being stickied! :D

PS: Personally, I use TCP for my current WoW-killer project. It's fast (as long as you don't send packages every frame, and don't update player positions to players outside a player's zone), and ver reliable. :)
_______________________Afr0Games
First, I played AC (and AC2) for many years (months for AC2) and never once had a problem with lag. The only problem the game had was when too many players were in one area and then a portal storm would occur and take you out of the laggy area. Never a lag problem so I don't know what AC you were playing.

Secondly I'd go with UDP.

build a reliable UDP protocol.

TCP has too much overhead for everything you need. Not to mention the back throttling issues talked about above.
you should use RTP or TCP .
if you decide to use UDP or TCP i suggest you to put below issue in consideration:
1- packet lost ;udp cannot ensure that so one client might send a critial message and imagine it drops on the way.
2- tcp is a relativly heavy protocol so imaging a condition which u have 1000
cuncurrent online users which in the case of role playing games they will send messages almost each seconds .so using udp make sense so it never send ack for each recieve or send.

i never used RTP but seems to be a good replace for TCP as it send acks lightly.
Quote:Original post by PlayerX
Guild Wars uses TCP exclusively. From what I know there have been no major issues with lag in GW.

That being said obviously a perfect UDP solution would be better than TCP. The problem is it is very, very hard coming up with a perfect UDP solution. In fact it is quite tricky coming up with a UDP solution that is better than native TCP. TCP has had years of evolution applied to it and on the face of it it is fairly straightforward implementing a UDP protocol that seems to work - until you throw several hundred clients on it.

Unless you really, really need the benefits of UDP, I'd suggest just going with TCP. It shaves weeks off your development time (unless you go with a middleware package) and it is often "good enough". Hell, it's "good enough" for World of Warcraft and Guild Wars.


Being "good enough" is what caused all the server problems with WoW. GW is not really an MMO, it spawns an instance of the world for each player / group so I imagine a lot more is left up the client in the case of GW, though I can't really be sure.

WoW is notorious for server problems.

Again as an tribute to AC, AC servers were rarely down and besides over populated places (sub in pre-marketplace days) you'd not notice any lag.

Quote:Original post by John Schultz
Does RakNet use TCP? If not, how do you see IOCP helping a UDP-only based protocol, especially if the server is single-threaded (for maximum performance due to zero (user-level, network) context switching)?
...[snip]...
It would appear that thread context switching overhead might outweigh kernel (paging) advantages with IOCP, especially given the nature of UDP (not using memory/queues as with TCP).


Why would you make a single-threaded server to begin with? You've typically got a bunch of entities to process, AI to manage, etc, and throwing your UDP receive loop into the same thread would aggravate performance to an unmanageable level, wouldn't it?

I've tried several UDP scenarios, such as:

- Having a primary thread to process game data, and a worker thread to receive and dispatch UDP packets (no async I/O)
- Same as above, using overlapped I/O and wait handles
- Using IOCP and a thread pool

I basically wrote a front end to blast packets between two machines on a 100BT network to see how many I lost, how far behind my programs got, etc. I also used a dual-xeon receiver and a single CPU sender, and vice-versa.

By far, and without question, the IOCP app ran the breakneck fastest, with the least CPU usage, and the least number of lost packets. As a matter of fact, I was able to completely saturate a 100BT network with 1400-byte UDP packets to the dual xeon receiver with 0 lost packets and 0 backlog -- and using a fraction of the CPU's time.

None of the other methods I tried scaled up to utilize all available CPU's, nor did they keep up with massive throughputs over and extended period of time. They invariably began to backlog and lost tons of packets.

Oh, I also tried running the program on the same machine (used both the dual xeon and a single CPU machine) using the loopback address. With two programs running full-tilt (one receiving and one sending) only the IOCP solution was able to receive all the packets with 0 backlog and 0 lost packets.

The only "flow control" I implemented was to turn off the send buffer on the socket to ensure the network layer didn't discard my outgoing packet due to lack of buffer space to store it.

If anyone's interested, I'll dig out the source code for the IOCP method and toss up a link.

Robert Simpson
Programmer at Large
Quote:Original post by rmsimpson
Quote:Original post by John Schultz
Does RakNet use TCP? If not, how do you see IOCP helping a UDP-only based protocol, especially if the server is single-threaded (for maximum performance due to zero (user-level, network) context switching)?
...[snip]...
It would appear that thread context switching overhead might outweigh kernel (paging) advantages with IOCP, especially given the nature of UDP (not using memory/queues as with TCP).


Why would you make a single-threaded server to begin with? You've typically got a bunch of entities to process, AI to manage, etc, and throwing your UDP receive loop into the same thread would aggravate performance to an unmanageable level, wouldn't it?


In the case of 100% resource utilization, where processing incoming packets has the highest priority, it's clear that a single-threaded design should be the fastest: no-thread context switching. This would be the limit case: it would not be possible to process packets more efficiently. When AI+physics+gamecode are factored in, if the incoming packet rate is very high, then packets will be dropped if the incoming buffer cannot be processed fast enough. If threads are used to process incoming packets, efficiency would be reduced due to context switching (unless the OS is doing something that improves efficiency).

Quote:Original post by rmsimpson
I've tried several UDP scenarios, such as:

- Having a primary thread to process game data, and a worker thread to receive and dispatch UDP packets (no async I/O)
- Same as above, using overlapped I/O and wait handles
- Using IOCP and a thread pool

I basically wrote a front end to blast packets between two machines on a 100BT network to see how many I lost, how far behind my programs got, etc. I also used a dual-xeon receiver and a single CPU sender, and vice-versa.

By far, and without question, the IOCP app ran the breakneck fastest, with the least CPU usage, and the least number of lost packets. As a matter of fact, I was able to completely saturate a 100BT network with 1400-byte UDP packets to the dual xeon receiver with 0 lost packets and 0 backlog -- and using a fraction of the CPU's time.

None of the other methods I tried scaled up to utilize all available CPU's, nor did they keep up with massive throughputs over and extended period of time. They invariably began to backlog and lost tons of packets.

Oh, I also tried running the program on the same machine (used both the dual xeon and a single CPU machine) using the loopback address. With two programs running full-tilt (one receiving and one sending) only the IOCP solution was able to receive all the packets with 0 backlog and 0 lost packets.

The only "flow control" I implemented was to turn off the send buffer on the socket to ensure the network layer didn't discard my outgoing packet due to lack of buffer space to store it.

If anyone's interested, I'll dig out the source code for the IOCP method and toss up a link.

Robert Simpson
Programmer at Large


This forum's moderator, Jon Watte (hplus) who works on a MMOG (There), stated that they tested various scenarios, including MP+multithreaded, and found that single threaded was the most efficient. It's not clear if their tests were from MMOG testing, benchmarks, or both.

Your test/benchmark sounds cool: if you could post your benchmark(s) showing that (overlapped I/O+) IOCP+threaded(+MP, etc.) does something extra-ordinary for UDP in Win32, including a means to compare with single-threaded standard UDP sockets, network developers would be interested in running the benchmark (I can test on various Intel/AMD/MP hardware).
Quote:Original post by John Schultz
This forum's moderator, Jon Watte (hplus) who works on a MMOG (There), stated that they tested various scenarios, including MP+multithreaded, and found that single threaded was the most efficient. It's not clear if their tests were from MMOG testing, benchmarks, or both.

Your test/benchmark sounds cool: if you could post your benchmark(s) showing that (overlapped I/O+) IOCP+threaded(+MP, etc.) does something extra-ordinary for UDP in Win32, including a means to compare with single-threaded standard UDP sockets, network developers would be interested in running the benchmark (I can test on various Intel/AMD/MP hardware).


It's been at least a year since I even looked at the code, but I'll blow the dust off and post a link to the benchmark program(s) I wrote.

Robert

This topic is closed to new replies.

Advertisement