Jump to content

  • Log In with Google      Sign In   
  • Create Account

MMORPG and the ol' UDP vs TCP


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • This topic is locked This topic is locked
69 replies to this topic

#41 joew   Crossbones+   -  Reputation: 3679

Like
0Likes
Like

Posted 16 May 2005 - 04:36 AM

Quote:
Original post by Martin Piper
EQ Only uses UDP for game communication.

Ah ok, I don't know a lot about EQ so I just took the info from the emulator site, thanks for the correction. I confuse this with DAoCs hybrid system I guess.

Quote:

Everquest2, which uses UDP for game data, also has much better network performance than WoW.

With less than 1/4 of the players as well which is a factor.

Quote:

I've also not noticed Asheron's Call problems that are related to them using a specific network protocol. The game might be rubbish, but that is not a network protocol related problem.

My point was that it is not a network protocol related problem exactly. I said that they use UDP but the game runs horribly not because of the protocol but because of the server architecture. Who knows the same could probably be said about WoW.

Sponsor:

#42 markf_gg   Members   -  Reputation: 170

Like
0Likes
Like

Posted 16 May 2005 - 04:47 AM

I'm still amazed that people even debate this. TCP alone is a poor solution for any kind of a realtime game, if only because even a single dropped packet causes a stall in all network data delivery until that data loss is noted and retransmitted.

Hybrid TCP/UDP systems are needlessly complicated and suffer from problems like bandwidth overconsumption by the TCP stream, maintainance of seperate channels for UDP and TCP, misordered delivery of updates, etc.

I was going to go into greater detail, but then I realized I already have in the design fundamentals section of the Torque Network Library reference. The packet loss section gives a good explanation for why neither UDP nor TCP provide the right abstraction level for realtime game network programming.

- Mark

#43 John Schultz   Members   -  Reputation: 807

Like
0Likes
Like

Posted 16 May 2005 - 10:55 AM

Network developers will agree that TCP and TCP+UDP are not optimal for real-time game/simulation applications. Custom UDP protocols can be more efficient. However, if the networked application sends too much data, any advantage provided by a custom UDP protocol is lost. In all cases, if the reliable queues fill up faster than they can drain, lag will be present (beyond network latency), up to the point the channel must be closed due to queue overflow.

A well designed network game can work fine using TCP or TCP+UDP. Beginning network programmers will have a much easier time using TCP+UDP than trying to create a custom UDP protocol from scratch. The argument that TCP+UDP is overly-complex is invalid, as creating a custom UDP protocol is much more complicated. In the context of marketing material for an existing, well-tuned and debugged custom UDP protocol, such an argument can make sense (more in terms of efficiency than complexity: the developer must now become familiar with third party code).

The out-of-order (early) unreliable data arrival before reliable data argument (from the TNL site) is easily dealt with by tracking state. Example: a reliable activation message is sent to an object via the reliable channel and gets lost in transit. An unreliable position update packet arrives before the activation message. The object state is checked, and since the object is not active, the position update is ignored.

How valuable is out-of-order reliable support (OOORS)? In a game/simulation where objects move and stop for long periods of time, or when turning on/off non-state-affecting effects/props, bandwidth can be saved. However, in a game where objects are constantly moving (or stop for very short periods of time), OOORS provides little to no benefit, and if extra packet bits are required to allow for support of OOORS, it's a bandwidth loss.

Quote:
Original post by markf_gg
I'm still amazed that people even debate this. TCP alone is a poor solution for any kind of a realtime game, if only because even a single dropped packet causes a stall in all network data delivery until that data loss is noted and retransmitted.


TCP alone is the only option for games operating in restricted environments (for example, when only HTTP/HTTPS is open at the firewall). As long as the TCP queue is effectively monitored (different methods for *nix and Win32), and decent client-side prediction is implemented, it is possible to work around stall issues.

While TCP will never be ideal for a high-data-rate FPS, it can work fine for RTS and slower moving MMOG's. If the game is primarily running lock-step, where everything must be delivered in order, guaranteed, TCP alone will work fine. During high congestion periods, TCP may, by design, slow down faster than a custom UDP protocol. However, this may be an advantage for a MMOG with thousands of players, where a poorly designed custom UDP protocol may fall apart (keeps sending data at a high(er) rate, preventing the network from recovering). I suspect this is one reason why existing UDP-based MMOG with thousands of players can fall apart. TCP is well designed to efficiently handle this case.

Quote:
Original post by markf_gg
Hybrid TCP/UDP systems are needlessly complicated and suffer from problems like bandwidth overconsumption by the TCP stream, maintainance of seperate channels for UDP and TCP, misordered delivery of updates, etc.


Bandwidth over-consumption is going to come from the unreliable channel, not the reliable channel. During periods of high congestion, the unreliable channel should be cut until the reliable channel(s) queue(s) can drain (data that is not truly state-critical should never be added to the reliable queue). All other arguments can be ameliorated at the network game design layer.

Quote:
Original post by markf_gg
I was going to go into greater detail, but then I realized I already have in the design fundamentals section of the Torque Network Library reference. The packet loss section gives a good explanation for why neither UDP nor TCP provide the right abstraction level for realtime game network programming.
- Mark


The Torque Network Library looks like a good network toolkit (and to be fair, so do RakNet and ReplicaNet). While the arguments given do well to support licensing/purchasing a pre-made, well-tested custom UDP network toolkit, the biggest problem, by far, is network game design as opposed to the underlying network protocol.

I created a custom reliable UDP protocol in a case where the TCP implementation wasn't quite finished. The new UDP protocol ended up being more efficient than a TCP+UDP model (due to packet overhead savings and retransmit optimizations). Even so, the game would grind to a halt during high reliable state data sends. This required a significant redesign of networked game elements. Thus, while every bit of bandwidth helps, the burden of efficiency and game play quality resides in the game design, not the network protocol.

This was for the first full, XBox Live enabled game, and it was finished early (network-enabled games tend to ship late due underestimation of networking issues). While the game only supported 4 players, many more flying and moving objects were active, as well as many rapid reliable state changes (a nature of the game: too late to completely remedy by the time I joined the project). Voice was enabled for all players, all the time (as opposed to only hearing players near each other). The game played with little to no perceptible lag, even below 64kbps (voice took ~32kbps).

Again, I recommend that developers look into developing or licensing/purchasing custom reliable UDP protocols (RakNet, TNL, and ReplicaNet appear to be good choices). However, TCP and TCP+UDP can work fine: the real work in making a game play well under all internet conditions is centered around the network game design itself, not the network protocol. Likewise, if a game plays well/poorly on the internet, it can’t be attributed-to/blamed-on TCP, TCP+UDP, or a custom UDP protocol. It’s the network game design itself.

[Edited by - John Schultz on May 16, 2005 5:55:13 PM]

#44 bit64   Members   -  Reputation: 218

Like
0Likes
Like

Posted 16 May 2005 - 11:12 AM

Quote:
Original post by graveyard filla
Quote:
Original post by Anonymous Poster
In UDP packets are received in the order that they arrive.


This isn't true. UDP can send packets out of order, and in fact can send duplicates and other nasty things as well.


Yes I should have explained that, sorry. I meant under optimal conditions they are received in the order that they are given, but are certainly not guaranteed to do so. (In fact I did cover this in the other portion of my post, perhaps you didn't read far enough.)

Quote:
In UDP packets are received in the order that they arrive. Which is great for movement, or actions, because the last action is less important than the current action. You can mitigate the problems with out of order sequences on UDP very easily, simply by queuing the messages as they arrive, and then re-requesting those that didn't make it. Once your packets are ordered, then you can process them.


...mitigate the problems with out of order sequences on UDP very easily.....

[Edited by - bit64 on May 16, 2005 5:12:14 PM]

#45 markf_gg   Members   -  Reputation: 170

Like
0Likes
Like

Posted 17 May 2005 - 07:11 AM

Quote:
Original post by John Schultz
However, if the networked application sends too much data, any advantage provided by a custom UDP protocol is lost. In all cases, if the reliable queues fill up faster than they can drain, lag will be present (beyond network latency), up to the point the channel must be closed due to queue overflow.

This is one of the reasons why TCP+UDP is a poor combination - it gets people thinking in a message-oriented mindset where the only two primitives for data delivery are guaranteed and unguaranteed messages. What often ends up happening in a 3D simulation is that a large portion of messages get tagged as "reliable", overflowing the queue.

Quote:

How valuable is out-of-order reliable support (OOORS)? In a game/simulation where objects move and stop for long periods of time, or when turning on/off non-state-affecting effects/props, bandwidth can be saved.

In practice we've found out of order delivered reliable events to be of limited use. There is however, no per-packet overhead for supporting reliable OOO data in the TNL model, since TNL also supports strictly unreliable event sends.

Quote:

TCP alone is the only option for games operating in restricted environments (for example, when only HTTP/HTTPS is open at the firewall). As long as the TCP queue is effectively monitored (different methods for *nix and Win32), and decent client-side prediction is implemented, it is possible to work around stall issues.

It should be possible for a network system to support TCP connections for those clients that can't connect via UDP. I think I'll add that as an option to TNL :). It would still support all the higher level primitives like prioritization of object updates and fixed bandwidth consumption.

Quote:

However, this may be an advantage for a MMOG with thousands of players, where a poorly designed custom UDP protocol may fall apart (keeps sending data at a high(er) rate, preventing the network from recovering).

Well, I would never suggest using a poorly designed custom UDP protocol ;)

Quote:

Again, I recommend that developers look into developing or licensing/purchasing custom reliable UDP protocols (RakNet, TNL, and ReplicaNet appear to be good choices). However, TCP and TCP+UDP can work fine: the real work in making a game play well under all internet conditions is centered around the network game design itself, not the network protocol. Likewise, if a game plays well/poorly on the internet, it can’t be attributed-to/blamed-on TCP, TCP+UDP, or a custom UDP protocol. It’s the network game design itself.

The network toolkit you use often affects to a great degree the higher level design of your game networking. TNL, for example, isn't just a low level delivery protocol - it supports a rich set of data delivery policies that greatly simplify the higher level game networking design. TNL also uses a fixed per-client bandwidth setting, meaning that no matter how many objects are being updated, clients network connections will never be flooded.

#46 graveyard filla   Members   -  Reputation: 583

Like
0Likes
Like

Posted 17 May 2005 - 09:24 AM

Quote:
Original post by Saruman
By large I mean that anything over 200 players you are going to start having some major issues, maybe even with a lower amount of connected players.

The main bottleneck in the RakNet API is memory usage for tracking duplicate packets. In ReliabilityLayer.h you will see a giant arry and this is a problem space that needs to be solved as for any large (>100) number of connected clients you are going to have issues. I know Kevin has worked on this but I do not know where he has gotten or what design he chose, and I am pretty sure he does not want to commit until the doxygen and osx port are complete as it would set back other peoples work.

There are other minor issues that really should be cleaned up, and IOCP support is something that you would definately want back in if you are running on a Windows platform server.

Hope that helps.


wow, thats pretty surprising to me. and to think the architecture in my game that uses RakNet should be able to handle way more then 200 players.... not that i ever expected that many people to play, but it's always nice to be scalable.



#47 joew   Crossbones+   -  Reputation: 3679

Like
0Likes
Like

Posted 17 May 2005 - 09:26 AM

Quote:
Original post by graveyard filla
wow, thats pretty surprising to me. and to think the architecture in my game that uses RakNet should be able to handle way more then 200 players.... not that i ever expected that many people to play, but it's always nice to be scalable.

Note that as I said Kevin will be fixing this so it is not like this will be a persistant issue in the future. You could also fix the main problem yourself just by changing that big array to something more feasible.

#48 John Schultz   Members   -  Reputation: 807

Like
0Likes
Like

Posted 17 May 2005 - 10:55 AM

Quote:
Original post by markf_gg
Quote:
Original post by John Schultz
However, if the networked application sends too much data, any advantage provided by a custom UDP protocol is lost. In all cases, if the reliable queues fill up faster than they can drain, lag will be present (beyond network latency), up to the point the channel must be closed due to queue overflow.

This is one of the reasons why TCP+UDP is a poor combination - it gets people thinking in a message-oriented mindset where the only two primitives for data delivery are guaranteed and unguaranteed messages. What often ends up happening in a 3D simulation is that a large portion of messages get tagged as "reliable", overflowing the queue.

Quote:

How valuable is out-of-order reliable support (OOORS)? In a game/simulation where objects move and stop for long periods of time, or when turning on/off non-state-affecting effects/props, bandwidth can be saved.

In practice we've found out of order delivered reliable events to be of limited use. There is however, no per-packet overhead for supporting reliable OOO data in the TNL model, since TNL also supports strictly unreliable event sends.


If OOORS is of limited use, what other class of data beyond guaranteed and non-guaranteed do you see of value? I agree that the biggest problem is too much data sent as guaranteed, which is a network game design issue. I have not yet seen a strong argument for supporting other classes of data. Either the data absolutely has to get there, or it doesn't. Perhaps you can give an example where this is not true?

Quote:
Original post by markf_gg
Quote:

TCP alone is the only option for games operating in restricted environments (for example, when only HTTP/HTTPS is open at the firewall). As long as the TCP queue is effectively monitored (different methods for *nix and Win32), and decent client-side prediction is implemented, it is possible to work around stall issues.

It should be possible for a network system to support TCP connections for those clients that can't connect via UDP. I think I'll add that as an option to TNL :). It would still support all the higher level primitives like prioritization of object updates and fixed bandwidth consumption.


That's cool. When you get it working, perhaps post benchmarks showing any performance differences between the two (for a variety of bandwidth and network conditions, game types, etc.). I believe you'll need to use overlapped I/O and IOCP to determine the TCP send queue state on Win32 (queue flags/read-options exist for *nix).

Quote:
Original post by markf_gg
Quote:

However, this may be an advantage for a MMOG with thousands of players, where a poorly designed custom UDP protocol may fall apart (keeps sending data at a high(er) rate, preventing the network from recovering).

Well, I would never suggest using a poorly designed custom UDP protocol ;)


The developer may not know that their implementation is poor until stressed under real-world internet conditions. WRT the previous link, I discovered that my first custom protocol was quite poor during network simulation and analysis (I thought it was decent before testing). It was during this analysis that I found that even the native TCP implementation (for this device) was broken. This is one advantage to using a pre-made toolkit, provided the toolkit authors can provide benchmarks/statistics showing that their design can survive worst-case internet conditions (thousands of players, etc., as with an MMOG). TCP has been researched/studied for around 20 years: it's strengths and weaknesses are well known (internet bandwidth balancing/optimization is still a hard, unsolved problem (not perfected)). RED and WRED are newer designs for router queues that help TCP to behave more efficiently during high congestion situations. This an another reason to start with the basic TCP design when designing a custom protocol.

Quote:
Original post by markf_gg
Quote:

Again, I recommend that developers look into developing or licensing/purchasing custom reliable UDP protocols (RakNet, TNL, and ReplicaNet appear to be good choices). However, TCP and TCP+UDP can work fine: the real work in making a game play well under all internet conditions is centered around the network game design itself, not the network protocol. Likewise, if a game plays well/poorly on the internet, it can't be attributed-to/blamed-on TCP, TCP+UDP, or a custom UDP protocol. It’s the network game design itself.

The network toolkit you use often affects to a great degree the higher level design of your game networking. TNL, for example, isn't just a low level delivery protocol - it supports a rich set of data delivery policies that greatly simplify the higher level game networking design. TNL also uses a fixed per-client bandwidth setting, meaning that no matter how many objects are being updated, clients network connections will never be flooded.


That's true. My point has been that arguments against TCP and TCP-UDP are without merit in the cases where developers are aware of the network game design issues and are (for whatever reason: time, policy, skill level, cost, target market) limited to a TCP or TCP-UDP solution. In cases of congestion, TNL must drop any clients that haven't allowed the server to drain its reliable queue(s): at some point, no matter how much queue memory is present, you've got to call it quits, and drop the client. While you state that TNL supports a fixed bandwidth option, do you mean fixed maximum bandwidth? Does TNL also implement a filtered mean-deviation estimator to dynamically (and near optimally) adjust bandwidth for live internet conditions? The latter is far more important than the former (I would look at the former as a tuning tool, but in general would not want to artificially limit client bandwidth: there is no reason to do so if the custom protocol is optimally adapting for all the live channels).


#49 markf_gg   Members   -  Reputation: 170

Like
0Likes
Like

Posted 17 May 2005 - 12:57 PM

Quote:
Original post by John Schultz
If OOORS is of limited use, what other class of data beyond guaranteed and non-guaranteed do you see of value? I agree that the biggest problem is too much data sent as guaranteed, which is a network game design issue. I have not yet seen a strong argument for supporting other classes of data. Either the data absolutely has to get there, or it doesn't. Perhaps you can give an example where this is not true?

TNL and the Torque and Tribes engines before it introduced a data delivery policy called "most recent state guarantee" which is to say that for a given object, the current state of the object will, at some point, be reflected to clients interested in that object. This is at the heart of the ghosting facility of TNL and what sets it apart from many other networking packages (i.e. RakNet).

In this system, rather than having simulation objects "push" data events to clients, the objects simply mark themselves as having dirty states. When TNL decides it's time to send another packet to a particular client, it sorts dirty objects based on a user-supplied prioritization function and then writes object updates into the packet until the packet is full. Any dropped packets simply set the dirty state flags for that object for that client that were not subsequently updated in a later packet.

All remote object (ghost) creation messages are sent using this system as well, thus substantially limiting both the number of guaranteed and unguaranteed messages sent to clients. Because TNL tracks which states exist on which clients, there's no need to pulse unguaranteed messages (in case some position state was lost) or to send lots of guaranteed object creation/deletion messages.

The other data classificiation I've found useful is the "quickest delivery" data type -- player input for example, where a dropped packet or two shouldn't require a round-trip back to the client for a re-send. This is mainly a presentation issue for other clients in the simulation.
Quote:

The developer may not know that their implementation is poor until stressed under real-world internet conditions.

Well, they could always use a network technology that's been proven successful in AAA networked games back to oh, say 1998...

Quote:

While you state that TNL supports a fixed bandwidth option, do you mean fixed maximum bandwidth? Does TNL also implement a filtered mean-deviation estimator to dynamically (and near optimally) adjust bandwidth for live internet conditions?

TNL does have an adaptive bandwidth option for connections, but it's fairly primitive at this point. In the products I've shipped with it (Starsiege: TRIBES and Tribes 2) we simply fixed the client bandwith at 2kbytes/sec and 3kbytes/sec respectively. Due to the nature of the most-recent state data guarantee we always filled up each packet to the client. The resulting gameplay was of sufficient quality that we didn't bother attempting to adaptively adjust bandwidth settings on the fly, although we did allow clients to adjust the params slightly upwards if they had broadband connections.

I am currently looking at improving our adaptive rate code to more easily allow TNL's use in higher bandwidth, non-simulation applications. Can you recommend anything I should read on the subject? A near optimal filtered mean-deviation estimator sounds like it might be what I'm looking for :)

#50 John Schultz   Members   -  Reputation: 807

Like
0Likes
Like

Posted 17 May 2005 - 03:19 PM

Quote:
Original post by markf_gg
Quote:
Original post by John Schultz
If OOORS is of limited use, what other class of data beyond guaranteed and non-guaranteed do you see of value? I agree that the biggest problem is too much data sent as guaranteed, which is a network game design issue. I have not yet seen a strong argument for supporting other classes of data. Either the data absolutely has to get there, or it doesn't. Perhaps you can give an example where this is not true?

In this system, rather than having simulation objects "push" data events to clients, the objects simply mark themselves as having dirty states. When TNL decides it's time to send another packet to a particular client, it sorts dirty objects based on a user-supplied prioritization function and then writes object updates into the packet until the packet is full. Any dropped packets simply set the dirty state flags for that object for that client that were not subsequently updated in a later packet.


This makes sense for objects that are rapidly changing state when the system is not running lock-step (or don't require ordered state consistency). To date, I have not run into a problem where this method can provide significant bandwidth savings, but I'll keep it in mind as a future option.

Quote:
Original post by markf_gg
Quote:

The developer may not know that their implementation is poor until stressed under real-world internet conditions.

Well, they could always use a network technology that's been proven successful in AAA networked games back to oh, say 1998...


Given that this thread is titled MMORPG..., have you tested TNL with 2000-3000 player connections, under real-world internet conditions?

Quote:
Original post by markf_gg
Quote:

While you state that TNL supports a fixed bandwidth option, do you mean fixed maximum bandwidth? Does TNL also implement a filtered mean-deviation estimator to dynamically (and near optimally) adjust bandwidth for live internet conditions?

TNL does have an adaptive bandwidth option for connections, but it's fairly primitive at this point. In the products I've shipped with it (Starsiege: TRIBES and Tribes 2) we simply fixed the client bandwith at 2kbytes/sec and 3kbytes/sec respectively. Due to the nature of the most-recent state data guarantee we always filled up each packet to the client. The resulting gameplay was of sufficient quality that we didn't bother attempting to adaptively adjust bandwidth settings on the fly, although we did allow clients to adjust the params slightly upwards if they had broadband connections.


Worst case scenario analysis for a MMORPG and 3000 very active players:

3kbytes/sec * 3000 players = 9000kbytes/sec, 72,000kbits/sec, 72Mbits/sec.

This means you'll probably have many fat pipes, as well as extra routing capabilities to deal with varying internet conditions. Given the unpredictability of network conditions, if the server does not actively adapt its bandwidth output, the system is going to fall apart (lots of data, lots of connections, lots of unpredictability). While this example isn't much of a proof, the bandwidth/complexity concepts come from studying the design of TCP, and why it is able to allow millions (billions) of connections to run relatively smoothly over a very complicated network (or web) of dataflows.

Quote:
Original post by markf_gg
I am currently looking at improving our adaptive rate code to more easily allow TNL's use in higher bandwidth, non-simulation applications. Can you recommend anything I should read on the subject? A near optimal filtered mean-deviation estimator sounds like it might be what I'm looking for :)


Van Jacobson's paper, Congestion Avoidance and Control, written in 1988, is an excellent starting point. The history is also fascinating: it describes a time when the early internet could collapse. In the almost 20 years since the paper was written, there have not been signficant improvements (for all cases). These algorithms, as well as their variants, make up the core features of, you guessed it, TCP. To go back further in time, see RFC793, written for DARPA in 1981.

Thus, I hope it is clear why I have been defending TCP*: it really is an excellent protocol. Some of its features are not ideal for games/simulations, but the RTO calculation (see appendix A in Jacobson's paper) is required if a custom protocol is to be used in a large-scale, real world internet environment (such as a MMORPG). It's probable that the UDP-based MMORPG's that fail/fall-apart is due to poor bandwidth management.

More links here.

In summary, study the history and design of TCP, and use the best feature(s) for custom game/simulation protocols, while leaving out (or retuning) features that hurt game/simulation performance.




* I believe this is the first paper to describe TCP, by Vinton Cerf and Robert Kahn in 1974, BSW (Before Star Wars ;-)). TCP/IP allowed ARPANET to become the Internet and later the World-Wide Web. Robert Kahn talks about TCP and the birth of UDP.

[Edited by - John Schultz on May 18, 2005 4:19:23 AM]

#51 markf_gg   Members   -  Reputation: 170

Like
0Likes
Like

Posted 18 May 2005 - 05:06 AM

Quote:
Original post by John Schultz
This makes sense for objects that are rapidly changing state when the system is not running lock-step (or don't require ordered state consistency). To date, I have not run into a problem where this method can provide significant bandwidth savings, but I'll keep it in mind as a future option.

We've found that this method is useful for almost all simulation objects in the 3D (or 2D) world. Players, projectiles, vehicles, etc, whose updates constitute a substantial amount of the server->client communication. In reality it doesn't "save" bandwidth, rather it allows you to more optimally use the bandwidth that you have. Because you're not guaranteeing that any particular data are being delivered, the network layer has the latitude to prioritize updates on the basis of "interest" to a particular client. This results in a substantially more accurate presentation to the client for a given bandwidth setting.

Quote:

Given that this thread is titled MMORPG..., have you tested TNL with 2000-3000 player connections, under real-world internet conditions?

The largest "real-world" games we tested with were 100+ player "single zone" Tribes 2 servers. The problem domain is slightly different - i.e. Tribes was a twitch shooter with substantially more interactivity than most MMO type games, but it should translate well into the MMO type domain.

Quote:

Worst case scenario analysis for a MMORPG and 3000 very active players:

3kbytes/sec * 3000 players = 9000kbytes/sec, 72,000kbits/sec, 72Mbits/sec.

This means you'll probably have many fat pipes, as well as extra routing capabilities to deal with varying internet conditions. Given the unpredictability of network conditions, if the server does not actively adapt its bandwidth output, the system is going to fall apart (lots of data, lots of connections, lots of unpredictability).

It seems to me that an MMO service provider is going to want to make sure that it has enough bandwidth to handle peak load, with a healthy margin. It would be a trivial addition to TNL to allow a server-wide maximum bandwidth setting (i.e. 50 mbits/sec) and then adapt the fixed rates for all connections down as new clients connect.
Quote:

Thus, I hope it is clear why I have been defending TCP*: it really is an excellent protocol. Some of its features are not ideal for games/simulations, but the RTO calculation (see appendix A in Jacobson's paper) is required if a custom protocol is to be used in a large-scale, real world internet environment (such as a MMORPG). It's probable that the UDP-based MMORPG's that fail/fall-apart is due to poor bandwidth management.

Yeah, TCP's got some good bandwidth adaptation features. But classifying all data as guaranteed makes for big simulation trouble when you inevitably run into packet loss.
Quote:

In summary, study the history and design of TCP, and use the best feature(s) for custom game/simulation protocols, while leaving out (or retuning) features that hurt game/simulation performance.

Good summary!

#52 John Schultz   Members   -  Reputation: 807

Like
0Likes
Like

Posted 18 May 2005 - 06:48 AM

Quote:
Original post by markf_gg
Quote:
Original post by John Schultz
This makes sense for objects that are rapidly changing state when the system is not running lock-step (or don't require ordered state consistency). To date, I have not run into a problem where this method can provide significant bandwidth savings, but I'll keep it in mind as a future option.

We've found that this method is useful for almost all simulation objects in the 3D (or 2D) world. Players, projectiles, vehicles, etc, whose updates constitute a substantial amount of the server->client communication. In reality it doesn't "save" bandwidth, rather it allows you to more optimally use the bandwidth that you have. Because you're not guaranteeing that any particular data are being delivered, the network layer has the latitude to prioritize updates on the basis of "interest" to a particular client. This results in a substantially more accurate presentation to the client for a given bandwidth setting.


I send this type of data in the unreliable/non-guaranteed channel: it makes up most of the data transmitted as well. Data is added and compressed based on each player's viewspace (more compression for far away objects, really far away objects get updated less frequently, etc.). Classic dead reckoning tracks the simulation so that updates get sent only after a divergence threshold is met. That is, even if the data/object is marked 'dirty', an update is not sent unless the interpolated/extrapolated error is significant (the sender simulates what the receiver should be seeing).

The original statement:

Quote:

If OOORS is of limited use, what other class of data beyond guaranteed and non-guaranteed do you see of value? I agree that the biggest problem is too much data sent as guaranteed, which is a network game design issue. I have not yet seen a strong argument for supporting other classes of data. Either the data absolutely has to get there, or it doesn't. Perhaps you can give an example where this is not true?


asked if there were other useful classes of network data beyond reliable/guaranteed and unreliable/non-guaranteed. Your example uses the unreliable channel: it's a management layer between the network layer and the game layer.

Quote:

Quote:

Worst case scenario analysis for a MMORPG and 3000 very active players:

3kbytes/sec * 3000 players = 9000kbytes/sec, 72,000kbits/sec, 72Mbits/sec.

This means you'll probably have many fat pipes, as well as extra routing capabilities to deal with varying internet conditions. Given the unpredictability of network conditions, if the server does not actively adapt its bandwidth output, the system is going to fall apart (lots of data, lots of connections, lots of unpredictability).

It seems to me that an MMO service provider is going to want to make sure that it has enough bandwidth to handle peak load, with a healthy margin. It would be a trivial addition to TNL to allow a server-wide maximum bandwidth setting (i.e. 50 mbits/sec) and then adapt the fixed rates for all connections down as new clients connect.


Networks are unpredictable. You can also think of bandwidth adaption as a form of fault tolerance. See Jacobson's paper (also see graphs/data present in other papers I referenced). The original version of TCP used a more naive approach (see Cerf & Kahns's 1974 paper and read their comments regarding bandwidth. It did not work well in practice, thus Jacobson's paper in 1988).

Quote:

Quote:

Thus, I hope it is clear why I have been defending TCP*: it really is an excellent protocol. Some of its features are not ideal for games/simulations, but the RTO calculation (see appendix A in Jacobson's paper) is required if a custom protocol is to be used in a large-scale, real world internet environment (such as a MMORPG). It's probable that the UDP-based MMORPG's that fail/fall-apart is due to poor bandwidth management.

Yeah, TCP's got some good bandwidth adaptation features. But classifying all data as guaranteed makes for big simulation trouble when you inevitably run into packet loss.


I see the misunderstanding: I'm only proposing TCP/reliable-only when there is no choice (firewall issues, etc.). A TCP-UDP design can work fine (provided the UDP channel is bandwidth managed as well). A custom UDP protocol should implement TCP-like (customized for games/simulations) bandwidth adaption (for both reliable and unreliable data (which will typically be sent in the same packet)).

Quote:

Quote:

In summary, study the history and design of TCP, and use the best feature(s) for custom game/simulation protocols, while leaving out (or retuning) features that hurt game/simulation performance.

Good summary!


Thanks!

#53 PlayerX   Members   -  Reputation: 279

Like
0Likes
Like

Posted 22 May 2005 - 10:28 AM

Guild Wars uses TCP exclusively. From what I know there have been no major issues with lag in GW.

That being said obviously a perfect UDP solution would be better than TCP. The problem is it is very, very hard coming up with a perfect UDP solution. In fact it is quite tricky coming up with a UDP solution that is better than native TCP. TCP has had years of evolution applied to it and on the face of it it is fairly straightforward implementing a UDP protocol that seems to work - until you throw several hundred clients on it.

Unless you really, really need the benefits of UDP, I'd suggest just going with TCP. It shaves weeks off your development time (unless you go with a middleware package) and it is often "good enough". Hell, it's "good enough" for World of Warcraft and Guild Wars.



#54 Afr0m@n   Members   -  Reputation: 100

Like
0Likes
Like

Posted 26 May 2005 - 08:06 PM

Great thread! I vote for it being stickied! :D

PS: Personally, I use TCP for my current WoW-killer project. It's fast (as long as you don't send packages every frame, and don't update player positions to players outside a player's zone), and ver reliable. :)

#55 anonuser   Members   -  Reputation: 148

Like
0Likes
Like

Posted 01 June 2005 - 07:00 PM

First, I played AC (and AC2) for many years (months for AC2) and never once had a problem with lag. The only problem the game had was when too many players were in one area and then a portal storm would occur and take you out of the laggy area. Never a lag problem so I don't know what AC you were playing.

Secondly I'd go with UDP.

build a reliable UDP protocol.

TCP has too much overhead for everything you need. Not to mention the back throttling issues talked about above.

#56 rzcodeman   Members   -  Reputation: 130

Like
0Likes
Like

Posted 01 June 2005 - 11:40 PM

you should use RTP or TCP .
if you decide to use UDP or TCP i suggest you to put below issue in consideration:
1- packet lost ;udp cannot ensure that so one client might send a critial message and imagine it drops on the way.
2- tcp is a relativly heavy protocol so imaging a condition which u have 1000
cuncurrent online users which in the case of role playing games they will send messages almost each seconds .so using udp make sense so it never send ack for each recieve or send.

i never used RTP but seems to be a good replace for TCP as it send acks lightly.

#57 anonuser   Members   -  Reputation: 148

Like
0Likes
Like

Posted 02 June 2005 - 09:42 AM

Quote:
Original post by PlayerX
Guild Wars uses TCP exclusively. From what I know there have been no major issues with lag in GW.

That being said obviously a perfect UDP solution would be better than TCP. The problem is it is very, very hard coming up with a perfect UDP solution. In fact it is quite tricky coming up with a UDP solution that is better than native TCP. TCP has had years of evolution applied to it and on the face of it it is fairly straightforward implementing a UDP protocol that seems to work - until you throw several hundred clients on it.

Unless you really, really need the benefits of UDP, I'd suggest just going with TCP. It shaves weeks off your development time (unless you go with a middleware package) and it is often "good enough". Hell, it's "good enough" for World of Warcraft and Guild Wars.


Being "good enough" is what caused all the server problems with WoW. GW is not really an MMO, it spawns an instance of the world for each player / group so I imagine a lot more is left up the client in the case of GW, though I can't really be sure.

WoW is notorious for server problems.

Again as an tribute to AC, AC servers were rarely down and besides over populated places (sub in pre-marketplace days) you'd not notice any lag.



#58 rmsimpson   Members   -  Reputation: 228

Like
0Likes
Like

Posted 04 June 2005 - 05:40 AM

Quote:
Original post by John Schultz
Does RakNet use TCP? If not, how do you see IOCP helping a UDP-only based protocol, especially if the server is single-threaded (for maximum performance due to zero (user-level, network) context switching)?
...[snip]...
It would appear that thread context switching overhead might outweigh kernel (paging) advantages with IOCP, especially given the nature of UDP (not using memory/queues as with TCP).


Why would you make a single-threaded server to begin with? You've typically got a bunch of entities to process, AI to manage, etc, and throwing your UDP receive loop into the same thread would aggravate performance to an unmanageable level, wouldn't it?

I've tried several UDP scenarios, such as:

- Having a primary thread to process game data, and a worker thread to receive and dispatch UDP packets (no async I/O)
- Same as above, using overlapped I/O and wait handles
- Using IOCP and a thread pool

I basically wrote a front end to blast packets between two machines on a 100BT network to see how many I lost, how far behind my programs got, etc. I also used a dual-xeon receiver and a single CPU sender, and vice-versa.

By far, and without question, the IOCP app ran the breakneck fastest, with the least CPU usage, and the least number of lost packets. As a matter of fact, I was able to completely saturate a 100BT network with 1400-byte UDP packets to the dual xeon receiver with 0 lost packets and 0 backlog -- and using a fraction of the CPU's time.

None of the other methods I tried scaled up to utilize all available CPU's, nor did they keep up with massive throughputs over and extended period of time. They invariably began to backlog and lost tons of packets.

Oh, I also tried running the program on the same machine (used both the dual xeon and a single CPU machine) using the loopback address. With two programs running full-tilt (one receiving and one sending) only the IOCP solution was able to receive all the packets with 0 backlog and 0 lost packets.

The only "flow control" I implemented was to turn off the send buffer on the socket to ensure the network layer didn't discard my outgoing packet due to lack of buffer space to store it.

If anyone's interested, I'll dig out the source code for the IOCP method and toss up a link.

Robert Simpson
Programmer at Large


#59 John Schultz   Members   -  Reputation: 807

Like
0Likes
Like

Posted 04 June 2005 - 07:32 AM

Quote:
Original post by rmsimpson
Quote:
Original post by John Schultz
Does RakNet use TCP? If not, how do you see IOCP helping a UDP-only based protocol, especially if the server is single-threaded (for maximum performance due to zero (user-level, network) context switching)?
...[snip]...
It would appear that thread context switching overhead might outweigh kernel (paging) advantages with IOCP, especially given the nature of UDP (not using memory/queues as with TCP).


Why would you make a single-threaded server to begin with? You've typically got a bunch of entities to process, AI to manage, etc, and throwing your UDP receive loop into the same thread would aggravate performance to an unmanageable level, wouldn't it?


In the case of 100% resource utilization, where processing incoming packets has the highest priority, it's clear that a single-threaded design should be the fastest: no-thread context switching. This would be the limit case: it would not be possible to process packets more efficiently. When AI+physics+gamecode are factored in, if the incoming packet rate is very high, then packets will be dropped if the incoming buffer cannot be processed fast enough. If threads are used to process incoming packets, efficiency would be reduced due to context switching (unless the OS is doing something that improves efficiency).

Quote:
Original post by rmsimpson
I've tried several UDP scenarios, such as:

- Having a primary thread to process game data, and a worker thread to receive and dispatch UDP packets (no async I/O)
- Same as above, using overlapped I/O and wait handles
- Using IOCP and a thread pool

I basically wrote a front end to blast packets between two machines on a 100BT network to see how many I lost, how far behind my programs got, etc. I also used a dual-xeon receiver and a single CPU sender, and vice-versa.

By far, and without question, the IOCP app ran the breakneck fastest, with the least CPU usage, and the least number of lost packets. As a matter of fact, I was able to completely saturate a 100BT network with 1400-byte UDP packets to the dual xeon receiver with 0 lost packets and 0 backlog -- and using a fraction of the CPU's time.

None of the other methods I tried scaled up to utilize all available CPU's, nor did they keep up with massive throughputs over and extended period of time. They invariably began to backlog and lost tons of packets.

Oh, I also tried running the program on the same machine (used both the dual xeon and a single CPU machine) using the loopback address. With two programs running full-tilt (one receiving and one sending) only the IOCP solution was able to receive all the packets with 0 backlog and 0 lost packets.

The only "flow control" I implemented was to turn off the send buffer on the socket to ensure the network layer didn't discard my outgoing packet due to lack of buffer space to store it.

If anyone's interested, I'll dig out the source code for the IOCP method and toss up a link.

Robert Simpson
Programmer at Large


This forum's moderator, Jon Watte (hplus) who works on a MMOG (There), stated that they tested various scenarios, including MP+multithreaded, and found that single threaded was the most efficient. It's not clear if their tests were from MMOG testing, benchmarks, or both.

Your test/benchmark sounds cool: if you could post your benchmark(s) showing that (overlapped I/O+) IOCP+threaded(+MP, etc.) does something extra-ordinary for UDP in Win32, including a means to compare with single-threaded standard UDP sockets, network developers would be interested in running the benchmark (I can test on various Intel/AMD/MP hardware).

#60 rmsimpson   Members   -  Reputation: 228

Like
0Likes
Like

Posted 04 June 2005 - 09:13 AM

Quote:
Original post by John Schultz
This forum's moderator, Jon Watte (hplus) who works on a MMOG (There), stated that they tested various scenarios, including MP+multithreaded, and found that single threaded was the most efficient. It's not clear if their tests were from MMOG testing, benchmarks, or both.

Your test/benchmark sounds cool: if you could post your benchmark(s) showing that (overlapped I/O+) IOCP+threaded(+MP, etc.) does something extra-ordinary for UDP in Win32, including a means to compare with single-threaded standard UDP sockets, network developers would be interested in running the benchmark (I can test on various Intel/AMD/MP hardware).


It's been at least a year since I even looked at the code, but I'll blow the dust off and post a link to the benchmark program(s) I wrote.

Robert





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS