30 seconds to disaster: Why you should NEVER use TCP for multiplayer games

Started by
70 comments, last by Kreso 18 years, 2 months ago
A long time ago, when I started working on Galactic Magnate ( http://www.galacticmag.com/ ), I decided to use TCP. Why? Well, it was just what I needed: in-order data delivery. I actually don't like third-party solutions because they steal some control away from me, and also I had problems with third-party software before. But I thought: this is TCP. It's used everywhere. Nothing can possibly go wrong… wrong! A few months ago I launched Galactic Magnate. Soon the database was full of logs of strange network failures: clients would get disconnected for no apparent reason. For every 15 connections to server, there was one strange disconnect. I was puzzled, but I thought: maybe users have unreliable internet connections, and this is normal. I tried to investigate it several times, but came to no conclusion. But recently, something strange happened: the server's internet connection started failing. It looks like there was something wrong with network adapter or wires. The error would manifest itself in following way: the machine would loose connection to internet for about one minute, but after that it will continue as nothing has happened. This error occurred about eight times a day. This had an effect on Galactic Magnate server that I couldn't explain. When connection fails (for 1 minute), all the users get disconnected form server. If I am playing the game, I get disconnected too. But, if I am telneted to server at the same time, the telnet connection won't break. Notice that the behavior of telnet is normal, it is what I expected: the TCP stack has to keep trying to resend the packets, until the connection is working again. But why did all Galactic Magnate TCP connections break down? And also: why does this error affect Galactic Magnate, but not telnet? Without any clues, I decided to run a sniffer. The sniffer (on client side) showed client trying to send data, and then the connection fails. Since the connection is down, this data is never acknowledged by the other side, so TCP stack (winsock) tries to resend the data. And now the most important thing: how many retries does it make, and for how long? Or to put it more simply: how long is TCP send timeout? The answer is: 30 seconds. But why didn't telnet fail? Answer is simple: telnet is idle most of the time and sends no data over TCP connection. If internet connection fails (completely), and telnet is idle, it can sometimes take a few hours for TCP stack to detect this. The TCP stack detects connection failure 30 seconds after it tries to actually send the data. If no data is sent, failure is not detected for a long time. So what are the consequences of such behavior of TCP stack? Well, most TCP connections don’t last long anyway (for example: web browsing), so 30 seconds is ok. On the other hand, most long TCP connections (like telnet) are idle most of the time, so they will survive temporary internet failure. But what about games? Bad luck. Games usually use long internet connections, lasting at least half an hour. Also, games are rarely idle. So if internet connection is temporarily down, for more than 30 seconds, clients will get disconnected if they use TCP. How often does this happen (how often do internet connections stall for more than 30 seconds)? Well, usually, the server's internet connection is stable. But your users will be connecting using dialup, ADSL, cable, etc… and their connection does stall often. If you remember what I said at the start of an article: for every 15 connections there is one disconnect (there is another important parameter: this is true if average connection lasts 30 minutes). So the conclusion is: Don't use TCP or you will be seeing a lot of disconnects… A few notes: - So far, this has only been tested on winsock (Windows XP). Other operating systems may have different IP stacks. - I also tried to find out if timeout interval can be somehow changed to more than 30 seconds. I browsed through MSDN documentation and found nothing. I posted a question on alt.programming.winsock, but received no answer. I found an old post on alt.programming.winsock asking the same question, but again with no answer. - I will continue to investigate this because I'm interested what's the timeout on Linux IP stack.
Advertisement
doesnt world of warcraft use TCP?
any suggestions on another protocol to use (ex: UDP)?

Beginner in Game Development?  Read here. And read here.

 

TCP would be good for different sections. Logon, small areas of the map etc. Don't know why you would do that though..
Hello?
Quote:Original post by Alpha_ProgDes
any suggestions on another protocol to use (ex: UDP)?


I'm planning to switch to ENET. In its header file, I saw that this dreaded timeout can be configured. ENET uses UDP, but also gives in-order packet delivery.
Hmm, isn't this just basic stuff that should've been researched when the game was developed? This doesn't sound like a particularly new revelation, it's one of the common knowledge downsides to not using a connectionless protocol like UDP. The reason a lot of games do use UDP is because we don't care if packets are dropped - if a packet is guaranteed delivery and a users connection stalls who cares? the packet has arrived too late to be of much use to most games so will be discarded anyway, the difference is TCP insists on trying to deliver that packet whereas UDP doesn't care if it never gets there and just drops it as we want anyway.

UDP doesn't guarantee the packet arrive intact of course which can be a problem but it's not hard to implement something like that as well as a few other features that UDP lacks from TCP.

A lot of games I've seen recently use TCP for guaranteed delivery of things like login credentials and such but the actual real-time game data is passed back and forth in UDP which makes a lot of sense.
You can also instantly send at max speed, whereas TCP ramps up...
Xest wrote:
> Hmm, isn't this just basic stuff that should've been researched when the
> game was developed?

And exactly how should I have known that a lot of connections stall for more than 30 seconds, and that this will cause problems. I thought developers of winsock have:
- choosen sensible defaults
- made possible to change the defaults if it is necessary

Even GameDev's FAQ says that TCP is suitable for multiplayer games. I quote:
"Briefly, if you're turn-based, you should go with TCP because it's simpler"

I don't undestand what exactly did you expect I should have been searcing for?

> This doesn't sound like a particularly new revelation, it's one of the
> common knowledge downsides to not using a connectionless protocol like UDP.

If it's common knowledge, point me where does it say that winsock has 30 seconds timeout. It's certainly not in it's documentation in MSDN.

> The reason a lot of games do use UDP is because we don't care if
> packets are dropped - if a packet is guaranteed delivery and a users
> connection stalls who cares? the packet has arrived too late to be of
> much use to most games so will be discarded anyway,

No, this does not apply to turn-based strategy games, like Galactic Magnate.

> the difference is TCP insists on trying to deliver that packet whereas
> UDP doesn't care if it never gets there and just drops it as we want anyway.

The problem is obviously that TCP doesn't 'insist' strong enough.

> UDP doesn't guarantee the packet arrive intact of course which can
> be a problem but it's not hard to implement something like that as well
> as a few other features that UDP lacks from TCP.

Why should I reinvent the wheel? I thought TCP does this for me.
Hmm, no need to get so aggressive over it, I was just trying to provide some pointers to help you out.

> If it's common knowledge, point me where does it say that winsock has 30 seconds timeout. It's certainly not in it's documentation in MSDN.

I'm not particularly sure it's a winsock specific thing, it may well be OS specific or even a standard part of TCP, it's been a while since I've looked in depth at this type of thing.

> The problem is obviously that TCP doesn't 'insist' strong enough.

For what purpose? Not everything can be developed to be suited to everyone's specific needs unfortunately. Whilst it maybe problematic to you it's going to be absolutely fine for a lot of other developers with different needs.

> Why should I reinvent the wheel? I thought TCP does this for me.

This is rather contradictory to your situation, on one hand you accept that TCP isn't doing the job you need it to do, yet at the same time you argue that TCP should fill your every specific need. My point was that if you build up from UDP which is a very basic protocol you can build onto it the features you want whilst avoiding the features you don't want, and that's key to solving your problem here.
> Hmm, no need to get so aggressive over it, I was just trying
> to provide some pointers to help you out.

Sorry if I was aggressive, I didn't mean to. I never got used to this usenet notion of 'aggresive'.

> > The problem is obviously that TCP doesn't 'insist' strong enough.

> For what purpose?

For the purpose of multiplayer games, and especially turn based strategies.

> This is rather contradictory to your situation, on one hand you accept
> that TCP isn't doing the job you need it to do, yet at the same time
> you argue that TCP should fill your every specific need. My point was
> that if you build up from UDP which is a very basic protocol you can
> build onto it the features you want whilst avoiding the features you
> don't want, and that's key to solving your problem here.

My point is that I didn't know that TCP won't fill my needs, and also I realized that other developers may run into the same problem because it is not obvious (in my opinion). And I didn't keep it all to myself, I shared what I know on this forum, in the hope it will be usefull to others.

Of course, now I know that I should have used UDP.

This topic is closed to new replies.

Advertisement