Sign in to follow this  

Handling UDP resends

This topic is 3255 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey all, I've written a small network lib using the UDP protocol. Right now, I have set the resend interval to 500ms, meaning if I don't receive an ACK from the other side, I resend the packet, then wait another 500ms before re-resending the packet. The problem I see with that approach is that I'm forcing a 500ms lag to my players if 1 packet is lost. My first Idea was to determine the resend delay per clients according to his ping, but ping measurement is never really accurate and has lots of variations. On the other end, I don't want my server to start resending packets like crazy thus eating all my upload bandwidth. What approach would you guys recommend ? The game is very fast paced/projectile intensive, is in a client/server architecture and up to 16 players max can be connected to the server at the same time.

Share this post


Link to post
Share on other sites
Quote:
Original post by md_lasalle
Very good article. Some of his concept don't apply for me because of my implementation, but I guess that measuring the average RTT of ACK packets would be a good start.


The most important concept you need to implement is asynchronous and unreliable sending. You cannot wait for ack of every message. You keep sending regardless of acks, but maintain in-transit buffer to not flood the peer. This is the most important part, since your individual simulations can progress even if one or several stall for any reason.

There is alternate approach which can, sometimes, be used to minimize the resends. Sequential packets can be encoded using Reed Solomon (possibly in combination with convolution) encoding. There, information from several packets is merged. This adds some overhead to each individual packet, but original data can be reconstructed even when some packets never arrive.

Whether such approach is practical or not has to be determined on per-case basis.

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
The most important concept you need to implement is asynchronous and unreliable sending. You cannot wait for ack of every message. You keep sending regardless of acks, but maintain in-transit buffer to not flood the peer. This is the most important part, since your individual simulations can progress even if one or several stall for any reason.


Yes, this is what I did it : 1 physical UDP packet can contain multiple :
- safe user packets
- unsafe user packets
- safe user packet resends

When I say "user packet" I mean game layer packet, meaning the network lib is not game specific.


Quote:
Original post by hplus0603
The terms to google for include "windowing" and "RTT estimation." Both of which are implemented in TCP, by the way -- if you need sequential, in-order delivery, you might as well use TCP.


I have experienced with TCP before, my first game which is widely played at the moment is entirely TCP, and what I have noticed, compared to the new version that will use UDP, is that TCP can take forever before it detects/resends a lost packet, which happens quite often over wireless networks and with client on poor ISP connections.

With current implementation, I'm taking advantage of UDP by letting the Game App level tag user packets as Safe/Unsafe depending on the need.

Thanks for the inputs so far.



Share this post


Link to post
Share on other sites
Quote:
Yes, this is what I did it : 1 physical UDP packet can contain multiple :


His suggestion was that you window the amount of physical UDP packets you can have outstanding on the wire at the same time. Put a sequence number in each physical packet. When you send packets back, send acks for the last N sequence numbers you've received. When the sending end has sent N packets without receiving an ack for the first packet, start re-sending from that packet. As long as (packet send rate * N) is larger than (round trip time) this will be perfectly stable.

The size of "N" is your window size, and it sounds like your current implementation has N == 1, which is usually not the optimal window size.

Share this post


Link to post
Share on other sites
Currently, I don't resend physical packets, since they can contain both safe/unsafe data. If a safe user packet needs to be resent, it will get merged in the next physical packet to send, in-front of the new data to be sent, until my maximum send size is reached (which is about 1200 bytes per packet on the xbox, and imposing the same limit on PC seems to be good enough for me, my packets rarely go over 600 bytes)

It might not be the best approach, but it is definitely running smooth with 16 players and intense action, especially since I'm not sending at the same rate for every client on the map, so the upload on the server is distributed over multiple game logic frames.

Going from there I'm still opened to recommendations that fits my current design.

Share this post


Link to post
Share on other sites
Quote:
Original post by md_lasalle
... (which is about 1200 bytes per packet on the xbox, and imposing the same limit on PC seems to be good enough for me, my packets rarely go over 600 bytes)


By this are you implying that your packets are not a uniform size? I listened to a lecture from Ben Garney from GDC in which he says that you should favour packets of uniform size as routers tend to favour data like this than packets of varying sizes.

If you say your packets are 600bytes and you have 16 players with an update every 20th of a second then you have a 8.78kbs upstream usage, which is too much for the xbox360 as the limit is 8kbs.

Share this post


Link to post
Share on other sites
What is your unit of re-send? Are you saying that if a user-level message has not been delivered in 500ms, that user-level message gets re-queued for physical packet inclusion?

In general, you want to track your physical packets with sequence numbers, and then "know" which user-level messages have "made it" or not by tracking which physical packets have made it across. In general, you also want to allow multiple physical packets to be outstanding before you get an ack from the other end. Once you have those two pieces implemented, doing "optimal" re-send of reliable messages into physical packets based on latency is mostly an exercise in accounting and adding/removing things onto keyed lists.

Perhaps this is what you're already doing, in which case you're doing fine. But then the original question makes no sense to me?

Share this post


Link to post
Share on other sites
Quote:
Original post by dmail
By this are you implying that your packets are not a uniform size? I listened to a lecture from Ben Garney from GDC in which he says that you should favour packets of uniform size as routers tend to favour data like this than packets of varying sizes.


I just skimmed over the pdf presentation (don't have time to listen though entire thing), but it doesn't seem to mention that.

Since UDP and IP are stateless, what does *varying* mean? There's nothing to compare to. The "connection" exists only within the context of application.

As long as packet size is under MTU, I don't really see why that would be a case. Or is there some other context, something about transport layer, or LAN or particular network stack.

Share this post


Link to post
Share on other sites
He mentions it at about the 39min mark. He is not very technical of why it happens and at one point says "strange things happen", but he says it is to do with using a little bandwidth one second and then more the next which shows these "strange" packet drops and unresponsiveness of routers.

Share this post


Link to post
Share on other sites
Quote:
Original post by dmail
If you say your packets are 600bytes and you have 16 players with an update every 20th of a second then you have a 8.78kbs upstream usage, which is too much for the xbox360 as the limit is 8kbs.


I still need to optimize things with a PVS and whatnot :)


Quote:
Original post by hplus0603
What is your unit of re-send? Are you saying that if a user-level message has not been delivered in 500ms, that user-level message gets re-queued for physical packet inclusion?


Yes, the safe user packet gets included in the next outgoing physical packet.

Quote:
Original post by hplus0603
In general, you want to track your physical packets with sequence numbers, and then "know" which user-level messages have "made it" or not by tracking which physical packets have made it across.


I have sequence numbers, but instead of putting them per physical packet, i put them per user-level safe packet, but before sending them and affecting a sequence number, a merge a few user-level safe packets together, so i don't end up with too much overhead.

Quote:
Original post by hplus0603
In general, you also want to allow multiple physical packets to be outstanding before you get an ack from the other end. Once you have those two pieces implemented, doing "optimal" re-send of reliable messages into physical packets based on latency is mostly an exercise in accounting and adding/removing things onto keyed lists.

Perhaps this is what you're already doing, in which case you're doing fine. But then the original question makes no sense to me?


Yes this is what I'm doing, sorry if we misunderstood each other. My initial question is now irrelevant since it mostly depends on one's implementation.

At least I got some nice inputs from the discussion. I am now using the average round trip time measured from the moment i send the safe packet to the moment i receive the ACK for it. I impose a certain threshold and now it seems to behave really well without killing my server bandwidth.

Share this post


Link to post
Share on other sites
Quote:
Original post by md_lasalle
Quote:
Original post by dmail
If you say your packets are 600bytes and you have 16 players with an update every 20th of a second then you have a 8.78kbs upstream usage, which is too much for the xbox360 as the limit is 8kbs.


I still need to optimize things with a PVS and whatnot :)




Sorry that calculation is well off and was for one update per second not 20 which would require an upstream of 175kbs and that does not include the UDP overhead of 28bytes per packet. That is a hell of a lot of optimisation. :)

Share this post


Link to post
Share on other sites
Yep, quite challenging.

20 updates per second is in most case overkill : that's why the PVS will come in handy.

Also, since the server upload is the culprit, the server is not updating Entity A to all players in the same game frame. It is distributed over multiple frames to try and keep a constant upload rate on the server, as opposed to big bursts of data to send in one frame at any time.

Share this post


Link to post
Share on other sites
Quote:
Original post by dmail
Quote:
Original post by md_lasalle
Quote:
Original post by dmail
If you say your packets are 600bytes and you have 16 players with an update every 20th of a second then you have a 8.78kbs upstream usage, which is too much for the xbox360 as the limit is 8kbs.


I still need to optimize things with a PVS and whatnot :)




Sorry that calculation is well off and was for one update per second not 20 which would require an upstream of 175kbs and that does not include the UDP overhead of 28bytes per packet. That is a hell of a lot of optimisation. :)


the 8kps limit is just to pass the TRC. It's more of a 'guide' than a imposed limit. But typically, a DSL upstream is between 128 / 512 kbps, that's 16 kB/s -> 64 kB/s.

At 20 fps, for a 128kbps DSL, you get a limit of just over 800 bytes per frame. 16 players, that drops down to 54 bytes per packet. With a 28 bytes header overhead, you are left with 26 bytes of data! 15 fps (one fourth of every 60 fps game update), you are left with 44 bytes. That's not even considering voice.

for a good DSL, you are left with 263 bytes per packet, for 16 player (really, 15 clients), at 15 fps. That's a lot better, but add voice, and there would not be much left (voice encoding is what, 800 bytes / sec?).

It's tight...

Share this post


Link to post
Share on other sites


If your data load is small you could piggyback the previous sends data on every new packet sent. That woul;d eliminate a great deal of the lost dats.

You could have your clients constanly sending back a ack list of received packets on a more frequent time interval -- usually this will develop into some kind of window reliability scheme (which can get quite complicated)


Remember that some data like position updates atually becomes useless in the interval (when the next update is sent within a short time). If you are using that kind of data then you can differentiate it from 'event' and state update data which has to be reliably delivered (sometime the update data is the majority and the reliable data is less time dependant and thus can have slower correction allowances).

Share this post


Link to post
Share on other sites
Quote:
Original post by dmail
He mentions it at about the 39min mark. He is not very technical of why it happens and at one point says "strange things happen", but he says it is to do with using a little bandwidth one second and then more the next which shows these "strange" packet drops and unresponsiveness of routers.


The issue is that routers tend to allocate bandwidth based on who's using it. If you are regularly using a certain amount of bandwidth then it assumes you will keep using that much. If you are jumpy (ie you use 5kb/sec for a while then jump to 40kb/sec then back down) then it introduces extra lag as the router has to reallocate bandwidth to meet the sudden demand. In some cases, the router may not have enough storage to keep the packets around until they can be sent and will drop them. In other words, user is playing game, something exciting happens, and suddenly user has a big lag spike - if you don't budget a fixed data rate.

In low-bandwidth situations, having a fixed stream size also lets the system fail fast. That is, if the user doesn't have the bandwidth to play the game, it's probably better for them to not be able to play it at all than be able to join, then have it stall out 10 seconds into gameplay.

I don't have a lot of hard data to back these particular claims up which is why I kind of skimmed over it in my talk. It did help for Tribes1/2, which were widely played, so there's at least anecdotal evidence. If you're really serious about it you would want to do testing across a long link. (Connect to a server in Europe, say.) Otherwise it'll be hard to get reliable results.

Share this post


Link to post
Share on other sites
I'm working on a project at the moment where the bandwidth restraint means each packet is allowed only 13 bytes payload. That's pretty tight, even when you pull out the heavy weapons ;-) (ie. Hand tuned data compression, delta compression, etc ..)

Because of the tight bandwidth constraint, we have developed a method - amongst others - of reliable packet delivery, where you specify the bandwidth to be used for the resend. If you're working to a 64K bandwidth budget and you have 100 bytes per second free for resends then it will use no more than 100 bytes per second for the resend.

The math / logic couldn't be simpler. Given the size of the packet and the number of bytes to be used per second, you can calculate how frequently to resend the packet.

Be sparing ;-)

Share this post


Link to post
Share on other sites
Sorry for resurrecting this, just got back from vacations.

Quote:
Original post by ben_garney
The issue is that routers tend to allocate bandwidth based on who's using it. If you are regularly using a certain amount of bandwidth then it assumes you will keep using that much. If you are jumpy (ie you use 5kb/sec for a while then jump to 40kb/sec then back down) then it introduces extra lag as the router has to reallocate bandwidth to meet the sudden demand. In some cases, the router may not have enough storage to keep the packets around until they can be sent and will drop them. In other words, user is playing game, something exciting happens, and suddenly user has a big lag spike - if you don't budget a fixed data rate.


If this is true, then i will be padding my packets with some resends until i reach a minimum amount of data, probably based on the max number of players in the game.

Thanks for the input.



Share this post


Link to post
Share on other sites
Quote:
Original post by dmail
This is the method which many UDP based games that want reliable delivery use, the author also has other articles which may interest you.



I did my own reliable protocol much like that years ago (bitmapped ack windows /flow control/etc..) and it worked fairly well (for what I was using it for). I did somewhat different by usually doing resends when the ACK window indicated a packet was missing (for 'reliable' mode data versus a seperate stream of reliable-unneeded data).
I did have large packets being parts of long data blocks which had to be sent intact. And also had frequent bi-directional traffic to constantly have a stream of ack data traveling in both directions (many games have a majority of the traffic headed out to a client).

The author says to have the application rebuild more data instead of resending the lost packets. That needs more explanation (and examples as that gets complicated). It might be as simple as having a flag on the packets to prevent resend because the content was data which is constantly updated (versus single event type data that isnt sent again). You may have to segregate reliable vs reliable-unneeded data to different packets.

The application might (depending on the process flows) have to do alot of work to reconstitute the 'patched' (versus 'resent') data -- so automatically using that as a strategy might not fit all usages.

Another scheme Ive heard of (when data isnt large) is to piggyback the last data load onto the subsequent packet which can cut down a majority of resend scenarios (single packet loss) -- as long as the bandwidth isnt impacted.

Share this post


Link to post
Share on other sites

This topic is 3255 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this