Sign in to follow this  
Tree Penguin

Quick UDP/TCP question

Recommended Posts

Washu    7829
No. When in stream mode the sockets deal with the data as a stream. Thus enabling you to read small bits of data at a time. Unreliable datagram protocols however (of which UDP is one) require you to read all of the data pending from a particular sender. Doing otherwise causes the remaining data to be dropped.

With stream based sockets you end up with a socket per client, and tend to call send/recv. With datagram based sockets you typically end up with one socket, and call sendto/recvfrom, passing the address of the destination/source you wish to send the data to.

Share this post


Link to post
Share on other sites
xor    516
Absolutely not.

UDP is a connectionless protocol, as oposed to TCP.
UDP is an unrealiable protocol, because not every packet is garanteed to get to the destiny and is not garanteed to get there in order, as oposed to TCP.

There a lot more you need to change.

Share this post


Link to post
Share on other sites
Washu    7829
Quote:
Original post by xor
and is not garanteed to get there uncorrupted, as oposed to TCP.


Eh, UDP is checksummed. Unless the packet is corrupted in a very specific way, it will fail the checksum process and thus be dropped. The chances of getting corrupted data unintentionally is...slim, although that still doesn't mean you shouldn't handle that case, as someone might try to exploit your networking code to gain access to your machine. Although checksumming is optional (just set the field in the UDP header to 0).

Share this post


Link to post
Share on other sites
John Schultz    811
Quote:
Original post by Washu
Quote:
Original post by xor
and is not garanteed to get there uncorrupted, as oposed to TCP.


Eh, UDP is checksummed. Unless the packet is corrupted in a very specific way, it will fail the checksum process and thus be dropped. The chances of getting corrupted data unintentionally is...slim, although that still doesn't mean you shouldn't handle that case, as someone might try to exploit your networking code to gain access to your machine. Although checksumming is optional (just set the field in the UDP header to 0).


UDP packets can get corrupted by bugs in routers that, for example, fragment packets, and upon reassembly, compute a new checksum. If there is an error in any step, the packet checksum will be valid, but the data it is computed from will not be valid.

For voice data, etc., it may not be catastrophic to get a bad UDP packet, but for game data, it could cause the game to crash. An arithmethic (additive) checksum is not a very strong method to detect errors in a block of data: a CRC (polynomial division) is much stronger. In the same way a checksum can be recomputed on invalid data, so can a CRC be recomputed by an attacker after modifying a packet. Thus, to try to prevent hacking, a cryptographic hash must be used (the data itself should also be encrypted).

Share this post


Link to post
Share on other sites
Washu    7829
Quote:
Original post by John Schultz
Quote:
Original post by Washu
Quote:
Original post by xor
and is not garanteed to get there uncorrupted, as oposed to TCP.


Eh, UDP is checksummed. Unless the packet is corrupted in a very specific way, it will fail the checksum process and thus be dropped. The chances of getting corrupted data unintentionally is...slim, although that still doesn't mean you shouldn't handle that case, as someone might try to exploit your networking code to gain access to your machine. Although checksumming is optional (just set the field in the UDP header to 0).


UDP packets can get corrupted by bugs in routers that, for example, fragment packets, and upon reassembly, compute a new checksum. If there is an error in any step, the packet checksum will be valid, but the data it is computed from will not be valid.

For voice data, etc., it may not be catastrophic to get a bad UDP packet, but for game data, it could cause the game to crash. An arithmethic (additive) checksum is not a very strong method to detect errors in a block of data: a CRC (polynomial division) is much stronger. In the same way a checksum can be recomputed on invalid data, so can a CRC be recomputed by an attacker after modifying a packet. Thus, to try to prevent hacking, a cryptographic hash must be used (the data itself should also be encrypted).


Yes, however such bugs are rare. Things like that tend to get caught and fixed, because they are serious problems. I didn't say you shouldn't expect corrupted data, i did say that the chances were slim.

Share this post


Link to post
Share on other sites
John Schultz    811
Quote:
Original post by Washu
Quote:
Original post by John Schultz
Quote:
Original post by Washu
Quote:
Original post by xor
and is not garanteed to get there uncorrupted, as oposed to TCP.


Eh, UDP is checksummed. Unless the packet is corrupted in a very specific way, it will fail the checksum process and thus be dropped. The chances of getting corrupted data unintentionally is...slim, although that still doesn't mean you shouldn't handle that case, as someone might try to exploit your networking code to gain access to your machine. Although checksumming is optional (just set the field in the UDP header to 0).


UDP packets can get corrupted by bugs in routers that, for example, fragment packets, and upon reassembly, compute a new checksum. If there is an error in any step, the packet checksum will be valid, but the data it is computed from will not be valid.

For voice data, etc., it may not be catastrophic to get a bad UDP packet, but for game data, it could cause the game to crash. An arithmethic (additive) checksum is not a very strong method to detect errors in a block of data: a CRC (polynomial division) is much stronger. In the same way a checksum can be recomputed on invalid data, so can a CRC be recomputed by an attacker after modifying a packet. Thus, to try to prevent hacking, a cryptographic hash must be used (the data itself should also be encrypted).


Yes, however such bugs are rare. Things like that tend to get caught and fixed, because they are serious problems. I didn't say you shouldn't expect corrupted data, i did say that the chances were slim.


The following statement implies that a checksum is sufficient for corruption detection, and that additional corruption checking is only necessary in the case of hack-prevention:

Quote:
Original post by Washu
The chances of getting corrupted data unintentionally is...slim, although that still doesn't mean you shouldn't handle that case, as someone might try to exploit your networking code to gain access to your machine.


I would agree with the following statement:

Quote:

While the chances of getting corrupted data unintentionally is slim, one should still handle that case, as such errors do occasionally occur via bad routers and buggy software/drivers. In cases where someone might try to exploit your networking code to gain access to your machine, one should use a cryptographic-strength hash.


Again, an arithmetic additive checksum is a relatively weak method to detect errors in blocks of data. A CRC or stronger method should be used in any application where fault tolerance is important. The few extra bytes required are more valuable than the (miniscule) loss in bandwidth. Link layers can provide CRC protection, but it's up to the application to ensure the integrity of its data.

For example, in the case of a MMOG, customer support costs outweigh the additional bandwidth required to validate data. In the case of faulty routers and/or malicious hackers, the application will drop the bad packets instead of crashing. While the bad routers/active-hackers can be fixed in time, crashing applications can be prevented.

Another example: at the Launch of XBox Live, faulty routers were found in Japan. The short term solution was to reduce UDP packet size so that the packets were never fragmented. Since XBL packets are encrypted and authenticated, the apps would not crash, though the corrupted packets would get dropped.

Share this post


Link to post
Share on other sites
John Schultz    811
Quote:
Original post by OrangyTang
TCP only does a checksum as well, so the possibility of getting corrupted data from both is roughly the same.


Right, so when you download a multi-megabyte archive via TCP and it fails the CRC check (or MD5 hash, etc.), you download it again. This happens frequently enough that large downloads are sometimes broken up into smaller files.

This raises another point: TCP data should also be validated with a CRC or stronger method.

Share this post


Link to post
Share on other sites
Tree Penguin    262
Wow, thanks for all the replies.

First of all, i know about the fragmentation of large (a little over one KB iirc) datagrams. I also know that the order in which the datagrams are sent and recieved can be different because of different routes the packets can go. I also know that when datagrams are dropped (which can occur at anytime), the app isn't notified about this and that if the datagram gets fragmented (due to it's size) and one of the fragments isn't recieved the whole datagram is dropped.

What i really wanted to know is wether or not the initialization code is the same, i guess so now, as no-one really said anything about that.

Anyway, thanks everyone and thanks Washu for noting the sendto and recvfrom methods.

Cheers!

Share this post


Link to post
Share on other sites
John Schultz    811
Quote:
Original post by Tree Penguin
What i really wanted to know is wether or not the initialization code is the same, i guess so now, as no-one really said anything about that.


UDP:
s = ::socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP);

TCP:
s = ::socket(AF_INET,SOCK_STREAM,IPPROTO_TCP);

Share this post


Link to post
Share on other sites
hplus0603    11356
There's so much checking in the various layers that your data goes through, that the weakness of the UDP checksum, in practice, just doesn't matter. You should be concerned about hackers mal-forming packets, and making sure you deal correctly with those kinds of packets. If you deal well with those packets, then by inference, you'll deal well with all kinds of mal-formed packets, no matter what the cause.

I e: you can probably be sure that what you receive, is what the sender intended for you to receive. You can NOT, however, be sure that that sender is someone who has your best interests at heart. In fact, you should assume the reverse -- for every packet you receive.

Share this post


Link to post
Share on other sites
Tree Penguin    262
Ok, thanks all. Still got one question though:

With TCP/IP you have a server and a client, the client connects to the server (which has opened a listening port). When connected the server can also send messages back to the client.

How is this properly done using UDP? I know it can be done when the client becomes a server (opens a listening port) of it's own but that would raise several issues when that client uses a router without a proper NAT configuration.

Should it be done in a different way or is there a way to solve the router problems?

Thanks in advance!

Share this post


Link to post
Share on other sites
rmsimpson    228
Typically a server is on a fixed IP address, and opens a UDP socket on a fixed predetermined port #.

The client opens up a UDP socket using any available port, and uses sendto() to send a message to the server. IP address and port the server is using are again predetermined as above and passed as parameters to sendto().

The server receives a packet using recvfrom(), which retrieves both the message and the IP/port the message came from. To reply, the server merely sends a message back using the IP/port it got from recvfrom().

It's extremely simple on the surface. A server can receive messages from literally hundreds of thousands of different senders at a time without the underlying transport layer having to do any management of them.

As for NAT routers, here's how I think they work -- I'm pretty sure this is what they do based on logical deduction. I haven't actually researched it, but I know they work and games do in fact play just fine over UDP/NAT, so this is logically how they must work:

All UDP packets have a source IP address and port of where they came from. When a NAT router sees a computer behind the NAT sending out a UDP packet to someone outside the NAT, the router will rewrite the UDP packet's headers so the intended recipient has a valid IP address to reply to. It also temporarily makes a mapping of the client's IP and port. When the router sees an incoming UDP packet destined for the internal network on the mapped IP/port, it forwards it to the client.

The router's UDP mapping is temporary and expires if no traffic flows over that port for an extended period of time.

Robert

Share this post


Link to post
Share on other sites
markr    1692
A UDP session is considered for NAT purposes, exactly like a TCP connection. Except it has no explicit start / end or any state, so the NAT router, just has to maintain the session NAT rules in place until some timeout period has expired.

Again, unlike TCP, in UDP, any packet can be the start of a session (there is no SYN flag) - the session-management is entirely application-specific.

As long as the client and server don't make any assumptions about the other one's apparent and local address being the same, it should work.

Basically the rule says, always send packets back from whence they (apparently) came, not anywhere else. The specific no-no, is sending IP addresses or port numbers inside data.

Mark

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this