Sign in to follow this  
tom_mai78101

[Question] UDP Checksum Calculation Theory: Pseudo-Header and how to get the values for them.

Recommended Posts

tom_mai78101    693
Given this information from most sources searched via Google:


[quote]To calculate UDP checksum a "pseudo header" is added to the UDP header. This includes:

IP Source Address 4 bytes
IP Destination Address 4 bytes
Protocol 2 bytes
UDP Length 2 bytes

The checksum is calculated over all the octets of the pseudo header, UDP header and data.
If the data contains an odd number of octets a pad, zero octet is added to the end of data.
The pseudo header and the pad are not transmitted with the packet.
[/quote]


I have to split the IP Source Address and IP Destination Address up into 2 words (2 16-bits), then add the Protocol and the U UDP length together to form a sum S, then do a 1's complement on the S, to get the checksum.

[img]http://i1207.photobucket.com/albums/bb464/tom_mai78101/Untitled-1.png[/img]


So, according to this packet, the highlighted part is the UDP header, split into 4 groups of 2 bytes. It represents in the following order, Source Port, Destination Port, Length, and Checksum, respectively.

According to the quote above, I should get the Source and Destination IP addresses, the protocol number and the UDP length field:

Source IP Address (in hex, located at address 0x0022): 3D E3 7F 45
Destination IP Address (located at address 0x0026): A8 5F C0 01
Protocol (located at address 0x001F): 11
UDP Length (located at address 0x002E): 00 33

Then I add those up in 16-bit, and I get this for the sum S: 0x25CC

Complementing S and it didn't match the checksum C (located at 0x0030): D0 7A

What have I done wrong? I mean, I don't understand what 16-bit segments should I split up, other than the UDP pseudo-header that was generated... I also noticed the protocol should be 2 bytes, yet all I got was 1 byte, so I don't know where to pad 1 byte full of zeroes at.

Share this post


Link to post
Share on other sites
tom_mai78101    693
[quote name='rip-off' timestamp='1305725583' post='4812506']
The UDP checksum is handled automatically by the lower networking layers. Do you want to add an additional layer of checksumming on top?
[/quote]


Of course not. I don't want myself to get over-obfuscated by it. :rolleyes: But it would be nice enough for me to understand the math behind it, which unfortunately, I didn't get a nice grasp of it. :( All I need is the correct way of calculating the UDP checksum in the Transport Layer, so that way, I can create a simple protected 1-to-1 server for my homework. My professor thought I was working too hard over this, but I just loved to give myself a little bit of challenge.

Share this post


Link to post
Share on other sites
hplus0603    11356
[quote name='tom_mai78101' timestamp='1305725930' post='4812510']
[quote name='rip-off' timestamp='1305725583' post='4812506']
The UDP checksum is handled automatically by the lower networking layers. Do you want to add an additional layer of checksumming on top?
[/quote]


Of course not. I don't want myself to get over-obfuscated by it. :rolleyes: But it would be nice enough for me to understand the math behind it, which unfortunately, I didn't get a nice grasp of it. :( All I need is the correct way of calculating the UDP checksum in the Transport Layer, so that way, I can create a simple protected 1-to-1 server for my homework. My professor thought I was working too hard over this, but I just loved to give myself a little bit of challenge.
[/quote]

First, how are you generating these packets? Using raw IP sockets?

Second, all network operations are done in BIG ENDIAN mode. This means that each 16-bit word has the big byte first. This means that "protocol" is a zero followed by the IP protocol byte. This is well illustrated in the Wikipedia article on UDP datagrams:

[url="http://en.wikipedia.org/wiki/User_Datagram_Protocol"]http://en.wikipedia.org/wiki/User_Datagram_Protocol[/url]

Meanwhile, the padding always goes at the end, so the last byte is actually the high byte of the last word in this case!

I found some C code to calculate UDP checksums using a [url="http://lmgtfy.com/?q=udp+header+checksum"]two-second Google[/url] search. You may want to read it for reference:
[url="http://www.netfor2.com/udpsum.htm"]http://www.netfor2.com/udpsum.htm[/url]
Although, it being free on the internet from some random guy, it may or may not be authoritative :-)

Share this post


Link to post
Share on other sites
tom_mai78101    693
[quote name='hplus0603' timestamp='1305730781' post='4812540']
First, how are you generating these packets? Using raw IP sockets?

Second, all network operations are done in BIG ENDIAN mode. This means that each 16-bit word has the big byte first. This means that "protocol" is a zero followed by the IP protocol byte. This is well illustrated in the Wikipedia article on UDP datagrams:

[url="http://en.wikipedia.org/wiki/User_Datagram_Protocol"]http://en.wikipedia....tagram_Protocol[/url]

Meanwhile, the padding always goes at the end, so the last byte is actually the high byte of the last word in this case!

I found some C code to calculate UDP checksums using a [url="http://lmgtfy.com/?q=udp+header+checksum"]two-second Google[/url] search. You may want to read it for reference:
[url="http://www.netfor2.com/udpsum.htm"]http://www.netfor2.com/udpsum.htm[/url]
Although, it being free on the internet from some random guy, it may or may not be authoritative :-)
[/quote]


I generate these packets from Wireshark and using the command prompt to call "nslookup". I don't use raw IP sockets. I don't understand the C code, because I don't understand the theory of finding out the walkthrough to how to calculate the checksum, as mentioned below. This is why I ask such question to everyone, because, reading the codes sometimes still confuses me, without no knowledge prior to computer networking.

Again, back to the theory, you have a pseudo-header. Given 4 bytes for the source IP address, 4 bytes for the destination IP address, 2 byte for the protocol, and 2 bytes for the UDP packet's length. The checksum is calculated over all the octets of the pseudo header, UDP header and data. [color="#ff0000"]Tell me if I'm wrong on this one.
[/color]
1. You split the entire pseudo-header into groups of 1-byte long units, with the remaining spaces replaced with zeroes. Then you add them up. (True/False)
2. After obtaining the sum of that, you also need to split the UDP header up into groups of 1-byte long units, then add all of them to the sum. (True/False)
3. After obtaining the newer sum from above, you also need to split up all of the data into 1-byte long units, then add all of the octets up into the sum. (True/False)
4. Then you do a 1's complement on the sum, and you have obtain the checksum for the UDP.

Please check. :)

Share this post


Link to post
Share on other sites
rip-off    10979
The definitive source is the [url="http://tools.ietf.org/html/rfc768"]RFC for UDP[/url].

From skimming that, you seem to be totally incorrect, apart from step 4. The header is "chunked" into groups of 16 bits, and there is only a single zero byte added to the data in the case where there aren't enough 16 bit groups (i.e. an odd sized packet).

Share this post


Link to post
Share on other sites
tom_mai78101    693
[quote name='rip-off' timestamp='1305734944' post='4812572']
The definitive source is the [url="http://tools.ietf.org/html/rfc768"]RFC for UDP[/url].

From skimming that, you seem to be totally incorrect, apart from step 4. The header is "chunked" into groups of 16 bits, and there is only a single zero byte added to the data in the case where there aren't enough 16 bit groups (i.e. an odd sized packet).
[/quote]


Exactly. I always see this graph show below:

[source lang="cpp"][font="Arial"]

[font="Courier New"]// 0 7 8 15 16 23 24 31
// +--------+--------+--------+--------+
// | source address |
// +--------+--------+--------+--------+
// | destination address |
// +--------+--------+--------+--------+
// | zero |protocol| UDP length |
// +--------+--------+--------+--------+

[/font]
[/font][/source]


I don't know how to add these up. I could never understand what the RFC meant by "obtaining the sum of the pseudo header ([color="#0000ff"]shown above[/color]), the UDP header ([color="#00ff00"]the entire UDP header or parts of it?[/color]), and the data ([color="#ff0000"]again, all of it??[/color])". I did the cacluation and it wasn't correct when I complement it.

Do I just do:

[color="#4169e1"]NOT(source address + destination address + ([zero] [protocol] [UDP length])) [/color]

to get the checksum?

Or do you split them into 2 octets:

[color="#4169e1"](First half of the source address) + (Second half of the source address) + (First half of the destination address) + (Second half of the destination address) + ([zeroes] and [protocol] blocks together) + (UDP length)?[/color]

Share this post


Link to post
Share on other sites
hplus0603    11356
[quote name='tom_mai78101' timestamp='1305734074' post='4812566']
[color=#1C2837][size=2]reading the codes sometimes still confuses me, without no knowledge prior to computer networking[/size][/color]
[color=#1C2837][size=2][/size][/color][/quote]

To put it another way: If this was a "file" rather than a "network packet," the algorithm and code would be the same, so if you can't read the C code, then you also can't read C code that calculates checksums for files. And if this was a "memory stream" instead of a "network packet," then you can't read C code that calculates checksums for memory streams.
If you don't understand the C code, which is just plain C code -- it sounds as if you need to improve in C programming! Given that the difference between bytes and 16-bit (half)words is one of the fundamentals of C programming, and memory layout and data alignment in general is highly important to any real C program, you should understand C enough to read that snippet before you start looking at systems programming topics (like network programming, or device drivers, or file formats, or whatever).

Share this post


Link to post
Share on other sites
tom_mai78101    693
[quote name='hplus0603' timestamp='1305772857' post='4812855']
To put it another way: If this was a "file" rather than a "network packet," the algorithm and code would be the same, so if you can't read the C code, then you also can't read C code that calculates checksums for files. And if this was a "memory stream" instead of a "network packet," then you can't read C code that calculates checksums for memory streams.
If you don't understand the C code, which is just plain C code -- it sounds as if you need to improve in C programming! Given that the difference between bytes and 16-bit (half)words is one of the fundamentals of C programming, and memory layout and data alignment in general is highly important to any real C program, you should understand C enough to read that snippet before you start looking at systems programming topics (like network programming, or device drivers, or file formats, or whatever).
[/quote]

So, where do we start adding 16-bit words from in a packet? (What hexadecimal address should I start reading the data, and in what order of addition should I start off with?)

It's not like I don't know C programming, it's the theory behind the calculation. I'm interested in the theory of doing the UDP checksum, and only using raw datagram packets to calculate the checksums by hand. I dislike reading the C code purely because it doesn't tell me where to start reading the UDP packet, and at what address in the packet.

It's like you're confusing me with someone who doesn't know how to read C programming code, which is really a bad assumption, since throughout the whole thread, I was asking how to obtain the checksum by hand, by theory, by mathematical means, instead of relying on programming and obtaining the results from there. And I didn't say that I wanted to program it out. :(

[color="#ff0000"]I feel like I'm being misunderstood.[/color] I wanted to clear this out, I don't want the code, I want the step-by-step instructions on how to compute the checksum by hand, when all you've got is a Wireshark UDP packet that you've gotten from using "nslookup" in the command prompt.


Share this post


Link to post
Share on other sites
hplus0603    11356
[quote name='tom_mai78101' timestamp='1305790343' post='4812906']
[color="#ff0000"]I feel like I'm being misunderstood.[/color] I wanted to clear this out, I don't want the code, I want the step-by-step instructions on how to compute the checksum by hand, when all you've got is a Wireshark UDP packet that you've gotten from using "nslookup" in the command prompt.
[/quote]

And I pointed you at C source code that does exactly that. If you feel that you cannot translate C source code to step-by-step instructions, then I don't know what to say. C source code is explicitly and by necessity a step-by-step instruction for how to do it.

I also pointed at the Wikipedia article that explains the process in very clear detail, and me and others have mentioned the most obvious causes of mis-calculation (byte order, 16-bit words, how to treat the padding byte).

If, given these instructions, you cannot calculate a UDP checksum, then I don't know what to do. The code is already written for you! The algorithm is described in detail! The typical gotchas are called out already!

Share this post


Link to post
Share on other sites
tom_mai78101    693
[b][size="5"]MAJOR EDIT:[/size][/b]

[color="#ff0000"]Dear hplus0603:[/color] I finally found it, on May 20th, 2011, noon. I would like to show what I have been trying to ask people for. This is the step-by-step instructions I'm looking for ever since Wednesday.

:)

[url="http://irw.ncut.edu.tw/peterju/internet.html"]Source.[/url]


???
1.UDP?Pseudo Header??? IP ???? Sorurce Address, Destination Address, ? Protocol?
2.UDP Length??Pseudo Header?UDP Header????1??????????2??
3.?Data?????Word(2 Bytes)?????????????? byte?padding?0??

[source lang="cpp"]
[font="Courier New"]// ------- BEGIN Ethernet HEADER ----------------------------
00 09 5b 4f 64 72 // Destination HW address
00 07 95 e7 79 2d // Src HW address
08 00 // Type (not really sure what this octet's for)
// ------- BEGIN IP HEADER ----------------------------
45 // IP version (IPv4)
00 // Differentiated Services Code Point (i have no idea what this octet is for)
00 38 // Total Length (of the packet?)
5d 02 // ID (not really sure what this octet's for)
00 00 // Fragmentation offset
80 // Time To Live (in network hops)
11 // Protocol (UDP)
33 d6 // IP Header Checksum
c0 a8 00 02 // Source IP
c0 f6 28 3c // Destination IP
// ------- BEGIN UDP HEADER ---------------------------
6d 38 // Source Port
6d 2e // Destination Port
00 24 // Length (of the UDP Packet?)
29 b5 // UDP Checksum
ff ff ff ff // Marker (part of the packet data, no idea what it's for)
67 65 74 73 65 72 76 65 72 73 20 38 32 20 66 75 6c 6c 20 65 6d 70 74 79 // Packet Data[/font]
[/source]

??????2byte???(16 bits)????(Pseudo Header, UDP Header, UDP Data)

[source lang="cpp"]
[font="Courier New"]
c0a8 + 0002 + c0f6 + 283c + // Source IP, Dest IP
0011 + // Protocol
0024 + // UDP length
6d38 + 6d2e + // Source Port, Dest Port
0024 + // UDP length
0000 + // empty checksum
ffff + ffff + 6765 + 7473 + 6572 + 7665 + 7273 + 2038 + 3220 + 6675 + 6c6c + 2065 + 6d70 + 7479 // Data
= 8d642[/font]

[/source]

???LSB??????

[source lang="cpp"]

[font="Courier New"]8 + d642 = d64a[/font]

[/source]

?????????checksum?

[source lang="cpp"]

[font="Courier New"]~d64a = 29b5 [/font]

[/source]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this