Sign in to follow this  

Documentation on TCP buffers

This topic is 3480 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm interested in TCP buffer sizes, as set with setsockopt(socket, SOL_SOCKET, SO_SNDBUF/SO_RCVBUF, ...). Is there any useful documentation on this that will apply broadly to Linux and Win32? I once read that setting the send buffer length to zero will force the socket to send straight away (perhaps causing a send operation to block for longer?) and reduce latency. Is this so? What about the receive buffer - if that is set to zero, does this have any benefits in terms of latency? And what happens if the application cannot service such a socket quickly enough: is it possible that data will be dropped silently, or does TCP guarantee more reliable behaviour as I would expect? Finally, I believe there is (or can be) a relationship between the buffer sizes and the TCP window size: what are reasonable values for the buffers in the case of wanting to maximise throughput over a LAN (eg. 1 gigabit), and does this affect latency? And what are the implications (if any) for a server socket with default buffer sizes when connected to a client socket that has both buffers set to zero?

Share this post


Link to post
Share on other sites
I think if you want to reduce send latency, disabling Nagle's algorithm works better.

I don't know to be honest, but I guess what happens if you make the send buffer zero, then all sends will block until the previous one has completed, which would be a major downer. Similarly, if the receive buffer is small (or zero), I imagine the TCP stack accepting exactly one packet at a time, and dropping all others, forcing resend. Like I said, I don't really know, but this is what would make sense in my opinion.

Having said that, I'm setting my buffers to 256k each, hoping that it improves something. Would be nice to hear what it's worth from someone who acutally knows :-)

Share this post


Link to post
Share on other sites
Quote:
Original post by samoth
I think if you want to reduce send latency, disabling Nagle's algorithm works better.

I don't know to be honest, but I guess what happens if you make the send buffer zero, then all sends will block until the previous one has completed, which would be a major downer.


Yeah, on the code I am looking at, TCP_NODELAY is already set. But I vaguely recall reading that there's a system on some operating systems where sent TCP data is held back in the buffer for a time (200ms?) rather than sent immediately, to see if it can be merged into a larger packet, and that disabling the send buffer disables this feature. (EDIT: this may have something to do with TCP_CORK? EDIT 2: Actually, it looks like TCP_CORK is for something else.)

I'm not sure what having no receive buffer would achieve, and am interested in learning.

[Edited by - Kylotan on May 29, 2008 7:30:54 AM]

Share this post


Link to post
Share on other sites
Linux source would likely be a good start.

For high throughput, I'd guesstimate that sending more than MTU in one send() would cause immediate dispatch (there is nothing to buffer after-all).

Share this post


Link to post
Share on other sites
Quote:
Original post by Kylotan
But I vaguely recall reading that there's a system on some operating systems where sent TCP data is held back in the buffer for a time (200ms?) rather than sent immediately, to see if it can be merged into a larger packet, and that disabling the send buffer disables this feature.
It's even 500 ms if I remember correctly. Yeah, that's Nagle (and deferred ACK similarly). TCP_NODELAY should do for that.

Quote:
(EDIT: this may have something to do with TCP_CORK?
That's the opposite. You use TCP_CORK to force TCP not to send out small packets.
You need that for example in a webserver if you want to write the (dynamic) header first and then sendfile() the content.
Since you want to disable Nagle's algorithm here, TCP would send out two packets, one containing the header, and one containing the web page. This is a nuisance especially for responses that would fit entirely into one datagram otherwise. TCP_CORK avoids that.

Share this post


Link to post
Share on other sites
Thanks, those links look quite useful.

I'd still be interested in hearing if anybody knows for definite what happens on the receiving side when the receive buffer is full though. I seem to have a situation where 2 subsequent successful recv() calls on a socket appear not to be returning contiguous parts of the data stream that has come in (according to Wireshark). I'm just speculating about the buffers, because it only occurs under relatively heavy load. It's probably an application level problem, but I'm having trouble seeing how.

Share this post


Link to post
Share on other sites
Are you using threads?
Winsock documentation says somewhere that two receive calls shouldn't be initiated simultaneously on the same socket from two threads, because buffer-order might become unpredictable. Ofcourse several receive operations can be in progress at the same time, just that the second cannot be called before the first has returned for overlapping operations. I'm not sure how this works for blocking sockets, perhaps it is not a problem.
According to http://msdn.microsoft.com/en-us/magazine/cc302334.aspx under "Who manages the buffers?" it seems that even though the receive buffer is too small the data will still be buffered. If it cannot be buffered I assume it will be discarded which will lead to an eventual resend from the remote end.
That data would get to the application in the wrong order or that some data would arrive without prior data should be impossible.
Also, from http://msdn.microsoft.com/en-us/library/ms740476(VS.85).aspx:
Quote:

SO_RCVBUF int Specifies the total per-socket buffer space reserved for receives. This is unrelated to SO_MAX_MSG_SIZE and does not necessarily correspond to the size of the TCP receive window.

Share this post


Link to post
Share on other sites
I am using threads, but a socket should only have recv() being called from a single thread. What I see is a 'hole' in the data, where the data we get back from recv() in subsequent calls is in the right order, but misses out a large amount of data in between which should have arrived. I would assume it's a problem with the app, but having added plenty of logging, it is hard to see how this could be the case. So I'm just trying to see if the buffering could possibly be an issue, even if very unlikely.

Share this post


Link to post
Share on other sites
You can also check the remote application sending the data for an error. Perhaps it will disregard some error, like that the send buffer is full, and still remove the data it tried to send from it's application buffer.

Share this post


Link to post
Share on other sites
The data makes it across the network ok - I can see that from Wireshark sniffing the packets. Unfortunately it appears (though is not obvious, due to obfuscated code) that the last 2 recv() calls miss out about 5K of that data, which would come after the first call and before the second. Strange.

Share this post


Link to post
Share on other sites
i suppose if the recv buffer is zero, the TCP will receive nothing. That because the ack segment will notify its window size is zero, and the remote machine will not send anything until the window size is > zero .

Share this post


Link to post
Share on other sites
Quote:
I'd still be interested in hearing if anybody knows for definite what happens on the receiving side when the receive buffer is full though


- Nothing fancy happens on the receiving side. The kernel is full and is not accepting the data and it will inform the clients.

- The client needs to implement error checking for this..

len = strlen(msg);
bytes_sent = send(sockfd, msg, len, 0);

if (bytes_sent == -1) do error handling
if (bytes_sent != len) store what you managed to send and try to resend the rest at a later time!

send will fire off as much of the data as it can, and trust you to send the rest later. If the value returned by send() doesn't match the value in len, it's up to you to send the rest of the string


So in short, you dont need to handle this on the server. As long as it reads every now and then it should be fine.

tcp packets are guaranteed to arrive in order.

Share this post


Link to post
Share on other sites

This topic is 3480 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this