Sign in to follow this  
Antheus

64k UDP receive buffer limit

Recommended Posts

I began testing my network code, and ran into an unexpected problem, namely the receive buffer overflow. I stumbled across this quite accidentaly, while running dummy clients with Sleep(x) between their commands. The server stalled at absurdly low rates, with clients timing out after several seconds. Changing that into Sleep(x + rand() % y) fixed problem somewhat, increasing the network buffer solved the problem completely (800 simulated clients running with 0 packet loss at 4 Mb/sec rate). While this works, it doesn't solve the potential lag spike problem. It seems that the original test managed to bring down the client with burst traffic, but how to handle in live server. I'm using boost::ASIO to set send and receive buffer sizes to 1Mb (although 64k is enough), but what are the actual limits. ASIO might have its own buffers, I'll need to check into that. Is there a common strategy to somewhat reliably solve the problem? PS: As a side note, since nobody heard about it before, Boost ASIO works as all boost libraries do - horrible API, but once set up, very hard to break. [Edited by - Antheus on April 5, 2007 4:11:02 PM]

Share this post


Link to post
Share on other sites
64 kilobytes is the limit on the maximum packet size that can be sent with TCP or UDP. (The packet size is stored in two bytes)

With UDP, 64K is all you'll ever need in a receive buffer, since UDP messages are split up for you. This means you'll never get more than 64K with one call to recv_msg. BTW, 64K is an enormous packet, and has an increased chance of being dropped by the network. Unless your network is very fast/reliable, the optimal throughput rate may be achieved with a lower packet size.

I'm not sure if there's any benefit to having buffers larger than 64K for TCP either, but this will depend on the size of the flow control window used by TCP. I don't know if the OS lets it grow past 64K... maybe someone else can answer this.

Share this post


Link to post
Share on other sites
Quote:
Original post by myrdos
64 kilobytes is the limit on the maximum packet size that can be sent with TCP or UDP. (The packet size is stored in two bytes)


I'm not sure if that's correct.

From what I've read around, this is the buffer that keeps bytes received from the wire until they are accepted by recv_from.

In my case, the server is receiving 3200 1024 byte packets per second, with potential bursts of several 100 kb in a matter of miliseconds.

Despite adequate number of threads and 5-10% CPU utilization, the server simply can't free the winsock buffer fast enough

I've read around, and it seems increase of receive buffer is common when dealing with heavy UDP traffic. It seems that 64kb becomes inadeqaute at some 2000 packets per second.

Unfortunately it seems that fixing this on Linux requires kernel recompilation. Guess I'll see if this remains a problem. Like I said, right now increasing the buffer solves the problems.

Share this post


Link to post
Share on other sites
I think we may be talking about different buffers here... I thought you meant the size of the buffer you declare in the application. I'm not familiar with AISO, maybe it's hiding this from you? I now understand that you mean the buffer used internally by the OS.

Anyways, you can easily verify the max packet sizes for TCP and UDP, for example there's a section in the wikipedia
http://en.wikipedia.org/wiki/User_Datagram_Protocol#Packet_structure
And for TCP http://en.wikipedia.org/wiki/TCP_header#Header

Do a search for TCP Length on that page, it's in the IPv4 section. Note the enormous length allowed in IPv6! 4 gig packets, yikes.

>Unfortunately it seems that fixing this on Linux requires kernel recompilation.

That's my understanding.

Share this post


Link to post
Share on other sites
You can resize the kernel-side receive buffer to whatever size is reasonable to you. Making it one megabyte for a server intended for a high-traffic game with lots of clients is not uncommon at all.

Note that there is a maximum rate at which you can receive data from the kernel (context switching and synchronization overhead), and it could be possible that a sustained packet stream would fill up the kernel-side buffer faster than you can drain it. In that case, you have to figure out who is flooding your buffer, and add a firewall rule for them, as you are being DOS-ed (Denial of Service).

Last, when trying to get the maximum throughput possible is when the advanced sockets techniques (that are platform dependent) start mattering. This means overlapped I/O with IOCP on Windows, and the flavor of poll/epoll/kpoll that works best on your kernel on UNIX. However, before you go there "just to prove it," make sure that you realistically will actually need that level of performance. If not, you'll be gold plating some part of the system that nobody will actually ever see.

Share this post


Link to post
Share on other sites
Header: Declared in winsock2.h.
Import Library: Link with ws2_32.lib.



setsockopt
The Windows Sockets setsockopt function sets a socket option.

int setsockopt (
SOCKET s,
int level,
int optname,
const char FAR * optval,
int optlen
);





//----------------------------------------------------------------------------------------

INT UDPX::get_socket_statistics(INT32 optname, CHAR socket) //+++ SO_MAX_MSG_SIZE //SO_RCVBUF //SO_SNDBUF
{
INT dbuf;
INT dbuflen = 4;
UINT socketx;

if (socket =='S') socketx = SenderSocket;
else if (socket =='L') socketx = ListenerSocket;

if (getsockopt(socketx, SOL_SOCKET, optname, (CHAR*)&dbuf, &dbuflen) != 0)
{
logit2("getsockopt() failed -- need to check error returned ...\n");
return(0);
}
else
{
return(dbuf); // 32 bit int
}
} // end of get_socket_statistics()

//-----------------------------------------------------------------------------------------------------------------------
VOID setbuffersize(INT32 dbuf) //+++ set to large size UDP bufferes
{
printf(" getsockopt sender SO_MAX_MSG_SIZE = %d \n", mysockx->get_socket_statistics(SO_MAX_MSG_SIZE, 'S'));
printf(" getsockopt SO_RCVBUF = %d \n", mysockx->get_socket_statistics(SO_RCVBUF, 'S'));
printf(" getsockopt SO_SNDBUF = %d \n", mysockx->get_socket_statistics(SO_SNDBUF, 'S'));

//INT dbuf = 65536; //16384 //32768 //65536 !!!! it accepts over 65k !!!!

if (setsockopt(mysockx->ListenerSocket, SOL_SOCKET, SO_RCVBUF, (CHAR*)&dbuf, 4) != 0)
printf(" setsockopt(SO_RCVBUF) failed\n");
else
printf(" NEW getsockopt SO_RCVBUF = %d \n", mysockx->get_socket_statistics(SO_RCVBUF, 'L'));

if (setsockopt(mysockx->SenderSocket, SOL_SOCKET, SO_SNDBUF, (CHAR*)&dbuf, 4) != 0)
printf(" setsockopt(SO_SNDBUF) failed\n");
else
printf(" NEW getsockopt SO_SNDBUF = %d \n", mysockx->get_socket_statistics(SO_SNDBUF, 'S'));
}// end of setbuffersize()


Share this post


Link to post
Share on other sites
Quote:
Original post by hplus0603
Last, when trying to get the maximum throughput possible is when the advanced sockets techniques (that are platform dependent) start mattering. This means overlapped I/O with IOCP on Windows, and the flavor of poll/epoll/kpoll that works best on your kernel on UNIX. However, before you go there "just to prove it," make sure that you realistically will actually need that level of performance. If not, you'll be gold plating some part of the system that nobody will actually ever see.


This is why I chose ASIO. On Windows, the implementaton is IOCP. There's also epoll and kqueue implementations.

What I needed to know, was how the whole thing scales. Some boost libraries are known to be inefficient, or at least perform redundant operations.

This performance observation is relevant for one reason - initial implementation, despite using IOCP stalled at 50-70 kb/sec, even when client and server were running on same machine. And after setting those two parameters, the exact same code effortlesly jumped to megabytes without even stressing the CPU.

The other reason is, that due to modular design, I do preliminary tests, so that I know what to expect and at least have a feel for various concepts.

There are some topics with which I am unfamiliar, so I need to get my bearings. Is 100 messages a lot? 10,000? Is 50 million messages / second possible? It helps to know at least loose bounds.

The only thing such test shows me, is that server is capable of handling the connection, ip/connection hash mapping, packet de-serialization, new packet generation, bandwidth throttling and sending for 500+ clients without breaking sweat. If there were any real problems, then I'd need to rethink the whole aproach to networking.

Another interesting observation was, that the clients, for simplicity sake, use sockets synchronously, so there's one per thread. And on a single windows machine, running 800+ threads didn't kill the system - something that we were having problems in Java 1.4 with around 150-200 sockets, and Java was always advertised as thread friendly, even more that windows.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this