TCP stream 300ms spikes due to nagle algorithm?

Started by
18 comments, last by Zurtan 4 years, 5 months ago

I am trying to stream a lot of videos on a peer to peer connection using TCP stream(via winsock2) on windows.

The connection is 10Gbs copper.

I need to stream about 2Gbs. I am able to get the throughput but occassionally there is a 200 or 300ms stall.

The stall occurs around send for the server and around recv for the client.

Digging a bit more I read about the Nagle algorithm and quickack implemented by default. I have tried to change the configurations in many ways... But there always seem to occassionally a non "flat line" communication. So there are spikes of around 200 ms.

Should I stream video in UDP? Are such spikes inherent to TCP Stream? I can smooth the spikes with a buffer but I would pay with latency.

Or is thete a setup where TCP ack or algorithm wont stall at all?

Advertisement

I wouldn't think that Nagle would affect high bandwidth streaming at all.  I was under the impression that you'd only notice its effects if sending small pieces of data infrequently.

Yeah, Nagle doesn't impact anything with large packets. Instead, it's quite likely that you drop a packet now and then, and the TCP stream temporarily stalls. This is what TCP connections do. You need to add enough buffering / read-ahead that a few hundred milliseconds doesn't matter; you just run down to less data available in the buffer while that goes on.

If the additional delay from a playout buffer is not acceptable for you, then you'll likely want to use UDP instead of TCP for streaming. That, of course, opens up other problems. You might want to look into a dedicated protocol on top of UDP, like RTP and RTCP.

enum Bool { True, False, FileNotFound };

I think I made progress.

I figured that the recv thread needs to do nothing apart from copying packets and pushing them on a queue for a different thread.

I tried to do UDP, and I saw that if the recvfrom thread is doing anything but recvfrom and push, I suffer from lost packets.

So I think that this will be the case in TCP as well, eventhough TCP has a buffer that hides this issue.

I don't do much in the recv thread, but I do enough that it's an issue I think.

For the very least, separating recv from constructing the frame, will help me isolate the problem.

I am thinking that what we have seen that both recv and send stutter, is because the recv thread wasn't doing the bare minimum like it should.

 

So I am thinking a good solution is to just do something like:

1) recv into buf of about 8000 Bytes.

2) Then push those 8000 bytes into a queue.

3) The queue will be read in a different thread and do the same thing the recv thread used to do with the 8000 bytes,

If you want high throughput, use buffers bigger than 8 kB. Try at least 64 kB for your read/write chunkiness, an often you get more with more.

Also, if you set the buffer size of the socket on both the sending and receiving side, BEFORE you connect/bind/accept the socket, then the TCP connection may be negotiated with TCP window scaling, which can increase how much throughput you can get on high latency connections, as well as increses how much buffering is done in the kernel to mitigate packet loss. It's common to set the socket buffer size to at least 1 MB per socket for high-throughput connections.

 

enum Bool { True, False, FileNotFound };

I set internal buffers(SNDBUF and RCVBUF) to 128MB each. I do this to all the sockets.

So that include client socket, accept socket, and newly created socket at the server after connecting to a client.

I might be getting abnormal amount of packet loss for a Nic to Nic communication.

I tested this using iperf with UDP(although iperf in windows is a bit outdated).

On a 100Mbs connection UDP, with window size(I think that's internal buffers in iperf) of 32MB, I think I get no packet loss.

The more I increase bandwidth, the more I get packet loss.

I am not sure this is reasonable, but on this 10Gbs NIC, with target bandwidth of 1Gbs on one socket, using a buffer of at least 64MB, I am getting packet loss of around 0.3% from time to time. I still think this might be too much.

If I am trying to transmit 9Gbsm depending on window size, I get no packet loss, until I assume the window size overflow and then I get like 30% packet loss.

If I take half the size of the window size(internal buffer) I can see the jump from 0% to 30% takes twice as fast.

This is with buffers as big as 0.5GB.

So I am not sure if this packet loss is normal. But it is now clear to me that the stutters I see in TCP is actually the TCP overcoming too many packet losses.

 

Is this kind of packet loss normal?

We have both fiber and copper NICs, all of them have these kind of packet losses.

The amount of packet loss you get depends on many things:

  1. The quality of the hardware Ethernet transmitter and receiver

  2. The quality of the hardware in the switch you're using

  3. The quality of the software in the switch you're using

  4. The quality of the wiring (or fibers) you use for connections

  5. The quality of the software (driver) running the Ethernet card in the sending and receiving node

  6. The quality of the network stack implementation for the network protocols in the sending and receiving nod

My experience is that the problem can be in any of these places. If your switch is a cheap, unmanaged switch, I would start looking there. Another popular place for problems is the driver or hardware for the Ethernet card, where some temporary processing glitch can cause a lost packet.

In general, unless you use switches from places like Juniper or Arista or Cisco, you can not rule out the switch! And you may need pretty sophisticated analysis and debugging (maybe even special network analyzer hardware) to pinpoint the exact spot of the problem. (Also, even the high-end switches will sometimes have bugs, but at least those vendors will answer bug reports, assuming the equipment is still within support windows.)

To reliably get X Gb/s of throughput between point A and point B, you might consider using some kind of N-out-of-M self-correcting encoding, where you increase the amount of bandwidth you use, and send more packets, using some encoding scheme that lets you recover the underlying message if you receive at least N packets out of M sent. Send this over UDP, so that you don't see head-of-line blocking when packets are dropped.

enum Bool { True, False, FileNotFound };

We don't have a switch, it's a direct cable from NIC to NIC.

A star setup.

I have a theory though...

Let's say you have a 100Mbs NIC. Now let's say in the first second you send 400Mb, and then the following 3 seconds, you don't send anything.

You might think that you are sending 100Mbs on average, but your NIC might drop a lot of packets in the first second.

Edit: I might need to smooth the sending. I might be spamming the NIC too fast in a short period of time.

 

The Nic cannot send more than 100 Mbps. The rest gets buffered in the kernel, before it's even sent. When the buffer fills up, more packets will be dropped.

The goal of TCP is to manage the connection in between, so that it doesn't start dropping packets because of link overflow. The plain vanilla default TCP implementation will, long term, end up with a single connection that saturates about 75-80% of available bandwidth; tweaks to the algorithm may make certain pairs of client/server operating systems achieve better results. (It's not uncommon to get to 95% or better of theoretical max, on undisturbed networks.) Thus, TCP has a send buffer, and only sends packets to the NIC when the NIC can actually use them -- nothing will be discarded on the way to the NIC, as long as you don't run out of buffer space. But, even that, will be unlikely, because the send() (or sendfile(), or write(), or whatever) system call will block when the TCP buffer runs dry, until data can be copied out of the provided buffer into the network queue.

Thus, the main place where packets are dropped, for TCP, with blocking sends, is on the wire, in the hardware, or on the receiving side. A NIC that receives packets that there is no incoming buffer space for, has no choice but to discard the packet. So, if you burst at 10 Gbps, and the driver, and/or network stack, on the receiving end, can't keep up with that burst rate for however long, then that may cause a packet drop. Given that this happens "occasionally," I woder if it's some occasional driver or device on the system that disables interrupts for too long or something -- polling a disk, flipping a graphics buffer, or something like that.

 

enum Bool { True, False, FileNotFound };

Well I think we figured that it is highly unlikely that the hardware lose packets.

The packet loss we have seen in UDP is just "software packet loss".

It means the receiving end is unable to read the packets fast enough.

This is more apparent because if your server sends packets to the other computer, he doesn't even care if there is a receving client reading those packets. He sends them anyway, and windows will show that there is bandwidth on the NIC eventhough you already closed your client because the server just keeps on sending.

On the other hand, since TCP is blocking on send, the block itself might affect the bandwidth you see on task manager.

TCP is a lot less stable bandwidth wise on task manager than the UDP server that always sends and doesn't care if anyone reads it on the other end.

So why would TCP block the send if it queues the data into a buffer and send it with a latency later on? You would think a buffer would smooth out the bandwidth.

 

We are now focusing on UDP, as we don't know why TCP is so unstable and non uniform. Maybe it's by design.

 

However, even in UDP we have issues.

My latest theory is that... a single core cannot handle 3Gbs of bandwidth too well.

A CPU core operates on about 3Giga cycles per second? So you have 3Ghz/(3Gbs/(8*9000Kb)) that leaves you with 72K cycles per packet on average.

That might be not enough for a a single core.

So I think that I need more than one thread to read from a single socket, or to limit the bandwidth per thread.

What is the scale of dealing with 14k packets per second?(for 1Gbs)

This topic is closed to new replies.

Advertisement