Jump to content

  • Log In with Google      Sign In   
  • Create Account


Theory - Handling TCP data that is split on recv calls


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
18 replies to this topic

#1 DarkRonin   Members   -  Reputation: 610

Like
0Likes
Like

Posted 05 May 2011 - 03:40 AM

Hi Guys,

I understand that (with TCP) if data is sent on one send() call, that it is possible that it may possibly take multiple recv calls to get the data that was sent in the first place.

But, is there any delay between recv() calls or will the data always be there on the next call?

For example will this happen?

Data sent = "Hello world"

Recv() = "Hell"
Recv() = "o world"

Or is it possible that this may happen;

Recv() = "Hell"
Recv() = NO DATA
Recv() = NO DATA
Recv() = "o world"

Thanks in advance :)

Sponsor:

#2 MatsK   Members   -  Reputation: 226

Like
1Likes
Like

Posted 05 May 2011 - 03:55 AM

The first will always happen, because recv() blocks until data has been received.
You haven't specified which language you're using, but assuming you're using C++, you should use an asynchronous solution like IOCP so that your main application thread doesn't have to block in order to wait for data.
This way, you can easily handle data asynchronously. Just attach a packet-ID to your packets, compare the packet-ID to the packetsize (which is predetermined), and if you've received PACKETSIZE number of bytes, the packet has been fully received. If the current buffer contains any data before the packet-ID, that data belonged to the previously received packet. If you've received more than PACKETSIZE number of bytes, you've received the start of a new packet also.

#3 DarkRonin   Members   -  Reputation: 610

Like
0Likes
Like

Posted 05 May 2011 - 04:13 AM

Thanks for the reply.

Yes, I am using C++. But, I am using non-blocking sockets.

So, is it possible the latter may happen? Or, will the data always be there on the second call?

#4 Drew_Benton   Crossbones+   -  Reputation: 1713

Like
0Likes
Like

Posted 05 May 2011 - 04:15 AM

In principle, yes, the second can happen if the socket is set to non-blocking mode. In that case, recv() returns SOCKET_ERROR and sets an error code if no data is available. For that model, you are just "polling" the socket for data every loop.

"But I, being poor, have only my dreams. I have spread my dreams under your feet; tread softly, because you tread on my dreams." - William Butler Yeats

#5 rip-off   Moderators   -  Reputation: 8163

Like
1Likes
Like

Posted 05 May 2011 - 04:22 AM

recv() is documented as returning the number of bytes read on success, the value 0 if the connection is closed, and an error code otherwise. If you set your socket into non-blocking mode, it will return an "error" value of something like EAGAIN, EWOULDBLOCK or WSAEWOULDBLOCK.

#6 DarkRonin   Members   -  Reputation: 610

Like
0Likes
Like

Posted 05 May 2011 - 04:29 AM

Actually, that is a good point. So, if I havent recieved all of the data I should be getting WSAEWOULDBLOCK and then I can act accordingly. :cool:

#7 frob   Moderators   -  Reputation: 20462

Like
0Likes
Like

Posted 05 May 2011 - 10:51 AM

What's more that this, you are also likely to get part of the next block of data. Or part of the next three or four or five data blocks, all at once.


This is one big reason for design of most networking systems. Look at not just the IP protocol, but most of them up and down the OSI model, and you will see this: {header { data } } The header is simply the size of the payload and enough data to know what to do with it.

That gets wrapped up at another level, where the block becomes data for the next level, with it's own message {header { data } } where data = {header { data } }.

So by the time you get to the wire, you end up with:
{ Ethernet header { IP header { TCP header { App data header { your game packet header { actual data } } } } } }

It is generally best use this model for your own code.

Read a small but known header, that includes the total size. Keep this buffered until you get that much data or more, and then pull off a full block and process it. Repeat until you have extracted and processed all your items.
Check out my personal indie blog at bryanwagstaff.com.

#8 hplus0603   Moderators   -  Reputation: 5181

Like
0Likes
Like

Posted 05 May 2011 - 12:12 PM

Actually, that is a good point. So, if I havent recieved all of the data I should be getting WSAEWOULDBLOCK and then I can act accordingly. :cool:


I recommend reading my description of one solution to the same problem in this other thread:
http://www.gamedev.net/topic/601279-sending-and-receiving-structs-i-read-faq/page__view__findpost__p__4807012
enum Bool { True, False, FileNotFound };

#9 DarkRonin   Members   -  Reputation: 610

Like
0Likes
Like

Posted 05 May 2011 - 04:43 PM

Thanks for the link hplus0603, checking it out now.

@frob - that was pretty much what I was thinking. So, getting too much data wont be an issue for me.

This is what I had in mind. A header (struct) containing an int for data size and data type. Where the data type might be another struct, compressed, or raw data. Something like;

struct header
{
int nType;
int nSize;
}

So, nType will identify to the app what sort of data to expect and what to do with it.

Does this sound feasable?


[edit]
I am also going to look at IOCP as suggested earlier (or atleast a multi-thread approach), as I foresee problems with waiting around for data too long - like application jitter, if the data gets caught up somewhere.

#10 Drew_Benton   Crossbones+   -  Reputation: 1713

Like
1Likes
Like

Posted 05 May 2011 - 05:41 PM

So, nType will identify to the app what sort of data to expect and what to do with it.

Does this sound feasable?


Yes, that's how it's usually done in practice for binary protocols. My advice is to keep it simple, and don't try to be clever with it.

I've seen some variations, such as writing a size as a byte, then data, and then a continued size byte for more data and so on (the last size byte is 0 to signal no more data), but I've never seen the point of those implementations. In the end, they use 2 bytes the majority of the time and are not saving anything.

I prefer having the size there to give more flexibility to the protocol rather than determining the size based on the ID alone. Reason be, fixed size packets usually mean fixed size strings, and those can really take a toll on a system. I've seen games use 512 byte fixed size strings for every chat packet, which is pretty painful to think about...

I am also going to look at IOCP as suggested earlier (or atleast a multi-thread approach), as I foresee problems with waiting around for data too long - like application litter, if the data gets caught up somewhere.


Any problems you would have with IOCP in regards to that would also happen in any other network implementation. This issue belongs to the application protocol processing layer, not the underlying network layer. If you say you only care the client is still sending data, then a client could send your ping packet over and over while not actually doing anything. Of course, you'd not know that if you only handle it at the network layer.

If instead you handle it at the application protocol processing level, you can time between different logical packets. Only then, can you know that client A has not sent a login packet in 1 minute, so you should disconnect them. Or, a client has not finished sending the packets required to complete a transaction in game in 10 seconds, start a rollback of actions and disconnect them. So on and so forth, you have to be aware of such things if you want to avoid timing exploits in your system. It's surprising how many systems are actually vulnerable to such things. I've come across a lot of flaws in games that were directly related to this issue.

"But I, being poor, have only my dreams. I have spread my dreams under your feet; tread softly, because you tread on my dreams." - William Butler Yeats

#11 DarkRonin   Members   -  Reputation: 610

Like
0Likes
Like

Posted 05 May 2011 - 06:37 PM

Thanks for the advice. I agree 100% with the variable packet sizes. Thats why I was thinking that 'chat' might have an ID of 50 (for example) and then a variable size, to save wasting bandwidth with a bunch of zero's.

Clientwise - I was also just toying with the thought of using blocking sockets and having a separate thread that monitors incoming data. So, this way the main loop of the app will never block but will check the network class on everyloop to see if a complete set of data has come in.

The recv thread could set a private variable when there is data that is ready to be used by the main app and once the main app has used the data, could reset the variable (via a public function) to tell the class it is ready for the next data structure.

That way the recv thread can hang around all it likes (within the application rules etc..) withought stopping flow of the main program at all.

What are your thoughts on this?

#12 Drew_Benton   Crossbones+   -  Reputation: 1713

Like
0Likes
Like

Posted 05 May 2011 - 11:06 PM

Since I've not seen it mentioned yet in this thread, I'll throw out the obligatory consider using: boost::asio, ACE, POCO, or any other known and established library that is designed to take care of these things for you. I'd recommend looking into boost::asio first myself. Once you learn how to use libraries like those, you will rarely find the need to work on this low level again.

But if you still want to do it yourself, I would really start out with a simple design that hplus linked to in his other reply. The client would simply call select with 0 timeout so it does not block. If there was data to receive, you can then being pulling it out and processing any complete messages. Depending on how much traffic and how many messages take place in your game, you might not really need to separate the networking logic from the client thread.

They can coexist together without any issues as long as you take some precautions in your design. Namely, you only check select at fixed rates rather than every update cycle, as checking to see if there is data each loop is a waste of resources. Once you being message processing, you keep track of processing time so you can bail out if too much time is being spent. On the next update cycle, you would continue processing messages.

That should be about it really. You should be able to implement and use a single threaded solution that will last you a good while. You can consider changing the method if you determine that running the network logic in your main thread is causing real client issues with performance. Usually though, such issues are not related to the actual networking IO aspect as much as just how you deserialize and process the messages

To get back to your question, I personally don't like using a synchronized variable to let the system know data is pending in this context. It's far too easy to come up with an implementation that suffers from a race condition or does not properly handle multiple events the same time. Your actual implementation will vary based on how many producer threads and how many consumer threads you have. In the end, you still have to have a lock take place, unless you are using a lockless queue.

Because of this, I'd implement a solution like this on Windows (due to the way CRITICAL_SECTION works):

global message queue
global lock (critical section)

Network Thread

while connected
- locals: buffer, size, index, messages
- if we have room left to store messages (we do not want the network thread to flood the client ever)
--- recv to buffer
--- perform protocol specific logic to split buffer into messages
----- running this logic in a thread only gives benefits when there is overhead from
----- packet decryption or other expensive deserialization calls

- if we have messages
-- global lock (enter critical section)
--- for each message in messages
----- add message to global message queue
-- global unlock (leave critical section)

- Sleep only if message queue is full, client thread is not consuming fast enough,
- so let it catch up some.

loop

Main Client Thread

while running
-- everything but network stuff --

- if ready to check for network events ( say you check every 1/60 of a second, some games more, others less)
-- global lock (enter critical section)
---- copy global message queue to local var
---- clear global message queue
-- global unlock (leave critical section)

-- process all messages or queue them into a client message queue to
--- process over the next bunch of update cycles that network logic is not called.

loop


But globals are bad!!1! Not always. In this case, you just want to be able to pass data from one thread to another. The easiest and most efficient way that has the lowest overhead (on Windows) is using a design like this. Simple and straight forward gets the job done. Since the client loop is only attempting to acquire the lock at a fixed rate, the maximum number of lock contentions you can have will be the inverse (i.e. imagine the threads happen to sync perfectly so both contend for the lock each loop).

All of the CRITICAL_SECTION locks the network thread makes when adding messages have no effect on the client thread until the client thread tries to acquire the lock. By that time, the lock is held for a very short period of time, as the only operations that are taking places are going to be pointer copy'ing related (assuming you are using a fixed size array as the copying medium, which means you limit the producer thread from having more than N amount of stored messages at a time). Say your global packet queue is just a fixed size array (or vector with a pre-allocated size) of 1024 elements or so and you store an int for the size. The operations that take place between the locks are pretty miniscule to not cause any client thread delays (no allocations are taking place at this time!). I just threw out a number, you'd probably want to make it a bit smaller depending on expected number of messages.

There are certainly other ways to do this, but then you are unnecessarily complicating what otherwise should be a very simple design. Other approaches include using PostThreadMessage to post the message objects to the client thread, using QueueUserAPC to invoke a function from the client thread context, attempting to create your own lockless queue, or making use of a library, like boost.

If you wanted something that was cross-platform, then this approach would not work as efficiently since it relies on CRITICAL_SECTION being so 'cheap' to use. It would still work with other locking mechanisms, but you have to be careful which one was chosen as some carry very high overhead. It should be noted that these things are not premature optimizations, it is simply a matter of choosing the right tools. While the 1024 array might be borderline premature optimization, it's other purpose is to allow you to gauge packet throughput to know if something is going wrong, either in the network thread or your client thread. If each update cycle you have a lot of messages to process, it means you might need to allocate more time for the network processing logic. Alternatively, if the time between the network logic processing in the main thread is greater than the rate you have set, it means something else in your system is eating up more than its fair share of time.


My final advice would be to start simple and get that to work. A single threaded solution should be possible and more than enough to handle what you want to throw at it at this stage from what I have read. It also simplifies a lot of other things so if you can't get a viable solution with it, chances are you are doing something wrong (or perhaps even using the wrong protocol!). If you want to make use of more efficient networking methods right off the bat, then it is strongly recommended to use an existing library. There are tradeoffs to each approach, but I think you just need to get more familiar with each method to understand how they could help you out (or not help you at all). Multithreaded programming in C++ should not be taken lightly, so be careful before just jumping right in!

"But I, being poor, have only my dreams. I have spread my dreams under your feet; tread softly, because you tread on my dreams." - William Butler Yeats

#13 DarkRonin   Members   -  Reputation: 610

Like
0Likes
Like

Posted 06 May 2011 - 12:25 AM

Excellent post.

I'll do a bit of playing over the next few days to see what I can come up with. I think I am at the point where I can come up with something workable and see how it goes.

Thanks again guys! :cool:

#14 MickeyMouse   Members   -  Reputation: 201

Like
0Likes
Like

Posted 08 May 2011 - 06:39 PM

You may also want to check out my solution to the problem (including source code) which is very similar to what hplus0603 proposed in a different thread but it handles both sending and receiving for you using non-blocking TCP sockets.
Maciej Sawitus
my blog | my games

#15 ddyer   Members   -  Reputation: 124

Like
0Likes
Like

Posted 09 May 2011 - 12:15 AM

You cannot make any assumptions about how many recv calls will be needed and how those received blocks of data will be split up.
You should never wait for data except in a task that does nothing else; or otherwise use nonblocking IO.
Do not forget that your send calls may block too, unless you are using nonblocking send calls.
You cannot ever make any assumptions, however reasonable, about the buffering capacity of the channel or the time delays that
may be involved in immediate delivery of the data.
That's TCP. It's the price you pay for guaranteed delivery.

.
---visit my game site http://www.boardspace.netfree online abstract strategy games

#16 Bozebo   Members   -  Reputation: 108

Like
0Likes
Like

Posted 09 May 2011 - 05:47 AM

The way I do this is to update a buffer whenver recv returns something (so recv is called in a loop somewhere, 10 times a second works OK for me at the moment).
The first 2 bytes of the buffer are always meant to be an unsigned short representing an op-code. Each op-code relates to a command (such as login, update health, spawn zoned-in player etc).

Every command has a known exact length or a variable length. If they have a variable length, the 2 bytes after the op-code describe this.

so:
[ushort op-code][ushort length][bytes for message data]

or for a message of known length:
[ushort op-code][exact number of bytes for message data]

the buffer could be something like this:
[op][len][arguments][op][arguments][op][arguments][op][len][arguments][op][op][len][arguments]

(nothing stoping you making messages without any arguments, all that is needed is the op-code)

The only way the buffer can become corrupted (ie, the first 2 bytes are not infact an op-code, but any other 2 bytes) is if there is bad programming somewhere or the packets leaving the client machine are being manupulated (so, custom client application or man in the middle etc.). Input validation should result in notification of a corrupted buffer, my applications' buffers only get corrupted if I purposely make them do so for testing the robustness of my code. And my code always detects the corrupted buffer, but sometimes there are artifacts (ie, if the first 2 bytes happen to be a [moveto] message when infact they were meant to be part of something completely different - the bytes after these 2 are then parsed as the moveto location. At the moment, my clients' input for some things is instantly trusted - so it can cause the player to teleport to an inaccessible location! Rather than the server noticing that the requested location is impossible)


With this format, it can always be known how many bytes are expected for a message, so the char array buffer can be clipped one message at a time:

First, decide how many bytes following the first 2 are part of the message (eg, if login is known to send an unsigned long playerID and a 32 byte md5 sum).
Second, check if the buffer is at least this many bytes long
Third, extract your data and trim it off the start of the buffer - the 2 bytes now at the start of the buffer should be the op-code for the next waiting message

You may want to incorporate things like the max number of messages to be resolved in one go, and the max buffer size to be expected from a behaving device on the other end (if you discover your game code only leaves about 200 bytes in the server's buffer on average, cap it at 1500 or something - if a connection has more than 1500 bytes in it's buffer for any length of time, kill it)

I also like to make a message that denotes how many messages from that device have been handled, so the client knows that the server is keeping up with it's message output - If you count messages out and in on both sides of the connection.

One problem I find doing things this way is that it is hard to make a proper object orientated system to handle messages, I tend to resort to hard coding most messages with switch statements... messy. Complex things like spawning new players that have just logged in send the argument bytes to a deserialise method on my object. If I had a command/event system yet, it would help make this easier.

#17 hplus0603   Moderators   -  Reputation: 5181

Like
0Likes
Like

Posted 09 May 2011 - 12:57 PM

[ushort op-code][ushort length][bytes for message data]

or for a message of known length:
[ushort op-code][exact number of bytes for message data]


I find that it's more robust to always send the length. This allows you to do things like record/replay, proxying, and better structuring of the network stack.


Specifically, you might want your network stack to have a layer that lets you say "get next complete packet if there is one." If the opcode tells you whether the data is known length or not, then that layer actually has to understand what the opcodes mean. This is a high degree of coupling that's generally not very robust. If you always send the opcode and length in a known format, then that layer can give you a complete packet, without needing to know anything about what might be in that packet, or what opcodes "mean."
enum Bool { True, False, FileNotFound };

#18 Bozebo   Members   -  Reputation: 108

Like
0Likes
Like

Posted 09 May 2011 - 03:47 PM

If the opcode tells you whether the data is known length or not, then that layer actually has to understand what the opcodes mean. This is a high degree of coupling that's generally not very robust. If you always send the opcode and length in a known format, then that layer can give you a complete packet, without needing to know anything about what might be in that packet, or what opcodes "mean."


That makes sense. I have been struggling to find a proper way to form my messaging system into true object orientated way of working: I knew that programming a way to "get next complete packet if there is one." would be awkward, if all messages have a size it will be cleaner and simpler.

#19 Drew_Benton   Crossbones+   -  Reputation: 1713

Like
0Likes
Like

Posted 09 May 2011 - 06:07 PM

That makes sense. I have been struggling to find a proper way to form my messaging system into true object orientated way of working: I knew that programming a way to "get next complete packet if there is one." would be awkward, if all messages have a size it will be cleaner and simpler.


Take a look at this thread and this thread whenever you get a chance.

The main concepts discussed in those threads allow you to keep an OOP driven design while still allowing you to implement the practical aspects of network communication. You serialize your objects to a stream first. Once that is done, you now have the length of the payload. You can then just build a second 'packet' for the header data (opcode, size, anything else) and send the header followed by the payload. If you wanted to implement encryption or compression, you just have to work with an intermediate buffer, but everything else stays the same. When you receive data, you process the header first then the payload. You pass the payload to the deserialize function and out comes your object! It's a pretty nice design, imo, and it works well.

The downside to the approach is as with any generic de/serialization code: wasted bytes. Since you can control the actual types that are written, you might consider optimizations over the longer run to control bandwidth costs after you carefully collect bandwidth usage. For example, rather than write a size_t for a string length (which is never recommended anyways since the size can change across 32bit-64bit architectures) you write an UInt16 then check to make sure the size is within an acceptable range.

"But I, being poor, have only my dreams. I have spread my dreams under your feet; tread softly, because you tread on my dreams." - William Butler Yeats




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS