Since I've not seen it mentioned yet in this thread, I'll throw out the obligatory consider using:
boost::asio,
ACE,
POCO, or any other known and established library that is designed to take care of these things for you. I'd recommend looking into boost::asio first myself. Once you learn how to use libraries like those, you will rarely find the need to work on this low level again.
But if you still want to do it yourself, I would really start out with a simple design that hplus linked to in his other reply. The client would simply call select with 0 timeout so it does not block. If there was data to receive, you can then being pulling it out and processing any complete messages. Depending on how much traffic and how many messages take place in your game, you might not really need to separate the networking logic from the client thread.
They can coexist together without any issues as long as you take some precautions in your design. Namely, you only check select at fixed rates rather than every update cycle, as checking to see if there is data each loop is a waste of resources. Once you being message processing, you keep track of processing time so you can bail out if too much time is being spent. On the next update cycle, you would continue processing messages.
That should be about it really. You should be able to implement and use a single threaded solution that will last you a good while. You can consider changing the method if you determine that running the network logic in your main thread is causing real client issues with performance. Usually though, such issues are not related to the actual networking IO aspect as much as just how you deserialize and process the messages
To get back to your question, I personally don't like using a synchronized variable to let the system know data is pending in this context. It's far too easy to come up with an implementation that suffers from a race condition or does not properly handle multiple events the same time. Your actual implementation will vary based on how many producer threads and how many consumer threads you have. In the end, you still have to have a lock take place, unless you are using a lockless queue.
Because of this, I'd implement a solution like this on Windows (due to the way CRITICAL_SECTION works):
global message queue
global lock (critical section)
Network Threadwhile connected
- locals: buffer, size, index, messages
- if we have room left to store messages (we do not want the network thread to flood the client ever)
--- recv to buffer
--- perform protocol specific logic to split buffer into messages
----- running this logic in a thread only gives benefits when there is overhead from
----- packet decryption or other expensive deserialization calls
- if we have messages
-- global lock (enter critical section)
--- for each message in messages
----- add message to global message queue
-- global unlock (leave critical section)
- Sleep only if message queue is full, client thread is not consuming fast enough,
- so let it catch up some.
loop
Main Client Threadwhile running
-- everything but network stuff --
- if ready to check for network events ( say you check every 1/60 of a second, some games more, others less)
-- global lock (enter critical section)
---- copy global message queue to local var
---- clear global message queue
-- global unlock (leave critical section)
-- process all messages or queue them into a client message queue to
--- process over the next bunch of update cycles that network logic is not called.
loop
But globals are bad!!1! Not always. In this case, you just want to be able to pass data from one thread to another. The easiest and most efficient way that has the lowest overhead (on Windows) is using a design like this. Simple and straight forward gets the job done. Since the client loop is only attempting to acquire the lock at a fixed rate, the maximum number of lock contentions you can have will be the inverse (i.e. imagine the threads happen to sync perfectly so both contend for the lock each loop).
All of the CRITICAL_SECTION locks the network thread makes when adding messages have no effect on the client thread until the client thread tries to acquire the lock. By that time, the lock is held for a very short period of time, as the only operations that are taking places are going to be pointer copy'ing related (assuming you are using a fixed size array as the copying medium, which means you limit the producer thread from having more than N amount of stored messages at a time). Say your global packet queue is just a fixed size array (or vector with a pre-allocated size) of 1024 elements or so and you store an int for the size. The operations that take place between the locks are pretty miniscule to not cause any client thread delays (no allocations are taking place at this time!). I just threw out a number, you'd probably want to make it a bit smaller depending on expected number of messages.
There are certainly other ways to do this, but then you are unnecessarily complicating what otherwise should be a very simple design. Other approaches include using
PostThreadMessage to post the message objects to the client thread, using
QueueUserAPC to invoke a function from the client thread context, attempting to create your own lockless queue, or making use of a library, like boost.
If you wanted something that was cross-platform, then this approach would not work as efficiently since it relies on CRITICAL_SECTION being so 'cheap' to use. It would still work with other locking mechanisms, but you have to be careful which one was chosen as some carry very high overhead. It should be noted that these things are not premature optimizations, it is simply a matter of choosing the right tools. While the 1024 array might be borderline premature optimization, it's other purpose is to allow you to gauge packet throughput to know if something is going wrong, either in the network thread or your client thread. If each update cycle you have a lot of messages to process, it means you might need to allocate more time for the network processing logic. Alternatively, if the time between the network logic processing in the main thread is greater than the rate you have set, it means something else in your system is eating up more than its fair share of time.
My final advice would be to start simple and get that to work. A single threaded solution should be possible and more than enough to handle what you want to throw at it at this stage from what I have read. It also simplifies a lot of other things so if you can't get a viable solution with it, chances are you are doing something wrong (or perhaps even using the wrong protocol!). If you want to make use of more efficient networking methods right off the bat, then it is strongly recommended to use an existing library. There are tradeoffs to each approach, but I think you just need to get more familiar with each method to understand how they could help you out (or not help you at all). Multithreaded programming in C++ should not be taken lightly, so be careful before just jumping right in!