Handling disconnections with multi-threaded

Started by
3 comments, last by Chindril 11 years, 8 months ago
This question is a bit specific to C++ since it is the language I use for my network game.

My game server runs the network in the main loop, and every network messages are sent in a queue. Then I have a bunch of Workers (each one running in their own thread) that takes the network messages and do the appropriated job (Lets call them WorkerJob).

I have 2 classes representing a client. I have the class User, which represent the network connection itself, and Player, which represent the logged player. A User can have a Player (if logged). Every WorkerJob has a pointer to the User that sent the network message.

Now the issue is that when I received a network disconnection (main thread), I need to force a logout of the Player and clean it as well as the User. The problem arise if I have WorkerJobs currently processing or in the queue that have pointers to this specific User.

What I'm trying to do is on a player's logout (either manually or when connection is lost in any way), I need to remove the queued network messages from this player, wait until the WorkerJobs are done and THEN cleanup my pointers. However I cannot wait since the disconnection happens in the main thread and blocking there is out of the question.

Anyone has experience with something similar and could give me tips on pitfalls to avoid ?

Thanks,
Advertisement
I used a timestamp to identify the connection a message is comming from. So if a user disconnects and connects very fast it is possible to identify that a message that should be send to the user is related to a connection that does not exists anymore even if it is the same user. This is a scenario for a web-server where you hold down the reload button.
The web-server gets a request the networking thread processes the incoming HTTP-Request and sends the already parsed request to the loader stage.
While the loader prepares the answer the closing happens and the loader sends a document for a connection that does not exist any more. The communication stales at some time.
The timestamp gets imporant because the memory management may give you the same pointer for the same client but for a new request. If the answer from the loader stage arrives you do not know whether it is for an old request of for the new one.
The sender drops the message if the timestamp in the loader stage answer message is not the same as the timestamp of the connection.
You can solve this problem either by reference counting the player structure (so the processors will still process messages, until all messages are gone) or locking each user message queue, and locking, flushing, and posting a "user disconnected -- kill this user" message on that queue.

You probably still need some reference counting if you allow multiple worker threads to process messages for the same user -- a better approach might be to use a boost::asio::strand (or emulate that same functionality for whatever threading you use.) This means messages for a particular user are always funneled through a single thread, so you just post the "this user disconnected" message, and then messages will be processed in order until the disconnect is reached.
enum Bool { True, False, FileNotFound };
@Tribad: My case is a bit different than a HTTP server. My connections are persistant and if disconnected I will drop the corresponding messages. What I'll probably do if mark the user as disconnected as soon as it happens, but keep it in memory until all it's messages are processed / drop and then cleanup the user.

@hplus0603: I'm probably gonna change my raw pointers to boost::shared_ptr in this case so they are reference counted. This should fix my problem of sharing and memory deletion. About boost::asio::strand, your idea is very interesting. If I can manage to have all the user's messages processed in the same thread it would fix a whole lot of my problems. I never touched boost::asio however (I'm using boost::thread for my threading), and I'll need to spend a lot of time to understand it before implementing it in my code. I might try to implement something like that myself as the logic itself is pretty straightforward.

Thanks to both of you,
Just to let you guys know, I implemented the solution of my last post and it seems to be working very well so far. I'll need much more testing and profiling to be sure it's 100% stable but for now I'll stick with this. Thanks for your help.

This topic is closed to new replies.

Advertisement