Multithreaded server - horrible performance

Started by
27 comments, last by LycaonX 14 years ago
I've got a game server with 160-200 clients. Well, a potential server. I'm using .NET's sockets with the async methods (BeginReceive, etc). My problem is, I'm experiencing MAJOR lag with clients that are connected when there are more than 25. This can be any random mix of actual players and npc connections. Even clients connected over my LAN or even on the local computer are running ping times of 500-3500ms. Now, these are not ICMP pings, but they're done over the client's single TCP connection, I suspect like any other online game. If I have just a dozen or two player clients, their pings are in the usual range you'd expect from random players all over the world. 50ms to 400ish (for dialup). My personal ping from my client to the server is ALWAYS 0-1ms with any number of clients under 25. Once I add more in, my ping climbs up to 3500ms, which is ridiculous for a self-ping. From some of the research I've been able to turn up on Google (not much), the async methods of .NET sockets grab threads from the process threadpool, which contains up to 25 threads available for classes that use the ThreadPool. So this means that I have 25 threads servicing 200 connections, which I think MIGHT be the problem, but I would like the advice of more experienced coders. Most of the 'clients' are NPC connections. The npc application is in c++ and not easily modifiable, so I'm not too keen on messing with it at the moment. If I look at task manager, I see that the npc app has a single thread for each npc connection. So I'm kind of flailing in the dark here. I'm not sure what the 'name' of the solution is, so I'm not sure what to type in Google to find whatever answer I need. Let me know if you need more information in order to diagnose the problem, and/or fix it.
Advertisement
Threadpools and non-blocking sockets are the way to go. Don't know about the async versions. Never used them because you get potential thread lock issues.

I also suspect theres some calling overhead with all the async functions being called.

Actually only having a single thread (like in haproxy) can actually be faster than threading.

How does your memory behave? Could be collection resize issues (can be fixed by reserving more space, could be garbage collection kicking in)

On the MMO's I've worked on we had a maximum of 12 threads and sometimes (during live debugging) we had only 1. And it ran perfectly (in java mind you)

Also. Did you try profiling the server?
Did you verify that the server only allocates a maximum of 25 threads (as mentioned before you might speed up the application by lowering the threadpool count, although it might hide the real issue)
Yeah, I profiled it. I also ramped up the npc connections to 400 and still show a max of 25 threads.

The async socket methods internally use the ThreadPool, according to MSDN and several other sites found via Google.

From what I read, calling a socket Begin* method grabs an idle thread from the process's ThreadPool and holds onto it until the async operation completes, then releases that thread back to the pool.
Ok. Did you try lowering the number of threads? (The whole async thingie could be swamping the CPU(s))

What did the profiler say with regards to memory/cpu usage?

You could implement thread pooling yourself with non-blocking sockets to see if that helps. I did this with great success in Java so it should be of similar performance)
~20 megs of memory usage give or take about 500k, 1m44s of cpu time since I started the server about three hours ago with 160 connections.
Microsoft recommends using async sockets and others usually recommend this as well on the windows platform, so you are using the preferred method it sounds like. Could you profile your app and see where it is spending a lot of its time?

I am thinking the problem you are having is that with only 25 threads, you can only perform 25 async read operations at the same time. Any more will have to wait in a queue. And since an async read operation will not return until some data has been received, that means your queued up clients are waiting on other clients to send data which would definitely increase ping times.


Another thought I had was that you might be spending too much time in your callback method. This will also tie up your threads.


Quote:Original post by landagen
I am thinking the problem you are having is that with only 25 threads, you can only perform 25 async read operations at the same time.


That's what I'm thinking, as well. Like I mentioned the npc application appears to use 1 thread per connection... But it's very well known that the npc client we have is part of a project that uses... eh... saying 'bad coding practices' would be putting it in a nice light.

I really wouldn't mind recoding the client class to use blocking sockets with its own unique thread, but performancewise is having 200-400 threads really a smart idea?
I haven't done any network programming in .NET, but have you considered giving the old fashioned Select a try? I see they still include it as a static method on the Socket class.
Quote:
That's what I'm thinking, as well. Like I mentioned the npc application appears to use 1 thread per connection... But it's very well known that the npc client we have is part of a project that uses... eh... saying 'bad coding practices' would be putting it in a nice light.

I really wouldn't mind recoding the client class to use blocking sockets with its own unique thread, but performancewise is having 200-400 threads really a smart idea?
C# by default allocates 1MB for thread stacks so you are talking about an order of magnitude or more increase in memory usage.
Looks like 85% of the total time is spent on Threading.Sleep calls (spread across 3 threads. Main, movement, misc game events). The other 15% is spread pretty evenly across the ~50 or so other methods that are being called. Anywhere between 0.3 and 0.8% of total processing time.

This topic is closed to new replies.

Advertisement