Jump to content

  • Log In with Google      Sign In   
  • Create Account


phiwer

Member Since 14 Apr 2013
Offline Last Active Sep 20 2013 12:19 PM
-----

Posts I've Made

In Topic: MMO Network Layer Architecture

20 September 2013 - 12:19 PM

Note that in .NET, proper use of the XxxAsync() functions actually solve the C10k problem for you.

At least with Microsoft's implementation -- I don't know about the mono implementation.

There may be other reasons not to use Microsoft servers, though.

 

I found an implementation on codeproject which I'm using as a base for my server. So far performance seems ok.


In Topic: MMO Network Layer Architecture

20 September 2013 - 12:18 PM

I think you are really trying to "put the cart before the horse". If you're *really* making a MMO, then the funding you have for content-creation should mean that you can afford to hire some experts for the network parts.

 

On the other hand, if you're making a smaller online multiplayer game (which is fine, too!) then you really don't need to worry about "the C10k problem". Pretty much any approach will definitely work.

 

Personally I can't really see a situation where any amateur-developed game is likely to hit the limitations of networking, unless you've made some incredibly bad decisions on the protocol design - which is something you should really think about early on.

 

There are a lot of hard things about network game programming, and the things you've mentioned are not part of them.

 

It's a remake of an old game. It's not a FPS type of game, but a web based one.

 

I just thought it was an interesting read (the articles) and wanted to know if anyone had done any testing/experiments of how to develop a server with that method.


In Topic: MMO Network Layer Architecture

18 September 2013 - 05:53 AM

Given the way the async works, you generally have to consider things a bit differently between input and output. The "input" (i.e. reading from client) side is basically something like:

AsyncBegin
Sleep (or wait for run control messages)
-> AsyncResult, associate message data with internal server object Id, post to queue
-- Issue another AsyncBegin

Now, given how async *write* works, this pattern is not proper for the output side. If you do an Async Write Begin, it typically triggers a result instantly, at least to start with because the buffer is empty until you start pushing stuff onto it of course. So, you don't issue the begins until there is something to actually be sent to the specific session/connection. Even then, you generally don't want to start triggering begin/waits in this case because it is a waste if there is no more data to be posted. (I.e. doing so would cause a mass of those context switches you don't like. smile.png) So the write side is a bit more complicated and delves into architecture bits. Let me just generalize things at a high level, this is not a perfect picture but hopefully gives us common terminology to work with:

Read threads:
1..n network clients --> messages received on server are mapped to the "session" for each client
--> session id/object/whatever + message put on queue (I.e. multiplexing all messages to a queue)

MMO guts, more threads:
message queue -->
. process the available messages into the various objects
.. By nature this is demultiplexing the messages back to internal server objects represented by the session.
. run game loop --> produces messages to "sessions" somewhere in here
. repeat at some simulation rate

Send threads, much more painful than read threads:
Sessions get messages from the game loop/simulation/whatever. I.e. the game loop itself demultiplexed everything.
. Is there an outstanding async in progress?
. Yes- buffer the data.
. No- try a direct send, if would block, send a portion, buffer the rest and kick off the async begin for writing.
.. on async results, if no more data in buffer, don't issue a new begin.

Actual send threads only contain async's outstanding for sessions with buffered data. This means a communications system to wake such threads and tell them, start a send async for xxx session. This has to be carefully designed or you will be beating on mutex's and all that constantly and this can/will cause your nasty context switching issues. (NOTE: this is actually a problem for most AIO networking API's, they are designed with TCP bulk transmission in mind but a game only sends a bit here and there usually.)

Now, a network server can/should be more asynchronous than a standard client game loop so I generalized that considerably. The point of the walk through though is that with three threads in this simple (yeah, right smile.png) outline, you only have two locations to worry about context switching with and they are easy to remove as problems. For instance, the queue between the reads and the game loop does not actually need to be a queue, nor does it need to be shared between threads. Instead of a queue, each "read" thread simply writes to it's own copy of a vector. Once a game loop the server says, let me have all the contents, you start filling this new empty vector. This does impose a mutex lock/contention (potential context switch) but once a loop is very acceptable. (With a centralized queue, things are MUCH more expensive.) The output side is much the same using a different system, posted items would be in pooled blocks of a fixed size, locklessly post new blocks into the outgoing queue (remember, per session, so the likely hood of hitting actual contention is near zero so even a mutex lock would be acceptable) only if there are async operations already in progress, otherwise attempt to send it without bothering to touch the writer thread. (NOTE: May want a queue to let the writer thread actually do the send even if there is no buffer, would have to see if there are any notable costs in calling the API directly.)

Hopefully this gives you a full high level picture of things. This is not the "only" way to do things nor even necessarily the best, but there is nothing here which causes any outstandingly high number of memory copies or context switches unless you implement one of the pieces wrong. A test client and do nothing server using all of this can be written in a couple days. Removing any remaining performance issues a couple more days. Generalizing, getting clean startup/shutdown, yet another couple days. Throwing 1k simulated client connections at it, a day. Fixing things when it falls on it's face with 1k connections sending/receiving 5kb/s each, a couple more days. Making an MMO out of the framework by yourself, see you in your next life. smile.png

 

I'm with you on the details of the IO part, and yes the network layer for my game is probably not something I'll write on my own either way (I'd much rather focus on the game engine :) ). Any recommendations with regards to open source alternatives?

 

Just to clarify exactly how the game server/client will be used in practise: The game server won't be exposed directly to the players. The players will play the game on mobile devices and desktop web browsers (it's a browser based game). The front-end servers will in turn connect to the game server and then return result to player in either html or json (depending on device). In other words, the communication between the front-end and back-end need not handle authentication or be encrypted in any way (they are both on the same network behind a firewall).


In Topic: MMO Network Layer Architecture

18 September 2013 - 05:39 AM

For the most  part what you have learned about threading is 10 year old stuff.  Times have changed.  Thread per connection/user is pretty much a dead approach.

 

The right way to do threading is to have a rather small thread pool,and multiplex jobs between them.  This is usually done with message passing and immutable objects.  

 

Take a look at Akka for a good example of how to do concurrency well.  It's probably the simplest abstraction for concurrency I've used, have been fairly impressed with it for this kind of task. 

 

Trying to do concurrency in C# is an exercise  in frustration, especially on mono.  Every language has things it does well and things it does badly.  Both will work up to a  point, but when you hit that point, you will  start  swearing up and down at C#.

 

Chris

 

I never suggested a thread/connection. Quite the contrary. The discussion is about the message passing (and work to be done with said message) after it has been received by the network layer. I'm merely inquiring about what experience other people have with this type of approach, or if a "standard" shared queue is the way to go.

 

I'll look into Akka.

 

Luckily I won't be using .NET on Mono. :)

 

/Philip


In Topic: MMO Network Layer Architecture

17 September 2013 - 02:34 PM

While I agree that context switching and data copies are a problem for scaling, taking things to far trying to avoid them is equally problematic in other ways. Let's just look at the initial networking layer and what you are considering doing to it. First off, under low load, a typical async network model is going to be something along the following:

1) Issue 1..n async begin's.
2) Eventually go into a wakeable wait state.
a) If there are any waiting async results, there is no context switch, the thread just starts processing them. Assume that this simply reads the message, posts it to a queue and issues a new async begin.
b) Sleep until there is an async result.

Nothing surprising here but unfortunately defined in the manner you seem to want to avoid. The question is if the avoidance makes sense? First off, you need to consider other locations of bottlenecks in an MMO server. One is directly related to the things you are trying to solve, the context switches and data copy/moves. Another is generally something which is often attributed to other things incorrectly and that is over utilization of the receiver thread. I actually had a very specific discussion about the context switch costs and reasons why some of those costs are a "good" thing in several cases. The cost itself is never a "good" thing, the decoupling is, but not simply in the software architecture but also in the server behavior. The little testbed I wrote modified things in the following manner:

1) Issue 1..n async begin's.
2) Eventually go into a wakeable wait state.
a) If there are any waiting async results, there is no context switch, the thread just starts processing them. Simulate 1-10ms processing time for each message, i.e. moving the processing to be in the receiver thread so there is no context switching nor data copies being made.
b) Sleep until there is an async result.

As the old saying goes, everything is fine, until it is not. Initially this sort of thing looks great because the lower CPU utilization and easily provable lower buss traffic. Unfortunately where it fails, it not only fails but it completely falls on it's face because of the unpredictable nature of having the simulation (sim of a sim in this case smile.png) in the same thread. As you scale up, the CPU utilization in the server goes up fairly smooth, the latency on message processing is fairly consistent and everything looks great. Unfortunately as soon as the numbers get high enough the CPU utilization starts varying massively due to the numbers of longer running processing items which just happen to start lining up on the threads, which in turn means more messages pile up which in turn causes the threads to start falling behind, then playing catchup, falling behind again etc. This is unfortunately a self reinforcing issue and if the CPU hits 100% utilization for any extended time, boom, the entire house of cards falls flat on it's face until the server is locked at 100% CPU, the network buffers are completely full, and even the clients end up stalling because their outgoing buffers are full.

The problem with such a system is the inherent lack of predictability tied in at two locations. One location is the network traffic itself, it often comes in in bursts for various reasons ranging from game play events causing everyone to start moving all of a sudden (flash crowds to see something interesting for instance) or the internet itself having to reroute due to a busy router somewhere etc. The other lack of predictability is in trying to process the messages as part of the same thread, some messages require more work than others, that's just a fact of life in games. The combination of the two items self reinforce the failure point on the threads/networking layer and I'd argue that the cure in this case is worse than anything else. Using the queue solution removes one side of predictability and "buffers" the burst data in between in hopes that the actual server processing can play catch up before the queue in between explodes. You can also watch the queue inbetween as a control system to say speed up the processing side (allocate more threads if possible), decrease sending rates of the clients before things explode, etc etc.

All this and I've not even really gotten into the discussion of the context switching. In reality, with a MMO, message processing is/should not be a cause of constant contention on synchronization and as such context switching should be minimal. As with the network side, the actual switches that do take place should be because the overall system is working 'well' and is not stressed out. It is the combination of high frequency switching and a stressed CPU/memory buss environment which will cause scaling fails, normal operation *does* include a reasonable amount of context switching though, that is unavoidable and actually desirable. And, as to the memory copies, if the messages are processed into individual reference counted objects when placed on the queue, there is no copy/move cost beyond that of a pointer.

So, long story short. I would argue that you are worrying about the wrong things. There are many gotcha's involved in this but an over abundance of context switches and/or memory copies is caused by other things than the separation of the network code and the processing code.

 

Thanks for the detailed explanation of your own results!

 

How would you solve the output/result issue with queues though?

 

We have one queue for the main thread, and one for output I presume? But would you have separate threads in the network layer that would only handle the output part, sleeping while no result is available? In this pipe-line approach the only thing bothering me is the output part.

 

I'd appreciate any ideas.

 

Thanks!


PARTNERS