Do you think this is a feasible server architecture design?

Started by
9 comments, last by rip-off 10 years, 11 months ago
This post is a bit of an extension of my previous two posts Trying to get my IOCP code on par with expected performance and [Article] Using UDP with IOCP. I've begun working on what I hope would be a scalable high performance server architecture useful in an online game with a large number of concurrent users. This is how I envision it would work: The public interface to the network as a whole is handled on the first layer. There would be 1..n of these IOCP servers that handle the network traffic for the entire system. Packet security and physical validation would take place in this layer. When packets are deemed to be complete and physically valid, they would be passed to the second layer. I achieve scalability here by simply being able to add more servers and update the client with a list to connect to. The second layer is a collection of 1..m switch servers that route complete packets from the first layer to the desired destination in the third layer as well as packets from the third layer to other servers in the third layer. The database is used to store connection specific data sent from the first and third layers so that information can be kept to route the response packets back to where they need to go. That logic would be handled through a specific packet protocol not yet described. I achieve scalability here by simply being able to add more servers and update the 1st/3rd layer servers with a list of available switches to use. The third layer is composed of all of the "logical" servers for the system. If servers need to talk to each other, they will go through the second layer so the switch servers can router the packets to the other servers that need the information. Likewise when a server in layer 3 needs to return data to a client, it just sends the data to the second layer with a specific protocol so the switch server then sends it back to the correct layer 1 server that contains the client to send to. I don't have a blueprint on the exact packet protocol for the internal communication, but it would be really simple, it'd contain the local and remote addresses for the client and the original server, possibly some flags and then the data to be sent. There is no specific UDP or TCP protocol specified right now, but ideally that would be up to the user to choose which one would work best. I believe for now I'd just use TCP, but UDP could be used with a simple protocol like enet. Each "Server" listed on the image does not necessary mean a physical machine on the network, but the design should allow for that. It could just as well be another process! For starting out, I will have one physical server and have one IOCP process in layer one, one switch process in layer 2, the database in layer 2, and one server in layer 3 (world). My reasoning for this design has to do with some recent success with splitting up a networking task into two parts, a low level part that does all the performance critical work in C++ and a high level part that uses readily formatted data in C#. With this setup, I envision all of the performance critical networking traffic being processed by C++ IOCP processes and my actual "logical servers" can be done in a variety of languages as I need. The second layer routing servers is to easily allow all the servers to interact as needed. As of right now I am working on the protocol between the 2nd and 1st/3rd layers. The layer 1 server design is done, so that code can be used in the 2nd and 3rd layers for my servers there, even though I plan on making another server in C# for the 3rd layer. I already have an ambition to actually complete this design and see how it works in practice myself, but I'd like to know what everyone else thinks about it. Am I overlooking anything and are there any serious flaws? So please, let me have all your thoughts about it! [smile] [Edited by - Drew_Benton on October 5, 2009 11:10:39 PM]
Advertisement
I've always thought that these multi-server solutions, where the first 2 layers do, somewhat trivial work (esp. your second layer) are a bit odd and a waste of processor time.

Also, there seems to be a flaw in your logic, is the first and second layer are performance critical, and written in C++, and they all pass all their traffic to layer three, doesn't that make layer three performance critical too?

I'd personally get rid of the 2nd layer and merge it with the 1st one. I don't see any good reason to have a process that has a sole purpose of receiving a packet, doing a lookup in a database and then sending a packet on. Since you've already got the packet in the 1st layer, you might as well deal with it there, it saves you having to do another receive and send, and decreases the latency a bit.

I'd also go further and merge the 1st and 3rd layers together (I assume you can run native code alongside C# as you can with Java).

Regards
elFarto
Thanks for the reply! I won't address your post yet, but given the lack of replies, I am going to go ahead and just complete the entire prototype using simpler network code and then make a more detailed post for people to comment on with more details.

I went back and through through my IOCP code and it has some serious flaws, so for my prototype I am just using a simple select() based server and client. That should be more than reasonable for what I will be showing.

Cheers!
I'm not really an expert on network programming or anything so excuse me if I am taking nonsense. But elFarto had a point...

To me what you want to do is spread load among different servers, this is cool and is scalable, but if all separate servers talk to the same sever again it just creates overhead.

Say all the nice OICP servers need to use the save chat server, what point does it have to load the balance? The chat server will need to process all there request anyways, mind as well do it directly and save the overhead.

Anyways, that's just my common logic... please correct me if I am wrong.
You are totally over-thinking it. The main cost for gaming servers is the physical simulation. Networking is a very small cost. The main goals in networking middleware should be:

1) Reducing bandwidth consumed
2) Providing good latency hiding
3) Performing well under adverse conditions
4) Providing a simple, robust mechanism to couple networking with in-game data

Unless you're doing something like a chat server where > 10,000 users will be connected at the same time, and mostly do nothing, you needn't worry about more threading than what IOCP gives you by itself.
enum Bool { True, False, FileNotFound };
I have already completed the proposed design and performed some minor tests with it. Everything works just as I had hoped to and it's beautiful. The design allows for a great deal of flexibility and scalability. The design was crafted after reading through all of hplus's replies I linked to in my other thread. It was specifically for a mmo-ish style design.

A lot of the points that have been brought up seem to point out contradictory stuff to the design intentions, but when you apply it to a large scale mmo, you can begin to see the compromises and why it's not as bad as it might sound. Most of these issues are due to me not having a good way to convey the design. I'll try again, but I really have no idea how to show it clear, concise, and organized.

The layer 1 servers's sole purpose is to handle all of the public network connections. With TCP you have to allocate a send/recv buffer for each client (ince it's a stream protocol), so the layer 1 servers will have 2 x N x Xkb buffers as well as N x socket resources to track on that side. If you want to support a lot of connections, you are going to need quite a bit of system resources to be able to chug through 'a lot'. Since IOCP would be used, each one of those servers could support up to 1000s of connections with the right setup.

When the layer 1 server gets data to process from a client, it's only task is to break up the stream into logical packets and forward them to the Layer 2 server it is connected to. A layer 1 server has no idea of game state or game logic, only of how to parse the stream into logical packets (based on the custom header). It will never need to talk to a layer 3 server and it will never need to talk to another layer 1 server. It only has to talk to the connected layer 2 server.

The layer 2 servers are the packet routing servers that will dispatch packets received from layer 1 to the appropriate layer 3 server. The layer 3 servers register with the layer 2 servers which opcodes they want, so all the layer 2 server does from that side is send the packets to the correct layer 3 servers. Think of them like middleman, which simplifies a lot of things. I did not actually take it one step further to allow for more layer 2 servers, but that would be ideal down the line.

The reason why the layer 2 server is so important is because everything revolves around it internally. The layer 1 and layer 3 servers connect to them, so they are the internal servers and the other layers are the clients, hence why the layer 1/2 servers need to have the best performance relatively speaking. By doing this, you can scale on either side and you don't have to worry about connecting specific layer 1 and layer 3 servers to each other in a P2P fashion.

When a layer 3 server wants to talk to other layer 3 servers directly, they just go through the middle man and that's it. The slight overhead of packets on a LAN is negligible for the simplification of the design. It's a matter of keeping to a client/server architecture rather than trying to go peer-to-peer on that end, which is a lot more complicated and uses more resources as you scale higher.

Likewise, when a layer 3 server needs to dispatch data to a layer 1 server, since layer 1 handles all of the connections, it just sends data to layer 2, layer 2 sends it to the correct layer 1 server, and the layer 1 server sends it on the specific connection. A 'disconnect' can be generated from layer 3 server on a specific layer 1 server connection through the use of a custom protocol.

What binds everything together is the custom protocol. Through that, the whole network learns of client connections, disconnections, packet send events, server connections, disconnections, and packet send events. An example of how a typical session might go would be as follows:

1. Start Layer 2 server

2a. Start Layer 1 server
2b. Layer 1 Server Connects to Layer 2 server
2c. Layer 1 Server "registers" itself with Layer 2 server
2d. Layer 2 server accepts or denies the server
2e. Layer 2 server sends all existing servers to the server if accepted

3a. Repeat Step 2 for Layer 3 server
3b. Layer 3 server registers which opcodes to handle with Layer 2 through protocol.

4a. A client connects to a layer 1 server
4b. Layer 1 server sends the event to a layer 2 server using the protocol
4c. Layer 2 server stores the server of the client as well as generates a unique layer 2 id for the client.
4d. Layer 2 server forwards the connection event to all connected Layer 3 servers

5a. Client sends data to layer 1 server
5b. Layer 1 server wraps message using protocol and forwards to layer 2.
5c. Layer 2 server determines which layer 3 servers should be recipients and forward the data to them.

6a. Layer 3 server sends data to layer 2 with the id(s0 of the destination clients.
6b. layer 2 converts the layer 2 id into the layer 1 server's id (taking into account which server should be used)
6c. Layer 2 sends data to layer 1 servers with the new id.
6d. Layer 1 takes id and sends data on the associated connection.

So, when everything is put together, you have a modular server setup that is scalable and pretty flexible. The real purpose of breaking it down like this is
to accomplish things that are not easily done with a one server setup, most specifically the scalable and modular approach.

If you want to support more clients, you add more layer 1 servers (assuming everything else can support it fine). If you want to add more servers to balance load or new servers for different functionality, you just add a layer 3 server and don't have to modify anything unrelated.

Let's say you have your game setup using this approach and it's complete. You want to add new stats tracking logic and upload to your web page. Rather than having to go in and modify the core game server, you can just develop an additional layer 3 server that receives the opcodes of the packets to process and uses the necessary DB logic for lookups and connect it. Plug and play really, what you really do is up to you, but you have that power.

Since network connections are handled on the layer 1 servers, your networking code for the rest of the servers is a lot simpler since you are only handling data from 'trusted" connections and only a small amount. That is why the layer 3 servers do not have to be as performance critical as the layer 1 servers. Each layer 3 server is doing something different. Not all layer 3 servers might actually respond to packets (as I mentioned before in my stats logging example). The layer 3 servers that do handle the performance critical logic in a game, should be.

The specific logic of what layer 3 server does what is up to the programmer though. If you want only one chat server to handle all chat for your game, then that's fine. However, if you want to make the design so it's handled across different servers, you just reroute the packets once again from the layer 3 server that receives all of the chat packets via the protocol and it's done.

The key thing though is that this is all on a LAN, so you are talking about very very low latency between the internal servers. On a gigabit lan, you should not really be filling up the line since your clients aren't going to be sending you that much data. Since most TCP based MMOs are not as fast paced as UDP MMOs in terms of how much data that is being sent at a time, you should not really have to worry about filling up your internal network lines.

So, hopefully that address everything that has been brought up. The first layer servers separate the public networking aspect of the system from the logical aspect. The layer 2 servers act as middlemen to simplify communication between the servers as needed, since TCP is being used and a client/server architecture is still desired (where layer 2 is the server and layer1/3 are clients to it). Finally, the layer 3 servers are the "modular" aspect of the design that allow for more flexible designs and modifications.

Overall, I am very happy with how it turned out. I ended up just using a select() based client/server design for the code. My code is not fully optimized yet because the main goal was getting the prototype done, which I have now and I couldn't have asked for more. When I have a full project to use it on, I'll be rewriting it again to make sure I get all of the efficiency aspects covered and update the networking code.

I hope that clears it up. I kind of made the thread to talk about the design as I was writing it, but ended up just writing it and testing it because I don't think this is really a topic that makes for good casual discussion [lol]. Oh well, one can only try. I'll be rewriting some more code soon and I'll post a demo showing why I believe in the design so much, maybe that will help some too.

Thanks for the replies though! I'm open to talk about anything else about it now that you have a more complete picture of the design.

-

I just saw your reply hplus, I was typing mine for a while. However, in addition to what I just posted, the reasoning behind this design isn't really driven by the costs of the networking aspect but rather the complexity of a single server design where I don't know how to make it scalable and flexible to changes.

Part of my other thread that I ended up cutting out since it was too much for one thread was the whole idea of how to go about making a scalable system where you can add servers to handle the load as well as implement your game across multiple servers. This design is the best way I could see doing it, it's not really meant to be middleware, but just what my game's networking setup would be like.

Does that clear things up or do you think I'm still going about it the wrong way?
I understand what you're saying. If opcode-per-player is your basic unit of distribution, then that can scale well. The problem is that, generally, players are so coherent in operations that you typically want all operations involving a given player on a single server, so dispatching per player is often what you end up doing. That, in turn, means that layer 1 could immediately forward to layer 3, with one connection from 1 to 3 per layer 3 server that a user is connected to.

And, if the number of players per server is limited because of simulation (I've seen numbers between 40, for Second Life, and 200 users per server), then you can just as well have 200 users connect straight to layer 3 -- any server can do 200 connections at a time.

So, the million dollar question is: How can you break the simulation up such that opcode-per-player can be distributed across simulation servers (layer 3), without dying from messaging and synchronization overhead at layer 3.
enum Bool { True, False, FileNotFound };
"The main goals in networking middleware should be.."


I'm going to disagree *very slightly* with you.

I've worked on some fairly hefty networked code, I've been in the room while other stuff has been worked on and I've seen the various attributes of software that does tasks of that sort of size.

(We are talking about the kind of project where people really use units like exabyte and mean it.)

So I'll tell you what the number 1 requirement of serious network software is when you actually want to make it work;

Being able to tell what the hell is going on in there.

Visibility.

No, really. Fast shovelling of data is *nice*, but that's something you can approach incrementally or just scale up for. Low latency is good. Smaller hardware reqs. All nice things to have.

When you actually deploy the network in 50 locations around the globe, what you really, really need is all the pieces so you can very quickly answer questions like "Why can no-one in Spain see any television pictures?" or "How come every ninety seconds, all the connections to the servers in South America exhibit bursty latency?"


Because there's nothing like having Spain actually on the phone shouting at you while you look at ninety-seven terabytes of logfiles trying to work out how all seven layers of your network have colluded to think that everyone in Spain is under 18 and every television station on your network is adult rated...

What you really want then is the ability to go ask your servers WHY they did things. Software, when you put enough of it in one place, starts acting like it's alive. You get these funny beats as different bits of software phase in and out of action. You get these crazy, crazy bugs, you get bits of software DOSing bits of your own network while trying to answer customer questions.

And you need to be able to understand all this. You need to be able to get switches to explain that they aren't sending packets to Spain because half an hour ago, they got a message from another server telling them to stop doing.


And if that means things are a bit less efficient and you need 10% more machines or 10% more disk space, then those extra machines don't cost very much. Not compared to Spain not paying you this month...

People focus too much on the efficiency and not enough on the making a system which not only works but keeps on working and can be fixed when it breaks -- cos it will.

Make as easy as possible to debug the network and your network ops centre will be your friends and they're the ones who keep the payments coming in.

Quote:Being able to tell what the hell is going on in there.


So true! In general, operational requirements are very important once your program starts working. I totally did not include that in my list, so thanks for setting the record straight!

Other things in that same category include logging where you can actually get information about each log message (kind of like how MSVC has a C4899 type error code for each error), and networking that can switch between various UDP/TCP NAT/open/tunneling configurations as needed.
enum Bool { True, False, FileNotFound };

@Drew_Benton, you do a great job for this,

how can I download the source code of this TCP & IOCP project, I wan to test it.

Thanks.

This topic is closed to new replies.

Advertisement