Server design questions

Started by
7 comments, last by philipbennefall 13 years ago
Hi all,

I had a quick question regarding ports and splitting up network resources. I have a server that has to be able to accept thousands of clients at once. The number of messages will be relatively few (no real-time communication and certainly no finger-twitching action). My current plan has been to set up a sort of gate that listens on a set port, and then finds a thread within the server that is able to take on the new client. Each thread listens on a different port, that the client then reconnects to as instructed by the gate. I am facing some issues with this, and my question is simply this. How important is it to split up network communication between ports for maximum performance, rather than using one single port for all clients? The multithreading is no problem, just the use of different ports. Is this really needed? I have been told that if I have too many clients connected on the same port I will run into packet loss and general congestion. How much of an issue will this be in my case, if I have about 10 or 15000 clients maximum?

To give a little extra information. All of the worker threads that listen on individual ports at present, use select to poll their respective list of sockets that are put in non-blocking mode. So one could say that I have a number of single threaded select servers that cooperate with one another, with the gate as the resource manager.

Any information on this would be much appreciated.

Kind regards,

Philip Bennefall

Advertisement

I have a server that has to be able to accept thousands of clients at once.

This is, for all practical purposes, a completely solved problem.

Between node.js, Erlang, Twisted, plain old Java NIO...

I have been told that if I have too many clients connected on the same port I will run into packet loss and general congestion.[/quote]
Yes, heavy load can cause problems. The trivial solution is to throw more money at it by getting better hardware. But loads discussed here are nowhere near that, unless if using some really low-end hosting, where bandwidth will likely be an issue sooner.

How much of an issue will this be in my case, if I have about 10 or 15000 clients maximum?[/quote]
According to that 500k article, a single EC2 instance will have no problem handling 10 times that number.

To give a little extra information. All of the worker threads that listen on individual ports at present, use select to poll their respective list of sockets that are put in non-blocking mode. So one could say that I have a number of single threaded select servers that cooperate with one another, with the gate as the resource manager.[/quote]
Um... Just use one of the things above. Nobody even knows what select() is today, servers are just a solved problem. There really is nothing much to be learned or gained at this point. Especially since most of suggestions above require about 5 lines of code and just work.


Also, number of concurrent connections is by far the most mundane and simplest problem to solve. It just involves adding more memory and doing a few kernel tweaks. The actual capacity will depend on type of actions performed by clients and those, unless using some standardized middleware (message passing, pub/sub, HTTP) cannot be guessed in advance. At least not without some thorough planning and analysis of actual server-side logic. Just implement whatever you have and see if it works, networking is unlikely to be a problem.
Hi there, and thank you for your fast reply.

I am hesitant to use large third party libraries. I write all my code in C or C++, often from scratch. I much prefer building the server from the ground up, as I have much greater control over what goes on behind the scenes that way. You say that select is gone and forgotten? Why is this? What should I be using instead? I am writing on Win32 but the server will be running on a Linux box. For testing purposes I have set up a small Linode where I can compile and run the server during development. Then, once it is stable I will get set up with a more expensive vps solution that has more resources.

Is my approach with non-blocking sockets running with select in separate threads a bad one? And I take it from your answer that I can safely use the same port for considerably more clients than 15000? Or did I misunderstand?

Thanks once again for your help.

Kind regards,

Philip Bennefall


I am hesitant to use large third party libraries. I write all my code in C or C++, often from scratch.
Well, that excludes boost, which comes with asio.

I much prefer building the server from the ground up, as I have much greater control over what goes on behind the scenes that way. You say that select is gone and forgotten? Why is this? What should I be using instead? I am writing on Win32 but the server will be running on a Linux box.[/quote]
Which means that you'll need to write portable threading wrapper (considering boost::thread is large third-party, std::tr1::thread isn't guaranteed to be portable yet, pthreads is third-party again and WinAPI doesn't run on linux), then you'll need to write a third-party wrapper over Berkeley sockets, however that API is quite old, so instead one uses IOCP on Windows, kqueue or epoll on Linux, which means writing two different APIs again.

Or, just use boost::asio for C++ or libevent for C or heck, even ACE. But those are large third-party libraries.

And once you get to this point, might as well drop C and C++ altogether and use the projects mentioned above. They bring other advantages, solving problems such as :
I can compile and run the server during development.[/quote]...no need to compile.
Then, once it is stable [/quote]... and they are stable by definition, due to heavy production use.

Is my approach with non-blocking sockets running with select in separate threads a bad one? And I take it from your answer that I can safely use the same port for considerably more clients than 15000? Or did I misunderstand?[/quote]In theory, that's all true.

In practice, there are millions of little quirks, settings and tweaks that simply require long-term use to iron out.

The problem to focus on is acquiring 15,000 users.

If you have that many already, then for sake of productivity, run the node.js demo server and watch it handle that traffic on cheapest instance. It takes about 5 minutes to set up.


But yes, Berkeley sockets work fine, may require some threading tweaks to work around a 40 year old limitation when there weren't 20 clients in the world. There's nothing wrong with that approach, it's just a completely solved problem.
Hello there,

I do have wrappers for multithreading, a library called TinyThread++, and a small-ish wrapper for sockets called the Simple Sockets Library (not thread safe by design but that was easy to add), so both of those problems are already solved. The server that I have presently runs fine on both operating systems. My main question was simply whether I needed to use separate ports or if I could keep everything on one, and how well a select based solution with non-blocking sockets in multiple threads would scale.

I will look at the node.js library again but C or C++ is definitely my first choice, unless node.js offers a lot of advantages that I am not yet aware of.

Kind regards,

Philip Bennefall


To give a little extra information. All of the worker threads that listen on individual ports at present, use select to poll their respective list of sockets that are put in non-blocking mode. So one could say that I have a number of single threaded select servers that cooperate with one another, with the gate as the resource manager.


If you're using TCP, with a properly multi-threaded server, there is zero benefit from listening on multiple ports.
If you're using UDP, with a properly multi-threaded server, but are setting a sufficiently large buffer, there is zero benefit from listening on multiple ports.
If you're using a single-threaded server (like Python, Node, etc) then you may want to run different processes, each of which listens on a different port, and put a static load balancer in front. HAProxy or Nginx are the two most commonly used proxies for that work. (Personally, I prefer HAProxy, but they are both good).

When I say "properly multi-threaded," I mean something that uses one thread per core, and non-blocking/evented I/O. This includes things like kpoll, boost::asio, or Windows I/O completion ports. It does not mean "one thread per client," which would be a terrible use of machine resources.

If your socket layer was "not thread safe by design" then it's not suitable for a high-performance threaded server. There is no way to implement proper threading support (with kernel poll or I/O completion ports) *without* being thread safe. I suggest you use boost::asio if you want a simple library that's not too big.
enum Bool { True, False, FileNotFound };
Hello there,

Thanks for the detailed reply. I would never use one thread per connection, so no worries on that score. When I say that the socket library I use is not thread safe, I only mean a few of the functions. With it being C and also rather modular, I can easily modify it and have already done so to ensure that the parts that are used in different threads are indeed thread safe.

I just found this library:
http://libeve.dev.jauu.net/

Which seems to be just what I need. I figured I could then run a few IO threads that would all wait for epoll on Linux (not sure if I'll make a Windows version yet), and then the IO threads would either respond directly if the task was a simpel one or deligate heavier work to separate worker threads. Is this reasonable?

Kind regards,

Philip Bennefall


I am hesitant to use large third party libraries. I write all my code in C or C++, often from scratch.



Will you use the Windows API (or the GNU libraries on Linux) or will you rewrite as much as you can using raw C and C++ by yourself?

Will you use OpenGL or Direct3D, or do you intend to write the code to directly act with every major video card by yourself?

Will you use an audio product like WWise or FMod, or use OS-provided audio calls, or do you intend to support every hardware variation by yourself?

For scripting, do you intend to use Lua or AngelScript or another major product, or do you intend to write it all yourself?

For your user authentication do you intend to store everything in an off-the-shelf relational database, or do you intend to write all that yourself?



For PC development, the days of doing everything yourself ended about 15 years ago. If you want to get a finished product I suggest you leverage all the external libraries you can apply.
Hello,

I do use external libraries for a lot of things, but not libraries that are as bloated as Boost for example. As you can gather from my posts I also use libraries for threading and sockets etc, but these still are relatively thin layers on top of the operating systems they support. I feel that this is a good balance between using external code and writing things from scratch, and so this is how I intend to work. My main question has already been answered, so I can now proceed with development.

Kind regards,

Philip Bennefall

This topic is closed to new replies.

Advertisement