Sign in to follow this  
beebs1

Which Sockets?

Recommended Posts

beebs1    398
Hiya,

I'm writing an authentication server using C++, as I'd like to learn how to work with sockets. I've written some basic applications before using blocking sockets, but I don't really have experience with network programming...

I'd like the server to be able to handle 150-200 concurrent clients, and all it needs to do is accept connections, run through a simple authorisation protocol and then drop the clients.

There seem to be a lot of different socket programming methods, including blocking/non-blocking, asynchronous, and IOCP (I'm developing for Windows).

I'm also thinking of a threading strategy. It seems to make sense to accept connections in the startup thread, then pass off the connections to several worker threads which will handle the clients using some balancing scheme. Maybe one worker for each logical processor, and distribute connections between them... I'm not sure yet.

Which socket method should I look into for the worker threads?

- Blocking sockets are not an option I guess.
- Non-Blocking seem better, but apparently they use a lot of CPU while polling?
- Asynchronous seems fine too, I believe this uses the Windows message system? Are there are bad points to using it? Any advantages over non-blocking?
- Also IOCP, which I don't know much about. There doesn't seem to be many explanations or examples available. What are the advantages here?

Any advice or explanations would really be appreciated!

Cheers!

Share this post


Link to post
Share on other sites
Dunge    405
I only had experience using Winsock2 on Windows and Berkeley sockets on Linux.

First thing, to have 150-200 concurrent opened sockets, you probably need Windows Server or it will be blocked to a small amount. My linux test stopped after 128 sockets.

What I did (probably not the best method, but works fine) is have a "listen" thread (a thread with a socket in listening mode), which loop indefinitely with an select() using a timeout of 0ms (so it make is non-blocking, while staying a normal socket and not an async one). If a new connection is made, the new connected socket is returned by accept() and can be passed to another thread of do what you want with it. Of course select() can use a bit of CPU, but a small sleep(1) at the end of the loop fix that.

There's also Boost (C++ general library) which contain Boost.Asio which I heard is a simpler method to handle socket and is portable.

[Edited by - Dunge on October 14, 2010 6:45:44 PM]

Share this post


Link to post
Share on other sites
Drew_Benton    1861
Quote:
- Blocking sockets are not an option I guess.


Blocking sockets can still be used in this case. While it might seem far from optimal, regular desktop machines can easily handle spawning 200 threads over and over a given time in a web server fashion that your program is implementing. There are better alternatives, but if you just want something that works and is easy to grasp your head around, then there is nothing wrong with this solution to solve your problem. For a longer term solution though, as if you ever need to be able to handle a lot more connections, then this would not be the way to go.

Quote:
- Non-Blocking seem better, but apparently they use a lot of CPU while polling?


The poll function select can be set to blocking so no CPU is used until there are events ready. The caveat to this approach is that you can only service 64 socket handles at a time on Windows with selection, so you would need to spawn one thread per 64 connections, or about 4 in your case to handle all those users while maintain a blocked wait state.

You can still poll if you want, but then you do start getting into more CPU usage and will need to Sleep() a ms or so to avoid 100% cpu usage if you are simply looping through one thread. In that case, you'd just have one non-blocking select call per set of 64 connections.

So this method is pretty simple to implement too and allows you to modestly scale upwards since you can just add more select blocking threads and get an extra 64 connections for each. Since your logic seems rather simple, this approach too would be good for the short term.

Quote:
- Asynchronous seems fine too, I believe this uses the Windows message system? Are there are bad points to using it? Any advantages over non-blocking?


After having used it in the past for a lot of work, I will no longer recommend anyone take this approach nowadays. It's just cumbersome to work with overall. You have to setup a hidden window to handle the networking stuff, then setup a window procedure to handle network events, then worry about handling the event triggers correctly. If you are just getting started in this networking stuff, it's a bit over-complicated and there are much easier and just as effective alternative methods.

Quote:
- Also IOCP, which I don't know much about. There doesn't seem to be many explanations or examples available. What are the advantages here?


IOCP is a bit overkill for your simple task here. The cost/benefit ratio would be extremely low if you were to try to develop your own IOCP server from scratch using Win32 functions only. With that being said though, using a network library such as boost::asio which wraps IOCP under the hood for you would give you a high cost/benefit ratio since you would then be able to scale up as needed and your development efforts are only on learning how to use boost::asio, which in itself is a good time investment for the future.

Quote:
Any advice or explanations would really be appreciated!


Basically it comes down to this: if you want something that just works and will do the job, then take the simplest methods and do not worry about efficiency until it's actually needed. Or, if you do take the approach and you find it to be really unsuitable for the task in implementation, then you can go back and look for something slightly better. Based on your problem description, I think this would be the most fitting given your current network experience.

Now, if you have some time to learn new stuff to put into practice, then I'd recommend going with boost::asio. That will give you a lot of flexibility for the future while performing quite well. The issue here though is learning how boost::asio works and getting familiar with the concepts does take a bit of time to get used to. I spent quite a lot of time with it after passing it by and will never look back to other methods because of the flexibility and power it provides.

Basically I just wrote my own low level network library wrapper for boost::asio so have easy client/server clusters all in one object. While it simplifies a lot of things for me dev wise, it would not be efficient for scaling towards moderate and bigger servers with intensive logic requirements, which is something I'm not really working with anyways. With that done I can just extend it for each project and save a lot of time since the code is already done and working. This is part of the boost time investment I mentioned earlier. If you put in the time now to learn it and build up your own system for stuff, it pays off later when you have to do a lot of networking projects that can reuse the generic code.

Lastly, do you have to use C++ for this? I only ask because writing network based stuff in C# is pretty simple compared to C++ and saves you quite a bit of time and code if you are already familiar with C# some. For example, just referring to these simple example or looking up Async server code for C# on google, you should be able to find really simple code that is able to accept connections and send/recv data and process it. So from a perspective of "just getting the job done" with minor pain, you could consider this idea as well. The same applies for other languages, but I myself have done a bit of stuff in C++ and C# so that's why I refer to it.

Good luck!

Share this post


Link to post
Share on other sites
Drew_Benton    1861
Quote:
Original post by Antheus
Quote:
Original post by Drew_Benton

need to Sleep() a ms or so to avoid 100% cpu usage
select() has a timeout parameter.


Yeap, I mentioned that too. What I was talking about specifiably there was if you want to use only one thread yet still service more than 64 connections at a time with select. In that case, you would just loop through your socket list and pass blocks of up to 64 sockets to the select function until you have them all done.

You would not specify a timeout in this case since you need to poll more than 64 sockets in the thread and would need to add a Sleep(1) after the loop so the thread does not continuously just poll as fast as possible wasting CPU cycles in the process.

A simple example, if you had 1000 connections and wanted to use only one thread for them, you'd end up calling select 17 times with no timeout. I'm not saying that's the best use of the function here, but I'm just saying you do not have to use one thread per blocking 64 sockets if you really don't want to and have a setup where it's ok to poll in this manner, since it's a lot less complicated than say WSAAsyncSelect. [smile]

Of course I know you know this, but I guess I didn't clarify enough in my post what I meant so I explained it here in a little more detail.

Share this post


Link to post
Share on other sites
beebs1    398
Hmm... interesting indeed!

It looks like I should focus my efforts on Boost ASIO then.

Could anyone offer any insight into a threading scheme? I'm thinking of spawning a worker thread for each logical processor, as the process can run on a dedicated machine. Although I could implement a command line switch to specify the number of threads too, maybe helpful for testing.

This seems tidy as there is no synchronisation needed between the worker threads, unless it's needed by Boost ASIO to send/receive. They will be isolated from each other, and accessing a MySql database. What I've done before is have each thread maintain it's own connection to the database, so I don't have to place mutex locks around database access.

I could keep a queue of accepted connections for each thread, and the startup thread running the ASIO acceptor can just round-robin the queues when distributing connections. Maybe even use a lock-free queue... Does any of this sound reasonable? :)

Anyway thanks for all your replies, very much appreciated!

Edit:
Oh - assuming a thread has no connections active or queued for it, would calling Sleep(0) be enough to keep the thread from burning too much CPU time? Thanks :)

Share this post


Link to post
Share on other sites
Drew_Benton    1861
Quote:
Original post by beebs1
Could anyone offer any insight into a threading scheme? I'm thinking of spawning a worker thread for each logical processor, as the process can run on a dedicated machine. Although I could implement a command line switch to specify the number of threads too, maybe helpful for testing.


Making it configurable would be a good option in my opinion. I don't think there is a real need to try and get fancy with various methods for processor detection and cores when you can just have it set in a config file or through the command line for something as simple as this. I mean you can if you want, but I myself would just let users configure it as needed, since you might want to run more threads per processing core for this task than you would others.

Quote:
This seems tidy as there is no synchronisation needed between the worker threads, unless it's needed by Boost ASIO to send/receive. They will be isolated from each other, and accessing a MySql database. What I've done before is have each thread maintain it's own connection to the database, so I don't have to place mutex locks around database access.


Your problem is all the much easier to solve when you don't have to worry about any synchronization issues!

Quote:
I could keep a queue of accepted connections for each thread, and the startup thread running the ASIO acceptor can just round-robin the queues when distributing connections. Maybe even use a lock-free queue... Does any of this sound reasonable? :)


boost::asio actually simplifies this for you. All you have to do is invoke the io_serivce's run function from each thread and everything else is taken care of for you with boost. One thing you will have to implement is a way to pass the current thread's context to the handler so the generic handler always uses the correct objects for the client being handled. This is really easy with boost::bind once you get used to the syntax.

Quote:
Oh - assuming a thread has no connections active or queued for it, would calling Sleep(0) be enough to keep the thread from burning too much CPU time? Thanks :)[/i]


No Sleep is required with boost::asio since you can have it setup so it just always does blocking work from each thread. In that case you call the io_service's run function rather than poll. It's really convenient that way for your current setup. However, if you did need to poll, you'd might want to consider Sleep(1) since Sleep(0) gives up what is left of its current time slice whereas Sleep(1) will sleep for some granularity period of 15-32ms unless it was changed with timeBeginPeriod or uses a different higher execution priority class.

Have a look at the boost::asio tutorials and examples to get started.

Share this post


Link to post
Share on other sites
Antheus    2409
Quote:
Original post by beebs1

This seems tidy as there is no synchronisation needed between the worker threads, unless it's needed by Boost ASIO to send/receive. They will be isolated from each other, and accessing a MySql database. What I've done before is have each thread maintain it's own connection to the database, so I don't have to place mutex locks around database access.


I was trying to come with a two-liner answer to this, but then realized just how mindboggingly complex something like this is.

Meanwhile, millions of people who have never heard of programming, who will never in their life encounter anything called sockets, let alone threads, mutex or anything similar are happily setting up their worpresses and other LAMP stacks, blissfully ignorant of all such stuff.

Their performance issues? Users who upload 4000x4000 BMP to their blog so it hogs all the bandwidth.


My advice, use PHP. Between the time needed to download a LAMP stack, learning PHP (this is taught literally as the very first lesson of LAMP) and deploying on one of free hosts which will happily serve millions of authentications, it's simply not worth spending any kind of time trying to do this in C.

And if you ever hit the need to millions of authentications per second, then simply spawning more EC2 instances will still be faster, cheaper and more reliable.

Or, something like Ruby, where it literally is 4 lines.

Share this post


Link to post
Share on other sites
stonemetal    288
Quote:
Original post by Drew_Benton
[ The caveat to this approach is that you can only service 64 socket handles at a time on Windows with selection,


It is 64 by default it is easily changeable through a #define FD_SETSIZE number.

Share this post


Link to post
Share on other sites
Drew_Benton    1861
Quote:
Original post by Antheus
My advice, use PHP. Between the time needed to download a LAMP stack, learning PHP (this is taught literally as the very first lesson of LAMP) and deploying on one of free hosts which will happily serve millions of authentications, it's simply not worth spending any kind of time trying to do this in C.


I like this idea myself (not using C/C++ for this kind of stuff), which is why I suggested the C# at the end of my first reply but I didn't consider web alternatives. However, I guess it depends on what logic goes on in the authentication sequence. If it's a simple, easy to replicate algo that requires a db lookup or very little math or that stuff then it'd work out good. But if it was some custom C++ pub/priv key implementation with custom encryption and the works, then you'd have to work that into a DLL PHP can use. I'm not sure how hard or easy that'd be though, but it'd take a free host out of the question in that case.

Quote:
Original post by stonemetal
It is 64 by default it is easily changeable through a #define FD_SETSIZE number


Thanks for the correction, my mistake. I got that logic mixed up with WSA_MAXIMUM_WAIT_EVENTS and how you are limited to that in WSAWaitForMultipleEvents on Windows.

Share this post


Link to post
Share on other sites
flodihn    281
If you are interested in the technically best solution I would recommend Erlang.
The language itself can handle a million of sockets (decrease this value depending on hardware and how much data you send/receive).
Here is a link how: http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1

For example, my own Erlang connection server has 10 processes (threads) listening on the same socket for new connections, each accepted connection is managed in a new process.
My current tests show that I have no problem handling about 65 000 connections.

If you are not interested in the technically best solution but rather just have something working I would recommend PHP/C# if you are not really interested in handling more than 200 clients.

Share this post


Link to post
Share on other sites
hplus0603    11356
Quote:
my own Erlang connection server has 10 processes (threads) listening on the same socket for new connections


Why 10? Are you not using tcp_server? I have a hard time thinking of any kind of application (including a static web content server) that would need more than one accepting thread. (Check out mochiweb for an Erlang web server used as a production ad server, for example)

For reference tcp_server spawns a new Erlang process each time it accepts a socket, so there's a 1:1 between connection and process. (And it can do this because Erlang processes are more like "reactor objects" and less like "OS threads.")

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this