• Advertisement
Sign in to follow this  

Coding a Multiuser Server in C/C++

This topic is 3583 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Ive been off and on for the past year at least trying to find the best way to program a TCP server. I have a question. I did ask this question in my other topic "Trouble with sockets" but no one responded. I want to know what is the best way to code a server with multi client support. What i'm talking about, is blocking and non blocking... Is there a easy way in windows to poll a sockfd for if it has information ready to be recv() or someone to be accept()ed, or is it better for me to make a new thread for each client that connects, and inside those threads i have a recv() waiting for data from the clients? As well as a thread for the listen sockfd. Or is there a better way to do it without threads or the use of select(). Or is it best to use a combination of select() and threads. Where say i have 1 thread watching (64) diffrent client sockfd. I think 64 is the default max for a fd_set. I am still a n00b to sockets, and i would really like to clear these questions up, and get on with my programming, rather then trying to figure out whats the best way to do this.

Share this post


Link to post
Share on other sites
Advertisement
The question is very platform specific. For Windows you probably want to use IOCP, for Linux you probably want to use epoll, for BSD kernels you probably want to kqueue and for SUN I think they use /dev/poll. There's libraries that deal with this for you, such as libevent which scale very well.

Share this post


Link to post
Share on other sites
Quote:
Original post by asp_
The question is very platform specific. For Windows you probably want to use IOCP, for Linux you probably want to use epoll, for BSD kernels you probably want to kqueue and for SUN I think they use /dev/poll. There's libraries that deal with this for you, such as libevent which scale very well.


Would it be better that i use libevent or IOCP, or would giving each client sockfd its own thread work just as well?

Or is it best i learn one of those rather then using threads.

Share this post


Link to post
Share on other sites
One thread per connection is the worst thing you can do unless you have very very few and long lived connections. It's the easiest thing to do though.

I would learn libevent (C) or asio (C++) depending on if you use C or C++. Asio has a few issues imo, I use it myself and it does a large number of memory allocations, especially if you use their timers as well. It's well abstracted though and does allow you to get started very quickly. Also part of boost now so it has been audited by a few good men.

Share this post


Link to post
Share on other sites
Quote:
Original post by asp_
One thread per connection is the worst thing you can do unless you have very very few and long lived connections. It's the easiest thing to do though.

I would learn libevent (C) or asio (C++) depending on if you use C or C++. Asio has a few issues imo, I use it myself and it does a large number of memory allocations, especially if you use their timers as well. It's well abstracted though and does allow you to get started very quickly. Also part of boost now so it has been audited by a few good men.


Alright well then i think that clears things up for me, i need to learn/use Asio.

Thank you very much asp_.

Anyone else have anything to add?

Share this post


Link to post
Share on other sites
Quote:
Original post by asp_

Asio has a few issues imo, I use it myself and it does a large number of memory allocations, especially if you use their timers as well.


Use 3.9 (and it appears version 1.0 has been released in mean time). It supports custom allocators. Also, if you don't use boost::function for callbacks, but provide your own callback delegates, there won't be any extra allocations.

Timers however are a software resource, which means they should be used sparingly.

I experimented with timers to provide bandwidth throttling. With one timer per client, it was always the NIC that broke first, never the application. Thousands of clients managed this way don't even dent the CPU (literally, echo server should run at idle CPU).

Quote:
Or is there a better way to do it without threads or the use of select()


For non-blocking sockets, have a single thread. recv() for n milliseconds, process the data that arrived, then send. Optionally, you can just recv() the data that is in the network buffer until you get WOULD_BLOCK error. Then send. Repeat as needed.

[Edited by - Antheus on April 1, 2008 9:35:09 AM]

Share this post


Link to post
Share on other sites
Quote:

there won't be any extra allocations

Extra being kind of relative here. Shared pointers alone are 2 allocations (switched them out for intrusive pointers). An async_receive is 1 allocation. expires_from_now and async_wait together causes 3 memory allocations. Currently memory allocations account for about 15 - 20% of the overall execution time because the connections are extremely short lived and they send extremely brief requests and get very brief responses. Now I've eliminated all the large allocations and asio is now only requesting really small pieces of memory and since most standard allocators are pooled in one form or another it's acceptable. ZLIB is by far the worst culprit allocating almost 300 KB of memory in varying sizes during one compression pass and it ended up with a custom memory allocator.

ASIO saved me a lot of time, in my opinion it's not perfect, but it's probably one of the best libraries out there for what it tries to achieve and it's pretty customizable. It reminds me of boost in general to be honest and it seems to me it was originally written to fit in with the suite. I've been very happy with it and anything I wasn't happy with I was able to work around without rewriting the library which is awesome.

Share this post


Link to post
Share on other sites
Quote:
Original post by asp_

ZLIB is by far the worst culprit allocating almost 300 KB of memory in varying sizes during one compression pass and it ended up with a custom memory allocator.


Hmmm. You aren't creating and de-allocating the (defalteInit,deflateEnd) on every call, are you?

I allocate one z_stream per active object. Typically this will be a single one per system, there may be others, but this avoid sharing it over threads. And this approach has no reasonable problem coping with 100Mbit connection. This allocation is also fixed, 256k typically + dynamic overhead for per compression, which apparently should be < 64k. This is where multiplexed design comes handy, rather than having one global z_stream, or one per connection, it's one per worker thread.

I won't vouch that there really aren't any extra allocations, but between all things, I never noticed that to be a problem.

Share this post


Link to post
Share on other sites
Antheus, that's a good idea. I could use a thread local storage pointer of memory to a z_stream object. Would significantly cut the memory requirements for deflation. I verified that deflate doesn't do any memory allocations at all beyond what deflateInit does. Under the assumption that deflate can be run multiple times on the base initialized z_stream and that one thread runs one request from start to finish this should be safe and be a significant save for me both performance wise and memory wise. Off to do some testing.

Share this post


Link to post
Share on other sites
TLS is slow, and with ASIO you don't need it.

#ifndef CAPDUMP_COMPRESSION_HPP
#define CAPDUMP_COMPRESSION_HPP

#include <zlib.h>

class MemoryBuffer;

class ZlibCodec
{
public:
typedef unsigned int size_type;

ZlibCodec(void);
~ZlibCodec(void);

int encode(unsigned char *src, size_type srcLen, unsigned char *dest, size_type destLen);
int decode(unsigned char *src, size_type srcLen, unsigned char *dest, size_type destLen);

int encode( const MemoryBuffer &src, MemoryBuffer &dst );
int decode( const MemoryBuffer &src, MemoryBuffer &dst );
private:
z_stream streamR;
z_stream streamW;

void reportError( const z_stream & s) const;
void initStream(z_stream &s) const;
void setInput(z_stream &s, unsigned char *dest, size_type destLen, unsigned char *src, size_type srcLen) const;
};


#endif // CAPDUMP_COMPRESSION_HPP


#include <iostream>
#include "memorybuffer.hpp"

#include "compression.hpp"

ZlibCodec::ZlibCodec(void )
{
initStream(streamR);

if (inflateInit(&streamR) != Z_OK) reportError(streamR);

initStream(streamW);

if (deflateInit(&streamW, Z_BEST_SPEED) != Z_OK) reportError(streamW);
}

ZlibCodec::~ZlibCodec(void)
{
(void) inflateEnd(&streamR);
(void) deflateEnd(&streamW);
}

void ZlibCodec::initStream(z_stream &s) const
{
s.zalloc = Z_NULL;
s.zfree = Z_NULL;
s.opaque = Z_NULL;
s.data_type = Z_BINARY;
}

void ZlibCodec::reportError( const z_stream & s) const
{
std::cout << std::endl << "ZLIB ERROR: ";
std::cout << ((s.msg) ? s.msg : "Undefined error") << std::endl;
}

inline void ZlibCodec::setInput(z_stream &s, unsigned char *dest, size_type destLen, unsigned char *src, size_type srcLen) const
{
s.next_in = src;
s.avail_in = srcLen;
s.next_out = dest;
s.avail_out = destLen;
}

int ZlibCodec::decode(unsigned char *src, size_type srcLen, unsigned char *dest, size_type destLen)
{
setInput(streamR, dest, destLen, src, srcLen);

if (inflateReset(&streamR) != Z_OK)
{
reportError(streamR);
return 0;
}

if (inflate(&streamR, Z_SYNC_FLUSH) != Z_STREAM_END)
{
reportError(streamR);

return -((long)streamR.total_out);
}

return streamR.total_out;
}

int ZlibCodec::encode(unsigned char *src, size_type srcLen, unsigned char *dest, size_type destLen)
{
setInput(streamW, dest, destLen, src, srcLen);

if (deflateReset(&streamW) != Z_OK)
{
reportError(streamW);
return 0;
}

if (deflate(&streamW, Z_FINISH) != Z_STREAM_END)
{
reportError(streamW);
return 0;
}

return streamW.total_out;
}


Then, you just allocate this on per-handler basis. ASIO gives you (optionally) a guarantee that handlers are executed safely.

This way, you can re-use the z_stream and the by far most costly allocation part between calls. The above should cut down the running time to a half or so for small buffers.

Note: the source is from a utility, so it doesn't cover all the cases.

Share this post


Link to post
Share on other sites
okay i've spent all day trying to figure out how to complier boost with codeblocks and mingw....

I can't figure it out.

Can anyone please link me to pre-complied libs for boost's newest version?

Share this post


Link to post
Share on other sites
I did that a while ago and it wasn't difficult, once I found the correct instructions to use [grin]. I think this is the place to start (once you have everything downloaded etc).

It says that the gcc option supports Mingw (which is the compiler that codeblocks uses IIRC). It is a command line process, you probably shouldn't try use codeblocks at all until you have boost built properly.

Share this post


Link to post
Share on other sites
Quote:
Original post by rip-off
I did that a while ago and it wasn't difficult, once I found the correct instructions to use [grin]. I think this is the place to start (once you have everything downloaded etc).

It says that the gcc option supports Mingw (which is the compiler that codeblocks uses IIRC). It is a command line process, you probably shouldn't try use codeblocks at all until you have boost built properly.


Great i haven't found that tutorial yet :)

I did however manage to build bjam heh.

Now i have another question.

It may sound stupied, but i cannot change the directory in command prompt in windows.

Ive tried typing in e:\codeblocks\boost
ive tried cd e:\....

How the hell do i change directorys?

I miss windows 2000 :( damn xp... I don't remember it being this hard in windows to change the directory...

Share this post


Link to post
Share on other sites
The-Moon, are you sure you'll be using functionality the requires boost to be compiled? Personally I'm using the boost consulting installer so I'm having a hard time keeping track of what they link to via pragmas.

Antheus, TLS had an access time of less than 1 microsecond. I suspect that the times it went over 1 microsecond it was the scheduler interrupting the process. And I'm about to do a deflate on a few KB of data. If this was a real time (i.e game) I wouldn't use zlib myself as it's a tad slow compared to other (commercial) libraries available. As long as there's no synchronization issues this seems ideal. Right now the connection handler shares a lifetime with the connection itself so I'd rather have a buffer per thread than one per connection handler if the access time is negligible. Thanks for tipping me off to the idea though, can't believe I missed that one.

Off topic, does anyone know of a commercial library that does deflate according to RFC1951 and which is a good bit faster than zlib, doesn't cost over 500 USD and allows for redistribution in binary form?

Share this post


Link to post
Share on other sites
To change the working drive letter (I don't know what the right name is... "current filesystem" maybe), just type the driver letter and a semicolon:
Quote:

C:\some\path\>d:

D:\>

Share this post


Link to post
Share on other sites
Quote:
Original post by asp_
The-Moon, are you sure you'll be using functionality the requires boost to be compiled? Personally I'm using the boost consulting installer so I'm having a hard time keeping track of what they link to via pragmas.

Antheus, TLS had an access time of less than 1 microsecond. I suspect that the times it went over 1 microsecond it was the scheduler interrupting the process. And I'm about to do a deflate on a few KB of data. If this was a real time (i.e game) I wouldn't use zlib myself as it's a tad slow compared to other (commercial) libraries available. As long as there's no synchronization issues this seems ideal. Right now the connection handler shares a lifetime with the connection itself so I'd rather have a buffer per thread than one per connection handler if the access time is negligible. Thanks for tipping me off to the idea though, can't believe I missed that one.

Off topic, does anyone know of a commercial library that does deflate according to RFC1951 and which is a good bit faster than zlib, doesn't cost over 500 USD and allows for redistribution in binary form?


No, well maybe not with asio, but i just wanted to have it all pre built, just in case, that way i don't need to build it later if i decide to use some of the other stuff boost has. Ive heard a lot of good things about boost. Haven't taking the time out too use or learn it. So thats the whole point of me compiling the libs.

So well anyways, anyone know how to change directorys in windows Command Prompt.


C:\Documents and Settings\The-Moon>e:'e:\' is not recognized as an internal or external command,
operable program or batch file.

C:\Documents and Settings\The-Moon>cd e:
C:\Documents and Settings\The-Moon>e:'e:\' is not recognized as an internal or external command,
operable program or batch file.

C:\Documents and Settings\The-Moon>c:'c:\' is not recognized as an internal or external command,
operable program or batch file.





*sigh* :(

C:\Documents and Settings\The-Moon>cd C:\Program Files
C:\Program Files>cd e:\codeblocks
C:\Program Files>cd E:\CodeBlocks\boost_1_35_0
C:\Program Files>



Ok why am i not alloud to goto my D:\?

Do i have to move boost to c:\?




Edit: Ok i got it working, apprently XP wants to act like a douchbag and not let me cd to E:\, had to move boost to c:\.... :)

[Edited by - The-Moon on April 1, 2008 3:12:47 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by rip-off
I did that a while ago and it wasn't difficult, once I found the correct instructions to use [grin]. I think this is the place to start (once you have everything downloaded etc).

It says that the gcc option supports Mingw (which is the compiler that codeblocks uses IIRC). It is a command line process, you probably shouldn't try use codeblocks at all until you have boost built properly.


Okay, thank you rip-off, i've managed to build boost now, thank you very much :D

It would be nice if that tutorial came up when i double clicked the index file in the boost main folder.... :|


Edit:Boost did not build a lib for "serialization", is it supose to? Also theres no dll file for "regex"

[Edited by - The-Moon on April 1, 2008 4:51:53 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
Quote:
Or is there a better way to do it without threads or the use of select()


For non-blocking sockets, have a single thread. recv() for n milliseconds, process the data that arrived, then send. Optionally, you can just recv() the data that is in the network buffer until you get WOULD_BLOCK error. Then send. Repeat as needed.


I just noticed today what you wrote about this antheus.

Few questions.

What i'm doing: i'm making a basic chat/find game lobby.

It will allow people to connect to one another and play a game i'm working on. The game i'm working on will only support like 20 people or something like that, nothing select couldn't handle, its also going to be a simple games of sorts, so i think select() will work fine.

I don't know how well this game will do, i might get 100 people using the lobby, maybe 1000. Don't know right now.

However, i like planning ahead. I want to make the lobby setup so it can support at least 500-1000 people at once.

After reading what you say Antheus, i looked at beejs tutorial again....

http://beej.us/guide/bgnet/output/html/multipage/advanced.html

6.1 is what i'm referring too.

Quote:
By setting a socket to non-blocking, you can effectively "poll" the socket for information. If you try to read from a non-blocking socket and there's no data there, it's not allowed to block—it will return -1 and errno will be set to EWOULDBLOCK.


So if i set all my client sockfd, to be non-blocking.

Then once every second i run a loop checking each recv(). If theres no data it returns -1 and errno is set to EWOULDBLOCK.

Would this be such a bad thing?

I don't see it as being too bad, maybe it would take a bit more cpu then the other options. But it would save me a sh*t load of time, trying to learn asio or something else.

My server has a 2.1ghz processor, if i have 1000 people connected and i'm sending and recv()ing data. Is this going to cause much of a problem when i go into the Recv() loop to check for any incoming data. I can't see how its going to take up that much cpu to check only 1000 recv()'s.

I don't foresee my cpu jumping too 100% each time it checks the recv()'s, even if 1000 people are connected.

Can anyone here say different? Is doing this that bad? What i'm doing is just a basic chat lobby to allow people to play games with other people.

If you guys say different, then i will go ahead and make a test server, and see just how much of a lag its going to cause to my cpu, if any :)

Share this post


Link to post
Share on other sites
The socket buffer is 64kb (8kb by default on windows). So you need to poll faster than it can fill up, or you start losing data from wire. Rule of thumb measure is buffer_size/network_bandwidth. At full capacity, on a 100Mbit network, you'd need to poll at least every 6ms (64k/10Mb/s).

IMHO, async design is considerably more scalable and suitable for lobby server, where you don't care about real-time event processing, or inter-client event ordering. Even if it takes a second or two to handle an individual request, it has no crucial impact (beyond possible user's annoyance), compared to a FPS, where 10ms can make a difference between playable and unplayable.

Select comes handy if you want to process network data on every frame. Read what's in buffer, if it's complete message, handle the network events, otherwise render and process input. Network polling is then limited by your frame rate and will be more than sufficient. If client is connected with 20kb/s link, you can afford random stalls up to 3 seconds without losing any network data.

An example of actual logic, which is as simple as it gets. This one spins on select without timeout (just polls, if there's nothing there, it doesn't wait), so if you wanted to reduce CPU load you could put a sleep in that loop if you run it in separate thread, or do actual processing.

Share this post


Link to post
Share on other sites
Know what i'm thinking?

To hell with making a chat room lobby. Ill just make it so that the "Lobby" is just a list of all the other game servers. If people want to talk outside the game there is trillian :)

This way i do not need to waste a lot of time doing stuff i don't really need to do.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement