Sign in to follow this  
nethackpro

select without a linear search?

Recommended Posts

Im trying to make some changes to my server side code to make it more efficient, so Ive been reading up more on select(). The problem is, all the examples I find do something like this after calling select
	if (FD_ISSET(sock,&socks))
		handle_new_connection();
	/* Now check connectlist for available data */
	
	/* Run through our sockets and check to see if anything
		happened with them, if so 'service' them. */

	for (listnum = 0; listnum < 5; listnum++) {
		if (FD_ISSET(connectlist[listnum],&socks))
			deal_with_data(listnum);
	}
}
My problem with this is the linear search it does. We know from our call to select how many sockets are ready to be read from, but isnt there a way to know specifically which ones they are off the bat without having to do a linear search? With this method, I'd have to search through every single connected player every time any of the clients send a packet to my server. Maybe theres something similar to select that returns a fd_set containing only the sockets ready for reading?

Share this post


Link to post
Share on other sites
Yes, there are much better alternatives, but they depend on your target OS and are not necessarily trivial to implement.

Seeing that your example has an entire 5 sockets, it is probably not worth to spend a lot of time worrying, because there is no really existent overhead with select(). Unless you plan for more than a hundred descriptors, select() works just as fine as everything else.

If you need several hundred or thousand descriptors, look into epoll() or kqueue() for unix-like system and IO completion ports for Windows.

Share this post


Link to post
Share on other sites
Then you want epoll. Note that while epoll can be confusing (read the docs very carefully!), it is flipping awesome, you can for example create special file descriptors that fire notifications when your thread catches a signal, and catch those with the same epoll call as you catch your socket notifications and your disk ready notifications.

If you don't want to mess with the low-level stuff yourself, there is libevent which does that for you. As with all layered libraries, it adds a little overhead (hardly noticeable) and does not do all the cool things that you could possibly do otherwise, but it works and it takes away 99.5% of the pain implementing it.

Share this post


Link to post
Share on other sites
thanks for the helpful reply. I plan on implementing it all myself as the main reason Im doing this project is to learn. I'm currently reading the man pages on epoll because I cant seem to find decent documentation on it online. Do you know of any sites that might be helpfull in learning the ins and outs of epoll?

Share this post


Link to post
Share on other sites
Boost has the "asio" library. It wraps the epoll/iocp calls into a single interface. Works the same on linux and windows.

It's not perfect, but it's portable, and event driven (no select search to do).

www.boost.org

Share this post


Link to post
Share on other sites
I agree; the epoll API is better than the select API for certain heavy use cases.

However, have you ever profiled this loop to be a problem? If you start getting more load, what will happen is that any time spent looping over inactive sockets will be time that the other users can add more data to their buffers, which means there will be more active users in the next iteration. The system is self-balancing that way, and under heavy load, the looping time just won't show up on a profile. Meanwhile, why would you optimize the light load case? :-)

Share this post


Link to post
Share on other sites
Ok, so I've got everything set up right going by the man pages, but it seems to be stuck on epoll_wait. It always returns 0 and I cant figure out why for the life of me. Im going to assume its because Ive been up for over 24 hours and made a really simple mistake somewhere but heres the code I got so far for it.

if((ReadFDs = epoll_wait(epFD, Events, MAX_PLAYERS, 0)) < -1)
{
perror("epoll_wait");
exit(EXIT_FAILURE);
}
// We've got some data to read //
else if(ReadFDs > 0)
{
for(int i = 0; i < ReadFDs; i++)
{
// New Connection //
if(Events[i].data.fd == Listener)
{
if((NewFD = accept(Listener, (struct sockaddr *)&Client, &CliLen))<0)
{
perror("accept");
exit(EXIT_FAILURE);
}
SetNonBlocking(NewFD);
Event.events = EPOLLIN | EPOLLET;
Event.data.fd = NewFD;
if(epoll_ctl(epFD, EPOLL_CTL_ADD, NewFD, &Event) < 0)
{
perror("epoll_ctl");
exit(EXIT_FAILURE);
}
send(NewFD, "Welcome to Paveria...\n", 22, 0);
}
// Handle Input //
else
{
}


The man page says this...

EXAMPLE FOR SUGGESTED USAGE
While the usage of epoll when employed like a Level Triggered interface
does have the same semantics of poll(2), an Edge Triggered usage
requires more clarifiction to avoid stalls in the application event
loop. In this example, listener is a non-blocking socket on which lis-
ten(2) has been called. The function do_use_fd() uses the new ready
file descriptor until EAGAIN is returned by either read(2) or write(2).
An event driven state machine application should, after having received
EAGAIN, record its current state so that at the next call to
do_use_fd() it will continue to read(2) or write(2) from where it
stopped before.

struct epoll_event ev, *events;

for(;;) {
nfds = epoll_wait(kdpfd, events, maxevents, -1);

for(n = 0; n < nfds; ++n) {
if(events[n].data.fd == listener) {
client = accept(listener, (struct sockaddr *) &local,
&addrlen);
if(client < 0){
perror("accept");
continue;
}
setnonblocking(client);
ev.events = EPOLLIN | EPOLLET;
ev.data.fd = client;
if (epoll_ctl(kdpfd, EPOLL_CTL_ADD, client, &ev) < 0) {
fprintf(stderr, "epoll set insertion error: fd=%d0,
client);
return -1;
}
}
else
do_use_fd(events[n].data.fd);
}
}


Do you see anything that I'm doing obviously wrong?

Share this post


Link to post
Share on other sites
Quote:
Original post by nethackpro
Do you see anything that I'm doing obviously wrong?


Umm, you're passing zero for the timeout, which make epoll_wait return immediately regardless of how many events are available? Do I win?

Share this post


Link to post
Share on other sites
sorry, but you dont win(I wish you did, believe me. Then I could be done with this and move on). When I try to connect it shouldn't return 0 anymore. However, I tried setting the value to -1 and > 0 earlier and it didnt help

Share this post


Link to post
Share on other sites
Quote:
Original post by hplus0603
I agree; the epoll API is better than the select API for certain heavy use cases.

However, have you ever profiled this loop to be a problem? If you start getting more load, what will happen is that any time spent looping over inactive sockets will be time that the other users can add more data to their buffers, which means there will be more active users in the next iteration. The system is self-balancing that way, and under heavy load, the looping time just won't show up on a profile. Meanwhile, why would you optimize the light load case? :-)


This.

It is highly unlikely that the performance of select is really going to be a significant bottleneck.

The two golden rules of optimization:

1) Don't do it.
2) (experts only) Don't do it yet.

Share this post


Link to post
Share on other sites
Quote:
Original post by nethackpro
Hey, I got it working. The man page said to do struct epoll_event *events; Thats where it threw me off. I changed it to struct epoll_event events[MAX_PLAYERS]; and that did the trick


Yes, the example in the man page is obviously very wrong.

Ignore the people who recommend against using epoll() in Linux: the code required to use it over select() or poll() is minimal and it really does scale better than either of those two system calls. If you're handling more than a few hundred simultaneous connections the difference becomes obvious.

One cool trick with epoll() is that you can multiplex the epoll() descriptor with other file descriptors and use it with select(). You might wonder why you'd want to do something like that until you encounter something that wants to take control of your main event loop (OpenSSL, OpenLDAP, or Qt, for instance).

Share this post


Link to post
Share on other sites
Quote:
Original post by Bregma
Ignore the people who recommend against using epoll() in Linux: the code required to use it over select() or poll() is minimal and it really does scale better than either of those two system calls. If you're handling more than a few hundred simultaneous connections the difference becomes obvious.

Yeah, I don't see any reason to use select over epoll now. epoll takes like what? maybe 5 more lines to set up then select? After seeing this
http://monkey.org/~provos/libevent/libevent-benchmark2.jpg
theres no way Ill use select in any programs that I'm expecting a massive amount of connections. Although, it does seem like select would be slightly more efficient with less then 600 or so connections. But I imagine that the difference at that point wouldnt be noticeable enough to bog down any program.
And why someone would tell me not to do it when Ive already stated that I have it set up is beyond me...

Another interesting thing I noticed about the man page, is that it made no mention of adding the listener to kdpfd(the epoll file descriptor created with epoll_create). Luckily I assumed it would have to be. I tested to see what would happen if I didnt, and sure enough, it didnt work

[Edited by - nethackpro on January 22, 2009 11:17:19 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by nethackpro
And why someone would tell me not to do it when Ive already stated that I have it set up is beyond me...

Well, these are discussion forums, not question/answer forums or tech. support forums. There's an expectation that the 'consumer' of any given thread is not just the original poster but any number of other lurkers or future posters. So sometimes it's worth posting something saying "That may have worked for you, but I still don't recommend it to others". Since select() is more portable and better known than epoll() and no worse a performer in 90% of situations, it's probably best for newbies who might stumble across this thread to be told to use select().

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this