boost::asio proactor or reactor? true asyn IO?

Started by
7 comments, last by phsan 12 years, 10 months ago
I am about to start implementing a new messaging application on Fedora Linux, whose underlying mechanism is an event demultiplexer. I was a ACE user and now am considering switching to Boost::asio. The reason I am considering but not doing right now is that most of the UNIX platform does not provide a robust implementation of asynchronous I/O operations like Windows does. So we used ACE Reactor, instead of the ACE Proactor, implementation to do our old messaging system. See http://www.artima.com/articles/io_design_patterns2.html here details about what I just said.

Now if I switch to Boost::asio, then I will be using asyn I/O on Linux (Fedora). My questions are:

1) if it is true that most Unix/Linux platforms do not provide a full robust async OS-level support, but why people are using Boost::asio. Does Boost::asio need to rely on the underlying OS support to implement asyn I/O - the Proactor mode/pattern? or Boost::asio is the underlying OS support for implementing asyn I/O?

2) With Boost::asio, does it mean we now eventually can use the Proactor implementation on Linux (I use Fedora), or Boost:asio is just a pseudo asio, it actually does/wraps the Reactor mode, which essentially is a syn I/O.

3) Is there anything that I should be worried about switching from ACE Reactor (the syn I/O) to Boost:asio (the claimed asyn I/O)? for example performance issue, scalability issue or portability issue

Thanks,
Advertisement
Choice of patterns affects application design, the choice of networking API depends on what is built into kernel. Both asio and ACE are just high-level wrappers on top of kernel so it depends on what and how something is implemented there.

asio has ability to use either epoll, kqueue or /dev/poll. While application doesn't need to change, the performance characteristics may, depending on why underlying mechanism is used. Networking tends to be widely supported, file IO, various APC or similar however are not. *nix traditionally preferred different facilities for those and file IO tends to lag most.

Check which of these facilities your build of Fedora supports and verify that asio correctly selects it.


There are various issues with asynchronous networking on *nix based systems, but they need to be examined individually since there are so many variations. I seem to recall Fedora being somewhat conservative. For high performance networking, the BSD variations seem to enjoy a reputation.


But I'm not entirely clear on expectations. ACE is the "original' async networking library of same quality as asio, just some minor details differ. asio is preferred since it's considerably more C++-like and makes heavy use of standard C++ libraries while ACE relies on mostly C-like design. Neither is inherently better.

I am about to start implementing a new messaging application on Fedora Linux, whose underlying mechanism is an event demultiplexer.
...

if it is true that most Unix/Linux platforms do not provide a full robust async OS-level support



UNIX "bad" async I/O generally comes from traditional read()/write() on file descriptors backed by disk. In most systems, you'd want to use mmap() and madvise() and msync(MS_ASYNC) instead to arrang for I/O to happen in the background. The problem with those calls is that you don't actually get told when the I/O completes, and there are also kernel call overhead considerations, as you mmap() only parts of files and then munmap() when you're done.

boost::asio mainly concerns itself with I/O on sockets and socket-like objects. This means that the kernel notification mechanisms (kpoll, epoll, kqueue, etc) can be used to notify about available I/O, and the semantics of send and recv can be used to make sure I/O is not blocking. If your application is doing networking, rather than file system I/O, then boost::asio is likely just as efficient on Linux as on Windows.

That being said: If you're building some network event handler or multiplexer, why would you do it in C/C++? There are languages and environments these days that are much better at solving that problem. Anything from Python, to Scala, to Erlang, to Node.js may very well fit the bill, and deliver a significant improvement in development speed, and are efficient enough at runtime that you will not go CPU bound at any current network speeds.

Sure, if you're building a layer 7 router for 100 Gbit/s Ethernet, you may need C, or assembly, or custom hardware, or most likely all three, but that's not what you're doing, right? :-)
enum Bool { True, False, FileNotFound };

...Anything from Python, to Scala, to Erlang, to Node.js may very well fit the bill

Just wanted to second the idea of using something like erlang for the networking layer.

Recently updated some things at work to use an erlang based networking layer and to be honest it simplified a lot of questions about how to do certain things in parallel. Erlang is also becoming a widely sought after skill in the job market as it is becoming a buzz-word of sorts.
Evillive2

Recently updated some things at work to use an erlang based networking layer and to be honest it simplified a lot of questions about how to do certain things in parallel. Erlang is also becoming a widely sought after skill in the job market as it is becoming a buzz-word of sorts.


And, for those that happen to be in London with some time and money to spare in a few weeks, I'll be talking about how we use Erlang at a large scale at IMVU at the Erlang Factory:
http://www.erlang-factory.com/conference/London2011/speakers/JonWatte
Might be interesting for those who missed my talk at GDC West this year (although this talk will be more in-depth on the Erlang bit).
enum Bool { True, False, FileNotFound };
It turns out that asyn I/O in boost::asio is implemented through a Reactor pattern due to lack of asyn I/O support from OS level. See here please: http://www.boost.org/doc/libs/1_44_0/doc/html/boost_asio/overview/core/async.html. So apparently, the asyn is not on the socket I/O level, I need dig deep at which level it becomes asyn. (Any idea?)


Also, is there a easy way to see if the Reactor is using select() or epoll() oe kqueue()? I found an example here, http://stackoverflow.com/questions/3106304/boost-asio-on-linux-not-using-epoll. Not sure if there is an easier way.
What is the actual problem you are solving?

Reactor vs. proactor is mostly semantics. There are certain differences in how they are implemented, but they are both high performance designs. Either one is enough to serve anything today at effectively zero load to CPU. On server-grade machines with quality network cards this part of networking really isn't a problem.

The important parts that asio or ACE solve are those not present in these APIs, such as worker scheduling or timeout handling. If not using such library, this functionality must be implemented separately anyway.

It turns out that asyn I/O in boost::asio is implemented through a Reactor pattern due to lack of asyn I/O support from OS level. See here please: http://www.boost.org...ore/async.html. So apparently, the asyn is not on the socket I/O level, I need dig deep at which level it becomes asyn. (Any idea?)



What problem are you attempting to solve with trying to pin down what "asynchronous I/O" really means?

Here's the Windows model:

1) Try to receive data on a socket
2) Ask for completions
3) Get told about completions
4) You now have some data (possibly less than you intiially asked for)

Here's the UNIX model:

1) Ask for notifications
2) Get told about available data
3) Try to receive data on a socket
4) You now have some data (possibly less than you asked for)

The number of system calls is the same. The amount of data copying in the kernel is the same (for network sockets, at least). The flow of the code is almost the same, except the lifetime of the user-space buffer can actually be shorter in the UNIX case.

boost::asio makes both UNIX and Windows look the same, by basically emulating the Windows model, holding on to buffers earlier than strictly necessary on UNIX. However, it is likely to be equally efficient in both implementations (as long as epoll or similar is used on UNIX).

So, again: Please clearly state your specific question or assumption! What is your assumption about "asynchronous I/O," and what is your assumption about what happens when it is, or is not, available? And are you doing this for network sockets, or files on disk?
enum Bool { True, False, FileNotFound };
I am doing network socket, not files. Thanks for the replies. They answered my question.

This topic is closed to new replies.

Advertisement