A guide to getting started with boost::asio
4. Serializing our workload with strand
There will come a time when we will want to queue work to be done, but the order in which it is done is important. The strand class was created for such scenarios. The strand class "provides serialised handler execution." This means if we post work1 -> work2 -> work3 through a strand, no matter how many worker threads we have, it will be executed in that order. Neat!
With great power comes great responsibility though. We must understand the order of handler invocation for the strand class!
Order of handler invocation
if any of the following conditions are true:
- a strand object s
- an object a meeting completion handler requirements
- an object a1 which is an arbitrary copy of a made by the implementation
- an object b meeting completion handler requirements
- an object b1 which is an arbitrary copy of b made by the implementation
then asio_handler_invoke(a1, &a1) happens-before asio_handler_invoke(b1, &b1).
- s.post(a) happens-before s.post( b )
- s.post(a) happens-before s.dispatch( b ), where the latter is performed outside the strand
- s.dispatch(a) happens-before s.post( b ), where the former is performed outside the strand
- s.dispatch(a) happens-before s.dispatch( b ), where both are performed outside the strand
Note that in the following case:
async_op_1(..., s.wrap( a ));
async_op_2(..., s.wrap( b ));
the completion of the first async operation will perform s.dispatch( a ), and the second will perform s.dispatch( b ), but the order in which those are performed is unspecified. That is, you cannot state whether one happens-before the other. Therefore none of the above conditions are met and no ordering guarantee is made.
It is absolutely imperative that we understand these conditions when using the strand class. If we do not, we can code a solution that has undefined behavior that might work most of the time, but every once in a while, it breaks down and it is extremely hard to figure out why! I have done this myself and learned quite a lot from it as a result.
Now we can consider an example where we do not use strand. We will remove the output locks on the std::cout object.
The output on my PC was as follows:
This is pretty much expected. Since we no longer lock the std::cout object and have multiple threads writing to it, the final output gets combined. Depending on how many worker threads we have and how many PC cores as well, the output might look a little different and even might show up correct! Conceptually though, we will know the correct output does not mean anything here since we are not properly synchronizing access to a global shared object!
Now, let us check out the next example, simply comment out all of the io_service->post and uncomment the strand.post function calls. Here is one output of the strand program.
No matter how many times we run the program, we should see a clean output each time for the x outputs. This is because the strand object is correctly serializing the event processing to only one thread at a time. It is very important that we notice that strand does not serialize work through only one thread either. If we check the previous output once again, more than one thread was used. So work will still execute serially, but it will execute through whichever worker thread is available at the time. We cannot program with the incorrect assumption the same thread will actually process all of the work! If we do, we will have bugs that will come back to bite us.
As mentioned before, in the past I had used strand the wrong way without realizing it and it caused all sorts of hard to find problems. Let us now take a look at such an example that is syntactically correct but logically incorrect as per our expectations.
If we run this program quite a few times, we should see the expected 1, 2, 3, 4, 5, 6 output. However, every so often, we might see 2, 1, 3, 4, 5, 6 or some other variation where the events are switched. Sometimes we have to run a lot to get this to happen, while other times it might happen more frequently. The output remains clean though, but the order is just not as expected. This is because the work we are passing is guaranteed to be executed serially, but there is no guarantee to which the order of the work actually takes place as a result of the API functions we are using!
So if order is important, we have to go through the strand object API itself. If order is not important, then we can post across the io_service object and wrap our handler through the strand. It might seem obvious now, but if we were just getting started with this stuff on our own, it would be easy to misunderstand these basic concepts. The type of work we are posting will ultimately determine which interface we want to use as both are really useful. We will see more examples of the strand wrap member function being used in the future.
That pretty much covers the strand object. It is very powerful as it allows us to have synchronization without explicit locking. This is absolutely a must have feature when working with multi-threaded systems and maintaining efficiency across the board.
We almost have enough core concepts covered to move on into the networking aspect of the boost::asio library. The boost::asio library is huge with a ton of awesome features!