Using STL Algorithms

Published March 05, 2015
Advertisement
I have a small confession to make: I have very rarely used STL algorithms, even though I know I should. There is lots of really great reasons why I should, there top C++ developers out there that tell you to use them whenever it applies, but I just don't feel 100% comfortable with them yet. With that in mind, we can explore a recent foray into the use of an algorithm to both make things easier and harder at the same time :)

The Setup


I recently wrote some code that would access a vector of structures, where each structure is sorted according to one of its data members. The data member itself happened to be a float. There is nothing too fancy or special about the data structure at all. One particular use of the data stored in this vector is to find the two elements that surround a provided input value. In this case, the code is being used to perform an interpolation between those two structures, based on the location of the input value with respect to the two enclosing structures.

That sounds easy enough, but in fact it can lead to a morass of code littered with if statements, a for loop, and a few other things that are kind of 'old-school C++'. Since I was accessing two elements at a time, I was stuck with a for loop (or something equivalent) and because you have to handle the cases where the input is outside of the two ends of the vector, there must be some branching involved too. By the time it was up and running, I went back to try and comment the code and ensure that when I came back 3 months from now that I could understand it. I was having a bit of trouble writing a good set of comments, which I thought was a bad sign...

The Swing


It was then that I decided I would give it a shot to try and find an STL algorithm that could get me pretty close to the same result but hopefully with less code to maintain. In case you aren't familiar, I am talking about the contents of the algorithm header, which supplies something like 40 prebuilt functions for some of the most common operations in computing with the STL data structures.

The less code you have to write, the better off you are in the long run. In addition, algorithms have (supposedly) well defined functionality which means that your code should be more expressive to boot. The trick is to know which algorithm will do what you want, and then to understand the semantics of the algorithms to make sure you are using it correctly. This latter part is where I fell down this time...

In case you haven't already guessed it, the algorithm that can be used for the searching task I mentioned above is the std::lower_bound and std::upper_bound functions. If you haven't ever used them, I would recommend writing out a few sample programs to play around with them - they are incredibly easy to use, and you don't need to fuss about the implementation. It just does its thing and returns the result.

So I modified my function to use lower_bound, and then compare the result with the begin and end of the vector. The code was about 1/4 the size, very easy to understand, and easily described with comments. I thought it was a grand testament to the algorithm header - the stories were true, and I should be using algorithms whenever I can!

The Miss


The problem is that when I was more closely scrutinizing the results of the function calls, it turned out that I was always getting an iterator to the element above the input value. That didn't seem to make any sense, since a lower_bound function should be returning the item that is just lower than my input! But no, that isn't how it works...

Let's assume a simple example of a vector of floats, that is populated like this:std::vector values = {1.0f, 2.0f, 3.0f};
Now if you call lower bound and pass in a value of 1.5f, then I would expect to get an iterator to the first element. It turns out that you will always get the second element as a result. The reason is that the iterator that is returned is actually designed to be the iterator where you would insert the new value if you were going to insert it. That's right - it has nothing to do with the lower bound, but rather it is the insertion location.

I thought to myself "That's strange... I wonder what upper_bound does then...". It actually returns an iterator to the item after your input value - which is what I would expect it to do. It turns out that the name lower_bound is not entirely in agreement with what most of us would call a lower bound. The only real difference between these two functions is when you are requesting a value that is already in the vector (or if there are multiples of that value in the vector). It is a very subtle difference, and maddeningly difficult to reason about when you are an hour or two into refactoring a relatively minor function.

The End


So the moral of the story is that algorithms are good, and they can help you. If given the choice, I would take the algorithm version of the function any day. But you take responsibility for fully understanding what the algorithm does - and you will pay the consequences if you jump in without fully understanding what the algorithm does!

If you want to try it out for yourself, here is the simple starter code that I used to play around with it:#include "stdafx.h"#include #include #include int _tmain(int argc, _TCHAR* argv[]){ std::vector values{ 1.0f, 2.0f, 2.0f, 3.0f }; auto lower_bound_result = std::lower_bound(values.begin(), values.end(), 2.0f); auto upper_bound_result = std::upper_bound(values.begin(), values.end(), 2.0f); std::wcout << L"Lower Bound: " << lower_bound_result - values.begin() << std::endl; std::wcout << L"Upper Bound: " << upper_bound_result - values.begin() << std::endl; return 0;}
7 likes 3 comments

Comments

Washu
The behavior is consistent with all other container behaviors, in which "end" is returned when something is not found. Which is to say, lower bound returns an iterator past the last possible point for the object to exist. I will admit it is confusing, but if you just need to check for existence there's binary_search. You could also use equal_range, and if the two iterators are equal then the element was not found.


#include <algorithm>
#include <iostream>
#include <vector>

int main()
{
	std::vector<float> values{ 1.0f, 2.0f, 2.0f, 3.0f };

	auto result = std::equal_range(values.begin(), values.end(), 2.0f);

	std::wcout << L"Lower Bound: " << result.first - values.begin() << std::endl;
	std::wcout << L"Upper Bound: " << result.second - values.begin() << std::endl;

	return 0;
}
March 05, 2015 03:20 AM
Jason Z

Thanks for the suggestion - I'll give that a try and see how it works out. I don't question the consistency of how the algorithm works, but rather just that the name is a bit misleading. I'm certain that with more interactions with these algorithms that things like this will just be a distance memory :)

March 05, 2015 03:48 AM
Krohm

I use algorithm sparingly albeit I'm looking to use it more. Thank you for sharing.

March 05, 2015 07:54 AM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Advertisement