Jump to content

  • Log In with Google      Sign In   
  • Create Account

Like
0Likes
Dislike

The C++ Standard Library Part 2

By Howard "SiCrane" Jeng | Published Sep 13 2006 01:28 PM in General Programming

vector std::vector int_vector elements size element member function new
If you find this article contains errors or problems rendering it unreadable (missing images or files, mangled code, improper text formatting, etc) please contact the editor so corrections can be made. Thank you for helping us improve this resource




Introduction

In this article I will introduce the basics of the std::vector<> class. This is one of the most frequently used classes in the standard library. It's also one of
the first classes that people not familiar with the standard library are recommended to use.

In C++, there are several ways to use the new operator. There's the familiar new MyObject(); which allocates a single "code">MyObject. There's also new MyObject[n]; which allocates n different MyObjects in a dynamic array. While
dynamic arrays are useful, they are also error-prone and can be cumbersome to use. For instance, it's relatively easy to generate a memory leak by forgetting to delete []
the array. Or you may accidentally call delete instead of delete [] which may cause any number of problems. "fn_top" id="fnt1">1)

The std::vector<> class is a template container class that provides the functionality of a dynamic array that is less error prone, easier to use and largely just as
efficient as manually manipulating dynamic arrays.


Standard Library Conventions

The standard library is exposed through a number of header files. std::vector<> lives in the <vector> header file. So when using
std::vector<> you would do:


#include <vector>

This is not #include <vector.h>, but #include <vector>. Similarly when you include any other standard library header file there
should also be no extension.


#include <list>

#include <algorithm>

#include <iostream>

The C++ Standard Library header files inherited from C are also extensionless. However, they undergo a name transformation: take off the .h at the end and add c to the beginning. For example
#include <cstdio> instead of #include <stdio.h> or #include <cstdlib> instead of "code">#include <stdlib.h>.

As mentioned in the last article, almost all of the components of the C++ standard library are found in namespace std "fnt2">2). In this article series I generally do not use using directives or declarations and preface all the identifiers with std::.

Another thing you may notice about the standard library is that it uses all lower cases identifiers for classes, member function and basically any identifier that isn't a macro. There are
historical reasons for this practice; this is not an example that you should copy in your own code.


The Basics of std::vector<>

As I mentioned, std::vector<> is defined in the <vector> header file. As a template class it has two template arguments: the type of
element the vector holds, which I will often refer to as T in this article, and an allocator. The allocator argument defaults to "code">std::allocator<T>. 3)


std::vector<int> int_vector; <span class="codecomment">// vector of ints</span>

std::vector<int, std::allocator<int> > another_int_vector; <span class="codecomment">// same type as int_vector</span>

 

std::vector<int, boost::pool_allocator<int> > a_third_int_vector; <span class="codecomment">// a different type from the other two</span>

Most of the time the default allocator argument is fine, but knowing it is there is useful when deciphering error messages when you have a compiler error involving "code">std::vector<>. More on that later.

There are a number of different ways to construct a std::vector<>. Some of them are:


std::vector<int> int_vector; <span class="codecomment">// no constructor arguments, creates a vector of size 0 (no elements).</span>

std::vector<float> float_vector(5); <span class="codecomment">// single integer argument, creates a vector of size 5 (5 elements).</span>

std::vector<double> double_vector(5, 3.14159265359); <span class="codecomment">// integer argument followed by T, creates a vector

                                                     //  of size 5, where each element is 3.14159265359</span>

std::vector<double> another_double_vector(double_vector); <span class="codecomment">// copy constructor</span>

A std::vector<> can hold types other than primitives.


std::vector<int *> int_ptr_vector; <span class="codecomment">// holds pointers to integers</span>

std::vector<MyObject> my_object_vector; <span class="codecomment">// holds my_objects</span>

std::vector<std::vector<int> > int_vector_vector; <span class="codecomment">// holds a vector of ints</span>

However, a std::vector<> can't hold just any kind of object. In order to be stored in a std::vector<> a type needs to be copy
constructible
and assignable. There are formal definitions for this, but basically it means that copy construction and assignment should do what you expect.


const T & t = <span class="codecomment">/* get a reference somehow */</span>;

T u(t); <span class="codecomment">// this should compile and t and u should be equivalent afterwards</span>

const T & v = <span class="codecomment">/* get a reference somehow */</span>;

u = v;  <span class="codecomment">// this should compile and u and v shold be equivalent afterwards</span>

In general, most types will be valid to put inside a std::vector<>, including most user created types 4).
Notable exceptions include std::auto_ptr<> and most classes that have const or reference member variables.

And unlike operator new[], the type stored in a vector does not need to be default constructible. However, if it is not default constructible, some member functions won't
work like you may expect, like the std::vector<> constructor that takes a single integer argument.


Manipulating Elements

Once you have a vector, you can access the elements in the vector via operator[], like you would an array.


std::vector<float> float_vector(5); <span class="codecomment">// single integer argument, creates a vector of size 5.</span>

float_vector[0] = 3.0f;

float_vector[4] = -1.1f;

std::vector<> also provides special member functions to get a references to the first and the last elements in the vector.


<span class="codekeyword">float</span> f = float_vector.front(); <span class="codecomment">// equivalent to float f = float_vector[0]</span>

float_vector.back() = f;        <span class="codecomment">// equivalent to float_vector[4] = f;</span>

                                <span class="codecomment">// though it changes properly if you add or remove elements</span>

You can add a new element to the end of the vector (expanding its size) by using the push_back() member function. And you can remove elements from the end of the vector
(shrinking its size) by using the pop_back() member function.


float_vector.push_back(4.0f); <span class="codecomment">// puts 4.0f at the end of the vector, size is now 6</span>

float_vector.pop_back();      <span class="codecomment">// removes the last element of the vector, size is now 5</span>

float_vector.pop_back();      <span class="codecomment">// removes the last element of the vector, size is now 4</span>

push_back() requires you to supply an object to copy into the vector; you can't call push_back() by itself.


Size and Capacity

In addition to the actual elements in the vector, std::vector<> keeps track of two more details: the vector's size and its capacity, which you can access via the
size() and capacity() member functions. The size() function returns the number of elements that the vector contains. The
capacity() member function returns how many elements the vector can store before it needs to allocate new memory.


std::vector<int> int_vector(4); <span class="codecomment">// vector of 4 ints</span>

size_t size = int_vector.size(); <span class="codecomment">// returns 4</span>

size_t capacity = int_vector.capacity(); <span class="codecomment">// This is implementation specific, it can be any number

                                         // greater than or equal to 4. Let's pretend this is 8</span>

int_vector.push_back(5); <span class="codecomment">// adds an element, size is now 5, capacity still 8</span>

int_vector.push_back(3); <span class="codecomment">// adds an element, size is now 6, capacity still 8</span>

int_vector.push_back(9); <span class="codecomment">// adds an element, size is now 7, capacity still 8</span>

int_vector.push_back(4); <span class="codecomment">// adds an element, size is now 8, capacity still 8</span>

 

int_vector.push_back(1); <span class="codecomment">// adds an element and causes vector to allocate new memory</span>

                         <span class="codecomment">// size is now 9, capacity is implementation defined</span>

When you add new elements that cause the capacity of the vector to be exceeded, the vector needs to allocate new memory large enough to hold all the new elements plus some room for later growth,
and then copies all the elements from the old memory to the new memory. When a push_back() call causes the vector to grow, the vector usually allocates a new capacity
somewhere between 1.5 to 3 times the old size. All this work means that reallocations are very expensive.

However, if you know in advance that the vector is going to grow to a given size, you can call the reserve() member function to tell the vector to allocate enough
capacity for the number of elements you expect. For example: if you're reading data from a file and the first line tells you how many elements to process.


std::ifstream ifs(<span class="st0">"myfile.txt"</span>);

<span class="codekeyword">int</span> number_of_lines;

ifs >> number_of_lines;

std::vector<int> numbers;

numbers.reserve(number_of_lines);

<span class="codecomment">// read each line into the vector</span>

You can also change the size of a vector by calling the resize() member function. std::vector<>::resize() takes two arguments, the first
one is the new size, and the second one is the an object to copy into any new elements added to the vector. This defaults to T(). You can also call the "code">clear() member function to erase all elements in the vector.


std::vector<int> int_vector; <span class="codecomment">// size 0</span>

int_vector.resize(5, 4);     <span class="codecomment">// resizes the vector to size 5 with all the elements equal to 4</span>

int_vector.resize(10, 3);    <span class="codecomment">// resizes the vector to size 10 with all the new elements equal to 3</span>

int_vector.resize(6, 2);     <span class="codecomment">// resizes the vector to size 6, the 2 is ignored</span>

int_vector.resize(20);       <span class="codecomment">// resizes the vector to size 20, all the new elements are equal to int() or 0</span>

int_vector.clear();          <span class="codecomment">// clears all elements, size is now 0</span>

A related member function is assign(). When you pass a integer, n, as a first argument and a const T reference as the
second argument, assign() fills the vector so that it contains exactly n elements that are copies of the reference.


std:vector<int> int_vector;

int_vector.resize(5, 100); <span class="codecomment">// vector contains 5 elements each with the value 100</span>

int_vector.assign(4, 10);  <span class="codecomment">// vector now contains 4 elements each with the value 10</span>

Also, std::vector<> also supplies an empty() member function, which returns true if and only if the vector has 0 elements. This may be
faster than using size() == 0, depending on the implementation.


Iterators and Algorithms

std::vector<> doesn't provide member functions to do everything. In fact, its member functions are only related to managing and accessing elements. For other kinds
of functionality, std::vector<> provides iterators that algorithms can use to manipulate the vector's data. For example to sort a vector you can use the
std::sort<>() algorithm found in the <algorithm> header.


std::vector<int> int_vector;

int_vector.push_back(6);

int_vector.push_back(2);

int_vector.push_back(3);

int_vector.push_back(9);

std::sort(int_vector.begin(), int_vector.end()); <span class="codecomment">// sorts the vector so it now contains 2, 3, 6, 9</span>

Iterators are generalizations of the pointer concept. std::vector<> has what are called random access iterators, the type of iterator that are most like
regular pointers. 5) The begin() member function returns a std::vector<T>::iterator
that refers to the first element in the vector. The end() member function returns a std::vector<T>::iterator that points to place one
after the last element. Between the two, the iterators form a half open range, [begin, end), that describes all the elements in the vector.


std::vector<int> int_vector;

<span class="codecomment">// fill vector</span>

std::vector<int>::iterator first = int_vector.begin();  <span class="codecomment">// itr refers to the first element of int_vector;</span>

<span class="codekeyword">int</span> i = *first;                                         <span class="codecomment">// you can dereference it to read the value</span>

*first = 5;                                             <span class="codecomment">// or you can dereference it to write to it</span>

std::vector<int>::iterator middle = first + 4;          <span class="codecomment">// middle now refers to the fifth element (int_vector[4])</span>

std::sort(first, middle);                               <span class="codecomment">// sorts the first four elements</span>

                                                        <span class="codecomment">// (from first to the element before middle)</span>

<span class="codekeyword">for</span> (std::vector<int>::iterator i = int_vector.begin(); <span class="codecomment">// this loop (or the moral equivalent of it) </span>

     i != int_vector.end();                             <span class="codecomment">// appears in many algorithms. The i != int_vector.end()</span>

     ++i) {                                             <span class="codecomment">// is why you need an iterator to one after the last element</span>

  <span class="codecomment">// do stuff</span>

}

std::vector<int>::iterator j = find(int_vector.begin(), <span class="codecomment">// get an iterator into the vector that refers to an</span>

                                    int_vector.end(),   <span class="codecomment">// element whose value is 100. If there is no such element</span>

                                    100);               <span class="codecomment">// then j will be equal to int_vector.end()</span>

With iterators, you can also insert and erase elements at arbitrary points inside the vector. When you use insert() the elements inserted are inserted before the
iterator, or at the end, if you specify the end() vector.


int_vector.insert(int_vector.begin(), 5);  <span class="codecomment">// inserts an element with the value 5 at the beginning of the vector</span>

int_vector.insert(int_vector.begin() + 4, 10, 0); <span class="codecomment">// inserts 10 elements with the value 0 between the fourth</span>

                                                  <span class="codecomment">// and fifth element of the vector</span>

int_vector.erase(int_vector.begin()); <span class="codecomment">// erases the first element of the vector</span>

int_vector.erase(int_vector.begin() + 5, int_vector.begin() + 7); <span class="codecomment">// erases the sixth and seventh elements of the vector</span>

int_vector.erase(int_vector.begin() + 2, int_vector.end()); <span class="codecomment">// erases everything from the third element down</span>

However, iterators to elements in a vector may not stay valid. Some actions will invalidate iterators, which means that those iterators become unsafe to use. Also, when an iterator is
invalidated, all pointers or references to the element that the iterator refers to also become invalidated. Whenever a reallocation occurs, such as when you insert() or
push_back() more elements than the vector currently has capacity, all existing iterators become invalidated. Additionally, whenever you use "code">insert(), all iterators that referenced elements after the point of insertion become unsafe to use, and erase() invalidates all iterators, pointers or
references to the erased elements or the elements after the point of erasure.

In addition to producing iterators, some of std::vector<>'s member functions also accept iterator inputs. In particular, you can construct a vector from an iterator
range, insert() elements from an iterator range into a vector or assign() to a vector from an iterator range. All these function take input
iterators
, which are all iterators that you can read from, including the std::vector<>'s iterators. 6)


std::vector<int> int_vector;

<span class="codecomment">// fill the vector</span>

std::vector<int>::iterator middle = int_vector.begin() + int_vector.size() / 2;

std::vector<int> first_half(int_vector.begin(), middle);

std::vector<int> second_half(middle, int_vector.end());

second_half.insert(second_half.end(), first_half.begin(), first_half.end());

Generally these iterator range functions are more efficient that calling the equivalent single element functions multiple times.


The Remove-Erase Idiom

One frequent operation for vector is removing all elements of a certain value. One way you can do that is repeatedly calling std::find<>() and "code">erase() on the vector until find() returns end(). This isn't very efficient, and writing the loop necessary to do that is annoying.
Another approach is called the remove-erase idiom. It looks like:


std::vector<int> int_vector;

<span class="codecomment">// fill the vector</span>

int_vector.erase( std::remove(int_vector.begin(),

                              int_vector.end(),

                              number_to_remove)

                  int_vector.end() );

Here's what it does: std::remove<>() takes two iterators that describe a range, and a value to remove from that range. It removes the elements by copying on top of
the elements to be removed with values from later on in the range. So if you had values like:


1 2 3 4 2 2 3

and told it to remove all the 2's, the algorithm would do something like:


1 2 3 4 2 2 3

^

|

Here's the first one. It's not a 2, let's move on.



1 2 3 4 2 2 3

  ^

  | 

Here's the next one. It is a 2, let's find the next non 2.



1 2 3 4 2 2 3

  ^ ^

  | | 

The next one is a 3. Ok, copy that on top of the 2.



1 3 3 4 2 2 3

    ^

    | 

Now we need to copy something on top of 3's old position. Let's find the next non 2.



1 3 3 4 2 2 3

    ^ ^

    | |

This guy's a 4, that's not a 2. Let's copy that on top of the old 3.



1 3 4 4 2 2 3

      ^

      |

Now we need to find something to copy on top of the old 4's position.



1 3 4 4 2 2 3

      ^ ^

      | |

No good. That's a 2.



1 3 4 4 2 2 3

      ^   ^

      |   |

No good. That's another 2.



1 3 4 4 2 2 3

      ^     ^

      |     |

Ok that's not a 2. Copy it on top of the old 4's position.



1 3 4 3 2 2 3

        ^     ^

        |     |

Now we're out of inputs. So return the iterator after the last item we copied.

std::remove<>() doesn't try to erase the remaining elements from the container, because it doesn't know about the container, it only knows about the iterators it
was given. Once remove copies forward all the non 2's, it returns an iterator to just after the range that has all the 2's removed. You can use this iterator to actually erase the elements from the
vector, which is what the erase() call does. You need both, which is why this is called the remove-erase idiom.

Similar to std::remove<>(), there's also std::remove_if<>(). Instead of removing elements that are equal to a given value,
std::remove_if() removes values if the satisfy a given criterea. This critirea is given passed to std::remove_if<>() as a function or
function object argument. For example:


<span class="codekeyword">bool</span> greater_than_five(<span class="codekeyword">int</span> n) {

  <span class="codekeyword">return</span> n > 5;

}

 

std::vector<int> int_vector;

<span class="codecomment">// fill the vector</span>

int_vector.erase( std::remove_if(int_vector.begin(),

                                 int_vector.end(),

                                 &greater_than_five)

                  int_vector.end() );

This removes all elements from the vector that are greater than five. The third argument to std::remove_if<>() is called the predicate. The predicate can be
anything that you can call using operator() that takes the element type and returns something convertible to bool. So instead of the greater_than_five() function, you could do something like:


<span class="codekeyword">struct</span> GreaterThan {

  GreaterThan(<span class="codekeyword">int</span> n) : val(n) {}

  <span class="codekeyword">int</span> val;

  <span class="codekeyword">bool</span> <span class="codekeyword">operator</span>()(<span class="codekeyword">int</span> n) {

    <span class="codekeyword">return</span> n > val;

  }

}

 

int_vector.erase( std::remove_if(int_vector.begin(),

                                 int_vector.end(),

                                 GreaterThan(5))

                  int_vector.end() );

This creates a special kind of object, called a function object, that you can call using operator(). In this case it return true of the value passed via operator() is greater than some
value set during construction.

Another common use of std::remove_if<>() is to remove an element if one of it's member functions returns true. For example:


<span class="codekeyword">struct</span> Cow {

  <span class="codekeyword">enum</span> Color { Spotted, Brown };

  Color color;

  Cow(Color c) : color(c) {}

 

  <span class="codekeyword">bool</span> is_brown(<span class="codekeyword">void</span>) { <span class="codekeyword">return</span> color == Brown; }

};

 

<span class="codekeyword">int</span> main(<span class="codekeyword">int</span>, <span class="codekeyword">char</span> **) {

  std::vector<Cow> cow_vector;

  cow_vector.push_back(Cow(Cow::Brown));

  cow_vector.push_back(Cow(Cow::Spotted));

  cow_vector.push_back(Cow(Cow::Brown));

  cow_vector.push_back(Cow(Cow::Spotted));

  cow_vector.push_back(Cow(Cow::Brown));

  cow_vector.push_back(Cow(Cow::Brown));

  cow_vector.push_back(Cow(Cow::Brown));

  

  cow_vector.erase(std::remove_if(cow_vector.begin(),

                                  cow_vector.end(),

                                  std::mem_fun_ref(&Cow::is_brown)),

                   cow_vector.end());

 

  <span class="codekeyword">return</span> 0;

}

This removes all the brown cows from the vector. In this case you use std::mem_fun_ref<>() to turn a pointer to the member function to something that can be called
by std::remove_if<>() as it checks every element in the iterator range.


Const Vectors

Sometimes you may have a const vector or a const reference to a vector. This can happen for a variety of reasons, but often occurs when defining a const member function for a class type that has a
std::vector<> member variable. When you have a const vector or const reference to a vector, then some of the functions change behaviour. For example, using operator[]
on the vector no longer returns a reference to the element indexed. Instead it returns a const reference to the element, so you cannot assign to it. Also begin() and
end() return std::vector<>::const_iterator instead of std::vector<>::iterator. This means that you can't use
those iterators with algorithms that modify the vector, such as std::remove<>().


Tips, Tricks and Gotchas


Polymorphism and Pointers

std::vector<> stores elements by value. This means that storing polymorphic types by value doesn't work like you might expect.


<span class="codekeyword">class</span> Object {

  <span class="codekeyword">public</span>:

    <span class="codekeyword">virtual</span> <span class="codekeyword">void</span> hello(<span class="codekeyword">void</span>) { std::cout << <span class=

"st0">"Object::hello()"</span> << std::endl; }

    <span class="codekeyword">virtual</span> ~Object() {}

};

 

<span class="codekeyword">class</span> Dog : <span class="codekeyword">public</span> Object {

  <span class="codekeyword">public</span>:

    <span class="codekeyword">virtual</span> <span class="codekeyword">void</span> hello(<span class="codekeyword">void</span>) { std::cout << <span class=

"st0">"Dog::hello(). Woof."</span> << std::endl; }

};

 

Dog fido;

std::vector<Object> object_vector;

object_vector.push_back(fido);

object_vector.front().hello(); <span class="codecomment">// outputs "Object::hello()", not "Dog::hello(). Woof."</span>

This only copies the base part of the object into the vector, which is a problem called object slicing. To avoid this problem you can instead store pointers to polymorphic objects. "#fn7" name="fnt7" class="fn_top" id="fnt7">7)


Dog * fido = <span class="codekeyword">new</span> Dog();

std::vector<Object *> object_vector;

object_vector.push_back(fido);

object_vector.front()->hello(); <span class="codecomment">// outputs "Dog::hello(). Woof."</span>

However, if you do this you need to remember to delete the pointer manually; when the vector is destroyed it doesn't call delete on any pointers
it contains. One way to make this easier is to instead store smart pointers, like std::tr1::shared_ptr<> if your compiler's standard library implementation has it
8) or boost::shared_ptr<> if it doesn't. But keep in mind that std::auto_ptr<>,
introduced in the last article, cannot be stored in a std::vector<>. I'll cover more details about smart pointers in article eight.


boost::shared_ptr<Dog> fido(<span class="codekeyword">new</span> Dog());

std::vector<boost::shared_ptr<Object> > object_vector;

object_vector.push_back(fido);

object_vector.front()->hello(); <span class="codecomment">// outputs "Dog::hello(). Woof."</span>

Even if you aren't storing polymorphic objects in a vector, you may wish to use (smart) pointers instead if the objects you wish to store are very expensive to copy.


Your Friend typedef

When using std::vector<>, or really any template code with any degree of complexity, judicious use of typedefs can make your life much easier.


<span class="codekeyword">typedef</span> boost::shared_ptr<Object> ObjectPtr;

<span class="codekeyword">typedef</span> std::vector<ObjectPtr> LiveObjects;

LiveObjects live_objects;

LiveObjects::iterator begin = live_objects.begin();

In this example, you could change the allocator for LiveObjects with a minimum of fuss, or even the type of container.


Shrinking vectors

std::vector<>::reserve() will not reduce the capacity of a vector. If you want to reduce the capacity, you can use try the copy and swap trick.


std::vector<int> my_vector;

<span class="codecomment">// fill the vector</span>

std::vector<int>(my_vector).swap(my_vector); <span class="codecomment">// creates a temporary vector copy constructed from</span>

                                             <span class="codecomment">// the vector to be shurnk, then immediately swaps</span>

                                             <span class="codecomment">// contents with the original vector</span>

This generally will remove some, but not necessarily all, excess capacity from the original vector. Similarly, sometimes instead of just calling clear() to remove all
elements, you may wish to swap with a temporary vector to minimize the memory held by the vector.


std::vector<int> my_vector;

<span class="codecomment">// fill the vector</span>

std::vector<int>().swap(my_vector); <span class="codecomment">// creates an empty temporary vector, then immediately swaps</span>

                                    <span class="codecomment">// contents with the original vector</span>


Interoperating

Some functions expect pointers to buffers. To use std::vector<> with these functions you can pass the address of elements in the vector. For example, if you have a
vector v, &v[0] will return a pointer to the first element. std::vector<>'s elements are guaranteed to be
stored in contiguous memory 9) so you can use that pointer in the same cases you could use a pointer to a dynamic array of the same size as the
vector. (Provided the function doesn't try to delete, realloc() or similarly mess with the actual memory instead of the contents of the
memory.)


Unordered Vectors

Sometimes your vector contains a set of elements where the order is unimportant; one example might be a list of particles. In that case instead of using erase() to remove
a single element, it's often more efficient to swap the element to be removed with the last element in the vector and call pop_back().


std::vector<int>::iterator itr = <span class="codecomment">/* determine element to be erased */</span>;

<span class="codecomment">// instead of int_vector.erase(itr);</span>

std::swap(*itr, int_vector.back());

int_vector.pop_back();

Alternately, you can assign the last element on top of the current element and call pop_back().


std::vector<int>::iterator itr = <span class="codecomment">/* determine element to be erased */</span>;

<span class="codecomment">// instead of int_vector.erase(itr);</span>

*itr = int_vector.back();

int_vector.pop_back();

Which one is more efficient depends on the type used to instantiate the vector. For integers and other primitive types assignment will generally perform better. For some classes using "code">std::swap<>() will be more efficient.


Avoid std::vector<bool>

The C++ Standard defines std::vector<bool> to be implemented in quite a different manner than other std::vector<> instantiations.
Code that works fine with other vector instantiations may explode mysteriously if used with with std::vector<bool>. If you find you need a container to hold bools,
consider instead using std::deque<> (covered in part 4).


Reading Error Messages

Reading error messages from code using std::vector<> (or any template code, including the rest of the standard library) can be something of a chore. Let's start
with an easy example:


#include <vector>

 

<span class="codekeyword">int</span> main(<span class="codekeyword">int</span>, <span class="codekeyword">char</span> **) {

  std::vector<int> int_vector;

  int_vector.push_back(<span class="st0">"string"</span>);

 

  <span class="codekeyword">return</span> 0;

}

Obviously this won't work, because I'm trying to put a string into a vector, but let's pretend that this wasn't so obvious. One compiler, gcc 3.4.4, produces this error message:


temp.cpp: In function 'int main(int, char**)':

temp.cpp:5: error: invalid conversion from 'const char*' to 'int'

temp.cpp:5: error:   initializing argument 1 of 'void std::vector<_Tp, _Alloc>::push_back(const _Tp&)

         [with _Tp = int, _Alloc = std::allocator<int>]'

The "void std::vector<_Tp, _Alloc>::push_back(const _Tp&) [with _Tp = int, _Alloc = std::allocator<int>]" may look a little intimidating, so let's break it down somewhat. It says
that we're dealing with a vector where _Tp is int and _Alloc is std::allocator<int>, so let's substitute those back
into the vector bits to get:


temp.cpp:5: error: invalid conversion from 'const char*' to 'int'

temp.cpp:5: error:   initializing argument 1 of 'void std::vector<int, std::allocator<int> >::push_back(const int &)'

We know that the default allocator for std::vector<T> is std::allocator<T> so we can take that part out, and this breaks down
to:


temp.cpp:5: error: invalid conversion from 'const char*' to 'int'

temp.cpp:5: error:   initializing argument 1 of 'void std::vector<int>::push_back(const int &)'

Now it looks more managable, so it says on line five I tried to stuff a const char * into a std::vector<int>::push_back(), when it
expected an int.

Here's another example:


  std::vector<int> int_vector;

  std::vector<float>::iterator itr = int_vector.begin();

When compiled with MSVC .NET 2003, it produces this error message:


temp.cpp(6) : error C2440: 'initializing' : cannot convert

    from 'std::vector<_Ty>::iterator' to 'std::vector<_Ty>::iterator'

        with

        [

            _Ty=int

        ]

        and

        [

            _Ty=float

        ]

        No constructor could take the source type, or constructor overload resolution was ambiguous

In this case you have two different vector types involved, which shows up as two different _Ty='s lines in the error output. Subbing in the "code">_Ty='s in order you get:


temp.cpp(6) : error C2440: 'initializing' : cannot convert

   from 'std::vector<int>::iterator' to 'std::vector<float>::iterator'

So it says I can't assign an iterator to a int vector to an iterator to a float vector.


Some Performance Considerations

In the introduction I claimed that std::vector<> was largely just as efficient as using standard dynamic arrays. This is quite a large claim, and there are some
caveats. There are times where std::vector<> is less efficient than standard dynamic arrays, and there are times when dynamic arrays are less efficient. The primary
justifying factor for efficiency claims in the standard library is that the standard library is implemented as templates, and as such benefit from inlining. This means that code like:


<span class="codekeyword">for</span> (std::vector<Something>::iterator itr = some_vector.begin();

     itr != some_vector.end();

     ++itr) {

  *itr = Something();

}

compiles to equivalent machine code10) as:


<span class="codekeyword">for</span> (<span class="codekeyword">int</span> i = 0; i < number_of_somethings; ++i) something_array[i] = Something();

Places where std::vector<> differs from normal dynamic arrays come from the fact that std::vector<> holds both initialized and
uninitialized memory, and normal dynamic arrays store just initialized memory. Because std::vector<> manages both initialized and uninitialized memory, it needs to
store both size and capacity in addition to the memory address of the memory it holds. When you hold a pointer to a dynamic array all you need to do is hold the pointer and usually the size (if you
intend to do anything useful with the dynamic array). This means that std::vector<> generally stores an additional word per instance as compared to a dynamic
array.

The lack of uninitialized memory makes adding new elements to a dynamic array costly, as you need to first allocate a new array and then copy over the data from the old array every time new
elements are added. On the other hand, std::vector<>() can just use the uninitialized memory to add the new elements up to the capacity of the vector. This generally
means that the more dynamic the data set, std::vector<>() is more attractive. However, if your data size never changes, (but is still determined at runtime) then a
dynamic array may be more attractive from a memory consumption standpoint.

However, some std::vector<> implementations that uses a class type for std::vector<>::iterator, may exhibit some slower behavior for
things like using std::sort<>() due to less than perfect inline code generation. This can be addressed by instead of passing begin() and
end() to the algorithm, passing pointers to the raw data as arguments. ex: instead of doing std::sort(my_vector.begin(), my_vector.end()) doing
std::sort(&my_vector.front(), &my_vector.front() + my_vector.size()).


Closing and References

This article covered most of the more useful member functions and algorithms used with std::vector<>. However, there are still details that I didn't cover. For a
reference about std::vector<> and other algorithms to used with std::vector<> you may wish to read The C++ Standard Library by
Nicolai Josuttis. Or if you are interested in the language lawyer details, the actual text of the C++ standard: ISO/IEC 14882:2003, which can be found either in hard cover as The C++ Standard:
Incorporating Technical Corrigendum No. 1
from Wiley or in electronic or print form from your national standards body. Another good resource of learning how to use standard library containers
effectively is Effective STL by Scott Meyers.

In the next article I'll go into more details about iterators and algorithms.



1) Such as not calling the destructors for all the elements in the array or corrupting the heap.

2) exceptions being macros and the like

3) Technically, the standard library implementation may also add additional template arguments following these two as long as they have
default arguments.

4) However, you should be careful that you implemented copy construction and assignment properly for user defined types you put in a
std::vector<>.

5) Pointers are actually a special kind of random access iterator. In fact, in some std::vector<> implementations,
std::vector<T>::iterator is just a T *.

6) There are five classes of iterators: input, output, forward, bidirectional and random. All forward iterators are also input and output
iterators. All bidirectional iterators are also forward iterators, and all random access iterators are also bidirectional iterators. std::vector<>::iterator is a
random access iterator, so all std::vector<>::iterators are also input, output, forward and bidirectional iterators. More on this in the next article.

7) Remember that objects with virtual functions should also have a virtual destructor in the base class.

8) As of this writing hardly any do.

9) at least after the 2003 revision of the C++ Standard

10) There may be symbol name differences, and so on. This is in release mode. In debug builds (where performance usually isn't a factor)
std::vector<> usually performs much slower. On the other hand it may do some error checking that raw dynamic arrays with pointers does not, so that performance cost
isn't pure overhead.







Comments

Note: Please offer only positive, constructive comments - we are looking to promote a positive atmosphere where collaboration is valued above all else.




PARTNERS