The Copy&Swap Idiom

Started by
31 comments, last by TheComet 10 years, 6 months ago

This thread is about me questioning the efficiency of the copy&swap idiom. Before I can ask this question, however, I will briefly explain what the copy&swap idiom is. For those of you who already know it, you can skip to the next big headline.

The Copy&Swap Idiom

I was reading up on the rule of 3, and found a more elegant way of overriding the assignment operator. The way I initially learned to do it was something like this:


class Foo
{
public:
   Foo& operator=( const Foo& cp )
   {
      if( this == &cp ) return *this;
      // copy resources here
      return *this;
   }
};

HOWEVER, this is not exception safe (for instance, if you are allocating new objects during the copying of resources, and any one of them throws an exception, you'll be looking at a memory leak) and it performs needless checks for self assignment.

The copy&swap idiom solves these problems. It requires your class to have a copy constructor and a swap method in order to work (which is also part of the rule of 3 and a half).

The simplest example of a copy&swap implementation would look like the following:


#include <algorithm> // std::swap
class Foo
public:
   Foo( const Foo& cp ) : m_Data(cp.m_Data) /*insert any copyable members into initializer list*/ {}
   Foo& operator=( Foo O ) // intentionally not a reference, so copying of the object Foo is forced
   {
      swap( O );
      return *this;
   }
   void swap( Foo& O )
   {
      using std::swap;
      swap( m_Data, o.m_Data );
      /* swap any further copyable members here*/
   }
private:
   int m_Data;
};

Given this test code:


Foo myObject1;
Foo myObject2;
myObject2 = myObject1; // copy and swap

When calling myObject2 = myObject1, the overridden assignment operator operator=( Object1 ) is called. Now here's the part where the magical thing happens: As seen in the comments in the code, Object1 is not passed by reference, but by value. What does this mean? Two things:

  1. The copying of Object1 into a temporary is forced, which means the copy constructor of Foo is called before we even enter the overridden assignment method.
  2. Unlike the "traditional" way of overriding assignment, if an exception is thrown within the copy constructor, it is guaranteed to be handled by the class' destructor as soon as it goes out of scope.

Since std::swap is guaranteed to be exception safe, any further operations are guaranteed to be safe as well.

A far more detailed description can be found here: http://stackoverflow.com/questions/3279543/what-is-the-copy-and-swap-idiom

Efficiency

So here's my question. Following the completely unoptimized version of the code, there appear to be a total of 4 copies made of the object. The first copy is done when the copy constructor is first called from when the assignment operator is called. The second, third, and forth copies are done inside std::swap, where it copies each member of the object into a temporary, copies one to the other, and then copies the temporary back before destructing it again.

How efficient is this?

Here is some code for you to play around with using the copy&swap idiom:


#include <iostream>
#include <algorithm>

template <class T>
class Container
{
public:

    // default constructor
    Container( void );

    // copy constructors
    Container( const Container<T>& cp );
    Container( const Container<T>* cp );

    // destructor
    ~Container( void );

    // pushes an element into the container
    void push_back( const T& data );

    // returns the number of elements currently in the container
    const std::size_t& size( void ) const;

    // prints the contents of the container to cout
    void print( void );

    // override subscript operators for easy element access
    T& operator[]( const std::size_t& index );
    const T& operator[]( const std::size_t& index ) const;

    // the copy&swap idiom
    Container<T>& operator=( Container<T> o );
    void swap( Container<T>& o );

private:
    T*              m_Data;
    std::size_t     m_Size;
};

// ---------------------------------------------------------------------
template <class T>
Container<T>::Container( void ) : m_Data( 0 ), m_Size( 0 )
{
}
// ---------------------------------------------------------------------
template <class T>
Container<T>::Container( const Container<T>& cp ) : m_Data( new T[cp.m_Size] ), m_Size( cp.m_Size )
{
    for( std::size_t i = 0; i != m_Size; ++i )
        m_Data[i] = cp.m_Data[i];
}
// ---------------------------------------------------------------------
template <class T>
Container<T>::Container( const Container<T>* cp ) : m_Data( new T[cp->m_Size] ), m_Size( cp->m_Size )
{
    for( std::size_t i = 0; i != m_Size; ++i )
        m_Data[i] = cp->m_Data[i];
}
// ---------------------------------------------------------------------
template <class T>
Container<T>::~Container( void )
{
    if( m_Data ) delete[] m_Data;
}
// ---------------------------------------------------------------------
template <class T>
void Container<T>::push_back( const T& data )
{

    // reallocate memory
    T* oldData = m_Data;
    m_Data = new T[m_Size+1];
    if( oldData )
    {
        for( std::size_t i = 0; i != m_Size; ++i )
            m_Data[i] = oldData[i];
        delete[] oldData;
    }

    // add new data
    m_Data[m_Size] = data;
    ++m_Size;
}
// ---------------------------------------------------------------------
template <class T>
void Container<T>::print( void )
{
    for( std::size_t i = 0; i != m_Size; ++i )
        std::cout << m_Data[i] << std::endl;
}
// ---------------------------------------------------------------------
template <class T>
const std::size_t& Container<T>::size( void ) const
{
    return m_Size;
}
// ---------------------------------------------------------------------
template <class T>
T& Container<T>::operator[]( const std::size_t& index )
{
    return m_Data[index];
}
// ---------------------------------------------------------------------
template <class T>
const T& Container<T>::operator[]( const std::size_t& index ) const
{
    return m_Data[index];
}
// ---------------------------------------------------------------------
template <class T>
Container<T>& Container<T>::operator=( Container<T> o )
{
    swap( o );
    return *this;
}
// ---------------------------------------------------------------------
template <class T>
void Container<T>::swap( Container<T>& o )
{
    using std::swap;
    swap( m_Data, o.m_Data );
    swap( m_Size, o.m_Size );
}

// ---------------------------------------------------------------------
// simple test
int main()
{

    // test on stack
    {
        Container<int> test2;
        {
            Container<int> test;
            test.push_back(10);
            test.push_back(20);
            test.print();
            test2 = test; // copy and swap occurs here
        }
        test2.print();
    }

    // test on heap
    Container<int>* test = new Container<int>();
    test->push_back(666);
    test->push_back(999);
    test->print();
    Container<int>* test2 = new Container<int>();
    *test2 = *test; // copy and swap occurs here
    delete test;
    test2->print();
    delete test2;

    return 0;
}

"I would try to find halo source code by bungie best fps engine ever created, u see why call of duty loses speed due to its detail." -- GettingNifty
Advertisement

Only one copy of the object is made, the swap function is only swapping primitive types; a T* and an std::size_t in your case. You can of course use some special hardware instructions, if there are any, but in general that is about as efficient as you can make a swap of your non-primitive Container type.

Hi Bob,

What you say is true for the example I provided, but not if I were to specify something more complex for T, such as here:


Container<std::string> test;
Container<std::string> test2;

test.push_back( "hello world" );
test.push_back( "how many times will this be copied?" );

test2 = test;
test2.print();

I compressed the whole example down into a much simpler example. Try running this and see for yourself:


class Foo
{
public:
    Foo() : m(0) {}
    Foo( const Foo& o ) : m(o.m) { std::cout << "copy constructor called" << std::endl; }
    ~Foo() {}
    void set( int m ){ this->m = m; }
    int get( void ){ return m; }
    Foo& operator=( Foo o ){ std::cout << "assignment operator called" << std::endl; swap( o ); return *this; }
    void swap( Foo& o ){ using std::swap; swap( m, o.m ); }
private:
    int m;
};

// ---------------------------------------------------------------------
// simple test
int main()
{

    Foo foo;
    Foo bar;
    foo.set(7);
    bar.set(10);

    std::cout << "SWAPPING..." << std::endl;
    std::swap( foo, bar ); // NOTE: this swaps the CONTENT of Foo, not the pointers to Foo

    return 0;
}

On my end, I get the following output:


SWAPPING...
copy constructor called
copy constructor called
assignment operator called
copy constructor called
assignment operator called

There are a total of 3 copies being made of Foo inside std::swap, just as I predicted. These aren't copies of pointers to the member variable, these are actual raw data copies. If I were using a 500kb std::string in place of an integer, I would be copying this string back and forth 4 times.

"I would try to find halo source code by bungie best fps engine ever created, u see why call of duty loses speed due to its detail." -- GettingNifty
Not really. std::swap is specialized for std::string to swap internals. You can also specialize std::swap for your own types as well. Ex:

template <> void std::swap(Foo & lhs, Foo & rhs) { lhs.swap(rhs); }

Your container class stores a T*, so even if you instance the container with T as a complex and expensive-to-copy class, the swap in your container is for the primitive type T* and not the complex type T.

Ah, that makes much more sense! So I modified it to be the following and now it's only copying once (as it should be). Thanks!


class Foo
{
public:

    // default constructor
    Foo() : m(0) {}

    // copy constructor
    Foo( const Foo& o ) : m(o.m) { std::cout << "copy constructor called" << std::endl; }

    // default destructor
    ~Foo() {}

    // some get and set methods
    void set( int m ){ this->m = m; }
    int get( void ){ return m; }

    // copy and swap idiom
    Foo& operator=( Foo o )
    {
        std::cout << "assignment operator called" << std::endl;
        swap( *this, o );
        return *this;
    }
    void swap( Foo& a, Foo& b )
    {
        using std::swap;
        swap( a.m, b.m );
    }
private:
    int m;
};

namespace std {
    template <> void swap( Foo& a, Foo& b )
    {
        a.swap(a,b);
    }
}

// ---------------------------------------------------------------------
// simple test
int main()
{

    Foo foo;
    Foo bar;
    foo.set(9);
    bar = foo;

    return 0;
}
"I would try to find halo source code by bungie best fps engine ever created, u see why call of duty loses speed due to its detail." -- GettingNifty


HOWEVER, this is not exception safe (for instance, if you are allocating new objects during the copying of resources, and any one of them throws an exception, you'll be looking at a memory leak) and it performs needless checks for self assignment.
This is off topic, but: in 99% of the C++ projects I've worked on, exceptions have been banned by the programming guidelines/standards laugh.png

Writing exception-safe code is very important if you're writing the standard library that will be used by everyone, but in my own experience, it's not something that is required in order to work on most C++ projects.

Copy and swap isn't limited to C++ exceptions. It can be applied to many domains that have transaction safety requirements no matter what language and error reporting mechanism you use.

Sorry but I have another problem. How do I specialize std::swap if the arguments are templates?


namespace std {
    template <> void swap( Container<T>& a, Container<T>& b ) // where does T come from now?
    {
        a.swap(a,b);
    }
}

I'm still getting my grip on templates.


HOWEVER, this is not exception safe (for instance, if you are allocating new objects during the copying of resources, and any one of them throws an exception, you'll be looking at a memory leak) and it performs needless checks for self assignment.
This is off topic, but: in 99% of the C++ projects I've worked on, exceptions have been banned by the programming guidelines/standards laugh.png

Writing exception-safe code is very important if you're writing the standard library that will be used by everyone, but in my own experience, it's not something that is required in order to work on most C++ projects.

That's odd, because I've been encouraged to use exceptions if it makes sense. There's an entire section on when and when not to use exceptions here: http://www.parashift.com/c++-faq-lite/exceptions.html

It basically boils down to this: If the method encounters something it was not designed to handle, it makes sense to throw an exception. This definition may sound cool when it's layed down this simply, but in practice it can be very unclear when to use exceptions.

A real world example: You have a login screen and a method to evaluate login information. If said method encounters the username and password to be incorrect, it does not make sense to throw an exception, because it should be designed to handle incorrect login information. If for instance the validation method fails to allocate resources (such as creating the pop up window to tell the user the data is incorrect), it counts as something that cannot be handled from within the method, so it would then make sense to throw an exception.

As to exception safety in C++, if you design everything to use RAII semantics (shared_ptr for instance), you're on the safe side.

"I would try to find halo source code by bungie best fps engine ever created, u see why call of duty loses speed due to its detail." -- GettingNifty

template <typename T> void swap( Container<T>& a, Container<T>& b )

This topic is closed to new replies.

Advertisement