A class that needs to manually manage any resources also needs to implement The Big Three. The copy-and-swap idiom applies to such classes, and elegantly assists the assignment operator in achieving two things: removing code duplication, and providing a strong exception guarantee.
Conceptually, it works by using the copy-constructor to create a copy of the data, then a swap
function to swap the old data with this new data. The temporary copy is then destructed, taking the old data with it, and we are left with a copy of the new data.
Let's consider an explained concrete case:
class dumb_array
{
public:
dumb_array(std::size_t pSize = 0) :
mSize(pSize),
mArray(mSize ? new int[mSize]() : 0)
{}
dumb_array(const dumb_array& pOther) :
mSize(pOther.mSize),
mArray(mSize ? new int[mSize]() : 0),
{
std::copy(pOther.mArray, pOther.mArray + mSize, mArray);
}
~dumb_array(void)
{
delete [] mArray;
}
private:
std::size_t mSize;
int* mArray;
};
This class almost manages an array successfully, but it needs operator=
to operate correctly and we'll be done. Here's how a naive implementation might look:
// the hard part
dumb_array& operator=(const dumb_array& pOther)
{
if (this != &pOther) // (1)
{
// get rid of the old data
delete [] mArray; // (2)
mArray = 0;
// and put in the new
mSize = pOther.mSize;
mArray = mSize ? new int[mSize]() : 0; // (3)
std::copy(pOther.mArray, pOther.mArray + mSize, mArray);
}
return *this;
}
This now manages an array, without leaks; however, it suffers from three problems.
The first is the self-assignment test. While this check is an easy way to prevent us from running needless code (and therefore also provides a no-throw guarantee) on self-assignment and from introducing subtle bugs (such as deleting the array only to try and copy it), in all other cases it merely serves to slow the program down. Self-assignment rarely occurs, so most of the time this check is a waste. It would be better if the operator could work properly without it.
The second is that it only provides a basic exception guarantee. If new int[]
fails, *this
will have been modified (namely, the size is wrong and the data is gone). For a strong exception guarantee, it would need to be something akin to:
dumb_array& operator=(const dumb_array& pOther)
{
if (this != &pOther) // (1)
{
// we need to get the new data ready before we replace the old
std::size_t newSize = pOther.mSize;
int* newArray = newSize ? new int[newSize]() : 0; // (3)
std::copy(pOther.mArray, pOther.mArray + newSize, newArray);
// replace the old data
delete [] mArray;
mSize = newSize;
mArray = newArray;
}
return *this;
}
The code has expanded...and this is only for one resource! With multiple resources to copy, we need to introduce a try/catch block and keep track of what to free if an exception is thrown. Far too messy. (Though having to manage multiple resources in a single class is a bad thing!)
The third problem is code duplication. In our case, it's only a single line with std::copy
, but with more complex resources and/or multiple resources this code bloat can be quite a hassle. And we should strive to never repeat ourselves.
As mentioned, the copy-and-swap idiom will fix all these issues. It requires a working copy-constructor, which we have (as required by implementing The Big Three), and a swap
function, which we don't necessarily have. While The Big Three dictate we implement everything we have, it should really be called "The Big Three and A Half": almost any time you manually manage a resource it also makes sense to provide a swap
function, for optimal swaps.
That is, std::swap
will normally copy an entire object, and then perform two assignments, then discard of the copy. This works fine for primitive types, but for expensive classes this won't do. Rather, we should just swap all the internal members of the class and then the classes are effectively swapped.
We do that as follows:
class dumb_array
{
public:
// ...
void swap(dumb_array& pOther) // nothrow
{
using std::swap; // allow ADL
swap(mSize, pOther.mSize); // with the internal members swapped,
swap(mArray, pOther.mArray); // *this and pOther are effectively swapped
}
};
// the following isn't strictly necessary for the copy-and-swap
// idiom, but if we're going to implement swap let's do it right.
namespace std
{
// adding things to the std namespace leads to undefined
// behavior, unless it's a specialization.
template <> // <-- so this is important!
void swap(dumb_array& pFirst, dumb_array& pSecond)
{
pFirst.swap(pSecond);
}
}
Now using std::swap
on dumb_array
's is much more efficient; it just swaps pointers and sizes, rather than copying and assigning entire arrays. Aside from this bonus in functionality and efficiency, we are now ready to implement the copy-and-swap idiom.
The assignment operator is thus:
dumb_array& operator=(dumb_array pOther) // (1)
{
swap(pOther); // (2)
return *this;
}
And that's it! With one fell swoop, all three problems are elegantly tackled at once. Let's dissect how it works.
We first notice a design choice: the parameter is taken by-value. While one could just as easily do the following (and indeed, many naive implementations of the idiom do):
dumb_array& operator=(const dumb_array& pOther)
{
dumb_array(pOther).swap(*this);
return *this;
}
We lose an important optimization opportunity. The article details why, but the guideline is: if you're just going to make a copy anyway, let the compiler do it in the parameter list. (This gives it the opportunity to skip a copy when working with rvalues.) (Note before you accept this is a rule to follow for all eternity, it does not apply in C++0x! More on this below.)
Either way, this method of copying is the key to eliminating code duplication: we get to use the code from the copy-constructor to make the copy, and never need to repeat any bit of it. Now that the copy is made, we are ready to swap.
Observe that upon entering the function all the new data is already allocated, copied, and ready to be used. This is what gives us a strong exception guarantee for free: we won't even get to the swap if construction of the copy fails, and it's therefore not possible to alter the state of *this
. (What we manually did before for a strong exception guarantee, the compiler is doing for us; how kind.)
At this point we are home-free, because swap
is non-throwing. We swap our current data with the copied data, safely altering our state, and the old data gets put into the temporary. The old data is then released when the function returns. (Where upon the parameter's scope ends and its destructor is called.)
Because the idiom cleanly separates the constructive part from the destructive part, we cannot introduce bugs within the operator. This means we get rid of the need for a self-assignment check, allowing a single uniform implementation of operator=
. (Additionally, we no longer have a performance penalty on non-self-assignments.)
And that is the copy-and-swap idiom.
In C++0x, we won't need to implement std::swap
manually anymore. This is because instead of copying things around, std::swap
will move them around; with proper move-semantics this ultimately results in the same code as our custom swap.
I have expressed the ideal C++0x resource managing class on my blog, which discusses the effects move-semantics have on The Big Three (it's The Big Four now), swapping and swap
, and the C++0x version of both copy-and swap and move-and-swap.
While you can see the rationale on the blog, for the impatient here's the result:
class dumb_array
{
public:
// constructor
dumb_array(std::size_t pSize = 0) :
mSize(pSize),
mArray(mSize ? new int[mSize]() : nullptr)
{}
// copy constructor
dumb_array(const dumb_array& pOther) :
mSize(pOther.mSize),
mArray(mSize ? new int[mSize]() : nullptr),
{
std::copy(pOther.mArray, pOther.mArray + mSize, mArray);
}
// move constructor
dumb_array(dumb_array&& pOther) :
dumb_array() // delegate
{
swap(pOther);
}
// assignment operator
dumb_array& operator=(dumb_array pOther)
{
swap(pOther);
return *this;
}
// destructor
~dumb_array(void)
{
delete [] mArray;
}
// swap
void swap(dumb_array& pOther)
{
std::swap(mSize, pOther.mSize);
std::swap(mArray, pOther.mArray);
}
private:
std::size_t mSize;
int* mArray;
};
Enjoy.