views:

229

answers:

7

Consider the following:

PImpl.hpp

class Impl;

class PImpl
{
    Impl* pimpl;
    PImpl() : pimpl(new Impl) { }
    ~PImpl() { delete pimpl; }
    void DoSomething();
};

PImpl.cpp

#include "PImpl.hpp"
#include "Impl.hpp"

void PImpl::DoSomething() { pimpl->DoSomething(); }

Impl.hpp

class Impl
{
    int data;
public:
    void DoSomething() {}
}

client.cpp

#include "Pimpl.hpp"

int main()
{
    PImpl unitUnderTest;
    unitUnderTest.DoSomething();
}

The idea behind this pattern is that Impl's interface can change, yet clients do not have to be recompiled. Yet, I fail to see how this can truly be the case. Let's say I wanted to add a method to this class -- clients would still have to recompile.

Basically, the only kinds of changes like this that I can see ever needing to change the header file for a class for are things for which the interface of the class changes. And when that happens, pimpl or no pimpl, clients have to recompile.

What kinds of editing here give us benefits in terms of not recompiling client code?

+10  A: 

The main advantage is that the clients of the interface aren't forced to include the headers for all your class's internal dependencies. So any changes to those headers don't cascade into a recompile of most of your project. Plus general idealism about implementation-hiding.

Also, you wouldn't necessarily put your impl class in its own header. Just make it a struct inside the single cpp and make your outer class reference its data members directly.

Edit: Example

SomeClass.h

struct SomeClassImpl;

class SomeClass {
    SomeClassImpl * pImpl;
public:
    SomeClass();
    ~SomeClass();
    int DoSomething();
};

SomeClass.cpp

#include "SomeClass.h"
#include "OtherClass.h"
#include <vector>

struct SomeClassImpl {
    int foo;
    std::vector<OtherClass> otherClassVec;   //users of SomeClass don't need to know anything about OtherClass, or include its header.
};

SomeClass::SomeClass() { pImpl = new SomeClassImpl; }
SomeClass::~SomeClass() { delete pImpl; }

int SomeClass::DoSomething() {
    pImpl->otherClassVec.push_back(0);
    return pImpl->otherClassVec.size();
}
Alan
+1 -- can you provide an example?
Billy ONeal
Your example invokes undefined behavior: you have forgotten the "infamous" Rule of Three >> Whenever you define one of Copy Constructor, Copy Assignment Operator or Destructor, define the other two.
Matthieu M.
Not sure about the undefined behavior, the compiler-generated copy constructor is well-defined but will lead to an eventual double-free if called. Or was the effect of the double-free what you meant by UB?
Ben Voigt
@Matthieu M.: Yes, the class could use a copy constructor and operator=. But that's not relevant to the OP's question and would just clutter up an already over-verbose example.
Alan
@Alan: I disagree, unfortunately many beginners will just copy/paste and adapt your example, and as such they will end up with a sore mess in their hand. Furthermore they may not be trivial, due to exception handling (depending on the implementation you choose). @Ben: the double free was what I meant by UB.
Matthieu M.
+5  A: 

With the PIMPL idiom, if the internal implementation details of the IMPL class changes, the clients do not have to be rebuilt. Any change in the interface of the IMPL (and hence header file) class obviously would require the PIMPL class to change.

BTW, In the code shown, there is a strong coupling between IMPL and PIMPL. So any change in class implementation of IMPL also would cause a need to rebuild.

Chubsdad
Err.. isn't that already the case with a .cpp and .hpp pattern? If the implementation in the .cpp changes, the .hpp does *not* change. Hence no other code should need to be recompiled....
Billy ONeal
Implementation includes data members like `data` and private methods. These would change `Impl.h` (if it existed) but do not change `PImpl.h`.
Beta
@Bill ONeal: Also, since the PIMPL is an opaque pointer, it can be made to point to any concrete derived class from PIMPL abstraction thereby getting the advantages of Strategy Design Pattern. You are not longer dependent on the actual Concrete Strategies, but to the interface. To my mind, this idiom is basically akin to two OOAD principles: a) Program to an interface and not to an implementation b) Favor aggregation over inheritance Is my understanding correct?
Chubsdad
@chusbad: I think so too, indeed we regularly use pimpl to a base class. I have added an answer with an implementation geared toward this.
Matthieu M.
@chubsdad: I don't see how an abstract base class does not ensure the exact same kind of of interface segregation.
Billy ONeal
+3  A: 

In your example, you can change the implementation of data without having to recompile the clients. This would not be the case without the PImpl intermediary. Likewise, you could change the signature or name of Imlp::DoSomething (to a point), and the clients wouldn't have to know.

In general, anything that can be declared private (the default) or protected in Impl can be changed without recompiling the clients.

Beta
+4  A: 

Consider something more realistic and the benefits become more notable. Most of the time that I have used this for compiler firewalling and implementation hiding, I define the implementation class within the same compilation unit that visible class is in. In your example, I wouldn't have Impl.h or Impl.cpp and Pimpl.cpp would look something like:

#include <iostream>
#include <boost/thread.hpp>

class Impl {
public:
  Impl(): data(0) {}
  void setData(int d) {
    boost::lock_guard l(lock);
    data = d;
  }
  int getData() {
    boost::lock_guard l(lock);
    return data;
  }
  void doSomething() {
    int d = getData();
    std::cout << getData() << std::endl;
  }
private:
  int data;
  boost::mutex lock;
};

Pimpl::Pimpl(): pimpl(new Impl) {
}

void Pimpl::doSomething() {
  pimpl->doSomething();
}

Now no one needs to know about our dependency on boost. This gets more powerful when mixed together with policies. Details like threading policies (e.g., single vs multi) can be hidden by using variant implementations of Impl behind the scenes. Also notice that there are a number of additional methods available in Impl that aren't exposed. This also makes this technique good for layering your implementation.

D.Shawley
+1  A: 

In non-Pimpl class headers the .hpp file defines the public and private components of your class all in one big bucket.

Privates are closely coupled to your implementation, so this means your .hpp file really can give away a lot about your internal implementation.

Consider something like the threading library you choose to use privately inside the class. Without using Pimpl, the threading classes and types might be encountered as private members or parameters on private methods. Ok, a thread library might be a bad example but you get the idea: The private parts of your class definition should be hidden away from those who include your header.

That's where Pimpl comes in. Since the public class header no longer defines the "private parts" but instead has a Pointer to Implementation, your private world remains hidden from logic which "#include"s your public class header.

When you change your private methods (the implementation), you are changing the stuff hidden beneath the Pimpl and therefore clients of your class don't need to recompile because from their perspective nothing has changed: They no longer see the private implementation members.

http://www.gotw.ca/gotw/028.htm

Allbite
+1  A: 

Not all classes benefit from p-impl. Your example has only primitive types in its internal state which explains why there's no obvious benefit.

If any of the members had complex types declared in another header, you can see that p-impl moves the inclusion of that header from your class's public header to the implementation file, since you form a raw pointer to an incomplete type (but not an embedded field nor a smart pointer). You could just use raw pointers to all your member variables individually, but using a single pointer to all the state makes memory management easier and improves data locality (well, there's not much locality if all those types use p-impl in turn).

Ben Voigt
@Ben Voigt: I disagree, even with only primitive types in the implementation using Pimpl allows you to change them without any incidence on your clients (ABI preserved, no recompilation required).
Matthieu M.
+2  A: 

There has been a number of answers... but no correct implementation so far. I am somewhat saddened that examples are incorrect since people are likely to use them...

The "Pimpl" idiom is short for "Pointer to Implementation" and is also referred to as "Compilation Firewall". And now, let's dive in.

1. When is an include necessary ?

When you use a class, you need its full definition only if:

  • you need its size (attribute of your class)
  • you need to access one of its method

If you only reference it or have a pointer to it, then since the size of a reference or pointer does not depend on the type referenced / pointed to you need only declare the identifier (forward declaration).

Example:

#include "a.h"
#include "b.h"
#include "c.h"
#include "d.h"
#include "e.h"
#include "f.h"

struct Foo
{
  Foo();

  A a;
  B* b;
  C& c;
  static D d;
  friend class E;
  void bar(F f);
};

In the above example, which includes are "convenience" includes and could be removed without affecting the correctness ? Most surprisingly: all but "a.h".

2. Implementing Pimpl

Therefore, the idea of Pimpl is to use a pointer to the implementation class, so as not to need to include any header:

  • thus isolating the client from the dependencies
  • thus preventing compilation ripple effect

An additional benefit: the ABI of the library is preserved.

For ease of use, the Pimpl idiom can be used with a "smart pointer" management style:

// From Ben Voigt's remark
// information at:
// http://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Checked_delete
template<class T> 
inline void checked_delete(T * x)
{
    typedef char type_must_be_complete[ sizeof(T)? 1: -1 ];
    (void) sizeof(type_must_be_complete);
    delete x;
}


template <typename T>
class pimpl
{
public:
  pimpl(): m(new T()) {}
  pimpl(T* t): m(t) { assert(t && "Null Pointer Unauthorized"); }

  pimpl(pimpl const& rhs): m(new T(*rhs.m)) {}

  pimpl& operator=(pimpl const& rhs)
  {
    std::auto_ptr<T> tmp(new T(*rhs.m)); // copy may throw: Strong Guarantee
    checked_delete(m);
    m = tmp.release();
    return *this;
  }

  ~pimpl() { checked_delete(m); }

  void swap(pimpl& rhs) { std::swap(m, rhs.m); }

  T* operator->() { return m; }
  T const* operator->() const { return m; }

  T& operator*() { return *m; }
  T const& operator*() const { return *m; }

  T* get() { return m; }
  T const* get() const { return m; }

private:
  T* m;
};

template <typename T> class pimpl<T*> {};
template <typename T> class pimpl<T&> {};

template <typename T>
void swap(pimpl<T>& lhs, pimpl<T>& rhs) { lhs.swap(rhs); }

What does it have that the others didn't ?

  • It simply obeys the Rule of Three: defining the Copy Constructor, Copy Assignment Operator and Destructor.
  • It does so implementing the Strong Guarantee: if the copy throws during an assignment, then the object is left unchanged. Note that the destructor of T should not throw... but then, that is a very common requirement ;)

Building on this, we can now define Pimpl'ed classes somewhat easily:

class Foo
{
public:

private:
  struct Impl;
  pimpl<Impl> mImpl;
}; // class Foo

Note: the compiler cannot generate a correct constructor, copy assignment operator or destructor here, because doing so would require access to Impl definition. Therefore, despite the pimpl helper, you will need to define manually those 4. However, thanks to the pimpl helper the compilation will fail, instead of dragging you into the land of undefined behavior.

3. Going Further

It should be noted that the presence of virtual functions is often seen as an implementation detail, one of the advantages of Pimpl is that we have the correct framework in place to leverage the power of the Strategy Pattern.

Doing so requires that the "copy" of pimpl be changed:

// pimpl.h
template <typename T>
pimpl<T>::pimpl(pimpl<T> const& rhs): m(rhs.m->clone()) {}

template <typename T>
pimpl<T>& pimpl<T>::operator=(pimpl<T> const& rhs)
{
  std::auto_ptr<T> tmp(rhs.m->clone()); // copy may throw: Strong Guarantee
  checked_delete(m);
  m = tmp.release();
  return *this;
}

And then we can define our Foo like so

// foo.h
#include "pimpl.h"

namespace detail { class FooBase; }

class Foo
{
public:
  enum Mode {
    Easy,
    Normal,
    Hard,
    God
  };

  Foo(Mode mode);

  // Others

private:
  pimpl<detail::FooBase> mImpl;
};

// Foo.cpp
#include "foo.h"

#include "detail/fooEasy.h"
#include "detail/fooNormal.h"
#include "detail/fooHard.h"
#include "detail/fooGod.h"

Foo::Foo(Mode m): mImpl(FooFactory::Get(m)) {}

Note that the ABI of Foo is completely unconcerned by the various changes that may occur:

  • there is no virtual method in Foo
  • the size of mImpl is that of a simple pointer, whatever what it points to

Therefore your client need not worry about a particular patch that would add either a method or an attribute and you need not worry about the memory layout etc... it just naturally works.

Matthieu M.
+1: for deriving the idiom from first principles
Chubsdad
+1 -- What kind of circumstance would you ever be in though where you'd want to be accessing `pimpl` objects, but would not want to be accessing `impl` objects? And in such cases, how is this any different than forward declaring `impl` and storing it in a smart pointer class?
Billy ONeal
@Billy: don't understand all... the main difference with a classic smart pointer is that `pimpl` implements Deep Copy semantics in a Strong Guarantee manner, thus freeing the user from this burden. Otherwise it's very much similar to a `scoped_ptr`.
Matthieu M.
Since your `pimpl<T>` class uses T heavily, it reintroduces the dependencies that p-impl is designed to break. I think that's what your comment about manually defining default constructor, copy constructor, assignment operator, and destructor is all about, but you need to remove the inlined versions.
Ben Voigt
@Ben: Actually I don't need to remove them. As you noted the necessity to manual redefine constructors and al come from the fact that they can only be instantiated within the .cpp file. I have a much more advanced designed based on the `shared_ptr` trick for deleting an unknown type that I actually use in my own software to avoid this manual burden; but it does work as such, since not including the header prevents the actual instantiation of the methods.
Matthieu M.
I assume your "manual definitions" are actually specializations, otherwise you would have a ODR violation. But specializations have to be declared before use. So either your specializations are in the header where they can be seen by clients of Foo, or else the compiler will use the inline definition when compiling Foo clients because it is the best visible definition. You do need to remove the definitions and leave only declarations, then the compiler will generate calls which are resolved statically at link time.
Ben Voigt
It would also be worth adding some smart pointers such as `std::auto_ptr` to your first snippet to highlight that completed types ARE required for the use of smart pointers.
Ben Voigt
Specifically you have advocated putting a std::auto_ptr<T> in pimpl.h, this requires that a complete definition of T be visible to work correctly. If you moved this to a pimpl-internals.h which is included in foo.cpp and left only declarations in pimpl.h everything would be fine. Unfortunately you cannot rely on the compiler to generate errors, because deleting an incomplete type seems to produce a warning only on some compilers.
Ben Voigt
@Ben: no need for specializations. I was speaking of redefining the constructor for `Foo`, the compiler auto-generated versions of the template'd code is fine, but requires access to `T` definition which is only available in the .cpp file, thus the definition of the constructors and al have to be located in the .cpp file so that the compiler will have the full definition of `T` when instantiating the methods. I rely on these instantiations being deferred to the .cpp file.
Matthieu M.
Problem is that one forgets to replace the compiler-generated inline destructor for Foo (it is missing from your example, for instance), and then the compiler generates it, instantiating `pimpl<FooBase>::~pimpl` while `FooBase` is incomplete, which invokes undefined behavior if `FooBase` has a destructor. But there's no compile error. See e.g. http://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Checked_delete
Ben Voigt
@Ben: thanks for that :) It generates a warning on `gcc 3.4.2` though, guess any good compiler would too. Anyway the real solution, as I said, is to use `shared_ptr` idea of tucking a "deleter" alongside the object at construction time, thus forgoing the need of having the complete definition for copy construction / copy assignment / destruction, but that is a tad more advanced... even though that's the form I personally use :)
Matthieu M.
Oh hey, according to the C++0x FCD, the checked deleter can be written as `delete `... see page 93.
Ben Voigt