views:

181

answers:

7

I've just come across a nice STL-like tree container class written by Kasper Peeters:

http://tree.phi-sci.com/

However, because it's STL-like, it's geared towards having a single class type in the tree; i.e. template <class T>. The trouble is, like STL-lists, if it suffers from the polymorphic class problem, in that the objects in the tree that are pointers to heap based objects (like pointers to a base class), aren't destroyed when nodes are deleted.

So, my options:

1: Use a tree of boost::shared_ptr, although this is more expensive/overkill than I'd like.

2: Write a little pointer wrapper like the one I've written below. The idea being that it wraps a pointer, which when it goes out of scope, deletes its pointer. It's not ref counted, it's just guarantees the pointer destruction.

template<class T>
class TWSafeDeletePtr
{
private:
    T *_ptr;
public:
    TWSafeDeletePtr() : _ptr(NULL) {}
    TWSafeDeletePtr(T *ptr) : _ptr(ptr) {}
    ~TWSafeDeletePtr()
    { 
        delete _ptr;
    }
    T &operator=(T *ptr)
    {
        assert(!_ptr);
        delete _ptr;
        _ptr=ptr;
        return *ptr;
    }
    void set(T *ptr)
    {
        *this=ptr;
    }

    T &operator*() const { return *_ptr; }
    T *operator->() const { return _ptr; }
};

3: Write my own allocator which allocates the node objects from a pool in the allocate() and deletes the pointed to memory in the deallocate().

4: Specialise the code to make a tree of pointers, avoiding the initial allocation and copy construction, plus innately knowing how to delete the pointed-to data.

I already have option 2 working, but I'm not really happy with it, because I have to actually insert an empty ptr to begin with, then set() the pointer when the insert returns an iterator. This is because the tree uses copy construction, and hence the temporary object passed on the stack will ultimate delete the pointer when it goes out of scope. So I set the pointer upon return. It works, it's hidden, but I don't like it.

Option 3 is looking like the best candidate, however I thought I'd ask if anyone else has already done this, or has any other suggestions?

Ok, I'ved decided to go with option 1 (tree of shared_ptrs), mainly because it's using standard libraries, but also because the extra refcount per node won't break the bank. Thanks for the replies everyone :)

Cheers,

Shane

+1  A: 

I don't like the allocator version, because allocators are supposed to allocate memory, not construct objects. So there's no guarantee that the number of requested allocations to the allocator matches the number of objects to be constructed; it would depend on the implementation whether you get away with it.

The tree calls the copy constructor on an inserted or appended value after the allocator has allocated the memory for it, so you would be hard pressed to write something which worked with polymorphic objects - alloc_.allocate doesn't know the runtime type of x before the constructor is called (lines 886 on):

tree_node* tmp = alloc_.allocate(1,0);
kp::constructor(&tmp->data, x);

Also looking at the code it doesn't seem to use assignment at all, and your wrapper only supplies the default copy constructor, so I can't see any of your suggested mechanisms working - when a node is assigned to the same value as it already holds with this code called (lines 1000 on):

template <class T, class tree_node_allocator>
template <class iter>
iter tree<T, tree_node_allocator>::replace(iter position, const T& x)
    {
    kp::destructor(&position.node->data);
    kp::constructor(&position.node->data, x);
    return position;
    }

your smart pointer would destruct their referent when their destructor is called here; you may get away with it by passing a reference counted pointer instead (so x doesn't destroy its referent until it goes out of scope, rather than the position.node->data destructor destroying it).

If you want to use this tree, then I would either use it as an index into data owned elsewhere rather than the tree owning the data, or stick with the shared_ptr approach.

Pete Kirkham
Storing indexes is an excellent idea, but you'd still have to index a vector of pointers to have polymorphism.
Ben Voigt
I've decided to stay with the shared_ptr approach. I decided the extra 4 bytes for a refcount isn't the end of the world :)
Shane
@Shane I don't mean to deter you from your current solution as it is a perfectly good one, but shared_ptr tends to be greater than 4 bytes. On a 32-bit system, it'd be at least 12 bytes and that's assuming that sp_counted_impl_p was allocated without any wasted allocation overhead (which can be as high as 16 bytes depending on the implementation of malloc).
@Shane If memory does become a concern, then consider a smart pointer wrapper of your own (doesn't have to be a general, template one), which deep copies elements or uses intrusive reference counting (in which case it can be a 4 byte overhead).
Actually if memory becomes a concern, I'll just use an object pool allocator instead :)
Shane
A: 

First of all, you could benefit from move semantics here. If you have access to C++0x.

Otherwise, the Boost Pointer Container library has solved the issue of the STL containers of pointers by... recoding it all.

Another issue with containers of pointers that you did not mention is the copy of the container. In your case the original container and its copy both point to the same objects, so changing one will not change the other.

You can of course alleviate this by writing a proxy class which wraps the pointer and provides deep copying semantic (clone method in the object wrapped). But you will then copy the data more often that if the container is pointer aware.... it's less work though.

/// Requirement: T is a model of Cloneable
template <class T>
class Proxy
{
  template <class> friend class Proxy;
public:
  // Constructor
  Proxy(): mPointer(0) {}
  explicit Proxy(T* t): mPointer(t) {}
  template <class U>
  explicit Proxy(std::auto_ptr<U> t): mPointer(t.release()) {}
  template <class U>
  explicit Proxy(std::unique_ptr<U> t): mPointer(t.release()) {}

  // Copy Constructor
  Proxy(Proxy const& rhs):
    mPointer(rhs.mPointer ? rhs.mPointer->clone() : 0) {}
  template <class U>
  Proxy(Proxy<U> const& rhs):
    mPointer(rhs.mPointer ? rhs.mPointer->clone() : 0) {}

  // Assignment Operator
  Proxy& operator=(Proxy const& rhs)
  {
    Proxy tmp(rhs);
    this->swap(tmp);
    return *this;
  }
  template <class U>
  Proxy& operator=(Proxy<U> const& rhs)
  {
    Proxy tmp(rhs);
    this->swap(tmp);
    return *this;
  }

  // Move Constructor
  Proxy(Proxy&& rhs): mPointer(rhs.release()) {}
  template <class U>
  Proxy(Proxy<U>&& rhs): mPointer(rhs.release()) {}

  // Move assignment
  Proxy& operator=(Proxy&& rhs)
  {
    Proxy tmp(rhs);
    this->swap(tmp);
    return *this;
  }
  template <class U>
  Proxy& operator=(Proxy&& rhs)
  {
    Proxy tmp(rhs);
    this->swap(tmp);
    return *this;
  }

  // Destructor
  ~Proxy() { delete mPointer; }

  void swap(Proxy& rhs)
  {
    T* tmp = rhs.mPointer;
    rhs.mPointer = mPointer;
    mPointer = tmp;
  }

  T& operator*() { return *mPointer; }
  T const& operator*() const { return *mPointer; }

  T* operator->() { return mPointer; }
  T const* operator->() const { return mPointer; }

  T* release() { T* tmp = mPointer; mPointer = 0; return tmp; }

private:
  T* mPointer;
};

// Free functions
template <class T>
void swap(Proxy<T>& lhs, Proxy<T>& rhs) { lhs.swap(rhs); }

Note that as well as providing deep-copying semantics, it provides deep-constness. This may or may not be to your taste.

It would also be good taste to provide equivalent to static_cast and dynamic_cast operations, this is left as an exercise to the reader ;)

Matthieu M.
A: 

1.

I already have option 1 working, but I'm not really happy with it, because I have to actually insert an empty ptr to begin with, then set() the pointer when the insert returns an iterator. This is because the tree uses copy construction, and hence the temporary object passed on the stack will ultimate delete the pointer when it goes out of scope. So I set the pointer upon return. It works, it's hidden, but I don't like it.

Until there is at least one copy of the same shared_ptr, pointed object won't be destroyed so there is no problem you writing about.

2.Your class is pointless. Use shared_ptr.

3.The allocator would have to know what kind of object to create when asked for a piece of bytes, this is not well solution.

4.Too much complications.

I suggest solution 1.

adf88
Like I said in my post, I don't want the overhead of a refcount per node, it's unnecessary. My class gets the same result without the refcount. Effectively it's class that simply deletes its pointer when it goes out of scope.
Shane
You assumed that once created node won't be copied internally by the tree class. With this assumption your TWSafeDeletePtr class would be correct, but in this case it's better to use std::auto_ptr which will work almost the same - additionally it'll eliminate the initialization problem you have. Without the assumption, reference counting is a must-have.
adf88
A: 

It seems that the cleanest solution would be a container adaptor in the style of Boost Pointer Container. This would smooth the syntax a lot as well. However writing such an adaptor is tedious and repetive as you would have to "lift" typedefs and repeat every member function of the class that is to be adapted.

pmr
+1  A: 

[Edit] Shane has chosen to go with the boost::shared_ptr solution and has pointed out that he needs to store polymorphic base pointers. Should memory/processing efficiency ever become a concern (after profiling, of course), consider a base pointer wrapper with safe copying behavior (e.g., deep-copying the pointee through a clone method) and the fast swap idiom shown in #5. This would be similar to suggested solution #2, but safe and without making assumptions on the data type being used (ex: trying to use auto_ptr with containers).

I think you should consider option #5.

1: Use a tree of boost::shared_ptr, although this is more expensive/overkill than I'd like.

First of all, do you realize that any linked structure like std::list, std::set, std::map requires a separate memory allocation/deallocation per node but doesn't require copying nodes to do things like rebalance the tree? The only time the reference counter will amount to any processing overhead is when you insert to the tree.

2: Write a little pointer wrapper like the one I've written below. The idea being that it wraps a pointer, which when it goes out of scope, deletes its pointer. It's not ref counted, it's just guarantees the pointer destruction.

For this tree you might be able to get away with it since you have the tree implementation, but it's a heavy assumption. Consider at least making the pointer wrapper non-copyable so that you'll get a compiler error if you ever try to use it for something which does copy node elements.

3: Write my own allocator which allocates the node objects from a pool in the allocate() and deletes the pointed to memory in the deallocate().

If it's an STL-compliant memory allocator, it should not be making such assumptions about the memory contents in deallocate. Nevertheless, writing a fast memory allocator which can assume fixed allocation sizes can really speed up any linked structure. Writing a fast memory allocator that consistently outperforms malloc/free is non-trivial work, however, and there are issues to consider like memory alignment.

4: Specialise the code to make a tree of pointers, avoiding the initial allocation and copy construction, plus innately knowing how to delete the pointed-to data.

Making a wrapper for the tree is probably the most robust solution. You'll have full control over when to insert and remove elements (nodes) and can do whatever you like in the mean time.

Option #5: just store the element directly in the tree and focus on making the element fast.

This is your best bet if you ask me. Instead of map<int, ExpensiveElement*> or map<int, shared_ptr<ExpensiveElement> >, consider simply map<int, ExpensiveElement>.

After all, you obviously want the tree to be the memory manager (deleting a node deletes the element). That happens when we avoid the indirection of a pointer to the element already.

However, your concern seems to be the overhead of the copy-in policy of insert (copy ctor overhead on ExpensiveElement). No problem! Just use operator[] instead of insert:

map<int, ExpensiveElement> my_map;

// default constructs ExpensiveElement 
// and calls ExpensiveElement::do_something().
// No copies of ExpensiveElement are made!
my_map[7].do_something();

Tada! No copying, no need to worry about proper destruction, and no memory allocation/deallocation overhead per element.

If default constructing ExpensiveElement won't suffice, then consider making default construction super cheap (practically free) and implement a swap method.

map<int, ExpensiveElement> my_map;

// construct an ExpensiveElement and swap it into the map
// this avoids redundant work and copying and can be very 
// efficient if the default constructor of ExpensiveElement
// is cheap to call
ExpensiveElement element(...);
my_map[7].swap(element);

To make the default construction super cheap and allow for a swap method, you could implement a fast pimpl on ExpensiveElement. You can make it so the default ctor doesn't even allocate the pimpl, making it a null pointer, while the swap method swaps the two pimpls of ExpensiveElement. Now you have super cheap default construction and a way to swap properly constructed ExpensiveElements into the map, avoiding the redundancy of deep copies all together.

What if ExpensiveElement cannot have a default ctor?

Then make a wrapper which does. The approach can be similar to the pointer wrapper you suggested, except it will be a complete class with valid (safe) copying behavior. The copy ctor can deep copy the pointee, for example, if reference counting is undesired. The difference may sound subtle, but this way it's a very safe and complete solution which doesn't make assumptions about how the tree is implemented; safe and general like boost::shared_ptr but without the reference counting. Just provide a swap method as your one and only means to shallow swap data without requiring a deep copy and use it to swap things into the tree.

What if we need to store polymorphic base pointers?

See answer immediately above and modify the wrapper to call something like clone (prototype pattern) to deep copy the pointee.

Option 5 doesn't work if `ExpensiveElement` cannot have a default constructor. Otherwise, this looks good.
Mike DeSimone
Option #5 isn't polymorphic.
Ben Voigt
@Ben true, but given what was suggested for option #3, I doubt the author needs polymorphism here.
@Ben and @Mike: I modified the post to address points regarding polymorphism and a lack of default construction.
I'm not convinced. If a default constructor were so easy to provide that a wrapper could do it, then `ExpensiveElement` probably already has one. Therefore, the wrapper constructor doesn't construct the object it wraps, but waits for later (lazy construction?), when the parameters become available. This means you'll have to add a `create_actual_object` method, and either a `has_been_created` method, or have it throw a `DoesNotExist` exception, to the wrapper. At this point, what's the difference between that and #1?
Mike DeSimone
@Mike #1 isn't safe. Consider what happens when we copy TWSafeDeletePtr.
Such a solution would be tightly coupled with a particular tree representation. If used with a solution like #4, then it might be acceptable as an implementation detail. Unfortunately it is difficult to answer this question thoroughly given the lack of details provided.
BTW, I'm not proposing lazy construction either. Consider boost::shared_ptr which stores a null pointer in the default ctor. It doesn't use lazy construction, but supports swap idioms and has safe copying behavior. Calling the parameterized ctor on it causes it to store a valid pointer (not null), and one can swap a null shared_ptr with a non-null one. That's all I'm suggesting really, but it's possible we don't even need a smart pointer-like wrapper at all depending on what ExpensiveElement is in this particular case.
The important point here is that I need a tree of pointers precisely because a single template T isn't polymorphic. I have a tree of pointers to base classes.In any case, I've bitten the bullet and used option 1. Thanks for your reply anyway :)
Shane
A: 

It looks like option 1 is probably the best. shared_ptr is very common and most people should know how it works. Is there a problem with the syntax map_obj[key].reset(new ValueType);?

Unless you have measurement data that your wrapper for option 2 is a significant savings in use compared to shared_ptr, shared_ptr seems safer since people will know about its semantics right away.

Option three seems complex for what it provides. You need to implement the allocate/construct and deallocate/destruct pairs, and make sure that if a node is copied around it will not have deletion problems.

Option four is probably the second-best option. I wouldn't suggest using it unless profiling shows that the shared_ptr operations really are that expensive though, since this requires reinventing code that's already been written and debugged in the standard library.

Mark B
A: 

I'ved decided to go with option 1 (tree of shared_ptrs), mainly because it's using standard libraries, but also because the extra refcount per node won't break the bank. Thanks for the replies everyone :)

Shane