views:

129

answers:

3

I want to derive from std::back_insert_iterator to create a kind of filter for a string type, say back_xml_insert_iterator, that will examine the characters passed through it looking for characters that can not be emitted "naked" into an XML stream, e.g., '"', '&', '<', '>', and '\'', and will on-the-fly insert their character entity references instead, e.g., "&#34;" for '"'.

template<class StringType>
class back_xml_insert_iterator : public std::back_insert_iterator<StringType> {
  typedef std::back_insert_iterator<StringType> base_type;
public:
  typedef typename base_type::container_type container_type;
  typedef typename StringType::value_type value_type;

  explicit back_xml_insert_iterator( StringType &s ) : base_type( s ) { }

  back_xml_insert_iterator& operator=( value_type c ) {
    switch ( c ) {
      case '"':
      case '&':
      case '\'':
      case '<':
      case '>':
        char buf[10];
        this->container->append( "&#" );
        this->container->append( itoa( c, buf ) );
        this->container->push_back( ';' );
        break;
      default:
        this->container->push_back( c );
    }
    return *this;
  }
};

This compiles fine. When I create an instance, I confirmed that the constructor is called, but my operator=() is never called. I think it's because the inherited operator*() returns a back_insert_iterator& and not a back_xml_insert_iterator& so back_insert_iterator::operator=() is called rather than mine (since operator=() is not, nor can not be, virtual).

If that's the case, then it seems impossible to derive from back_insert_iterator in a useful way.

If I instead create my own back_insert_iterator_base class like:

template<class ContainerType,class DerivedType>
class back_insert_iterator_base :
  public std::iterator<std::output_iterator_tag,void,void,void,void> {
public:
  typedef ContainerType container_type;

  DerivedType& operator*() {
    return *static_cast<DerivedType*>( this );
  } 

  DerivedType& operator++() {
    return *static_cast<DerivedType*>( this );
  }

  DerivedType& operator++(int) {
    return *static_cast<DerivedType*>( this );
  }

protected:
  back_insert_iterator_base( ContainerType &c ) : container( &c ) {
  }

  ContainerType *container;
};

and derive from that instead like:

template<class StringType>
class back_xml_insert_iterator :
  public back_insert_iterator_base< StringType, back_xml_insert_iterator<StringType> > {
  // ... as before ...
};

then back_xml_insert_iterator works as desired. So is it possible to derive from std::back_insert_iterator and have it work as desired?

Update

Here's how I'd like to use the back_xml_insert_iterator. First, there would be the helper function:

template<class StringType> inline
back_xml_insert_iterator<StringType> back_xml_inserter( StringType &s ) {
  return back_xml_insert_iterator<StringType>( s );
}

Then writing a to_xml() function would be trivial:

template<class InputStringType,class OutputStringType> inline
void to_xml( InputStringType const &in, OutputStringType *out ) {
  std::copy( in.begin(), in.end(), back_xml_inserter( *out ) );
}
+1  A: 

You should not try to inherit from classes that were not designed with inheritance in mind. In general you should not inherit from standard containers or iterator. Note that in most cases the usage of iterators is in templates that do not require a type hierarchy in the first place.

It is usually better to add the functionality externally either by providing functors or else implementing your own iterator --which should not be really complex. You can take a look at the iterator adaptor library in boost if you really want to implement your own iterators.

David Rodríguez - dribeas
Then why does std::back_insert_iterator have a protected section if you're not supposed to derive from it? That's the confusing bit.
Paul J. Lucas
@Paul: you are right, it does have a protected section, but that is not in itself a reason to derive from it. You should derive from a class only *if it is the simplest, cleanest and most maintainable* way to extend it. Not just because you *can*. The class doesn't have a virtual destructor, which is usually a strong hint that users should *not* derive from it.
jalf
@jalf: Virtual functions do NOT typically occur in iterator classes. I don't think that's an indicator of anything.
Potatoswatter
@Potatoswatter: It is an indicator that they are not meant to be used polymorphically. Maybe it would be better if the discussion is worded in terms of inheriting and expecting dynamic behavior. A class can be designed to be used as a base class with non-polymorphic behavior -- consider trait to be the biggest example there. But that won't allow you to use it polymorphically as it won't be possible.
David Rodríguez - dribeas
@Paul J. Lucas: As I mentioned in the previous comment, I should have stated the issue better. Trying to be generic in the answer --avoid inheriting from classes not designed for it--, I missed the detail. A class in C++ can be designed to be inherited and yet not used polymorphically. There are few examples, but I guess this is one of them. The `protected` block is an indicator that the designer considered the class could be derived. The lack of virtual methods on the other hand, is a clear indicator that no polymorphic behavior is to be expected.
David Rodríguez - dribeas
Certainly, no polymorphism. But iterators are typically used with template idioms. The only `protected` member of `back_insert_iterator` is `Container *container;`. If I wanted to distribute inserted elements among several containers, say round robin, it would seem necessary to use inheritance. Private inheritance would prevent any polymorphism confusion.
Potatoswatter
A: 

You can derive from back_insert_iterator, but you have to override the operators to return your derived type.

Does that render the strategy pointless? It depends on your coding style. The overrides will simply call the base method and return this, so you end up with a run of copy-paste code, much like what you put in back_insert_iterator_base.

template<class StringType>
class back_xml_insert_iterator : public std::back_insert_iterator<StringType> {
  typedef std::back_insert_iterator<StringType> base_type;
public:
  typedef typename base_type::container_type container_type;
  typedef typename StringType::value_type value_type;

    // add trivial overrides to avoid returning base type:
  back_xml_insert_iterator &operator++() { base::operator++(); return this; }
  back_xml_insert_iterator operator++(int) {
      back_xml_insert_iterator ret( *this );
      base::operator++();
      return ret;
  }

  explicit back_xml_insert_iterator( StringType &s ) : base_type( s ) { }

  ...

Of course, a similar block of code could implement passing operator calls to a member back_insert_iterator, which would eliminate the inheritance as others recommend. This is possibly a better approach.

Note that private inheritance, with : private std::back_insert_iterator, would add safety by preventing users from attempting to use your class as something it's not.

Potatoswatter
Yes, it's rather pointless to override all the members. The point was hopefully to get code reuse. Calling the base operator++() is also pointless since all they do it return *this.
Paul J. Lucas
A: 

I could get this to work with the call to operator= successfully. Did I get it right?

// back_insert_iterator example

#include <iostream>
#include <iterator>
#include <vector>
using namespace std;

template<class StringType> 
class back_xml_insert_iterator : public std::back_insert_iterator<StringType> { 
   typedef std::back_insert_iterator<StringType> base_type; 
public: 
   typedef typename base_type::container_type container_type; 
   typedef typename StringType::value_type value_type; 

   explicit back_xml_insert_iterator( StringType &s ) : base_type( s ) { } 

   back_xml_insert_iterator& operator=( value_type c ) { 
      switch ( c ) { 
      case '"': 
      case '&': 
      case '\'': 
      case '<': 
      case '>': 
         break; 
      default: 
         this->container->push_back( c ); 
      } 
      return *this; 
   } 
}; 

int main () {
   vector<int> firstvector, secondvector;
   for (int i=1; i<=5; i++)
   { firstvector.push_back(i); secondvector.push_back(i*10); }

   back_xml_insert_iterator< vector<int> > back_it (firstvector);

   back_it = 2;   // It does call operator=

   copy (secondvector.begin(),secondvector.end(),back_it);

   vector<int>::iterator it;
   for ( it = firstvector.begin(); it!= firstvector.end(); ++it )
      cout << *it << " ";
   cout << endl;

   return 0;
}
Chubsdad
"Did I get it right?" - You are missing that no library is going to assign a value to an iterator without dereferencing it. No code does `back_it = 2;`; all code does `*back_it = 2;`. If the latter doesn't work, you don't have an iterator.
UncleBens
@Uncle: This peculiar "extension" is part of the specification of inserters. See §24.4.2/3.
Potatoswatter