views:

374

answers:

6

I've always wanted a bit more functionality in STL's string. Since subclassing STL types is a no no, mostly I've seen the recommended method of extension of these classes is just to write functions (not member functions) that take the type as the first argument.

I've never been thrilled with this solution. For one, it's not necessarily obvious where all such methods are in the code, for another, I just don't like the syntax. I want to use . when I call methods!

A while ago I came up with the following:

class StringBox
{
public:
   StringBox( std::string& storage ) :
       _storage( storage )
   {
   }

   // Methods I wish std::string had...
   void Format(); 
   void Split();
   double ToDouble(); 
   void Join(); // etc...

private:
  StringBox();

  std::string& _storage;
};

Note that StringBox requires a reference to a std::string for construction... This puts some interesting limits on it's use (and I hope, means it doesn't contribute to the string class proliferation problem)... In my own code, I'm almost always just declaring it on the stack in a method, just to modify a std::string.

A use example might look like this:

string OperateOnString( float num, string a, string b )
{
    string nameS;
    StringBox name( nameS );

    name.Format( "%f-%s-%s", num, a.c_str(), b.c_str() );

    return nameS;
}

My question is: What do the C++ guru's of the StackOverflow community think of this method of STL extension?

A: 

If you want to extend the methods available to act on string, I would extend it by creating a class that has static methods that take the standard string as a parameter. That way, people are free to use your utilities, but don't need to change the signatures of their functions to take a new class.

This breaks the object-oriented model a little, but makes the code much more robust - i.e. if you change your string class, then it doesn't have as much impact on other code.

Follow the recommended guidelines, they are there for a reason :)

Larry Watanabe
Why using a class with static methods instead of a namespace with free functions ?
Matthieu M.
How is a class with only static methods different from a namespace?
Potatoswatter
@Potatoswatter: I can do `using namespace N` to import the contents of a namespace, but I can't d othat with a static class. Static classes can be used as template parameters, namespaces can't. Ther are plenty of differences.
jalf
Of course, but no advantages relevant here.
Potatoswatter
+1  A: 

If the scope of the string isn't the same as the StringBox you can get segfaults:

StringBox foo() {
  string s("abc");
  return StringBox(s);
}

At least prevent object copying by declaring the assignment operator and copy ctor private:

class StringBox {
  //...
  private:
    void operator=(const StringBox&);
    StringBox(const StringBox&);
};

EDIT: regarding API, in order to prevent surprises I would make the StringBox own its copy of the string. I can think fo 2 ways to do this:

  1. Copy the string to a member (not a reference), get the result later - also as a copy
  2. Access your string through a reference-counting smart pointer like std::tr1::shared_ptr or boost:shared_ptr, to prevent extra copying
orip
Yeah, in the actual implementation I did this.
dicroce
In the above comment, I was referring to declaring the operator = and copy ctor private, not the bit about owning the string. I explicitly DID NOT want that...
dicroce
+14  A: 

As most of us "gurus" seem to favour the use of free functions, probably contained in a namespace, I think it safe to say that your solution will not be popular. I'm afraid I can't see one single advantage it has, and the fact that the class contains a reference is an invitation to that becoming a dangling reference.

anon
+1 for free functions. Most of the `std::string`'s member functions should have been free functions in the first place.
avakar
-1 for free functions, lern 2 OOP moar
BlueRaja - Danny Pflughoeft
@BlueRaja: What's not OOP about it? Surely the most OOP solution is the one that gives us the highest degree of encapsulation, code reuse and extensibility. Which is to extend objects with non-member functions. They don't have access to private members of the class, so they can't break anything. The class is *more* encapsulated than if you were to rip it open and add more member methods. Extensibility is improved because I can add functionality without modifying the class. Reuse is improved because a nonmember function can be used on many types of classes.
jalf
Having "extension methods" for types you don't control can be nice. This approach raises controversy (happened in C#, which has this built in) but only because there points for and against. I think the poster is trying to model that in C++, and that exploring the syntactic options is interesting.
orip
In the actual implementation I declare assignment operator and copy ctor private. This largely prevents scenarios that could result in dangling references.
dicroce
+ for free functions .. live free or die! :)
Larry Watanabe
@dicroce: `std::vector<StringBox*>` >> largely but not completely, I would also prevent the use of all forms of `new` on this class by making the various forms of `operator new` private, though it would still not completely circumvent the problem. Your solution is just unsafe, sorry.
Matthieu M.
@orip: What do extension methods offer that you don't already have in C++ with nonmember functions? They're just syntactic sugar to avoid upsetting the "Java is OOP incarnate" gang who think OOP is synonymous with using dots for function calls.
jalf
@jalf: syntactic sugar is important, and helps conveys meaning in code. The question is what sugar to have in a language, what it adds and what it detracts. C++ is full of it's own "just syntactic sugar" that C# lacks, for example.
orip
Matthieu: You can't prevent new, e.g. `::new StringBox(s)`; http://codepad.org/8TbRUn64.
Roger Pate
+19  A: 

I've never been thrilled with this solution. For one, it's not necessarily obvious where all such methods are in the code, for another, I just don't like the syntax. I want to use . when I call methods!

And I want to use $!---& when I call methods! Deal with it. If you're going to write C++ code, stick to C++ conventions. And a very important C++ convention is to prefer non-member functions when possible.

There is a reason C++ gurus recommend this:

It improves encapsulation, extensibility and reuse. (std::sort can work with all iterator pairs because it isn't a member of any single iterator or container class. And no matter how you extend std::string, you can not break it, as long as you stick to non-member functions. And even if you don't have access to, or aren't allowed to modify, the source code for a class, you can still extend it by defining nonmember functions)

Personally, I can't see the point in your code. Isn't this a lot simpler, more readable and shorter?

string OperateOnString( float num, string a, string b )
{
    string nameS;
    Format(nameS, "%f-%s-%s", num, a.c_str(), b.c_str() );
    return nameS;
}

// or even better, if `Format` is made to return the string it creates, instead of taking it as a parameter
string OperateOnString( float num, string a, string b )
{
    return Format("%f-%s-%s", num, a.c_str(), b.c_str() );
}

When in Rome, do as the Romans, as the saying goes. Especially when the Romans have good reasons to do as they do. And especially when your own way of doing it doesn't actually have a single advantage. It is more error-prone, confusing to people reading your code, non-idiomatic and it is just more lines of code to do the same thing.

As for your problem that it's hard to find the non-member functions that extend string, place them in a namespace if that's a concern. That's what they're for. Create a namespace StringUtil or something, and put them there.

jalf
A: 

The best way is to use templated free functions. The next best is private inheritance struct extended_str : private string, which happens to get easier in C++0x by the way as you can using constructors. Private inheritance is too much trouble and too risky just to add some algorithms. What you are doing is too risky for anything.

You've just introduced a nontrivial data structure to accomplish a change in code punctuation. You have to manually create and destroy a Box for each string, and you still need to distinguish your methods from the native ones. You will quickly get tired of this convention.

Potatoswatter
+2  A: 

I'll add a little something that hasn't already been posted. The Boost String Algorithms library has taken the free template function approach, and the string algorithms they provide are spectacularly re-usable for anything that looks like a string: std::string, char*, std::vector, iterator pairs... you name it! And they put them all neatly in the boost::algorithm namespace (I often use using namespace algo = boost::algorithm to make string manipulation code more terse).

So consider using free template functions for your string extensions, and look at Boost String Algorithms on how to make them "universal".

For safe printf-style formatting, check out Boost.Format. It can output to strings and streams.

I too wanted everything to be a member function, but I'm now starting to see the light. UML and doxygen are always pressuring me to put functions inside of classes, because I was brainwashed by the idea that C++ API == class hierarchy.

Emile Cormier