tags:

views:

200

answers:

4

From the documentation of the StringPiece class in Chromium's source code:

// A string-like object that points to a sized piece of memory.
//
// Functions or methods may use const StringPiece& parameters to accept either
// a "const char*" or a "string" value that will be implicitly converted to
// a StringPiece.  
//
// Systematic usage of StringPiece is encouraged as it will reduce unnecessary
// conversions from "const char*" to "string" and back again.

Example of use:

void foo(StringPiece const & str) // Pass by ref. is probably not needed
{
   // str has same interface of const std::string
}

int main()
{
    string bar("bar");
    foo(bar); // OK, no mem. alloc.

    // No mem. alloc. either, would be if arg. of "foo" was std::string
    foo("baz");  
}

This seems like such an important and obvious optimization that I can't understand why it's not more widespread, and why a class similar to StringPiece is not already in the standard.

Are there any reasons why I shouldn't replace the use of string and char* parameters in my own code with this class? Is there anything like it already in the C++ standard libraries?

UPDATE. I've learnt that LLVM's source uses a similar concept: the StringRef class.

+2  A: 

Personally, I couldn't use something like that, because supporting Unicode on Windows requires use of wide character strings (wchar_t's, and std::wstring's) which that library doesn't seem to support.

Additionally, for most cases, the construction cost of a std::basic_string is negligible in the overall runtime cost of a program.

EDIT: That also doesn't appear to allow modification of the underlying object -- i.e. the std::basic_string::append() method is missing.

Billy ONeal
Fair point. I've modified the code to say: "has same interface of CONST std::string"
Manuel
@Billy: How hard would it be to templatize that code so that you could use it with char* and wchar_t*? In fact, I'd rename it to const_string. The only thing I would add to the implementation would be to make operator new and operator delete private so you can't allocate one of these things.
jmucchiello
@jmucchiello: See my comment to Mark Ransom's answer: it might also make sense to make the copy ctor. and op.= private.
Manuel
@Manuel: So it has the same interface as `const std::string`; is that really worth introducing another class for?
David Thornley
@David Thornley - It's not about interface but about efficiency. If I have a function that takes a constant string, and I want to pass it some text that I have in a char*, why does that have to imply a call to new? StringPiece solves that in a manner that is transparent to the caller.
Manuel
Would an append that returns a new std::basic_string object make sense?
jmucchiello
@jmucchiello - I think by definition "append" modifies the object it is invoked on, so it makes no sense for a immutable class like StringPiece
Manuel
The reason for `std::basic_string::append` is to modify the underlying string object in linear time. s1.append(s2).append(s3) takes linear time with respect to s2 and s3, while s1 = s1+s2+s3 takes exponential time. Making append return a new object violates append's semantics.
Billy ONeal
+6  A: 

The standard is trying to move away from const char* in favor of string altogether, so adding more options for conversion is useless.

Also note that a nice formed program should use either string or const char* all around ;).

Kornel Kisielewicz
@Kornel Kisielewicz: But how is it useless if it saves a memory allocation, which can be expensive if the string is long? I thought C++ was about "don't pay for what you don't use".
Manuel
@Manuel - it violates the DRY principle, and has all the pitfall that come from that
Kornel Kisielewicz
@Kornel Kisielewicz - How is this related to DRY?
Manuel
+1  A: 

StringPiece keeps a pointer to the string data that is passed to its constructor. Thus it relies on the lifetime of that string to be longer than the StringPiece. This is guaranteed if you are using StringPiece as a function parameter; not so if you're keeping a copy of the string in a class member for example.

It's not quite as universal as std::string, and it's definitely not a replacement for it.

Mark Ransom
Yes, you should never keep a copy of it. I think this can be trivially enforced by making its copy ctor. and op= private. As for not being an universal replacement for std::string, I agree. Both are intended to work together. std::string => generic use; StringPiece => only as a read-only function parameter
Manuel
I would disallow assignment only because I would require that all methods on the class be declared const, reinforcing that this is a const object. I suppose the practice of declaring copy ctor and assignment in pairs is the only reason not to allow a copy ctor.
jmucchiello
@jmucchiello - And also to prevent people from storing their own copy of the string and keeping it alive longer that the function. I don't think you can prevent that unless you disable the copy ctor.
Manuel
+1  A: 

Because why bother? With copy elision and/or pass by reference, memory allocations for std::string can usually be avoided as well.

The string situation in C++ is confusing enough as it is, without adding still more string classes.

If the language was to be redesigned from scratch, or if backwards compatibility wasn't an issue, then this is one of many possible improvements that could be done to string handling in C++. But now that we're stuck with both char* and std::string, adding a stringref-style class into the mix would cause a lot of confusion, with limited benefit.

Apart from this, isn't the same effect achieved more idiomatically with a pair of iterators? If I want to pass a sequence of characters, whether they belong to a string or a char*, why shouldn't I just use a pair of iterators to delimit them?

jalf
But if you go the "two iterators" route, then you have to invent a class analogous to StringPiece that gives you a convenient string-like interface. And the first line of your functions would always be something like "StringRange str(str_beg, str_end);". Accepted anyway for providing the best assessment so far.
Manuel
True about the convenient string-like interface, but again, remember that iterators are already idiomatic in C++. I don't see why your functions would need to create a stringrange from the iterators. Instead, the entire string manipulation interface should have been provided as free functions on iterators to begin with. Boost's StringAlgo library fixes much of this.
jalf
@jalf - I don't think you would really like what you are suggesting. Imagine something as simple as `strA+strB+strC` in terms of iterators.
Manuel
I never said it was an ideal solution. ;) But then a better solution would be to add a *general* Range object, which could be used not just for strings, but for any type of ranges.
jalf
@jalf - See Boost.Range lib. and its future replacement, Boost.RangeEx. I dream one day they'll be in the standard.
Manuel