ansaurus

Question

Why isn't string assignment optimised when the length is known to the compiler?

Answer 1

+4 A:

The first can't be optimised into the second. In the first, the length of the string is unknown and so has to be calculated, in the second you tell it how long it is, so no calculation is needed.

And using sizeof() makes no difference - that is calculated at compile time too. The constructor that the first case uses is:

 string( const char * s );

there is no way of this constructor detecting it is being given a string literal, much less calculating its length at compile time.

Also, constructing strings from C-style string literals happens relatively rarely in real code - it simply isn't worth optimising. And if you do need to optimise it, simply re-write:

while( BIGLOOP ) {
   string s( "foobar" );
   ...
}

as:

string f( "foobar" );
while( BIGLOOP ) {
   string s( f );
   ...
}

anon 2010-01-31 23:00:06

The compiler should be able to figure out the length of the string at compile time in the same way it can figure out sizeof(...) since the string is constant

Greg 2010-01-31 23:03:51

I think his point is valid, in the first case, the compiler _does_ know how long the string literal is. If you don't already know how std::string and the compiler work, there is no way to know that the compiler can't make use of its knowledge about the size of the literal.

John Knoeller 2010-01-31 23:07:27

I know that it cant be optimised automatically using c++ syntax, thats why im asking why the compiler cant, since the compiler knows its a constant string, knows the length of constant strings (or can at least find out at compile time), and could be told about various cases in the c++ standard library where it could substitute one thing for another.

Fire Lancer 2010-01-31 23:19:22

Well, the compiler could. But it would take a lot of effort on the part of the compiler writers and would bind the compiler tightly to the library - I suppose the compiler writers didn't think it was worth it, correctly IMHO .

anon 2010-01-31 23:22:02

The compiler certainly could, it would just have to extend the standard to do so. **String constants can't be templates**, so you can't just make a templated constructor. **Default arguments can't rely on other arguments** so you can't sneak a call to `strlen()` in. In C++0x you could perhaps have a literal class containing the string length, but there would be no way to construct it from a normal string literal. This is genuinely a failure at the language level, although not a serious one.

Potatoswatter 2010-02-01 03:12:31

Disagree, because string constants can't be template *parameters*. They certainly can cause Template Argument Deduction, because that merely requires a type, not a value. `template <size_t N> std::string::string(char const arg[N])` is a valid way to make a templated constructor.

MSalters 2010-02-01 11:57:51

@MSalters Yes, but it wouldn't help in this case because you can't assume that the NTS competely fills the array being passed, so you would have to call strlen() on it.

anon 2010-02-01 12:14:03

This extension will indeed cause "incorrect" behavior for `std::string("a\0b")`.

MSalters 2010-02-01 12:55:32

Answer 2

+2 A:

The compiler undoubtedly could do something like this, and actually you could do this yourself:

template<size_t SIZE>
std::string f(const char(&c)[SIZE]) {
    return std::string(c, SIZE);
}

int main() {
    std::string s = f("Hello");
    cout << s;
}

or even with a custom derived type (though there is no reason std::string couldn't have this constructor):

class mystring : public string {
public:
    template<size_t SIZE>
    mystring(const char(&c)[SIZE]) : string(c, SIZE) {}
};

int main() {
    mystring s("Hello");
    cout << s;
}

One large drawback is that a version of the function/constructor is generated for every different string size, and the whole class could even be duplicated if the compiler doesn't handle template hoisting very well... These could be deal-breakers in some situations.

joshperry 2010-02-01 03:30:48

`SIZE` should be `size_t`, not `int`. We can't have a string of -1 length.

Chris Lutz 2010-02-01 03:33:12

If you insist! I didn't realize I had -pedantic turned on :)

joshperry 2010-02-01 03:39:15

@joshperry - I always have -pendantic turned on. :P

Chris Lutz 2010-02-01 03:40:49

D'oh! I forgot that string literals are arrays. §2.13.4/1: "string literal has type “array of n const char”" Erasing my answer…

Potatoswatter 2010-02-01 04:06:20

Looks unlikely that the compiler will generate additional code for this. Remember that functions declared inside `class{}` are inline, and you're not declaring a new specialization of `string` so `string` methods *will not* be duplicated. This should be zero-overhead on most platforms.

Potatoswatter 2010-02-01 04:10:02

You have to take into account the array parameter may not be filled - for example, it may be an array of 10000 characters that contains the empty string - then using sizeof ratherv than strlen becomes a pessimisation.

anon 2010-02-01 12:15:31

@Niel: But that problem doesn't exist for string literals... And doing something like `char* c = "Hello";` will throw a deprecation warning on newer compilers. See http://codepad.org/3OQMvLZH for example.

joshperry 2010-02-01 19:17:55

ansaurus

tags:

views:

answers:

Why isn't string assignment optimised when the length is known to the compiler?

related questions