views:

154

answers:

4

They're both resizable arrays, and std::basic_string doesn't have any specifically character-related functions like upper(). What's special about string to make it better for character data?

+1  A: 

It was a design decision early in the STL creation. I think a lot of people now admit that std::string's interface is too bloated and inconsistent with the rest of the STL, but it's too late to change it.

rlbond
+2  A: 

Strings do have special string-related functions: c_str, substr, concatenation, among others. Also, don't forget the important point that strings automatically add the '\0' to the end of their data (and handle it properly with concatenation etc.) so they don't have the same operation as a vector<char> or something like that.

But yes, they are incredibly similar. They both hold a pointer to a heap-allocated array, but they are certainly not the same.

Peter Alexander
I don't really see substr or concatenation as character-string specific. There are languages that do provide those for arbitrary arrays.You're right about the null-termination, though: c_str() is, unfortunately, the most common function I use on std::string.
dan04
@dan04: std::string has tons of functions and overloads designed to make it work hand in hand with C-style strings (which happens to be what string literals are in C++). It would be meaningless for `vector<T>` to support all those operations for `T*`, because `char*` happens to be a pointer with a very specific meaning.
UncleBens
+2  A: 

std::string has a lot of operators that std::vector doesn't:

  • operator + (append string a to string b, + doesn't really make sense for vectors)
  • operator <, >, ==, != (string comparison, some don't make sense for vectors)
  • c_str() (return a "C style" representation)
  • And more (including substring, find, etc but some of these are implmeneted elsewhere in the STL and can be use on vectors, sort of)

Admittedly there is little else that std::string has that a vector doesn't or couldn't, but these are important, they are the majority of use cases for a string.

SoapBox
+1  A: 

Most of the reason has to do with localization and internationalization (L10I18),performance and for historical reasons.

For the L10I18 issues, char_traits was added, and you will note that streams has these as well. The intent was to make "smarter characters" in a way, but the outcome was useless. About the only thing char_traits is good for is to specialize some of the std::string/wstring compares, copies, etc as compiler intrinsics.

The failure is mostly due to UNIX streams themselves, which see the character as the main "atom" where in GUIs, web etc that are internationalized the string is the main "atom." In other words, in C/C++ land, we have "dumb arrays of smart characters" for strings, whereas every other language uses "smart arrays of dumb characters." Unicode takes the latter approach.

Another big difference between basic_string and vector -- basic_string can only contain POD types. This can make a difference in some cases somoetime the compiler has an easier time optimizing basic_string compared to vector.

basic_string sometimes has many other optimization, such as Copy on Write and Small String Optimization. These vary from one implementation to the next.

However probably the most reason there are two things nearly the same is historical: strings predates the STL quite a bit, and most of the work seemed to center on making them interoperate with IOStream library. One C++ Urban Myth is that STL is a "container library" that was added to C++. It is not, and to get it adopted into C++, containers were added. An "STL Interface" was also bolted onto the existing string class. std::vector was largely taken from a vector implemenation that existed in the AdaSTL.

Lance Diduck