views:

3552

answers:

13

When I use std::string and when char* to manage arrays of chars in C++?

It seems you should use char* if performance(speed) is crucial and you're willing to accept some of a risky business because of the memory management.

Are there other scenarios to consider?

+1  A: 

if you are using the array of chars in like text etc. use std::string more flexible and easier to use. If you use it for something else like data storage? use arrays (prefer vectors)

PoweRoy
+4  A: 

Some comments about the basic_string<>: A Policy-Based basic_string Implementation. Might give you some idea about the encountered problems and possible ways of avoiding them.

Anonymous
Interesting article, but it makes a very dubious claim that thread-safe copy-on-write is 'unacceptably slow even in the single-threaded parts of your application'. Efficient thread-safe reference counting is not that difficult to implement.
TrayMan
@TrayMan: the problem is that in order to be correct. Every operation which could potentially mutate the string must do a copy (even getting a reference to a char via operator[]). Add thread safety to this and you get locks everywhere. It ends up often not being a gain over just copying in the first place.
Evan Teran
More info supporting Evan. The original C++ standard put so many constraints on std::string that implementers where essentially required to us a COW implementation. In C++1x they've removed the ability for std::string to be COW. The reasoning is it is not possible to do correctly and efficiently in a multi threaded environment.
caspin
A: 

Even when performance is crucial you better use vector<char> - it allows memory allocation in advance (reserve() method) and will help you avoid memory leaks. Using vector::operator[] leads to an overhead, but you can always extract the address of the buffer and index it exactly like if it was a char*.

sharptooth
But it would be nice to use some kind of typical string functionality, and have just the option to specify the policy for the storage. For that see the link in my answer.
Anonymous
It is not reaaly true. If you consider that vector will be allocated in contiguous memory space, reallocation (to increase the vector size) will not be efficient at all, as it implies the copy of the previous chunk.
Jérôme
I missunderstood your response, as you use the vector instead of char*, not instead of string... In this case I agree.
Jérôme
There shouldn't be an overhead in operator[] usage. See for instance, http://stackoverflow.com/questions/381621/using-arrays-or-stdvectors-in-c-whats-the-performance-gap
Luc Hermitte
+3  A: 

You should consider to use char* in the following cases:

  • This array will be passed in parameter.
  • You know in advance the maximum size of your array (you know it OR you impose it).
  • You will not do any transformation on this array.

Actually, in C++, char* are often use for fixed small word, as options, file name, etc...

Jérôme
+9  A: 

You can pass std::strings by reference if they are large to avoid copying, or a pointer to the instance, so I don't see any real advantage using char pointers.

I use std::string/wstring for more or less everything that is actual text. char * is useful for other types of data though and you can be sure it gets deallocated like it should. Otherwise std::vector is the way to go.

There are probably exceptions to all of this.

Skurmedel
is there a performance difference before the two?
vtd-xml-author
@vtd-xml-author: Some maybe. Straight `char *` has almost no overhead. Exactly what overhead `std::string` has I don't know, it's likely to be implementation dependant. I hardly expect the overhead to be much greater than that of a bare char pointer. Since I don't own a copy of the standard I can't really detail any guarantees made by the standard. Any performance difference will likely vary depending on the operations to be made. `std::string::size` could store the size next to the character data and thus be quicker than `strlen`.
Skurmedel
+1  A: 

Why using c++ string:

  • strings in overall are more secure then char* , Normally when you are doing things with char* it happens that you do checking , lot of these checking are done for you in the string class.
  • Usually when using char* , u will have to free the memory you allocated , you don't have to do it with string since it will free its internal buffer when destructed.
  • Strings working well with c++ stringstream , formated IO is very easy.

why using char*

using char* giving more control of what is happenning "behind" which means , you can tune the performence if you need.

+1  A: 

If you want to use C libraries, you'll have to deal with C-strings. Same applies if you want to expose your API to C.

n0rd
+2  A: 

One occasion that you MUST use char* and not std::string is when you need static string constants. The reason for that is that you don't have any control on the order modules initialize their static variables, and another global object from a different module may refer to your string before it's initialized. http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Static_and_Global_Variables

std::string pros:

  • manages the memory for you (the string can grow, and the implementation will allocate a larger buffer you)
  • Higher-level programming interface, works nicely with the rest of STL.

std::string cons: - two distinct STL string instances can not share the same underlying buffer. So if you pass by value you always get a new copy. - there is some performance penalty, but I'd say unless your requirements are special it's negligible.

thesamet
Actually, STL implementations often implement copy-on-write semantics for std::string, so passing them by value doesn't cost very much at all. Still, it's better not to rely on that, and generally better to pass a reference-to-const anyway.
Some std::string implementations gave up on COW implementation. Moreover it is not as trivial as it seems to provide a (POSIX) thread safe class compatible with the standard. See http://groups.google.fr/group/ifi.test.boa/browse_frm/thread/cb16ed54c3e78a78/215edbc9c7686fdd or http://groups.google.fr/group/comp.programming.threads/browse_frm/thread/dbdf76a8844bde5c/d8651dd45d13b862
Luc Hermitte
A: 

AFAIK internally most std::string implement copy on write, reference counted semantics to avoid overhead, even if strings are not passed by reference.

piotr
This is no longer true, because copy on write causes serious scalability issues in multithreaded environment.
Suma
It's true for GCC's implementation of the STL at least.
+1  A: 

Raw string usage

Yes, sometimes you really can do this. When using const char *, char arrays allocated on the stack and string literals you can do it in such a way there is no memory allocation at all.

Writing such code requires often more thinking and care than using string or vector, but with a proper techniques it can be done. With proper techniques the code can be safe, but you always need to make sure when copying into char [] you either have some guarantees on the lenght of the string being copied, or you check and handle oversized strings gracefully. Not doing so is what gave the strcpy family of functions the reputation of being unsafe.

How templates can help writing safe char buffers

As for char [] buffers safety, templates can help, as they can create an encapsulation for handling the buffer size for you. Templates like this are implemented e.g. by Microsoft to provide safe replacements for strcpy. The example here is extracted from my own code, the real code has a lot more methods, but this should be enough to convey the basic idea:

template <int Size>
class BString
{
  char _data[Size];

  public:
  BString()
  {
    _data[0]=0;
    // note: last character will always stay zero
    // if not, overflow occurred
    // all constructors should contain last element initialization
    // so that it can be verified during destruction
    _data[Size-1]=0;
  }
  const BString &operator = (const char *src)
  {
    strncpy(_data,src,Size-1);
    return *this;
  }

  operator const char *() const {return _data;}
};

//! overloads that make conversion of C code easier 
template <int Size>
inline const BString<Size> & strcpy(BString<Size> &dst, const char *src)
{
  return dst = src;
}
Suma
+1  A: 

You can expect most operations on a std::string (such as e.g. find) to be as optimized as possible, so they're likely to perform at least as well as a pure C counterpart.

It's also worth noting that std::string iterators quite often map to pointers into the underlying char array. So any algorithm you devise on top of iterators is essentially identical to the same algorithm on top of char * in terms of performance.

Things to watch out for are e.g. operator[] - most STL implementations do not perform bounds checking, and should translate this to the same operation on the underlying character array. AFAIK STLPort can optionally perform bounds checking, at which point this operator would be a little bit slower.

So what does using std::string gain you? It absolves you from manual memory management; resizing the array becomes easier, and you generally have to think less about freeing memory.

If you're worried about performance when resizing a string, there's a reserve function that you may find useful.

+5  A: 

My point of view is:

  • Never use char * if you don't call "C" code.
  • Always use std::string: It's easier, it's more friendly, it's optimized, it's standard, it will prevent you from having bugs, it's been checked and proven to work.
Gal Goldman
+1  A: 

Use (const) char* as parameters if you are writing a library. std::string implemetatinos differ between different compilers.

Nemanja Trifunovic
If you're writing a library in C++, the layout of std::string isn't the only thing you have to worry about. There's hosts of potential incompatibilities between two implementations. Use libraries in C++ only if available in source or compiled for the exact compiler you're using.C libraries are typically more portable, but in that case you don't have std::string anyway.
David Thornley
True that std::string is not the only problem, but it is a bit too much to conclude "Use libraries in C++ only if available in source or compiled for the exact compiler you're using." There are component systems that work fine with different compilers (COM, for instance) and it is possible to expose a C interface to a library that is internally written with C++ (Win32 API, for instance)
Nemanja Trifunovic