tags:

views:

409

answers:

7

I have just done what appears to be a common newbie mistake:

First we read one of many tutorials that goes like this:

 #include <fstream>
 int main() {
      using namespace std;
      ifstream inf("file.txt");
      // (...)
 }  

Secondly, we try to use something similar in our code, which goes something like this:

#include <fstream>
int main() {
    using namespace std;
    std::string file = "file.txt"; // Or get the name of the file 
                                   // from a function that returns std::string.
    ifstream inf(file);
    // (...)
}

Thirdly, the newbie developer is perplexed by some cryptic compiler error message.

The problem is that ifstream takes const * char as a constructor argument.

The solution is to convert std::string to const * char.

Now, the real problem is that, for a newbie, "file.txt" or similar examples given in almost all the tutorials very much looks like a std::string.

So, is "my text" a std::string, a c-string or a *char, or does it depend on the context?

Can you provide examples on how "my text" would be interpreted differently according to context?

[Edit: I thought the example above would have made it obvious, but I should have been more explicit nonetheless: what I mean is the type of any string enclosed within double quotes, i.e. "myfilename.txt", not the meaning of the word 'string'.]

Thanks.

+2  A: 

So, is "string" a std::string, a c-string or a char*, or does it depend on the context?

It depends entirely on the context. :-) Welcome to C++.

A C string is a null-terminated string, which is almost always the same thing as a char*.

Depending on the platforms and frameworks you are using, there might be even more meanings of the word "string" (for example, it is also used to refer to QString in Qt or CString in MFC).

James McNellis
I think the poster meant `"example string in double quotes"`, not "the word string".
tc.
@tc is right. I have edited my post accordingly. Sorry for the confusion.
augustin
@augustin: Oh, my mistake :-)
James McNellis
"Hello, World!" might be more traditional ;)
tc.
+3  A: 
std::string file = "file.txt"; 

The right hand side of the = contains a (raw) string literal (i.a. a null-terminated byte string). Its effective type is array of const char.

The = is a tricky pony here: No assignment happens. The std::string class has a constructor that takes a pointer to char as an argument and this is called to create a temporary std::string and this is used to copy-construct (using the copy ctor of std::string) the object file of type std::string.

The compiler is free to elide the copy ctor and directly instantiate file though.

However, note that std:string is not the same thing as a C-style null-terminated string. It is not even required to be null-terminated.

ifstream inf("file.txt");

The std::ifstream class has a ctor that takes a const char * and the string literal passed to it decays to a pointer to the first element of the string.

The thing to remember is this: std::string provides (almost seamless) conversion from C-style strings. You have to look up the signature of the function to see if you are passing in a const char * or a std::string (the latter because of implicit conversions).

dirkgently
I believe the `std::string` is in fact required to be null-terminated; the implementation bundled with GCC 4.4.1 reads "due to 21.3.4 must be kept null-terminated", and as far as I know that comment has been there for quite some time.
Jon Purdy
Not entirely correct; it's an initializer, not an assignment operator. There's no way to parse `T a = b` such that it's assignment; `(T a) = b` doesn't work because (T a) isn't an expression, `T (a=b)` doesn't work because (a=b) isn't an identifier.
tc.
@tc: See second paragraph. I am not sure how else I could have worded `=`.
dirkgently
Updated, on public demand.
dirkgently
According to a slightly dubious copy (http://www.kuzbass.ru:8086/docs/isocpp/lib-strings.html), 21.3.4 requires that (const std::string this can be special-cased (it's not required for the non-const operator[]). The data() function is not required to return a null-terminated buffer, and c_str() is not required to be O(1). For convenience, though, GNU libstdc++ just uses a null-terminated buffer. IIRC GCC's memory layout of std::string is also equivalent to char*, which is a neat hack.
tc.
@dirkgently: I didn't downvote. I was just mentioning something I remembered, then checked against the GCC implementation of the STL. The referenced section apparently pertains to "`basic_string` element access", so it might be implying that `basic_string<T>[s.size()]` should be `T()`, though I'm not at all certain.
Jon Purdy
@dirkgently: The relevant section reads: "`const_reference operator[](size_type pos) const; reference operator[](size_type pos);` Returns: If `pos < size()`, returns `data()[pos]`. Otherwise, if `pos == size()`, the `const` version returns `charT()`. Otherwise, the behavior is undefined." So it's not required to be null-terminated, but a *reasonable* implementation is likely to be.
Jon Purdy
[Fixing comment:] Note: It's only the result of `c_str()` that is guaranteed to have a terminating null added at offset `size()`
dirkgently
@tc: FWIW, C++0X also mandates that the elements of `std::string` be contiguous (since most implementations already do that).
dirkgently
And that it be null terminated; `data()` and `c_str()` are now just different names for the same function.
Dennis Zickefoose
+1  A: 

The C++ standard library provides a std::string class to manage and represent character sequences. It encapsulates the memory management and is most of the time implemented as a C-string; but that is an implementation detail. It also provides manipulation routines for common tasks.

The std::string type will always be that (it doesn't have a conversion operator to char* for example, that's why you have the c_str() method), but it can be initialized or assigned to by a C-string (char*).

On the other hand, if you have a function that takes a std::string or a const std::string& as a parameter, you can pass a c-string (char*) to that function and the compiler will construct a std::string in-place for you. That would be a differing interpretation according to context as you put it.

David
A: 

As often as possible it should mean std::string (or an alternative such as wxString, QString, etc., if you're using a framework that supplies such. Sometimes you have no real choice but to use a NUL-terminated byte sequence, but you generally want to avoid it when possible.

Ultimately, there simply is no clear, unambiguous terminology. Such is life.

Jerry Coffin
Well, there actually is clear terminology if you use the full name. I.e. "Standard string" or "double you ex string" or "queue string" or "null terminated character array"....
Billy ONeal
+5  A: 

So, is "string" a std::string, a c-string or a *char, or does it depend on the context?

  • Neither C nor C++ have a built-in string data type, so any double-quoted strings in your code are essentially const char * (or const char [] to be exact). "C string" usually refers to this, specifically a character array with a null terminator.
  • In C++, std::string is a convenience class that wraps a raw string into an object. By using this, you can avoid having to do (messy) pointer arithmetic and memory reallocations by yourself.
  • Most standard library functions still take only char * (or const char *) parameters.
  • You can implicitly convert a char * into std::string because the latter has a constructor to do that.
  • You must explicitly convert a std::string into a const char * by using the c_str() method.

Thanks to Clark Gaebel for pointing out constness, and jalf and GMan for mentioning that it is actually an array.

casablanca
Double-quoted strings aren't `char*`, they're `const char*`. Also, you can't convert a `std::string` into a `char*` with `c_str()`, that returns a `const char*` yet again. Sure const-correctness is hard, but that doesn't mean you can ignore it entirely.
Clark Gaebel
Missed that, thanks for pointing it out.
casablanca
Actually, a a string literal is a const char *array*, not a pointer. :)
jalf
@jalf: It can be both depending on how you look at it, but it's definitely at least a pointer. Every array evaluates to a pointer, and an array has no individual existence except for the fact that an array declaration statically allocates storage.
casablanca
@casablanca: It's not "at least" a pointer, what does that even mean? Types aren't ranked. Arrays do not "evaluate" to pointers, arrays are arrays and pointers are pointers. Arrays can be implicitly converted to pointers. An array does not statically allocate storage, unless you mean static storage to mean statically sized. (But note static storage has a specific meaning in C++.) And... @Clark: A string literal is an array of const char, with static storage.
GMan
@GMan: I guess I was wrong about the "evaluates to" part, I was only trying to say that since you don't have direct access to the array declaration, it behaves as a pointer for most practical purposes, except for say `sizeof` a string. I've amended the answer anyway.
casablanca
A: 

To use the proper wording (as found in the C++ language standard) string is one of the varieties of std::basic_string (including std::string) from chapter 21.3 "String classes" (as in C++0x N3092), while the argument of ifstream's constructor is NTBS (Null-terminated byte sequence)

To quote, C++0x N3092 27.9.1.4/2.

basic_filebuf* open(const char* s, ios_base::openmode mode);

...

opens a file, if possible, whose name is the NTBS s

Cubbi
+7  A: 

"myString" is a string literal, and has the type const char[9], an array of 9 constant char. Note that it has enough space for the null terminator. So "Hi" is a const char[3], and so forth.

This is pretty much always true, with no ambiguity. However, whenever necessary, a const char[9] will decay into a const char* that points to its first element. And std::string has an implicit constructor that accepts a const char*. So while it always starts as an array of char, it can become the other types if you need it to.

Note that string literals have the unique property that const char[N] can also decay into char*, but this behavior is deprecated. If you try to modify the underlying string this way, you end up with undefined behavior. Its just not a good idea.

Dennis Zickefoose