tags:

views:

627

answers:

5

How does is differ from a usual string?

+12  A: 

A null-terminated string is a contiguous sequence of characters, the last one of which has the binary bit pattern all zeros. I'm not sure what you mean by a "usual string", but if you mean std::string, then a std::string is not required (strictly) to be contiguous, and is not required to have a terminator. Also, a std::string's string data is always allocated and managed by the std::string object that contains it; for a null-terminated string, there is no such container, and you typically refer to and manage such strings using bare pointers.

All of this should really be covered in any decent C++ text book - I recommend getting hold of Accelerated C++, one of the best of them.

anon
Basically, the zeroed byte determines the length of a character string in C.
Costique
It's easy, thx!
lhj7362
The last character need not have the bit pattern of all zeros, it merely has to have the *value* of 0.
avakar
I'm not aware of any platform where there is this distinction. Note the use of null (or perhaps more properly NUL) as a string terminator is different from the NULL pointer.
anon
Neither am I. However, the standard says very little about bit-patterns of integral values.
avakar
It almost must have bit pattern all 0. unsigned char must be 2's complement. signed char must have the same representation as unsigned for non-negative values. char must have the same representation as one of the two (and hence, for 0, the same as unsigned). `'\0'` must equal `0` so 2's complement all value bits must be 0. The only way `(char)0` can not have all bits zero is if it has one or more padding bits, that is bits of the representation which don't participate in the value. I'm not sure that's illegal, but it's *really nasty*, especially for char. It'd be less wide than unsigned char.
Steve Jessop
You say that std::string is not required to have a terminator. Does that mean that mystring.c_str() is not guaranteed to be null-terminated?
StackedCrooked
No. If the null-terminator doesn't exist, c_str() is guaranteed to append it. See http://www.cplusplus.com/reference/string/string/c_str/
Pukku
Steve, "signed char must have the same representation as unsigned for non-negative values", could find the relevant quote?
avakar
Also, `unsigned char` must behave according the modulo 2^k arithmetic. I don't think the actual bit pattern is mentioned in the standard (can you please quote or reference, if I'm wrong?).
avakar
Both are implied by 3.9.1:7, "the representation of integral types shall define values by use of a pure binary numeration system", further defined in a footnote. So actually I shouldn't have stated them separately. I also found in 3.9.1:1. "For character types, all bits in the object representation participate in the value representation", which slams the door. I wrongly thought that was only required of unsigned char, hence "less wide", but it's required of signed char too.
Steve Jessop
Steve, thank you, I stand corrected, I missed 3.9.1/7. I don't think that 3.9.1/1 is relevant though, the fact that "all bits in the object representation participate in the value representation" simply means that there are no padding bits, it says nothing about the actual bit patterns. In any case, the answer is absolutely correct now and I'm up-voting. :) Sorry for the prolonged discussion.
avakar
A: 

A null-terminated string means that the end of your string is defined through the occurrence of a null-char (all bits are zero).

"Other strings" e.g. have to store their own lenght.

Dario
+3  A: 

A "string" is really just an array of chars; a null-terminated string is one where a null character '\0' marks the end of the string (not necessarily the end of the array). All strings in code (delimited by double quotes "") are automatically null-terminated by the compiler.

So for example, "hi" is the same as {'h', 'i', '\0'}.

Ricket
+1  A: 

A null-terminated string is a native string format in C. String literals, for example, are implemented as null-terminated. As a result, a whole lot of code (C run-time library to begin with) assumes that strings are null-terminated.

Seva Alekseyev
+4  A: 

There are two main ways to represent a string:

1) A sequence of characters with an ASCII null (nul) character, 0, at the end. You can tell how long it is by searching for the terminator. This is called a null-terminated string, or sometimes nul-terminated.

2) A sequence of characters, plus a separate field (either an integer length, or a pointer to the end of the string), to tell you how long it is.

I'm not sure about "usual string", but what quite often happens is that when talking about a particular language, the word "string" is used to mean the standard representation for that language. So in Java, java.lang.String is a type 2 string, so that's what "string" means. In C, "string" probably means a type 1 string. The standard is quite verbose in order to be precise, but people always want to leave out what's "obvious".

In C++, unfortunately, both types are standard. std::string is a type 2 string[*], but standard library functions inherited from C operate on type 1 strings.

[*] Actually, std::string is often implemented as an array of characters, with a separate length field and a nul terminator. That's so that the c_str() function can be implemented without ever needing to copy or re-allocate the string data. I can't remember off-hand whether it's legal to implement std::string without storing a length field: the question is what complexity guarantees are required by the standard. For containers in general size() is recommended to be O(1), but isn't actually required to be. So even if it is legal, an implementation of std::string that just uses nul-terminators would be surprising.

Steve Jessop