The behavior you're describing would be expected if you copy the characters into the string but forget to tack a null character at the end to terminate the string. Try adding a null character to the end after the loop, and make sure you allocate enough space (one more character) for the null character. Or, better, use the string
constructor overload which accepts not just a char *
but also a length.
Or, even better std::string::substr -- it will be easier and probably more efficient.
string after(int after, string word) {
return word.substr (after);
}
BTW, you don't need an after method, since exactly what you want already exists on the string
class.
Now, to answer your specific question about why this only showed up on the 8th and later characters, it's important to understand how "C" strings work. A "C" string is a sequence of bytes which is terminated by a null (0) character. Library functions (like the string constructor you use to copy temp
into a string
instance which takes a char *
) will start reading from the first character (temp[0]) and will keep reading until the end, where "the end" is the first null character, not the size of the memory allocation. For example, if temp
is 6 characters long but you fill up all 6 characters, then a library function reading that string to "the end" will read the first 6 characters and then keep going (past the end of the allocated memory!) until it finds a null character or the program crashes (e.g. due to trying to access an invalid memory location).
Sometimes you may get lucky: if temp
was 6 characters long and the first byte in memory after the end of your allocation happened to be a zero, then everything would work fine. If however the byte after the end of your allocation happened to be non-zero, then you'd see garbage characters. Although it's not random (often the same bytes will be there every time since they're filled by operations like previous method calls which are consistent from run to run of your program), but if you're accessing uninitialized memory there's no way of knowing what you'll find there. In a bounds checking environment (e.g. Java or C# or C++'s string class), an attempt to read beyond the bounds of an allocation will throw an exception. But "C" strings don't know where their end is, leaving them vulnerable to problems like the one you saw, or more nefarious problems like buffer overflows.
Finally, a logical follow-up question you'd probably ask: why exactly 8 bytes? Since you're trying to access memory that you didn't allocate and didn't initialize, whats in that RAM is what the previous user of that RAM left there. On 32-bit and 64-bit machines, memory is generally allocated in 4- or 8-byte chunks. So it's likely that the previous user of that memory location stored 8 bytes of zeroes there (e.g. one 64-bit integer zero) zeros there. But the next location in memory had something different left there by the previous user. Hence your garbage characters.
Moral of the story: when using "C" strings, be very careful about your null terminators and buffer lengths!