tags:

views:

244

answers:

2

The title is pretty self explanatory.

char c = std::cin.peek(); // sets c equal to character in stream

I just realized that perhaps native type char can't hold the EOF.

thanks, nmr

+8  A: 

Short answer: No. Use int instead of char.

Slightly longer answer: No. If you can get either a character or the value EOF from a function, such as C's getchar and C++'s peek, clearly a normal char variable won't be enough to hold both all valid characters and the value EOF.

Even longer answer: It depends, but it will never work as you might hope.

C and C++ has three character types (except for the "wide" types): char, signed char and unsigned char. Plain char can be signed or unsigned, and this varies between compilers.

The value EOF is a negative integer, usually -1, so clearly you can't store it in an unsigned char or in a plain char that is unsigned. Assuming that your system uses 8-bit characters (which nearly all do), EOF will be converted to (decimal) 255, and your program will not work.

But if your char type is signed, or if you use the signed char type, then yes, you can store -1 in it, so yes, it can hold EOF. But what happens then when you read a character with code 255 from the file? It will be interpreted as -1, that is, EOF (assuming that your implementation uses -1). So your code will stop reading not just at the end of the file, but also as soon as it finds a 255 character.

Thomas Padron-McCarthy
@Thomas: nice answer!
RageZ
Depends if you open the file as reading as a ascii file or a binary file. Though from memory I've never needed to open a file as ASCII allways as binary. This forgoes all the problems with EOF and what people define as EOF. Though my defintion of EOF is not when the ASCII document declares EOF but when you do reach the end of file size.
Chad
@Chad: I think you're thinking of something else here. If you open a file as text or as binary doesn't change how the value EOF is stored in a char variable.
Thomas Padron-McCarthy
@Thomas: just one minor detail: EOF is always negative, and *usually* -1, but the standard allows other negative numbers.
Jerry Coffin
@Jerry: Thanks.
Thomas Padron-McCarthy
+4  A: 

Note that the return value of std::cin.peek() is actually of type std::basic_ios<char>::int_type, which is the same as std::char_traits<char>::int_type, which is an int and not a char.

More important than that, the value returned in that int is not necessarily a simple cast from char to int but is the result of calling std::char_traits<char>::to_int_type on the next character in the stream or std::char_traits<char>::eof() (which is defined to be EOF) if there is no character.

Typically, this is all implemented in exactly the same way as fgetc casts the character to an unsigned char and then to an int for its return value so that you can distinguish all valid character values from EOF.

If you store the return value of std::cin.peek() in a char then there is the possiblity that reading a character with a positive value (say ÿ in a iso-8859-1 encoded file) will compare equal to EOF .

The pedantic thing to do would be.

typedef std::istream::traits_type traits_type;

traits_type::int_type ch;
traits_type::char_type c;

while (!traits_type::eq_int_type((ch = std::cin.peek()), traits_type::eof()))
{
    c = traits_type::to_char_type(ch);
    // ...
}

This would probably be more usual:

int ch;
char c;

while ((ch = std::cin.peek()) != EOF)
{
    c = std::iostream::traits_type::to_char_type(ch);
    // ...
}

Note that it is important to convert the character value correctly. If you perform a comparison like this: if (ch == '\xff') ... where ch is an int as above, you may not get the correct results. You need to use std::char_traits<char>::to_char_type on ch or std::char_traits<char>::to_int_type on the character constant to get a consistent result. (You are usually safe with members of the basic character set, though.)

Charles Bailey
I appreciated this answer but it was a little more verbose than I was looking for and somewhat confused me.
nmr
Can you point to bits that could usefully be cut out or things that I can clarify? The goal of SO is to collabaratively get to the 'best' answers so any help improving is appreciated.
Charles Bailey
I'm not questioning the lucidity of the statements I was more so referring to my own ignorance of the language. I truly just am not familiar enough with C++ to the point where you answer immediately clicked with me.I guess I should have clarified at the beginning of my question that I wasn't concerned as much with portability.
nmr
I think this is a good answer, and (for C++) more correct in the details than mine. But it does get a bit complicated, because C++ is a complicated language. I mean, std::char_traits<char>::to_char_type!
Thomas Padron-McCarthy