ansaurus

Question

ifstream::unget() fails. Is MS' implementation buggy or is my code erroneous?

Answer 1

+1 A:

is there anything that is not correct in the code that could cause the problem (not talking about whether it makes sense)

Yes. Standard streams are required to have at least 1 unget() position. So you can safely do only one unget() after a call to get(). When you call peek() and the input buffer is empty, underflow() occurs and the implementation clears the buffer and loads a new portion of data. Note that peek() doesn't increase current input location, so it points to the beginning of the buffer. When you try to unget() the implementation tries to decrease current input position, but it's already at the beginning of the buffer so it fails.

Of course this depends on the implementation. If the stream buffer holds more than one character then it may sometimes fail and sometimes not. As far as I know microsoft's implementation stores only one character in basic_filebuf (unless you specify a greater buffer explicitly) and relies on <cstdio> internal buffering (btw, that's one reason why MVS iostreams are slow). Quality implementation may load the buffer again from the file when unget() fails. But it isn't required to do so.

Try to fix your code so you don't need more than one unget() position. If you really need it then wrap the stream with a stream that guarantees that unget() won't fail (look at Boost.Iostreams). Also the code you posted is nonsense. It tries to unget() and then get() again. Why?

ybungalobill 2010-09-29 10:47:15

thanks, this is it! I know the code is nonsense, it's just for testing the problem; the get() is there to mimic the behaviour of the actual code, which after finding the /* continues to read additional characters. I'm not sure why the original author chose to use peek()/get()/unget(), it would probably be less error-prone to have the method use just get() and return a flag specifying if end of line, EOF or the token was last consumed..

stijn 2010-09-29 11:38:32

btw I do think the implementation stores more then one character in basic_filebuf by default: when inspecting the memory pointed to by the next read pointer, I can always see a lerge part of the file.

stijn 2010-09-29 11:40:00

@stijn: You're right. Maybe they changed it in 2008 or maybe it's just more complicated. From your output in the post it looks like it uses a 4KB buffer.

ybungalobill 2010-09-29 11:42:24

ansaurus

tags:

views:

answers:

ifstream::unget() fails. Is MS' implementation buggy or is my code erroneous?

related questions