views:

649

answers:

3

There's a ton of information available on overloading operator<< to mimic a toString()-style method that converts a complex object to a string. I'm interested in also implementing the inverse, operator>> to deserialize a string into an object.

By inspecting the STL source, I've gathered that:

istream &operator>>(istream &, Object &);

would be the correct function signature for deserializing an object of type Object. Unfortunately, I have been at a loss for how to properly implement this - specifically how to handle errors:

  1. How to indicate invalid data in the stream? Throw an exception?
  2. What state should the stream be in if there is malformed data in the stream?
  3. Should any flags be reset before returning the reference for operator chaining?
+8  A: 
  1. How to indicate invalid data in the stream? Throw an exception?

You should set the fail bit. If the user of the stream wants exception to be thrown, he can configure the stream (using istream::exceptions), and the stream will throw accordingly. I would do it like this, then

stream.setstate(stream.rdstate() | ios_base::failbit);
  1. What state should the stream be in if there is malformed data in the stream?

For malformed data that doesn't fit the format you want to read, you usually should set the fail bit. For internal streaUpdated. m specific errors, the bad bit is used (such as, if there is no buffer connected to the stream).

  1. Should any flags be reset before returning the reference for operator chaining?

I haven't heard of such a thing.


For checking whether the stream is in a good state, you can use the istream::sentry class. Create an object of it, passing the stream and true (to tell it not to skip whitespace immediately). The sentry will evaluate to false if the eof, fail or bad bit is set.

istream::sentry s(stream, true);
if(!s) return stream;
// now, go on extracting data...
Johannes Schaub - litb
Also make sure you check the `fail` bit before you try doing anything. If it's set already, just return the stream.
KTC
Thanks for the advice, especially using the `fail` bit instead of exceptions. In addition to setting the fail bit, must I make any guarantees about the contents of the stream? E.g., should the stream be unchanged if I set the `fail` bit?
Mike Koval
That's essentially what I was going to say, but you answered faster! I'd add that the right answer is found by looking up what the existing implementations do, which is what you're describing. Also, I would note that there is no such thing as malformed data, so much as a wrong format to read it; in that case, you want to make sure the variable is unchanged, and (if you set the fail bit and not the bad bit) that no characters have been lost from the stream.
Brooks Moses
@KTC, thanks good point. Updated
Johannes Schaub - litb
Not consuming any char from the stream for a badly formatted input is about impossible. You'd need arbitrary backtrack. The common behavior is to consume the valid prefix.
AProgrammer
How would you recommend handling situations where the prefix is valid, but the actual data is not?
Mike Koval
In the operator>>, like it is done for the simple types (at first though floating point are the only one in this case): return a failure and consume the valid prefix. In user code, you can provide better diagnostic to your user by reading a full line and then parsing it from a stringstream. But there is a time when going to a full lexer/parser designed for error reporting is the thing to do.
AProgrammer
I would just consume the valid prefix. And keep it at that. If you need tentative parsing without actually consuming anything, then *I* would not use iostreams anymore. They are just too unflexible for this. I would use C streams or the low-level `read` interface of iostreams, and create my own read-ahead buffer.
Johannes Schaub - litb
A: 

As for flags, I don't know if there is any standard somewhere, but it is a good idea to reset them.

Boost has neat raii wrappers for that: IO State Savers

Eugene
I don't think this is designed to be used in extractors but by user code.
AProgrammer
I don't see the difference. They work on input streams as well, so if you want to set flag back to what it was you can, better than manually anyway. Extractor that affects stream in weird ways is essentially broken.
Eugene
+2  A: 

Some additional notes:

  • when implementing the operator>>, you probably should consider using the bufstream and not other overloads of operator>>;

  • exceptions occuring during the operation should be translated to the failbit or the badbit (members of streambuf may throw, depending on the class used);

  • setting the state may throw; if you set the state after catching an exception, you should propagate the original exception and not the one throwed by setstate;

  • the width is a field to which you should pay attention. If you are taking it into account, you should reset it to 0. If you are using other operator>> to do basic works, you have to compute the width you are passing from the one you received;

  • consider taking the locale into account.

Lange and Kreft (Standard C++ IOStreams and Locales) conver this in even more details. They give a template code for the error handling which takes about one page.

AProgrammer
Why do you suggest using `bustream` instead of other overloads (e.g. `istream`? I'll be sure to check out Lange and Kreft's work - it sounds just like what I'm looking for.
Mike Koval