tags:

views:

32

answers:

1

I'd like to parse formatted basic values and a few custom strings from a TextReader - essentially like scanf allows.

  • My input might not have line-breaks, so ReadLine+Regex isn't an option. I could use some other way of chunking text input; but the problem is that I don't know the delimiter at compile time (so that's tricky), and that that delimiter might be localization-dependant. For instance, a float followed by a comma might be "1.5," or "1,5," but in both cases attempting to parse the float should be "greedy".
  • To be safe, I'd like to assume my input is actively hostile (say, streaming in from a network stream): i.e. intentionally missing chunking delimiters.
  • I'd like to avoid custom Regex's: int.Parse and double.Parse work well and are localization-aware. Don't get me started on DateTime's - I might need a few custom patterns anyhow, but writing Regexes to cover that scenario doesn't sound like fun.

For a concrete example, let's say I have a TextReader and that I know the next value should be a double - how can I extract that double and possibly a limited amount of lookahead without reading the entire stream and without manually writing a localizable double-parser?

Similar Questions

There's a previous question "Looking for C# equivalent of scanf" which sounds similar but the Q+A focus on readline+regex (which I'd like to avoid). How can I use Regex against a TextReader? didn't find an answer (beyond chunking), and in any case I'd like to avoid writing my own Regexes.

A: 

Based on that lack of answers and still not having found anything myself, it seems that

  • There is no means to use localized parsing directly from Streams (or TextReaders) in .NET, nor is there a way to know how much of the stream corresponds to a parseable prefix in a systematic way.
  • There is no means to apply regular expressions to Streams (or TextReaders) in .NET, so there's no easy way of implementing something like this yourself.
  • If you really need something like this, the easiest option is a full-fledged parser generator. ANTLR works well for this; it has a lot of existing grammars you can copy-paste for the basics, and it comes with a GUI to help understand your grammar and makes parsers for .NET, java, C and a host of other languages. It's developer friendly, fast... ...but way too powerful and flexible for what I need; like shooting a bug with a shotgun - I'm not thrilled with this solution.
Eamon Nerbonne