views:

52

answers:

2

Hello,

Lately I've realized that one must be good at handling (parsing) text. It may be from as simple as interpreting the HTTP response or reading a settings file (*.ini or *.xml or *.json) to as hard as writing a compiler or regex engine.

I agree that now we have library functions/methods for interpreting popular formats of text. But using functions is making me feel something is missing. I don't know what I'm missing but I'm definitely loosing confidence by using function for everything.

In order to build up some confidence I want to try some text processing in C.

Can anyone suggest some intermediate level but good project? If you can suggest some useful project little more complex is also appreciated.

+1  A: 

Not too hard, but you could implement a nice CSV parser?

k_b
+1  A: 

Beginning but potentially useful projects:

  • Given a text file that contains C-style comments (/* ... */), write a processor that strips comments from the file.
    • Extend this to handle nested comments.
  • Try parsing a C-style string, handling the backslash commands.

For a more intermediate project, think about a functional domain you're interested in, and try your hand at writing a simple domain-specific language for it. Work on just the front-end part of parsing the language, and tackle small parts of the language at a time.

I think you'll quickly find that, for more advanced text processing, you'll want to start looking at libraries that will help you do the parsing. I think this could lead quite nicely into studies into regexp, lex/yacc, Antlr, and maybe even Haskell/Parsec if you really get into this kind of thing. Whichever way, you won't just be relying on other people's text processors anymore.

Hope this helps!

Owen S.