I would suggest using a lexer and parser for doing this, either the lex/yacc combo or flex/bison.
You basically write code in a .l
and .y
file to describe the layout and the lexer/parser generator creates C code that will process the file for you, calling functions to deliver the data to you.
Lexical analysis and parsing are a pain to do unless you're well versed in the art. Tools like those I've mentioned make the job a lot easier.
In the lexer, you get it to recognise the lexical elements like
e_account (account)
e_openbrace ({)
e_name (name)
e_string ("[^"]*")
e_semicolon (;)
and so on.
The lexer is used by the parser to detect the lexical elements and the parser has the higher level rules for deciding what constructs are valid. Things like an account section being e_account
, e_openbrace
, zero or more of e_stanza
then finally e_closebrace
. And also detecting e_stanza
as being (among others) e_name
, e_equals
, e_string
then e_semicolon
.
Most of the intelligence is under the covers (and pretty ugly looking code at least for lex/yacc) but it's better than trying to write it yourself :-)