




What is the best method to implement a system for parsing a configuration file based on a set of rules? I would appreciate any pointers in the direction of best practices or existing implementations.

Edit: I have not decided not choice of any specific language yet but I am comfortable with both Perl and Python. The files are something along Router/Switch configuration files with different functional sections.

+2  A: 

Assuming that this is not an XML based configuration file, may I recommend ANTLR?

  • Generates parser code based on EBNF style grammar rules file that you provide.
  • Has a graphical editor for the rules file as an Eclipse plugin.
  • Very strong and sound parser technology
  • Flexible in terms of what you want to do with the parsed output
  • Runtime environments permit parsing with ANTLR in C++, C, C#, Java, Python, and Ruby applications.
For XML, see my answer about ANTXR below -- it's based on ANTLR.
Scott Stanchfield

I often use YAML for a config files, it's lightweight and there are a ton of libraries supporting it in different languages.


Adam Pope

If you're thinking of XML and using Java, you can try my XML parser generator, ANTXR, which is based off ANTLR 2.7.x

See http://javadude.com/tools/antxr/index.html for details

An example:

XML File:

<?xml version="1.0"?>
  <person ssn="111-11-1111">
  <person ssn="222-22-2222">
  <person ssn="333-33-3333">

A parser skeleton:

header {
package com.javadude.antlr.sample.xml;

class PeopleParser extends Parser;

  : <people> EOF;

  : (<person>)*

  : ( <firstName>
    | <lastName>



A parser that actually does something with the data:

header {
package com.javadude.antlr.sample.xml;

import java.util.List;
import java.util.ArrayList;

class PeopleParser extends Parser;

document returns [List results = null]
  : results=<people> EOF

<people> returns [List results = new ArrayList()]
  { Person p; }
  : ( p=<person>  { results.add(p); }   )*

<person> returns [Person p = new Person()]
  { String first, last; }
  : ( first=<firstName>  { p.setFirstName(first); }
    | last=<lastName>    { p.setLastName(last);   }

<firstName> returns [String value = null]
  : pcdata:PCDATA { value = pcdata.getText(); }

<lastName> returns [String value = null]
  : pcdata:PCDATA { value = pcdata.getText(); }

I've been using this for years, and when I've shown it to folks at work, after the initial "getting used to a grammar" learning curve, they really love it.

Note that you can use a SAX or XMLPull front-end (and the front-end can do validation if you like). The code to run the parser looks like

// Create our scanner (using a simple SAX parser setup)
BasicCrimsonXMLTokenStream stream =
    new BasicCrimsonXMLTokenStream(new FileReader("people.xml"),
                                   PeopleParser.class, false, false);

// Create our ANTLR parser
PeopleParser peopleParser = new PeopleParser(stream);

// parse the document
Scott Stanchfield