tags:

views:

270

answers:

4

We have a large number of legacy configuration files, of various formats normally something like KEYWORD DATA KEYWORD DATA KEYWORD DATA.

The the format of the data itself is unique within each configuration file.

What we would like to do is define the file data formats in some way and then use that to allow a application to check a the configuration files against to defined file formats.

We have thought about defining them as BNF and using YACC or its equivalent, but the nagging feeling is that there must be a away of doing this using XML.

What would be required was a way of defining a configuration files data format preferable in a XML format, then use that file to convert the legacy file into valid XML. Preferable a way of converting the XML file back to the legacy file format would be useful.

A: 

Have a look at the Altova-Tools, especially Mapforce. AFAIR they can convert from/to user-file-format and the mapping can be done quite naturally on screen. (Altova Tools also can generate XSD to check against.)

Leonidas
+1  A: 

For the convertion XML->legacy file, XSLT would probably work fine.

Touko
A: 

Try to use a simple text processor like awk (or gawk) to generate the XML. The pattern would look like this.

BEGIN { 
    print "<?xml version=\"1.0\" encoding=\"utf-8\"?>";
    print "<config-type>"; 
}
 { print "    <" $1 ">" $2 "</" $1 ">"; }
END { print "</config-type>"; }

Make sure the encoding is correct. For config files in English, "ASCII" is enough.

After that, you can use a wide variety of tools to process that XML. I suggest to use this format because it's most simple to create and process:

<config-type>
    <KEYWORD1>DATA1</KEYWORD1>
    <KEYWORD2>DATA2</KEYWORD2>
    <KEYWORD3>DATA3</KEYWORD3>
</config-type>

Use a different name for "config-type" for each type of config file you have so they are easy to distinguish.

To check the format of the XML, the most simple way is to define a DTD for it. Many XML editors can read an existing XML file and create a DTD for it. That DTD won't be perfect but it will be a very good starting point.

You can then specify the name of the DTD in the XML header and tell the XML parser to validate the structure (not the data, though).

To check the data, you can use XML schema but XML schema is very complicated and often an overkill.

Aaron Digulla
I wouldn't call XML Schema overkill. Especially as it's required if you want to do anything serious with XML.
Joachim Sauer
*lol* I can't fail to notice that you didn't object to "is very complicated" ;)
Aaron Digulla
A: 

This is precisely the type of problem that Gelatin was designed for. (Also, self promotion warning.)

knipknap