views:

119

answers:

6

Before I dive into ANTLR (because it is apparently not for the faint of heart), I just want to make sure I have made the right decision regarding its usage.

I want to create a grammar that will parse in a text file with predefined tags so that I can populate values within my application. (The text file is generated by another application.) So, essentially, I want to be able to parse something like this:

Name: TheFileName
Values: 5 3 1 6 1 3
Other Values: 5 3 1 5 1

In my application, TheFileName is stored as a String, and both sets of values are stored to an array. (This is just a sample, the file is much more complicated.) Anyway, am I at least going down the right path with ANTLR? Any other suggestions?

Edit The files are created by the user and they define the areas via tags. So, it might look something like this.

Name: <string>TheFileName</string>
Values: <array>5 3 1 6 1 3</array>
Important Value: <double>3.45</double>

Something along those lines.

A: 

Well, if it's "much more complicated", then, yes, a parser generator would be helpful. But, since you don't show the actual format of your file, how could anybody know what might be the right tool for the job?

Jonathan Feinberg
The reason I didn't show any other files is because the files are user defined. I've updated to describe the problem a bit better.
JasCav
A: 

I use the free GOLD Parser Builder, which is incredibly easy to use, and can generate the parser itself in many different languages. There are samples for parsing such expressions also.

Zach
+2  A: 

The basic question is how is the file more complicated? Is it basically more of the same, with a tag, a colon and one or more values, or is the basic structure of the other lines more complex? If it's basically just more of the same, code to recognize and read the data is pretty trivial, and a parser generator isn't likely to gain much. If the other lines have substantially different structure, it'll depend primarily on how they differ.

Edit: Based on what you've added, I'd go one (tiny) step further, and format your file as XML. You can then use existing XML parsers (and such) to read the files, extract data, verify that they fit a specified format, etc.

Jerry Coffin
Good response. To answer your question, I cannot predict how files will be different, necessarily. Basically, I want to offer a series of various tags that can be placed into a file template, and I will use that template to parse any other file that matches that template. This will help bring data quickly into the system. Other than the tags, however, I don't care much of what is in between them.
JasCav
That makes no sense!
Jonathan Feinberg
+1  A: 

It depends on what control you have over the format of the file you are parsing. If you have no control then a parser-generator such as ANTLR may be valuable. (We do this ourselves for FORTRAN output files over which we have no control). It's quite a bit of work but we have now mastered the basic ANTLR lexer/parser strategy and it's starting to work well.

If, however, you have some or complete control over the format then create it with as much markup as necessary. I would always create such a file in XML as there are so many tools for processing it (not only the parsing, but also XPath, databases, etc.) In general we use ANTLR to parse semi-structured information into XML.

peter.murray.rust
A: 

If the format of the file is up to the user can you even define a grammar for it?

Seems like you just want a lexer at best. Using ANTLR just for the lexer part is possible, but would seem like overkill.

Rob Walker
+1  A: 

If you don't need for the format to be custom-built, then you should look into using an existing format such as JSON or XML, for which there are parsers available.

Even if you do need a custom format, you may be better off designing one that is dirt simple so that you don't need a full-blown grammar to parse it. Designing your own scripting grammar from scratch and doing a good job of it is a lot of work.

Writing grammar parsers can also be really fun, so if you're curious then you should go for it. But I don't recommend carelessly mixing learning exercises with practical work code.

Parappa