views:

197

answers:

3

Hello, I need to parse a text file that has a lot of levels and characters. I've been trying different ways to parse it but I haven't been able to get anything to work. I've included a sample of the text file I'm dealing with. Any suggestions on how I can parse this file?

I have denoted the parts of the file I need with TEXTINEED.

(bean name:
       'TEXTINEED
       context:
       (list '/text
             '/content/home/left-nav/text
             '/content/home/landing-page)
       type:
       '/text/types/text
       module:
       '/modules/TEXTINEED
       source:
       '|moretext|
       ((contents
          (list (list (bean type:
                             '/directory/TEXTINEED
                             ((directives
                                (bean ((chartSize (list 600 400))
                                        (showCorners (list #f))
                                        (showColHeader (list #f))
                                        (showRowHeader (list #f)))))))
                      (bean type:
                             '/directory/TEXTINEED
                             ((directives
                                (bean ((displayName (list "MTD"))
                                        (showCorners (list #f))
                                        (showColHeader (list #f))
                                        (showRowLabels (list #f))
                                        (hideDetailedLink (list #t))
                                        (showRowHeader (list #f))
                                        (chartSize (list 600 400)))))))
                      (bean type:
                             '/directory/TEXTINEED
                             ((directives
                                (bean ((displayName (list "QTD"))
                                        (showCorners (list #f))
                                        (showColHeader (list #f))
                                        (showRowLabels (list #f))
                                        (hideDetailedLink (list #t))
                                        (showRowHeader (list #f))
                                        (chartSize (list 600 400))))))))

Thanks!

A: 

You might consider writing a state machine implementation which changes states according to the different tokens you encounter within the file. I have found state-based parsers to be quite easy to write and debug. The most difficult part would likely be defining the tokens you use.

TreDubZedd
+1  A: 

it looks like you have stumbled upon a nice S-Expression file, also know as LISP code. It does look complex but its actually pretty easy to parse. In fact if you wan't to learn a lot about Lisp you could follow these blog posts, a small part of it is writing a parser for files like this. But thats probably overkill for you. :)

instead you should use an already available S-Expression parser, here's project that has a lisp interpreter for .NET, you should be able to either use their code or their project to parse the file.

The lispy thing to do would be to just read the file as a lisp program so instead of 'parsing' it you would just execute it. So another option would be to just write a small lisp program to transform the file into something else thats a little more natural in C# (maybe XML?).

for reference here's another post that talks about lisp in C#

EDIT

here is a scheme interpreter written in c (its only about 1000 loc) you are interested in the read and associated procedures. this uses a very simple forward only parse of an sexpression into a tree of c structs, you should be able to adapt this into C# no problem.

luke
A: 

Use a parser generator like ANTLR. It takes a EBNF-like description of the grammar and creates parser code in the language of your choice.

nikie