views:

599

answers:

1

I'm working on a Boost Spirit 2.0 based parser for a small subset of Fortran 77. The issue I'm having is that Fortran 77 is column oriented, and I have been unable to find anything in Spirit that can allow its parsers to be column-aware. Is there any way to do this?

I don't really have to support the full arcane Fortran syntax, but it does need to be able to ignore lines that have a character in the first column (Fortran comments), and recognize lines with a character in the sixth column as continuation lines.

It seems like folks dealing with batch files would at least have the same first-column problem as me. Spirit appears to have an end-of-line parser, but not a start-of-line parser (and certianly not a column(x) parser).

+2  A: 

Well, since I now have an answer to this, I guess I should share it.

Fortran 77, like probably all other languages that care about columns, is a line-oriented language. That means your parser has to keep track of the EOL and actually use it in its parsing.

Another important fact is that in my case, I didn't care about parsing the line numbers that Fortran can put in those early control columns. All I need is to know when it is telling me to scan rest of the line differently.

Given those two things, I could entirely handle this issue with a Spirit skip parser. I wrote mine to

  • skip the entire line if the first (comment) column contains an alphabetic charater.
  • skip the entire line if there is nothing on it.
  • ignore the preceeding EOL and everything up to the fifth column if the fifth column contains a '.' (continuation line). This tacks it to the preceeding line.
  • skip all non-eol whitespace (even spaces don't matter in Fortran. Yes, it's a wierd language.)

Here's the code:

  skip = 
   // Full line comment
   (spirit::eol >> spirit::ascii::alpha >> *(spirit::ascii::char_  - spirit::eol))
   [boost::bind (&fortran::parse_info::skipping_line, &pi)]
  |  
   // remaining line comment
   (spirit::ascii::char_ ('!') >> *(spirit::ascii::char_ - spirit::eol)
    [boost::bind (&fortran::parse_info::skipping_line_comment, &pi)])
  |
        // Continuation
   (spirit::eol >> spirit::ascii::blank >> 
    spirit::qi::repeat(4)[spirit::ascii::char_ - spirit::eol] >> ".")
   [boost::bind (&fortran::parse_info::skipping_continue, &pi)]

     |   
   // empty line 
   (spirit::eol >> 
    -(spirit::ascii::blank >> spirit::qi::repeat(0, 4)[spirit::ascii::char_ - spirit::eol] >> 
      *(spirit::ascii::blank) ) >> 
    &(spirit::eol | spirit::eoi))
   [boost::bind (&fortran::parse_info::skipping_empty, &pi)]
  |   
   // whitespace (this needs to be the last alternative).
   (spirit::ascii::space - spirit::eol)
   [boost::bind (&fortran::parse_info::skipping_space, &pi)]
  ;

I would advise against blindly using this yourself for line-oriented Fortran, as I ignore line numbers, and different compilers have different rules for valid comment and continuation characters.

T.E.D.
FYI: This is Spirit 2.1 code, which means it works with the newly-released Boost (1.41) but might not compile with earlier versions.
T.E.D.