tags:

views:

100

answers:

4

I'm writing a C/C++/... build system (I understand this is madness ;)), and I'm having trouble designing my parser.

My "recipes" look like this:

global
{
    SOURCE_DIRS src
    HEADER_DIRS include
    SOURCES bitwise.c \
            framing.c
    HEADERS \
            ogg/os_types.h \
            ogg/ogg.h
}
lib static ogg_static
{         
   NAME ogg
}
lib shared ogg_shared
{
    NAME ogg
}

(This being based on the super simple libogg source tree)

# are comments, \ are "newline escapes", meaning the line continues on the next line (see QMake syntac). {} are scopes, like in C++, and global are settings that apply to every "target". This is all background, and not that relevant... I really don't know how to work with my scopes. I will need to be able to have multiple scopes, and also a form of conditional processing, in the lines of:

win32:DEFINES NO_CRT_SECURE_DEPRECATE

The parsing function will need to know on what level of scope it's at, and call itself whenever the scope is increased. There is also the problem with the location of the braces ( global { or global{ or as in the example).

How could I go about this, using Standard C++ and STL? I understand this is a whole lot of work, and that's exactly why I need a good starting point. Thanks!

What I have already is the whole ifstream and internal string/stringstream storage, so I can read word per word.

A: 

Unless the point of the project is specifically learning how to write a lexer and shift-reduce parser, I'd recommending using Flex and Bison, which will handle much of the parsing grunt-work for you. Writing the grammar and semantic analysis will still be a whole lot of work, don't worry ;)

Well, I see this as a "Learn C++ the hard way" academic self-tutor trial-and-error thing, if you get what I'm saying :). I think it's a worthwhile cause, and will certainly show me enough of the Standard Library and STL to get a good grip on C++ IMHO. Heck, I'll even learn about compilers and linkers in the process.
rubenvb
@rubenvb: then write a recursive decent parser. The simplest exaplin-by-doing reference I know for that is the Crenshaw Tutorial (linked in 1669).
dmckee
@dmckee Is recursive descent reasonable for grammar of the complexity the OP suggests in his post? I'd think that that way madness lies...
@user: Recursive decent would be fine for this. Crenshaw's tutorial builds a recursive decent parser for a pascal-like language which has considerable more complexity.
dmckee
+1  A: 
Owen S.
This is what I was trying to get structured in my mind. Heck, the wikipedia article (why didn't I look there :s) will certainly help me get the basics. Thanks
rubenvb
+1  A: 

ANTLR (use ANTLRWorks), after that you can look for FLEX/BISON and others like lemon. There are many tools out there but ANTLR and flex/bison should be enough. I personally like ANTLRWorks too much to recommend something else.

LATER: With ANTLR you can generate parser/lexer code for a variety of languages.

Iulian Şerbănoiu
+1  A: 

boost::spirit is a good recursive descent parser generator that uses C++ templates as a language extension to describe parser and lexer. It works well for native C++ compilers, but won't compile under Managed C++.

Codeproject has a tutorial article that may help.

spoulson