views:

900

answers:

10

The student adviser we were working with is suddenly on leave.

The plan was to design a compiler for a program which is similar to C (the basic structure of C still exists but keywords have been changed for example : for(i=0;i<10;i++) would now look like : loop i from 0 to 9 up 1 . And things such as #include have been omitted entirely. We thought of many such little changes to make the language simple and unique). We want to make a compiler for our language (still in production), but want to write the programs to implement our language and compiler using C.

Please suggest a book that would help us. I bought the new version of the dragon book, but it contains few examples ,and even those are in java (I dunno). Plus the theory is daunting, and its taken me a week just to get to chapter 4, and I have too many questions already.

Is it possible to write a compiler within a 5 weeks? Should I be spending 2 more weeks just trying really hard to understand the theory? What should we do?

thanks for reading.

+1  A: 

A whole compiler in 5 weeks for students is probably a bit aggressive. I recall when I took a compiler construction course, the instructor supplied a lot of the framework (lexing and code generation, I think) and our job was to add or enhance something in the middle. It required understanding what was going on at each level, without actually having to implement all of it from first principles.

Greg Hewgill
+5  A: 

For a simple language, yes you can write a compiler in 5 weeks.

The single biggest hurdle you have to overcome (IMHO) is the grammar and working with the "lexx/yacc" tools you will (likely) be using.

The biggest payday is working with the grammar to ensure that you can readily implement it with your tools. Fighting the "compiler compiler" is a notorious time sink. But for your simple language it should not be a problem. Likely you biggest problem will be with your expressions. They're notorious.

I would take a look at a Pascal grammar as a start, as it's particularly simple. It can easily be compiled using a recursive decent process, but the key is that the grammar is really simple and "works out of the box" for many projects, even if you don't understand grammars.

Mind, I'm not saying write a Pascal compiler, but you can at least lift expressions and go a long way from there in getting your system started running quickly.

As for books, etc., there are other post on ST that you can refer to to dig those up.

The rest of the compiler is reasonably simple to create (IMHO, depending on your target languge), but the grammar is where most folks have real problems up front.

Will Hartung
your point abuot controlling the grammer for simplicity is an excellent one.
Charlie Martin
thank for the reply. I will look into the pascal grammars.
A: 

So far all weve done, is written a c program to create tokens and link them up with attribute space.

I think it is better to edit the question or use "Add comment" as it is not good to put answer for your question
Ahmed Said
+3  A: 

These are not books, but it might still be of interest as a complement to the book you choose, as well as a bit of inspiration as far as how to structure the course.

University of Massachusetts has a course called Compiler Techniques which has both video lectures and slides available online. It covers subjects such as scanning, parsing, code generation etc.

The book used in the course is Engineering a Compiler by Keith Cooper, and the goal of the course is to implement a subset of Pascal.

University of Edinburgh has a course called Compiler Optimisation which has video lectures available online. It digs a bit deeper in some of the concepts.

Erik Öjebo
these videos seem great. thanks a lot.
+2  A: 

It kind of depends on your knowledge in other areas. With 5 weeks, unless you are already pretty skillful, learning a new language wouldn't be a good plan; otherwise, I'd recommend learning a language like icon, or python, or even Ruby.

Given the five week restriction, you have a couple of options: either use lexer and parser generators, like lexx and yacc, or make sure your grammar is sufficiently simple tat you can build a recursive descent parser. "Sufficiently simple" in this context means two things: not a lot of constructs in the lanague, and a grammar that is LL(k), ideally with k=1.

Figure out what your target language will be. I wouldn't recommend an assembler. Something that compiles to C might be good, but for this kind of project, it might even be acceptable to compile to some kind of pseudo-assembler that can be hand verified.

Learn about, and use, test driven design and build tests. Amazingly enough, it actually will go faster than coding without tests.

Also, make it clear that your compiler won't have much error checking -- the error message for a syntax error should be something useful like "syntax error at line nn, exiting."

With those constraints, and lexx and yacc or similar tools, you should be able to build a simple compiler and get your first "hello, world" program parsed in just a few days work. then you can grow it to fulfill the professors requirements.

Charlie Martin
thanks for your reply. but I will have to keep coming back to it, as you talk about somethings I dont understand as yet. I am just a beginner at this. thanks a lot.
+6  A: 

You may find the book "Basics of Compiler Design" useful. It is provided for free as pdf. Also free is "Compilers and Compiler Generators".

I've been told the book "Crafting a Compiler with C" is also good but I've not seen it.

As general comment, yes you should use this as an opportunity to learn the theory behind. Since you have only 5 weeks I would suggest the following approach: (somehow similar to the one used by Wirth in the classical "Algorithm+Data Structure=Programs" book).

  • Learn about Context Free Grammars and lexical analysis
  • Craft your language so that it can be parsed with a Recursive descent parser
  • Write your grammar as BNF and derive the recursive parser following the BNF rules
  • Alternatively use the very good "RDP" tool by Adrian Johnstone, It is much simpler to use than any lex/yacc derivative.
  • Use an high level language (C could be a good choice) as your target language
  • For now ignore anything about optimization of the generated code, it will take too much time.

It is a lot of work, actually, but I'm sure it will be fun and instructive to do.

Remo.D
Crafting a Compiler with C was a useful textbook. I think it focuses on recursive descent parsers, which may be a good technique to use.
Yuval F
From what I remember from my student days, generating stack-based language is better for future optimization than high-level language. And it's closer to Assembly.
Arkadiy
+1  A: 

I think you can write CFG and make the lexical analyzer or even parser in 5 weeks and you can minimize your effort using parser generator tools (like Antlr)

Ahmed Said
A: 

in Compilerbau you should learn the use proper of lexx/yacc (flex/bison). that is one of the main things in studying Compilerbau, to build a correct and usable grammar.

C for example is bad example because of the dangling-else-problem

if (b1)
  if (b2) 
    statement1;
else
  statement2;

where belongs "else statement2" to? per definitionem to "if (b2)" but you have to know it or use paranthesises.

Peter Miehle
A: 

I have used this book to design a compiler. However, the book is quite old and the code is in pascal.. but it is still readable and this is the closest book that gave step by step instructions on how to write a compiler. You have to brush up on BNF. Your first step should be to have a BNF for your language, you might have issues with verification later if your BNF isn't mathematically well grounded.

IMHO compiler design is much closer to mathematics than software engineering as we know it. A simple warning/error in the compiler is NOT tolerable.

Sridhar Iyer
A: 

To write your first compiler, using C, in 5 weeks is not a reasonable goal. I have written several compilers and have worked on a number of others (a mature compiler is a great wallowing beast and requires many hands at the oars), and even using a well-designed functional language, which is ideal for writing compilers, I would be hard pressed to write a full compiler and produce decent native code in just five weeks. Especially if the language spec is not quite nailed down.

Using the Boehm garbage collector (libgc) will help a little.

If you want to compile to a byte code and implement a virtual machine, that would be doable in 5 weeks. Especially if you avoid yacc and lex and use something like the Elkhound GLR parser or maybe ANTLR. You could lose two weeks just trying to get yacc to swallow your grammar.

Norman Ramsey