views:

477

answers:

6

Is there a more modern, maybe object-oriented, equivalent to Jack Crenshaw's "Let's Build a Compiler" series?

A while back I stumbled across "Let's Build a Compiler" and could just not resist writing some code. I wrote a recursive-descent C compiler in C# that output .NET CIL. "Write once, leak everywhere" was my slogan.

Too bad I did not realize until too late that parsing C is a nightmare.

I am now interested in writing a Java compiler in Java that outputs .NET CIL or assemblies with the goal of being self-bootstrapping. I was hoping there might some newer tutorials kicking around.

As an aside, would you spend more time with up-front design or would you just write a ton of tests to support the ability to mercilessly refactor. Thinking back, I am leaning towards the latter. The compiler worked but the code was really awful.

A: 

Have you taken a look at the PyPy project? It is a Python implementation of the Python language. Maybe it can provide some inspiration for your goal of self-bootstrapping Java?

Arrieta
A: 

When thinking of learning this stuff, you should have a look at book language-implementation-patterns and antlr-reference

manuel aldana
A: 

I'm a fan of "MiniJava" and associated work based on the "Modern Compiler Implementation in Java" family of books. This doesn't quite meet all the requirements you mention as a MiniJava implementation will, generally, generate native code - but the backend can easily be changed to emit MSIL or whatever.

Richard Cook
A: 

If you like to learn by example, the code for Finch, a little programming language of mine:

  1. Is written in object-oriented C++.
  2. Is very clean.
  3. Includes a bytecode compiler.
munificent
+1  A: 

It sounds like you completely missed the point of Crenshaw's tutorials. LBC isn't about writing pretty, clean, or efficient code. It's all about bringing something that's steeped in formal theory down to a level where the casual coder can easily and rapidly hack out a rudimentary (but working!) compiler.

When I read through LBC years back, I rewrote the examples in C#. I'm sure the class layout isn't the best, or tasks segregated properly, but it's comparable to his Pascal. I'd be happy to share the code with you if you like-- let me know and I can post it online and share the link.

In my spare time I've been hacking out some writing with the aim of unifying the philosophies of LBC and Basics of Compiler Design together-- walkling away with practical, working code at the end of each unit/chapter, with also discuss some theoretical stuff after exploring the ideas so the reader understands why things are the way they are. But it took Crenshaw years to write his incomplete series, so mine my be a pipe dream... and I use C (exactly because it's not C++ or Java).

Timothy
@Timothy - Sorry you feel I missed the point. As you read in my question, I wrote a C compiler in C# using Crenshaw's tutorial. The result also resembled his Pascal. This was ok because I wanted to rewrite it in C but it makes little sense for C#. It should be possible to write a similarly practical tutorial to "rapidly hack out" a compiler in an OO way that is similar in spirit to Crenshaw but more modern. My question is if there was one. If you do not write yours first, I may write it myself. I have some ideas.
Justin
A: 

How about Watt & Brown's Programming Language Processors in Java. It demonstrates what OO patterns to use in (simple) compiler design. I used it with C# successfully.

Kyberias