views:

442

answers:

13

Recently, I was going around looking for ideas on what I can build using C this summer and I came across this post: http://stackoverflow.com/questions/1257376/interesting-project-to-learn-c

Implement a programming language. This doesn't have to be terribly hard - I did the language that must not be named - but it will force you to learn a lot of the important parts of C. If you don't want to write a lexer and/or parser yourself, you can use lex/flex and yacc/bison, but if you plan on that you might want to start with a somewhat smaller project.

I was kinda intrigued about the implementing a programming language answer and I'm wondering how do I go about starting this? I've gone through the whole K&R book and I've done some of the exercises as well. I also have a bit of experience in C++ and Java if that matters. Any tips? Thanks!

+1  A: 

Well, I think something like that is really hard to do but also it would be a great pet project. You should have notions of parsers, lexers, flow control, paradigms (imperative, functional, OO) and many other things.

Many people says the Dragon Book is one of the best books for this. Maybe you can take a look at it :)

Good Luck!

Markust
+1 for the dragon book.
Paul Nathan
Don't recommend the Dragon book for a "very simple" toy project. Just don't.
dmckee
@dmckee what is the problem if it's for learning purposes?
Markust
The problem is that offering a comprehensive---indeed encyclopedic---reference written in fully abstract form as a tutorial is silly and counter productive. Compilers are deep, complicated magic to do well, but they are simple to do simply. The dragon book buries that fact.
dmckee
@Markust: The Dragon book is a great reference if you are into heavy computer science theory about compilers and programming languages. Its not a great "getting started" book.
Yann Ramin
@dmckee: I do recommend the dragon book for the small projects. There's no reason to ignore complexity and pretend its not there.
Paul Nathan
dmckee
Well I think that knowing where the dragons live is important, if only so you can avoid visiting their lairs sooner than you have to. If you're starting down this road, shouldn't you read this book? I think so.
Warren P
+1  A: 

you can read some well-written papers by Niklaus Wirth:

  • "Compiler Construction" (available here) is a short, concise introduction to the art of building a compiler.
  • "Algorithms + Data Structure = Programs" (unfortunately out of print), presents a simpler language (named PL/0) in his last chapter.

although those papers are mainly written in Pascal, the concepts exposed are easily translated to C.

Adrien Plisson
+3  A: 

Learn about regular expressions, grammars, and a good parser generator.

Even if you end up implementing your own parser, these are the fundamental concepts to implementing any programming language.

Uri
+3  A: 

Start with a very simple (toy) language; later you can create a more complex syntax.

You could write an interpreter to parse strings like,

integer x
integer y
set x, 2
set y, 5
add x, y // x = x + y
print x

and evaluate each line immediately. If you store the lines in a vector it'd be easy to implement loops with goto command.


An example, Another World (vintage game)
Script editor:

alt text

Nick D
Rather than worrying about how the compiler will be written and what tools will be used, the primary concern is the language design. There are no useful details to get into until what you are designing is obvious. I completely agree with the concept of this answer.
erisco
A: 

Another alternative is to build a language without looking at anything else. Figure out what you can do easily, and go from there. For example, you could parse expressions into a list of tokens, separating with whitespace, and use prefix notation (which is quite simple to deal with). This sort of thing is a huge amount of fun, and you can learn a lot from experimenting.

Justin Frankel
Experimentation can be good. I'd say don't *only* experiment, and definitely look up answers to problems you run into after you've taken an honest crack at them yourself, whether you managed to solve them or not. However, completely unstructured experimentation (dinking around) isn't terribly useful
Merlyn Morgan-Graham
A: 

If you speak French you may be interested in one of my colleagues courses (freely available) http://matthieuamiguet.ch/scientifique/enseignement/langages-et-compilateurs although he uses Python to explain the concepts of language construction and compilation.

English PDF from PyCon 2010 http://matthieuamiguet.ch/assets/files/scientifique/publis/TeachingCompilersWithPython_Paper.pdf

I may have to speak to him about translating his info to English 8)

Reuben Mallaby
A: 

I've made a simple language parser in Java some time ago, basically evaluated mathematical expressions, replaced constants and variables and provided some feedback on syntax/type errors.

The easiest way I found to do such a thing was to make a parse tree. This can be done easily by using two stacks, an operator stack and a result stack. Afterwards you could just parse it recursively using a DFS, maybe use the visitor pattern if you decide to implement this in a object oriented language.

There is a lot to say about these things and if you want to I can explain them more in-depth, I didn't because I thought you'd want to try implementing the above mentioned yourself, but if you do, just notify me and we can talk.

kiwi
A: 

One old compiler tutorial is this one. Though it is in Pascal it is a very good source of information. If you want something more recent you should have a look at ANTLR.

Iulian Şerbănoiu
A: 

To keep things simple, I recommend implementing a simple postfix language. FORTH or the core part of PostScript would be great choices.

Norman Ramsey
A: 

Read through posts on the usenet newsgroup comp.compilers, it is accessible through Google Groups. It has many discussions related to building a language, building a compiler, lex/yacc, grammars and the like. Of course, you'd have to have good familiarity with the classics such as the dragon book, the tiger book among many books on compilers and, good books on algorithms and data structures.

The Original C Compiler is being given a new life. Most of it is being rewritten, and its code base is small enough to be read and understood in a summer vacation. Consider reading the code along with the papers that were used to write the code of this or any working compiler and I'm sure you'd get ideas about where to start, etc.

vpit3833
A nit: PCC wasn't the first C compiler.
Darius Bacon
+1  A: 

Scheme from Scratch is a nice series of blog posts about implementing Scheme in C. The code is very readable, and each version builds on the previous one in a way that's easy to follow.

Here is the first installment: v0.1 - Integers.

namin
+1  A: 

I'd start with a simple desk calculator program that can read things like:

5 + 10 * 3

and print the answer. Then you can progress it to add variables, control flow, even functions.

Walter Bright
Although implementing a compiler to evaluate `5 10 + 3 *` is easier :) http://en.wikipedia.org/wiki/Reverse_Polish_notation. Still a simple desk calculator
Merlyn Morgan-Graham
A: 

Let someone else do the dirty work for you, namely, the lexer and the parser. Use cup, yacc, or bison to handle the syntax. This will let you focus on the more important language design decisions. There are even example parser definitions for many languages that you can use as a template for yours.

Chris