I am looking for a simple compiler that compiles a simple language, I need it to write a paper about it and to learn how compilers work, I am not looking for a sophisticated thing just a simple language (by simple I mean a small code because for example gcc is toooooo big). any help is appreciated.
Brainfucked is a compiler for the extremely simple-minded language Brainfuck.
You may look at Calculator example in Bjarne Stroustrup's hilarious book "The C++ programming language".
If you want something more advanced, read the source code of boost::spirit.
Depends on your view of simple. You could look at one of the variouse available BrainFuck compilers. That's an extremely simple language and the compilers are veery small. But I don't know how much this will tell you about how a "real" compiler works.
What about looking at a small C compiler? C isn't very compilcated and I think this will give you some insight in compiler construction.
You should read a book on compiler design; it should have the theory you want to know, as well as some appropriately simple examples.
I recommend "the dragon book": Principles of Compiler Design, by Aho and Ullman. It has been many years since I read it, so I don't recall exactly what examples are available, but it is a very good text.
There are a lot you can use, what you will find easiest will depend on your experience.
Firstly as regards the language:
- The simplest is a toy language, for example compiling an arithmetic expressions.
- Next is an assembler - again really just translating but shows the basics of parsing and turning into op-codes
- Next is probably something like C, which is very close to pure assembler, or something like LISP which is very close to pure theory.
Next, choosing your compiler.
You could start with an assembler - turning assembler into machine code. This was the first step in producing compilers - I'd suggest for a chip like the 6502 or 8080 which are both very simple. Something like the assembler's development kit might work well for you (it comes with examples)
Many people (including me) would argue the easiest languages to write compilers in are functional - nowadays that probably means Haskell, Scheme or Common Lisp. An example of how easy it is is this blog post. He writes a compiler that just compiles arithmetic expressions in a few lines. This might be minimal enough for you.
Almost every introduction to writing compilers at the academic level starts with a minimal language as an example, is always recommended, but there are other good ones.
At University I used C-- which is like C but even easier to write a compiler for. Lots of resources at: http://www.cminusminus.org/qc--.html
If you wanted a compiler and you know a language like Java I'd suggest something like JavaCC, where the language is specified using grammars. There are lots of example grammars here - pick something simple like C to get started.
LISPes (Scheme, etc) are the simplest actual languages. You can look how to build a primitive Scheme interpreter in perl with this book (paper version here on Lulu). Parsing, type checking are similar in interpreters and compilers. Then, here is a more hardcore book on the compiler design subject (also available as dead tree on Lulu).
About 1000 lines of code. Compiles Scheme to LLVM assembler or to C. I would say this is an excellent fit for a paper on compilers. If you want to go deeper, I recommend the book "SICP".
The standard Stack Overflow resource for resources on compiler writing is http://stackoverflow.com/questions/1669/learning-to-write-a-compiler
Look at the simple compiler for PL/0 (a small pascal-like subset - no parameters, only integer data). The source, written in Pascal, is only about 500 lines of code, and is easy to follow. This may be all you need to look at.
However, if you want to go a little farther, once you are comfortable with that, look at the source to Pascal-S. This is a compiler for a larger subset of Pascal, but includes some additional concepts, such as parameter passing, additional data types, and arrays and records (structures). Still it is only about 2000 lines of code, and is easy to follow once you have mastered PL/0.
You can find the sources here:
In my former IT school, we had to develop a compiler in C++, but not from scratch : there were steps, learning curve etc..
The concept of the TIGER Compiler and projet assignments
All documents are available, but the code itself isn't, so you'd have to do it all by yourself.
There's a lot of understandable and usable informations, it could be a good start for learning to code a compiler.
Google UCSD Pascal. It was a ground-breaker in the 70s. Maybe it's more than you want, but it was easily ported to lot of "micro" chips back then.
If you want to look at code, I'm very impressed with Eijiro Sumii's MinCaml compiler.
It's only 2000 lines long.
It compiles a pretty interesting source language.
It generates real machine code, none of this namby-pamby C or LLVM stuff :-)
Speed of compiled code is competetive with gcc and the native OCaml compilers.
The compiler is designed for teaching.
Did I mention I've been very impressed?
Jack Crenshaw, a Ph.D. who has written extensively about practical numerical methods, was scared of compilers for a long time. He finally got tired of being scared, and wrote a multi-part tutorial on compiler construction, based on what he learned as he was teaching himself about the subject.
See "Let's Build a Compiler" for more information. Note that it isn't complete; he ran out of steam before he finished it, but there is a lot of easily-digestible information in there.
I've started a video tutorial on writing an ANTLR 3.x compiler - check out
http://javadude.com/articles/antlr3xtut
I'll be adding more to it soon! -- Scott