compiler-theory

Syntax analysis question

In school we were assigned to design a language and then to implement it, (I'm having so much fun implementing it =)). My teacher told us to use yacc/lex, but i decided to go with java + regex API, here is how the the language I designed looks: Program "my program" var yourName = read() if { equals("guy1" to yourName) } print("hello m...

LALR(2) dangling else

Hello Is LALR(2) able to handle the dangling else case naturally (without any special rules, as with LALR(1))? Thanks ...

Hash tables optimization

Hi In several hash table implementations I've seen the usage of heuristics like "transpose" or "move to front" for items in a bucket. What are the advantages of using such heuristics? I could't figure it out myself. Which other optimizations can be done at the hash table / bucket level, why, and under which circumstances? Optimizing...

The Dragon Book

Is the Dragon book a good introductory textbook for a compiler class? It seems to be famous, and our professor likes it, but it's not one of the textbooks that we use. ...

Register allocation and spilling, the easy way?

I'm looking for a way to allocate local variables to registers. I'm aware of a couple of serious methods for doing it (namely, those mentioned on Wikipedia), but I'm stuck on how "spilling" is accomplished. Also, the relevant literature is quite intimidating. I'm hoping there's something simpler that will satisfy my priorities: Corr...

What is the name of a function whose result depends only on its parameters?

I'm writing a toy compiler thingy which can optimise function calls if the result depends only on the values of the arguments. So functions like xor and concatenate depend only on their inputs, calling them with the same input always gives the same output. But functions like time and rand depend on "hidden" program state, and calling t...

Anywhere I can find good LR(1) and LALR(1) state generation examples or reading material?

I'm using Kenneth Louden's Compiler Construction book but it lacks examples and the rules stating how this goes are really hard to follow. I'm not sure how to go to the LR(1) states. also, not sure how to go from the LR(1) states to the LALR(1) ones. For example, these LR(1) states: i understand how the "S -> .XX, $" got there, but t...

Questions about implementation of a global register allocator for the tiny c compiler

Hey guys, upcoming summer i will hopefully start writing my masters thesis and i have been quite busy looking for a thesis subject. I now have a pool of subjects that i am interested in and the one that struck me most is the implementation of a global register allocator for the tiny C compiler (graph coloring or linear scan). So i wante...

Will this 'algorithm' for nullable and first work (in a parser)?

Working through this for fun: http://www.diku.dk/hjemmesider/ansatte/torbenm/Basics/ Example calculation of nullable and first uses a fixed-point calculation. (see section 3.8) I'm doing things in Scheme and relying a lot on recursion. If you try to implement nullable or first via recursion, it should be clear you'll recur infinitely ...

Classic Academic Textbooks

I'm familiar with both the Wizard Book: Structure and Interpretation of Computer Programs and the Dragon Book: Compilers: Principles Techniques and Tools However, I'm curious to find out which other classic academic textbooks people would consider essential reading for a programmer. ...

How do C compilers implement functions that return large structures?

The return value of a function is usually stored on the stack or in a register. But for a large structure, it has to be on the stack. How much copying has to happen in a real compiler for this code? Or is it optimized away? For example: struct Data { unsigned values[256]; }; Data createData() { Data data; // initialize d...

Parsing "off-side" (indentation-based) languages

An off-side language is the one where ...the scope of declarations (a block) in that language is expressed by their indentation. Examples of such languages are Python, Boo, Nemerle, YAML and several more. So my question is this: how do I actually parse these? How do I resolve tabs vs spaces problem (are two tabs or 8 spaces equiv...

Why we can't implement polymorphism in C++ without base class pointer or reference?

Hay Dear! First of all have a look at the following code (in this code shape is the base class and line is the derived class) void drawshapes(shape sarray[],int size) { for(int i=0;i< size; i++) sarray[i].draw(); } main() { line larray[10]; larray[0]=line(p1,p2);//assuming that we have a point class larray[1]=line(p2,p...

Advantages of compilers for functional languages over compilers for imperative languages

As a follow up to this question What are the advantages of built-in immutability of F# over C#?--am I correct in assuming that the F# compiler can make certain optimizations knowing that it's dealing with largely immutable code? I mean even if a developer writes "Functional C#" the compiler wouldn't know all of the immutability that the...

I'm attempting to write a .NET compiler using System.Reflection.Emit how do I do type resolution?

I've got a strategy for resolving types from referenced dlls. I'm stuck on trying to resolve types that are defined in the assembly that is being compiled. I'm using the System.Reflection.Emit apis with no 3rd party libraries. For instance: class A {} class B { public A AnInstanceOfA {get; private set;} } What's the best way to r...

"Type and Size Specifier" - Terminology

Take the following snippet: 1 #include <stdio.h> 2 #include <stdlib.h> 3 int foo(char [6]); 4 5 int main(void) { 6 char* bar="hello"; 7 return foo(bar); 8 } 9 10 int foo(char f[6]) { 11 return EXIT_SUCCESS; 12 } 13 What is the right technical term for "char [6]" on line 3? I call it "type and...

yacc: Distinguish Integers from Floating Point Numbers.

I'm supposed to write a program that does 2 + 2 = 4 and 2.2 + 2 = 4.2. I've already done it so that it treats everything as a floating point, but that's "wrong". I have to distinguish them. Here's what I have so far: %{ #include <stdio.h> #include <ctype.h> %} %token <dval> FLOAT %token <ival> INTEGER %union { float dval; int i...

Runtime definition

What is the runtime? And I don't mean "at run time" = as the program/script is running. I mean The <your-interpreted-language-here> runtime ...

What does S-attributed and L-attributed grammar mean?

I'm reading a compiler book and kinda confused when it says "a S-attribute grammar is also a L-attribute grammar". Couldn't understand. Can someone make it clear (an example should be great). Thanks. ...

P6 Architecture - Register renaming aside, does the limited user registers result in more ops spent spilling/loading?

I'm studying JIT design with regard to dynamic languages VM implementation. I haven't done much Assembly since the 8086/8088 days, just a little here or there, so be nice if I'm out of sorts. As I understand it, the x86 (IA-32) architecture still has the same basic limited register set today that it always did, but the internal register...