tags:

views:

162

answers:

4

Hi all , Is there any body who can tell how to create symbol table for compiler using C.

+2  A: 

The standard Stack Overflow resource for building compilers and interpreters is http://stackoverflow.com/questions/1669/learning-to-write-a-compiler

anon
This belongs in a comment. And most of those resources assume that the programmer know enough about his implementation language to not need to ask this basic a question.
dmckee
A: 

That's the easiest part imo, once you have your parser working, when you encounter identifiers in your grammar you already have every bit of information about them, types, if they're part of a function grammar rule or not, and if they're part of a function declaration/definition, you have every parameter with their types.

Once you identify all this information, the most basic symbol table (globals only) is to build a list of unions of either a name and a type (a variable) or a name, a type and a list of name-type combos (function). You can separate them with a flag or something. Once that's done, you can further nest it for functions and later on scope, thus creating a c-style symbol table. At the end, during the code generation phase, this is where you're going to write the registers/labels your symbols will use, so make sure it's easily extensible; you're gonna come back a lot to this repository to add bookkeeping data.

The trickiest part of it is migrating the information as you're still parsing your grammar. This is usually done with a large structure that you fill in as you read it. Take a C-style int f(int x, int y) declaration: once you parse int f you don't know if you're parsing a function or a variable yet, so you have to fill in your structure with just the name and type, then pass it down to the lower tree nodes (in the case of a recursive descent parser) and let them deal with it, then once they're done, return the structure to the caller so they have the full information, even though you, in the specific functions, don't have any idea what that thing you're parsing is.

Blindy
+1  A: 

Absolutely the simplest thing that you can do is provide an array of structures. SOmething like:

typedef struct {
  char *name;
  char type; /* i for int, s for string ... */
  value union {
     int i;
     char c;
     char *s;
     float f;
  }
} symbol;
symbol stable[MAX_SYMBOLS];
int symbolCount=0;

and a set of routines to manipulate it.

You'll need:

int isDefined(char *name);  /* returns trye if the named symbol already exists */
symbol* addSymbol(char *name, char type);  /* Adds a symbol; returns a pointer to it */
symbol* getSymbol(char *name); /* returns a pointer to the named symbol or NULL */

Once this is working, you will want to

  1. Get rid of the global symbol table, and make it a parameter to all you routines
  2. replace the nasty, inefficient fixed array with a tree or hash table
dmckee
A: 

Look up how to make a hash table keyed on string. That's the standard way of doing it.

BCS