views:

51

answers:

3

I've created an interpreter for a stupid programming language in C++ and the whole core structure is finished (Tokenizer, Parser, Interpreter including Symbol tables, core functions, etc.).

Now I have a problem with creating and managing the function libraries for this interpreter (I'll explain what I mean with that later)

So currently my core function handler is horrible:

// Simplified version
myLangResult SystemFunction( name, argc, argv )
{
      if ( name == "print" )
      {
         if( argc < 1 )
         {
           Error('blah');
         }
         cout << argv[ 0 ];
       } else if ( name == "input" ) {
         if( argc < 1 )
         {
           Error('blah');
         }
         string res;
         getline( cin, res );
         SetVariable( argv[ 0 ], res );
       } else if ( name == "exit ) {
         exit( 0 ); 
}

And now think of each else if being 10 times more complicated and there being 25 more system functions. Unmaintainable, feels horrible, is horrible.

So I thought: How to create some sort of libraries that contain all the functions and if they are imported initialize themselves and add their functions to the symbol table of the running interpreter.

However this is the point where I don't really know how to go on.

What I wanted to achieve is that there is e.g.: an (extern?) string library for my language, e.g.: string, and it is imported from within a program in that language, example:

import string
myString = "abcde"
print string.at( myString, 2 ) # output: c

My problems:

  • How to separate the function libs from the core interpreter and load them?
  • How to get all their functions into a list and add it to the symbol table when needed?

What I was thinking to do:

At the start of the interpreter, as all libraries are compiled with it, every single function calls something like RegisterFunction( string namespace, myLangResult (*functionPtr) ); which adds itself to a list. When import X is then called from within the language, the list built with RegisterFunction is then added to the symbol table.

Disadvantages that spring to mind:

All libraries are directly in the interpreter core, size grows and it will definitely slow it down.

+1  A: 

I think that you should look into the Command pattern. Then you can implement each function as a Command and have a map which map function names to Command objects.

This will also enable you to load additional functions from an external library by letting each library have an initializer function which adds its functions to the map.

Anders Abel
+1  A: 

If your interpreter is implemented as a library, it is going to be called from other people's C++ code. It's not unreasonable for them to have to call functions in your library from their own code to add function to the interpreter. That's hat my own expression evaluator does. something like this in user code:

Interpreter in;    // an instance of the interpreter
in.AddFunc( lenfun, "length", 1 );
in.AddFunc( catfun, "concat", 2 );

where the caller must supply a pointer to the implementation function, and the name and number of parameters for the function. This works well with an untyped setup - if you do strict typing, there is of course a lot more work to do.

anon
A: 

It's unclear if you're aiming at a fully functional programming lang, or if this is a toy, so I don't know how much time you want to put into this. I suspect that dynamic library loading will be significantly more complicated than you need. You would likely need a PATH list of libraries locations, or you would need to register namespaces - which puts you back in the same place you are now.

A simpler approach is to keep a global symbol table, with associated handlers.

Use std::map (or hash_map) for a much faster function lookup than your if/else construct. Have all functions register themselves with the symbol table.

The "handlers" stored in the symbol table can be simple objects (or function pointers) and can do their own argument checking.

class FuncHandler {
  virtual MyLangResult Run(argv, argc) = 0;
}

typedef std::map<string, FuncHandler*> FuncTableType;

// Simplified version
myLangResult SystemFunction(name, argc, argv )
{
  FuncTableType::const_iterator it = function_table_.find(name);
  if (it == function_table_.end()) return Error("Unknown function: " + name);
  return it->second(argc, argv);
}

I wrote some code to perform self-registration for classes in this question, but it's not necessarily needed here: http://stackoverflow.com/questions/2535072/accessing-c-functions-from-text-storage/2535228#2535228

Stephen