What I'm doing: I'm writing a small interpreter system that can parse a file, turn it into a sequence of operations, and then feed thousands of data sets into that sequence to extract some final value from each. A compiled interpreter consists of a list of pure functions that take two arguments: a data set, and an execution context. Each function returns the modified execution context:
type ('data, 'context) interpreter = ('data -> 'context -> 'context) list
The compiler is essentially a tokenizer with a final token-to-instruction mapping step that uses a map description defined as follows:
type ('data, 'context) map = (string * ('data -> 'context -> 'context)) list
Typical interpreter usage looks like this:
let pocket_calc =
let map = [ "add", (fun d c -> c # add d) ;
"sub", (fun d c -> c # sub d) ;
"mul", (fun d c -> c # mul d) ]
in
Interpreter.parse map "path/to/file.txt"
let new_context = Interpreter.run pocket_calc data old_context
The problem: I'd like my pocket_calc
interpreter to work with any class that supports add
, sub
and mul
methods, and the corresponding data
type (could be integers for one context class and floating-point numbers for another).
However, pocket_calc
is defined as a value and not a function, so the type system does not make its type generic: the first time it's used, the 'data
and 'context
types are bound to the types of whatever data and context I first provide, and the interpreter becomes forever incompatible with any other data and context types.
A viable solution is to eta-expand the definition of the interpreter to allow its type parameters to be generic:
let pocket_calc data context =
let map = [ "add", (fun d c -> c # add d) ;
"sub", (fun d c -> c # sub d) ;
"mul", (fun d c -> c # mul d) ]
in
let interpreter = Interpreter.parse map "path/to/file.txt" in
Interpreter.run interpreter data context
However, this solution is unacceptable for several reasons:
It re-compiles the interpreter every time it's called, which significantly degrades performance. Even the mapping step (turning a token list into a interpreter using the map list) causes a noticeable slowdown.
My design relies on all interpreters being loaded at initialization time, because the compiler issues warnings whenever a token in the loaded file does not match a line in the map list, and I want to see all those warnings when the software launches (not when individual interpreters are eventually run).
I sometimes want to reuse a given map list in several interpreters, whether on its own or by prepending additional instructions (for instance,
"div"
).
The questions: is there any way to make the type parametric other than eta-expansion? Maybe some clever trick involving module signatures or inheritance? If that's impossible, is there any way to alleviate the three issues I have mentioned above in order to make eta-expansion an acceptable solution? Thank you!