tags:

views:

398

answers:

3

I'm in the process of implementing a cross-platform (Mac OS X, Windows, and Linux) application which will do lots of CPU intensive analysis of financial data. The bulk of the analysis engine will be written in C++ for speed reasons, with a user-accessible scripting engine interfacing with the C++ testing engine. I want to write several scripting front-ends over time to emulate other popular software with existing large user bases. The first front will be a VisualBasic-like scripting language.

I'm thinking that LLVM would be perfect for my needs. Performance is very important because of the sheer amount of data; it can take hours or days to run a single run of tests to get an answer. I believe that using LLVM will also allow me to use a single back-end solution while I implement different front-ends for different flavors of the scripting language over time.

The testing engine itself will be separated from the interface and testing will even take place in a separate process with progress and results being reported to the testing management interface. Tests will consist of scripting code integrated with the testing engine code.

In a previous implementation of a similar commercial testing system I wrote, I built a fast interpreter which easily interfaced with the testing library because it was written in C++ and linked directly to the testing engine library. Callbacks from scripting code to testing library objects involved translating between the formats with significant overhead.

I'm imagining that with LLVM, I could implement the callbacks into C++ directly so that I could make the scripting code work almost as if it had been written in C++. Likewise, if all the code was compiled to LLVM byte-code format, it seems like the LLVM optimizers could optimize across the boundaries between the scripting language and the testing engine code that was written in C++.

I don't want to have to compile the testing engine every time. Ideally, I'd like to JIT compile only the scripting code. For small tests, I'd skip some optimization passes, while for large tests, I'd perform full optimizations during the link.

So is this possible? Can I precompile the testing engine to a .o object file or .a library file and then link in the scripting code using the JIT?

Finally, ideally, I'd like to have the scripting code implement specific methods as subclasses for a specific C++ class. So the C++ testing engine would only see C++ objects while the JIT setup code compiled scripting code that implemented some of the methods for the objects. It seems that if I used the right name mangling algorithm it would be relatively easy to set up the LLVM generation for the scripting language to look like a C++ method call which could then be linked into the testing engine.

Thus the linking stage would go in two directions, calls from the scripting language into the testing engine objects to retrieve pricing information and test state information and calls from the testing engine of methods of some particular C++ objects where the code was supplied not from C++ but from the scripting language.

In summary:

1) Can I link in precompiled (either .bc, .o, or .a) files as part of the JIT compilation, code-generation process?

2) Can I link in code using the process in 1) above in such a way that I am able to create code that acts as if it was all written in C++?

+1  A: 
  1. I believe so.
  2. This is hairy. You need to match the C++ ABI of the functions you are calling into, and need to make sure the generated code uses the same data structures, classes, layout, etc (via an equivalent of header files). The C++ ABI has quite a number of nuances and portability issues. Perhaps prototype with doing interop with C first. clang has limited support for C++ right now.
Yann Ramin
+4  A: 
  1. Yes we can! Depending on the version of LLVM you use there are different API calls. you will need llvm::getBitcodeModuleProvider on 2.5.
  2. The easiest way to call C++ functions is to create a function (llvm::Function::Create) using the flag llvm::Function::ExternalLinkage and then addGlobalMapping to make it point to your C++ function.
Soroush
Thank you for your help. I'll check that out.
inflector
+1  A: 

1) You can load and link .bc files, .o files if they have been comnpiled to a .so archive should be loadable and the symbols in them should be able to be used.

2) As long as you don't want to do horrible things with the callbacks you can probably just pass standard C function pointers and do callbacks by function pointers. You can do certain other things too, but dealing with trying to define C++ objects or templates or call member functions while not being a C++ compiler is something you want to not do.

you must know the C++ ABI, you must know about the platform you target, you must know all sorts of things, you effectively must be a C++ compiler to generate code that looks like it is C++. The name mangler is one of the most annoying parts.

OmniMancer