views:

590

answers:

7

This is a question that I've always wanted to know the answer, but never really asked.

How does code written by one language, particularly an interpreted language, get called by code written by a compiled language.

For example, say I'm writing a game in C++ and I outsource some of the AI behavior to be written in Scheme. How does the code written in Scheme get to a point that is usable by the compiled C++ code? How is it used by the C++ source code, and how is it used by the C++ compiled code? Is there a difference in the way it's used?

Related

How do multiple-languages interact in one project?

+4  A: 

Typically the C++ code will invoke an interpreter for the scripting language. The degree of interaction between the compiled and scripting code is dependent on the interpreter but there's always a way to pass data between the two. Depending on the interpreter, it may be possible to manipulate the objects on one side from the other side, such as a C++ function calling a method on a Ruby object. There may even be a way to control execution of one from the other.

Pesto
+3  A: 

There is a protocol for how modules communicate. Here is a high-level, broad overview of how it works:

  1. A library is created for code you want to 'share'. These are commonly called DLLs or SOs depending on your platform.
  2. Each function you want to expose (entry point) will be available to the outside world to bind to. There are protocols to how to bind such as the calling convention which specifies the order the parameters are passed, who cleans up the stack, how many parameters get stored in registers and which ones, etc. See cdecl, stdcall, etc for examples of calling conventions here.
  3. The calling module will then either statically or dynamically bind to the shared library.
  4. Once your calling library is bound to the shared library it can then specify it wants to bind to a particular entry point. This is generally done by name, however most platforms also offer the option of binding by index (faster, yet more brittle if your module changes and entry points are reordered).
  5. You will also generally declare the function you want to call in your module somewhere so that your language can do static type checking, knows what the calling convention is etc.

For your scenario of calling Scheme from C++, the Scheme interpreter most likely exports a function that dynamically binds to a Scheme function/object and calls that. If the Scheme module is compiled it probably has the option of exporting an entry point so your C++ module could bind to that. I am not very familiar with Scheme so someone else can probably answer the specifics of that particular binding better than I.

Adam Markowitz
Haa ha!! Hi Sparky! Fancy seeing you here. (vince)
veefu
+1 for being an awesome guy :)
veefu
Hey there Vince! Haven't seen you in a while. How you've been (hit me up on FB)?
Adam Markowitz
+1  A: 

If you're actually looking for tools to do such a thing, a la Adam's response, see swig.

Matt
+1  A: 

You can also integrate the two environments without having to compile the interpreter's library inside your executable. You keep your exe and the Scheme exe as separate programs on your system. From your main exe you can write your Scheme code to a file then use system() or exec() to run the scheme interpreter. You then parse the output of the scheme interpreter.

The approach suggested above keeps the exes separate and you do not have to worry about 3rd party dependencies, they can be significant. Also problems stay contained in one exe or another.

If running a separate exe does not satisfy your performance requirements you can devise a protocol where the Scheme interpreter becomes a server. You need to write some Scheme functions that wait for input on a socket or file, eval that input then output the result to the same socket or a different file. Another iteration of this is to look at existing servers that may be running your interpreter already, for example apache has modules that allows code to be written in many languages.

Marius Seritan
+12  A: 

There is no single answer to the question that works everywhere. In general, the answer is that the two languages must agree on "something" -- a set or rules or a "calling protocol".

In a high level, any protocol needs to specify three things:

  • "discovery": how to find about each other.
  • "linking": How to make the connection (after they know about each other).
  • "Invocation": How to actually make requests to each other.

The details depend heavily on the protocol itself.

Sometimes the two languages conspire to work together. Sometimes the two languages agree to support some outside-defined protocol. These days, the OS or the "runtime environment" (.NET and Java) is often involved as well. Sometimes the ability only goes one way ("A" can call "B", but "B" cannot call "A").

Notice that this is the same problem that any language faces when communicating with the OS. The Linux kernel is not written in Scheme, you know!

Let's see some typical answers from the world of Windows:

  • C with C++: C++ uses a contorted ("mangled") variation of the "C protocol". C++ can call into C, and C can call into C++ (although the names can be quite messy sometimes and it might need external help translating the names). This is not just Windows; it's generally true in all platforms that support both. Most popular OS's use a "C protocol" as well.

  • VB6 vs. most languages: VB6's preferred method is the "COM protocol". Other languages must be able to write COM objects to be usable from VB6. VB6 can produce COM objects too (although not every possible variation of COM objects).

    VB6 can also talk a very limited variation of the "C protocol", and then only to make calls outside: it cannot create objects that can be talked to directly via the "C protocol".

  • .NET languages: All .NET languages communicate compile to the same low-level language (IL). The runtime manages the communication and from that point of view, they all look like the same language.

  • VBScript vs. other languages: VBScript can only talk a subset of the COM protocol.

One more note: SOAP "Web Services" is really a "calling protocol" as well, like many other web-based protocol that are becoming popular. After all, it's all about talking to code written in a different language (and running in a different box at that!)

Euro Micelli
+1  A: 

From a theoretical point of view, when program A need to use resources(class/functions/etc) from program B, it's about passing in some information from A to B, and get some information back or some actions performed. So there needs to be a way provided by B that allows A to pass in information and get result.

In practice, it usually lies on the shoulder of languages to handle this process: the language B(program B is written in) will generate a protocol and make resources in B available in a predefined way, then language A(program A is written in) will provide some utility/framework to help invocate the exposed resources and get results following B's protocol.

To be more specific to your question, for interpreted languages, the process is fairly universal, the protocol is normally among the lines of command line parameter, HTTP request and other ways of transmitting plain text. Take the first example, program B will receive a call from HTTP request as input, and then process the request from there on. The actual format of input is totally decided by program B.

Things like SOAP and etc, are just a way to regulate programs to take input in a commonly agreed standard.

FlyinFish
+1  A: 

It's been a decade or so, but I did exactly this for my senior capstone (Well, I built a back-propogating neural network in C, and used a scheme program to teach it). The version of Scheme I was using had a compiler as well as an intepreter, and I was able to build it as a .o file. I don't know the version of scheme I was running, but it appears the RScheme will turn your scheme code into C.

Matt Poush