tags:

views:

212

answers:

4

As I was reflecting on this subject, it seemed to me that if a language is implemented in C++, it might well have a mechanism for linking to C++. As I recall, Java is this way through JNI, though I don't really remember if it goes through C++ or plain C.

However, it seems that in general languages don't link to C++, and are accessible to C++ only through C for various reasons.

So, out of curiosity,

What languages are there that can link to C++ and how and to what extent?

(No credit for the aforementioned C bridge unless it is done in an elegant or interesting way, like Boost.Python)

+9  A: 

Note: the ABI includes name-mangling as well as the calling convention, environment requirements, etc. The EABI (not covered here) is platform-specific but is usually standardized for each platform (though this is not guaranteed) and does not suffer from the problems of a variadic EABI (because it doesn't change much). Consult ABI on Wikipedia for a bit more on the subject.

Linking to C++ is unlikely because the ABI is undefined. Thus, the names of symbols can change at the will of the compiler vendor. GCC does it differently from MS's compiler, for example. Even with the same compiler ABI's may be different between versions (as is the case with some GCC versions).

However, linking to C++ is not impossible. It's common for C++ programs to link to other C++ programs because both are compiled by the same compiler (or at least compilers which use the same ABI). If the ABI is known, non-C++ applications can link to C++ ones.

Linking to C is easy. This is because C's ABI is more stable than C++'s ABI for most platforms. This is due to C's simplicity compared to C++. Because C++ can export C symbols, bypassing the problem with C++'s ABI, you sometimes see C++ libraries export C symbols which allow interaction with the underlying C++ objects without the problems of name mangling, link compatability, and future-proofing. However, doing this prevents you from easily using the C++ library via the C interface with the C++ interface.

But you can use the C++ interface by using the C exports. The library can provide some classes in header files which wrap the C exports for you. This is more work on the library writers (and can cause some problems because DRY usually isn't practiced with this technique) but provides much benefit for library users (which can choose between both interfaces without worrying about linking issues!). For other languages the library writers can provide interfaces like the C++ header files which deal with the C exports themselves. (You then have the advantage of using native classes and such.)

strager
Yes; GCC changes the names when it makes incompatible changes (eg 4.2.x to 4.3.x, IIRC).
Jonathan Leffler
Interesting anecdote about GCC.
Ellery Newcomer
3.3 to 3.4 was a major break of ABI
Johannes Schaub - litb
Not sure why I am losing votes here. Anyone care to explain?
strager
I was actually wondering that too. I have no idea but here's a vote :-).
tgamblin
i also upvoted you. however "This is because C's ABI is well defined and stable, unlike C++'s ABI" is not quite right. both C's and C++'s ABI can be defined within one platform + compiler. there is no difference between defined-ness of C and C++. it is just that C is much simplier.
Johannes Schaub - litb
and has thus much less problems with different ABIs. like Jonathan says in the other answer, name mangling actually is only a small part of an ABI (and like you said too now). there is more, like register passing and so. C's ABI is neither defined across platforms+compilers.
Johannes Schaub - litb
@litb, I remember reading somewhere that C's ABI is defined in a standard (probably C89 and C99). Perhaps only name mangling is defined?
strager
@litb, I have edited my answer. Does it please you? =]
strager
i don't know of any ABI rules put by the standard. i would rather be very surprised if the C standard puts any force here. anyway, of course, i like your answer much better now. thumbs up :)
Johannes Schaub - litb
+5  A: 

There is no language that can do this well because the C++ standard does not specify a name mangling scheme. Because of this, different compilers are free to mangle names differently, and there's no consistent way to link to C++ binaries.

The reason so many things can link to C is because C has simple and consistent linkage. A function foo is called foo in the object file. Likewise, things can link to Fortran reasonably well because foo is one of foo, FOO, foo_, or foo__.

This is why there are so many wrapper generators for C++ objects (SWIG, Boost.Python, SIP, etc.). They define an interface as a set of C calls precisely because that makes things easy to link against.

Something else to keep in mind is whether you really want to link directly to C++ libraries. Many wrapper generators provide a lot of policy options when you generate wrappers. Remember, C calls boil down to just functions, but C++ calls typically have a lot of OO semantics you'll need to navigate around. You need to specify how your C++ objects should be garbage collected in the host language, how things get copied, where vptrs are, how objects are laid out, where objects get allocated, and all the subtle differences between the host language's object model and that of C++.

I know it sounds ideal to just link to a C++ library and be done with it, but it's not that simple, and you're going to want more than just direct linkage in the end anyway.

tgamblin
And the standard should not specify a name mangling scheme because there are too many other things that would have to be specified for interoperability between different implementations of C++ (memory layout of objects and vptrs, for a start, and that's just the tip of a very large iceberg!)
Jonathan Leffler
yep -- VERY good point. I left it for the long answer -- see above, and feel free to add things you see missing! This question has come up a lot and it would be nice to have an answer that tells people why they don't want this!
tgamblin
+3  A: 

Looked at SWIG lately?

greyfade
Not lately, but I'm aware of its existence.
Ellery Newcomer
Oh yeah, and +1 for actually answering my question. Sort of.
Ellery Newcomer
A: 

There is no standard way of linking with any language, because there is no international standard for linkers, exported names, or even the concept of linking.

anon
There are for platforms. DLL's and ELF's have defined, constant structures which are public.
strager
Platform != standard
anon
It's still a standard, and probably won't change for the platform. This is not the case for compilers.
strager