views:

386

answers:

8

I recently asked a question on switching from C++ to C for writing an interpreter for speed and I got a comment from someone asking why on earth I would switch to C for that.

So I found out that I actually don't know why - except that C++ object oriented system has a much higher abstraction and therefore is slower.

  • Why are the interpreters of all popular scripting languages written in C and not in C++?

If you want to tell me about some other language where the interpreter for it isn't in C, please replace all occurences of popular scripting languages in this question with Ruby, Python, Perl and PHP.

+5  A: 

I'd guess it's because C is pretty much the only language that has a reasonably standard compiler for almost every platform in existence.

Matti Virkkunen
+13  A: 

C is a very old language, and is thus supported on pretty much every system available. It is therefore a good choice for any project that needs to be ported everywhere.

Jason Williams
For example, some people are actually using CPython on embedded platforms for which neither a C++ compiler nor a C99 compiler nor GCC exists. There was a great deal of controversy when CPython 2.6 (I think) started using computed gotos (a GCC extension) and even more so when Unladen Swallow was started, which uses LLVM which is written in C++.
Jörg W Mittag
+3  A: 

I would hazard a guess that it's in part due to 1998 C++ not being standardized until 1998, making achieving portability that much harder.

All those languages you list were developed before that standardization.

Stephen
+1  A: 

The complexity of C++ is great compared to that of C - many people consider it one of the most complex and error prone languages in existance.

Many of the features of C++ are problematic as well - the STL was standardized many years ago and it still lacks one great implementation.

OOP is certainly great, but it does not outweigh C++'s deficiencies in many scenarios.

Bozhidar Batsov
There's no great implementation of the STL? So what do all the popular compilers ship with?
jalf
There are shipping an implementation of STL, but how great it is is another matter altogether. In the past I was working a couple of years as a C++ developer. On one of the biggest project in our company the CTO had forbidden the use of any STL classes. As the most simple example I can point out that he had noticed serious performance issues with the use the string class, compared to the use of a raw array of characters. And they weren't only performance issues that we had with that STL. We were using GCC, but the CTO said the had tested other STL implementation as well - of them bugridden...
Bozhidar Batsov
You don't say when, but your CTO sounds suspect. You may have specific use cases were a different string implementation would be better, but good luck improving on map or even matching vector's capabilities. You can write slow code with any library, and I haven't come across any STL bugs. I'd have asked to see the data.
Stephen
@Bozhidar Batsov: How long ago was this?
Viktor Sehr
So your reasoning is actually just "a guy once told me it sucked"? Some would say actually testing it for yourself would yield more reliable data. ;)
jalf
Actually a lot of guys told me similar stuff, which make the data reliable and tend to explore only things that are of interest to me - I'd much rather spend my time with Lisp, than with C++ ;-)
Bozhidar Batsov
+1  A: 

Most known compiler books are written with examples in C. Also two of the major tools lexx (builds a lexer) and yacc (Translates a grammar to C) have support for C.

Romain Hippeau
+5  A: 

Why are the interpreters of all popular scripting languages written in C and not in C++?

What makes you think that they are written in C? In my experience, the majority of implementations for the majority of scripting languages are written in languages other than C.

Here's a couple of examples:

Ruby

  • BlueRuby: written in ABAP
  • HotRuby: JavaScript
  • Red Sun: ActionScript
  • SmallRuby: Smalltalk/X
  • MagLev: Ruby, GemStone Smalltalk
  • Smalltalk.rb: Smalltalk
  • Alumina: Smalltalk
  • Cardinal: PIR, NQP, PGE
  • RubyGoLightly: Go
  • YARI: Io
  • JRuby: Java
  • XRuby: Java
  • Microsoft IronRuby: C#
  • the original IronRuby by Wilco Bauwer: C#
  • Ruby.NET: C#
  • NETRuby: C#
  • MacRuby: Objective-C
  • Rubinius: Ruby, C++
  • MetaRuby: Ruby
  • RubyVM: Ruby

Python

  • IronPython: C#
  • Jython: Java
  • Pynie: PIR, NQP, PGE
  • PyPy: Python, RPython

PHP

  • P8: Java
  • Quercus: Java
  • Phalanger: C#

Perl6

  • Rakudo: Perl6, PIR, NQP, PGE
  • Pugs: Haskell
  • Sprixel: JavaScript
  • v6.pm: Perl5
  • Elf: CommonLisp

JavaScript

  • Narcissus: JavaScript
  • Ejacs: ELisp
  • Jint: C#
  • IronJS: F#
  • Rhino: Java
  • Mascara (ECMAScript Harmony Reference Implementation): Python
  • ECMAScript 4 Reference Implementation: Standard ML

The HotSpot JVM is written in C++, the Animorphic Smalltalk VM (from which HotSpot and V8 are derived) is written in C++, the Self VM (on which the Animorphic Smalltalk VM is based) is written in C++.

Interestingly enough, in many of the above cases, the implementations that are not written in C, are actually faster than the ones written in C.

As an example of two implementations that are written in C, take Lua and CPython. In both cases, they are actually written in a small subset of a very old version of C. The reason for this is that they want to be highly portable. CPython, for example, runs on platform for which a C++ compiler doesn't even exist. Also, Perl was written in 1989, CPython in 1990, Lua in 1993, SpiderMonkey in 1995. C++ wasn't standardized until 1998.

Jörg W Mittag
+1 Interesting. But did you notice the use of the word "populat" in the question :-)
anon
@Neil: I would say that Ruby, Python, PHP and Perl are quite popular, and in fact the OP specifically listed those four in his question. And JavaScript is pretty much the most popular programming language *ever* (at least for a more conservative definition of "programming language", otherwise the most popular one would be Excel).
Jörg W Mittag
@Jorg Yes, the languages are popular, but not I would guess the majority of the specific implementations you mention.
anon
@Jörg: As my question stated: The popular, official and main implementations of the example languages I listed are *all* written in C.
wndsr
@Neil: Well, the OP was very careful to distinguish between *languages* and *implementations* and he was specifically asking about popular *languages*, not *implementations*. Anyway, I'd think that V8, JRuby and IronPython are somewhat popular, and I predict that MacRuby, Rubinius and PyPy will become somewhat popular. Also, HotSpot and the CLR are *wildly* popular, and both are written in C++ (although I wouldn't necessarily classify them as "scripting", whatever *that* means).
Jörg W Mittag
@Neil Butterworth: agreed. This answer is pretty misleading actually because it doesn't list the canonical or most common or original implementation of any of those languages. Most of these implementations were created for the sake of, and known (if they are known) for, having an implementation in that language or VM.
intuited
@intuited: The question doesn't mention anything about "canonical", "most common", "original" or "popular" implementations. It simply says that the interpreters for popular scripting languages are written in C, and that is not true, therefore invalidating the whole premise of the question. In Ruby, for example, 80% of the currently existing interpreters are not written in C. In Python, about 40% are not written in C. The closest thing JavaScript has to a "canonical" implementation are the reference implementations, which are written in Standard ML, Python and (in the future) ECMAScript.
Jörg W Mittag
+4  A: 

Ruby dates back to 1995. If you were writing an interpreter in 1995, what were your options? Java was released in the same year. (And was painfully slow in v1.0 and in many ways, not really worth using)

C++ was not yet standardized, and compiler support for it was very sketchy. (it had also not yet made the transition to the "modern C++" that we use today. I think the STL was proposed for standardization around this time as well. It didn't actually get added to the standard until years later. And even after it was added, it took several more years for 1) compilers to catch up, and 2) people to get used to this generic programming style. Back then, C++ was an OOP language first and foremost, and in many cases, that style of C++ was quite a bit slower than C. (In modern C++ code, that performance difference is pretty much eliminated, partly through better compilers, and partly through better coding styles, less reliance on OOP constructs and more on templates and generic programming)

Python was started in 1991. Perl is even older (1987)

PHP is from 1995 as well, but additionally, and importantly, was created by a guy who knew virtually nothing of programming. (and yes, of course this has shaped the language in many important ways)

The languages you mention were started in C because C was the best bet for a portable, future-proof platform back then.

And while I haven't looked this up, I'm willing to bet that apart from the PHP case, which is shaped by incompetence more than anything, the language designers of the other languages chose C because they *already knew it. So perhaps the lesson is not "C is best", but "the language you already know is best"

There are other reasons why C is often chosen:

  • experience and accessibility: C is a simple language that is fairly easy to pick up, lowering the barrier of entry. It's also popular, and there are a lot of experienced C programmers around. One reason why these languages have become popular might just be that it was easy to find programmers to help developing the interpreters. C++ is more complex to learn and use well. Today, that might not be so much of a problem, but 10 or 15 years ago?
  • interoperability: Most languages communicate through C interfaces. Since your fancy new language is going to rely on components written in other languages (especially in early versions when the language itself is limited and has few libraries), it's always nice and simple to call a C function.So since we're going to have some C code anyway, it might be tempting to go all the way and just write the whole thing in C.
  • performance: C doesn't get in your way much. It doesn't magically make your code fast, but it allows you to achieve good performance. So does C++, of course, or many other languages. But it's true for C as well.
  • portability: Practically every platform has a C compiler. Until recently, C++ compilers were much more hit and miss.

These reasons don't mean that C is in fact a superior language for writing interpreters (or for anything else), they simply explain some of the motivations that have caused others to write in C.

jalf
A: 

If the question is about why C and not C++ the answer comes down to the fact that when you implement a scripting language the C++ object model comes into your way. Its so restricted that you will not be able to use it for your own objects.

So you can only use this for the internals and they there you usually do not get enough benefits from C++ over the much simpler C language, which makes it easier to port and distribute.

The only problem when implementing a script language in C are missing coroutine support (you have to switch your stack pointer in some way) and most important there is no way to do exception handling without a lot of overhead (compared to C++).

Lothar