views:

758

answers:

6

I used Java for like 6-7 years, then some months ago I discovered Groovy and started to save a lot of typing.. then I wondered how certain things worked under the hood (because groovy performance is really poor) and understood that to give you dynamic typing every Groovy object is a MetaClass object that handles all the things that the JVM couldn't handle by itself. Of course this introduces a layer in the middle between what you write and what you execute that slows down everything.

Then somedays ago I started getting some infos about Scala. How these two languages compare in their byte code translations? How much things they add to the normal structure that it would be obtained by plain Java code?

I mean, Scala is static typed so wrapper of Java classes should be lighter, since many things are checked during compile time but I'm not sure about the real differences of what's going inside. (I'm not talking about the functional aspect of Scala compared to the other ones, that's a different thing)

Can someone enlighten me?

From WizardOfOdds comment it seems like that the only way to get less typing and same performance would be to write an intermediate translator that translates something in Java code (letting javac compile it) without alterating how things are executed, just adding synctatic sugar withour caring about other fallbacks of the language itself.

+9  A: 

One thing to be aware of: Java 7 will introduce a new invokedynamic bytecode for the JVM, which will make a lot of Groovy's "metaclass magic" unnecessary and should drastically speed up dynamic language implementations on the JVM.

Michael Borgwardt
+17  A: 

Scala is doing an increasing good job of reducing the cost of abstraction.

In the comments inline in the code, I explain the performance characteristics of Array access, pimped types, Structural Types, and abstracting over primitives and objects.

Arrays

object test {
  /**
   * From the perspective of the Scala Language, there isn't a distinction between
   * objects, primitives, and arrays. They are all unified under a single type system,
   * with Any as the top type.
   *
   * Array access, from a language perspective, looks like a.apply(0), or a.update(0, 1)
   * But this is compiled to efficient bytecode without method calls. 
   */
  def accessPrimitiveArray {
    val a = Array.fill[Int](2, 2)(1)
    a(0)(1) = a(1)(0)        
  }
  // 0: getstatic #62; //Field scala/Array$.MODULE$:Lscala/Array$;
  // 3: iconst_2
  // 4: iconst_2
  // 5: new #64; //class test$$anonfun$1
  // 8: dup
  // 9: invokespecial #65; //Method test$$anonfun$1."<init>":()V
  // 12:  getstatic #70; //Field scala/reflect/Manifest$.MODULE$:Lscala/reflect/Manifest$;
  // 15:  invokevirtual #74; //Method scala/reflect/Manifest$.Int:()Lscala/reflect/AnyValManifest;
  // 18:  invokevirtual #78; //Method scala/Array$.fill:(IILscala/Function0;Lscala/reflect/ClassManifest;)[Ljava/lang/Object;
  // 21:  checkcast #80; //class "[[I"
  // 24:  astore_1
  // 25:  aload_1
  // 26:  iconst_0
  // 27:  aaload
  // 28:  iconst_1
  // 29:  aload_1
  // 30:  iconst_1
  // 31:  aaload
  // 32:  iconst_0
  // 33:  iaload
  // 34:  iastore
  // 35:  return

Pimp My Library

  /**
   * Rather than dynamically adding methods to a meta-class, Scala
   * allows values to be implicity converted. The conversion is
   * fixed at compilation time. At runtime, there is an overhead to
   * instantiate RichAny before foo is called. HotSpot may be able to
   * eliminate this overhead, and future versions of Scala may do so
   * in the compiler.
   */
  def callPimpedMethod {    
    class RichAny(a: Any) {
      def foo = 0
    }
    implicit def ToRichAny(a: Any) = new RichAny(a)
    new {}.foo
  }
  // 0: aload_0
  //   1: new #85; //class test$$anon$1
  //   4: dup
  //   5: invokespecial #86; //Method test$$anon$1."<init>":()V
  //   8: invokespecial #90; //Method ToRichAny$1:(Ljava/lang/Object;)Ltest$RichAny$1;
  //   11:  invokevirtual #96; //Method test$RichAny$1.foo:()I
  //   14:  pop
  //   15:  return

Structural Types (aka Duck Typing)

  /**
   * Scala allows 'Structural Types', which let you have a compiler-checked version
   * of 'Duck Typing'. In Scala 2.7, the invocation of .size was done with reflection.
   * In 2.8, the Method object is looked up on first invocation, and cached for later
   * invocations..
   */
  def duckType {
    val al = new java.util.ArrayList[AnyRef]
    (al: { def size(): Int }).size()
  }
  // [snip]
  // 13:  invokevirtual #106; //Method java/lang/Object.getClass:()Ljava/lang/Class;
  // 16:  invokestatic  #108; //Method reflMethod$Method1:(Ljava/lang/Class;)Ljava/lang/reflect/Method;
  // 19:  aload_2
  // 20:  iconst_0
  // 21:  anewarray #102; //class java/lang/Object
  // 24:  invokevirtual #114; //Method java/lang/reflect/Method.invoke:(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
  // 27:  astore_3
  // 28:  aload_3
  // 29:  checkcast #116; //class java/lang/Integer

Specialisation

  /**
   * Scala 2.8 introduces annotation driven specialization of methods and classes. This avoids
   * boxing of primitives, at the cost of increased code size. It is planned to specialize some classes
   * in the standard library, notable Function1.
   *
   * The type parameter T in echoSpecialized is annotated to instruct the compiler to generated a specialized version
   * for T = Int.
   */
  def callEcho {    
    echo(1)
    echoSpecialized(1)
  }
  // public void callEcho();
  //   Code:
  //    Stack=2, Locals=1, Args_size=1
  //    0:   aload_0
  //    1:   iconst_1
  //    2:   invokestatic    #134; //Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
  //    5:   invokevirtual   #138; //Method echo:(Ljava/lang/Object;)Ljava/lang/Object;
  //    8:   pop
  //    9:   aload_0
  //    10:  iconst_1
  //    11:  invokevirtual   #142; //Method echoSpecialized$mIc$sp:(I)I
  //    14:  pop
  //    15:  return


  def echo[T](t: T): T = t
  def echoSpecialized[@specialized("Int") T](t: T): T = t
}

Closures and For Comprehensions

In Scala for is translated to a chain of calls to higher order functions: foreach, map, flatMap and withFilter. This is really powerful, but you need to be aware that the following code isn't nearly as efficient as a similar looking construct in Java. Scala 2.8 will @specialize Function1 for at least Double and Int, and hopefully will also @specialize Traversable#foreach, which will at least remove the boxing cost.

The body of the for-comprehension is passed as a closure, which is compiled to an anonymous inner class.

def simpleForLoop {
  var x = 0
  for (i <- 0 until 10) x + i
}
// public final int apply(int);   
// 0:   aload_0
// 1:   getfield    #18; //Field x$1:Lscala/runtime/IntRef;
// 4:   getfield    #24; //Field scala/runtime/IntRef.elem:I
// 7:   iload_1
// 8:   iadd
// 9:   ireturn


// public final java.lang.Object apply(java.lang.Object);

// 0:   aload_0
// 1:   aload_1
// 2:   invokestatic    #35; //Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
// 5:   invokevirtual   #37; //Method apply:(I)I
// 8:   invokestatic    #41; //Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
// 11:  areturn

// public test$$anonfun$simpleForLoop$1(scala.runtime.IntRef);
// 0:   aload_0
// 1:   aload_1
// 2:   putfield    #18; //Field x$1:Lscala/runtime/IntRef;
// 5:   aload_0
// 6:   invokespecial   #49; //Method scala/runtime/AbstractFunction1."<init>":()V
// 9:   return

LineNumberTable: line 4: 0

// 0:   new #16; //class scala/runtime/IntRef
// 3:   dup
// 4:   iconst_0
// 5:   invokespecial   #20; //Method scala/runtime/IntRef."<init>":(I)V
// 8:   astore_1
// 9:   getstatic   #25; //Field scala/Predef$.MODULE$:Lscala/Predef$;
// 12:  iconst_0
// 13:  invokevirtual   #29; //Method scala/Predef$.intWrapper:(I)Lscala/runtime/RichInt;
// 16:  ldc #30; //int 10
// 18:  invokevirtual   #36; //Method scala/runtime/RichInt.until:(I)Lscala/collection/immutable/Range$ByOne;
// 21:  new #38; //class test$$anonfun$simpleForLoop$1
// 24:  dup
// 25:  aload_1
// 26:  invokespecial   #41; //Method test$$anonfun$simpleForLoop$1."<init>":(Lscala/runtime/IntRef;)V
// 29:  invokeinterface #47,  2; //InterfaceMethod scala/collection/immutable/Range$ByOne.foreach:(Lscala/Function1;)V
// 34:  return
retronym
I love your answer. You missed for-comprehension and closures, though.
Daniel
+6  A: 

You can transliterate Java into Scala and end up with bytecode that is almost exactly the same. So Scala is perfectly capable of being as fast as Java.

That said, there are lots of ways to write slower, more memory intensive Scala code that are shorter and more readable than the Java equivalent. And that's good! We use Java, not C, because memory protection improves our code. Scala's extra expressiveness means you can write programs that are shorter, than thus less buggier than in Java. Sometimes that hurts performance, but most of the time it doesn't.

David Crawshaw
+6  A: 

retronym and David have covered the main points regarding Scala: it is essentially as fast as Java, and it is that way because it is statically typed (thus requiring no extra runtime checks) and uses lightweight wrappers that the JVM can usually remove entirely.

Scala does make it very easy to use powerful generic library features. As with any powerful generic library in Java, that has some performance penalty associated with it. For example, using a java.util.HashMap to implement a map between bytes and bytes is going to be painfully slow in Java (compared to a primitive array lookup table), and it will be equally slow in Scala. But Scala gives you many more features of this sort, and makes it amazingly easy to invoke them, to the point where you can really ask for an amazing amount of work to be done in very little code. As always, when you make it easy to ask for a lot, people sometimes will ask for a lot, and then wonder why it takes so long. (And the ease of asking makes it all the more surprising when one learns (or thinks carefully about) what must happen behind the scenes.)

The only legitimate criticism one might raise is that Scala does not make it as easy as it possibly could to write high-performance code; most of the ease-of-use features are directed towards generic functional programming, which still is pretty fast, but not as fast as direct access to primitive types. For example, Scala has an incredibly powerful for loop, but it uses generic types, so primitives must be boxed, and hence you can't use it effectively for iterating over primitive arrays; you have to use a while loop instead. (The performance differential is likely to decrease in 2.8 with specializations as retronym mentioned.)

Rex Kerr
+8  A: 

Lots of good answers, I'll try to add something else I got from your question. There is no wrapping of Scala objects. For instance, the following two classes, in Scala and Java respectively, generate exactly the same bytecode:

// This is Scala
class Counter {
  private var x = 0
  def getCount() = {
    val y = x
    x += 1
    y
  }
}

// This is Java
class Counter {
  private int x = 0;

  private int x() {
    return x;
  }

  private void x_$eq(int x) {
    this.x = x;
  }

  public int getCounter() {
    int y = x();
    x_$eq(x() + 1);
    return y;
  }
}

Of particular notice is the fact that Scala always go to the fields through getters and setters, even on other methods of the same class. The point, however, is that there is absolutely no class wrapping going on here. It's the same thing whether it is compiled in Java or Scala.

Now, Scala makes it easier to write slower code. Some examples of it are:

  • Scala's for are notably slower than Java when just incrementing indices -- the solution, so far, is to use while loops instead, though someone wrote a compiler plugin that does that conversion automatically. Sooner or later such an optimization will be added.

  • It's very easy to write closures and pass functions around in Scala. It makes the code much more readable, but it is notably slower than not doing it in tight loops.

  • It's also easy to parameterize functions so that one can pass Int, which can cause bad performance if you are handling primitives (in Scala, AnyVal subclasses).

Here is an example of a class written in Scala in two different ways, where the more compact one is about twice as slow:

class Hamming extends Iterator[BigInt] {
  import scala.collection.mutable.Queue
  val qs = Seq.fill(3)(new Queue[BigInt])
  def enqueue(n: BigInt) = qs zip Seq(2, 3, 5) foreach { case (q, m) => q enqueue n * m }
  def next = {
    val n = qs map (_.head) min;
    qs foreach { q => if (q.head == n) q.dequeue }
    enqueue(n)
    n
  }
  def hasNext = true
  qs foreach (_ enqueue 1)
}

class Hamming extends Iterator[BigInt] {
  import scala.collection.mutable.Queue
  val q2 = new Queue[BigInt]
  val q3 = new Queue[BigInt]
  val q5 = new Queue[BigInt]
  def enqueue(n: BigInt) = {
    q2 enqueue n * 2
    q3 enqueue n * 3
    q5 enqueue n * 5
  }
  def next = {
    val n = q2.head min q3.head min q5.head
    if (q2.head == n) q2.dequeue
    if (q3.head == n) q3.dequeue
    if (q5.head == n) q5.dequeue
    enqueue(n)
    n
  }
  def hasNext = true
  List(q2, q3, q5) foreach (_ enqueue 1)
}

This is also a good example of how it is perfectly possible to balance out performance when required. The faster version uses foreach in the constructor, for instance, where it won't cause performance problems.

In the end, it's all a matter of perspective. Calling methods on objects is slower than calling functions and procedures directly, and that was a major objection to object oriented programming, but it turned out not to be much of a problem most of the time.

Daniel
+1  A: 

The other answers focus on the specifics of scala. I'd like to add some points for the generic case. First of all, it is quite doable to write a bytecode generator that produces javac like code but from a language that is not java. This gets harder as the language semantics differ from java's semantics. Explicit typing however is not a part of the semantics, only of the syntax (and it has error detection properties).

Performance gets lower in the case that types cannot be statically (at compile time) determined, or if the language is dynamic by nature (typing is dynamic like in many scripting languages like javascript, jython, jruby etc). In those cases with the 1.6 jdk you need to do some reflection based dispatching. This is obviously slower and cannot be easily optimized by hotspot / the virtual machine. Jdk 1.7 extends invokedynamic so that it can actually be used to call a function in a dynamic way as supported by scripting languages.

The javac compiler does not do all that many optimizations (the jvm does them at runtime) so the java language maps rather straightforward into java bytecode. That means that languages with the same semantics have an advantage compared to languages with different semantics. This is a disadvantage of the JVM and a place where the CLR (.net runtime) and LLVM have clear advantages.

Paul de Vrieze