views:

230

answers:

2

Lets say I have

trait fooTrait[T] {
  def fooFn(x: T, y: T) : T 
}

I want to enable users to quickly declare new instances of fooTrait with their own defined bodies for fooFn. Ideally, I'd want something like

val myFoo : fooTrait[T] = newFoo((x:T, y:T) => x+y)

to work. However, I can't just do

def newFoo[T](f: (x:T, y:T) => T) = new fooTrait[T] { def fooFn(x:T, y:T):T = f(x,y); }

because this uses closures, and so results in different objects when the program is run multiple times. What I really need is to be able to get the classOf of the object returned by newFoo and then have that be constructable on a different machine. What do I do?

If you're interested in the use case, I'm trying to write a Scala wrapper for Hadoop that allows you to execute

IO("Data") --> ((x: Int, y: Int) => (x, x+y)) --> IO("Out")

The thing in the middle needs to be turned into a class that implements a particular interface and can then be instantiated on different machines (executing the same jar file) from just the class name.

Note that Scala does the right thing with the syntactic sugar that converts (x:Int) => x+5 to an instance of Function1. My question is whether I can replicate this without hacking the Scala internals. If this was lisp (as I'm used to), this would be a trivial compile-time macro ... :sniff:

+1  A: 

Quick suggestion: why don't you try to create an implicit def transforming FunctionN object to the trait expected by the --> method.

I do hope you won't have to use any macro for this!

Eric
That's the idea. But what should the implicit def say? All I can think of is exactly what I defined for myFoo, and that is not compile but run-time.
bsdfish
+1  A: 

Here's a version that matches the syntax of what you list in the question and serializes/executes the anon-function. Note that this serializes the state of the Function2 object so that the serialized version can be restored on another machine. Just the classname is insufficient, as illustrated below the solution.

You should make your own encode/decode function, if even to just include your own Base64 implementation (not to rely on Sun's Hotspot).

object SHadoopImports {
    import java.io._

    implicit def functionToFooString[T](f:(T,T)=>T) = {
        val baos = new ByteArrayOutputStream()
        val oo = new ObjectOutputStream(baos)
        oo.writeObject(f)
        new sun.misc.BASE64Encoder().encode(baos.toByteArray())
    }

    implicit def stringToFun(s: String) = {
        val decoder = new sun.misc.BASE64Decoder();
        val bais = new ByteArrayInputStream(decoder.decodeBuffer(s))
        val oi = new ObjectInputStream(bais)  
        val f = oi.readObject()
        new {
            def fun[T](x:T, y:T): T = f.asInstanceOf[Function2[T,T,T]](x,y)
        }
    }
}

// I don't really know what this is supposed to do
// just supporting the given syntax
case class IO(src: String) {
    import SHadoopImports._
    def -->(s: String) = new {
     def -->(to: IO) = {
      val IO(snk) = to
      println("From: " + src)
      println("Applying (4,5): " + s.fun(4,5))
      println("To: " + snk)
     }
    }
}

object App extends Application {
  import SHadoopImports._

  IO("MySource") --> ((x:Int,y:Int)=>x+y) --> IO("MySink")
  println
  IO("Here") --> ((x:Int,y:Int)=>x*y+y) --> IO("There")
}

/*
From: MySource
Applying (4,5): 9
To: MySink

From: Here
Applying (4,5): 25
To: There
*/

To convince yourself that the classname is insufficient to use the function on another machine, consider the code below which creates 100 different functions. Count the classes on the filesystem and compare.

object App extends Application {
  import SHadoopImports._

  for (i <- 1 to 100) {
      IO(i + ": source") --> ((x:Int,y:Int)=>(x*i)+y) --> IO("sink")
  }
}
Mitch Blevins
Thanks a lot for this! I completely agree with you that in the context I was talking about, the classname was insignificant. This looks really promising!
bsdfish