views:

508

answers:

2

In the code below I create 20 threads, have them each print out a message, sleep, and print another message. I start the threads in my main thread and then join all of the threads as well. I would expect the "all done" message to only be printed after all of the threads have finished. Yet "all done" gets printed before all the threads are done. Can someone help me to understand this behavior?

Thanks. Kent

Here is the code:

  def ttest() = {
     val threads = 
      for (i <- 1 to 5)
        yield new Thread() {
          override def run() {
            println("going to sleep")
            Thread.sleep(1000)
            println("awake now")
          }
        }

    threads.foreach(t => t.start())
    threads.foreach(t => t.join())
    println("all done")
  }

Here is the output:

going to sleep
all done
going to sleep
going to sleep
going to sleep
going to sleep
awake now
awake now
awake now
awake now
awake now
+6  A: 

In your code, threads is deferred - each time you iterate it, the for generator expression is run anew. Thus, you actually create 10 threads there - the first foreach creates 5 and starts them, the second foreach creates 5 more (which are not started) and joins them - since they aren't running, join returns immediately. You should use toList on the result of for to make a stable snapshot.

Pavel Minaev
Thanks, Pavel. That was exactly it.I don't know how to delete the not-really-answer above.
Kent
+6  A: 

It works if you transform the Range into a List:

  def ttest() = {
     val threads = 
      for (i <- 1 to 5 toList)
        yield new Thread() {
          override def run() {
            println("going to sleep")
            Thread.sleep(1000)
            println("awake now")
          }
        }

    threads.foreach(t => t.start())
    threads.foreach(t => t.join())
    println("all done")
  }

The problem is that "1 to 5" is a Range, and ranges are not "strict", so to speak. In good English, when you call the method map on a Range, it does not compute each value right then. Instead, it produces an object -- a RandomAccessSeq.Projection on Scala 2.7 -- which has a reference to the function passed to map and another to the original range. Thus, when you use an element of the resulting range, the function you passed to map is applied to the corresponding element of the original range. And this will happen each and every time you access any element of the resulting range.

This means that each time you refer to an element of t, you are calling new Thread() { ... } anew. Since you do it twice, and the range has 5 elements, you are creating 10 threads. You start on the first 5, and join on the second 5.

If this is confusing, look at the example below:

scala> object test {
     | val t = for (i <- 1 to 5) yield { println("Called again! "+i); i }
     | }
defined module test

scala> test.t
Called again! 1
Called again! 2
Called again! 3
Called again! 4
Called again! 5
res4: scala.collection.generic.VectorView[Int,Vector[_]] = RangeM(1, 2, 3, 4, 5)

scala> test.t
Called again! 1
Called again! 2
Called again! 3
Called again! 4
Called again! 5
res5: scala.collection.generic.VectorView[Int,Vector[_]] = RangeM(1, 2, 3, 4, 5)

Each time I print t (by having Scala REPL print res4 and res5), the yielded expression gets evaluated again. It happens for individual elements too:

scala> test.t(1)
Called again! 2
res6: Int = 2

scala> test.t(1)
Called again! 2
res7: Int = 2

EDIT

As of Scala 2.8, Range will be strict, so the code in the question will work as originally expected.

Daniel