Wednesday 18 May 2011

Vladimir Galore - Lots of Threaded Scala Actors Waiting for Godot

Dear Junior

When we discussed the model for programming actors in Scala we saw that the thread-based model could be compared to the very actively waiting character Vladimir of Waiting for Godot. I also mentioned that the more drowsy, laid-back Estragon of the same play could be a good picture of the other model, the event-based.

However, before proceeding to the event-based actors, I think it enlightening to dive a bit more into the thread-based and see what the problem is.

The point of actor-based programming is that you want to mimic a "society of collaboration" where each actor performs a focused task. And you do not just want one actor per function (or type of task), you actually want one actor per task. So you want a lot of them.

This is similar to object orientation - you do not want one object per class (e g representing phone numbers), you want one object per instance of the class (one for each phone number). Actually, it can be claimed that the restricted actor-message model is closer to the original ideas of object orientation than the object-method model we have become accustomed to.

So we really want loads of actors. Let us see how many Vladimirs we can create before my poor laptop cringes.

Let us revise the code for Vladimir in his waiting for Godot. As we will create a lot of Vladimirs I have given each an id.

class Vladimir(id : Int) extends Actor {
  def threadid = {
    Thread.currentThread.getId
  }

  def name: String = {
    "Vladimir" + id
  }

  def act() = {
    println(name + " is waiting " + threadid)
    receive {
      case Godot =>
        println(name + " saw Godot arrive! " + threadid)
    }
    println(name + "'s wait is over " + threadid)
  }
}

case class Godot

Now we also need to create loads of them. Let us make a list of integers and turn each of them into an instance of Vladimir.

object vladimirgalore extends Application {
  override def main(args: Array[String]) {
    val actors = Integer.parseInt(args(0));
    val ids = 0 until actors // [0,1,2 ...]
// turn each int to an actor using the int as id
    val vladimirs = ids map (id => new Vladimir(id))
    println(actors + " actors on stage")
    for(vlad <- vladimirs) { vlad.start }
  }
}

scala godot.vladimirgalore 6
6 actors on stage
Vladimir2 is waiting 12
Vladimir0 is waiting 10
Vladimir1 is waiting 11
Vladimir3 is waiting 13
Vladimir4 is waiting 17
Vladimir5 is waiting 18
^C

Never mind the order of the output - actors are threads and are thus entitled to run scheduled in any order. What we see is six actors, each an instance of Vladimir, and each given a thread of its own. And every one of them are in a wait-state waiting for the message "Godot". Let us relieve one of them from its wait, just to see one completion. Let us send Vladimir4 the happy message of Godot's arrival.

object vladimirgalore extends Application {
  override def main(args: Array[String]) {
    val actors = Integer.parseInt(args(0));
    val ids = 0 until (actors)
    val vladimirs = ids map (id => new Vladimir(id))
    println(actors + " actors on stage")
    for(vlad <- vladimirs) { vlad.start }
    vladimirs(4) ! Godot
  }
}

danbj$ scala godot.vladimirgalore 6
6 actors on stage
Vladimir1 is waiting 11
Vladimir2 is waiting 12
Vladimir3 is waiting 13
Vladimir0 is waiting 10
Vladimir4 is waiting 17
Vladimir4 saw Godot arrive! 17
Vladimir4's wait is over 17
Vladimir5 is waiting 17
^C

Ahaa. Vladimir 4 was started with thread 17 - which used "receive" to register the message handler (the code block containing the "case"). It was also thread 17 that later executed the message handler, doing the pattern matching of the case and running the corresponding code for "case Godot". Finally it was thread 17 that continued the code after the handler-block. Same thread all the way - that is why they are called thread-based. They do not only behave as if they where a thread, they are actually implemented using the same thread all the time.

Accidentially, thread 17 managed to complete the act-method of Vladimir4 so that specific thread could be reused for Vladimir5. However, in all other cases, a new fresh thread was required.

Now let us put loads of actors on stage. For clarity we remove the release of Vladimir 4 so that every actor will be waiting and all threads be locked up.

Let us see if we can put 1000 Vladimirs on stage and set them acting.


object vladimirgalore extends Application {
  override def main(args: Array[String]) {
    val actors = Integer.parseInt(args(0));
    val ids = 0 until (actors)
    val vladimirs = ids map (id => new Vladimir(id))
    println(actors + " actors on stage")
    for(vlad <- vladimirs) { vlad.start }
    // vladimirs(4) ! Godot
  }
}

danbj$ scala godot.vladimirgalore 1000
1000 actors on stage
Vladimir0 is waiting 10
Vladimir3 is waiting 13
Vladimir2 is waiting 12
Vladimir1 is waiting 11
Vladimir4 is waiting 16
Vladimir5 is waiting 17
…
Vladimir115 is waiting 128
Vladimir116 is waiting 129
Vladimir117 is waiting 130
Vladimir118 is waiting 131
Vladimir119 is waiting 132
…
Vladimir251 is waiting 264
Vladimir252 is waiting 265
Vladimir253 is waiting 266
Vladimir254 is waiting 267
Vladimir255 is waiting 268
^C

So, the system hangs on Vladimir255 even though it has not yet started all the 1000 actors I asked for. 

Hmm … "Vladimir0" to Vladimir255" - that is 256 actors that have been started before the system hangs. Such a number is hardly a coincidence … Here my colleague George Spalding came to the rescue by pointing out the relevant JVM properties, in this case "actors.maxPoolSize (default 256)". So, as I understand it Scala will not allow the JVM to allocate more than 256 threads for actor stuff. This means that when 256 Vladimirs had been started, then all 256 threads where sitting waiting for receiving Godot.


receive {
      case Godot =>
        println(name + " saw Godot arrive! " + threadid)
    }

And if the runtime-system refuse to allocate more threads, then no more actors will be started.

Let us run it again, with modified properties, increasing the maximum number of actor threads.


scala -Dactors.maxPoolSize=10000 godot.vladimirgalore 1000
1000 actors on stage
Vladimir0 is waiting 10
Vladimir3 is waiting 13
Vladimir2 is waiting 12
Vladimir1 is waiting 11
Vladimir4 is waiting 16
Vladimir5 is waiting 18
…
Vladimir997 is waiting 1010
Vladimir998 is waiting 1011
Vladimir999 is waiting 1012
^C

Ok now it worked… what about 2500?


danbj$ scala -Dactors.maxPoolSize=10000 godot.vladimirgalore 2500
2500 actors on stage
Vladimir0 is waiting 10
Vladimir3 is waiting 13
…
Vladimir2498 is waiting 2511
Vladimir2499 is waiting 2512
^C

Seems to work … and 5000?


danbj$ scala -Dactors.maxPoolSize=10000 godot.vladimirgalor 5000
...
Vladimir2538 is waiting 2552
Vladimir2539 is waiting 2553
godot.Vladimir@35a631cc: caught java.lang.OutOfMemoryError: unable to create new native thread
java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:658)
 at scala.concurrent.forkjoin.ForkJoinPool.createAndStartSpare(ForkJoinPool.java:1609)
 at ...
 at scala.actors.Scheduler$.managedBlock(Scheduler.scala:21)
 at scala.actors.Actor$class.receive(Actor.scala:512)
 at godot.Vladimir.receive(waitingforgodot.scala:38)
 at godot.Vladimir.act(waitingforgodot.scala:49)
 at ...
 at scala.actors.ReactorTask.run(ReactorTask.scala:36)
 at ...
 at scala.concurrent.forkjoin.ForkJoinWorkerThread.mainLoop(ForkJoinWorkerThread.java:340)
 at ...
Vladimir2540 is waiting 2553
^C

Nope it crashed.

But … systems always reveals interesting information when breaking down.

In this case it was the scala.actors.Scheduler that tried to start a new Thread in order to serve the Vladimir "receive" inside its act-method. And creating this "new native thread" was too much load for the JVM that crashed with OutOfMemoryError.

Let us run this once again, just below the limit where it crashes.

danbj$ scala -Dactors.maxPoolSize=10000 godot.vladimirgalor 2539

and have a look at process status


danbj$ ps -m -O rss
  PID    RSS   TT  STAT      TIME COMMAND
 8164 264280 s001  R+     0:06.80 /usr/bin/java -Xmx256M …

OK, "RSS" stands for "resident set state" and is basically "real memory". So, memory use is 264280 kB. Also note that the scala runtime has set the JVM "maximum heap size" (Xmx) to 256M. These two numbers are strikingly close to each other. My conclusion is that it seems to be the allocated heap that has filled up.

This does makes sense, a few thousand threads will need one allocated stack each, and that will eat the available space pretty quickly. Trying out with different number of actors give us some data on how much stack is needed per actor.

actors    RSS (in kB)
    1    62752
  500    95676
 1000   134132
 2500   239248

So, apart from that the system needs 62M just to start, every Vladimir (thread-based actor) seems to need just below 70k each. That is not very slim, not if we want to create truckloads of actors.

Obviously, the thread based actors have their advantage in being pretty easy to understand. But they definitely have their drawback in eating system resources by asking for one thread and a stack allocation each.

But, hey … in our benchmark all the threads where in wait-state. Could they not have been pooled in some way? We could have gotten away with just a few threads, and we would have been able to create many more actors!

Sure we could, and that is exactly what event-based actors do. Instead of being on their feet like Vladimir, we want them to drowse off and take a nap like Estragon. And in the mean-while the thread could be used by some other actor that has something in its inbox it wants to process. Unfortunately, they will also loose a little bit of their sense of time like him. But that will have to wait to another letter.

Yours

Dan

ps Check out what it looks like using the event-based execution model - with a very similar programming model.