Harmony Generators in streamline.js

Harmony generators have landed in a node.js fork this week. I couldn’t resist, I had to give them a try.

Getting started

If you want to try them, that’s easy. First, build and install node from Andy Wingo’s fork:

$ git clone https://github.com/andywingo/node.git node-generators
$ cd node-generators
$ git branch v8-3.19
$ ./configure
$ make
# get a coffee ...
$ make install # you may need sudo in this one

Now, create a fibo.js file with the following code:

function* genFibos() {  
  var f1 = 1, f2 = 1;  
  while (true) {  
    yield f1;  
    var t = f1;  
    f1 = f2;  
    f2 += t;  
  }  
}

function printFibos() {
    var g = genFibos();
    for (var i = 0; i < 10; i++) {
      var num = g.next().value;
      console.log('fibo(' + i + ') = ' + num);  
    }
}

printFibos();

And run it:

$ node --harmony fibo
fibo(0) = 1
fibo(1) = 1
fibo(2) = 2
fibo(3) = 3
fibo(4) = 5
fibo(5) = 8
fibo(6) = 13
fibo(7) = 21
fibo(8) = 34
fibo(9) = 55
$

Note that generators are not activated by default. You have to pass the --harmony flag to activate them.

Using generators with streamline.js

I had implemented generators support in streamline.js one year ago and I blogged about it but I could only test in Firefox at the time, with a pre-harmony version of generators. I had to make a few changes to bring it on par with harmony and I published it to npm yesterday (version 0.4.11).

To try it, install or update streamline:

$ npm install -g streamline@latest # you may need sudo

Then you can run the streamline examples:

$ cp -r /usr/local/lib/node_modules/streamline/examples .
$ cd examples
$ _node_harmony --generators diskUsage/diskUsage
./diskUsage: 4501
./loader: 1710
./misc: 7311
./streamlineMe: 13919
./streams: 1528
.: 28969
completed in 7 ms

You have to use _node_harmony instead of _node to activate the --harmony mode in V8. You also have to pass the --generators option to tell streamline to use generators. If you do not pass this flag, the example will still work but in callback mode, and you won’t see much difference.

To see what the transformed code looks like, you can just pass the -c option to streamline:

$ _node_harmony --generators -c diskUsage/diskUsage._js

This command generates a diskUsage/diskUsage.js source file containing:

/*** Generated by streamline 0.4.11 (generators) - DO NOT EDIT ***/var fstreamline__ = require("streamline/lib/generators/runtime"); (fstreamline__.create(function*(_) {var du_ = fstreamline__.create(du, 0); /*
 * Usage: _node diskUsage [path]
 *
 * Recursively computes the size of directories.
 *
 * Demonstrates how standard asynchronous node.js functions
 * like fs.stat, fs.readdir, fs.readFile can be called from 'streamlined'
 * Javascript code.
 */
"use strict";

var fs = require('fs');

function* du(_, path) {
  var total = 0;
  var stat = (yield fstreamline__.invoke(fs, "stat", [path, _], 1));
  if (stat.isFile()) {
    total += (yield fstreamline__.invoke(fs, "readFile", [path, _], 1)).length;
  } else if (stat.isDirectory()) {
    var files = (yield fstreamline__.invoke(fs, "readdir", [path, _], 1));
    for (var i = 0; i < files.length; i++) {       total += (yield du(_, path + "/" + files[i]));     }     console.log(path + ": " + total);   } else {     console.log(path + ": odd file");   }   yield ( total); } try {   var p = process.argv.length > 2 ? process.argv[2] : ".";

  var t0 = Date.now();
  (yield du(_, p));
  console.log("completed in " + (Date.now() - t0) + " ms");
} catch (ex) {
  console.error(ex.stack);
}
}, 0).call(this, function(err) {
  if (err) throw err;
}));

As you can see, it looks very similar to the original diskUsage/diskUsage._js source. The main differences are:

  • Asynchronous functions are declared with function* instead of function.
  • Asynchronous functions are called with a yield, and with an indirection though fstreamline__.invoke if they are not directly in scope

But otherwise, the code layout and the comments are preserved, like in --fibers mode.

You can execute this transformed file directly with:

npm link streamline # make streamline runtime available locally - may need sudo
node --harmony diskUsage/diskUsage

Benchmarks

Of course, the next step was to try to compare performance between the 3 streamline modes: callbacks, fibers and generators. This is a bit unfair because generators are really experimental and haven’t been optimized like the rest of V8 yet but I wrote a little benchmark that compares the 3 streamline modes as well as a raw callbacks implementation. Here is a summary of my early findings:

  • In tight benches with lots of calls to setImmediate, raw callbacks outperform the others by a factor of 2 to 3.
  • Fibers always outperform streamline callbacks and generators modes.
  • Fibers nails down everyone else, including raw callbacks, when the sync logic dominates the async calls. For example, it is 4 times faster than raw callbacks in the n=25, loop=1, modulo=1000, fn=setImmediate case.
  • Streamline callbacks and generators always come up very close, with a slight advantage to callbacks.
  • The results get much closer when real I/O calls start to dominate. For example, all results are in the [243 258] ms range with the simple loop of readMe calls.
  • The raw callbacks bench is more fragile than the others. It stack overflows when the modulo parameter gets close to 5000. The others don’t.
  • The generators bench crashed when setting the modulo parameter to values < 2.

My interpretation of these results:

  • The difference between streamline callbacks and raw callbacks is likely due to the fact that streamline provides some comfort features: long stack traces, automatic trampolining (avoids the stack overflow that we get with raw callbacks), TLS-like context, robust exception handling, etc. This isn’t free.
  • I expected very good performance from fibers when the sync/async code ratio increases. This is because the sync-style logic that sits on top of async calls undergoes very little transformation in fibers mode. So there is almost no overhead in the higher level sync-style code, not even the overhead of a callback. On the other hand fibers has more overhead than callbacks when the frequency of async calls is very high because it has to go through the fibers layer every time.
  • Generators are a bit disappointing but this is not completely suprising. First, they just landed in V8 and they probably aren’t optimized. But this is also likely due to the single frame continuation constraint: when you have to traverse several layers of calls before reaching the async calls, every layer has to create a generator and you need a run function that interacts with all these generators to make them move forwards (see lib/generators/runtime.js). This is a bit like callbacks where the callbacks impact all the layers that sit on top of async APIs, but not at all like fibers where the higher layers don’t need to be transformed.
  • The fibers and generators benches are based on code which has been transformed by streamline, not on hand-written code. There may be room for improvement with manual code, although I don’t expect the gap to be in any way comparable to the one between raw callbacks and streamline callbacks. The fibers transformation/runtime is actually quite smart (Marcel wrote it). I wrote the generators transform and I think it is pretty efficient but it would interesting to bench it against other solutions, for example against libraries that combine promises and generators (I think that those will be slower because they need to create more closures/objects but this is just a guess at this point).
  • The crashes in generators mode aren’t really anything to worry about. I was benching with bleeding edge software and I’m confident that the V8 generators gurus will fix them.

So yes, generators are coming… but they may have a challenge to compete head to head with raw callbacks and fibers on pure performance.

Advertisements
This entry was posted in Asynchronous JavaScript, Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s