Currying the callback, or the essence of futures…

Wow, this one is not for dummies!!!

I decided to play it pedantic this time because I posted a layman explanation of this thing on the node.js forum and nobody reacted. Maybe it will get more attention if I use powerful words like currying in the title. We’ll see…

Currying

I’ll start with a quick explanation of what currying means so that Javascript programmers who don’t know it can catch up. After all, it is a bit of a scary word for something rather simple, and many Javascript programmers have probably already eaten the curry some day without knowing what it was.

The idea behind currying is to take a function like

function multiply(x, y) { return x * y; }

and derive the following function from it:

function curriedMultiply(x) {
  return function(y) { return x * y; }
}

This function does something simple: it returns specialized multiplier functions. For example, curriedMultiply(3) is nothing else than a function which multiplies by 3:

function(y) {
  return 3 * y;
}

Attention: curriedMultiply does not multiply because it does not return numbers. Instead, it returns functions that multiply.

It is also interesting to note that multiply(x, y) is equivalent to curriedMultiply(x)(y).

Currying the callback

Now, what happens if we apply this currying principle to node APIs, to single out the callback parameter?

For example, by applying it to node’s fs.readFile(path, encoding, callback) function, we obtain a function like the following:

fs.curriedReadFile(path, encoding)

The same way our curriedMultiply gave us specialized multiplier functions, curriedReadFile gives us specialized reader functions. For example, if we write:

var reader = fs.curriedReadFile("hello.txt", "utf8");

we get a specialized reader function that only knows how to read hello.txt. This function is an asynchronous function with a single callback parameter. You would call it as follows to obtain the contents of the file:

reader(function(err, data) {
  // do something with data
});

Of course, we have the same equivalence as we did before with multiply: fs.readFile(path, encoding, callback) and fs.curriedReadFile(path, encoding)(callback) are equivalent.

This may sound silly, and you may actually think that this whole currying business is just pointless intellectual masturbation. But it is not! The interesting part is that if we are smart, we can implement curriedReadFile so that it starts the asynchronous read operation. And we are not forced to use the reader right away. We can keep it around, pass it to other functions and have our program do other things while the I/O operation progresses. When we need the result, we will call the reader with a callback.

By currying, we have separated the initiation of the asynchronous operation from the retrieval of the result. This is very powerful because now we can initiate several operations in a close sequence, let them do their I/O in parallel, and retrieve their results afterwards. Here is an example:

var reader1 = curriedReadFile(path1, "utf8");
var reader2 = curriedReadFile(path2, "utf8");
// I/O is parallelized and we can do other things while it runs

// further down the line:
reader1(function(err, data1) {
  reader2(function(err, data2) {
    // do something with data1 and data2
  });
});

Futures

Futures is a powerful programming abstraction that does more or less what I just described: it encapsulates an asynchronous operation and allows you to obtain the result later. Futures usually come with some API around them and a bit of runtime to support them.

My claim here is that we can probably capture the essence of futures with the simple currying principle that I just described. The reader1 and reader2 of the previous example are just futures, in their simplest form.

Implementation

This looks good but how hard is it to implement?

Fortunately, all it takes is a few lines of Javascript. Here is our curried readFile function:

function curriedReadFile(path, encoding) {
  var done, err, result;
  var cb = function(e, r) { done = true; err = e, result = r; };
  fs.readFile(path, encoding, function(e, r) { cb(e, r); });
  return function(_) { if (done) _(err, result); else cb = _; };
}

I won’t go into a detailed explanation of how it works.

Going one step further

Now, we can go one step further and create a generic utility that will help us currify any asynchronous function. Here is a simplified version of this utility (the complete source is on streamline’s GitHub site):

function future(fn, args, i) {
  var done, err, result;
  var cb = function(e, r) { done = true; err = e, result = r; };
  args = Array.prototype.slice.call(args);
  args[i] = function(e, r) { cb(e, r); };
  fn.apply(this, args);
  return function(_) { if (done) _(err, result); else cb = _; };
}

With this utility we can rewrite curriedReadFile as:

function curriedReadFile(path, encoding) {
  return future(readFile, arguments, 2);
}

And then, we could even get one step further, and tweak the code of the original fs.readFile in the following way:

function readFile(path, encoding, callback) {
  if (!callback) return future(readFile, arguments, 2);
  // readFile's body
}

With this tweak we obtain a handy API that can be used in two ways:

  • directly, as a normal asynchronous call:
    fs.readFile("hello.txt", "utf8", function(err, data) { ... }
  • indirectly, as a synchronous call that returns a future:
    // somewhere:
    var reader = fs.readFile("hello.txt", "utf8");
    
    // elsewhere:
    reader(function(err, data) { ... }

Blending it with streamline.js

I have integrated this into streamline.js. The transformation engine adds the little if (!callback) return future(...) test in every function it generates. Then, every streamlined function can be used either directly, as an asynchronous call, or indirectly, to obtain a future.

Moreover, obtaining a value from a future does not require any hairy callback code any more because the streamline engine generates the callbacks for you. Everything falls down very nicely into place as the following example demonstrates:

function countLines(path, _) {
  return fs.readFile(path, "utf8", _).split('\n').length;
}

function compareLineCounts(path1, path2, _) {
  // parallelize the two countLines operations
  // with two futures.
  var n1 = countLines(path1);
  var n2 = countLines(path2);
  // get the results and combine them
  return n1(_) - n2(_);
}

Wrapping it up

I was looking for an elegant way to implement futures in streamline.js and I’m rather happy with this design. I don’t know if it will get wider adoption but I think that it would be nice if node’s API could behave this way and return futures when called without callback. Performance should not be an issue because all that is required is a simple test upon function entry.

I also find it cute that a future would just be the curried version of an asynchronous function.

About these ads
This entry was posted in Asynchronous JavaScript, Uncategorized. Bookmark the permalink.

17 Responses to Currying the callback, or the essence of futures…

  1. Great stuff, as usual! Just one comment: in your sample implementation code of future(), you say fn(args) when I think you mean fn.apply(this, args).

  2. Pingback: links for 2011-04-05 « Citysearch® Australia Code Monkeys

  3. Andreas Kalsch says:

    Déja-vu for me – some months ago I have written a memoizer for async functions: http://github.com/akidee/node_memo
    It guarantees that two identical function calls will be executed only once, which is more challenging to handle with callbacks.

  4. Pablo says:

    I won’t go into a detailed explanation of how it works.

    I’ll try to :)

    I’ve created a gist with the currying function explained with comments (I wrote it as I tried to understand it).

    Comments are welcome

  5. gozala says:

    Cool stuff!! I think there is some overlap with what I’m trying to get to with:

    https://github.com/Gozala/streamer/blob/master/readme.js

    • There is some overlap and I’m working on some new stuff which is clearly on the same tracks as yours. The more it goes, the more I get convinced that this “future” design (functions that take a single callback parameter) is a very elegant way to deal with asynchronous programming in JS. I’d like to publish my new stuff but I need to validate it with my employer first (and people are on vacation).

      Bruno

  6. tnlogy says:

    Nice! Was looking for some interesting future implementation, since I liked the feature in the language Mozart-Oz.

    • Thanks for the feedback.

      FYI, the implementation that I gave in the post corresponds to what is currently in streamline’s master branch and what is published on NPM.
      This implementation is a bit limited because is supports only one read attempt on the future. So you cannot distribute the future to several agents who would try to read it independently from each other.

      The runtime branch has a better implementation which supports any number of reads. See https://github.com/Sage/streamlinejs/blob/runtime/lib/compiler/runtime.js for details (search for __future). I’m going to merge this into master when I get the time.

  7. Pingback: A Node.js Experiment: Thinking Asynchronously, Using Recursion to Calculate the Total File Size in a Directory « Procbits

  8. Pingback: 锋谈Node.js开发技巧 - 炫意 HTML5

  9. Pingback: 专家观点——袁锋谈Node.js开发技巧 | chainding

  10. Pingback: what a fuck callback! | BLOG OF PARADOX

  11. Jason McCreary says:

    Great line – “pointless intellectual masturbation”

  12. How to do the following, ideally via https://github.com/Sage/streamlinejs (as it seems quite possibly the best most-flexible asynchronous management for Javascript I’ve seen –very impressive!): a potentially common need I have: very regularly and largely unpredictably, my code needs to load a typically-small amount of data from local and/or remote storage so each such request can take arbitrary long to retrieve so I want to do it async. At the time of request, this data is key, generally bringing the foreground task to a halt until it has this data, so because of that and more, I want it to try retrieving (requesting) the data all the sources it could be at (up to say 50) as parallel as the hardware will well support, with say 10 simultaneous/pending requests at all times until less than that remain (note the flow funnel https://github.com/Sage/streamlinejs/blob/master/lib/util/flows.md seems it could be used here), where (the tricky part) as soon as NOT all but *any* one source/request, notably **the first to**, returns the data, I want this main task, now with its data, to immediately resume/continue execution where it left off (which is likely deep inside some call stack). And meanwhile, I don’t want to waste CPU cycles round-robin checking thru all (nor any) of the pending requests to see see which has completed; rather, once requests are pending, further computation should only be done only & immediately as each requests returns or times out. Moreover, as one would probably guess, I don’t generally know which source/location will have the data nor how long each will take to respond (yes, future versions I aim to be smarter here, but still there always will regularly be cases where which particular locations/sources to check will be part or entirely unknown). Finally, to conserve bandwidth, once it’s got the data, I likely want to cancel pending requests and not send out any more, or at least make them a lot lower priority.

    So how to do it?

    • Futures https://github.com/Sage/streamlinejs/wiki/Futures as presented does not seem it will work as the moment my code checks on a future’s value, even if just to see if now available, it seems my code will hang until it will suspend until its available, meanwhile any number of other requests could have come in making me wait for nothing; and my initiation code nor any code should be polling, a waste of CPU.
    • It does appear to me this is all doable by callbacks, but I don’t see how via the super-simplified abstraction streamlinejs provides, though perhaps by explicit named callbacks instead of the usual “_” which https://github.com/Sage/streamlinejs/wiki/Fast-mode#syntax-summary may suggest is possible (as perhaps an event handler or a named call-back) but without examples how to code it, at least with streamlinejs, isn’t clear to me.
    • Hi,

      What you are asking for is some kind of select call that waits on several async operations, returns as soon as one of the operations complete, and cancels the other pending operations.

      This is similar to a point that was discussed in https://github.com/bjouhier/galaxy/issues/5. The gist was designed for galaxy but I have just added a streamline.js variant. Maybe I should include this waitForOne function in streamline.js.

      This implementation does not handle cancelling. I did experiment with a cancelling API in streamline.js (see https://github.com/Sage/streamlinejs/issues/106) but I did not publish it. If you want to discuss this, we should do it in this github issue.

      Note that waitForOne is implemented with a mix of futures and callbacks rather than with streamline’s magic _ token. The idea behind streamline is not to force you to write everything with futures and _. These features are a natural fit for most of the code we write, but there are a few special functions (like waitForOne or funnel) that cannot be implemented without some form of callbacks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s