Yield – Resume vs. Asynchronous Callbacks – An Equivalence

I introduced streamline.js about four months ago and I never took the time to explain the theory (if it can be called a theory) behind it. I’ll try to do it in this post.

The material that follows is a lot less entertaining than the little tale I started with. The tale could give the wrong impression that streamline.js is just a cute hack. This post gives a different perspective.

Y-R Javascript

For this investigation, I won’t use streamline.js as is because its syntax does not differentiate it well enough from vanilla Javascript. Instead, I will introduce a fictional language that I will call Y-R Javascript (for Yield and Resume Javascript).

Y-R Javascript is a small extension to Javascript. It introduces a new operator that can be applied to function definitions and function calls:

function myFunc@(p1, p2) {
  // function body here
}

result = myFunc@(arg1, arg2);

The @ sign is the yield and resume operator. When present in a function definition, it means that the function executes asynchronously. Somewhere in its body, the function yields to the system and the system calls it back with a result or an error. It may actually yield and resume more than once before returning to its caller. I will call such a function an asynchronous function, in contrast with normal Javascript functions (defined without @) that I will naturally call synchronous functions.

Y-R Javascript gives you two possibilities to call an asynchronous function.

  • The first one is to call it as myFunc@(arg1, arg2). In this case, the function will yield and resume (maybe several times) and then return a result to its caller (or throw an exception).
  • The second one is to call it as myFunc(arg1, arg2) (without @). In this case, the function will synchronously return a future F that you may call later as F@() to obtain a result. Yielding will happen when you call F@(), not in the initial call to myFunc.

I will also be assuming that the runtime library of Y-R Javascript contains a mixture of synchronous and asynchronous functions. The developer who writes asynchronous programs with Y-R Javascript does not use callbacks. Instead, he uses the @ operator to call asynchronous functions very much like synchronous ones. For example:

function lineCount@(fname) {
  return readFile@(fname, "utf8").split("\n").length;
}

function compareLC@(fname1, fname2) {
  // serial version: start reading fname2 only after
  // getting the count for fname1
  return lineCount@(fname1) - lineCount@(fname2);
}

function fastCompareLC@(fname1, fname2) {
  // parallelized I/O with futures
  var f1 = lineCount(fname1);
  var f2 = lineCount(fname2);
  // both I/O operations have been started.
  // now we yield and resume to obtain the results.
  return f1@() - f2@();
}

Y-R Javascript also supports anonymous asynchronous functions. The syntax is naturally:

function@(args) {
  // body
}

Y-R Javascript imposes the following rule:

Asynchronous calls are contagious: if a function contains an asynchronous call (an @ call), then the function itself must be defined as an asynchronous function (defined with an @).

So, for example, a Y-R Javascript compiler would reject the following function definition:

function lineCount(fname) {
  return readFile@(fname, "utf8").split("\n").length;
}

On the other hand, it would accept the following code:

function getLineCountFunc() {
  return function@(fname) {
    return readFile@(fname, "utf8").split("\n").length;
  }
}

In this last sample, getLineCountFunc is a synchronous function that returns an asynchronous function. The rule given above is not broken because getLineCountFunc does not contains any asynchronous function calls. What it contains is an asynchronous function definition.

There is also an additional rule to deal with top level calls:

Y-R Javascript does not allow asynchronous calls at the top level of a script module because this would conflict with the fact that require works synchronously.

The only exception to this rule is a main script in which we allow top level asynchronous calls.

Equivalence

In this post I want to investigate the relationship between this fictional Y-R Javascript language on one side and  Javascript with asynchronous callbacks on the other side.

I will show that any program written in Y-R Javascript can be mechanically translated into an equivalent Javascript program with asynchronous callbacks, and vice versa.

From Y-R to Callbacks

The translation from Y-R Javascript to Javascript with asynchronous callbacks requires many steps. I will not give a complete proof in this post but I will try to give enough information to convince the reader that this transformation is possible and that it preserves the semantics of the Y-R Javascript source.

Synchronous functions

The first thing to observe is that synchronous function definitions may only contain synchronous function calls. So synchronous functions do not need any translation at all.

If a synchronous function contains an asynchronous function definition, like getLineCountFunc above, then the synchronous function itself does not need any translation but the asynchronous function definition that it contains does need to be translated.

Main script with top level asynchronous calls

So we only have to translate asynchronous function definitions and top level asynchronous calls.

We can eliminate the case of top-level asynchronous calls. As mentioned above, these calls are only allowed in main scripts. Here is an example:

var count = readFile@(argv[1], "utf8").split("\n").length;
console.log(argv[1] + " has " + count + " lines.");

This script can be wrapped as follows:

(function@() {
  var count = readFile@(argv[1], "utf8").split("\n").length;
  console.log(argv[1] + " has " + count + " lines.");
})();

This transformation gives us a module that contains a single synchronous call to an anonymous asynchronous function.

The problem has been simplified. We don’t need to worry about top level asynchronous statements any more. If we know how to transform an arbitrary asynchronous function definition then we will have solved our problem.

Canonicalization steps

Our goal now is to eliminate all the @ calls in an arbitrary function definition and replace them by logic based on asynchronous callbacks.

But we are not going to introduce the callbacks right away. Before that, we are going to transform the Y-R Javascript code into a canonical form. The goal is to reduce the number of language constructs that we have to convert to callback form afterwards.

Each of these canonicalization steps is a simple transformation that preserves the semantics of the program. So the program that we obtain after applying all these steps will behave exactly like the the original program.

Moving variable declarations

The first canonicalization step moves all the variable declarations at the beginning of the function body. Here is an example:

function f1@() {
  var v1 = 2;
  if (f2@()) {
    var v2 = f3@();
    return v1 + v2;
  }
  return v1;
}
function f1@() {
  var v1, v2;
  v1 = 2;
  if (f2@()) {
    v2 = f3@();
    return v1 + v2;
  }
  return v1;
}

Moving functions declarations

We apply the same treatment to functions declarations (subject to hoisting). For example:

function f1@() {
  f2@(f3());
  function f3() {...}
}
function f1@() {
 function f3() {...}
 f2@(f3());
}

 Referencing this and arguments with variables

If the function body contains the this keyword, we introduce a new variable to reference this and we replace all occurrences of this by this variable. For example:

function f1@() {
  f2@(this.x);
  f3(this);
function f1@() {
 var __this = this;
  f2@(__this.x);
  f3(__this);
}

The arguments variable is handled in a similar way.

These three steps will allow us to introduce anonymous functions when we convert the body, without having to worry about the scope of variables and functions defined inside the body, and without having to worry about the fact that function calls may modify the value of this.

Canonicalizing loops

The next step is to transform loops. Javascript has several loop constructs but the most general one is the for(init; cond; incr) loop and they can all be converted to this form. We also move the init clause (if any) before the loop. Here are some examples:

while (cond) {
  ...
}
for (; cond; ) {
  ...
}
do {
  ...
} while (cond);
var __1;
__1 = true;
for (; __1 || cond; ) {
  __1 = false;
  ...
}
for (init; cond; incr) {
  ...
}
init;
for (; cond; incr) {
  ...
}
for (key in obj) {
  ...
}
var __1, __2, __3;
__1 = __keys(obj);
__2 = __1.length;
__3 = 0;
for (; __3 < __2; __3++) {
  key = __1[__3];
  ...
}

Note: __keys is a little helper functions that returns the properties of the object as an array. It is different from Object.keys because it also returns the prototype’s properties. To be completely rigorous we should generate an extra test in the loop to skip properties that may have been deleted during the loop.

If the loop condition is a complex condition containing one or more asynchronous calls, we transform it as follows:

for (; cond; incr) {
  ...
}
for (; (function@() {
  return cond;
  })@(); incr) {
  ...
}

This last transformation ensures that the whole condition will be evaluated at each iteration, even after it has been disassembled (see step below).

Splitting try/catch/finally

This step is very straightforward: we split try/catch/finally statements in two, as follows:

try {
  ...
} catch (ex) {
  ...
} finally {
  ...
}
try {
  try {
    ...
  } catch (ex) {
    ...
  }
} finally {
 ...
}

This will allow us to handle try/catch and try/finally with independent transformations.

Eliminating lazy operators

The next canonicalization step will eliminate the three lazy operators of Javascript (&&, || and ?:) when they combine asynchronous expressions. These operators will be converted into function calls as described in the following table:

exp1 && exp2
(function@() {
  var v = exp1;
  if (!v) return v;
  return exp2;
})@();
exp1 || exp2
(function@() {
  var v = exp1;
  if (v) return v;
  return exp2;
})@();
cond ? exp1 : exp2
(function@() {
  if (cond) return exp1;
  else return exp2;
})@();

Thanks to this transformation, we won’t have to worry about lazy operators any more. They have been converted to asynchronous functions that contain if/else statements.

Disassembling expressions

The next canonicalization step will disassemble all expressions that contain asynchronous calls. Here are three examples:

function f1@() {
  var x, y;
  x = f2@(f3() + f4@());
  y = f5@(x).f6@();
  ...
}
function f1@() {
  var x, y, __1, ...
  __1 = f3();
  __2 = f4@();
  __3 = __1 + __2;
  x = f2@(__3);
  __4 = f5@(x);
  y = __4.f6@();
  ...
}
function f1@() {
  if (f1@() < f2() + f3@()) {
    ...
  }
 }
function f1@() {
  var __1, __2, ...
  __1 = f1@();
  __2 = f2();
  __3 = f3@();
  __4 = __2 + __3;
  __5 = __1 < __4;
  if (__5) {
    ...
  }
}
function f1@() {
  return f1@() + f2() + f3@();
 }
function f1@() {
  var __1, __2, ...
  __1 = f1@();
  __2 = f2();
  __3 = f3@();
  __4 = __1 + __2
      + __3;
  return __4;
}

This transformation can be implemented by processing the parse tree in a bottom up fashion and introducing an intermediate variable (__1, __2, etc.) for every node.

As demonstrated by the second example, the transformation moves asynchronous calls outside of if conditions. It also moves them outside of switch, return and throw  expressions, as demonstrated by the third example above. On the other hand, it does not move asynchronous calls outside of loop conditions but they will be moved a bit later (see loop transformation pattern below).

This transformation is valid because we have taken special care of loop conditions and lazy operators. The following transformation would have been incorrect:

if (f1@() && f2@()) {
  ...
}
var __1, __2;
__1 = f1@();
__2 = f2@(); // wrong!
__3 = __1 && __2;
if (__3) {
...
}

Post-canonicalization state

We haven’t introduced any callbacks so far. We have just transformed our Y-R Javascript source into another Y-R Javascript source. It is relatively easy to verify that the individual transformations that we have applied preserve execution semantics. So, the transformed source that we have at this stage is another Y-R Javascript program which has the same observable behaviors as the original program.

But the new source will be easier to transform than the original one because all the complex expressions have been disassembled. Asynchronous calls will only be found in the following constructs:

func@(args)
obj.func@(args)
asynchronous procedure call
v = func@(args)
v = obj.func@(args)
asynchronous function call
for (; func@(args); incr)
for (; obj.func@(args); incr)
loop condition

We are now ready to introduce the callbacks.

Transforming an asynchronous function definition

An asynchronous function definition is transformed as follows:

function f@(a1, a2) {
  ...
}
function f(__cb, a1, a2) {
  if (!__cb)
    return __future.call(this,
      f, arguments);
  ...
  __cb();
}

The __cb argument is the callback that the transformed f function will use to return a value or an error. It has the standard node.js callback signature:

__cb(err, result)

The callback is passed as first rather than last argument. This choice is different from the standard node.js convention but it makes it completely trivial to handle optional parameters.

__future is a helper function similar to the one I described in my April post.

The transformation adds a __cb() call at the end of the function body because the callback must be called if execution reaches this point without hitting a return or a throw.

Transforming return and throw statements

The disassemble step has moved asynchronous calls outside of return and throw statements. So these statements can now be easily converted:

return exp;
return __cb(null, exp);
throw exp;
return __cb(exp);

Transforming loops

All loops have already been transformed into a for loop. The loops that do not contain any asynchronous calls do not to be transformed any further. The loops that contain asynchronous calls will be converted to callback form as follows:

for (; cond; incr) {
  loopBody
}
afterLoop
var __beenHere = false;
(function(__break) {
  (function __loop() {
    if (__beenHere) {
      incr
    } else {
      __beenHere = true;
    }
    var __doIt = cond;
    if (__doIt) {
      loopBody;
      __loop();
    } else {
      __break();
    }
  })();
})(function() {
  afterLoop
});

break and continue statements that appear inside the loop will be transformed as follows:

break;
return __break();
continue;
return __loop();

Transforming if statements

Similarly, if statements that do not contain any asynchronous calls do not need to be transformed. The ones that contains asynchronous calls will be transformed as follows:

if (cond) {
  thenBody
}
else {
  elseBody
}
afterIf
(function(__then) {
  if (cond) {
    thenBody
    __then();
  }
  else {
    elseBody
    __then();
  }
})(function() {
  afterIf
});

Transforming switch statements

switch statements have a similar transformation pattern:

switch (exp) {
case v1:
  v1Body
  break;
case v2:
case v3:
  v2v3Body
  break;
default:
  defaultBody
}
afterSwitch
(function(__break) {
  switch (exp) {
  case v1:
    v1Body
    return __break();
  case v2:
  case v3:
    v2v3Body
    return __break();
  default:
    defaultBody
    return __break();
  }
})(function() {
  afterSwitch
});

break statements contained in the branches of a switch will be transformed as follows:

break;
return __break();

Here we assume that the branches of the switch are all terminated by a break, a return or a throw. If this is not the case and if one of the branches leaks into the next branch, the transformation is still possible but a bit more complex. I will not describe it here.

Transforming try/catch

try/catch statements are a bit more challenging. The pattern is the following:

try {
  tryBody
}
catch (ex) {
  catchBody
}
afterTryCatch
(function(__then) {
  (function(__cb) {
    __tryCatch(__cb, function() {
      tryBody;
      __then();
    });
  })(function(ex, __result) {
    __tryCatch(__cb, function() {
      if (ex) {
        catchBody;
        __then();
      } else {
        __cb(null, __result);
      }
    });
  });
})(function() {
  __tryCatch(__cb, function() {
    afterTryCatch;
  });
});

The __tryCatch helper function is defined as follows:

function __tryCatch(__cb, fn) {
  try {
    fn();
  } catch (e) {
    try {
      __cb(e);
    } catch (ex) {
      console.error("UNCAUGHT EXCEPTION: " 
        + ex.message + "\n" + ex.stack);
    }
  }
}

The pattern is rather heavy but it is based on a simple idea: we execute the try body in a different scope which overrides __cb. The overriden __cb executes the catch body if called with an error. Otherwise, it forwards the result to the enclosing __cb callback.

The amount of try/catch logic is a really overwhelming and may look paranoid. But it is not. The problem is that most of this code is executed in the context of a callback and very strange things (like jumping into catch clauses that belong to functions from which we have already returned) would happen if we did not plug all the holes and if we let some exceptions bubble up through the callback chain. This is why we invoke __cb via the __tryCatch helper function.

Transforming try/finally

try/finally is our best of show. The pattern is the following:

try {
  tryBody
}
finally {
  finallyBody
}
afterTryFinally
(function(__then) {
  (function(__cb) {
    __tryCatch(__cb, function() {
      tryBody;
      __cb(null, null, true);
    });
  })(function(__e, __r, __cont) {
    (function(__then) {
      __tryCatch(__cb, function() {
        finallyBody;
        __then();
      });
    })(function() {
      __tryCatch(__cb, function() {
        if (__cont)
          __then();
        else
          __cb(__e, __r);
      });
    })
  });
})(function() {
  __tryCatch(__cb, function() {
    afterTryFinally;
  });
});

The idea is again to override __cb. I won’t explain it much further here.

Transforming the remaining blocks

This is probably getting very boring but we are now getting close to the end of our transformation journey.

We have now transformed all the control flow statements and we are left with blocks that contain asynchronous function calls.

If you look carefully at the transformations that we have used in the callback generation phase, you will notice the following:

  • Every time a control flow statement was encountered, the control flow statement and the statements that followed it have been replaced by a function call like (function(__then) { ... })(function() { ... }).
  • All the code blocks that have been transfered inside these functions are now followed by a function call like __then(), __cb() or __break().

So, we are left with blocks that contain the asynchronous function calls produced by the disassembly pass, and that are terminated by a function call (the two cases above).

The blocks that contained a return or a throw statement before the transformation now have a return __cb(...) statement before their ending function call. These blocks can be simplified by removing all the statements that follow the return __cb(...) statement. After this simplification we can also remove the return keyword on the return __cb(...) statement.

So we are left with blocks that contain:

  • synchronous statements that we don’t need to transform.
  • func@(args); statements
  • obj.func@(args); statements
  • v = func@(args); statements
  • v = obj.func@(args); statements
  • a function call at the end.

If we manage to transform the statements that contain an asynchronous call we will be done!

Transforming asynchronous calls

The asynchronous calls that remain will be transformed with the following patterns:

func@(args);
afterCall
func(function(__err) {
  if (__err) return __cb(__err);
  afterCall
}, args);
v = func@(args);
afterCall
func(function(__err, __result) {
  if (__err) return __cb(__err);
  v = __result;
  afterCall
}, args);

Note: the obj.func@(args) statements are transformed like the func@(args) statements.

That’s it! We now have a complete mechanical process that can transform any Y-R Javascript program into an equivalent Javascript program with asynchronous callbacks.

From Callbacks to Y-R

If you read this far you probably deserve a medal. Unfortunately for you, we are only half-way through! What about the other direction? Can any Javascript program with asynchronous callbacks be converted into an equivalent Y-R Javascript program?

Fortunately the reverse proof will be much shorter. We just need to prove  two things:

  • That the definition of a callback-based asynchronous function definition can be converted to a definition of a Y-R Javascript function.
  • That a callback-based asynchronous function call can be converted to a Y-R Javascript function call.

In this second proof, we will assume that the callback function has the default node.js format and we won’t worry about the callback’s position in the argument list. A complete proof would need to take these variations into account but these details don’t really bring anything fundamental to the table. So we will ignore them here.

Transforming a callback-based function definition

This part is completely trivial. The transformation is the the following:

function f(a, b, callback) {
  ...
}
function f@(a, b) {
  ...
}

Transforming a callback-based asynchronous call

This one is a little trickier. Here is a solution:

f(a, b, function(err, res) {
  callbackBody
})
(function@() {
  var err, res
  try {
    res = f@(a, b);
  }
  catch (ex) {
    err = ex;
  }
  callbackBody
})();

The try/catch part is not very interesting. We just need it to extract the error and assign it to an err variable.

The (function@() { ... })() wrapper is more interesting. What it does is wrap the function call into an anonymous asynchronous function which is then called synchronously (no @ operator before the parentheses). This call produces a future that we don’t even assign to a variable because we don’t really care about the termination of this anonymous call. What we care about is that the callback body will be called after f produces a result or an error, which is obviously the case!

Wrapping-up

This was a rather long and probably boring explanation and I haven’t even proved anything formally. But I hope that I have given enough elements to convince you that any Y-R Javascript program can be mechanically converted into an equivalent Javascript program with asynchronous callbacks, and vice versa.

You may wonder why this could be of any interest and why I wasted several hours to write this. The reason is that I think that Y-R Javascript is a much better language than Javascript with asynchronous callbacks when it comes to writing asynchronous code.

Y-R Javascript lets you to write very lean and elegant code. You can express your logic with the native constructs of the language rather than with flow-control libraries; you can do proper exception handing with try/catch/finally; you can chain calls even if some of them are asynchronous, etc. In short, you feel like a Javascript programmer, not like a super-hero who spends his days taming callbacks.

Y-R Javascript has another quality: it makes it very easy to reason about race conditions. The yield and resume points are very easy to spot: they correspond to all the @ calls in your code. If you see a piece of code that does not contain any @ calls, you immediately know that this code never yields and cannot be subject to a race condition of any kind. I think that this is an important feature.

The first part of this dissertation proves that Y-R Javascript is not a leaky abstraction: Y-R programs can be faithfully translated to Javascript programs with asynchronous callbacks.

The second part proves that Y-R Javascript is not a crippled subset of Javascript with asynchronous callbacks. Anything that you can do with asynchronous callbacks can also be done with Y-R Javascript’s primitives (@ calls and futures).

At this point, I have probably exhausted you (and I got my dose too). I will write another post later to illustrate the differences between Y-R Javascript and asynchronous callbacks. The only remaining point that I want to cover quickly is the relationship of all this to my current pet project: streamline.js

Y-R Javacript vs. streamline.js

Streamline.js is very similar to Y-R Javascript. The main difference is on the syntax: streamline.js uses an _ parameter in place of the @ operator.

I chose this syntax for streamline.js because I wanted to be able to edit and reformat my code with standard Javascript tools. It also made implementation easier because I did not have to modify parsers: I could use narcissus as is and I could piggyback the transformation to the CoffeeScript compiler. And, last but not least, it made it easier to call native node.js libraries because you can control where the callback goes in the argument list.

But I don’t have any special attachment to the _ parameter, nor to the @ operator. What I like in all this is the concept of a yield-resume marker that you can tag both onto function definitions and onto asynchronous function calls. I also like the idea of having asynchronous functions return a future when they are called without the marker. And I like the minimalist nature of this design: no extra libraries (promises, futures, tasks), no new keywords, just a small marker!

One last point: the CPS transform that I described here is similar to what I implemented in the streamline compiler. It is not completely identical though (and this article may have a few bugs that I introduced when adapting the patterns) but the general spirit is the same.

Posted in Asynchronous JavaScript, Uncategorized | 19 Comments

Asynchronous episode 3 – Adventures in event-land

From callbacks to events

This is my third post on asynchronous Javascript. In the first one, I explored callbacks and I explained how I ended up creating streamline.js, a tool that simplifies programming in callback-land. In the second one, I described how I introduced futures in streamline.js to enable and control parallel execution of asynchronous I/O operations. In this third one, I am going to talk about the other side of node’s asynchronous APIs: events.

But before that, I’ll quickly explain the process that drove me through these investigations: I did not build streamline.js just for the fun of it; I built it because I saw a big potential in node.js and I had a real project that I wanted to build on top of it. But I quickly realized that callbacks would be a showstopper for us. We are building business applications and we have to give our developers a simple and friendly programming environment that lets them concentrate on their problem domain. It does not make sense for us to try to turn our developers into callback ninjas. Also, readability is critical for us: business rules must be expressed in natural and understandable manner. Callbacks just add too much noise, even with libraries to ease the pain. So node.js would have been disqualified if we hadn’t found a way to drastically reduce the overhead introduced by callbacks.

Once I had the tool working, I started to convert several thousands of lines that we had written in callback style before. This went very smoothly actually and I am very pleased with the result: the code is leaner, simpler, much easier to write and, more important, much much easier to read, understand and maintain. Looks like node.js has now become a viable platform for us.

But the conversion left us with some modules that were written in event style rather than callback style, mostly middleware modules and specialized communication clients (HTTP and TCP). So I started to investigate these other modules. We had drastically improved the situation in callback-land. Maybe there was room for improvement in event-land too!

When events aren’t so rosy

When I first looked at node.js I thought that there was some beauty in all these observers and emitters. But when I started to investigate our event-style modules, which had been written by a brilliant Javascript programmer of our team, I started to cringe a bit. Of course, the code was correct and all the events that were emitted were getting handled properly by some observer that had been registered beforehand. But the logic was often hard to follow because the flow was jumping abruptly from one place to another instead of following natural call chains. With callbacks, I often had the feeling that the flow had been wired with GOSUB directives. Here, in event-land, it felt more like it had been wired with setjmp/longjmp calls. Node.js ninjas probably get a kick out of this, but is this really a viable approach for large industrial projects?

And then we ran into an interesting bug. We have an authentication module between our main module and the lower-level modules to which the requests are routed. We had chosen to let the lower-level modules register the data and end observers on requests so that they could stream posted data when necessary. It was all working fine, until we connected our authentication module to the database. Suddenly, our data and end event handlers were not getting called any more!

It did not take us long to figure out why: the authentication module had become asynchronous because of the database calls and consequently, the data and end events were being dequeued during authentication, before our lower level modules had got a chance to register their observers.

This raised a design issue: how should we fix this? One way would be to let the lower level modules invoke the authentication module after setting their handlers but then every low level module would be responsible for authentication and would have to hold its dispatch until it has received both the end event and the green light from the authentication module. This means a proliferation of non-obvious security sensitive code snippets. Not good! Another way to fix it would be to install the event handlers in the main module and let them buffer the data. Then we would probably need to introduce pause and resume calls to control the amount of buffering.

But to me, all this raised a red flag: this kind of event handling logic seems fundamentally fragile and we would need a great deal of discipline to write code that is robust and easy to maintain.

Funny enough, while I was writing this post, someone asked for help about request object losing events on the node.js forum. We are obviously not the only ones to have run into this problem.

Inverting the flow

So I started to question the beauty and relevance of all this. Is this event style somehow imposed upon us by the asynchronous nature of node.js? Or is it just an arbitrary API choice that shadows other possible designs?

What about trying to invert the flow around node’s stream objects? Instead of letting the stream push the data to its consumer through events, couldn’t we set up the API so that the consumer pulls the data from the stream by calling an asynchronous read method?

I gave it a shot and found out that this was actually fairly easy to do. The pattern is similar to the one I used for futures (and actually, I discovered the pattern when experimenting with streams, and applied it to futures afterwards). I packaged the solution as a streams module containing small wrappers around node streams, that I posted here.

To keep things simple, I chose to have the basic read method return the data one chunk at a time, and return null once all the chunks have been read and the end event has been received.

With this low level method, I could easily write a readAll method that reads the stream to the end. Here is the streamlined source of this method:

this.readAll = function(_) {
  var chunk, chunks = [];
  while (chunk = this.read(_))
    chunks.push(chunk);
  return concat(chunks);
}

where concat is an auxiliary function that deals with the fact that chunks could be buffers or strings.

I also added lowMark and highMark options to the stream constructor so that you can control how much data will be buffered without having to fool around with pause and resume calls.

Pull-style streams in action

This streams module is still under development. I’ve only published wrappers for HTTP client and server requests at this time and I haven’t written extensive unit tests yet. But I published a small example that calls Google’s search web service. Here is a condensed version of this little Google client:

function google(str, _) {
  var json = streams.httpRequest({
    url: 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=' + str,
    proxy: process.env.http_proxy
  }).end().response(_).checkStatus(200).readAll(_);

  return JSON.parse(json).responseData.results.map(function(entry) {
    return entry.url + '\n\t' + entry.titleNoFormatting;
  }).join('\n');
}

This example demonstrates some of the benefits that we get by inverting the flow:

  • The code is smaller
  • Calls can be chained in a natural way (and even more so with streamline.js which prevents the callbacks from disrupting the chains).
  • The code is more robust: there is no risk of losing events because listeners were not set up early enough; exceptions are naturally propagated along call chains; pause/resume calls are hidden and buffering is controlled declaratively, etc.
  • The code should look familiar to people who have been programming with traditional I/O APIs (read functions).

Wrapping up

I don’t know how the idea of inverting the flow will be received (it may not actually be completely new either) but I anticipate that some node.js aficionados will treat it as heresy. Aren’t callbacks and events the very essence of node? Who dares promote alternate programming styles, first without callbacks (actually not quite), then without events?

To me, the essence of node.js is the combination of an incredible Javascript engine and a simple, fundamentally asynchronous runtime, which treats HTTP as a first class protocol. APIs are just enablers for innovations on top of this foundation.

What I’m looking for are tools and APIs that will allow us to deliver great products, and I feel that they are starting to shape up. The streams module is just one more piece of the puzzle. It blends very well with streamline.js but the transformed source can also be used as a normal standalone node.js module with a callback oriented API.

Posted in Asynchronous JavaScript, Uncategorized | 3 Comments

Currying the callback, or the essence of futures…

Wow, this one is not for dummies!!!

I decided to play it pedantic this time because I posted a layman explanation of this thing on the node.js forum and nobody reacted. Maybe it will get more attention if I use powerful words like currying in the title. We’ll see…

Currying

I’ll start with a quick explanation of what currying means so that Javascript programmers who don’t know it can catch up. After all, it is a bit of a scary word for something rather simple, and many Javascript programmers have probably already eaten the curry some day without knowing what it was.

The idea behind currying is to take a function like

function multiply(x, y) { return x * y; }

and derive the following function from it:

function curriedMultiply(x) {
  return function(y) { return x * y; }
}

This function does something simple: it returns specialized multiplier functions. For example, curriedMultiply(3) is nothing else than a function which multiplies by 3:

function(y) {
  return 3 * y;
}

Attention: curriedMultiply does not multiply because it does not return numbers. Instead, it returns functions that multiply.

It is also interesting to note that multiply(x, y) is equivalent to curriedMultiply(x)(y).

Currying the callback

Now, what happens if we apply this currying principle to node APIs, to single out the callback parameter?

For example, by applying it to node’s fs.readFile(path, encoding, callback) function, we obtain a function like the following:

fs.curriedReadFile(path, encoding)

The same way our curriedMultiply gave us specialized multiplier functions, curriedReadFile gives us specialized reader functions. For example, if we write:

var reader = fs.curriedReadFile("hello.txt", "utf8");

we get a specialized reader function that only knows how to read hello.txt. This function is an asynchronous function with a single callback parameter. You would call it as follows to obtain the contents of the file:

reader(function(err, data) {
  // do something with data
});

Of course, we have the same equivalence as we did before with multiply: fs.readFile(path, encoding, callback) and fs.curriedReadFile(path, encoding)(callback) are equivalent.

This may sound silly, and you may actually think that this whole currying business is just pointless intellectual masturbation. But it is not! The interesting part is that if we are smart, we can implement curriedReadFile so that it starts the asynchronous read operation. And we are not forced to use the reader right away. We can keep it around, pass it to other functions and have our program do other things while the I/O operation progresses. When we need the result, we will call the reader with a callback.

By currying, we have separated the initiation of the asynchronous operation from the retrieval of the result. This is very powerful because now we can initiate several operations in a close sequence, let them do their I/O in parallel, and retrieve their results afterwards. Here is an example:

var reader1 = curriedReadFile(path1, "utf8");
var reader2 = curriedReadFile(path2, "utf8");
// I/O is parallelized and we can do other things while it runs

// further down the line:
reader1(function(err, data1) {
  reader2(function(err, data2) {
    // do something with data1 and data2
  });
});

Futures

Futures is a powerful programming abstraction that does more or less what I just described: it encapsulates an asynchronous operation and allows you to obtain the result later. Futures usually come with some API around them and a bit of runtime to support them.

My claim here is that we can probably capture the essence of futures with the simple currying principle that I just described. The reader1 and reader2 of the previous example are just futures, in their simplest form.

Implementation

This looks good but how hard is it to implement?

Fortunately, all it takes is a few lines of Javascript. Here is our curried readFile function:

function curriedReadFile(path, encoding) {
  var done, err, result;
  var cb = function(e, r) { done = true; err = e, result = r; };
  fs.readFile(path, encoding, function(e, r) { cb(e, r); });
  return function(_) { if (done) _(err, result); else cb = _; };
}

I won’t go into a detailed explanation of how it works.

Going one step further

Now, we can go one step further and create a generic utility that will help us currify any asynchronous function. Here is a simplified version of this utility (the complete source is on streamline’s GitHub site):

function future(fn, args, i) {
  var done, err, result;
  var cb = function(e, r) { done = true; err = e, result = r; };
  args = Array.prototype.slice.call(args);
  args[i] = function(e, r) { cb(e, r); };
  fn.apply(this, args);
  return function(_) { if (done) _(err, result); else cb = _; };
}

With this utility we can rewrite curriedReadFile as:

function curriedReadFile(path, encoding) {
  return future(fs.readFile, arguments, 2);
}

And then, we could even get one step further, and tweak the code of the original fs.readFile in the following way:

var original = fs.readFile;
fs.readFile = function(path, encoding, callback) {
  // note: assumes always called with encoding arg
  if (!callback) return future(readFile, arguments, 2);
  // delegate implementation to original function
  original.apply(this, arguments);
}

With this tweak we obtain a handy API that can be used in two ways:

  • directly, as a normal asynchronous call:
    fs.readFile("hello.txt", "utf8", function(err, data) { ... }
  • indirectly, as a synchronous call that returns a future:
    // somewhere:
    var reader = fs.readFile("hello.txt", "utf8");
    
    // elsewhere:
    reader(function(err, data) { ... });

Blending it with streamline.js

I have integrated this into streamline.js. The transformation engine adds the little if (!callback) return future(...) test in every function it generates. Then, every streamlined function can be used either directly, as an asynchronous call, or indirectly, to obtain a future.

Moreover, obtaining a value from a future does not require any hairy callback code any more because the streamline engine generates the callbacks for you. Everything falls down very nicely into place as the following example demonstrates:

function countLines(path, _) {
  return fs.readFile(path, "utf8", _).split('\n').length;
}

function compareLineCounts(path1, path2, _) {
  // parallelize the two countLines operations
  // with two futures.
  var n1 = countLines(path1);
  var n2 = countLines(path2);
  // get the results and combine them
  return n1(_) - n2(_);
}

Wrapping it up

I was looking for an elegant way to implement futures in streamline.js and I’m rather happy with this design. I don’t know if it will get wider adoption but I think that it would be nice if node’s API could behave this way and return futures when called without callback. Performance should not be an issue because all that is required is a simple test upon function entry.

I also find it cute that a future would just be the curried version of an asynchronous function.

Posted in Asynchronous JavaScript | 28 Comments

Asynchronous Javascript – the tale of Harry

Catching Up

Before telling you about Harry, I’ll start with a short account of what I went through recently so that those who know me from long ago can understand how I ended up being involved in asynchronous Javascript development, and what that means.

After many years of programming with mainstream OO languages (Java and C#), I eventually decided to give up in the summer of 2009 and I switched to Javascript. The trigger was jQuery. I used it to experiment with HTML 5 and it helped me realize that the world had changed and that many of my previous beliefs were just wrong. Functional programming, which I had put aside 20 years ago to focus on mainstream OO, was coming back, with new ideas and a huge potential. I also realized that strong typing, which I had been worshiping for years, was somewhat of a neurotic thing. It makes you feel safe but you pay a high price for it (you write a lot of code just to please the compiler) and it actually misses the point (you should be relying on unit tests that check the semantics rather than on compilers that only check formal constraints). It also introduces tight coupling between code modules, and makes the code more rigid than necessary. JQuery really made me rediscover the pleasure of functional programming and convinced me that Javascript was not a toy language but most likely a very important language for the future.

Then, I thought that if I was going to invest a lot on Javascript and jQuery for the client side, I might as well try to use it also on the server side. This would make it possible to reuse code between client and server. It would also make the development process simpler: one language to learn, common methodologies and tools, etc. This is how I ended in the SSJS (server side Javascript) world.

So, about 18 months ago, we (I had taken the lead of a new team for a new project in the meantime) started working with Helma NG (now RingoJS). We quickly switched to Narwhal which seemed to have more traction at the time. And we were keeping an eye on a blip that was getting bigger on our radar screens: node.js. It looked amazing but I wondered if it would be wise to drag a Sage project into it. Selling SSJS as a viable platform for future applications had already been a bold move but with node.js we were crossing the line between being leading-edge and bleeding-edge!

But as we were moving forwards with Narwhal, it became clearer every day that node.js was really where the future of SSJS was shaping up. There was a vibrant community around it. And an incredibly fast Javascript engine! So we ended up making the switch in the spring of last year.

Asynchronous Javascript in node.js

Node.js is really cool. Small, simple, very fast, etc. But it takes a radical approach on concurrency: no threads, asynchronous I/O instead. This has a profound impact on the way code is written.

Node.js provides two API styles to deal with asynchronous programming:

  • An event style
  • A callback style

The event style allows you to emit events from various points in your code, and set up listeners that catch these events and act upon them somewhere else. The flow of control is highly non-local and a bit similar to what we classically use for exception handling. It works really well for I/O handling (HTTP client and server for example) but I have a hard time imagining business application developers writing their mundane business logic with this kind of non-local flow of control.

So we would likely end up writing most of our code in the callback style, which is what node.js proposes for local flows that call asynchronous functions. The emblematic pattern for an asynchronous call is the following:

asyncFunc(args, function(err, result) {
  if (err)
    // error: propagate it or handle it
  else
    // do something with result
});

Every asynchronous function takes an extra argument which is a callback. When the asynchronous operation completes, the callback is executed. If the operation fails, an error object (err) is passed as first argument to the callback. If the operation succeeds the first argument is set to null and an optional result may be passed through the second argument. A simple and straightforward design (like most things in node.js)! The new thing is that node.js is highly asynchronous so this pattern is not anecdotic as it may have been before. In node.js, it is omnipresent.

Harry’s first steps

So now comes the time to tell you my little story about Harry. Harry is an experienced programmer who makes his first steps in node.js. He has done some async programming before but never with a system that is as pervasively asynchronous as node.

To get familiar with node’s APIs, Harry decides to implement a function that traverses directories to compute disk usage. In his previous life he would have written it as:

function du(path) {
  var total = 0;
  var stat = fs.stat(path);
  if (stat.isFile()) {
    total += fs.readFile(path).length;
  }
  else if (stat.isDirectory()) {
    var files = fs.readdir(path);
    for (var i = 0; i < files.length; i++) {
      total += du(path + "/" + files[i]);
    }
    console.log(path + ": " + total);
  }
  else {
    console.log(path + ": odd file");
  }
  return total;
}

In node.js, the fs.stat, fs.readFile and fs.readdir calls are asynchronous. So, the du function itself must be asynchronous too. Its signature becomes:

function du(path, callback)

where callback is a node.js callback function that du will use to return its result. The signature of the callback is the following:

callback(err, result)

So Harry tries to adapt his du implementation for node.js. He quickly reaches the following point:

function du(path, callback) {
  var total = 0;
  fs.stat(path, function(err, stat) {
    if (err) { callback(err); return; }
    if (stat.isFile()) {
      fs.readFile(path, function(err, data) {
        if (err) { callback(err); return; }
        total += data.length;
        // (a) what do I do here?
      });
      // (b) and here?
    }
    else if (stat.isDirectory()) {
      fs.readdir(path, function(err, files) {
        if (err) { callback(err); return; }
        // (c) is this right?
        for (var i = 0; i < files.length; i++) {
          du(path + "/" + files[i], function(err, len) {
            if (err) { callback(err); return; }
            total += len;
            (d) what do I do here?
          });
          // (e) and here?
        }
        // (f) this does not sound right either!
        console.log(path + ": " + total);
      });
      // (g) what do I do here?
    }
    else {
      console.log(path + ": odd file");
      // (h) and here?
    }
  });
  // (i) sounds right, but not in the right place.
  callback(null, total);
}

He has started to introduce the callbacks but he is hitting some difficulties and a lot of questions arose. After a bit of thinking, Harry figures out the answer to many of his questions:

At spots (b), (e), (g) and (i) I should not have any code because these statements follow an async call.

(i) is misplaced. I need to make 3 copies of it. The first two will go to spots (a) and (h). The third copy will go somewhere in the for branch where I am a bit lost at this point. I note that I have been lucky in this example because there is a single statement after the if/else if/else branching sequence. If there had been more code, I would have had to copy a whole block 3 times. So I’ll probably need to package the trailing statements into a function the next time I hit this kind of branching code

So Harry makes the changes and he is now left with (c), (d) and (f), that is the branch with the for loop. Let’s look at it again:

      fs.readdir(path, function(err, files) {
        if (err) { callback(err); return; }
        // (c) is this right?
        for (var i = 0; i < files.length; i++) {
          du(path + "/" + files[i], function(err, len) {
            if (err) { callback(err); return; }
            total += len;
            (d) what do I do here?
          });
        }
        // (f) this does not sound right either!
        console.log(path + ": " + total);
      });

After a bit more investigation, Harry comes up with the following conclusions:

(c) is clearly wrong. du is an async function. So if I leave the loop like this, all the du calls will execute in parallel and I have no way of collecting all the results and continuing!

(d): this seems to be where I need to continue looping.

(f) is clearly misplaced as it will execute before any of the loop iterations get a chance to run. Maybe I’ll know what to do with it once I have fixed (c) and (d).

Fortunately, Harry is a smart guy. The conclusion that he derives from (d) leads him to conclude that he should restructure the loop and handle it recursively rather than iteratively. After a bit of time, he comes up with the following solution:

      fs.readdir(path, function(err, files) {
        if (err) { callback(err); return; }
        function loop(i) {
          if (i < files.length) {
            du(path + "/" + files[i], function(err, len) {
              if (err) { callback(err); return; }
              total += len;
              loop(i + 1);
            });
          }
          else {
            // loop is done. Execute last statement and then callback to return value
            console.log(path + ": " + total);
            callback(null, total);
          }
        }
        // start the loop
        loop(0);
      });

Bingo! Harry is happy. He can write simple algorithms like this with asynchronous functions. It just seems to require a bit more brain cycles that before, but this is feasible.

Patterns

A few weeks have gone by. Harry has written a few modules in node.js and he starts to feel more comfortable with callbacks. He actually came up with some patterns that help him write robust code without too many headaches.

The first pattern is actually a helper function for loops:

function asyncForEach(array, iterator, then) {
  function loop(i) {
    if (i < array.length) {
      iterator(array[i], function() {
        loop(i + 1);
    }
    else {
      then();
    }
  }
  loop(0);
}

With this helper function, he can now write his loops as:

asyncForEach(array, function(item, next) {
  // body of my loop
  somethingAsync(function(err, result) {
    if (err) {callback(err); return; } // I'm starting to get tired of writing this one!
    // do something with item and result
    next(); // don't forget me at the end of every code path
  });
}, function() {
  // this is where execution resumes after the loop
});

He also came up with a small funny construct that helps him deal with branching code (he calls it the branch neutralizer):

(function(next) {
  // if or switch statement with branches that may mix sync and async calls.
  // All code paths must end up calling next() or callback(null, result)
})(function() {
  // this is where execution resumes after the branches
});

Also, he is now mentally wired to automatically replace return statements by statements like { callback(err, result); return; } that he often simplifies as return callback(err, result); as this is more compact and nobody actually cares about the values returned by asynchronous functions.

Harry feels more relaxed. He now has patterns and a methodology that lets him deal with node.js APIs without too many headaches. He has also published these patterns on the team’s wiki to foster consistent coding practices inside his team.

Looking back

Then summer arrives and Harry takes a well deserved vacation break on the French Riviera. When he comes back, he did not loose his newly acquired programming skills but sometimes he wonders:

In my previous life, I could write algorithms in a very natural form. The code that I was writing was directly related to the problem I was trying to solve. There was no noise between the statements. Also, I could easily chain calls. For example I could write something as simple as:

total += fs.readFile(path).length;

Now, despite all the patterns that I have found, I still have to write something like:

fs.readFile(path, function(err, data) {
  total += data.length;
  // continue here ..
});

Isn’t there something wrong here? I can write the same kind of algorithms as before in this amazing node.js environment but the code that I write contains a lot more noise than before, natural chains are broken, etc.

Actually, I often feel like working with a crippled language. As soon as I have asynchronous calls in scope, I cannot use while and for loops any more. Of course, I cannot use break and continue either. I also need to neutralize my if and switch statements. And I have completely given up on try/catch/finally for now. I’ll see later if I can find a pattern for it. The only keyword which seemed to have survived (and actually prospered) is function.

Sounds like a regression. What am I getting in return?

Sure, I know. I may get a raise. I was a dull classical programmer and I am now part of an elite of bleeding edge programmers who know how to do asynchronous programming in this trendy node.js environment. My management needs to know that!

Also, I can write programs that run fast and consume very little system resources because they run in this amazing new server.

But instead of spending my time crafting clever algorithms with simple control flow statements, I now find myself spending a significant part of my time applying the same silly patterns over and over. The code that I write is harder to read and the beauty of my algorithms is burried into lots of callback noise. Also, I need a lot more concentration because as soon as I miss one of these next() calls, I break a callback chain and my code just goes nowhere without returning a result, and it can take time to find out where I have forgotten to call next().

Also I had a bit of hope that async programming would allow me to write more parallel code and that for example, I could take advantage of the (b), (e), (g) and (i) spots in my code to do something clever after firing the async calls but I find it hard to exploit those spots. I haven’t yet found how I could take advantage of them. Maybe I’m not looking in the right place. I feel frustrated.

Haven’t I become some kind of slave programmer? Wouldn’t it make sense to have a machine take care of all the tedious callback patterns for me, so that I could go back to writing beautiful algorithms?

Harry also remembered something from his CS classes and he added the following:

Node.js is actually a new run-time environment that I am targeting when I am writing code. When I write a function like du, what the function does is no different than what my old du function did in my old environment. Also, the library functions that I am calling (fs.stat, fs.readdir and fs.readFile) are similar to the ones I had before (what they do, not how they do it). And I am using a high level programming language which is supposed to screen me from differences between execution engines. So why should I write different code? It looks like something is interfering when it should not and that I’m being asked to deal with a problem that the language tools should handle. Am I still programming in a high level language?

streamline.js

I’ll leave Harry to his thoughts for a moment.

As you can guess, I went through this too. And what I ended up finding is that Harry is right. He is writing code that a machine could write. So why not let the machine write it for him?

I was actually hoping that a solution would emerge from the community some day. The first signs of hope came from the promise pattern. This is an attempt to solve the problem with special libraries rather than by transforming code. The promise pattern actually improves the situation when the code contains a sequence of async calls that don’t return values. But as soon as the flow hits an asynchronous call that returns a value, a callback pops up. So it only solves part of the problem. I tried to understand where the limit was and a simple topological consideration convinced me that even the smartest of us would not be able to fully solve Harry’s problem and eliminate all the callback noise with any library-based approach. We had to step out and transform the code.

The second sign of hope came from Neil Mix’s narrative.js. This is a compiler tool that relies on a small extension to the language (a special yielding operator) to convert synchronous-looking code into its callback-based equivalent. Unfortunately it suffers from two flaws:

  • It introduces new syntax in the language: this makes the source incompatible with all sorts of tools that understand the Javascript syntax
  • It is not modular: The functional beauty and modularity of the original code is destroyed by the narrative.js compiler.

So the project did not get traction and got abandoned.

More recently, I stumbled upon Oni Labs’ stratified.js. I was excited at first but I quickly realized that this project suffered from the same flaws as narrative.js: it extends the language and the compiler does not preserve the modular quality of the original code.

At this point it became very clear to me that the right way to help Harry was to create a tool that would just try to automate what Harry does by himself when he writes code. The tool would take the statements one by one and apply the patterns that Harry has found. This would preserve the modular beauty of Javascript. And if I could do this through a naming convention rather than through new syntax it would be less disruptive for Harry because he would be able to keep his favorite editing tools (in the long term it makes sense to have a language construct for this but in the short term a naming convention will work much better).

So I decided to give it a try. The last week-ends have been somewhat sacrificed and I had some sleepless nights too but the tool started to shape up. It is not completely finished today but it now works well enough that I decided to publish it. I called it streamline.js and made it available on Sage’s GitHub site, here.

So I have good news for you Harry. You now have the best of both worlds. The old one because you can get back to writing real code with simple control flow and easy to read algorithms, and the new one because your code will run in this amazing and exciting node.js server. I’ve put your streamlined du function in diskUsage.js.

And by the way, I have a little extra goodie for you: you can now parallelize your code in a controlled way with two little functions that make parallel programming sound like gardening (or janitoring, your choice): spray and funnel. Just take a look at the diskUsage2.js example!

Happy programming!

Posted in Asynchronous JavaScript | 73 Comments

Objects and Values – A conceptual model

Eventually!

I created a blog a long time ago but I never took the time to publish anything on it and I was getting frustrated because I have been ruminating some ideas for years and I wanted to share them with others. So, the time has come!

My first post will be on a subject that has been haunting me for a very long time: “objects and values”. I wrote a short note about it in 1993 and then put the subject to sleep. I went back to it in 2002 and started to write it in book form but I gave up and let it rest. In 2005, I rewrote it as an article but never completed it and again, I let it rest.

I still did not have time to complete this article and it will take me some time to do so because this is a vast subject with lots of ramifications. But I’ve decided to post the first part that I wrote 5 years ago, hoping it will give me the energy to write the rest. As you will see, the examples are outdated but I already had to change them twice and I don’t feel like editing them one more time.

The problem

Programmers use the words “value” and “object” all the time, and they don’t really take the time to define these words. And if they were to define them precisely, I am not sure that they would all agree on what an “object” is and what a “value” is.

Everybody will probably agree on the fact that a window and a button are objects rather than values. But what about an integer, a string? Are they objects too, or are they simple values that don’t deserve the same status as objects? This is the main question that I will try to answer.

The Object Oriented community has a natural tendency to consider that “everything is an object”. For a SmallTalk programmer, a window is an object, a block of code is an object, a string is an object, an integer is an object, etc.

But a FORTRAN programmer is likely to view things differently: he cares primarily about numerical “values” and not as much about “objects”. So, he may have a hard time accepting that an integer or a real number is an “object”, or he may not see any benefit in treating numbers as objects. He is more likely to see them as being values.

So maybe there is no firm answer to a question like “Is an integer an object or a value?”. Maybe the SmallTalk programmer is right into thinking that an integer is an object because this fits his general framework, and maybe the FORTRAN programmer is right too, in thinking that an integer is a value. Or maybe an integer is both an object and a value, and they are just seeing two sides of the same thing.

Or maybe there is something deeper, and there are some fundamental reasons to consider that an integer is a value rather than an object, or to take the opposite stand. And maybe there is simple way to define what an object is and what a value is. This is the conclusion that I have reached and that I will try to explain here.

Semantics and Implementation

Programming issues can usually be attacked from two angles: the semantic angle, in which one focuses on the concepts and their meaning independently from implementations, and the implementation angle, in which one focuses on the implementation and the meaning is somewhat inferred from the implementation.

Here, my line of thought will be driven by the semantics. The distinction that I will be making between objects and values will be a semantic one, and I will deliberately stay away from implementation issues and even from “programming” most of the time.

So, you may be surprised because I will be talking about “Venus” and “Jacques Chirac” at least as much as about pointers and bits. This is because my goal is to give a definition of objects and values that is universal and not related to the specifics of any programming paradigm. Actually, as you will see, this definition has nothing to do with programs and computers.

My intent is to complete this with other posts in which I will deal more directly with the implications of this on programming (and in other areas). So, what follows is just a starting point.

Frege

When we try to come up with definitions for such basic concepts as “object” and “value”, we find ourselves tackling some very abstract issues, almost philosophical in nature, and we are not always equipped for this.

So, we need some “conceptual tools” and a bit of terminology to investigate this fundamental level. The best tools that I found come from Gottlob Frege, a German mathematician, logician and philosopher from the beginning of the 20th century.

Frege investigated fundamental aspects of modern logic and mathematics and tackled important questions like “What is a number? What is a function? What is a concept?”. As I was trying to answer questions of a similar nature (What is an object? What is a value?), I found a great source of inspiration in Frege’s writings.

Frege often uses a “linguistic” angle to attack these difficult questions. His rather abstract discussions are often illustrated by simple sentences like “the earth has two poles” or “Venus is the morning star”. I found this approach very interesting because it steps out of the domain, and gives more power to the argument. Frege’s question is: “What is a number?”, it is not “What is a number for a mathematician?”

Similarly, the questions that I am trying to answer are “What is an object? What is a value?”, they are not “What in an object/value for a programmer?”. This is because I believe that we should try to solve the general question first, and then “apply” our findings to the limited domain of programming rather than do the opposite.

This is part of a general belief that progress in computer science and especially in the design of computer languages will come from a better understanding of our own conceptual structures. We need languages that match our mental organization, rather than artificial languages that are at odds with our mental structures and that force us into a difficult and imperfect translation.

So, I find it very useful to “step out” of the domain first, and try to answer the question at a general level. I think that this will also help the reader because he will not be biased by his own programming background.

Sense and Denotation

Amongst the “conceptual tools” that Frege introduced in his mathematical and logical investigations, there is one that I found really enlightening and that really helped me formalize my thoughts about objects and values: the distinction between “sense and denotation”. So, before going into the heart of the subject, I will quickly introduce this part of his theory:

Frege considers that expressions (names, propositions) have two sides: a sense and a denotation:

  • The “sense” accounts for the “meaning” or “cognitive significance” of the expression itself.
  • The “denotation” is the actual object or value “denoted” by the expression.

This is rather abstract but it gets much clearer with a few examples. Let us start with some simple arithmetic expressions:

  • SQRT(4)
  • 3 – 1
  • 2

These expressions have different “senses”. The first one could be defined as “the positive number whose square is 4”, the second one as “the number that we obtain when we subtract 1 from 3”, and the third one simply as “number 2”. So, if we were to define these expressions (at this level, defining is a bit like paraphrasing), we would give them different definitions. These definitions correspond to the “senses” of the expressions, and it is obvious that these three expressions have different “senses”.

On the other hand, these three expressions “denote” the same number: 2. They have the same “denotation”.

Frege’s approach is not restricted to arithmetic’s. If we take expressions like:

  • The morning star
  • The second planet of the solar system
  • Venus

These expressions have different senses because they resonate differently in our cognitive system and they are not always interchangeable. For example, if we take the following sentence:

John did not know that Venus was the second planet of the solar system because he did not study astronomy well enough.

And we replace “the second planet of the solar system” by “Venus”, we get:

John did not know that Venus was Venus because he did not study astronomy well enough.

We obtain nonsense. This simple substitution exercise shows that these expressions are not always interchangeable; they have a slightly different “sense”.

On the other hand, these three expressions have the same “denotation”. All three denote the Venus planet, a real object somewhere in the solar system.

Note: of course, “Venus” could also “denote” a goddess or a statue, but this is not relevant here because we will assume that the expressions are interpreted within a “context”. In the example above, the context is “astronomy” and, in this context, “Venus” has an unambiguous denotation.

Now, what happens if we apply Frege’s distinction to whole sentences, for example to assertions about arithmetic’s or about the outside world? Take the following:

  • SQRT(4) = 3 – 1
  • The second planet of the solar system is Venus

These two expressions have obviously very different “senses”: the first one expresses an arithmetic truth while the second one states something about the solar system. But what are their “denotations”?

The answer is quite simple: these sentences have the same denotation and this denotation is simply the “true” value. In Frege’s theory, assertions (about numbers, about objects) are expressions that have Boolean values as denotation. They may have very different “senses” but their denotations are either “true” or “false”.

One of Frege’s important findings was that an expression cannot be “reduced” to its denotation. The sense is very important, as demonstrated by the substitution exercise and also by the fact that unrelated assertions (one about numbers, one about planets) may have the same denotation.

I will not dwell much longer on Frege, because I will only need the distinction between the “expression” and its “denotation” in what follows. If you want to know more, you will find a good bio with a complete list of references on http://plato.stanford.edu/entries/frege.

Variables and Constants

Before exploring objects and values, I would like to make a short detour through variables and constants. We have two more “what is” questions to answer: What is a variable expression? What is a constant expression?

Let us consider the following expressions:

  • The French president.
  • Jacques Chirac

These two expressions have the same denotation: a man called Jacques Chirac. But there is a fundamental difference between them: the denotation of the first one changes every 5 years, when a new president is elected, while the denotation of the second one will never change (Jacques Chirac will always be the same person – of course, we have to consider context to rule out homonyms).

I will say that an expression is “variable” when it denotes different things at different times, and that it is “constant” when it always denotes the same thing. This is rather obvious and does not need much further explanation. I will just give a few additional examples.

First, some variable expressions:

  • The French president
  • The temperature in my room
  • Today
  • The number of hair on my head

And some constant ones:

  • Jacques Chirac
  • 21°C
  • 5 days after January 25th, 2005
  • 3 * 4 * 10000

Proper nouns like Venus or Jacques Chirac are usually constant expressions; they always denote the same object or person. Literals like 2, 5 or 38 are also constants; they are more or less the proper nouns that we give to numbers.

Objects and Values

Now, we can answer our main questions: What is an object? What is a value?

I propose the following definitions:

  • An object is a mutable denotation.
  • A value is an immutable denotation.

Let us analyze some of the examples that we used earlier in the light of these definitions.

First, we notice that these definitions apply to “denotations” rather than to the expressions themselves. So, when we apply these definitions to Venus for example, it does not matter whether we call it “Venus” or “the second planet of the solar system” because we are interested in the denotation, i.e. the planet itself, the real thing that circle around the sun. If we decide that Venus is an object, then the second planet of the solar system will also be an object (and will be the same object).

And actually, Venus is an object, because it is mutable: its position in space changes continuously, its temperature changes as its moves around its elliptical orbit, and, if we consider long time scales, Venus was very different in the early days of the solar system than it is today. So, there is no doubt that Venus is “mutable”. Venus is an object.

Similarly, with the definition given above, there is no doubt that Jacques Chirac is an object rather than a value. He gets older every minute, he changes position, he loses hair, etc.

Now, what about numbers? Are they mutable or immutable? If a number like 2 was mutable and if it could mute completely and become identical to 3, then, as 2 is also the denotation of SQRT(4), SQRT(4) would mute as well and would start to denote an odd number. The consequences of such a mutation would be disastrous and all arithmetic rules would fall apart!

Now, what if 2 were to mute partially rather than completely and if only some of its properties would change over time? For example, 2 could become odd but keep its other properties or it could stop being prime, or it could become a perfect square. Our imagination can go rather far here, but if we start to analyze the consequences of these mutations, it becomes quickly obvious that the coherence of our arithmetic system will break as soon as we allow one of these mutations.

So, we can conclude that numbers are immutable. And, with the definitions that I gave above, numbers are values rather than objects.

At this point, it is important to stress the difference between mutable and variable, and the distinction that we made between an expression and its denotation. Let us consider the following expression:

The number of employees in the company.

And let us assume that the company had 26 employees yesterday and just hired a new one. Then, the number of employees in the company went from 26 to 27 and it seems like we have found a counter example: a number that mutes.

This is because we made a confusion between the expression and its denotations. Yesterday, the denotation of the expression was 26 and today, its denotation is 27. But 26 did not mute into 27. Instead, the expression ceased to denote 26 and started to denote 27, a different number. So, the phenomenon that we observe here is a variable expression which denotes different numbers at different times. The numbers themselves are immutable.

This investigation into the true nature of numbers is not really new. Frege had already analyzed this and he had reached the same conclusion, i.e., that numbers are immutable. He did not relate this to categories like values and objects because his focus was on the nature of functions and variables, but his analysis is very clear:

The expression “the number that expresses the length of this stick in millimeters”, without any indication of time, does not designate any number. If we add a temporal indication, we will designate a number, 1000 for example; but it will then be invariable. Another temporal indication yields another expression, which eventually designates another number, 1001 for example. We can say: “half an hour ago, the number that expresses the length of this stick in millimeters was a perfect cube, the number that expresses the length of this stick in millimeters now is not a perfect cube”, the sentence does not have the same subject. 1000 did not, in some sort, inflate to 1001, but has been replaced by 1001. Or else, could it be that 1000 is the same thing as 1001 under different clothing? When something varies, different properties and different states affect the same object successively. If it were not the same, there would be no subject from which we could say that it varies. A stick lengthens under the effect of heat; during heating, it remains the same. If on the contrary, we had taken it away and replaced by another, longer stick, we could not say that it lengthened. A man gets older, but if we could not recognize in him the same man, there would be nothing from which we could state the age. Let us apply this to numbers. What remains identical when a number varies? Nothing. Then, the number does not vary; there is nothing about which we can state a variation. A cubic number never becomes a prime number; an irrational number never becomes a rational number.
(my own translation here — my Frege books  are in French)

This extract of Frege’s “What is a function?” article demonstrates very well the difference that we make between objects and values. A stick or a man are “objects”, they “mute” because some of their properties change with time but they are not “replaced” as a whole. The expressions that designate numbers may vary and may denote different numbers at different times but the numbers themselves are immutable. Numbers are values.

Simple values

Numbers are not the only values that we know of. There are many other things that we can classify as values.

First, physical quantities that we can measure and express with units, for example 1.5 Kg, 23°C, 23µm, etc. are values. This is rather obvious because we would run into the same kind of logical nightmare as we did when we considered mutable numbers if we accepted that these quantities could mute.

Dates and timestamps are values too. Here also, it is important to distinguish expressions from their denotations. Let us consider the following expression:

The deadline for my project

If my project runs late and I cannot reach the original deadline, let us say February 15th, 2005, I will negotiate with the marketing department, and maybe they will accept a new deadline, for example, February 18th, 2005. So, the deadline of my project can change. But does this mean that the date “February 15th” can mute and become “February 18th”? No, because if this were true, then everything that could be said about February 15th, for example that the weather was rainy on that day or that it was a Tuesday, would suddenly apply to February 18th. This is clearly wrong and we have to accept a different interpretation: “the deadline for my project” is a variable expression that takes different denotations. The individual denotations, i.e., the dates, are immutable.

Boolean values are values too: true and false are immutable. If they were not, and if true could mute and become false, then everything that is true would become false and our whole logical system would fall apart. Here also, the distinction between expression and denotation is crucial to avoid confusions. For example, if we consider:

The weather is sunny today.

This expression is false today but will probably be true some other day. This does not mean that false will mute into true, it just means that the expression is variable and takes different denotations on different days.

Representations

Before going much further into our exploration of values and objects, I would like to introduce a new concept: representations. The definition that I will give at this point is rather tautological but it will be helpful:

A representation of an object or a value is an object that represents the object or the value.

This is somewhat circular because I am using the verb “represent” to define the noun “representation”, but despite this flaw, this definition is still useful. First, it says that a representation is an object. Then it says that a representation “represents” and is thus different from the thing that it represents.

All this is relatively obvious and can be illustrated easily:

  • 2, two and 2 are three representations of the number 2.
  • and are two representations of Gottlob Frege.

The representations are objects because they are mutable. For example, I can shrink them, colorize them, etc. And, if the mutations are too severe, for example if I tear the images apart, they may even cease to represent what they represent now.

Bits and Bytes

The distinction that we just made between the thing and its representations will be very useful when we will start to relate this general discussion to computers and programming. It will avoid some serious confusion.

For example, it is very important to make the difference between a value and its representation. The bit pattern 00000010 is only a representation of the number 2, and the fact that this representation can be altered and that a memory location that contained 00000010 can be modified to contain 00000011 does not say anything about 2 as a number, it just says that a given representation of number 2 has been modified, and that it ceases to represent 2 and starts to represent 3 instead. So, when a programmer says that he is “incrementing an integer”, we should not conclude that integers are mutable, because what he is really saying is that he is incrementing an integer variable, and that variable is an object made of bits somewhere in the memory of the computer. At some point in time this sequence of bits represents a certain integer, and at a later time it represents a different integer; but the integers themselves did not change.

Along the same lines, a Java programmer may have some difficulties in accepting that dates are immutable because when he deals with dates, he manipulates them as objects that he can modify like any other object. But this does not say anything about the true nature of dates; it just says that the designers of Java chose to represent dates as mutable objects. And, as I will discuss later, it may also indicate that their design choice is poor and that it would have been preferable to represent dates as immutable values.

So, we need to be careful here. When we are investigating whether something should be classified as a mutable object or an immutable value, we should not be misled by common language like “incrementing an integer”, or by specific representations like the representation that the designers of Java have chosen for dates, we have to analyze the nature of the denotations, not the nature of their representations.

Characters and Strings

Now, we can continue our exploration of values and objects. How should we classify characters and strings? Are they objects or values?

Again, it is important to distinguish characters and strings from their representation. A, A and A are three representations of the character A and we have to make the difference between the character A and the glyphs that represent it.

The true nature of characters is a bit more difficult to apprehend than the true nature of numbers because characters are human creation that did not always exist, while numbers seem to exist independently of us. What is character A? There is no unique glyph that corresponds to A but the various glyphs have a common shape. There is no unique sound either for A but rather a “family” of sounds (car, cable). So, the best way to define character A is probably to say that it is a symbol, and that the various glyphs and sounds are just different representations of this common symbol.

If we accept this definition of characters as being symbols, we have to accept that characters are immutable, very much like numbers. If character A were to mute, what could it become? If it were to mute completely and become another character, B for example, then every A that has been printed in every book would become a glyph for B, and we would have the same logical catastrophe as we had when we imagined that 2 would become 3. And it is difficult to imagine that A would only mute partially and that A, for example, would become a consonant, or would not be a letter any more. All these properties seem to be intrinsic to A, and if A were to mute like this, it seems like our only option would be to consider that A has become another character, and we would end up with two characters: the original A and the mutant A. But the original A would remain and the characters A in all the books would continue to represent the original A rather than the mutant. So, we would have created a new character and the original character would still exist unmodified.

Strings are sequences of characters. For example “flower” and “prime number” are two strings. Like characters, strings may have different representations: flower, flower, flower, etc. The representations can mute but the string itself is immutable. If “flower” were to mute into “prime number”, then flower shops would start selling prime numbers! And even if “flower” were to mute only partially and become, for example, “floxer”, we would have to adjust our knowledge and our pronunciation of English. To keep our sanity, we have to accept that “flower” and “floxer” are two different strings and that the first one will not mute and become the second one.

Expressions

This leads us naturally to our “expression” examples:

  • The French president
  • SQRT(4)
  • The number of employees in the company.

What are these expressions? Are they objects or values?

It is obvious that the first expression above denotes an object and that the other two denote values. But here, we are not concerned by what these expressions denote but by the expressions themselves, by what they are.

These expressions are actually strings that denote objects or values. So, as strings, they are values and they are immutable. If “the French president” were to mutate into “the Queen of England” everything that has been written about French presidents would suddenly apply to Queens of England. We would be very confused!

Here, again, it is important to stress that we are talking about the expressions themselves, not about what they denote, nor about the nature of the denotation. An expression like “the French president” denotes different persons at different times. So it is variable but still immutable.

This last point may seem a bit hard to swallow and needs a bit more explanation. How can something variable be immutable? One way to clarify this is to say that “being variable” is a property of expressions. Some expressions are variable, like the first and third ones above. Others are invariable, like the second one above. But they are immutable: the first and third expressions above are variable and will always be variable. The second one will always be invariable. When we are saying that expressions are immutable, we are considering how the expression itself evolves over time and what would happen if the expression where to mute into a different expression, not how its denotation evolves over time.

So, to summarize we can say that expressions are values that denote objects or values. Invariable expressions are values that always denote the same object or value. Variable expressions are values that may denote different objects or values at different times.

At this point, it is interesting to notice the duality between expressions and representations:

  • Expressions are values that denote objects or values.
  • Representations are objects that represent objects or values

Identity and State

So far, we have been rather focused on values and we have classified the following things as values:

  • Numbers
  • Physical quantities
  • Dates and timestamps
  • Boolean values
  • Characters
  • Strings
  • Expressions

We have not paid too much attention to objects but we assumed that the persons and objects that surround us (a person, a planet, a wooden stick) are objects, because they are obviously mutable.

Being mutable means that some of the properties of these objects can change with time, but it does not mean that everything can change. If everything were to change, then there would not be any reason to consider that we are dealing with the same object. The excerpt from Frege that we quoted earlier explains this very clearly:

When something varies, different properties and different states affect the same object successively. If it were not the same, there would be no subject from which we could say that it varies. A stick lengthens under the effect of heat; during heating, it remains the same. If on the contrary, we had taken it away and replaced by another, longer stick, we could not say that it lengthened.

So, although objects are mutable, there is something about them that is immutable, and that we rely upon to decide whether we deal with the same object, or with two different objects. This immutable part is the identity of the object, and the mutable part is its state. During the lifetime of an object, its identity remains the same but its state changes.

It is a bit difficult to say precisely what the identity of an object is made of. In the case of a person, the administrative identity is not always sufficient because we have to consider the case of a person who “changes identity”. The birth record would probably be more reliable.

In the case of wooden stick, there is no administrative record and we have to rely on our perception: intuitively, we perceive the stick as being one object and we symbolically give it an identity, but this identity is weaker than the identity of a person. What happens if we break the stick in two? If we just cut a small piece out of a long stick, we will probably consider that the stick keeps its identity and that the small chop is a separate object with another identity. But if the stick is broken in the middle, then we will probably consider that the stick does not exist any more, and that we now have two sticks, with new identities. So, we cannot really say precisely what makes up an identity; the identity is more a symbolic thing that allows us to identify the object.

But for a given object, the identity is immutable, and, given our definition of objects and values, it sound logical to conclude that:

the value of an object is its identity.

The state of an object is defined by the set of values that we obtain when we measure properties of the object. For example, a wooden stick has a length, a temperature, a color. We can measure these properties and we obtain values. The values themselves are immutable but the measurement process may yield different values at different times. So, there is no contradiction between the fact that the object is mutable and the fact that values are immutable. The properties of the object are variable, they may denote different values at different times but the individual values are immutable.

Sets

In the previous sections, we have tried to classify individual things (numbers, strings, persons, wooden sticks) as mutable objects or immutable values. But what about sets? Is a set, for example a set of persons or a set of numbers, an object or a value?

Let us consider the following expression:

The employees of the company

And let us assume that the company had 26 employees yesterday and that John joined today. What does the expression “the employees of the company” denote? Does it denote some kind of “bag” object that contained 26 elements yesterday and into which we have dropped John today. Or did it denote a set of 26 employees yesterday and does it denote a different set of 27 employees today?

If we take the OO programmer’s view, there is a good chance that we will adopt the “bag” viewpoint. A programmer is likely to implement a company object as having an “employees” collection, and when John joins the company, he will be added to the collection. The collection is an object and the programmer will not usually create a new collection every time an employee joins or leaves the company.

On the other hand, if we take the mathematician’s view, then the set of 26 employees from yesterday and the set of 27 employees of today are two different sets. When a mathematician adds an element to a set, he obtains a different set and the original set continues to exist (in the mathematician’s mind). It is not modified by the operation.

So, if we take the programmer’s view, sets are mutable objects. But if we take the mathematician’s view, they look more like immutable values. Who’s right?

To answer this, it is useful to go back to the classical distinction between the “intentional” and “extensional” characterizations of sets.

  • The intentional characterization of a set is a predicate that is true for the elements that belong to the set and false otherwise. In our example the predicate would be “is an employee of the company”.
  • The extensional characterization of a set is the enumeration of the members of the set. In our example, it would be an enumeration of names like “Jack Smith, Mary Brown, etc.”.

Mathematicians deal with immutable worlds. For example, number theory deals with the properties of integers and the set of all integers does not change with time. It will be the same tomorrow as it is today. And, when a mathematician considers a predicate about numbers (for example “is prime”), the set of numbers that satisfies the predicate does not change.

So, for a mathematician, the intentional and the extensional characterizations correspond to “frozen” sets that do not evolve in time.

But in the real world, the set of objects that satisfy a given predicate, for example the set of employees that belong to the company, changes with time. When we see the set under the intentional angle, we have a tendency to see it as one thing because the intention (the predicate) remains the same. But when we see it under the extensional angle, we see a different extension every day.

One way to conciliate these two views is to go back to the distinction between sense and denotation:

  • The intentional characterization (the predicate “is an employee of the company”) corresponds to the “sense” of the expression.
  • The extensional characterizations (the enumerations of all employees) correspond to the different denotations of the expression.

In this approach, there is only one sense, which is given by the predicate, but the different denotations correspond to the different extensions that the set takes over time.

So, what we have here is a variable expression that denotes different sets at different times.  The expression is variable but its denotations, the sets, are immutable. If we take this view, we conclude that sets are values.

Symbolic Values and Real Objects

We started this investigation with a simple characterization of objects and values: “objects are mutable, values are immutable”, and we have now identified some objects (persons, planets, wooden sticks) and some values (numbers, strings, Booleans, sets).

If we look at these two lists, we are struck by the fact that objects are things that belong to the real world while values are rather intangible, symbolic constructions that seem to exist only in our minds. This should be no surprise because everything that “exists” and is made of real matter is mutable. Immutable things can only exist in our minds, nothing is truly immutable in the real world.

But we should not conclude too quickly that all objects are real and all values are symbolic and that all this discussion was just a futile exercise to distinguish real objects from symbolic values. The latter is true: values are symbolic and are creations of our minds, but the former is false: all objects are not real. There are objects that don’t exist in the world but that we have to consider as being objects rather than values. For example, fictional characters are objects: they have an identity and they go through different states, just like real objects. They are not frozen, immutable values.

Summary of the Conceptual Model

To conclude this first article, let me quickly summarize the conceptual model that we have explored:

  • Expressions have a sense and a denotation.
  • The sense of an expression accounts for its “cognitive significance”
  • The denotation of an expression is the thing that the expression denotes.
  • Two expressions may have different senses but the same denotation.
  • A variable expression is an expression that takes different denotations over time.
  • A constant expression is an expression that always has the same denotation.
  • An object is a mutable denotation.
  • A value is an immutable denotation.
  • Objects have an identity which is immutable.
  • Objects have a mutable state.
  • Numbers and physical quantities are values.
  • Dates and timestamps are values.
  • Booleans are values.
  • Characters and strings are values.
  • Sets are values.
  • A representation is an object that represents an object or a value.
  • An expression is a value that denotes an object or a value.
  • The value of an object is its identity.
  • Values are symbolic and don’t exist in the real world.
  • Objects are usually real, but not always.

Links

Objects vs. Values, by Jean-Hugues Robert

Posted in Programming - Conceptual | 9 Comments