Monday, January 05, 2015

January 5, 2015

(Reason #2534 why Lisp rules)

The Lisp assembly code generated by the compiler in pLisp is already in continuation-passing style, but I wanted to have a fresh look at the whole compilation process so that the current limitations related to compiling the FRAME instruction are overcome, and also wanted to correctly handle continuation objects in the full-native-code mode where we will not have reg_accumulator, reg_current_value_rib and the other machinery that the Lisp assembly depends on.

Time to hit the books; in particular, Appel's Compiling With Continuations.

We first compile the source code into continuation-passing style, apply closure conversion to this, and then do the remaining good stuff like elimination of nested scopes, register spilling, target assembly generation and so on.

In this post I'm just going to talk about the first step, i.e. conversion to CPS-style, as I've progressed only to Chapter 5 in the book so far. But even at this stage, the advantages of programming in Lisp stand out. More specifically, two features: a) 'Code is data' (I might as well print a T-shirt with this emblazoned in the front) and b) gensym (yeah, this is not exactly a universal Lisp feature, but the ability to create and manipulate symbol objects programmatically is, I guess). Oh, and since we're doing the entire thing in pLisp itself, a tip of the hat to the whole meta-circular interpreter thing; the ability to write and test out these things in Lisp itself is a godsend not only in terms of saving time, but is also central to leveraging the features mentioned above.

The CPS version of pLisp source code is a CONS object, the CAR of which is the expression to evaluate, and the CDR of which is a lambda expression to which the result of the evaluation is passed. Textbook CPS stuff. As an example,

((cons a b) <exp>)

is converted to

(cons (lambda (x) 
        (a (lambda (y) 
             (b (lambda (z) 
                  ((x y z) <exp>)

where <exp> is the rest of the computation (also a CPS expression) where our expression fits in.

To convert to and evaluate CPS expressions within the interpreter, we need an environment to map variables--both source expression variables and the lambda variables created as part of the conversion. A normal way to do this is to create an ASSOC-PAIR object and get and put stuff into it. But no, we do things the cool way (this is a nifty technique to mimic environments in interpreters):

(defun env0 (x)
  nil)

(defun lookup (e var)
  (if (primop var)
      var
    (e var)))


(defun update (e var val)
  (lambda (v)
    (if (eq var v)
        val
      (e v))))


The environment is not a data object; it's a closure. We start out with an initial environment that simply returns NIL. When a new symbol is added to it, we produce a new closure that returns the value corresponding to this symbol if the argument matches the symbol, and defers to the original environment otherwise. Symbol lookup is just invoking the environment with the symbol (an added wrinkle is that we need to evaluate symbols corresponding to primitive operators to themselves). Brilliant, if you ask me.

Conversion to CPS-style is done by the to-cps function (from this point on, the code is in image form, apologies; click on the image to see it fully):


The function takes a source expression [e.g. (cons a b)] and a receiver CPS expression, and produces a CPS expression that is the receiver expression into which the converted source expression is embedded. An example:

(to-cps '(cons x (+ x x)) nil)

evaluates to:

(imagine that the LAMBDAs are preceded by a left paren as CONsing them with a symbol removes the left paren in print mode).

All the #:G symbols may look ugly, but if we're counting on GENSYM to generate symbols on the fly for us, we're going to have to live with them.

Evaluation of CPS expressions is done by eval-cps:

 
Since we're doing everything within the pLisp interpreter, and code is data, all we need to do for the actual evaluation is to invoke apply.

(eval-cps (to-cps '(cons x (+ x x))) (update env0 'x 100))

gives

(100 . 200)

And voila, we've implemented Chapter 5 in less than 50 lines of Lisp code (arguably for a more elegant and powerful language). Less than 25 lines if you ignore eval-cps.