Monday, November 25, 2013

November 25, 2013

Nice (italics mine):
It wasn't entirely a shock to Fallon, as he'd always been aware that he was someone especially motivated by power and manipulating others. Additionally, his family line included seven alleged murderers, including Lizzie Borden, infamously accused of killing her father and stepmother in 1892. Many of us would hide this discovery and never tell a soul, out of fear or embarrassment of being labeled a psychopath. Perhaps because boldness and disinhibition are noted psychopathic tendencies, Fallon has gone in the opposite direction, telling the world about his finding in a TED Talk, an NPR interview and now a new book published last month, The Psychopath Inside.

Thursday, November 14, 2013

November 15, 2013

What can you do with, say, Rs 15 crores?
  1. Lifesaving heart surgeries for one thousand children @ Rs 1,50,000 per child
  2. School tuition for a year of Rs 10,000 for 15,000 children (we're talking about a less-than-middling school, nothing fancy)
  3. Splurge it on a luxury condominium with seven-level security, personal elevators, designer sanitary and bath fittings, gym, terrace pool with jacuzzi, a panic room and a quick-safe room (I don't even know what the last two are for; maybe sanctuaries to quickly escape to when you're paralyzed by terror at the thought of what will happen when the Revolution comes for you?)
(In case it wasn't clear by now, this is post #17384 on what's wrong with our country).

The full front-page ad in The Hindu yesterday shows an airport runway with a just landed aircraft, and a well-heeled lady climbing out of a limousine, with a concierge/butler type helping her out, holding her gloved -- naturally -- hand  while protecting her from the elements with an umbrella (disclosure: I do not know for sure whether the umbrella is diamond encrusted). Oh, and in the background you can see a highrise, no doubt housing the above condominium with the panic room.

The sad part is that these condominiums will probably be overbooked and will anyway be on an invitation-only basis to keep out the mere millionaires. You can try having your secretary call the numbers provided (they actually say this in the ad: 'Have your secretary call us at ..."), but unless your net worth is in three figures (we're not talking rupees here, but crores) be prepared to take a number and wait.

The other ridiculous aspect of the whole thing is that, when you step out to your balcony in the morning, with the platinum cup of coffee made from beans shat out by an endangered civet, Central Park you ain't getting: on a lucky day, there will not be any street dogs crapping (if you're feeling bored, you can amuse yourself by wondering what kind of beans you'll find in their poo) in front of your building; on an unlucky (or extra-lucky, if you're into that sort of thing) day the canines might very well be replaced by a member of homo sapiens.

Tuesday, October 22, 2013

Multi-Stage Programming in Lisp

I have been reading up on MSP recently, and trying to grok the examples in Walid Taha's Gentle Introduction. When all you have is a Lispy hammer, every problem calls for a program with parentheses, so the next order of business is to find Lisp equivalents of Bracket, Escape and Run.

'Run' is trivial; to run a Lisp program, we evaluate expressions; so 'eval' it is.

'Escape' is for dropping in actual values into placeholders, so a comma seems appropriate. However, a comma has to occur within the context of a backquote, so it seems we're stuck. But wait, an Escape has to occur within the context of a Bracket, so maybe we can use backquotes for Brackets? Further analysis, in the form of seeing how well the two words rhyme, proves that backquotes are indeed the equivalents of Brackets in Lisp.

I went through a slightly more involved process in arriving at the above mapping, but this version of the story is a lot more fun.

Without further ado, here is the famous MSP power example in Lisp:

CL-USER> (defun mypower (x n)
           (if (eq n 0)
               `1
               `(* ,x ,(mypower x (- n 1)))))
MYPOWER


CL-USER> (mypower 10 3)
(* 10 (* 10 (* 10 1)))


CL-USER> (eval (mypower 10 3))
1000
CL-USER>


Yeah, we're missing the well-typed and well-formed guarantees, but Lisp's 'code-is-data' proves its worth once again. And we're not even using macros.

If we want to bring more structure to the staged expressions, we can express them as, say, closures. That's probably a topic for a future post.

I still have a lingering doubt whether we have accomplished truly multi-stage programming, seeing how trivial it has been to implement.

On a related note, what's common between the power and the logger functions? They are the only use cases that provide credibility to their respective patron saint programming paradigms, viz. MSP and aspect oriented programming. Just kidding.

***
With all the recent changes at the helm, can we expect less of full front-page real estate ads promising free gold coins?

Wednesday, October 16, 2013

October 16, 2013

Deccan Chronicle has an irritating daily feature called 'Word Spy' where they (on second thought, they're not smart enough to do this; it's probably syndicated) tweak regular words slightly, in a way nobody else would, and create supposedly charming and quirky new words that make you -- in theory -- smile and nod your head intelligently.

Example from a recent edition: 'Trialogue'. Before looking at the answer, let's try to figure it out ourselves: hmm, sounds like 'dialogue' which is a conversation between two people, so could this be a conversation among three people? Bingo! You win a free subscription to DC for a year! What? You don't have a to-be-potty-trained pet? Maybe you have a cat litter to fill up? OK then.

To see how easy it is to make up new (crappy) words from regular ones, here are three of my own inventions:
  1. Polisex: 1. Using sex as a weapon in politics 2. The dynamics of physical relations between people
  2. Furvasive: Condition of too much fur on a dog's back
  3. Introduckory: Evasive behaviour of someone who wants to avoid meeting new people.
I did say they're crappy words.They have to be, since I took about 1.2 seconds to make up each.
**** 
Kudos to John Michael Greer for being the first person from the native English-speaking world to use the word 'crores' in a normal way (at least to us Indians).

 

Wednesday, October 09, 2013

October 9, 2013

From the shedding-crocodile-tears department:
“What is the need to privatise the profitable Chennai airport? After investing over Rs. 2,000 crore to modernise the airport, why should a private party run the airport? After Delhi and Mumbai airports were privatised, passengers are being charged an exorbitant User Development Fee,” said L. George, regional secretary of the AAEU.
Yeah, they care for passengers that much. Why not be honest and admit that they fear a threat to their livelihoods because of the possible loss of jobs?

Monday, September 30, 2013

September 30, 2013

Ignoring for the moment the need for the RBI to get intimately involved in transactions between retailers and consumers, this is precisely what is wrong with our country:
Babu Jayaram and his family have been waiting to replace their nine-year-old television with an LCD TV. They have also set their sights on acquiring an SLR camera.
Their dream to own these high-value products by swiping their credit cards is, however, unlikely to be realised this Diwali with the Reserve Bank of India (RBI) coming down on banks offering zero per cent-interest consumer finance schemes.
“We were banking on such schemes. Though a processing fee was levied, it used to be adjusted against the second instalment. I ended up paying only the cost of the product. But with the ban on zero-interest schemes, it will be some time before we get access to such easy consumer finance,” he said.
The need for instant gratification, buy-now-pay-later, pledging future earnings to pay off usurious loan sharks masquerading as your friendly corporate banker/credit card company. It's not obvious how much of our economy is underpinned by easy credit. No wonder everybody's keeping their eyes peeled for the RBI's latest pronouncements about the repo and other policy rates -- if businesses (and consumers) relied only on money in the bank to buy stuff, our GDP would probably be lesser by 20%.

***
"Also, by 2100, all our brains will be in vats, with our needs being fulfilled by electric impulses fed to our neurons, while machines will harvest our brains' excess energy to power themselves in their continued domination of humankind". Oh wait, that was the plot for The Matrix.

Tuesday, August 13, 2013

Look Ma, no console!

pLisp is now officially a GUI app. The code still contains #ifdefs to revert to console mode if required, but the results are decent enough to persist with this change. I zeroed in on GTK+ after dithering over this decision for a long time, and, while doing GUI development in a C environment without any handy things like drag and drop, WYSIWYG layouts, and so on is a bit of a pain, I think I've got the hang of it now. Speaking of GTK+, it looks overwhelming at first, but it's manageable if you adopt an approach of just taking what you need and getting it to work (supplemented with liberal copy/paste of code strewn all over the internet).

Screenshots:




Yeah, I know: Transcript and Workspace? It's almost as if pLisp is a cover tribute to VisualWorks Smalltalk.

Thursday, July 04, 2013

Building a Lisp Interpreter from Scratch -- Part 12: Exception Handling and Object System

(This is Part 12 of a series of posts on pLisp)

This post is almost an afterthought, since I initially thought of winding up with Part 11. But then I got to working on exceptions and an object system, and decided to write about them as well.

Exception handling

Exception handling turned out to be extremely trivial to implement in pLisp, on account of its support for continutations. All we need to do when we invoke a function is to capture the current continuation and create a new continuation object that incorporates the exception code and, optionally, the 'finally' code. This continuation is invoked whenever an exception occurs (through a (throw ...) special form [Update: not exactly a special form, as it's implemented in the library]). The system maintains a stack of these exception handlers, to be unwound as the exception travels up the call stack.

The only support from the core system (i.e., that requires changes to the source code of pLisp as opposed to changes to the pLisp library) is to reset the global exception handler stack after each return to the top level.

Here's the full exception system code from pLisp.lisp (in image form to avoid code wrapping and other ugly things; one of these days I should figure out how to get those pretty boxes with scroll bars in Blogger for displaying code):


Object System

"Give a man functions and procedures, and he'll never have closure. 
Give a man closures, and he'll write his own object system"

(I was googling for Paul Graham's quote on closures and object oriented programming, but couldn't find it, so the above dodgy quote will have to do).

As the quote implies, a more-or-less fully functional object oriented system can be developed just with closures (and macros, of course; what will we do without them?). The meat of pLisp's object system (unfortunately named POS) is in two macros, DEFINE-CLASS and CREATE-INSTANCE, and a closure called CALL-METHOD. Though these macros look quite intimidating, they are conceptually quite simple; they create closures that store the class/instance variables and methods, and return a list of CONS pairs that map symbols to the class'/instance's methods.

Once a class or an instance has been created, we invoke methods on it by invoking CALL-METHOD.

When we define a class, we specify the following pieces of information:
  • The name of the class
  • The name of its superclass
  • List of instance variables with their initial values and getter/setter names
  • Ditto for class variables
  • Closures that perform initialization for the class and its instances
  • List of class methods
  • List of instance methods
An example:

(DEFINE-CLASS 
              ;name of the class
              SQUARE
              ;name of its superclass
              SHAPE
              ;list of class variables
              NIL
              ;list of instance variables
              ((A NIL GET-A SET-A))
              NIL ;class initialization code
              ;instance initialization
              (LAMBDA (VAR) (SET A VAR)) 
              ;class methods
              NIL
              ;instance methods
              ((AREA () (* A A))))  

;MAKE-INSTANCE is a convenience macro
(DEFINE A-SQUARE (MAKE-INSTANCE SQUARE))

;sets instance variable A to 10
(CALL-METHOD A-SQUARE 'SET-A 10)

;invokes the function AREA that returns (* A A), i.e. 100
(CALL-METHOD A-SQUARE 'AREA)

Inheritance, data encapsulation, information hiding -- all there in about 100 lines of pure library code (polymorphism via the dynamic typing already in place).

One current deficiency is the inability to refer directly to class variables from instance methods; the workaround is to invoke CALL-METHOD on the class with the relevant symbol/message.

Monday, June 17, 2013

Building a Lisp Interpreter from Scratch -- Part 11: This and that

(This is the final part of a series of posts on pLisp)

Well, it's time to wind up the series. It's been fun, spending time doing two of my favourite things (programming and writing; the only way this could have been topped is if I were to write about the coding involved in a Lisp-based first-player porn video game [just kidding]). I'll use this post to record stuff that doesn't really fit in any of the topics covered so far, and other random things.
  1. I have continued to work on the code since starting the series, and the posts are already somewhat out of sync with the code. One notable change is that WHILE is now a special form as opposed to a macro. There was nothing wrong with the macro (other than making your head hurt every time you looked at the definition), but since WHILE figures in the definitions of quite a few standard macros (DOLIST, DOTIMES, etc.), this makes sense from a performance perspective.

  2. I am in the process of adding a RETURN-FROM special form to handle non-local returns. This is pretty straightforward to implement using continuations, but things are still flaky -- it works fine for simple function calls, but barfs for functions that call macros internally.

  3. Performance-wise, pLisp has become slow as molasses. I probably need to do something about the implementation of arrays, and look at seriously optimizing the code, particularly memory management. Update: I replaced the custom binary search tree implementation with a red black tree, and the speedup has been phenomenal.

  4. The debug stacktrace at present, to put it politely, sucks big time. This is a direct consequence of moving to a compiler/VM architecture -- the execution trace at the VM level does not make sense for the application programmer; there needs to be a way to map (and display) source forms to the VM instructions, similar to how conventional debuggers manage things.

  5. The next major goal is the IDE -- something that approaches the power of a typical Smalltalk IDE with a system browser where we can browse all the symbols/packages, view/edit the code in a  pane, workspace and transcript windows to replace the command line top-level, a debugger UI, and so on. Man, I miss Smalltalk, especially after mucking around so much with Emacs, gcc and gdb.

  6. I realized that the NUATE instruction (used to invoke continuations) is not used at all, for the simple reason that the compiler never emits it. Just a simple matter of making the compiler inspect a symbol and see if it is bound to a continuation object (similar to how it handles macros now), will probably get around to implementing this soon.

  7. An object system is also in the works (not at the level of sophistication of CLOS or similar), but a functioning system nevertheless, with classes, inheritance, slots, and methods. 

Thursday, June 06, 2013

Building a Lisp Interpreter from Scratch -- Part 10: Foreign Functions

(This is Part 10 of a series of posts on pLisp)

The foreign functions interface in pLisp is the gateway to access and use external code. It is defined by two special forms, LOAD-FOREIGN-LIBRARY and CALL-FOREIGN-FUNCTION. The first form takes a string (literal or otherwise) that represents the name of the shared library to load, and does a dlopen() on it. The successfully created handle to the library is stored internally by pLisp, and is used to service requests through the second form. The second form is the actual invocation of the library's exported function.

We use libffi for implementing these special forms.

The data types supported are integer, float, character and string, and pointers to these data types.

To illustrate with an example, if we have a shared library called libtest.so that exposes a function called fn_ret_float:


float fn_ret_float(int i, float f, char c, char *s)

{
  printf("%d\n",i);
  printf("%f\n",f);
  printf("%c\n",c);
  printf("%s\n",s);
  printf("%f\n", f + f);
  printf("exiting fn_ret_float\n");
  return f+f;
}

we invoke the function as:

$ ./plisp
Welcome to pLISP. Type 'quit' to exit.
USER> (load-foreign-library "libtest.so")
NIL
USER> (call-foreign-function "fn_ret_float" 'float
                             '((10 integer)
                             (6.5 float)
                             (#\a character)
                             ("abc" character-pointer)))
10
6.500000
a
abc
13.000000
exiting fn_ret_float
13.000000
USER>

The first parameter to CALL-FOREIGN-FUNCTION is the name of the to-be-invoked function, expressed as a string; the next parameter is the return type of the function; this is followed by a list containing the actual parameters and their respective data types. Passing the data types too may seem redundant, since pLisp can figure out the types on its own, but we still need to differentiate between pointers and non-pointers, so we have to live with this redundancy.

Another example:

int fn_arg_int_ptr(int *i)
{
  printf("passed value is %d\n", *i);
  *i = 100;
  return 0;
}

USER> (define i 10)
I
USER> (call-foreign-function "fn_arg_int_ptr" 'integer
                             '((i integer-pointer)))
passed value is 10
0
USER> i
100
USER>

This illustrates the above point about the redundancy -- if we do not mention the type of the argument (integer-pointer), pLisp will not know whether we need to pass the value of i or the address of i to the function.

A couple of points with respect to libffi:
  1. There seems to be some issues with using the ffi_arg struct for returning floats; I had to explicitly use a float variable for this.
  2. libffi is also used to implement the FORMAT special form. It handles the variable arguments requirement perfectly. More on FORMAT later.

Monday, May 27, 2013

May 27, 2013

(Obligatory 'Random thoughts on the IPL' post of the year)

1. If there is a single piece of imagery that succinctly captures the essence of this whole sordid IPL season -- no, make that the entire IPL -- it's the super slow motion footage (accompanied by faux operatic music, natch) of the scantily clad oxy blonde East European hooker cheerleader dancing with gay abandon, her face frozen in the rictus of feigned joy while holding an open Pepsi bottle that spews out its carbonated sugary poison all over the place, also in super slow motion. Lust, gluttony, greed, pride -- all there. Throw in  some footage of Sreesanth glaring balefully at a batsman while having a towel stuck in his waistband, and you've got wrath and sloth covered as well (brownie points if you figure out how sloth is involved).

2. In related news, the Oxford English Dictionary has decided to replace the present definitions of the words/phrases 'chutzpah', 'cojones', and 'conflict of interest' with just a single picture of Srinivasan. They claim that this change will save them about 0.13% in space, while adding 78% in clarity and simplicity to the definitions.

3. On a more serious note, the reaction of the BCCI president reminds me of Bill Clinton's impeachment troubles. The spin then was that it was a witch hunt that had to be resisted at any cost. What transpired subsequently, in terms of prosperity and economic growth ("It's the economy, stupid"), is purported to have vindicated this stand (ignoring the dotcom bust, and other downsides to bubble economics). Whether this is true in this case as well remains to be seen.

4. If the BCCI president has been able to stand up so far to the powerful forces arrayed against him, this either means that a) he has powerful backers in his corner supporting him behind the scenes or b) he has a few aces up his sleeve that serve as leverage. If you believe that the match/spot fixing scandal emerged as the result of genuine sleuthing and has nothing to do with the moves on the grand chessboard to control the riches offered by Indian cricket, I have lifetime season tickets for two for the hospitality box in all IPL matches to sell to you at a 90% discount.

5. The electronic media's coverage of the spot fixing scandal has been over the top as usual. While this is expected, Deccan Chronicle have also joined the party with their front-page attacks on Srinivasan. No doubt they're getting their revenge for some slight they suffered at his hands when they had a horse in the IPL race.

Friday, May 24, 2013

Building a Lisp Interpreter from Scratch -- Part 9: Serialization

(This is Part 9 of a series of posts on pLisp)

(Note: This post is quite out of sync with the code base; please see Part 13)

Serialization in pLisp is an all-or-nothing thing, i.e. not at the level of individual objects, but at the image level. You can invoke the CREATE-IMAGE special form at any point (at the top level, in the middle of execution, or while in debug mode) to dump the image to the hard disk to a file of your choosing. Loading the image is done by invoking pLisp with the file name as the argument.


$ ./plisp

Welcome to pLISP. Type 'quit' to exit.

USER> (defun fact (n)

        (apply * (range 1 n 1)))

FACT
USER> (fact 10)
3628800
USER> (create-image "test.image")
NIL
USER> q
Bye.

$ ./plisp test.image
Welcome to pLISP. Type 'quit' to exit.
USER> (fact 10)
3628800
USER> quit
Bye.

Nifty. An even more powerful aspect is the ability to do this in the middle of a debugging session (Hi, Smalltalk. It's been a long time, we should catch up sometime soon.):

$ ./plisp
Welcome to pLISP. Type 'quit' to exit.
USER> (let ((x 10))
        (break)
        (* 2 x))
BREAK
DEBUG> x
10
DEBUG> (create-image "debug.image")
NIL
DEBUG> (resume)
20
USER> q
Bye.

$ ./plisp debug.image
Welcome to pLISP. Type 'quit' to exit.
DEBUG> x
10
DEBUG> (resume)
20
USER> q
Bye.

Astute readers will observe that starting pLisp with the image takes us directly into the debugging session that was active when the image was created. The real utility of this feature is in allowing us to share the image file with somebody and enlist them in our dirty debugging efforts.

All these good things are possible courtesy of the TPL serialization library (thanks Troy, especially for help with the tricky bit involving packing arrays inside structs). The following variables capture the complete state of pLisp, and serializing/deserializing them is all it takes:

strings
top_level_env
free_list
last_segment
heap
current_package
gen_sym_count
packages
debug_mode
debug_continuation
debug_env
execution_stack
debug_execution_stack
reg_accumulator
reg_next_expression
reg_current_env
reg_current_value_rib
reg_current_stack

images.c is the place where all this action is.

Update: One feature not yet implemented is the serialization of references to foreign function libraries: if we load a bunch of .so's and then create an image, the handles to these libraries will be lost when we invoke pLisp with the image file. The way out is to serialize an array of strings containing the shared library names, and then do a dlopen() on these libraries afresh.

Monday, May 20, 2013

May 20, 2013

Quick, what's the first thing that comes to mind when you hear the word "shochaley"? If you answered "Vidya Balan", pat yourself on the back and stand in the line over there for your lollipop. I understand the importance of proper plumbing, the need for celebrities to lend their time to worthwhile causes, and so on, but if somebody told me that roping her in for this ad was a plot concocted by her rivals to damage her brand ("Jahan shochaley, vahan Vidya Balan"), I'll probably nod my head in thoughtful agreement.

Does anybody believe that spot-fixing in the IPL begins and ends with the three Rajasthan Royals players, and nobody else from the other franchises (the word 'franchise' itself reeks of the money-grubbing ethos of the whole thing) is involved? Please return the lollipop if you do. The convenient death of the person responsible for the tapping of the phones? Just a coincidence, please move along.

Must-read article about Ranbaxy.

Monday, May 06, 2013

Building a Lisp Interpreter from Scratch -- Part 8: Garbage Collection

(This is Part 8 of a series of posts on pLisp)

(Note: This post is somewhat out of sync with the code base; please see Part 13)

There are many algorithms available for garbage collection. We go with the simplest (well, the second simplest, since mark-and-sweep is simpler, but needs allocation of one bit in the object, something which we cannot do), i.e. tri-color marking.

Tri-color marking works like this: we have three sets of objects -- white, grey and black. The white set consists of all objects that can be garbage collected, while the black set contains objects that should not be (because somebody is holding on to references to these objects). The grey set contains objects that we're not sure about yet.

Before moving on, something about the when of garbage collection: we do GC after every execution of the REPL.

Whenever an object is created (through object_alloc()), it is added to the white set. If, after execution of the REPL, we find that it's still in the white set, it is GC'd. Objects save themselves by first getting themselves promoted to the grey set and then to the temporary haven that is the black set. Temporary because, as they say, you're only as good as your last hit, so beware the next call of gc() -- nobody may be holding your hand when the music stops.

The grey set starts off with all the root references. A root reference is an object that we're sure is reachable and hence cannot be GC'd. In the case of pLisp, the root references are all the top-level objects, conveniently captured in the top_level_env CONS object. Thus our grey set starts with this single object.

For each root reference, we do the following:
  1. Move it to the black set
  2. Move all the objects it directly references to the grey set.
The above algorithm is applied recursively till we end up with no more objects in the grey set.

Now GC is just a simple matter of freeing up RAW_PTRs corresponding to the objects remaining in the white set.

We use a binary search tree to store the contents of the three sets:

struct node
{
struct node *left;
struct node *right;
OBJECT_PTR key;
} ;

with the OBJECT_PTR values serving as the key. A slight wrinkle is needed to make sure that the BST remains balanced when we remove a node with both children present: we toss a coin to determine whether we go down the left or the right sub tree.

How to determine the objects referenced by a given object is pretty straightforward:
  1. If it's a CONS object, check its CAR and CDR
  2. If it's a closure or a macro object, it will reference three CONS objects: a parameter list, the enclosing environment, and the body of the closure/macro.
  3. For array objects, check all the array elements
  4. If it's a continuation object, the CONS object corresponding to the current call stack will be the referenced object (remember the trickery we resorted to to efficiently store the current call stack in the continuation object?)
  5. Objects of other types (character, integer, float, symbols, strings) do not reference other objects.
The memory needed to store symbols and strings does not come from the heap, but from a dynamic (char **) array called 'strings'. pLisp does not do any GC for this yet; maybe in the future.

Integers and floats, while not referencing other objects, are actually allocated on the heap. Lopping off the tag from objects of these two types yields a RAW_PTR that indexes into the heap; the relevant location contains 32 bits that is the integer/float value of the object. This is very inefficient, actually, since the same number will be stored multiple times on the heap, but this is required if we want to leverage the full 32 bits to store the integer (only 28 bits were being used earlier [see Part 3]). Maybe we can use the flyweight pattern or something.

Friday, May 03, 2013

May 3, 2013

Patriotism is the last refuge of the scoundrel.

Watching Musharraf's travails after his return to Pakistan reminds me of a Tom and Jerry cartoon: the one where Tom sneaks furtively into a yard, only to be ambushed by the bulldog, his tail firmly in the grip of said bulldog's jaws, with periodic 'boing, boing' noises emanating whenever Tom's head makes contact with the hard ground. Not that I have any sympathy for the f*cker. Musharraf, that is; Tom I dearly love.

Speaking of bulldogs, Surendra's recent cartoon about the CBI and the Supreme Court is a classic.

Tuesday, April 23, 2013

Building a Lisp Interpreter from Scratch -- Part 7: Continuations

(This is Part 7 of a series of posts on pLisp)

Ah, continuations. The feature that needed so much wracking of the brain, so much rewrite of the code to implement.

All this frustration stemmed from a pretty innocuous thing: my literal interpretation of the definition of a continuation. A continuation, in the context of an expression being evaluated, is defined as a function of one parameter which, if invoked with the result of the sub expression, would execute the rest of the computation. Sounds straightforward, once you wrap your head around the confusing and poorly worded previous sentence. You look at the examples from Paul Graham's On Lisp and go 'Yeah, that sounds OK, I get it'.

Big mistake. You do not get it.

It's quite easy to form the lambda expression equivalent to the current continuation for a given expression/sub-expression. What is not easy is to figure out how to save the state of the partial computation till that point (some of the arguments may have been evaluated, some may not; if the continuation is invoked later, the context till that point has to be abandoned, and replaced with that of the continuation; and so on), when your interpreter is implemented as nothing more than a call to a function named eval(), which calls itself multiple times (we're implicitly riding on the coattails of the C call stack, basically).

Long story short, Kent Dvybig's PhD thesis to the rescue, continuations trivial to implement if the interpreter is a proper virtual machine with registers, call stack, etc.

Our continuations are modelled on those of Scheme. The relevant primitive is CALL-CC (CALL-WITH-CURRENT-CONTINUATION has only 30 characters, so I went with CALL-CC). We invoke CALL-CC with a lambda form which is passed the current continuation object by pLisp. Textbook Scheme.

$ ./plisp
Welcome to pLISP. Type 'quit' to exit.
USER> (define x 0)
X
USER> (define cont)
CONT
USER> (progn (call-cc (lambda (cc) (set cont cc)))
             (incf x)
             x)
1
USER> x
1
USER> (cont '())
2
USER> x
2
USER>

As I mentioned earlier, we resort to some trickery when we store continuation objects. The only thing a continuation object captures is the current call stack, which is a CONS object (a list of call frames). We lop off the tag bits from this CONS object and simply retag the bits 1010 -- CONTINUATION_TAG -- to it, and call it a continuation object. Cheap trick, but efficient.

Monday, April 22, 2013

Building a Lisp Interpreter from Scratch -- Part 6: Macros

(This is Part 6 of a series of posts on pLisp)

Macros are where pLisp's affinity to Common Lisp is most clearly manifest. The macro definition syntax -- use of backquotes, comma, and comma-at -- pretty much mirrors that of CL:

(defmacro first (lst)
  `(car ,lst))

I guess I don't have much else to say in this post, so maybe I'll ruminate on macros in general.

One key lesson I learned while doing macros in pLisp is that macros and closures are identical (as evidenced by the identical code in vm.c that handles both), except for when they're invoked (compile time versus run time). There is nothing special about the 'special' operators used by macros, i.e., comma and comma-at: you can make calls to these operators from closures and derive the same expected behaviour from them as you would when calling them from a macro:


$ ./plisp

Welcome to pLISP. Type 'quit' to exit.

USER> (defun test (lst)

        `(first ,lst))
TEST
USER> (test '(1 2 3))
(FIRST (1 2 3))
USER> Huh.

Another aspect of pLisp macros is that they are top-level only, i.e., there is no MACROLET or its equivalent. Shouldn't be too difficult to add this, but I'm not (yet) convinced that their absence is that critical.

Tuesday, April 16, 2013

Building a Lisp Interpreter from Scratch -- Part 5: Compiler, Virtual Machine

(This is Part 5 of a series of posts on pLisp)

Using a compiler and a virtual machine for an interpreter may seem somewhat of an overkill, but makes sense for two reasons:
  1. Incorporating continuations in pLisp showed me that a VM-based approach (hence a compiler) is a straightforward way to do so -- virtual registers, call stack, the whole works -- rather than mucking around with code walkers and the like.
  2. Writing a compiler would also be useful when we want to run pLisp code natively.
Before we proceed further, some attributions are in order: Kent Dybvig's PhD thesis is pretty much the reference for much of this post.

The last time we spoke about the interpreter per se, we had just converted an expression_t struct into a native pLisp object (OBJECT_PTR). We are now ready to do further things with this pLisp object:
  1. Compile it
  2. Interpret it
(who could have seen that coming?)

The compiler converts a pLisp source form into a simplified Lisp form that is understood by the VM/interpreter. The following are the constructs in the simplified Lisp:

HALTHalts the virtual machine
REFERLoads a variable reference
CONSTANTLoads a constant
CLOSECreates a closure
MACROCreates a macro
TESTImplements IF/ELSE
ASSIGNStores a value in a variable
DEFINECreates a variable binding in the top level
CONTICreates a continuation
NUATEExecutes a continuation by replacing the current stack with that of the continuation
FRAMEStores the register contents in a new frame and pushes the frame on to the call stack
ARGUMENTAdds the results of the last evaluation to the value rib
APPLYClosure/Macro/Continuation application
RETURNPops a frame, restores registers
BACKQUOTEProcesses a backquote (can't be done by the compiler because we need to do some compiling at run time; see below) 

These constructs are predicated on the existence of the following, for want of a better term, ISA elements:

//register that stores the 
//results of the last computation
OBJECT_PTR reg_accumulator;

//the next expression to be 
//evaluated by the VM
OBJECT_PTR reg_next_expression;

//the environment in which the
//current expression is to be evaluated
OBJECT_PTR reg_current_env;

//holds the arguments evaluated
//till now (for closure applications)
OBJECT_PTR reg_current_value_rib;

//the call stack
OBJECT_PTR reg_current_stack;

The VM interprets the form pointed to by reg_next_expression, by suitably manipulating the registers, and sets the next expression to be evaluated, thereby perpetuating the computation, till it runs out of expressions to evaluate.

The astute reader will observe that all these elements are first-class pLisp objects. This is imperative especially for the current stack, since continuations encapsulate the call stack (although, to be fair, there are ways to reify the stack using native mechanisms and still end up with a first-class continuation object).

The call stack itself is a CONS object made up of frames, themselves pLisp arrays containing the next expression, environment, and the current value rib.

The working of the compiler is somewhat peculiar; the simplified Lisp that it produces from a pLisp source form is something like a Russian-doll-within-doll toy. An example will make this clear:

The source form

(define x 10)

gets compiled into

(CONSTANT 10 (DEFINE X (HALT)))

i.e., we have three 'instructions' to be interpreted by the VM (CONSTANT, DEFINE and HALT), and the first instruction contains the second, the second contains the third, and so on. Looks a bit disconcerting at first, but it works.

The compiler and interpreter work pretty independently of each other, except when it comes to macros and EVAL. If the compiler detects that it's dealing with a macro, it sets things up so that a call is made to the interpreter, and receives the results of the interpreting for compilation. The interpreter enlists the help of the compiler to process BACKQUOTE, COMMA, COMMA-AT and EVAL.

Thursday, April 11, 2013

Building a Lisp Interpreter from Scratch -- Part 4: Memory System

(This is Part 4 of a series of posts on pLisp)

(Note: This post is quite out of sync with the code base; please see Part 13)

As I mentioned briefly in Part 3, memory in pLisp is allocated from a fixed-size heap and objects are referred via indexes (called RAW_PTRs) into this heap.

Memory is addressed in pLisp not at the byte level, but at the word (32 bit) level, as indicated by the typedef for OBJECT_PTR (unsigned int). This is inefficient when we're dealing which characters, for example, but efficiency is not the primary concern for us.

The interface to the memory system is through three calls:

RAW_PTR object_alloc(int);
void set_heap(RAW_PTR, OBJECT_PTR);
OBJECT_PTR get_heap(RAW_PTR);
  1. object_alloc() is the equivalent of C's malloc(). It takes as argument the number of four-byte words to be allocated, and returns a RAW_PTR that points to the newly allocated space in the heap.
  2. set_heap() sets the pointed-to memory location in the heap to the given OBJECT_PTR value.
  3. get_heap() is the read counterpart of set_heap().
There is no deallocation procedure because we have GC that kicks in right after a form has been evaluated and its results reported to the top level. GC is a topic for another post.

The memory system is a straightforward text book implementation. We maintain a list of available chunks/segments of memory in the form of a linked list (initially there is just one big segment as big as the full heap; this gets chopped up as allocation requests are serviced).

The first word of a segment contains its size; the next word points to the next available segment. The subsequent words constitute the space available in that segment:



We maintain two variables called 'free_list' and 'last_segment' to store the start and end of the linked list respectively. To allocate memory of a certain size, we walk the free list and stop when we encounter the first segment that is large enough for our needs. The segment is removed from the list and its space is given to pLisp; whatever is left over is packaged off into a new (smaller) segment. An obvious improvement would be to look for the segment whose size most closely matches the amount of memory requested, rather than settle for the first segment that meets our needs. When it's time to free memory -- through GC -- the segment to be deallocated is simply attached to the free list at the end.

To illustrate this with an example, here's the implementation of the CONS operator:

OBJECT_PTR cons(OBJECT_PTR car, OBJECT_PTR cdr)
{
  log_function_entry("cons");

  RAW_PTR ptr = object_alloc(2);

  set_heap(ptr, car);
  set_heap(ptr+1, cdr);

  insert_node(&white, create_node((ptr << CONS_SHIFT) + CONS_TAG));

  log_function_exit("cons");

  return (ptr << CONS_SHIFT) + CONS_TAG;
}

Ignore the call to insert_node() for now; this is for GC and doesn't have a bearing on what we've discussed so far. Creation of objects of other types in pLisp follows the same template:
  1. A call to object_alloc() to allocate the memory
  2. Call(s) to set_heap() to populate both the type-specific information (e.g. length for array objects) and the object content itself (array elements, for example)
  3. Decorating the RAW_PTR with the relevant object tag and returning it.
And we're done with memory.