Friday, April 05, 2013

Building a Lisp Interpreter from Scratch -- Part 3: The Object System

(This is Part 3 of a series of posts on pLisp)

(Note: This post is quite out of sync with the code base; please see Part 13)

As I mentioned briefly at the end of Part 1, objects in pLisp are denoted by OBJECT_PTROBJECT_PTR is nothing but a fancy name for unsigned 32-bit integers:

typedef unsigned int OBJECT_PTR;

The 32 bits in an OBJECT_PTR do two things:

1. Specify what type of object we're looking at
2. Depending on the type of the object, either store its value, or point to where the value is stored.

#1 is accomplished by the last four bits, called the tag bits, and the remaining 28 bits take care of #2. Before dissecting OBJECT_PTR, a brief digression is in order: we need to understand how memory is handled in pLisp. I plan to do a separate post on the memory model, so I'll stop with a few key pieces of information that are relevant here.

Memory is allocated out of a heap, whose size is specified in the code (TODO: pass this as a command line parameter):

#define HEAP_SIZE 8388608

This memory is in the form of an array indexed by RAW_PTR values which are, surprise surprise, unsigned integers in drag:

typedef unsigned int RAW_PTR;

Digression over.

The table below lists the pLisp object types and how an OBJECT_PTR value is deconstructed to yield them (click to see full size):




Most of this is pretty self-explanatory, however a couple of explanations are in order:


First, implementing the full IEEE floating point format is too painful (I had done this for Vajra), so I enlisted the help of the C runtime for this. A float value is stored in a union:


union float_and_uint
{
  unsigned int i;
  float f;
};

When we want to extract the float value from an OBJECT_PTR, we simply extract the first 28 bits from it, index into the heap using these bits,  and store the heap content at that index into the 'i' value of the union, and voila, the 'f' value automagically contains the correct float representation. The same process in reverse works for creating the OBJECT_PTR value.

Second, we resort to some hackery for continuation objects as well. The only piece of information a continuation object has to store is the stack object active at the time of the creation of the continuation, so it seems wasteful to use the heap for this. Instead, we simply convert the stack object (a CONS) into a continuation object by manipulating the last four tag bits.