AIM-514, the beginning
14 Sep 2015In which I take a deliberate swan dive off the deep end.
Returning readers may recall that I am no fan of LispMs. However this semester at UT Austin I’m taking a research reading class on computer architecture. The focus of the class and hopefully of next semester’s project is that of paralleism in computer architectures. As my professor has a sense of humor, his first assignment for me was to implement exactly one of the linked list interpreting machines which I previously mocked.
So here we go.
The paper in question is AI Memo 514, Steele and Sussman ‘79. This post is just an analysis of the paper which I wrote to force myself to understand Steele’s memory controller. This post is not a presentation of or introduction to either Steele or my designs. Rather it is a bunch of notes on both which I needed to get out of my head and figured I would publish. Future posts will present and talk about the resulting FPGA implementation.
Memory model
This machine uses a 16 bit address space and 24 bit words. Pointer words are
pairs [tag:8, value:16]
where tag
is one of the types or opcodes, and
value
is a memory address. Other word interpretations also exist but are
special cases.
That structures tend to be small, randomly accessed and short lived is a long
established fact in the study of garbage collection and object lifetimes. A
naive memory model where literal lists are used to represent all storage while
possible sacrifices a lot of space storing link pointers, and a machine using
this model sacrifices much time traversing link pointers. Were I to implemnent a
strictly linked list based system, a 48 bit word fully encoding a pair
(car, cdr)
would be more natural. However as hinted above and as studied in
the various cdr coding papers
(AIM-578 ~ Steele ‘80 & citations)
frequently the structure we desire is not truly a linked list although we may
encode it as one, rather it is a buffer or an n-tuple which may have some very
compact & efficient vector representation given known size.
This suggests that buffer allocation while not logically part of the memory model must be supported, and leads to the conclusion of a 24 bit word size.
Types
Steel observes on Pg. 13 that a constant requirement of the naive list interpreters presented earlier is to test the head of the list for equality with respect to the set of symbols which the interpreter knows how to handle. It would be far more efficient Steele observes to simply encode in the type field of the pointer to the list what the op to be performed on that list is. This encoding saves at least one memory access per expression evaluated and is a subtlety worth noting.
Note: I shall abuse the → notation here. → b
denotes a pointer to whatever b
is. From Clojure, ^foo
is understood to denote that the following expression
has metadata foo. ^{:tag int} x
for instance would be a legitimate expression,
the shorthand for which is simply ^int x
since the common case of metadata is
tags. So ^cons → (e₁ e₂)
would be understood to mean a pointer to the list
(e₁ e₂)
having the tag symbolically meaning cons
.
Steel proposes for instance that if
and cons
may be encoded with such tags,
showing the following example on Pg. 14:
The encoding of which is given to be as follows:
^if → (^var → (0, 3) ^list → (^symbol → x ^list → (^symbol → y ^symbol → nil)) ^if → (^var → (1, 2) ^list → (^symbol → d) ^comb → (^var → (0, 1) ^more → (^number → 69 ^cons → nil))))
One such optimization noted on Pg. 13 is that while naively local variables
could be implemented by maintaining and walking an association list as Scheme is
lexically scoped we can instead statically write a pair [n, m]
, where n
is
the number of binding stack frames (0 or greater) which we must traverse up to
find the scope of the binding in question and m
is the binding number at that
level. By way of example, consider the following:
Steele makes use of several types, but does not discuss any of them at length
here. Number
, Symbol
, List
, Variable
and Procedure
all occur as tags
for values. Furthermore the ops If
, Bind
, Local
, Funcall
, Cons
, Car
,
Cdr
, Atom
, Combination
and More-args
appear. The implication is that
there are several more such types which are not mentioned, however I have yet to
discover them.
Memory controller
On Pg. 18, Steele discusses a multiple cycle memory controller (seemingly a
simple stack machine) which operates on pointers and has the operations push
,
pop
, drop
, setq
, cons
, car
and cdr
. Push is a control state on which
the value of the inbound bus is pushed onto the memory controller’s stack. Pop
writes the top of the stack out over the result bus. Drop discards the top of
the stack. Setq writes the top of the stack to the 2nd of the stack, consuming
both. Cons consumes two elements from the stack and pushes one (a new cell)
back. Car and cdr both consume the top of the stack, pushing a replacement
result value.
Steele’s design does not feature a garbage collector due to chip area restrictions and by way of appology to the effect that it could be done cites much work on the subject of garbage collection. Due to having a large (8k word) memory in my prototype and wanting to get something working I will also for the time being ignore garbage collection.
One could implement a mark and sweep collector by changing the pointer format from having 8 bits of tag to having seven bits of tag and one of gc flag. A single bit is kept to count the number of garbage collection passes which have occured. When allocating resources, the tag bit is set to the value of this bit. When marking garbage, the complement of this bit is used to denote that space is reached and that value is recursively written to all reachable objects starting from the register set. Collection and compaction of freed space into larger regions then occur as one would expect and finally the counter bit is flipped. This gives the invariant that all storage is tagged as unused at the start of each collection pass and saves the need to traverse all of storage to reset collection flags.
^d