Monday, February 28, 2011

I wrote a compiler for 01_ to LLVM (Low Level Virtual Machine) in a week of weekends and evenings. LLVM's static type-checking caught numerous silly mistakes. However, I got bit twice because LLVM does not warn, at least with the default option, when the calling convention declaration of the caller and callee do not match. (I use fastcc because tail-call elimination is important for 01_ programs, and failed to specify it in the caller those two times.) This seems like something that could be checked by the computer and reminds me why I prefer using statically typed programming languages over dynamically typed programming languages.

I wrote the parser in one evening, which I had done before, so it was mostly a matter of getting reacquainted with the Parsec parsing library.

I spent another evening and a weekend learning the LLVM assembly language and writing the runtime library.

I spent another couple of evenings writing the code generator.

I spent the last evening chasing down memory leaks in the generated code.

The code is available at github.com.

Monday, February 14, 2011

I had been thinking about compiling 01_ to LLVM for a while, and finally decided to get started on it by playing around with LLVM assembly language. One thing I like about LLVM assembly language is the static type checking. Anyhow, I started writing a runtime library for 01_. The only data type in 01_ is a lazy list of bits. The data type does not permit circular references, so I'll use reference counting garbage collection.

Here's my first stab at the data type for 01_ values:

%val = type { i32, i1, %val*, { i1, %.val* } (i8*)*, void (i8*)*, i8* }

which, in a C-like syntax would be:

struct val {
int refcount;
bool bit;
struct val *next;
  struct { bool bit, struct val *next } (void*) *eval;
void (void *) *free_env;
void *env;
};

where bit and next are undefined and eval is non-null for unevaluated promises, and bit is undefined and next is null and eval is null for values evaluating to nil, and bit contains the bit value and next points to the next value and eval is null for non-nil values. That's a pretty large data structure for a single element of a bit list. I could shrink it by the size of a pointer by using the same location for next and env and casting, as env and free_env are never valid at the same time as next and bit. I won't do that, though, because it would make the code less clear, and having more understandable code is more important to me in this project.