Kernel Mode Garbage Collector


I've been impressed by papers on managed operating systems, like Singularity. Singularity supports hardware and software process isolation, but either can be disabled. In experiments, software process isolation without hardware process isolation increased the speed of system calls and inter-process communication by 30% to 500%.1 I've started to think hardware process based isolation is completely useless, but I often get push back or confusion when I suggest removing hardware and operating system support for virtual memory. To make my argument against virtual memory, and to help me see if I'm wrong, I've started working on my own managed operating system.

I don't have the time or interest to implement hardware process isolation, much less an operating system that implements POSIX, and I'm starting from scratch so I can design the system however I want. I'm interested in Oberon and Inferno, but not interested enough yet to study them. I have too much to read anyway. Instead, I'm going to make my own operating system based on my misconceptions of them. My operating system will be a kernel level lisp interpreter. Instead of executables, all programs are functions. Instead of a loader, you call a function. Instead of processes, all programs run as threads in the kernel. This design is easy to implement, relatively safe, interesting, and not particularly innovative. Microcontroller operating systems, like Xinu, don't use hardware isolation either; all processes or really kernel threads. To make my system, I'll need, at a bare minimum, a garbage collector, a scheduler, and an interpreter.

Hardware support doesn't matter much to me, but I'd like to at least get a RTL8139 driver written so I can support raw ethernet sockets. The RTL8139 is supported by QEMU, documented by the operating system development wiki, and doesn't look too hard to implement. The real challenge will be seeing how much of the driver I can implement with my lisp. Support for non-volatile storage would be nice too.

To get me started, I went through the aforementioned wiki, built my cross compilers, finished the bare bones tutorial, and got something written out using the VGA text mode on QEMU. I'm stuck with 32 bit x86 for now, but it shouldn't be hard to port the system later. The only places I need assembly are the stub that calls kernel main, the context switcher for the scheduler, and maybe some drivers.


Figure 1: My kernel hello world.

From there, I started working on the garbage collector. It seems silly to work on this before the interpreter, but I had never implemented a garbage collector before and I had a book on them on my desk. I decided to start with a simple mark and sweep garbage collector. For the object format, I used something very similar to the example simple represenation shown in Guile implementation documentation. It's very wasteful. The memory pool is a big array of these objects. The root list is another array.

struct memory_object {
  enum memory_object_type type;
  enum memory_mark mark;
  union {
    size_t fixed_integer;
    struct {struct memory_object * car; struct memory_object * cdr;} pair;
    struct {struct memory_object * start; size_t size;} array;

This ran in kernel mode, but debugging it was frustrating so I wrote test code that uses the garbage collector in user space. Now, the garbage collector seems to be working. I recently added support for cons cells, but I still need to test cyclical data structures.


Figure 2: I can allocated some objects and print them out. With the object pool set to 4, the allocator must run the garbage collector and reclaim memory.

struct memory_object * test = memory_new_cons(


Figure 3: I can print out my list too. Unsurprisngly, cyclical lists overflow the stack.

I will probably need to make the garbage collector generational, for speed, and distributed, to support multiple processes, but I can do that later. Since it's well documented in the CPython Internals book, I might copy CPython's generational garbage collector.

There's a neat Catch-22: I want a good memory inspector, but to implement one I need string support. I don't want my memory inspector to rely on a different form of memory allocation than the rest of the system.

Even with such a short piece of code, I've already been exposed to a lot of design questions:

There's a lot to figure out.

Sharing garbage collected objects between processes is actually a very hard problem.

I could always just say "processes don't share memory", but that seems like an unnecessary restriction.