duck-lisp


Duck-lisp is a programming language I wrote from 2021 to 2024, though I still periodically update it.

I was joking around online with some programmers about horrible code such as i -= -!False and someone said "Just us #define and write it in regular expressions". Now this doesn't exist in C, but I and another programmer found this to be an interesting idea, interesting enough to write our own programming language that had this feature or something like it: C, but with a powerful preprocessor. He named it "Hanabi". Writing a C-like programming language seemed difficult, so I started writing duck-lisp as practice. Unfortunately, the other language didn't work out, but duck-lisp continued to be my primary project for the next three years.

Duck-lisp is a little odd. I barely knew what Lisp was when I started writing it so I just made stuff up. I only learned Emacs Lisp a few months in. This is why variables are defined using var instead of let and are scoped using parentheses (yes, this can be confusing). In some ways it was planned and in some ways it grew. I originally wanted goto, but that turned out to be difficult to implement with how the duck-lisp stack works.

The repository is on GitHub.

Resources

I relied heavily upon Crafting Interpreters for some of the major features. I had a lot of trouble with closures for example. For most of the other components I merely used Crafting Interpreters as a reference while making up my own mechanisms to get similar behavior.

I learned how to use Lua's C API while writing a game engine. I practically copied its VM API.

Emacs Lisp and Common Lisp have greatly influenced duck-lisp. Duck-lisp's macros are a slightly restricted form of Lisp macros, and many other functions and data types from lisps exist as well.

I didn't find many resources on implementing macros in a bytecode compiler, so here's something I wrote that can hopefully fill that gap: Implementation of duck-lisp macros.

Timeline and explanation of major components

Memory allocator

Duck-lisp has its own memory allocator. I suggest not using it, but if you are compiling for a platform that doesn't have a memory allocator for whatever reason, the duck-lisp's allocator might work. It is not what I would call well-written and from what I recall tends to do dangerous things when it runs out of memory. It is also slow. I originally wrote the allocator because I wanted duck-lisp to be able to run even if there was no standard library. It can do that, but the vast majority of platforms have memory allocates even when they don't have the full C standard library, so in the end it wasn't too useful. It was a fun challenge though.

Parser

The parser is quite basic. It can read integers, floats, strings, identifiers (symbols), and expressions (which other Lisps call "forms", but this is duck-lisp, and what I say goes :) ).

Tries

I originally added tries to keep track of variable names. I did this primarily because tries are cool and not because they are simpler or faster. I don't regret this decision, and as of 2025-05-31, duck-lisp still does not have hash tables. Part of the reason the language still doesn't have hash tables is because I wanted the language to run on microcontrollers, and it seemed wasteful of memory to do hashing at runtime when arrays could be used instead. I'm not sure this logic follows, and if I were to do a "duck-lisp 2", some sort of record type would be one of the first things I would add. It's possible that the underlying mechanism would still be tries though. :)

Modularity

One of my primary goals was for duck-lisp to be as modular as possible. I did not achieve this as well as I wanted, but it is somewhat modular.

External C functions, what I call "callbacks", can be registered with or added to the compiler and added to the VM. These are called at runtime by duck-lisp code.

The parser could easily be replaced since all it needs to do is accept a text string and emit an AST.

The current parser has what I call "parser actions". A C function is passed to the parser along with a name for the function. When a function call is found by the parser that has that name, the parser action is called. Parser actions can perform arbitrary transformations to its arguments (which are parsed before the action is called). In "duckLisp-dev.c" I use a parser action to implement include. The action takes the string, opens the file with that name, parses the text inside, then replaces the (include "file.dl") form with the resulting AST.

The compiler itself is modular and is primarily composed of "generators". Generators implement keywords, or as other lisps call them, "special forms". These are keywords such as setq, if, and defun. When (if a b c) is encountered, the if generator is called and passed the AST for the form. It then compiles the form into "high level assembly" instructions. Since a, b, and c could be expressions that have other special forms, if recursively calls compilation functions on its arguments.

The VM is much more monolithic, but it has a couple extension mechanisms as well. The main one is the user defined types. Users can put their own binary data into an object and give it a custom garbage collector tracing function and a destructor function. Lua also has extensible types, but I think it does it better because it doesn't require users of its API to mess with garbage collection as much.

Data stack

Stacks are pretty simple. They are an array that you push data onto the end and then pop it off. The idea of a VM stack is also simple. You push data on the stack, pop it off to operate on it, then push the result back on. That said, they are deceptively hard to manage in a large compiler. I had many stack related bugs due to not being consistent about how and where data was pushed on and popped off the stack. Eventually I came up with a calling convention for special forms and functions and that seemed to help.

Closures

Closures were painful. I find most other programming concepts pretty easy to understand, but the implementation of closures in a bytecode VM is complicated. From what I recall, I tried two different ways to implement closures, and both failed. I finally read the whole chapter on closures in Crafting Interpreters and I was finally able to implement them correctly.

Interestingly, there are at least two types of closures in programming languages. Scheme introduced the most popular form. If you define a variable and then reference it in a function, you create a closure. In Scheme, you could then assign a new value to that variable from inside the closure and if other functions that reference the variable read its value, they will see the new value that was assigned. I haven't used Haskell, but it sounds like that language has a second sort. You simply can't assign a new value to variable. This is how Church's lambda calculus originally did closures, and Scheme is interesting in that it extended the semantics to allow assignment to captured variables.

Macros

I went wrong here. Duck-lisp has a hard split between compile time and runtime code, and my first attempt tried to make possible to call any functions at runtime and compile time. This was a disaster. I don't think it's worth explaining here, but I have another page that goes into detail.

Eventually I did a more typical implementation where macros are functions that run at compile time and accept and emit syntax trees. I go into more detail in another article.

Parenthesis inference

And now for one of my favorite parts of the language. I mentioned in the introduction that this project started as practice for another language called "Hanabi". Not realizing how integral to lisp macros are, we decided on a whim that the language would have lisp syntax, but with fewer parentheses. Well that project died, but I really liked the idea of making Lisp look more conventional by removing the parentheses. But there was one problem, and that was that macros allowed creating what are effectively new keywords, and that made the rules for when to automatically insert parentheses quite complicated. After duck-lisp was nearly complete though, I finally figured out how to make it work. All I needed was to add a very simple type system. The type system is described in detail in duck-lisp's docs, so I won't say much more about it here. I think it is the most interesting part of duck-lisp, but I don't think it is a good system. The worst lisp coding style (without defining new macros) looks nowhere near as bad as you can make inferred duck-lisp look. It can make lisp look cleaner though, and it makes an excellent syntax for assembly languages. I think a better direction to look in would be Pratt's original parser that was designed for adding syntactic sugar to Lisp. With a parser like this, I expect the user could easily add new syntactic sugar such as curly braces for indicating variable scope, much like C and its decedents do. Another resource to gain inspiration from could be David Moon's language Lunar, which has a built-in Pratt parser instead of the parserless reader macros that Common Lisp has.

Uses

So once you have your own programming language, what do you do with it? Let it sit around like a trophy and gather dust? I would not be surprised if that is what usually happens. I've been fortunate in that I found uses for my language even before it was finished. I've used it in a flash programmer, a microcode loader for my TTL computer, an assembler, and a command language for networked microcontrollers. I also intend to make my TTL computer run a variant of it once it is working.

Updated 2025-05-31