Or “How not to design your compiler for macros”
To see how macros can successfully be implemented in a compiler,
see Implementation of duck-lisp macros.
I don't see many people posting their programming failures, so I figured I'd post my failure with implementing
macros in duck-lisp. The moral is that it was better for the language to accurately represent the division
between the compiler and the VM instead of trying to hide it. I didn't want a built-in comptime
keyword. I wanted macros to be able to call functions defined in runtime code, just like in nearly every other
lisp. Unfortunately, this made a mess of the language model. I would have had to compile every function twice:
once for the runtime environment and once for the compile-time environment. This worked fine without macros, but
once I added them to the language model, compile-time functions had to be ready to execute immediately after
they were defined so that they could be called by macros immediately after.
(defun f (x) x)
(defmacro m (y)
(f y))
(defun g (z)
(m z))
(println (g "g"))
Duck-lisp is lexically scoped, so all variables must be declared before they are referenced. f
must
be defined in the compile-time environment before its reference in m
. m
must be
defined in the compile-time environment and declared in the runtime environment before it is called in the
definition of g
. g
must be defined in the runtime environment before its reference
by (g "g")
. The easiest way to achieve this is to do a sort of compilation in
parallel. Sequentially compile f
in runtime and compile-time environments, then
compile m
, then compile g
. It still works nicely… except for the fact that macros are
lexically scoped and can be defined inside other functions.
(defun h (w)
(defun f (x) x)
(defmacro m (y)
(f y))
(defun g (z)
(m z))
(println (g w)))
I'm sure it's not at all obvious how to compile this at this point, but the order the functions must be compiled
in is now much more specific.
- Begin compilation of
h
in both environments. - Compile
f
in both environments. The compile-time function is required in step 3. - Compile
m
in the compile-time environment and declarem
in the runtime environment. It is required in step 5. - Begin compilation of
g
in both environments. - Call
m
in both environments. Both expansions of m result in the same code since the macro only calls a pure function. (But what if it's impure…) - Finish compilation of
g
in both environments. The runtime function is required in step 7. - Finish compilation of
h
in both environments.
;; Separate copies of `x' are kept in the runtime and compile-time environments.
(var x 0)
(defmacro increment-x ()
(setq x (1+ x)))
(increment-x)
(println x) ; ⇒ 0
(comptime (println x)) ; ⇒ 1
This isn't difficult to reason about as long as you keep in mind that the macro only captures the compile-time copy of
x
. The macro can't affect the runtime x
. Now let's stick it in a function
(without calling it!) and see what happens.
(var x 0)
(defmacro increment-x ()
(setq x (1+ x)))
(lambda () (increment-x))
(println x) ; ⇒ 0
(comptime (println x)) ; ⇒ 2 Uh oh
As has already been established, functions are compiled twice.
(increment-x)
is expanded twice as a
result. This inconsistency is fixable. The top-level code is not part of any function, but wrapping all code in
a function would cause (comptime (print x))
to print the same result for both versions of the
code. Either this can be fixed properly in the compiler, or you can just evaluate the source code inside an
anonymous function call as shown below.
eval("(funcall (lambda () " + source_code + "))")
Something interesting about this double evaluation behavior is that it's possible to tell if a macro is being expanded in the runtime environment or the compile-time environment.
(var i 0)
(defmacro is-comptime ()
(setq i (1+ i))
(even? i))
This assumes the code is first compiled in the runtime environment and then in the compile-time envionment. And now the abstraction of a single unified environment is completely gone.
(var i 0)
;; Only ever call this once in a macro, else this method doesn't work
(defun is-comptime ()
(setq i (1+ i))
(even? i))
(defmacro unpredictable ()
(if (is-comptime)
"Hardware and software are logically equivalent"
'(funcall (lambda () (self))))
(defun f ()
(unpredictable))
(comptime (println (f))) ; ⇒ "Hardware and software are logically equivalent"
(println (f)) ; Hangs until the stack overflows
This is actually a tame example since I wanted to show that this isn't a problem of hygiene or anything like that. The problem is that
unpredictable
breaks the symmetry that all code has had up to this
point. You might recall from the compile sequence above that functions appeared to be compiled in parallel in
both the runtime environment and the compile-time environment. The reason this was done was because those
functions were identical in both the compile time and the runtime environments. They were compiled this way to
hide the fact that there is more than one execution environment. Now that it is possible to generate different
code for each environment, it is no longer practical to do that sort of parallel compilation at all times. It is
not obvious to me how a string and a list would be compiled in parallel, so for this macro-expansion, I think it
would be better if one expansion was fully compiled before the other. If the mode of compilation has to change
from parallel to serial during some or all macro expansions, and the language user has to account for that, then
it doesn't seem like this model is a very good abstraction.However, there turned out to be a simple and elegant solution to that problem.
Give up.
Make a clear separation between the compile-time and runtime environments as God intended, and don't ever think about this cursed macro system ever again.
- No major changes need to be made to the compilation of defun.
comptime
runs arbitrary code at compile-time. Lexical variables defined in its body remain defined until compilation is finished.defmacro
defines a compile-time function and declares a macro in the runtime environment.
comptime
, which still isn't very difficult to add.It must be possible to switch between the runtime and compile-time environments that code is compiled in. Everything is compiled in the runtime environment by default, but when a
comptime
is
encountered, its body must be compiled in the compile-time environment.It must be possible to incrementally compile compile-time code. The results of each incremental compilation are run one-at-a-time on the VM to define variables and functions.