Metaprogramming in Lean

A post giving an overview of metaprogramming in Lean. There is of course a book on Metaprogramming in Lean, consider this an introductory chapter that I felt I lacked. 

There are two types of metaprogramming, shallow embedding and deep embedding.
If you've done some metaprogramming in Ruby, Javascript or Python - that was likely the shallow embedding metaprogramming. If you want to write tactics in Lean, you want to learn deep embedding.

Connection to compilers

Metaprogramming in Lean is deeply connected to the compilation steps - parsing, syntactic analysis, transformation, and code generation.

Lean 4 is a reimplementation of the Lean theorem prover in Lean itself. The new compiler produces C code, and users can now implement efficient proof automation in Lean, compile it into efficient C code, and load it as a plugin. In Lean 4, users can access all internal data structures used to implement Lean by merely importing the Lean package.

Leonardo de Moura, Sebastian Ullrich (paper)

First we will have lean code as a string. Then Syntax object. Then Expr object. Then we can execute it.

The entirety of metaprogramming in Lean can be captured in 3 steps - parsing, elaboration, and evaluation. The compiler sees a string of Lean code, say "let a := 2", and the following process unfolds:

  1. apply a relevant syntax rule ("let a := 2" → Syntax)
    During the parsing step, Lean applies all declared syntax rules to turn a string of Lean code into the Syntax object. syntax rules are basically glorified regular expressions - when you write a Lean string that matches a certain syntax rule's regex, that rule will be used to handle subsequent steps.
  2. apply all macros in a loop (Syntax → Syntax)
    Each macro simply turns the existing Syntax object into some new Syntax object. Then, the new Syntax is processed in a similar way (steps 1 and 2), until there are no more macros to apply.
  3. apply a single elab (Syntax → Expr)
    Finally, it's time to infuse your syntax with meaning - Lean finds an elab method that's matched to the appropriate syntax rule by the name argument (both the syntax rule and elab method have this argument, and they must match). The newfound elab returns some Expr object.

Expression (Expr) is then converted to the executable code during the evaluation step - we don't have to specify that in any way, the Lean compiler will handle that for us.

Manual conversions between Syntax/Expr/code

Lean will perform the parsing/elaboration/evaluation steps for you if you use syntax & elab, however when you are actually writing your elab, you frequently need to perform these transitions manually.

3 essential functions and their syntax sugars

Now, when you're reading Lean source code, you will see 11(+?) commands specifying the parsing/elaboration/evaluation process.  Most of them are syntax sugar, and you only need 3 commands to do metaprogramming in Lean:

  • syntax rule: syntax (name := xxx) ... : command

For example, syntax (name := xxx) "#help" "option" (ident)? : command is a syntax rule that will match (remember the regex analogy) all strings in the form of "#help option hi.hello" or just "#help option"
Other widespread syntax categories are tactic and term, all of these are used in different physical places in your code.

  • macro: @[macro xxx] def ourMacro : Macro := ...

For example, @[macro xxx] def ourMacro : Macro := λ stx => match stx with | _ => `(tactic| pick_goal 2). Whenever your Lean code matches the syntax rule with the name xxx, this macro will be applied.
If the syntax rule was syntax (name := xxx) "swap" : tactic, this macro will syntactically turn swap into pick_goal 2 - and the subsequent elaboration step will be handled by the pick_goal 2 syntax rule.

  • elab: @[command_elab xxx] def ourElab : CommandElab := ...

For example, @[term_elab our_help] def ourElab : TermElab := λ stx tp => ( (Expr.const `Nat.add [])).
Now, TermElab stands for Syntax → Option Expr → TermElabM Expr, so the elaboration function is expected to output the Expr object.
There also exists CommandElab, which stands for Syntax → CommandElabM Unit, so it shouldn't return anything.
Tactic stands for Syntax → TacticM Unit, so it shouldn't return anything either.
This corresponds to out intuitive understanding of terms, commands and tactics in Lean - terms return a particular value upon execution, commands modify the environment or print something out, and tactics modify the proof state.

These syntax (name := xxx) ... : command, @[macro xxx] def ourMacro : Macro := ... and @[command_elab xxx] def ourElab : CommandElab := ... are the 3 essential functions, and you can get away with them only. Lean standard library and Mathlib use many syntax sugars, however, so it's well worth the effort to memorize these. I'm summing them up in the next diagram.

elab_rules is sugar for @[command_elab ~] def ~ : CommandElab.
elab is a combination of syntax ~ : command and @[command_elab ~] def ~ : CommandElab (I write CommandElab here for succinctness, but of course it can be a Tactic or TermElab). 

macro_rules is sugar for @[macro] def ~ : Macro.
macro is a combination of syntax ~ : command and @[macro] def ~ : Macro.
notation, prefix, infix, postfix also all combine  syntax ~ : command and @[macro] def ~ : Macro - they differ from macro in that you can only define rather simple syntax using them. With macro you can define any syntax.

Order of execution: syntax, macro, elab

I have hinted at the flow of execution of these three essential functions here and there, however here I want to lay it out explicitly. The order of execution follows the following pseudocodey template: (syntax; macro)+ elab.

Consider the following example.

The process is as follows:

match appropriate syntax rule (happens to have name := xxx) ⇒
apply @[macro xxx]

match appropriate syntax rule (happens to have name := yyy) ⇒
apply @[macro yyy]

match appropriate syntax rule (happens to have name := zzz) ⇒

no macros with name zzz to apply, so apply @[command_elab zzz].

The behaviour of syntax sugars (elab, macro) can be understood from these first principles.

Assigning meaning: macro VS elaboration?

In principle, you can do with a macro (almost?) anything you can do with the elab function. Just write what you would have in the body of your elab as a syntax within macro. However, the rule of thumb here is to only use macros when the conversion is simple and truly feels elementary to the point of aliasing. Here are some discussions from Lean Zulip.

I'm wondering where people think the border between "use a macro" and "use an elaborator" for this actually is. In my mind this has so far been that as soon as types or control flow is involved a macro is probably not reasonable anymore.

Henrik Böving (zulip thread)


I also agree (this was also discussed in another thread) that using Syntax for logic and control is far from ideal, and it is best to work at the Meta/TermElab level for this. Lean 4 macros are superb for notation and for creating some shortcuts. But passing around bits of syntax as proxies for function arguments is very fiddly.

Understanding Syntax is very much necessary. I think the point both Jannis Limperg and I are making is that one should avoid readers driven to using macros and working with syntax where it is better to use Meta/Elab and work with expressions, simply because one can start working with macros a bit quicker.

Siddhartha Gadgil (zulip thread)