A post giving an overview of metaprogramming in Lean. There is of course a book on Metaprogramming in Lean, consider this an introductory chapter that I felt I lacked.
There are two types of metaprogramming, shallow embedding and deep embedding.
If you've done some metaprogramming in Ruby, Javascript or Python - that was likely the shallow embedding metaprogramming. If you want to write tactics in Lean, you want to learn deep embedding.
Metaprogramming in Lean is deeply connected to the compilation steps - parsing, syntactic analysis, transformation, and code generation.
Lean 4 is a reimplementation of the Lean theorem prover in Lean itself. The new compiler produces C code, and users can now implement efficient proof automation in Lean, compile it into efficient C code, and load it as a plugin. In Lean 4, users can access all internal data structures used to implement Lean by merely importing the Lean package.
Leonardo de Moura, Sebastian Ullrich (paper)
The entirety of metaprogramming in Lean can be captured in 3 steps - parsing, elaboration, and evaluation. The compiler sees a string of Lean code, say "let a := 2"
, and the following process unfolds:
syntax
rule ("let a := 2" → Syntax
)syntax
rules to turn a string of Lean code into the Syntax
object. syntax
rules are basically glorified regular expressions - when you write a Lean string that matches a certain syntax
rule's regex, that rule will be used to handle subsequent steps.macro
s in a loop (Syntax → Syntax
)Syntax
object into some new Syntax
object. Then, the new Syntax
is processed in a similar way (steps 1 and 2), until there are no more macro
s to apply.elab
(Syntax → Expr
)elab
method that's matched to the appropriate syntax rule by the name
argument (both the syntax
rule and elab
method have this argument, and they must match). The newfound elab
returns some Expr
object.Expression (Expr
) is then converted to the executable code during the evaluation step - we don't have to specify that in any way, the Lean compiler will handle that for us.
Syntax
/Expr
/codeLean will perform the parsing/elaboration/evaluation steps for you if you use syntax
& elab
, however when you are actually writing your elab
, you frequently need to perform these transitions manually.
Now, when you're reading Lean source code, you will see 11(+?) commands specifying the parsing/elaboration/evaluation process. Most of them are syntax sugar, and you only need 3 commands to do metaprogramming in Lean:
syntax (name := xxx) ... : command
For example,
syntax (name := xxx) "#help" "option" (ident)? : command
is a syntax rule that will match (remember the regex analogy) all strings in the form of"#help option hi.hello"
or just"#help option"
.
Other widespread syntax categories aretactic
andterm
, all of these are used in different physical places in your code.
@[macro xxx] def ourMacro : Macro := ...
For example,
@[macro xxx] def ourMacro : Macro := λ stx => match stx with | _ => `(tactic| pick_goal 2)
. Whenever your Lean code matches the syntax rule with the namexxx
, this macro will be applied.
If the syntax rule wassyntax (name := xxx) "swap" : tactic
, this macro will syntactically turnswap
intopick_goal 2
- and the subsequent elaboration step will be handled by thepick_goal 2
syntax rule.
@[command_elab xxx] def ourElab : CommandElab := ...
For example,
@[term_elab our_help] def ourElab : TermElab := λ stx tp => (Expr.app (Expr.const `Nat.add []))
.
Now, TermElab stands for Syntax → Option Expr → TermElabM Expr, so the elaboration function is expected to output the Expr object.
There also exists CommandElab, which stands for Syntax → CommandElabM Unit, so it shouldn't return anything.
Tactic stands for Syntax → TacticM Unit, so it shouldn't return anything either.
This corresponds to out intuitive understanding of terms, commands and tactics in Lean - terms return a particular value upon execution, commands modify the environment or print something out, and tactics modify the proof state.
These syntax (name := xxx) ... : command
, @[macro xxx] def ourMacro : Macro := ...
and @[command_elab xxx] def ourElab : CommandElab := ...
are the 3 essential functions, and you can get away with them only. Lean standard library and Mathlib use many syntax sugars, however, so it's well worth the effort to memorize these. I'm summing them up in the next diagram.
elab_rules
is sugar for @[command_elab ~] def ~ : CommandElab
.elab
is a combination of syntax ~ : command
and @[command_elab ~] def ~ : CommandElab
(I write CommandElab
here for succinctness, but of course it can be a Tactic
or TermElab
). macro_rules
is sugar for @[macro] def ~ : Macro
.macro
is a combination of syntax ~ : command
and @[macro] def ~ : Macro
.notation
, prefix
, infix
, postfix
also all combine syntax ~ : command
and @[macro] def ~ : Macro
- they differ from macro
in that you can only define rather simple syntax using them. With macro
you can define any syntax.I have hinted at the flow of execution of these three essential functions here and there, however here I want to lay it out explicitly. The order of execution follows the following pseudocodey template: (syntax; macro)+ elab
.
Consider the following example.
The process is as follows:
match appropriate
syntax
rule (happens to havename := xxx
) ⇒
apply@[macro xxx]
⇒match appropriate
syntax
rule (happens to havename := yyy
) ⇒
apply@[macro yyy]
⇒match appropriate
syntax
rule (happens to havename := zzz
) ⇒no macros with name
zzz
to apply, so apply@[command_elab zzz]
.
The behaviour of syntax sugars (elab
, macro
) can be understood from these first principles.
macro
VS elab
oration?In principle, you can do with a macro
(almost?) anything you can do with the elab
function. Just write what you would have in the body of your elab
as a syntax within macro
. However, the rule of thumb here is to only use macro
s when the conversion is simple and truly feels elementary to the point of aliasing. Here are some discussions from Lean Zulip.
I'm wondering where people think the border between "use a macro" and "use an elaborator" for this actually is. In my mind this has so far been that as soon as types or control flow is involved a macro is probably not reasonable anymore.
Henrik Böving (zulip thread)
I also agree (this was also discussed in another thread) that using
Syntax
for logic and control is far from ideal, and it is best to work at theMeta
/TermElab
level for this. Lean 4 macros are superb for notation and for creating some shortcuts. But passing around bits of syntax as proxies for function arguments is very fiddly.Understanding
Syntax
is very much necessary. I think the point both Jannis Limperg and I are making is that one should avoid readers driven to using macros and working with syntax where it is better to useMeta
/Elab
and work with expressions, simply because one can start working with macros a bit quicker.Siddhartha Gadgil (zulip thread)