Dynamic code generation: def as an exception in Macro expansion prioritisation. Understanding Macro.escape(var, unquote: true) Injection vs Transfer

DidactMacros · May 17, 2024, 11:16pm

I have a question regarding a code snippet from part 6 of a really great Macros guide by @sasajuric.

This question specifically ranges from use of Macro.escape’s unquote: true option. to the disparate prioritisation of expansion of def calls compared to other macros.

deftraceable (as seen in the snippet) is a custom macro, so the arguments sent to it are quoted, creating nested quoting when it is called a def do block. I presume this nested quoting prevents the unquote on its arguments from taking immediate effect, which means what is passed to deftraceable should be the AST of the unquotes and their args.

Expansion of the deftraceable macro takes place before execution of its housing function, but this isn’t an issue since the macro itself does not do any evaluations on its inputs; however, after conclusion of expansion, bind_quoted will evaluate the arguments passed, and inject an AST describing and binding the results of evaluation to the stipulated variables.

Since the evaluation is going to yield the AST of the unquote and its arguments, the use of Macro.escape is employed to ensure correct AST form.

I’ve ascertained from the guide and the docs that unquote: true is necessary here because Macro.escape var, unquote: true is supposed to recognise where an unquote call AST is present in the evaluation and do a further eval by which it yields the result of the unquote call as the final AST to be injected.

The one small problem I’m having here is that I don’t know how to understand injection and transfer as distinct phenomena in the context described here.

Injecting the code vs transferring data

Another problem we’re facing is that the contents we’re passing from the macro to the caller’s context is by default injected, rather then transferred. So, whenever you do unquote(some_ast), you’re injecting one AST fragment into another one you’re building with a quote expression.

Occasionally, we want to transfer the data, instead of injecting it. Let’s see an example. Say we have some triplet, we want to transfer to the caller’s context…

def is a macro, so its arguments, including the do block, are quoted.

I was also a bit curious as to why, ostensibly, no such precaution (Macro.escape var, unquote: true) need be taken with use of unquote in passing arguments to def. I read something in the guide about def being the exception in the prioritsation of macro expansion, which I assume would mean that the unquote AST might get to be processed differently when it comes to the involvement of def in dynamic code generation? I would appreciate even a brief clarity on this, as I’ve tried looking at the def code directly and didn’t really get far.

deftraceable Snippet

defmodule Tracer do
  defmacro deftraceable(head, body) do
    # This is the most important change that allows us to correctly pass
    # input AST to the caller's context. I'll explain how this works a
    # bit later.
    quote bind_quoted: [
      head: Macro.escape(head, unquote: true),
      body: Macro.escape(body, unquote: true)
    ] do
      # Caller's context: we'll be generating the code from here

      # Since the code generation is deferred to the caller context,
      # we can now make our assumptions about the input AST.

      # This code is mostly identical to the previous version
      #
      # Notice that these variables are now created in the caller's context.
      {fun_name, args_ast} = Tracer.name_and_args(head)
      {arg_names, decorated_args} = Tracer.decorate_args(args_ast)

      # Completely identical to the previous version.
      head = Macro.postwalk(head,
        fn
          ({fun_ast, context, old_args}) when (
            fun_ast == fun_name and old_args == args_ast
          ) ->
            {fun_ast, context, decorated_args}
          (other) -> other
      end)

      # This code is completely identical to the previous version
      # Note: however, notice that the code is executed in the same context
      # as previous three expressions.
      #
      # Hence, the unquote(head) here references the head variable that is
      # computed in this context, instead of macro context. The same holds for
      # other unquotes that are occuring in the function body.
      #
      # This is the point of deferred code generation. Our macro generates
      # this code, which then in turn generates the final code.
      def unquote(head) do
        file = __ENV__.file
        line = __ENV__.line
        module = __ENV__.module

        function_name = unquote(fun_name)
        passed_args = unquote(arg_names) |> Enum.map(&inspect/1) |> Enum.join(",")

        result = unquote(body[:do])

        loc = "#{file}(line #{line})"
        call = "#{module}.#{function_name}(#{passed_args}) = #{inspect result}"
        IO.puts "#{loc} #{call}"

        result
      end
    end
  end

  # Identical to the previous version, but functions are exported since they
  # must be called from the caller's context.
  def name_and_args({:when, _, [short_head | _]}) do
    name_and_args(short_head)
  end

  def name_and_args(short_head) do
    Macro.decompose_call(short_head)
  end

  def decorate_args([]), do: {[],[]}
  def decorate_args(args_ast) do
    for {arg_ast, index} <- Enum.with_index(args_ast) do
      arg_name = Macro.var(:"arg#{index}", __MODULE__)

      full_arg = quote do
        unquote(arg_ast) = unquote(arg_name)
      end

      {arg_name, full_arg}
    end
    |> Enum.unzip
  end
end

dynamic code generation Snippet

defmodule Test do
          import Tracer

          fsm = [
            running: {:pause, :paused},
            running: {:stop, :stopped},
            paused: {:resume, :running}
          ]

          for {state, {action, next_state}} <- fsm do
            deftraceable unquote(action)(unquote(state)), do: unquote(next_state)
          end
          deftraceable initial, do: :running
        end

josevalim · May 18, 2024, 7:54am

There is no exception to the rules happening here.

Originally, unquote was only allowed inside def, then if you wanted to generate functions dynamically, you would have to do something like:

for {key, value} <- [one: 1, two: 2, three: 3] do
  quote do
    def unquote(key), do: unquote(value)
  end
end
|> Module.eval_quoted()

That’s both not ergonomic and relies on eval. The idea of using Macro.escape(unquote: true) is that we can skip all of that and do:

for {key, value} <- [one: 1, two: 2, three: 3] do
  def unquote(key), do: unquote(value)
end

There are no changes to the order that macros are expanded. def is expanded as everything else, it still receives AST, and it still emits AST. The only difference is that def choose to traverse its AST in a way that it will keep all of its AST as is (that’s what Macro.escape does), except for the unquote pieces.

DidactMacros · May 18, 2024, 8:45am

Thanks for replying.

My understanding now, is that def uses Macro.escape and that by using Macro.escape in the custom macro it made it behave more life def which therefore facilitated its use in dynamic code generation.

What does it mean to say that calls to def are evaluated subsequent to macro expansion? (This is what I took to mean that def has different prioritisation, but it seems I’m understanding this wrong)

Also I wanted to correct a typo in my OP:

I intended to say called in a def do block rather than called a def do block.

josevalim · May 18, 2024, 10:25am

100%! You can check the source (I am on my phone so no links), it is all Elixir/Erlang, but it calls an private function that expands and stores the function AST, so it is later used to emit the module byte code.

DidactMacros · May 18, 2024, 12:16pm

Thanks for that clarification.

Just to really confirm, def is expanded at the same phase as other macros, but with def the expanded AST is stored and used later. Which would mean that the below is exemplifying the order of expansion for all intents and purposes, due to actual exposure of the def AST’s effects via emittance of the module byte code?

Order of expansion

As you’d expect, the module-level code (the code that isn’t a part of any function) is evaluated in the expansion phase. Somewhat surprisingly, this will happen after all macros (save for def) have been expanded. It’s easy to prove this:
defmodule MyMacro do
          defmacro my_macro do
            IO.puts "my_macro called"
            nil
          end
        end
defmodule Test do
          import MyMacro

          IO.puts "module-level expression"
          my_macro
        end

# Output:
my_macro called
module-level expression
See from the output how mymacro is called before IO.puts even though the corresponding IO.puts call precedes the macro call. This proves that compiler first resolves all “standard” macros. Then the module generation starts, and it is in this phase where module-level code, together with calls to def is being evaluated.

excerpt from Understanding Macros Part 6

josevalim · May 18, 2024, 5:21pm

Yes. And the behavior of def can be mirrored by anyone. The quickest way would be to define a macro that stores AST in module attributes and injects them during in a @before_compile.

DidactMacros · May 18, 2024, 8:06pm

Thanks! That’s my query resolved. Really appreciate it.

Regarding my question with respect to Macro.escape and the understanding of AST Injection vs Data Transfer, that’s solved also.

I now understand that the distinction between data transfer and injection was represented by the fact that data transfer (via Macro.escape var, unquote: true) evaluates then makes sure that the AST is correct, while injection (via unquote) only evaluates then injects the result of evaluation by inserting without validating form, and quote does not evaluate so the data is lost if it is not described as belonging to a variable somewhere in the AST block or in the recipient context.

Thanks a lot for all your help, about ready to build a smallish macro utilising program and move on to the next chapter.

God bless.

josevalim · May 19, 2024, 6:17am

One last clarification is that unquote or escape do not evaluate per se. Unquote keeps the AST as is, so it is evaluated when the code executes. Escape will modify the AST so that, when it is executed, it returns itself. Escape with the unquote option escapes everything, except the bits inside unquote.

DidactMacros · May 19, 2024, 6:39am

Ah, thanks a lot. I’ll make sure to remember that.

I’ve noticed that with macros these distinctions can be real gotchas when you’re trying to keep track of everything.