Compiling files with many function heads is very slow(OTP 26 issue?)

The linked GitHub issue discusses the optimization and possible fixes already. It is about liveness analysis of variables, which can be used for several things, such as performing destructive updates, decreasing ref counters, etc.

1 Like

It can also be used for generating more optimal instructions which don’t need to check there arguments as they know through the compiler their types. It is quite cunning really, we are doing type handling in our dynamically typed language,

The dymanic typing will not go away because of this.

3 Likes

Blast from the past, the number one issue here is that the replace function is public, exporting functions not only increases compile time it significantly reduces runtime performance as the compiler cannot do it’s job.

I remember José joking about introducing nil was his billion dollar mistake, a common software engineering joke, but in all honesty I’ve come to believe it is defaulting to exported functions which is an explicit action in Erlang as all functions are private by default.

You may be misremembering or misquoting it. The phrase above is attributed to Tony Hoare (and it also relates to implicit nullables), and not myself.

Also, I am not sure private/public function is the number one issue for high compile times. It may be in some cases but it has never been the issue in my experience. So we need to see examples and their frequency to understand how well it generalizes. Also don’t forget that compilers are always evolving, so this could be something tackled/improved too.

3 Likes

Thread too long to parse in the early morning but I remember solving issues with lots of function heads by switching to a case statement.

so instead of


def funct(:a), do: bla
def funct(:b), do: hum

It’s now

def funct(arg1)
case arg1 do
  :a → bla
  :b → hum
end
end

Yes, if you are meta-programming code, that can indeed have an impact, since function clauses need to do additional work, such as process attributes, check for overridables, and what not, but I would expect it to happen only in the cases where hundreds of function heads are defined programmatically.

There are a few different ways of improving it too. IIRC, Phoenix Router fixes it by wrapping the function definition in an anonymous function, so it is preexpanded:

route = fn method, path ->
  def match(unquote(method), unquote(path), do: …
end
4 Likes

This trick can actually save a lot of time. One system I used to maintain got down from 2min to 20s compilation time on the same machine because it was generating routes from YAML files metaprogramatically.

1 Like

That’s very interesting given that ex_cldr and friends generate a lot of function heads at compile time. How would you suggest I apply this technique to (quite a lot of) code I have in the following form:

def define_backend_functions(config) do
  backend = config.backend

  quote location: :keep, bind_quoted: [config: Macro.escape(config), backend: backend] do
    for locale_name <- Cldr.Locale.Loader.known_locale_names(config) do
      delimiters = delimiters_for_locale(locale_name)

      defp quote_marks_for(unquote(locale_name)) do
        unquote(Macro.escape(delimiters))
      end
    end
  end
end
1 Like

If the function heads and return values are simple, wouldn’t there be something we can do in the compiler to optimize that. @kip example is good, since it looks like that should compile down to simple lookup table or something.

Have you tried the case instead of function heads approach? The rewrite is easy, the speedup considerably.

Routex can generate a few helper functions per route for more than 2000 routes (I just stopped benching) while before it would trip over less than 200 routes due to compilation limits.[1] Not sure if it’s faster than José suggestion though :slight_smile:

Ps. Or :use persisted_term? Seems a fitting use case as I see no computations for the key and its load-once, read-many global state. One can even (re)load more locale data at runtime.

[1] the extension for legacy support of good old route helpers is/was the culprit. Blessed be Verified Routes.


  1. Footnotes ↩︎

Yes.

But I have 17 libraries and many many constructs like that. So I’d like to see if there is an answer to the question I asked.

I am rewriting ex_cldr into a mono-repo localize library that will start to emerge in 2026. It will use :persistent_term (which didn’t exist when I wrote ex_cldr).

4 Likes
def define_backend_functions(config) do
  # I didn't see backend used in the macros shared
  backend = config.backend

  inner_def = fn locale_name, delimiters ->
    quote do
      delimiters = unquote(delimiters)
      locale_name = unquote(locale_name) # I think one of these 2 unquotes is un-needed, maybe both
      defp quote_marks_for(unquote(locale_name)) do
        unquote(Macro.escape(delimiters))
      end
     end
  end

  asts = Enum.map(Cldr.Locale.Loaders.known_locale_names(config), inner_def)

  {:__block__, [], asts}
end

I think it would be something like this in the end. I haven’t run the code, but this is the general feeling of “let’s force the macro to expand only once inside the anonymous function”.

I might be wrong, but I think the trick also requires that var!(name, scope) call to ensure the function is defined only once per compilation

3 Likes

Thank you very much @polvalente, great appreciated. Will give that a whirl.

2 Likes

That’s exactly the technique. Define an anonymous function on use and then use it later on. The var! is not doing anything special though, it is just a mechanism to reuse vars across macros.

And yes, the benefit in this case is that it expand macros once (and the impact will be bigger if you have macros that expands macros that expands to a def/defp).

Although I think in your case you are already reaping those benefits because you only emit defp once (it is wrapped in a for.. but it is expanded once anyway). Unless you are calling define_backend_functions several times over and over? Then you may see a benefit. In other words, the goal is to avoid the amount of quote you inject.

3 Likes

Wrote 3 simple examples to benchmark compile times. They perform similar without the guards but…it seems the Case-variant is immune for adding guards, while the others start to slow down.

Of course the benchmark is not realistic. Kept it simple, no macro in macro.

test1.ex compiled in 3203.686 ms. → Modules: [Anon]
test2.ex compiled in 1394.32 ms → Modules: [Case]
test3.ex compiled in 3040.71 ms → Modules: [Heads]

ps. Yes, naming variables is hard :wink:

Snippets:
Case

defmodule Case do
  case_ast =
    for i <- 1..10000 do
      {:->, [], [[{i, i}], i]}
    end

  def foo(x, y) when is_integer(x) and is_integer(y) do
    case {x, y} do
      unquote(case_ast)
    end
  end
end

Heads

defmodule Heads do
  for i <- 1..10000 do
    def foo(z = unquote(i), a= unquote(i)) when is_integer(z) and is_integer(a), do: unquote(i)
  end
end

Anon function

defmodule Anon do
  fun = fn x, y ->
      def foo(z= unquote(x), a = unquote(y)) when is_integer(z) and is_integer(a), do: unquote(y)
  end

  for i <- 1..10000 do
    fun.(i, i)
  end
end
5 Likes

~What happens when you don’t use guards and just pattern match, guards can certainly slow compilation times in my experiance also they perform worse then pattern matching.~

Nevermind missed: They perform similar without the guards but.

Thinking a bit more about this, I wonder if the compiler eleminates redudent guards, as the case version is not the same as the other, I believe it would preserve the guards, thats why you’re experiencing no difference with guards.

Maybe it’s late and I need some sleep but I see no difference to the others (functionality wise). There is, obviously, less to check though.

AFAIK then the two other would be compiled down to your case example but with the guards, so they become similar to your test which all contained guards. I hope I explained myself clearly.

Although I believe the compiler should do a better job and not just throw guards around when they clearly is not necessary.

To be clear, there are two (potentially more) discussions going on. There is one about macros/quote, which can be summarized as the difference between:

defmacro fast do
  quote bind_quoted: [] do
    for i <- 1..10000 do
      def foo(z = unquote(i), a = unquote(i)) when is_integer(z) and is_integer(a), do: unquote(y)
    end
  end
end

and

defmacro slow do
  for i <- 1..10000 do
    quote do
      def foo(z = unquote(i), a = unquote(i)) when is_integer(z) and is_integer(a), do: unquote(y)
    end
  end
end

(i.e. the difference between expanding once vs expanding 10k times) And in your example, anon and heads are virtually the same because in both scenarios def is expanded only once.

And now there is another discussion, about def vs case. However, I would say your examples are not apples to apples. In one you generate 10k guards (def), in the other one you generate only one. Even if at the end they boil down to the same thing (the compiler is smart enough), in one you are generating waaaay less code than the others.

4 Likes

This would be a more apples to apples version:

defmodule Case do
  case_ast =
    for i <- 1..10000 do
      hd(
        quote do
          {x = unquote(i), y = unquote(i)} when is_integer(x) and is_integer(y) -> y
        end
      )
    end

  def foo(x, y) do
    case {x, y} do
      unquote(case_ast)
    end
  end
end

Now they all have to go through the same amount of guards and the end result is roughly the same on my machine. I also could not spot any difference between public and private.

5 Likes