Compile elixir module directly to BEAM bytecode without loading into memory

I have a system that has thousands of run-time generated modules, some of which I want to cache for future reloading.

Ordinarily, the BEAM bytecode for a runtime generated module can be captured like so (How to read the BEAM bytecode compiled into memory and that is not on the disk?):

{:module, ^module_name, binary_contents, _} =
                    module_definition =
                    defmodule module_name do
                      unquote(contents)
                    end

And then saving bytecode to disk. The module can then later be loaded into memory with :code.load_binary or similar (Erlang -- code)

However, the issue with using defmodule and then capturing the output is that because of Erlang’s hot reloading rules, loading versions of a module can cause processes to be killed. (Erlang -- code)

So what I want to be able to do is create a BEAM file for a module without loading into memory.

I see the beginnings of an answer here: Getting each stage of Elixir's compilation all the way to the BEAM bytecode but I’m not sure how to implement what I need using this information.

I am not sure how to create an arbitrary function that will allow me to compile Elixir source code (as string or AST) directly into BEAM without affecting running processes at all.

Does anyone know how to do this? I have been scouring the elixir source code (elixir/lib/elixir/src/elixir_module.erl at 11a493ec4a07479a74e605b409dd16e23bb5cafe · elixir-lang/elixir · GitHub) to try understand how this may be done but I am struggling to get the answer I need.

P.S. Yes I know that when the BEAM file is loaded manually, it will still potentially cause processes to die, but I don’t think that this is avoidable.

You can pass @compile {:autoload, false} in the module body.

3 Likes

@josevalim
Doesn’t that merely delay the issue? Per the docs:

  • @compile {:autoload, false} - disables automatic loading of modules after compilation. Instead, the module will be loaded after it is dispatched to

So that will cause processes to be potentially be killed when the module is dispatched to? I assume “dispatched to” means that functions from it are called?

If there is a module already loaded, it won’t load the one that was just compiled. Plus, if the bytecode of the module just compiled was not written anywhere in a codepath in disk, then it can’t be found and it won’t be loaded either.