Note: There are a few folks I’d really love to hear from, time permitting. Pinging in case the title isn’t catchy enough @dorgan @scohen @scottming @mhanberg @lukaszsamson @zachdaniel
Proof-of-concept on GitHub: GitHub - zachallaun/lib_elixir
Motivation
There has been an ongoing effort in Elixir to add or improve APIs to make code analysis easier. Primarily, this has been done to support developer tools, namely language servers, allowing them to use the same mechanisms that Elixir itself uses to analyze code.
The first major fruits of this effort shipped in Elixir 1.17 in the shape of new APIs in Macro.Env
, but useful improvements have been added to Code
and Macro
continuously, and there are a number of open issues under discussion that could lead to further improvements.
The problem, then, becomes accessing these new improvements while still supporting older versions of Elixir. Existing libraries handle this in different ways:
- Sourceror vendors in parts of
Code
(and related Erlang source) related to formatting in order to support formatting in versions prior to Elixir 1.13. - Lexical vendors in
Code
and parts ofMacro
(and related Erlang source) in order to parse and analyze source code in an environment that doesn’t interfere with user project dependencies. - Next LS bundles the latest version of Elixir and uses that to compile and analyze user code.
These methods have significant trade-offs. Vendoring in code is a time-consuming, manual, and potentially buggy process, as modules have to be copied in and namespaced so that they don’t conflict with the runtime. Bundling Elixir requires user code to be compiled in a different environment than that code will be run in production, which can cause spurious warnings or other subtle differences.
Ultimately, the problem is that Elixir is a shared dependency that library authors do not control.
Elixir as a library
I’ve been experimenting with a new library that allows for the Elixir standard library to be included as a dependency in a way that does not conflict with the runtime version of Elixir.
The idea is to allow library authors to replace their usage of standard library modules with namespaced ones. For the following examples, I’ll use Spitfire, which uses features of Macro.Env
that were introduced in Elixir 1.17 (see here). Here’s an example of how Spitfire might use this:
defmodule Spitfire.Env do
@moduledoc """
Environment querying
"""
+ alias Spitfire.LibElixir.Macro, as: Macro
@env %{
- Macro.Env.prune_compile_info(__ENV__)
+ Macro.Env
+ |> struct(Map.from_struct(__ENV__))
+ |> Macro.Env.prune_compile_info()
| line: 0,
file: "nofile",
module: nil,
function: nil,
context_modules: []
}
defp env, do: @env
...
end
This would allow Spitfire to support versions of Elixir earlier than 1.17. (More on that below when I discuss challenges, but I think 1.15+.)
Namespacing
So, how do we compile a specific version of the Elixir standard library and then use it as in Spitfire.LibElixir.Macro.Env
?
The strategy is the same one used by Lexical to ensure that its dependencies don’t conflict with user dependencies at runtime. We call it namespacing. (Hat tip @scohen, who came up with this for Lexical.)
Here’s the gist of it:
- Compile your Elixir and Erlang modules to bytecode:
.app
and.beam
files. - Read them in as Abstract Forms using
:beam_lib.chunks(path, [:abstract_code])
. - Walk the abstract code, rewriting module names to their namespaced counterparts:
Code -> Spitfire.LibElixir.Code
:elixir_tokenizer -> :spitfire_lib_elixir_tokenizer
- Recompile the modified abstract forms using
:compile.forms(...)
, writing the resulting binary out to a new.beam
:Elixir.Code.beam -> Elixir.Spitfire.LibElixir.Code.beam
elixir_tokenizer.beam -> spitfire_lib_elixir_tokenizer.beam
- Do something similar with the
.app
:elixir.app -> spitfire_lib_elixir.app
There’s a bit more to it, but this isn’t a hypothetical:
~/dev/forks/spitfire main*
> iex -S mix run --no-compile
Erlang/OTP 25 [erts-13.2.2.10] [source] [64-bit] [smp:32:32] [ds:32:32:10] [async-threads:1] [jit:ns]
Interactive Elixir (1.15.8) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> alias Foo.Bar.Baz, as: Qux
Foo.Bar.Baz
iex(2)> env = struct(Spitfire.LibElixir.Macro.Env, Map.from_struct(__ENV__))
%Spitfire.LibElixir.Macro.Env{
aliases: [],
...
}
iex(3)> Spitfire.LibElixir.Macro.Env.expand_alias(env, [], [:Qux])
{:alias, Foo.Bar.Baz}
Challenges and open questions
At the moment, this is just a proof-of-concept and there’s a lot left to figure out.
When and how to compile?
The current proof-of-concept library is using a Mix compiler that downloads an Elixir archive from GitHub, compiles only the stdlib (make erlang app stdlib
), namespaces the resulting *.beam
files and app, and then sticks them in _build/dev/lib/lib_elixir/ebin
.
defmodule Spitfire.MixProject do
...
def project do
[
...,
lib_elixir: [{Spitfire.LibElixir, "v1.17.2"}]
]
end
...
defp deps do
[
...,
{:lib_elixir, path: "..."}
]
end
end
This almost works, but not quite. Something’s causing protocol consolidation to fail:
14:17:49.007 [error] Task #PID<0.1890.0> started from #PID<0.94.0> terminating
** (FunctionClauseError) no function clause matching in Spitfire.LibElixir.List.Chars.Spitfire.LibElixir.Atom."-inlined-__impl__/1-"/1
Spitfire.LibElixir.List.Chars.Spitfire.LibElixir.Atom."-inlined-__impl__/1-"(:target)
(elixir 1.15.8) lib/protocol.ex:679: Protocol.each_struct_clause_for/3
(elixir 1.15.8) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
(elixir 1.15.8) lib/protocol.ex:657: Protocol.change_struct_impl_for/4
(elixir 1.15.8) lib/protocol.ex:619: Protocol.change_debug_info/3
(elixir 1.15.8) lib/protocol.ex:570: Protocol.consolidate/2
(mix 1.15.8) lib/mix/tasks/compile.protocols.ex:140: Mix.Tasks.Compile.Protocols.consolidate/4
(elixir 1.15.8) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
Function: #Function<9.26660727/0 in Mix.Tasks.Compile.Protocols.consolidate/6>
Args: []
I’m not yet sure why this happens. If you compile with mix compile --no-protocol-consolidation
and then hop in iex with iex -S mix run --no-compile
, it succeeds for Spitfire.LibElixir.List.Chars.Spitfire.LibElixir.Atom.__impl__(:for)
, but not for :target
.
Anyways, I’m not sure whether this is necessarily the right direction and am open to suggestions.
How much to compile/include?
Right now, all of Elixir’s stdlib is being namespaced and included. An alternative might be to whitelist certain modules, like Code
, Macro
, Module
, etc. that are likely useful to library authors.
This might solve the protocol consolidation issue above, but it could also lead to subtle or difficult-to-find bugs when a namespaced module calls into a non-namespaced module expecting certain behavior.
Version compatibility
The exact format of the data in *.beam
files may change from version to version, but this strategy relies on files compiled on one version being loadable on another. I already found some incompatibility related to binaries that changed in 1.15, meaning that LibElixir 1.17.2 won’t run on any Elixir earlier than 1.15. This creates “windows of compatibility” that would need to be kept track of.
Is this even a good idea?
This is the final question. Is this generally useful and worth the effort? Are there gotchas I’m missing?
Any feedback greatly appreciated.