zachallaun
Working on lib_elixir - Elixir core modules as a library
Note: There are a few folks I’d really love to hear from, time permitting. Pinging in case the title isn’t catchy enough
@dorgan @scohen @scottming @mhanberg @lukaszsamson @zachdaniel
Proof-of-concept on GitHub: GitHub - zachallaun/lib_elixir
Motivation
There has been an ongoing effort in Elixir to add or improve APIs to make code analysis easier. Primarily, this has been done to support developer tools, namely language servers, allowing them to use the same mechanisms that Elixir itself uses to analyze code.
The first major fruits of this effort shipped in Elixir 1.17 in the shape of new APIs in Macro.Env, but useful improvements have been added to Code and Macro continuously, and there are a number of open issues under discussion that could lead to further improvements.
The problem, then, becomes accessing these new improvements while still supporting older versions of Elixir. Existing libraries handle this in different ways:
- Sourceror vendors in parts of
Code(and related Erlang source) related to formatting in order to support formatting in versions prior to Elixir 1.13. - Lexical vendors in
Codeand parts ofMacro(and related Erlang source) in order to parse and analyze source code in an environment that doesn’t interfere with user project dependencies. - Next LS bundles the latest version of Elixir and uses that to compile and analyze user code.
These methods have significant trade-offs. Vendoring in code is a time-consuming, manual, and potentially buggy process, as modules have to be copied in and namespaced so that they don’t conflict with the runtime. Bundling Elixir requires user code to be compiled in a different environment than that code will be run in production, which can cause spurious warnings or other subtle differences.
Ultimately, the problem is that Elixir is a shared dependency that library authors do not control.
Elixir as a library
I’ve been experimenting with a new library that allows for the Elixir standard library to be included as a dependency in a way that does not conflict with the runtime version of Elixir.
The idea is to allow library authors to replace their usage of standard library modules with namespaced ones. For the following examples, I’ll use Spitfire, which uses features of Macro.Env that were introduced in Elixir 1.17 (see here). Here’s an example of how Spitfire might use this:
defmodule Spitfire.Env do
@moduledoc """
Environment querying
"""
+ alias Spitfire.LibElixir.Macro, as: Macro
@env %{
- Macro.Env.prune_compile_info(__ENV__)
+ Macro.Env
+ |> struct(Map.from_struct(__ENV__))
+ |> Macro.Env.prune_compile_info()
| line: 0,
file: "nofile",
module: nil,
function: nil,
context_modules: []
}
defp env, do: @env
...
end
This would allow Spitfire to support versions of Elixir earlier than 1.17. (More on that below when I discuss challenges, but I think 1.15+.)
Namespacing
So, how do we compile a specific version of the Elixir standard library and then use it as in Spitfire.LibElixir.Macro.Env?
The strategy is the same one used by Lexical to ensure that its dependencies don’t conflict with user dependencies at runtime. We call it namespacing. (Hat tip @scohen, who came up with this for Lexical.)
Here’s the gist of it:
- Compile your Elixir and Erlang modules to bytecode:
.appand.beamfiles. - Read them in as Abstract Forms using
:beam_lib.chunks(path, [:abstract_code]). - Walk the abstract code, rewriting module names to their namespaced counterparts:
Code -> Spitfire.LibElixir.Code:elixir_tokenizer -> :spitfire_lib_elixir_tokenizer
- Recompile the modified abstract forms using
:compile.forms(...), writing the resulting binary out to a new.beam:Elixir.Code.beam -> Elixir.Spitfire.LibElixir.Code.beamelixir_tokenizer.beam -> spitfire_lib_elixir_tokenizer.beam
- Do something similar with the
.app:elixir.app -> spitfire_lib_elixir.app
There’s a bit more to it, but this isn’t a hypothetical:
~/dev/forks/spitfire main*
> iex -S mix run --no-compile
Erlang/OTP 25 [erts-13.2.2.10] [source] [64-bit] [smp:32:32] [ds:32:32:10] [async-threads:1] [jit:ns]
Interactive Elixir (1.15.8) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> alias Foo.Bar.Baz, as: Qux
Foo.Bar.Baz
iex(2)> env = struct(Spitfire.LibElixir.Macro.Env, Map.from_struct(__ENV__))
%Spitfire.LibElixir.Macro.Env{
aliases: [],
...
}
iex(3)> Spitfire.LibElixir.Macro.Env.expand_alias(env, [], [:Qux])
{:alias, Foo.Bar.Baz}
Challenges and open questions
At the moment, this is just a proof-of-concept and there’s a lot left to figure out.
When and how to compile?
The current proof-of-concept library is using a Mix compiler that downloads an Elixir archive from GitHub, compiles only the stdlib (make erlang app stdlib), namespaces the resulting *.beam files and app, and then sticks them in _build/dev/lib/lib_elixir/ebin.
defmodule Spitfire.MixProject do
...
def project do
[
...,
lib_elixir: [{Spitfire.LibElixir, "v1.17.2"}]
]
end
...
defp deps do
[
...,
{:lib_elixir, path: "..."}
]
end
end
This almost works, but not quite. Something’s causing protocol consolidation to fail:
14:17:49.007 [error] Task #PID<0.1890.0> started from #PID<0.94.0> terminating
** (FunctionClauseError) no function clause matching in Spitfire.LibElixir.List.Chars.Spitfire.LibElixir.Atom."-inlined-__impl__/1-"/1
Spitfire.LibElixir.List.Chars.Spitfire.LibElixir.Atom."-inlined-__impl__/1-"(:target)
(elixir 1.15.8) lib/protocol.ex:679: Protocol.each_struct_clause_for/3
(elixir 1.15.8) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
(elixir 1.15.8) lib/protocol.ex:657: Protocol.change_struct_impl_for/4
(elixir 1.15.8) lib/protocol.ex:619: Protocol.change_debug_info/3
(elixir 1.15.8) lib/protocol.ex:570: Protocol.consolidate/2
(mix 1.15.8) lib/mix/tasks/compile.protocols.ex:140: Mix.Tasks.Compile.Protocols.consolidate/4
(elixir 1.15.8) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
Function: #Function<9.26660727/0 in Mix.Tasks.Compile.Protocols.consolidate/6>
Args: []
I’m not yet sure why this happens. If you compile with mix compile --no-protocol-consolidation and then hop in iex with iex -S mix run --no-compile, it succeeds for Spitfire.LibElixir.List.Chars.Spitfire.LibElixir.Atom.__impl__(:for), but not for :target.
Anyways, I’m not sure whether this is necessarily the right direction and am open to suggestions.
How much to compile/include?
Right now, all of Elixir’s stdlib is being namespaced and included. An alternative might be to whitelist certain modules, like Code, Macro, Module, etc. that are likely useful to library authors.
This might solve the protocol consolidation issue above, but it could also lead to subtle or difficult-to-find bugs when a namespaced module calls into a non-namespaced module expecting certain behavior.
Version compatibility
The exact format of the data in *.beam files may change from version to version, but this strategy relies on files compiled on one version being loadable on another. I already found some incompatibility related to binaries that changed in 1.15, meaning that LibElixir 1.17.2 won’t run on any Elixir earlier than 1.15. This creates “windows of compatibility” that would need to be kept track of.
Is this even a good idea?
This is the final question. Is this generally useful and worth the effort? Are there gotchas I’m missing?
Any feedback greatly appreciated.
Most Liked
josevalim
Very cool exploration @zachallaun!
Just some ideas (feel free to fully ignore them):
-
You can probably skip protocols and their implementations from
lib_elixir. All of our protocols and their implementations are public, so it is very unlikely they will change between versions in an incompatible way. -
Perhaps instead of allowing some modules to be removed, you could ask developers to list which modules they want to use, then you traverse their abstract code and find what they depend on, and convert these too, recursively. This means that Spitfire, which only really needs the tokenizer, gets the minimum stuff they need. You may have some corner cases, for example if we use some module conditionally, but then you can manually add those (and they should be few), such as the
string_tokenizerused by the tokenizer.
zachallaun
Yep, that seems to work just fine.
As a quick update, here’s what it currently takes to get Spitfire using lib_elixir with all tests passing while running on Elixir 1.15 (cc @mhanberg):
diff --git a/mix.exs b/mix.exs
index fc7eabf..90f27ab 100644
--- a/mix.exs
+++ b/mix.exs
@@ -12,7 +12,9 @@ defmodule Spitfire.MixProject do
start_permanent: Mix.env() == :prod,
deps: deps(),
docs: [main: "Spitfire"],
- package: package()
+ package: package(),
+ compilers: [:lib_elixir] ++ Mix.compilers(),
+ lib_elixir: {Spitfire.LibElixir, "v1.17.2", [Code, Macro, Macro.Env, :elixir_tokenizer]}
]
end
@@ -26,6 +28,7 @@ defmodule Spitfire.MixProject do
# Run "mix help deps" to learn about dependencies.
defp deps do
[
+ {:lib_elixir, path: "../lib_elixir", runtime: false},
{:ex_doc, ">= 0.0.0", only: :dev},
{:styler, "~> 0.11", only: :dev}
# {:dep_from_hexpm, "~> 0.3.0"},
diff --git a/lib/spitfire.ex b/lib/spitfire.ex
index 8e92c35..9174bc5 100644
--- a/lib/spitfire.ex
+++ b/lib/spitfire.ex
@@ -1989,7 +1989,7 @@ defmodule Spitfire do
tokens =
case code
|> String.to_charlist()
- |> :spitfire_tokenizer.tokenize(opts[:line] || 1, opts[:column] || 1, opts) do
+ |> :spitfire_lib_elixir_tokenizer.tokenize(opts[:line] || 1, opts[:column] || 1, opts) do
{:ok, _, _, _, tokens} ->
tokens
diff --git a/lib/spitfire/env.ex b/lib/spitfire/env.ex
index 50ee68d..bd76dcc 100644
--- a/lib/spitfire/env.ex
+++ b/lib/spitfire/env.ex
@@ -2,8 +2,14 @@ defmodule Spitfire.Env do
@moduledoc """
Environment querying
"""
+
+ alias Spitfire.LibElixir.Code
+ alias Spitfire.LibElixir.Macro
+
@env %{
- Macro.Env.prune_compile_info(__ENV__)
+ (Macro.Env
+ |> struct(Map.from_struct(__ENV__))
+ |> Macro.Env.prune_compile_info())
| line: 0,
file: "nofile",
module: nil,
diff --git a/test/spitfire_test.exs b/test/spitfire_test.exs
index 7f0c6b2..30475ff 100644
--- a/test/spitfire_test.exs
+++ b/test/spitfire_test.exs
@@ -1,6 +1,8 @@
defmodule SpitfireTest do
use ExUnit.Case
+ alias Spitfire.LibElixir.Code
+
doctest Spitfire
describe "valid code" do
zachdaniel
I’m not sure what you mean by “fixed the mistake” WRT Igniter, but I think I was just wrong when explaining our main use case with Spitfire. Sorry about that
Error tolerance is why we use Spitfire.container_cursor_to_quoted/1 as opposed to Code.Fragment, but what we actually use Spitfire for is primarily Spitfire.Env.expand.
The way we are using it currently (and we aim to expand this usage) is, for example, when patching in a module name, we want it to respect existing module aliases.
For example, when installing AshPostgres, we make sure that your Repo module exists, is configured correctly, etc., and then we make sure that it is a child of your application (and that you have an application file, we create it if it doesn’t exist).
This patching logic is naive at the moment, but it will get more robust over time. I’ve also removed a bunch of stuff token this example simple
# in `Igniter.Project.Application`
def do_add_child(igniter, application, to_supervise) do
path = Igniter.Code.Module.proper_location(application)
Igniter.update_elixir_file(igniter, path, fn zipper ->
with {:ok, zipper} <- Igniter.Code.Module.move_to_module_using(zipper, Application),
{:ok, zipper} <- Igniter.Code.Function.move_to_def(zipper, :start, 2),
{:ok, zipper} <-
Igniter.Code.Function.move_to_function_call_in_current_scope(
zipper,
:=,
[2],
fn call ->
Igniter.Code.Function.argument_matches_pattern?(
call,
0,
{:children, _, context} when is_atom(context)
) &&
Igniter.Code.Function.argument_matches_pattern?(call, 1, v when is_list(v))
end
) do
zipper
|> Zipper.down()
|> Zipper.rightmost()
|> Igniter.Code.List.append_new_to_list(Macro.escape(to_supervise), diff_checker)
else
_ ->
{:warning, "...."}
end
end)
end
So when using append_new_to_list, that ultimately ends up using code that expands the environment at the place you are bringing in code, and uses that to
- honor aliases when checking for matches in the list
- use aliases when inserting a module into the AST
Error tolerance can be useful for us because it is theoretically possible to introduce invalid AST “temporarily” while working with igniter. You compose a bunch of AST modifiers, and I didn’t want to necessarily guarantee that, after every single modification you make, the AST is valid. Only when actually writing the file must it be valid.
If its not valid, though, you won’t get the nice features of being able to determine env at a location.
This is the entirety of our Spitfire usage currently.
@doc """
Expands the environment at the current zipper position and returns the
expanded environment. Currently used for properly working with aliases.
"""
def current_env(zipper) do
zipper
|> do_add_code({:__cursor__, [], []}, :after, false)
|> Zipper.topmost_root()
|> Sourceror.to_string()
|> String.split("__cursor__()", parts: 2)
|> List.first()
|> Spitfire.container_cursor_to_quoted()
|> then(fn {:ok, ast} ->
ast
end)
|> Spitfire.Env.expand("file.ex")
|> then(fn {_ast, _final_state, _final_env, cursor_env} ->
{:ok, struct(Macro.Env, cursor_env)}
end)
rescue
e ->
{:error, e}
end







