zachallaun

zachallaun

Working on lib_elixir - Elixir core modules as a library

Note: There are a few folks I’d really love to hear from, time permitting. Pinging in case the title isn’t catchy enough :slight_smile: @dorgan @scohen @scottming @mhanberg @lukaszsamson @zachdaniel

Proof-of-concept on GitHub: GitHub - zachallaun/lib_elixir

Motivation

There has been an ongoing effort in Elixir to add or improve APIs to make code analysis easier. Primarily, this has been done to support developer tools, namely language servers, allowing them to use the same mechanisms that Elixir itself uses to analyze code.

The first major fruits of this effort shipped in Elixir 1.17 in the shape of new APIs in Macro.Env, but useful improvements have been added to Code and Macro continuously, and there are a number of open issues under discussion that could lead to further improvements.

The problem, then, becomes accessing these new improvements while still supporting older versions of Elixir. Existing libraries handle this in different ways:

  • Sourceror vendors in parts of Code (and related Erlang source) related to formatting in order to support formatting in versions prior to Elixir 1.13.
  • Lexical vendors in Code and parts of Macro (and related Erlang source) in order to parse and analyze source code in an environment that doesn’t interfere with user project dependencies.
  • Next LS bundles the latest version of Elixir and uses that to compile and analyze user code.

These methods have significant trade-offs. Vendoring in code is a time-consuming, manual, and potentially buggy process, as modules have to be copied in and namespaced so that they don’t conflict with the runtime. Bundling Elixir requires user code to be compiled in a different environment than that code will be run in production, which can cause spurious warnings or other subtle differences.

Ultimately, the problem is that Elixir is a shared dependency that library authors do not control.

Elixir as a library

I’ve been experimenting with a new library that allows for the Elixir standard library to be included as a dependency in a way that does not conflict with the runtime version of Elixir.

The idea is to allow library authors to replace their usage of standard library modules with namespaced ones. For the following examples, I’ll use Spitfire, which uses features of Macro.Env that were introduced in Elixir 1.17 (see here). Here’s an example of how Spitfire might use this:

  defmodule Spitfire.Env do
    @moduledoc """
    Environment querying
    """
  
+   alias Spitfire.LibElixir.Macro, as: Macro
  
    @env %{
-     Macro.Env.prune_compile_info(__ENV__)
+     Macro.Env
+     |> struct(Map.from_struct(__ENV__))
+     |> Macro.Env.prune_compile_info()
      | line: 0,
        file: "nofile",
        module: nil,
        function: nil,
        context_modules: []
    }
    defp env, do: @env

    ...

  end

This would allow Spitfire to support versions of Elixir earlier than 1.17. (More on that below when I discuss challenges, but I think 1.15+.)

Namespacing

So, how do we compile a specific version of the Elixir standard library and then use it as in Spitfire.LibElixir.Macro.Env?

The strategy is the same one used by Lexical to ensure that its dependencies don’t conflict with user dependencies at runtime. We call it namespacing. (Hat tip @scohen, who came up with this for Lexical.)

Here’s the gist of it:

  • Compile your Elixir and Erlang modules to bytecode: .app and .beam files.
  • Read them in as Abstract Forms using :beam_lib.chunks(path, [:abstract_code]).
  • Walk the abstract code, rewriting module names to their namespaced counterparts:
    • Code -> Spitfire.LibElixir.Code
    • :elixir_tokenizer -> :spitfire_lib_elixir_tokenizer
  • Recompile the modified abstract forms using :compile.forms(...), writing the resulting binary out to a new .beam:
    • Elixir.Code.beam -> Elixir.Spitfire.LibElixir.Code.beam
    • elixir_tokenizer.beam -> spitfire_lib_elixir_tokenizer.beam
  • Do something similar with the .app:
    • elixir.app -> spitfire_lib_elixir.app

There’s a bit more to it, but this isn’t a hypothetical:

~/dev/forks/spitfire main*
> iex -S mix run --no-compile
Erlang/OTP 25 [erts-13.2.2.10] [source] [64-bit] [smp:32:32] [ds:32:32:10] [async-threads:1] [jit:ns]

Interactive Elixir (1.15.8) - press Ctrl+C to exit (type h() ENTER for help)

iex(1)> alias Foo.Bar.Baz, as: Qux
Foo.Bar.Baz

iex(2)> env = struct(Spitfire.LibElixir.Macro.Env, Map.from_struct(__ENV__))
%Spitfire.LibElixir.Macro.Env{
  aliases: [],
  ...
}

iex(3)> Spitfire.LibElixir.Macro.Env.expand_alias(env, [], [:Qux])
{:alias, Foo.Bar.Baz}

Challenges and open questions

At the moment, this is just a proof-of-concept and there’s a lot left to figure out.

When and how to compile?

The current proof-of-concept library is using a Mix compiler that downloads an Elixir archive from GitHub, compiles only the stdlib (make erlang app stdlib), namespaces the resulting *.beam files and app, and then sticks them in _build/dev/lib/lib_elixir/ebin.

defmodule Spitfire.MixProject do
  ...
  
  def project do
    [
      ...,
      lib_elixir: [{Spitfire.LibElixir, "v1.17.2"}]
    ]
  end
  
  ...

  defp deps do
    [
      ...,
      {:lib_elixir, path: "..."}
    ]
  end
end

This almost works, but not quite. Something’s causing protocol consolidation to fail:

14:17:49.007 [error] Task #PID<0.1890.0> started from #PID<0.94.0> terminating
** (FunctionClauseError) no function clause matching in Spitfire.LibElixir.List.Chars.Spitfire.LibElixir.Atom."-inlined-__impl__/1-"/1
    Spitfire.LibElixir.List.Chars.Spitfire.LibElixir.Atom."-inlined-__impl__/1-"(:target)
    (elixir 1.15.8) lib/protocol.ex:679: Protocol.each_struct_clause_for/3
    (elixir 1.15.8) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
    (elixir 1.15.8) lib/protocol.ex:657: Protocol.change_struct_impl_for/4
    (elixir 1.15.8) lib/protocol.ex:619: Protocol.change_debug_info/3
    (elixir 1.15.8) lib/protocol.ex:570: Protocol.consolidate/2
    (mix 1.15.8) lib/mix/tasks/compile.protocols.ex:140: Mix.Tasks.Compile.Protocols.consolidate/4
    (elixir 1.15.8) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
Function: #Function<9.26660727/0 in Mix.Tasks.Compile.Protocols.consolidate/6>
    Args: []

I’m not yet sure why this happens. If you compile with mix compile --no-protocol-consolidation and then hop in iex with iex -S mix run --no-compile, it succeeds for Spitfire.LibElixir.List.Chars.Spitfire.LibElixir.Atom.__impl__(:for), but not for :target.

Anyways, I’m not sure whether this is necessarily the right direction and am open to suggestions.

How much to compile/include?

Right now, all of Elixir’s stdlib is being namespaced and included. An alternative might be to whitelist certain modules, like Code, Macro, Module, etc. that are likely useful to library authors.

This might solve the protocol consolidation issue above, but it could also lead to subtle or difficult-to-find bugs when a namespaced module calls into a non-namespaced module expecting certain behavior.

Version compatibility

The exact format of the data in *.beam files may change from version to version, but this strategy relies on files compiled on one version being loadable on another. I already found some incompatibility related to binaries that changed in 1.15, meaning that LibElixir 1.17.2 won’t run on any Elixir earlier than 1.15. This creates “windows of compatibility” that would need to be kept track of.

Is this even a good idea?

This is the final question. Is this generally useful and worth the effort? Are there gotchas I’m missing?

Any feedback greatly appreciated.

Most Liked

josevalim

josevalim

Creator of Elixir

Very cool exploration @zachallaun!

Just some ideas (feel free to fully ignore them):

  1. You can probably skip protocols and their implementations from lib_elixir. All of our protocols and their implementations are public, so it is very unlikely they will change between versions in an incompatible way.

  2. Perhaps instead of allowing some modules to be removed, you could ask developers to list which modules they want to use, then you traverse their abstract code and find what they depend on, and convert these too, recursively. This means that Spitfire, which only really needs the tokenizer, gets the minimum stuff they need. You may have some corner cases, for example if we use some module conditionally, but then you can manually add those (and they should be few), such as the string_tokenizer used by the tokenizer.

10
Post #2
zachallaun

zachallaun

Yep, that seems to work just fine.


As a quick update, here’s what it currently takes to get Spitfire using lib_elixir with all tests passing while running on Elixir 1.15 (cc @mhanberg):

diff --git a/mix.exs b/mix.exs
index fc7eabf..90f27ab 100644
--- a/mix.exs
+++ b/mix.exs
@@ -12,7 +12,9 @@ defmodule Spitfire.MixProject do
       start_permanent: Mix.env() == :prod,
       deps: deps(),
       docs: [main: "Spitfire"],
-      package: package()
+      package: package(),
+      compilers: [:lib_elixir] ++ Mix.compilers(),
+      lib_elixir: {Spitfire.LibElixir, "v1.17.2", [Code, Macro, Macro.Env, :elixir_tokenizer]}
     ]
   end
@@ -26,6 +28,7 @@ defmodule Spitfire.MixProject do
   # Run "mix help deps" to learn about dependencies.
   defp deps do
     [
+      {:lib_elixir, path: "../lib_elixir", runtime: false},
       {:ex_doc, ">= 0.0.0", only: :dev},
       {:styler, "~> 0.11", only: :dev}
       # {:dep_from_hexpm, "~> 0.3.0"},

diff --git a/lib/spitfire.ex b/lib/spitfire.ex
index 8e92c35..9174bc5 100644
--- a/lib/spitfire.ex
+++ b/lib/spitfire.ex
@@ -1989,7 +1989,7 @@ defmodule Spitfire do
     tokens =
       case code
            |> String.to_charlist()
-           |> :spitfire_tokenizer.tokenize(opts[:line] || 1, opts[:column] || 1, opts) do
+           |> :spitfire_lib_elixir_tokenizer.tokenize(opts[:line] || 1, opts[:column] || 1, opts) do
         {:ok, _, _, _, tokens} ->
           tokens

diff --git a/lib/spitfire/env.ex b/lib/spitfire/env.ex
index 50ee68d..bd76dcc 100644
--- a/lib/spitfire/env.ex
+++ b/lib/spitfire/env.ex
@@ -2,8 +2,14 @@ defmodule Spitfire.Env do
   @moduledoc """
   Environment querying
   """
+
+  alias Spitfire.LibElixir.Code
+  alias Spitfire.LibElixir.Macro
+
   @env %{
-    Macro.Env.prune_compile_info(__ENV__)
+    (Macro.Env
+     |> struct(Map.from_struct(__ENV__))
+     |> Macro.Env.prune_compile_info())
     | line: 0,
       file: "nofile",
       module: nil,

diff --git a/test/spitfire_test.exs b/test/spitfire_test.exs
index 7f0c6b2..30475ff 100644
--- a/test/spitfire_test.exs
+++ b/test/spitfire_test.exs
@@ -1,6 +1,8 @@
 defmodule SpitfireTest do
   use ExUnit.Case

+  alias Spitfire.LibElixir.Code
+
   doctest Spitfire

   describe "valid code" do
zachdaniel

zachdaniel

Creator of Ash

I’m not sure what you mean by “fixed the mistake” WRT Igniter, but I think I was just wrong when explaining our main use case with Spitfire. Sorry about that :slight_smile: Error tolerance is why we use Spitfire.container_cursor_to_quoted/1 as opposed to Code.Fragment, but what we actually use Spitfire for is primarily Spitfire.Env.expand.

The way we are using it currently (and we aim to expand this usage) is, for example, when patching in a module name, we want it to respect existing module aliases.

For example, when installing AshPostgres, we make sure that your Repo module exists, is configured correctly, etc., and then we make sure that it is a child of your application (and that you have an application file, we create it if it doesn’t exist).

This patching logic is naive at the moment, but it will get more robust over time. I’ve also removed a bunch of stuff token this example simple

# in `Igniter.Project.Application`
  def do_add_child(igniter, application, to_supervise) do
    path = Igniter.Code.Module.proper_location(application)

    Igniter.update_elixir_file(igniter, path, fn zipper ->
      with {:ok, zipper} <- Igniter.Code.Module.move_to_module_using(zipper, Application),
           {:ok, zipper} <- Igniter.Code.Function.move_to_def(zipper, :start, 2),
           {:ok, zipper} <-
             Igniter.Code.Function.move_to_function_call_in_current_scope(
               zipper,
               :=,
               [2],
               fn call ->
                 Igniter.Code.Function.argument_matches_pattern?(
                   call,
                   0,
                   {:children, _, context} when is_atom(context)
                 ) &&
                   Igniter.Code.Function.argument_matches_pattern?(call, 1, v when is_list(v))
               end
             ) do
        zipper
        |> Zipper.down()
        |> Zipper.rightmost()
        |> Igniter.Code.List.append_new_to_list(Macro.escape(to_supervise), diff_checker)
      else
        _ ->
          {:warning, "...."}
      end
    end)
  end

So when using append_new_to_list, that ultimately ends up using code that expands the environment at the place you are bringing in code, and uses that to

  1. honor aliases when checking for matches in the list
  2. use aliases when inserting a module into the AST

Error tolerance can be useful for us because it is theoretically possible to introduce invalid AST “temporarily” while working with igniter. You compose a bunch of AST modifiers, and I didn’t want to necessarily guarantee that, after every single modification you make, the AST is valid. Only when actually writing the file must it be valid.

If its not valid, though, you won’t get the nice features of being able to determine env at a location.

This is the entirety of our Spitfire usage currently.

  @doc """
  Expands the environment at the current zipper position and returns the
  expanded environment. Currently used for properly working with aliases.
  """
  def current_env(zipper) do
    zipper
    |> do_add_code({:__cursor__, [], []}, :after, false)
    |> Zipper.topmost_root()
    |> Sourceror.to_string()
    |> String.split("__cursor__()", parts: 2)
    |> List.first()
    |> Spitfire.container_cursor_to_quoted()
    |> then(fn {:ok, ast} ->
      ast
    end)
    |> Spitfire.Env.expand("file.ex")
    |> then(fn {_ast, _final_state, _final_env, cursor_env} ->
      {:ok, struct(Macro.Env, cursor_env)}
    end)
  rescue
    e ->
      {:error, e}
  end

Where Next?

Popular in RFCs Top

dergraf
Hey everyone! :wave: I’ve been working on a side project for the last 4 months that I’m excited to share—it’s called ProxyConf! ProxyCon...
New
laibulle
Hello, I am playing with quantitative finance with Elixir. This library is more a way for me to explore and learn in this area and especi...
New
zachallaun
Note: There are a few folks I’d really love to hear from, time permitting. Pinging in case the title isn’t catchy enough :slight_smile: @...
New
pzingg
I took a phx.gen.auth application and added support for storing a user’s cookie consent settings. Created a couple of modal dialogs to ha...
New
GenericJam
Plugin System for Mob Mob is growing by leaps and bounds! I realize now we will not be able to accommodate all the functionality people ...
New
bluzky
Hi everyone, I would like to introduce my new project OrangeCMS, it’s an application that help you to create/edit content post for your ...
New
andreashasse
Hi all, I’ve been working on PhoenixSpectral, a library that makes Phoenix controller @spec annotations drive both OpenAPI generation an...
New
tmbb
I’ve started working on a toy project to compile extended POSIX-compatible regular expressions into NimbleParsec combinators. These combn...
New
jarlah
Hi! I have recently created, after having tried to get in touch with the creator of excontainers for quite some time, a new library call...
New
pknoth
Built on top of boruta | Hex, I am on the way to creating a standalone OAuth 2.0/OpenID Connect server thinking of a lightweight Keycloak...
New

Other popular topics Top

marius95
Hello everyone, I try to use an Javascript Event Handler in my root.html.leex file. Therefore I created a function in the app.js file: ...
New
greenz1
I have a phoenix application from which a user can download multiple(5-6) files of size 1MB. I couldn’t find anything related to sending ...
New
AstonJ
Posting this to see if we can make things easier for people to get into Neovim. If you use Neovim and have a favourite distro please let ...
New
gshaw
What is the idiomatic way of matching for not nil in Elixir? E.g., First way: defp halt_if_not_signed_in(conn, signed_in_account) when...
New
dokuzbir
I want to highlight html closing tags when i click a html tag. That works in .html files but doesnt work for html.eex templates. How can...
New
ovidiubadita
Hey all, I discovered Elixir and I love it. I always wanted to learn a functional programming and I intended to go for Haskell, but afte...
New
nsuchy
Hi. I’ve noticed that Windows Powershell has it’s own IEX command and you cannot access Elixir’s IEX due to the conflict. This isn’t a cr...
New
shijith.k
I am trying to start a new phoenix project with elixir 1.9, but mix phx.new does not work. It says that ** (Mix) The task "phx.new" could...
New
marick
I had some trouble figuring out how to make many-to-many associations work. Once I got it working, I wrote a blog post. Because I'm a nov...
New
sergio
Kind of like when jquery came out, it was super necessary. Existing drag and drop libraries have a bunch of baggage to support old browse...
New

We're in Beta

About us Mission Statement