Tria

hst337 · January 18, 2023, 1:13am

An optimizing compiler for Elixir language

Public and alpha

I’ve made this repository public to finally get some rest and find other developers who may be interested in participating in this project (it sounds so naive when I write it) before getting hands on the second iteration.

Current state

!!!Unstable!!!

Tria passes most of the tests in Phoenix, Ecto, Plug and other projects, while it may fail to compile other projects. You’re welcome to compile projects using Tria and fill an issue if something fails. This kind of help would be heavily appreciated.

Even without considering this, Tria compiler lacks some important features like incremental compilation, warning handling and stuff.

However, you can expect the stable version during 2023-ish

Current features

Compile-time evaluation.
Yes, as simple as it sounds, Elixir and Erlang were incapable of doing this, since they support runtime recompilation. Ahead-of-time nature of Tria allowed to evaluate pure statement in compile time
Enum fusion
Join multiple Enum.maps, optimize for loops to finally be efficient and other magic.

I haven’t benchmarked these yet, but these optimizations actually work and (more importantly) work correctly in most cases

Planned features

Broad documentation (like pathex has)
Inlining
Hot-reloading support
Peephole optimizations for common suboptimal code.
map.field handling (did you know that it is 2-3 times slower than Map.fetch!(map, :field)?)
Type-checker integration

Help wanted

If you’re interested in optimizing compiler development, you’re welcome to take a look around, poke some things, check some stuff. I know that some Brazilian universities are working in Elixir-related compiler development, and I hope we can collaborate on Tria someday

PS: Questions are appreciated

UPD: License is going to change in a near future

sergio · January 18, 2023, 1:44am

I’m still a little confused what this project is/does.

Can you try to restate what it is and what it does in simpler terms?

hst337 · January 18, 2023, 1:47am

“Optimizer” means a thing which makes something more efficient.

So is Tria. It simply makes your code more efficient in terms of time and memory when you add it to your project as a dependency (and a list of compilers). It makes Enum calls and for loops faster, it removes unnecessary calls, it evaluates some stuff in compile time

benwilson512 · January 18, 2023, 1:54am

This is very exciting! Looking forward to testing this on some projects.

Tiny question: In the install guide you add {:tria, github: "hissssst/tria"} to the deps, presumably this could be runtime: false or does it need to exist when the app runs too?

hst337 · January 18, 2023, 1:56am

No, Tria is not called in runtime, though I haven’t testing it with this flag yet. To be honest, I’ve only tested it with path: "../tria" during development

Eiji · January 18, 2023, 3:12am

Ahh, so bad we can’t set compilers via Mix.install/2 (we can only turn off consolidated protocols) … Is there any particular reason why thee is no support for it?

hst337 · January 18, 2023, 8:53am

I think we can create a proposal for the language. I don’t see any boundaries why this shouldn’t be implemented there

Eiji · January 18, 2023, 9:43am

I was thinking the same, but your comment made me sure about it. Here is in issue:

github.com/elixir-lang/elixir

Missing support for compilers config in Mix.install/2

opened 09:41AM - 18 Jan 23 UTC

Eiji7

### Elixir and Erlang/OTP versions Found no support at all up to `main` branch.… `Erlang` version `N/A`. ### Operating system N/A - that's `Elixir` issue regardless of OS ### Current behavior See `[]` (empty list) value for compilers config regardless of passed options. https://github.com/elixir-lang/elixir/blob/6730d669fb319411f8e411d4126f1f4067ef9231/lib/mix/lib/mix.ex#L723-L737 ### Expected behavior Some libraries requires custom `mix` compilers to work. The problem was [found by me](https://elixirforum.com/t/tria-the-elixir-optimizer/53250/6?u=eiji) when I want to test `tria` in my scripts, but applies to other libraries as well, for example `elixir_make`. **Note**: When creating this issue I was thinking about it more **like** a bug report rather than a proposal, because `Mix.install/2` was added to help working with dependencies in single file scripts and missing option blocks developers from doing so in specific use cases.

As wrote in it once there would be a support for it at least in main branch I would try your library with my scripts like scrapers and I would try to submit some benchmarks as well.

josevalim · January 18, 2023, 10:01am

Exciting to see your ideas come to life!

How do you know which functions are pure? Especially when anonymous functions are involved?

Could you please expand on the tricks here (or share a link)? I assume some of those may be implementable in Elixir itself, since for is opaque (although it is unlikely we would do Enum fusion).

Curious: how would that be different from Erlang inlining via @compile :inlineand @compile {:inline, size}?

Any ideas here? We do compile map.field to a case statement but, last time I measured, the pattern matching was faster than invoking a BIF instruction (which is what Map.fetch! does). map.fieldshould be slower if the field does not exist though.

Other than that, nice work! I am sure the Erlang Compiler team would also be interested in further ideas to optimize. Some of the peephole and type-checker integration may already be implemented there. Although we don’t perform protocol related type optimizations and the Erlang compiler does not have enough information to do so, so maybe Elixir could/should.

hst337 · January 18, 2023, 11:48am

Thanks for your questions!

This is a big questions. Generally, there are two types of functions: functions which are defined in Erlang or Elixir, or functions which are undefined (thus contain erlang:nif_error or call to themselves in their bodies like in Elixir bootstrap).

For the latter, I’ve prepared the ets table file with purity for all such function from all available modules. When unexpected NIF function is found in call stack, Tria would just ask the user in TTY if such function is pure. I have a plan to add annotations like

@tria pure: true, safe: true
def my_nif() do
  ...
end

For the functions which have Erlang/Elixir definitions, compiler fetches the abstract code, translates it to Tria and just takes a look at all possible calls. So, calls to impure or anonymous functions are considered impure and this is not always correct, but it works for the first iteration of compiler and it can be partially solved with type checking in the future

It translates Enum.reduce() |> :lists.reverse to Enum.map. Compiler just needs to make sure that Enum.reduce aggregates each item straight to the accumulator list, without any conditions. Here’s the optimization: tria/lib/tria/optimizer/pass/enum_fusion.ex at main · hissssst/tria · GitHub. You can see there tri macro which is a great tool for pattern-matching on Tria AST.

It will be different in some ways. For example, inter-module calls will be inlined. And defaults (like foo(x, y \\ 0) will be expanded too, removing one call in the stack

However, some of these approaches change the behaviour of the program in two ways:

Pure code which raises exceptions may be reordered by EnumFusion optimization. I am working on fixing this behavior to have more correct compiler.
Runtime recompilation is not yet possible. It can be possible in the future, but Tria just needs some tooling for this.

josevalim · January 18, 2023, 12:27pm

What we could do in Elixir itself is to keep a list of the functions in its standard library that are pure. Erlang also keeps a similar list for its standard library and inlines those. WDYT?

We can make this generally applicable but it is worth keeping in mind that each inlined modules becomes a compile-time dependency, and enlarging the compile time graphic can have huge impacts on performance. This is not a concern for Elixir stdlib itself though.

It is also not possible to know if something is pure or not when calling Erlang functions (unless we have an FFI layer where you also explicitly declare those).

Ah, thank you. We had a previous discussion about this. At the time, the cost of traversal in for comparing Enum.reduce vs Enum.map was roughly 13% faster. However, your benchmarks show the body recursive version can be even faster and we could emit Erlang AST with a body recursive version using Erlang’s “named anonymous functions”. Is this something you have an interest in contributing? Or should I open up an issue for someone else to pick?

hst337 · January 18, 2023, 12:38pm

Yeah, I am using Erlang’s list, but unfortunately, :erl_bifs covers only :erlang module. I didn’t know about list of pure Elixir functions, I’ll grep the Elixir sources then, thanks for the tip!

I didn’t quite get what you meant.

Well, it depends on these Erlang functions. If abstract_code for the function can be fetched, it can be analyzed for purity and safety. If the abstract_code is not present, or the function is a NIF/BIF stub, we can ask a user to provide the value for a trait, or just assume the worst (impurity or unsafety)

ityonemo · January 18, 2023, 3:24pm

You don’t need abstract code. You can decompile the module and trace the bytecode to make sure there aren’t any calls that are known to be mutating.

hst337 · January 18, 2023, 3:34pm

Yes, that’s true, but right now I’ve decided to rely on abstract code, since I already have an abstract code translator. Plus, this is a foundation for applying Tria to Erlang projects. However, I’ve planned Erlang support only for the third iteration

josevalim · January 18, 2023, 10:33pm

We would need to build such a list but I guess it is worth it?

If A inlines code from B, every time B changes now A has to recompiled. This has a compilation time cost. Perhaps inlining should be enable for production (think -O3).

hst337 · January 18, 2023, 10:44pm

I don’t know if it will be useful anywhere except Tria

Ahhh, yes, that’s true and I came up with several solutions and chose the solution with separating code into contexts.

Modules will be divided into groups, and each module group will be joined and compiled into separate module. Inlinings between groups won’t be possible.

So, the problem is to divide modules into groups. Right now Tria compiles everything into one group (called TriaGlobalContext) And my ideas for this are

Using @sasajuric 's Boundary
Using explicit tria annotation
Profiling application during testing and using this data to statistically find optimal groups
Just compiling everything into one module

josevalim · January 19, 2023, 1:18pm

The goal would be to start doing some of those optimizations in Elixir too.

hst337 · January 19, 2023, 3:39pm

Haha, you can always hire me to do so

But anyway, it would change behavior of a language heavily and would require reimplementing about 99% of what I already wrote in Tria

josevalim · January 19, 2023, 3:45pm

I thinking inlining calls to pure Elixir functions should be ok, with no large changes to the semantics, especially if we do it on the pass we convert to Erlang.

hst337 · January 19, 2023, 3:55pm

Well, it might be an issue when someone recompiles the project in runtime. For example, Patch project allows recompilation of Elixir’s modules, and this code might be broken