I suspect that most BEAM files contain all of the functions defined in each referenced module (including frameworks and libraries). Although this may not matter in many cases, it would be nice to have a way to make distributions (eg, escripts) smaller.
A tree-shaking compiler might be able to substantially reduce the size of many BEAM files. Tree-shaking compilers remove unconnected branches from the code tree. For example, ClojureScript translates Clojure code into JavaScript, then passes the result through the Google Closure compiler to remove unused functions.
I’d like to have a tool that performs the analogous task for BEAM files, escripts, etc. Would anyone like to offer approaches, caveats, comments, etc?
This might be hard to do in general given some dynamic expressions, for example, when using apply/3 or even mod.foo(...), even worse if combined with String.to_atom or Module.concat. Those things would limit dramatically the amount of stuff you can eliminate, given access to all modules is global (unlike in JavaScript). That is to say, that it might be a lot of effort and a single place that uses some dynamic feature might lower the gains significantly, if not remove them completely.
Although some programs have dynamically linked functions that could complicate the situation, most do not. In any case, a configuration file could list all of these functions. Assuming that these sorts of problems could be handled, what other issues do we need to consider?
It’s enough that there’s just one instance of apply in the entire codebase with dynamic module and function that makes it impossible to remove any public module or any public function - it might be potentially called by that apply. Unless you do some extensive static analysis to see possible values, but I would imagine this to be extremely difficult to do in practice.
Hot code loading is not a big concern, if the primary target are escripts. I don’t think anybody does hot reloading in this context.
I definitely wouldn’t do it in releases - the most powerful troubleshooting tool is the shell, which is inherently dynamic (since it’s based on eval) - removing all directly unused functions could make the shell almost unusable.
Sounds like an automated tree shaking approach wouldn’t be possible. However, If I only need one module from a 700+ module library, I would be happy to explicitly list these modules within mix.exs. Perhaps something like the following would be possible?
defp deps do
[
# ...
{:google_api_drive, "~> 0.42", only: [GoogleApi.Content.V21.Api.Accounts]}
]
end
Our application uses this dependency for a single module. It’s sad to watch the compiler compile all 717 files in the dependency.
Nitpick: your application only uses “a single module”, but that module depends on plenty more files from inside that package, especially model structs
IMO the biggest problem with trying to do JS-style tree-shaking is that modules are never referenced by their location on the filesystem (like how JS’s import statement does) - the only way to know what file defines the module named GoogleApi.Content.V21.Api.Accounts is to compile ALL of the files and read the resulting BEAM file.
the only way to know what file defines the module named GoogleApi.Content.V21.Api.Accounts is to compile ALL of the files and read the resulting BEAM file
Would it be possible to run a simple search within the dependency to find which file defines the module with this name? Then only compile this found file, along with the modules which it depends on (would require some recursive searching)? I realize this is starting down a rabbit-hole, but maybe there is a solution here which could dramatically improve build times.
Not in general, short of parsing the file all the way to AST. For instance, imagine trying to search for the definition of Foo.Bar when there’s a file like:
defmodule Foo do
defmodule Bar do
def bar do
:bar
end
end
end
Also keep in mind that you’d need to do even more almost-compiling work to correctly resolve aliases:
defmodule UsesAlias do
def wat do
alias Foo.Bar, as: Huh
Huh.bar()
end
end