The :filelib.wildcard/1
function used by the Elixir compiler for discovering Elixir source files is relatively slow for projects with a lot of files. The project I work on at the time of writing has 1570 Elixir files. Using that project I was able to profile and see that iterating to find source files is taking a substantial amount of time using eflambe and speedscope.
This led me to see if I could improve the performance. I wrote a very small Rust NIF using Rustler to see if I could outperform :filelib.wildcard/1
.
Rust NIF
use rustler::Term;
use rustler::Env;
use rustler::OwnedBinary;
use walkdir::WalkDir;
use std::ffi::OsStr;
#[rustler::nif]
fn walkdir(env: Env, dir: &str, extension: &str) -> Vec<String> {
let mut files = Vec::new();
for entry in WalkDir::new(dir) {
let entry = entry.unwrap();
if entry.path().extension() == Some(OsStr::new(extension)) {
files.push(entry.path().display().to_string());
}
}
files
}
rustler::init!("Elixir.Walkdir.Native");
Bench Results
Comparing this NIF to the :filelib.wildcard/1
function results in a substantial amount of time that could be saved when compiling a project with no changes. Roughly a third of the compile time for a no-op compile is spent iterating to find the source files (~100ms of ~300ms). This is most notable when hot reloading a Phoenix project as the compiler is invoked on page load even if there are no files changed (See Phoenix.CodeReloader — Phoenix v1.7.14).
Benchee.run(
%{
"nif" => fn -> Walkdir.Native.walkdir("lib", "ex") end,
"elixir" => fn -> :filelib.wildcard('lib/**/*.ex') end
}
Name ips average deviation median 99th %
nif 81.88 12.21 ms ±8.98% 11.98 ms 16.20 ms
elixir 9.49 105.35 ms ±16.43% 105.28 ms 151.09 ms
Comparison:
nif 81.88
elixir 9.49 - 8.63x slower +93.14 ms
I do not see a way forward to get these changes into the Elixir compiler easily since this code relies on a NIF. I do not see a more performant set of functions from the BEAM that would help either. The issue seems to be that iterating over every file, checking if it’s a directory, iterating further, etc has a bit of overhead using the :filelib
and by extension :file
module. It seems like creating a specialized BIF for returning matching files in a directory in the BEAM would be the most straight-forward way to include these potential performance improvements without using a NIF. I would appreciate any feedback or additional ideas on how this could be improved or included into Elixir proper.