Any idea why the Tails library may be contributing to longer compile time?

I’ve been using the SaladUI library a bit and one of the things that I’ve noticed is that compiling depencies has significantly slowed down. Just based on watching the console logs I can tell that the Tails library is likely causing the issue.

I come from C# (work) and I like pursuing/fixing performance issues/bottlenecks, but I feel like my knowledge doesn’t transfer to elixir. I was wondering what is causing this relatively small library to take so much time compiling, not out of malice for the library but only out of curiosity. I’ve heard macros could cause these issues, but the macro seems benign?

If anyone has any insight into this specific library or any tools that I could use to found out myself and maybe play around with attempting to make it faster, I’d love the help.

3 Likes

:wave: @firesidewing

I was curious so I tried it out and it’s indeed slow:

iex(1)> :timer.tc fn -> Mix.install([:tails], force: true) end
{9605421, :ok} # nine seconds

Skimming the files, it seems to do attribute and function generation from external (?) resources in tails/lib/custom.ex at main · zachdaniel/tails · GitHub, so I guess that’s what is causing the long compilation time.

With profiler on:

[profile]     15ms compiling +      0ms waiting while compiling lib/color_classes.ex
[profile]     21ms compiling +      0ms waiting while compiling lib/doc.ex
[profile]     26ms compiling +      0ms waiting for module Tails.ColorClasses while compiling lib/colors.ex
[profile]    151ms compiling +      0ms waiting while compiling lib/custom.ex
[profile]   7878ms compiling +    143ms waiting for module Tails.Custom while compiling lib/tails.ex
[profile] Finished compilation cycle of 8 modules in 8024ms
[profile] Finished group pass check of 8 modules in 41ms

And by “attribute and function generation” I mean something like this

# try running it in IEx
defmodule SlowModule do
  for i <- 1..100_000 do
    def to_string(unquote(i)), do: unquote(to_string(i))
  end
end

Tails doesn’t do that exact thing, but I think it’s something similar. One possible fix is to do less of that, maybe splitting this work across modules, or using “smaller” data structures. And I think Tz had a similar problem in the past, and now it doesn’t. So it might be useful :slight_smile:

2 Likes
Mix.install([:tails, :beam_file])
ex = BeamFile.elixir_code!(Tails)
File.write!("…/tails.ex", ex)

This module gets blown up to 70k lines (formatted) once macros are expanded.

2 Likes

Oh cool, I didn’t know you could do that in elixir. Definitely could see that adding to the compilation time

Yes, as @ruslandoga pointed out, it generates all the variants at compile time. The idea is to ensure it runs extremely fast at runtime, since it will be called many times during every render. There are other possible approaches, but the choice to do the work at compile time was intentional.

Tails is currently in need of a maintainer, as Zach no longer has time to work on it with all the other things he has going on.

3 Likes

Makes sense to make that trade off, I would rather the faster runtime performance as well. I can also understand lacking time to be able to put into the library. With a 3 year old and a 3 month old myself, I doubt I could meaningfully contribute to the ecosystem.

I would be curious as to how well a NIF would do (using the rust/leptos tailwind merger for example) for this use case. With all the string manipulations I could see rust performing better than even an optimized elixir version. Are you able to package up NIF wrappers as a hex package?

Regardless, when I get a moment to myself I’d like to try to benchmark a NIF vs Tails just for fun.

Yeah, NIFs can be in Hex packages. For Rust specifically, there’s Rustler and also RustlerPrecompiled. Tailwind 4 (coming soon) is actually starting to use Rust in parts of its toolchain. Hopefully, at some point there would be a library for this sort of logic upstream that we could just package up into a NIF.

As an aside, I think Tailwind 4 affords better tools to handle the problem. It uses native CSS layers, which enables you to slot in extra layers for things like variants that have precedence between components and utilities, eliminating the need for merging at all.

For future reference I was able to compare a quick wrapper NIF vs Tails for a few different sizes of class lists. Got about a 2-8x speed boost depending on size with identical outputs and without the slow compile. I’ve never written a hex package but might just throw it up when I have time.

I do agree there should just be a universal tool for this, but right now it’s useful to me without having to mess around with layers too much.

3 Likes

That sounds like an interesting experiment. Care to share more about it?
What did you try exactly?

All super experimental, probably missed a few things and could have made things nicer (the weird duplicate functions), but I just wanted to get a quick comparison.

The rust code

#[rustler::nif]
fn merge2<'a>(env: Env<'a>, class1: &str, class2: &str) -> NifResult<Term<'a>> {
    let result = tw_merge!(class1, class2);

    Ok(result.encode(env))
}

#[rustler::nif]
fn merge<'a>(env: Env<'a>, class: &str) -> NifResult<Term<'a>> {
    let result = tw_merge!(class);

    Ok(result.encode(env))
}

#[rustler::nif]
fn merge_list<'a>(env: Env<'a>, classes: Vec<&str>) -> NifResult<Term<'a>> {
    let result = tw_merge!(classes.join(" "));

    Ok(result.encode(env))
}

rustler::init!("Elixir.TailwindMerge");

The elixir wrapper

defmodule TailwindMerge do
  use Rustler, otp_app: :tailwind_merge, crate: "tailwind_merge"

  def merge2(_base, _classes), do: :erlang.nif_error(:nif_not_loaded)
  def merge(_classes), do: :erlang.nif_error(:nif_not_loaded)
  def merge_list(_classes), do: :erlang.nif_error(:nif_not_loaded)
end

And the benchmark

def run_benchmark do
    simple_classes = ["p-4 text-red-500", "p-6 text-blue-500"]

    medium_classes = [
      "p-4 m-2 text-red-500 bg-blue-300 flex items-center",
      "p-6 m-4 text-blue-500 bg-green-400 justify-between"
    ]

    complex_classes = [
      "p-4 px-6 py-2 m-2 mx-4 text-red-500 bg-blue-300 flex items-center justify-start rounded-lg border border-gray-200 shadow-sm hover:shadow-md transition-all duration-200",
      "p-6 px-8 m-4 mx-6 text-blue-500 bg-green-400 justify-between items-start rounded-xl border-2 border-blue-300 shadow-md hover:shadow-lg",
      "p-8 py-4 m-6 my-8 text-green-600 bg-yellow-200 flex-col items-end rounded-2xl border-4 border-green-400"
    ]

    large_classes = Enum.map(1..100, fn _ -> Enum.random(complex_classes) end)

    arbitrary_classes = [
      "text-[#custom] p-[14px] w-[232.5px] grid-cols-[1fr,auto,30%]",
      "text-[rgb(24,25,26)] p-[3vh] w-[calc(100%-3rem)] grid-cols-[repeat(auto-fit,minmax(200px,1fr))]"
    ]

    Benchee.run(
      %{
        "TailwindMerge - Simple" => fn ->
          TailwindMerge.merge_list(simple_classes)
        end,
        "Tails - Simple" => fn ->
          Tails.classes(simple_classes)
        end,
        "TailwindMerge - Medium" => fn ->
          TailwindMerge.merge_list(medium_classes)
        end,
        "Tails - Medium" => fn ->
          Tails.classes(medium_classes)
        end,
        "TailwindMerge - Complex" => fn ->
          TailwindMerge.merge_list(complex_classes)
        end,
        "Tails - Complex" => fn ->
          Tails.classes(complex_classes)
        end,
        "TailwindMerge - Large" => fn ->
          TailwindMerge.merge_list(large_classes)
        end,
        "Tails - Large" => fn ->
          Tails.classes(large_classes)
        end,
        "TailwindMerge - Arbitrary" => fn ->
          TailwindMerge.merge_list(arbitrary_classes)
        end,
        "Tails - Arbitrary" => fn ->
          Tails.classes(arbitrary_classes)
        end
      },
      time: 10,
      memory_time: 2,
      pre_check: true,
      formatters: [
        {Benchee.Formatters.Console, extended_statistics: true}
      ]
    )
  end

Which, on my machine, results in:

Name                                ips        average  deviation         median         99th %
TailwindMerge - Simple         523.26 K        1.91 μs   ±995.11%        1.83 μs        2.09 μs
TailwindMerge - Arbitrary      335.41 K        2.98 μs   ±634.92%        2.89 μs        3.07 μs
TailwindMerge - Medium         224.60 K        4.45 μs   ±276.75%        4.33 μs        4.79 μs
TailwindMerge - Complex         67.16 K       14.89 μs    ±42.19%       14.60 μs       21.65 μs
Tails - Simple                  54.74 K       18.27 μs    ±31.61%       18.04 μs       24.12 μs
Tails - Arbitrary               49.74 K       20.11 μs    ±31.14%       19.38 μs       37.77 μs
Tails - Medium                  47.43 K       21.08 μs    ±33.35%       20.70 μs       28.98 μs
Tails - Complex                 17.56 K       56.96 μs    ±13.94%       55.72 μs       82.48 μs
TailwindMerge - Large            2.75 K      363.23 μs     ±2.78%      360.60 μs      401.51 μs
Tails - Large                    0.84 K     1191.36 μs     ±3.54%     1184.49 μs     1291.30 μs

Comparison: 
TailwindMerge - Simple         523.26 K
TailwindMerge - Arbitrary      335.41 K - 1.56x slower +1.07 μs
TailwindMerge - Medium         224.60 K - 2.33x slower +2.54 μs
TailwindMerge - Complex         67.16 K - 7.79x slower +12.98 μs
Tails - Simple                  54.74 K - 9.56x slower +16.36 μs
Tails - Arbitrary               49.74 K - 10.52x slower +18.19 μs
Tails - Medium                  47.43 K - 11.03x slower +19.17 μs
Tails - Complex                 17.56 K - 29.80x slower +55.05 μs
TailwindMerge - Large            2.75 K - 190.07x slower +361.32 μs
Tails - Large                    0.84 K - 623.39x slower +1189.45 μs

Extended statistics: 

Name                              minimum        maximum    sample size                     mode
TailwindMerge - Simple            1.71 μs    21580.94 μs         4.64 M                  1.82 μs
TailwindMerge - Arbitrary         2.67 μs    18349.99 μs         3.13 M                  2.89 μs
TailwindMerge - Medium            4.04 μs     8567.07 μs         2.14 M                  4.30 μs
TailwindMerge - Complex          13.83 μs     3345.28 μs       660.40 K                 14.56 μs
Tails - Simple                   17.50 μs     2992.43 μs       539.68 K                    18 μs
Tails - Arbitrary                18.52 μs     2546.79 μs       490.53 K                 19.34 μs
Tails - Medium                   19.79 μs     3187.23 μs       468.31 K                 20.65 μs
Tails - Complex                  53.40 μs     1980.29 μs       174.36 K                 54.67 μs
TailwindMerge - Large           346.85 μs      654.30 μs        27.49 K                357.53 μs
Tails - Large                  1140.16 μs     2437.37 μs         8.39 K               1173.54 μs

Memory usage statistics:

Name                              average  deviation         median         99th %
TailwindMerge - Simple          0.0625 KB     ±0.00%      0.0625 KB      0.0625 KB
TailwindMerge - Arbitrary       0.0625 KB     ±0.00%      0.0625 KB      0.0625 KB
TailwindMerge - Medium          0.0625 KB     ±0.00%      0.0625 KB      0.0625 KB
TailwindMerge - Complex         0.0625 KB     ±0.00%      0.0625 KB      0.0625 KB
Tails - Simple                    5.55 KB     ±0.00%        5.55 KB        5.55 KB
Tails - Arbitrary                 7.70 KB     ±0.00%        7.70 KB        7.70 KB
Tails - Medium                    9.27 KB     ±0.00%        9.27 KB        9.27 KB
Tails - Complex                  28.65 KB     ±0.00%       28.65 KB       28.65 KB
TailwindMerge - Large           0.0625 KB     ±0.00%      0.0625 KB      0.0625 KB
Tails - Large                   680.54 KB     ±0.00%      680.54 KB      680.54 KB

Comparison: 
TailwindMerge - Simple          0.0625 KB
TailwindMerge - Arbitrary       0.0625 KB - 1.00x memory usage +0 KB
TailwindMerge - Medium          0.0625 KB - 1.00x memory usage +0 KB
TailwindMerge - Complex         0.0625 KB - 1.00x memory usage +0 KB
Tails - Simple                    5.55 KB - 88.88x memory usage +5.49 KB
Tails - Arbitrary                 7.70 KB - 123.25x memory usage +7.64 KB
Tails - Medium                    9.27 KB - 148.38x memory usage +9.21 KB
Tails - Complex                  28.65 KB - 458.38x memory usage +28.59 KB
TailwindMerge - Large           0.0625 KB - 1.00x memory usage +0 KB
Tails - Large                   680.54 KB - 10888.62x memory usage +680.48 KB

Extended statistics: 

Name                              minimum        maximum    sample size                     mode
TailwindMerge - Simple          0.0625 KB      0.0625 KB       185.28 K                0.0625 KB
TailwindMerge - Arbitrary       0.0625 KB      0.0625 KB       161.84 K                0.0625 KB
TailwindMerge - Medium          0.0625 KB      0.0625 KB       144.94 K                0.0625 KB
TailwindMerge - Complex         0.0625 KB      0.0625 KB        77.44 K                0.0625 KB
Tails - Simple                    5.55 KB        5.55 KB        46.88 K                  5.55 KB
Tails - Arbitrary                 7.70 KB        7.70 KB        32.52 K                  7.70 KB
Tails - Medium                    9.27 KB        9.27 KB        38.40 K                  9.27 KB
Tails - Complex                  28.65 KB       28.65 KB        14.86 K                 28.65 KB
TailwindMerge - Large           0.0625 KB      0.0625 KB         5.26 K                0.0625 KB
Tails - Large                   680.49 KB      680.54 KB         1.25 K                680.54 KB
2 Likes