steffend

steffend

Phoenix Core Team

Memory explosion when accessing a large map while using Nx+EXLA

Hey there,

I’m currently benchmarking some sentence transformers (see Nx vs. Python performance for sentence-transformer encoding) and stumbled upon something weird:

Basically I’m encoding a sentence into a vector using Bumblebee (Nx+EXLA) and then calculate a cosine similarity with a list of pre-calculated vectors (~150k). I’m benchmarking the whole process to see what performance I can achieve:

[%{embedding: query}] = Nx.Serving.batched_run(SentenceTransformer, ["this is a test input"])

sim =
  for chunk <- vectors do
    Bumblebee.Utils.Nx.cosine_similarity(query, chunk)
  end
  |> Nx.concatenate()

{similarity, labels} = Nx.top_k(sim, k: 10)

indexes = Nx.to_flat_list(labels)
scores = Nx.to_flat_list(similarity)

for {idx, score} <- Enum.zip(indexes, scores) do
  # accessing the sentence_map here leads to the issue
  %{sentence: sentence_map[idx], score: score}
end

This works fine until I try to map the resulting top k indexes back to the sentences using a 150k key map of index → sentence entries. When doing this, EXLA suddenly starts to consume a LOT of memory:

I think that the memory usage is in EXLA as is not visible in the observer:

Running on CUDA seems to confirm this, it even runs out of memory completely:

This is my script to reproduce:

# System.put_env("XLA_TARGET", "cuda118")

Mix.install([
  {:bumblebee, github: "elixir-nx/bumblebee", ref: "23de64b1b88ed3aad266025c207f255312b80ba6"},
  {:nx, github: "elixir-nx/nx", sparse: "nx", override: true},
  {:exla, github: "elixir-nx/nx", sparse: "exla", override: true},
  {:axon, "~> 0.5.1"},
  {:kino, "~> 0.9"}
])

Nx.global_default_backend(EXLA.Backend)
# Nx.Defn.global_default_options(compiler: EXLA, client: :cuda)
Nx.Defn.global_default_options(compiler: EXLA, client: :host)

model_name = "sentence-transformers/all-MiniLM-L6-v2"
{:ok, model_info} = Bumblebee.load_model({:hf, model_name})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, model_name})

serving =
  Bumblebee.Text.TextEmbedding.text_embedding(model_info, tokenizer,
    compile: [batch_size: 64, sequence_length: 128],
    defn_options: [compiler: EXLA],
    output_attribute: :hidden_state,
    output_pool: :mean_pooling
  )

Kino.start_child({Nx.Serving, serving: serving, name: SentenceTransformer, batch_size: 64, batch_timeout: 50})

defmodule ConcurrentBench do
  def run(fun, concurrency \\ System.schedulers_online(), timeout \\ 10_000) do
    # use an erlang counter to count the number of function invocations
    counter = :counters.new(1, [:write_concurrency])

    # returns time in microseconds
    {taken, _} =
      :timer.tc(fn ->
        tasks =
          for _i <- 1..concurrency do
            Task.async(fn ->
              Stream.repeatedly(fn ->
                fun.()
                # only count after the function ran successfully
                :counters.add(counter, 1, 1)
              end)
              |> Stream.run()
            end)
          end

        results = Task.yield_many(tasks, timeout)

        # kill all processes
        Enum.map(results, fn {task, res} ->
          res || Task.shutdown(task, :brutal_kill)
        end)
      end)

    runs = :counters.get(counter, 1)
    ips = runs / (taken / 1_000_000)

    %{runs: runs, ips: ips}
  end
end

n = 150000

sentence_map = Map.new(1..n, fn i -> {i, i} end)
[%{embedding: vector}] = Nx.Serving.batched_run(SentenceTransformer, ["This is a test sentence"])

IO.puts("encoded sample input")

 
vectors =
  Stream.duplicate(vector, n)
  |> Stream.chunk_every(10000)
  |> Stream.map(fn chunk -> Nx.stack(chunk) end)
  |> Enum.to_list()

IO.puts("created dummy comparison vectors")
IO.puts("Running concurrent bench now")

ConcurrentBench.run(
  fn ->
    [%{embedding: query}] = Nx.Serving.batched_run(SentenceTransformer, ["this is a test input"])

    sim =
      for chunk <- vectors do
        Bumblebee.Utils.Nx.cosine_similarity(query, chunk)
      end
      |> Nx.concatenate()

    {similarity, labels} = Nx.top_k(sim, k: 10)

    indexes = Nx.to_flat_list(labels)
    scores = Nx.to_flat_list(similarity)

    # uncomment this
    # for {idx, score} <- Enum.zip(indexes, scores) do
    #   %{sentence: sentence_map[idx], score: score}
    # end

    nil
  end,
  16, 60_000
) |> IO.inspect()

See the “uncomment this” section.

I guess this is a bug? At least accessing the map should not lead to EXLA allocating memory, right?

Thanks for any suggestions!

Most Liked

polvalente

polvalente

Nx Core Team

Last week we looked a bit into this and there probably is something related to the benchmarking infrastructure itself that’s provoking data copying between processes, and we think the difference that commented for actually makes is make the processes last longer, thereby changing how the garbage collection can work.

The initial hypothesis was that the loop would make the processes take longer to finish, therefore holding onto the memory copied to them, but I didn’t measure this. Instead, it behaved the same as a long Process.sleep where the memory usage increased more slowly, but to the same level as without the for/Process.sleep call.

I tried a few things like setting the vectors variable to a :persistent_term to read from there, but without success.

@josevalim perhaps there’s a chance each process is getting a JIT cache miss and getting its own compilation?

josevalim

josevalim

Creator of Elixir

We lock the cache key to avoid that, so it shouldn’t be the case, unless there is a bug.

Where Next?

Popular in Questions Top

chokchit
** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 2733ms. You can configure how long re...
New
aadeshere1
I have a another noob question about loop. Since elixir is immutable, while loop is not directly possible. total = 10 while total != 0 ...
New
senggen
Erlang/OTP 25 [erts-13.2.2] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] 15:22:35.803 [error] gen_event {lager_file_backend...
New
siddhant3030
Hi, I have to write a raw query for one of my project. But till now I have used ecto queries and don’t have much experience writing raw ...
New
Patoshizzle
After calling mix ecto.create I get this error: 17:00:32.162 [error] GenServer #PID&lt;0.412.0&gt; terminating ** (Postgrex.Error) FATAL...
New
aalberti333
As the title describes, I’m trying to run Enum.map() over a list of key/value pairs, where the value is a map. My data looks like this: ...
New
freewebwithme
Using vs code and installed ElixirLS: support and debugger. And I got an error popped up on start up says Failed to run ‘elixir’ comma...
New
dblack
I’ve got an issue with an app and I’ve no idea of how to troubleshoot it. I’m hoping someone here might have seen something similar. I p...
New
joaquinalcerro
Hi there, I am working with Ecto-Postgresql and I need to call all of the records from a specific table but the table has 40,000 records...
New
vonH
In asking this question I am more interested about the expressiveness of the language itself and less concerned about the availability of...
New

Other popular topics Top

albydarned
Hello all! I am typing this post from my new MacBook Pro with the M1 chip. I’m loving it so far, and will probably use it as my daily dr...
New
greenz1
I have a phoenix application from which a user can download multiple(5-6) files of size 1MB. I couldn’t find anything related to sending ...
New
stefanchrobot
What’s the safe way to decode a JSON string into a struct? I want to avoid calling String.to_atom. Jason.decode can give me a map with st...
New
AngeloChecked
What learn first? Rust or Elixir Hi Elixir community! I’m here because i want learn a new language. I’m a junior developer and mainly i ...
New
jay1
Why is it that the mnesia database isn’t the most preferred database for use in Elixir/Phoenix?
New
saif
Hello everyone, Long time lurker first time poster here. I’ve recently begun working on Elixir full-time again! :raised_hands: It’s been...
New
nsuchy
Hi. I’ve noticed that Windows Powershell has it’s own IEX command and you cannot access Elixir’s IEX due to the conflict. This isn’t a cr...
New
komlanvi
Hi everyone, I was playing with phoenix liveView but I run into an issue. I have a form and want to validate each input text when the te...
New
hariharasudhan94
I would like to know what is the best IDE for elixir development?
New
AstonJ
Seen any cool LiveView demos, sample apps or examples? Please post them here! :003:
New

We're in Beta

About us Mission Statement