High memory usage when transcribing with whisper-large-v3 in Nx/EXLA/Bumblebee

it may explain what you’re seeing, but I just learned this: Bumblebee: Slow load_model in GenServer, slow Nx.Serving.run in exs file - #12 by jonatanklosko