Server/deployment considerations for Bumblebee?

niccolox · December 15, 2022, 1:13am

what processor constraints does this stack have?
GPU?

is there a deployment README.md somewhere?

josevalim · December 16, 2022, 9:09am

Phoenix app examples with deployment considerations can be found here: bumblebee/examples/phoenix at main · elixir-nx/bumblebee · GitHub

niccolox · December 16, 2022, 11:04pm

Thanks, was thinking more about GPU friendly vendor support, i.e. fly.io or …

mayel · December 17, 2022, 3:09am

I’ve come across https://www.vultr.com/products/cloud-gpu/ as an option but haven’t tried it out (also not sure how the pricing compares to other options…)

niccolox · December 22, 2022, 11:39pm

Vultr is good value:performance for sure

AstonJ · December 22, 2022, 11:45pm

There is a related thread here that may be of interest:

(Might be worth posting some of those cloud providers there as well :D)

mayel · December 26, 2022, 1:26am

Also came across this: A Google Colab that runs Elixir livebook w/ BumbleeBee and CUDA acceleration (probably only useful for development / personal use though)

Edit: it is indeed limited, some attempts at using Bumblebee fail with an error like Allocator (GPU_0_bfc) ran out of memory trying to allocate 4.41GiB requested by op

meanderingstream · December 26, 2022, 3:16pm

Consider carefully assessing whether you need a GPU in production. Inference is not as compute intensive as training models. Can you meet business domain needs with just the CPU version of XLA or Torchx? Many initial product MVPs can live with the sub-second latency of some models running on the CPU.