what processor constraints does this stack have?
GPU?
is there a deployment README.md somewhere?
what processor constraints does this stack have?
GPU?
is there a deployment README.md somewhere?
Phoenix app examples with deployment considerations can be found here: bumblebee/examples/phoenix at main · elixir-nx/bumblebee · GitHub
Thanks, was thinking more about GPU friendly vendor support, i.e. fly.io or …
I’ve come across https://www.vultr.com/products/cloud-gpu/ as an option but haven’t tried it out (also not sure how the pricing compares to other options…)
Vultr is good value:performance for sure
There is a related thread here that may be of interest:
(Might be worth posting some of those cloud providers there as well :D)
Also came across this: A Google Colab that runs Elixir livebook w/ BumbleeBee and CUDA acceleration (probably only useful for development / personal use though)
Edit: it is indeed limited, some attempts at using Bumblebee fail with an error like Allocator (GPU_0_bfc) ran out of memory trying to allocate 4.41GiB requested by op
Consider carefully assessing whether you need a GPU in production. Inference is not as compute intensive as training models. Can you meet business domain needs with just the CPU version of XLA or Torchx? Many initial product MVPs can live with the sub-second latency of some models running on the CPU.