Is there any way to spawn a processes on the GPU?

I am wondering if it is possible to run a process from a beam language on the GPU and if so how?

Being able to spawn a process on the GPU could be cool because it might (depending on how it’s implemented) make the BEAM better at handling lots of floating point math very quickly and since GPUs generally have many more cores than CPUs it would be possible to run a lot more processes truly asynchronously.

I would also like to know some downsides of doing this.


GPUs are completely different animals from CPUs (GPU processors are very small, very limited in what kind of instructions they can execute, but therefore very fast and plentiful.)

Because they are so different, I do not think that a BEAM scheduler would be able to run on there.

If you’d like to make use of the GPU in your code, you might want to look into writing a NIF (such as using rustler and writing it in Rust).


There was research done in the EU with integrating Erlang and CUDA. Just don’t know the details.


Kevin Smith - Erlang & CUDA: Concurrent and Fast (2011)


Machine Learning in Erlang and CUDA

As already alluded to - the BEAM is used for orchestration of computation - because that is what it is good at.

Aside: Simon Marlow: Parallel and Concurrent Programming in Haskell (; GPU Programming with Accelerate)


There is also the Scene library that was presented at ElixirConf.

On the surface it would seem unrelated as it is dedicated to drawing graphics - but it draws those graphics through OpenGL. There is a small step between scheduling graphics on the GPU and scheduling programs to run on the GPU. Scene may or may not have the primitives needed to take that step, but it certainly shows that communicating with the GPU is possible.

1 Like

There are some BEAM runtimes that use LLVM, so it might be possible to run a process on the GPU using the SPIRV-LLVM-Translator.

1 Like

Bleh. I feel like this is such a ripe area of research and development but Kevin Smith’s Pteracuda library hasn’t been touched in a decade. I feel like there should be a way to just inform the BEAM that a GPU is available and it just throws processes at it. It’s awesome that the BEAM is so good at scheduling but it seems irrefutable that never having to swap out a process is way better. I’m not a hardware guy though, so maybe there are limits to what a GPU core can do? Just to throw out a possible use-case, how about a web scraper that spawns many many processes to make http connections and parse the response? I’m seeing Nvidia GPUs with over 1500 cores these days! That sounds super parallel to me!

Gpu cores are simd. One way to think of it is that you must dispatch the same instruction to all of the cores simultaneously (this is not exactly correct for most modern gpus, but it gives you the flavor of the problem). I do think that the idea of having a process manage a gpu is correct, though, and every other programming language that does gpu stuff does this abstraction wrong. I tried to convince the Julia folks that they should treat the gpu as a virtual “distributed node” that you can dispatch Julia code to (you can do this with distributed cpu nodes on Julia), but they did not pick that idea up.


@ityonemo thanks for the explanation! I hope that tech/libs catch up and we can make the BEAM easily dispatch onto 1k+ cores!