Using Elixir for Data Science and Machine Learning

I see that this post is from 2016 and quite a lot has probably changed since then. I’m curious to hear your experiences with using Elixir/Erlang for AI and neural networks since then, if you’re still at it.

2 Likes

I stumbled upon these amazing blog post by @TheQuengineer :

http://www.automatingthefuture.com/blog/2017/2/20/deep-learning-building-and-training-a-multi-layered-neural-network-in-elixir

http://www.automatingthefuture.com/blog/2016/11/30/training-elixir-processes-to-learn-like-neurons

Perhaps they will be useful to others who follow this thread.

3 Likes

This is going to sound harsh, but those posts seem quite naive - a three layer neural network is not “deep learning”, and using Erlang/Elixir processes as individual neurons would scale terribly. DL networks are essentially a pipeline of tensors that transform input tensors into output tensors. Horizontal scaling via distributed computing is horribly inefficient for this - you need massive hardware parallelism, e.g., GPUs via CUDA. Elixir might have some promise as a front end to TensorFlow or another C/C++/Rust library, but IMO implementing things in pure Elixir is a non-starter for anything but toy problems.

6 Likes

Yes, I agree with what you’re saying, @jamesnorton. I found those articles to be interesting nonetheless, despite the limited practical utility of the proposed approach.

I don’t see many blog posts and articles about using Elixir for machine learning and deep learning in particular, likely due to the limitations you mentioned. I’m easily excited.

I welcome any writings on the subject, if only as food for thought :slight_smile:

As you can see in my post above from a couple of days ago, I share your concerns about the computational aspects of machine learning in pure Elixir, accessing GPUs via CUDA, etc.

Personally, I’m more hopeful about the possibilities of using Elixir for the non-computational and operational aspects of data science, such as data gathering and wrangling, model serving/exposure, model distribution for federated/collaborative learning, monitoring model behaviour, etc.

1 Like

I am not a Data Scientist but interested in the subject and stumbled upon this library. Have not tried or tested it but since it was mentioned that for any serious machine learning you would need access to CUDA for GPU processing I thought this could be of interest to you.

The library has not been updated in a while but it seems someone already looked into CUDA bindings for Elixir/Erlang.

2 Likes

OpenCL seems better than CUDA as then you’d be able to use other streaming processing, FPGA’s, etc… etc… CUDA locks you in to nVidia hardware only when there is so much other available hardware. Even if you did just want to constraint yourself to GPU’s then Vulkan’s compute layer would be the way to go, not CUDA.

1 Like

This Erlang library for OpenCl might be worth checking out then. Its been around for a while it seems but has been recently updated https://github.com/tonyrog/cl. Should be semi-compatible with Elixir.

2 Likes

Disclaimer: I’m not a GPGPU expert. Please be skeptical about what I’m about to say.


The issue seems to be that the major projects such as Google’s TensorFlow only support CUDA and (unfortunately) not OpenCL (yet). To my limited knowledge in this area, I believe the primary reason for this is because TensorFlow depends on Eigen, which currently only supports CUDA.

It would be awesome to have a vendor-agnostic platform layer of sorts, independent of OpenCL, CUDA and other future GPGPU interfaces X. Back when I was working in the games industry, I lead a rendering team which developed an analogous proprietary platform layer for OpenGL and DirectX. Building something like that is a massive and time-consuming undertaking.

It seems to me like Nvidia has an edge with CUDA, simply because that is what most established frameworks, libraries, tools and applications have chosen to target. Data scientists who make use of said products rarely have the know-how, time and interest to develop low-level GPU interfaces.

1 Like

You might be interested in Intel’s nGraph:

nGraph Library is an open-source C++ library and runtime / compiler suite for Deep Learning ecosystems. With nGraph Library, data scientists can use their preferred deep learning framework on any number of hardware architectures, for both training and inference.

1 Like

Interesting, thanks for the tip! I’ll check it out.

Let’s keep in mind that supervised learning is only one approach to ML, and backpropagation (the heaviest number crunching part of Deep Learning) is falling out of favor. DL is not very scalable, and Unsupervised Learning (i.e. Reinforcement Learning) seems to be expected to be the key path to AGI (artificial general intelligence) — according to Richard Sutton and others

With that in mind, I have been successfully using Matrex for high-dimensional vectorized computation for Multi-Armed Bandits (Elementary Reinforcement Learning). You can check it out here. Its part of The Automata Project. Down the road I may need to use python or julia via erlports (or maybe even docker containers as used here) for the vectorized parts, but for prototyping, things are going well so far.

In my view, Neuroevolutionary Typology and Weight Evolving Artificial Neural Networks (TWEANN) with Novelty Search is one of the most promising alternative approaches, and Elixir has a head start in this sub-field of ML thanks to Gene Sher’s book.

As far as python interop goes, something like this looks pretty appealing for scaling ML as well.

The Automata Project is seeking contributors if anyone here is interested.

3 Likes

I wrote a comment on a separate thread with some more relevant info on this topic:

I’m leaving a reference here for future adventurers to discover.

2 Likes

Hei @ericsteen, what is the landscape now that we have Nx at our disposition?

1 Like