Using Elixir for Data Science and Machine Learning

I’m a data scientist who loves Elixir.

There is an incredible amount of potential for Elixir in data science and machine learning, since the language provides excellent facilities for data transformation through pattern matching, piping, etc.

Also, the functional programming paradigm is a much more natural fit for data science than object-oriented and imperative programming, being conceptually closer to the problem space.

At my workplace, we primarily use Python (but also some R and Scala) for data science, which are the de facto standards in this domain. The reason for that, I think, is because of some extensive and useful libraries, such as pandas, scikit-learn, TensorFlow (by Google), Keras and many others.

Elixir has a lot of catching up to do in the library department with respect to Python.

One other aspect of Elixir (and the BEAM in general) which is of interest to data science, is the way that it allows for easy distribution and scaling, at least on the CPU-side. However, I’m not sure how to interface with GPUs via Elixir (e.g. via CUDA), which is essential for effective machine learning.

At the moment, I’m looking into using Elixir for hosting and exposing pre-trained machine learning models (trained using TensorFlow and Kreas) to consuming applications via APIs, etc. @anshuman23 has already created Tensorflex for that purpose, which looks very promising.

It is also sensible to use Elixir for the aspects of data science which have to do with gathering and preparing the data set you need to train your models (aka “tidy data”).

6 Likes