joegiralt

joegiralt

Comparing neural network training performance between Elixir and Python

With a wide range of libraries focused on the machine learning market, such as TensorFlow, NumPy, Pandas, Keras, and others, Python has made a name for itself as one of the main programming languages. In February 2021, José Valim and Sean Moriarity published the first version of the Numerical Elixir (Nx) library, a library for tensor operations written in Elixir. Nx aims to allow the language be a good choice for GPU-intensive operations. This work aims to compare the results of Python and Elixir on training convolutional neural networks (CNN) using MNIST and CIFAR-10 datasets, concluding that Python achieved overall better results, and that Elixir is already a viable alternative.

Why would Python achieve “overall better results?” What does that mean? Is the elixir code they used even idiomatic or current?

/nx

Most Liked

josevalim

josevalim

Creator of Elixir

It is also worth noticing that the paper uses Nx v0.2 and a lot has changed since then given it is relatively new technology. In particular, the new Axon version has many improvements on training, so I would be eager to see more recent results and see if those improvements are proven on paper!

josevalim

josevalim

Creator of Elixir

We fixed those in the next Axon. We did have some code that would take longer the more epochs you had but it is all fixed now.

We would also recompile the network between epochs, but also fixed.

ityonemo

ityonemo

Looking at the graphs it looks like the time difference can be accounted for by 1) late startup of elixir GPU usage and 2) mysterious gaps in the GPU usage. A SWAG (“scientific wild-ass guess”) here guess that the late startup likely scales based on training set size and not training epochs, but those gaps scale in number based on epoch count. So for a more useful machine learning training problem, it’s likely to scale to somewhere between 15-25% slower (the mnist and cifar very nearly represents an upper bound on the pessimization).

Maybe I missed something but it doesn’t appear that the paper tries to explain what is happening in those gaps… My gut feeling guess that there’s some GPU data shuffling back and forth with the cpu that is blocking progress and probably could be run concurrently towards the end of the first chunk. Don’t know if the python libs proactively figure that out and schedule those data transfers in advance concurrently, would be interesting to find out.

Where Next?

Popular in Discussions Top

PragTob
Hello everyone, I know we had quite some threads (read through lots of them) about background job processing but it remains a hotly deba...
New
JakeBecker
TL;DR: I’ve just released an implementation of Microsoft’s IDE-independent Language Server Protocol for Elixir. It adds language support ...
1144 53578 245
New
Nvim
Anybody knows a comprehensive comparison of Django and Phoenix, thanks for the help. Where are they similar? Where do they differ the m...
New
pillaiindu
In django there is a cache framework backed by memcached. Rails also puts a lot of emphasis on caching, and even the idea of russian-doll...
New
MarioFlach
Hello, I want to share a project I’ve been working on for a while: Background Some time ago I came across a talk: How we scaled git l...
New
pillaiindu
I want to convert a Phoenix LiveView CRUD website to a CRUD mobile app. What do you think is the easiest way to do so?
New
crispinb
On reading dhh’s latest The One Person Framework it strikes me that Phoenix with LiveView is already pretty much this. However, never hav...
New
gausby
I asked this very same question on twitter and got some interesting feedback, but I thought it would be a good question to ask here as we...
1207 39247 209
New
IVR
Hi all, I’ve seen a number of related threads in the past, but I’d still be very curious to hear an up-to-date opinion on this topic. I...
New
New

Other popular topics Top

sen
Hi All, I set a environment variables in dev.exs , like below code. when i start server, how can i set the ${enable} value? thanks. d...
New
AstonJ
Posting this to see if we can make things easier for people to get into Neovim. If you use Neovim and have a favourite distro please let ...
New
JorisKok
I have a server on AWS, and was running a load test using artillery. When looking at the Phoenix dashboard I see the Ports going to 100% ...
New
freewebwithme
Using vs code and installed ElixirLS: support and debugger. And I got an error popped up on start up says Failed to run ‘elixir’ comma...
New
ashish173
I am using Ecto timestamps with postgres, I can see the timestamps() use the :naive_dateime but for my use case I wanted to store the ti...
New
dblack
I’ve got an issue with an app and I’ve no idea of how to troubleshoot it. I’m hoping someone here might have seen something similar. I p...
New
AstonJ
Please see the new poll here: Which code editor or IDE do you use? (Poll) (2022 Edition) It’s been a while since we first asked this, I...
208 31107 143
New
Brian
What is the proper way to load a module from a file in to IEX? In the python world, doing something like this pretty standard: from ....
New
dogweather
I wrote this comment on r/haskell, and it’s not popular there. :wink: But I think I’m on to something… Haskell reminds me of Java, and e...
New
svb
Hi! Currently I want to submit a form by pressing the Enter key. However, since my input field is of type “textarea” this is just adds a...
New

We're in Beta

About us Mission Statement