Deep Learning library with GPU(CUDA/cuBLAS)

I am making a Deep Learning library using GPU. It’s name is DeepPipe2. I improved my previous DeepPipe. It can run simple SGD for MNIST.You can use GPU from Elixir.


I added momentum and adagrad method for updating the weight matrix and bias matrix.

1 Like

You’ve got a typo in your github description


1 Like

I fixed it. thank you very much.

I am improving CNN to work. I will write convolution(im2col) myself in CUDA.

1 Like

I wrote a brief introduction.


I’m studying NVIDIA’s cuDNN. I plan to incorporate cuDNN to DeepPipe2 as well as other standard Deep-Learning frameworks.


I decided not to include cuDNN. There is very little information on examples of how to use the API. I am writing my own code for CNN in CUDA. I will complete the necessary additions to CNN in the first half of 2020.

1 Like

I made a video introducing DeepPipe2.

I’m sorry for poor English. I will go to English conversation school.


This is amazing. I am not an expert in deep learning, but very curious about your project. I love what you are bringing to the Elixir community.

Impressive work! I hope to learn to use it soon.


Thank you for your reply.
Your reply is very encouraging.
Thank you very much.


Cool video!

I didn’t know it was possible to make NIFs in Cuda like this, so this was an interesting point.

However could you explain a bit more what happens when you run Test.sgd/2? That part I couldn’t really understand.

1 Like

Thank you for your reply.

I will add a explanation.
SGD stands for Stochastic Gradient Descent.
The argument of Test.sgd /2 is sgd (size, epoc).
size is the size of the data for mini-batch learning.
epoc is the number of times to repeat mini-batch learning.
In the video, it is Test.sgd (100,100).
DeepPipe2 randomly extracts 100 from MNIST training data.
Then, learn the MNIST data. Repeat it 100 times. After the learning is over, check if the learning was successful with the confirmation data.
As a result, approximately 86% of correct answers can be obtained.

There is a test module in the test.ex file. There is the description code of the neural network. And there is also description of sgd/2.
sgd/2 calls Cumatrix, a matrix calculation library that uses a GPU. Learning requires a large amount of matrix computation. Thanks to the GPU, Cumatrix can perform calculations faster.


I’m writing CNN features in CUDA and debugging. It now works partially.

#for CNN test
defnetwork init_network4(_x) do
_x |> f(5,5) |> full
|> w(576,300) |> b(300) |> relu
|> w(300,100) |> b(100) |> relu
|> w(100,10) |> b(10) |> softmax

iex(1)> Test.cnn(100,100)
preparing data

accuracy rate = 0.821


Pooling now works.


Great Job!!! I’ve been thinking why not use Distributed Erlang power to handle big matrices. Something like Tensor Flow or Spark but using Elixir. I work with Artificial Intelligence, pm if you need some help.

1 Like

Thank you very much.
Currently, processing large data can cause a segmentation fault. I am investigating the cause.


I fixed the bug.
The cause was a memory leak in deconvolution.


I have added basic CNN functionality. Now I am testing. To handle large matrices, at least 16GB of memory is required. My GPU is an old GTX960. For this reason, the mini-batch size cannot be increased. GTX1660 is recommended.


I have added the garbage collection to run manually when learning on CNN. Segmentation fault was avoided. Thanks Mr. NobbZ.