Ulam - Elixir interface to the Stan probabilistic programming language

Ulam

Elixir interface to Stan, inspired by the Python project CmdStanPy. Why should Python programmers have all the Bayesian fun?

Why?

I’m interested in developping probabilistic programming languages that compile to Stan. I have found that doing so in Python is pretty inconvenient. I have decided to give Elixir a try to see how far I can go.

Installation

The package must be installed from GitHub. It’s not currently stable enough to be uploaded to Hex.

Examples

Se the example in the tests. Relevant code:

alias Ulam.Stan.StanModel

# Some simple data for the model
data = %{
  N: 10,
  y: [0, 1, 0, 0, 0, 0, 0, 0, 0, 1]
}

# Compile the model from the stan program file
model = StanModel.compile_file("test/stan/models/bernoulli/bernoulli.stan")

# Sample from the model and save it in a dataframe
dataframe =
  StanModel.sample(model, data,
    nr_of_samples: 1000,
    nr_of_warmup_samples: 1000,
    nr_of_chains: 8,
    show_progress_bars: false
  )

Is this a good idea?

Of course not, but if Elixir can have classical machine learning with Nx for big data, why can’t it have cool Bayesian model fitting for “small data”?

6 Likes

Kudos for the name

For anyone interested, because the Rust toolchain is much friendlier than the C++ build chain (which is required by stan, as stan effectively writes and compiles a C++ program for each model), I’m working on trying to become intependent of stan and implement a sampler based on this rust project: GitHub - pymc-devs/nuts-rs: A implementation of NUTS in rust

The sampler seems to be at least as good as the stan sampler according to some benchmarks.

The idea would be to compile the model into a rust logp minimization program, using the above rust library and use Rustler to compile it into a NIF that can interact with Elixir. Bonus points if I can make it work without generating any rust source files.

The hard par in this is to write vectorized implementation of the functions that will make up our likelihood, probably in Elixir (if we don’t vectorize, if we implement everything directly according to the mathematical definitions the dynamic compute graph will blow up like crazy) and implement the gradient calculations (on the rust side)