Have you heard of MPI? How is it compared to Erlang?

jackalcooper · January 4, 2018, 3:24am

My new company is using MPI to do some crazy stuff so basically that is what I have to learn and master. Could anyone recommend some resources about this? It will be great if there is some insight from a Erlang perspective.

bbense · January 4, 2018, 3:44am

MPI is a completely different kettle of fish. MPI is a message passing library developed primarily for use in Fortran and C/C++ applications used for High Performance Computing. It’s largely meant to allow existing scientific codes with millions of lines and hundreds of man-years of programming to transition to current “supercomputing” hardware.

It is meant for passing large sections of arrays around in high bandwidth environments on tightly coupled clusters( i.e what is called a SuperComputer these days). Many of the hard problems in distributed computing are assumed not to exist in MPI. (i.e. all the nodes that start a computation finish the computation, you have unlimited access to the resources of the node, etc… )

Comparing MPI and Erlang is largely an apples and oranges. There really don’t have much of an intersection in terms of the kinds of problems each is designed to solve.

jackalcooper · January 4, 2018, 3:53am

Well I don’t think there are totally not comparable. Say you want to do some computation. You can have the job distributed among a Spark cluster or a MPI cluster. While Spark is built upon Akka which is in a way equivalent to Erlang/OTP.

bbense · January 4, 2018, 4:12am

You can pound screws with a hammer too. Doesn’t make it the right tool for the job. MPI was primarily intended to allow number crunchers to safely use threads and shared memory to emulate the well understood algorithms from the old days of Vector Supercomuters with gather/scatter vector processing. MPI was designed for compute bound problems that have tightly coupled data. Very high speed interconnects like infiniband have allowed MPI to extend beyond a single machine to a cluster, but the underlying model is still the same.

The Actor model works best on loosely coupled problems that are largely I/O bound or require robust failure handling.

bbense · January 4, 2018, 4:34am

I can imagine using MPI to duplicate some of the kinds of things you can do in Spark using it’s I/O interface to large parallel filesystems. Spark/ Hadoop Map/Reduce are kind of the inversion of classic supercomputing. They transport the executable to the data nodes; traditional supercomputing is the reverse, you bring data to the compute nodes.

Given the hype around “Big Data” I can imagine that the MPI folks have created some tooling to get on the bandwagon, but MPI was meant to solve a fundamentally different class of problems. In Spark/Hadoop Map Reduce you start with large amounts of data and end up with a much smaller data set that is some kind of “summary” of the data.

In more traditional supercomputing that MPI is typically used for, you start with a small set of data ( think inititial state of a simulation) and end up with a lot more data ( the entire time series of the simulation). Map/Reduce computing environments are a terrible choice for that kind of problem.