MadLib on Postgres

ppiechota · September 14, 2018, 1:08pm

Hi, has anyone worked with MadLib library for Postgres and could share his experience please?
http://madlib.apache.org/

In a world of ever increasing data size, many existing analytics solutions are not up to the task. The MADlib project seeks to address this need by creating a framework built to take advantage of modern computing capabilities to provide robust solutions that scale with the needs of the business.

Our approach is to leverage the efforts of commercial practice, academic research, and the open-source development community. Please watch the short video below for more details on the product.

Key philosophies driving the architecture of MADlib:

Operate on the data locally in-database. Do not move data between multiple runtime environments unnecessarily.

Utilize best of breed database engines, but separate the machine learning logic from database specific implementation details.

Leverage MPP shared nothing technology, such as the Greenplum Database and Apache HAWQ (incubating), to provide parallelism and scalability.

Open implementation maintaining active ties into Apache community and ongoing academic research.

I don’t have any special requirements for now but I’m simply exploring available options for simplified ML solution with Elixir as part of stack (time to market ). What would be pros and cons of machine learning directly on database? Is MadLib worth the candle or should I not look beyond Python through Ports?

OvermindDL1 · September 14, 2018, 2:51pm

For note, someone is making an ML library for Elixir right now based on and using tensorflow, if that is helpful in any way.