I’m starting to learn Elixir and from reading different blogs out there, it looks like Ecto is a default/preferred way to deal with DBs in Elixir web development.
While reading about Ecto, I mostly see references to Postgres, MySQL, SQLite. So the question is does it work with Duckdb and a more generic question, is Ecto in general a common way to deal with SQL based DBs and therefore works with any flavour of SQL based DB?
Ecto is the most popular way to interact with a database from an elixir application.
There are essentially two parts of Ecto: the Ecto core core, which handles general data mapping things like changesets and schemas, and Ecto SQL, the SQL adapter layer which provides support for Ecto to work with an SQL database. It includes support for Postgres, MySQL, and MSSSQL already, but there’s no out of the box support for DuckDB.
If you wanted to use DuckDB with Ecto you could create an adapter for it based on Ecto SQL.
Or you could skip Ecto and just query DuckDB directly. It looks like duckdbex supports DuckDB directly without Ecto.
Since there is duckdbex already (which is the hard part of the equation) you might get away with using the ecto postgresql adapter and simply swap the postgrex calls with those of duckdbex.
@mmmrrr@msmithstubbs - thanks. I see I’m very far from understanding yet how the ecosystem works. I didn’t know about all these details of Ecto Core and Ecto Adapters. I thought it’s all one package. I’ll investigate more and thanks for duckdbex reference.
@acrolink - I’m sorry, I’m not going to go into the a VS b thing here. Suffice it to say I’m using duckdb extensively and almost exclusively (with some polars sprinkles) for my Python data analytics pipeline. It’s insanely fast for my use case and deals with out of memory size files with much grace.
This is how I’m starting my journey into Elixir. I want to recreate what I already have in Python, but in Elixir. Also, my through process is “…if I do almost 90% of my data gymnastics with duckdb, let’s see if I can also use it for the web app. The fewer tools to master the better…” (at least for me)
@josevalim - thanks for the tip. I have actually already used your advice from github issue and was able pretty easily make duckdb work in Livebook. My question here stems from the ignorance on how things work not in Livebook but for a normal web app based on Phoenix/Liveview. Because I see references to Ecto everywhere where DBs are involved I thought that by default I need to use it for every DB related task in a web app.
The ADBC is new to me as is web dev, but I’ll see if I can find an example somewhere on how to use duckdb + ADBC + web app.
You should be able to get any code from Livebook and move to your web application. The steps are:
Get the relevant dependencies in Mix.install/2 (the top of your notebook) and add them to the def deps section of your mix.exs. In this case, you only need the :adbc dependency (you don’t kino nor any of the kino_* ones)
Get any process you pass to Kino.start_child!(...) and instead start them as part of your application supervision tree (typically in lib/my_app/application.ex)
Now the code doing the queries and so on should just work™! Feel free to ask if you have questions.
The cool thing about the Smart Cells in Livebook is that they only generate code at the end of the day and you can see. If you focus on the Connection smart cell, you will see the <> icon on its top right. If you click, you can see how the ADBC connection is initialized.
As per your questions:
Yes, you can. Try out Adbc.Connection.query(conn, query, params) instead of Explorer.DataFrame.from_query!(conn, query, params)
Columnar storage is the main reason. For analytical queries this will be much faster than postgres (unless you use a columnar format in Postgres such as timescaledb or tembo)