MarioFlach

MarioFlach

Postgres recursive CTE on single table

Hello, I have a single SQL table looking as follow:

create table("git_commits", primary_key: false) do
  add :oid, :binary, primary_key: true
  add :parents, {:array, :binary}, null: false
end

Each commit has one or multiple parents. I would like to create a recursive Common Table Expressions in order to traverse the entire commit history.

Let’s say I want to start from commit 161b3955043064f33e3684183837fb7160cccfea and traverse all the parent commits.

I came with the following SQL query:

WITH RECURSIVE dg AS (
  SELECT oid FROM git_commits c WHERE c.oid = '\x161b3955043064f33e3684183837fb7160cccfea'
  UNION
  SELECT p.oid FROM git_commits AS c JOIN git_commits AS p ON p.oid = ANY(c.parents)
)

SELECT COUNT(oid) FROM dg;

The query is not invalid but does not return the number of parents (it returns more that it should).

So my question is how is the first (non-recursive) SELECT statement related to the second?
Is this kind of recursive CTE traversal even possible when working on a single table?

Thank you in advance.

Most Liked Responses

MarioFlach

MarioFlach

I’m working on a Git hosting platform and trying to get better performances when browsing repositories. I’m using libgit2 and traversing/walking the commit history is quite slow by design.

When working on big repositories (linux kernel source tree, >800,000 commits) it can take up to 10 seconds when computing something like commit count. I’ve been able to work around quiet a few performance bottlenecks using different tricks (clever pagination in combination with Stream, etc.) but it still needs to get faster than that…

I’m currently experimenting with storing/caching some meta informations in the database and hope to achieve better results.

Thank you for the query, it works like a charm :hugs: I’m surprised that it counts merges correctly. I would have assumed that it would return twice as much as commits each time the recursion queries more than one parent…

MarioFlach

MarioFlach

That’s what I’m doing right now. I use exclusively libgit2 to interact with the underlying git objects.
I currently have two interchangeable storage backends available :

  • :filesystem (default) – .git directory
  • :postgres (experimental) – using git_references and git_objects tables.

The problem is that traversing the commit history is quiet slow by design. There is no fast way of querying the history of a given pathspec (tree, blob) or even counting commit ancestors.

I’m not quite sure how you would implement caching effectively. I don’t want to cache on demand, the first hit will always be slow…

Currently, I’m experimenting with storing meta data for commit and tree objects and the performance boost I got so far is impressive.

In future I would like to extend my :postgres backend to write Git objects in separated tables (references, commits, tags, blobs, trees). Being able to perform more complex queries (authors, committers, query avatars, GPG signatures, ACLs, etc.) directly on the DB would be a :raised_hands:

https://github.com/almightycouch/gitgud

hauleth

hauleth

You can cache on upload once.

You could use git notes attached to commits for storing metadata and cached values, like the commit “depth”. Alternatively instead of storing all commit data you could just store cached metadata in Postgres, however I would say that this still would be a little overdoing. But this is how I would solve that, if your way works for you then ok.

The problem is that this will not scale well in time as PSQL wasn’t meant to store BLOBs.

Where Next?

Popular in Questions Top

aadeshere1
I have a another noob question about loop. Since elixir is immutable, while loop is not directly possible. total = 10 while total != 0 ...
New
sergio
In Ruby, I can go: User.find_by(email: "foobar@email.com").update(email: "hello@email.com") How can I do something similar in Elixir? ...
New
marius95
Hello everyone, I try to use an Javascript Event Handler in my root.html.leex file. Therefore I created a function in the app.js file: ...
New
mcarvalho
What is the difference between System.get_env and Application.get_env? For example, what are best practices to use one versus another.
New
jerry
Good day to you all. I have been struggling to get a query involving like and ilike to work. Can anyone assist me on this, please? pro...
New
LegitStack
I’m trying to make a websocket server in Phoenix or raw Elixir. I heard about gun, I think I could use cowboy, but since I’m not that sma...
New
jay1
Why is it that the mnesia database isn’t the most preferred database for use in Elixir/Phoenix?
New
alice
Hey, Just curious what are the main benefits of Elixir compared to Clojure? When is Elixir more useful than Clojure and vice versa? Th...
New
freewebwithme
Using vs code and installed ElixirLS: support and debugger. And I got an error popped up on start up says Failed to run ‘elixir’ comma...
New
Qqwy
Original source of discussion: This topic on the Pragmatic Programmers’ Functional Web Development with Elixir, OTP, and Phoenix forum. ...
New

Other popular topics Top

sen
Hi All, I set a environment variables in dev.exs , like below code. when i start server, how can i set the ${enable} value? thanks. d...
New
fireproofsocks
Forgive me if this is obvious, but how does one delete a database record WITHOUT selecting it first? Ecto.Repo — Ecto v3.14.0 has exampl...
New
chrismccord
This release brings a number of exciting features, including integration with the new Phoenix LiveDashboard and Phoenix LiveView. There h...
New
josevalim
Hi everyone, One of the features added to Elixir early on to help integration with Erlang code was the idea of overridable function defi...
New
vrod
I am using the Starship cross-shell prompt – it seems pretty nice, but I get some errors: [WARN] - (starship::utils): Executing command ...
New
Emily
I have VueJS GUIs with the project generated using Webpack. I have Elixir modules that will need to be used by the VueJS GUIs. I forese...
New
vonH
When I run the Plug and I recompile I wind up having to use Ctrl C to quit iex and start again. Witht the help of rlwrap I can use the cu...
New
freewebwithme
Using vs code and installed ElixirLS: support and debugger. And I got an error popped up on start up says Failed to run ‘elixir’ comma...
New
AstonJ
We’ve put together this wiki for Phoenix LiveView - please feel free to add any info you feel is worth including. What is Phoenix LiveV...
New
marick
I had some trouble figuring out how to make many-to-many associations work. Once I got it working, I wrote a blog post. Because I’m a nov...
New

We're in Beta

About us Mission Statement