GitGud, GitHub clone entirely written in Elixir

tme_317 · March 22, 2018, 6:50pm

As I currently have lot of things going on (2nd baby arriving in a few weeks, house construction and lot’s of work to cover the costs for everything), I’m not sure if the project will be a usable product anytime soon.

Congratulations! and definitely understand

Indeed, I started with the idea of a having a Vue SPA. I’m not really a front-end guy and wanted to give Vue a try. I later decided to go the EEX/HTML path because I was not feeling really well with writting that much Javascript . In the end, I will try to render most of the interface using server-side rendered HTML and React/Relay for complex components (branch/tag for example).

Yeah, this is exactly how I felt in my first iteration. Nice to know I’m not alone! I always looked forward to days that were heavy in backend Elixir/Phoenix/Ecto/SQL after spending a few days in JS-land. Also, I like Vue but felt it messy and too highly coupled sprinkling it in EEX, targeting it by ID, then writing the JS Vue instances in another per-page .js file.

I felt encapsulated components were a cleaner answer but couldn’t figure out how to embed Vue SFCs with EEX (like the way GitHub - geolessel/react-phoenix: Make rendering React.js components in Phoenix easy or your react_components.ex file works). Found no examples in the wild doing this with Vue/webpack without going full SPA. At that point decided to learn/try React components instead.

A while ago, I wrote a small SPA using React/Relay + GraphQL. I was pretty amazed how easy it was to implement once the backend was ready (Absinthe). But still, I’m not a Javascript person and for bigger projects, I prefer to stick to render plain boring HTML from the server…

I realize I’m probably overcomplicating things at the moment which will bite me later, but currently experimenting with a fully decoupled React frontend (written in ReasonML instead of JS) progressively rendered server-side by NextJS talking to Phoenix using GraphQL. It’s super fast for the client and most things just work even if they have JS disabled! It’s early and so far, so good, but as a fallback if my experiment fails I’ll probably just follow your lead and go back to EEX rendering with some React components for the complex stuff.

Again, thank you very much for sharing your code and your insights on making this architecture change!

Virviil · September 19, 2018, 7:29pm

@MarioFlach

Is it your roadmap - to extract git server functionality in separate hex packages?

I’m trying to make a small project which will use git server approaches (it means that you can push something to it), but cloning the whole GitGud with significant changes in web part seems to be very bad idea, because it’s impossible to pull new updates with git-server part in the future.

I can help with contributing to git-server part with documentations, testing and use experience, if it will be separated.

cnck1387 · September 20, 2018, 12:49pm

Any chance of putting up a demo site somewhere so we can see it without installing it locally?

ConnorRigby · September 20, 2018, 2:16pm

2 things:

This is really cool.
Hosting the source on Github is so meta

MarioFlach · September 20, 2018, 6:57pm

It’s an umbrella application with three apps:

:gitrekt – Git NIFs, wire-protocol and packfile implementation.
:gitgud – Ecto schemas and queries for users and repositories, authorization and Git HTTP and SSH transport protocol implementation.
:gitgud_web – Web application and GraphQL.

The dependency graph looks as follow:

gitrekt <- gitgud <- gitgud_web

Currently, I do not plan to split the project into multiple independent apps. At least not until I release a more stable version.

If you wish to do so, you can extract :gitrekt and :gitgud and use them independently without Phoenix and Absinthe dependencies.

MarioFlach · September 20, 2018, 7:13pm

I’m doing my best to release a more polished version soon. In the long-term, a public launch is planned and moving the project’s codebase to it own service will be the first logic step. Stay tuned =)

MarioFlach · September 20, 2018, 7:24pm

Thanks a lot =)
It is and feels ~~wrong~~ weird in many ways =)

bryanhuntesl · July 17, 2019, 12:35pm

@MarioFlach impressive !

I’ve got a question. When building Erlang/Elixir software in docker - private repositories (git ssh) are always a problem. The current options are:

(Linux only, doesn’t work on Mac) - Dockerfile which mounts ssh-agent
Copy id_rsa into docker container
Fetch dependencies, then run the docker build - possible but ugly - and doesn’t work well with build caching (invalidates the cache).

I’ve had an idea of a possible solution. A ssh git proxy running in the same docker network, with the hostname github.com

You provide a machine user SSH key as credentials when starting the git proxy - the SSH key can read all required repositories, but not write.
Your build containers is also configured to use this network (lets call it ‘ci’)
Your build runs, it makes git ssh requests to download dependencies, but thanks to the magic of docker hostnames it’s actually talking to the proxy.
The proxy deleagates to github, pulls the repo, maybe caches it, and returns it to the build containers.
$profit$ - well actually no profit, it’s open-source

I was wondering could some part of your project be reused to achieve this goal - it would really improve the CI story for Erlang/Elixir vs go (where all the dependencies are just ‘vendored’ (copy-n-pasted) into the repository.


   +-------------------------Network 'ci'-----------------------------+
   |                                                                  |
   |                                                                  |
   |   +----------------+    Fetch dependency  +------------------+   |
   |   |                |    (git ssh)         | Docker container |   |
   |   |  docker build  +----------------------> (github.com)     |   |
   |   |                |                      |                  |   |
   |   +----------------+                      +------------------+   |
   |                                                     |            |
   |                                                     |            |
   +------------------------------------------------------------------+
                                                         |
                                                         |
                         +--------------------------+    |
                         |                          |    |
                         |     Github.com           +<---+
                         |                          |  Fetch and cache
                         |                          |
                         |                          |
                         +--------------------------+

I’m thinking of tools like :

Thanks,

Bryan

tristan · July 17, 2019, 3:32pm

Checkout docker 19.03 (not GA as of this comment but there are release candidates) for secure ways of doing the id_rsa “copy”. Note you can also use .netrc file to use https instead of ssh for cloning from github.

There is a good blog post about using it for ssh and secrets for netrc here https://medium.com/@tonistiigi/build-secrets-and-ssh-forwarding-in-docker-18-09-ae8161d066

You could also use it for hexpm:

RUN --mount=type=secret,id=hex.config,target=/root/.config/rebar3/hex.config rebar3 compile

Then run with with:

$ docker build --secret id=hex.config,src=~/.config/rebar3/hex.config .

There is also a new build cache that allows you to store artifacts like fetched hex deps separate from the image and automatically bring them in at build time. Plus mounting . instead of COPY . ..

I got it working but was hacky to do having rebar3’s _build be stored outside of ./ so it could be cached by the new docker build cache and then you don’t lose your built dependencies or git clones either. I’m sure similar can be done with mix.

I have written all this up with more details and examples but I’m still waiting for 19.03 to be released before publishing

wolfiton · July 17, 2019, 4:45pm

Very interesting an cool project @MarioFlach , it makes you think that sky is the limit with elixir and phoenix.

bryanhuntesl · July 17, 2019, 6:16pm

Thanks Tristan - I see in Docker for Mac edge release notes - https://docs.docker.com/docker-for-mac/edge-release-notes/

Upgrades
* [Docker 19.03.0-rc2](https://github.com/docker/docker-ce/releases/tag/v19.03.0-rc2)

Oh joy - have you managed to get the secrets stuff working on OSX? I reached out a couple of months ago and there was no progress.

B

bryanhuntesl · July 17, 2019, 6:17pm

Tristans reply notwithstanding - a SSH GIT proxy would be useful in a variety of environments.

tristan · July 17, 2019, 6:30pm

Na, I’ve only used the secrets stuff on linux.

MarioFlach · July 18, 2019, 4:57pm

You can use or extract different parts of the project to write a SSH Git proxy. The umbrella project has three separate apps:

The :gitrekt app provides:

Git plumbing & porcelain commands (libgit2 wrapper using NIFs)
Git wire-protocol implementation
Git PACK format implementation

The :gitgud app provides the core functionalities and building blocks such as User, Repo and many other schemas. It also implement the SSHServer and SmartHTTPBackend both required to support Git over SSH/HTTP.

The :gitgud_web app provides the Phoenix application and a fully-featured GraphQL API.

I’m also very interested in a CI/CD solution for my project but need to finish/polish things first. There is https://github.com/AlloyCI/alloy_ci which looks promising (based on GitLab CI runner).

MarioFlach · July 18, 2019, 5:12pm

Thank you, I really enjoy working on this project! For me the interesting thing is that there is a lot of scaling possibilities.

I’m currently working with medium size repositories (elixir-lang > 15.000 commits, phoenix > 6.000 commits) and I’m quite happy with the performances so far. But it’s a completely different story to work on big repositories such as linux-kernel > 850.000 commits).

There are a few Git things that perform very slow. For example, getting the total number of ancestors from a given commit can take up to one second on large repositories. Even worse, getting the history for a specific blob or tree can take up to 20 seconds.

I’ve began to cache meta-informations each time a Git repository is pushed in order to optimise performances but there is still much to do…

wolfiton · July 18, 2019, 5:56pm

@MarioFlach You are an ambitious person and with patience, I think you will do great things.

I am a beginner in elixir i only have one week in elixir but if I can help with your project in any way let me know.
As experience i only created some absinthe toy apps.

I want to learn more about elixir and phoenix and decided to make it my go to stack when it comes to web dev.

Currently learning vue and nuxt so if i can help with the frontend would be cool.

I also come with a node background and have some skills with graphql.

In any case good luck with the project.

MarioFlach · August 20, 2019, 2:51pm

I’ve been working a lot on this project lately. Just released version 0.2.7.

Also I decided to publish the project on a dedicated server for testing purposes.

So here it is: https://git.limo

Or here if you want to browse some repositories:

https://git.limo/redrabbit

This is pretty limited currently because you are not allowed to create new repositories. I need to provide monitoring tools and other things before letting anybody host their repositories.

Here’s a brief summary of the project’s current state:

Basic CRUD operations

User registration/authentication
- with login/email - password credentials.
- with OAuth2.0 for GitHub and Gitlam.
Email management:
- Each registered user need at least one verified email address in order to create new repositories and maintain other’s repositories.
- Each verified email address is used to associate Git commits. This is also the case for GPG signed commits.
- The primary email address is also used to retrieve a user’s Gravatar.
SSH and GPG key management:
- We support Git over SSH and HTTPS. By adding SSH public keys, a user can authenticate without the need of a password.
- GPG public keys are used to verify that a Git commit has been signed by a given user. In order to show the verification icon:
  1. The committer’s email must match a verified email address
  2. The committer’s email must also match the GPG key email address
Repository management:
- Users can create repositories if they have at least one verified email address.
- Each repository can have multiple maintainers with read/write/admin rights.

Git Repositories

We support both HTTPS and SSH transport protocols. While HTTPS support Basic Authentication via credentials. SSH supports authentication via public key and password.

Access to repositories depend on the user’s authorisations:

Anybody can view/clone a public repo.
Only owner and maintainers can view/clone a private repo.
Only owner and maintainers with at least :write access can write/push to a repo.
Only owner and maintainers with :admin access can edit settings of a repo.

Browsing repositories is similar to GitHub/GitLab. You can browse trees, view blobs (with syntax-highlighting), walk the commit history, walk the commit history for a given tree/blob, display diffs, etc.

Accessing a Git repository content is done via the GitRekt.GitAgent abstraction. Basically the agent provides an API to manipulate Git objects and refs.

The nice part here is that the access to a repository can be done from multiple processes simultaneously (by wrapping the NIFs resources in a GenServer and serialising function calls).

This will be important when implementing support for clustering aka. distributing repositories on multiple nodes.

The GitRekt.GitAgent provides two different modes:

:inproc - calling GitRekt.Git functions (NIFs) directly.
:shared - serialise function calls through GenServer.

The GitRekt.GitRepo protocol helps to implement custom logic (clustering) for both modes. Currently, only GitGud.Repo implements this protocol.

GraphQL API

The available GraphQL API is quite fully featured. It also provide subscriptions via Phoenix Websockets.

Issues and reviews

There is no issue system at the moment. I hope to get started with issues soon.

I also want to provide a review mechanism, basically the possibility to review changes across commits. Currently, any registered user can comment on any commit. Comments are stored globally (shown at the end of a diff) or on a per-line basis.

Wiki

I’ve played a little bit with a Git based Wiki implementation. But this will come in a later future.

LTheGreats · August 20, 2019, 4:03pm

The project looks really cool. The front page even has my favorite language, lorem ipsum! Seriously though, this is an awesome use for elixir and looking through this project’s source code is interesting and fun.

MarioFlach · August 20, 2019, 5:30pm

Thank you. Haha, yes you have to fill that front-page with something

The overhaul design of the site is very “minimal”, I’m using Bulma but did not bother to customise things yet.

scouten · August 22, 2019, 4:16am

FYI it’s not ready for you to use yet, but I’d love to hear your thoughts on what it would take for xgit to serve your core git needs in the long term.