Architecting Systems with Elixir

epitaph · September 18, 2022, 7:04pm

Hey Guys,

I’ve recently been playing around with Elixir (and Phoenix) for a personal project it has been a pleasant experience so far. However, there’s something that’s been nagging at the back of my mind regarding how “enterprise” Elixir systems should be architected.

A bit of background - professionally I work for a small product team building and supporting an internal platform (can’t say much more than that, unfortunately) which is made up of microservices written primarily in Go which communicate using gRPC and are deployed to Kubernetes. Overall, this works exceptionally well. gRPC and Protobufs make it easy to talk to different services in a way that feels “native” with relatively low overhead. Kubernetes and Linkerd make auto-scaling and load balancing work without me having to think about much and I can spin up a new service and have it deployed in a matter of minutes.

From what I have read about Elixir and more so the Erlang VM, software that runs on the BEAM is not too dissimilar to what I just described. You can have many different applications running inside the runtime at once (à la Microservices) and the BEAM handles orchestrating these applications (à la Kubernetes). But where my confusion comes in is when you progress from something like a CRUD application to something more complex.

As an example (hypothetical) if I wanted to build some sort of real-estate analysis application at a high level it might involve the following “services”:

Broadway to ingest data from various real-estate data providers.
NX/Torchx for doing some sort of analysis on the ingested data.
Oban for background tasks, like downloading images of listing and persisting them to S3 or sending scheduled property update emails to users.
Phoenix for the web frontend to view the analysed data.

What is considered the best practice for architecting something like this? From what I have managed to gather these are the two ways that seem most intuitive to me:

Creating an umbrella project to keep all the services together in a mono-repo but with some level of separation (however, I have seen that some people seem hesitant to recommend using umbrella projects). Coupling this with something like libcluster and/or Horde means it could easily be deployed to something like fly.io and would have good utilization of the underlying hardware.
Create a project per “service” and have them communicate via some sort of API (I see there is a gRPC package for Elixir, tho it doesn’t seem production ready so I guess this is not a common way of doing things). The downside here is we now have network latency between service calls and would have worse utilization of the underlying hardware if using something like fly.io

Sorry if this was a bit of an essay, but would appreciate any advice and if there are any good resources for me to read on the topic that would be great too!

tj0 · September 18, 2022, 8:26pm

There’s no need to do either actually since each genserver pool is supervised and could be considered a service. BEAM has its own communication fabric, so gRPC is also unnecessary unless you are planning to do polyglot services.

I’d just start with a regular phoenix project and add modules / genservers as required.

cjbottaro · September 18, 2022, 8:28pm

Ohh buddy, you’re gunna stir the hornet’s nest with this one!

My day job has a similar setup as you. We started with an umbrella app, but then just merged everything into a monolithic Elixir app with multiple top level namespaces… and I think it’s MUCH more simple.

We also use Kubernetes… but it’s more about individually scalable deployments, not code separation. We have dozens of Kubernetes deployments, most of them hooked up to HPA, but they all use a shared “Elixir app” build image.

Also, the concept of “RPC” in a dynamically typed language is interesting to me… We have “RPC” which is really just API calls over BSON+NATS (as opposed to JSON+HTTP). But is that really RPC? Isn’t RPC all about static type analysis, i.e. making remote calls look like local calls, including static type checking?

josevalim · September 18, 2022, 9:48pm

My suggestion would be to roll with a single repository, no umbrella, specially if you are a small team.

For example, if your Broadway pipeline may need to use some database resources, and if that’s on a separate application, now you need either a separate RPC service or an additional application to encapsulate shared logic.

The benefit of the Erlang VM is exactly to juggle many things at once (Go as well?), umbrella or not.

I would go with umbrella and/or separate services if you have a larger team and you would naturally organize around some segments or if certain parts of the application have distinct scalability needs.

PS: using gRPC with Elixir is also completely fine.

LostKobrakai · September 19, 2022, 6:25am

I’d add to the already given suggestions, that elixir is usually rather nice to refactor. So this is really less of an “either-or” question as one might expect. In other languages I’ve seen people opt to start out with the most complexity from day 1, because the expectation is that things become a big ball of mud and you cannot change things later. To a degree this might happen with elixir as well, but the functional nature of elixir will allow you to change things quite a bit easier, so you really can start with the MVP (even in architecture) and change things only when there’s demand for it.

bugant · September 19, 2022, 7:03am

Do you mean using a single mono-repo, with multiple apps/projects in there?

josevalim · September 19, 2022, 7:05am

Ah, good catch. I meant a single project/app.

thojanssens1 · April 18, 2023, 2:22am

Does your folder structure look like this?
/foo_app/lib/foo_app
/foo_app/lib/foo_app_web
/foo_app/test
/bar_app/lib/bar_app
/bar_app/lib/bar_app_web
/bar_app/test

Do those apps share the same Beam VM?
How many application.ex files do you have? If one it’s in the root folder?
How are third-party dependencies managed? Each app has its mix.exs file?
How do you manage the dependencies between the apps? And how do they communicate?

Considering this is an alternative to umbrella apps for many cases, for which cases will we need umbrella apps? In other words, what will umbrella apps solve that top level namespaces can’t solve?

Exadra37 · April 18, 2023, 8:34am

I would use Phoenix exclusively for the web logic and then everything else that needs business logic not tied up with serving the request will go into a poncho project.

A poncho project is just a simple elixir project in a parent folder that is then required as a dependency on the Phoenix web app.

See an example form following @PragDave course in this repo:

.
├── dictionary
│ ├── _build
│ ├── assets
│ ├── lib
│ └── test
├── hangman
│ ├── _build
│ ├── deps
│ ├── lib
│ └── test
├── hangman_live
│ ├── _build
│ ├── assets
│ ├── config
│ ├── deps
│ ├── lib
│ ├── priv
│ ├── rel
│ └── test
└── text_client
├── _build
├── lib
└── test

The Phoenix App is in hangman_live while the core logic for the game is in hangman. The text_client is another way of playing the game, thus it also uses hangman which is agnostic of how the game is being played, in a browser or in a terminal, doesn’t matter. The dictionary was extracted from hangman because its generic and can be reused in other apps.

Example of requiring hangman as a dependency from hangman_live can be seen here:

{:hangman, path: "../hangman"},

This is what I like to cal a clean architecture that gives you the benefits of mircroservices in a mono-repo, but without adding any great deal of complexity and keeping things decoupled and reusable. If later you decide that you really need a microservices architecture then you are almost there.

cjbottaro · April 18, 2023, 8:25pm

No, our dir structure is vanilla Phoenix app, but with top level namespaces in lib…

/our_app/lib/our_app/application.ex
/our_app/lib/our_app/*
/our_app/lib/our_app_web/*

/our_app/lib/foo/*
/our_app/lib/bar/*

I.e. Foo and Bar are top level namespaces, and we have only one BEAM application which is :our_app.

If it wasn’t clear, I meant we reverted from umbrella apps to just normal Elixir/Phoenix apps, i.e. whatever is spit out by mix new or mix phx.new. And we don’t mind having top level namespaces. So instead of OurApp.Datastores, we simply have Datastores.

We conditionally start processes/supervisors based on environment variables, so something like this in application.ex…

children = if System.get_env("START_WORKER") do
  [OurApp.Worker | children]
else
  children
end

I don’t see much benefit to umbrella apps.