Questions about Umbrella apps

I’ve been reading the Umbrella apps threads (one, two) and come away with some questions.

If I understand correctly, we can think of Umbrella apps as contexts in Phoenix 1.3. If I were to break up a Phoenix 1.3 app, the natural thing to do would be to break its contexts into separate apps and include them as dependencies in the mix file (assuming it is well designed, to begin with).

For example, an online store app might have a Cart and Catalog contexts. It might a web storefront and admin section. In an Umbrella setup, It would be something like:

Repo app
Catalog app
Carts app
Store Web
Admin Web

So far that’s fairly straightforward. I don’t see including a context in a controller as too different to including an app (which essentially seems like a context to me).

Other questions come to mind, though:

  1. Do env variables get set at the Umbrella level?
  2. What does the workflow look like? Can I keep the Umbrella app in a single git repo and treat it as a single project?
  3. How does it affect deployment? In the above scenario, can I deploy the Umbrella app and have to web apps running on separate ports?
  4. Inter-app communication would be via shared modules and not HTTP? (Umbrella vs micro-service, I suppose)
  5. Where would migrations live? At the Umbrella level, or the app level?
2 Likes

Yes

That is the usual workflow in my experience

Yes, two separate apps would listen on two separate ports. There is some endpoint trickery you can do to allow them to share the same port, but the default would be like you suggested. So your Admin Web will listen on different port than Store Web, which might be good thing.

Yes, the whole runtime including processes and modules and all the code is shared. It all runs on the same VM if you use single node and do not do any clustering just yet, and communicates using function calls and local message passing (i.e. call/cast).

In your Repo app.

3 Likes

Most of your technical questions have already been answered but I feel the need to point out that the breakdown you showed might be a bit too micro (that depends on your needs of course). What I usually do is to have:

  • data app, contains all data structures and all storage mechanisms’ drivers (PostgreSQL, Mnesia, Riak, Redis, Cassandra, what have you). Only schemata and migrations here, zero business logic if possible.
  • domain app, or, lately I started naming that app with the name of the business itself. Contains every single piece of functionality to make your web app or API work – like the Phoenix contexts, more or less. Stuff like User.forgot_password or Cart.add_line_item or Order.finalize or Customers.send_marketing_emails goes here. It’s also a good idea to include your backend-agnostic authentication and authorization logic here, or, if that proves to be too big, refactor it out in a separate app (so far I found that to be an overkill though).
  • web app, where I put everything needed for the website (or websites) of the business (Phoenix, Plug). Uses the functions in the domain app heavily, has no idea about the data storage mechanism at all. Should use functions like Cart.delete or Accounts.confirm_registration that change the database below and must never ever use a single Ecto-specific function or module (or whatever direct storage library). Might include authentication / authorization modules, as long as they are specific for the websites only.
  • api app, where everything that exposes REST or GraphQL endpoints lives. Same as above: uses the domain app heavily, must have no idea about the data storage mechanism. API-specific authentication / authorization modules included.
  • Lately I started adding another one in my projects: reports. Reports are a very weird beast and more often than not it’s best if they are just stored SQL procedures but when that’s not possible (namely when no dev wants to get their hands dirty) it pays really well to have all the ugly compromises and flaky performance optimizations in an entirely separate app so you can change stuff around without affecting your business logic or anything else.

One important caveat though: when using Ecto – like most projects do – having business logic wrapped in changeset functions is very valuable. Code like this:

params
|> cast(:your_association)
|> validate_required(...)
|> ensure_enough_stock(...)
|> apply_discounts(...)

…is very intuitive and the pipe-able nature of the changeset functions that usually live in your Ecto schema files makes validation and any extra requirements and data reshaping very convenient. In the end I left only validation in these (and they have to 100% mirror any database-level constraints which are encoded in the migrations; such duplication is one of the very few flaws of using Ecto) and moved the extra checks and processing into the business logic app – and I am calling them directly from the Ecto schema file changeset function.

That’s kind of ugly and creates a mutual dependency but so far hasn’t been an issue and still gives you a pretty strict separation of concerns.

In general: don’t let frameworks shape the project’s file structure for you. Think for yourself what makes sense, which apps/modules should be mutually dependent, which should be black boxes to the apps/modules that use them, and make the call. Elixir isn’t conservative in this regard; Phoenix just gives you sensible and easy to work with defaults but they are by no means mandatory. Trust in your own judgement! :+1:

My $0.02.

10 Likes

Thanks. Would it not make more sense for the schemas to live in the app (or context if you will, as is the case with Phoenix 1.3)?

I maybe wrong, but reading your reply makes me a little apprehensive. It seems like added complexity, and in my mind it makes reasoning about the app difficult, but of course I’ve never seen your code or worked on your app, so I might have the wrong impression.

1 Like

Thank you.

In your Repo app.

The schemas would be in the app, though?

I put them in the repo/db app. But I don’t generally put a business logic in them either.

1 Like

I put them in the repo/db app. But I don’t generally put a business logic in them either.

I take that to mean that you’re changeset functions also live in the app or context?

I am brand new to Elixir and haven’t even started learning Phoenix yet.
I just read about Umbrella Apps yesterday and like the idea of dividing an App by ‘context’ to prevent creating one big, messy, monolithic app.

Simple question: Can one ‘sub-app’ easily inter-communicate with another ‘sub-app’ (i.e. sub-app A gets information from sub-app B and stores information to sub-app C)?

Yes. See in Dependencies and umbrella projects:

defp deps do
  [{:kv, in_umbrella: true}]
end

…in mix.exs should be enough.

Yes but note that should only be calling in one direction. So B can depend on and call methods from C, but C cannot depend on or call methods from B. There are ways around it (such as registering a listener), but generally you want to have just one-way dependencies.

Thanks dimitarvp and axelson! Much appreciated

1 Like

The suggestion to store schemas and migrations in a Repo app is still not very clear to me. This means centralising validations and business logic in the Repo app, or moving the changesets and validation rules down to the context, or app, in this case. Is that what usually happens?

The migrations and Ecto schemas are (hopelessly) tightly coupled with the underlying datamodel of the RDBMS. So as such they will always belong to the Ecto repo.

This means centralising validations and business logic in the Repo app

No. Most of the confusion arises from the desire to leverage the Ecto.Changeset functionality for validation within the context (app) in concert with the Ecto schema structs. From a DDD purist perspective that is an anti-pattern.

A DDD repository would typically unmarshal the RDBMS data into data types defined by the context - rather than the context depending on data types defined by the repository. You are free to implement a repository this way but at the cost of easily using Ecto.Changeset for validation within the context. (Inside the repository you are free to use Ecto.Changeset as much as you like - but any data entering or leaving the repository is supposed to be valid domain data).

I suspect that is one of the beefs that Dave Thomas may have with Ecto.

2 Likes

I don’t think we should go all the way to the biggest heights of absurdity while chasing DDD purism as the Java’s “enterprise” stacks often do, though.

A couple of points about Ecto:

  1. The maintainers are considering splintering Ecto in two: core and SQL. Core would handle changesets, validations and overall managing changes in structures per operation / transaction. That’s not bound to SQL at all. You can do it in hobby projects while exploring append-only immutable DBs, for example. Ecto’s SQL-neutral parts will work just fine regardless.
  2. To me, Ecto is a fantastic glue and manager when working with pieces of data that change over time. No need to wrap that in an even higher-level abstraction which is gonna be specific for your project only and will only increase confusion while onboarding, and force people into practices which might or might not help their career prospects in the future.

I see nothing wrong in accepting Ecto as a hard dependency. In commercial projects we have to prioritize readability, predictability and minimal WTFs per minute, and most of the times making the rational and practical calls is quite enough.

EDIT: All that being said, I still don’t use Ecto outside of my data app (it lives inside an umbrella which handles all storage and validation; that’s how I usually structure my commercial Elixir apps). But I don’t see anything wrong to use Ecto’s SQL-neutral parts outside these bounds.

2 Likes

Dogmatic pursuit of “purity” for it’s own sake is rarely beneficial. However the idea behind Ecto’s schemas is that the shape of your data will likely closely mirror the shapes that already exist in your RDBMS’s datamodel - an idea that is rooted in the realm of CRUD applications.

However, as explored in the references in this post, in some “enterprise-y situations” the shape of the data used in the business context(s) may be some strange projection of the data as it actually exists within the RDBMS - at which point DB schema based structures and generic changeset functionality will be much less relevant - so the context defines the shape of the data that it needs to work with (not the repository).

Now if your business context has autonomy over its persisted data (i.e. data is not shared on the level of persistent storage) then there should be fewer reasons for divergent data shapes in persistent storage and the context.

4 Likes

That’s a problem only in legacy apps – like some heavily denormalized databases in 10+ year old systems where data duplication and the impossibility to maintain the stored procedures that held the whole thing together are a real threat. If your RDBMS model makes total sense for your business and code then the only problem would be possible worries about future-proofing (which are admittedly always 100% legitimate since none of us can accurately predict the future).

Oh, absolutely. That’s why I have User.get(user_id) and Accounts.get_customer(user_id) and Accounts.get_billable_profile(user_id) etc.; the contextual functions either transform the data structure or just return a subset of the struct’s keys. If you have the slight extra bandwidth you should IMO do that because it forces you to work with limited information and only include keys which are truly needed for the particular subsystem of the bigger business app. This is not about privacy or personally-identifiable information; this is about not leaking info where it should not be needed (the functions that operate on the billable profile should not care about the encrypted password or the last signed in IP or date, for example).

This however needs to be not overdone; instead of using other structs, most of the times I just use maps and pattern-match the required keys in the related functions. (Although it might still be a better idea to not only have a User struct but Customer and BillableProfile as well.)

1 Like

@dimitarvp @peerreynders can’t we create a “Repo” context?

As it stands, I’m not sure I see the value added of umbrella apps when I can split my app into contexts and namespace my web layer (as well as write a router for each namespace). In Phoenix 1.4, you have to import the router into the controller anyway.

What would be the gain of going a further step and, instead of a Repo context, split up the app into multiple ones?

Consider this app dependency graph: webdomainstorage.

The umbrella is a soft limiter in terms of dependencies and gets the message across that your web app shouldn’t touch your storage app; it should only go through the domain app modules and functions. At least for the several Elixir devs I worked with so far, it conveyed the message clear.

It’s obvious that there’s nothing stopping you – but when your web app has {:domain, in_umbrella: true} but does not have the same for the storage app, your teammates (and you) should understand not to cross the boundary and thus preserve some sanity and simpler dependency graphs in the project.

Additionally, I find umbrellas refreshing because they allow you to have several piles of code in different directories as opposed to one huge pile of unrelated code in the same directory. But that’s highly subjective of course and I am not trying to convince anyone that it’s “the one true way”.

1 Like

And to expand a little on what @dimitarvp said. The idea is that you should be able to swap web with something like console and have console -> domain -> storage and not have to touch any of the code in domain as web or console is “just an interface” to the business logic within domain – i.e. “(web|console) is not your application.”

1 Like

Right, but you have that with contexts.

Users.register(params)

In this case, you can call that from the web, the console, or whatever. It doesn’t matter here if the Users module is a context module or in an umbrella app. I understand the philosophical separation. But contexts seems to be “umbrella light,” without some of the added complexity. Unless I’m mistaken, in hard technical terms, I’m not sure I see a hard delimiter. Of course, the granular code organisation is nice, but I rather enjoy working with contexts.