Project structure and layering

pedromtavares · September 27, 2020, 9:28pm

Hey everyone,

After some discussions here on the forum, I decided to write a blog post about structuring a Phoenix app with a big focus on properly layering your domains: https://dev.to/pedromtavares/blazing-with-phoenix-project-structure-463l

I’m really curious to know how other people do this as I struggled to find material when I was tackling this problem myself, so hopefully this discussion will provide further guidance to Phoenix newcomers.

BartOtten · September 27, 2020, 10:59pm

This is the way I structure my apps too. To avoid a lot of boilerplate, I am using macro’s to create many default functions (CRUD for example) which can be overridden when needed.

The blog post is a good explanation for newcomers struggling to find a good structure

baldwindavid · September 28, 2020, 1:20am

Really enjoyed the article. This is how my app is structured as well. It took me a long time to arrive here and wish I would have read something like this early on. There is a chance that I did, but that it didn’t click yet.

I don’t know that this is much of a difference, but for cross-domain orchestration I have been creating dedicated service/use-case modules for cases where more than just a couple lines of code are needed. These are located in a “/services” directory for lack of a better term. The top-level domain then delegates to those functions. Do you have any sort of convention for more complicated cases?

I also wonder how much you are thinking about the direction of dependencies between your contexts. Most of my contexts can query my Accounts context, but the opposite is not true. I try to enforce a hierarchy amongst my contexts. I haven’t yet used the Boundary library yet, but am very intent on exploring this to avoid tight coupling and circular dependencies.

@BartOtten - I’ve specifically tried to avoid macros and such for context functions until I have enough knowledge/examples of how my contexts typically shake out. I’m now at the point where I have a lot of ceremony and duplication. It is easy to read, but I’d like to push some of the things that I do every single time to macros for readability and to cut down on the need to write the same tests over and over. If you have any specific examples of macros that you would like to share, I’d love to see them.

pedromtavares · September 28, 2020, 2:00am

Great to hear!

Yeah I think the idea of moving larger orchestration to its own module is fine, I place mine as direct children of the app level domain, App.Conductor for example. I think moving these to their own folder would stimulate filling this folder up, and if you start having too much of these then maybe that’s a sign that the current contexts aren’t very well abstracted, as the bulk of work needs to be moved downwards, not sideways.

Forcing hierarchy amongst contexts is exactly what needs to be done, this way you create more layers of abstraction and avoid the problem that I mentioned above. I haven’t used that library either but really like the idea behind it, definitely helpful for teams to stay consistent with domain encapsulation.

Really appreciate the thoughtful post!

tfwright · September 28, 2020, 2:33am

Very nice walk through! This is definitely something I wish I read when I first started trying to wrap my head around the Phoenix ‘contexts’ concept.

I found it very interesting that you start with an ‘app’ context (the Rpg module) and then introduce more conventional contexts as a layer between that and the schemas. I really like that from an “agile” perspective because you don’t have to decide about what divisions will emerge in your domain before you’ve done more than add a single schema.

Conversely though, on the idea of starting with query and services modules, I wonder if this caveat at the end of the article doesn’t also apply:

It’s up to you if you would like to make the app-level Rpg context your only public API. The downside of this is that you would have a lot of functions that just defer to either Gameplay or Accounts and on a larger app that can get very repetitive

Splitting schema logic into multiple modules seems to me like an optimization on par with maintaining separate layers of contexts like you do in your example. After all, why not start with one app context? If, or rather, when, your app context needs to be broken up into multiple “contexts” do that. But doesn’t the same apply to the schema module? If your schema is complex enough that it makes sense to break out all the query functions and all the service functions then by all means, do it. But why before? In your example, you start with the former optimization but not the latter, which is opposite to the conventional Phoenix approach, but I am inclined to take the most minimal approach from both first: 1 app context, 1 schema. Go from there.

To play devil’s advocate, it seems like Phoenix introduces the context convention somewhat aggressively in order to firmly dissuade the bad practice of putting everything in your schemas in a Rails-esque way. Part of what your article seems to imply is that maybe it should also encourage starting with a query and services module for the same purpose. But short of some framework level mechanism that makes the dev’s job easier as a result, I’d rather leave those decisions to the architect.

pedromtavares · September 28, 2020, 2:57am

Thank you!

When writing the post, I decided to focus on one convention at a time, first the Service/Query/Schema split, and then the top-level domain structure, not necessarily that you should follow that order when building your app. The important message I wanted to convey was the final structure, if you have an idea of how it should be in the future, then how you choose to start is also completely up to you.

I personally like that Phoenix introduces Contexts more aggressively, this raises the overall level of most Phoenix apps because people will get interested to learn about them and thus be more prone to make proper architectural decisions.

tfwright · September 28, 2020, 4:30pm

I do agree with this. I think Rails would have been better served by making model namespaces a stronger convention instead of falling into the global services concept.

hlx · September 28, 2020, 5:56pm

I’ve read the article and I was wondering if you have considered ‘defdelegate’ for things like ‘Accounts.create_user/1’

pedromtavares · September 28, 2020, 6:12pm

Not having seen much of this keyword being used nor used it myself, I wasn’t sure of its pros and cons. From what I’ve seen it’s just an extra touch of syntactic sugar, so I wonder if it’s important to introduce it as part of a project convention.

hlx · September 28, 2020, 6:24pm

There is a great discussion on that here Is defdelegate necessary?

Personally, I think that less code is better. Less changes for bugs, less to maintain. (Generally speaking ofc) Aside from the fact that I wouldn’t use layering with contexts, if I had to do it I would do it with defdelegate to simplify the code base.

I have to admit that I need to read your article better and I also want to check out the moba code base but for now I do not see a reason why I should put the Users context under the Accounts context. I’ve worked in many larger codebases over the years and I’ve found that almost always spreading out code over multiple layers can be very hard to maintain in the end. This is a personal preference though and it might not apply to anyone else.

pedromtavares · September 28, 2020, 6:33pm

I’ll start using it in my projects from now on and see how it plays out, will edit the article later if it does make things easier. Not having to retype the whole call is definitely better, especially when you have default values.

I’m curious to how do you structure these larger codebases, do you place Users as a standalone top-level domain? If you could write a small gist to lay out the contexts for us to discuss that would be fantastic.

baldwindavid · September 28, 2020, 6:48pm

All of my contexts use defdelegate. The file layout is exactly the same as in your blog post. The only difference is that a function name like Users.create/1 becomes Users.create_user/1. To me, this becomes slightly simpler to reason about. The main context file (Accounts) is just filled with a bunch of delegated functions which is easily scannable. The cost is slightly more verbose function names within Accounts.Users.

hlx · September 28, 2020, 7:01pm

Larger codebases usually means more developers. When you have to lead multiple developers I found it’s easier to have a more of a flat structure. In the case of the RPG app a (new/junior) developer could have a hard time finding things because functions that they might need can be in multiple layers. For example, you would need to explain that a developer can only use RPG.Accounts.create_user/1 in their controller instead of RPG.Accounts.Users.create_user/1.

I cannot think of a reason why Users can not be a top level context. That said I also really don’t mind “large files”. Is it really a problem if the Accounts context becomes a few hundred lines or a couple of thousands? Do you really need a separate file for the queries or is it also ok to place them in the schema?

All this comes down to preference, I like to stuff my schema full of queries and create large context files and call them all Users, Products, Orders and etc. If that every becomes a problem I’ll refactor it but for now I would like to think that I saved myself a lot of time not thinking/worrying about context names, layers and too many files.

baldwindavid · September 28, 2020, 7:23pm

@hlx - I’m trying to understand how you are defining a “context”. I think of a context as grouping similar modules/resources under a namespace to help with organization and keep track of dependencies.

For example, an MyApp.Intercoms context might have resources like a Listing and Device, which would end up as MyApp.Intercoms.Listing and MyApp.Intercoms.Device. Are you saying you might opt for something like MyApp.IntercomListing and MyApp.IntercomDevice whereby each of those resources is on its own (i.e. not really using contexts at all)?

sasajuric · September 28, 2020, 7:50pm

There are parts of this article that I enjoyed, but then there are some which I don’t agree with. In general, I find the design of this example overly elaborate, and some suggestions overly generalizing. For example:

… how we can structure our business logic with a simple convention focused on developer productivity. For every database table, I suggest having 3 supporting modules: a Schema, a Query and a Service.

I have to say that I don’t find this proposal very productive focused. Granted, it adds a strong structure to the code, which can be followed mechanically, but I feel that such structure is pretty shallow, i.e. it is focused on splitting the code by non-essential properties. As a result, things which would better fit together from a readers perspective are kept separate. For example, let’s consider the create function:

def create(attrs) do
  Hero.changeset(%Hero{}, attrs)
  |> Repo.insert()
end

Here we immediately delegate to another module, and now I have to open a second file to see what’s happening before the insert. It’s also worht noting that Rpg.create_hero already delegated to Heroes, so basically one line of code in, and I already have three files open, keeping the stacktrace in my mind, and I’m no wiser about what goes on. I have to say I don’t find this particularly helpful or productive

Another indication that there is something amis with this design is in the fact that changeset is only meant to be used by the Heroes service. Yet we return heroes struct to the web layer, so it is free to invoke changeset/1 even though it makes no sense.

Yet another clue emerges if we try to type this function:

@type create(attrs) :: Ecto.Changeset.t()

This is a very concrete abstraction, and yet the signature of its API function is completely generic. What’s in this changeset? I have no clue, I have to read the code to understand. Basically the delegate doesn’t bring anything useful, but forces me to jump back and forth in the code.

In this particular case, I’d address this by moving the changeset function to the service layer, and given its size, I’d inline it directly into Heroes.create/1. Going further, I feel that the whole Heroes service is an overkill for such a small program. I’d move the code of Heroes.create/1 to the Rpg context, and now we end up with:

defmodule Rpg do
  def create_hero(attrs) do 
    hero
    |> cast(attrs, [:level, :is_enabled, :gold])
    |> Repo.insert()
  end
end

which I believe is much easier to read and maintain

Going further, I’d also inline queries into the query functions, and move those to Rpg as well. With that, we lose two modules, and consolidate the code of all operations in a single place, making it easier for the reader to understand what each operation does.

Now, granted, as the code grows, the Rpg module will become bloated. However, instead of upfront design based on guesswork and some bureaucratic rules, I prefer to let the context grow a bit, and then perform the split based on the actual code which at the time exactly corresponds to the requirements. The gain is that at that point we have a deeper understanding of the world (because we’ve spent some time working on it), and we have a better understanding of requirements (because we implemented them), and so we have some concrete data from which we can see distinct groups of code.

A nice example could be the items concern. In the article, the code is small enough that I’d place this logic in Rpg. But over time, I can imagine some split might be needed. I prefer to wait until we have enough of real code implementing actual requirements to see how that split will happen. Many times, I’ve witnessed that if I wait, things turn out to be much different than I originally guessed, because the requirements change, and because our understanding of the domain deepens.

Sometimes it’s worth splitting upfront. A nice example of this is the accounts logic. We can be pretty certain that this logic will have some complexity (registration, authentication, password resets/changes, profiles, etc), and that this will be mostly independent from our main logic. Hence, immediately placing account operation into the Accounts context makes sense, though I personally wouldn’t mind if the initial implementation is stashed in Rpg.

I hope you won’t take this criticism too hard. I’ve seen some real damage that can arise from applying these principles mechanically. A team I consulted, did that, and they ended up with a huge amount of micro-modules in a relatively small code base. There was a strong sense of structure, and yet the code was incredibly confusing (not just to me, the original developers also didn’t like it), because of the large amount of code delegation and inter-module dependencies. Basically, it was hard to tell the forrest from the trees in such a codebase, and to me this is not particularly better than stashing all code in a single bloated module.

The good modularity is IMO obtained by keeping together things which naturally belong together, while separating the things which don’t. This is not done to satisfy some academic principles, but to simplify the lives of the people that have to plow through the code on a daily basis. Ideally, a single module contains everything I need to know, while pushing aside everything I don’t. This will never be completely possible, as there’s always some work cutting through concerns, but the code can be organized to be close to that ideal in most typical cases.

baldwindavid · September 28, 2020, 8:30pm

I was hoping you might weigh in here, @sasajuric - Admittedly, I read the article from the mindset of an already large codebase with fairly clear sub-systems, but can definitely see this as overkill at the outset. I originally had all functions for a given context in a single file, but have gradually moved to separate service, policy, and query modules with a context module that defdelegates to all of those functions. For larger contexts, I appreciate being able to see the entire context API on one page since they only take up a line a piece. Clearly a tradeoff there though.

Thanks for noting concerns about how changesets are used. My next refactoring is moving changesets from schema files to contexts; they never felt right being there in the first place!

pedromtavares · September 28, 2020, 8:48pm

Hey Sasa, thank you for the thoughtful post, it’s quite fantastic to be able to discuss these matters with someone of your expertise, everyone learns so much which to me is the whole point of all of this.

I’m considering changing some wording on the article because I wasn’t clear in the sense that I do not suggest starting with such a rigid structure from the beginning. I chose to slowly present my proposed convention with easy to understand examples while keeping the post short and dense as I feel that is a more effective way of communicating to a broader range of people.

As @baldwindavid mentioned, the article is aimed to provide structure to a larger codebase. I completely agree that small abstractions need to have a small code footprint and up until you start having multiple sub-systems, keeping everything under one context can be much better as you pointed out.

I’ll have to test your suggestion of putting changesets in the service module as well, I’ve had situations where some business logic leaked into changesets and questioned if they should really be in Schemas, but given that is how it’s presented on the Phoenix guides, I chose to keep that convention the way it is – again, that is why I think talking about this is so valuable.

Edit: added this section to the introduction

It’s important to note that the conventions laid out here are focused on optimizing larger codebases, so if you have a small project, following the patterns set by the Phoenix generators is completely fine and will make you more productive. It’s always best to start with simple abstractions and refactor as the project evolves.

sasajuric · September 28, 2020, 10:37pm

Yeah, both Phoenix and Ecto docs propose keeping changeset functions in schemas. This is one of the things where I disagree with “the blessed” way" (some other examples being overuse of app config and organization of web files by controller/view/template role).

When it comes to business logic leaking in changesets, I’d argue that this will be the case anyway, no matter where you put these functions. The thing is that Ecto is a somewhat “dirty” concept because it mixes domain modeling and data storage concerns. So, for example, changeset functions typically contain some business validations. Now, to be clear, I think this is actually a good thing, b/c we can start simple without all the ceremony of transferring the data to library/framework independent data structures and back. I also believe that this simple approach can scale far with respect to complexity. And finally, I feel that it should be easy enough on the beginners, without limiting the options for more experienced programmers.

But either way, the consequence of typical Ecto usage is this mixing of persistence and domain concerns, as well as the fact that the application view of the world maps exactly to the relational model, which is not always perfect.

Regardless, I think that intertwining of changeset operations with some amount of business logic is fine, as long as it’s not overdone. However, keep in mind that this will cause the context to grow more rapidly, and when it becomes too large, I suggest splitting by paying attention to cohesion, i.e. extracting things which naturally belong together, and separating things which are completely independent. When I’m doing this, I first perform a casual scan through the module to bootstrap my brain and grow some refactoring ideas, i.e. pick some group of functions which could be extracted to another module. Then I cut-paste one function, and rely on compiler warnings/errors to move the rest of the associated code. Once I get the project to compile & tests to pass, I reflect about the new shape of the code. If I’m not happy I might revert it and try something else, or I might try further splits, or maybe move some things back. Sometimes I find that the new state is not really better than the previous one, and then I just postpone refactoring until another time.

It’s not a straightforward process, and it requires some critical thinking, but I feel it leads to a much better separation of concerns compared to mechanical division based on secondary properties (like e.g. putting changesets here, and queries there).

There will be more complex domains where mixing Ecto with business logic is going to be noisy, but I feel that in such cases it’s better to explore pure domain modeling, i.e. transferring data from Ecto schemas to plain data structures (which might be organized in a significantly different way than the relational model), and doing all business logic on this pure model. In such approach Ecto will still be useful for data transfer, but it then becomes more like a supporting infrastructure, and should probably be moved out of the context, or burried deeper in some internal persistence subcontext.

sasajuric · September 28, 2020, 10:44pm

Yeah, that’s my sentiment too. I also initially wrote changesets in schemas (b/c docs said so ), but I never got comfortable with it. About a year ago I proposed to a client team to try writing changesets as private funs in contexts. There was some initial skepticism (b/c docs said otherwise ), but it turned out great, and I definitely don’t see myself going back to writing changesets in schemas.

baldwindavid · September 29, 2020, 1:53am

I’ve gone through a lot of that within my contexts and they are still evolving. Separating resources into commands/services and queries has been nice in some larger contexts and unnecessary in others.

Each context in my app also currently has a single authorization policy file. I’ve tried this in a few other places: completely outside any contexts, inline within each resource’s service code, and as separate per-resource files. I tend to think that a context should contain the information about how its resources are authorized so have kept it within the context. Inline felt too cluttered and unfocused while separate per-resource files was overkill for the number of auth rules currently being used.

For the most part, I’m fairly content with “in-context” file organization. It’s easy to change and experiment because all of the tests are against the context’s public API. Where I’m less content is in dealing with “cross-context” operations. As mentioned above, I’m currently placing those into a “services” directory under the main app. These are modules like MoveInTenant, CaptureContactRequest, and MoveOutTenant which necessarily need to interact with multiple disparate contexts. They typically have their own lifecycle with their own new, change, and call functions, as well as, their own embedded structs and changesets for custom forms. This has worked mostly fine, but as @pedromtavares mentioned above it’s possible this is a design issue that I’m not quite sure how to work my way out of. Or maybe it is just an organizational issue. At any rate, it feels a bit ad hoc as compared to the “in-context” conventions.