Lonestar ElixirConf 2017 - KEYNOTE: Phoenix 1.3 by Chris McCord

35 Likes

Fantastic talk!

I think the changes make a lot of sense and I really like how Chris wants to get the guides and book updated asap as well - this will definitely be a huge help to quickly get people up to scratch on the 1.3 way of doing things :slight_smile:

3 Likes

One question I would have is how one should approach the sharing of certain entities between Bounded Contexts. For example, see the slide at 16:05—Consumer and Product “belong” both in the Support and Sales contexts. However the code structure being proposed would require us to place these schemas in either one or the other.

What’s the recommended approach here? I see three possible ways to reconcile this:

  1. Place the schemas in one of the contexts, and simply accept that the other context is coupled to this one by allowing it to directly use the schema modules, execute Repo operations, etc. This seems bad, since we would end up e.g. having support-related changesets in the sales context, just because the Consumer schema happens to live there.
  2. Extract the schemas to be “global”, so that they don’t exist in any bounded context. They have the base repo functionality needed by all contexts, and the two contexts call in and use them as needed. This seems to be a return to the Phoenix 1.2 approach, but perhaps it is warranted in these cases.
  3. Build a new context for these shared modules, and make the Sales and Support contexts interact with them across a context boundary. This initially seems nice, but then it would not make much sense to place Consumer and Product in a context together just because they’re both shared by the same two contexts—what if a third context also wanted access to Consumer but not Product? Therefore we would probably end up building a separate context for every schema which may be shared by more than one other context. This seems like a quite involved approach, especially for larger apps where we would end up with quite a lot of these one-schema contexts, but perhaps this would encourage us to think better about isolating our contexts as much as possible to minimise these shared schemas.

I would love to hear others’ thoughts on this, and if @chrismccord could weigh in, even better!

5 Likes

Actually we specifically want to promote the opposite. You are misinterpreting the diagram in this case. The diagram is showing the context have independent Consumer and Product entities. So this is something we want to push folks towards and I could have covered better in the talk. For example, you’d have this dir structure:

|– lib
|–––– sales/
|––––––– consumer.ex # schema
|––––––– product.ex  # schema
|––––––– sales.ex    # context
|
|–––– support/
|––––––– consumer.ex # schema
|––––––– product.ex  # schema
|––––––– support.ex  #context

[quote=“mjadczak, post:3, topic:3908”]Extract the schemas to be “global”, so that they don’t exist in any bounded context. They have the base repo functionality needed by all contexts, and the two contexts call in and use them as needed. This seems to be a return to the Phoenix 1.2 approach, but perhaps it is warranted in these cases.
[/quote]

The opposite. You won’t have global schemas and stray Repo calls. Contexts will manage their own schemas, in different tables. When you run the generators, we namespace the schema tables under the context name, so you’d have "sales_products". Your Support.Product schema could have its own database table, or it could be an embedded schema that fetches the canonical %Sales.Product{} thru the bounded api, and then decorates it as a sales product as needed. Two separate database tables or the fetch and decorate approach will depend on your app. In either, you don’t have a global schema or breaking of the bounded api.

9 Likes

Thanks for great work! Many interesting changes.
The whole idea of Bounded Contexts is quite questionable though. It makes app logic a bit too complex imo

  1. joining mjadczak’s question about shared models (schemas)
  2. on short run, having multiple contexts and coordinating code makes things too complex without much pros
  3. on long run, having multiple contexts means “lib/” folder is overly populated and it’ll be hard to find something in there
  4. context file (e.g. blog.ex in the video) is going to grow big, looks like, absorbing various semi-related features, mostly stuff previously belonged to models - and now gathered in one place. Don’t think it’ll help making code clear
    P.S.
  5. where do “embedded schema” for User Registration from the video go? it’s not clear from the presentation. imo that’s a separate being, not just regular “schema”, and there will be loads of such beings so it makes sense to move them to separate folder, maybe
1 Like

We aren’t saying you should have two-dozen Product schemas and product tables, but we are saying you should consider when it makes sense to represent a “Product” independently in parts of your domain. If you have a “sales” system selling products, the sales context is going to likely be the the authority on your product system and manage the core data for your products. So other parts of your app can interact with products, update them, create them, etc, but they will do so from the Sales boundary, not by reaching into the internal storage.

In our experience seeing real applications in the wild, a dumping ground like web/models exactly caused overpopulated, hard to see at a glance structure. One of the explicit goals of 1.3 was to understand what the app does, its features, and how to use those features only by looking at the dir structure, so I don’t agree with this point.

It’s worth re-interating that these suggestions are the same things you should think about when designing any elixir api. If you start feeling like your module is too large and handling too many concerns, then it’s often the case that you can extract the concerns to different modules and functions. As far as making the code clear, gathering related, depending parts of your domain in one place, and isolating them exactly helps to make the code clear :slight_smile: It will be easier to find docs, easier to see relationships, and most importantly, easier to maintain in the long run, so I don’t agree with this assertion.

Inside lib/accounts/registration.ex . In this case the module could serve as both the api and the embedded schema. You could also expose Accounts.register_user which would have been a better example for that slide.

Hope that helps!

4 Likes

Thanks! Just fearing too deep && wide directory tree (unlike rails and phoenix <=1.2) can scare off newcomers

2 Likes

One thing to notice is that having multiple “Product” schemas does not mean you need to have multiple tables in database!
Thanks to explicit schema declarations in ecto, you can have all of them refer to the same table in database and pick only the fields this context is interested in - leaving the others untouched. This also has the advantage of performing more optimal queries - retrieving only the columns we care for.

Database and application logic are separate - one doesn’t need to be coupled to another.

13 Likes

There is nothing that Phoenix can do that won’t require developers to think about their domain. If you don’t break into contexts, you will end-up in the situation you have today with a large web/models directory without any idea of what your application does (schemas reflect the database structure and not your application structure), how those models relate to each other and what are the proper boundaries.

However, if you break into too many contexts, or if you have contexts violating each other boundaries, you are also going to end-up with messy code anyway.

I believe questions such as where to put an embedded schema is exactly the kind of confusion that comes from having something such as web/models in the past. A schema is nothing more than Elixir module and a struct. An embedded schema as well. So you should structure them in the same way you would structure any other Elixir code.

A context is not made of schemas. A context is made of Elixir code, data structures or even processes.

10 Likes

Interesting, I guess it makes sense that it is valid to define multiple Ecto schemas with the same table name, given the decoupled nature of how queries are actually performed. It would probably not have occurred to me. There is an issue there about keeping common fields in sync when making changes, however.

One thing to think about would be the conversion of a given structure that we hold from one context to the other—say we do take the approach that we have multiple schemas storing data in the same table, we may (say in some external code) receive a Support.Product (say from viewing a support ticket) and wish to e.g. create an order with that product, requiring a Sales.Product. Would we explicitly extract the id of our product and re-fetch it in a different context? Or would it be appropriate for Sales to be aware of the other “side” of our product and expose a from_support_product function which would take the appropriate steps (ideally communicating through the bounded API) to give us the relevant structure?

Another thing which I’m wondering about is the coupling at schema level in terms of associations. We may well have belongs_to or other associations in our scheme which reference entities in another context. Is this still valid in keeping with the Bounded Context abstraction? Is it ok as long as we treat any such “foreign” entities as black-box structures that we use the bounded API to modify or transform inside our logic?

3 Likes

The solution to the problem above is dependent on the domain. For example, in case of sales, you may want to copy all of the product information because, if you change the name, the price and other characteristics, you don’t want that reflected on previous sales. Maybe you only keep a reference to the original product for tracking purposes.

It is not recommended to define associations across contexts, as contexts should be decoupled code-wise. However, a context does know other contexts exist. As in the examples above, the Sales and Support likely know there is a canonical product representation with an ID which it uses as main reference. So you could think there is a Store.Product with ID=1 and Sales.Product somewhere will have MAINPRODUCTID=1 when creating a sale for that product.

3 Likes

Sure, I think perhaps my wall of text obscured the intent of the question. I believe what you’re referring to is the practice of copying down the details of a product at the time of sale, so that when viewing an order you can see what was bought at that time even if the actual Product being referenced has changed or no longer exists. FWIW I tend to dump this kind of denormalised data into jsonb columns.

There were two parts to my questions and two parts to your answer, so I’ll provide a context and then continue with the two parts of the conversation.

The situation

Assume now that we run a company supplying jellybean vending machines to shopping centres and we have two contexts in our app: Sales and Support.

The Sales context deals with our web store where we list our many offerings, take orders, process payments, track shipping to our customers and so on. We, of course, have a Product entity and a Customer entity within that context. A Product will hold information about the name, price, image, description, while a Customer will hold information about a particular customer’s name, address, email, contact name, shipping address and so on. When we create an Order we copy in the relevant Product information (as well as storing a foreign key too), and store a reference/foreign key to the Customer who made the order.

We also have a support portal section of the site where people not happy with our service, or who encounter any problems can go and vent to our poor customer service people through the creation of Tickets. Our employees have special access to this panel where they can view all the tickets, manage them, issue refunds, respond to queries. Note that these tasks will need to access different contexts (tickets vs refunds—Support vs Sales) but that’s ok, since we’ve decoupled our controllers from the business logic layer! :tada:

Now, in this Support context we clearly need Customers, since they are the ones creating tickets. This Support.Customer is different from the Sales.Customer though—we don’t need to care about their payment information, or shipping address, but in turn we might care about whether they are a VIP support client and we need to get to them right away, or we simply want to model that a Support.Customer has_many Tickets. This is still fine and in keeping with the spirit of the Bounded Context principle.

We also need Support.Products—again we probably don’t care about stock quantities, or pricing, but we may care about special instructions for supporting certain products, internal links to scripts or the specific set of instructions to walk a customer through getting into the back control panel of the BeanTron 3000—again, things which are relevant to us in the Support context.

Cross-context entities at the data layer

This related to my original question. Chris made the point that you want to have two data structures, one per context, even if they “represent” the same thing, as their respective views of that thing may differ. Then, Michał made the point that Ecto allows us to define multiple schemas on the same DB table, which allows us to consolidate the storage of these different fields at the DB layer, while keeping them separate at the business logic layer. All good so far.

This is, in fact, the case with our Customers. Support.Customer and Sales.Customer are two distinct data structures in our code, but conceptually, both represent the same customer—one could say they are two views of the same data. Note however, that they will have shared fields such as name or email. We want to keep these fields consistent between the two views, and we want to preserve the conceptual identity of the customer—for example, if we delete one of these Customers, the other one gets deleted too.

Now, we can take the following approaches to storing things in the database:

  • Have two separate tables, `support_customers` and `sales_customers`, perhaps with one of them being the "canonical" or main one, storing all of the shared data as well as its context-specific data. Let's say it's the `sales_customers` table (because that was initially the only table, and we only added the support portal after our phone line started being busy 24/7).
    

    The Support context, when asked to fetch a customer, actually asks the Sales context for a customer, and then strips away irrelevant fields and adds support-relevant ones, populated with data from the support_customers table looked up by having a sales_customer_id foreign key in that table.

    We need to ensure that the Ecto schemas for the two structures are the same regarding the shared fields like name and email, and we need to ensure that when one gets deleted, so does the other (yay for DB constraints/cascades).

  • Have three tables, customers, sales_customer and support_customers, add a new context Customers, and have our Sales.Customers and Support.Customers be created by fetching the “canonical” customer through the bounded API and then embellishing it with relevant fields from our own table. We’re going for full normalisation here, and our invariants can be enforced with DB constraints, but it seems like overkill. Again we end up creating extra contexts for anything which is shared.

  • Have one table, customers, and have Sales.Customer and Support.Customer both be an Ecto schema on the same table. This is what I believe Michał was talking about. Again we need to ensure that our shared fields are the same in the schema. But we will likely want to have shared validation logic for our shared fields for our changesets, so where does that go now?

Which one is best, or is there another, better way?

Context-switching entities

Suppose we are viewing a ticket in our support system. We have our Support.Customer Bob, the author of the ticket, loaded available in our controller and we are displaying his name and other tickets in the sidebar. But now our support staff tell us it would really be helpful if they had a list of Bob’s recent orders in the sidebar too. To get that, we’ll need the Sales.Customer representing Bob. How do we go about it? If we were using the “one table” method above, we could simply write

sales_customer = Sales.get_customer_by_id(support_customer.id)

but now we’ve coupled our controller to our database layer! If we change things around so that the two views of a customer no longer share an id, suddenly we are showing Jane’s orders in the sidebar and the customer support people will hate us even more.

What I propose is better, and what my question was about, is to be able to write

sales_customer = Sales.customer_from_support_customer(support_customer)

where this method will do the right thing to fetch me a different view of the same customer. If we have entities which cross contexts in our domain, they are coupled in the domain, and so I propose it’s impossible to avoid coupling in our business logic model. What we can and should do is to push that into a small codebase at the points where the two contexts interface, so that if we did change how we represent customers at the data layer, we would only need to change a few access and conversion functions in the two context modules.

Cross-context associations

This was the second part of the question. I see your point that associations should not go across contexts, because strong associations between entities usually imply they belong in the same context. What I’m referring to are “loose associations” I suppose. Let me demonstrate.

Let’s suppose that we want to be able to associate Tickets with a specific order, to ease things like billing or shipping enquiries. That is, we want to associate a Sales.Order to a Support.Ticket—a cross-context association! Now, our Support context isn’t really going to do any processing on this order, that’s not its job. All it’s really going to do is just give us the order so we can use the functions in the Sales context to process it appropriately.

The question is in relation to actual Ecto schemas now. Is it ok to do this?

schema "tickets" do
  ...
  belongs_to :order, Sales.Order
  ...
end

Now we have signed up to validating this linked order in changesets as well I suppose, though this could delegate out to the appropriate Sales functions.

One idea which comes to mind is to just do this:

schema "tickets" do
  ....
  field :order_id, :id
  ...
end

that way, we store the relevant information, but we don’t pretend to know anything about its domain.

— Hey, Support, what’s the date of the order associated with this ticket?
— I don’t know what orders are, or if they have dates, or where they’re stored, but hey, here’s this arbitrary id that Sales gave me to hold on to, why don’t you go ask her about it?
— Oh ok, thanks!

Thoughts?

1 Like

I just watched the video and I think Phoenix is going in the right direction. Before Phoenix 1.3 I was thinking that the framework was taking over the app, even if Phoenix has always been very modular and flexible.

The new structure is very good. I especially like that it doesn’t have a default dumping ground but instead makes developers think upfront before they add a new feature.

6 Likes

This is not a guideline. You may want to have two data structures, one per context. But you may not. It is completely fine for the Support context to ask Sales for the customer information and then perform operations on the customer as long as it does so through the Sales API.

Think about any other Elixir application. Sometimes you are fine with using the Plug.Conn struct, but sometimes you want to augment it by wrapping it in your own struct. Sometimes you want only some fields and put them into a Phoenix.Socket. All of those cases are valid as long as everyone is respecting Plug.Conn’s APIs. For example, Plug.Conn ships with a private field for custom annotations. So maybe it would be best to add support for annotations in Sales.Customer which the Support people will then add information to, without a need to have their own Support.Customer.

And if you have shared code between contexts, then create a directory that both can depend on. In the same way that, if I have two libraries that have the same dependency, a third library is made for the dependency.

Our goal is to push people towards data structures and modules. How you organize Elixir code is how you organize contexts. It should not become a set of rules such as “one Customer struct per context” because then it is not going to work. The reason why we mentioned Sales.User and Support.User is because we have seen applications with 100 columns in the User “model”. However, if that goes to the extreme opposite and becomes 20 users schemas in different contexts, I have a hunch it won’t work as well.

Other than that, it is really hard to provide concrete advice and have a fruitful discussion on a fictitious domain. The best we can do is to say what not to do in certain cases. I believe those discussions would be better in front of an existing project with a known domain (Hex Web comes to mind).

Yes, that’s what we have in mind.

1 Like

This is exactly what I proposed. I really don’t worry about things like common fields. Ecto schema syntax is very declarative in it’s nature, so the data is not buried somewhere deep in the functions - it’s in a very prominent place. Nonetheless, let’s consider when schema data would get invalid and what repercussions does it have. There are 3 possible scenarios:

  • you add a field - if you haven’t noticed, it means you don’t need it in the other context - no problem
  • you remove a field - any test that touches this schema will bork, so it’s very easy to detect
  • you modify the type of a field - this is probably the trickiest, but I would argue the rarest. Additionally many type changes on database level maintain the type at ecto level (e.g. text → citext).

Divergent validations could be a bigger concern, but are they really? Can support edit customer data? If so, do they have exactly the same rules that the customer has, or can “walk around” some rules for exceptional situations?
Fortunately validations are just functions - you can extract common patterns to separate modules (outside of any context) and use them. I don’t think other solutions besides modules and functions are needed here.

Surprisingly my answer will be exactly the same here - I would define a Support.Order referring to the same table as Sales.Order and declaring the fields that are important for support.

4 Likes

Btw @mjadczak, I am not implying you wanted to do any of the things above. I am only using your examples to elaborate more general points since I assume those topics will be discussed frequently in the upcoming months. :heart:

1 Like

Haha definitely, I know you were illustrating the larger point :slight_smile:

I think I perhaps read too much into that slide, and didn’t pay enough attention to Chris’ assertion that “we’re not applying this pattern, it’s just a good analogy to how we want to structure code”. If the guideline is “structure the code as you would any other Elixir code”, even when that violates a strict interpretation of the Bounded Context pattern (for example refactoring common changeset code out into a separate module, which doesn’t need to be a context, just a utility module), then I’m happy with that.

I see your point, however in the case I was proposing, the only thing you would want to do with a Support.Order would be to convert it to a Sales.Order—we want to grab the relevant order and then just query it in the Sales context. It seems like in that case the valid way would be to store an order id but treat it as an arbitrary token from the Sales system.

José, I also get your point about a fictitious domain being prone to contrivances which don’t contribute to giving specific guidelines. I’m sure after Phoenix 1.3 is released (apparently Soon™ ;)) we will see many questions on how to apply this structure to real-world domains.

1 Like

I’m actually in the middle of updating HexWeb’s project structure to the “v1.3” one. It’s very very early [1] [2], in particular I’m just moving files around to see what feels right (before changing module names), but expect updates soon.

[1] https://github.com/hexpm/hex_web/pull/483
[2] https://github.com/hexpm/hex_web/tree/wm-phoenix-1-3-structure/lib/hex_web

5 Likes

Whaf if Support wants to change the Support.Order, would you convert it back to a Sales.Order and use the Sales context to modify it or would you write a Support.update_order function? This would lead to duplicate code, wouldn’t it?

Anyway, @chrismccord @josevalim, I feel like there should be a guide to show the ways this could be done.

1 Like

I think the idea is to have more domain-specific functions like Support.add_ticket_to_order and Sales.set_order_payment_status, instead of a generic update_order. That way you separate the different responsibilites across the contexts.

2 Likes