How to keep Ecto out of web layer

axelson · February 24, 2018, 2:38am

Continuing down the path of an umbrella application with multiple backing applications/services I am working on creating a clean Phoenix application without Ecto to be the web interface to the rest of the applications.

On one hand, it would be nice to be able to rely on Ecto so that we can better support displaying changesets via JSON (such as using Ecto.Changeset.traverse_errors/2). But on the other I’d really like to not need to include Ecto as a dependency in the web layer at all. Partially to remove the temptation to use it more which would blur the lines and increase coupling.

Should we not even return changesets from the other applications? Any other structuring ideas? Just reimplement our own version of Ecto.Changeset.traverse_errors/2?

cmkarlsson · February 24, 2018, 5:10am

This is my feeling with ecto and phoenix as well. They are theoretically not coupled but in practice it is too easy to fall back into it.

I think the best approach is to not return changesets from your application. Rather return pure semantically correct data-structures. They can be treated like that today but the changeset stuff is still in there.

It has been discussed before. The web layer should do validation on its form, this is separate from the database layer. I see this coupling as “accidental coupling”. I.e it looks the same so lets bundle database and web validation together. In practice I feel they are separate in everything but the simplest case where your form and your data model macthes.

In some cases though it makes sense to have it together but if they are pushing “phoenix is not your application” then it should be completely separated.

axelson · February 24, 2018, 5:38am

Yeah, that makes a lot of sense. Do you know of any good in depth examples of this sort of architecture?

cmkarlsson · February 24, 2018, 6:42am

No not really I tend to think a lot about APIs and how I would like to use my application (and I try this out in the repl/shell) and try to develop it without the web layer all together. I’ve used this design quite a bit in various languages and I feel it really helps with maintenance going forward.

I found that this approach was better than alternatives when I was developing a python web app a while back. Because I couldn’t quite decide which web framework was best I developed the actual app without any web coupling. Then I decided on an template framework (I ended up using mako templates in python). Because of this loose coupling I could quickly change the web layer between various frameworks. It also gave me an understanding that the actual web layer is a very small part of your application. This app I developed for a customer over 10 years ago is still going strong, and because the core app is not tied to hard to any other component upgrading software along the way has been straight-forward.

idi527 · February 24, 2018, 7:21am

I include only phoenix_ecto and an umbrella application that depends on ecto. I also define humanize_errors in the umbrella app which uses Ecto.Changeset.traverse_errors/2, and use it from the web application.

I don’t see any problems with passing changesets into the web layer, they are data, not functionality.

cmkarlsson · February 24, 2018, 7:59am

My problem with this is that it couples your API to ecto (or changeset behaviour). If I return something from an API the only way I want someone to interact with that data is through my API. The type should be opaque. I see changeset’s as an implementation detail of data validation which should never be used directly if returned from an API, otherwise you have a hard time replacing this dependency.

Qqwy · February 24, 2018, 8:13am

What makes this even harder of course, is the fact that some things can only be validated once you touch your data store (like ‘is this an unique e-mail address?’).

I fully agree with you, @cmkarlsson, that it is very easy and feels (too) natural to combine form validation logic with database updating logic.

But then again, the only alternative of Changesets I know of, is the map_diff package (disclaimer: I’m its original author) that checks structs/maps for changes recursively, but that does not do any validation on its own at all.

josevalim · February 24, 2018, 9:40am

I am not sure I agree with validation being on the web front. Checking length, uniqueness, inclusion of a value in a set are all part of your domain rules. However, I do think data casting should be done in the web layer and it is only done later for convenience.

And I do agree that relying on Ecto can lead to tighter coupling. On the other hand, we need a way to communicate rules between dependencies. Giving the before and after is not enough in case of errors, you need the error reasons, you need the attempted rules, and you need the attempted changes. You could use pure data structures but you would most likely need something more structured. Ecto.Changeset is one of those options. Even if you implement YourOwnChangesetLikeThing, you would need it to be available on both dependencies and you would get coupling there too.

I think the reason why Ecto is seen more like accidental coupling is because you can’t use just part of it, so the cause and consequence always goes like “my database has to use it and therefore I have to use it”. That’s why we are considering at least removing the ecto_sql and ecto_migrations part out of Ecto, so that it becomes more of a toolkit for data mapping and data change than a database library.

What do you think?

To be fair, the “phoenix is not your application” slogan is more about the fact that phoenix doesn’t run your applications, they are simply Elixir applications. i.e. “application” here means the runtime component. But that’s why I don’t like such slogans, it can be interpreted in multiple ways and the original intent is lost along the way.

jeremyjh · February 24, 2018, 12:57pm

It really doesn’t. You could have an opaque wrapper type over %Ecto.Changeset if you want to, and then you could change the implementation if you had to, but it seems unlikely most people need this.

Nothing about using Ecto.Changeset actually forces you to use an Ecto Adapter or even a database at all. If you wrote the features yourself - data validation and error tracking - you may find you had a pretty useful utility module with applicably to other apps/APIs you are developing. You’d find yourself using it in multiple projects. You’d eventually publish it on hex. It would become wildly popular.

Then someone would start this thread about how they didn’t want to be coupled to it, and they’d encourage people to write all this stuff themselves.

axelson · February 24, 2018, 5:14pm

I definitely agree. If the validations and changesets were usable without pulling in the SQL bits then it would go a long way. One of the benefits is it would let you analyze and ensure properties of your application just by looking at the application and dependency level, without even needing to look at the code (i.e. if the application doesn’t depend on ecto_sql then it cannot do any database access).

josevalim · February 24, 2018, 8:13pm

Well, this particular aspect also applies today because if you don’t depend on postgrex/mariaex, you don’t have database access.

brightball · February 24, 2018, 9:00pm

Personally, I think taking a lesson from Django would be an ideal solution here. Django has a structure for creating web forms where you basically define a “form object” which is very similar to a changeset.

A form object basically just takes your datatypes in the order you want to present them and then gives you the ability to either spit out structured HTML for the form .as_p, .as_ul or .as_table to generate the form for you in a single command with well structured HTML. You can also render on a field by field basis if needed as well.

What you get with that form object combines validation, field white listing and form generation in much the same way as having a changeset per form. And all of that happens before you even touch what the form will be submitted to “if valid”, whether it will go to an API or database. Probably the closest equivalent to this in Phoenix is the formex library. The most Phoenix-focused tweak to that would be a clean way to validate a single field as the user was completing the form via a web-socket.

I like the idea of defining things that way since 9/10 times when I’m building a site I want the form rendering to be uniform across the app, but the whitelisting of variables will differ on a per-form basis. The formex approach abstracts the repetitive HTML and let’s you benefit from focusing on the security side of white listing.

It’s a challenge though, because there are a number of benefits that also come from hooking “form” validation to the database especially since the database is going to have strict type rules already defined. Maybe if there were a clean way to merge form validation with database level validation (type transformations/validations, length validations, constraints)?

Defining that interface would provide a good way to create the same structure for apis (both for connected elixir apps or outside apis). Changesets and formex are pretty close right now, IMO.

Just to play devil’s advocate to myself though, I always get concerned when anything dovetails from getting things done productively to hand-wringing over architectural purity. This feels like something that should come up more as a minor re-factor opportunity…because even though we get a lot of extra options from Elixir vs non-BEAM languages…a substantial portion of web applications are still just an interface in front of a database, no matter how much we work to convince ourselves otherwise. In that way, having a seamless form->validation->database flow is very beneficial.

EDIT: Looking at formex again it looks like they’ve got almost everything here. There’s even a formex_ecto library that will spit out a changeset for you.

LostKobrakai · February 25, 2018, 8:34am

Form rendering in Phoenix is polymorphic via the Phoenix.HTML.FormData protocol. From the resulting Phoenix.HTML.Form struct you can automate the generation of HTML as much as you like (I’ve some simple helpers for that) and I’m personally really happy to have that intermediate step, because if you form does ever need a more custom field setup you can just ditch the helper function and implement the field with the custom needs. E.g. a money input is often two fields (currency and amount), which should be displayed as one field. This has always been a pain in form generation tool I’ve used before. There’s also a blogpost of @josevalim on writing those helper functions for form generation: Dynamic forms with Phoenix « Plataformatec Blog

All this is already fully independent on the data struct, which actually implements Phoenix.HTML.FormData and how fields/validation does work for it. I often prototype the markup with the (not as feature-heavy) implementation of using a Plug.Conn for forms, before switching over to using an Ecto.Changeset and you could also always implement your own.

drl123 · February 25, 2018, 1:22pm

@josevalim so if I understand your comment about removing the ecto_sql and ecto_migrations correctly, this would be akin to what Rails did by offering ActiveModel to provide the validation functionality on any class that may or may not be database backed?

Effectively, you get all of the validation functionality on a ‘struct’. It doesn’t matter if you would be persisting that struct or just taking further action on the data, so it could be used as part of a ‘form model’ then, allowing the web application to validate form inputs and only if that data is ‘valid’, pass it off to the rest of the application to operate on. If another application in the umbrella happens to persist it, likely that would have it’s own changeset that at least partially lines up with what the web application is passing by means of inter-app contract/design.

LostKobrakai · February 25, 2018, 1:53pm

This is already possible. Ecto.Changeset is only dependent on a db where it’s necessary (constraints and such) but everything else is also available for embedded_schemas as well as for schemaless usage (which is basically any map/struct). You can also already use ecto today without starting any database connection.

michalmuskala · February 25, 2018, 2:10pm

Ecto.Changeset knows nothing about database. It’s a pure data structure. It is used to perform database operations when passed to repo, but changeset itself is not concerned with database.

As far as I can see, changesets are very much like form objects from Django. I’m not sure what practical differences there are between the two (excluding HTML generation), besides that changesets live in a database library.

josevalim · February 25, 2018, 7:05pm

When Rails offered ActiveModel, they were offering new features and a separate package. We already have the feature for some years, it is just a matter of splitting the packages or no.

cmkarlsson · February 26, 2018, 1:31am

josevalim:

I am not sure I agree with validation being on the web front. Checking length, uniqueness, inclusion of a value in a set are all part of your domain rules. However, I do think data casting should be done in the web layer and it is only done later for convenience.

And I do agree that relying on Ecto can lead to tighter coupling. On the other hand, we need a way to communicate rules between dependencies. Giving the before and after is not enough in case of errors, you need the error reasons, you need the attempted rules, and you need the attempted changes. You could use pure data structures but you would most likely need something more structured. Ecto.Changeset is one of those options. Even if you implement YourOwnChangesetLikeThing, you would need it to be available on both dependencies and you would get coupling there too.

I think the reason why Ecto is seen more like accidental coupling is because you can’t use just part of it, so the cause and consequence always goes like “my database has to use it and therefore I have to use it”. That’s why we are considering at least removing the ecto_sql and ecto_migrations part out of Ecto, so that it becomes more of a toolkit for data mapping and data change than a database library.

What do you think?

I think your response is very articulate and balanced, thanks for taking the time to write it up and when reading it a think I agree with your view of things.

I really like the idea of separating ecto_sql and ecto_migrations.

I believe that there are multiple type of validations in a system, that are separate and only joined because they look the same.

Database validation
Business rules validation
UI validation

They may be the same, but the larger and more complex the application and the older the application gets the more the rules starts to diverge.

You mention the late casting. I think that is because we are mixing UI and business rules validation. UI validation is done as soon as possible. Once we have formatted our UI input into an acceptable format we hand it of to the business layer. The business layer than validates according to the business rules, if still successful it may or may not hand of to yet another layer (storage) which will also do the validation.

Yes, I interpreted it in my way I guess . That the Domain Logic is its little black box with a clearly defined API, with as little coupling to any UI code as possible.

Yes, I didn’t quite express myself the way I wanted. I don’t want people to rewrite Changesets. They are very useful.

Perhaps most people do not need to make Changesets opaque by wrapping but from my experience it is important to at least be aware that by exposing an implementation detail you run the risk of it being depended on, especially if your application is so big that you need to start having a separate UI layer.

I agree. I should add that I am mostly talking about larger applications here where the business layer is the absolute largest part of the application and is not just a “glue” layer between the database and UI.

In addition nothing is black and white. There are trade-offs with different designs. I know I recommend a less convenient approach in favour of loose coupling. I believe that helps maintaining and swapping out layers in the future.

LostKobrakai · February 26, 2018, 7:43am

I’m using changesets for the first/last case separately in some places. The controller does use schemaless changesets for form UI validation, the resulting map is passed into a context for business rule validation/database validation.

The only thing I’m missing in that approach is a slick way to consolidate errors back to the UI in case field names do not align.

In the end a changeset by itself is just a data structure – and a quite well thought-through one. I personally don’t feel it’s an implementation detail if it’s a conscious decision to use it for communication of changes/error in changes. A separate UI layer is still free to use ecto to work with the changeset or use any other library which can work with the data transfered in the changeset.