What goes into your Ecto schema modules?

By default Phoenix generators, when creating an Ecto schema, puts a schema definition there (obviously) and a changeset function. I wonder what else do you put in your schema modules and what unwritten rules do you have for that.

I sometimes add functions that somewhat hide the details of underlying data storage, especially in regards to optional associations or weird choices (I have a DB where some boolean fields are represented C-style as 0 or 1 integers, for example).

def premium_package_name(user) do
  case user.premium_package do
    nil -> nil
    package -> package.name
  end
end

# or

def finished?(session) do
  case session.is_finished do
     0 -> false
     1 -> true
   end
end

I try not to put much more in schemas and keep them relatively minimal.

However, I know some people put functions returning Ecto queries there. I’m not a fan of that. It feel weird that a schema needs to know how it will be queried. But on the other hand I cannot totally reject this approach. If we treat schema module as the place that hides database complexity form the outside world, putting a query there (for which you need to know the database intricacies) makes some sense.

def latest_by_user(user_id, limit \\ 10) do
   from(o in Order, where: o.user_id == ^user_id, order_by: [desc: o.created_at], limit: limit)
end

Maybe someone thinks changeset does not belong in schema? That would be an interesting discussion… So what would I find in your schema modules and why? :wink:

That’s how I treat them, yes. If not for them I might as well write raw SQL so why not put all the SQL-like stuff in there? And if not, where else?

One example: random-generated data you need for security f.ex. put_change(:token, :crypto.strong_rand_bytes(32) |> Base.encode16!() or stuff.

1 Like

There are definitely people who don’t think changeset functions belong in schemas!

I put changesets and queries in there, I don’t care :laughing: So long as they stay pure it doesn’t matter so much to me, it’s a nice obvious place for them. Though I also have a generic query module that has more generalized dynamic and “duck type” queries (like for_user or whathaveyou). Also, anything that would be used by the “outside world” such as the predicates in your example are defdelegated to from the context, so if I ever feel the need to slim down the schemas refactoring will be trivial (which is to not even mention that it’s keeping good boundaries).

All that to say so long as you keep them pure, you won’t run into any trouble with fat schemas (like horrendous compile time dependencies).

1 Like

I put them in separate modules. Depending on the size of the context and if it has “subcontexts”, either one or more. They are usually called Queries or Querying, although perhaps I should just call them repositories.

Defining the data and querying the data are separate concerns IMO. I don’t understand the remark about writing raw SQL.

I dislike too much separate files. Especially in this case I don’t feel they provide any value. Having a schema + functions that manipulate its changesets is to me entirely valid.

Queries are still static definitions on how we want to shape data, so it’s pretty related. For that reason I think you should stick with Queries because Repository makes me think of the actual repository with side effects.

I worked on a codebase where they insisted on putting queries in their own modules.
There were so many of them that just had one or two small queries in them which was insanely annoying therefore worked against DX. Separation of concerns is an important principle but like any programming principle, it doesn’t need to be followed dogmatically. Having queries in schemas is perfectly readable, workable, discoverable, and doesn’t cause any harm in compile times or any such things and in my experience a far better DX, so I always vouch for them to go there. I do keep a visual separation between queries and changesets, so they are still separated, just in the same module :slight_smile: In fact, I do a thing that most would probably find cavalier: I put my import Ecto.Query above the first query as a kind of “header,” though that’s not something I would argue hard for in a shared codebase.

I’m usually finding the opposite. The amount of queries grows almost infinitely in time, so I prefer to keep them in a separate module, where they don’t obscure other purposes of the unit of code. Even worse, the queries are usually more or less directly coming from UI requirements and to me it feels extremely weird to have UI dictate how the module closes to persistence looks like.

This can be partially addressed by keeping the query parts in schemas very generic and atomic. After all, composability is one of the main strengths of Ecto. But in that case you need some place when you stitch together the atomic query functions into something larger. I guess that would be a context. But having too many UI-driven query functions in context is a whole another discussion I’d like to have one day.

But yeah, I get your point.

4 Likes

LOL ya sorry, I got a little emotional there and was channeling my frustration at past situations :sweat_smile: My point wasn’t that I think separate query modules are bad, just that collocating them in schemas is fine if the situation allows for it.

Yes, the UI-driven query functions is a different discussion that I am interested in but I’ll keep this topic focused :slight_smile:

1 Like

I put my schema modules in a top-level namespace called Schema (for example, Schema.Profile). They contain just the schema definition and a @type.

And then, I have a core module (for example, Core.People.Profile*) that contains changeset functions as well as a Query submodule that has composable query functions, which are assembled in context functions.

I use Boundary to separate my Core modules from my Web modules, but because both of them need to access schemas, I let both of them access the Schema modules. I’m okay with that because the Schema modules don’t contain any code.

It might sound like a pain to have the schema and changeset functions in different files, but I wrote an editor plugin to jump between related files easily so it hasn’t been an issue for me.

*The Phoenix generators create top-level modules called MyApp and MyAppWeb; I rename them to Core and Web which has the advantages of being shorter and being easier to copy between projects.

2 Likes

I have a separate top-level module MyAppStorage. It exposes functions for querying and manipulating data which don’t return Ecto schemas. The schemas are only used internally. This way I keep full flexibility in the way I persist data. I can optimise as much as I want. At the same time my business entities don’t have to mirror my database tables’ structure.

2 Likes

I do like this approach but mostly found it good only after certain scale is reached. Most projects are just fine being tightly integrated with the “raw” Ecto schemas.

Yes but usually when you reach the particular scale, you never decouple from the database. Schemas keep bubbling up to views and so on.

1 Like

I put the module doc which describes what the fields of my schema are and then I call a macro which abstracts Ecto’s schema macro.

The macro is defined in my EctoAbstractionMacros module and takes in two arguments: the first is the name of the source and the second is an anonymous function. The anonymous function is defined through a module attribute and returns a list of the field names and types. Right now it can’t do anything fancy like parameterized types but I’ll get there.

Then inside of my schema abstraction macro it calls a changeset abstraction macro to get my changeset functions. These are defined through a DSL I created just to abstract the definition of my changeset functions.

Right now I can’t get past all the errors the compiler is throwing at me but I know one day this will all be worth it.

1 Like

image

You can still open a repo and link it here. Your idea sounds interesting.