TokenOperator - Dependency-free helper most commonly used for making clean keyword APIs to Phoenix context functions

Documentation and Clarity is very important. ^.^

1 Like

Just curious about the naming. What is meant by token here?

1 Like

By token, I just mean a data structure that can be passed around and operated upon. Two common examples of tokens are an Ecto query and Plug.Conn. This post explains it much better than I can… https://rrrene.org/2018/03/26/flow-elixir-using-plug-like-token/

TokenOperator is actually pretty abstracted in that it doesn’t care what type of token you are passing around or what naming you use for the options. I chose an obvious use-case to explain how it can be used. I don’t think I did a particularly good job explaining what it is though.

I started out calling it MaybeQuery, dependent upon Ecto, and with hard-coded opinionated option names like filter, order_by, and preload. That would have been easier to explain, but less flexible. Instead, this allows you to configure your own API conventions and naming and use it beyond that example use case if one presents itself.

2 Likes

I should note that even with the controller/context use case that token structure might be an Ecto.Multi as opposed to an Ecto query.

Version 0.2.0 has been released. This release provides the ability to use functions with an arity of 1. Previously, it was required to reference a function with an arity of 2. This function will receive the token as the first argument and the opts with the second argument. For many usage scenarios, these options are unused, so this requirement has been removed.

Previously, you might have configured a function to filter published articles with the following:

def list_posts(opts \\ []) do
  Post
  |> TokenOperator.maybe(opts, :filter, published: &published/2)
  |> Repo.all()
end

defp published(query, _) do
  from(p in query, where: p.is_published)
end

That can be simplified to:

def list_posts(opts \\ []) do
  Post
  |> TokenOperator.maybe(opts, :filter, published: &published/1)
  |> Repo.all()
end

defp published(query) do
  from(p in query, where: p.is_published)
end

It’s a small change but these functions (e.g. published) probably already existed in your context without a second argument. They should be able to stay that way rather than having arguments dictated by this package unless needed for the use case.

1 Like

This plugin is working just fine for me, but I still can’t kick the nagging feeling that there might be a simpler, better way that is more maintainable and obvious.

One thing that can be done is to expose some of these queries in the context to the controller like @josevalim mentions at Phoenix Contexts - is it common to preload all resources on the context methods with whatever everything needs access to?

For example, say you have users with profiles that you want to “preload”, but not “join”. This is going to result in two queries no matter what.

In the controller, you can do something like…

user = Accounts.get_user!(id)
user_with_profile = user |> Accounts.preload_profile()

However, if you want to join or run a query prior to get_user! having hit the database, I don’t think that is going to work.

Maybe something like this could provide for flexible querying, but with more transparency and easier maintainability than the pattern that this plugin results in…

def get_user!(id, queries \\ []) do
  from(User)
  |> (fn query -> Enum.reduce(queries, query, & &1.(&2)) end).()
  |> Repo.get!(id)
end

def join_profile(query) do
  from u in query, join: p in (u, :profile), preload: [profile: p]
end

Now in the controller, you are free to reference queries exposed from the context, but without being coupled to ecto.

user = Accounts.get_user!(id, [&Accounts.join_profile/1])

Filtering and passing additional arguments is also fine like in this contrived example in a Blog context:

def list_posts(queries \\ []) do
  from(Post)
  |> (fn query -> Enum.reduce(queries, query, & &1.(&2)) end).()
  |> Repo.get!(id)
end

def published_posts(query) do
  from p in query, where: p.is_published
end

def order_posts_by(query, attr_name) do
  from query, order_by: ^attr_name
end

Those functions can be referenced in the controller via list:

posts = Blog.list_posts([
  &Blog.published_posts/1,
  &Blog.order_posts_by(&1, :title)
])

The thing I like about this is that it is easy to see and track down the context functions that are being used in the controller, which might make maintainability easier.

I don’t like how much larger the context api becomes and it doesn’t look as clean in the controller, but I’m not sure I’d prioritize that over the transparency and easier reasoning about direct references to queries.

I guess I think of TokenOperator providing the ability to setup an API that is similar to the sort of thing you would use when hitting an external API. However, maybe that is needlessly abstracting away from just exposing a few more queries from the context and referencing them directly in the controller.

@OvermindDL1 - I’d love to hear your thoughts if any of this strikes you as a useful (or abhorrent) pattern. Perhaps this is just a nuanced enough thing that there is no right answer though.

1 Like

How do I know from that API that list_posts returns structs, but that published_posts returns a query?
I think there should be at least a convention on naming for such context functions, like prefixing query_ for functions that return queries; or the convention that any function different than the generic list_things, get_thing, etc. return queries; is it the latter you had in mind (and thus the pattern then forces the developer to start with a generic function such as list_things, get_thing, etc. in the controller, which he can further customize with the other available functions) ?

Great point. In my usage so far, I’ve been using the same types of terms as had become solidified with my usage of TokenOperator: order_by, filter, and include (though not in my post above). So all of my “queries” are named like “filter_published_posts”, “filter_featured_posts”, “order_by_last_name”, etc. Since I’m thinking about being slightly more explicit, I’m thinking that rather than using “include”, I will use “join” and “preload”. Thus, I’d end up with function names like “join_user_profile” and “preload_post_images”.

It is pretty clear to me by that naming which are queries and which return structs, but perhaps appending something like “_query” would be better.

Or just have queries in a XXX.Queries module

2 Likes

I’ve now moved one project fully to this alternate style and it feels pretty good thus far. It is certainly more verbose in the controller, but very easy to see what is going on.

In the controller…

featured_projects =
  ProjectMgmt.list_projects([
    &ProjectMgmt.order_projects_by_name/1,
    &ProjectMgmt.preload_project_photos/1,
    &ProjectMgmt.preload_project_categories/1,
    &ProjectMgmt.filter_featured_projects/1,
    &ProjectMgmt.filter_published_projects/1
  ])

As opposed to how it is using TokenOperator…

featured_projects =
  ProjectMgmt.list_projects(include: [:photos, :categories], filter: [:featured, :published])

Clearly it is much cleaner in the controller with TokenOperator. Additionally, the concept of “defaults” is built in. Thus, I would have pre-configured list_projects to sort by name as an overridable default.

This alternative route loses the built in defaults, so order_projects_by_name is passed explicitly, but I don’t think that’s so bad. And there is no question of what order these queries will be run.

For more complex queries involving scoping based upon user permissions, I think this alternate method is easier to follow. Suppose I want to scope the rooms that a user can reserve. I was using the following with TokenOperator in the controller…

Inventory.list_rooms_for(location,
  scope: {user, :reserve}
)

That eventually calls a scope function within my context, but there are some gymnastics involved where it isn’t immediately obvious how it gets there.

This alternate method is more obvious in the controller…

Calendar.list_rooms([
  &Calendar.order_rooms_by_name/1,
  &Calendar.scope_rooms(&1, user, :reserve))
])

In practice, I’m hiding that ugly bit of code to run those queries off in a utility:

defmodule Utilities.QueryRunner do
  def run(query, additional_queries) when is_list(additional_queries) do
    Enum.reduce(additional_queries, query, fn additional_query, query ->
      additional_query.(query)
    end)
  end

  def run(query, additional_query) do
    run(query, [additional_query])
  end
end

This is then referenced within the context:

alias Utilities.QueryRunner

def list_projects(queries \\ []) do
  Project
  |> QueryRunner.run(queries)
  |> Repo.all()
end

As of now I’m leaning toward this simplified method, but both have their advantages.

1 Like

I think you’ve seen that I built a library inspired by TokenOperator a few weeks ago.

In my projects, I would write the above in just a few lines, like below, without having to write any queries. The code you show in the controller, including those 5 query functions, is a lot of functions and lines of code. I guess these functions are small and pure, but every time you want to add an order, another preload, etc. you have to write such function, which might get tedious.

ProjectMgmt.list_projects(
  order_by: :name,
  where: [featured: true, published: true],
  preload: [:photos, :categories],
)

Do you see disadvantages doing that? The preload function also knows if it’s better to join the association (one-to-one cardinality) or execute a separate query (one-to-many and many-to-many cardinality).

1 Like

I see an advantage in this: A list of closures can just be applied as is, but there’s no way to control what those functions do. While with a keyword list of atoms ProjectMgmt.list_projects is in control which functionality it supports and which it doesn’t.

Say there are two functions concerning the same schema and a helper to preload a certain association, but the helper should only be allowed to be applied to one of those two functions. With the closures there isn’t that level of control.

2 Likes

Is the library published on github?

1 Like

See

and

2 Likes

@mathieuprog I think using something like QueryBuilder is a valid way to do it and that does some cool stuff in figuring out whether to preload/join, query associated columns, etc. It is just a different take on it.

I originally started down the path of something similar, but ended up with the more abstract TokenOperator (and now tossing around the idea of an alternative method without a package at all) with the following goals:

  1. No dependency on an ORM. Unopinionated about whether this is a query or multi or whatever token.
  2. Unopinionated about the language used for interacting with a context (e.g. include, preload, join, paginate, etc.). Use the words that make sense for your app.
  3. Consistency of usage. Do the same thing for the least AND most complex queries. Write them out in the context and then reference them via atoms in the controller. No matter how powerful QueryBuilder (or any opinionated Ecto wrapper) becomes, there are still going to be times when you need to do something more complex than its API supports. You can keep adding features, but at some point it might become so big and complex that it might be easier to just write the queries in the context since Ecto has a great API already. While verbose, they are generally easy to write and I often need those queries anyway for operations within the context.
  4. Flexibility - Ability to not have to need a list_posts and list_published_posts and list_published_featured_posts for every little tweak to a query. This is probably the core thing that TokenOperator, QueryBuilder, and this alternate method all do so it barely needs mentioning here.
  5. Transparency - Easy to see (and change) the queries that are being run. The alternate method I’m talking about here uses direct references to context functions, which doesn’t get any more obvious or transparent. TokenOperator uses atoms and still makes it easy to get at the functions, but with a layer of abstraction that makes it slightly less so. As @LostKobrakai mentioned, passing atoms like in TokenOperator and QueryBuilder allows the context to dictate what queries are made. That has been useful but, in practice, has provided me less value and transparency than I thought it might, which is why I brought up this alternate method.
  6. Also probably goes without saying, but all of these methods are trying to avoid directly interacting with Ecto, which is what contexts are pushing us to do.

As an aside, I considered writing an opinionated Ecto “adapter” for TokenOperator as an example of how to use it. It is made to be the core of higher-level abstractions like QueryBuilder.

Anyway, this is a nuanced thing and I’m trying to tease out some of the differences between the approaches, but I think they are all useful. I’m clearly still undecided on how I want to do it so I appreciate things like QueryBuilder being around as an option.

2 Likes

I think this is a really nice pattern, and might use it (in other contexts than building queries).

I am just a little worried that we do not really have an API here that is easy to document and use. The same goes for TokenOperator actually, as I can add any keyword atoms (as you said, use the words that make sense for your app). This means that we then must rely on conventions (otherwise one dev can use :preload and another dev can use :include for preloading). I’m not sure this is a necessarily a bad thing; I guess it is as it can lead to bugs easier. With an Ecto wrapper like QueryBuilder, there is no need to document all the keywords for every function or rely on agreed conventions between devs, because they are always the same limited set of keyword atoms. Of course it’s limited to building Ecto queries.

In this specific example quoted above, all these function names contain the word “project”; so in this case it’s easy to find the functions “compatible” for list_projects/1; we do not even need to rely on conventions or heavy documentation here as the query function names are self explanatory. Actually would be nice to think about other contexts/examples than query building because we have only one context we base our discussion on :smiley:

In any case I prefer convention on function names shown in your alternative method than the keywords that need to be agreed on and documented in TokenOperator.

1 Like

I can see how documenting and ensuring consistency might be more difficult with this alternate method. For TokenOperator, in practice, I would always lock down the API with wrapper functions as noted at https://github.com/baldwindavid/token_operator#making-it-our-own

Thus, you end up only using maybe_filter, maybe_include, maybe_order_by, etc. and those only respond to the pre-defined atoms of :filter, :include, :order_by, etc.

You can interact with TokenOperator directly, but I would usually suggest writing a simple wrapper around it. It is pretty low-level; so much so that it’s a little hard to explain without writing a higher-level abstraction on top of it as an example. Things like QueryBuilder don’t suffer from that as it is immediately obvious how to use it.

I don’t have a great answer for locking down / documenting the API for this alternate method. Maybe what @chrisjowen mentioned of putting them in a XXX.Queries module helps. It hasn’t been a problem for me yet. I break my contexts into sub-contexts (ex. MyApp.Calendar.Reservations), so they are usually not so large that I need to split them up further.

2 Likes

@mathieuprog - I was thinking of a couple more goals for TokenOperator that I missed…

  • Be decoupled from the database - I’d like to avoid leaking the implementation details of “how” to get the data within functions called in the controller. I think that TokenOperator mostly allows for this using words like “include”. However, in my “alternate method” examples it is questionable since I’m using both the words “preload” and “join”. Perhaps the controller shouldn’t care about such things since this could be grabbing data from an external API, flat file, etc.
  • Avoid referencing actual tables and columns in the functions - Making a change to my database should ideally not require a change at the call site within a controller.

Whether you use the term “include” and behind the scenes use Ecto’s preload, or you simply adopt the term “preload” from Ecto doesn’t make a difference. And ofc preloading/including - no matter how you want to call it - applies to whatever data source you use, so there’s no leak here to me.

Regarding “join”, your example showing a usage of “join” is the following:

Maybe you should then write a more specific version of the function that preloads, to one that preloads a profile for a user:

# preloading profiles for a user
def preload_user_profile(query) do
  from u in query, join: p in (u, :profile), preload: [profile: p]
end

# preloading profiles for other entities if needed
def preload_profile(query) do
  from q in query, preload: [:profile]
end

The controller doesn’t have to know about joining (which is specific to databases) anymore; up to the context. If you change your data source, preload_user_profile can simply delegate to the more general preload_profile function if there’s no difference.

I don’t see table and column names; I see struct names and fields, and those are rightly used everywhere, from the models to the plugs to the templates. Let me know if I missed something.

Yep, no problem with the words “include” or “preload”. Moreso that “join” is used and that there needs to be some way to designate whether you want to join or not. I like where you’re going with the concept of using more generalized terms like “preload_user_profile” and “preload_profile”.

I don’t see them directly referenced in TokenOperator or the alternate method, which was the goal.

1 Like