Build dynamic bindings for Ecto.Query.order_by

Hi there,

I tried to make ransack analogue and faced with the next problem: all Ecto.Query macros like where, order_by, select, etc. need to pass binding variables as second arguments for build queries with schema relations.
In my case I just try to translate map structure like this:

%{post_body_cont: "body", author_name_eq: "author"}

to Ecto.Query like this:

Comment
|> join(:inner, [c], p in assoc(c, :post)) 
|> join(:inner, [c], a in assoc(a, :author)) 
|> where([c, p, a], like(field(p, :body), "%body%"))
|> where([c, p, a], field(p, :body) == "author")

The problem is that I don’t know how many relation can be represented in the map. There is some approach to build and pass that bindings dynamically?

1 Like

@valyukov - I’m curious about your thoughts on the solution you ended up with on this. I found your project and looked at your code; you are using Code.eval_quoted to accomplish the binding right? This is the most relevant bit I think:

  @spec build(Ecto.Queryable.t, list(Sort.t), Macro.t) :: Ecto.Query.t
  def build(query, sorts, binding) do
    query
    |> Macro.escape
    |> OrderBy.build(binding, Enum.map(sorts, &expr/1), __ENV__)
    |> Code.eval_quoted
    |> elem(0)
  end

At first I was wary of run-time code generate like this - but relevant topics were discussed in this thread and it seems like this code should get garbage collected like any other closure. Any other comments or thoughts on this, are you happy with how ex_sieve turned out?

I’d also be curious if anyone else has thoughts about dynamically binding to an unknown (at compile time) number of tables that may be joined in an Ecto query. We’re working on a library with some similar requirements.

This is a private API, so it can break in any new Ecto version and in ways that we may not even provide an equivalent API.

order_by can actually be applied multiple times to the query, can’t you rely on this behaviour instead?

query = Post
query = from q in Query, order_by: q.foo
query = from q in Query, order_by: q.bar
...etc...

Thanks for your input Jose.

We’re using multiple order_by applications, but at compile time I do not know where to find q in your example, it might be [q] in queryable or [_, q] in queryable etc. Our library will take run-time input to determine which field is requested for sort - it will confirm it can find that column in the schema (or associated schema) attributes or a white-list of fields supplied by the user.

Then it determines the join order for the relation with that column (could also be supplied by the user, or determined by inspecting Ecto.Query.joins which also looks like its a private API).

This index join_order is used to determine the bindings by calling into a function like the below. If I cannot find a better answer I’m left with using a macro to generate these clauses for arbitrary N number of bindings (maybe 25 joins is enough? probably will have to be configurable).

This code makes me very sad, but maybe it will convey the problem I’m trying to solve and it has a better solution:

  defp order_relation(queryable, 0, dir, column) do
    order_by(queryable, [t], [{^dir, field(t, ^column)}])
  end
  defp order_relation(queryable, 1, dir, column) do
    order_by(queryable, [_, t], [{^dir, field(t, ^column)}])
  end
  defp order_relation(queryable, 2, dir, column) do
    order_by(queryable, [_, _, t], [{^dir, field(t, ^column)}])
  end
...

Yeah I have this same issue a lot in codegen. I wish we could optionally name the bindings as well like (completely convoluted example, I actually have combinations over over 1000 variations):

query = from [aName: a] in Abra

query =
  case something do
    nil -> query
    i -> join(query, [a], [bName: b] in Bloop, a.id == b.id and b.i == ^i)
  end

query =
  case somethingElse do
    nil -> query
    i -> join(query, [a], [cName: c] in Choo, a.id == c.id and c.i == ^i)
  end

query =
  if name_exists?(query, :cName) do
    where(query, [aName: a, cName: c], a.blah > 42 and c.bloob <16)
  else
    query
  end

Basically just some way to refer by name instead of just position.

@jeremyjh couldn’t you write something like:

  defp order_relation(queryable, 0, dir, column) do
    order_by(queryable, [t], [{^dir, field(t, ^column)}])
  end
  defp order_relation(queryable, _, dir, column) do
    order_by(queryable, [..., t], [{^dir, field(t, ^column)}])
  end

From “Query bindings” section of: Ecto.Query — Ecto v3.11.1

In other words, … will include all the binding between the first and the last, which may be no binding at all, one or many. Using … can be handy from time to time but most of its uses can be avoided by relying on the keyword query syntax when writing queries.

@axelson I wasn’t aware of that feature but I do not see how it helps here. Binding on the last join would only work if the column the user was trying to sort by happened to belong to that table. I really need to be able to bind to any one of say, say 7 tables, and which one I need to bind to is something I will only learn at runtime. At compile time in the project of my library user they may know the exact bindings; if so I only let them tell me by index position and my library turns that index into a specific binding using above code.

@jeremyjh The order_relation/4 method I defined is identical to the one that you defined, but I can see how it is deficient relative to what you may actually want. However as far as I understand you may call Ecto.Query.order_by/3 at any point, so you do not have to call it after creating all the joins. This means that as soon as you create the join (assuming you create them one at a time) you can call order_relation.

Also thinking about this a little further I think you don’t even need the first function head at all, you can just use:

defp order_relation(queryable, dir, column) do
  order_by(queryable, [..., t], [{^dir, field(t, ^column)}])
end

Am I missing anything?

I’m writing a library which will be handed a query that has 0-N joins already done on it, along with a list of 0-N columns to order by. My library doesn’t know anything about the queries it will process at compile time.

Yeah in that case this approach won’t really help you (unless you change that contract so that the same library can handle the joins and the ordering).

@jeremyjh I’m curious if you ever found a solution for this. I need to be able to choose the binding at runtime as well.

Ecto 3.0 and higher has named bindings, which now makes this trivial. :slight_smile:

2 Likes

Yep! I had been hanging my head against the wall on named bindings all day and just figured it out.

Thank you!

1 Like

Can you elaborate on this a bit? I am working on a feature where I planned to use named bindings but I ran into this trying to dynamically name the join: as must be a compile time atom`

Without being to pass in a var here, I don’t see how one avoids maintaining some kind of map of params to join defs, whether it be pattern matching or some other means.

If you‘re dynamically building up a query, including the bindings then yes - you still need some kind of mapping. But for many use cases the possible bindings are finite and known at compile time, but the conditions applied to them are what is highly dynamic.

It looks like ex_sieve still uses some macro magic to eval generated code in order to allow conditionally setting the value of as:

It sounded as if José was strongly recommending against that above, so I’m wondering what the potential pitfalls are and if there is a better way (besides storing a map somewhere of keys to joins).

I’d also be interested to learn why the value of as is required to be an atom there…I’m guessing it has something to do with safety, but not clear if it’s an Elixir concern or a database concern.

The internals of ecto query structs are private API and therefore prone to changes at any given time.

There’s not – afaik named binding are meant to help developers in composing queries, by supplying names instead of maintaining knowledge about the order of bindings. If bindings (and their names) are composed dynamically then no developer is involved in the naming in the first place – e.g. there cannot be a helper function referring to a known binding name when binding names are not known in advance.

A map external to the query can map arbitrary keys to bindings without ecto needing to be involved at all and given things are built dynamically anyways it shouldn’t really matter where the mapping is stored. The code, which built up the bindings must already be able to know the positions of those bindings, because it’s the piece which is composing them.

The issue is less related to the name needing to be an atom (it’s just the datatype, which makes the most sense), but the fact it must be set at compile time and therefore cannot be dynamic. I’d expect this is to keep things simpler with all the compile time stuff ecto queries do. An option supporting names not known at compile time would likely be way more complex/hard to maintain (if at all possible). But this I’m not 100% sure about.

I think I’m missing something because it doesn’t look like the code snippet above, which handles dynamically named bindings is referencing any Ecto internals, it looks like it’s essentially using an eval statement to get around the compile time atom restriction.

Ah, should’ve taken a closer look. I just expected it to use internal api like many other libraries do. I guess this does work. But I really don’t see how this would be needed in the first place:

Either one knows the possible bindings in advance and can name them statically; Or one does not. In the latter case bindings will be composed by some code (as opposed to by some developer) and that code should be able to maintain knowledge over the order those bindings were added.

It’s not necessary anymore than the named bindings themselves are necessary. It just simplifies the logic necessary to compose complex queries with variable parts.

If I’m following this discussion correctly it sounds like the possibility of handling dynamic named bindings was raised but ultimately tabled for the time being.

It does seem reasonable that for this pretty narrow set of use cases you have to resort to macros, rather than burdening the Ecto code base with a lot of extra complexity.