Error with Enum.sort_by/3 (ArgumentError) you attempted to apply :field_name on)

dstpierre · April 6, 2020, 11:05am

Hi everyone,

I’m prototyping some queries that I would need at work should we decide to go with Elixir and I wanted to see how Elixir, Ecto, Plug, and Cowboy would feel to me in real world context.

I’m getting this error when doing a Enum.sort_by/3 and I’m not understanding what I’m doing wrong. Here’s my Ecto schema (abbreviated):

@primary_key {:id, :integer, source: :RefID}
  schema "Supplier" do
    field :company_id, :integer, source: :CompanyID
    field :supplier_id, :integer, source: :SupplierID
    # ...
    field :last_modified, :naive_datetime, source: :LastModified

I’m auto-generating all Schema from a C# Linq-To-SQL DBML (but that’s a story for a later post). I’d like to migrate to Elixir from outdated 15-20 years old C# legacy business web application.

This is the function I created that I’m running from IEx:

defmodule Data.Test do
  import Ecto.Query

  alias DB.HiddenName.Supplier
  alias DB.HiddenName.Repo

  def report(id) do
    sup = Supplier
      |> where([s], s.company_id == ^id)
      |> order_by([s], desc: s.last_modified)
      |> select([s], s)
      |> Repo.all()

    sup
    |> Enum.group_by(fn x -> x.supplier_id end)
    |> Enum.map(fn {_k, v} -> Enum.take(v, 4) end)
    |> Enum.sort_by(&(&1.last_modified), {:desc, NaiveDateTime})
  end
end

The error I’m getting:

** (ArgumentError) you attempted to apply :last_modified on [%DB.HiddenName.Supplier{supplier_id: 123, company_id: 6, last_modified: ~N[2019-06-18 15:32:00]], :last_modified, [])

When I comment out the last Enum.sort_by there’s no error. I tried replacing the &1.last_modified with the primary key id and just use :desc as the 3 argument, but I’m getting the same error with :id instead of the NaiveDateTime :last_modified.

In short, I just want to sort by that field, and I’ll be completely honest I’m not very sure what &(&1.last_modified) does exactly (I’m in my day 2 with Elixir),

Probably basics question but I’m not phrasing that properly to get any answer myself.

Thanks for your time and help.

hauleth · April 6, 2020, 11:12am

Not an answer, but why not do all of that in the query (I assume PostgreSQL)?

from s in Supplier,
  where: s.company_id == ^id,
  order_by: [desc: :last_modified],
  group_by: s.supplier_id,
  select: fragment("array_agg(?)", s)

About your question, use Enum.flat_map/2 as your Enum.map/2 will return list of lists and in Enum.sort_by/3 you expect list of maps.

dstpierre · April 6, 2020, 11:29am

Thanks for the clarification re: flat_map vs map.

Regarding doing that on the Ecto query itself, (it’s MS SQL Server here) .

Maybe but I’m not sure if your example is what I need though, I need to take the first 4 rows of each “supplier_id”, for instance:

id | company_Id | supplier_id | last_modified
1 | 6 | 123 | 2015-01-02
1 | 6 | 123 | 2015-02-02
1 | 6 | 123 | 2015-03-02
1 | 6 | 123 | 2015-04-02
1 | 6 | 123 | 2015-05-02
1 | 6 | 789 | 2020-04-06

In the above resultset the query needs to return the first 4 rows for supplier_id = 123 order by last_modified, and the one from 789.

I’ll need to investigate the fragment Ecto function as I’m not familiar with it, but I thought this query would be harder to do on the Ecto side of things but I might be wrong there.

hauleth · April 6, 2020, 12:29pm

In SQL:2003 it could be written as:

SELECT *
FROM (
  SELECT
    s.*,
    dense_rank() over (PARTITION BY s.supplier_id ORDER BY s.last_modified DESC) rank
  FROM suppliers s
  ORDER BY s.last_modified DESC) q
WHERE rank < 5
ORDER BY q.last_modified DESC

http://sqlfiddle.com/#!17/bc507/12

Unfortunately Ecto AFAIK still do not allow to have select from subquery so we need to “hack around” by using JOIN:

SELECT s.*
FROM suppliers s
INNER JOIN (
  SELECT
    id,
    dense_rank() over (PARTITION BY supplier_id ORDER BY last_modified DESC) rank
  FROM suppliers) q
  ON q.id = s.id
WHERE q.rank < 5
ORDER BY s.last_modified DESC

http://sqlfiddle.com/#!17/f7eae/2

This one should be possible to express in Ecto syntax:

from s in Supplier,
  inner_join: q in subquery(from ns in Supplier,
                              select: %{
                                id: ns.id,
                                rank:
                                  row_number()
                                  |> over(partition_by: :supplier_id, order_by: {:desc, :last_modified})
                              }),
  on: s.id == q.id,
  where: q.rank < 5, # row_number is 1-based
  order_by: {:desc, :last_modified}

However this is all based on the assumption that the T-SQL supports window functions in SQL:2003 compatible way or that Ecto can translate above query to syntax used by MS SQL.

dstpierre · April 6, 2020, 1:13pm

Yes thanks for the reply.

This is mostly the kind of query we have for this in C# currently in production.

That’s why I was trying to express this using the pipe operator and list comprehension to get more readability.

To be frank, if I’m going to choose Elixir and Ecto I would not want to use the ~Linq syntax from x in “table” bla.

This is mostly what I want to exit actually with C#.

I’m aware that doing the proper query against SQL engine is certainly most performant, but I’m lucky enough that performance comes 2nd after readability (or maybe not, depending on how much more slow this will be using Enum vs. the SQL optimiser).

Because that’s the real question. To be able to justify Elixir, the data access will need to be more comfortable than the current C# code, which is similar to your example above.

I’m having a battle between Go and Elixir (I’ve been writing Go for the last 5 years).

Am I to understand that the Linq syntax from x in "table" where: x.something > 0 is preferred in this community than:

Table
|> where([x], s.something > 0)

for instance? I guess my question is really: Is there something that the pipe operation way cannot do that the DSL can that I should be aware of? Or they are 100% identical and only differ in their usage hence depend on taste of the developers?

Thanks

hauleth · April 6, 2020, 1:30pm

These are 100% identical. IIRC once one was implemented using another. There is no strong preference on one over another AFAIK but with time I personally prefer from macro as this do not force me to repeat the match over and over again. Use what you want TBH. Additional advantage of from for me is that it encourages people to write whole query in place as I found the “building” quite confusing (but that was pre-named joins, so nowadays it is probably nicer).

So answer is no, pipe and from are completely equivalent solutions and you can even mix them as you please.

dstpierre · April 6, 2020, 1:35pm

Very clear, thank you very much.