how to select when you're doing group_by in ecto?

I have a table activity that has fields as following

user_id
role_id
cost
inserted_at
updated_at
deleted_at

So using ecto query I have written a query like this

from t in activity, where: t.id == ^id, group_by: [t.user_id, t.role_id], select: %{inserted_at: t.inserted_at}

Since I can not use select without using an aggregate function. So what I need is that I need to choose the last element which is inserted for the cost field and it’s not possible to use aggregate on this. What are my options?

how can I achieve this?

also is there any alternative for group_by and I can achieve the same thing as group_by

Try using max:
select: %{inserted_at: max(t.inserted_at)}

Actually, I wanted to use it for cost field. Which I’ve mentioned here

I need some examples of this. I know about max aggregate function but in this case it will just give me max value.

Yoi could try looking into window function for this. From what you said it looks like it could be a problem for window to handle.

https://hexdocs.pm/ecto/Ecto.Query.WindowAPI.html

I am currently outside, I’ll look into it once I’m back 2 hours and try to come up with examples.

Thanks. looking forward to it.

So I re-read your post and didn’t quite understands what you meant by last element which is inserted for cost field? Does it mean latest inserted_at value where cost is not null? In which case could you not just filter out the null-s?

No Just the last record which is there for the cost field. Nothing to do with the inserted_at field. Let’s ignore the inserted_at field for now.

Suppose its a table like this

user_id | role_id | cost 

1 | 2 | 300
2 | 5 | 400
3 | 5 | 500
4 | 6 | 800

In this example, I want 800 which is the last element inserted for the cost field.
Is this helpful?

I know you said:

But, I’m not quite sure what you want when you say:

In your example 800 is the maximum value, which you could get using

from t in activity, select: %{cost: max(t.cost)}

What I really think you are looking for is something like this, which gives you the cost for the activity with the greatest inserted_at value (a.k.a. most recent as long as you are not putting future dates in that field):

from t in activity,
  order_by: [desc:  t.inserted_at],
  limit: 1,
  select: %{cost: t.cost}

You can change the column used in the order by clause if you’re wanting something a little different, like “the cost for the activity with the greatest user_id value.”

1 Like

Sorry for the late reply.

I was sort of confused by that example too. I had thought, when I made that WindowAPI comment, you would be looking for something like:

INPUT

user_id | role_id | cost | inserted_at

1 | 2 | 300 | 1/1/2021
1 | 2 | 50 | 1/2/2021
1 | 2 | 2 | 1/3/2021
2 | 5 | 400 | ...
3 | 5 | 500 | ...
3 | 5 | 900 | Assuming time in ascending order per (user * role)
4 | 6 | 800 | ...
4 | 6 | 0 | ...

DESIRED OUTPUT

user_id | role_id | most recent cost
1 | 2 | 2
2 | 5 | 400
3 | 5 | 900
4 | 6 | 0

I guess I assumed wrong.

As 800 is what you desired then I’d do it the way @jswanner suggested.

Thanks. This works for me