I am learning Elixir and Ecto at the moment but I’m having some trouble expressing this query.
I have a Post schema and want to create an Ecto query that calculates a score for each post based on different “scorers” modules.
So far, here is what I have:
scorers = [{PostActivityScorer, 3}]
total_weight = Enum.map(scorers, fn {_, weight} -> weight end) |> Enum.sum()
total_score = fn post ->
Enum.reduce(scorers, 0, fn {scorer, weight}, current_score ->
# Each scorer.score function would return another expression,
# for example: fragment("CASE WHEN p.type = ? THEN 1 ELSE 0 END)", "some type")
^scorer.score(post) * ^weight + ^current_score
end)
end
posts_with_score_query = from(p in Post, select: {p, ^total_score.(p) / ^total_weight})
Is this possible in Ecto? I have tried with dynamic() but as far as I can tell there is no way to interpolate it inside the “select” clause.
What you’re trying is not possible. Ecto.Query is an abstraction around sql queries (native queries to the db you’re using). They cannot express arbitrary elixir code. If you find a way to express the same in sql though, then they should be ways to make that work as ecto querys or even parts of one you can compose.
What I want to do is expressible in SQL, the code would generate a single SQL query. If I take a simplified example it would generate something like this:
SELECT
p.*,
(
-- This expression would come from PostActivityScorer.score()
(CASE WHEN p.inserted_at >= (NOW() - '1 month'::INTERVAL) THEN 1 ELSE 0 END) * 3 +
-- This expression would come from another scorer, like PostTagSimilarityScorer.score()
(CASE WHEN p.tags @> ARRAY['some_tag'] THEN 1 ELSE 0 END) * 1
) / 4
as score
FROM posts as p
ORDER BY score DESC;
Basically I’m looking for a way to abstract all these scoring formulas into modules that would produce the necessary SQL float expression at runtime.
One function would produce (CASE WHEN p.inserted_at >= (NOW() - '1 month'::INTERVAL) THEN 1 ELSE 0 END), and another one (CASE WHEN p.tags @> ARRAY['some_tag'] THEN 1 ELSE 0 END), and I then I can compose them to produce the final query.
I hope this is a bit clearer, please let me know otherwise!
You can hide some of the complexity in fragment macros:
defmodule ActivityScore do
defmacro activity_score(inserted_at) do
quote do
fragment(
"CASE WHEN ? >= (NOW() - '1 month'::INTERVAL) THEN 1 ELSE 0 END",
unquote(inserted_at)
)
end
end
end
defmodule TagScore do
defmacro tag_score(tags) do
quote do
fragment("CASE WHEN (? :: text[]) @> ARRAY['some_tag'] THEN 1 ELSE 0 END", unquote(tags))
end
end
end
defmodule PostQueries
import Ecto.Query
import ActivityScore
import TagScore
def posts_by_score do
activity_weight = 3
tag_weight = 1
total_weight = activity_weight + tag_weight
post_scores =
from post in Post,
select: %{
post_id: post.id,
score:
(activity_score(post.inserted_at) * ^activity_weight +
tag_score(post.tags) * ^tag_weight) /
^total_weight
}
Post
|> with_cte("post_scores", as: ^post_scores)
|> join(:inner, [post], scores in "post_scores", on: post.id == scores.post_id)
|> order_by([_post, scores], desc: scores.score)
|> select([post, score], {post, score.score})
end
Which generates the SQL:
WITH "post_scores" AS (
SELECT b0."id" AS "post_id",
((CASE WHEN b0."inserted_at" >= (NOW() - '1 month'::INTERVAL) THEN 1 ELSE 0 END * $1) +
(CASE WHEN (b0."tags" :: text[]) @> ARRAY['some_tag'] THEN 1 ELSE 0 END * $2)) / $3 AS "score"
FROM "blog_posts" AS b0
)
SELECT b0."id", b0."title", b0."type", b0."tags", b0."inserted_at", b0."updated_at", p1."score"
FROM "blog_posts" AS b0
INNER JOIN "post_scores" AS p1 ON b0."id" = p1."post_id"
ORDER BY p1."score" DESC