Group_by query with two conditions

Hello all,

I want to do a query that get all elements that meet two different conditions.
What I have so far is:

def count_signed_documents_group_by_date do
    from(_ in Document, as: :document)
    |> join(:left, [document: document], dv in DocumentVersion,
      on: dv.document_id == document.id and dv.current_version == true,
      as: :document_version
    )
    |> join(:left, [document_version: document_version], au in AuditLog,
      on: au.document_version_id == document_version.id,
      as: :event
    )    
    |> where([document: document], document.status == "current_signature")   
    |> where([event: event], event.action == "signature.completed")
    |> where(
      [event: event],
      fragment("DATE(?) > NOW() - INTERVAL '12 months'", event.inserted_at)
    )
    |> group_by(
      [event: event],
      fragment("to_char(?, ?)", event.inserted_at, "Month YYYY")
    )
    |> select(
      [document: document, event: event],
      {fragment("to_char(?, ?)", event.inserted_at, "Month YYYY"), count(document.id)}
    )
    |> Repo.all()
    |> Map.new()
  end

The problem is that I also need to get all documents that have a “signed” status. But for those documents, the date that needs to be considered is the inserted_at date of the document itself and not the one from event. I created an helper function to get that:

def count_signed_documents_by_date do
    from(_ in Document, as: :document)      
    |> where([document: document], document.status == "signed")  
    |> where(
      [document: document],
      fragment("DATE(?) > NOW() - INTERVAL '12 months'", document.inserted_at)
    )
    |> group_by(
      [document: document],
      fragment("to_char(?, ?)", document.inserted_at, "Month YYYY")
    )
    |> select(
      [document: document],
      {fragment("to_char(?, ?)", document.inserted_at, "Month YYYY"), count(document.id)}
    )
    |> Repo.all()
    |> Map.new()
  end

My question is, is there a way to accomplish all of that in a single function?

Thanks a lot in advance

What would the query to do this in a single SQL statement look like? That answer will tell you what Ecto machinery to use.

A straight UNION of the two queries would give the right result shape, but could get duplicates if there’s a documents.inserted_at and an events.inserted_at with the same value.

Maybe a subquery that uses UNION to produce tuples of documents.id, some_timestamp, then do the counting + grouping over that? I don’t know if the DB will produce better or worse results with that much-more-complicated flavor of query, though.

1 Like