Hello, I’m trying to join 3 different tables with the only field in common as account_id with no success after 3 days of failure. I have no knowledge of SQL, but I have the following function being looped through 3x and merged to get the result I need. I’d like to compress it into one call.
I’m essentially running the function below on 3 different modules (read schemas). I’m sure there’s a better way.
def get_data(module) do
module
|> order_by(desc: :id)
|> preload(account: :users])
|> Repo.all()
end
Is your current approach working, just a bit messy, or is it not working?
A typical pure SQL approach would be something like this (depending on your SQL dialect):
SELECT combined.f1, combined.f2, combined.table_name, accounts.* FROM
(SELECT f1, f2, 'table1' AS table_name, account_id from table1
UNION ALL SELECT f1, f2, 'table2' AS table_name, account_id from table2
UNION ALL SELECT f1, f2, 'table3' AS table_name, account_id from table3
) AS combined
LEFT JOIN accounts on combined.account_id = account.id
What you are looking for is a UNION query either with a preload, or pre-joined at the database. The above is in raw SQL. I had a quick hunt around the Ecto repository and it looks like there is some Ecto support for Unions - see https://github.com/elixir-ecto/ecto/pull/2678 and https://github.com/elixir-ecto/ecto/issues/1549. I am not familiar with the implementation, but hopefully that points you in the right direction.
Hi @sfusato - I’m not sure join would work here - I think @tio407 is trying to combine the tables by adding extra rows, where as joining them adds the data in extra columns. @tio407 - can you confirm?
I’m trying to create a CSV file with the exact same 4 or so columns, which exist on all the tables but I want them as rows on the CSV file if that makes sense.
Let’s say I had 3 columns: nyc_taxes, sf_taxes, la_taxes. I want the same 3 columns on these tables as headers on my CSV file; and the data from all 3 columns…
All 3 columns belong to account_id in their schemas which is why I mentioned it; not sure how relevant that is.
Maybe it would be best if you provided the following:
database schema (or migration code used to create schema)
some sample data
ecto schema code
sample csv output
Maybe not the answer you are looking for, but I don’t see anything wrong with 3 separate SQL queries. Your DB will handle it no problem and your code will be cleaner, without hand-written SQL.
What you should consider, regardless if you go with unions or not - is the table size. You don’t want to load all records from the table into memory. It is fine for a small amount, but if it’s millions of records then it will get ugly.
General comment: I recommend you address that before trying to grapple with Ecto. It’s close enough to the underlying SQL that you really need to understand what you’re trying to send to the database server when you build queries.
To echo others in this thread, I’d recommend getting a working solution that’s built in a straightforward way with three queries. It’s entirely possible that’s “fast enough” for amount of data in your database. If it isn’t fast enough, then you probably don’t want to be doing Repo.all against giant tables anyways - consider something like Repo.stream instead.
Union was the correct solution however my tables do not have the same # of columns (and Ecto does not allow this). On the second table I’m selecting one additional column (and setting it as a virtual field for the others)…