Postgresql Ecto query speed

Yes, fetching 17k rows can be a problem. Question is if you need to fetch them all at once? What do you do with them after that?

1 Like

You mean to select using filter on date but not include date in the resulting data set?

Try using this query:

from m in MerchantTransaction,
  where: m.trans_date > ^date,
  select: struct(m, [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code])

and let us know if it goes faster

edit: sorry I don’t think the original query I gave you will run

1 Like

well, it’s definitely 2-6x faster:

  def transactions_after_date(date) do
    from m in MerchantTransaction,
         where: m.trans_date > ^date,
         select: [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code]
  end
iex(20)> :timer.tc(fn ->    transactions = MerchantTransaction.transactions_after_date(str_hour_ago) |> Markers.Repo.all end) |> elem(0)
[debug] QUERY OK source="merchant_transactions" db=21.5ms decode=81.0ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 20}, {12, 29, 39, 482418}}]
112730

iex(21)> :timer.tc(fn ->    transactions = MerchantTransaction.transactions_after_date(str_hour_ago) |> Markers.Repo.all end) |> elem(0)
[debug] QUERY OK source="merchant_transactions" db=21.1ms decode=31.4ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 20}, {12, 29, 39, 482418}}]
54239

Seems the dates interpolation is taking too much time!
Thanks for the clue!

1 Like

we have separate tables for transactions and their locations and we do grouping and aggregating of transaction amounts in code. generally, it should be done with joins and SUM() in one DB query if I get your point, and we should definitely do it this way! But so far we were prototyping and 0.5 sec results versus ms in psql put us into doubts.

Thanks!

1 Like

Yeah, so as you can see from the Ecto logs, the original data reports around 250ms for database and 100-300ms for decode. The db as reported by ecto also includes the network roundtrip. If your database is on another host, you need to transfer all of that data - with more data it takes longer. This time is not reported by psql - it only reports the internal database time.
The second measurement - the decode time is how long it takes for ecto to go from the bytes on the wire into the elixir data structures. The more data you have, the more it takes.

It seems that the issue is just the amount of data you’re pushing over the network or then decoding on the elixir side.

Also - what versions of ecto and postgrex are you using?

1 Like
ecto 2.2.10
postgrex 0.13.5

interestingly, I have almost same results for

    from m in MerchantTransaction,
         where: m.trans_date > ^date,
         select: [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code]

[debug] QUERY OK source="merchant_transactions" db=26.5ms decode=69.8ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 21}, {14, 49, 47, 275557}}]
159374

and

    from m in MerchantTransaction,
         where: m.trans_date > ^date,
         select: [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code, :trans_date]

[debug] QUERY OK source="merchant_transactions" db=80.9ms decode=144.9ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 21}, {14, 49, 47, 275557}}]
230789

Even all fields in select is faster than no select key:

    from m in MerchantTransaction,
         where: m.trans_date > ^date,
         select: [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code, :trans_date, :trans_created_at, :trans_updated_at]

gives

:timer.tc(fn ->     transactions = MerchantTransaction.transactions_after_date(str_hour_ago) |> Markers.Repo.all end) |> elem(0)
[debug] QUERY OK source="merchant_transactions" db=142.2ms decode=291.2ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date", m0."trans_created_at", m0."trans_updated_at" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 21}, {14, 25, 45, 460508}}]
441323

Three fields are added if I don’t specify select: id, inserted_at, updated_at. They add the delay.

@peerreynders Calecto doesn’t help too much. it gives results between 69-126 ms while Timex is between 78-143 ms.

Ecto 3 should offer much better decode performance for the datetime fields with the native types - once they are decoded in the driver they are not shuffled anymore.

In Ecto 2 the drivers would return the Erlang-style tuples and then Ecto would parse them into structs. In Ecto 3 everything is using the built-in structs.

3 Likes

db is significantly more than network round trip time, it will include decoding from bytes to elixir terms in postgrex. decode time is the time to go from rows (list of lists) to ecto preprocessed result. Ecto will also do post processing not including in this time. Therefore should think of db more like time using the connection.

4 Likes

Ecto 3:

    from m in MerchantTransaction,
         where: m.trans_date > ^date,
         select: [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code, :trans_date]

[debug] QUERY OK source="merchant_transactions" db=96.3ms decode=1.6ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [#DateTime<2018-02-24 15:26:32Z>]
147457

decoding is significantly faster!