Postgresql Ecto query speed

Yea, sorry. The :geom was in an additional migration that i missed to copy in and is from the geo package.
I uncommented it in the model to check if the speed changed, and it did not. Same result.

I cannot share the source code due to company rules, but let me see if i can create a new phoenix app with the exact models and have that a public repo.

Stay tuned :slight_smile:

2 Likes

When querying postgres directly with select [all fields] from users; it respond with 0,4xx ms.

That is the internal postgreql server speed, not necessarily any of the socket communication and memory allocation.

When doing Repo.all User the iex is reporting a 35 to 40 ms query time.

Most of my ecto queries take between 1-20ms, most are on the low end, very few (mostly big reports) on the high end (with a couple taking a couple hundred ms, though that is expected for those). Raw roundtrip no-op time average 0.7ms. All of these times are after the query is cached, you should never really look at the uncached times.

1 Like

Yea, get the same results as you OvermindDL1 when running locally on my MacBook (Postgresql locally). First request is high at around 50ms, but all later queries clocks in at about 4-9ms.

The only difference between my local machine is i7(lower speed than prod server) and 16Gb memory(8 in prod server) and Postgresql 10. I’ve tried Postgresql 10 at the production server and does not help, so guess its not that.

Will create a test project during the day and test it out on the production server and publish to a GitHub repo.

2 Likes

Just a note. The prod server we where using got reinstalled for a unknown reason.
When everyting was installed again ( same setup as before ) i get decent query speeds again.

First request for a user = 30ms, all queries after the first load is ± 8 ms.

Thats a good number and i cannot reproduce the high numbers i had before so unsure how to debug this furter.
Must have been a software error somewhere or something, i dont know.

2 Likes

Also want to add that the reinstalled software was now installed on FreeBSD 11.1, was CentOS 7 before. Not that i think it makes a huge difference.

Most of my endpoints where laying at 250-500 ms from Postman before.
Now many of them have gone from 300+ms to ~30ms in response time.
The server is the same, network is the same, test client is the same but server OS has changed and both DB and OTP, Elixir and Phoenix reinstalled.

3 Likes

Reinitializing the topic: same problem with Postgres/Ecto.

    create table(:merchant_transactions) do
      add :merchant_id, :integer
      add :transaction_id, :integer
      add :usd_amount, :decimal
      add :trans_amount, :decimal
      add :trans_currency_code, :string
      add :trans_date, :naive_datetime
      add :trans_created_at, :naive_datetime
      add :trans_updated_at, :naive_datetime

      timestamps()
    end
    create index("merchant_transactions", [:merchant_id])
    create index("merchant_transactions", [:inserted_at])
    create index("merchant_transactions", [:trans_date])

Model:

  schema "merchant_transactions" do
    field :merchant_id, :integer
    field :transaction_id, :integer
    field :usd_amount, :decimal
    field :trans_amount, :decimal
    field :trans_currency_code, :string
    field :trans_date, Timex.Ecto.DateTime
    field :trans_created_at, Timex.Ecto.DateTime
    field :trans_updated_at, Timex.Ecto.DateTime

    timestamps([autogenerate: {Timex, :now, []}])
  end

  def transactions_after_date(date) do
    from m in MerchantTransaction,
         where: m.trans_date > ^date
  end

Query:

    options = %{time_term: "months", time_value: -1, grouping: "sum"}
    now = Timex.now
    hour_ago = Timex.shift(now, [{String.to_atom(options.time_term), options.time_value}])
    {:ok, str_hour_ago} = Timex.format(hour_ago, "{ISO:Extended:Z}")

    transactions = MerchantTransaction.transactions_after_date(str_hour_ago) |> Markers.Repo.all

When called in iex the last line is executed in 300-500ms:

:timer.tc(fn ->     transactions = MerchantTransaction.transactions_after_date(str_hour_ago) |> Markers.Repo.all end) |> elem(0)
[debug] QUERY OK source="merchant_transactions" db=453.7ms decode=172.6ms
SELECT m0."id", m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date", m0."trans_created_at", m0."trans_updated_at", m0."inserted_at", m0."updated_at" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 20}, {10, 36, 0, 624233}}]
630562

iex(28)> :timer.tc(fn ->     transactions = MerchantTransaction.transactions_after_date(str_hour_ago) |> Markers.Repo.all end) |> elem(0)
[debug] QUERY OK source="merchant_transactions" db=403.6ms decode=145.7ms
SELECT m0."id", m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date", m0."trans_created_at", m0."trans_updated_at", m0."inserted_at", m0."updated_at" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 20}, {10, 36, 0, 624233}}]
552560

iex(29)> :timer.tc(fn ->     transactions = MerchantTransaction.transactions_after_date(str_hour_ago) |> Markers.Repo.all end) |> elem(0)
[debug] QUERY OK source="merchant_transactions" db=245.1ms decode=127.9ms
SELECT m0."id", m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date", m0."trans_created_at", m0."trans_updated_at", m0."inserted_at", m0."updated_at" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 20}, {10, 36, 0, 624233}}]
375806

iex(30)> :timer.tc(fn ->     transactions = MerchantTransaction.transactions_after_date(str_hour_ago) |> Markers.Repo.all end) |> elem(0)
[debug] QUERY OK source="merchant_transactions" db=228.7ms decode=292.4ms
SELECT m0."id", m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date", m0."trans_created_at", m0."trans_updated_at", m0."inserted_at", m0."updated_at" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 20}, {10, 36, 0, 624233}}]
526772

Direct call in psql is very fast:

explain analyze SELECT m0."id", m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date", m0."trans_created_at", m0."trans_updated_at", m0."inserted_at", m0."updated_at" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > '2018-02-20 10:36:00.624233');
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on merchant_transactions m0  (cost=0.00..491.24 rows=17379 width=610) (actual time=0.009..4.742 rows=17379 loops=1)
   Filter: (trans_date > '2018-02-20 10:36:00.624233'::timestamp without time zone)
 Planning time: 0.150 ms
 Execution time: 6.082 ms
(4 rows)

Even if I call the query directly like

SELECT m0."id", m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date", m0."trans_created_at", m0."trans_updated_at", m0."inserted_at", m0."updated_at" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > '2018-02-20 10:36:00.624233');

the resulting out put is shown very fast. I understand that it’s not taking full set at once, and no elixir decoding, but still: why is that SO long? half a second!

Resulting data set is 17379 rows, is that a problem?

Postgres and Iex are running on same machine locally, without docker.

Tried config with localhost and 127.0.0.1 - same result.

Any clues?
Thanks!

What happens when you select only these rows: [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code]. Do you notice any change in speed?

Yes, fetching 17k rows can be a problem. Question is if you need to fetch them all at once? What do you do with them after that?

1 Like

You mean to select using filter on date but not include date in the resulting data set?

Try using this query:

from m in MerchantTransaction,
  where: m.trans_date > ^date,
  select: struct(m, [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code])

and let us know if it goes faster

edit: sorry I don’t think the original query I gave you will run

1 Like

well, it’s definitely 2-6x faster:

  def transactions_after_date(date) do
    from m in MerchantTransaction,
         where: m.trans_date > ^date,
         select: [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code]
  end
iex(20)> :timer.tc(fn ->    transactions = MerchantTransaction.transactions_after_date(str_hour_ago) |> Markers.Repo.all end) |> elem(0)
[debug] QUERY OK source="merchant_transactions" db=21.5ms decode=81.0ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 20}, {12, 29, 39, 482418}}]
112730

iex(21)> :timer.tc(fn ->    transactions = MerchantTransaction.transactions_after_date(str_hour_ago) |> Markers.Repo.all end) |> elem(0)
[debug] QUERY OK source="merchant_transactions" db=21.1ms decode=31.4ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 20}, {12, 29, 39, 482418}}]
54239

Seems the dates interpolation is taking too much time!
Thanks for the clue!

1 Like

we have separate tables for transactions and their locations and we do grouping and aggregating of transaction amounts in code. generally, it should be done with joins and SUM() in one DB query if I get your point, and we should definitely do it this way! But so far we were prototyping and 0.5 sec results versus ms in psql put us into doubts.

Thanks!

1 Like

Yeah, so as you can see from the Ecto logs, the original data reports around 250ms for database and 100-300ms for decode. The db as reported by ecto also includes the network roundtrip. If your database is on another host, you need to transfer all of that data - with more data it takes longer. This time is not reported by psql - it only reports the internal database time.
The second measurement - the decode time is how long it takes for ecto to go from the bytes on the wire into the elixir data structures. The more data you have, the more it takes.

It seems that the issue is just the amount of data you’re pushing over the network or then decoding on the elixir side.

Also - what versions of ecto and postgrex are you using?

1 Like
ecto 2.2.10
postgrex 0.13.5

interestingly, I have almost same results for

    from m in MerchantTransaction,
         where: m.trans_date > ^date,
         select: [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code]

[debug] QUERY OK source="merchant_transactions" db=26.5ms decode=69.8ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 21}, {14, 49, 47, 275557}}]
159374

and

    from m in MerchantTransaction,
         where: m.trans_date > ^date,
         select: [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code, :trans_date]

[debug] QUERY OK source="merchant_transactions" db=80.9ms decode=144.9ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 21}, {14, 49, 47, 275557}}]
230789

Even all fields in select is faster than no select key:

    from m in MerchantTransaction,
         where: m.trans_date > ^date,
         select: [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code, :trans_date, :trans_created_at, :trans_updated_at]

gives

:timer.tc(fn ->     transactions = MerchantTransaction.transactions_after_date(str_hour_ago) |> Markers.Repo.all end) |> elem(0)
[debug] QUERY OK source="merchant_transactions" db=142.2ms decode=291.2ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date", m0."trans_created_at", m0."trans_updated_at" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [{{2018, 2, 21}, {14, 25, 45, 460508}}]
441323

Three fields are added if I don’t specify select: id, inserted_at, updated_at. They add the delay.

@peerreynders Calecto doesn’t help too much. it gives results between 69-126 ms while Timex is between 78-143 ms.

Ecto 3 should offer much better decode performance for the datetime fields with the native types - once they are decoded in the driver they are not shuffled anymore.

In Ecto 2 the drivers would return the Erlang-style tuples and then Ecto would parse them into structs. In Ecto 3 everything is using the built-in structs.

3 Likes

db is significantly more than network round trip time, it will include decoding from bytes to elixir terms in postgrex. decode time is the time to go from rows (list of lists) to ecto preprocessed result. Ecto will also do post processing not including in this time. Therefore should think of db more like time using the connection.

4 Likes

Ecto 3:

    from m in MerchantTransaction,
         where: m.trans_date > ^date,
         select: [:merchant_id, :transaction_id, :usd_amount, :trans_amount, :trans_currency_code, :trans_date]

[debug] QUERY OK source="merchant_transactions" db=96.3ms decode=1.6ms
SELECT m0."merchant_id", m0."transaction_id", m0."usd_amount", m0."trans_amount", m0."trans_currency_code", m0."trans_date" FROM "merchant_transactions" AS m0 WHERE (m0."trans_date" > $1) [#DateTime<2018-02-24 15:26:32Z>]
147457

decoding is significantly faster!