Hello all,
When I repeat a query to select a “large” number of records (around 470k records in my test) , the query works as expected till the nth repetition that consumes all the RAM allocated to my pod and write on disk. So far, I am unable to figure out the root cause.
My environment:
I test on 2 pods :
Elixir: 1.10.3 - Ecto: 3.7.1 Postgrex: 0.15.13
or
Elixir: 1.11.4 - Ecto: 3.9.0 Postgrex: 0.16.5
AWS RDS Postgresql 13.7 - I use multi-tenancy and materialized view.
The requested material view represents customers, it has 650K records and the size is 100MB with btree indexes on email and phone columns.
In order to reproduce my issue, I execute this piece of code
{:ok, pid} = Postgrex.start_link( hostname: ..., username: ..., password: ..., database: ...)
query = "SELECT DISTINCT ON (c0.email) c0.email, c0.phone, c0.country FROM my_schema.customers AS c0 WHERE (NOT (c0.phone IS NULL)) AND (c0.phone != '') AND (c0.email != '');"
Postgrex.query(pid, query, [])
I repeat the Postgrex.query(pid, query, []) till it fails, most of the time it fails at the 2nd attempt. Sometimes it happens at the first shot.
When the query works fine it takes a few seconds and I get the result:
{:ok,
%Postgrex.Result{
columns: ["email", "phone", "country"],
command: :select,
connection_id: 17465,
messages: [],
num_rows: 471782,
rows: [..]
}}
the RAM increases of about 80Mo which is consistent and looking at performance insight in AWS console only CPU is used
When the request fails, it times out
{:error,
%DBConnection.ConnectionError{
message: "tcp recv: closed (the connection was closed by the pool, possibly due to a timeout or because the pool has been terminated)",
reason: :error,
severity: :error
}}
All the RAM is consumed, the disk IO increases of 5Mbps, on performance insight side I can see 5% CPU usage and 95% Client:ClientWrite (the client stucks)
I have also tested using stream, I faced the same behaviour.
Postgrex.transaction(pid, fn(conn) ->
conn
|> Postgrex.stream(query, [], max_rows: 5000)
|> Stream.map(fn %Postgrex.Result{rows: rows} -> rows end)
|> Enum.to_list()
end)
Anyone has already get in this trouble? Any idea know how I can move on the investigation?
Thanks in advance for your help
Regards