The only way I know is a bit ghetto on account of its direct access of the pdict, but that’s the easiest way I know. Happy to improve it if someone can suggest a better way.
def load(stream, columns) do
statement = """
COPY readings (#{Enum.join(columns, ", ")})
FROM STDIN
WITH (FORMAT csv, HEADER false)
"""
Repo.transaction(fn ->
conn = get_conn(Repo)
query = Postgrex.prepare!(conn, "", statement, [copy_data: true])
stream
|> Stream.chunk(2000, 2000, [])
|> Stream.into(Postgrex.stream(conn, query, []))
|> Stream.run
end, timeout: 3_600_000)
:ok
end
defp get_conn(repo) do
{pool, _} = repo.__pool__
Process.get({Ecto.Adapters.SQL, pool})
end
I am running similar code and I noticed that when I set logging to :debug and not disable the logging on the stream that I do get exactly one log statement per line (edit: it seems I actually do get N + 2 lines per row) I copy into the table. Is this “normal” or may that be something that could slow down copying process (even when all logging is disabled)?
Here is the code I tested it with:
out_stream = Ecto.Adapters.SQL.stream(repo, "copy numbers from stdin with csv")
repo.transaction(fn ->
repo.query!("truncate numbers")
1..10
|> Enum.map(fn i -> [i] end)
|> CSV.encode()
|> Enum.into(out_stream)
end)
and I get this log output:
[debug] QUERY OK db=0.3ms idle=9457.8ms
begin []
[debug] QUERY OK db=0.5ms
truncate numbers []
[debug] QUERY OK db=0.1ms
copy numbers from stdin with csv []
[debug] QUERY OK db=0.0ms
copy numbers from stdin with csv []
[debug] QUERY OK db=0.0ms
copy numbers from stdin with csv []
[debug] QUERY OK db=0.0ms
copy numbers from stdin with csv []
[debug] QUERY OK db=0.0ms
copy numbers from stdin with csv []
[debug] QUERY OK db=0.0ms
copy numbers from stdin with csv []
[debug] QUERY OK db=0.0ms
copy numbers from stdin with csv []
[debug] QUERY OK db=0.0ms
copy numbers from stdin with csv []
[debug] QUERY OK db=0.0ms
copy numbers from stdin with csv []
[debug] QUERY OK db=0.0ms
copy numbers from stdin with csv []
[debug] QUERY OK db=0.0ms
copy numbers from stdin with csv []
[debug] QUERY OK db=40.2ms
copy numbers from stdin with csv []
[debug] QUERY OK db=4.0ms
commit []
I was wondering exactly the same, I would expect to have only one log since we are doing only one COPY statement and just feeding the STDIN. @tobstarr did you find an answer to you question ?
This a issue for me, because when the logger is in debug mode, for a big load of data, ~500000 row, the same amount of logs is generated slowing down everything.