Hi, I am working on a script to insert huge csv files into MySQL and Postgres database.
I take the help of Ectos Repo.insert_all
for this task of bulk inserting data.
While my code runs fine for MySQL database however I run into some issues when working with Postgres.
Since I am inserting csvs I don’t have schemas defined and I just do Repo.insert_all(table_name, data_chunk)
Postgres gives me these errors:
** (DBConnection.EncodeError) Postgrex expected an integer in -2147483648..2147483647, got "9". Please make sure the value you are passing matches the definition in your table or in your query or convert the value accordingly.
Also…
21:46:45.501 [error] GenServer #PID<0.313.0> terminating
** (DBConnection.EncodeError) Postgrex expected %DateTime{} or %NaiveDateTime{}, got "2017-05-02 15:28:20.2000000 -04:00". Please make sure the value you are passing matches the definition in your table or in your query or convert the value accordingly.
I am using these dependencies…
{:myxql, "~> 0.3.4"},
{:postgrex, ">= 0.0.0"},
{:ecto, "~> 3.4"},
{:ecto_sql, "~> 3.1"},
While MySQL does not have a problem Postgres has some issues with data types. The only idea I can think of is to iterate over the data and convert strings to integers and timestamps to NativeDateTime, something like String.slice(x, 0..18) |> NaiveDateTime.from_iso8601!
, but I don’t think this is a good solution since it involves iterating over all rows which might take time in case of huge csvs.
Are there any other better solutions to this? Am I missing something obvious?
Any help or advice will be very helpful.
I am stuck on this for quite sometime now.