You have a mistake in your data sample file. First line contains a shell prompt:
anonymized@ANONYMIZED:~/file_watch_example$ head -100
The rest is valid CSV.
You have a mistake in your data sample file. First line contains a shell prompt:
anonymized@ANONYMIZED:~/file_watch_example$ head -100
The rest is valid CSV.
You are sorted. Here is the PR: Make it work by dimitarvp · Pull Request #1 · pascal-chenevas/file_watch_example · GitHub
I made it use batch inserting again and it works just fine. Only takes 108 seconds on my Intel Mac to insert exactly 10 million records.
Try my branch on your machine and let us know. No reason not to work unless your MySQL instance is broken, or you are not showing all your code and something else in it is bugging it.
@dimitarvp, I have merged your modifications and then cloned the repo into a new local directory.
Afterward, I run the Ecto-migration with mix ecto.migrate
(so at that moment the table products
is empty)
When the migration was finished I copied the CSV file into data/in.
I am very surprised at how fast the records are created in the database!
mysql> select count(*) from products;
+----------+
| count(*) |
+----------+
| 84040000 |
+----------+
As you can see the number I got is much higher than the number of rows the file contains:
cat products.csv | wc -l
10000000
Maybe the file_watcher.ex is not properly configured (and the CSV file is endlessly loaded?)?
Command ecto.migrate
does not delete old data, it runs the migrations otherwise if everything is up-to-date it does nothing.
In your case you want to delete the entire database before inserting new data. This can be done by explicitly running ecto.reset
, or ecto.drop
followed by ecto.create
and ecto.migrate
.
To have easier visibility on what happens in your database I would strongly recommend using a cli tool to query it, or even better a visual tool like dbeaver.
Man, this is getting tiring. Just drop the DB, create it again, run the migration, and then start the code.
And indeed as @D4no0 said, post some command output e.g. mysql -e 'select count(id) from products;'
before and after.
The code I did is working perfectly, you are missing something else and I believe itâs time you get a grip and make it work.
Are you opening iex -S mix
and then running IngestCSV.load("10_million_products.csv")
? Is there any other code youâre not showing?
Thank you for giving me the command you use to load the file. After using the same command I can confirm that I came to the same result as you, It works!
Thatâs confirm my doubt in my previous comment :
Maybe the file_watcher.ex is not properly configured (and the CSV file is endlessly loaded?)?
How I proceed to run in this error:
cp products.csv data/in
In the Elixir App, I have a âfile watcherâ that should check for new incoming files in the directory data/in. If a new file is detected, then it should be loaded by IngestCSV (see line #20. So I think this process is wrong defined/configured and seems infinite load the CSV-FileâŠ
Looks like you donât have a detection of duplicates so youâre constantly inserting the same records.
Your next step is to add an unique index on the name (or another field) to make sure you donât duplicate so much.
And it would have been useful for you to mention your exact steps before. Weâre not mind readers.
I consider this thread completed. Mark one of my answers as the solution (if you donât mind) and if you have other problems, they are for a new thread IMO.
Looks like you donât have a detection of duplicates so youâre constantly inserting the same records.
Your next step is to add an unique index on the name (or another field) to make sure you donât duplicate so much.
Yes I will do that!
And it would have been useful for you to mention your exact steps before. Weâre not mind readers. :smiley:
I will take care of that for my future posts Sorry about that!
I consider this thread completed. Mark one of my answers as the solution (if you donât mind) and if you have other problems, they are for a new thread IMO.
I already did it
Thank you all of you for your support!