arpan

arpan

Csv2Sql - Load csv files to database

Hi, everyone. I am an elixir nooby and have been learning elixir for about 6 months now. I would like to share a new project I have been working on recently.

My work involved loading lots of CSV files into MySQL databases, this was very problematic, although it was possible to read the CSV files and load it into the database using some tools or scripts, the main problem was to create the database tables before inserting the CSVs. Manually going through each CSV and preparing the query to create the corresponding table was very tedious and time taking. There were some popular tools that I found, but none of them could solve my problem completely, additionally, I also wanted to automate the whole process. So, I decided to solve the problem using elixir, this is how I made csv2sql.

Csv2Sql is a blazing fast fully automated tool to load huge CSV files into a MySQL database.

Csv2Sql can automatically…

  • Read CSV files and infer the database table structure

  • Create the database and the required tables

  • Insert all the CSV into the database

  • Validate that all the CSVs have been correctly imported to the database.

I have made an escript, which accepts command-line arguments so you can load a directory full of CSVs into the database just using a single command like:


./csv2sql --source-csv-directory "/home/user/Desktop/csvs" --db-connection-string "root:mysql@localhost/test_csv"

I used nimble CSV which is super fast when parsing the CSV files, I used ecto to easily bulk insert the CSVs into the database, but the best part is everything is done in parallel taking full advantage of elixirs cheap processes, gen servers, and supervisors. That is, multiple CSVs are processed in parallel, inferring the database schema by reading each CSV is also done in parallel, inserting the CSVs into the database is done parallelly, this makes the app very fast and makes full use of the power of the processor. I have used streams, to lazily read huge CSV files, thus it has minimal memory footprint.

I am sure that there is lots of room for improvements and bugs that will surface as I test the app further, but I wanted to share my project with the community.

Check out the project here.

Any suggestions, advice, bug report, or comments are welcome :slightly_smiling_face:.

Where Next?

Popular in Discussions Top

gausby
I asked this very same question on twitter and got some interesting feedback, but I thought it would be a good question to ask here as we...
1207 39247 209
New
Nvim
Elixir appears to be a superior language to Python. I don’t see any advantage of Python over Elixir. Are there any?
New
crabonature
I’m still quite new to Elixir. As I understand we got in Elixir “multi guards” as convention to simplify one large guard with or’s?: de...
New
sashaafm
Piggy backing a bit on @dvcrn topic BEAM optimization for functions with static return type?, I’ve been trying to understand in a deeper ...
New
fireproofsocks
I’ve been working on an Elixir project that has required a lot of scripting. I usually reach for Elixir because I like it more (and in th...
New
rms.mrcs
A couple of days ago I was discussing with a friend about different approaches to write microservices. He said that if he was going to w...
New
klo
Got a question about when to concat vs. prepending items to list then reversing to achieve appending. So i know lists boil down to [1 | ...
New
tomekowal
Hey guys! I want to create a toy project that shows a chart of temperature over time and updates every 5 seconds. I feel LiveView is per...
New
pdgonzalez872
If this has been asked here before, please point me to where it was asked as I didn’t find it when I searched the forum. Maybe a mailing ...
New
Markusxmr
Since Drab has been developed for a while in the open, introducing the Liveview functionality in a way it happend appears to undermine th...
New

Other popular topics Top

malloryerik
Hi, this is for people who, like me, have had some friction using .html.heex templates in VSCode. The solution seems to be, in a hyphena...
New
greenz1
I have a phoenix application from which a user can download multiple(5-6) files of size 1MB. I couldn’t find anything related to sending ...
New
Nvim
Anybody knows a comprehensive comparison of Django and Phoenix, thanks for the help. Where are they similar? Where do they differ the m...
New
shahryarjb
Hello, I have map which I want to convert it to string like this: the map: %{last_name: "tavakkoli", name: "shahryar"} the string I ne...
New
RisingFromAshes
I've read in another post that it may be possible with a router helper - but I couldn't find an appropriate one, and tbh, I'm still just ...
New
baxterw3b
Hi guys, i’m new in the Elixir world, and i have to say, that i love it! i’m having some problem to understand anonymous functions with ...
New
KronicDeth
Elixir plugin for JetBrain’s IntelliJ Platform (including Rubymine) This is a plugin that adds support for Elixir to JetBrains IntelliJ...
289 35953 110
New
dblack
I’ve got an issue with an app and I’ve no idea of how to troubleshoot it. I’m hoping someone here might have seen something similar. I p...
New
AstonJ
Please see the new poll here: Which code editor or IDE do you use? (Poll) (2022 Edition) It’s been a while since we first asked this, I...
208 31107 143
New
dogweather
I wrote this comment on r/haskell, and it’s not popular there. :wink: But I think I’m on to something… Haskell reminds me of Java, and e...
New

We're in Beta

About us Mission Statement