collegeimprovements

collegeimprovements

Elixir for ETL Tasks!

Hello Guys,
Is Elixir a good choice for doing ETL stuff ?
We need to sync few tables and have to transfer about 200 million rows in first iteration of ETL.
We need to do this between Oracle DB and Postgres.

Can gen_stage be used for this kind of task ?

Appreciate any guidance and pointers here :slight_smile:

Most Liked

aseigo

aseigo

GenStage is perfect for this kind of workload, as it allows the program to be throttled using back-pressure based on what your target system can take (as well as how fast your data source(s) can provide). If you can partition the source data and your transform steps are a bottleneck (as opposed to the extract or load being the bottlenecks due to capacity at either source or destination), then you can also build your application to be clustered, allowing for horizontal scaling of the ETL workload.

Moreover, Elixir makes it trivial to run multiple work orders at the same time, spreading the work out across however many cores you have in the machine running your ETL application.

That said: I would expect that writing this application in Elixir would not provide for the absolute fastest execution times for a single-core, single-threaded approach when compared to other languages available out there (C++, Java, ..), but for concurrency (local multi-core and/or multi-system distribution), durability (fault tollerance), and developer productivity it is really hard to beat for these sorts of applications IME.

minhajuddin

minhajuddin

The Stream module is your friend for moving huge volumes of data. You will need to use native tools provided by postgresql and oracle. I did a screencast on how to fast inserts into postgresql using postgrex https://www.youtube.com/watch?v=YQyKRXCtq4s

aseigo

aseigo

Stream is great, but of course does not bring parallelism along with it. So if one is dealing with small amounts of data, or processes that absolutely must be strictly serialized (and even then there are sometimes tricks that can be played), then Stream makes lots of sense. For larger data sets where you have many cores (in one or more machines) to throw at it, then GenStage and/or Flow can often produce better results.

Where Next?

Popular in Questions Top

fireproofsocks
I’m working on defining a simple Ecto schema for a table (in PostGres), but I don’t see where I can define a column as NOT NULL. Conside...
New
chokchit
** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 2733ms. You can configure how long re...
New
Darmani72
If I have a post route which an argument: post /my_post_route/:my_param1, MyController.my_post_handler How would get the post params ...
New
qwerescape
Is there a way to get the call stack or stack trace at any point in the code? Not from exceptions, but an expression that returns how the...
New
albydarned
Hello all! I am typing this post from my new MacBook Pro with the M1 chip. I’m loving it so far, and will probably use it as my daily dr...
New
chrisalley
ExUnit now has describe blocks which is a welcome addition coming from RSpec. In the docs, it states that nested hierarchies of describe ...
New
jaysoifer
Is there a way to rollback a specific migration and only that one (“skipping” all the other ones)? Would mix ecto.rollback -v 200809061...
New
JeremM34
Hello, how can I check the Phoenix version ? Thanks !
New
shahryarjb
Hello, I have map which I want to convert it to string like this: the map: %{last_name: "tavakkoli", name: "shahryar"} the string I ne...
New
minhajuddin
I have seen a lot of code which picks the first element from a list using Enum.at(0) instead of List.first. Is there a reason why people ...
New

Other popular topics Top

vertexbuffer
Hello, can anybody help here..? I have a list of players and I what to delete an element, but every for loop the list is reverting to ori...
New
9mm
I am constructing a JSON object (map) and I need to conditionally set a field. I’m trying to write proper elixir-way code… and I’m at a l...
New
ovidiubadita
Hey all, I discovered Elixir and I love it. I always wanted to learn a functional programming and I intended to go for Haskell, but afte...
New
fireproofsocks
Forgive me if this is obvious, but how does one delete a database record WITHOUT selecting it first? Ecto.Repo — Ecto v3.14.0 has exampl...
New
gausby
I asked this very same question on twitter and got some interesting feedback, but I thought it would be a good question to ask here as we...
1207 39297 209
New
fayddelight
I tried installing elixir 1.11.2 erlang 23.3.4 via asdf in my zsh shell. Enabled the versions locally and globally. When I list them ...
New
jason.o
In the code below, if the create action is not set to accept “extra_key” as an input, it errors out with a message shown above. Is there ...
New
SoCreat
i’m a new one to elixir which editor can i use vs code? or atom? Thanks! :smiley:
New
AstonJ
We’ve put together this wiki for Phoenix LiveView - please feel free to add any info you feel is worth including. What is Phoenix LiveV...
New
Qqwy
Update: How to use the Blogs & Podcasts section You can post links to your blog posts or podcasts either in one of the Official Blog...
3271 126479 1222
New

We're in Beta

About us Mission Statement