In the place where I work every day, we are facing the need to implement a small improvement in our own provisioning solution.
Today, they are shell scripts that are hosted on a server and downloaded by host which needs provisioning, and then it executes them sequentially with a predefined order.
However, there are some scripts that could be executed in parallel and we save a lot of time.
My boss talked to me the idea of working with Elixir, since he knows that working with him in freelance work, so it’s a chance to put it to the consideration of my office.
What I was thinking is basic, download all the scripts, start each with its own genserver and run independently of the rest, is: os.cmd a solution for this problem, porcelain, it is even a good approach?
My most recent Elixir project is sshing to 20K machines. Elixir works really well for this kind of stuff.
What you use depends a lot on how much interaction the scripts need and how much you want to deal with error handling.
At the most basic level, System.cmd works reasonably well and has minimal overhead. I generally write a wrapper module to run each script, and then create a list of inputs and use basic Task.async/Task.await_many to gather results.
System.cmd doesn’t run a shell, so if you need shell redirection :os.cmd works. Porcelain is great if you need to interact with the process and monitor it’s output.
Of course the environment you are running in needs to have enough horsepower and cores to take advantage of what Elixir offers. Elixir’s concurrency can help with I/O bound tasks on even a single core VM, but if the problem is compute bound, then the more cores the better.
Most shells already let you run scripts concurrently, however if you want to do it via Elixir it is very simple to do as @bbense showed above. Rewriting the script entirely in Elixir would probably get you a much larger boon if you truly have a highly concurrent problem though.
If the scripts respond to EOF properly then just using os.cmd or ports is fine, if not then porcelain could handle them best.
While lot’s of tools can manage things via fork/exec, Elixir’s “green threads” allow you to monitor all the processes in parallel. This management can be quite tricky at scale. If the problem is big enough to require throttling ( i.e. for whatever reason you can’t just fork all the jobs in a single loop ), elixir makes the scheduling relatively easy.
Where elixir really shines is when you need to do further processing on the output of the commands you run. If that isn’t required, then Elixir may be overkill for the problem.
In fact it is a set of fairly simple scripts.
At start one of them download some packages by apt-get (they are Debian hosts).
And then I have a generation of samples for our project, and post-configuration of the packages.
That sequentially carry almost 3 minutes and are independent of each other, I understand that I can parallelize to gain time.
I am very grateful for speed on your answers, they give me hope that I was on the right track.
Depending on what the scripts do parralising them might give no gain. I already observed and experimented on some machines that I provision Ed for our internal it department. Most of the tasks were heavily limited by read write and transfer rates of hard disks or networks rather than the available cpu.
Also provisioning should be a one time job and not get repeated every hour or worse.
At my company we don’t care about a minute or two but make sure that everything works correctly.
If I were you, I would write some batch programs for downloading/provisioning/.etc with most suitable language for the requirements ( mostly in Ruby, Python or Elixir sometimes), and decouple concurrent logic from those.
In concrete, I would write a simple wrapper shell script for executing those batch programmes running in the background jobs, while logging using STDOUT.
I once wrote an awesome concurrent script executor in posix shell (I only had access to KSH 88 at my old job for that, talk about old!) that brought some XML processing time down from 2 minutes for the old script to <10s for my new script, it was awesome. ^.^
It was a script that was run 10-100 times a day, I’ve no clue why the old one was written so poorly. I made my executor fully generic so I ended up writing a lot more scripts to use it to shave off more time (I always tested and only used it on things that did not communicate or have any ties between the scripts or files) but none as much as that 2m -> 10s one. ^.^
I still have my old executor somewhere if I should go find it?