This is my first post. I recently started working with Elixir and I absolutely love it! I’m working through the course from Pragmatic Studio.
We have a backend system that was developed using Node.js and part of the workflow involves generating and verifying digital signatures. It’s quite common for us to have to verify or sign 20,000 signatures per request.
Right now, we are distributing the work across multiple Node.js processes, each working on a respective batch. As an example, for 20,000 sigantures, we’d have 10 workers each processing 2,000 signatures. One process is just way to slow. Our processes communicate using RabbitMQ/HTTP depending on the situation.
Working with non related processes in Node.js is a royal pain in the a$$. It’s horrible and dirty and we want to run far, far away from it.
At first glance, Elixir looks like a dream come true and appears to solve near all the challenges we’ve had to-date with a distributed architecture in Node.js.
I’m considering having the next version of this backend system ported to Elixir and was wondering if Elixir would be more performant for this asymmetrical encryption workflow (I see that we’d actually use Erlang crypto)? Instead of 10 processes, could the load be distributed across 50, 100 or more processes? Could we have 100 processes, each processes 200 signatures?
We had considered even writing a C library that used CUDA cores but that is not a simple task.
It could be distributed over however many cores you have, or if you have multiple servers you can distribute across those as well (though take in to account network transmission time then).
Well I’d not touch proprietary crap like CUDA personally, but I’d rather use OpenCL, but this is really not that hard regardless of the way, and as OpenCL can distribute to CPU’s as well as GPU’s then even just writing it in that will let you be far more performant on a CPU then you would otherwise be anyway.
Honestly, with the work that you are doing it sounds like I’d be dropping back down to native code like stink on a monkey, and OpenCL (or cuda if you really really want to bind yourself to only nvidia things) is practically tailor-made for this kind of work (and of course Elixir or whatever could orchestrate many of those across servers as well, but a single server would probably be fast enough for it at that point anyway).
To give a more detailed answer need a few things known first. Like you said a single request could have 20k signatures, is the client itself sending 20k things to calculate or is it generated serverside via a client seed or something?
Thanks for the response OvermindDl, the 20k signatures in my example are stored in a database (Neo4j). The client simply sends a request and the data is taken from the database and processed. A response to the client tells the client if the operation was a success or failure.
OpenCL looks really cool. I was not aware of it, so thank you for that. This looks like exactly what we should do.
Well grabbing that data from the DB will definitely take some time, so yeah using something like OpenCL will definitely make the rest of that time practically vanish in comparison, so no matter your server side, I’d definitely recommend something like that for the processing if you want it done in a timely manner, just mind the actual data transmission costs for what seems like a potentially multi-meg (or more) amount of data especially with it coming from a comparatively slow database.
For example, I re-wrote the RSA decrypt operation to use a port instead of the built-in crypto module which uses NIFs. The built-in didn’t work so well under high concurrency as it messed with the scheduler. The port version was much more stable and able to keep erlang’s low latency promises even though it was marginally slower than the NIF.