Background processing for a rule-based system (and sideboxing)

Hi, I have some experience with Ruby, never touched Elixir, now considering to use it for background processing service in my expert system. Can’t decide whether I should.

A fraction of the rules in my expert system are “procedural”, i.e. they will be dealing with map-reduce processing by running custom code against user data. This code will be crowdsourced. So even though all custom code will be required to pass some kind of review before ever being run on production, there is still a possibility for social engineering type of attack.

My main user-facing application is a Rails+PSQL app: users put their "facts"into the DB and it triggers the background inference engine to evaluate these facts against the existing rules. The results are then also put into the DB, because there are some rules that rely on other rules (like AND/OR constructs). The job throughput will probably be bottlenecked by the database I/O, since the calculation itself will just be mapping over short hashes and arrays. What I worry about most is when a bunch of new rules will be introduced (or changed) — the system will need to process all existing user data against the new rules, that may takes tens of minutes per job (in case the app ever gets traction).

I initially planned to use Sidekiq for all processing, but then I learned about Elixir and OTP: it is much better at managing complex processing flows (for example Sidekiq has “batches” but only in Pro version). On the other hand, job processing service will be accessing the DB, so I would like to execute crowdsourced code in some kind of sandbox, and it seems it’s harder to do in Elixir than in Ruby. I’ve googled sandboxing with Elixir, everybody says to use Lua, but introducing another dependency is uncool and more importantly it will affect the throughput, won’t it?

Now, what questions should I be asking myself and what else am I missing, that would help me make the decision?

1 Like

I’m not sure how sandboxing works in ruby, but lua is a super common method here, or just slave out to another process.

Adding a dependency is just adding a line in your mix.exs file, so I don’t see it as an issue there either. :slight_smile:

Pointers:

IMO either use an embedded script engine (like https://github.com/rvirding/luerl), or have a separate small machine to which you delegate possibly unsafe external code.

Somebody used a pretty cheap setup to achieve something similar: How to setup an Elixir sandbox for a nerves project in 2 simple steps with Nanobox