We at Purple are trying to build the first scalable decentralized computing platform and we chose Elixir for two reasons:
The Erlang vm and it’s hot-code loading capabilities can be leveraged for compiling and executing code dynamically.
Decentralized software needs to be bulletproof and the Erlang vm is just good at this.
The only problem is that in order to prevent malicious software from running on nodes i.e. infinite recursion, there has to be way to count the opcodes called by the Erlang vm at runtime by a piece of code.
We have not yet found an API for doing such a thing, is it even possible?
Source analysis only gives all the possible opcodes that a certain program can call. What we are trying to do is to count the number of opcodes that have been called by a script while it runs. The problem is that scripts have to be killed if the number of opcodes they call exceeds a specific limit.
We cannot use timers since the script is run on every node that is connected to the network and consensus on the execution of the script is required among nodes with varied hardware and different owners. Opcode counting allows for a script to always stop at a specific call on each node, regardless of the amount of time it took for that node to compute the script.
It seems to me that you are considering running user code inside the same BEAM environment as your management code. This is a bad idea; it will be very difficult to properly isolate user-code from breaking the security restrictions you might want to put on it, because the BEAM does not do this for you.
However, spinning up BEAM instances inside some virtual sandbox environment and having these communicate with your management-code and with each-other is definitely possible. You can hidden-remote-shell-connect an instrumentation node into a running cluster to perform monitoring that way.
That would definitely be possible. You have one process which monitors the processes could run malicious code and keep track of how many reductions they do. There is no problems with counting reductions as everything is done with calling functions so it is a reasonable way to keep track of how work is done. For example Erlang has no loops as such and so looping is done with function calls.
As @Qqwy pointed out it is very difficult, in fact impossible, to safely restrict what code, any code , can do. Basically any code is allowed to do anything it wishes and there is no way to make an internal sandbox which could limit what a process and access or do. The BEAM was never designed to be “safe” in this way.
The only way is to run separate BEAMs inside their own virtual sandboxes to be able to completely restrict them. A thing to note here is that if you run distributed Erlang then you are basically opening up all the nodes in the distributed system to each other. You can restrict access but if is difficult and I wouldn’t trust it. This means that opening up your system to allow remote shells is definitely a no-no. This even if you are hidden as you can find hidden nodes when they are connected.
The network talks via a gossip protocol and nodes only need to know the ips of other nodes. The user code is passed around in udp packets so the code can be analysed before it is loaded into the system.
In order to prevent malicious code from interacting with the vm the plan is to expand it’s ast and check if there are calls being made to any erlang function or elixir functions which interact with the vm. If this is the case, the nodes would simply reject the received source code.
Regarding opening up erlang nodes to each other, since nodes speak via gossip, wouldn’t setting a different erlang cookie for each node effectively prevent access from other nodes?
Yes, but then you disallow all access through distribution, for example running remote shells. So it’s an all-or-nothing deal with distribution and cookies. Then you would have to decide whether to run the nodes alive or not at all. There is no reason to have them alive if you are not going to allow distribution.
What I do in a couple of apps with user code (one in lua another in a few things simultaneously) is I run their code in another process that must stop within a certain bound time or it is brutally killed (and a report sent out to tell the user their code is borked). It has worked well for me so far, but it is definitely not opcode-count-specific.
Yeah this definitely becomes harder. In that case have you thought about just running a custom language that handles it’s own opcode counting? That is what I did with my safe_script library (complete enough for my use, but I doubt it’s complete enough for anyone else’s as obvious features are still missing). I just count the instructions and ‘suspend’ after a certain amount of instructions (returning a continuation object). It’s not ‘too’ hard to write actually, especially if it is a pure whitelisted interaction (no direct beam calls)…
Yeah if you could ‘inject’ calls periodically into the running user code that do this and test, that would work well.
As long as it is fully whitelisted only code it is easier to handle by far (though watch your interface properly!).
I would not run straight-beam code on the beam to do this though, I’d still use a higher level language on top…
And this is why.
The BEAM is a VM, not a sandbox of any real form.
And this is only one of many injection points.
However, please answer this:
What precisely are you trying to accomplish and why. Not ‘how’ are you trying to accomplish it, but what endgoal are you trying to accomplish? There might be a better way…
We are trying to implement smart contracts on a decentralized ledger. Normally, a virtual machine would need to be written in order to achieve this.
However, this is no easy task and since erlang already provides tools for loading code at runtime, we are exploring the posibility of implementing smart contracts without the need of writing an entire vm from scratch.
I’m not sure I would do this then. At the very least it would be very easy to run out of atoms if you are starting to talk about loading that code at runtime. There are quite a variety of VM’s out there, but one customized for the (seemingly restrictive) language would be best for this task I’d bet.