How to create a sandbox to run untrusted code/modules?

There are extremely many ways to call a function - in a lot of places you can pass {module, function, arguments} tuples, you can create atoms at runtime in many ways as well. I don’t see a viable way of filtering this.

OS-process-level sandboxing is definitely a much better idea.

1 Like

I know i am a Cassandra at that point and that i mess a bit with the whole fun thing.

But Docker is not a sandbox. Is it better than the BEAM isolation ? Yes probably. But if you need a secure sandbox with Docker, the advise right now and for the year to come probably is to have one VM per Docker container. If you want a more secure sandbox/container, have a look at FreeBSD Jails or SmartOS Zones. You can run a Docker container on SmartOS Zones.

6 Likes

Exactly this! Docker is NOT a sandbox, it is pretty trivial to ‘break out of it’, and it is designed for process segmentation, not sandboxing, and the developers are upfront about that. Anyone using docker to sandbox has a ticking time-bomb…

SmartOS Zones would be a good sandbox though. FreeBSD Jails, they ‘mostly’ work, but they can be broken out of with relative ease now too. I’d still say just support lua or something, not elixir.

9 Likes

I’ve found another thread discussing sandboxing here:
https://groups.google.com/forum/#!topic/elixir-lang-talk/RzWVMrHaZF8

Also, found one elixir lua impementation (not sure if this is sandboxed though):

1 Like

There are a lot of LUA implementations for the BEAM:

  • https://github.com/rvirding/luerl by the very own RVirding, this is implemented in erlang, not super fast, but very safe, it implements ‘most’ of lua, good enough for simple scripting. However you can implement lua things in erlang/elixir that can be called from lua very fast, so it makes a great simple scripting layer.

  • https://github.com/bendiken/exlua is just an Elixir wrapper of luerl to change the API a bit, does not add anything though.

  • https://github.com/Eonblast/Erlualib Embedded driver to add lua to the BEAM, full power and normal lua speed, but you can crash the VM if lua crashes (rare, but possible). ^.^

  • Do it yourself via a port: This is probably what you should do, make your own C LUA Port system (and release it as a hex.pm library!) that connects to the BEAM via a Port, full safety, full sandbox (as long as you leave out the lua standard library like filesystem and network I/O), etc…

EDIT: It would be fun to build a new simple safe scripting language on Elixir though, not hard to do. ^.^

6 Likes

Thank you for all the advice.

What would a “simple safe scripting language on Elixir” look like to you? Would this also use ports? Implemented in C?

1 Like

I’d personally would use either luerl or actual LUA through a PORT, depending on the feature-set I needed.

If I were to make a new scripting language, it would probably be SML-like, probably just implemented in elixir itself… ^.^

2 Likes

It is basically impossible to sandbox an erlang/elixir system if you allow it to run untested code, especially if you allow loading already compiled code. You *could* interpret erlang/elixir code but it would have to be very limiting but it is still possible to get around this. If you want to run the code within the erlang/elxixir system then the only safe way is to interpret/run another language where you can control how it interacts with the system outside it. Like for example running Lua either with luerl.

Another way would be to run a “safe” language/system outside erlang/elixir. Or perhaps best of all run a special system inside a DMZ.

14 Likes

I know of people who use LXC to run untrusted code in a sandbox instead. How safe is this?

2 Likes

Better than docker, but still not secure as it is possible to talk across kernel bounds, and it still has overhead compared to, say, Illumos’s Containers.

2 Likes

Thanks for clarifying. I was always under the impression that using Docker provided secure isolation if the process inside the container was running with a non root user. There are many posts which talk about creating a standard non sudo user in the container and then running the container process with that user:

#!/bin/bash
# /opt/build/entry.sh
# create the user
useradd --home /opt/build --shell /bin/bash "statictrain"
# run our script with this new non-sudo user
su --command /opt/scripts/build.sh "statictrain"
docker run --rm .... /opt/build/entry.sh

Is this approach not secure?

I just created an account with Joyent (https://www.joyent.com/smartos) to play around with smartos containers (Had no idea they existed :slight_smile: ). They are giving a $250 credit for new accounts, so that is nice :slight_smile:

1 Like

No it is not secure. The simplest to explain escape is the following escape : escape the container limits and then the only “security” you have is the one provided… by the user system. And you are back to just normal linux kernel security in the whole box.

Even LXC is not that good, and i would not advise to use it before another couple years to clean a bit the code.

Docker is good at writing dockerfiles and as a build tool. It is not a really good production environment. In general Linux is not the best environment for containers right now.

About SmartOS : you can run your docker container here and it will be secure, due to using Zones. They use the Docker Remote API.

2 Likes

I’m curious why the following approach may not work: suppose you forked elixir and removed any modules that might run or load untrusted code like File, eval, IO, compile, etc. Assuming you compiled code with only this compiler, might this be an effective way to sandbox?

A popular article on hacker news today is another person looking to run multi-tenant code:

Unfortunately, from what I’ve seen, no technology ticks all the boxes for my use case. I’m still looking for an audited, lightweight, easy to use sandbox, geared for running multi-tenant code.

https://idea.popcount.org/2017-03-28-sandboxing-lanscape/

Unfortunately it is still possible, and quite easy in fact, to load a module into the system. Even if you were to remove the code module the basic BIFs are still there. You would need a very very controlled access to stop this. And I still wouldn’t trust it. :grinning:

1 Like

Exactly this, a blacklist sandbox will never work on a full language.

A whitelist of each individual system call where each one is audited in detail for uses (including auto conversions of atoms to module and all sorts of things) would have to be done. It is possible to make a sandbox, but it would be limited on purpose and you would have to make a lot of safe stubs and such…

1 Like

cough cough Illumos Zones cough cough

1 Like

I’ve read about Erlang/BEAM’s preemptive multitasking which seems to be a good solution to not allowing sandboxed code to hog all the CPU resources.

Do you know of any strategies/papers/examples on how to prevent untrusted code to hog memory resources?

The best I’ve seen so far is just limiting the amount of memory that could be allocated/used but I was wondering if a better strategy is out there.

You could run it in a process and have the sandboxed code test it’s memory usage on each reduction or so, if it exceeds a value then GC it, if still exceeded then kill it? I’m doing that currently.

But then in one sense you are not really running untrusted code. The trouble with code is that it can do anything it wants to, it’s pretty much like having root privileges on a machine. The only way is to interpret the code in some way to check what it is doing. For example even it was running in a process with max memory set there is nothing stopping from starting another process and do what it want there.

This was a problem we did not attack.

3 Likes