Sand: an Elixir sandbox

bopjesvla · November 23, 2020, 2:59pm

I’m very excited to share this!

Sand is a language-level Elixir sandbox. It’s fast and very much experimental. It uses max_heap_size to limit memory usage, reduction monitoring to limit CPU usage, and AST whitelisting to make sure all code is nice and side-effectless. Atom renaming is used to combat the atom table filling up, and binaries are limited to 64 bytes to make sure nothing is stored off-heap.

The demo can be found here.The code can be found here. I’m very curious to see if you guys can manage to break out of this thing!

Phillipp · November 23, 2020, 3:57pm

Nice.

The highest number I can use in the demo is factorial.(866). It breaks with 867 and above.

bopjesvla · November 23, 2020, 7:02pm

That’s by design, you only get a small share of RAM.

I will update the error message to make this more clear.

bopjesvla · November 23, 2020, 7:26pm

Done!

On the demo server, I’ve given each program a generous 1 MB of memory. This is configurable, so more computationally heavy stuff could also be run in the sandbox.

madlep · November 23, 2020, 8:55pm

Nice! This really looks cool. Will have to have a play with it.

The whitelist is pretty sparse though, limits what you can do - I understand why, as that’s the whole point of a sandbox. Would be interesting to see how it could be expanded without affecting isolation/security of code running in the sandbox

ityonemo · November 23, 2020, 11:23pm

if you’re interested in trying something more ambitious:

same principle could go for pre-compiled code

bopjesvla · November 24, 2020, 1:29pm

I’m hesitant to expand the whitelist, but many functions can be re-implemented in the sandbox. This is preferred because it doesn’t increase the attack surface. To take Enum.map as an example:

r enum_map = fn
  [h|t], fun -> [fun.(h) | enum_map.(t, fun)]
  [], _ -> []
end

enum_map.([1,2,3], &(&1 * &1))

These function re-implementations can be included in Sand. Then, we’d only need one more macro load that expands to the relevant function definition:

load(enum_map)

enum_map.([1,2,3], &(&1 * &1))

benwilson512 · November 24, 2020, 1:30pm

@bopjesvla I agree that expanding the surface area increases the attack risk.

I think the question is: What are desired use cases for Sand? Of course one can try to reimplement a safe version of the standard library’s important functions, but this really just creates an “Elixir like dialect” language because you can’t actually run Elixir code in it, only code that looks like Elixir but has a bunch of differences.

bopjesvla · November 24, 2020, 1:42pm

That’s a good question. It should be noted that this limitation is one-directional. You can run Sand code in Elixir if you import the macro r (and, in the future, possibly load).

An example use case would be allowing untrusted users to run bots on chat site, without using a lot of server resources. The server provides user messages to the sandbox and sends the sandbox output back to the chat.

The reason why you’d use this over existing (OS-level) sandbox solutions is that it doesn’t require additional moving parts (just one dependency) and because of the speed and low overhead.

bopjesvla · November 24, 2020, 1:55pm

So I’m not aiming to create something that can run existing Elixir projects, just something that allows users to program select parts of websites.

mat-hek · November 25, 2020, 9:37pm

Looks great, thanks for sharing that! Have you thought about allowing Enum.map(enum, fun) syntax? I mean only supporting the syntax, not real modules. It would be more straightforward for people already familiar with elixir imho.

bopjesvla · November 27, 2020, 11:34am

Mhmm, you’d have to rewrite Enum.map(l, f) to enum_map_2.(l, f) and &Enum.map/2 to enum_map_2. You’d have to check the left-hand side of the dot at compile time, so some valid Elixir code wouldn’t work:

e = Enum
e.map(l, f)

But that’s rarely used anyway. Seems possible, but there might be some issues I haven’t considered.