Dune - Sandbox for Elixir

sabiwara · September 19, 2021, 1:01pm

Dune is a sandbox for Elixir and aims to safely evaluate user-provided code.

You can try it out using this basic Elixir playground made with Dune
and LiveView.

iex> Dune.eval_string("IO.puts('Hello world!')")
%Dune.Success{inspected: ":ok", stdio: "Hello world!\n", value: :ok}

iex> Dune.eval_string("File.cwd!()")
%Dune.Failure{message: "** (DuneRestrictedError) function File.cwd!/0 is restricted", type: :restricted}

Github: https://github.com/functional-rewire/dune

While only a subset of Elixir is supported due to safety concerns, Dune aims to keep this subset fairly large, including:

a good chunk of the standard library
atoms (without leaks)
module definition (without actual modules), including recursion

My hope is that Dune can help build fun projects such as tutorials or coding games in Elixir.
But use it carefully: evaluating user-provided code on a server is highly dangerous and Dune is still early-stage.

If you manage to break out of the sandbox or find any bug, please open an issue on Github or PM me on this forum (sorry, I don’t offer any bounty ).

Happy coding!

bryanhuntesl · September 19, 2021, 1:55pm

Interesting. There has been very little work in this area. In Java, sandboxes (class loader + security policy) have facilitated the popularity of Hadoop and large scale distributed computing workloads. Definitely a small but very interesting step in that that direction.

Qqwy · September 19, 2021, 8:07pm

Very cool!
I wonder what @arjan thinks of this library, as if memory serves he has been working on ‘sandboxed Elixir’ / ‘DSLs built on top of Elixir’ for quite some time.

sbuttgereit · September 25, 2021, 3:01pm

This looks very interesting and may provide another answer to a problem that I’m going to run into in a landscape without many existing solutions (that I can see): user customizable business logic. I’ve been looking at Luerl for awhile (nice to see it mentioned in the README). I’m not to this part of my project yet, and won’t be for some time, but it’s nice to see some others working on this problem.

I noticed that the README in the repo mentions the customizable business logic, but your message here really only says it’s targeted more a fun/educational projects. Is that discrepancy due to really what you’re targeting or just the fact that this is very early stage?

It will be interesting to see how this plays out and how well it performs… my other concern about this sort of thing on the BEAM. There’s so much more interest in writing BEAM languages than writing languages that run on the BEAM (if that distinction makes sense) that it does have me wondering if these sandboxed kind of runtimes are really viable on the BEAM.

Anyway, best of luck!

sabiwara · September 26, 2021, 4:03am

Thanks for your interest in Dune!

I noticed that the README in the repo mentions the customizable business logic, but your message here really only says it’s targeted more a fun/educational projects. Is that discrepancy due to really what you’re targeting or just the fact that this is very early stage?

Customizable business logic is definitely a use case I have in mind for Dune, and I’m sorry that my message might have given a wrong impression (squeezing a readme in a short forum post can be a bit difficult). This was more about my initial motivation for starting the project it in the first place: I would be happy to see Dune help build projects that might play a role in Elixir adoption, like tutorials/games which could help people get a taste of Elixir without any need to install. But this doesn’t impact the design of the library in any way: Dune is a generic sandbox and is totally agnostic about the use case.

It will be interesting to see how this plays out and how well it performs… my other concern about this sort of thing on the BEAM. There’s so much more interest in writing BEAM languages than writing languages that run on the BEAM (if that distinction makes sense) that it does have me wondering if these sandboxed kind of runtimes are really viable on the BEAM.

Regarding the viability aspect, I think the BEAM has some strong advantages over other stacks in some regards: light-weight processes with a pre-emptive scheduling and all the introspection capabilities make it a breeze to run the code in isolation within limited resources: memory, CPU and time. Some other aspects, like the way atoms or modules are globally shared on the VM, were more tricky to handle and needed some convoluted workarounds to support. Regarding Elixir in particular, having AST manipulation as a first-class citizen helped a lot, even if it comes with its challenges because the AST can be rather complex: there are many cases to be considered to properly restrict what can be executed.

I hope that Dune will be able to help when you reach the stage where you need a sandbox, and I look forward to getting feedback if there is anything missing/would need to be fixed

ityonemo · September 26, 2021, 4:49am

If you can really lock this down, there could be something interesting with publically hosting LiveBooks, though the really tricky thing to secure is going to be “giving access to limited amounts of memory”

I think it should also be possible to go really low in the stack and disassemble modules, reassemble them with very carefully injected substitutes for certain things (calls, etc).

voltone · September 26, 2021, 5:29am

The bytecode sandbox approach has been attempted: GitHub - robinhilliard/safeish: NOT FOR PRODUCTION USE: Safe-ish is an experimental sandbox for BEAM modules that examines and rejects BEAM bytecode containing instructions that could cause side effects. You can provide an optional whitelist of opcodes and functions the module is allowed to use.

It turns out to be quite difficult to implement at that level, because of the various ways high-level language features are compiled down to bytecode, after the various optimizations and handling of special cases. Once all the sandbox escape paths have been fully blocked (if that ever happens) I suspect it will also block many perfectly harmless and possibly essential language features just because they happen to compile down to bytecode that is considered too powerful.

Personally I would stick with Luerl until the BEAM offers a native, runtime sandboxing feature…

ityonemo · September 26, 2021, 2:02pm

Oh, I wouldn’t propose blocking bytecode,I would propose rewriting bytecode. E.g. rewriting “send” to actually call a safe registry. There are only about 180-is opcodes. Shouldn’t be so terribly hard, ha!

But you’re right. A natively sandboxed vm is probably worth waiting for.

sabiwara · September 27, 2021, 12:47am

If you can really lock this down, there could be something interesting with publically hosting LiveBooks, though the really tricky thing to secure is going to be “giving access to limited amounts of memory”

Publically shared LiveBooks would be an awesome application indeed! I think it should be possible to have a working version for simply exploring and sharing code. But for it to be actually useful for data analysis, one would need to access not-so-small amounts of memory, be able to have some long-running processing and probably be able to fetch some data from external sources as well. These requirements might be quite hard to reconcile with a publicly hosted service.

I think it should also be possible to go really low in the stack and disassemble modules, reassemble them with very carefully injected substitutes for certain things (calls, etc).

Oh this is a really interesting idea. Thanks @voltone for sharing safeish, I will check it!

voltone · September 27, 2021, 5:20am

You may want to check out this video of @robinhilliard talking about the experiment (with a cameo by yours truly)…

robinhilliard · September 27, 2021, 5:45am

Hi Sabiwara,

To be clear, Bram was involved in successfully hacking my attempts to make byte code secure with Safeish (a late December rainy holiday project for me), and he in no way condones or supports this approach, it is entirely my fault :-).

Cheers,
Robin

sabiwara · September 27, 2021, 12:20pm

Hi Robin!
Thanks for clarifying the context, this was already more or less my understanding but I am still curious about this approach.
Even if it didn’t turn out to be successful, it is always interesting to check learnings from past experiences.
I’ll watch the video, thank you guys for sharing!

sabiwara · September 28, 2021, 1:56am

Thanks a lot for the video, it was really instructive and fun!
TIL about erl_eval, I’ll look into it.

Even if the bytecode approach is pretty different from Dune’s, it was interesting to see that one of the biggest difficulty (and attack surface) was also around vetting the allowlist. Especially avoiding all these nasty functions like :timer.tc which are accepting a module-function-arguments and can be used to escape the sandbox. This is something I tried to be mindful of, but I have also been thinking about adding automated checks, like an audit feature that checks functions signatures and doc to try to detect these MFA kind of patterns.

Kudos @voltone for your successful hacking attempts on safe-ish! I should probably put a bounty as well

I also took note of your erlref article which is mentioned in the video. I’ll add the link with a warning to my readme.

Thanks again for sharing all of this. I understand that my approach is in contradiction with your recommendations, but I still hope Dune can offer something useful and safe-ish enough for some practical use cases. Even prior watching your talk, I would never have felt confident to recommend it as something that can be run directly on your main phoenix application that has access to your DB and everything.