Private dependencies - where source code is not exposed?

lok0613 · February 23, 2021, 12:39pm

defp deps do
  [
  ]
end

The packages that mix deps.get are actually source code. What if I have a private package and don’t want to expose my source code, is there a way to build beam files package?

Archives and escripts can build beam files, but it seems like people not recommand coz those will be in global.

LostKobrakai · February 23, 2021, 12:45pm

There are no good ways to do what you’re looking for. This is even besides the fact that beam files can be decompiled as well. Generally the consensus seems to be to use licenses and lawyers to keep your code from being misused rather than tech. solutions, which most of the time do just make the lives of legitimate users more difficult.

xlphs · February 23, 2021, 2:51pm

Why not make the package run as a BEAM node and talk to it from your main application?

hauleth · February 23, 2021, 7:59pm

If you allow full rpc and load code from .beam (default in releases) files then malicious party can still get code that can be decompiled.

Licenses and lawyers is the only reasonable and safe way to achieve what you want.

lok0613 · February 25, 2021, 10:44am

But why we can’t have it like Java (jar file) with beam files rather than source code? It can be share and distribute via package manager as well.

NobbZ · February 25, 2021, 10:52am

You can share pre compiled libraries, though there is currently nothing in the eco system that helps you with that. You need to take care manually for putting the files where they are expected.

The current eco system favors open source.

Also the problem with precompiled BEAMs is, that you need to provide them for each supported major version to get the most out of it. Using OTP 20 compiled BEAMs within an otherwise OTP 22 application structure might cause slowdown if I recall some discussion in a chat correctly. Also if your libraries require usage of NIFs things get even more complicated.

eksperimental · February 25, 2021, 12:18pm

I don’t know of any source code obfuscator, but it sounds like what may help you.

dimitarvp · February 25, 2021, 2:17pm

You can set up a private Hex repository where only you and your organization will have access to the source code of the dependencies. People have been doing that successfully.

One example: Elixir application deployment using a CI and private Hex.pm dependencies | by Bruno Ripa | Medium

Another, much easier option is to simply point at your dependencies through their GIT repo – and make the repo private. Anyone that needs to use it will use an auth token so they can actually fetch the dependencies and compile them.

eksperimental · February 25, 2021, 6:10pm

Sorry, I am failing to see how will this prevent the user from having access to the source code?

ityonemo · February 25, 2021, 6:24pm

It won’t. I think that if you were clever you could load an encrypted binary, decrypt it, then load it as a module. You can do it with debug info off, too, which will make disassembly harder.

If you weren’t afraid to dig into eelamg source code, this would probably be relatively easy to implement in C.

dimitarvp · February 25, 2021, 6:24pm

Ah, it will not prevent access to an internal team. Only to everybody else. Many found that to be good enough so I’m including it here.

olivermt · February 26, 2021, 6:48am

Once you have a module its trivial to disassemble

ityonemo · February 26, 2021, 7:00am

That’s not really true. Do a defmodule in iex (or an exs file) and try to disassemble it.

hauleth · February 26, 2021, 8:59am

FTFY. It is actually impossible to fetch the code for the in-memory modules which was my problem when I wanted to use Concuerror in ExUnit.

olivermt · February 26, 2021, 11:24am

TIL, I stand fully corrected.

That raises some interesting scenarios where you can sideload code based on a licensing model etc if you are forced to deploy into a customer environment.

That is actually very cool.
Is there really no way to fetch the module code at runtime if you control the VM? That sounds weird to me for an interpreted language.

Edit: Sloppy wording, virtualized language then. In java it’s fairly easy to grab a classloaded class to decompile it, thats where my assumption came from.

NobbZ · February 26, 2021, 1:26pm

Elixir isn’t interpreted. It’s compiled into beam bytecode.

al2o3cr · February 26, 2021, 1:37pm

The BEAM VM was designed with a significantly different set of requirements than the Java VM; the Java designers were specifically aiming to make dynamic code-loading from network sources work (remember it was for set-top boxes back when it was called Oak).

As a result, Java included internal security barriers etc for isolating code (the ClassLoader infrastructure has been there since the beginning) that aren’t present in the BEAM.

Compatilbilty is also a concern - compiled BEAM code is not designed to be particularly portable, because the assumption is that it will be compiled specifically for the target machine. IIRC there’s no guarantee of forward compatibility (compiling with OTP 24 and running on less → no go) and only a limited window of backward compatibility (OTP 24 only promises to run OTP 22+ output)

derek-zhou · February 26, 2021, 3:51pm

I agree that you need license and lawyers to protect your intellectual property. However there could still be valid cases that someone want to hide source code. For example, maybe I just don’t want my customer to see my embarrassingly sloppy source?

Is it possible to distribute intermediate compilation results, like core erlang or textual erlang AST? I would prefer that to a stripped BEAM code.

ityonemo · February 26, 2021, 7:10pm

Is there really no way to fetch the module code at runtime if you control the VM? That sounds weird to me for an interpreted language.

Not directly, via the VM. Beam bytecode looks dramatically different from beam assembly, and is optimized for different usecase bytecode is (roughly speaking) optimized for in-memory access and fast cpu instruction selection and switching, and assembly is (roughly speaking) optimized to have universally understandable data layout on persistent media.

If you have access to the memory of the machine, you could in theory reverse engineer much of the code from the in-memory contents, but I think that would be very difficult.

Maybe an enterprising person who cares about IP and DRM (not I) could figure out how to run enough of a BEAM process inside of an SGX enclave to do “interesting things” but… Haha probably not worth the effort, and besides SGX is a terrible idea IMO. Already we see exploits where someone can run malicious code inside the SGX enclave and therefore hide it from the hypervisor lololol

Is it possible to distribute intermediate compilation results, like core erlang or textual erlang AST? I would prefer that to a stripped BEAM code.

Yes. All the tools are there in modules that ship with the beam.

cenotaph · February 27, 2021, 3:45pm

I maybe coming too late to the conversation but why don’t you put your package into your private repository and serve the dependency from there?

If you never want to share your source code, well then make it a black box service and offer as a saas dependency to the project?