Proposal: Private modules (general discussion)

josevalim · January 15, 2019, 11:09am

moderators note:

A conclusion by @josevalim has been drawn in Proposal: Private modules (general discussion) - #143 by josevalim

While Elixir has private functions, it does not have the concepts of private modules. This makes it harder for applications and libraries to define clear boundaries and communicate intent clearly.

As an example, when Elixir v1.7 was released, it broke some libraries that were using Elixir’s private APIs. This gives an impression of instability and immutarity in the ecosystem. Even more worrying, is that this practice in the long term can be really harmful as systems grow in size. If we, as a community, fail to define boundaries and fail to respect compatibility, updating only a small part of the system becomes impossible, because a minimal change breaks many unwarranted things along the way. It usually goes like this: let’s update Elixir! Unfortunately, updating Elixir breaks package X because X used a private API. So we have to update package X too but wait! That breaks Y and Z. Soon you find yourself having to update the whole system at once.

For these reasons, it is desirable to have a better way to outline boundaries and communicate intent. This is not only useful when working with dependencies. Even within the same library or application, developers can use well-defined boundaries to better organize their codebase and reveal intent to their coworkers.

However, one of the questions on this topic is how strict does the private module system has to be as there are many situations we would like to bypass it.

As @scarfacedeb mentioned in another thread:

I can think of at least 2 uses of open private modules:

As @JEG2 said, sometimes you have to use private modules in IEx in prod. You may encounter an unexpected error that you didn’t anticipate in your code and the quickest way to debug it is to call internal modules by hand and check the results. You could also use tracing in these cases, but I don’t see why we can’t have both.

When I’m learning how a new library (or app) works, I often call its internal modules directly to experiment and get a better idea how they work under the hood. Now it’s easy to do in iex and it doesn’t require to now about any new concepts (such as private modules, their visibility, etc).

Making code easy to explore is useful in production and while learning too.

With this in mind, this proposal is going to highlight four possible implementations for further discussion. Before we get to the possible implementation, we need to establish some common ground. Note the APIs in this proposal are not final and are meant to be examples. Once an approach is chosen, we can have a separate discussion to refine its APIs.

Best-effort warnings

One possible implementation of “private modules” is to provide best-effort warnings. This would work by annotating the visibility of a module, such as:

defmodule MyApp.Private do
  @module_visible_to [MyApp]
end

Now invoking MyApp.Private outside of MyApp will emit a warning that the module is private and may not be accessible externally.

It is important to note that, when code is compiled, Elixir does not actually guarantee the module you are calling exist. For example, if you have this function:

defmodule Foo do
  def bar, do: Bar.baz
end

The code will compile even if Bar is not defined. While mix does warn in cases like this, those warnings are “best-effort”. For example, the code below, while semantically the same to the code above, won’t warn:

defmodule Foo do
  def bar do
    mod = Bar
    mod.baz
  end
end

which means that “best-effort warnings” for private modules can be easily bypassed by doing:

defmodule Foo do
  def bar do
    mod = MyApp.Private
    mod.baz
  end
end

Therefore, the only way we could consistently and constantly warning when breaking a private module boundary, is if the private modules are required (via require/2) before they are used. Otherwise, we can only provide best-effort warnings, which are extremely easy to bypass and may not display as frequently.

Guaranteed warnings/errors (defmodulep)

If we want to have guaranteed warnings, private modules must be explicitly required before usage. One possible implementation of such mechanisms is to introduce a defmodulep construct, that defines a module in a separate namespace:

defmodulep MyApp.Private, visible_to: [MyApp] do
  def hello do
    IO.puts "hello world"
  end
end

In the definition above, only MyApp and modules nested under it can access MyApp.Private. To access a private module, you must explicitly require and alias it:

defmodule MyApp.Other do
  require MyApp.Private, as: Private
  Private.hello
end

The require is necessary to validate the visibility rules. The alias is required to bring the private module to the current namespace. The require+alias mechanism is essential to this alternative.

If we decide to go on the defmodulep route, we have three options:

Modules must be explicitly required+aliased and it will error if you break its boundaries. The namespace the module will be assigned to is private, which means you have no official ways of accessing a private module beyond its original intent.
Modules must be explicitly required+aliased and it will error if you break its boundaries. However, the namespace the module will be assigned to is public, which means you can access it directly, without any visibility check, by using its long name. For example, defmodulep Foo.Bar would be accessible directly via :"Elixirp.Foo.Bar", which could also be stored in a variable and passed around.
Modules must be explicitly required+aliased but it warns instead of erroring if you break its boundaries.

Rejected ideas

The following ideas were rejected:

Declaring the module visibility per package or application. The Elixir language and the compiler do not have the concept of “applications”. Applications and packages are purely a build tool construct. In a way this is great, because the language is small and we build features on top, but it also means we cannot implement a construct such as visibility per package as part of the language.

Proposals

With this in mind, we have four proposals (A, B, C and D). Please criticize those options and your rationale over them. Why you like some and why you dislike others.

They are:

A. Provide @module_visible_to annotations with best-effort warnings

B. Provide defmodulep where modules must be explicitly required+aliased and it will error if you break its boundaries. The namespace the module will be assigned to is private, which means you have no official ways of accessing a private module beyond its original intent.

C. Provide defmodulep where modules must be explicitly required+aliased and it will error if you break its boundaries. However, the namespace the module will be assigned to is public, which means you can access it directly, without any visibility check, by using its long name. For example, defmodulep Foo.Bar would be accessible directly via :"Elixirp.Foo.Bar".

D. Provide defmodulep where modules must be explicitly required+aliased but it warns instead of erroring if you break its boundaries.

If you can think of other implementations and approaches, please drop a comment to so we can amend the proposal accordingly.

Thank you!

josevalim · January 15, 2019, 11:13am

Here is my personal opinion (I have tried to be unbiased in the proposal as much as possible):

Proposal A, with “Best-effort warnings”, is not going to cut it. It is too easy bypass it and they are not guaranteed to be emitted either.

Proposal B, given the community feedback, is not a good choice either. It means debugging live systems become very hard. In the worst scenario, everyone will use hacks to access private modules. So I believe the solution is to make private modules accessible but “ugly” enough to signal that accessing them directly is discouraged.

Therefore, I am personally ok with C and D, but with stronger preference on C as I believe errors send a stronger message.

hauleth · January 15, 2019, 11:16am

Warning could be bypassed in B, C, and D proposals as well, in a little bit more explicit way, but possible as well (and this cannot be prevented as this would break behaviours, i.e. you could not have private GenServer)

defmodulep Private.Foo, visible_to: [Private] do
  def hello, do: "Hello World"
end

defmodule Private do
  require Private.Foo, as: Foo

  def call, do: Bar.call(Foo)
end

defmodule Bar do
  def call(mod), do: mod.hello()
end

As I am generally in favour of the proposal, I would vote for warnings, potentially proposing EEP as well for providing such feature to rest of the platform (then, possibly, with hard error).

After second thought, the C solution is okaish as well I think, provides enough “obscurity” to discourage direct usage, but at the same time it allows to use direct calls.

Shikada · January 15, 2019, 11:24am

C seems just right to me. Clear signal of intent without complicating debugging or intentional bypassing.

lpil · January 15, 2019, 11:51am

Could you clarify what this means? How is this privacy achieved?

josevalim · January 15, 2019, 11:57am

A proof of concept of this mechanism can be seen in the closed proposal: Proposal: Private modules (implementation specific) (closed) - #25 by josevalim. In a nutshell, the name is arbitrary and it could be changed at any time or on any new Elixir release without notice.

NobbZ · January 15, 2019, 11:57am

I’m pretty sure there will be no way to really lock private modules down, that they are not usable from external people. But we already agreed on that we do not want that, since it makes debugging live systems harder than necessary.

Also I’m not very keen on warnings. Just pick a random package from hex and compile it. Most of them still have a lot of warnings because they still use the deprecated name for charlists, but do not want to update, because they want to support older versions of elixir as well. Or because of imperative assignment or because… So just another warning will simply not be visible through the noise.

If though compilation fails, then people will actually know they do something wrong, and they have to actively work around the error.

Therefore A and D are a no-go for me. B would make it hard to debug through iex, as I always need to be in a defmodule first to be able to circumvent visibility (or find out the mangled name of the module as it is valid at least during this session).

As C would make using those modules in production still possible and relatively convinient I’m in favor of this.

Eiji · January 15, 2019, 12:00pm

I’m 100% agree with @NobbZ!

lpil · January 15, 2019, 12:01pm

Thank you.

With option C would :"Elixirp.Foo.Bar".some_fn() generate a warning?

LostKobrakai · January 15, 2019, 12:03pm

What @NobbZ ( C ) said and a way to disable the hard failing via a cli flag for the local project.

josevalim · January 15, 2019, 12:04pm

No. But I may randomly show up at your door step…

…to discuss the importance of respecting contracts.

NobbZ · January 15, 2019, 12:05pm

No warning, no error. Just look at the sentence right before the example you quoted here

So not even any switches necessary as @Eiji asks for.

lpil · January 15, 2019, 12:08pm

Lovely! I’ll put the kettle on and await your arrival

I’m voting C, though would prefer it if a warning was printed when the module atom is called directly.

NobbZ · January 15, 2019, 12:10pm

Thats probably impossible to implement without teaching the BEAM about hidden modules…

lpil · January 15, 2019, 12:12pm

It’d be trivial to add a warning for :"Elixirp.Whatever".call(), just check what the prefix of the atom is after parsing in the compiler.

I’m not suggesting that mod = :"Elixirp.Whatever"; mod.call() print a warning.

josevalim · January 15, 2019, 12:15pm

Unfortunately that may have false positives. For example, if you do this:

require SomethingPrivate, as: Private
some_macro(Private)

And then some_macro(Private) calls Macro.expand and returns a quoted expression that calls something on the expanded atom. We would need to track if the atom came from the context or written by the user but atoms do not have metadata. Alternatively it could be done in the parser but that’s likely not the time or place it to do it.

derekkraan · January 15, 2019, 12:15pm

Currently we signal private modules with @moduledoc false. A isn’t much different from this, only includes a warning.

In B, you say:

you have no official ways of accessing a private module beyond its original intent.

But, there is still a way: copy and paste the source code into your own function. Maybe this isn’t so bad?

LostKobrakai · January 15, 2019, 12:25pm

The critical usages outside the original intent is when you’re debugging at runtime. I’m not sure copy/pasting is any option for that usecase.

derekkraan · January 15, 2019, 12:46pm

That’s true, but maybe developers will make public modules to help with debugging if the private modules are completely unavailable. It’s something that people will consider when applying defmodulep.

tcoopman · January 15, 2019, 12:52pm

I like this goal of outlining boundaries and communicating intent explicitly! For that reason I don’t think really locking down all possible workarounds is necessary.

I think my current option goes to C and D (public namespace). I’m picking this because I find the defmodulep with visible_to syntax communicates intent more obviously than @module_visible_to. The public namespaces makes sure that exploring and iex still work.

I’m not sure I like the require part, although an other possible advantage of the require part is that it makes the user of a module explicit.

No real preference on errors over warnings.