Separating concerns with Supervision trees

Fl4m3Ph03n1x · August 21, 2019, 9:11am

Background

I have recently finished @pragdave 's online course and I was rather happy with all the architectural insight I got from it. Dave specifies in his course that his approach differs from the one used by the community (no surprises here for me) but I didn’t think this would impact me that much … until I started using process trees.

Here is a small example on how Dave would organize an app:

Interface file. It delegates to a server in this case.

defmodule FootbalEngine.Populator do
  @moduledoc """
  Interface for the populator that fills up the memory table (populates it) with
  data.
  """

  alias FootbalEngine.Populator.Server

  @spec new(String.t) :: GenServer.on_start
  def new(path), do: Server.start_link(path)
end

Server file. It has all the OTP logic and GenServer behaviours and callbacks:

defmodule FootbalEngine.Populator.Server do
  @moduledoc """
  Server for the Cache. Tries to populate it with data and if it gets anything
  other than a complete success for the indexation, it will keep trying to
  repopulate the memory tables.
  """

  use GenServer

  alias FootbalEngine.Populator.Cache

  ###############
  # Public API  #
  ###############

  @spec start_link(String.t) :: GenServer.on_start
  def start_link(path), do:
    GenServer.start_link(__MODULE__, path)

  ###############
  #  Callbacks  #
  ###############

  @impl GenServer
  @spec init(String.t) :: {:ok, String.t} | {:stop, any}
  def init(file_path) do
    :persistent_term.put(:indexation_status, :initializing)
    check_file_with_msg(file_path, {:ok, file_path})
  end

  @impl GenServer
  def handle_info({:check_status}, file_path), do:
    check_file_with_msg(file_path, {:noreply, file_path})

  ###############
  # Aux Functs  #
  ###############

  @spec check_file_with_msg(String.t, any) :: any
  defp check_file_with_msg(file_path, msg) do

    case Cache.populate(file_path) do
      status = {:ok, :indexation_successful} ->
        :persistent_term.put(:indexation_status, status)

      bad_status ->
        :persistent_term.put(:indexation_status, bad_status)
        {:ok, _ref} = :timer.send_after(15_000, {:check_status})
    end

    msg
  end

end

Logic file. Contains the logic used by the GenServer.

defmodule FootbalEngine.Populator.Cache do
  @moduledoc """
  Reads the CSV file, validates and parses its data and then populates the
  memory tables (the DB) with it's information.
  """

 #logic code here
 # def populate .....

end

Here we have a really good separation of concerns:

one file for the interface
one file for OTP and GenServer behaviours
one file for the program’s logic

The challenge

So, now that we have this neat interface it’s time to use it. Let’s say I have an OTP app, and I want to add populator to my supervision tree, as is normal in Elixir apps.

How would I do it?

The ideal solution would be to use the Interface file, namely using FootbalEngine.new/1.

    children = [
      {FootbalEngine, file_path}
    ]

    opts = [strategy: :one_for_one, name: FootbalInterface.Supervisor]

    Supervisor.start_link(children, opts)

But if you try it, you will soon realize it fails. It fails because FootbalEngine is not an OTP compatible behaviour, it doesn’t even have a childspec function. It is simply an interface that delegates to another module.

The obvious solution here is to fix it the following way:

    children = [
      {FootbalEngine.Populator.Server, file_path}
    ]
    opts = [strategy: :one_for_one, name: FootbalInterface.Supervisor]
    Supervisor.start_link(children, opts)

But this breaks the encapsulation principle Dave has tried to create by placing the Server module behind an interface. We, the dummy users, are not supposed to know Populator uses a GenServer behind the scenes. We are only supposed to know about it’s interface.

Paradox

So now I have a paradox. I want to have an interface that hides implementation details (such as, does this use a GenServer, or GenStage or does this even use processes?) but at the same time, if I want to make an OTP supervision tree, I need to expose these details and break the encapsulation of my interface.

Questions

Is this a signal my interface is poorly designed?
How can I hide implementation details while still making supervision trees possible?
Are these 2 approached incompatible in nature? (should I just quit trying to separate concerns like Dave does in his courses?)

Your opinions and ideas are welcome !

LostKobrakai · August 21, 2019, 9:57am

The paradox is not in the architectual part of your design. It’s a technical issue, because the public interface you support doesn’t match the public interface expected by the supervisor. For your public interface to support being used in supervision trees like you showed it needs to implement child_spec/1, which is what’s used by the supervisor.

You can implement it in FootbalEngine.Populator like so:

defdelegate child_spec(args), to: FootbalEngine.Populator.Server

Now your public interface supports that usage, but still doesn’t do anything by itself. It’s still just a plain wrapper module.

Fl4m3Ph03n1x · August 21, 2019, 10:08am

Hey there!
I considered this approach as well but … if my FootbalEngine interface has a child_spec, doesn’t it mean I am breaking encapsulation? By doing so I am telling my users “Hey, this module/library/component is/has a GenServer!”

This is an implementation detail. Is this the only way to fix the paradox?
Do you have any other opinions regarding my 3 questions? Don’t be shy, I always love your feedback !

LostKobrakai · August 21, 2019, 10:41am

This is not really the case though. Having a child_spec doesn’t mean you have a GenServer. It means “This library has components, which can be started as a child of a supervisor”. It could be a GenServer, Agent, Supervisor or even a whole tree of things. The user doesn’t come in contact with any implementation details and you’re still free to switch out what actually get’s started at any time, without the user needing to change a line, which is normally the whole reason for interfaces. child_spec/1 isn’t an implementation detail imho. What it actually starts is.

As the intended usage for your library seems to include “being startable as child of a supervisor” I don’t see any reason for not fulfilling the expected contract though. What exactly do you gain by not implementing child_spec/1 besides probably a lot of unnecessary indirection?

rvirding · August 21, 2019, 2:34pm

I personally think that separating a server into 3 files is actually a bit too much. I use file and just be careful to keep the different types of functions in in separate sections of the file. So there will one section with the user/admin interface functions, one section with the behaviour callbacks, and a final section with any extra internal logic functions needed. This has one benefit in the it limits the exports to only those functions which actually need to be exported.

pragdave · August 21, 2019, 3:14pm

I think the issue you’re seeing is a common one among Elixir developers: people tend to conflate the supervision structure with the design of the code. In reality, the two are distinct: supervision is about starting and stopping things, and the code is about doing things.

So the interface to your server; the API; does not include the code that knits it into the application when it starts.

I’d just put the server module in the supervisor parameters.

Dave

Fl4m3Ph03n1x · August 22, 2019, 7:38am

So, you would follow my original solution and forego the FootbalEngine interface altogether?

A consideration to have in mind is that FootbalEngine.Populator is a component of FootbalEngine itself (following your folder structure for projects).

The interface my component exposes to the world is that of FootbalEngine which has a couple of methods. Populator is a part, a component if you may, of the FootbalEngine.

If I understand your quote correctly:

I’d just put the server module in the supervisor parameters.

then interpret that you would do the following:

    children = [
      {FootbalEngine.Populator.Server, file_path}
    ]
    opts = [strategy: :one_for_one, name: FootbalInterface.Supervisor]
    Supervisor.start_link(children, opts)

But thinking about this, now the client that uses my FootbalEngine knows about a Populator thing. And a Populator.Server thing. And it shouldn’t ! Right?

Doesn’t this approach go against the message you try to pass on your courses ?
Am I miss understanding / miss interpreting something?

I truly appreciate you replying to this post, thank you !

Fl4m3Ph03n1x · August 22, 2019, 8:00am

Excellent points, I find hard to refute. What do I have to gain?
Well, I can’t answer that question, at least not for now, because I don’t yet have a clear solution to compare your suggestion to.

I am trying to understand if my design choices are good and how @pragdave would attack this issue, so then I can make up my mind. At the moment I have your solution quite clear in my head and I like it, but it is not clear to me what I am supposed to do based on Dave’s courses, so I can’t yet tell you the tradeoffs both answers would imply.

(Story time!)

Many years ago, I used to do C# in a consultant company.

In C# there was (I believe there still is) a primitive that would allow you to divide your code into sections, so that you could fold/hide your code when working with your IDE of choice. So, as I was working with files hundreds or even thousands of lines long, I started to think “Hey, this section primitive (or whatever its name was) looks really good. I can have files 1000 lines long and only see 100 at a time. I should learn more about it so I can use it more efficiently.”

And so I embarked on a journey to learn about this awesome primitive, hoping to create sections and subsections in my files hundreds of lines long.

However, as I read through blogs and documentation I realized something: this feature wasn’t awesome, it was horrible. By far, the general consensus of the community was that if you are creating sections like that in your code, then you should probably create another file/class for that section, because it means you are mixing several responsibilities in the same file/class. So the overall suggestion was: separate your responsibilities and don’t use this feature. In fact, many even considered the usage of the feature a code smell.

Truth is, managing behemoth classes was hard, and sections were not making it easier. With both years and time, I ended up ditching that feature and refactored the code to be more concise.

I understand you like to have everything in the same file. I am not here to tell you are wrong. Perhaps your files are only 200 lines long. Perhaps your GenServers are really isolated or simple and perhaps your teams is already used to that system and is really efficient with it. By all means, use what is best for you.

However, from my experience, the thing that works best for me is to have things into more files. Obviously, if I have a 10 lines GenServer, I am not going to separate it into 3 files, but our apps rarely are that small. They rarely are small, with code easily reaching 500 lines (the smallest ones).

Now, I am a slow person. I can rarely go 100% into a PR and review more than 200/300 lines of code in an hour. 500 to just seems like a lot, at 2 hours into a PR I just want to take a break for a coffee or jump out of a window - whichever is closest.

Your experience is valuable and I really appreciate you sharing your opinion, I also use your suggested approach from time to time after all. But for this specific case, I am looking to explore new terrain and I hope you are as excited as I am.

Also, I hope I didn’t bore you to death with my Story time (trademark!)

peerreynders · August 22, 2019, 2:37pm

I’m having difficulty understanding the point you are making here:

Fl4m3Ph03n1x:

then interpret that you would do the following:
    children = [
      {FootbalEngine.Populator.Server, file_path}
    ]
    opts = [strategy: :one_for_one, name: FootbalInterface.Supervisor]
    Supervisor.start_link(children, opts)
But thinking about this, now the client that uses my FootbalEngine knows about a Populator thing. And a Populator.Server thing. And it shouldn’t ! Right?

What are you referring to as “the client” here (in reference to the code snippet you have shown)?

Aside

I’m also wondering whether you are basing your notion of an interface (or contract) too closely on the codified manifestation found in statically typed languages like Java, C#, Go, etc.

Programming to an Interface, not an Implementation (GoF, p.17) doesn’t require the existence of an explicit interface.

The “implicit supervisory management interface” used by a supervisor against it’s children is used specifically to

spawn a child process
terminate a child process
detect the unexpected termination of a child process

and the means to do so are generically based on the primitives exposed by the run-time system, the BEAM - without any direct dependency on the child implementation.

It’s the child processes that have to adhere to some constraints to be able to be supervised.

Terminating a child process
At the lowest level the supervisor can simply send an OTP stop message to the child process. This is typically handled on the level of OTP behaviour modules and only manifests in the callback module as a terminate/2 function. However non-responsive children can be terminated with signals via Process.exit/2 - with :kill if necessary.

So here we’re using process mailboxes (send/2 and receive/1) and signals - all generic BEAM primitives.

Detecting child process termination
This is accomplished by trapping exits along process links or by using process monitors - both of which are generic BEAM primitives.

Java has tried to force error handling into their static typing with checked exceptions with mixed results and reception and none of these languages have anything like links and monitors on the language level.

Spawning a child process
Spawning a process is another BEAM primitive. In terms of a supervisor

{FootbalEngine.Populator.Server, file_path}

is a short form of

%{
  id: FootbalEngine.Populator.Server,
  start: {FootbalEngine.Populator.Server, :start_link, file_path}
  restart: :permanent,
  shutdown: :infinity,
  type: :worker,
  modules: [FootbalEngine.Populator.Server]
}

The point being - it’s configuration data - there is no dependency on any particular (child) implementation.

Granted

    children = [
      {FootbalEngine.Populator.Server, file_path}
    ]
    opts = [strategy: :one_for_one, name: FootbalInterface.Supervisor]
    Supervisor.start_link(children, opts)

chooses to hardcode that configuration in this particular place - but that “configuration” could be hypothetically injected from anywhere.

The point of this longwinded exploration is this - there certainly is a conceptual (management) interface/contract between the supervisor and the child process - but:

The supervisor implementation “defines” the “supervision control interface” that has to be satisfied by the child process
The child process has to adhere to that “interface”
The supervisor is isolated from the child implementation by only using BEAM primitives to interact with the child process

So the supervisor-child interface is never fully, explicitly expressed in code and the supervisor has no direct dependency on the implementation of the child process. The supervisor simply is handed “data” that determines which type of process is started and how it will be started and run and it only interacts with the running child processes indirectly via generic BEAM primitives.

The supervisor is never coupled with the child implementation.
The child process exhibits Consumer-to-Contract coupling (i.e. the “good” kind). The child process is the “client” because it wants to be supervised.

So all is as it should be - even in the absence of a full textual (i.e. code) representation of the interface - the interface emerges from the supervision principles that are part of the design of OTP which itself leverages the primitives implemented by the BEAM.

LostKobrakai · August 22, 2019, 3:19pm

I know from prev. discussions on the topic, that the library won’t start the server by itself, but any user of the library can start a Populator or maybe even multiple ones. So the client is the user of the FootbalEngine library, which would add the populator to their own supervision tree.

Fl4m3Ph03n1x · August 22, 2019, 3:23pm

And since Populator is an internal detail from FootbalEngine, it hurts my soul to see the client, which should only know and use the methods FootbalEngine exposes, to know about an internal detail of the library it is using.

peerreynders · August 22, 2019, 3:28pm

Would

    children = [
      FootbalEngine.process_spec(file_path)
    ]
    opts = [strategy: :one_for_one, name: FootbalInterface.Supervisor]
    Supervisor.start_link(children, opts)

make you happier?

Fl4m3Ph03n1x · August 22, 2019, 8:43pm

@LostKobrakai made the same suggestion, which I very much enjoyed. Safe to say it is the solution winning the most traction.

I just don’t quite get if I learned Dave’s lesson well enough, so I was wondering how he, according to his philosophy, would attack the issue.

Now I am thankful for his reply, but I am having trouble correctly interpreting it. Do you think my interpretation of his post is correct?

Also, thanks for the supervisor explanation, I appreciate it!

peerreynders · August 22, 2019, 9:15pm

You’re asking me to interpret two things

pragdave’s statement
AND your interpretation

My impression is that it is seen as a acceptable engineering tradeoff to expose implementation details for the purpose supervision configuration given that it is not linked to the run-time behaviour of the process (the “server’s API”).

That being said I’ve run across examples in Erlang source code where functions were being used to “help” assemble child specs or for the purpose of hiding the implementation details of those child specs (which can still be subject to modification before they are passed to the supervisor for configuration).

In the past I’ve felt a bit odd about the “spread of responsibilities” of the functions in the GenServer (or gen_server) callbacks:

handle_call/3, handle_cast/2, handle_info/2 seem to be the functions that relate directly to being a server.
The others seems to be more oriented towards OTP life cycle management.

Now in OO it wouldn’t be uncommon to have a separate lifecycle interface and a separate behaviour interface and have the objects of a class implement both at the same time.

However even in OO there is a similar tension between the constructor and the rest of the class. The constructor is concerned with building the thing while the rest is concerned with being the thing.

So in the end mixing life cycle concerns with run-time concerns in the same callback module doesn’t really seem all that odd in relation to processes.

rvirding · August 22, 2019, 10:17pm

I find that the interface part is generally quite small as all the functions generally do is just send of a request most of the code is in the callbacks. Separating them doesn’t give you that much. Also the interface calls and the code in the callbacks are very closely linked so they fir well together in the same module.

Now, of course, sometimes the implementation of the callbacks can result in quite a lot of code so breaking some of it out into a separate library module is a reasonable thing to do but doing that as a rule does not seem right. And finding a reasonable divide is often not that easy.

mindriot · August 23, 2019, 7:24am

I think the C# feature you are thinking of is #region. While I wouldn’t advocate cramming all kinds of things in the same file, it might be worth also considering why this is actually a problem. I think there is a bit of a difference between doing this in an OO world vs a functional world. Some reasons why you would work hard to extract anything you possibly can include:
• Reducing the size of the things you have to reason about as a whole (which is a class mainly due to the amount of mutation that is or can be happening)
• increasing reuse opportunity by being able to use the extracted code in a different context
• increasing composition options by being able to configure more easily how extracted components are combined via IOC or similar.

Having regions in a class is a strong indicator that you can improve any/all of the above factors and due to those reasons we often find ourselves taking any opportunity to separate anything even if it might make more sense not to in the absence of reasons created by the language constructs.

I’m not sure what the answer to your original question really is or where that should really land (especially since you are looking for a particular persons approach and there are good answers here already), however I would probably rethink the relevance of the region story and reconsider how many of the issues that drive that notion in C# are real problems that should force us to move things into another file when already our units of reasoning, reuse and composition are smaller.