Elixir for Programmers course - should we adopt Dave Thomas' way of organizing GenServers?

cdegroot · May 14, 2018, 1:16am

I’m a bit over halfway through the course, and so far I really like it, especially the very logical order of moving up from basic code through state handling to genservers.

I just finished a video where @pragdave defends the fact that he styles his application a bit differently than usual, and he assumes he’s the sole person agreeing with him. On the contrary, I think we should adopt it as the standard style (it’s never too late to change!).

What I never liked about GenServers is the mix of concerns - it’s api stuff, call implementations, and business logic all wrapped in one usually very long and hard to follow file. I think Dave is right in assuming that this just bubbled down through the ages from the Erlang world but that that doesn’t necessarily mean that it is the best way to do things.

For those who aren’t on the course (really… you should), here’s how Dave organizes things:

lib/myapp.ex - API and just API. It will have the genserver message sends. This is the only module that a user of the app ever imports;
lib/myapp/the_business_logic.ex - Just the business logic.
lib/myapp/myapp_server.ex - Just the genserver implementation which should mostly just forward stuff to the business logic.

I really like this style, as genservers are often not nice to test but in this style they are so obviously correct that testing them is either simple or unnecessary. Lifting the Single Responsibility Principle to the module level just makes for way nicer code, and from my Smalltalk days I still tend to fear the time when a scrollbar appears because your code is too long ;-).

Am I the only person not-Dave-Thomas that really digs this style, is going to adopt it, and thinks that the core team should seriously think of adapting/promoting it? Clean code trumps pretty much everything…

cmkarlsson · May 14, 2018, 1:42am

Yes, I like this layout. I haven’t done/seen the course though so I am speaking without the full context.

This is the only thing I am questioning. From my experience the layout Dave suggests is a common way to layout code in erlang.

My standard layout in erlang (which I have seen in multiple erlang projects as well, I probably took it or was inspired by Erlang and OTP in action) is:

src/myapp.erl - Includes the API. For smaller projects this is the only module used and all others are internal implementation detail. All access to the application is done though here.
src/myapp_some_server.erl - Gen server implementation. Should usually be thin and only handle messages.
src/myapp_whatever.erl - To support business logic and other things. These modules are generally referential transparent to help with testing, robustness. Often doesn’t have any API guarantees between versions as they are internal to the application.

Yes, but as with everything there is a trade-off. With large and complex applications even following the single responsibility principle will give you large modules which may be hard to grok at a quick glance and require plenty of domain knowledge to understand.

bmitch · May 14, 2018, 4:08am

I’m new to the Elixir world and just finished @pragdave’s course. But I really like the way he laid out the structure and in my limited Elixir experience, his way just seems to make a lot of sense.

Mostly because it separates the concerns and think it would make testing easier.

So no, you are not the only one

StefanHoutzager · May 14, 2018, 5:17am

Robert Virding mailed sometime ago on this list that he is not a fan of Dave Thomas’ idea of splitting API functions and server functions. See Elixir Blog Posts. I for me am not convinced by Thomas’ reasoning, I don’t find his idea interesting. I still want to look at https://github.com/sasa1977/exactor to simplify genserver implementations.

michalmuskala · May 14, 2018, 6:19am

I’ll play a devil’s advocate here for a little.

What is the GenServer’s callback module other than the raw “business logic” module already? It’s already an isolated implementation that extracts the common logic from what is specific to your app. I’m not sure adding another layer in-between is actually helping with anything.

oldpond · May 14, 2018, 7:04am

I think what Dave was demonstrating in his course was not the “best way” to organize code, but the process and logic behind refactoring in Elixir. It’s interesting to compare Lance Halvorsen’s approach to factoring out business rules in his book Functional Web Development with Elixir, OTP and Phoenix. They both do a great job of breaking business rules out in their API designs but with slightly different naming conventions. In both cases they end up with well organized applications that have clear separation of concerns.

sasajuric · May 14, 2018, 7:36am

I believe that there are two different discussions here:

Should “business logic” (i.e. state management) reside in a separate module?
Should GenServer API be placed in a separate module?

When it comes to first question, you and Dave are certainly not the only ones thinking that way. You’ll find the same line of thoughts in my writings in Elixir in Action and To spawn or not to spawn. I believe that the same train of thought is also present in Lance’s Functional Web Development.

Besides just public writing, I tend to use this style in “real life” projects. However, I do not go for it 100%. If the state is fairly simple, I just keep it inside the GenServer module. When it becomes more complex, I extract it.

In some situations, I decide upfront that the state handling is complex, and so I decide to extract it upfront. In fact, in such situations, I immediately start working on the state, and not on the GenServer. A bit unusual example is my Parent library, where I first wrote the state logic, then I added a procdict wrapper, and only then did I wrap it inside a GenServer.

There are also some cases where I extract the state management into a separate module, but I keep some orthogonal parts, such as handling timer messages in the GenServer.

But as a rule, I definitely think that state management mostly belongs outside of the GenServer callback module. This leads to a nice separation of concerns, improves code reading experience, supports testability, and promotes code reusability.

OTOH, I’m mostly not convinced that API should be separated from the server side call (GenServer callbacks). I believe the two are naturally coupled, and thus usually belong together.

I can see some possible situations where this wouldn’t hold, for example if there are lot of complex transformations in interface functions, or if you want to vary the server implementation but use the same interface (basically a process-based polymorphism). In my opinion such cases are not very common, and so I think that prematurely extracting API into a separate module will mostly leads to unnecessary level of indirection and reduces the reading experience. It sort of reminds me of the OO style where each member variable in a class is immediately wrapped behind getter/setter functions.

So while I agree that there are some situations where that separation would be beneficial, I think that a good default is to keep API close to the server side handles (because after all they are naturally coupled), and extract if there’s particular need.

minhajuddin · May 14, 2018, 12:25pm

I love a few ideas from his course, he is obviously a very experienced guy who has worked in a ton of languages. However, there is one thing I find a bit unfunctional and a bit more OOPy using of the @me constant to refer to the same module:

defmodule Tasks.Sync do

  @me __MODULE__
  def start_link(args) do
    GenServer.start_link(@me, args, name: @me)
  end
end

I prefer adding an alias and using the Module’s name instead of a module attribute:

defmodule Tasks.Sync do

  alias __MODULE__

  def start_link(args) do
    GenServer.start_link(Sync, args, name: Sync)
  end
end

rvirding · May 14, 2018, 1:08pm

As @sasajuric says there are really two issues here: where to put the API; and where to put the “business logic”.

For the first question I think that they should definitely both be in the same module as they are totally integrated with each other so splitting them up doesn’t feel right. And there is honestly not much code in the client side.

The second question is more complex. In my mind there are a number of factors on which this depends. I think it is easiest if I (briefly) present how I structure my gen_server modules. So they all basically follow the same structure and ordering of parts:

First comes the “management API” section by which I mean the calls to start/stop the server: start, start_link and stop.

Then the “user/client API” section which is the calls users/clients make to access the server. These are the ones calling GenServer.call/cast.

Then come the section with all the callback functions but in a specific order:

init/1 and terminate/2
handle_call/3, handle_cast/2 and handle_info/2
code_change/3

I keep this order as it makes easier to find the functions. Also I generally include all the callbacks even if they are not used as I can directly see what each one does and there is no chance that I miss one and get it wrong. Being explicit rules in the long run (like next week ) and the extra code is negligible, seriously it is. I also generally do very little inside the actual callback functions themselves but call local functions to do most of the work, unless what is done is trivial. This make reading the callback functions easier, especially when there are many clauses. I find this more important with elixir than erlang as the elixir syntax tends to hide that we are really talking about multiple clauses of the same function and not multiple functions.

Finally come the section with all the local functions which generally implement most of the logic. I never ever mix the callbacks with the local functions as, again, keeping them separate makes it easier to find things.

This organisation means that breaking out the business logic into a separate module becomes less important as it is already separated into its own section. Again whether to break it out or not depends on a number of things, for example how much code there is. Also if you do break out the code into a separate module then you must move all the logic to this module so you don’t split it into different modules which tends to be a bad thing. I am assuming that this logic is just used in the gen_server and not in other places, if not then it should definitely be moved into a separate module as you only want to call the gen_server module for accessing the server.

So my thoughts on the matter. It became a bit longer than I had intended.

Final note here: I think that being explicit is extremely important in the long run. The longer your code is used the more being explicit helps, especially if it gets to the stage where someone apart from you has to manage the code. And seriously, it is generally not much code we are talking about.

cdegroot · May 14, 2018, 1:10pm

The advantage of Dave’s method, I find, is that you now have one module you can open (either its source code or its documentation) which announces “I’m the public API, everything I have can be touched” - so even if it maybe makes it harder for the author to organize things, it makes things much simpler for the reader, and in my opinion, code is meant to be written for the person reading it (including the proverbial you, one year down the road, in the middle of the night, while stuff is down).

I do agree that having three modules for a GenServer that maybe just one or two things on top of what an Agent would to is a think you can discuss - design decisions are never black and white. The point I’m trying to make is that, as a community, we’re pushing (by making it the default and using it in code everywhere) a pattern of “it’s ok to have API, GenServer glue, and logic in a simple module” and I think that Dave makes a point that that pattern is more wrong than right and we maybe should stop advertising it.

sasajuric · May 14, 2018, 1:31pm

I’m not sure I follow this point. It seems that you refer to the fact that with the API module you can see the interface of the module, and nothing else. If that’s the point, I personally don’t find it very relevant, but it admittedly depends on the editor you’re using and the code culture.

At my company, we use @impl for all the callbacks, which implicitly includes @doc false, and so a callback function is not considered to be a public API, and won’t be included in generated docs. Moreover, I’m using VSCode with ElixirLS, and so I get a proper member list which includes only API functions.

Finally, we organize our code to place API functions into a separate section at the top of the file, which means that even without the editor support it’s fairly easy to quickly scan the file and see all of the API functions.

cdegroot · May 14, 2018, 1:41pm

(I use Spacemacs/Alchemist, so I guess we’re both good. But frankly, code should be obvious in less as well)

I guess I’m trying to make the point that the SRP applied to modules is a nice thing. And this time around I don’t feel like I’m trying to transplant the wrong bits of my OO knowledge into Elixir. It’s a design gut feeling that’s been well honed over the decades and languages ranging across the spectrum: the single responsibility principle is something to take seriously and the dangers of taking it far or too far are small. I know that there’s no big functional difference between having one “do-everything” module separated by comments and broken up into sections, but qualitatively, I think that the difference is huge. If only because it stresses good design much more than the community currently does around genservers. When I watched Dave’s video on his solution (is it publicly available?), something just clicked and the uneasiness I’ve always had when writing GenServers and looking at the resulting code mess broken up with # API functions comments like I was coding large C programs again in the '80s got a name.

I guess it just boils down to: should we ruthlessly apply the SRP to modules? I guess the answer is, to me, yes. Even though I have enough experience in large OO systems to know where delegation hell can lead you

peerreynders · May 14, 2018, 1:45pm

And one also needs to keep in mind that we are talking about a single GenServer - all that code can only execute in a strictly sequential manner. Once you start focusing on modules that Genserver could take on too many responsibilities and become a bottleneck in the system. And lines of code or number of modules don’t tell the full story - given a realistic load does the GenServer remain responsive?

As @sasajuric outlined, there is always a range of design and implementation choices. It’s important:

To know what the full range of choices are
To be able to justify the choice made in reference to the relevant context (at the time) rather than just sticking to some default (“that’s the way we do it around here” is rarely good judgement).
Be willing to change the choice if either the relevant context or the knowledge about the relevant context changes.

In short, don’t make the approach your “new default” (i.e. don’t have a default. Defaults have a habit of to turning into dogma). Add it to the toolbox, sure, but only use it when the trade offs (which @sasajuric and @rvirding get into) can be justified.

FYI: SOLID Elixir - Georgina McFadyen - ElixirConf EU 2018

cdegroot · May 14, 2018, 1:50pm

Well, that’s the issue - it is, usually, three things:

The API to the server;
The business logic that the server implements;
The “glue” logic that sits between the GenServer protocol and the business logic.

rvirding · May 14, 2018, 2:00pm

But all those are tightly integrated so having a rule which splits them up feels very wrong. Yes, there are cases where it would be reasonable, but not as the general rule.

EDIT: Just a short comment about rules in this context. I see them as the good starting point for working out a solution. So they should always work but sometimes they might not be the best solution and then it can be reasonable to think about a better solution.

joaquinalcerro · May 14, 2018, 3:51pm

Can you share some code showing how do you structure GenServer modules? Thanks,

Best Regards

StefanHoutzager · May 14, 2018, 4:12pm

Posting code yourself where you have doubts about might give interesting comments. So post your code, share your doubts.

joaquinalcerro · May 14, 2018, 6:34pm

So I have also taken the Dave’s course and really love the separation of concerns but in some way I agree with @michalmuskala that to many layers might not help much.

My doubt is how can I structure GenServers better. I am sure we will all learn something looking at others code.

This is one example on how I do it:

  defmodule FindattachV2.Inbox do
  use GenServer
  require Logger

  ###################
  # Client API
  ###################

  def start_link(user) do
    GenServer.start_link(__MODULE__, [], name: String.to_atom(user))
  end

  def full_sync(user, token) do
    inbox = FindattachV2.MailServer.find_inbox(user)
    t = Date.utc_today()
    params = [params: [{"q", "has:attachment after:#{t.year}/#{t.month - 1}/#{t.day}"}]]
    GenServer.cast(inbox, {:full_sync, user, token, params})
  end

  def all_data(inbox) do
    GenServer.call(inbox, {:get_all})
  end

  def get_state(inbox) do
    GenServer.call(inbox, {:get_state})
  end

  def get_contact_attachments(inbox, contact) do
    GenServer.call(inbox, {:get_contact_attachments, contact}) 
  end

  def get_contacts(inbox) do
    GenServer.call(inbox, {:get_contacts})
  end

  def get_contact(inbox, contact_email) do
    GenServer.call(inbox, {:get_contact, contact_email})
  end

  ###################
  # Server API
  ###################

  def init(state) do
    new_state = [:ets.new(:data, [:bag, :public, read_concurrency: true, write_concurrency: true]) | state]
    {:ok, new_state}
  end

  def handle_cast({:full_sync, user, token, params}, state = [table | _]) do
    download_all_messages(table, user, token, params)
    {:noreply, state}
  end

  def handle_call({:get_contact_attachments, contact}, _from, state = [table | _]) do
    attachments =
      :ets.lookup(table, contact)
      |> Enum.map(fn({_, _, id, date, attachments}) ->
           Enum.map(Kernel.elem(attachments,0), fn(x) ->
             Map.put(x, :msg_id, id)
             Map.put(x, :date, date)
           end)
         end)
      |> List.flatten()
    {:reply, attachments, state}
  end

  def handle_call({:get_contacts}, _from, state = [table|_]) do
    contacts =
      :ets.match(table, {:'$1',:'$2',:'_',:'_',:'_'})
      |> Enum.reduce(%{}, fn([k,v], acc) -> Map.put(acc, k,v) end)
    {:reply, contacts, state}
  end

  def handle_call({:get_contact, contact_email}, _from, state = [table | _]) do
    {_,contact,_,_,_} = :ets.match_object(table, {contact_email, :"_", :"_", :"_", :"_"})  |> List.first()
    {:reply, contact, state}
  end

  def handle_call({:get_all}, _from, state = [table | _]) do
    data = :ets.tab2list(table)
    {:reply, data, state}
  end

  def handle_call({:get_state}, _from, state) do
    {:reply, state, state}
  end
  
  ###################
  # Private Functions
  ###################

  defp download_all_messages(table, user, token, params) do

    case FindattachV2.GmailService.list_messages(token, user, params) do
      {:ok, data} ->  
        if Map.has_key?(data, "messages") do 
          spawn_link(FindattachV2.Pmap, :pmap, [table, data["messages"], &FindattachV2.Message.download_messages_details/3, token, user]) 
        
          if Map.has_key?(data, "nextPageToken") do
            download_all_messages(
              table, user, token,
              [params: [{"q","has:attachment"},{"pageToken", data["nextPageToken"]}]]
            )
          else
            Logger.info("Finish importing messages IDs for user")
          end
        end

      {:error, reason} -> 
        if reason == :api_timeout do
          Logger.error("Error: --- #{inspect reason} ---") 
          download_all_messages(table, user, token, params)
        end
    end
  end
end

rvirding · May 14, 2018, 8:43pm

Ok. I found most of the servers that I could distribute are pretty small but this one covers most of the ground. It is in Erlang which I mostly use but it should illustrate what I mean. The code is the simulation master from a spaceship simulation where the ships are implemented in Lua using my Luerl package.

-module(sim_master).

-behaviour(gen_server).

-define(SERVER, sim_master).
-define(TABLE, sim_ship_array).

%% User API.
-export([start/3,start_link/3,stop/1]).
-export([start_run/1,start_run/2,stop_run/0,stop_run/1]).
-export([get_ship/1,get_ship/2]).

%% Behaviour callbacks.
-export([init/1,terminate/2,handle_call/3,handle_cast/2,
         handle_info/2,code_change/3]).

%% Test functions.
-export([init_lua/0,load/3]).

-record(st, {xsize,ysize,n,arr,tick=infinity,st}).

%% Management API.

start(Xsize, Ysize, N) ->
    gen_server:start({local,?SERVER}, ?MODULE, {Xsize,Ysize,N}, []).

start_link(Xsize, Ysize, N) ->
    gen_server:start_link({local,?SERVER}, ?MODULE, {Xsize,Ysize,N}, []).

stop(Pid) ->
    gen_server:call(Pid, stop).

%% User API.

start_run(Tick) ->
    gen_server:call(?SERVER, {start_run,Tick}).

start_run(Sim, Tick) ->
    gen_server:call(Sim, {start_run,Tick}).

stop_run() ->
    gen_server:call(?SERVER, stop_run).

stop_run(Sim) ->
    gen_server:call(Sim, stop_run).

get_ship(I) ->
    gen_server:call(?SERVER, {get_ship,I}).

get_ship(Sim, I) ->
    gen_server:call(Sim, {get_ship,I}).

%% Behaviour callbacks.

init({Xsize,Ysize,N}) ->
    process_flag(trap_exit, true),
    {ok,_} = esdl_server:start_link(Xsize, Ysize),
    {ok,_} = sim_renderer:start_link(Xsize, Ysize),
    {ok,_} = sim_sound:start_link(),
    {ok,_} = universe:start_link(Xsize, Ysize), %Start the universe
    random:seed(now()),                         %Seed the RNG
    Arr = ets:new(?TABLE, [named_table,protected]),
    St = init_lua(),                            %Get the Lua state
    lists:foreach(fun (I) ->
                          {ok,S} = start_ship(I, Xsize, Ysize, St),
                          ets:insert(Arr, {I,S})
                  end, lists:seq(1, N)),
    {ok,#st{xsize=Xsize,ysize=Ysize,n=N,arr=Arr,st=St}}.

terminate(_, #st{}) -> ok.

handle_call({start_run,Tick}, _, #st{arr=Arr}=St) ->
    %% We don't need the Acc here, but there is no foreach.
    Start = fun ({_,S}, Acc) -> ship:set_tick(S, Tick), Acc end,
    ets:foldl(Start, ok, Arr),
    {reply,ok,St#st{tick=Tick}};
handle_call(stop_run, _, #st{arr=Arr}=St) ->
    %% We don't need the Acc here, but there is no foreach.
    Stop = fun ({_,S}, Acc) -> ship:set_tick(S, infinity), Acc end,
    ets:foldl(Stop, ok, Arr),
    {reply,ok,St#st{tick=infinity}};
handle_call({get_ship,I}, _, #st{arr=Arr}=St) ->
    case ets:lookup(Arr, I) of
        [] -> {reply,error,St};
        [{I,S}] -> {reply,{ok,S},St}
    end;
handle_call(stop, _, St) ->
    %% Do everything in terminate.
    {stop,normal,ok,St}.

handle_info({'EXIT',S,E}, #st{arr=Arr}=St) ->
    io:format("~p died: ~p\n", [S,E]),
    ets:match_delete(Arr, {'_',S}),             %Remove the ship
    {noreply,St};
handle_info(_, St) -> {noreply,St}.

%% Unused callbacks.
handle_cast(_, St) -> {noreply,St}.

code_change(_, St, _) -> {ok,St}.

%% Local functions.

%% init_lua() -> LuaState.
%%  Initialise a LuaState to be used for each ship process.

init_lua() ->
    L0 = luerl:init(),
    L1 = lists:foldl(fun({Name,Mod}, L) -> load([Name], Mod, L) end, L0,
                     [{esdl_server,luerl_esdl_server},
                      {universe,luerl_universe},
                      {sound,luerl_sound},
                      {ship,luerl_ship}]),
    %% Set the default ship.
    {_,L2} = luerl:do("this_ship = require 'default_ship'", L1),
    L2.

load(Key, Module, St0) ->
    {Lk,St1} = luerl:encode_list(Key, St0),
    {T,St2} = Module:install(St1),
    luerl:set_table1(Lk, T, St2).

start_ship(I, Xsize, Ysize, St) ->
    %% Spread out the ships over the whole space.
    X = random:uniform(Xsize) - 1,
    Y = random:uniform(Ysize) - 1,
    {ok,S} = ship:start_link(X, Y, St),
    %% Random speeds from -0.25 to 0.25 sectors per tick (very fast).
    Dx = 2.5*random:uniform() - 1.25,
    Dy = 2.5*random:uniform() - 1.25,
    ship:set_speed(S, Dx, Dy),
    {ok,S}.

Yes, it is pretty comment-free but it is not for release yet.

joaquinalcerro · May 14, 2018, 11:11pm

Thanks for sharing.