Should we adopt Dave's way of building applications as a series of components? (Dave's talk has now been added!)

CptnKirk · June 4, 2018, 9:24pm

In this case, how do you introduce ETS to support concurrent reads (assuming :thing_i_want_to_do is :get_some_process_state)?

The problem here is that messaging is an internal method for supporting concurrency. It isn’t the only method, and isn’t always the fastest one. By exposing directly what used to be a private concurrency concern, you limit your ability to modify your concurrency methodology later. It’s also currently miserable to document public message based APIs.

michalmuskala · June 4, 2018, 9:26pm

I think this is actually an interesting idea. The problem is that the share-nothing approach of Erlang processes leaks here and makes this approach extremely inefficient considering all the copies of the messages when multiple “proxy” processes are involved.

There’s also the problem that a very common optimisation technique involves maintaining an ets table for state and writing to it through the process to ensure all writes are linear and exclusive, but use “dirty” reads directly from the ets table. A message interface completely breaks down in cases like this.

desmond · June 4, 2018, 9:56pm

In your example I would put the lookup behind a function call as you described, doing away with the call entirely. I almost always know up front if there will be resource contention and will use this approach from the start.

If I didn’t think there would be contention, which is by far the more common use case, I would stick with my original choice and save the intermediate function call. If my assumption is wrong or the system changes to necessitate concurrent reads, it’s not a tall order to refactor the calls. But that particular evolution has been rare enough that I prefer to optimize for other use cases.

josevalim · June 4, 2018, 10:10pm

Except it isn’t. We have a fantastic tooling around modules and functions. We have documentation. We have xref to find where a function is invoked. We have @spec, @since and @deprecated. If you forget to rename a function, xref will likely catch it at compile time. A bad message will manifest itself as an error on the receiver (and not on the caller!). The amount of tooling around modules and functions is an order of magnitude superior to the contracts we have around messages and processes.

So even if you don’t agree that a GenServer is an implementation detail, there are many reasons to prefer exporting modules and functions. Any connotation of sync/async could be easily encoded in a naming convention such as cast_foo (although you rarely should be using cast anyway). Skipping the function is putting the burden on users of the code.

desmond · June 4, 2018, 11:18pm

This pattern is exactly what I think makes it superfluous! The client interface is such a transparent wrapper.

TBH, I don’t use the @deprecated and @since tags in my app code, and we could easily @spec the callbacks. Although the tooling is currently superior for functions/modules over processes and messages, that’s just reinforcing the handed-down traditions that Dave was originally trying to question. As @joeerl pointed out above, the key feature of Erlang/Elixir is processes and messages.

If we invested in proper tooling around message passing, couldn’t we reap the same benefits you describe without the additional layer of indirection? Couldn’t xref be made smart enough to catch malformed messages? Not that I’m volunteering for this : )

sasajuric · June 5, 2018, 5:07am

I agree that hiding that the GenServer is used is not a big benefit of client functions. In the past 8 years of working with Erlang, I think I had only a few cases where this was helpful.

IMO a much bigger benefit of interface functions is that they consolidates the communication into one place. So things like:

message format
process discovery
sync/async nature of each particular message

end up being a part of a single module.

I don’t want to have that knowledge spread around the code, because that makes it much harder to change (and these properties do change more frequently IME). In my mind, these things are a part of the single responsibility (communication), and the fact that we can consolidate the details of that responsibility in a single module, for client and the server, is IMO a benefit and a good pattern.

sihui_io · June 5, 2018, 7:11am

Thanks a lot for the longish post! It’s educational, and I’m certainly intrigued. I’m new to processes so I hope you can excuse my longish response and don’t mind me asking a few follow up questions.

Sounds like overall, you suggest us, as a community, to spend more effort on messages and protocols, not functions and APIs.

My first question how are black boxes different than functions?
Here is my understanding after reading your post, please correct me if I’m wrong. black boxes are independent services/components/applications/functions/foo_bars that run on their own processes and are constantly listening to requests that come to their ways, a.k.a. their input ports. The things that distinguish black boxes from functions are:

A black box runs on its own process and is always listening to requests comes to its way. A function lives with its clients (the clients who invoke the function) in the same process. As a result, a black box can run concurrently with its clients. But a function run sequentially with its client.
A black box takes input from its designated input port/stream and writes to its output port and error port. A function takes arguments and returns a result or raises an error. Sounds like the difference is that a black box’s input and output is “hard”-wired to its three ports, whereas a function is always in the same stream as its client. As I write it out, I have trouble seeing the difference. I’m probably missing the point. Maybe you can help me out here?

My second question is what are the advantages black boxes have over functions?
Sounds like the a the advantage is that black boxes can run concurrently alongside with its clients, hence its easier to compose, scale, and manage (kill & restart). Is that correct? I think there’s more to it. I think you are also suggesting the benefit related to thinking our programs differently-no longer thinking programs as sequential chained functions but as individual processes. Can you expand on this?

^^ this is a benefit you called out directly. I think you are suggesting an expressive way of showing how the components are wired up is (way) better than what we have right now, which is you can’t really tell besides reading (lots of) code. That is something I have been wanting for a long time!

Lastly, I just realized I might have missed your point entirely. Maybe it’s not so much about black boxes vs functions. It’s about the messages and the messaging mechanism among black boxes/functions. Sounds like that’s what you want the community to make progress on.

Thinking and building such a messaging mechanism/protocols for black boxes (i.e. “how to wire up the black boxes”) makes sense to me. But what does “thinking message” mean?

^^ how are protocols different than APIs?

Again, I still want to ask what’s the advantage of such a “messaging system” over the current way of programming? What’re the advantages of that way of thinking?

Sorry for my long-form thinking-out-loud. I would really appreciate it if you can confirm or correct some of my points and share your insights. I’m sure other folks new to this concept will benefit from it as well.

Cheers
/Sihui

josevalim · June 5, 2018, 9:41am

I am not saying you should do it though. But you could if it was relevant for you.

The first thing I want to clarify is that calling a GenServer explicitly is not what Joe nor Dave are proposing. In fact, Dave advocates for an explicit API, so you can change implementation details, and Joe proposes to abandon GenServer altogether and rely on send/receive patterns.

When you are using a GenServer, you have already foregone messages in favor of a function-based GenServer API that encapsulates the process messages in an opaque way (except for handle_info). That’s why I keep emphasizing that a GenServer is an implementation detail. So to me, you need to either fully encapsulate it, as Dave/I propose, or you should abandon it altogether in favor of an actual message based contract, like Joe proposes. Otherwise you are not here nor there and you end-up with the worst of both words: you don’t get the compiler to yell at you and you don’t really leverage everything processes can offer because you are tied to a GenServer.

Regarding the tooling, there are improvements we can do, and the community should definitely pursue those, but you also need to remember that processes and messages provide late binding, meaning that you don’t really know what you are calling until you call it. As a simplified example, you can consider send(pid, message) to be equivalent to some_mod.message(args). xref, intellisense, etc won’t work for the latter, because we only know some_mod at runtime, and that’s why it would be hard for it to assert anything about the messages/processes.

So the late binding approach makes it very hard to perform any kind of static analysis, unless we starting moving towards type systems and we start explicitly outlining the message contract between processes. But here is the catch, we can’t use any message contract for a GenServer because the call/cast messages it uses are private/opaque! Which brings us back to our original point.

joeerl · June 5, 2018, 10:21am

These are very good questions I’ve split my answer into several sections - feel free to ask more. I didn’t manage to put these in-line so they’re all at the end of the post

Difference between black boxes and functions

I’ll only talk about PURE functions. A pure function is one where all the inputs come in the arguments and the only output comes through the return value - it has no side effects.

A side effect can be thought as “leaving something behind” after the function has returned.

Most math functions are pure:

 double(X) -> 2*X.

 add(X, Y) -> X+Y.

etc.

Most OS calls are impure.

Imagine we open a file:

Handle = file:open("filename"),

The side effect is that the file is now opened, now we can do
file:read(Handle) this will return data (or eof) and the side effect
is that after this call the some position pointer with the file will have been updated.

Since functions cannot remember anything after they have returned we need a mechansim to remember things after the functions have returned.

We can do this inside a process:

In Erlang

  loop(State) ->
      receive
          Msg ->
              NewState = F(State),
              loop(NewState)
       end.

F can be a pure function, which given State returns NewState.
The current value of State is remembered by the process.

So processes basically remember state for you.

So what about Black boxes? These are things with inputs and outputs - the inputs and outputs are MESSAGES - inside a black box there can be one or more processes - the point about a black box is we’re not supposed to know how it works - ie we’re not allowed to peep in the inside and see how it’s made up.

Difference between APIs and protocols

Imagine a file system with a simple API

file:open(FileName) => Handle
file:read(Handle) => {ok, Data} | eof
file:close(Handle) => ok

The notation F(Args) => A | B
means that the function F can return an A or N

This is an API but it says NOTHING about the allowable legal sequences of function calls.

For example the program fragment

      Handle = file:open(FileName)
      file:close(Handle)
      file:read(Handle)

is NONSENCE - we can’t read a file after we have closed it - but the type system and the API does not tell us this.

A protocol is a description of the legal sequences of messages that a black box can process.

Now imagine the file API as messages going into a black box
so there is an open message that returns a handle, a read message that returns data or eof and so on.

Then a valid program could be represented by the regular expression

  open read* close

ie we can open a file, do any number of reads then close the file

We might write the protocol as something like:

 filesystem =
   start -> open read* close start

Assuming the filesystem starts of in some magic state ‘start’

We can be more precise if we use a state machine:

   open x closed -> Handle x opened
   read x opened -> {ok,Data} x opened | eof x closed
   close x Handle -> ok x closed

The file is has state open or closed, the messages are open
read and close, so rule 2 reads

if a file is in the state opened and gets a read request when
we either emit a {ok,Data} message and the next state in opened or we emit an eof message and go to the state closed

Note that the protocol provides far more information than the API in that it specifies the legal sequences of operations that the black box can perform.

Principle of Observational Equivalence

Two systems are the same if they cannot be distinguished by observing their inputs and outputs - this is very important - it allows us to swap an implementation for a better one, change what happens inside the black boxes.

Wiring things up and starting and stopping

Now we have to consider the wiring - and how we start and stop black boxes? - To do this we need yet another language, something like:

 system X is:
      start component a
      start component b
      ...
      connect out1 of a to in2 of b
      connect out2 of b to in2 of c
      ..
      send {logging,on} to control2 of c
      ..
     send run to all

and we’re off - I assume start starts a black box and that run
makes it operational

Then

 operation update means
    send pause to all
    send newcode1 to control2 of b
    send resume to all

Or something

Most of this post was just thinking out loud - more about state machines for describing APIs is in the appendix of my PhD thesis it’s a system called UBF.

The ideas of describing protocols with communicating state machines comes from Tony Hoare’s CSP (Communicating Sequentail Programs)

The ideas of wiring up black boxes comes from Flow based Programming (Paul Morrison).

That’s enough for one posting

Cheers

/Joe

sasajuric · June 5, 2018, 12:55pm

A while ago, I had some vague idea about trying to at least mimic a typed message contract, but never got to play with it. The idea is that a developer specifies a GenServer contract, e.g. along the lines of

contract do
  cast {:foo, type1(), type2(), ...}
  cast :bar
  call ...
end

and based on this, some metaprogramming code would generate generic client-side functions, and server-side handlers, with type specifications.

One benefit would be that the messaging protocol is consolidated (instead of being sprinkled around various functions). In addition, since we have typespecs on the client and the server, and the delivery is performed by the generic code (the code generated by metaprogramming, and the GenServer code), we can be more confident that the specs on the client side match the specs on the server-side.

There are a lot of open questions here, like where should the state be specified, how should developers provide their server-side code, what about discovery (aliases, via tuples), initialization, call timeouts, info messages, code_change, and terminate and so on.

I didn’t really pursue this idea further neither in thought, nor in code In case anyone finds it interesting, feel free to take it further. At the very least, it might serve as a nice metaprogramming exercise

josevalim · June 5, 2018, 4:44pm

What you said makes sense. I do think it is a solvable problem but it is one extra problem in an already long list of non trivial problems when it comes to statically verifying message contracts. Especially because the verification of this contract system would need to be generic as other behaviours may want to leverage it too. There is some interesting research on meta-programming and type systems which may be a good fit for Elixir.

OvermindDL1 · June 5, 2018, 5:01pm

This is precisely it, the module interface is exactly that, the interface.

+1

cmkarlsson · June 6, 2018, 1:54am

Thanks. You are right.

I agree, although you don’t technically have to start it (you just need to make sure that the dependencies are started before you use the app) it makes much more sense to treat all the applications the same (as the application controller (and likely other tooling) already do)

It doesn’t necessarily change the way I interact with them but it helps me understand how an application can fail and it let me know what I am communicating with. Am I sending messages to a process or am I calling a pure function? This can affect the way I use the library quite a bit.

xtian · June 8, 2018, 1:46pm

If you’re not already aware of them, you might be interested in session types. They allow systematic checking that messages conform to a defined protocol:

Short session types intro: Session types in programming languages---a collection of implementations. | Simon Fowler
Up-to-date list of implementations: From Data Types to Session Types: A Basis for Concurrency and Distribution

There is a research Erlang implementation, but the checking happens at runtime. There are some static implementations in other languages, however.

mbolmstrand · June 18, 2018, 11:03pm

How about using the graphql specification for defining typed message contracts?

The schema introspection of graphql is extremely powerful even if it’s perhaps more intended for dynamic, rather than static use. But if you can query a blackbox for it’s message contract runtime, I guess it shouldn’t be hard to define rules for how to get that information at compile time (guessing that’s what you mean by static verification?).

@sasajuric, it sounds like your idea about metaprogramming to define the message contract is very similar to how the schema macros work in https://hexdocs.pm/absinthe. Is that about right?

If you don’t read too much into the graph and the (s)ql parts, it really seems like graphql is just a way to define types and messages, but with the awesome add-on that you can use built-in messages to understand how to talk to other components.

It’s the wiring-up @joeerl describes that is the hard part I think. It would be great if you could just define the messages you expect as input, and the messages you will send as output - and let someone else wire it up. To drop the box in some place where there is potential for the defined input to arrive, other boxes interested in the output. With a unified way of asking all the boxes what they want as input and what they can provide as output, wiring-up can be an optimization/mapping task that doesn’t require any knowledge about the actual business problem that should be solved.

That’s at least what I would want if I were a programmer

OvermindDL1 · June 19, 2018, 8:34pm

It is, I’ve used it entirely server-side as API call points. ^.^

rvirding · July 2, 2018, 1:27pm

I have missed an enormous amount of this discussion but I think part of the problem is about naming. Unfortunately an erlang/elixir application is not what the rest of the world means by an application. It (the erlang/elixir application) is definitely intended to be a component and not a complete application (what the rest of the world means). This is very obvious when you look at OTP and see all the applications in it which are definitely components. While this is obvious for an “insider” it can be very confusing for newcomers.

Also in this respect a build tool like mix doesn’t help here as people (again especially newcomers) use it to build an application by making an application. If you get my meaning. No, there is not much mix or any other tool can do about this, it’s all in the naming.

When I give OTP courses I explicitly point out that OTP applications should be viewed as components and not as what the rest of the world means by applications.

I don’t know why the name was chosen.

I think someone in an earlier post also mentioned the naming problem.

AstonJ · July 2, 2018, 1:33pm

Do you think the name will ever get changed to component Robert?

I think it’s one of those things where the meaning has changed over the years - in this case, I think partly due to Apple - when they started calling programs, ‘Apps’ (/applications).

rvirding · July 2, 2018, 2:26pm

No, I don’t think the name will ever change. It would be such deep change and affect things everywhere so you would break too much. Unfortunately.

dimitarvp · July 2, 2018, 3:58pm

For what it’s worth I feel the incomplete meaning of the word “application” in BEAM land is what the rest of the tech world should adopt and not us adopting their meaning. Singular, self-sufficient and complete applications do not exist. Everything depends on something else, inevitably.

(Even Golang’s single big binaries are ultimately constructed from your code + all your dependencies.)

Add the loads of hard work on distributed architectures in the last years to the mix and IMO the BEAM’s definition of an application makes more sense.