Should we adopt Dave's way of building applications as a series of components? (Dave's talk has now been added!)

joeerl · June 5, 2018, 10:21am

These are very good questions I’ve split my answer into several sections - feel free to ask more. I didn’t manage to put these in-line so they’re all at the end of the post

Difference between black boxes and functions

I’ll only talk about PURE functions. A pure function is one where all the inputs come in the arguments and the only output comes through the return value - it has no side effects.

A side effect can be thought as “leaving something behind” after the function has returned.

Most math functions are pure:

 double(X) -> 2*X.

 add(X, Y) -> X+Y.

etc.

Most OS calls are impure.

Imagine we open a file:

Handle = file:open("filename"),

The side effect is that the file is now opened, now we can do
file:read(Handle) this will return data (or eof) and the side effect
is that after this call the some position pointer with the file will have been updated.

Since functions cannot remember anything after they have returned we need a mechansim to remember things after the functions have returned.

We can do this inside a process:

In Erlang

  loop(State) ->
      receive
          Msg ->
              NewState = F(State),
              loop(NewState)
       end.

F can be a pure function, which given State returns NewState.
The current value of State is remembered by the process.

So processes basically remember state for you.

So what about Black boxes? These are things with inputs and outputs - the inputs and outputs are MESSAGES - inside a black box there can be one or more processes - the point about a black box is we’re not supposed to know how it works - ie we’re not allowed to peep in the inside and see how it’s made up.

Difference between APIs and protocols

Imagine a file system with a simple API

file:open(FileName) => Handle
file:read(Handle) => {ok, Data} | eof
file:close(Handle) => ok

The notation F(Args) => A | B
means that the function F can return an A or N

This is an API but it says NOTHING about the allowable legal sequences of function calls.

For example the program fragment

      Handle = file:open(FileName)
      file:close(Handle)
      file:read(Handle)

is NONSENCE - we can’t read a file after we have closed it - but the type system and the API does not tell us this.

A protocol is a description of the legal sequences of messages that a black box can process.

Now imagine the file API as messages going into a black box
so there is an open message that returns a handle, a read message that returns data or eof and so on.

Then a valid program could be represented by the regular expression

  open read* close

ie we can open a file, do any number of reads then close the file

We might write the protocol as something like:

 filesystem =
   start -> open read* close start

Assuming the filesystem starts of in some magic state ‘start’

We can be more precise if we use a state machine:

   open x closed -> Handle x opened
   read x opened -> {ok,Data} x opened | eof x closed
   close x Handle -> ok x closed

The file is has state open or closed, the messages are open
read and close, so rule 2 reads

if a file is in the state opened and gets a read request when
we either emit a {ok,Data} message and the next state in opened or we emit an eof message and go to the state closed

Note that the protocol provides far more information than the API in that it specifies the legal sequences of operations that the black box can perform.

Principle of Observational Equivalence

Two systems are the same if they cannot be distinguished by observing their inputs and outputs - this is very important - it allows us to swap an implementation for a better one, change what happens inside the black boxes.

Wiring things up and starting and stopping

Now we have to consider the wiring - and how we start and stop black boxes? - To do this we need yet another language, something like:

 system X is:
      start component a
      start component b
      ...
      connect out1 of a to in2 of b
      connect out2 of b to in2 of c
      ..
      send {logging,on} to control2 of c
      ..
     send run to all

and we’re off - I assume start starts a black box and that run
makes it operational

Then

 operation update means
    send pause to all
    send newcode1 to control2 of b
    send resume to all

Or something

Most of this post was just thinking out loud - more about state machines for describing APIs is in the appendix of my PhD thesis it’s a system called UBF.

The ideas of describing protocols with communicating state machines comes from Tony Hoare’s CSP (Communicating Sequentail Programs)

The ideas of wiring up black boxes comes from Flow based Programming (Paul Morrison).

That’s enough for one posting

Cheers

/Joe