How does the pipe operator really work?

Fl4m3Ph03n1x · September 11, 2018, 11:47pm

Background

Coming from a functional background functions usually take data last. This happens so you can compose them like the following ( reads from bottom to top ):

Example using Javascript with Ramda:

compose(
    count,
    filter( user => user.name === "Mary" ),
    getUsers,    // returns list of users
)

Because the result of getUsers is a list, we pass it to the filter, which is called like filter( condition ) ( list ), which is then passed to count.

Elixir

When I first came t Elixir ( a few days ago ) I saw that most functions are data first instead of data last. This was a huge disappointment because I thought I couldn’t compose functions like in the previous example above.

But I was wrong. Turns out I can compose functions, it just works differently:

def count(strand, nucleotide) do
    strand
      |> Enum.filter( fn char -> char == nucleotide end )
      |> Enum.count
  end

Correct me if I am wrong

According to what I understand, Elixir’s API is data first because the |> operator pushes the result of the last executed function as the 1st argument to the next function.

However I noticed this operator also has some special behaviors ( like every time you use it, you have to use () in your functions ).

Questions

Is my assessment of the use of the |> correct?
What other quirks and special cases should I be aware of when using the pipe operator?

peerreynders · September 11, 2018, 11:54pm

Kernel.|>/2

If you scan to the right you’ll see macro </> - click on it and it will take you to the source code.

you have to use () in your functions

That is a peculiarity because it actually is implemented as a macro.

Then there are some stylistic issues:

I seem to also recall a preference for one-pipe-operator-per-line - but maybe I’m remembering Elm’s forward function application. - Found it.

Soon you’ll be asking - what about errors? Time to look at:

Kernel.SpecialForms.with/1

not quite ROP but it works.

easco · September 12, 2018, 2:19am

Things that I found quirky:

• You can pipe into a cond expression. One member of my team loves the technique, I find it weird.

• You can pipe into an anonymous function but you have to express it as a call to that function (as you point out):

"Hello" |> (fn x -> IO.puts(x) end).()

gon782 · September 12, 2018, 6:22am

I used to find this weird but it legitimately helps in reducing the scope’s unnecessary variables, etc… At this point I think it’s preferrable to the alternative.

id
|> by_id()
|> Repo.one()
|> case do
      %__MODULE__{} = thing ->
        # do stuff with thing

      nil ->
        {:error, :no_thing_found, id} ->
    end

The above is nicer (IMO) because the variables you do bind are bound closer to (and are restricted to) the scope you use them in and you’re also reducing the possible binding for thing to only the success case, so you’re effectively guaranteeing that whenever there is a thing it’s never nil. caseing on a variable, you could do this, but now you have a variable thing that is effectively useless anyway, because you’re binding in the subscope anyway, and the original variable will still be available in the nil scope.

This, however, I really dislike seeing. I’d actually rather see a private function at that point.

peerreynders · September 12, 2018, 12:53pm

Which really shouldn’t be too surprising to anyone who has used IIFEs in JavaScript.

In my mind the preferences to both separate points seem somewhat disjoint.

well_named_fn = fn(x) ->
  case x do
    %__MODULE__{} = thing ->
      # do stuff with thing

    nil ->
      {:error, :no_thing_found, id} ->
  end
end

id
|> by_id()
|> Repo.one()
|> well_named_fn.()

Now at this point I’m likely to just turn well_named_fn into a static function, likely using multiple clauses with pattern matching, regardless of the parameters which I may have to pass.

Throwing a conditional in the middle (or end) of a pipe may seem convenient but to me still adheres to a flowchart style of programming. FP to me is suggestive of breaking things down into tiny little concepts as functions which can be appropriately composed (even if I only intend to compose them once).

Not saying that you are wrong, just pointing out that there is some room for a different point of view - or in this case that there are some ideas that are worth pushing further.

gon782 · September 12, 2018, 1:21pm

peerreynders:

In my mind the preferences to both separate points seem somewhat disjoint.

well_named_fn = fn(x) ->
  case x do
    %__MODULE__{} = thing ->
      # do stuff with thing

    nil ->
      {:error, :no_thing_found, id} ->
  end
end

id
|> by_id()
|> Repo.one()
|> well_named_fn.()

I don’t see any real value in this function being named but not just a private function to the module, to be honest. It takes no advantage of being bound to a variable at all. On top of that, as you alluded to, casing on an input variable is sloppy. If a case statement somehow gets out of control I’d rather someone jumped straight to a defp instead of the roundabout way through a binding to a lambda.

Having a case at the end of a pipe is essentially a much cleaner version of your lambda example and if you want to name something there’s a much better way to do that as well. The bit I quoted would/should never pass review, IMO.

peerreynders · September 12, 2018, 1:31pm

This is where we disagree - reading the name reminds me what that function is supposed to accomplish - rather than having to mentally parse through the code and having to divine what it is actually trying to accomplish from how it is doing something. This is my main beef with anonymous functions - and throwing in a case expression in between (or the end of) a chain of functions essentially creates the same problem.

gon782 · September 12, 2018, 1:39pm

Just to be clear, let’s acknowledge that you cut off that sentence before the bit about naming with private function definitions instead. I’m not at all against naming things, but using variable bindings to do it when I’m not actually using them as variables is usually just a waste and it clutters up function definitions. We don’t have nested function definitions so let’s not pretend we do.

To reiterate: I’m not advocating for cramming whatever you want in these piped case statements, but I’m most definitely saying that binding lambdas to variables is a useless step inbetween that should be skipped entirely. Lambdas are slower and in this case you’ve gained nothing from binding to a variable. Were you to actually pass the variable to something I’d see the point, but you’re just using it as a name. On top of that you’d get the less awkward function call out of using a defp.

peerreynders · September 12, 2018, 2:00pm

The intent was to

replace naked code with a name.
give that code access to all identifiers via the closure - if I happen to be too lazy to express the necessary information as parameters, which I’m not 99.99999% of the time.

On top of that you’d get the less awkward function call out of using a defp .

I wholeheartedly agree.

I’m not advocating for cramming whatever you want in these piped case statements,

The example given is to me is a Broken Window, the “thin edge of the wedge”. The lambda I gave was simply setting the stage for “just make it another function - even if you have to define/pass some arguments”.

The whole “Naming is difficult” excuse is used far too often to justify less readable or inferior code - to get good at naming things you have to keep practicing.

ianrumford · September 14, 2018, 2:36pm

My first post on Elixir delved into the pipe operator.

Bit old now but may be of interest.

dimitarvp · September 26, 2018, 7:36pm

While I definitely agree with gradual refactoring I believe @gon782 has a bit better point here: if you find yourself having to modify such a weird piece of code (which I also would not pass during a review) you better just go all the way and break it down to more readable and single-responsibility pieces utilizing well-named functions. If you have good tests, they don’t care if you change things 5 lines of code at a time or you modify the whole thing and run them then.

I see what you meant with your idea above, it’s just that in this particular example I see no reason to go through the intermediate step.