When to use closures

I have been playing with closures today to learn when to use them but each time I find myself thinking, what’s the advantage over a normal function.

For example split_sentence is a closure and returns a function which returns a value.

And split_sentence2 is not a closure and just returns the same value as split_sentence but without being wrapped in a function.

(They both take a string such as “one two three” and return a new string with spaces relapsed with "*, “|”, “#” or “,”. They don’t do complete checking for correctness of the given parameters as I’m only creating to learn closures.)

#closure example (if I understand closures correctly)
def split_sentence do
    token_options = ["*", "|", "#", ","]  
    fn(str, token) -> 
        if is_bitstring(str || token) do
            if Enum.member? token_options, token do 
                String.replace str, " ", token, global: true
            else [:error, "#{token} is not a valid token. Valid tokens are #{token_options}"]
            end
        else [:error, "two strings expected"]
        end
    end
end
 
#not a closure
def split_sentence2(str, token) do 
    token_options = ["*", "|", "#", ","] 
    if is_bitstring(str || token) do
        if Enum.member? token_options, token do 
            String.replace str, " ", token, global: true
        else [:error, "#{token} is not a valid token. Valid tokens are #{token_options}"]
        end
    else [:error, "two strings expected"]
    end
end

Why use one over the other? Maybe these aren’t good example? What is a good use-case for closures? I have google this question a lot and still can’t translate that into my own coding.

Closures can be used for lots of things, but at heart they’re about separating “what should be computed” from “when it should be computed”.

A good example to look at is Map.get/3 versus Map.get_lazy/3:

Map.get(some_map, :some_key, ExpensiveModule.run())
# always executes ExpensiveModule.run, even when the value isn't used

Map.get_lazy(some_map, :some_key, fn -> ExpensiveModule.run() end)
# only executes ExpensiveModule.run if the value is needed

Here, wrapping ExpensiveModule.run() (the “what should be computed”) in a closure allows Enum.get_lazy to evaluate it on-demand (“when it should be computed”).

This operation is so common there’s a shorthand for it, the “capture operator” &. The second call above could instead be spelled Map.get_lazy(some_map, :some_key, &ExpensiveModule.run/0)

(aside: a closure can also be handy if the value is sensitive and you don’t want it to appear in debug traces, etc)


Closures see two kinds of variables: arguments and environment.

  • arguments are supplied where the closure is used. A simple example: Enum.map(some_list, fn arg -> IO.inspect(arg) end)
  • the closure’s environment is “closed” when the closure is created, and contains the local variables visible at that point in the code For instance,
def split_many(strings, sep) do
  Enum.map(strings, fn s ->
    String.split(s, sep)
  end)
end

Here s is an argument to the closure, but sep is part of the environment.


The closure’s environment is part of the closure, so it doesn’t change even when the closure is passed around:

def splitter(sep) do
  fn s ->
    String.split(s, sep)
  end
end

def split_many_with_splitter(strings, splitter) do
  Enum.map(strings, splitter)
end

# called like
split_many_with_splitter(strings, splitter("\t"))

Here sep is part of the environment in the closure returned by splitter, still available despite the splitter function having returned.

The BEAM takes this one level farther: code can send a closure to a different PROCESS and it will work without issues:

origin_pid = self()
closure = fn other_pid ->
  IO.inspect(origin_pid, label: "origin pid")
  IO.inspect(other_pid, label: "other pid")
end

listener_pid = spawn(fn ->
  receive do
    a_closure ->
      a_closure.(self())
  end
end)

send(listener_pid, closure)

should print something like:

origin pid: #PID<0.106.0>
#Function<44.97283095/1 in :erl_eval.expr/5>
other pid: #PID<0.152.0>

(the exact position of the lines is variable, because concurrency)

Here closure is created in the initial process sent to the listener_pid process, but retains the value in origin_pid.

This is used in the standard library for things like Agent.get, where it’s used to avoid copying the entire Agent state back to the calling process - the idea is that the closure passed to Agent.get can access the agent’s state directly and return only what it’s interested in. For instance, you might use an Agent to maintain a large shared data structure in-memory and use Agent.get with a closure that extracts a small piece.

This also works over Distributed Erlang, for the ultimate “bring the computation TO the data” experience. :slight_smile:


Edit: add some general Elixir notes

  • [:error, "two strings expected"]: the usual convention for this kind of value is a tuple ({:error, "two strings expected"}). The list form will work, but future readers may be confused

  • if is_bitstring(str || token) do: this likely doesn’t do what you mean. It matches what we would say “if str or token is a bitstring do this” but it means “if str is a bitstring, or str is nil and token is a bitstring”. Consider if is_bitstring(str) and is_bitstring(token) do or see below for another option

  • consider doing type-checking with pattern-matching to keep the main code path clear:

def split_sentence(str, token) when is_bitstring(str) and is_bitstring(token) do
  ...
end

def split_sentence(_, _), do: {:error, "two strings expected"}
18 Likes

This is amazing thank you. I’m going to look at this in the morning when my brain isn’t so fried, so I can reply properly.

1 Like

I think, I’m getting there. Would you say a closure is like an object (as in OO languages) with one method that get’s passed into other objects to have it functionality used when needed?

I think I may need to switch my thinking from ,how to make use of closures, to code modularity and small reusable functions that can be passed around. That way closures should naturally become part of the code as needed.

Very interesting, Thank you again for the examples. It really helps.

In languages like Elixir, functions are first class(function expressions or lambda expression): you can bind them to variables and pass them around.

When a function has variables in it’s body that are not present in the function arguments(they have “free” variables, so it’s an open expression), they are defined by the values in it’s surrounding context(like the “parent” scope), so that every variable gets bound to a value and the expression get’s closed and a computer can evaluate it. That context is called the closure of the function expression, because it closes an open expression.

Though many people call closures to the function expressions because in their implementation often times both the context and the function definition get wrapped in a single struct Closure and just get used to that. So in practical terms you can call the function a closure and that’s alright, but conceptually the closure is the context.

Would you say a closure is like an object (as in OO languages) with one method that get’s passed into other objects to have it functionality used when needed?

Conceptually a closure is not an object, it’s just a function value. In an object oriented language like C# you may see something like this:

using System.Linq;

int[] arr = { 1, 2, 3 };
int[] doubledArr = arr.Select(el => el * 2).toArray();

Where el => el * 2 is the closure. It’s not a method of any object. It’s kind of like a “literal function”.

2 Likes

nitpick: I’d call this an anonymous function. Closures are those that also capture the lexical scope of the environment. Most sane languages allow both; and will optimize a clean closure into an anonymous function.

Both capture a subset the surrounding environment(which may be the empty set!), if you use them in the expression or not is up to you and may or may not trigger some compiler optimizations.

The correct term wouldn’t be closure either, but rather function expression, or lambda expression(like in c++, where the closure is more evident since you have to explicitly capture it). When you’re implementing lamba expressions in a language you always need both the lambda definition and the closure(the context it needs to be a closed expression). The first implementations called that set a “closure”, which is not 100% correct but naming things is hard and we keep using it.

There’s no “anonymous function” without a closure because otherwise it wouldn’t make any sense to the computer since it can’t evaluate open expressions. Even in an expression like fn x -> x + 2 end in elixir you’re capturing an environment(Elixir.Kernel, otherwise + would be defined).

For that particular example, I tried to stress the fact that it’s an expression that can be defined like any other value and is not bound to a particular data structure, like foo(Utils.Toupper) where Utils is a class with static methods.

PS: correct me if I’m wrong, but that’s the understanding I got a while ago when researching about λ-calculus and the differences between closures and lambda expressions

There is no disagreement here. What I was trying to say was some languages has anonymous function but not closure, for example c. In a language that support full closure it is sometimes important to refactor code to use pure anonymous function whenever possible, as a way to optimize for performance.