# Merging Intervals the "Elixir Way"

Hi there! Newbie to the language looking for tips on how to write more idiomatic elixir.

The problem: Given a list of intervals (start, end), merge any overlapping intervals and return the new list.

Ex:

``````iex(1)> intervals = [%{start: 1, end: 2},%{start: 2,end: 3}, %{start: 4, end: 6}, %{start: 7, end: 10}, %{start: 9, end:
12}]
iex(2)> Intervals.merge_intervals intervals
[%{end: 3, start: 1}, %{end: 6, start: 4}, %{end: 12, start: 7}]
``````

Here’s what I’ve got working:

``````defmodule Intervals do
def merge_intervals(intervals) do
Enum.sort(intervals, &(&1.start <= &2.start))
|> _merge_intervals([])
|> Enum.reverse
end

defp _merge_intervals([], merged_intervals), do: merged_intervals
defp _merge_intervals([%{end: cur_end} = cur, %{start: next_start} = next | tail], merged_intervals) when cur_end >= next_start do
collapsed = _collapse_intervals(cur,next)
_merge_intervals(tail, [collapsed | merged_intervals])
end

defp _collapse_intervals(a, b), do: %{start: a.start, end: b.end}
end
``````

Obviously looking for any no-brainer idiomatic improvements, but my one nagging question is if inclusion of the second merge_intervals definition is the right way to do things. It seems significantly more difficult to read (especially pattern matching to extract cur_end and next_start) than throwing a control-flow if into the body of the function definition. I know that pattern matching is powerful, but is there a compelling reason to use that functionality in a case like this one? Is there a better way to do this?

Thanks!

2 Likes

Looks idomatic to me. A few changes I would make are:

1. Make the collapse interval inline. That should improve readability.
2. Remove `_` prefixes, I haven’t seen used this in Elixir for private functions. You can just use the same name as the arity is different.
3. Move the `when` a new line
4. Rename _merge_intervals to merge_sorted_intervals

Would my above solution be preferable to a standard reduce?

``````def merge_intervals(intervals) do
[head | tail] = Enum.sort(intervals, &(&1.start <= &2.start))
|> Enum.reverse
end

def merge_step(int, acc) do
[ head | tail ] = acc
else
end
end
``````

Three questions related to that solution:

1. Notice that I now am only piping two functions instead of three. It seemed to be required to start off the reduce with the first interval in the accumulator already and I couldn’t figure out how to achieve the [head | tail] match without removing my first pipe. Is there a way do that?

2. Is there a way to define the merge_step function as an anonymous function on multiple lines? I come from the javascript world where something like:

intervals.reduce( (acc, int) => {
…work
});

is standard. I tried the fn(int,acc) -> end syntax but it seems like that only works if the end is on the same line.

1. I still feel like what I’ve read suggests avoiding if statements when possible, but this seems like a pretty bread and butter use case. Is there a better option?

To point 2 that you made: This was suggested in Programming Elixir 1.3. I’m happy to choose away from it, as I’ve never liked leading underscores anyway, but would be interested in what the consensus is.

Leading underscores for private functions is a terrible idea IMHO. Leading underscores have meaning to the compiler for variable names. Adding them to functions just leads to confusion.

Correct me if I’m wrong, but don’t you want to sort by `&(&1.start <= &2.start)` ?

Yep. Fixed that in the above code.

I think in this case the `reduce` version is more readable - but that doesn’t mean that `reduce` is always better than recursion - context is everything.

Actually according the Elixir style guide that code should be written as

``````def merge_intervals(intervals) do
[head | tail] = Enum.sort(intervals, &(&1.start <= &2.start))
tail
|> Enum.reverse
end
``````

Well you could do this

``````def merge_intervals(intervals) do
intervals
|> Enum.sort(&(&1.start <= &2.start))
|> (fn ([]) -> []
([h|t]) -> [[h]|t] end.())
|> Enum.reduce(&merge_step/2)
|> Enum.reverse
end
``````

… but I don’t see anything to recommend that approach.

Works fine, no problems …

``````def merge_intervals(intervals) do
[head | tail] = Enum.sort(intervals, &(&1.start <= &2.start))
tail
cond do
true ->
end
end)
|> Enum.reverse
end
``````

… but why for heavens sake would you choose to write code like this??? I find this infinitely more readable:

``````defmodule Intervals do

defp compare_start(lhs,rhs), do: lhs.start <= rhs.start

defp merge(lhs, rhs), do: %{start: (min lhs.start, rhs.start), end: (max lhs.end, rhs.end )}

cond do
next.end >= head.start -> # intervals overlap
true ->
end
end

defp prep([]), do: []

def merge_intervals(intervals) do
intervals
|> Enum.sort(&compare_start/2)
|> prep()
|> Enum.reduce(&cons/2)
|> Enum.reverse
end

end

IO.inspect Intervals.merge_intervals [%{start: 1, end: 2},%{start: 2,end: 3}, %{start: 4, end: 6}, %{start: 7, end: 10}, %{start: 9, end: 12}]
``````

i.e. no anonymous functions “cleverly” interleaving code everywhere, but rather well-named functions that describe what they do (so OK, `cons` is a bit conceited). Anonymous functions have their place but I use them when I need a closure - other than that I prefer well named functions.

And you’ve just hit one of my pet peeves with the “JavaScript community style” - to quote my own rant:

The main issue with `if` is that it looks like a statement when in fact it is an expression - an `if` without an `else` will return a value of `nil` (which actually is `:nil`, i.e. its an atom) when the condition fails. So I personally focus on using `cond` and `case` (as shown in the example above) to remind myself that everything is an expression.

Also unless you are planning to add more information to the maps, I think its perfectly OK to use tuples in this case:

``````defmodule Intervals do

defp compare_start({lhs,_},{rhs,_}), do: lhs <= rhs

defp merge({lhs_start,lhs_end}, {rhs_start,rhs_end}),
do: {(min lhs_start, rhs_start), (max lhs_end, rhs_end)}

cond do
next_end >= head_start -> # intervals overlap
true ->
end
end

defp prep([]), do: []

def merge_intervals(intervals) do
intervals
|> Enum.sort(&compare_start/2)
|> prep()
|> Enum.reduce(&cons/2)
|> Enum.reverse
end

end

IO.inspect Intervals.merge_intervals [{1,2},{2,3}, {4,6}, {7,10}, {9,12}]``````
1 Like

Great stuff, thanks!

One question: What’s going on here?

`do: lhs <= rhs`

Haven’t seen this syntax yet. Is it effectively a merge?

Nope, all this does is the body of the function (delineated via the `do:`) just compares the left and right bindings of the names `lhs` and `rhs` using the `<=` operator, basically just like `1 <= 2`, and returns `true` or `false` depending on if the `lhs` is less than or equal to the `rhs`.

This could be considered another refinement

``````defmodule Intervals do

defp lte_start({lhs_start, _},{rhs_start, _}), do: lhs_start <= rhs_start

when next_end >= head_start,   # intervals overlap
do: [{next_start, (max next_end, head_end)} | tail]

defp prep([]), do: []

def merge_intervals(intervals) do
intervals
|> Enum.sort(&lte_start/2)
|> prep()
|> Enum.reduce(&cons/2)
|> Enum.reverse
end

end

IO.inspect Intervals.merge_intervals [{1,2},{2,3}, {4,6}, {7,10}, {9,12}]
``````

It’s very important to recognize that pattern matching is a conditional construct - unlike JavaScript’s Destructuring Assignment.

I personally also view the guard sequence (`when`-clause) as an extension of the pattern match as the guard sequence can supply match conditions that would be difficult to express within the pattern itself.

1 Like

Nope, all this does is the body of the function (delineated via the do:) just compares the left and right bindings of the names lhs and rhs using the <= operator, basically just like 1 <= 2, and returns true or false depending on if the lhs is less than or equal to the rhs.

…that makes more sense

1 Like