saudade
Merging Intervals the "Elixir Way"
Hi there! Newbie to the language looking for tips on how to write more idiomatic elixir.
The problem: Given a list of intervals (start, end), merge any overlapping intervals and return the new list.
Ex:
iex(1)> intervals = [%{start: 1, end: 2},%{start: 2,end: 3}, %{start: 4, end: 6}, %{start: 7, end: 10}, %{start: 9, end:
12}]
iex(2)> Intervals.merge_intervals intervals
[%{end: 3, start: 1}, %{end: 6, start: 4}, %{end: 12, start: 7}]
Here’s what I’ve got working:
defmodule Intervals do
def merge_intervals(intervals) do
Enum.sort(intervals, &(&1.start <= &2.start))
|> _merge_intervals([])
|> Enum.reverse
end
defp _merge_intervals([], merged_intervals), do: merged_intervals
defp _merge_intervals([%{end: cur_end} = cur, %{start: next_start} = next | tail], merged_intervals) when cur_end >= next_start do
collapsed = _collapse_intervals(cur,next)
_merge_intervals(tail, [collapsed | merged_intervals])
end
defp _merge_intervals([head | tail], merged_intervals), do: _merge_intervals(tail, [head | merged_intervals])
defp _collapse_intervals(a, b), do: %{start: a.start, end: b.end}
end
Obviously looking for any no-brainer idiomatic improvements, but my one nagging question is if inclusion of the second merge_intervals definition is the right way to do things. It seems significantly more difficult to read (especially pattern matching to extract cur_end and next_start) than throwing a control-flow if into the body of the function definition. I know that pattern matching is powerful, but is there a compelling reason to use that functionality in a case like this one? Is there a better way to do this?
Thanks!
Most Liked
peerreynders
I think in this case the reduce version is more readable - but that doesn’t mean that reduce is always better than recursion - context is everything.
Actually according the Elixir style guide that code should be written as
def merge_intervals(intervals) do
[head | tail] = Enum.sort(intervals, &(&1.start <= &2.start))
tail
|> Enum.reduce([head], &merge_step/2)
|> Enum.reverse
end
Well you could do this
def merge_intervals(intervals) do
intervals
|> Enum.sort(&(&1.start <= &2.start))
|> (fn ([]) -> []
([h|t]) -> [[h]|t] end.())
|> Enum.reduce(&merge_step/2)
|> Enum.reverse
end
… but I don’t see anything to recommend that approach.
Works fine, no problems …
def merge_intervals(intervals) do
[head | tail] = Enum.sort(intervals, &(&1.start <= &2.start))
tail
|> Enum.reduce([head],fn (int, [head|tail]) ->
cond do
head.end >= int.start ->
[collapse_intervals(head, int) | tail]
true ->
[int, head | tail]
end
end)
|> Enum.reverse
end
… but why for heavens sake would you choose to write code like this??? I find this infinitely more readable:
defmodule Intervals do
defp compare_start(lhs,rhs), do: lhs.start <= rhs.start
defp merge(lhs, rhs), do: %{start: (min lhs.start, rhs.start), end: (max lhs.end, rhs.end )}
defp cons(head, [next|tail]) do
cond do
next.end >= head.start -> # intervals overlap
[merge(head, next) | tail]
true ->
[head, next | tail]
end
end
defp prep([]), do: []
defp prep([head | tail]), do: [[head]|tail]
def merge_intervals(intervals) do
intervals
|> Enum.sort(&compare_start/2)
|> prep()
|> Enum.reduce(&cons/2)
|> Enum.reverse
end
end
IO.inspect Intervals.merge_intervals [%{start: 1, end: 2},%{start: 2,end: 3}, %{start: 4, end: 6}, %{start: 7, end: 10}, %{start: 9, end: 12}]
i.e. no anonymous functions “cleverly” interleaving code everywhere, but rather well-named functions that describe what they do (so OK, cons is a bit conceited). Anonymous functions have their place but I use them when I need a closure - other than that I prefer well named functions.
And you’ve just hit one of my pet peeves with the “JavaScript community style” - to quote my own rant:
The main issue with if is that it looks like a statement when in fact it is an expression - an if without an else will return a value of nil (which actually is :nil, i.e. its an atom) when the condition fails. So I personally focus on using cond and case (as shown in the example above) to remind myself that everything is an expression.
Also unless you are planning to add more information to the maps, I think its perfectly OK to use tuples in this case:
defmodule Intervals do
defp compare_start({lhs,_},{rhs,_}), do: lhs <= rhs
defp merge({lhs_start,lhs_end}, {rhs_start,rhs_end}),
do: {(min lhs_start, rhs_start), (max lhs_end, rhs_end)}
defp cons({head_start,_} = head, [{_, next_end} = next|tail]) do
cond do
next_end >= head_start -> # intervals overlap
[(merge head, next) | tail]
true ->
[head, next | tail]
end
end
defp prep([]), do: []
defp prep([head | tail]), do: [[head]|tail]
def merge_intervals(intervals) do
intervals
|> Enum.sort(&compare_start/2)
|> prep()
|> Enum.reduce(&cons/2)
|> Enum.reverse
end
end
IO.inspect Intervals.merge_intervals [{1,2},{2,3}, {4,6}, {7,10}, {9,12}]
peerreynders
This could be considered another refinement
defmodule Intervals do
defp lte_start({lhs_start, _},{rhs_start, _}), do: lhs_start <= rhs_start
defp cons({head_start, head_end}, [{next_start, next_end} | tail])
when next_end >= head_start, # intervals overlap
do: [{next_start, (max next_end, head_end)} | tail]
defp cons(head, rest),
do: [head | rest]
defp prep([]), do: []
defp prep([head | tail]), do: [[head]|tail]
def merge_intervals(intervals) do
intervals
|> Enum.sort(<e_start/2)
|> prep()
|> Enum.reduce(&cons/2)
|> Enum.reverse
end
end
IO.inspect Intervals.merge_intervals [{1,2},{2,3}, {4,6}, {7,10}, {9,12}]
It’s very important to recognize that pattern matching is a conditional construct - unlike JavaScript’s Destructuring Assignment.
I personally also view the guard sequence (when-clause) as an extension of the pattern match as the guard sequence can supply match conditions that would be difficult to express within the pattern itself.
saudade
Nope, all this does is the body of the function (delineated via the do:) just compares the left and right bindings of the names lhs and rhs using the <= operator, basically just like 1 <= 2, and returns true or false depending on if the lhs is less than or equal to the rhs.
…that makes more sense








