Why is [1, 2, 3 | 4] a valid list if lists are linked lists?

I did not formally study Computer Science, so please bear with me if I am missing something obvious. I am studying the inner workings of functional programming concepts and wrote this simplistic map function:

defmodule EnumMap do
  def map(proc, [head | []]) do
    proc.(head)
  end

  def map(proc, [head | tail]) do
    [proc.(head)] ++ map(proc, tail)
  end
end

EnumMap.map(fn x -> x + 1 end, [1, 2, 3, 4])

This produced: [2, 3, 4 | 5].

I thought “that’s weird, why is 5 being explicitly called with a list constructor?”

Aha, then I caught my mistake:

defmodule EnumMap do
  def map(proc, [head | []]) do
    [proc.(head)] # I needed to wrap this result in a list
  end

  def map(proc, [head | tail]) do
    [proc.(head)] ++ map(proc, tail)
  end
end

EnumMap.map(fn x -> x + 1 end, [1, 2, 3, 4])

Producing: [2, 3, 4, 5]

Hooray! However, that then made me wonder!

If Elixir lists are linked lists, why is [2, 3, 4 | 5] allowed? Isn’t that explicitly saying “this is a linked list, except for the final element which is just an integer.”? Is there a benefit to having linked lists where the final element is not in fact a list? Are there certain data structures or algorithms that use these sort of odd linked lists?

This is allowed but this is actually invalid list.

Because you have a subtle bug in your code, as you find out later.

[2, 3, 4 | 5] is an improper list. You should avoid it, but there is nothing in the language to forbid it.

If you are interested in improper lists, you can read: Making sense of Elixir (improper) lists

9 Likes

Thank you for sharing that article. Super interesting read! Also made me realize my use of lists could be optimized:

My original implementation used appending to create the list:

[proc.(head)] ++ map(proc, tail)

However I could instead use the cons operator to actually form a linked list as it is intended:

[proc.(head) | map(proc, tail)]

It also was redundant in its application of [proc.(head)] and could be simplified for just matching against an empty array.

Better version:

defmodule EnumMap do
  def map(_proc, []), do: []
  def map(proc, [head | tail]) do
    [proc.(head) | map(proc, tail)]
  end
end

EnumMap.map(fn x -> x + 1 end, [1, 2, 3, 4])
1 Like

You can also refactor that recursion to use tail call optimization.

2 Likes

I have read this article countless times. It has the best explanation of improper/io lists I’ve read since I first started working with Elixir. I’m constantly recommending this to other people; it’s a great article! :heart:

It is a very good suggestion as an exercise, but for this particular case which is building a list it might not be the best approach, actually :lists.map/2 (used under the hood by Enum.map/2) is implemented with body recursion:

Also see: Erlang -- The Seven Myths of Erlang Performance.

1 Like

As somebody new to functional programming, can I ask whether this means simply using an accumulator? Or is there some more idiomatic-Elixir method for this? E.g.

defmodule EnumMapTail do
  def map(proc, list) do
    run_map(proc, list, [])
  end
  defp run_map(_proc, [], acc), do: Enum.reverse(acc)
  defp run_map(proc, [head | tail], acc) do
    run_map(proc, tail, [proc.(head) | acc])
  end
end
1 Like

The cons operator is indeed much more idiomatic in this case, but this is mostly a style preference, there is no performance or behavior implication.
Both will generate the same bytecode when prepending a known number of elements with ++.

2 Likes

Indeed, this is exactly what tail recursion is!
Your code is exactly how you would write it.

1 Like