Zip list of different lengths

I’m having trouble trying to zip lists of different lengths because elixir only returns the first match of each list.

I have the following list of lists:

[
  ["e5", "e7", "e7", "e8", "e8", "e2", "e2", "e0", "e0", "e0"],
  ["B5", "B5", "B5", "B3", "B1", "B1", "B1", "B0", "B1", "B1"],
  ["G5", "G5", "G5", "G2", "G2", "G2"],
  ["D7"]
]

Elixir return this:
[{“e5”, “B5”, “G5”, “D7”}]

And I would like to get the following list:

[
  ["D7", "G5", "B5", "e5"],
  ["G5", "B5", "e7"],
  ...,
  ["B1", "e0"]
]

Is there any way to accomplish this with zip or any other function?

not directly possible.

The zipping finishes as soon as any enumerable in the given collection completes. [Enum — Elixir v1.16.0]

Sth like python’s zip_longest would be nice in Enum.
But even that would leave you with some handy work.

Hint: you could pad the lists with nil to fill all up to length of the longest list, and then remove the nils after that.

2 Likes

If you don’t want to write your own recursive function helpers for this then your best bet is indeed filling up the smaller lists with a marker value – nil or something else – and then just throw those away after the zipping operation, as @Sebb suggested.

2 Likes

yes, that seems to be the better way here.

@dimitarvp already suggested recursion, this is one way to do that using unfold

data = [
  ["e5", "e7", "e7", "e8", "e8", "e2", "e2", "e0", "e0", "e0"],
  ["B5", "B5", "B5", "B3", "B1", "B1", "B1", "B0", "B1", "B1"],
  ["G5", "G5", "G5", "G2", "G2", "G2"],
  ["D7"]
]

Stream.unfold(data, fn list_of_list ->
  List.foldr(list_of_list, {[], []}, fn
    [], {heads, tails} -> {heads, tails}
    [h], {heads, tails} -> {[h | heads], tails}
    [h | t], {heads, tails} -> {[h | heads], [t | tails]}
  end)
  |> case do
    {[], _tails} -> nil
    {heads, tails} -> {heads, tails}
  end
end)
|> Enum.to_list()
2 Likes
  def zip([], zipped), do: zipped

  def zip(lists, zipped) do
    {zipped_, rest_} =
      for [h | t] <- lists, reduce: {[], []} do
        {hs, ts} -> {[h | hs], ts ++ [t]}
      end

    zip(Enum.reject(rest_, &(&1 == [])), zipped ++ [zipped_])
  end
zip(data, []) #=> [
  ["D7", "G5", "B5", "e5"],
  ["G5", "B5", "e7"],
  ["G5", "B5", "e7"],
  ["G2", "B3", "e8"],
  ["G2", "B1", "e8"],
  ["G2", "B1", "e2"],
  ["B1", "e2"],
  ["B0", "e0"],
  ["B1", "e0"],
  ["B1", "e0"]
]

or using Enum.zip with help

  def zip2(lists) do
    max_len = lists |> Enum.map(&length(&1)) |> Enum.max()

    lists
    |> Enum.map(&(&1 ++ List.duplicate(nil, max_len - length(&1))))
    |> Enum.zip()
    |> Enum.map(& &1 |> Tuple.to_list() |> Enum.reject(fn x -> x == nil end))
  end
1 Like

Here you go:

defmodule Example do
  def sample(input) do
    input
    |> Enum.reduce({[], []}, fn
      # when list contains only one element
      # we add it as others and skipping empty list in next steps
      [head], {left_acc, right_acc} -> {[head | left_acc], right_acc}
      # in other case
      # we add list's head and pass its tail, so it would be used in next steps
      [head | tail], {left_acc, right_acc} -> {[head | left_acc], [tail | right_acc]}
    end)
    |> case do
      # since we have lists of lists we need to wrap last list in another one-element list
      # so it's added as one, last element of result
      # instead of adding all elements of last list as separate result elements
      # [
      #   list1,
      #   list2,
      #   # …
      #   listn_elem1,
      #   listn_elem2
      # ]
      {left_acc, []} -> [left_acc]
      # left_acc here is a list of heads for each step
      # tail is a recursive call of input lists except their heads
      # we need to reverse it as otherwise each step would have opposite order to previous step
      {left_acc, right_acc} -> [left_acc | right_acc |> Enum.reverse() |> sample()]
    end
  end
end

input = [
  ["e5", "e7", "e7", "e8", "e8", "e2", "e2", "e0", "e0", "e0"],
  ["B5", "B5", "B5", "B3", "B1", "B1", "B1", "B0", "B1", "B1"],
  ["G5", "G5", "G5", "G2", "G2", "G2"],
  ["D7"]
]

iex> Example.sample(input)

Helpful resources:

  1. Enum.reduce/3
  2. Enum.reverse/1
1 Like

What should happen when the input is the following?

[
  ["e5", "e7", "e7", "e8", "e8", "e2", "e2", "e0", "e0", "e0"],
  ["G5", "G5", "G5", "G2", "G2", "G2"],
  ["B5", "B5", "B5", "B3", "B1", "B1", "B1", "B0", "B1", "B1"],
  ["D7"]
]

what do you think is special about this input?

lists #=> [
  ["e5", "e7", "e7", "e8", "e8", "e2", "e2", "e0", "e0", "e0"],
  ["G5", "G5", "G5", "G2", "G2", "G2"],
  ["B5", "B5", "B5", "B3", "B1", "B1", "B1", "B0", "B1", "B1"],
  ["D7"]
]
|> Enum.map(&(&1 ++ List.duplicate(nil, max_len - length(&1)))) #=> [
  ["e5", "e7", "e7", "e8", "e8", "e2", "e2", "e0", "e0", "e0"],
  ["G5", "G5", "G5", "G2", "G2", "G2", nil, nil, nil, nil],
  ["B5", "B5", "B5", "B3", "B1", "B1", "B1", "B0", "B1", "B1"],
  ["D7", nil, nil, nil, nil, nil, nil, nil, nil, nil]
]
|> Enum.zip() #=> [
  {"e5", "G5", "B5", "D7"},
  {"e7", "G5", "B5", nil},
  {"e7", "G5", "B5", nil},
  {"e8", "G2", "B3", nil},
  {"e8", "G2", "B1", nil},
  {"e2", "G2", "B1", nil},
  {"e2", nil, "B1", nil},
  {"e0", nil, "B0", nil},
  {"e0", nil, "B1", nil},
  {"e0", nil, "B1", nil}
]
|> Enum.map(&(&1 |> Tuple.to_list() |> Enum.reject(fn x -> x == nil end))) #=> [
  ["e5", "G5", "B5", "D7"],
  ["e7", "G5", "B5"],
  ["e7", "G5", "B5"],
  ["e8", "G2", "B3"],
  ["e8", "G2", "B1"],
  ["e2", "G2", "B1"],
  ["e2", "B1"],
  ["e0", "B0"],
  ["e0", "B1"],
  ["e0", "B1"]
]

(isnt dbg the coolest thing?)

1 Like

Maybe it should be e2, nil, B1? Or something else.

But my real hypothesis was that the OP is not coming back. So far satisfied…

I don’t think so:

I’m having trouble trying to zip lists of different lengths

nil should stay only if it’s a part of input. In theory any solution which is adding nil values in the middle of process and later removes all nil values could be wrong. “Could” because author have specified String values as the only possibility. Otherwise there should be at least one atom or integer in code examples.

Since in Elixir it’s common to use nil values similarly to undefined (in other languages) the mistake “is” at author side even if there are other solutions without such problem. I’m quoting “is” because it’s his/her first post and question is relatively simple which in my opinion should lead community to provide more “safe” answers anyway.

The other argument against such solution could be question itself (without code). While examples shows list(list(String.t())) as the only possibility, the question is more generic. Please pay attention that in question there is no word on what’s the contents of said lists which means we have here list(list(term)) instead, especially that input here is extremely simple (it looks like String values are different only to easily catch incorrect zip cases).

Also I would suggest to provide “safe” solutions “by default” as many people especially including junior developers are not used to look for all possible edge cases.

In the OP’s example, the lists decrease in length. For any other shape of input, we can only assume what the desired behaviour is. What you say is a “safe” solution may be different to what the user is expecting.

  • Maybe lists that do not decrease in length is an error in their case, not something that should be transparently handled.
  • Maybe their code makes assumptions based on the position in the list.

These implementations make assumptions that have caveats. It’s possible the OP (or reader) had not considered this and is why I asked for clarification.

good catch, so the pad-with-nil solution is out.

I would use make_ref() as marker value, guaranteed to be unique… but I vote for a recursive function :slight_smile:

1 Like