Can you improve this? Zipping two lists, the result must be same size as the first list

lud · February 5, 2022, 12:21pm

As a micro optimization I got the habit to put clauses matching [] below clauses matching [_|_].

dimitarvp · February 5, 2022, 12:43pm

Yes, absolutely. Not all threads should be about the 2763th person panicking when iex shows them two characters when trying to print e.g. [10, 13]. Or “how do I make this Ecto query” or “how do I do X with Phoenix”. Or “I haven’t read even the basics of Elixir but please write this code for me”…

Threads like this one give people a chance to go deep and work on a small but (hopefully) interesting problem and show their talents. We indeed need more of them and I hope others will not be shy and start such threads.

dimitarvp · February 5, 2022, 1:34pm

Marking @LostKobrakai’s post/code as the solution. He won fair and square.

Also might be a good place for a shameless plug, a post of mine a while ago that got zero traction: Algorithm challenge: the Partition Problem

eksperimental · February 5, 2022, 1:48pm

Hi Dimitar,could you please share the whole source code as a Git repo?

dimitarvp · February 5, 2022, 2:04pm

Sure, I planned to anyway. Not sure about today but it will be soon.

eksperimental · February 5, 2022, 2:04pm

or at least the benchmark file as a gist.

adamu · February 5, 2022, 3:55pm

Why did you give the award to @LostKobrakai for the solution that @derek-zhou had already posted earlier? The code is the same and the difference in benchmark is negligible and can be attributed to randomness

(edit: sorry tried to combine the below posts but I can’t delete them so leaving as-is)

adamu · February 5, 2022, 4:13pm

On my machine, the winner between pair_with_recursion and pair_with_tail_recursion depends on the order they are supplied to Benchee. Put pair_with_tail_recursion last, and it wins…

adamu · February 5, 2022, 4:32pm

Similarly I didn’t see a significant difference here.

Comparison:
pair_with_recursion                     43.53 K
pair_with_tail_recursion_improved       43.09 K - 1.01x slower +0.24 μs
pair_with_tail_recursion                42.62 K - 1.02x slower +0.50 μs

On another run:

Comparison:
pair_with_tail_recursion                46.00 K
pair_with_tail_recursion_improved       45.77 K - 1.01x slower +0.112 μs
pair_with_recursion                     44.37 K - 1.04x slower +0.80 μs

dimitarvp · February 5, 2022, 5:05pm

Hm. Can you share the exact order of the benchmarks that you used? Let me run them again myself.

moogle19 · February 5, 2022, 5:11pm

Arent pair_with_tail_recursion and pair_with_recursion (except for a litte syntax like binding the copy of the list to list1) the same solutions?

adamu · February 5, 2022, 5:18pm

Yes, that’s my point. It doesn’t make sense to declare one person a winner over the other for the same answer, unless you give the recognition to the person who got there first

@dimitarvp same as your file - I just moved pair_with_tail_recursion to the end because I suspected less than 1 millionth of a second was not a significant difference and was probably more to do with the execution environment than the algorithm (which is the same).

dimitarvp · February 6, 2022, 12:57am

So I had a little time and reviewed again.

pair_with_recursion is @LostKobrakai’s code.

pair_with_tail_recursion is @derek-zhou’s code (amended slightly by @moogle19).

I have put pair_with_tail_recursion at the end – this didn’t change the order with which the benchmarks were being run (predictably, because the order with which you iterate through map’s pairs is not guaranteed in Elixir) but you (@adamu) are indeed correct that this tilted the benchmark slightly in favour of @derek-zhou’s code.

And now I remember why I left both in the module. @derek-zhou’s code uses Enum.reverse in the public function, at the end, while @LostKobrakai’s does so in the terminating condition in the private function. The difference should be inconsequential indeed as we’re seeing but I’ve learned not to presume anything and just measure so that’s why I left both variants intact.

But yep, seems they are identical so instead of proclaiming a single winner I say this:

@LostKobrakai and @derek-zhou practically arrived at the same solution with one super minor difference so they’re both winners. (and kudos to @moogle19 for slightly improving upon the code)

eksperimental · February 6, 2022, 4:47am

This is my micro-optimized version based on @derek-zhou

  def pair_with_tail_recursion_micro_optimized([h0 | t0] = _list0, [h1 | t1] = list1),
    do: pair_with_tail_recursion_micro_optimized(t0, t1, list1, [{h0, h1}])

  def pair_with_tail_recursion_micro_optimized(list0, list1),
    do: pair_with_tail_recursion_micro_optimized(list0, list1, list1, [])

  defp pair_with_tail_recursion_micro_optimized([h0 | t0], [h1 | t1], l1, acc),
    do: pair_with_tail_recursion_micro_optimized(t0, t1, l1, [{h0, h1} | acc])

  defp pair_with_tail_recursion_micro_optimized([h0 | t0], [], [h1 | t1] = l1, acc),
    do: pair_with_tail_recursion_micro_optimized(t0, t1, l1, [{h0, h1} | acc])

  defp pair_with_tail_recursion_micro_optimized([], _, _, acc),
    do: :lists.reverse(acc)

I haven’t bench-marked it. If it’s faster I can explain the changes in detail if needed.

eksperimental · February 6, 2022, 4:48am

Yes, I always do that. I think it is good practice. You only return once, so why check first in every iteration. If that loop is the returning one it does not hurt to check it last.

I have submitted a solution that uses that trick amongst others

moogle19 · February 6, 2022, 9:19am

Shouldn’t the second function just return an empty list, since if either (initial) list is empty the zipping doesn’t work? E.g:

  def pair_with_tail_recursion_micro_optimized([h0 | t0] = _list0, [h1 | t1] = list1),
    do: pair_with_tail_recursion_micro_optimized(t0, t1, list1, [{h0, h1}])

  def pair_with_tail_recursion_micro_optimized(_list0, _list1), do: []
  ...

Otherwise something like pair_with_tail_recursion_micro_optimized([:foo, :bar], []) would fail.

eksperimental · February 6, 2022, 2:51pm

Correct.
Too bad I cannot update it.
Thank you.

dimitarvp · February 11, 2022, 10:58pm

Thanks everyone for participating.

I made a GitHub repo here: GitHub - dimitarvp/elixir-zipleft-arena: Present competing implementations of zipping two lists while cycling through the second (assumed to be the smaller one)

Its central module is responsible for running the benchmarks of all other module/function implementations of the algorithm. Every module is named after the person who contributed the code, and those modules do contain more than one implementation (each still belongs to the same authoring person, of course).

I have included no tests but probably should have. I might get to it, PR welcome if somebody feels up to it. Preferrable: property tests with stream_data.

@derek-zhou / @eksperimental / @LostKobrakai / @moogle19 If you feel your code is misrepresented or wrongly copied, just ping me or make a PR – I’ll accept it!

Remarks:

Both the code of @derek-zhou and @eksperimental is amended with a small change proposed by @moogle19.
To @adamu: the performance difference between the code of @LostKobrakai and @derek-zhou is miniscule but still there and is actually reproducible on my machine so I am leaving both implementations as they are (they have a small difference which I explained here: Can you improve this? Zipping two lists, the result must be same size as the first list - #33 by dimitarvp). I also included @eksperimental’s variation.

adamu · February 12, 2022, 7:01am

I still don’t think the benchmarks that close are telling you anything about the algorithms, the difference is probably down to environment.

Here’s mine on a 2015 iMac

LostKobrakai.pair_with_recursion                             47.09 K
eksperimental.pair_with_tail_recursion_micro_optimized       46.69 K - 1.01x slower +0.183 μs
DerekZhou.pair_with_tail_recursion                           46.58 K - 1.01x slower +0.23 μs
moogle19.pair_with_tail_recursion_two_way                    45.94 K - 1.02x slower +0.53 μs
DerekZhou.pair_with_body_recursion                           30.63 K - 1.54x slower +11.41 μs
dimitarvp.pair_recursively                                   22.94 K - 2.05x slower +22.36 μs

And on a 2021 Macbook Pro

eksperimental.pair_with_tail_recursion_micro_optimized       59.35 K
DerekZhou.pair_with_tail_recursion                           58.63 K - 1.01x slower +0.21 μs
LostKobrakai.pair_with_recursion                             58.25 K - 1.02x slower +0.32 μs
dimitarvp.pair_recursively                                   33.53 K - 1.77x slower +12.98 μs
DerekZhou.pair_with_body_recursion                           31.54 K - 1.88x slower +14.86 μs
moogle19.pair_with_tail_recursion_two_way                    25.24 K - 2.35x slower +22.77 μs

Both running elixir 1.13.3-otp-24 erlang 24.2.1

dimitarvp · February 12, 2022, 3:24pm

Same Erlang/Elixir versions here, guess we have to write it off as CPU generation differences indeed.

But in any case, I kept both versions because they still have a tiny code difference. People can clone the repo, run benchmarks, see the minute and inconsistent performance difference, get curious and experiment on their own.

That’s why I left both in.