I came across something weird today as I was doing some optimization.
I have a function that splits a list at some point. I was exploring the differences in using Enum.take/2 and Enum.drop/2. I ran some benchmarks on a list of 1,000,000 elements with 4 different scenarios:
Getting the first 500,000 elements using Enum.take(list, 500_000)
Getting the first 500,000 elements using Enum.drop(list, -500_000)
Getting the last 500,000 elements using Enum.take(list, -500_000)
Getting the last 500,000 elements using Enum.drop(list, 500_000)
Name ips average deviation median 99th %
Enum.drop/2 second 500,000 elements 142.99 6.99 ms ±3.27% 6.97 ms 7.22 ms
Enum.take/2 first 500,000 elements 50.22 19.91 ms ±33.95% 19.52 ms 26.40 ms
Enum.drop/2 first 500,000 elements 30.62 32.66 ms ±24.34% 32.35 ms 77.36 ms
Enum.take/2 second 500,000 elements 23.11 43.28 ms ±19.79% 42.81 ms 112.65 ms
Enum.drop/2 get second 500,000 elements 142.99
Enum.take/2 get first 500,000 elements 50.22 - 2.85x slower +12.92 ms
Enum.drop/2 get first 500,000 elements 30.62 - 4.67x slower +25.66 ms
Enum.take/2 get second 500,000 elements 23.11 - 6.19x slower +36.28 ms
The results are KIND of as expected. Take is better at retrieving the first half of the list and drop is better at receiving the back half of the list. My real misunderstanding is WHY is drop/2 so much faster than take/2 at doing what it’s intended to do vs. what take is intended to do.
Also, shouldn’t the benchmarks be pretty similar no matter whether you’re dropping or taking forward or backward? Wouldn’t it make sense to just call drop/2 when the second input for take/2 is a negative integer and vice versa with drop/2?
Given that lists are linked lists, it makes sense that taking from start is faster than taking from middle. But if You want to optimize this, I would suggest Stream instam of Enum, because it implements lazy loading, while Enum is loading all the list in memory.
I would always add from start, never from the end, then I would reverse the list if needed.
It might not be useful if You just want to split the list…
This exact code would never return if you were to pass a negative n because it would continually match the n < 0 guard on both functions. If you meant
def take(list, n) when n < 0 do
drop(list, n * -1)
def drop(list, n) when n < 0 do
take(list, n * -1)
then I’m not sure that would give you what you are looking for. If I had some list of size 10 and wanted the last 2 elements, I would do Enum.take(list, -2) which would end up becoming drop(list, 2) which would return the first 8 elements of the list.
drop with a positive value: Just returns the tail at that position
drop with negative value: Has to rebuild the front list
take with a positive value: Has to rebuild the front list
take with a a negative value: Is currently rebuilding the back list. Now technically it shouldn’t necessarily need to if it were implemented differently (implement as Enum.drop in that case?), a PR could fix it. ^.^