Why are streams better for large collections if you still need to enumerate over the entire stream?

Pistrie · November 4, 2023, 11:45am

I’m reading the documentation page, but it’s not landing. A stream basically defines how you want to go over the collection, without actually executing it at that time. Nothing happens until you enumerate over it.

This is where I get lost. What is the point of the stream if you still need to enumerate over it with Enum? I can understand it a bit better if you don’t want to enumerate over the entire stream, but what if you do want to do that? Are you better off skipping the stream altogether?

Given this example (slightly modified from the documentation):

iex> range = 1..1_000_000
iex> stream = Stream.map(range, & &1)
iex> Enum.map(stream, & &1)          
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42,
 43, 44, 45, 46, 47, 48, 49, 50, ...]

Here the entire stream is being enumerated over, so why not just use map and skip the stream?

josevalim · November 4, 2023, 11:47am

It depends on how many operations you want to perform. If it is a single map, it is likely that the stream is slower. But if you want to map, filter, reduce, etc, each Enum operation will build a new list, the Stream will traverse once. for comprehensions are highly optimized too. If you feel the docs could be clearer, a PR is welcome!