# How to handle very big enumerables in comprehensions?

For a school assignment, I need to esentially brute-force check every possible solution for a puzzle. This is acceptable, as the task is meant to teach us about prefiltering and optimizing. I have done both of those as best as I could, and now my code will run out of memory for longer puzzles where the number of possible solutions can range from a couple hundred-thousand to a few millions.

``````# From a list of sublists, generate all possible permutations, using one element from each sublist
def perms_from_lists([]), do: []
def perms_from_lists(list) do
Enum.reduce(list, fn sublist, acc ->
for a <- acc, b <- sublist do
[b | List.wrap(a)]
end
end)
|> Enum.map(&Enum.reverse/1)
end

def solutions(puzzle) do
# This would generate a lot of solutions, possibly (for longer puzzles) in the millions,
# depending on how effective is my prefiltering on the given puzzle
Enum.filter((for x <- perms_from_lists(transform_and_prefilter(puzzle)),
do: if sol_okay?(puzzle, x), do: x), fn x -> x != nil end)
end
``````

Iâ€™m guessing the problem may be that my program wants to store every value given by perms_from_lists() in the memory, but how do I prevent this?

Or maybe perms_from_lists() performs poorly for such large operations?

I have heard about Streams but havenâ€™t used them, could this be possibly solved by them?

Hi,

Iâ€™d suggest benchmarking different methods until you find one that works. You can use Benchee for that.

You might like to avoid putting the entire function on one long line. If you give things names and keep each line relatively short it is easier to read.

Did you by chance read the documentation for Stream? At the top or explains how to swap from Enum to Stream.

Source of this code: How to generate permutations from different sets of elements? - #12 by Eiji

1 Like

I believe that for new user of Elixir (or any functional language for that matter), it is better not to learn list comprehension. Just use recursion (both body and tail), There is nothing you can do with list comprehension but not recursion.

4 Likes

Maybe you should show us the source of these so we can help you in detail.

Especially if the data itself is coming from a file or a DB or a network service, you can just ingest it in chunks â€“ by using the `Stream` module â€“ and your code will have predictable memory and CPU footprint.

1 Like

The thing about for comprehensions is that they are eagerly evaluated, as you discovered. You are generating every possible solution before filtering whether they are acceptable solutions. (I also question whether you really need to reverse too).

Streams are one approach because they will lazily generate a candidate solution, but it will not be trivial to write the stream. Another approach is to figure out a way to call `sol_okay` within your `reduce`

1 Like

You are generating all possible permutations of your lists and checking them for some property, as you said, brute-force the check.

The space complexity of such task is exponential in the number of lists and I donâ€™t see a way around that. Recursion, comprehension or what not wonâ€™t make any difference.

I think your options are to either deal with the increasing space requirements by swapping your permutations on disk, or understand the underlying structure of the solutions youâ€™re looking for and try to generate only those. Knowing what `transform_and_prefilter` and `sol_okay?` do and the nature of this puzzle would help.

Here goes a simplified example:

``````defmodule Example do
# in case where empty list is passed as an input
def sample([]), do: []

# in case first element is an empty list
def sample([[] | tail]), do: sample(tail)

# we are using list 2 times
# the first would be iterated
# the second is a copy for `sol_okay?/2` function
def sample(list) do
case sample(list, [], list) do
[] -> {:error, :not_found}
result -> result
end
end

# in case we already have result skip everything else
def sample(_list, {:ok, _result} = data, _list_copy), do: data

# data here is finished element of output list
# due to prepending the data is reversed
# which is fixed by `:lists.reverse/1`
def sample([], data, list_copy) do
solution = :lists.reverse(data)
if sol_okay?(list_copy, solution), do: {:ok, solution}, else: []
end

# when heads i.e. elements of first list ends
# all we need to do is to return an empty list
def sample([[] | _tail], _data, _list_copy), do: []

# collecting data and recursion part
# nested recursion
new_data = sample(tail, [sub_head | data], list_copy)

if new_data == [] do
else
# solution is already found and there is no need to return it
new_data
end
end

def sol_okay?(_puzzle, [:b, 2]), do: true
def sol_okay?(_puzzle, _solution), do: false
end
``````

Why simplified? Well â€¦ first of all copying a full list for every processed combination is not the best idea. Perhaps you would like to store them somewhere, for example in `:ets` table. I just give you a simplified code which could be refactored by changing minimal amount of code and without any problem.

Also `sol_okay?/2` function is naive as we have no idea what you want there and again for simplicity I made a simplest possible example i.e. pattern matching. However this function is still really useful. If you debug `solution` variable in 2nd clause you would find that the code is properly skipping other combinations when valid combination was found.

1 Like

This seems like a good application for streams; they trade additional runtime complexity for memory usage. They can also be useful to skip unnecessary calculations by stopping the stream early, but thatâ€™s not applicable since you need every solution.

Hereâ€™s a version of your code above that uses `Enum.filter` all-at-once:

``````def solutions_enum(puzzle) do
puzzle
|> perms_from_lists()
# in here: all the permutations are in memory
|> Enum.filter(&sol_okay?(puzzle, &1))
end
``````

and the same code but with streams:

``````def solutions_enum(puzzle) do
puzzle
|> stream_perms_from_lists()
# in here: only one permutation is in memory at a time
|> Stream.filter(&sol_okay?(puzzle, &1))
# this accumulates all the results
|> Enum.to_list()
end
``````

The new `stream_perms_from_lists/1` will involve a bunch of functions from `Stream`. For some inspiration, hereâ€™s a similar-but-different problem (generating permutations) solved with streams:

4 Likes

Nice, but my solution do almost the same (the only difference is thatâ€™s not `Stream` based) with just pattern-matching and I guess it would be much easier especially for new developers.

I think that `Stream` would be much more reasonable comparing to my solution if `puzzle` would be `Stream` as well from the first create/generate.

Thank you all for your kind replies, I very much appreciate your time!

@dimitarvp The â€śdataâ€ť here is a bunch of .txt files for puzzles, and the generated solution candidates. Think of it like solving Sudokus brute-force.

@gregvaughn

The thing about for comprehensions is that they are eagerly evaluated

I was suspicious, but I wasnâ€™t sure until now, so thank you for that Makes sense now that it doesnâ€™t run

(I also question whether you really need to reverse too)

I use the position of the elements as the link between tents and trees. It would be avoidable, but Iâ€™d try other stuff before rewriting a bunch of code to skip this reverse. If nothing else makes it faster, Iâ€™ll try it.
And also thanks for the suggestions, I will try them tomorrow!

@trisolaran

Knowing what `transform_and_prefilter` and `sol_okay?` do and the nature of this puzzle would help.

Itâ€™s a Tents and Trees puzzle solver.
`transform_and_prefilter` weeds out invalid tent positions to cut down on the permutations, it traverses a ~ 20x20 matrix of integer tuples once, shouldnt take that long.
`sol_okay?` checks if solution candidate (given by the direction each tent is compared to their tree, n/s/e/w) is valid. Itâ€™s against a few criteria, but returns as soon as one fails, I used `throw-catch` there. Shouldnâ€™t take that long in itself, but runs for each solution candidate.

@Eiji Thank you, Iâ€™m not very good at recursions so it will take me a while to understand how this function works Also I might not said it in the topic, but all possible solutions are needed, so the skipping over part is an bonus functionality

@al2o3cr Thanks a lot, I will try to look into streams a bit more and maybe I can make this work with them.

Great, then you can stream them and not read all of them at once. That way youâ€™d be okay with system resources load.

I guess generally your approach could make a significant difference, however in my case, the â€śbunch of filesâ€ť means around 20 files, each <500 characters, so I wouldnâ€™t expect any sort of noticable performance problem stemming from reading them all at once. Sure it could be optimized, but for now I think based on the replies in this topic the main problem is how I handle the large number of candidate solutions. (Before posting this topic I did a bit of testing, and the program ate up all 16 GBs of my PCâ€™s memory, while trying to solve a mid-sized puzzleâ€¦ )

I see. In that case try one of the proposed solutions?

It took me a while to realize this is the right solution. After thinking about how to cut down more on the number of candidate solutions, I recognized your code basically does what I need to do. So thanks a lot, today I learned something about recursions

1 Like