# How could I speed up the code

I have a code for generating a list of possible combinations. It is working fine with less than 11 digits but after 11 it gets very slow sometimes even freezes my computer. I would like to generate a very large collection of combinations. I am new to elixir.
Heres the code

``````  def calculate_permutations(data) do
permute(String.codepoints(data), "")
end

def permute(list, locked_el) when length(list) > 2 do
Enum.flat_map(list,
&(permute(List.delete(list, &1), locked_el <> &1))
)
end

def permute(list, locked_el) do
[
locked_el <> Enum.at(list, 0) <> Enum.at(list, 1),
locked_el <> Enum.at(list, 1) <> Enum.at(list, 0)
]
end
``````

When working with large datasets the Stream module is faster. Looking at your code youâ€™re using `Enum.flat_map` which is a greedy operation meaning that it immediately returns the result. Youâ€™re also calling `List.delete` on every item being passed to the function.

So if you have a list containing 11 digits `List.delete` is called 11 times for every single element. The more elements in a list the longer it takes to enumerate. Thatâ€™s most likely your bottleneck. You probably need to learn more about recursion and pattern matching.

I cleaned up your code a bit using `Stream` and pattern matching. `Stream` is a lazy operation and will return the result at the end.

``````  def calculate_permutations(data) do
data
|> String.codepoints()
|> permute("")
end

def permute(list, locked_el) when length(list) > 2 do
Stream.flat_map(list, &permute(List.delete(list, &1), locked_el <> &1))
|> Enum.to_list()
end

def permute(list, locked_el) do
[first, last] = list
p1 = "#{locked_el}#{first}#{last}"
p2 = "#{locked_el}#{last}#{first}"
[p1, p2]
end
``````
2 Likes

This is sort of true. Where Stream tends to win is on memory, if you can avoid loading the entire dataset into memory. It can also win on speed if you have several composed calls `data |> map(a) |> map(b) |> map(c)` and those are `Stream.map` calls then itâ€™ll do it in a single pass instead of 3. BUT for a single pass, eager iteration is faster, because streams impose some overhead to allow them to gain the other benefits. Thus if you turn the 3 map operations into `data |> map(& &1 a |> b |> c)` eager is probably faster.

@anuarsaeed The logic is slow here for a few reasons. The main reason itâ€™s slow is this: Computing permutations is intrinsically slow. Some algorithms, like counting how many items are in a list, get a little bit slower each time you make the list longer. Some algorithms though, like computing permutations, get a LOT slower as you add items. There is a â€śmathyâ€ť or formal way of looking at this kind of â€śslowness relative to inputsâ€ť called Big O notation, and the big O for permutations is pretty bad, the time it takes grows pretty fast.

The second reason itâ€™s slow though does have to do with how you wrote it. Iâ€™d check out Most elegant way to generate all permutations? for an alternative implementation.

4 Likes

Permuting 11 objects will generate 11! = 39,916,800 results

Permuting 12 objects will try to generate 12 times more. Iâ€™m not surprised that `flat_map` gets bogged down; appending to a list of 39,961,800 elements is going to be exceptionally slow.

Also note that lists are built of cons cells, so each element takes at least as much space as two pointers - so 16 bytes on a 64-bit machine. As a result, that list of all 12-object permutations will need at least 7.6GB of memory. (!)

+1 to the recommendations to look at `Stream` to avoid generating the entire list, but Iâ€™d also recommend you consider an alternative algorithm for whateverâ€™s consuming the list of permutations, as even streaming isnâ€™t going to save you if you need to permute 20 elements (list would be 2.4e18 elements, so 60 BILLION times longer than the one for 11 elements).

1 Like

I also thought it could the size of the list which is slowing down the computing.
Maybe if i use processes i could speed up the running.

Using Stream module and then convert to the Enum list at the end will actually slow down the whole process.

Going to rewrite the whole code for serving the output piece by piece for large datasets. Using recursions and pattern matching.

(my emphasis) The intent of using `Stream` is to avoid ever holding the whole list in memory, as thatâ€™s the most expensive part. For instance, hereâ€™s a naive implementation of permutations with streams:

``````defmodule StreamPerms do
def permutations_of([]), do: [[]]
def permutations_of(ary) do
Stream.flat_map(ary, fn el ->
rest = ary -- [el]
permutations_of(rest)
|> Stream.map(&[el | &1])
end)
end
end

# run with
StreamPerms.permutations_of(Enum.to_list(1..11)) |> Stream.into(File.stream!("scratch_output", [{:delayed_write, 1000000, 2000}])) |> Stream.run()
``````

This generates the 439MB output file in a handful of minutes locally. The VM uses less than 50MB of memory total the whole time.

Changing 11 to 19 would, if you had the time and the space, produce a file roughly the size of ALL data storage in the world circa 2011 (295EB).

HOWEVER

I still donâ€™t understand what your ultimate goal is - permutations are easy to compute on demand, there are just a LOT of them. Storing them on disk or transmitting them over a network is going to take almost as long as (and cost almost as much as) generating them directly at the point of use. â€śCheck all permutationsâ€ť is one of the worst-case scenarios for time complexity, seen for instance in brute-force solutions to the traveling salesman problem. You likely want a better algorithm for whatever is consuming these permutations.

2 Likes