Learning idiomatic Elixir - Q1

nikiiv · August 29, 2023, 1:14pm

Hi all, I am trying to learn how to write idiomatic Elixir. I decided to write an implementation of a coding challenge I was given long time ago.
I wrote the implementation and tests, but… I don’t like it
The problem I see is that I use too many exit checks . The algo is explained on the github page. If someone can take a look and give me some pointers I will be very very grateful.

I should note that the goal is O(n) including how any internal or external libraries could potentially affect it

defmodule CodingTest do
  @moduledoc """
  * Implement a method that will determinate if exists a range of
  * non-negative elements in an array that sum up to a specific non-
  * negative number
  *
  * Example 1 - targetSum 3, numbers = [1,7,1,1,1,5,6,1] - output true (the sum of the elements in range 2 to 4 is 3
  * Example 2 - targetSum 7, numbers = [0,4,5,1,8,9,12,3,1] - output false (no range sums to 7)
  *
  """

  def solution([h|t], target_sum) do
      find_solution([h],t,h, target_sum)
  end

  defp find_solution(_,_,target_sum, target_sum) when target_sum > 0, do: :OK
  defp find_solution(_,[0 | _], _, 0), do: :OK

  defp find_solution(arr, [h | t], current_sum, target_sum) when current_sum < target_sum do
    #Slog.log ["case_less", arr, [h | t], current_sum, target_sum]
    current_sum = current_sum + h
    find_solution(arr ++ [h], t, current_sum, target_sum)
  end

  defp find_solution([h | t], right, current_sum, target_sum) when current_sum > target_sum do
    current_sum = current_sum - h
    find_solution(t, right ,current_sum, target_sum)
  end

  defp find_solution([], [h|t], _, target_sum), do: find_solution([h],t,h, target_sum)

  defp find_solution(_,[],current_sum, target_sum) when current_sum != target_sum do
    :KO
  end

  defp find_solution([],[],_, _), do: :KO

end

TEST

defmodule CodingTestTest do
  use ExUnit.Case
  doctest CodingTest

  test "solution" do
    assert CodingTest.solution([1,2,3,4],6) == :OK
    assert CodingTest.solution([1,7,1,1,1,5,6,1],3) == :OK
    assert CodingTest.solution([0,4,5,1,8,9,12,3,1],7) == :KO
    assert CodingTest.solution([5,3,3,3,4,100],13) == :OK
    assert CodingTest.solution([5,4,3,2,1,0,1,2,3,4,5],0) == :OK
    assert CodingTest.solution([5,4,3,2,1,1,1,2,3,4,5],0) == :KO
  end
end

tomekowal · August 29, 2023, 2:41pm

I’d do it this way:

defmodule CodingTest do
  @moduledoc """
  * Implement a method that will determinate if exists a range of
  * non-negative elements in an array that sum up to a specific non-
  * negative number
  *
  * Example 1 - targetSum 3, numbers = [1,7,1,1,1,5,6,1] - output true (the sum of the elements in range 2 to 4 is 3
  * Example 2 - targetSum 7, numbers = [0,4,5,1,8,9,12,3,1] - output false (no range sums to 7)
  *
  """

  def solution([h | t], target_sum) do
    # start by taking first element as the current sublist
    find_solution([h], t, h, target_sum)
  end

  # defp find_solution(current_sublist, rest_of_list, current_sum, target_sum)
  # if target sum is zero, we require zero number somewhere
  # it is not enough to simply say that zero element sublist has zero length
  # NOTE: that edge case should be clarified because it complicates simple recursion
  # in which empty range sums up to zero (even if there no zeros)

  # we've found the sublist!
  # NOTE: previous solution where `target_sum` was twice in parameters was perfectly fine and idiomatic
  # However, in this particular example, I wanted to see three `when` clauses with `==`, `<` and `>`
  defp find_solution(current_sublist, _rest_of_list, current_sum, target_sum) when current_sum == target_sum and length(current_sublist) > 0, do: :OK

  # we need more, so we take add first element from the reset to the current sublist
  # NOTE: this merges a case where the current sum is too low AND where current_sublist has zero elements and both current_sum and target_sum are 0
  defp find_solution(current_sublist, [h | t] = _rest_of_list, current_sum, target_sum) when current_sum <= target_sum do
    find_solution(current_sublist ++ [h], t, current_sum + h, target_sum)
  end

  # we are over target, so we drop first item from the current sublist
  defp find_solution([h | t] = _current_sublist, rest_of_list, current_sum, target_sum) when current_sum > target_sum do
    find_solution(t, rest_of_list, current_sum - h, target_sum)
  end

  defp find_solution(_, _, _, _), do: :KO
end

Thoughts:

The algorithm would be simpler if zero element sublist would be allowed for zero sum; In fact, it is so much simpler, that I am 90% sure that it was the intended solution (unless it is some kind of class about algorithms where you should look for edge cases)
you can use [h | t] = _name to convey variable meaning
you can use and in guards, so having length(current_sublist) > 0 drops one of the cases that started from scratch and merges it with the case where we have too little elements (they were doing the same thing conceptually - taking a new element from the rest of the list)
it is often good practice to put all the terminal cases first and then all the recursive ones; however, this problem is simpler if we:

check if we have the solution
check if we should extend the current_sublist
check if we should shrink the current_sublist
stop if we can’t do any of those

We could potentially still put stop conditions first, but the check would be we would be more compiicated and I’d rather reduce the number of clauses.

stefanluptak · August 29, 2023, 2:50pm

I would probably do it this way. You can paste that code into the Livebook app and play more with it.

It’s less effective than the pure recursion in the solution above, but I find it much more readable.

Regarding the Elixir itself, function returning :OK or :KO is not very idiomatic. I understand you probably decided this way for the sake of simplicity, but it’s worth pointing out for future readers. My solution returns range of the correct sublist or nil.
Also, naming your Elixir project with a Test suffix is not a good idea too, because that’s usually used for the modules that contains tests. (e.g. MyApp.RegistrationTest is a module that is testing MyApp.Registration functionality).

defmodule Sublist do
  @doc """
      iex> Sublist.find([1, 2, 3], 3)
      0..1

      iex> Sublist.find([1, 7, 1, 1, 1, 5, 6, 1], 3)
      2..4

      iex> Sublist.find([0, 4, 5, 1, 8, 9, 12, 3, 1], 7)
      nil
  """
  def find(list, sum, offset \\ 0)
  def find([], _sum, _offset), do: nil
  def find([_head | tail] = list, sum, offset) do
    list
    |> Enum.scan(0, &Kernel.+/2)
    |> Enum.find_index(&(&1 == sum))
    |> case do
      nil ->
        find(tail, sum, offset + 1)

      index ->
        Range.new(offset, index + offset)
    end
  end
end

nikiiv · August 29, 2023, 3:15pm

Actually an empty list cannot be a solution, so …
The solution I am after is O(n) and length on a list will increase the complexity.
I should’ve mentioned it. My initial solution disregarding the zero and empty list was pretty mich similar to yours

But there are some valid pointers like putting terminal cases first. Thanks

nikiiv · August 29, 2023, 3:17pm

scan and find_index will increase the complexity. I am after O(n).
Thanks for pointing out the suffix shouldn’t be _test. I started with mix new coding_test

100phlecs · August 29, 2023, 5:02pm

Here’s mind attempt to create less checks.

defmodule CodingTest do
  def solution([num | rst], target) do
    q = :queue.new()
    find(rst, :queue.in(num, q), num, target)
  end

  defp find([], _, _, _), do: nil
  defp find([num | rst], q, sum, target) when sum == target do
    if :queue.is_empty(q), do: find(rst, :queue.in(num, q), num, target), else: :ok
  end

  defp find([num | rest], q, sum, target) when sum < target,
    do: find(rest, :queue.in(num, q), sum + num, target)

  defp find(lst, q, sum, target) when sum > target do
    {{:value, prev}, q} = :queue.out(q)
    find(lst, q, sum - prev, target)
  end 
end

In order to maintain an O(n) runtime complexity, I reached for the :queue erlang module which allows O(1) inserts and pops for a FIFO data structure.

This allows us to maintain a ‘sliding window’ view of the list, moving the right side it when the range sum is too small and moving the left side when the range sum is too large.

Originally I reached for Enum.reduce but because we have to process nodes within the list more than once (see when sum > target function) recursion makes the most sense.

Idiomatic Elixir usually keeps atoms lowercase, and represents empty states either with nil or {:error, reason}, so I changed those values.

As to why I changed find_solution to find is just personal preference

Without the :queue data structure we could take advantage of Enum.reverse to simulate a queue, but then the runtime is no longer O(n), but O(n * k) where k is the max length of the simulated queue (I think, feel free to correct if wrong)

Edit: I found a case where my solution fails, which is the case where the range is at the very end of the list. It requires modifying the case of when sum == target:

  defp find(lst, q, sum, target) when sum == target do
    case {:queue.is_empty(q), lst} do
      {true, []} ->
        nil

      {true, [num | rst]} ->
        find(rst, :queue.in(num, q), num, target)

      {false, _} ->
        :ok
    end
  end

I’m sure there’s more elegant ways to solve this, but I went with this because it’s more readable to me.

Here’s a new test case to confirm it works:

    assert CodingTest.solution([1,7,1,1,1,5,13,1],14) == :ok

I updated the Livebook (here is where it is hosted)

eksperimental · August 29, 2023, 7:26pm

Hi @nikiiv,
Here’s my solution.
This is as idiomatic as I could make it be.

Naming convention, guards, types, specs, pattern matching, optimization (avoids unnecessary traversing), documentation, doctests, and code formatting.

Feel free to study the code and ask questions.

Later on tonight when I’ve got a bit more of free time I will come back to make a few comments about your implementation and what’s idiomatic in Elixir.

– Cheers

defmodule Coding do
  @moduledoc """
  Implement a function that will determinate if exists a range consecutive of
  non-negative elements in an list, that sum up to a specific non-negative number.
  """

  @type element :: non_neg_integer()
  @type element_list :: nonempty_list(element())
  @type sum :: non_neg_integer()
  
  defguardp is_non_neg_integer(term) when is_integer(term) and term >= 0

  @doc """
  Determinate whether consecutive number of non-negative integers in an list
  sum up to a specific non-negative number. Returns `true` if this condition is
  met, otherwise returns `false`.

  ## Examples

      # The sum of the elements in range 2 to 4 is 3
      iex> Coding.valid?([1,7,1,1,1,5,6,1], 3)
      true

      # No range sums to 7
      iex> Coding.valid?([0,4,5,1,8,9,12,3,1], 7)
      false

  """
  @spec valid?(element_list(), sum()) :: boolean()
  def valid?(list, target_sum)
      when is_list(list) and list != [] and is_non_neg_integer(target_sum) and target_sum >= 0 do
    result =
      Enum.reduce_while(list, [], fn
        # If element is target_sum, we immediately return
        ^target_sum, _acc ->
          {:halt, true}

        element, acc ->
          case sum_list(acc, element, target_sum) do
            true ->
              {:halt, true}

            new_list ->
              {:cont, [element | new_list]}
          end
      end)

    case result do
      true -> true
      list when is_list(list) -> false
    end
  end

  @spec sum_list(
          nonempty_list(sum()),
          element(),
          sum()
        ) :: true | list()
  defp sum_list(list, element, target_sum)
       when is_list(list) and is_non_neg_integer(element) and is_non_neg_integer(target_sum) do
    Enum.reduce_while(list, [], fn sum, acc ->
      case sum(sum, element, target_sum) do
        {:halt, true} ->
          # We abort inmediately
          {:halt, true}

        {:halt, _sum} ->
          # We do not include this result
          {:cont, acc}

        # This is an optimization.
        # We don't need to store 0
        {:cont, 0} ->
          {:cont, acc}

        {:cont, sum} ->
          # We add this result
          {:cont, [sum | acc]}
      end
    end)
  end

  @spec sum(
          nonempty_list(sum()),
          element(),
          sum()
        ) :: true | list()
  defp sum(sum, element, target_sum) when sum + element == target_sum,
    do: {:halt, true}

  defp sum(sum, element, target_sum) when sum + element < target_sum,
    do: {:cont, sum + element}

  defp sum(sum, element, _target_sum),
    do: {:halt, sum + element}
end

All tests pass:

true = Coding.valid?([1, 7, 1, 1, 1, 5, 6, 1], 3)
false = Coding.valid?([0, 4, 5, 1, 8, 9, 12, 3, 1], 7)
true = Coding.valid?([1, 2, 3, 4], 6)
true = Coding.valid?([1, 7, 1, 1, 1, 5, 6, 1], 3)
false = Coding.valid?([0, 4, 5, 1, 8, 9, 12, 3, 1], 7)
true = Coding.valid?([5, 3, 3, 3, 4, 100], 13)
true = Coding.valid?([5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5], 0)
false = Coding.valid?([5, 4, 3, 2, 1, 1, 1, 2, 3, 4, 5], 0)

nikiiv · September 3, 2023, 10:35am

Thank you so much!
It gives a lot of color to how structure Elixir code.
The.problem with this implementation is the complexity. I am after O(n), though adding an element to the end of the list is spoiling my solution also

eksperimental · September 3, 2023, 4:45pm

How adding elements to a list spoils your solution?
It may look complex but is the most optimal IMO.
You only keep what is needed, and stops as soon as a solution is found.

nikiiv · September 3, 2023, 5:03pm

Because Elixir lists are single linked so adding an element to the end requires is an O(n) operation

Niki

eksperimental · September 4, 2023, 9:04pm

My solution is of O(n²) complexity.
I cannot see how you can solve this problem linearly (O(n)).

nikiiv · September 5, 2023, 7:24am

Of course, the proper solution would be a sliding window within the original structure
The problem with Elixir is that it lacks an internal O(1) access data structure outside of tuples and to solve this one, one needs O(1) access. There are two ways I’ve found so far to solve it. Either use :queue or convert the List to tuple (given that it is fixed size and we won’t be changing the size, it should work).
I don’t know much yet of Elixir and BEAM internals, this may not be super optimal but…

defmodule CodingChallange do
  def solve_problem(arr,target_sum) do
    tarr = List.to_tuple(arr)
    size = Kernel.length(arr);
    current_sum = elem(tarr,0)
    left = 0
    right = 0
    solve(tarr,size,left,right,current_sum, target_sum)
  end


  defp solve(_,_,left,right, target_sum,target_sum) when left <= right, do: :ok
  defp solve(tarr,size,left,right, _current_sum,target_sum) when left > right do
    right = left
    current_sum = elem(tarr, left)
    solve(tarr,size,left,right, current_sum,target_sum)
  end
  defp solve(tarr,size,left,right,current_sum, target_sum) when current_sum < target_sum and right < size-1 do
    if Mix.env() != :test, do: Slog.log(["LESS CASE", tarr,size," left:",left, " right:", right, "curretn_sum:" ,current_sum, "target_sum", target_sum])
    right = right + 1
    current_sum = current_sum + elem(tarr,right)
    solve(tarr,size,left,right,current_sum, target_sum)
  end

  defp solve(tarr,size,left,right,current_sum, target_sum) when current_sum > target_sum and left <= right and left < size-1 do
    if Mix.env() != :test, do: Slog.log(["GREATER CASE", tarr,size," left:",left, " right:", right, "curretn_sum:" ,current_sum, "target_sum", target_sum])
    current_sum = current_sum - elem(tarr,left)
    left = left + 1
    solve(tarr,size,left,right,current_sum, target_sum)
  end
  defp solve(_,_,_,_, current_sum,target_sum) when current_sum != target_sum, do: nil

end

Tests:

defmodule CodingChallangeTest do
  use ExUnit.Case
  doctest CodingChallange

  test "check solution" do
    assert CodingChallange.solve_problem([1,2,3,4],6) == :ok
    assert CodingChallange.solve_problem([1,7,1,1,1,5,6,1],3) == :ok
    assert CodingChallange.solve_problem([0,4,5,1,8,9,12,3,1],7) == nil
    assert CodingChallange.solve_problem([5,3,3,3,4,100],13) == :ok
    assert CodingChallange.solve_problem([5,4,3,2,1,0,1,2,3,4,5],0) == :ok
    assert CodingChallange.solve_problem([5,4,3,2,1,1,1,2,3,4,5],0) == nil
  end
end

TBH I find the lack of proper arrays a bit frustrating. The Arrays module doesn’t explicitly state that the access time is O(1). Outside of that so far Elixir is a delight
In my line of work (and probably everybody else’s) there are like 15% of code that executes 90% of the time and performance needs to be optimal. I am at the phase where I research Elixir for the appropriate data structures to ditch Java and Akka in Elixir’s favor. Over the years I developed the habit of reading the internal implementation to make sure I won’t get tricked by assuming a one line of code would execute in linear time, but I don’t read Elixir’s code that easily yet

stefanluptak · September 5, 2023, 7:39am

I did a benchmark of the solutions presented here and @100phlecs’s solution is still the fastest, but your second solution uses the least memory. See: benchmark.exs · GitHub

Results:

Name                    ips        average  deviation         median         99th %
100phlecs            2.53 M      395.84 ns  ±7509.36%         291 ns        1416 ns
nikiiv2              1.98 M      504.83 ns  ±4331.43%         458 ns         583 ns
nikiiv               1.38 M      724.03 ns  ±2450.27%         666 ns         875 ns
tomekowal            1.28 M      783.66 ns  ±2716.34%         708 ns         958 ns
eksperimental        0.21 M     4653.02 ns   ±216.90%        4291 ns       11750 ns
stefanluptak       0.0484 M    20668.09 ns    ±21.22%       20167 ns       29125 ns

Comparison:
100phlecs            2.53 M
nikiiv2              1.98 M - 1.28x slower +108.99 ns
nikiiv               1.38 M - 1.83x slower +328.18 ns
tomekowal            1.28 M - 1.98x slower +387.81 ns
eksperimental        0.21 M - 11.75x slower +4257.17 ns
stefanluptak       0.0484 M - 52.21x slower +20272.24 ns

Memory usage statistics:

Name             Memory usage
100phlecs             2.34 KB
nikiiv2               0.79 KB - 0.34x memory usage -1.55469 KB
nikiiv                7.03 KB - 3.00x memory usage +4.69 KB
tomekowal                7 KB - 2.99x memory usage +4.66 KB
eksperimental        30.13 KB - 12.86x memory usage +27.79 KB
stefanluptak         16.90 KB - 7.21x memory usage +14.55 KB

eksperimental · September 5, 2023, 8:05am

Do you mind sharing the benchmark file. I would like to experiment with different approaches.

stefanluptak · September 5, 2023, 8:11am

It’s linked in my post.

eksperimental · September 5, 2023, 8:14am

Sorry, it is too late here

eksperimental · September 5, 2023, 8:20am

There’s a relevant discussion in the forum related to constant time access.

The possible options seem to be using tuples or ETS tables.
Another alternative would be to create your own constant-time-access structure, which is a struct, but internally it uses a tuple. You could take as an inspirations @Qqwy’s library Array, GitHub - Qqwy/elixir-arrays: Well-structured Arrays with fast random-element-access for Elixir, offering a common interface with multiple implementations with varying performance guarantees that can be switched in your configuration.

I will play with different solutions and come back to it when I have time.

wanton7 · September 5, 2023, 8:27am

I just use Elixir in hobby project but my two cents are that’s why in Elixir it’s sometimes better for performance to add items into beginning of list and reverse it later. Or solution might be to put items into multiple lists then reverse some and combine lists.

nikiiv · September 5, 2023, 9:05am

That’s not going to work in this approach. The original code was trying to have a sliding window within the original list by appending or removing from both end of the window based on the numbers. Otherwise, you are right

wanton7 · September 6, 2023, 6:45am

I mean I would avoid recreating list must as possible with something like this.

defmodule CodingTest do
  @moduledoc """
  * Implement a method that will determinate if exists a range of
  * non-negative elements in an array that sum up to a specific non-
  * negative number
  *
  * Example 1 - targetSum 3, numbers = [1,7,1,1,1,5,6,1] - output true (the sum of the elements in range 2 to 4 is 3
  * Example 2 - targetSum 7, numbers = [0,4,5,1,8,9,12,3,1] - output false (no range sums to 7)
  *
  """

  def solution([h | t], target_sum) do
    find_solution(1, [h], t, h, target_sum)
  end

  defp find_solution(_, _, _, target_sum, target_sum) when target_sum > 0, do: :OK
  defp find_solution(_, _, [0 | _], _, 0), do: :OK

  defp find_solution(size, arr, [h | t], current_sum, target_sum) when current_sum < target_sum do
    current_sum = current_sum + h
    find_solution(size + 1, [h | arr], t, current_sum, target_sum)
  end

  defp find_solution(size, arr, right, current_sum, target_sum) when current_sum > target_sum do
    size = size - 1
    current_sum = current_sum - Enum.at(arr, size)
    find_solution(size, arr, right, current_sum, target_sum)
  end

  defp find_solution(0, _, [h | t], _, target_sum), do: find_solution(1, [h], t, h, target_sum)

  defp find_solution(_, _, [], current_sum, target_sum) when current_sum != target_sum, do: :KO
  defp find_solution(0, _, [], _, _), do: :KO
end

I actually don’t know how to benchmark in Elixir so maybe someone else can check how does this perform.