Hey fellow Elixir people,
I am still a Elixir noob and on some problems I spend a lot of time.
In order to nest a flat list into elixir structs / maps (e.g. from csv or relational db) i wrote a small parser.
The most time spent was on not understanding Enum.chunk_while/4
. So as a check if I use it correctly now I wrote this small example (in form of a test). See below.
Just ahead my question: Is there a more elixir way to do that (keep in mind that I convolute multiple such chunkers and use a dynamic schema to chunk into, I guess filtering or grouping alone would not do…)?
In case this would be totally ok as a principle: What do you think, would this example (modified maybe) be something to extend the Elixir documentation? And in case there is another “Yes” here. How would that work, just a pull request to the elixir repo? The specific part which took so much time is annotated specificly below.
Many thanks for your time and consideration!
defmodule Example_chunk_while do
@moduledoc """
Proposition for additional Example in
Documentation for `Enum.chunk_while`.
"""
use ExUnit.Case
test "chunker_test" do
list_of_maps = [
%{a: 5, b: 9},
%{a: 5, b: 9},
%{a: 7, b: 15},
%{a: 360, b: 15},
%{a: 360, b: 15}
]
expected_result = [
[%{a: 5, b: 9}, %{a: 5, b: 9}],
[%{a: 7, b: 15}],
[%{a: 360, b: 15}, %{a: 360, b: 15}]
]
chunk_fun = fn element, acc ->
# check *initial* case
if acc == [] do
{:cont, [element]}
else
# If not empty: compare with last element
[previous | _] = acc
previous_code = Map.get(previous, :a)
case element.a do
^previous_code -> {:cont, Enum.reverse([element | acc])}
# the following line did cost me some time to figure out!
# In case you want to group by some features but also allow
# entries which result in a group of "one entry", you need
# to return the element as the acc for the next processing step.
_ -> {:cont, acc, [element]}
end
end
end
after_fun = fn
[] -> {:cont, []}
acc -> {:cont, Enum.reverse(acc), []}
end
result = Enum.chunk_while(list_of_maps, [], chunk_fun, after_fun)
assert result == expected_result
end
end
Ps. I could maybe “opensource” my “parser” but it is quite messy still and changes hourly
I guess you guys already have your libraries for such things (which I didn’t really find to be honest). I am glad for pointers! I don’t currently use ecto, and I am trying to build a completely db agnostic app to begin with and add a persistency layer later on. The parser is going to be used on file.streams (with Stream.chunk_while) and on “complete” smaller files and results from outgoing db requests. Then the data is send further down the “pipeline” and gets added to the state eventually.