Common data transform patterns - please add yours!

So I’m really loving elixir. BY FAR the most excruciating piece of learning a functional language for me is having to “transform” all my input data so that it includes ALL desired output data. In a mutable language like ruby you can have loops and iterators that just update variables (maps, counters, etc) outside the scope. This makes it super easy to wrangle complex data structures… but in Elixir it’s taking me quite awhile to deal with all this.

I wanted to make a little “database” in this thread that I can constantly come back to and reference of all the best ways you might do simple/medium/advanced data transformations. The advanced ones are too many to include in this thread but I feel like there are a lot of simple/medium ones that you can count on 2 hands that you use over and over and over.

I will keep adding to this thread but here are a few to start off…

A. Transform all values inside a map

input: %{a: "hello", b: "world"}
output: %{a: "HELLO", b: "WORLD"}

Answers:

x |> Enum.reduce(%{}, fn {k, v}, acc -> Map.put(acc, k, String.upcase(v)) end)

for({k, v} <- x, into: %{}, do: {k, String.upcase(v)})

:maps.map(fn (_k,v) -> String.upcase(v) end, i)

A2. Transform value inside a map for a specific key

input: %{a: "hello", b: "world"}
output: %{a: "hello", b: "WORLD"}

A3. Transform value for specific key inside a nested map

input: %{person: %{first: "Jose", last: "valim"}, age: 100}
input: %{person: %{first: "JOSE", last: "VALIM"}, age: 100}

Answers:

x |> update_in([:person, :last], &String.upcase(&1))

B. Transform all values inside a list of map

input: [%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz}]
output: [%{a: "HELLO", b: "WORLD"}, %{a: "WORLD", b: "BUZZ}]

B2. Transform one value inside a list of map for a specific key

input: [%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz}]
output: [%{a: "hello", b: "WORLD"}, %{a: "fizz", b: "BUZZ}]

Answers:

x |> Enum.map(fn m -> Map.update(m, :b, nil, &String.upcase(&1)) end)

C. Push to a list in a map

input: %{list: []}
output: %{list: [1]}

Answers:

x |> update_in([:list], &[1 | &1])

I will add more as I go, feel free to add your own

5 Likes

My small contrib for A :slight_smile:

iex> i = %{a: "hello", b: "world"}
iex> o = %{i | a: "HELLO", b: "WORLD"}

# or

iex> o = i |> Enum.reduce(%{}, fn {k, v}, acc -> Map.put(acc, k, String.upcase(v)) end)
2 Likes

Amazing, I already like that more than Enum.map + Enum.into blah blah blah. thanks!

1 Like
input = %{list: []}
output = update_in(input, [:list], &[1 | &1])
IO.inspect(output)
iex(1)> input = %{list: []}
%{list: []}
iex(2)> output = update_in(input, [:list], &[1 | &1])
%{list: [1]}
iex(3)> IO.inspect(output)
%{list: [1]}
%{list: [1]}
iex(4)> 
2 Likes
input = %{person: %{first: "Jose", last: "valim"}, age: 100}
output = update_in(input, [:person, :last], &String.upcase(&1))
IO.inspect(output)
iex(1)> input = %{person: %{first: "Jose", last: "valim"}, age: 100}
%{age: 100, person: %{first: "Jose", last: "valim"}}
iex(2)> output = update_in(input, [:person, :last], &String.upcase(&1))
%{age: 100, person: %{first: "Jose", last: "VALIM"}}
iex(3)> IO.inspect(output)
%{age: 100, person: %{first: "Jose", last: "VALIM"}}
%{age: 100, person: %{first: "Jose", last: "VALIM"}}
iex(4)> 
2 Likes
input = %{a: "hello", b: "world"}
output = Map.update(input, :b, "THERE", &String.upcase(&1))
IO.inspect(output)
iex(1)> input = %{a: "hello", b: "world"}
%{a: "hello", b: "world"}
iex(2)> output = Map.update(input, :b, "THERE", &String.upcase(&1))
%{a: "hello", b: "WORLD"}
iex(3)> IO.inspect(output)
%{a: "hello", b: "WORLD"}
%{a: "hello", b: "WORLD"}
iex(4)> 
1 Like

For B2…

iex> input = [%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
iex> input |> Enum.map(fn m -> Map.update(m, :b, nil, &String.upcase(&1)) end)
1 Like
input = %{a: "hello", b: "world"}
output = for({k, v} <- input, do: {k, String.upcase(v)}) |> Map.new()
iex(1)> input = %{a: "hello", b: "world"}
%{a: "hello", b: "world"}
iex(2)> output = for({k, v} <- input, do: {k, String.upcase(v)}) |> Map.new()
%{a: "HELLO", b: "WORLD"}
iex(3)> 
1 Like

You might also use the into parameters…

iex> output = for({k, v} <- input, into: %{}, do: {k, String.upcase(v)})
2 Likes
input = [%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
output = for(m <- input, do: for({k, v} <- m, into: %{}, do: {k, String.upcase(v)}))
iex(1)> input = [%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
[%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
iex(2)> output = for(m <- input, do: for({k, v} <- m, into: %{}, do: {k, String.upcase(v)}))
[%{a: "HELLO", b: "WORLD"}, %{a: "FIZZ", b: "BUZZ"}]
iex(3)>   

into is OK as long as you stay with Keyword while you are still transforming the data - i.e. not constantly creating Maps only to turn them back to keywords again.

input = [%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
output = Enum.map(input, &(Enum.map(&1, fn {k, v} -> {k, String.upcase(v)} end) |> Map.new()))
iex(1)> input = [%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
[%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
iex(2)> output = Enum.map(input, &(Enum.map(&1, fn {k, v} -> {k, String.upcase(v)} end) |> Map.new()))
[%{a: "HELLO", b: "WORLD"}, %{a: "FIZZ", b: "BUZZ"}]
iex(3)> 
2 Likes
input = [%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
output = for(m <- input, do: update_in(m, [:b], &String.upcase(&1)))
iex(1)> input = [%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
[%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
iex(2)> output = for(m <- input, do: update_in(m, [:b], &String.upcase(&1)))
[%{a: "hello", b: "WORLD"}, %{a: "fizz", b: "BUZZ"}]
iex(3)> 
input = [%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
output = input |> Enum.map(fn m -> update_in(m, [:b], &String.upcase(&1)) end)
iex(1)> input = [%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
[%{a: "hello", b: "world"}, %{a: "fizz", b: "buzz"}]
iex(2)> output = input |> Enum.map(fn m -> update_in(m, [:b], &String.upcase(&1)) end)
[%{a: "hello", b: "WORLD"}, %{a: "fizz", b: "BUZZ"}]
iex(3)> 
1 Like

Another solution for A is

%{a: "hello", b: "world"}
iex(7)> o = :maps.map(fn (_k,v) -> String.upcase(v) end, i)
%{a: "HELLO", b: "WORLD"}

I don’t know why the Map module doesn’t include a map function. Yes, you can do it with Enum.map but it always returns a list and not a result of the same type as the input. Which I feel is a bit off.

Another solution would be to use for.

2 Likes

Just me hypothesizing:

  • Even maps.map/2 generates a list of keywords as an intermediate result.
  • For Enum the list is a kind of universal output (iterable) data structure.
  • Elixir has the pipe (|>) operator and Enum functions are used in pipelines quite heavily. So once something is a list, it should stay a list throughout the pipeline and not be “rehydrated” until the end of the pipeline (Map.new/1).
  • By not offering a Map.map/2 there is a speed bump to mindlessly putting Map.map/2 at the beginning (or in the middle) of a pipeline (where map is typically used) - i.e. creating a map that is just going to be converted to a list again.

What I meant was that I think Enum should go type -> type and not type -> list. It feels wrong imao.

1 Like

The reasoning behind that is explained a bit here: https://hexdocs.pm/elixir/Collectable.html#module-why-collectable

1 Like

Example:

m |> Enum.map(f1) |> Enum.reduce(init,f2)

If m is a Map, it would also be a Map emerging from map/2 which would then have to be converted to a list again before entering reduce/3 - which introduces an unnecessary type->list conversion. It makes more sense to simply leave it as a list and have Collectable reconstitute everything (as desired) at the end of the pipe.

1 Like

Enum.reduce/3 can already work directly on a map so I don’t see the problem here. I just think it should be enumerable in and not, list out, but enumerable out. It makes it cleaner, imao.

While I don’t have an issue with the implementation compromise for efficiency sake, it’s actually the name map that from my perspective fails the “principle of least astonishment”. When I see a map function I expect it to produce the same shape of data that it consumed - perhaps with a different contained type. That’s because I encountered map in Haskell consciously for the first time and that is how it behaves. So in a way Enum.map/2 sets expectations that it doesn’t deliver on - Enum.map_to_list/2 would be more accurate.

Once I got over my initial “astonishment” I simply reclassified the Enum module as a set of functions that are capable of consuming a range of collections but (mostly) produce output collections as lists (i.e. I took the _to_list suffix as implied).

1 Like

There are cases, where the input type == output type doesn’t work though:

range = Date.range(~D[1999-01-01], ~D[2000-01-01])
result = Enum.map(range, fn date -> date.day end)
# or
set = MapSet.new([1, 2, 3, 4, 5, 6])
result = Enum.map(set, fn number -> rem(number, 2) end)

Which is why in Haskell the fmap function is tied to the type or more accurately typeclass - so neither Range nor MapSet would be a member of the Functor typeclass.

The original complaint was that there should be a Map.map/2 that produces a Map (while Enum.map_to_list can do whatever it needs to do).

1 Like