Help cleaning up this data transformation code

Phillipp · May 29, 2019, 9:17am

Hey,

I got a list of structs which I need to convert to a map (or similar useful data structure) that I can then further use to create some Prometheus metrics.

Here is what I got. The code is quite dirty in my opinion and maybe there is a smarter way to do it. I wrote it last night at 1am just to get it done, don’t judge me

Input data:

trackers = [
  %{
    pause: true,
    pause_timeout_in_ms: 72276156,
    position: %{"lat" => 11, "lng" => 1},
    response: nil,
    timeout_in_ms: 309722,
    user_id: "5cb6fb2c2071c963d21c517f"
  },
  %{
    pause: false,
    pause_timeout_in_ms: false,
    position: %{"lat" => 11, "lng" => 1},
    response: %{
      "status" => "ok",
      "vehicle_position" => %{"vehicle" => %{"tracking" => nil}}
    },
    timeout_in_ms: 1747683,
    user_id: "5cb6fb2d2071c963d21c5180"
  }
]

Then I do the following:

trackers
    |> Enum.map(
         fn x ->
           state = case x[:pause] do
             true -> "paused"
             false -> "active"
           end
           health = case x[:response] do
             nil -> "unhealthy"
             _ -> "healthy"
           end
           %{state: state, health: health}
         end
       )
    |> Enum.reduce(
         %{
           "active" => %{
             "healthy" => 0,
             "unhealthy" => 0
           },
           "paused" => %{
             "healthy" => 0,
             "unhealthy" => 0
           }
         },
         fn (x, acc) ->
           put_in(acc[x[:state]][x[:health]], get_in(acc, [x[:state], x[:health]]) + 1)
         end
       )

(Ignore the inline functions, gonna clean it up after the implementation is set)

Which gives me the following output:

%{
  "active" => %{"healthy" => 1, "unhealthy" => 0},
  "paused" => %{"healthy" => 0, "unhealthy" => 1}
}

Which I then use the following way:

series = for {state, health} <- extract_labeled_data(trackers), # extract_labeled_data is the cove above
    {health, value} <- health do
  {[state: state, health: health], value}
end

Prometheus.Model.gauge_metrics(series)

To generate Prometheus metrics like:

app_gateway_tracking_sessions_count{state="active",health="healthy"} 1
app_gateway_tracking_sessions_count{state="active",health="unhealthy"} 0
app_gateway_tracking_sessions_count{state="paused",health="healthy"} 0
app_gateway_tracking_sessions_count{state="paused",health="unhealthy"} 1

I am sure there are some things that can be improved to get from the input data to the final metrics.

mudasobwa · May 29, 2019, 9:41am

The first step could probably be simplified to:

input
|> Enum.group_by(& {
  (if &1.pause, do: "paused", else: "active"), 
  (if is_nil(&1.response), do: "unhealthy", else: "healthy")
})

Now one might Enum.reduce/3 if zeroes are indeed required, otherwise Enum.map/2 would work:

... |> Enum.map(fn {{state, health}, v} ->
  {[state: state, health: health], Enum.count(v)}
end)
#⇒ [
#     {[state: :active, health: :healthy], 1},
#     {[state: :paused, health: :unhealthy], 1}
#  ]

Phillipp · May 29, 2019, 9:48am

I thought about the group_by before but couldn’t wrap my head around it last night.

Zeros are indeed required, otherwise the time series are missing. How would the Enum.reduce/3 version look like?

mudasobwa · May 29, 2019, 10:06am

Actually, Enum.into/3 would suffice:

|> Enum.into(%{
    [state: "active", health: "healthy"] => 0,
    [state: "active", health: "unhealthy"] => 0,
    [state: "paused", health: "healthy"] => 0,
    [state: "paused", health: "unhealthy"] => 0
  }, fn {{state, health}, v} -> {[state: state, health: health], Enum.count(v)} end)

Phillipp · May 29, 2019, 10:29am

That worked wonderfully.

I was even able to simplify my for comprehension.

series = for {[state: state, health: health], value} <- extract_labeled_data(trackers) do
  {[state: state, health: health], value}
end

mudasobwa · May 29, 2019, 10:30am

Eh. Map.to_list/1 after Enum.into/3 would do exactly the same.

Phillipp · May 29, 2019, 10:33am

Yeah true. But I will keep the for in my code so it’s easy to see how the data looks like.

mudasobwa · May 29, 2019, 10:47am

At least you might be slightly more DRY: