How to improve sort of maps in a list which have duplicate key

Hello friends,
I have a JSON file in which I save some information about my apps and their dependencies, so some of my app dependencies may be the same, and I need to merge them, but my priority is the version, so I should reject the lower version and keep the higher version.

[
    {"app":"testi","dependencies":[{"app":"phoenix","min":"3.7","max":"4"},{"app":"phoenix_live_view","max":"0.17.7","min":"4.01.0"},{"app":"ueberauth","max":"0.17.7","min":"1.17.7"},{"app":"ueberauth_github","min":"0.8.1"},{"app":"ueberauth_google","min":"0.10.1"}],"dependency_type":"soft_update","git_tag":null,"timeout":null,"type":"hex","update_server":null,"url":"https://hex.pm/packages/mishka_social","version":"0.0.2 "},
    {"app":"testi2","dependencies":[{"app":"phoenix","min":"1.6"},{"app":"phoenix_live_view","max":"0.1.0","min":"0.01.78"},{"app":"ueberauth","max":"0.17.7","min":"0.17.7"},{"app":"ueberauth_github","min":"0.8.1"},{"app":"ueberauth_google","min":"0.10.1"},{"app":"test","max":"0.10.25","min":"0.10.1"}],"dependency_type":"soft_update","git_tag":null,"timeout":null,"type":"hex","update_server":null,"url":"https://hex.pm/packages/mishka_social","version":"0.0.2 "}
]

I convert top JSON with Jason to a list and get dependencies key, the output

[
  [
    %{"app" => "phoenix", "max" => "4", "min" => "3.7"},
    %{"app" => "phoenix_live_view", "max" => "0.17.7", "min" => "4.01.0"},
    %{"app" => "ueberauth", "max" => "0.17.7", "min" => "1.17.7"},
    %{"app" => "ueberauth_github", "min" => "0.8.1"},
    %{"app" => "ueberauth_google", "min" => "0.10.1"}
  ],
  [
    %{"app" => "phoenix", "min" => "1.6"},
    %{"app" => "phoenix_live_view", "max" => "0.1.0", "min" => "0.01.78"},
    %{"app" => "ueberauth", "max" => "0.17.7", "min" => "0.17.7"},
    %{"app" => "ueberauth_github", "min" => "0.8.1"},
    %{"app" => "ueberauth_google", "min" => "0.10.1"},
    %{"app" => "test", "max" => "0.10.25", "min" => "0.10.1"}
  ]
]

As you see, some app are the same, but the versions are not and maybe are.

For example

we have 2 times phoenix, the first one min (it should be sorted by min, this key is important in version) is 3.7 and the second is 1.6, so I need to keep

%{"app" => "phoenix", "max" => "4", "min" => "3.7"},

and reject the others

my code:

Enum.map(json_data, &(&1["dependencies"]))
|> Enum.concat
|> Enum.group_by(&(&1["app"]))
|> Map.to_list()
|> Enum.map(fn {_key, list} ->
    Enum.sort_by(list, &(&1["min"]))
    |> List.last()
end)

I think i’m in a wrong way, do you have any suggestions?

This is the final output

[
  %{"app" => "phoenix", "max" => "4", "min" => "3.7"},
  %{"app" => "phoenix_live_view", "max" => "0.17.7", "min" => "4.01.0"},
  %{"app" => "test", "max" => "0.10.25", "min" => "0.10.1"},
  %{"app" => "ueberauth", "max" => "0.17.7", "min" => "1.17.7"},
  %{"app" => "ueberauth_github", "min" => "0.8.1"},
  %{"app" => "ueberauth_google", "min" => "0.10.1"}
]

By the way, maybe I have another apps which have empty dependencies.

Thank you in advance.

You cannot sort versions by their string representation. For semantic versions elixir ships the Version module, which has an api to compare two versions.

I have no idea how to check it with Version.compare, because maybe we have 4 different versions of :test app, for example.

Enum.min_max_by(["3.0.0", "1.5.0", "3.0.0-rc.1"], &Function.identity/1, Version)
# {"1.5.0", "3.0.0"}
1 Like

I changed the code with your help, I think you meant that? If yes, I want to say a huge thank you because I could not be able to find this even after 100 years. I even need to see this block code several times to understand and what it is doing :))))

Enum.map(json_data, &(&1["dependencies"]))
|> Enum.concat
|> Enum.group_by(&(&1["app"]))
|> Map.to_list()
|> Enum.map(fn {_key, list} ->
  Enum.min_max_by(list, &Function.identity(&1["min"]), Version)
  |> Tuple.to_list()
  |> List.last()
end)

So with this changing, can I improve this, especially in performance?

If you only need the max you can use Enum.max_by. I though you need both ends.

Enum.map + Enum.concat could just be Enum.flat_map. Also no need for the Map.to_list. Maps are also enumerables.

1 Like
Enum.flat_map(json_data, &(&1["dependencies"]))
|> Enum.group_by(&(&1["app"]))
|> Enum.map(fn {_key, list} ->
  Enum.max_by(list, &Function.identity(&1["min"]), Version)
end)

Thank you, this was a full course of learning for me

This could ne just & &1["min"]

1 Like

I have seen max_by document and source code, but I could not understand what it is actually.

For example

Enum.max_by(list, &(&1["min"]), Version)

For example, when we call Version module, what this function does? What functions of Version module are used?

Sorry, I can’t figure out what it is, just for learning

Thank you


Update

I found it

It uses Version.compare to determine order and the min/max from that.

1 Like