it seems there are several (too many, I think) options for making sets, but no clue which one to choose in which situations, like number of elements in the collection, speed and space complexity, to name a few factors.
How do you choose between all these options when you need to use element sets in your applications?
Map is not a set. You can implement one on top of it, then you get a MapSet.
sets and gb_sets are some erlang implementations of different sets, also there is ordsets.
I’m not sure about how sets is implemented, ordsets though is just an ordered list of elements. gb_sets uses a self balancing tree to implement an ordered set.
In elixir code, I’d always prefer MapSet, as its interface just fits into elixir, as its designed for/in it.
In erlang code though I quite often just use sets, because it has the shortest module name
Some of the erlang modules are effectively legacy implementations from before Maps existed on which you could build MapSet. I believe there are benchmarks which show MapSet to be faster on all operations than the original sets.
I would say that by default one should then opt for MapSet, and to consider the other options if rewriting the application-level algorithm is not enough to get the desired increase in performance.
I see a use of Map to implement a set in those cases where a MapSet.pop would be handy. I believe that mimicking such non-existent function via MapSet.member? followed by MapSet.delete can be slower when data is large, otherwise Map would not need to provide such function.
I see this as an optimization of edge functionality - I want this value removed from the set and know whether it existed in the set - in the service of still using a type that communicates the nature of a set rather than that of a key value store for its overall intended use.
The tradeoff I see here is that I can still use MapSet where appropriate (rather than Map) but am potentially forced to take extra steps to placate dialyzer for the optimized edge functionality (i.e. pop on a set).