Unexpected behavior from libring (HashRing). Unlucky number 14?

I’ve been kicking the tires of the libring package to help distribute work/data across multiple nodes in a deterministic way.

Given a ring like so:

iex> ring = HashRing.new() 
  |> HashRing.add_node(:a@localhost) 
  |> HashRing.add_node(:b@localhost)
#<Ring[:b@localhost, :a@localhost]>

You can ask it which n nodes a given key should map to, e.g.

iex> HashRing.key_to_nodes(ring, 13, 2)
[:a@localhost, :b@localhost]

But what’s strange is that certain numbers seem to cause problems. For example, the lucky number 14 comes out with only ONE node assigned to it instead of the 2 that were asked for:

iex> HashRing.key_to_nodes(ring, 14, 2)
[:a@localhost]

But more curious is that the problematic keys seem to be related to the names of the hosts, e.g. using hostnames :a and :b works as expected:

iex> ring2 = HashRing.new() |> HashRing.add_node(:a) |> HashRing.add_node(:b)
#<Ring[:a, :b]>
iex> HashRing.key_to_nodes(ring2, 14, 2)
[:b, :a]

The docs say that “Will return either count results or the number of nodes, depending on which is smaller.”, but this seems to not be the case with the lucky number 14.

Can someone explain this? This is easy enough to avoid by wrapping calls to HashRing.key_to_nodes/3 to return all nodes when the count is equal to the number of available nodes, but it sure caught me by surprise!

:wave:

libring seems to be using gb_trees and iterates over it. Maybe it didn’t implement wrapping around?

Here: libring/lib/ring.ex at c64f12ded165eef37a987595dd1815e5846e8e6d · bitwalker/libring · GitHub

Maybe it needs to restart iteration from some appropriate point (e.g. :gb_trees.smallest(r) or some 0 equivalent) if next returnes :none and count > 0


Off-topic: Erlang people have some sort of unusual affinity towards hash rings (probably because of Riak) but I like Rendezvous hashing - Wikipedia better :upside_down_face:

Here’s a basic but simple (no deps) implementation: Check if there is interest in using highest random weight distribution strategy · Issue #277 · derekkraan/horde · GitHub – modifying it to k > 1 to pick top-K nodes is a matter of switching from Enum.max_by to Enum.sort and Enum.take.

1 Like

Not particularly familiar with this algorithm, but re: @ruslandoga’s point about wrapping around - 14’s phash value is definitely large :thinking:

iex(2)> :erlang.phash2(14, hash_range)
4251188017

iex(3)> :erlang.phash2(13, hash_range)
780247510

Of course, Riak was a Dynamo clone and the Dynamo paper was very influential. They in turn got it from Akamai, and I have seen it speculated that the popularity of consistent hashing was essentially a cargo cult on top of Akamai, which was a big deal during the boom. There was even an old Apple keynote from the time where Steve Jobs talked up Akamai’s CDN and announced they had invested.

The funny part is that Rendezvous hashing is not only better but, on top of that, way easier to understand.

Side note: if you are using this method to store data, make sure you read the Copysets paper!

1 Like

Thanks for the insights! I’m new to all the algorithms here. I see GitHub - timdeputter/Rendezvous: Implementation of the Rendezvous or Highest Random Weight (HRW) hashing algorithm in the Elixir Programming Language but it doesn’t seem to be on hex? (No docs, anyway… it’s 10 years old). I have some other headier problems with distributed nodes that are probably more in need of my brainpower, but I’d be happy to kick tires on this… anyone know if there is a more current implementation of rendezvous hashing in Elixir?

anyone know if there is a more current implementation of rendezvous hashing in Elixir?

It might be the curse of this approach. In Elixir it can be expressed with a single Enum.max_by

key = 14
nodes = Node.list([:this, :visible])
chosen_node = Enum.max_by(nodes, fn n -> {:erlang.phash2({key, n}), n} end)

:stuck_out_tongue:


The linked implementation seems to be using very heavy hashing functions and attempts to make them configurable, there is no need for this. :erlang.phash2 is good and fast enough.

3 Likes

Which linked implementation are you referring to? Are you saying that libring’s implementation is too complex? Or that the rendezvous repo is trying to be configurable and it’s too complex.

How to adjust :erlang.phash2 so it returns a sorted list?

What he’s saying is that the rendezvous hash algorithm is so simple that a library isn’t really necessary.

The algorithm is literally just: hash the key with each node and sort the hashes, then take the top N results and those are your nodes. The max_by example he gave would be N=1, but you could instead do Enum.sort_by(...) |> Enum.take(n).

3 Likes

ah, gotcha. Makes sense. Yep. Thanks for the clarification!