Why does the Heap Library by James Harton sort the elements in a different manner from a normal min-heap?

belteconti · November 2, 2021, 6:08am

Hi! So I was exploring some libraries of Elixir and I came across this heap library by James Harton.

The thing which I wanted to ask is that behind the scenes, the result produced by this is quite different from a normal min-heap. For example, I push the following elements to my heap: [4, 8, 9]. Then, I pop the first element and push 11 which gives us: [8, 9, 11]. Similarly, I again pop the topmost element and push 20. Once again, I pop the topmost element and push 15 this time.

My final heap in a normal min-heap would look like: [11, 20, 15]. However, when using the library, my result comes out to be [11, 15, 20].

If anyone has used this library and is familiar with it, does anyone know the reason why this is the case? And is there any way of getting the normal min-heap results? I also tried using the Heap.min() part of this library as well, however, that doesn’t work as well.

stevensonmt · December 26, 2022, 11:16pm

Necro bumping this because I want to make sure my understanding of how heap structures should work is correct. The misunderstanding in the OP seems to me comes from observing the heap as a list for display but expecting it to follow “[parent-left-right]” order down a tree structure. With the operations described by OP 20 would be the left child of the root node and 15 the right child. Displaying the structure as a flat array however the sorted array is more visually aligned with the expectation of a min heap and the underlying implementation as a sorted list of tuples rather than a true binary tree structure.
Am I way off?
Also there are a few different improvements of heaps and priority queues floating around. I am thinking about a crate that describes a behaviour for such structures that could then be used to adopt any implementation without needing to change a lot of code in the end user application. Any value to such an exercise?

Aetherus · December 26, 2022, 11:39pm

I checked the source code and the implementation is pairing heap, which is also what I always use when solving LeetCode problems with Elixir. Each node in a pairing heap can have any number of children, and they are not in any particular order. Your final state of the heap could be

{11, [
  {15, []},
  {20, []}
]}

or

{11, [
  {20, []},
  {15, []}
]}

or

{11, [
  {15, [
    {20, []}
  ]}
]}

It doesn’t matter as long as the value of every node is smaller than all of its children’s value.

stevensonmt · December 27, 2022, 12:09am

One quick difference that jumps out to me with pairing heap vs binary tree heap is that with the latter it must be a complete tree where children are filled at each level and if not possible then a single child is assigned to the left. So when you pop and then insert if the intermediate result is not a complete tree the final child is assigned a left position.
Also, I assume that while the three structures you have are equivalent in result the path taken to reach them is necessarily different. Is that accurate?

Aetherus · December 27, 2022, 12:43am

I’m sorry but my English is kinda rusty, so I don’t fully understand your question.

Anyway, I show an example of what pairing heaps do in the simplest way, without custom comparator, so the smallest items should appear at the top.

An empty pairing heap is just nil.

Push 1

{1, []}

Then push 2

{1, [
  {2, []}
]}

Then push 3

{1, [
  {3, []},
  {2, []}
]}

Note that 3 appears before 2, because adding items to the head of a list is cheaper than adding items to the end.

Then push 4 then 5 then 6

{1, [
  {6, []},
  {5, []},
  {4, []},
  {3, []},
  {2, []}
]}

When pushing items greater than the root into a non-empty pairing heap, the newly added items always appear at the second-top layer.

Then push 0

{0, [
  {1, [
    {6, []},
    {5, []},
    {4, []},
    {3, []},
    {2, []}
  ]}
]}

Pushing an item smaller than the root, the newly added item becomes the new root, and the existing heap becomes its child.

What happens when pushing an item equal to the root?

Well, you can make it the new root, you can also make it the new child of the existing root. It depends on how you implement the pairing heap. Pairing heap does not ensure FIFO of the equal items.

Let’s do a pop

Obviously we should remove 0. Since it has only one child, we are done.

{1, [
  {6, []},
  {5, []},
  {4, []},
  {3, []},
  {2, []}
]}

Pop again

This time, we should pop the root 1 out of the heap, but just removing 1 makes the heap no longer a tree. So we have to elect a new root. Here comes the pairing procedure.

We take the second layer, and make them 3 pairs.

# Pair1
{
  {6, []},
  {5, []}
}

# Pair 2
{
  {4, []},
  {3, []}
}

# Pair 3
{
  {2, []},
  nil
}

Note that {2, []} has no other child to pair with, so we pair it with nil.

Which child pairs with which child is kinda arbitrary, as long as a child only gets paired once. I just use the easiest way.

Then, for each pair, we build a sub pairing heap. This is called meld. The melding procedure is very simple. Pick the smaller one in the pair to be the root, and add the bigger one to the second layer of the smaller one.

# The heap built with the first pair
{5, [
  {6, []}
]}

# The heap built with the second pair
{3, [
  {4, []}
]}

# The heap built with the third pair
{2, []}

There are still more than 1 heaps, so we do the pair-and-meld again.

Pairing:

# Pair 1
{
  {5, [
    {6, []}
  ]},
  {3, [
    {4, []}
  ]}
}

# Pair 2
{
  {2, []},
  nil
}

Melding:

# Heap built with pair 1
{3, [
  {5, [
    {6, []}
  ]},
  {4, []}
]}

# Heap built with pair 2
{2, []}

Well, there are still 2 heaps, so we pair them up and meld again. The result is

{2, [
  {3, [
    {5, [
      {6, []}
    ]},
    {4, []}
  ]}
]}

The idea of pairing heap is to postpone the restructuring until the heap become multiple trees.

Aetherus · December 27, 2022, 1:21am

By the way, adding a new item to an existing heap can also be viewed as pairing the heap with {new_item, []} and melding them.

stevensonmt · December 27, 2022, 2:29pm

@Aetherus, thank you so much for the detailed replies. I think some typos from typing on my phone probably contributed to making my question more confusing rather than your English which is excellent as far as I can tell. What I meant was that because the pop and insert actions are sequential there is an intermediate state that can affect the ultimate shape of the end result.
Following the OP’s steps:
Push 4
{4, []}
Push 8

{4, [ 
    {8, []}
    ]}

Push 9

{4, [
    {9, []},
    {8, []}
  ]}

Pop first element

{ {9, []}, {8, []} } => pairing/melding => {8, [ {9, []} ] }

Push 11

{8, [
    {9, []},
    {11, []}
    ]}

Pop first element (after pair/meld)

{9, [ {11, [] } ] }

Push 20

{9, [ {11, []}, {20, []} ] }

Pop first element

{ 11, [ {20, []} ] }

Push 15

{11, [ {15, []}, {20, []} ] }

But the only way I can see to end up with your third example

would be to push 15, then push 20, then push 11. So the end result is equivalent in function but the underlying structure depends on the order of operations. As opposed to a binary tree heap where the structure is “rebalanced” on each operation. (I’m not sure if that’s the technically correct use of rebalancing, hence the quotation marks.) Have I misunderstood the pairing heap?

Aetherus · December 27, 2022, 10:02pm

You understand it quite well. Yes, pairing heap does not even try to rebalance. It just keeps the amortized O(log n) time per pop. A small detail though, push does not need to keep the second layer sorted. Any O(1) operation to add a child works fine. For an Elixir list, it’s [new_child | old_children].

stevensonmt · December 27, 2022, 11:09pm

Oh, right, thanks.
This then

should have been
{9, [ {20, []}, {11, []} ] }

Really appreciate your patient explanation.