Elixir struct that implements `hd | tl`

rdrey · September 20, 2017, 2:37pm

Hi everyone

I am using James Harton’s Heap / Priority Queue and I would like to know if it’s possible to implement List-like pattern matching for a custom datastructure. I.e. the heap is implemented as a struct, but I would like to:

# my heap actually contains {DateTime, object} tuples.
heap = 1..10 |> Enum.shuffle() |> Enum.into(Heap.min)
root = hd heap
rest = tl heap
# and
[root | rest] = heap

I first looked at Protocols and Behaviours, but found that the Heap already implements Enumerable and Collectable, which I thought might enable this. Then I looked at the source of hd and saw that it’s just calling into Erlang.

Is what I’m trying to do possible at the moment? (It seems strange to me that hd would only work on lists and not tuples for example.)

Thanks!

NobbZ · September 20, 2017, 3:10pm

Enumerable enables you to use the functions from Enum. They do not have anything to do with hd and tl, those are for lists only.

Since Heap does implement Enumerable, you can use Enum.to_list/1 and work with that.

rdrey · September 20, 2017, 4:07pm

Ah thanks.

[root | rest] = Enum.to_list heap is not ideal, since it turns the rest into a List. I guess the Heap module will have to implement a function that returns a tuple with {root, rest} (or a 2 element list), if I want to access both with a one liner.

OvermindDL1 · September 20, 2017, 4:08pm

You can always use Enum.take(heap, 1) in place of hd(head).

rdrey · September 20, 2017, 4:10pm

Yeah, thanks The Heap does implement Heap.root/1 to get the first element and Heap.pop/1 to get the rest. I was just looking for an Elixiridiomatic way to implement something that returns both.

OvermindDL1 · September 20, 2017, 4:14pm

Hmm, Enum.split/2 returns lists for both regardless of the data structure, don’t think there is a protocol’ish way off hand?

rdrey · September 20, 2017, 4:25pm

Yeah, Enum.split/2 is the closest to what I’m aiming for. (Might also be a good name, since pop is already taken.) Would also be cool if there was a protocol’ish way to implement “match like a list” and “match like a map”.

Follow-up question: What’s cheaper / more idiomatic, returning a tuple or a list containing head and tail?

Tuple!
List!

0 voters

OvermindDL1 · September 20, 2017, 4:36pm

There is such a way in my ProtocolEx library. ^.^

Depends, is it a fixed-length set? Then tuples. Is it an unknown length at runtime, then lists. ^.^

They are both very cheap, the BEAM is highly optimized for both. Tuples are slightly faster on longer sizes than lists technically, since their length can be matched on immediately if that is an important attribute for your matchers.

rdrey · September 20, 2017, 4:41pm

Hehe, cool. I’ll definitely check it out when I’ve got a working prototype. Probably shouldn’t dive too deep into metaprogramming just for syntactic sugar before my code works.

NobbZ · September 20, 2017, 5:04pm

To be honest, after what I’ve learned from my datastructure lessons, I’d have expected there were an operation which is equivalent to {Heap.root(h), Heap.pop(h)} (perhaps a PR against the repository helps, biggest problem is to come up with an appropriate name… This is exactly what I had pop/1 expected to do…).

I’d not use anything from Enum if you want to retain the Heaps attributes! Anything in Enum will convert your heap into an ordinary list!

OvermindDL1 · September 20, 2017, 5:21pm

I as well.

I’ve been thinking of using my ProtocolEx system to build a replacement Enum (EnumEx?) module that implements the proper styles and names of the algorithms. I could even use the new property testing library in it automatically to ensure that things follow the interfaces properly too. ^.^

But it would be so nice to have an enum library that actually returns the same types, a proper Monad Enum library. Enum just breaks SO many of the Monad Laws (the most basic of which is to always return the same-bleeping-type of monad when a monad is returned unless explicitly casting). >.>

tallakt · September 22, 2017, 7:04am

HI. I believe what you think of is not possible. Here are some suggestions based on speculation (ie I havent tried this code in practice)

To get hd/tl perhaps create your own module Heap.hd and Heap.tl and then include them into your module when using Heap. Your implementations of hd and tl must fall back to Kernel module if the argument is not a Heap struct.

For [h | t] you could use sigil syntax, for example %H[h | t] and solve that with macros quite easily. You should probably have a named fallback in case of naming collisions for that one. For examples look at the Kernel module.

My suggestion overall would be to skip the fancy stuff and just create functions called split, hd and tl in the Heap module. The split function should return a two item tuple. That extra syntactic sugar is probably not worth it, considering the complexity added…