Writing a keyword list guard?

I’d like a guard to determine if the value is a keyword list or not. I’ve tried the following macro:

defmacro is_keyword([{key, _value} | _tail]), do: quote do: is_atom(unquote(key))

but when I pass in a keyword list I get this error: ** (FunctionClauseError) no function clause matching in is_keyword/1

however if I remove the pattern matching in the function signature everything seems to work ok. So I’m lead to believe that pattern matching isn’t permitted in guards. If that’s the case how best can I write a single guard for keyword list detection? (yeah I know it should be a recursive function)

Never mind, rubber ducking this one out and realized it is a compile-time thing and not a runtime thing.

I believe the only way to make a guard for this would be:
defguard is_keyword(value) when is_list(value) and length(value) > 0 and is_tuple(hd(value)) and is_atom(elem(hd(value),0))

Unfortunately you can’t use pattern matching in guards

This is a non-starter (there is a reason the language doesn’t come with one).

TL;DR:
You cannot reliably fully traverse a list fast enough to be viable in guards.
As such, you cannot exhaustively ensure every item of a list is a Keyword pair in a guard.


Keyword lists are just lists (of atom/any two-tuples {atom, term}), which are singly-linked car/cons cells that require O(n) time to traverse.

Matches and guards are required to have near-constant-time access to data structures, since they are used in places in the VM with tight performance requirements (like selecting a function head to invoke from many given certain arguments).

This is why tuple and map indexing are allowed in guards (with integer-based elem(tuple, n) and key-based is_map_key(map, key) or :erlang.map_get(map, key)), their structures offer constant-time access to the data they compose. (Also why you can destructure on them in matches.)

I say “near” constant-time because you are also allowed to “peek” into the heads of O(n)-traversable data structures when such an operation is known to be fast and bounded (with hd in guards for lists, binary_part in guards for binaries, and destructuring in match patterns: [first | rest] for lists, "start" <> rest for strings, <<leading, rest::binary>> for binaries).

In practice this peeking is just enough to enable function clauses to recursively consume these O(n) datastructures piece-by-piece, using matching and guards to decide how to handle each chunk. Useful enough, we make concessions on the constant-time constraint.

However, fully accessing every item in a list is an unbounded operation, so, not allowed—even a theoretical “recursive” implementation (that guards are, by design, not expressive enough to support).

What your original macro (and what @Schultzer’s guard) attempts to do is just evaluate the first “peeked” term of a list for Keyword-iness, which could be useful in some contexts, but is generally insufficient to ensure that every item in a list is a keyword pair. And you can’t traverse every item in a list in guards, for reasons explained above.

3 Likes

I would say it is doable. You can mimic what the Kernel.in/2 operator does. Instead of checking for values you check for every element in the list being 2-element tuple and the first element being an atom (all operations allowed in guards). If the context of the call is not a match or a guard, you delegate to Keyword.keyword?/1.

Here’s the in/2 implementation: elixir/lib/elixir/lib/kernel.ex at v1.18.4 · elixir-lang/elixir · GitHub

What you end up doing is building are nested conditions similar to this

but with :erlang.andalso/2.

1 Like

I guess when is_atom(elem(hd(value), 0)) or [] == value is enough as guards can fail.

@christhekeele peeking on the first item only is not so bad, it’s generally enough when a function has different clauses to work with keywords or maps or scalars for instance. But yeah it gives no guarantee.

1 Like

But in only works that way in guards for fixed compile time lists. It doesn’t work with runtime lists.

EDIT: Elaborating a bit now that I’m not on mobile.

You can do:

def foo(x) when x in [1,2,3] do

which will turn into

def foo(x) when x == 1 or x == 2 or x ==3 do

but what you can’t do is:

def foo(x, list) when x in list do
6 Likes

Yeah, my example is gross, hard to give a good example without knowing the constraints the function would be used for.

But I do wish that defguard would allow certain patterns in it’s function head.

1 Like

That would be hard since guards do not support patterns, so the pattern would need to be ported back into the “call” part of the function head!

But I’m OK with the language not doing that kind of stuff, Erlang or Elixir, and sticking to what the VM can do. That’s something I like about those languages ; you can “feel” the underlying VM right behind your code, there isn’t a deep mass of unknown mechanisms like between Java code and the JVM, and despite that you can still write very expressive and high level code.

I don’t believe that would be necessary, as the pattern could be translated to a valid guard.

I’m not sure any pattern could be translated. But yeah many patterns could, that would be nice.

Yeah, I did state certain patterns.

1 Like

No, the restriction is artificial. User-defined guards running arbitrary code would not be difficult to support from a technical point of view (we already do it, see prim_eval.S and its interaction with :erl_eval), it’s just very annoying to define what’s supposed to happen in corner cases like such a guard running a receive while in a receive, how exception tracing is supposed to work, and the likes.

So why don’t we support it?

Partially because it’s a bit of a footgun. If you end up throwing a user-defined guard into an infinite loop, which specific clause was it “called from?” If you accidentally make it side-effecting, which combination of the (say) three hidden invocations derailed things?

The main reason however is historical and there’s been no pressing need to change it. You can case over the result in the function body, and that’s been enough so far.

6 Likes

To follow up on @jhogberg answer of this not being true: length/1 is a guard and completes in linear time by my understanding.

3 Likes