Discussion: Incorporating Erlang/OTP 21 map guards in Elixir

josevalim · June 22, 2018, 4:00pm

Hi everyone,

Erlang/OTP 21 comes with two new guards contributed by @michalmuskala: map_get/2 and is_map_key/2. Now we need to discuss how we would integrate those in Elixir. This proposal will be broken in three parts, each with their own discussion. Feel free to comment on any conclusion individually or as a group, but please let us know why.

NOTE: this is a focused thread, so we appreciate if everybody stayed on topic. Feel free to comment anything in regards to maps and guards but avoid off-topic or loosely related topics. For example, if you would like to know when other OTP 21 features will be added to Elixir, please use a separate thread.

The `map_get/2` and `is_map_key/2` guards

The first topic to consider is if we want to add the map_get/2 and is_map_key/2 guards to Kernel. Here are some things to consider.

First of all, in Elixir we don’t duplicate Kernel functionality in other modules. For example, since we have Kernel.map_size/1, there is no such thing as Map.size/1. However, given that we have Map.fetch!/2 (equivalent to map_get/2) and Map.has_key?/2 (equivalent to is_map_key/2), adding those two new guards means we would break those rules, as we simply can’t remove the Map functions.

Second of all, I am not particularly sure that map_get/2 and is_map_key/2 are beneficial in actual code, since they don’t add anything that you can’t do with a pattern matching. Let’s see the website example, without guards:

def serve_drinks(%User{age: age} = user) when age >= 21 do
  ...
end

Now written with guards:

def serve_drinks(%User{} = user) when map_get(user, :age) >= 21 do
  ...
end

Or if we assume we don’t need to check for the struct name:

def serve_drinks(user) when map_get(user, :age) >= 21 do
  ...
end

In both cases, I personally prefer the original example. It is clearer and promote better and more performant practices (pattern matching instead of field access).

In my opinion, the real advantage of map_get/2 and is_map_key/2 is that we can use them to build composable guards. For example, now we can finally implement guards such as is_struct/1:

defguard is_struct(struct)
         when is_map(struct) and
                :erlang.is_map_key(:__struct__, struct) and
                is_atom(:erlang.map_get(:__struct__, struct))

Luckily, to implement those guards, we don’t need to add is_map_key/2 and map_get/2 to Elixir. We can just invoke them from the :erlang module.

Discussion #1: Should we add map_get/2 and is_map_key/2 to Kernel?

Expand defguard

If the main goal of map_get/2 and is_map_key/2 is to help in the construction of composable guards, then one possible route to take is to expand the defguard construct, introduced in Elixir v1.6, to also support patterns. Such that the following:

def serve_drinks(%User{age: age} = user) when age >= 21 do
  ...
end

Could be written as:

defguard is_drinking_age(%User{age: age}) when age >= 21

def serve_drinks(user) when is_drinking_age(user)

The guard above would internally be written as:

defguard is_drinking_age(user)
         when is_map(user) and
                :erlang.is_map_key(:__struct__, user) and
                :erlang.map_get(:__struct__, user) == User and
                :erlang.is_map_key(:age, user) and
                :erlang.map_get(:age, user) >= 21

We can translate almost any pattern to guards. The only exception are binary matches, which won’t be supported.

Discussion #2: Should we allow patterns in defguard/1?

Support `map.field` in guards

Yet another approach to take is to support only map.field in guards. Again, if we take the original example:

def serve_drinks(%User{age: age} = user) when age >= 21 do
  ...
end

We could write it as:

def serve_drinks(%User{} = user) when user.age >= 21 do
  ...
end

Or even as:

def serve_drinks(user) when user.age >= 21 do
  ...
end

Where the latest is, IMO, by far the most readable.

However, this also has its own issues. First of all, as we said in the first section, pattern matching is typically more performant and clearer. Second of all, the foo.bar syntax in Elixir can mean either map.field OR module.function. This may be the source of confusion since invoking remote functions is not allowed in guards and those will make the guard to fail. For example, if somebody passes serve_drinks(User) to def serve_drinks(user) when user.age >= 21 do, you will get a FunctionClauseError, as only passing maps would be supported. However, I personally don’t think this would be a large issue in practice, as guards have always been limitted, remote calls are never supported in guards and dynamic remote calls are rare anyway, especially zero-arity ones.

Discussion #3: Should we allow map.field in guards?

ryanwinchester · June 22, 2018, 4:31pm

I agree with all of your opinions.

I sometimes have long-ish struct names so I’d prefer the look of map_get guards, but would still keep pattern-matching because of the mentioned performance benefits.

Also, expanding the defguard to add patterns looks like a brilliant idea.

edit:

#1: no
#2: yes
#3: no

gregvaughn · June 22, 2018, 4:32pm

Discussion #1: Should we add map_get/2 and is_map_key/2 to Kernel?
no
Discussion #2: Should we allow patterns in defguard/1?
yes
Discussion #3: Should we allow map.field in guards?
no

The reason behind it is basically the same for each. We should encourage pattern matching. Having more ways of doing the same things will lead to more confusion.

grych · June 22, 2018, 4:47pm

Ad. 1: No.

Ad. 2: YES! Pattern matching is a killer feature and should be used in every corner of Elixir world.

Ad. 3: Yes, it is pretty and useful. What about non-atom keys? Will

def serve_drinks(user) when user["age"] >= 21 do
  ...
end

be also supported?

josevalim · June 22, 2018, 4:55pm

FWIW, I tend to use this rule: if the key is necessary when matching the pattern, keep it in the pattern, otherwise move it to the body. So I end-up with code like this:

def some_fun(%{field1: :value} = struct) do
  %{field2: value2, field3: value3} = struct
  ...
end

No. foo["bar"] also works for lists and it would be hard to argue why it works only for maps and not lists. It is probably another reason against map.field.

michalmuskala · June 22, 2018, 5:31pm

Discussion #1: Should we add map_get/2 and is_map_key/2 to Kernel?

Yes, we should. There are things that can’t be expressed with the other proposals (or any other way in a guard for that matter), most notably:

def check_key(map, key) when is_list(map_get(map, key)), do: ...

It’s effectively implementing the def foo(key, %{key => value}) ... that is such a frequent request - now we’d have a way to do this.

However, I’m not a huge fan of the map_get name (pretty ironic considering it’s me who contributed it to Erlang). It feels very “imperative” while I find pattern matching and guards to be mostly about declarative code.

For example, for tuples, we use elem, not get_elem and for lists hd not get_hd (though head would probably be better). Because of that, I think we should explore adding this function under some different name. Some proposals: field, map_field, map_value, key_value, …

Discussion #2: Should we allow patterns in defguard/1?

Yes, I think we should. I like this idea a lot.

The only problem is that we might hit some performance issues with complex patterns when the value is used multiple times. Since we can’t bind inside guards, we’d have to generate the “access” code at each place the value is used. It becomes especially problematic when coupled with the in operator. It might lead to gigantic code explosion. I don’t think compiler would be smart enough to eliminate all those common sub-expressions (a long-term solution to this would be compiling to core erlang that allows binding in guards, that’s another set of issues, though).

With all of that said, I don’t think this should prevent us from exploring the implementation of this feature.

Discussion #3: Should we allow map.field in guards?

No. This feature would break the consistency of the language and make some construct mean different things inside and outside guards.

It’s true, today there’s difference in how code fails inside and outside guards (e.g. hd([]) might fail with an exception or just fail the current guard clause). When the expression fails or succeeds is always the same, though - this would break it and break it completely silently.

As a counter example, I could easily imagine somebody trying to write code like that:

def update_or_rollback(repo, changeset) do
  case repo.update(changeset) do
    {:error, reason} when repo.in_transaction?() -> repo.rollback(reason)
    other -> other
  end
end

Which would just silently fail. And it’s not a completely artificial example - in one of the projects I have a very similar function:

def update_or_rollback(repo, changeset) do
  case repo.update(changeset) do
    {:error, reason} = error ->
      if repo.in_transaction?(), do: repo.rollback(reason), else: error
    other -> other
  end
end

josevalim · June 22, 2018, 6:00pm

Shouldn’t we just compile it internally to a map_get then?

OvermindDL1 · June 22, 2018, 6:09pm

I would say no, however based on a change of mind later in this post, I say yes (with the erlang names).

This would cover 80% of the use-cases that I want proper matcher guards for, so I’m all for it!

Yep that’s the rest sadly…

By ‘patterns’ you mean ‘map patterns’, I’d say yes. It would, however, be awesome to expand that to some tuple and list things too, it would ease writing the guards substantially!

I would say yes as it is more uniform and expected, but I’ll go with No overall as it is better to encourage matchers (still wish defguard supported injecting matchers, it would a bit of addition into the elixir compiler to do so, not ‘that’ big ^.^).

Another point:

These performance losses listed could be removed if an optimization pass in Elixir was able to transform guards into the matcher equivalents where possible and it makes sense to do so.

Ahh true, until matchers have a way to match inside map keys to other static arguments then this is indeed useful…

It is fairly inconsistent yeah, but that’s because they use erlang names, and while the erlang guard names are still in use then that pattern should still be followed.

That would be awesome, such an optimization pass that can operate over a matcher/guard pair and optimize it between them both ways, moving things to matchers that are better there and moving things to guards to add functionality there, would be quite nice, you could then implement a guard like get_binary_bytes or so that just compiles to the matcher equivalent as well as an example.

jeremyjh · June 22, 2018, 6:18pm

I think presently Kernel includes all the Erlang guards with their original Erlang names, right? I see some benefit in maintaining that symmetry, and I agree with Michael’s reasoning otherwise so for me:

1: Yes - Erlang names
2: Yes
3: No

dimitarvp · June 22, 2018, 6:36pm

Discussion #1: Should we add map_get/2 and is_map_key/2 to Kernel?

No, it breaks the “only one way to do thing X” rule that won so much fans of Elixir and, partially, Go. That is still one of my top 5 favourite things about Elixir – the very minimal and expressive stdlib.

After reading some more though, I can see the value in adding them with the Erlang names. I suppose we will never get Map.fetch! and Map.has_key? allowed in guards? They can be replaced with the Erlang function calls when used in guards and thus we avoid the duplication Jose mentioned? Sorry if this is a naive suggestion, I am not aware of the realities of writing a compiler.

Not a strictly a fact, more like an opinion and a call for discussion if you are willing to entertain it: this is scratching and battering the gate between static and dynamic typing and since the BEAM has no ambitions to be statically typed, I’d say let’s skip it for now. I’m with @josevalim and @OvermindDL1 here: let’s expand and improve defguard instead.

Discussion #2: Should we allow patterns in defguard/1?

Yes. As @michalmuskala said, this might pose some performance problems but it seems they will be addressed in the future SoonEnough™. Famous last words I know, but I feel the feature is worth it. defguard is an amazing readability and productivity improvement mechanism and it should be developed further. The huge benefit is the smaller cognitive load: people need to limit parameter types or values, people know they have to turn to defguard or inline pattern matching for help.

Discussion #3: Should we allow map.field in guards?

Absolutely not. Many people come to Elixir from Ruby and Javascript, and they just expect certain syntax constructs to just work everywhere; they will be very disappointed when such a mechanism has only partial application and will probably leave. I am not saying we should strive for universal adoption or maximum users – if you follow my replies lately you’ll see I am actually against big growth – but less WTFs per minute and more explicitness should remain among the very top priorities of Elixir.

In my eyes, Elixir (and parts of modern JS) brought functional programming to the masses and the community should do its very best to keep it minimal yet very capable and thus average-programmer-friendly.

Maybe this third discussion should be resurrected in the future, if/when evolutionary changes in Erlang or Elixir make it relevant and feasible again, and without the compromises several people already outlined.

josevalim · June 22, 2018, 6:41pm

No, we changed the name of a couple of them. elem and put_elem for example. We can’t adopt the Erlang names for map because our semantics for get mean returning nil if a key is not found. The “correct” name would be map_fetch!/2. We also need to swap the argument order.

I actually meant maps, lists and tuples patterns. Good catch!

This is a very good point that we have discussed internally but I did not include in the proposal, so thank you for bringing this up. I don’t think we should allow those in guards because remotes are generally disallowed. So this would also be seen as a special case too.

dimitarvp · June 22, 2018, 6:45pm

Yes, but am I understanding correctly that it would be a special case only during compilation? If so, maybe this special case could reuse an already existing code in the compiler that aliases or inlines certain function calls to something else?

Again, sorry if this is naive or diminishes the required energy to achieve the idea. Easy for me to say “just do this or that” when I am not the one coding it.

josevalim · June 22, 2018, 6:46pm

It is a special case semantically speaking. For users of the language. We would be teaching people that they can do Map.fetch!(...) in a guard but in fact it is only possible for this particular function. When they try something like Keyword.fetch!(...) it won’t work and they will be puzzled.

Implementation wise it is very trivial.

dimitarvp · June 22, 2018, 6:59pm

Oh, I see now – such inconsistency would be puzzling indeed.

Maybe consider adding map_fetch! and is_map_key? not as real functions, but only as macros / special functions that are only usable in guards (and are internally replaced with Map.fetch! and Map.has_key? during compilation)? Is this possible? Here I might just be talking crap.

Alternatively, maybe a new function tag should be added to Elixir’s stdlib; something like @usable_guard true? And have a dedicated documentation page that is generated by code that gathers all tagged functions so we have a dedicated place dealing exclusively with functions allowed to be used in guards?

IMO at certain point this documentation should be made much more explicit. Instead of having the allowed guard functions listed in a page in the official website, I think the website’s page should point to the dedicated documentation page.

Final thought: maybe we should also have a Kernel.guard_functions/0 function returning a list of {Module, :function, arity} tuples. It should return inlined code-generated list and not calculate it at every call, of course.

bitwalker · June 22, 2018, 7:04pm

For #1, I agree in general with the arguments made against including these in Kernel.

For #2, I definitely like this, and I like the suggestion of compiling %{key => value} patterns to the corresponding guard functions to eliminate the need for using the guards by hand. Seems to be the biggest win here with the new guards for sure.

For #3, I think it looks clean, but not that much cleaner, and not a style I would likely use myself; so I’m leaning pretty heavily towards no. I just don’t think it adds much, and in the example provided, actually loses something: the match against the struct type - add that back and it’s essentially the same level of verbosity as the others.

mischov · June 22, 2018, 7:10pm

#1: Unsure, there have been good points presented on both sides.

#2: Yes, for reasons already mentioned.

#3: No, for reasons already mentioned.

michalmuskala · June 22, 2018, 7:33pm

There are even more changes - especially in operators, where almost all have different names. The most notable examples being andalso is and and orelse is or, while and and or don’t exist in Elixir at all. I don’t think “keeping close to erlang” should be a factor here.

axelson · June 22, 2018, 7:42pm

Discussion #2: Should we allow patterns in defguard/1?

I guess I’ll voice a dissenting opinion. If we can’t support all of pattern matching (with the existing semantics) in defguard/1 then I think we shouldn’t allow it at all.

yurko · June 22, 2018, 7:46pm

My 2 cents - mostly agreed with original thoughts.

Discussion #1: no
Discussion #2: yes
Discussion #3: no

Discussion: Incorporating Erlang/OTP 21 map guards in Elixir

The map_get/2 and is_map_key/2 guards

Expand defguard

Support map.field in guards

Another point:

The `map_get/2` and `is_map_key/2` guards

Support `map.field` in guards