defp has_char_in_string?(value), do: Regex.match?(~r/[^\d]/, value)
def somefun(arg) do
case has_char_in_string?(arg) do
true -> foo()
false -> bar()
end
end
And I really want to keep this regex within this module and do not externalize this logic. But I also want to use pattern matching or guards instead of a case. Is this even possible? And if it is, how?
It’s possible, especially if the regular expression is simple - you can translate the regex into its corresponding finite state machine and represent that with pattern matching:
Here, the regex [^\d] translates to a state machine that stays in call_bar as long as each character is 0-9, actually calls bar given an empty string, and calls foo otherwise.
If the regex doesn’t involve backreferences or lookaheads (so it’s a theory-of-languages regular expression), it’s always possible to do this.
HOWEVER
The example above is a good example of how this approach obfuscates what should have been code like this:
defmodule RegexRecursionSimple do
def somefun(arg) when is_binary(arg) do
if Regex.match?(~r/[^\d]/, arg) do
foo()
else
bar()
end
end
defp foo(), do: IO.puts("foo")
defp bar(), do: IO.puts("bar")
end
I’m very curious what’s motivating the preference for pattern matching here; it’s not the right tool for the job.
Well I think there is no pretty solution here. Though Hauleth presented something that would work ,I now see that the best way is through a well crafted regex. Thank you all for the answers.
These all operate on the Unicode character classes so it covers what passes for a digit in a more complete sense. It might help or give you some ideas.
Note that it works on code points since there is a limited set of underlying functions that can be used in guards.
There is another bunch of functions that might be helpful, including Cldr.Unicode.alphanumeric?/1 which will return a boolean and also uses the full Unicode definitions (not just Latin1):
is there a way to use guard clauses from erlang ??
I just found this in erlang masterclass course from kent university and I’m trying to solve the problems in elixir - looks similiar as op
def parse([ch|rest]) when ?a =< ch and ch =< ?z do
{succeeds, remainder} = get_while(&is_alpha/1, rest)
{{:var, List.to_atom([ch|succeeds])}, remainder}
end
thanks @peerreynders - I found a workaround which probably also can be used:
(I’m using double quoted strings for that)
defmodule Guards do
defguard is_lower(ch) when ch in ~w(q w e r t y u i o p a s d f g h j k l z x c v b n m)
defguard is_digit(ch) when ch in ~w(1 2 3 4 5 6 7 8 9 0)
end
That will check if a word is one of those characters, not that if a character is one of those characters.
iex(1)> ~w(q w e r t y u i o p a s d f g h j k l z x c v b n m)
["q", "w", "e", "r", "t", "y", "u", "i", "o", "p", "a", "s", "d", "f", "g", "h",
"j", "k", "l", "z", "x", "c", "v", "b", "n", "m"]
iex(2)> ?a in ~w(q w e r t y u i o p a s d f g h j k l z x c v b n m)
false
I know I’m probably more focused on i18n than many but Elixir strings are Unicode strings. So if the use case is only ascii then i think intent would be clearer to call the guards is_ascii_lower/1 and is_ascii_digit/1.
Unicode has 2,151 lower case and 630 digit characters as of Unicode 12.1.
Not forgotten! Both forms have canonical equivalence but they’re not identical as you say. Hence why its quite important to normalise to :nfc before checking casing for consistent results (unless implementing a full casing algorithm that is normal form independent). By the way, not only an issue for diacritics. Also for hangul. And not all decompositions are canonically equalivent.
For example half-width and full-width katakana characters will have the same compatibility decomposition and are thus compatibility equivalents; however, they are not canonical equivalents. They also aren’t cased so at least that’s not an issue here