It's possible to write a custom Guard to check for a blank string?

Sometimes I want to check if the input into a function is not a blank string.

My first approach:

defmodule Example do

  def do_stuff(string1, string2)
    when is_binary(string1)
    and byte_size(string1) > 0
    and is_binary(string2)
    and byte_size(string2) > 0 do

    # Do some stuff with string1 and string2, because now we know they cannot be:
    #  * ""
    #  * " "
    #  * "       "
  end
end

The problem here is that when a string contains spaces(":space: ", “:space: :another_space:”, etc.) the guard clauses above will not detect it and I reach the body of the function, but what I want is to detect the blank string in the guard clause.

So I tried to make a custom guard:

defmodule BlankGuard do

  defmacro is_not_blank?(string) do

    callback = fn
                <<" " :: binary, rest :: binary>>, func -> func.(rest, func)
                _string = "", _func -> true
                _string, _func -> false
              end

    is_blank = fn string, cb ->
                cb.(string, cb)
               end

    quote do
      unquote(string) |> is_binary()
      and unquote(string) |> byte_size > 0
      and unquote(string) |> is_blank.(callback)
    end
  end

end

And tried to use like this:

defmodule ExampleGuard do

  import BlankGuard

  def do_stuff(string1, string2) when is_not_blank?(string1) and is_not_blank?(string2) do

    # Do some stuff with string1 and string2, because now we know they cannot be:
    #  * ""
    #  * " "
    #  * "       "
  end
end

And I get the error:

== Compilation error in file lib/play.ex ==
** (CompileError) lib/play.ex:81: invalid expression in guard, anonymous call is not allowed in guards. To learn more about guards, visit: https://hexdocs.pm/elixir/guards.html

Following the link in the error it seems that custom guards can only invoke other guards.

So my question is if I am missing something or if this is not really possible to implement in a guard clause?

I know I can implement this in each module I want to check for a blank string, but looks like unnecessary code repetition:

# @link https://rocket-science.ru/hacking/2017/03/28/is-empty-guard-for-binaries
def empty?(<<" " :: binary, rest :: binary>>), do: empty?(rest)
def empty?(string = ""), do: true
def empty?(_string), do: false
1 Like

Guard functions are limited to this list. What they have in common is that they execute in constant time. Since binaries (Strings) as well as some other structures, are variable in size, there are no guard functions that operate on the structure as a whole except for length/1 which seems to be a bit of an exception to the constant time expectation.

Therefore I don’t think you’ll find a way to perform the checks you want as a guard.

Typically I would normalise input at an outer boundary layer. Like calling String.trim/2 in a changes that processes user input and therefore my guard would only have to check for "" which is possible,

Lastly, I would just note that the nature of empty? is open to a lot of interpretation and variability in the Unicode world given the wide range of code points that can be interpreted as whitespace.

5 Likes

I was checking in this moment your unicode library. Nice work :slight_smile:

Well this is really unfortunate and annoying, because sometimes this is not the case:

Sometimes I just have a function that can be called internally from other modules and I want to be sure that I don’t get the blank string " " .

It seems that I don’t have another alternative then keeping checking for the input in the body of the function :frowning:

defmodule UtilsFor.Hash.Sha256 do

  alias UtilsFor.Text.Empty

  def salted_base64_encoded(content, salt)
    when is_binary(content)
    and byte_size(content) > 0
    and is_binary(salt)
    and byte_size(salt) > 64 do

    case Empty.check?(content) || Empty.check?(salt) do
      true ->
        raise "Content and/or salt cannot be a blank string."
      false ->
        content_salted = content <> salt

        :crypto.hash(:sha256, content_salted)
        |> Base.encode64()
    end
  end
end

For my purposes I would be happy if I could get a guard clause to work for the common white space :wink:

Thanks for the comment! Unicode simple regex’s coming up soon, and then the full Uncocde transform spec. Not quite so soon :slight_smile:

Even given the situation you describe I would tend to still enforce a boundary condition separately from processing. For example:

defmodule UtilsFor.Hash.Sha256 do
  alias UtilsFor.Text.Empty

  def salted_base64_encode(content, salt) when is_binary(content) and is_binary(salt) do
    do_salted_base64_encode(String.trim(content), String.trim(salt))
  end
  
  def do_salted_base64_encode("", _), do: {:error, :invalid_content}
  def do_salted_base64_encode(__, ""), do: {:error, :invalid_content}
  
  def do_salted_base64_encode(content, salt) do
    :sha256
    |> :crypto.hash(content <> salt)
    |> Base.encode64()
    |> wrap(:ok)
  end
  
  def wrap(term, wrap) do
    {atom, term}
  end
end

Just my 2.354c worth :slight_smile:

5 Likes

This is exactly the type of solution that I am tired of using all hover the place each time I need to check for a blank string.

It’s ok when you need to do it 1 time, not when you need to keep repeating it a across projects.

If you notice the name of the module is UtilsFor because is a package that I use across my projects, thus I would like to have the check for a blank string that I could use from a guard clause, not from the body of a function or with multiple functions heads.

The salted_base64_encode function is just one example where I need ot check for a blanks string, that happens to also be inside the UtilsFor package. I just gave it to help illustrate why it’s needed.

Anyway thanks for your insights :slight_smile:

I hear you, boilerplate gets to be a pain, no disagreement there. In most cases I’ve converged on using structs a lot more where a new function does validation and afterwards I trust that the struct has data of the right form and I never check it again. Another approach to creating a boundary between data validation/casting and operations on data.

This has dramatically reduced the amount of boilerplate - but not necessarily for the kind of function you describe above. Over and out :slight_smile:

1 Like

Your approach doesn’t do exactly what I want, aka a blank string "lots of spaces here" will still go through, thus jeopardizing the hash, that could end-up to be generated from a blank content and salt.

Assuming I make do_salted_base64_encode/2 private (which I forgot to do), the String.trim/1 calls on the 5th line will strip all leading and trailing whitespace resulting in "" for a string that is only whitespace before the call to the private function and therefore I think it does what you are after.

1 Like

I did missed the call to String.trim/1, and I also use it in my UtilsFor package for the same effect. I think I should do a break, because I spent the entire day trying to figure out this issue, and I am starting to not seeing things clearly :wink:

Thanks for all your insights.

# @link https://rocket-science.ru/hacking/2017/03/28/is-empty-guard-for-binaries
def empty?(<<" " :: binary, rest :: binary>>), do: empty?(rest)
def empty?(string = ""), do: true
def empty?(_string), do: false

This can simply be injected with a macro module via something like use EmptyStringGuard.

To use in the guard itself?

def my_func(string) when not empty?(string) do
  # do stuff
end

Long story short, you can’t check for blank strings in a guard, as you had to iterate over the string to do so, and that’s not possible.

2 Likes

No, I meant those three lines of code. :slightly_smiling_face:

Sorry but I am not seeing what it’s the advantage or I am not getting what you are suggesting.

If what you suggest is to inject the three empty?/1 via a macro and then use it in the body of the function, then that’s just a different way of achieving what I already do, isn’t?

I’m wondering why you inject or copy paste at all… Why not just do a remote call?

Remote call from a guard?

I am assuming you mean this:

def my_func(string) when not UtilsModule.empty?(string) do
  # do stuff
end

Remote calls from a guard are not possible, though calling your function can’t be called from a guard anyway. That’s why I am confused.

1 Like

Yes I know remote calls are not possible from a guard clause, and that it’s why I wrote this post to ask how I could write a guard to check for a blank string, and in the first post I have the example of the macro I tried to build.

The other example in a subsequent post shows how I am checking for a blank string in the body of a function, and it’s there just to illustrate how I have to repeat this code every time i need to check for the blank string.

In my opinion would be nice that once the language doesn’t support the check in the core, that at least would let me build my own guard, but it seems to be a mission impossible unless the core of Elixir one day supports it or allow the developer to build it’s own true guard clause, because now it seems to me that a custom guard clause naming is misleading, because it only allow us to group togetther other guard clauses.

Not beeing able to write this guard is much deeper than elixir, it’s a limitation of the BEAM, which only allows for a handful of whitelisted functions.

And I still do not understand why you have to repeat your function everywhere, you have to define it once and call it everywhere…

I am not defining it everywhere I want to use it, I jsut don’t want to call it from the body everywhere…

See example usage from above:

As you can see I am just doing the remote call as you suggested, and this is precisely what I was trying to avoid by having a guard to check for the blank string.