String validation with Regex and ExUnit

Hey folks,
I’ve written a regular expression for validating input strings. For testing purposes, I want to setup a scenario where a set of test vectors can be automatically injected as input into a new test and match against the regex. I’m currently testing each vector manually. I’ve written the code below. I was also thinking of using a file as opposed to hard coding the strings. Is there a nice way to this? Any feedback is much appreciated.

 setup_all do
  
     sample_strings = %{
       sample_str1: "test_string",
       sample_str2: "cz-dev",
       sample_str3: "new___cczdj",
       sample_str4: "czar,,,nic",
       }
    end
  
    defp input_length_valid(inp) do
      case String.length(inp) > 100  do
         true -> false
         false -> true
      end
    end
                                                                                                                                                                                                                    
    test "Regular expression for string - test #1", sample_strings
    do
      assert sample_strings.sample_str3
         |> String.replace("_", "-")
         |> String.match?( ~r/^(my-regex)*$/)
      assert sample_strings.sample_str3 |> input_length_valid() == true
    end


    test "Regular expression for string - test #2", sample_strings
    do
      assert sample_strings.sample_str2
         |> String.replace("_", "-")
         |> String.match?( ~r/^(my-regex)*$/)
      assert sample_strings.sample_str2 |> input_length_valid() == true
    end

1 Like

If your sample strings are constants, you could put them in a module attribute:

@sample_strings %{
  sample_str1: "test_string",
  sample_str2: "cz-dev",
  sample_str3: "new___cczdj",
  sample_str4: "czar,,,nic",
  ...
}

General comment on input_length_valid: there are only four functions that take a single boolean value and return a single boolean value (constant true, constant false, identity, and invert) - whenever you find yourself writing a case with booleans on both sides of the ->s, figure out which one of the four you have and remove the case.

In this case you have “invert” - so input_length_valid simplifies to not(String.length(inp) > 100) or even just String.length(inp) <= 100


As to the test itself, unless there are a LOT of cases in @sample_strings you may find a single test easier to write. For instance, this is a very simple approach:

defp regex_matches?(sample) do
  sample
  |> String.replace("_", "-")
  |> String.match?(~r/^(my-regex)$/)
end

test "sample strings match" do
  Enum.each(@sample_strings, fn {_label, sample} ->
    assert regex_matches?(sample)
    assert input_length_valid(sample)
  end)
end

One downside of this approach is that it will fail the test on the first string that doesn’t match; if you want to see all the failures in a single run you could accumulate them explicitly:

test "sample strings match" do
  failed_matches =
    Enum.reject(@sample_strings, fn {_label, sample} ->
      regex_matches?(sample)
    end)

  assert [] == failed_matches

  # similar for input_length_valid
end

The main thing I like to keep in mind when structuring asserts like these: what do I want to learn when one fails?

3 Likes

Hey Matt,
Thanks a lot for the feedback.
My goal is to provide a test bed for trying out regular expressions. I’m currently displaying the strings that fail either test with IO.inspect.

On another note, I was thinking of adding the length_check to the pipe in regex_matches?()/1; since it returns a binary I don’t think it’s a good idea since I have to short circuit the pipeline. Would you recommend a with statement?

 with   true   <- input_length_valid(sample),   
    (the crux) <- String.replace(sample, "_", "-"),  
        true   <- String.match?(~r/^(my-regex)$/)  do

Thank you very much for your guidance.

To be frank, I don’t understand the intent of either input_length_valid or the String.replace here since the inputs are literals in the file - so it’s hard to make specific recommendations.

You could use an = clause for the String.replace if you wanted to use with.

1 Like

In my particular use case, the input strings should have less than 100 characters. I think you are correct that the replace statement is a preprocessing step and should not be part of the test. Thank you for putting that into perspective.

As an alternative to @al2o3cr’s excellent answer, you can just code-generate test cases for each value so you have one failing test for each value that doesn’t match well:

defmodule YourApp.YourModuleTest do
  use ExUnit.Case

  @values ["test_string", "cz-dev", "new___cczdj", "czar,,,nic"]
  @regex "your_regex_here"

  describe "value" do
    for value <- @values do
      test unquote(value) do
        value = unquote(value)
        value = String.replace(value, "_", "-")
        assert String.match?(value, @regex)
        assert input_length_valid(value)
      end
    end
  end
end

Or you can draw out the heavy guns and use property-based testing (with some limitations of the input values so they are not infinite).

1 Like

Thanks a lot @dimitarvp. :v: