kerryb
November 25, 2025, 3:26pm
1
Is there a known input length limitation with the Regex module (or I assume probably :re under the covers)?
I just spent a while debugging some code that generally worked, but was failing to match for certain inputs which seemed valid. Eventually I narrowed it down to the sheer size of the strings where it failed.
Here’s a simplified example. This works:
text = "START\n" <> Enum.map_join(1..100_000, "\n", &"Line #{&1}") <> "\nEND\n"
Regex.run(~r/START(.*?)END/ms, text)
# => ["START\nLine 1\nLine 2\nLine 3\nLine 4\nLine 5\n\n" <> ...]
But if I increase the number of lines by another order of magnitude, the same pattern stops matching:
text = "START\n" <> Enum.map_join(1..1_000_000, "\n", &"Line #{&1}") <> "\nEND\n"
Regex.run(~r/START(.*?)END/ms, text)
# => nil
No doubt this is a terrible regex, and I can imagine it having performance implications, but I was a bit surprised that it silently failed, indicating no matches, rather than raising an exception.
2 Likes
hauleth
November 25, 2025, 4:02pm
2
Assuming OTP 28 there are flags for re:run/3 . Flags that are interesting for you are:
:report_errors
:match_limit, quoting docs:
The default value 10,000,000 is compiled into the Erlang VM.
:match_limit_recursion
Also, in this particular case using regex is IMHO pointless as using binary pattern matching would be faster and cleaner IMHO.
def extract("START\n" <> input), do: do_extract(input, <<>>)
def extract(other), do: {:error, :no_start}
defp do_extract("\nEND\n", data), do: {:ok, data}
defp do_extract("", _), do: {:error, :no_end}
defp do_extract(<<c>> <> rest, acc), do: do_extract(rest, acc <> <<c>>)
3 Likes
kerryb
November 25, 2025, 4:16pm
3
Assuming OTP 28 there are flags for re:run/3 . Flags that are interesting for you are:
:report_errors
:match_limit, quoting docs:
The default value 10,000,000 is compiled into the Erlang VM.
Thanks!
Also, in this particular case using regex is IMHO pointless as using binary pattern matching would be faster and cleaner IMHO.
Yeah, it was a simplified example just to demonstrate the issue I was having. But thanks for the pattern matching example anyway.