Regex weirdness -- Aargh!

I have a Regex that works correctly in tests but not when my run my app. No idea what is going on. Here is the regex expression: Regex.scan(~r/^\[([a-z].*)\].--.(.*).--/msU, text) .
Here is the string I apply it to:

[quote]
--
foo, bar
baz
--

fdfd

The correct scan result is

# [quote]
# --
# foo, bar
# baz
# --
# fdfd

["[quote]\n--\nfoo, bar\nbaz\n--",
"quote",
 "foo, bar\nbaz"]

This is what I get using the regex tester at elixre.uk and also when I run tests with mix test ... However, when I use the above regex expression in my app, the result is an empty list – the regex recognizes nothing. What to do?

2 Likes

Your regular expression fails for me if I put windows line endings in the string ("\r\n") instead of Linux ones ("\n"). The option

This is, because . does match only \r then, after that the regex demands to get a - but there is \n, so it fails.

So you have 3 options:

  1. Demand the client to deliver only Linux-Endings, but depending on the tech knowledge of the client that may be impossible
  2. Normalize the input before feeding to the RE-scan, which should be the easiest one
  3. replace the dots that shall match line endings by (?:\r|\n|\r\n), but the output needs to be cleaned up as well.
5 Likes

Thank you! I choose your suggested option (2). Simple and works like a charm.

3 Likes