@hubertlepicki String.normalize
separates each special character in multiple characters in such a way that their combination represents the original character. Simple example:
iex(11)> "á" |> String.codepoints
["á"]
iex(12)> "á" |> String.normalize(:nfd) |> String.codepoints
["a", "́"]
However, for some reason it doesn’t work when the accentuated character is not the first one in the string:
iex(7)> "aá" |> String.normalize(:nfd) |> String.codepoints
["a", "á"]
@KronicDeth Here’s my output:
iex(15)> "árboles más grandes" |> String.normalize(:nfd)
"árboles más grandes"
iex(16)> "árboles más grandes" |> String.normalize(:nfd) |> String.replace(~r/[^A-z\s]/u, "")
"arboles ms grandes"
iex(17)> "árboles más grandes" |> String.normalize(:nfd) |> String.replace(~r/[^A-z\s]/u, "") |> String.replace(~r/\s/, "-")
"arboles-ms-grandes"
My machine is running Archlinux, this is the output of running locale
in the terminal:
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
I wonder what the problem could be…