fchabouis
String compare depends on upper/lower case?
I am wondering why I have this behavior, that seems counter intuitive to me:
iex(1)> "x" > "a"
true
iex(2)> "X" > "a"
false
When I sort a list of strings, I don’t expect this result
iex(1)> Enum.sort(["b", "a", "X"])
["X", "a", "b"]
I didn’t find in the documentation where this behavior is explained. If somebody knows, I’m interested ![]()
Thanks!
Marked As Solved
LostKobrakai
Bitstrings are compared byte by byte, incomplete bytes are compared bit by bit.
Uppercase letters use smaller byte values than lowercase ones.
Also Liked
kip
And also be aware that if you are sorting non-ASCII strings then you should also normalise the string first. For example String.downcase(string) |> String.normalize(:nfkd).
Lastly, collation rules are language and culture dependent even for the same strings so depending on what you’re trying to do this is a much more complex topic than it seems on the surface.
sbuttgereit
Just off the cuff… maybe something like this.
iex(4)> Enum.sort(["b", "X", "a"], &(String.downcase(&1) <= String.downcase(&2)))
["a", "b", "X"]
There could be some string handling caveats that I’m not thinking about, but it’s the general idea.
derpycoder
JavaScript has the same behavior as well:
["a", "b", "X"].sort(); // ["X", "a", "b"]
Internally the Ascii values are being compared, which you can check in iex:
iex(15)> 'X'
[88]
iex(16)> 'a'
[97]
iex(17)> 'x'
[120]
So, always convert to lowercase before comparing, to avoid running into edge cases.
For instance, see UpperCase win:
iex(22)> ["derpycoder", "Derpycoder", "DerpyCoder"] |> Enum.sort()
["DerpyCoder", "Derpycoder", "derpycoder"]
See the answer by @sbuttgereit.
Popular in Questions
Other popular topics
Categories:
Sub Categories:
Forums
Popular Tags
- #ecto
- #liveview
- #troubleshooting
- #learning-elixir
- #deployment
- #library
- #erlang
- #testing
- #genserver
- #mix
- #absinthe
- #remote-other
- #otp
- #plug
- #how-to-question
- #macros
- #postgres
- #channels
- #elixirconf
- #exunit
- #discussion
- #javascript
- #code-sync
- #podcasts
- #onsite
- #dialyzer
- #docker
- #authentication
- #umbrella
- #full-time-contract
- #podcasts-by-brainlid
- #ecto-query
- #elixir-ls
- #phoenix_html
- #iex
- #blog-post
- #graphql
- #genstage
- #ai
- #websockets
- #supervisor
- #advent-of-code
- #elixirconf-us
- #distillery
- #processes
- #forms
- #api
- #metaprogramming
- #security
- #performance








