Numero - a micro library for converting non-english digits in Elixir

Hi everyone.

A few days ago i was developing a JSON API on phoenix which receives users phone number from client and starts sending messages to that phone number. There was a little problem in production however.

The problem was some people used Arabic or Persian keyboards to enter their phone number. And the phone number was not passed correctly into sending message API. So i’ve developed a micro library called Numero to tackle this problem in elixir.

Here is Numero: https://hex.pm/packages/numero

Currently it supports Persian, Arabic and NKO digits.

6 Likes

Numero 0.2.0 is released.

Hi everyone. Numero 0.2.0 is released.

In this version i have added two new functions.

  • is_digit_only?/1 which checks if all of a string is numerical chars.
  • remove_non_digits/2 which removes all non numerical chars form a given string with ability to have exceptions for some chars (optional).

I hope it will be useful for you guys.

Thanks.

3 Likes

@alisinabh: Here are my ideas:

  1. Look at Naming Conventions: Trailing question mark (foo?)
  2. I think that some things could be written simpler

For example:

defmodule Example do
  @digits String.graphemes("0123456789")

  def digit_only?(""), do: false
  def digit_only?(string), do: do_digit_only?(string)

  defp do_digit_only?(""), do: true
  defp do_digit_only?(<<char::binary-size(1), rest::binary>>) when char in @digits, do: do_digit_only?(rest)
  defp do_digit_only?(_), do: false
end
2 Likes

Thank you @Eiji for reminding me my problem with function naming. I should remove is_ from next minor version.

This example looks nice and clean but the reason i’ve done it with char list was so that i could determine utf-8 digits too (but now that is see my code i just forgotten to support utf-8 numbers :frowning:). like “۱” which is 1 in Farsi.

Dealing with binary operations is a UTF-8 string is a bit hard. and maybe ugly.
For example if i wanted to do as you did on numbers i should have done this:

defmodule Example do
  @digits String.graphemes("0123456789")

  def digit_only?(""), do: false
  def digit_only?(string), do: do_digit_only?(string)

  defp do_digit_only?(""), do: true
  defp do_digit_only?(<<char::binary-size(1), rest::binary>>) when char in @digits, do: do_digit_only?(rest)
  defp do_digit_only?(<<char::binary-size(2), rest::binary>>) when char in @digits, do: do_digit_only?(rest)
  defp do_digit_only?(_), do: false
end

Which i think may lead to some unwanted exceptions.

Any ideas on how to deal with them?

1 Like

I’d not use that guarded version at all… Unicode has a lot of possible digits:

http://www.fileformat.info/info/unicode/category/Nd/list.htm

One doesn’t really have to iterate that list over and over again when using when char in @digits. Instead it should roughly like this:

@digits ~c[1234567890…] # include them all!

Enum.each(@digits, fn digit ->
  defp do_digit_only?(<<unquote(digit)::utf8, rest::binary>>), do: do_digit_only?(rest)
end
defp dp_digit_only?(<<_::utf8, _::binary>>), do: false
defp do_digit_only?(""), do: true
4 Likes

Thank you very much. :slight_smile:

This is so helpful. I will try that soon.

1 Like

Hey @NobbZ

I’ve been working on numero today and i did as you said and i have to say It was a significant improvement in performance and it resulted in a more beautiful code.

I tested the old version (with iteration solution) with a very long string and it took 1948ms and your suggestion made it possible to normalize the same string in only 5ms. Wow…

I knew binary pattern matching is fast in elixir but i did’t think it could improve performance this much! :anguished:

So thank you for your suggestion and teaching me. :heart:

Anyways, i’d be really thankful if you can look at this code and tell me

  • Should is use one Enum.each/2 or using multiple ones are ok?
  • Have i done this the right way?

https://github.com/alisinabh/Numero/blob/fix-iteration/lib/numero.ex

Thank you

1 Like

I do think, you are talking about those lines:

https://github.com/alisinabh/Numero/blob/fix-iteration/lib/numero.ex#L135-L158

Since they all generate clauses for different function you have to separate them.

1 Like

Thanks @NobbZ

I will release this version after doing some more tests on it.

Numero 0.3.0 released

Hi everyone.

I’ve released Numero v0.3.0 which is much faster than 0.2.0 thanks to binary pattern matching and preventing a conversion back and forth with charlists. (Which @NobbZ suggested)

You can read release notes here if you are interested.

Thanks.

P.S the reason i changed the minor version was that there were two slight incompatibilities with previous version in Numero API.

1 Like