Bridging locale name differences between ex_cldr & gettext

I did search on the forums here to see if there was a previous answer and didn’t come up with a clear match.

I’m working on a Phoenix app that is using gettext for text translations. I user will have a preferred language, so we’ll be serving up pages using that. However, for unauthenticated pages, we’ll need to read and respond to the Accept-Language headers.

I’ve been evaluating using the Plug from ex_cldr_plug and it’s remarkably robust. However it seems like the locale naming for Gettext isn’t aligned with those in CLDR.

ling 893 files (.ex)
warning: The locales ["de-DE", "en-US"] are configured in the Elixir.MyPhoenixApp.Gettext gettext backend but are unknown to CLDR. They will be ignored by CLDR.

Generating MyPhoenixApp.Cldr for 3 locales named [:de, :en, :und] with a default locale named :en

config/config.ex

# Configure for Internationalization (i18n)
config :myphxapp, MyPhoenixApp.Gettext,
  default_locale: "en_US",
  locales: ["en_US", "de_DE"]

config :ex_cldr,
  default_locale: "en",
  default_backend: MyPhoenixApp.Cldr,
  json_library: Jason

lib/myphxapp/cldr.ex

defmodule MyPhoenixApp.Cldr do
    use Cldr,
        locales: ["en", "de"],
        default_locale: "en",
        add_fallback_locales: false,
        gettext: MyPhoenixApp.Gettext,
        data_dir: "./priv/cldr",
        otp_app: :myphxapp,
        precompile_number_formats: ["¤¤#,##0.##"],
        precompile_transliterations: [{:latn, :arab}, {:thai, :latn}],
        # we could include providers for Numbers, etc. For now: empty
        providers: [],
        generate_docs: true,
        force_locale_download: false
end

lib/myphxapp/gettext.ex

defmodule MyPhoenixApp.Gettext do
  use Gettext, otp_app: :myphxapp
end

I believe that Gettext is using POSIX locale name from what I can see in an earlier version of the ex_cldr readme:

Since Gettext uses the Posix locale name format (locales with an ‘_‘ in them) and Cldr uses the Unicode format (a ‘-‘ as the subtag separator), Cldr will transliterate locale names from Gettext into the Cldr canonical form.

I imagine the warning is hinting at this being more than just a format definition though (so not just underscore _ vs dash -).

I realize that de_DE is not populated in CLDR from Kip’s response here:

Is there a way to address the warning and allow Gettext’s locales to coexist with CLDR’s definitions?

Or, I’m not sure if the correct move to is change from en_US & de_DE to simply en and de?

I know that the application will need to support en_GB and en_CA soon, so that’s why I was attempting so use the format for language-code / country code.

Hopefully I’ve provided enough configuration details (and not too much).

Thanks in advance

2 Likes

@lenards, thanks for the detailed message. This is a poor error message. It’s correct as far as it goes, but it doesn’t explain well enough.

TLDR; While en-US isn’t known to CLDR, setting your CLDR to en-US will correctly link it to the Gettext locale en_US.

During the configuration phase (ie at compilation time), ex_cldr tries to resolve the required locale names by combining the requested ex_cldr locale names with the configured Gettext locale names (as you correctly identified).

At configuration time the key thing of interest is “Does CLDR have a data repository for this locale”. For en_US and de_DE it does not since by the rules of CLDR, the data in en is the data for en-US and the data for de is the data for de-DE.

At runtime, however, when validating a locale derived from the Accept-Language header or elsewhere, ex_cldr will still try to find a Gettext locale to match with the requested locale. Using your example configuration you could try this:

iex> {:ok, locale} = MyPhoenixApp.Cldr.validate_locale("en-US")
{:ok, #Cldr.LanguageTag<en-US [validated]>}
iex> locale.gettext_locale_name
"en_US"

If for some reason thats not what you see (I tested it in my dev environment and it appears to be working as expected) please do open an issue.

5 Likes

Thanks Kip!

All that information I included and I forgot to mention that I was seeing #Cldr.LanguageTag<de-DE [validated]> when I was sending in a header value Accept-Language: de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7 when using the Plug.

I’ll try changing the CLDR locales quick.

1 Like

Yes, thats exactly what you would expect to see!

Requesting a locale de-DE (which has a 1.0 weighting in the header and therefore the highest priority to match) will match the ex_cldr configured locale of de. It will use the extra provided information of the territory DE to note that thats the users preferred territory for localisation.

In short, configuration is about “what data do I have available in CLDR to use for localisation”. Runtime is about “what’s the best fit data available to match with what the user requested”

1 Like

One way to explore what is being resolved is to look at some of the data in the language tag struct:

iex> {:ok, locale} = TestBackend.Cldr.validate_locale "en-US"
{:ok, #Cldr.LanguageTag<en-US [validated]>}

# What we asked for
iex> locale.requested_locale_name 
"en-US"

# The CLDR data repository to use
iex> locale.cldr_locale_name     
:en

# What Gettext locale we linked to
iex> locale.gettext_locale_name
"en_US"

# What territory is associated
iex> locale.territory          
:US
2 Likes

I might have misunderstood about how to eliminate the warning.

Leaving the MyPhoenixApp.Gettext configured the same.

If I change the CLDR configuration to:

config :ex_cldr,
  default_locale: "en-US",
  default_backend: MyPhoenixApp.Cldr,
  json_library: Jason

And then alter lib/myphxapp/cldr.ex to the following:

defmodule MyPhoenixApp.Cldr do
  use Cldr,
    locales: ["en-US", "de-DE"],
    default_locale: "en-US",
    add_fallback_locales: false,
    gettext: MyPhoenixApp.Gettext,
    data_dir: "./priv/cldr",
    otp_app: :myphxapp,
    precompile_number_formats: ["¤¤#,##0.##"],
    precompile_transliterations: [{:latn, :arab}, {:thai, :latn}],
    # we could include providers for Numbers, etc. For now: empty
    providers: [],
    generate_docs: true,
    force_locale_download: false
end

I’m failing a compile time check:

Compiling 263 files (.ex)
warning: The locales ["de-DE", "en-US"] are configured in the Elixir.MyPhoenixApp.Gettext gettext backend but are unknown to CLDR. They will be ignored by CLDR.


== Compilation error in file lib/myphxapp/cldr.ex ==
** (Cldr.UnknownLocaleError) Failed to install the locale named "en-US". The locale name is not known.
    (ex_cldr 2.31.0) lib/cldr/install.ex:84: Cldr.Install.do_install_locale_name/3
    (elixir 1.13.3) lib/enum.ex:937: Enum."-each/2-lists^foreach/1-0-"/2
    (ex_cldr 2.31.0) lib/cldr/install.ex:28: Cldr.Install.install_known_locale_names/1
    (ex_cldr 2.31.0) lib/cldr.ex:84: Cldr.install_locales/1
    (ex_cldr 2.31.0) expanding macro: Cldr.Backend.Compiler.__before_compile__/1
    lib/myphxapp/cldr.ex:1: MyPhoenixApp.Cldr (module)

So I’m still not sure I see how to tackle the warning here.

Exploring through iex is a great reminder. I definitely need to lean more on that as an approach :bowing_man:

There isn’t a way to eliminate the warning in this case. During compilation I try to be as explicit as possible - locale matching is complicated enough as is. But if you’re running with “warnings as errors” this would be problematic so sounds like this needs to be revisited since it is just a notification - there is nothing “wrong”.

Open to suggestions - what would you suggest is the correct behaviour? Emit the message, but not as a warning? Something else?

1 Like

The locales configuration must be an exact match to the locales available in CLDR. There’s no clever matching in this part.

1 Like

I was seeking to understand the warning, so I’m not sure I have a great suggestion in mind right now.

The concern for me was the phrase “are unknown to CLDR” in the warning:

warning: The locales ["de-DE", "en-US"] are configured in the Elixir.MyPhoenixApp.Gettext gettext 
backend but are unknown to CLDR. They will be ignored by CLDR.

Because the locales appears in the - format, which is the convention for ex_cldr, and not for gettext I was even more curious what was up (because that seemed to imply some “smart, or savvy, evaluation” of the configuration by CLDR).

Let me ponder a bit and see there is a more helpful or “wiser” response inside me that isn’t available at the moment.

Thank you again for all the effort on the ex_cldr* libraries and your prompt, thorough responses!

1 Like

Thanks for the encouragement!

I have changed both the error message and the way its printed. It no longer uses IO.warn/2 so it won’t trigger compilation errors if you use warnings_as_errors: true. The message now reads:

note: The locale "en_US" is configured in the MyApp.Gettext gettext backend but is unknown to CLDR. It will not be used to configure CLDR but it will still be used to match CLDR locales to Gettext locales at runtime.

BTW, you can also add suppress_warnings: true to your CLDR backend configuration and it won’t print warnings. I’ve just seen that its not honoured in all cases - but will be in this next release.

I’ll push a release in a few hours with this change.

1 Like

I have published ex_cldr version 2.32.1 with a change to the error message and no longer using IO.warn/2. The changelog entry reads:

Bug Fixes

  • Don’t use IO.warn/2 when compiling a backend and a known Gettext locale can’t be matched to a Cldr locale. IO.warn/2 will cause errors if the compilation setting warnings_as_errors: true is set. Instead, these messages will be output as a “note” that does not trigger warnings. In addition the error message has been improved to make clear that although the Gettext locale has no Cldr equivalent, it will still be matched at runtime. See the conversation at https://elixirforum.com/t/bridging-locale-name-differences-between-ex-cldr-gettext. Thanks to @lenards for the report.
5 Likes

Wow! Thank you! :bowing_man: