Is there an Hex package to parse "Accept-Language" header?

Is there an hex library suitable to parse an “Accept-Language” header, including the “edge-cases” with “quality-value” syntax ?

Examples (see MDN )

Accept-Language: en
Accept-Language: en-US
Accept-Language: *

// Multiple types, weighted with the quality value syntax:
Accept-Language: fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5

The RFC would be https://tools.ietf.org/html/rfc7231#section-5.3.5

I suppose splitting the string and handling edge cases myself would not be terribly complicated, just wondering if someone had already done this :smiley:

1 Like
2 Likes

Sounds great, I stumbled on a missing page for the hexdoc for this package and assumed it was not maintained any more.

I believe cowlib can also handle it (among other headers.)

1 Like

I fixed the documentation link issue: one of the recent versions of ex_doc lowercases all the doc files but I had “README” in the docs section of mix.exs. Thanks for the heads-up.

Using your data there is an example below. Note that list will be returned in sorted order so hd will return you the highest priority in the list.

See:

  • Cldr.AcceptLanguage.parse/1
  • Cldr.AcceptLanguage.parse!/1
  • Cldr.AcceptLanguage.best_match/1
  • Cldr.AcceptLanguage.errors/1

If you have ex_cldr locales configured then it will populate the :cldr_locale_name with the nearest matching configured locale. You’ll notice that only “en” is configured in this instance.

The matching algorithm implements all of the CLDR aliases and does sub tag replacement. It also parse, almost completely, the full definition of a language tag as defined by RFC5646 (ask me if you want to know the exception :slight_smile:)

iex> Cldr.known_locale_names
["af", "bs", "en", "en-GB", "he", "it", "pl", "root", "ru", "th"]

# Return the locale that most nearly matches the configured Cldr locales above.
# Since "fr", "fr-CH" and "de" are not configured it returns "en"
iex> Cldr.AcceptLanguage.best_match "fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5"
{:ok,
 %Cldr.LanguageTag{
   canonical_locale_name: "en-Latn-US",
   cldr_locale_name: "en",
   extensions: %{},
   gettext_locale_name: "en",
   language: "en",
   locale: %{},
   private_use: [],
   rbnf_locale_name: "en",
   requested_locale_name: "en",
   script: "Latn",
   territory: "US",
   transform: %{},
   variant: nil
 }}

# Returns the parsed list of locals in "q" descending order with
# any parse errors appended to the end of the list.
iex> Cldr.AcceptLanguage.parse "fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5"
{:ok,
 [
   {1.0,
    %Cldr.LanguageTag{
      canonical_locale_name: "fr-Latn-CH",
      cldr_locale_name: nil,
      extensions: %{},
      gettext_locale_name: nil,
      language: "fr",
      locale: %{},
      private_use: [],
      rbnf_locale_name: nil,
      requested_locale_name: "fr-CH",
      script: "Latn",
      territory: "CH",
      transform: %{},
      variant: nil
    }},
   {0.9,
    %Cldr.LanguageTag{
      canonical_locale_name: "fr-Latn-FR",
      cldr_locale_name: nil,
      extensions: %{},
      gettext_locale_name: nil,
      language: "fr",
      locale: %{},
      private_use: [],
      rbnf_locale_name: nil,
      requested_locale_name: "fr",
      script: "Latn",
      territory: "FR",
      transform: %{},
      variant: nil
    }},
   {0.8,
    %Cldr.LanguageTag{
      canonical_locale_name: "en-Latn-US",
      cldr_locale_name: "en",
      extensions: %{},
      gettext_locale_name: "en",
      language: "en",
      locale: %{},
      private_use: [],
      rbnf_locale_name: "en",
      requested_locale_name: "en",
      script: "Latn",
      territory: "US",
      transform: %{},
      variant: nil
    }},
   {0.7,
    %Cldr.LanguageTag{
      canonical_locale_name: "de-Latn-DE",
      cldr_locale_name: nil,
      extensions: %{},
      gettext_locale_name: nil,
      language: "de",
      locale: %{},
      private_use: [],
      rbnf_locale_name: nil,
      requested_locale_name: "de",
      script: "Latn",
      territory: "DE",
      transform: %{},
      variant: nil
    }}
 ]}
3 Likes

As @g-andrade said, we can use cowlib.

accept_language = "ja,en-US;q=0.9,en;q=0.8,zh-CN;q=0.7,zh;q=0.6"

:cow_http_hd.parse_accept_language(accept_language)

This code returns [{"ja", 1000}, {"en-us", 900}, {"en", 800}, {"zh-cn", 700}, {"zh", 600}].

Note that the cowlib converts all letters in the country code to lowercase.

4 Likes