How to approach multi language content management system?

I have a web based software using Gettext for multilanguage support that is working fine. The only limitation is that translations must be done upfront. Now, I’m thinking about letting users insert content in the different languages the system use, so I’m thinking in storing those translations in the database.
I couldn’t find much information about this and I’m not sure about the best approach: creating N schemas (one for each language) that are the same and then in the controller select the one to show? Use a package? I’m using a plug to set the language in the router for Gettext.

1 Like

Hi!
There are multiple approaches available to solve this problem.

One of them is having a paralell separated schema for storing translations of each model. For example you may have an articles schema and a article_translations schema. This is the traditional approach used by the Ruby gem Globalize. There are some libraries that provide a similar (but I think that not exactly the same) functionality for Elixir.

Having said that, I want to propose you an alternative solution.
Having separated schemas for basic model data and its translations has some difficulties and affects the query performance since it requires multiple JOINs per translated model.

Since modern database support unstructured data such as JSON and Ecto provides support for this kind of data, I’ve built a library that leverages this support for storing model translations into a single column of the same model table.

The library is called Trans, it has some examples and documentation that you may find useful. There is also this article which explains why Trans was created and what improvements does it provide.

Hope you find it useful :sunflower:

11 Likes

Thank you.
I’ll have a look and come back with feedback.

we also went the embed way, but did it directly with ecto. Depending on your app if the contents don’t have to be queried (they can, but the syntax is not that nice yet) it’s relatively straightforward with nested embeds. Still not really simple but these things are rarely so :slight_smile:

Here’s a simplified schema

defmodule App.Thing do
  use App.Web, :model

  alias App.Thing.Content
  alias App.OtherThing

  schema "thing" do
    field :name, :string
    field :order, :integer

    embeds_one :en, Content, [on_replace: :delete]
    embeds_one :de, Content, [on_replace: :delete]

    has_many :other_things, OtherThing

    timestamps()
  end

each content can have own embeds one / embeds many. Example query

@spec only_lang(struct(), atom()) :: map()
def only_lang(queryable, lang) do
  queryable
  |> where([t], not(is_nil(field(t, ^lang))))
  |> select([t], %{content: field(t, ^lang), id: t.id, uuid: t.uuid, name: t.name})
end

these fields are nullable so the user can have something for only one supported language. One thing to note is that we mostly deliver contents over the API without having the language in server side state

1 Like

Looks great! :raised_hands:

@PJextra: You need to have 2 types of translations:

  1. Database data, for example Article could have translated: :body and `:title
  2. normal translations
    Here it’s more difficult. In advanced system you should define rules per language to be able for easy changing word to it’s singular and plural version + by declensions.

Here is an example:

def Lang do
  use Ecto.Schema

  schema "langs" do
    # has_many :translations :-)
    field :country, :string
    field :family, :string
    timestamps()
  end

  def tag(lang), do: tag.family <> "_" <> tag.country
end


def Article do
  use Ecto.Schema

  schema "articles" do
    has_many :translations, Article.Translation
    timestamps()
  end
end

def Article.Translation do
  use Ecto.Schema

  schema "article_translations" do
    belongs_to :article
    field :body, :string
    field :lang, Lang # or :string
    field :title, :string
    timestamps()
  end
end

def Translation do
  use Ecto.Schema

  schema "translations" do
    field :code, :string
    field :lang, Lang
    # or:
    # field :lang, :string
    field :value, :string
    timestamps()
  end
end

defmodule Translator do
  def declension(word, type, lang) do
    # here your rules for declension type
    # of course nothing stops you from insert rules into database
    # and fetch them here
  end

  def for_context(model, :index, lang) do
    plural_model =
      model
      |> for_model()
      |> plural(:many)
    # here you should generate where Ecto.Query part
    translate(where, %{plural: plural_model})
  end
  # of course you should define other context actions here ...
  # for RESTful you need only:
  # [:edit, :index, :insert, :new, :update]

  def for_field(model, field, lang) do
    # here you should generate where Ecto.Query part
    translate(where)
  end

  def for_field_description(mode, field, lang) do
    # here you should generate where Ecto.Query part
    # custom method for example for form fields pop-up with additional data
    translate(where)
  end

  def for_model(model, lang) do
    # here you should generate where Ecto.Query part
    translate(where)
  end

  def singular(word, lang) do
    # here use your rules per language
    # of course nothing stops you from insert rules into database
    # and fetch them here
  end

  def plural(word, count, lang) where is_integer(count) or count == :many do
    # here use your rules per language
    # of course nothing stops you from insert rules into database
    # and fetch them here
  end

  def translate(code, data \\ %{}) when is_bitstring(code) do
    # here you should generate where Ecto.Query part
    # by translation code (custom translations)
    translate(where, data)
  end
  def translate(where, data \\ %{}) do
    # here you should query using where
    # and then replace your keys with given data
    # if you will not find translation
    # then you can also create empty here or just return error
  end
end

Example generated translation codes:

  1. models.model_name_to_snake_case.description
  2. models.model_name_to_snake_case.name
  3. model_fields.model_name_to_snake_case.field_name.description
  4. model_fields.model_name_to_snake_case.field_name.name
  5. custom.your_code_goes_here

And you can also define code fallback, for example look in database (in order):

  1. model_fields.model_name_to_snake_case.field_name.description
  2. global_fields.field_name.description
  3. model_fields.model_name_to_snake_case.field_name.name
  4. global_fields.field_name.name

Of course nothing stops you from inserting/fetching translations for fake models like:

Translator.for_field(MyApp, :description, :en)
# or:
Translator.for_field(MyApp, :version, 5) # 5 is language id from database

After some testing I can see that all suggestions are good options.
Nevertheless, I was thinking in a simpler approach using the new Contexts mindset:…

  1. I’ll have in my schema a duplicated field for each language translations (post, postDe, postFr,…);
  2. In the client I’ll make sure that when this field is being populated translations are also populated (by default with the same original content);
  3. In my Context interface I’ll define each query in a way that depending on the language I’ll search the translations fields, not the original.
    Theoretically I don’t see any problem and it seems very simple logic, as translations are clear (in the schema, client side and queries) and as not all schema fields are suitable for translation I don’t get too much complexity in the schema (making it explicit which fields expect translations).
    Does this make sense for someone that already implemented this functionality or is it just a newbie wrong idea?
1 Like