Multilanguge CMS database schema, ideas, opinions

Hi everyone,

I only found this on this topic How to approach multi language content management system?

and wondering if in 2020 there is a better way to structure the database schema except for embedding the translations in the article table?

How would you solve this problem?

Any suggestion or ideas is welcomed.

I am also interested in using absinthe and phoenix templates apps how to handle this?

Thanks in adavance

Without having read that link I’d do it by having a column that encodes the language key as well, makes it for easy and efficient lookup, making it trivial to even fall back to another with minimal network use compared to embedding and so forth.

2 Likes

Thanks @OvermindDL1 for the suggestion and explanations.

1 Like

Also i would like to be sure that understood your point of view so I am writing the following schema
Languages migration

create table(:languages) do
      add :name, :string
      timestamps()
end

Article migration

create table(:posts) do
      add :title, :string
      add :content, :string
      add :published, :boolean, default: false, null: false
      add :language_id, references(:languages, on_delete: :nothing)
      timestamps()
end

Category migration

create table(:categories) do
      add :name, :string
      add :language_id, references(:languages, on_delete: :nothing)
      timestamps()
end

Something like this?

But the language_id to be optional only if the category or article written in another language?

I wouldn’t even have the languages table at all, I’d just use the ISO-standard language codes for the language mappings.

For posts the language_id would just be a string type for that ISO mapping. Use the elixir cldr library (I think that was it?) for handling the language codes and making sure they are correct and all. Nicely by following this style you can easily lookup codes like en-us for a US specific entry, and if it doesn’t exist you can fall back to an en generic lookup, and if that fails you can fall back to an en-% prefix lookup, and if that fails fallback to whatever you want, all in a single query is easy to do. :slight_smile:

And for the categories table the language_id would again just be a string for the ISO language code.

I haven’t done anything super advanced with internationalization though so I could easily be missing something, but this is both what I’ve done in the past and what I’ve seen in a large variety of significantly used databases.

2 Likes

So something like this

create table(:categories) do
      add :name, :string
      add :language_code, :string
      timestamps()
end

and i would use a where for every language i support i guess.

Thanks @OvermindDL1 i think i finally got it.

That’s what I’d do. Plus add an index for the language code (along with probably the name after it, so an index on [:language_code, :name] perhaps would be the easiest and fastest index to make for the most usual queries I’d wager, but it all depends on what kind of queries you do).

2 Likes

The lib mentioned by @OvermindDL1 https://hexdocs.pm/ex_cldr_languages/readme.html#installation
https://github.com/elixir-cldr/cldr

1 Like

Ah yep, that looks like it, in addition to ex_cldr itself too!

EDIT: Or perhaps just ex_cldr itself?

1 Like

What are the difference because i couldn’t tell from the docs of https://hexdocs.pm/ex_cldr_languages/

1 Like

Looks like ex_cldr_languages just uses ex_cldr under the hood for an easier language name conversion, but it seems you can do it all with normal ex_cldr I think?

1 Like

Thanks i will have to try and see how it goes.

For the table, I would add a translation table, if You don’t want to embed translation…

create table(:posts) do
      add :title, :string
      add :content, :string
      add :published, :boolean, default: false, null: false
      timestamps()
end

create table(:posts_translations) do
      add :post_id, references(:posts, on_delete: :delete_all)
      add :language_id, references(:languages, on_delete: :nothing)
      # Add fields You want to translate
      add :title, :string
      add :content, :string
      timestamps()
end

This is more or less what globalize does.

3 Likes

Hmm interesting @kokolegorille but wouldn’t i have more joins to do this way?

Also i have more things that need to have a relations to an article(post):

category
tags
type

Yes, one join to do, but only one, not many…

That is why embedding translation in a json field is a good alternative.

1 Like

Got it the translations are required to be there just once when i select the language the content is displayed.

Thanks @kokolegorille for sharing this database diagram

I’ve generally seen both a side translation table and an embedded translation thing as crutches for fixing up originally broken designs. Designing the tables with the translation keys from the start is far superior as there is less joining, more efficient indexes, etc…

3 Likes

Yes it is quicker, but You also duplicate values that doesn’t need to be translated.

So the diagram vision is speed vs data duplication?

Wouldn’t you just duplicate the languages_code in @OvermindDL1 database diagram suggestion?

If it doesn’t need to be translated then just have a default of a blank locale key. ^.^

But really, I’m curious what wouldn’t ever need to be translated?

I don’t see that there’s any duplication at all with my setup? Can you elaborate with an example I’m quite curious?