Trans - Embedded translations for Elixir

I’ve released a new version of Trans :rocket:

Trans 1.1.0 makes Ecto an optional dependency.

This update addresses one of the main concerns of trans since its inception: to leverage, but be usable without, a database. The Trans.QueryBuilder component requires Ecto to work, but the Trans.Translator component can be used with any struct or map and does not require a database.

As usual, any comments, suggestions, issues or pull-requests are more than welcome!
Love :heart:

4 Likes

Trans 2.0 is out! :rocket:

This release of Trans is focused on improving the library interface and making it more safe and usable.
The Trans.QueryBuilder module has been completely rewritten. It now exposes the translated/3 macro that generates an SQL fragment that can be used when building Ecto queries.
The new translated/3 macro is compatible with all the functions and macros in Ecto.Query and Ecto.Query.Api and provides safe checks against translations on non existing or non translatable fields.

Compare how you would create a query with Trans 2.0 and before:

# Now: Trans 2.0
iex> Repo.all(from a in Article,
...> where: ilike(translated(Article, a.body, :es), "%elixir%"))

# Before: Trans 1.0
iex> Article
...> |> Trans.QueryBuilder.with_translation(:es, :title, "%République%", type: :like)
...> |> Repo.all

More detailed release notes can be found at GitHub. I also plan to publish soon an article explaining the changes and the reasoning behind them.

EDIT: the promised article about the changes in Trans 2.0 is now published in Medium :raised_hands:

As usual, any comments, suggestions, issues or pull-requests are more than welcome!
Love :heart:

1 Like

@belaustegui: Can you explain why you decided to have all translated columns in big jsonb?
After read your description here I through that you do something like:

defmodule Article do
  # use Ecto.Schema
  use Trans.Schema

  schema "articles" do
    trans_field :body, :string
    trans_field :title, :string
  end
end

so any trans_field is a separate map column.
How about performance when you have big jsonb map (with lots of fields) and lots of records?

Do you validating language codes?

For your #12 and #14 issues: in some cases developer prefer to store locales in database for example table named locales could have fields: id, locale, fallback_locale_id, name and description.

btw. You have still opened 2.0.0 milestone and already released 2.0.0 version :smiley:

1 Like

Hi @Eiji , I really appreciate your comment.
I completely forgot about the milestone :sweat_smile:, it is closed now, thanks!

Your idea of trans_field looks really good indeed. Could you open an issue in the project so we can discuss it further?

I went with this approach for Trans because I wanted to port the hstore_translate gem to Elixir.
I share your concerns with the big jsonb field containing all the translations, in particular when fetching lots of data in queries. I’ve not been able to test Trans in any project with high data volume. I’ve used the hstore_translate gem in a Ruby project and this approach was faster than having the translations in their own separated tables (the globalize approach). I think that this performance gain will still apply in Elixir.

I may perform a test comparing Trans performance versus having the translations separated in different tables. It would be a very interesting comparison :thinking:

Thank you very much!
Cheers.

1 Like

@belaustegui: I think lots about it today, but still don’t have a one good way. I read some articles about it today and there are lots of pros and cons for all cases (also for not using jsonb). I will think more about it in other time, but I will probably have more propositions for your API.

1 Like

I like the approach but it might be worth it to create a trans_LANGUAGE column. Committing to multiple languages on a site isn’t a small task and this would separate each translation out into it’s own column.

You’d get a couple of benefits from that approach.

First, with updates. Variable sized columns that can store blobs have to worry about space reallocation on updates which can stress the database a good bit as it grows, especially with any significant update frequency. By having a column-per-language you’ll end up with multiple smaller fields that update less frequently on their own.

Second, if you know the translation that you want to get back, you can request it in the select rather than the entire JSONB for every language. If there are multiple fields for the variation, you’ll be able to easily query the whole set of translations on that row for that language with one field, rather than having to separate it out of the JSONB. It will also speed up parsing the JSONB by keeping the size consistent.

1 Like

I’ve published an article explaining the main changes, improvements and future plans for Trans.

If you want to have a taste of the improvements in Trans 2.0 you should read it!

2 Likes

@belaustegui: I don’t have so much time, but I simply looked on that article and one point is interesting:

Make translations an embedded schema and create a custom Ecto type.

I was think about it too, especially a configuration that allows to use it like embedded schema or normal association. I think about something like that, because looking for lots of answers, articles and comments I don’t find a clear answer about what is the best way. This is related with data size and what database functions developer would like to use. For simplest scenarios a jsonb is recommend, but it’s not so simply in bigger and/or special projects. I don’t want to copy and paste too much, but I just want to point that it’s not easy to determine best strategy and if someone is going to share a code then I think that it should be as much configurable as possible. What do you think about it?

1 Like

That looks pretty cool! I look forward! :slight_smile:

1 Like

This is has currently an idea status only, I have to think about it deeper.

On the one side it will require more work to set up translations, since users should create embedded schemas and specify which data they should contain. On the other side it would make translations safer by letting us specify valid fields, changesets, etc.
Since this would be a big change for Trans, it requires more though and analysis. Some other issues should be fixed before addressing this.

I agree with you that it is not easy to determine the best strategy for content translation management. Currently there are different approaches provided by different libraries than trans such as ecto translate or translecto.
I also think that a library should have a single and clear concern. Users can then choose which approach fits best for their needs and pick the right library for it.

Thank you very much :bow: !

Mostly agree except that situations where 2 libraries could have ~90% of code same. I think in that cases developers prefer configuration over library count and finally if you will implement your plan then it’s easy way to make a configuration for my suggestion, but otherwise not - 2nd library will be better.

Hi again! We got a new version of Trans!! :slight_smile:

The version 2.0.1 is a minor release which contains the following main changes:

  • Fixed some issues with documentation examples.
  • Use Ebert to check code quality.
  • Relax the dependency restrictions on Poison.

You can see the release notes on GitHub.

As usual, any comments, suggestions, issues or pull-requests are more than welcome!
Love :heart:

2 Likes

This may be more of a ecto/macro mystery that Trans-specific, but how can I compose queries with the ‘translated’ macro? I am trying to find combinations of translations and have tried the following, which won’t compile:

def find_option(values, locale) do
  query = values
  |> Enum.reduce(Option, fn {f, val}, query ->
    from o in query, where: translated(Option, field(o, ^f), locale) == ^val
  end)
  # do query...
end

# => find_option(%{name: "NAME", category: "CATEGORY"}, :nl)

gives

** (Ecto.Query.CompileError) `field(field(o, ^f), :translations)` is not a valid query expression.

So the idea is to find Options (having two translated fields) with the same combinations of translated values. I admittedly can have a hard time getting my head around macros and ecto in particular, but how does one do this?

Hi again! We got a new version of Trans!! :slight_smile:

The only change in the new 2.1.0 version is the support of Ecto 3.0 as a dependency. Client code using Trans should keep working as usual without any modification.

You can see the release notes on GitHub or fetch the new version fron hex.pm.

As usual, any comments, suggestions, issues or pull-requests are more than welcome!
Love :heart:

5 Likes

Awesome. Just upgraded. The only change in my case was dropping off {:ecto, "~> 3.0", override: true} from my deps. Thank you!

Hi everybody! I’ve just published a new release of Trans!

Thanks to @sfusato and @gorav the main change in this 2.2.0 version is that the locale can now be passed as a string, so we can use Trans.Translator.translate/3 with the value returned by Gettext.get_locale/0 :raised_hands: For example we can now write something as:

Trans.Translator.translate(article, :title, Gettext.get_locale())

This release also removes Faker as a dependency, updates ExDoc and removes support for Elixir versions older than 1.6 (which are not officially supported by the core team anymore).

You can see the release notes on GitHub or fetch the new version from hex.pm.

As usual, any comments, suggestions, issues or pull-requests are more than welcome!
Love :heart:

3 Likes

This is great to see, it makes it really easy for ex_cldr users to get in on the party too. For example:

Trans.Translator.translate(article, :title, MyApp.Cldr.get_locale().cldr_locale_name)

One thought for your consideration is locale naming.

  • A locale name in Gettext is: arbitrary strings as long as they match a directory name which is no issue for simple locale names like es. But it does create ambiguity when you want Spanish as spoken in Chile. Is it es-CL or es_CL or es-cl or ES_CL?

  • Posix locale names use _ as the separator between a language code and a territory code. Therefore es_CL. Posix locale names aren’t full language tags (in the Unicode sense) but of course they are very common and fulfil most requirements.

  • Unicode’s CLDR says The "-" and "_" separators are treated as equivalent, although "-" is preferred. Therefore es-CL.

  • BCP47 is the formal definition applied for locale name in ex_cldr and therefore uses the “-” in its canonical form. Whereas ICU, a common CLDR implementation, uses “_” since its primarily targeted at the posix-compliant world.

  • Last point, All identifier field values are case-insensitive at least as defined by ICU and CLDR.

I’ve found this to be a source of occasional confusion: Do I use - or _? Is es-cl the same as es-CL? Since both Gettext and CLDR allow either I force one canonical approach internally ex_cldr.

I like to suggest Trans consider the implications of this too since it now takes a string locale name key. Perhaps:

  • At least treat - and _ as equivalent
  • And desirably treat all identifiers as case insensitive
    (which also happens to be what I do in ex_cldr). This might have backward compatibility issues since its easy on a case insensitive file system (most Mac file systems, Windows) but not on Linux and friends.
  • Or maybe just be really clear about the canonical form you expect and make it the developers concern.

Thanks for a great library, hope this wasn’t too much of a dive into esoterica!

2 Likes

Hi again! I’ve just published a new release of Trans.

Thanks to @sfusato, @RxAssim and Philipp Waldmann for their awesome contributions:

  • Translations can now be stored using embedded schemas. This is now the preferred way of using Trans since it provides stronger, explicit and easy to use translations. The old plain maps are still supported, of course.
  • Entire structs can now be translated using the new translate/2 function
  • We can automatically raise an error if a translation does not exist by using the new translate!/3 function.
  • Documentation has been greatly improved to showcase the new structured translations approach and to include specs in the public functions.

This release also updates the Trans dependencies to ensure clean compilations without warnings and requires Elixir 1.7 or higher.

You can see the release notes on GitHub or fetch the new version from hex.pm.

As usual, any comments, suggestions or pull-requests are more than welcome.
Love :heart:

7 Likes

Hi again! I’ve just published Trans 3.0.0.

Thanks to @kip , Kian-Meng Ang and Daniel Tinoco for their awesome contributions:

  • Unstructured translations are not supported anymore. They have been discouraged for some time but this release cuts them out in favour of structured translations using regular schemas as they provide much more explicitness and robustness.
  • The new Trans.translations/2 generates the translation code for you. For most cases this macro will generate the required nested schemas so you don’t have to write the code manually. If you prefer to write the translation schemas yourself you can always continue to do it just like you did before.
  • Locale fallback chains are now supported. When translating a struct or querying the database you may provide a list of locales instead of a single locale. Trans will fall back to the next locale in the list until it finds a translated one.
  • Docs and tests have been improved to showcase the new functionalities.

This release also updates Trans dependencies and bumps the minimum Elixir supported version to 1.11.

You can see the release notes on GitHub or fetch the new version from hex.pm.

As usual, any comments, suggestions or pull-requests are more than welcome.
Love :heart:

3 Likes