Trans - Embedded translations for Elixir

Hi all.
A few days ago I published my first package in Hex.pm. It is called Trans and aims to provide a easy way to leverage database support of JSON datatypes to store translations. Trans is heavily inspired by the incredible gem hstore_translate.

The traditional approach of having adjacent tables for storing the translation information quickly increases the number of JOINs required for retrieving data, especially when a single query contains multiple models. The approach provided by Trans stores translations in a single column of each model, so when a model is retrieved so are it’s translations. Modern RDBMSs provide support for this kind of unschemed data and to use conditions in it.

If you find it interesting, take a more detailed look at:

Any suggestions, issues, ideas and contributions are more than welcome.
Love :hearts:

12 Likes

Trans is now on version 1.0.0 :slight_smile:!!!
You can see the release notes on GitHub. The main changes of this version are the improved support for Elixir 1.3.x and the new requirement of Ecto 2.0.

You can update the version of Trans in your project by adding {:trans, "~> 1.0"} to your mix.exs and then running mix hex.update trans.

As usual, any comments, suggestions, issues or pull-requests are more than welcome!
Love :heart:

2 Likes

Hi again! We got a new version of Trans!! :slight_smile:

The version 1.0.1 is a minor release that focuses mainly on making Trans more comprehensible by improving the documentation and adding a changelog that conforms to the Keep a Changelog format.

You can see the release notes on GitHub. There are also some nice improvements planned

As usual, any comments, suggestions, issues or pull-requests are more than welcome!
Love :heart:

3 Likes

After a long time we have a new version of Trans.

The main changes in version 1.0.2 are:

  • Trans now compiles cleanly and is tested on Elixir 1.4.
  • The dependency earmark has been removed, since it is already required by ex_doc.
  • A CONTRIBUTING.md file has been added, detailing the contribution guidelines.

You can see the release notes and the planned improvements on GitHub.

The next Trans version will focus on making Ecto an optional dependency that is only required when using the QueryBuilder component.

As usual, any comments, suggestions, issues or pull-requests are more than welcome!
Love :heart:

4 Likes

So this is not so much for translations of the application, but rather for easily allowing users to create their own translated content for a given set of data? Looks useful. :slight_smile:

2 Likes

This is awesome for user-generated sites aspiring to manage multi-lang effortlessly in PostgreSQL with leverage of JSONB types.

Thank you very much for posting about it, I had already in my github stars but forgot about the project now that it might come useful for a side project

3 Likes

Thank you very much for your words @OvermindDL1 and @schp :slight_smile:

The mission of Trans is to provide an easy way to retrieve translations from structs or maps, and (optionally) provide an interface for generating Ecto queries by adding conditions on translated fields.

Trans has two main components:

  • The Translator mission is to retrieve a translation into the desired language, or fall back to the default one if no translation exists. (I also plan to allow more flexibility into the fallback process).
  • The QueryBuilder mission is to allow creating or modifying queries based on translated values. This component does require Ecto and leverages the power of the JSONB data type of PostgreSQL databases to look into the translations for the queries.

At the moment Trans has a hard dependency on Ecto, but I intend to make this dependency optional in the next version. Then, you will be able to use the Translator component without Ecto in any application.
The QueryBuilder will still require Ecto to work though, but it won’t be even compiled if Ecto does not exist in the application.

Edit: I actually plan to support MySQL also, since newer versions also have a JSON type. But there is an open issue in the Mariaex adapter to add support for this type that must be addressed first. I could look into it myself, but I would need some guidance into where to look first :sweat_smile:

5 Likes

I’ve released a new version of Trans :rocket:

Trans 1.1.0 makes Ecto an optional dependency.

This update addresses one of the main concerns of trans since its inception: to leverage, but be usable without, a database. The Trans.QueryBuilder component requires Ecto to work, but the Trans.Translator component can be used with any struct or map and does not require a database.

As usual, any comments, suggestions, issues or pull-requests are more than welcome!
Love :heart:

4 Likes

Trans 2.0 is out! :rocket:

This release of Trans is focused on improving the library interface and making it more safe and usable.
The Trans.QueryBuilder module has been completely rewritten. It now exposes the translated/3 macro that generates an SQL fragment that can be used when building Ecto queries.
The new translated/3 macro is compatible with all the functions and macros in Ecto.Query and Ecto.Query.Api and provides safe checks against translations on non existing or non translatable fields.

Compare how you would create a query with Trans 2.0 and before:

# Now: Trans 2.0
iex> Repo.all(from a in Article,
...> where: ilike(translated(Article, a.body, :es), "%elixir%"))

# Before: Trans 1.0
iex> Article
...> |> Trans.QueryBuilder.with_translation(:es, :title, "%République%", type: :like)
...> |> Repo.all

More detailed release notes can be found at GitHub. I also plan to publish soon an article explaining the changes and the reasoning behind them.

EDIT: the promised article about the changes in Trans 2.0 is now published in Medium :raised_hands:

As usual, any comments, suggestions, issues or pull-requests are more than welcome!
Love :heart:

1 Like

@belaustegui: Can you explain why you decided to have all translated columns in big jsonb?
After read your description here I through that you do something like:

defmodule Article do
  # use Ecto.Schema
  use Trans.Schema

  schema "articles" do
    trans_field :body, :string
    trans_field :title, :string
  end
end

so any trans_field is a separate map column.
How about performance when you have big jsonb map (with lots of fields) and lots of records?

Do you validating language codes?

For your #12 and #14 issues: in some cases developer prefer to store locales in database for example table named locales could have fields: id, locale, fallback_locale_id, name and description.

btw. You have still opened 2.0.0 milestone and already released 2.0.0 version :smiley:

1 Like

Hi @Eiji , I really appreciate your comment.
I completely forgot about the milestone :sweat_smile:, it is closed now, thanks!

Your idea of trans_field looks really good indeed. Could you open an issue in the project so we can discuss it further?

I went with this approach for Trans because I wanted to port the hstore_translate gem to Elixir.
I share your concerns with the big jsonb field containing all the translations, in particular when fetching lots of data in queries. I’ve not been able to test Trans in any project with high data volume. I’ve used the hstore_translate gem in a Ruby project and this approach was faster than having the translations in their own separated tables (the globalize approach). I think that this performance gain will still apply in Elixir.

I may perform a test comparing Trans performance versus having the translations separated in different tables. It would be a very interesting comparison :thinking:

Thank you very much!
Cheers.

1 Like

@belaustegui: I think lots about it today, but still don’t have a one good way. I read some articles about it today and there are lots of pros and cons for all cases (also for not using jsonb). I will think more about it in other time, but I will probably have more propositions for your API.

1 Like

I like the approach but it might be worth it to create a trans_LANGUAGE column. Committing to multiple languages on a site isn’t a small task and this would separate each translation out into it’s own column.

You’d get a couple of benefits from that approach.

First, with updates. Variable sized columns that can store blobs have to worry about space reallocation on updates which can stress the database a good bit as it grows, especially with any significant update frequency. By having a column-per-language you’ll end up with multiple smaller fields that update less frequently on their own.

Second, if you know the translation that you want to get back, you can request it in the select rather than the entire JSONB for every language. If there are multiple fields for the variation, you’ll be able to easily query the whole set of translations on that row for that language with one field, rather than having to separate it out of the JSONB. It will also speed up parsing the JSONB by keeping the size consistent.

1 Like

I’ve published an article explaining the main changes, improvements and future plans for Trans.

If you want to have a taste of the improvements in Trans 2.0 you should read it!

2 Likes

@belaustegui: I don’t have so much time, but I simply looked on that article and one point is interesting:

Make translations an embedded schema and create a custom Ecto type.

I was think about it too, especially a configuration that allows to use it like embedded schema or normal association. I think about something like that, because looking for lots of answers, articles and comments I don’t find a clear answer about what is the best way. This is related with data size and what database functions developer would like to use. For simplest scenarios a jsonb is recommend, but it’s not so simply in bigger and/or special projects. I don’t want to copy and paste too much, but I just want to point that it’s not easy to determine best strategy and if someone is going to share a code then I think that it should be as much configurable as possible. What do you think about it?

1 Like

That looks pretty cool! I look forward! :slight_smile:

1 Like

This is has currently an idea status only, I have to think about it deeper.

On the one side it will require more work to set up translations, since users should create embedded schemas and specify which data they should contain. On the other side it would make translations safer by letting us specify valid fields, changesets, etc.
Since this would be a big change for Trans, it requires more though and analysis. Some other issues should be fixed before addressing this.

I agree with you that it is not easy to determine the best strategy for content translation management. Currently there are different approaches provided by different libraries than trans such as ecto translate or translecto.
I also think that a library should have a single and clear concern. Users can then choose which approach fits best for their needs and pick the right library for it.

Thank you very much :bow: !

Mostly agree except that situations where 2 libraries could have ~90% of code same. I think in that cases developers prefer configuration over library count and finally if you will implement your plan then it’s easy way to make a configuration for my suggestion, but otherwise not - 2nd library will be better.

Hi again! We got a new version of Trans!! :slight_smile:

The version 2.0.1 is a minor release which contains the following main changes:

  • Fixed some issues with documentation examples.
  • Use Ebert to check code quality.
  • Relax the dependency restrictions on Poison.

You can see the release notes on GitHub.

As usual, any comments, suggestions, issues or pull-requests are more than welcome!
Love :heart:

2 Likes