Ecto padding fixed length fields with white space

I am running into an issue where Ecto seems to be padding white space to :char fields when the length of the text is less than the allotted size. For instance I have a languages table where the primary key is the language code. Most of the time this is two characters, but it can be 3. Here is the relevant portion of the migration script.

create table(:languages, primary_key: false) do
  add(:code, :char, primary_key: true, size: 3)
  add(:name, :string, null: false)
  add(:alternate_names, :map, null: false, default: %{})
  timestamps()
end

And the schema:

@primary_key {:code, :string, autogenerate: false}
@foreign_key_type :string
@derive {Phoenix.Param, key: :code}

schema "languages" do
  field(:name, :string)
  field(:alternate_names, :map)

  timestamps()
end

Inserting records seems to work as expected.

myapp_dev=# select code, LENGTH(code), name from languages where code = 'en';
 code | length |  name   
------+--------+---------
 en   |      2 | English
(1 row)

myapp_dev=# select code, LENGTH(code), name from languages where code = 'arz';
 code | length |      name       
------+--------+-----------------
 arz  |      3 | Egyptian Arabic
(1 row)

When I load the struct, two letter codes contain a trailing space. Notice the code: "en " bit.

iex(21)> en = Myapp.Repo.get(Myapp.Language, "en")
[debug] QUERY OK source="languages" db=0.7ms queue=1.0ms idle=1415.7ms
SELECT l0."code", l0."name", l0."alternate_names", l0."inserted_at", l0."updated_at" FROM "languages" AS l0 WHERE (l0."code" = $1) ["en"]
↳ anonymous fn/4 in :elixir.eval_external_handler/1, at: src/elixir.erl:312 
%Myapp.Language{
  __meta__: #Ecto.Schema.Metadata<:loaded, "languages">,
  code: "en ",
  name: "English",
  alternate_names: %{},
  inserted_at: ~N[2023-09-05 00:24:39],
  updated_at: ~N[2023-09-05 00:24:39]
}

Is there any way to avoid this padding?

Thanks in advance!

What database are you using?

If you’re using Postgres this makes sense as it “stores and displays” the char type with the padding. If you are indeed using pg, then there really isn’t any good reason reason to use char. The only difference between the two is essentially that padding, there are no performance improvements. I would use varchar with a size limit and this will solve your problem:

create table(:table) do
  add :field, :string, size: 10
end

This will make field varchar(10).

If you are using any other db then sorry for taking up your time!

2 Likes

It is stored with 3 characters it’s just that Postgres treats them as insignificant. An excerpt from their docs:

Values of type character are physically padded with spaces to the specified width n , and are stored and displayed that way. However, trailing spaces are treated as semantically insignificant and disregarded when comparing two values of type character .

So Ecto is giving you the real data. But Postgres is treating 'en ’ and ‘en’ as the same when doing your query in psql. Also like the above user said char is used explicitly when you want padding. If padding is not desired then varchar is the way to go.

reference: PostgreSQL: Documentation: 15: 8.3. Character Types

3 Likes

Yes, I am using Postgres. Your suggested change has removed the padding. My DB admin knowledge is a little weak. The idea of using string keys is a bit icky to me, so I thought using a char field might be better suited for this.

string is just the generic “type” that Ecto uses. The database adapter—in this case, postgrex—will choose the most appropriate type under the hood. Of course, in Postgres, there is very little difference between using varchar and text and it’s often preferable to just use text. There’s a decent answer about it here. I did just find that now and only skimmed it, it’s just one of those things I’ve known for a long time and can’t explain very well myself, lol. I still use vachar for things I know are going to be small anyway as a holdover from Rails’ simple_form introspecting the db to know whether to make a <input type="text"> or a <textarea> (which is not a good reason).

3 Likes

I’m coming from this in the other direction. My database development skills are substantially stronger than my Elixir skills.

Regarding strings as keys. Strings can make completely sensible keys especially where your using a “natural key” design which you seem to be doing here (natural keys are the part of the actual attributes of thing the record describes which can uniquely identify it). I wouldn’t sweat it unless the strings are rather “large”; what I mean by large here are strings in the several hundred or more characters per string
 and even then my concern is less to do with performance issues such as data size but more because it would raise some red flags about how “key-like” the data really is; I would start to look for other design issues at that point. Of course as data gets bigger at some point you should expect noticeable time & space performance issues using that data in key contexts, but most sensible keys don’t ever have to worry about that in any but the very most demanding applications.

As @sodapopcan suggests, on the PostgreSQL side I simply use the (effectively) unbounded text data type when writing string data including for key usage; even in cases where I could assign a reasonable limit in the database I will still use text; I favor enforcing any string size limits in the application over the database. The reason for this has mostly to do with the costs of changing the database later, when it’s used for a production application with live data. Changing the database schema can involve things like interruptions to data availability or require similar specialized handling. Changing length enforcement in the application is just changing the acceptable length value in code (unless you’re shortening the acceptable length); the only downside is if your application implements data changing functionality in different functions/methods/etc
 then you have to change it in multiple places
 but that’s an application design issue which should be managable.

4 Likes

This is extremely helpful.