How do you model your nullable strings?

I try to avoid nullable strings, so I don’t have to handle an extra scenario:

%Email{subject: ""}}
%Email{subject: "Hello"}
%Email{subject: nil}} # not permitted

I thought it was cleaner to allow the string to be whatever length and the empty state is "". This way I never have to first check if it’s nil before interacting with it. But I’ve spent the last few hours fighting with Ecto and Changesets interpretation / transformation of empty strings (eg adding a change of name: "" is transformed to name: nil), so I’m curious what other people do.

Do you allow nil values, and deal with converting nil throughout your application?


That’s because Ecto deals with that on a completely different level. Ecto needs to support e.g. url encoded form data, which does only support string values. Therefore it needs to treat empty string and “no input” the exact same way, because the encoding by which the data is sent doesn’t allow differenciating both.

You can customize the behaviour with the empty_values option on Ecto.Changeset.cast/4. The default behaviour makes sense for most use cases, where no submitted value means the value is missing. If you consider "" to not be a missing value then use the option to adjust the behaviour.

I’d however not consider "" a null value. It’s not the absense of a value, but a string like any other.


To add to @LostKobrakai’s response to this:

Even if you went and used "" to mean empty anyway, you’ll now just be checking for empty string instead instead of nil, which doesn’t really solve anything. For example, you won’t be able to just do send_email(to: ""). Email is of course a bit of a strange example since we usually make sure we have a validated email address before it gets into the system, but it applies to any nullable values. When it comes to presenting, in the scenarios where you just want to show a null value as nothing without any special messaging, <%= nil %> works just fine.

I’ve seen more than one person thrown off by this as well. So the idea that Ecto needs to behave this way is a bit strong. It is a design choice to behave in this manner because many of it users find it to be a good default. But that does throw off/confuse other users.

Yeah, though if I want to pass the string to a function, I now need to ensure there’s a base case for nil. Eg, passing it to some regex match. heex and stringifying options are generally fine- they’ll coerce it to an empty string.

Agreed. I like the idea of it for fields that always exist but permit empty (I think email subject is the best example for this), but it’s obviously a poor choice for something like user.homepage… in that case I’d expect either a validated URL or nil.

But it does take a fair amount of fighting to treat it differently…

validate_change will not run for nil values. validate_required treats empty strings as missing. cast converts empty strings to nil. So if I want to use empty strings the way I was, I’m really going against the current. This is what made me step back…generally when I’m fighting Ecto, it means I’m using it wrong.

So now your regex has to accommodate for empty string which hides the fact that you’re dealing with a nullable field. A nullable field is a nullable field! Ideally most of your fields aren’t but you don’t want to hide it when they are.

Very true.

So I’m guessing the way most people would address this is:

  • can it have an empty state? nullable / nil
  • otherwise: it should be a non-nil string of length > 1, likely with additional validation, eg min-length for a subject, URI/email regex validation…

Ya, which is what Ecto Changesets are for! :slight_smile:

The other problem you have with "" for email is that now "" is a special case valid email which is going to cause more complicated checks. You either want to have a valid value or nil. That keeps the consuming code simple and mostly focused on the happy path.

Agreed for something like email address. But for something like email.subject, an empty value is valid. So by allowing null for that, I get the possibility of two empty values: nil and "". That’s really all I was trying to avoid with having some non-null strings. I thought if it can be empty, it’s easier to know it’s always a string, rather than it’s a string OR a nil (another type). It would make a counter below a text field easier- I can just to String.length(email.subject), rather than String.length(email.subject || "").

But everyone (and the Ecto fight) is convincing me that I should model the bottom case as null, and deal with the few instances where I need to treat that field as a string.

Ohhhhh boy ok I gotta admit my brain sort of auto-corrected Email to User and subject to email, haha :grimacing:

So, rewinding a bit, I’m actually not super opposed to subject being non-nullable with a default of "". It could be argued that an email has three required fields: to, subject, and message where the latter two have defaults of "". The user isn’t really saying, “I don’t yet know what the subject is,” they are essentially making a conscious decision to enter them as blank (I believe this is where your mind was going). In an email draft then nil would make sense there, but once it’s sent I think it’s ok to empty subject and message as "".

Not taking that draft point into consideration (and it’s a shaky one), you can easily get a blank string like this:

field :subject, :string, default: ""

I would enforce this in the db as well.

This is just my thought process, though, some may well disagree with me and I’m not even 100% sure I agree with me here :sweat_smile: Basically it’s one of those things I see both arguments for. And I do have default: "" in my projects but also plenty of nils. It’s all situational!

My understanding is this default only applies to new structs, it doesn’t do anything for change casts.

To prevent writing nil to the db I think you need to make a custom validator as well:

  defp validate_not_null(changeset, fields) do

    Enum.reduce(fields, changeset, fn field, changeset ->
      case {changed?(changeset, field), get_change(changeset, field)} do
        {true, nil} -> add_error(changeset, field, "can't be null")
        _ -> changeset

And then hopefully overriding the empty_values to not include "" will keep the Changeset from transforming empty string’s to nil.

But it’s quite a bit of fighting against Ecto’s standard behavior.

This is not correct. validate_change only runs on changes, which means if you set a column, which is nil to nil (without forcing it) is not considered a change. If you default a column to "" and set it to nil it will be considered a change.

So this might just be it:

import Ecto.Changeset

{%{subject: ""}, %{subject: :string}}
|> cast(params, [:subject], empty_values: [nil])
|> validate_change(:subject, :required, fn 
  field, nil -> [{field, "can't be empty"}]
  _, _ -> []

validate_required is indeed a bit of an odd ball in the set of validations, because it’s the only validation, which doesn’t just look at changes, but also at the existing value in the base data of a changeset.

I mean sure you can call it a design decision, but “casting” in ecto is explicitly the act of converting weakly typed representations of a value into the corresponding elixir runtime value (see Ecto.Type.cast/1). That means there will be some tradeoffs to be made and not every external input you have provides the same level of information. Ecto does cater to a common type of input, which provides the least amount of information – form data, where all you got is strings (or the lack of in some situations).

If you already have proper types you’re can skip that by using Ecto.Changeset.change or Ecto.Changeset.put_change and casting will be skipped.

1 Like

Oof, I’m so confused now- so if I use empty_values: [nil], then after casting a %{subject: nil}, I’ll have changes: %{subject: ""} Then the validate_change will be called with an empty string (so I actually don’t need it anymore, it looks like nil can’t make it through the cast).

  defp validate_not_null(changeset, fields) do
    Enum.reduce(fields, changeset, fn field, changeset ->
      validate_change(changeset, field, fn
        field, nil -> [{field, "can't be empty"}]
        _, _ -> []

Without adding empty_values: [nil], my changes are %{subject: nil}, and my validateChange is never called. I think that’s expected behaviour: `Ecto.Changeset.validate_change` and setting fields to `nil` - #3 by josevalim

Then I’d suggest updating ecto. Because on my end I get changes: %{subject: nil} with %{"subject" => nil} as input.

I was wrong- with the empty_values set to [nil], Ecto does actually fallback to using the schema’s default value. So empty_values + default does exactly what I was looking for, without needing any custom validators.

Thanks everyone!

Perhaps you have some code that is causing this to be the case? You made me question myself so I set a random nullable string field to default: "" in a project, saved an existing record where it was already NULL and it got updated to "". Then I removed the default, saved it again and it got set back to NULL.

In any event, glad you found a solution!

Yep, I was wrong about this. Sorry this was a flurry of comments as I kept trying different things out.

Again, apologies for misreading your example! My first messages were written under the misconception that you were advocating for defaulting an email address to "" and I was a little concerned :sweat_smile: