Preventing error states in Elixir

laiboonh · September 11, 2020, 3:22am

Its good practice to code in such a way that it is impossible to create error states. We can do so using “smart-constructors” new_user

defmodule User do
  defstruct(~w[name age]a)

  defguard is_legal_age(age) when is_integer(age) and age < 120 and age > 0

  def new_user(name, age) when is_binary(name) and is_legal_age(age) do
    %User{name: name, age: age}
  end
end

However this does not stop people from creating a struct as such

 %User{name: "John", age: -1}

Is there anyway in Elixir to stop people from creating structs on their own but only through the smart-constructors?

r8code · September 11, 2020, 4:29am

try these

fuelen · September 11, 2020, 10:23am

nope, only through code review

opsb · September 11, 2020, 12:43pm

I believe you can use dialyzer to enforce this using opaque types https://medium.com/erlang-battleground/help-dialyzer-help-you-94db66bfbc5a

laiboonh · September 11, 2020, 1:16pm

Thanks. I think this could be a viable solution but after reading the blog i still do not see how it can be used in my scenario. Do enlighten me if you know how. I am gonna read up on the typespec docs to see if i can figure something out.

chulkilee · September 13, 2020, 3:10pm

For what?

“you” “they” here are not personal thing - let’s say you’re the owner of the struct and its validation.

If you want to stop passing a bad value to “your” function (e.g. between boundary), you can use guards on the functions to enforce it at runtime
If you want to stop creating a bad value in their functions, it’s their fault - you don’t care as far as “they” don’t pass it to you. It’s their bug

You can find such struct creation from AST. In this example you can see :%{}. Maybe this can be part of own compiler step?

Code.string_to_quoted("%Data{}")
{:ok,
 {:%, [line: 1], [{:__aliases__, [line: 1], [:Data]}, {:%{}, [line: 1], []}]}}

Code.string_to_quoted("Data.utc_now()")
{:ok,
 {{:., [line: 1], [{:__aliases__, [line: 1], [:Data]}, :utc_now]}, [line: 1],
  []}}

One sneaky solution is… to add a dummy enforced key which is unlikely given in such case:

defmodule Foo do
  @enforce_keys [:_]
  defstruct [:name, :year, :_]
end

%Foo{}
# ** (ArgumentError) the following keys must also be given when building struct Foo: [:_]
#     expanding struct: Foo.__struct__/1

olivermt · September 13, 2020, 3:17pm

The shorter version of above posters answer is that you use Structs for documentation and default values in a structured format.

Everything else should be validated in the nature of data pipelining that is the essence of functional programming.

Such validation could live in the defining module, like we do for ecto changesets.

The idea of object self validation should stay in oop land and be ignored in elixir

laiboonh · September 14, 2020, 1:06am

Hmm i disagree, ideally we should adopt good software engineering practices whether its oop, fp, or whaeverp

tfwright · September 14, 2020, 1:45am

I don’t understand this. I understand the idea that it should be impossible for client input to create error states, but you seem to be saying “good practice” requires making bugs impossible.

Maybe I’m being naive, but how is that not the only possible answer here? One can add typespecs, or tests, or do QA, or what have you, but surely these are all matters of process and not the code itself? All of these certainly make bugs less likely, but it is certainly impossible to prevent someone with commit access from introducing a bug regardless. To try would seem to me to by definition lead to overly defensive programming, which itself is not good practice.

mbuhot · September 14, 2020, 3:44am

That’s a pretty common approach in statically typed FP, eg google for “making illegal states unrepresentable” and there’s posts like this one

Since we don’t have those typing tools in Elixir, I like to consider the distinction between states and values. Any module can construct an invalid value - but it’s mostly harmless until that value is incorporated into the state of the system by storing it in a database or similar. For state that we want to protect, we can put a process around it. Now the only way to change state is by sending messages to the owning process, which can validate all incoming data to ensure integrity.

fuelen · September 14, 2020, 7:11am

My post was only about forcing people to use smart-constructors, not about preventing error states in Elixir at all.

chulkilee · September 14, 2020, 8:24am

Here is my alternative explanation:

In any language, it is impossible (or impractical) to prevent from building a bad value.
The bad value is problematic when side effects happens from it (e.g. calling database, calling APIs, updating cache…)
Elixir makes easy to find where side effects happen, and naturally encourages to organize code around the data flow, not around the data itself (like “object” in OOP)
Once using such pattern, it is easy to place “validation” at the right place.

For example, if you put the validation around data - then you’ll end up one giant module with lots of validation for all different purposes - like Rails fat models!

Going back to the author of this post… @laiboonh

Why do we want to make impossible to a user struct having negative integer for age? Only because it doesn’t make sense? The reason is that “it is incorrect input for functions using the age”. It is totally fine to pass such value to functions using only name - isn’t it?

Example - Date.

There is Date.new/4, and the Date module doc says:

Developers should avoid creating the Date structs directly and instead rely on the functions provided by this module as well as the ones in third-party calendar libraries.

But it is possible to any code to create a value. How does Date handles it?

# you can create a invalid value
%Date{year: 1, month: 0, day: -1}

# some operations do not care about the bad value
%Date{year: 1, month: 0, day: -1} |> Date.to_erl()
{1, 0, -1}

# it is validated when needed
%Date{year: 1, month: 0, day: -1} |> Date.add(1)
# ** (ArgumentError) invalid date: 0001-00--01
# (elixir 1.10.4) lib/calendar/iso.ex:1331: Calendar.ISO.ensure_day_in_month!/3
# (elixir 1.10.4) lib/calendar/iso.ex:521: Calendar.ISO.date_to_iso_days/3
# (elixir 1.10.4) lib/calendar/date.ex:544: Date.add/2

Please note that I’m not saying static type is meaningless. There are some benefits (e.g. not only catching certain errors on compile time, but potentially better performance in certain cases by avoid expensive runtime validation).

However, in Elixir (or dynamic typed FP) you can put such expensive validation only when needed, at the exact time, instead of “magic behind encapsulation” (e.g. you can do it when you transform API inputs before passing to your domain module) - so you can keep your world “clean” easily.

laiboonh · September 14, 2020, 8:55am

Good example shown via the Date module. I guess that’s the way Elixir handles this.

If we can have the validation happen during data construction, then it will be only run once whenever there is data construction done. By having validation happen every time it’s used is less efficient. But having said that, in FP, data mutation is handled by creating a new instance of the data so i guess its more or less the same amount of validation calls in the end.

tfwright · September 14, 2020, 2:32pm

In that case I think it was truer than you intended then

I didn’t realize this question was implicitly about static typing. I’m not really sure I see the distinction between an “illegal state” and a bug and if so what it could possibly mean to make render such a thing impossible, but leaving that issue aside, my guide for validating state in non-statically typed languages is to think in terms of clients and servers. Servers should validate messages from the client, but they should not validate internal messages. If I have an application that needs an email input, I might want to validate its presence or format, but once I’ve accepted it, I’m not going to add logic to validate it again every time it is passed from one part of the application to another (tests, documentation, and I guess, types/type specs are another matter).

Obviously there is a lot of gray area here. I think it is usually best to consider the DB as a server to the extent that it should apply at least some validations to data, even if that data has already been validated by the application, especially if there are other db clients (which, of course there always are, unless the app is the only client with write access). Conversely, I have seen API+SPAs where the FE is littered with unnecessary validations on API responses to avoid invalid data from being displayed to the user. In those cases, I think it’s much better to let the FE render fail and show a generic message to the user.

mbuhot · September 14, 2020, 11:45pm

Just for fun - I think you can enforce smart constructors by hiding the internal state inside a closure

gist.github.com

https://gist.github.com/mbuhot/3ff150eee164cd8e73e3db799d1ea1fe

opaque_user.ex

defmodule Opaque.User do
  @moduledoc """
  Demonstrates a ridiculous method for enforcing that a type remains in a valid state using a closure and internal secret
  """
  
  alias __MODULE__

  # Ensure that the internal closure can only be called from within this module
  @update_key :crypto.strong_rand_bytes(8)

This file has been truncated. show original

NobbZ · September 15, 2020, 5:03am

Now that you say it, there is even a library for that:

IvanR · November 15, 2021, 12:20am

One of the options can be to use Domo to validate structs that would generate the smart-constructor and the ensure function with all necessary guards from the @type t() considering associated preconditions, like the following:

defmodule User do
  use Domo

  defstruct name: "", age: 0

  @type t :: %__MODULE__{name: String.t(), age: age()}

  @type age :: non_neg_integer()
  precond age: &(&1 < 120)
end

john = User.new!(name: "John", age: 25)
%User{age: 25, name: "John"}

User.ensure_type(%{john | age: 800})
{:error,
 [
   age: "Invalid value 800 for field :age of %User{}. \
Expected the value matching the non_neg_integer() type. \
And a true value from the precondition function \"&(&1 < 120)\" defined for User.age() type."
 ]}