Validate associations consistency

Hi :wave:

I have a question that I think is not necessarily related to ecto and phoenix themselves but more about application design. I need your help.

Let’s say my app model is:

  • User has one Portfolio
  • Portfolio belongs to User
  • Portfolio has many Brokers and many Transactions
  • Broker belongs to Portfolio
  • Transaction belongs to Portfolio and to Broker

Now my app has a form that allows user to creates new transaction, the portfolio_id of the transaction is known by the backend (derived from the logged-in User) so it is not required from the user to input it. The Broker on the other hand can be chosen by the User from a dropdown.

My question is, at which level should I have a check that ensure that the selected Broker actually belongs to the Portfolio?

The Broker dropdown is already pre-populated with Broker that belongs to the logged-in user’s portfolio. But this alone doesn’t prevent a hacker from changing the broker_id in the POST /transactions request, correct?

Related question, all models that are created by the user only holds a portfolio_id reference (that links back the user). Is it an okay practice? Or should I still store the user_id in addition of the portfolio_id?

Thanks in advance

My question is, at which level should I have a check that ensure that the selected Broker actually belongs to the Portfolio?

Yes, this should be enforced at some level to avoid records being created with brokers that are not part of a given portfolio.

I would typically validate anything that doesn’t require querying the repo to a changeset within the transaction schema file.

However, I think that ensuring valid brokers would require a query and I typically run those types of validations at the context level. Using an ecto multi, you can run a series of steps. The first might be to validate fields that don’t need to query the database via the changeset. The next step might then be to make a query and validate that the broker can make a transaction for the portfolio.

I do wonder if this sort of thing might be best handled at the database level via a check constraint or trigger though. I would be curious of others’ thoughts on that and, if so, how to go about writing such a constraint/trigger. That would seem to be the most bullet-proof method.

Related question, all models that are created by the user only holds a portfolio_id reference (that links back the user). Is it an okay practice? Or should I still store the user_id in addition of the portfolio_id?

I think that is probably okay to the extent that no records that might lead back to the user will ever be deleted.

2 Likes

I found that using ex_machina in combination with faker and manually building valid and invalid object graphs works best. It’s really easy to miss cast_assoc or assoc_constraint in our own Ecto changeset code so having a safety net in our own tests that makes sure that e.g. no user can lack a portfolio (and that portfolio has at least one broker) is really a bliss.

It honestly doesn’t matter. Just make sure that you and everyone in your team can find the test if they need to modify it.

Data duplication (what is often called denormalisation) is usually not okay but many people found it more useful to not preload four levels of associations and just prefer to duplicate the user_id.

It mostly depends on how many queries you would end up doing per single request if the schema was more normalised (i.e. only the direct descendants of the user in the object graph have a direct back-reference). If you find that a single visit to a page that is supposed to show 3 objects from the database ends up doing 10+ queries then you might duplicate the user_id here and there.

1 Like

late to your response but every time you mutate (insert, update, delete) data you should verify the integrity or validity of your data.

imagine a simple blog scenario where two different requirements in place, anyone can create a post but only certain authors can edit and only the status field is allowed to be edited.

in your Post schema

def create_changeset(post, attrs) do
	post
	|> cast(attrs, [:category_id, :status, .....])
	|> validate_required([:status, ....])
    |> unique_constraint(:slug) #<-- 1st business logic protection
	|> assoc_constraint([:category]) # <--------- must have category
    |> **extra_business_logic_validations()** # <--------- must have 'mustard'
end

def edit_changeset(post, attrs) do
	post
	|> cast(attrs, [:author_id, :status])
	|> validate_required([:status])
	|> assoc_constraint([:author]) # <---------
    |> validate_author_permission() # <---------
end

that will guarantee entity-level validation.

given you operate on entities through your context only

Blog.create_new_post() // has to have a valid categoy otherwise fails
Blog.update_post() // has to have a valid author with edit permission

All your entity logic is encapsulated around the entity and all other logic is in your context module with domain separation.

1 Like