Ecto Postgres Index vs Unique Index

Owens · September 19, 2020, 9:50pm

Hello all,

Does anyone know if you create a unique_index in postgres for two fields on a table, does that mean you DON’T need to create an additional index on those two tables?

e.g.

create unique_index(:projects, [:organization_id, :name])

means you DON’T need to do create index(:projects, [:organization_id])

create unique_index(:team_users, [:team_id, :user_id])

means you DON’T need to do

create index(:teams_users, [:team_id])
create index(:teams_users, [:user_id])

AND if you already setup those indexes before the unique_indexes, should you drop/delete those indexes before migrating the unique_indexes?

benwilson512 · September 20, 2020, 12:31am

It does not mean that. The index will only be used if you query both of those fields. If you query only one of them, it cannot use that index. If you plan to query the table by those fields individually then you also want indices on those fields.

Owens · September 20, 2020, 12:38am

Interesting, thanks for the answer.

Does that mean where you already have an index like create index(:projects, [:organization_id])

But you want to make sure that every project within an organization has a unique name, you should add a unique constraint instead of a unique index?

But with joins tables like :teams_users you should create a unique_index on the :team_id and :user_id rather than an index on each individually?

benwilson512 · September 20, 2020, 12:40am

Unique constraints will create a unique index in the background anyway, it’s the only way to enforce the constraint in a performant manner.

Basically, create unique indices to enforce uniqueness rules. Also, create indices to reflect query patterns. If those two reasons to create indices happen to overlap, great! But it isn’t a big deal if they don’t, and they’re both perfectly valid reasons to create an index.

Owens · September 20, 2020, 12:43am

Thank you again, one follow-up question to this answer.

Unique constraints create a unique index in the background, does that mean one should also create a constraint to enforce uniqueness rules in addition to a unique index?

benwilson512 · September 20, 2020, 1:02am

As far as I can tell there isn’t any practical difference, however I think from a clarity perspective it is better to use constraints to enforce uniqueness. Just know that you don’t need to make an index for it, one will be created already.

Owens · September 20, 2020, 1:07am

Thanks, and apologies as I think my last question may not make sense. I was referring to unique constraint within postgres and not in the Ecto schema.

But it looks to me like unique constraint like in the above style is not implemented in Ecto Migration and it just uses unique_index?

Thanks for all your help.

kip · September 20, 2020, 1:14am

With some material limitations, Postgres can use a leading subset of multi-column indexes. From the documentation:

A multicolumn B-tree index can be used with query conditions that involve any subset of the index’s columns, but the index is most efficient when there are constraints on the leading (leftmost) columns. The exact rule is that equality constraints on leading columns, plus any inequality constraints on the first column that does not have an equality constraint, will be used to limit the portion of the index that is scanned. Constraints on columns to the right of these columns are checked in the index, so they save visits to the table proper, but they do not reduce the portion of the index that has to be scanned. For example, given an index on (a, b, c) and a query condition WHERE a = 5 AND b >= 42 AND c < 77 , the index would have to be scanned from the first entry with a = 5 and b = 42 up through the last entry with a = 5. Index entries with c >= 77 would be skipped, but they’d still have to be scanned through. This index could in principle be used for queries that have constraints on b and/or c with no constraint on a — but the entire index would have to be scanned, so in most cases the planner would prefer a sequential table scan over using the index.

But as Ben says, not appropriate to apply unique constraints.

benwilson512 · September 20, 2020, 1:28am

Yup so was I, in fact ecto unique constraints rely on postgres constraints, they don’t replace them.

amnu3387 · September 20, 2020, 12:51pm

From what I understood from this (Multicolumn indexes pg 9.6 (but same up to 12) the indexes can be used when the query involves columns from the index, going from the left to right.

In this case the second index on organization_id doesn’t give you anything extra since the leftmost column of the unique index you created before, is that same column.

Here you wouldn’t need a team_id index, but you would need a user_id if you are planning on querying the teams_users table solely by user_id (without team_id that is).
The team_id being the leftmost column declared on the unique_index would be used when a query containing a team_id column is made.

Theoretically this applies to indexes with more than 2 columns too.
E.g.

unique index on a, b, c
Means that pg can use the index on a query containing:
a
a,b
and a,b,c

(though it still goes through the planner ofc, so it might not use it, but that wouldn’t be because it can’t, at least this is my reading from the docs, it also means that the order in which you declare the columns for the index plays a role too)

sorentwo · September 20, 2020, 1:18pm

As noted above, you can use one or more parts of a compound index, from the left to the right. It can be a fantastic space and performance saving tool. In fact, that’s why Oban only declares two indexes for all of the standard operations.

Here is the “biggest” one, which composes 5 fields:

github.com

sorentwo/oban/blob/master/lib/oban/migrations.ex#L417


    add_if_not_exists(:tags, {:array, :string})
  end

  alter table(:oban_jobs, prefix: prefix) do
    modify :priority, :integer, default: 0
    modify :tags, {:array, :string}, default: []
  end

  drop_if_exists index(:oban_jobs, [:queue, :state, :scheduled_at, :id], prefix: prefix)

  create_if_not_exists index(:oban_jobs, [:queue, :state, :priority, :scheduled_at, :id],
                         prefix: prefix
                       )

  v2_oban_notify(prefix)

  record_version(prefix, 8)
end

def down(prefix) do
  drop_if_exists index(:oban_jobs, [:queue, :state, :priority, :scheduled_at, :id],