How to prevent LIKE-injections

Hello, I have read the Ecto query and I see this line:

You should be very careful when allowing user sent data to be used as part of LIKE query, since they allow to perform LIKE-injections.

and Im afraid of this line because I don’t know Postgress without Ecto and how can I preventLIKE-injections, Do I need to sanitize the like input like use regex? or my sample code doesn’t need anything and it is right?

my code:

def search_codes(sub_brand_id, pagenumber, search_term) do
    search_string = "%#{search_term}%"
    query = from u in ErrorSchema,
        join: c in assoc(u, :error_brands),
        join: j in assoc(u, :error_sub_brands),
        join: g in assoc(u, :error_categories),
        where: u.status == true,
        where: u.sub_brand_id == ^sub_brand_id,
        where: ilike(u.title, ^search_string),
        or_where: ilike(u.error_code, ^search_string),
        order_by: [desc: u.inserted_at],
        select: %{
          id: u.id,
          title: u.title,
          short_description: u.short_description,
          seo_alias_link: u.seo_alias_link,
          brand_image: c.image,
          brand_title: c.title,
          brand_id: c.id,
          category_title: g.title,
          category_id: g.id,
          model_title: j.title,
          model_id: j.id
        }
    Repo.paginate(query, %{page: pagenumber, page_size: 20})
  end

ref: https://hexdocs.pm/ecto/Ecto.Query.API.html#like/2

There’s a link in the docs describing the problem and an example solution in ruby, which basically does a regex replace of /([\%_])/ with \\1.

1 Like

AFAIK the vulnerability lies not in an attacker being able to craft arbitrary SQL, since the input value is being quoted by Ecto, but some LIKE patterns can be quite heavy on the database and cause DoS if used maliciously.

Have a look here: https://github.blog/2015-11-03-like-injection/

In short, you might want to quote or strip the wildcard characters before passing them to Ecto.

5 Likes

is there sample code with elixir ?, why do 2 /([\%_])/ with \\1 regex exist ? regex has a input!! :thinking:

Plus in most cases LIKE is the wrong tool for the job, postgresql has a fantastic language-knowing full-text search system that is easy to use and much better and faster than LIKE. :slight_smile:

5 Likes

I wanted yo use full-text search, but Ecto has no function for this!! and I don’t know how to write postgress

Ah but Ecto does, that’s the fragment call. :slight_smile:

Like I have a line in my code of fragment("to_tsvector(?) @@ to_tsquery(?)", ...args...), pretty simple overall, and postgres’ docs on it are fantastic. If you need speed don’t forget to add an index for it. :slight_smile:

3 Likes

You can create a migration for adding a tsv field to your table, as in the following example:

defmodule Your.Repo.Migrations.AddTsVectorsToUsers do
  use Ecto.Migration

  def change do
    alter table("users") do
      add :full_name_tsv,      :tsvector
    end

    execute "UPDATE users SET full_name_tsv = to_tsvector('english', COALESCE(full_name, ''))"
    create index(:users, [:full_name_tsv], using: :gin)

    execute """
    CREATE TRIGGER users_name_tsv_trigger BEFORE INSERT OR UPDATE
    ON users FOR EACH ROW EXECUTE PROCEDURE
    tsvector_update_trigger(full_name_tsv, 'pg_catalog.english', full_name);
    """
  end
end

This will create a tsv field and populate it from the existing field full_name and also index it. It then creates a trigger so that whenever you insert or update an existing record, it recomputes the ts vector.

Then you can define a macro, e.g.:

defmacro tsquery(field, text) do
    quote do
      fragment("?::tsvector @@ to_tsquery('english', ?)", unquote(field), unquote(text))
    end
end

I also have a helper function to split text input into a tsvector sequence, which also replaces invalid characters (only for my use case you might need a different set)

@spec split_names_for_tsquery(String.t()) :: String.t()
def split_names_for_tsquery(text) do
    String.split(text, " ", trim: true)
    |> Enum.reject(fn(text) -> Regex.match?(~r/\(|\)\[|\]\{|\}/, text) end)
    |> Enum.map(fn(token) ->  token <> ":*" end)
    |> Enum.intersperse(" & ")
    |> Enum.join
end

And I use it, similarly to:

import HelperModule, only: [tsquery: 2, split_names_for_tsquery: 1]
#....
n_text = split_names_for_tsquery(text)

Users
|> where([a], tsquery(a.full_name_tsv, ^n_text))
|> order_by([a], asc: a.full_name)
|> limit(^limit_results)
9 Likes

would you mind showing your schema and cast table for add :full_name_tsv, :tsvector ? pleas

I don’t manipulate the tsv field at the app level so I just don’t cast it in the change sets, since I set the trigger at the db level to re-update the full_name_tsv.
For the field type in the schema definition you need to create a simple wrapper. In my case I don’t do anything at all with it.

defmodule TSVectorType do
  @behaviour Ecto.Type

  def type, do: :tsvector

  def cast(tsvector), do: {:ok, tsvector}
  def load(tsvector), do: {:ok, tsvector}
  def dump(tsvector), do: {:ok, tsvector}
end

And then,
field :full_name_tsv, TSVectorType

2 Likes

LIKE and the Postgres fulltext search have some important differences, you should carefully consider what you actually need. The fulltext search processes the text and e.g. by default removes the ending of words (stemming) and also removes certain special characters.

You can speed up LIKE queries with a trigram index, speed is not necessarily a reason to use the fulltext search. It really depends on what kind of searches you want to allow.

5 Likes

I hear the warnings around the dangers of like injection.

I figured out how to alternatively write the queries I need for these cases using Ecto fragments and PostgreSQL’s own string functions like position() and lower().

However, it would be nice to able to use like/ilike… Can anybody see any remaining risks if we sanitized the user input with the following function (using @LostKobrakai’s proposed approach above, plus sanitizing for nil)?

sanitized_user_input = Regex.replace(~r/^$|([\%_])/, raw_user_input, "[\\1]")

The regexes in this thread are still vulnerable. They fail to sanitise escape character \.

1 Like