How to prevent LIKE-injections

Hello, I have read the Ecto query and I see this line:

You should be very careful when allowing user sent data to be used as part of LIKE query, since they allow to perform LIKE-injections.

and Im afraid of this line because I don’t know Postgress without Ecto and how can I preventLIKE-injections, Do I need to sanitize the like input like use regex? or my sample code doesn’t need anything and it is right?

my code:

def search_codes(sub_brand_id, pagenumber, search_term) do
    search_string = "%#{search_term}%"
    query = from u in ErrorSchema,
        join: c in assoc(u, :error_brands),
        join: j in assoc(u, :error_sub_brands),
        join: g in assoc(u, :error_categories),
        where: u.status == true,
        where: u.sub_brand_id == ^sub_brand_id,
        where: ilike(u.title, ^search_string),
        or_where: ilike(u.error_code, ^search_string),
        order_by: [desc: u.inserted_at],
        select: %{
          id: u.id,
          title: u.title,
          short_description: u.short_description,
          seo_alias_link: u.seo_alias_link,
          brand_image: c.image,
          brand_title: c.title,
          brand_id: c.id,
          category_title: g.title,
          category_id: g.id,
          model_title: j.title,
          model_id: j.id
        }
    Repo.paginate(query, %{page: pagenumber, page_size: 20})
  end

ref: https://hexdocs.pm/ecto/Ecto.Query.API.html#like/2

2 Likes

There’s a link in the docs describing the problem and an example solution in ruby, which basically does a regex replace of /([\%_])/ with \\1.

2 Likes

AFAIK the vulnerability lies not in an attacker being able to craft arbitrary SQL, since the input value is being quoted by Ecto, but some LIKE patterns can be quite heavy on the database and cause DoS if used maliciously.

Have a look here: https://github.blog/2015-11-03-like-injection/

In short, you might want to quote or strip the wildcard characters before passing them to Ecto.

8 Likes

is there sample code with elixir ?, why do 2 /([\%_])/ with \\1 regex exist ? regex has a input!! :thinking:

Plus in most cases LIKE is the wrong tool for the job, postgresql has a fantastic language-knowing full-text search system that is easy to use and much better and faster than LIKE. :slight_smile:

6 Likes

I wanted yo use full-text search, but Ecto has no function for this!! and I don’t know how to write postgress

Ah but Ecto does, that’s the fragment call. :slight_smile:

Like I have a line in my code of fragment("to_tsvector(?) @@ to_tsquery(?)", ...args...), pretty simple overall, and postgres’ docs on it are fantastic. If you need speed don’t forget to add an index for it. :slight_smile:

6 Likes

You can create a migration for adding a tsv field to your table, as in the following example:

defmodule Your.Repo.Migrations.AddTsVectorsToUsers do
  use Ecto.Migration

  def change do
    alter table("users") do
      add :full_name_tsv,      :tsvector
    end

    execute "UPDATE users SET full_name_tsv = to_tsvector('english', COALESCE(full_name, ''))"
    create index(:users, [:full_name_tsv], using: :gin)

    execute """
    CREATE TRIGGER users_name_tsv_trigger BEFORE INSERT OR UPDATE
    ON users FOR EACH ROW EXECUTE PROCEDURE
    tsvector_update_trigger(full_name_tsv, 'pg_catalog.english', full_name);
    """
  end
end

This will create a tsv field and populate it from the existing field full_name and also index it. It then creates a trigger so that whenever you insert or update an existing record, it recomputes the ts vector.

Then you can define a macro, e.g.:

defmacro tsquery(field, text) do
    quote do
      fragment("?::tsvector @@ to_tsquery('english', ?)", unquote(field), unquote(text))
    end
end

I also have a helper function to split text input into a tsvector sequence, which also replaces invalid characters (only for my use case you might need a different set)

@spec split_names_for_tsquery(String.t()) :: String.t()
def split_names_for_tsquery(text) do
    String.split(text, " ", trim: true)
    |> Enum.reject(fn(text) -> Regex.match?(~r/\(|\)\[|\]\{|\}/, text) end)
    |> Enum.map(fn(token) ->  token <> ":*" end)
    |> Enum.intersperse(" & ")
    |> Enum.join
end

And I use it, similarly to:

import HelperModule, only: [tsquery: 2, split_names_for_tsquery: 1]
#....
n_text = split_names_for_tsquery(text)

Users
|> where([a], tsquery(a.full_name_tsv, ^n_text))
|> order_by([a], asc: a.full_name)
|> limit(^limit_results)
14 Likes

would you mind showing your schema and cast table for add :full_name_tsv, :tsvector ? pleas

I don’t manipulate the tsv field at the app level so I just don’t cast it in the change sets, since I set the trigger at the db level to re-update the full_name_tsv.
For the field type in the schema definition you need to create a simple wrapper. In my case I don’t do anything at all with it.

defmodule TSVectorType do
  @behaviour Ecto.Type

  def type, do: :tsvector

  def cast(tsvector), do: {:ok, tsvector}
  def load(tsvector), do: {:ok, tsvector}
  def dump(tsvector), do: {:ok, tsvector}
end

And then,
field :full_name_tsv, TSVectorType

3 Likes

LIKE and the Postgres fulltext search have some important differences, you should carefully consider what you actually need. The fulltext search processes the text and e.g. by default removes the ending of words (stemming) and also removes certain special characters.

You can speed up LIKE queries with a trigram index, speed is not necessarily a reason to use the fulltext search. It really depends on what kind of searches you want to allow.

7 Likes

I hear the warnings around the dangers of like injection.

I figured out how to alternatively write the queries I need for these cases using Ecto fragments and PostgreSQL’s own string functions like position() and lower().

However, it would be nice to able to use like/ilike… Can anybody see any remaining risks if we sanitized the user input with the following function (using @LostKobrakai’s proposed approach above, plus sanitizing for nil)?

sanitized_user_input = Regex.replace(~r/^$|([\%_])/, raw_user_input, "[\\1]")

1 Like

The regexes in this thread are still vulnerable. They fail to sanitise escape character \.

1 Like

So, everyone is cooking their own flawed escape code instead of centralizing this on ecto :eyes:

For now i’m going with

String.contains?(user_query, ["\\", "%", "_"])
... raise Ecto.NoResultsError, queryable: "unsafe redacted"

:person_shrugging: but mostly because I am just temporarily using ILIKE and will move to SIMILARITY soon.

in context:

  def search(user_query) do
    case String.contains?(user_query, ["\\", "%", "_"]) do # TODO: move to app config in case you change db backend?
      true -> raise Ecto.NoResultsError, queryable: "unsafe redacted"
      false ->
        name_pattern = "#{user_query}%"
        Content
        |> where(ilike("name", ^name_pattern))
        |> limit(10)
        |> Repo.all
    end
  end
2 Likes

This thread reminds me of a similar question I asked about Perl back in 2016 that never got a good answer. Quoting the comment from there

A little surprised this is not supported out of the box. I’m wondering if there are driver-specific issues that make this difficult to do.

For something similar to the Rails approach would work like this:

  @doc """
  This escapes the default SQL escape character `\\`, and any wildcard characters (`%` and `_`)
  so they can be used in a `LIKE` statement.

  This may not be perfect and is based off of the Rails implementation and the MySQL docs.

  - https://apidock.com/rails/v4.2.1/ActiveRecord/Sanitization/ClassMethods/sanitize_sql_like
  - https://dev.mysql.com/doc/refman/8.0/en/string-comparison-functions.html#operator_like
  """
  def sanitize_sql_like(string, escape_character \\ "\\") do
    {:ok, pattern} =
      [escape_character, "%", "_"]
      |> Enum.map(&Regex.escape/1)
      |> Enum.join("|")
      |> Regex.compile()

    String.replace(string, pattern, fn x -> escape_character <> x end)
  end

There’s half a dozen other ways to do it, but this way should work fine, and supports the ability to pass a custom escape character so if you need to do that as part of your SQL query you can, which PostgreSQL and MySQL support (string LIKE pattern [ESCAPE escape-character])

It’s even worse than missing the fact that the backslash in the pattern wasn’t escaped (ie. doubled) so ends up just escaping the %, the backslashes in the replacement (\\1 ) also needed to be escaped/doubled: as it stands the ruby (and elixir based on it) not only replaces just % and _ but it replaces them with themselves :slight_smile:

@felix-starman ftw here for accurately reproducing Rails’s version, shame we don’t have a Regex.union

1 Like