Putting unaccent in existing query

Hi,

how do i add unaccent (already works on psql server) to the following query:

|> where( [a, b], ilike(a.title, ^“%#{search_string}%”) or ilike(b.name, ^“%#{search_string}%”)

I know something with fragment. But not sure how to add it. Would really appreciate any help here.

thanks

Hello Max,

I am having trouble understanding your questions. Do you mean escaping? Do you want to escape the search string?

Could you provide us with a small example of what you are looking for?

1 Like

Sorry for not beeing clear. For example: I have a user Horváth. Now if i search the name by typing Horvath ( a instead of á) there is no result. To ignore the accent postgres can use the extension unaccent.

SELECT * FROM users WHERE unaccent(lastname) = unaccent(‘Horváth’)

Now i don´t know how to combine ilike with unaccent.

Thanks for looking into it.

Would something like this be what you are looking for ?

No. But thank you for your suggention. Might come handy another time. I need a query with unaccent and ilike together.

To be honest, this will probably result in a linear scan of the table. Therefore I’d add an indexed column, which contains the unaccented name while not using like and friends unless necessary.

Thanks. But i don´t get it where the problem lies. The raw postgres query is somehow simple. Besides from ilike i tried the following but get an error that there is no function unaccent:

|> where(
      [a],
      fragment("unaccent(?)", ^search_string) == fragment("unaccent(?)", a.lastname)
    )

A second column would only be the last ( very last :slight_smile: ) option

But a solution that can benefit from an index and therefore make the query returning much faster…

Aside of that, please take a look at the generated query, does it look like you expect it? Does it work if you send that query to the DB directly?

You are absolutly right with the index and the performance. But it is not so relevant here.

But …

Got it working :slight_smile:

Here for anybody might looking for the same thing:

    |> where(
      [a],
      fragment("unaccent(?) ILIKE unaccent(?)", a.lastname, ^"%#{search_string}%")
    )

For note, just unaccenting won’t find slight misspelling and similar phenomes and so forth. If you are using PostgreSQL (as stated above) then you really should use it’s language lookup system (tsvector’s and all). It is significantly more powerful, more useful, and will also be faster than using ILIKE.

Also, you really don’t want to accept user string when using ILIKE/LIKE as it is quite possible to craft some exponentially bad lookups via it that can bring your database to a crawl or death.

A quick no-index setup but significantly more useful and safe than ILIKE while returning much better results would be something like changing your example into:

    |> where(
      [a],
      fragment("to_tsvector(?) @@ to_tsquery(?)", a.lastname, ^search_string)
    )

And there are tons of options on how to control it as well, but the defaults are perfectly fine for 95% of cases.

2 Likes