Putting unaccent in existing query

Max · April 29, 2019, 11:43am

Hi,

how do i add unaccent (already works on psql server) to the following query:

|> where( [a, b], ilike(a.title, ^“%#{search_string}%”) or ilike(b.name, ^“%#{search_string}%”)

I know something with fragment. But not sure how to add it. Would really appreciate any help here.

thanks

Fl4m3Ph03n1x · April 29, 2019, 3:57pm

Hello Max,

I am having trouble understanding your questions. Do you mean escaping? Do you want to escape the search string?

Could you provide us with a small example of what you are looking for?

Max · April 30, 2019, 8:15am

Sorry for not beeing clear. For example: I have a user Horváth. Now if i search the name by typing Horvath ( a instead of á) there is no result. To ignore the accent postgres can use the extension unaccent.

SELECT * FROM users WHERE unaccent(lastname) = unaccent(‘Horváth’)

Now i don´t know how to combine ilike with unaccent.

Thanks for looking into it.

Fl4m3Ph03n1x · April 30, 2019, 8:35am

Would something like this be what you are looking for ?

Max · April 30, 2019, 8:38am

No. But thank you for your suggention. Might come handy another time. I need a query with unaccent and ilike together.

NobbZ · April 30, 2019, 9:44am

To be honest, this will probably result in a linear scan of the table. Therefore I’d add an indexed column, which contains the unaccented name while not using like and friends unless necessary.

Max · April 30, 2019, 9:57am

Thanks. But i don´t get it where the problem lies. The raw postgres query is somehow simple. Besides from ilike i tried the following but get an error that there is no function unaccent:

|> where(
      [a],
      fragment("unaccent(?)", ^search_string) == fragment("unaccent(?)", a.lastname)
    )

A second column would only be the last ( very last ) option

NobbZ · April 30, 2019, 10:10am

But a solution that can benefit from an index and therefore make the query returning much faster…

NobbZ · April 30, 2019, 10:15am

Aside of that, please take a look at the generated query, does it look like you expect it? Does it work if you send that query to the DB directly?

Max · April 30, 2019, 10:41am

You are absolutly right with the index and the performance. But it is not so relevant here.

But …

Got it working

Here for anybody might looking for the same thing:

    |> where(
      [a],
      fragment("unaccent(?) ILIKE unaccent(?)", a.lastname, ^"%#{search_string}%")
    )

OvermindDL1 · April 30, 2019, 2:46pm

For note, just unaccenting won’t find slight misspelling and similar phenomes and so forth. If you are using PostgreSQL (as stated above) then you really should use it’s language lookup system (tsvector’s and all). It is significantly more powerful, more useful, and will also be faster than using ILIKE.

Also, you really don’t want to accept user string when using ILIKE/LIKE as it is quite possible to craft some exponentially bad lookups via it that can bring your database to a crawl or death.

A quick no-index setup but significantly more useful and safe than ILIKE while returning much better results would be something like changing your example into:

    |> where(
      [a],
      fragment("to_tsvector(?) @@ to_tsquery(?)", a.lastname, ^search_string)
    )

And there are tons of options on how to control it as well, but the defaults are perfectly fine for 95% of cases.