Is it possible to filter an ETS table with a contains or a starts_with?

I’m implementing a search feature for data that’s backed by an ETS table. Given the following:

:ets.new(:nodes, [:named_table])
:ets.insert(:nodes, {1, "Fred"})
:ets.insert(:nodes, {2, "Frederich"})
:ets.insert(:nodes, {3, "Barney"})

Is there a way to filter the second column with a starts_with or a contains? To demonstrate the sentiment of what I’d like to do (in spite of it not being a valid guard clause):

match_spec = :ets.fun2ms(fn {id, name} when String.contains?(name, "Fred") ->
    {id, name}
end)

:ets.select(:nodes, match_spec)

Is this a case where my only option is to traverse the table?

1 Like

binary_part is allowed in matchspecs, so in principle you could use that for the starts_with search. That won’t help with contains, though…

While it’s faster than many other approaches :ets.select is still going to be time-consuming if the table is large; it’s like always doing a sequential-scan in a full-size DBMS.

The solution to that for ETS is similarly parallel: create additional tables that speed up important queries. For instance, you can make “starts with” queries efficient if you maintain a second :bag table like:

{"F", 1}
{"Fr", 1}
{"Fre", 1}
{"Fred", 1}
{"F", 2}
{"Fr", 2}
{"Fre", 2}
{"Fred", 2}
{"Frede", 2}
{"Freder", 2}
{"Frederi", 2}
{"Frederic", 2}
{"Frederich", 2}
{"B", 3}
{"Ba", 3}
{"Bar", 3}
{"Barn", 3}
{"Barne", 3}
{"Barney", 3}

Like any index, this trades memory usage for retrieval speed.

Indexing for “contains” is harder but there are well-documented techniques like trigram indexing that can help.

11 Likes