Partial matching ETS data

Please is there a way to do the equivalent of the following with ETS.

:ets.fun2ms(fn {_id, m} 
when String.starts_with?(m.name, "charles") 
or String.starts_with?(m.mobile, "0809133") 
or String.starts_with?(m.email, "charles") -> 
m end)

I know we cannot have String.starts_with, or similar functions in a when clause, but is there a way to do the equivalent of this?


iex > :ets.fun2ms(fn {_id, m} when m.name == "charles" or m.mobile == "0809133" or m.email == "charles" -> m end)

[{{:"$1", :"$2"}, [{:orelse, {:orelse, {:==, {:map_get, :name, :"$2"}, "charles"}, {:==, {:map_get, :mobile, :"$2"}, "0809133"}}, {:==, {:map_get, :email, :"$2"}, "charles"}}], [:"$2"]}]

The above attempt will only handle exact string matches, and not partial matches like the LIKE clause

Thanks.

I found this on the mailing list, [erlang-questions] Binary pattern matching on ETS

It doesn’t look like ETS supports binary pattern matching.

It’s a bit dense, but you can also review the match spec, Erlang -- Match Specifications in Erlang

2 Likes

Can you split out the part of the value you want to match on insert or is it dynamic?

it’s dynamic. think of a report page where users can select a bunch of filters to apply

Answering a “starts with” query like that efficiently (ie not with a full table scan) in Postgres requires a B-tree index; with ETS you’ll need to construct something similar yourself.

The Efficiency Guide goes into some more detail about this approach.

If you are feeling ambitious, you could create a probabilistic datastructure like a Bloom filter or xor_filter for each entry’s possible matches, like adding ['c', 'ch', 'char', 'charl', charle', 'charles'], then iterate over each filter to check if the partial match is in the filter. This would be for very simple matching, but if your use case doesn’t support complex filters or ranking matches based on the ‘best match’ then it could work. There is a small chance of false positives. I’ve used this before for very simple first + last name matches for ~2k static values and it worked well enough.

1 Like