Key Prefix Scan in ETS

Is it possible to perform a prefix scan on keys using ETS? Like searching for all keys that start with a string “myprefix” using a query like “myprefix*”.

Another question is; is it possible to search key terms, by providing a partial key (acts similar to prefix scan)? For example a query like {“part1”} returns rows with keys {“part1”, “part2”} and {“part1”, “]art2”, “part3”}.

1 Like

You can use matchspecs and :ets.select to sort of do this.

For your “string” prefix you need to use charlists (because they are lists)

defmodule M do

  def setup() do
    :ets.new(:t, [:public, :named_table])
  end

  def insert(key, value) do
    :ets.insert(:t, {key, value})
  end

  def prefix_search(prefix) do
    :ets.select(:t, [{{prefix, :_}, [], [:"$_"]}])
  end

end


M.setup()
M.insert('hello', 'world')
M.insert('hey', 'joe')
M.insert('other', 'key')

IO.inspect(M.prefix_search([?h, ?e, ?l | :_]))
IO.inspect(M.prefix_search([?h, ?e | :_]))

Which gives the result.

$ elixir tst.exs
[{'hello', 'world'}]
[{'hey', 'joe'}, {'hello', 'world'}]

This does not work with elixir strings which are binaries.

The example can be transformed to match your second question:

defmodule M do

  def setup() do
    :ets.new(:t, [:public, :named_table])
  end

  def insert(tuple) do
    :ets.insert(:t, tuple)
  end

  def part_search(first_part) do
    ms = [{:"$1", [{:"==", first_part, {:element, 1, :"$1"}}, {:">=", {:size, :"$1"}, 2}], [:"$_"]}]
    :ets.select(:t, ms)
  end

end

M.setup()
M.insert({"part1", "part2", "somevalue"})
M.insert({"part1", "]art2", "part3", "somevalue"})
M.insert({"part4", "]art5", "part6", "somevalue"})

IO.inspect(M.part_search("part1"))
IO.inspect(M.part_search("part4"))

The tricky thing is to write proper matchspecs. Here is the specification: http://erlang.org/doc/apps/erts/match_spec.html

3 Likes

It’s not as simple as ETS, but mnesia_leveldb claims to be efficient for prefix-matching like your second question.

Thanks!

In the second sample it returns only the second record for “part1”, while both first and second records should be part of the result.

The first sample works perfectly! Are there any implications regarding performance? Having some ten millions of records.

The proposed prefix search solution using select is a good one - one thing to note is that if you want to do this, you probably want to use :ordered_set tables. Since they are implemented as trees, they can optimise and special-case prefix searches on keys. That works both with lists and with tuples - using keys like {"prefix", value} or similar will probably yield the most efficient solution.

5 Likes

Thanks @michalmuskala!

In case someone else comes here, :ets.fun2ms(...) is a very useful tool which converts a function to match_specs and makes writing queries far easier.

1 Like

Could you perhaps show an example of using :ets.fun2ms to generate a match spec that is a prefix search? I always have trouble finding ets searching examples that match what I’m trying to do.

Like this?

iex(1)> :ets.fun2ms(fn {:prefix_in_tuple, x} -> x end)
[{{:prefix_in_tuple, :"$1"}, [], [:"$1"]}]
1 Like

Yes. It is because I messed up :smiley:

The tuple I inserted have “part1” as key and therefore the second insert replaces the first input.

I find it easier to use lists than tuples for this. Then you can do:

defmodule M2 do

  def setup() do
    :ets.new(:t2, [:public, :named_table])
  end

  def insert(key, value) do
    :ets.insert(:t2, {key, value})
  end

  def prefix_search(prefix) do
    :ets.select(:t2, [{{prefix, :_}, [], [:"$_"]}])
  end

end


M2.setup()
M2.insert(["part1", "part2"], "somevalue")
M2.insert(["part1", ">part2", "part3"], "othervalue")
M2.insert(["part4", "part5"], "nomatchvalue")

IO.inspect(M2.prefix_search(["part1" | :_]))
IO.inspect(M2.prefix_search(["part4" | :_]))

But it is possible for tuples as well. If the tuples are of the same size it is easy:

ms = [{{{"part1", :_, :_}, [], [:"$_"]}]

If they are not I need to work on a working example :smiley:

Ok. so the match-spec for matching the first element of various sized tuples should be:

[{{:"$1", :_}, [{:==, {:element, 1, :"$1"}, "part1"}], [:"$_"]}]
2 Likes

Also useful is :ets.test_ms to test that your match-specs actually work. Something which I should’ve used in my examples

4 Likes

What @OvermindDL1 presented is the core of it. In addition to that it is possible to use guards:

iex(1)> p = :ets.fun2ms(fn ({{1,_,x,_},_} = rec) when 3 < x and x < 7 -> rec end)
[
  {{{1, :_, :"$1", :_}, :_}, [{:andalso, {:<, 3, :"$1"}, {:<, :"$1", 7}}],
   [:"$_"]}
]
3 Likes

Sorry for dragging an old thread up from the depths of … something, but I’m a bit stumped here.

I have these keys in Cachex:

iex(10)> Cachex.keys(:query)
{:ok,
 [
   {:list, "projects_assets", "0771CEF4"},
   {:list, "projects_assets", "07380C0D"},
   {:list, "projects_tags", "044FEDED"},
   {:list, "projects_projects", "07D9A0A2"}
 ]}
iex(11)>

I’m trying to get all with {:list, "projects_tags", _} but I’m failing:

iex(11)> :ets.select(:query, [{{:list, "projects_tags", :_}, [], [:"$_"]}])
[]
iex(12)> :ets.select(:query, [{{:list, :_, :_}, [], [:"$_"]}])
[]

A straight through query works:

:ets.select(:query, [{:"$1", [], [:"$1"]}])
[
  {:entry, {:list, "assets_assets", "0771CEF4"}, 1606923373175, 900000,
   [...

From what I read in this thread I thought this could be possible? Thanks for any advice! :slight_smile:

Well it helped just writing it out it seems. My error was that the format stored in Cachex is different:

:ets.select(:query, [{{:entry, {:list, "projects_tags", :_}, :_, :_, :_}, [], [:"$_"]}])

That did the job :slight_smile:

1 Like