Best way to keep read-only table in memory

How many values are we talking here (order of magnitude)?

We’re talking about 15K records

How does your current implementation work? Structs don’t store data in a way that is accessible from different processes.

The list of structs is in a GenServer. I did that precisely to allow concurrent access

One option is to have two tables, one which stores k1 -> v1 and then some other table which stores, k1 -> k1, k2 -> k1 mappings. You do a lookup in table 2 first to get the canonical key, and then a lookup in table 1 to get the real value.

Yes, that’s what I clumsily tried to explain before:
"Obviously, I can use a bit of brute force, and have a 2nd ETS table with the 2nd key as key, get the 1st key, and then query the 1st table. Kind of an reverse index table. "

And, my question still is if I should use something like Amnesia instead. I think that will allow me to more easily support queries on other columns that are not unique keys but that group data never the less.

axelson thinks CacheX is a better option for me, but I don’t see it. I still need to support queries on different columns.

If you need in memory multi column (non key) lookups you’re going to start making things pretty complex. No matter what you do, if you want to avoid linear scans you’ll need to maintain secondary indices. Adding in mnesia won’t really change that, or even something fancy like an in memory sqlite table.

15k items isn’t really all that many. You could easily just have 1 table for canonical key -> value pairs, and then N tables, one per column you want to query on, containing value -> canonical key pairs.

1 Like

Yes, that was my first idea. I just thought I should ask before going with it.

Looking at the Mnesia/Amnesia documentation it seems straightforward to declare multiple indexes on the same table. But, I’m very new to Elixir so I don’t really know what I’ll be getting myself into. Any reason why I should avoid this, and implement all the tables/indexes on my own?

1 Like

I definitely think giving :mnesia a shot may be worthwhile, particularly if you’re just doing in memory tables and not worrying about any of the distributed bits. Amnesia is OK as wrapper libraries go, it’s VERY macro heavy so it can be a bit hard to debug. Your use case is relatively simple though so maybe just time box it to a couple hours and see how it goes?

1 Like

Yes, that’s precisely my use case. Thank you so much. You’ve been very patient and helpful

2 Likes

I would compile those into a module with macro. Similar to the new :persistent_term.

Thank you for the info. I initially didn’t think this will help me because it doesn’t allow me to have two keys on the same record. However, I could use the same idea we discussed in this thread: Implement the two indexes on their own structures, and have a third structure for the non-key based query.

I’d like to implement this after the Mnesia based solution, and compare both. Can you tell me where to find more info? I know I can get the latest Erlang to be able to use :persistent_term but you’re suggesting to manually implement something similar with macros. Correct?

I might be severely off the mark here but I believe they were referring to something like FastGlobal. It produces a compiled module for you that would contain code lines like these:

def get("column1-key1"), do: "value1"
def get("column2-key1"), do: "value1"
def get("column1-key2"), do: "value2"

(This is the code generated by the library. You don’t write that. You do FastGlobal.put(:key, "value") and that’s it.)

…Which is the fastest ever access you can get with Erlang / Elixir. However, every changing of value has a heavy runtime cost (make sure to go through the README file). So only use FastGlobal for very rare writes and a ton of reads.

As Ben said, 15K records is nothing. Caching every value several times (on as many keys as you need) is a perfectly fine strategy at that small scale.

4 Likes

This is awesome sir. Thanks a lot

We will be happy if you share what solution you ended up with eventually. It’s interesting for the future readers.

Will do. I’ll try to implement at least two of the ones suggested here, and compare it against the baseline based on structs.

Thanks again

Yes, it’s super simple if you don’t need to add things at runtime.

defmodule Data do
  # load external data
  @data [
    %{id: 1, other_key: :a},
    %{id: 2, other_key: :b},
    %{id: 3, other_key: :c},
  ]


  for key <- [:id, :other_key] do
    for row <- @data do
      def unquote(:"by_#{key}")(unquote(row[key])) do
        unquote(Macro.escape(row))
      end
    end

    def unquote(:"by_#{key}")(_) do
      nil
    end
  end

end
2 Likes

and if I want to get the value of any key, Do I do value = Data.“by_#{key}”?

Sorry, I’m really struggling with macros

In this example, you do Data.by_id(1) or Data.by_other_key(:a).

Or if you want a unified api, you can put the key in argument as well:

def by(unquote(key), unquote(row[key])) do

Then call it like Data.by(:id, 1)

1 Like

Got it. Thank you

OK. I hit an issue. I have the data in a CSV file, and to load it, I’m doing this:

  def csv_to_map do
    "../test0.csv"
    |> Path.expand(__DIR__)
    |> File.stream!()
    |> CSV.decode(headers: true)
    |> Enum.to_list()
  end

That works fine, but if now I save it to a variable instead of the attribute you use, I get an error:

data = csv_to_map

for key <- [:id, :other_key] do
  for row <- data do
    def unquote(:"by_#{key}")(unquote(row[key])) do
      unquote(Macro.escape(row))
    end
 end

== Compilation error in file lib/catalog.ex ==
** (CompileError) lib/catalog.ex:15: undefined function key/0
(elixir) src/elixir_bitstring.erl:142: :elixir_bitstring.expand_expr/4
(elixir) src/elixir_bitstring.erl:27: :elixir_bitstring.expand/7
(elixir) src/elixir_bitstring.erl:20: :elixir_bitstring.expand/4
(stdlib) lists.erl:1354: :lists.mapfoldl/3
(elixir) expanding macro: Kernel.def/2

That key variable is defined in the outer for comprehension, so I don’t understand why switching from an attribute to a variable makes a difference.

Thanks for your help

Try this (I know embedded for’s sometimes have issues I’ve seen so it’s best to combine them regardless, like this):

data = csv_to_map

for key <- [:id, :other_key], row <- data do
  def unquote(:"by_#{key}")(unquote(row[key])) do
    unquote(Macro.escape(row))
 end

Thank you, but I still get the same error. The compiler doesn’t see “key” as being defined and expands it as a function. This is the code I have now:

defmodule Catalog do
  def csv_to_map do
    "../test0.csv"
    |> Path.expand(__DIR__)
    |> File.stream!()
    |> CSV.decode(strip_fileds: true, headers: true)
    |> Enum.to_list()

  end

  def map_list_to_mem do
    data = csv_to_map()

    for key <- [:id, :other_key], row <- data do
      def unquote(:"by_#{key}")(unquote(row[key])) do
        unquote(Macro.escape(row))
      end

      def unquote(:"by_#{key}")(_) do
        nil
      end
    end
  end
end

And, I still get:
== Compilation error in file lib/catalog.ex ==
** (CompileError) lib/catalog.ex:15: undefined function key/0
(elixir) src/elixir_bitstring.erl:142: :elixir_bitstring.expand_expr/4
(elixir) src/elixir_bitstring.erl:27: :elixir_bitstring.expand/7
(elixir) src/elixir_bitstring.erl:20: :elixir_bitstring.expand/4
(stdlib) lists.erl:1354: :lists.mapfoldl/3
(elixir) expanding macro: Kernel.def/2

The contents here need to happen in the module body if you’re trying to generate functions. This will all happen at compile time. If compile time isn’t when you want to do this then you’ll want to pick one of the other strategies.

1 Like

I see. Beginners mistake :slight_smile: I don’t know why I didn’t try that before.

BTW, I was able to do the ETS based solution, and it’s working fine. I decided not to bother with Mnesia as I just have one additional index and I don’t think I’ll gain anything else from it.

But I’d like to finish at least one of the macro based solutions. Even if I don’t use it, I’m learning a lot.

Thanks a lot

1 Like