Understanding the ets cache

Maxtonx · September 23, 2020, 10:31am

I’m trying to learn cache techniques and because the elixir has its own ets cache I decided to use it. I have a module that takes a list of books from Wikipedia but I don’t want to hit the server too many times. So I decided to cache the result.

  def get_books(limit, offset) do
    cache = Module.get_book(limit, offset)
  end

Now here I have created a gen server module that hits the Wikipedia server and stores the result in the database. Now I want to cache this result. But you’ve noticed here I have put a limit and offset here so in gen server it will again hit the server after some time and store the result in the database.

So If I want to cache every time it hits should it mean it will store in the same table or at least I’m assuming. So is this the way can I approach this

storing the result in the variable and pass that variable here

:ets.insert(cache, [:named_table])

Can someone give me a suggestion here on how to approach this?

ityonemo · September 23, 2020, 6:03pm

What level of experience do you have with elixir? For very early beginners I would not dive into ets because there are a lot of issues around process lifetimes that it would be good to understand first, and it also breaks basic immutability/messaging concepts in the VM, that it would help to be familiar with before going down the path of something “different”, and arguably you should be “ready to read erlang documentation”

That said, you shouldn’t be passing [:named_table] into insert, you should be passing it the name of the table, into the first parameter, and a tuple with the key in the first position as the second parameter. http://erlang.org/doc/man/ets.html

Then when you’re ready to look it up, use :ets.lookup, passing the name of the table and the key. Keep in mind that what you get back will be a list of tuples, and the key will be in the tuple, as well as the “value”, whatever you choose that to be.

If you’re exploring ets, I highly recommend use of IO.inspect to take a peek at what data are flying around. What happens is not necessarily intuitive, coming from other databaseses, or redis, or memcached, or the like, pay careful attention to the documentation and read it carefully before proceeding.

mpope · September 23, 2020, 6:08pm

I think a good caching strategy you’re looking for is called a ‘Read-Through Cache’. Basically, you have an ETS table that you check for a certain key. If the key isn’t present, it is fetched from the DB then stored in ETS if it is found. Some issues occurs with this, like controlling cache size, multiple writers conflicting, coherency between VMs, etc. ETS is a bit barebones, bit its a fun way to get creative with your caching, if you’re looking for a fun challenge!

egze · September 23, 2020, 7:02pm

Look at a battle tested library https://github.com/sasa1977/con_cache

or if you want to build something yourself, look at this implementation: https://gist.github.com/raorao/a4bb34726af2e3fa071adfa504505e1d

Maxtonx · September 25, 2020, 12:10pm

Really appreciate your comment. So I have been using elixir on and off in the last one year. Never really got into OTP because I wasn’t using it on the production level. So here I understood a lot of it after reading the documents and I have created a Genserver which currently does the basic functionality for the cache. I have written the basic gen server which deals with ets cache and I’ve added in the application start function and it’s working fine.

Now when I have a basic understanding of this can you refer to me some documents about the immutability and messaging concept in the VM. I’m just more curious now.