Simple 1-file ETS demonstration you can try

Rob1 · January 26, 2024, 2:10pm

Have you ever run ETS, Elixir’s famous in-memory caching mechanism?

If not, I have prepared this short code sample you can clone and run as a single file to demonstrate it in use, just start it from the command line with elixir cloudcomputing.exs

I am just learning ETS myself, so any feedback about my file is welcome for those who have run it.

tj0 · January 26, 2024, 2:40pm

Nice. FYI, there’s an Elixir wrapper for ETS.

https://hexdocs.pm/ets/ETS.html

I’ve never used Elixir for scripting, so I’m sure there’s a way to add dependencies, but not sure offhand.

kokolegorille · January 26, 2024, 4:28pm

I can see some possible enhancement…

use handle_continue for probable long load in the init
use a direct ets call to retrieve data, without calling the genserver
currently You use the table as if it was private… but You set it as public.
I prefer protected
use spawn link to link both process (main and task)
it’s a good place to use trap exit, and manage task death in the genserver

Just some quick things I would do

I also prefer Process.send_after instead of :timer.sleep
If I can I would avoid using :timer

schneebyte · January 28, 2024, 6:52pm

Well for match specs i might use a wrapper/library, but otherwise i try to just use the :erlang stuff directly, no unnecessary abstraction.
~3k LOC just so i can write ETS instead of :ets.
And its missing :ets.lookup_element/4.

I would replace all those
:ets.tab2list(:sensor_data) |> Enum.filter
with :ets.match/match_delete/select/select_delete

Rob1 · January 28, 2024, 7:25pm

Thanks for the tip! I’ve made this replacementt.

Here’s the explanation of the the $1 and $2 there, which I wasn’t familiar with:

"
The code snippet you’re asking about uses a feature of Erlang’s ETS (Erlang Term Storage) that’s also available in Elixir, as Elixir is built on top of the Erlang VM. The syntax with :"$1", :"$2", etc., is specific to ETS and represents a pattern matching and guard expression used in ETS select and delete operations. Let’s break down the specific line:

:ets.select_delete(:sensor_data, [{{:"$1", :"$2"}, [{:<, :"$1", oldest_allowed_time}], [true]}])

:ets.select_delete/2: This is an ETS function used to delete entries from an ETS table based on a match specification. The first argument is the name of the ETS table (:sensor_data in your case), and the second argument is a match specification that determines which records to delete.
The match specification [{{:"$1", :"$2"}, [{:<, :"$1", oldest_allowed_time}], [true]}] is a list that describes how to match and delete the records. Let’s dissect this:

{{:"$1", :"$2"}}: This part is the pattern. It matches tuples where the first element is bound to :"$1" and the second element to :"$2". In ETS match specifications, :"$1", :"$2", etc., are placeholders that correspond to the elements of the tuples stored in the ETS table. In your case, each tuple in the :sensor_data table is {time, simulated_temp}, so :"$1" matches time and :"$2" matches simulated_temp.
[{:<, :"$1", oldest_allowed_time}]: This is the guard sequence. It applies additional conditions to the matched records. Here, it checks if the time (:"$1") is less than (:<) oldest_allowed_time. Only records that satisfy this condition will be considered for deletion.
[true]: This is the result list. It indicates that for records that match the pattern and pass the guard sequence, the action true (which means delete) should be applied.

So, in simpler terms, this line of code tells ETS to delete all records from the :sensor_data table where the timestamp is older than oldest_allowed_time.

This kind of operation is both powerful and efficient, as it allows complex match and guard conditions to be executed directly within the ETS table, avoiding the overhead of pulling data into process memory for filtering and processing. The syntax might seem unusual at first, especially if you’re primarily used to Elixir’s syntax, but it’s a direct use of Erlang’s powerful pattern matching capabilities within ETS.
"

What do you think of this explanation?

Rob1 · January 28, 2024, 7:37pm

Thanks for these tips! You mentioned “just some quick things”, if it doesn’t take you long could you make a pull request? For me, I am not confident in appplying your three changes, I am still learning Elixir and had help generating the code to begin with. (Specifically: I haven’t used handle_continue and am not totally clear about where to add it, I am not completely clear on the private/public/protected distinction you mention, and I am not confident applying your spawn changes.

It sounds like you know all three of your suggestions (four with send_after) and could do it quickly, so if it will improve the demonstration code I would merge your PR to improve the example.

Otherwise I would have to wait until I learn the concepts you’ve mentioned in a bit more detail.

So far this is still a learning exercise for me too I appreciate the feedback.

kokolegorille · January 29, 2024, 12:04am

Your code…

  # Initializes the ETS table and starts the sensor simulation
  def init(:ok) do
    # Use `:named_table` to allow access from other processes
    :ets.new(:sensor_data, [:set, :public, :named_table])
    schedule_sensor_read()
    schedule_cleanup()
    {:ok, %{}}
  end

In FP functions returns something… It’s nice to write pure functions fn input → output
end

You are writing methods, probably with side effects, we don’t know the input, we don’t know the output

    schedule_sensor_read()
    schedule_cleanup()

This would be my init

  @impl GenServer
  def init(:ok) do
    # Use `:named_table` to allow access from other processes
    :ets.new(:sensor_data, [:set, :protected, :named_table])

    # HERE TRAP EXIT, so You don't die when Your tasks dies
    # You will instead receive a DOWN info message 
    Process.flag(:trap_exit, true)
    {:ok, %{}, {:continue, :load_data}}
  end

  @impl GenServer
  def handle_continue(:load_data, state) do
    # at least the init is very short!
    # I would not write this code... but data = schedule_...
    schedule_sensor_read()
    schedule_cleanup()
    {:noreply, state}
  end

Just above init, I would write the API, You just wrote the callbacks

Something like this

# THIS IS API 
def get_ets_data do
  # Here is the fun part with public, or protected ets...
  # You can read it without having to call the server

  # In case your table is private, You need to call, or cast the server
end

def write_ets_data(data) do
  # Here You call the server if protected
  # or You write directly if public
end

Your code…

  defp schedule_sensor_read do
    Process.send_after(self(), :read_sensor, 1)
  end

Do You know it returns a ref?

  defp schedule_sensor_read do
    ref = Process.send_after(self(), :read_sensor, 1)
  end

This ref could be stored in the server state

Because some day, You might want to cancel it

Your code…

  # Periodically analyze temperature data
  defp analyze_temperature do
    analyze_and_report()
    :timer.sleep(10_000)  # Use :timer.sleep for more reliable behavior
    analyze_temperature()
  end

My code… with some modifications. I would keep the ref in the state
If the state is a struct, with ref defined, I would then return the modified state with
%{state | ref: ref}

Be careful with this syntax… It works only if state is a struct with ref as field
But You get the idea… the function takes an input, and output a modified version of the state

defp tick(state) do
  # Do something here...
  ref = Process.send_after(self(), :tick, 10_000)
  %{state | ref: ref}
end

Your code… it’s too low level to use spawn, or spawn link
Prefer the task module

  # Public function to start the analysis process
  def start_analysis do
     # Use `spawn` instead of `spawn_link` to avoid linking the process
     spawn(fn -> analyze_temperature() end)
  end

Also… it should be at least explicative of what it returns, even if You don’t use pid

_pid = spawn(fn → analyze_temperature() end)

Your code reflects something You would write in other languages

OTP is a delightful piece of software, but can be tricky to write, and Functional Programming can be tricky too. In particular if You are an experienced OOP programmer

kokolegorille · January 29, 2024, 12:13am

Also I would try not to mix the gen_server code with the business logic code.

I would prefer having a Sensor context, with only pure functions, separated from time concern…

…and use the genserver only to call these context module functions, when time comes.

Rob1 · January 29, 2024, 12:33am

Great observation, and I like your style such as ending a function on a line like:
ref = Process.send_after(self(), :read_sensor, 1)

or your other example

_pid = spawn(fn → analyze_temperature() end)

I think that makes it very clear what the function is returning (a reference and pid in these two cases, and you also show when you are not using it via the underscore.)

I think that aids in writing very literate programming, so I will start using this style in my code when I can.

Where you write:

I would prefer having a Sensor context, with only pure functions, separated from time concern…

They couldn’t really be pure functions, though, since reading the sensors will not always return the same value.

Thank you again for the code review, as I get more advanced I will practice more of those techniques. I’ll try to incorporate some of these changes now, I’ll post the updates.

Rob1 · January 29, 2024, 1:02am

I’ve now made some of your suggested changes here, I didn’t totally rework the structure but I see what you mean about a separate API and changing the init to return {:ok, %{}, {:continue, :load_data}} and then have a def handle_continue(:load_data, state).

The reason I didn’t make the change totally is that you only provided part of the code here and I am still just getting used to dealing with passing data around like this, so I preferred to keep the application in a working state. But I see what you were referring to.

kokolegorille · January 29, 2024, 1:39am

You should look at how Ecto deals with external data. You could have a SensorApi providing data. This data is casted and validated to valid Sensor data

Once casted and validated, You can have a pure Core

I don’t know what You retrieve, but You could have pure functions doing Celsius ↔ Fahrenheit conversion

Of course it’s overkill for simple project, but I would not mind using multiple modules

Sensors
  Workers
  Core
  Api