GenServer pass file path

Fl4m3Ph03n1x · August 7, 2019, 7:56am

Background

I have an OTP application that reads a CSV file and puts it into memory. After following some advice from this community I went with the following approach:

My app file has 2 modules: Populator and Reader. Populator opens the file and indexes its contents into a persistent_term table, while Reader is a client that queries said table.
Populator is a Genserver, that starts with state initializing. Once it loads the file into memory its state evolves to :ready or :error depending on the outcome.
Before making a request, the Reader checks the Populator's state. If its :ready it queries the memory table, otherwise refuses the request.

Best practices

Now, the issue here is initialization. As a Genserver, my Populator has the file’s path hardcoded. This app is a library with, so following the good guidelines of libraries states that this is a terrible idea.

I could put the file path into a config file inside a config folder, but the directives for good libraries discourage this. They instead encourage me to have a function in the Populator called populate_memory (for lack of a better name) that upon receiving a file name, opens the file and fills the table.

Problem

But the problem with this approach is that if I do it that way, then the projects that use the library will have to deal with Engine failure on startup, and the code will endup looking like this:


defmodule FootbalInterface.Application do
  @moduledoc false

  use Application

  alias FootbalEngine

  require Logger

  def start(_type, _args) do

    children = []
    opts = [strategy: :one_for_one, name: FootbalInterface.Supervisor]

    case FootbalEngine.new(file_path) do
      {:ok, :indexation_successful} ->
        Logger.info("""
        Application launched successfully.
        Waiting for requests.
        """)
        Supervisor.start_link(children, opts)

      {:ok, :partial_indexation_successful, fails} ->
        Logger.warn("""
        Application launched with errors. Check the CSV file for:
        #{inspect fails}
        Waiting for requests.
        """)
        Supervisor.start_link(children, opts)

      {:error, :no_valid_data_to_save} ->
        Logger.error("""
        Failed to launch application. CSV file has no valid data.
        Shutting down.
        """)
        {:error, :invalid_csv}

      {:error, reason} ->
        Logger.error("""
        Failed to launch application. An unknown error occurred:
        #{inspect reason}
        Shutting down.
        """)
        {:error, reason}
    end

  end
end

Which has also been heavily discouraged by the community in previous posts.

Question

So, how do I follow best practices for libraries and still follow the community recommendations for my use case?

LostKobrakai · August 7, 2019, 8:29am

My guidelines for config is as follows:

If it’s functions, receive config via params
If it’s processes, receive config via params given to the childspec
If the above is not convenient for the default use case, additionally fall back to application env or other stateful configuration methods.

For your case I’m wondering why you need a process (genserver) at all. If all you do is either startup or not you don’t really need to check the genserver at runtime anymore. If things failed the app doesn’t start to begin with.

I’d go for:

with {:ok, result} <- Lib.populate(path) do
  case result do
    :indexation_successful -> Logger.info("Application launched successfully.")
    {:partial_indexation_successful, fails} -> Logger.info("Application launched with errors. Check the CSV file for: #{inspect fails}")
  end
  Logger.info("Waiting for requests.")
  Supervisor.start_link(children, opts)
end

You could add more logging to the failing part, but the not starting application does already cause logs to be generated. If you’d skip the logging for the success case it’s even more simple than that.

An alternative I can see, which doesn’t require wrapping Supervisor.start_link could be:

children = [
  {Lib.Populator, [path]},
  …
]
Supervisor.start_link(children, opts)

Where the Populator succeeds or fails to start depending on the result of loading the file. After a success the process could just hibernate. Here the population would be done at init/1 and block till finished and everything starting later can just use the data as needed.

If you don’t want to block on startup you can use the above as well, do the loading after init/1, but you essentially need to query the state of loading the file whenever you try to access it and also it’s not as straight forward in terms of stopping the application, as there might already be more things started.

Fl4m3Ph03n1x · August 7, 2019, 8:38am

For your case I’m wondering why you need a process (genserver) at all. If all you do is either startup or not you don’t really need to check the genserver at runtime anymore. If things failed the app doesn’t start to begin with.

This decision stems from this discussion:

Where I was advised to create a GenServer that monitors the file, and self-heals the applications once the file is corrected. With the file corrected, the GenServer then does nothing and in theory I could kill it, but for the time being I am leaving it be (for simplicity reasons).

An alternative I can see, which doesn’t require wrapping Supervisor.start_link could be:

I am already using this alternative, but then path needs to be hardcoded, which is against the guidelines for libraries. I honestly do not see a situation where I can abide for the best practices guidelines and the advice given by the community.

This is already being done and is expected. It’s a cheap operation and not a problem.

LostKobrakai · August 7, 2019, 8:43am

This kinda contradicts your snippet in the entry post, which does fail hard if the file could not be loaded, while self healing means you just start up not matter what and wait for it to heal. Maybe there’s some timeout, but you initially accept it to not work.

Maybe you could first define what should happen in case of errors. Should the application fail to start up, stop after a timeout, not fail at all and just deal with data not being available or should even the user of your application make the decision?

LostKobrakai · August 7, 2019, 8:44am

If that’s part of you library users’ supervision tree your library itself does not hardcode anything. And yeah, all my code examples would live in the users MyApp.Application, not your library. I missed specifying that.

Fl4m3Ph03n1x · August 7, 2019, 9:08am

Let me see if I get this correctly. The app that uses my library, would then have the following code:

defmodule FootbalInterface.Application do
  @moduledoc false

  use Application

  alias FootbalEngine

  require Logger

  def start(_type, _args) do

    children = [
        {FootbalEngine, "path_to_file"}
    ]
    opts = [strategy: :one_for_one, name: FootbalInterface.Supervisor]
  end
end

Correct?

This is actually what I have been trying to achieve, except I don’t quite know how to prepare my library to be used like that.

Questions

Is there any documentation or tutorial you could point me to ?
Would my FootbalEngine app still be an OTP application, or would it be a simple library?
Would it still have an applixation.ex file with use Application?
Would it simply returns the Supervisor in the Footbal.new function?

Would really appreciate some guidelines.

LostKobrakai · August 7, 2019, 9:12am

It would be a library without own supervision tree. It just has module(s), which can be started by other applications. There’s therefore no need for an application.ex in your library.

I’m not sure what supervisor you’d need in your library if FootbalEngine is a genserver.