Design pattern for a genserver that needs to get state from the database

apog · September 5, 2018, 2:40pm

I’ve been making a basic todo-list application to learn phoenix and i’ve been looking through the forums for answers on how to properly design a phoenix application. The consensus seems to be to only make calls to the database through Repo in either the controller or some separate service object dedicated to that task. But there doesn’t seem to be a consensus on a design pattern for getting such data to other modules that need it.

In my case I have a genserver (called Todo.Server) that has a specific name and keeps its own list of todo-items in it’s state. It’s started from a dynamic supervisor. My issue comes comes in here: when Todo.Server is started I check the database for it’s initial state. When an entry is added I have it persist to the database as well as update it’s local state.

From what I’ve been seeing this is a bad practice for a phoenix app, and it sounds like the convention is to have that initial state passed in from a higher level. But this makes things more complex. I would then have to first find out if a server with the given name is already running (since it only reads from the database on startup, otherwise it uses it’s local state), and if not then make a call to Repo to get the data, and then pass it all the way down to the server. And where would that logic live? That logic seems very out of place for a controller. Does this mean I should make a separate service object responsible for getting the data needed for the dynamic supervisor and the Todo servers and pass it to them through the controller?

I’ve been having a very difficult time figuring out how to properly handle such a situation in phoenix, any help would be greatly appreciated!

peerreynders · September 5, 2018, 2:47pm

If I understand you correctly it sounds like Ecto was installed as part of the Phoenix project setup.

The “pattern” that you seem to be looking for would make Ecto part of the Todo application - not Phoenix.

The “pattern” is demonstrated in

hangman is the equivalent to your Todo application. gallows is the Phoenix based web interface for hangman. Now hangman doesn’t use a database - but if it did Ecto would be part of hangman - not gallows.

Therefore your Todo application should be designed to use Ecto (or whatever other persistent storage you use) even before Phoenix get involved.

You list Elixir in Action 2e as one of your books. If you look at the Chapter 11 example you have:

A “Database”.
The Todo.Server which directly depends on the “Database”.
And finally the Plug-based web server which depends directly on the Todo.Server (and Todo.Cache).
Meanwhile application.ex/system.ex are responsible for starting everything up.

Now the big difference here is that it’s organized as one single Mix project. The gallows/hangman approach would organize Todo.Server and Todo.Database in a separate OTP application (i.e. separate Mix project) that can then be used as a dependency for Todo.Web (in a different Mix project).

Similarly a Phoenix application could simply use a “Todo OTP Application” (that uses Ecto internally) as a dependency without directly getting Ecto involved. Meanwhile the Phoenix project acts as “the application” that starts everything up but the “Todo Application” is responsible for managing its persistent storage (e.g. through Ecto or possibly yet another OTP application).

However that is probably the most complicated way of using Phoenix.

You can build “Phoenix is your application” style applications where simply each request to the web server initiates some interaction with the database that results in a response.
The next level of refinement is to use “Phoenix contexts” - i.e. organizing code into domain/business (i.e. context) specific modules rather than simply leaving all the code in the various controllers.
For even better separation there are umbrella projects which allows multiple OTP applications to run under the same configuration.
Finally the gallows/hangman approach which relies on bare path dependencies (which maximizes decoupling but makes many things less convenient (tradeoffs …)).

kokolegorille · September 5, 2018, 3:14pm

There is nothing wrong doing a todo list in phoenix without using gen_server, for example using data from db.

IIRC there is no ecto involved in the todo list of Elixir in Action.

In case You want to do both… for example having a gen_server loading state from db, there is a recommandation, try to have the quickest init possible. You can achieve this like that.

  @impl GenServer
  def init(args) do
    send(self(), {:set_state, args})
    # Do not use timeout here, it will be send by set_state
    {:ok, fresh_state(args)}
  end

  # Initialize handler, separate from init for fast init unlock.
  @impl GenServer
  def handle_info({:set_state, args}, state) do

    # Do the loading here! You might return state from db queries

    {:noreply, state, @timeout}
  end

What would be a service object in FP?

david_ex · September 5, 2018, 4:04pm

Note that if you’re using OTP >= 21, you can use handle_continue to avoid race conditions when deferring your initialization:

  def init(args) do
    state = fresh_state(args)
    # Do not use timeout here, it will be send by set_state
    {:ok, , state, {:continue, {:init_state, args}}}
  end

  # Initialize handler, separate from init for fast init unlock.
  def handle_continue({:init_state, args}, state) do

    # Do the loading here! You might return state from db queries

    {:noreply, state, @timeout}
  end

The advantage of this is that when using named processes, it prevents a message from being processed after init finished, but before the :set_state message is handled. Using continue will ensure the code to finish initialization is run before accepting a new message from the mailbox.

More info here.

Note that if you want to be able to @impl ... the handle_continue/2 function, you need to have Elixir >= 1.7

apog · September 5, 2018, 4:17pm

Thank you for all of the feedback! Yeah my todo application is based off of the one from Elixir in Action 2e and I was trying to modify it to make use the phoenix framework. But it looks like I am currently building phoenix as my application and I need to simply view it as a web interface (this is very new to me coming from a rails background) and keep the Todo app as it’s own separate thing. If I am understanding what you are saying, there is nothing wrong with me making a call to Repo from directly within my Todo.server genserver rather than having that be passed in?

apog · September 5, 2018, 4:21pm

The gen_server was for efficiency since after it’s started I can get data from its state rather than hitting the db every time. And yeah Elixir in Action just uses file IO as the database, but I modified the project to see how adding a relation database would work. And I should have said ‘service module’ instead of ‘object’. What I meant was a module dedicated to making calls to Repo for data. (i.e. if I had some complicated query for getting a combination of lists, it could live in the service module rather than the query happening directly in the controller)

kokolegorille · September 5, 2018, 4:25pm

Which is what contexts are made for

apog · September 5, 2018, 4:35pm

ah, i’m still getting the terminology down. I think contexts are what I mean. So I guess my question boiled down to whether I should get the data from within a context and pass it to the genserver through the supervisor, or if it’s okay to just get the data from directly within a genserver. It’s also just confusing that the built in generators for phoenix go against the recommended design of an application. For example, based on what peerreynders said above, I wouldn’t want any of this to live in the phoenix app and so the phx.gen.context command would actually be guiding me in the wrong direction.

kokolegorille · September 5, 2018, 4:39pm

Not really, it creates the context in the module connected to ecto.

If You create an app, You will have app, and app_web. And contexts are generated app side.

Phoenix still is an interface for your application, separated from your business logic.

peerreynders · September 5, 2018, 4:43pm

It depends a bit on the architectural style that you are using.

“Phoenix is your Application” (kinda “Rails-style”) wouldn’t bother with caching the todo list in a process and would interact straight with the database. In memory caching isn’t always a total win (unless the data is entirely ephemeral).
The Elixir In Action 2e “Database” uses the file system - but for all Todo.Server knows it could be using Ecto/PostgreSQL. With that in mind there is some value in hiding the details from Todo.Server behind a Todo.Database module which is the only one who knows about Ecto, the Repo and the queries. Most people are not willing to go to that extreme as it cuts them off from the functionality in Ecto.Changeset for data validation.

apog · September 5, 2018, 4:58pm

Well part of my question is trying to figure out what architectural style to use. I was hoping there was some sort of convention in the phoenix community around where to access the database and was the general structure should be.

For your second bullet, why wouldn’t you still be able to use Ecto.Changeset for data validation? The database module would still check the validity of the data before calling repo to persist.

peerreynders · September 5, 2018, 5:51pm

Programmers know the benefit of everything and the tradeoffs of nothing.

Phoenix gives you a range of choices and the different choices are about different sets of tradeoffs - so it is ultimately up to the developer to choose the tradeoffs that are most beneficial to the situation at hand.

“Phoenix is your application” is easy to learn and a fast initial development style; i.e. has a short time-to-initial-success but tends to sacrifice maintainability.
“Phoenix contexts” and “umbrella projects” attempt to improve maintainability at the cost of slowing you down with refactoring to maintain the appropriate level of separation and boundaries which tend to introduce a a bit more code and possibly complexity required for the improved decoupling.
Bare path dependencies maximize decoupling (if done correctly) but also introduces more overhead.

why wouldn’t you still be able to use Ecto.Changeset for data validation?

Because the idea behind Ecto hiding behind Todo.Database is to not let the fact that Ecto is being used leak out - otherwise what is the point in encapsulating it, it is supposed to be an implementation detail.

That being said there are plans to move the SQL and migration functionality into separate packages so that core Ecto is about data, not databases. However even having a canonical data schema throughout the entire system can lead to problems with unnecessary coupling.

MrDoops · September 5, 2018, 6:32pm

Say you go the path of using a GenServer to handle your commands and mutate state of your entity (e.g. %Todo{}). The reason for doing so is to decouple your business logic of managing Todo’s from the database implementation details. (There’s a good discussion here).

If you use Ecto.Changeset and/or Ecto.Schema in your GenServer, or the state uses the database schema / changesets you lose that decoupling.

Here’s a rule of thumb: if your entities are simple like a Todo or a Post where pretty much all the logic is captured by create, read, update, and delete, just use Ecto to represent your entity. If CRUD is fine, build your functions and changesets as you see in the Phoenix/Ecto docs. Think about your context boundaries and you’ll be served well.

But what if your entity is not simply represented by CRUD? Here are some things to consider using other patterns like you described:

Run-time characteristics: Maybe your entity encapsulates a business process. It has a Status field. You’ve got state-machine use cases to consider. You can still use CRUD here, but you’ll need more domain specific commands than CRUD.
You have a persistence layer that isn’t as nice as Ecto. Your persistence/query layers are nasty enough you plan to switch out the database later. You really don’t want to do anything else other than change a config to switch the persistence strategy you’re using.
Your business logic calls for a representation of state that is different from what your well-designed relational database looks like.

Like what @peerreynders said, it’s about trade-offs and knowing what they are and when to use them. Using a behavior to decouple your database in a config is extra development. Using a GenServer as a command handler and persisting when necessary also has overhead in development time and complexity. In many cases the trade-offs aren’t worth it.

apog · September 5, 2018, 8:12pm

This is essentially what I was trying to figure out. I don’t know what the boundaries are supposed to be and wanted to avoid any code-smells around how I was using my ecto repo. But it sounds like what I am doing now (calling the database through repo from within my genserver) is an acceptable thing to do, but that there are other high level design choices I should be aware of (like umbrella apps, and what @peerreynders calls bare path dependencies).

dimitarvp · September 22, 2018, 1:42pm

IMO accepting Ecto as a dependency even in your domain modules is quite fine. In my eyes the point is not to achieve 100% isolation; the point is to achieve agnosticism about storage details in your higher-level code.

I have successfully used Ecto.Changeset and Ecto.Multi to carry around changes, validation and an accumulated transaction (to be executed later when a certain user workflow finishes, say when finalizing a cart and actually making an order) in my business methods. The idea of those business methods for me was to not care what queries, updates and inserts are needed to get the job done, and not that underneath there might be no DB at all. This is further supported by the fact that you can use Ecto.Changeset without any database if you so desire. (As @peerreynders mentioned, Ecto will be split in two: a DB-specific and a DB-agnostic library.)

And finally, if you have a business method that uses your storage-knowing methods and only depends on Ecto.Changeset and Ecto.Multi, decoupling those is a minor-to-moderate refactoring effort. And that would only ever be necessary if your app(s) evolve to the point of needing a fully storage-agnostic code; not something you see every day (most corporations would sooner finance a Mars exploration campaign before switching away from their DB engine of choice). It’s a good tradeoff between productivity now and less refactoring pain later.

Fixating too much on academic purity can and will hurt productivity and velocity. Do what your future self will thank you for.