Should I use GenServer for this?

Problem

I’m building a retail store on-demand dispatch system. We have 3 store locations. Incoming delivery orders will be assigned to the nearest store location. Orders will then be grouped into batches based on destination proximity, ETA, etc… Each batch consists of 2-3 orders. One dispatcher will be assigned to each batch.

My Solution

  1. Create 1 GenServer for each store and perform order grouping as orders come in.
  2. Create 1 GenServer for each store and perform order grouping at 2 mins interval.
  3. Create 1 background job (using Quantum) for each store to perform order grouping at 2 mins interval.

Solution #1 and #2 require no reads from database. If I’m going for #2, might as well I go for #3.

My Question

How would you approach this problem using the Elixir way?

I would probably go with option two but I’m. not sure about the actual business requirements. Why do you need to batch? 2-3 orders per 2 min doesn’t seem like something you need to batch. How do you send them to the store? http, database, rabbitmq?

You also need to think about persistence in the GenServer in case you restart the server.

Sorry for the confusion. “Trip” might be a better word to describe the situation. Dispatchers will deliver 2-3 orders per trip. Each trip must be completed within 30 minutes. So Order Grouping GenServer will wait for orders and group orders into trips.

Store managers will retrieve those orders through an app.

I’m thinking of storing all orders in Postgres and GenServer. In case of server crash, GenServer will load ungrouped orders from Postgres.

Hi there, I hope that this diagram could help you a bit. A person object refers to a GenServer

2 Likes

I would not use GenServer for this. What happens if the GenServer crashes before its batch is dispatched? Also, running multiple app instances would require your app to know on which node a specific GenServer is running, making it more difficult to scale horizontally. In general, I would avoid singleton GenServers (in your case, the correspondence between store and GenServer) unless there’s a good reason for them.

One possibility would be to save orders in the database, and grouping “lazily” upon receiving an order. Something like this:

  • Save orders in the database as soon as they are placed, including all the data necessary for batching
  • When a new order is placed, compute its batch. If the batch can be considered complete (for example because the max number of orders per batch is reached), then dispatch it. Otherwise, schedule a job after an appropriate timeout to batch it anyway in case no orders for the same batch were placed within the timeout.

The advantage of this is that all the state stays in the database, so the app can be trivially scaled, and a crash of a GenServer will not loose orders.

Does this make sense, or did I misunderstand your problem?

3 Likes

++ on not using a GenServer.

What’s wrong with reading from the database?

It takes time to build the request, send it, wait for the response, deserialize it.

Especially when the database is not on the same host, those database queries can add hundreds of milliseconds to request processing time.

Especially if there are more than a single query.

Agreed, but the posed problem didn’t seem like those sorts of performance requirements would be a consideration.

Avoiding a query is/should always be part of the considerations.

Avoiding it in the first place is usually much easier, than having built a system on it and then either has to rewrite it or introducing a cache with all of its problems…

I see the point about minimizing database queries, but I (respectfully) do not agree on the fact that avoiding database queries should be the first consideration. Of course one should not make unnecessary queries, but in my experience it’s a lot simpler to optimize on top of a well-modeled database, rather than tweak an ad-hoc solution designed before it was known what the real bottlenecks are.

My reasoning is the following: keeping the state in the database only and avoiding a single global process for each store is simpler, failsafe, and horizontally scalable. If and when the database becomes a bottleneck, one can optimize differently, but even in that case the single GenServer strategy is probably not the best if performance is the concern.

A single global GenServer would easily become the bottleneck, and when running multiple nodes one would still have to pay the latency cost of reaching the node running the GenServer.

That said, I do not know the performance requirements in this case, and @NobbZ point about the performance penalty of using a database is relevant in some demanding cases.

2 Likes

I have not said its the first consideration, I said, the penalty should always be part of your considerations.

I fully agree that premature optimization is a bad thing, though just using a database, because we do so anyway and might need it later, is probably as bad…

Of course, this is a hairy situation…

If the same data might be needed in another application, which might not even be written in elixir, I’d prefer a database over a GenServer + exposing an endpoint to request the current value.

2 Likes

Fair enough, and I agree that this considerations are very dependent on the specific requirements.

I just think that, to the original question “Should I use a GenServer for this?” I would reply “no”. I feel that running a single global process is more problematic than using the database.

If performance is a concern, one could cache the batches in Redis or another in-memory datastore fit for the case, but I would do so as a pure optimization layer (if the cache is lost, the batch can be recalculated from the database state). Of course there are ways to implement an equivalent cache with GenServer and ETS, but even then I would do that as an optimization on top of a good database model.

Finally, since all solutions are contextual, let me make my assumptions explicit. I assumed that:

  • The expected number of orders per minute is low enough to be saved in the database
  • It’s more important to avoid loosing an order than to dispatch it in sub-second time
  • Running the app on more than one node is desirable (for high availability or scaling)

If these assumptions are wrong, my proposed solution needs to be reconsidered.