Resilient and battle tested event store with PostgreSQL

antoine · February 15, 2019, 12:47pm

We need to store a lot of events in a PostgreSQL database.
We can be spikes sometimes, so the rate of inserts can overload the database (2000/sec).

Those inserts are made in many places in our codebase, during important tasks, so an insert request should not be blocking.
I mean an EventStore.add(...) should not hang and lead to a Genserver timeout on the caller side.

defmodule Work do
  use GenServer

  def handle_call(...) do
     ...
     EventStore.add(...)
     ...
  end
end

Of course, we can not afford losing inserts, but we can trade by having a delay (being not realtime).
According to you, what is the best architecture for this?

We also need to handle the case of the database crashing or not responding. Maybe by enqueuing inserts in RAM/disk and dequeue when database in up again.
According to you, what is the best solution for this?

Fl4m3Ph03n1x · February 15, 2019, 3:11pm

Communication with DB

Batching

I understand that your main objective is not to lose information. So, IMO, the best way to do this is do send the information to the DB in batches. You would have a queue that would save data in RAM and every X seconds, write it to the DB.

This approach has the benefit that it will be harder to overload the DB with a billion petitions per minute, but it means that if your machine crashes, you lose the data in the queue (because it is in RAM). A possible solution would be to use DETS for this (write changes to disk) and clean the DETS table once the writes are done.

Pooling

Another approach would be to have a pool of workers which can write into the DB. If a process wants to write and there are no workers, it has to wait. The danger here is: what happens if I can’t wait indefnetly? That depends on your application.

Personal opinion

I would likely use a combination of the previous 2 approaches, having an infinite wait time for the process needing a worker to write to the DB but making sure that processes waiting are added to a queue. HTTPoison, an HTTP library, uses this algorithm with hackney.

What if the DB crashes?

DB crashes are hell. The only way to make sure your data is safe is to store it into disk. However, the longer the DB is down, the less space on your production machines you will have, which means they will eventually break down due to lack of space (it also happens with RAM). To avoid this I recommend a cleaner process, that deletes data older than X (seconds/minutes/etc) that runs permanently to clean your DETS, or files.

Other than that there is really nothing you can do, unless you are willing to streamline data into your DB via queues (Apache Kafka, RabbitMQ, etc).

Hope it helps!

Ciboulette · February 15, 2019, 3:58pm

Interesting @Fl4m3Ph03n1x!

If I’m not wrong, when I read the question, it’s quite clear that the insertion in DB should not block the caller process. So I am not sure your pooling solution with the indefinitely wait will work.

I like the Batching solution! What about using GenStage beside building a pool of GenServer for building this feature?

slashdotdash · February 15, 2019, 4:02pm

Have you considered using a purpose built event store, such as Greg Young’s Event Store?

“Store at around 15,000 writes per second and 50,000 reads per second!”
– https://eventstore.org/

I’ve written an Elixir Event Store using Postgres for persistence. The performance when running the benchmark suite on my laptop is 4,929 events/sec for a single writer and 8,586 events/sec for 50 concurrent writers.

To persist events without blocking you could do:

Task.start(fn ->
  EventStore.append_to_stream(stream_uuid, :any_version, events, :infinity)
end)

Alternatively you could send the GenServer process a message to store the events outside of the request:

defmodule Work do
  use GenServer

  def handle_call(...) do
    send(self(), {:persist_events, events})
    # ...
  end

  def handle_info({:persist_events, events}, state) do
    EventStore.append_to_stream(stream_uuid, :any_version, events, :infinity)
    # ...
  end
end

You could even use an approach with two GenServers. One to accept requests (e.g. store these events) which forwards the request to a second process to actually persist them to the event store. This allows the first GenServer to immediately reply without any blocking as it is offloading the work to another process.

How do you deal with storing events when the database is inaccessible? You could push them onto a queue and have one or more consumers taking events from the queue and writing them to the event store. However the same problem applies when the queue is unavailable.

Ciboulette · February 15, 2019, 4:18pm

Interesting!

By using your implementation with:

Task.start(fn ->
  EventStore.append_to_stream(stream_uuid, :any_version, events, :infinity)
end)

You will inevitably spawn thousands/ hundred thousand of processes depending the number of caller processes. I think the second one is more controllable! WDYT?

alvises · February 15, 2019, 4:27pm

If you can’t loose any event (but you don’t need realtime) and the spikes can cause timeouts, I would put something like RabbitMQ or Kafka between the application and the database.

app – event–> kafka/rabbit —> event importer --> postgres

Rabbit/Kafka should handle much better the spikes and they could act as a durable buffer. If the importer (which takes the events from rabbit/kafka and stores them into postgres) has any timeout and crashes, or even if postgres locks, the event is not lost and will be reprocessed once the importer or postgres is healthy again.

slashdotdash · February 15, 2019, 4:30pm

Depends whether it’s better to have thousands of processes doing one thing, or one process with a mailbox containing thousands of messages. The only way to reliably answer that question is to try both and benchmark the performance.

Fl4m3Ph03n1x · February 15, 2019, 4:36pm

Great observation. In our production code we actually go around this issue by making our main process create a child that does all the work:

asks for a worker
gives the worker a job to do
waits for a response and logs results

We use Task.start because of two reasons:

We don’t care if the child succeeds or not, this will be logged
We don’t want our main process to crash or handle extra work merely because 1 of it’s 1000 children has failed to get a worker and died

Your millage may vary

As for GenStage I would probably discard it and move directly to Flow which has a backpressure mechanism integrated to make sure no one is getting overwhelmed and uses GenStage on the background. Basically it would keep you from re-inventing the wheel.

Yes. And that is not an issue. You mentioned you do up to 2000 ops/sec. Assuming that a single DB write takes 3 seconds to save (which is a long time) you would have a top of 6000 Tasks active at any given second. Hell, let’s go crazy and say that thanks to some overhead you would have 10_000 Tasks (let’s say the cost of creating and killing Tasks is high). At this rate, you would be using less than 7% of the total number of process you can create with Elixir (and I am being evil, because the numbers would be lower iirc).

Not a problem basically, the way I see it.