Pachyderm - Virtual actor framework for durable, globally unique processes

Crowdhailer · August 28, 2019, 6:31pm

Summary

Durable, globally unique processes (Entites) are created by controlling side effects.
(In short side effects are deferred until the state change of the entity has been committed to storage).

The purpose of an entity is to handle data that must be strongly consistent within a given execution context.

Example (as of 0.2.0)

Entity behaviour

defmodule MyApp.Counter do
  @behaviour Pachyderm.Entity

  alias MyApp.Counter.{Increment, ...}
  alias MyApp.Counter.{Increased, ...}

  def init() do
    %{count: 0}
  end

  def handle(%Increment{}, %{count: count}) do
    events = [%Increased{amount: 1}]

    if count == 9 do
      effects = [{MyApp.AdminMailer, %{threshold: 10}}]
      {:ok, {events, effects}}
    else
      {:ok, events}
    end
  end

  def update(%Increased{amount: amount}, state = %{count: current}) do
    %{state | count: current + amount}
  end
end

Calling Entities

type = MyApp.Counter
id = UUID.uuid4()
reference = {type, id}

{:ok, state} = Pachyderm.call(reference, %Increment{})
# => {:ok, %{count: 1}}

Design notes

A detailed discussion of why this API was chosen and further examples is available in the project README.

Why

The single global process is very easy to reason about, but has many pitfalls.
The dangers of the single global process.
Pachyderm tries to rescue this simple model by offering a standard way of managing some of the pitfalls.

Prior Art

The description of this project is quite difficult because there are lots of related but different approaches to this problem.

Other terminology includes:

Single Global Processes
The Virtual actor model Microsoft Orleans
Resilient Actors
An Actor Database Actor-Relational Database Systems: A Manifesto
A database-less App framework
Sticky Processes

Qqwy · August 28, 2019, 6:32pm

Very cool!

What is your approach in the face of netsplits?
EDIT: I’ve read through the README, and it seems, if I interpret the explanation in the deferred side-effects section correctly, that Pachyderm defers netsplit handling to the storage layer: The first event to be ‘committed to storage’ wins, and at that time other events that were based on a stale state will be dropped (and you can possibly retry them manually).

‘Commited to storage’ is not a very clear thing to say. What is the storage? What does it mean to be commited to it?
If the storage is distributed, then multiple nodes might both save things thinking they are the first. On the other hand, if the storage is only in one single place, you create a single point of failure and a bottleneck (since all events go through this storage).

tristan · August 30, 2019, 1:34pm

It says it is based on Orleans so hopefully a safe assumption that it does storage similarly. If this is the case then it writes an etag with the state of a grain to storage when it saves state and uses the etag it read in on start to compare against the one currently in the store when doing the write. if the etags don’t match then it fails.

In sql this looks like https://github.com/erleans/erleans/blob/master/priv/blob_provider.sql#L60 for an update. This is from my project Erleans that is also based on Orleans.

Crowdhailer · September 2, 2019, 1:57pm

Committed to storage depends on the storage being used. At the moment it means that a Database transaction has completed.

However an append only log is something that can be implemented multiple ways, it could be written to kafka for example.

If the storage is distributed, then multiple nodes might both save things thinking they are the first
That would be a bug in the append only log. The log must offer a strong consistency API, and only consider the write to have succeeded once they are sure that only their write is definitely the next entry in the log.

Qqwy · September 2, 2019, 8:13pm

Thank you for your response!

Would it be correct to say that ‘the database’, ‘Kafka’ or whatever other thing that manages the append-only log is a bottleneck and failure-point for the Pachyderm system? And that Pachyderm therefore scales as well as the technology used to manage the append-only log can scale?

tristan · September 2, 2019, 8:24pm

Ah, so pachyderm is based on storing an event log instead of doing state like I described?

Crowdhailer · September 3, 2019, 1:58pm

Yes, as much as any coordination point for strong consistency MUST BE (by definition) a single point of failure. Pachyderm’s goal is for easy to work with strong consistency.

There are no consistency guarantees between entities. Pachyderm run each entity concurrently and so is concurrent to the maximum possible degree. There is nothing stopping a log also being fully partitioned by entity.

Crowdhailer · September 3, 2019, 1:58pm

Yes Pachyderm stores events, I wasn’t sure if you were originally talking about inside the storage layer.

pma · September 3, 2019, 4:37pm

https://github.com/rabbitmq/ra implements Raft for consensus and allows defining side-effects that run on the leader only. Using it with disk_log as storage layer for an event sourced system

Crowdhailer · September 5, 2019, 1:42pm

Ra looks interesting. At first I thought It might be cool to make an alternative Log storage. for Pachyderm using it.

However it could be even more useful, because it handles side effects as data in much the same way as Pachyderm. Just finished watching this talk. Which is really informative.

wanton7 · March 10, 2020, 8:17am

Instead of presenting virtual actors as actors would it be actually make more sense to create virtual actors similarly how dapr.io is doing it by treating actor calls basically as web requests. This way even reentrancy could be supported in Elixir for virtual actors.

wanton7 · March 10, 2020, 8:58am

Reentrancy is not currently supported by Dapr but it’s coming https://github.com/dapr/dapr/issues/21
Here are actor related docs for Dapr https://github.com/dapr/docs/tree/master/concepts/actors