Anyone care about Temporal?

This is a hugely valuable document, thank you. Just lately I’ve been trying to do something similar. Glad that you beat me to it.

Here’s my take. Just started the project an hour ago. I don’t want a Rustler solution for the durable Elixir consumer, and I don’t want to implement their SDK. So I made alkahest which leans on the official Go client, uses the official protobuf’s, and uses official workers. Minimal and safe on the elixir side.

After some review, decided against this ridiculous Rube Goldberg machine.

temporalex needs a bit of work, needs vetting the NIF, but otherwise is a much more cohesive option. MIT license helps.

The native Temporal SDK implementation is complex due to the Temporal Server’s extensive feature set and the resulting complexity of the Temporal API. To simplify this, one can build a language-specific SDK on top of the Core SDK provided by Temporal.
A key design goal of the Core SDK is to support highly diverse platforms, such as TypeScript or .NET, that may not be well-suited to implementing the native SDK complexity on their own. As a result, the Core SDK functions in practice as a large, abstracted black box built on a cross-platform compatibility trade-offs.
Using the Core SDK means sacrificing language-specific optimizations, customizations, and innovation opportunities in exchange for convenience.
Note, for example, that the Temporal Go and Java SDKs are not built on the Core SDK.

Building a native Temporal SDK for an ecosystem such as the BEAM VM unlocks entirely new possibilities, thanks to the BEAM unique and powerful features.

Current Temporal Erlang and Elixir (native) SDK already introduces several innovations and OTP-specific optimizations, such as:

  • OTP messages handling in workflows and activities.
  • Smart workflow eviction (WIP) based on workflow execution properties and business logic, rather than a simple LRU policy used in other SDKs.
  • Single workflow execution per Erlang cluster; combined with smart eviction, this can reduce infrastructure costs by orders of magnitude compared to other SDKs.
  • Highly flexible dynamic concurrency, fixed window and leaky bucket task rate limiters.
  • Direct execution activities (WIP) - zero execution latency activities.
  • Parallel workflow executions and composite await() functions, enabling a new semantic level for implementing workflow business logic. No need to use “context propagation” which is notoriously hard to use in other SDKs.
  • Superior handling of activity heartbeats compared to other SDKs.
  • Temporal events are a first class citizen - almost everything can be awaited as an event in this SDK.
  • ETS tables are used to store WF awaitables and Temporal events, which provides powerful WF history ETS queries. One can for example search in WF history for all activities from specific TQ which failed more than X times and deterministically branch WF logic.
  • Mutable markers that can reset WF execution during WF replay, for example, in response to environmental variable changes.

A native BEAM Temporal SDK implementation, in the long term, opens the door to future BEAM-specific opportunities, something not achievable with the Core SDK.

Using a Core SDK will introduce a large, resource-intensive NIF into your application. The OTP documentation includes numerous reservations and warnings regarding the use of NIFs. Addressing production system issues in such a setup may prove quite challenging. You will not be able to use Observer or Phoenix LiveDashboard to debug or introspect the Core SDK NIF if something goes wrong.

2 Likes

Right, which is why I am apprehensive about using the rustler solution. However, I can’t build around a non permissive license.

1 Like

Very comprehensive comparison! One small correction, Oban Web has been OSS and free since last year.

Seems like a description of Oban Pro to me :winking_face_with_tongue:. Anyhow, the latest release of Oban Web has a dedicated workflows view: Changelog — Oban Web v2.12.3

2 Likes

Chris McCord discussing durable_server

Durable Server on GitHub

Analysis of durable_server from that video, along with comparison with temporal

2 Likes

This has barely anything to do with libraries like Temporal or Oban btw. It’s much more of a “durable Phoenix dead-view / LiveView session object” more than anything, from my quick skim of it.

My understanding of Temporal is a bit out-of-date, but I wanted to share my perspective. I’m not dismissing it, but rather offering a different way of thinking about “workflows” for those evaluating their options.

My view is that Temporal provides durable execution backed by an event history, but that history represents the workflow execution, not necessarily the domain’s business events as one would find in CQRS/ES.

  • Temporal tracks what the system did (execution history).
  • CQRS/ES tracks what the business decided (business events).

That distinction matters to me. I want the source of truth to be explicit business decisions, not just a record of how a workflow executed.

Temporal’s appeal is the convenience of managing long-running processes in an imperative-style programming model, letting developers focus on step-by-step logic while abstracting away time, retries, and side-effects.

I think THAT is the crux of Temporal’s selling point: the intersection of “imperative programming logic” and deterministic handling of time and side-effects – timing, delays, etc.

“Individual customer’s monthly subscription billing” is an example I’ve seen come up as a long running process.

If your workflows are core business processes similar to the above rather than low-level mechanical tasks, you’ve effectively externalized the logic and history of state. This shifts the core business process logic and history into an external runtime, rather than keeping it as first-class domain concerns. I’d rather maintain internal control by modeling such processes as first class citizens in BEAM.

I would rather keep it all in the BEAM, using CQRS/ES to model long running BUSINESS process logic and state, keeping business events (history) first-class. its event-driven approach integrates cleanly with the real-time nature of BEAM ecosystem – CQRS/ES naturally handles Temporal’s concepts of “signals, queries, updates” with event injection, projections and commands.

In a BEAM + CQRS/ES setup, I handle Temporal-like concerns (timers, retries, side-effects) explicitly using Oban:

  • Event Handlers enqueue robust Oban Jobs that are scheduled in the future, to inject a “timer triggered” event for process managers to react to; or dispatch a “handle timeout” command directly to an aggregate to decide what to do.
  • Event Handlers enqueue robust Oban Jobs to perform a mechanical side-effect (eg API calls), and also inject result events (or a “handle result” command).
  • Recurring thing like “charge monthly”, again, varied techniques using Oban to inject regular timing events for the business process aggregate to decide on what to do.

I am ok with the trade-offs:

  • A more explicit state-machine / decision-modeling approach instead of imperative workflow code
  • Less batteries-included convenience for long processes.
  • Taking ownership of business process modeling instead of outsourcing it.
  • A steeper learning curve for CQRS/ES
  • The need to think carefully about “communication, coordination and consistency” (Ford and Richards)
  • The need to think carefully about idempotency.
  • More plumbing around scheduling, timing, and orchestration+choreography

In return:

  • Business events and history remain first-class
  • Strong alignment with BEAM + Event-Driven Architecture
  • Fewer external moving parts

Some additional references:

5 Likes

I am not an expert, but to me, CQRS is a low-level abstraction/pattern that demands significant experience and careful upfront planning. In many cases, fine-grained, low-level CQRS control is unnecessary, and “batteries-included” solutions may be preferable. In my opinion Temporal excels in scenarios where easy to reason business logic implementation and fast shipping are the top priorities.

Building your own custom, durable execution implementation can be challenging, especially when integrating disparate components such as CQRS and Oban. The greatest challenge is likely ensuring robust transactionality: what happens if the system crashes when execution is at a component boundary? What happens when Oban job fails after the CQRS handler completion? How do you secure transactional compensations across alien components? How do you query execution state, provide transactional cancellations and execution state updates? Writing “integration” tests for business logic using this architecture may present another problem. Visibility of such a solution will be limited; you will only be able to inspect Oban jobs via Oban Web dashboard, and the absence of a CQRS component may cause confusion. I am confident that all the challenges outlined here can ultimately be addressed, though doing so will likely require substantial effort.

Transactionality in Temporal is an intrinsic system property, enabled by the workflow event history and replay mechanism. Integration testing of Temporal workflows primarily focuses on activities mocking. Recently, an automatic time-skipping feature was added to the Temporal API, which may provide SDKs a convenient way to handle timers during testing. You can inspect and manage Temporal workflows using web UI, terminal CLI and TUI apps such as tempo or TemPurview.

Temporal was already compared here to Oban and CQRS. One could also compare Temporal to event-driven microservices architecture. I really like this slide from Maxim Fateev’s keynote presentation at the Replay 2023 conference, which contrasts event-driven microservices architecture (left) with a durable execution platform like Temporal (right):

The diagram above illustrates the evolution of Solar System architecture models: the geocentric model (pre-Copernican, left) and the heliocentric model (post-Copernican, right).

As previously mentioned, I am working on the Elixir and Erlang SDK. The SDK Samples repository currently includes around 20 code examples, ranging from “Hello World” to “Saga Pattern”, so you can check what it feels like to work with Temporal in Elixir.

Interesting blog post featuring a section that compares self-made durable execution with Temporal in a Haskell-based financial systems context: