How best to approach a live migration of pipeline segments across BEAM nodes?

Alex66 · January 19, 2026, 6:16pm

Running an FBP pipeline where each node is a GenServer. Need to move a segment of the running pipeline (multiple connected nodes) from one BEAM node to another without losing messages or breaking the flow.

Example: nodes A → B → C → D running on server 1. Want to move B → C to server 2 while pipeline is processing.

Questions:

Drain and restart, or something smarter?
How to handle in-flight messages during migration?
Coordinate the cutover so upstream (A) starts sending to new location?
Is there a pattern for this, or just stop/migrate/restart?

Using Pulsar/Artemis as message broker, so could potentially buffer there during migration. But curious about pure BEAM approaches too (mailbox draining, etc).

allen-munsch · January 19, 2026, 6:39pm

I haven’t done this using FBP, or BEAM. So I am drawing my answer from my experience in just database migrations ( postgres ) from one cloud to another with near zero downtime ( very little buffered draining )

The way I have done something similar in the past was via WAL streaming both with a timed syncronization window where queries timed out for a small bit while the buffers drained over.

Proxy stream original B → C to the 2nd B → C

verify 2nd B matches 1st B, then stop sending to 1st B

Kinda like: Blue–green deployment - Wikipedia

Someone told me once it’s a combination of the 2, a Cyan deployment.

If it’s okay to take it offline for maintenance for an hour or 2, that’d be easiest, just imo

manhvu · January 20, 2026, 12:34am

In this case, you can use hot code reloading for the GenServer , update its state in the code_change/3 callback, and route requests to the new node.

Alex66 · January 20, 2026, 11:08am

Thanks everyone for the input! Wanted to share what I ended up implementing.

Already had:

Remote Output / Remote Input nodes for cross-BEAM communication
Circuit breaker with message buffering when target unavailable
Auto-recovery when connection restored

What I added:

Runtime pipeline modification (add/remove nodes while running)
Runtime property injection (change node config without restart)
Dynamic connection changes (rewire targets mid-flow)

The migration approach:

Circuit break at the split point (Remote Output buffers messages)
Send pipeline segment config (JSON) to target server
Start the segment on new server
Update Remote Output’s target channel at runtime
Buffer drains to new location

Nodes are stateless transformations — state lives in the message flow. So “hot migration” is just runtime topology modification.

Appreciate the discussion!