Call for feedback: multi-node deployment tool (Edeliver 2.0) design


TL;DR: multi-node deployment tool can be implement with Supervisor-alike strategies to ensure consistent state across all targets: one4all, one4rest, one4one.

Goals: make it simple and flexible. But simple above all.

The structure can be following:

Application-wide settings

  • Release store.
    • Local.
    • Remote.
    • S3.

Nodes - individual node settings.

  • Host.
  • Port.
  • Environment - free form text labels like staging, prod, etc.

Environment (Node Group on a picture) - settings applied for a specific environment.

  • Full release or hot code upgrade.
  • Strategy.
    • One4all - fallback action will be applied to all nodes in the group.
    • One4rest - fallback action will be applied to all other nodes that were not yet updated.
    • One4rest - fallback action will be applied to all other nodes that were not yet updated.
  • Name - free form text labels like staging, prod, etc.
  • Fallback action - action that will be taken in case node update failed. So far I can think only of :rollback, :ignore and :stop.
  • Build strategy
    • Local.
    • Remote.
    • Docker.

Tasks - deploy and rollback tasks. For some reason right now I’d prefer to use directories as a grouping criteria, meaning having dirs like 10_assets and 20_deploy but that’s up for discussion. There can be also an-app wide setting.
On the other hand tasks can be either order by file names or by dependency graph.

Tasks will be just a set of Elixir modules conforming to a protocol akin to a Plug.
For example they can have:

  • “Stage” setting. At very least backwards compatibility with those hooks will be maintained
  • “Optional” setting. There should be a mechanism that will allow to execute tasks only in certain situations, say for the initial deployment only. There might be an optionality_group setting that can be passed through the cli. Like combination of include/exclude and tags in ExUnit
  • “Critical” setting. Will define whether deployment to node should considered to be failed if task failed.
  • “Cleanup” callback. Will be run in case of a failed deploy to optionally revert changes.
  • “Run” callback. The body of a task :slight_smile:
  • “Environment” settings. If set - those will apply only for specific environment.

One implied rule is that rollout process across all nodes in a given environment should be uniform - probably it will be impossible to deploy on Linux and OpenBSD within one environment.

Now please tell me if that will cover your particular use-cases :slight_smile: