PagerDuty engineering blog

blog-posts
pipeline
business-logic

#1

#2

Very nice read! For reminds me of René’s 208 ElixirConf talk: 7) ElixirConf US 2018 – Architecting Flow in Elixir - From Leveraging Pipes to Designing Token APIs – René Föhring


#3

A very interesting read!!

It occurs to me that after you’ve finished this refactoring, your modules/(public) functions will look like:

  • Processor.process
  • Parser.parse
  • Sender.send
  • Receiver.receive

Maybe this is of course only because of the simplified code, and I’m not sure if I like or dislike it, but anyway it stood out to me.

Other things to note:

  • In your with example you pattern-match on {:ok} at the end, which is a very unconventional way of returning a ‘sccess’-result. All other code I know, including Elixir’s standard library, uses either {:ok, some_value} in case some_value matters, or plain :ok.
  • check_webhook_expired ensures that the webhook has not yet expired rather than the other way around. My subjective opinion would be to rename it to ensure_webhook_not_expired or instead return a boolean value that you pattern-match on rather than an ok/error tuple.
  • Of course, again, this might be a simplification from your real code, but it might make sense to introduce one additional layer of indirection to the check_webhook_expired-stuff, so that you have an webhook_valid?-function (returning a boolean), whose internals might do multiple different checks (which each of course have their own function). Depending on when which checks make sense, you might be able to give it an even more explicit name like webhook_sendable?.
  • What you are manipulating is a simple Webhook-call, which is probably better named e.g. WebhookCall or WebhookRequest than Webhook or WebhookState.
  • It is possible to extract the non-effectful code (such as the validation checks, but also the different state-transitions of the WebhookState-struct) to its own dedicated module, making it even easier to test in isolation, and making sure that you do not mess up how you manipulate the WebhookState-struct in your code.
  • Instead of creating multi-line function head pattern-matches, match on everything you know you’ll want there, and introduce a case-statement in the function to match on what will make your implementation vary, such as:
  defp save_response_to_db(webhook = %WebhookState{
    result: "success",
    endpoint: endpoint,
    payload: payload,
    response: {_status_code, response_body}
  }) do
    # this was moved from Sender to here :) !
    DB.save_successful_response(endpoint, payload, response_body)
    {:ok}
  end

  defp save_response_to_db(webhook = %WebhookState{
    result: "failure",
    endpoint: endpoint,
    payload: payload,
    response: {_status_code, response_body}
  }) do
    # this was moved from Sender to here :) !
    DB.save_failed_response(endpoint, payload, response_body)
    {:ok}
  end

becoming

  defp save_response_to_db(webhook = %WebhookState{
    result: result,
    endpoint: endpoint,
    payload: payload,
    response: {_status_code, response_body}
  }) do
    # this was moved from Sender to here :) !
    case result of
      "success" ->
        DB.save_successful_response(endpoint, payload, response_body)
      "failure" ->
        DB.save_failed_response(endpoint, payload, response_body)
  end
  {:ok} # Instead of returning `:ok` here, you could also return the result of the DB statement, which might maybe fal?l
end

#4

That talk looks great! Sounds quite similar. I will check it out, the Token approach sounds interesting.


#6

Thanks for taking the time to read it and for all the feedback!

I agree that naming is quite redundant. One way to improve it is to have a PipelineStage behaviour with a function called process that the different stages could implement.

I like that. We have something similar in our actual code which does some “pre-processing” checks to make sure the webhook is valid before sending it out.

+1

I don’t follow this. The different state transitions are “non-effectful” even though they modify the state? And I’m having a hard time understanding how this new module would be any different from the current one. Maybe there’d be one module with just the pipeline defined and another module where each pipeline stage is defined?

+1


#7

I was not very clear on this in my post. For the given example, it might also be slightly overkill (but of course, since it is a simplification, it might make more sense in the real code).

If you split it like follows:

defmodule WebhookRequest do
  defstruct [:endpoint_url, :payload, :created_at, :response, :result]
  @expired_threshold_seconds 5

  def new(endpoint_url, payload) do
    %__MODULE__{created_at: DateTime.utc_now, endpoint_url: endpoint_url, payload: payload}
  end

  def expired?(webhook_request = %__MODULE__{created_at: created_at}, sending_time \\ DateTime.utc_now) do
    DateTime.diff(created_at, sending_time > @expired_threshold_seconds
  end

  def sendable?(webhook_request = %__MODULE__{}, sending_time \\ DateTime.utc_now()) do
    not expired?(webhook_request, sending_time)
  end

  def update_response(webhook_request = %__MODULE__{response: nil}, status_code, response_body, sending_time \\ DateTime.utc_now) when is_integer(status_code) and is_string(response_body) do
    case sendable?(webhook_request, sending_time) do
      false -> {:error, "webhook expired"}
      true ->
        webhook_request = %__MODULE__{
                            response: {status_code, response_body},
                            result: normalize_result(status_code)
                          }
        {:ok, webhook_request}
    end
  end
 
  def endpoint_url(%__MODULE__{endpoint_url: endpoint_url}), do: endpoint_url
  def payload(%__MODULE__{payload: payload}), do: payload

  def result(%__MODULE__{result: nil}), do: {:error, :not_sent_yet}
  def result(%__MODULE__{result: result, response: response}), do: {:ok, result, response}



  defp normalize_result(status) when status in 200 .. 299, do: :success
  # We do not handle 3xx codes because we follow redirects to the end
  defp normalize_result(status) when status in 400..499, do: :failure
  defp normalize_result(status),do: :unknown
end

Notice how the module ensures that:

  • The status is always normalized to one of the known symbols (We can use dyalizer typechecks to make it even harder to mis-type one of the options here).
  • All %WebhookRequest{}s will follow the new -> update_response lifecycle.
  • It is impossible to call update_response multiple times on a %WebhookRequest{} because it requires response to be not filled in yet.
  • Other modules will pattern-match on the outcome of WebhookRequest.result/1, rather than on its internals directly, allowing you to change the internals of the module.
  • The same is true for endpoint_url and payload; probably overkill here and the most dubious refactor I did for such a small example (you could just pattern match on them instead since you use their values directly), but there are cases in which this is helpful, because it decreases coupling between the modules.
  • In this case, I decided to put the expirity-check inside update_response, which is of course a subjective choice. Do note that in any case I decided to pass the current time as extra parameter, so that we can test the system against a fixed timestamp, as well as ‘just using the current time’ when running it in production.

The outside module (Processor, here renamed to WebhookDispatcher because it is a somewhat less ambiguous name) would then become something like:

defmodule WebhookDispatcher do
  def run(webhook_request = %WebhookRequest{}) do
    webhook_request
    |> send_webhook
    |> save_response
  end

  defp send_webhook(webhook_request) do
    endpoint_url = WebhookRequest.endpoint_url(webhook_request)
    payload = WebhookRequest.payload(webhook_request)
    {status_code, response_body} = Sender.send(endpoint_url, payload)
    
    WebhookRequest.update_response(webhook_request, status_code, response_body)
  end
  
  defp save_response(webhook_request) do
    case WebhookRequest.result(webhook_request) do
      {:ok, :success, response_body} -> DB.save_successful_response(endpoint, payload, response_body)
      {:ok, :failure, response_body} -> DB.save_failure_response(endpoint, payload, response_body)
      {:ok, :unknown, response_body} -> Logger.error "Unknown return value of Webhook request #{inspect(webhook_request)}"
      {:error, error}                -> Logger.error "Improper usage of Webhook request #{inspect(webhook_request)}, #{inspect(error)}"
    end
  end
end

Writing tests that for instance ensure that no expired webhook-requests will run is very easy now, and many other errors are catched either at compile-time, or very early in development, since the modules are very strict in what kind of structures they allow to be passed. (For instance, calling WebhookRequest.handle_response twice with the same structure will immediately trigger a crash.)


#8

I’m pretty sure on the call to sendable?/2 you meant to use sending_time instead of DateTime.utc_now


#9

Thanks! Fixed :slightly_smiling_face:


#10

Thanks for the detailed reply (again)! I didn’t know much about type checking in Elixir so this example helped a lot and now that I see that improving type safety isn’t that hard I’ll definitely consider doing this sometime in the future.