Halting a pipe

Once again, I am a total beginner with Elixir and FP…

I have a pipeline that looks like:

url    
    |> get_html_data()
    |> check_data_exist()
    |> process_html_data()
    |> parse_data_list()

when i pull the html data from the url, I want to use check_data_exist to check to see if an identical result has already been persisted in the the database, and if so, stop the pipe, and be done.

This wouldn’t be an error situation, just no more processing needed.

Is there a way to just jump out of the pipe on some return of the function?

Thanks!

1 Like

You might use with statement to manage error condition. it is Elixir’s monadic way to check for error.

Something like this…

with html_data <- get_html_data(url),
  data_exist <- check_data_exist(html_data),
  data_processed <- process_html_data(data_exist),
  data_list <- parse_data_list(data_processed) do

  # do something useful with data_list

else

  _ -> # Oh, there was an error

end

You can catch error at any step of your pipeline.

5 Likes

I think this ElixirConf talk will be worth a watch for you: 7) ElixirConf US 2018 – Architecting Flow in Elixir - From Leveraging Pipes to Designing Token APIs – René Föhring

The solution will be context dependent. Based on your problem description I would probably simply break the pipeline into two separate parts

data = 
  url
  |> get_html_data()
  |> check_data_exist()

case data do
  {:ok, fetched_data} ->
    fetched_data

  {:error, :not_fetched} ->
    data
    |> process_html_data()
    |> parse_data_list()
end

But it’ll depend on your specific scenario (which is why the video and associated blog posts are worth viewing).

3 Likes

You might also enjoy this talk.

3 Likes

A discussion currently taking place that may also interest you (and mentions a design pattern that does exactly what you want) can be found here:

I like this method a lot! Really clean and readable. Thanks for this!

OH! i like this… i should note that is not really an error case. The idea is, if i fetch the data, and it matches data that i have already fetched, then i don’t need to process the data… so i might return something like:

{:ok, fetched_data} -> process_data
{:ok, :data_current} -> do_nothing

does this seem okay>?