Effective way to enable modular filtering pipeline

fklement · January 10, 2021, 3:35pm

Hello together,

I’ve currently the following use case:
I need to enable configurable filtering for stream data (e.g. filter out/change data incoming data by a specific identifier or similar). For specifying the rules I constructed a JSON format that gets parsed.

Now two ways came into my mind that could handle this.

I’m using specific config files that generate modules on application start. That modules are then getting called in the pipeline inside my consumer module.
The JSON configs are stored in a database and are getting loaded into a process on start_link in my consumer module that saves the filtering state.

Is there already some lib or “best practice” on how to handle such a UC?
I also think about what is more effective in terms of computational runtime.

I appreciate any tips and tricks or further information about this topic.

fklement · January 11, 2021, 5:15pm

No one has an opinion on this topic?

mpope · January 11, 2021, 6:10pm

Do you have multiple filters per id, or just a single filter for each?

fklement · January 11, 2021, 6:20pm

Just a single filter for each id.

mpope · January 11, 2021, 6:26pm

And the id is a field within the JSON payload?

fklement · January 11, 2021, 6:28pm

If you mean by payload the rules configuration then yes.

mpope · January 11, 2021, 6:51pm

Well I was considering something like psuedo code:

Stream
...
|> Stream.filter(
  fn payload ->
    filter_fun = config.get_filter(payload['id'])
    filter_fun(payload)
  end)

Where the config would have a mapping of id -> required function:

id_1: &Filters.filter_fun_1/1

However, I don’t know of an ‘official’ way to choose functions for filtering.