Hi everyone,
I’m working on a project that processes billing data using Broadway to consume messages from Kafka and stores them in a database via Ecto. I need help ensuring that Broadway’s lifecycle is tightly coupled to the connection state of the database. Here’s the problem:
- Ecto.Repo is a supervisor: It can be running even when no database connections are active.
- Broadway depends on Ecto: Starting Broadway after Ecto.Repo doesn’t guarantee that it will be able to write to the database, as connections to the database might not yet be established.
- Handling dropped connections: If database connections are dropped for any reason, I want Broadway to stop immediately and only restart after Ecto has successfully reconnected to the database.
The main motivation behind this behaviour is the nature of the data. Since we’re processing billing data, we need to avoid situations where Kafka messages are consumed but fail to be stored in the database. Additionally, Kafka in this setup doesn’t retry failed messages, so ensuring Broadway only runs when the database is available would be very nice.
I was thinking I could have a supervisor with a :rest_for_one
strategy, have Project.Repo
as its first child, and then add a “sentinel” process, that would not start until it checks that ecto is completely online, and would die if it notices that ecto connections are dying (by monitoring connections). Broadway would come after the sentinel. This feels like I’m depending on ecto internals a bit too much though, and maybe I’m just over complicating the issue. Any advice would be greatly appreciated