Cainophile - Change Data Capture for Postgres using Elixir

I recently announced my work on Cainophile, and it was suggested I post it here for further dissemination and feedback. Cainophile is a library to assist you in building change data capture systems with Postgres and Elixir. I also have released a separate library for decoding the logical Postgres replication log directly in case that is helpful. You can find full details on the announcement.

Announcment
GitHub

9 Likes

Looks really useful and is something I’ve been wanting to do with Postgres in Elixir for a while.

Does using :temporary as the replication slot mean that only new changes will be received by the subscriber and it won’t receive any historical changes when restarted? For a use case I have in mind this is the behaviour I want whereby the subscriber only receives the changes made while its running.

Glad to hear that it might be useful! Honestly I have not attempted a full sync with it, I’m not sure how that would work but presumably if you kept all of you WAL logs around and postgres has access to them, you could request the initial WAL position and go from there. As far as temporary, it really just refers to if you want a persistent slot or not. Persistent slots keep track of what has been acknowledged by the subscriber, and survive across connections. For production use cases, persistent slots are probably the best option, as you can be sure that you’re not missing out on changes due to network issues/subscriber failure. However, if you have a replication slot hanging around and not being acknowledged, I believe it will eventually fill up and cause problems.

Temporary slots just create an ephemeral replication slot that goes away when the subscriber disconnects. This is useful for testing or quick trials as it doesn’t keep garbage around. If you want to try this on a production database for example (after staging!), you could start with temporary to minimize the impact.

As far as what you’re trying to do, it sounds like temporary or named slots would work. What you’ll want to do is specify {0,0} for the wal position (it defaults to this), which means “let postgres decide what to send me”. In the case of temporary slots, you will just receive what is new with {0,0}, as there will be no previously acknowledged wal position that Postgres will try to send. With a named slot, it seems that postgres will start sending you info from where you left off.

Thanks for the reply. Useful to know that {0, 0} for temporary means to receive only new data.