Oban Pro hex repo unreachable, breaking CI

We’re seeing failures in our CI pipeline starting about an hour ago when the runner tries to add the Oban Pro hex repo:

mix hex.repo add oban https://getoban.pro/repo

Downloading public key failed
{:failed_connect, [{:to_address, {~c"oban.pro", 443}}, {:inet, [:inet], :closed}]}

It’s been failing consistently across multiple runs and is blocking all of our CI jobs. The failure happens before any of our code even compiles.

Our CI runs on Ubicloud (GitHub Actions), so it’s possible this is a networking issue on their end, but wanted to check with you first. Is there a known issue with getoban.pro right now?

2 Likes

Yes, there’s an issue with our DNSSEC rotation. We’ve disabled the DNSSEC and it’s taking a while to propagate. As a short term workaround (just today), you can add the domain to your hosts file:

  • Add 151.101.189.242 repo.oban.pro to your /etc/hosts
  • As a single CI step echo "151.101.189.242 repo.oban.pro" | sudo tee -a /etc/hosts

Side note, you should use https://repo.oban.pro instead, the getoban.pro domain has been deprecated for a few years now.

3 Likes

Thanks!

Just a heads up that we still experience the issue.

Sorry, we’re definitely aware. We get paged intermittently at least once an hour. There’s absolutely nothing we can do but wait for the DNS change.

2 Likes

Is there something we should have been doing to cache the hex package locally? Are there steps to handle when your servers aren’t available? (I think this is the second time in recent memory we can’t run CI tests or do deploys because of server accessibility issues.)

Same thing for the docs: hex outdated is saying I should upgrade Oban and Oban Web, but those also say Oban Pro needs to be updated too… but I can’t read the docs there to know which migrations need to be run. Will things break if I update only those packages?

The previous issue was also from DNSSEC rotation. We initiated a request to disable DNSSEC at that point, and it didn’t go through, which caused this second flare up of the same exact issue. This time we’ve confirmed via API, and our registrar, that DNSSEC is disabled and this won’t happen again.

They should all be upgraded together. There aren’t any urgent fixes in those releases, it’s all features and nice-to-have improvements. It’s best to hold off until the DNS issue is fully resolved.

Any update on expected timeline/resolution for this? We are still running into this issue currently and are implementing some workaround to get through CI for now, but wondering how long we think the outage will last.

It may be nice to have a banner or message on the Oban.Pro website when this is happening!

1 Like

Do you do any deps caching like Oban Web CI does, for example? That, plus an allow-failure on the Oban repo pull should work, I’m pretty sure.

Disabling DNSSEC was reported to take 24-48 hours, and by our measure it took about 30 hours to propagate fully. DNSSEC is fully disabled now and there won’t be any key rotation incidents in the future.

We’d love to have a centralized place to notify people, but not being able to resolve the oban.pro domain is the crux of the problem.

In my experience, the CI issue is caused by adding the repo, not fetching the packages. That’s because adding the repo with the --fetch-public-key flag forces it to pull the public key, so it never gets a chance to fetch. A workaround is to omit the --fetch-public-key flag up front and let it validate it lazily when it fetches pacakges:

 mix hex.repo add oban https://repo.oban.pro \
-  --fetch-public-key SHA256:4/OSKi0NRF91QVVXlGAhb/BIMLnK8NHcx/EWs+aIWPc \
   --auth-key $OBAN_LICENSE_KEY
1 Like