Intermittent Bamboo Auth Failure

I have a web app that handles a fair amount of traffic and sends a good deal of transactional emails, likely around 1000 a day. About 99% of the time there are no issues but about three months ago we started having an issue where emails are failing with this error:

   message: "There was a problem sending the email through SMTP.\n\nThe error is :no_more_hosts\n\nMore detail below:\n\n{:permanent_failure, 'port', :auth_failed}\n",
   raw: {:no_more_hosts, {:permanent_failure, 'port', :auth_failed}}

We get this error between 0 - 20 times a day but, in almost all cases, manually sending the email again solves the problem. Again, this application has been in production, and processing about the same number of orders, since 2019 and we only started seeing this issue in April or May of this year. Any ideas?

auth_failed means the initial authentication handshake (before trying to send a message) failed. Your provider may be able to provide more insight from their logs.

One odd thing: the error tuple is {:permanent_failure, smtp_host, :auth_failed} - is 'port' the intended value, a placeholder for the real one, or something else? :thinking:

@al2o3cr sorry for the late response here.

Yeah port is a placeholder.
Due to the intermittent nature of this error I ended up just pattern matching on the failure and restarting from that point. Since then we have not had any issues, but I’d still love to know what the initial cause was. You mentioned that it is the initial authentication handshake, so would that be an AWS issue either with EC2 or SES or something else?
My initial hypothesis was that our rate limit was throttling sends but after upping our limit to some impossibly high amount the issue persisted with the same frequency.

Hey @travisf did you ever get to the bottom of this? We’re seeing the same thing - most emails work fine but roughly once per hour we get the following. We have retry set at the Bamboo level and outside of that too.

There was a problem sending the email through SMTP.

The error is :no_more_hosts

More detail below:

{:permanent_failure, '', :auth_failed}

No, we never did. As I mentioned in my response, the work around was to just pattern match on the error and try again if we encountered it. That’s worked fairly well.

Are you, by any chance, using SES?

Yup we’re using SES as well. We also retry (100x!) and still observe ~1 failure per hour.

I’m considering moving to the SES plugin.

I haven’t tried the SES package. Our solution is working pretty well, most of the issues we have now are with specific email addresses, which is likely more of an SES problem than a Bamboo/our codebase one. Ultimately there is some talk of moving away from SES to mandrill because they have much better logging/retry features.

Did you check the service quotas of SES?

Yeah. Under quota :frowning:

I moved to the SES adaptor. it was super easy and I’ve seen zero send errors in the 12 hours since going to prod.