How to properly limit my own rate requests to external API?

I use Oban to make external API requests every N minutes. It’s turned out that the frequnecy I use it with is too dense such that it’ll cause an error “too many requests” from API, as there’s a limit there.

How would I properly adjust my requests? I could a) incease N b) insert a few Process.sleep(10_000) inside the code that makes the requests.

An issue with (a) is that as my DB grows, it’ll have to be increasing N too, doing it by trial and error.

An issue with (b) is that the next iteration of Oban.perform(...) may coincide as the current one is getting executed, if Process.sleep(M) is too large. Otherwise, it may cause an API limit exceeded error.

How to do this properly? And in a simple manner too.

You could discard the current request that got “too many requests” error.

  1. Pattern match on the error body or HTTP status
  2. If too many errors, return {:cancel, reason} to the Oban worker
  3. The job will be canceled. Next request every N minutes will try again

This is a simple solution I can think of without much knowledge of the problem.

It’s not a single API request per perform(...) but multiple ones. Namely, in each perform(...) I iterate over Db rows making multiple API requests.

I don’t know much about how Oban works under the hood, but you could see if there’s a way to integrate Hammer into the execution loop. It’s a rate-limiter for Elixir that acts as a kind of traffic signaling device.

2 Likes

It seems like you need to limit the requests between records inside the perform call.

Is there a way to ask Oban to start the next job N minutes after the current one finishes execution? Instead of scheduling it on a predefined interval?

Also, you need to maybe “pause” between each api call with within the perform function.

I always try to create a plain genserver for things like this since it’s easier to handle and reach out for Oban when I need to deal with more complicated scenarios.

1 Like

That would be overcomplication.

I don’t know. Is there?

Eliminate Oban > Cron and instead insert a new job manually at the end of perform(...). Dynamically. Huh?

Can you elaborate on why you think this?

Oban Pro has rate limiting Smart Engine — Oban v2.11.0. That however operates at the job level not the API request level. However if these are pretty 1:1 that might work fine.

5 Likes

That might work. Combining this with setting a pause between each request could solve the issue. Having said that, there could be a better “perform” logic that might work better? Having one api request per job somehow? Querying only the number of records that is less than the threshold of the api and schedule the next job accordingly?

I’ve done something 99% the same long time ago, GenServer based, but I am not willing to dig it up at the moment.

You could write your own GenServer that is responsible for contacting the 3rd party API (when you send it a message) and have it preserve state that relates to how much requests you have left for e.g. the next 5 minutes, and only when you are about to hit a rate limit you do Process.sleep.

Furthermore, using a GenServer for this immediately rids you of any potential race conditions i.e. doing 2 or more requests just before you hit rate limit because sending messages to GenServer is serial and on a first-come-first-served (FIFO) basis.

Or you can use opq or :jobs (that one is in Erlang but fairly easy to use). I have used both successfully.

FWIW I am not a huge fan of Oban even though it works perfectly, I feel it confuses people and that’s why I never reach for it unless I need persistence for the jobs – which is a real requirement and you have that mandatory a good chunk of the time so maybe you’re gonna be better off just using Oban with uniqueness rules and maximum concurrency settings. That works quite fine as well.

2 Likes

Additional dependency which hasn’t proven yet to even be required

There are a couple of things that may work for you; it’s difficult to say what the best solution would be without knowing more about the specifics of the problem.

Yes, Oban supports scheduling, either after N seconds or at a specific datetime:

https://hexdocs.pm/oban/Oban.html#module-scheduling-jobs

If this is the route you go for, make sure you read the “Reliable Scheduling” docs to avoid some unexpected behaviour:

https://hexdocs.pm/oban/reliable-scheduling.html

Something else to look at is custom backoff:

https://hexdocs.pm/oban/Oban.Worker.html#module-contextual-backoff

Hope that helps.

2 Likes

People have already mentioned Oban.

I’ll just point to a real world config example I came across recently:

2 Likes

As I’ve explained, I’m using Oban in the cron mode. Therefore, it’s already scheduled, with repetition, via the config, statically.

So Oban is capable of slowing down the requests on its own then? This will work for me.

However, what will happen if

the next iteration of Oban.perform(...) that switches in execution may coincide with the current one which is STILL getting executed

?

As we don’t actively proselytize, this is a marvelous compliment. We’ll take it!

7 Likes

Yes, only with Pro’s smart engine. In OSS; manually chaining jobs, edge cases, etc yourself.

1 Like

Hey :wave:
a bit late to the party, but I think is worth sharing this library Regulator which provides adaptive concurrency limits around external resources.

It does not integrate with Oban, but it might be worth a look to understand how they implemented the logic.

Cheers :v:

1 Like

Oh that’s a good one, bookmarked it right away but that’s not surprising because the author is fantastic.

1 Like

You could also check out ExWaiter — ex_waiter v1.3.1

For better or worse, it’s not opinionated about how you keep track of whether requests can be made.

1 Like