Oban - Caching Ecto Statements ETS Table memory leak -> OOM

Hello

  • We’re using Oban.Pro 1.14
  • we have 23 queues
  • we use partitioning
  • we use uniqueness
  • we use local limits
  • we use global limits
  • we use rate limits
  • we use DynamicCron heavily (1800+ cron entries)
  • we have multiple nodes
  • we process millions of jobs a day

We are having an issue where our Ecto.Repo ETS table is slowly growing until we get an OOM (currently reaching our limit of 8GB).

Our Repo is mainly used for Oban.

I am trying to dig deeper into this to try to isolate where this can come from, but it seems related to internals of Oban :sweat_smile:

Here we are using 105MB after a few hours:

Some of the ETS table details at this point:

  • has 16178 references to the oban_jobs table
  • has 12 reference to name: "ecto_*

You can find a small portion of it at https://gist.githubusercontent.com/mathieurousseau/6e6529da36bc09927fb9227ed4acbebf/raw/72f46766b6acf9aede031d820e70df59577afd10/gistfile1.txt

Something I have isolated:

  • The leader ecto ets table does not grow if it does not process any jobs (probably only taking care of main Oban business and the Cron)

What can I provide to help with this investigation?
Is there some obvious configuration that you know could mitigate this? (using prepare: :unamed on ecto configuration did not change anything)

Thanks.

Mathieu

2 Likes

What version of Ecto are you using? I seem to recall some isssue in Ecto related to this that was fixed at one point.

4 Likes

Probably talking about this?

2 Likes

As indicated, this is fundamentally an Ecto issue. There are improvements in more recent versions of Pro to compensate for this issue by disabling prepared statements for highly dynamic queries.

Aside from upgrading Ecto and Pro, you could disable prepared statements for all Repo queries with prepare: :unnamed (double check the options in the Ecto docs).

4 Likes

About the Ecto fix post:
You can find me saying on that post that last version of Ecto did not solve the issue for us :sweat_smile:

We tested 1.5 RC but issue remained. We are on latest Ecto and prepare: :unnamed did not work :slightly_frowning_face:

I will try to create a dummy project to reproduce this behaviour.

Whoops, sorry. No clue then.

That should disable prepared statement caching entirely. Are you certain you configured that for the Repo that Oban is using?

Please do. If you have a reproduction I’ll gladly investigate.

2 Likes

yeah I am sure I have configured prepare: :unnamed on the Repo :slight_smile:

The catch though, I have doubled check on our deploy and:

  • prepare: :unnamed done in runtime.exs does not work

I will put it in the config.exs and test again.

Well, that did not make any difference. I doubled check the conf was loaded and ETS table still grows :frowning:

That sounds like an issue with ecto or dbconnection itself then. Oban makes judicious use of prepare: :unnamed for frequent, variable queries, and it would be an problem if the option isn’t being respected.

yes, that’s probably it.

I have a dummy project that I need to clean up that I will share shortly.

But here are the queries that are still being cached:

all, delete_all, update_all

Example of the largest one (141k):

Hi

here is the POC:

https://github.com/mathieurousseau/simple_oban

You will be able to have the ETS table growing.

The homepage will give you a list of the entries of the SimpleOban.Repo ETS tablle.

2 Likes

The repo appears to be private. Will you make it public, or invite us (@sorenone and @sorentwo)?

That was private :frowning:

Made it public.

Also on our real app, after 11h we had: 2695 entries of :update_all for a total of 2714 entries.