lucasaas98

lucasaas98

Upgrading Oban/Pro/Web causes database load spike

Hello again!

Following yesterday’s topic success we have tried to deploy the new version with the updated code to production but we are running into issues where the database load spikes to 100%. I have a feeling this would not happen if we had shorter pruning time for our jobs table since the same issues does not happen in other environments like our Sandbox environment but would be great to really understand the underlying issue.

For reference, this is the update:

  • Oban: from 2.17.3 to 2.17.10
  • Oban Pro: from 1.3.0 to 1.4.9
  • Oban Web: from 2.10.2 to 2.10.4

And our database:

  • PostgreSQL 13.13
  • 2 vcpus
  • 7.5GB ram

I have some screenshots and evidence from our Cloud SQL instance that should help to understand the issue, I will include them in details tags so the thread looks nice.

Database Load starts when we deploy

Top 3 offending queries and respective load

Query 1 and call times

SELECT
  $1
FROM
  "public"."oban_jobs" AS o0
WHERE
  (o0."meta" ? $19)
  AND (o0."meta" ->> $20 = $2)
  AND (o0."meta" ->> $21 IS NULL)
  AND (o0."state" = $3)
UNION ALL (
  SELECT
    $4
  FROM
    "public"."oban_jobs" AS o0
  WHERE
    (o0."meta" ? $22)
    AND (o0."meta" ->> $23 = $5)
    AND (o0."meta" ->> $24 IS NULL)
    AND (o0."state" = $6)
  LIMIT
    $25)
UNION ALL (
  SELECT
    $7
  FROM
    "public"."oban_jobs" AS o0
  WHERE
    (o0."meta" ? $26)
    AND (o0."meta" ->> $27 = $8)
    AND (o0."meta" ->> $28 IS NULL)
    AND (o0."state" = $9)
  LIMIT
    $29)
UNION ALL (
  SELECT
    $10
  FROM
    "public"."oban_jobs" AS o0
  WHERE
    (o0."meta" ? $30)
    AND (o0."meta" ->> $31 = $11)
    AND (o0."meta" ->> $32 IS NULL)
    AND (o0."state" = $12)
  LIMIT
    $33)
UNION ALL (
  SELECT
    $13
  FROM
    "public"."oban_jobs" AS o0
  WHERE
    (o0."meta" ? $34)
    AND (o0."meta" ->> $35 = $14)
    AND (o0."meta" ->> $36 IS NULL)
    AND (o0."state" = $15)
  LIMIT
    $37)
UNION ALL (
  SELECT
    $16
  FROM
    "public"."oban_jobs" AS o0
  WHERE
    (o0."meta" ? $38)
    AND (o0."meta" ->> $39 = $17)
    AND (o0."meta" ->> $40 IS NULL)
    AND (o0."state" = $18)
  LIMIT
    $41)
LIMIT
  $42
Query 2 and call times

SELECT
  $1
FROM
  "public"."oban_jobs" AS o0
WHERE
  (o0."meta" ? $4)
  AND (o0."meta" ->> $5 = $2)
  AND (o0."meta" ->> $6 = $3)
  AND (o0."state" IN ($7,
      $8,
      $9,
      $10,
      $11,
      $12,
      $13))
Query 3 and call times

SELECT
  $1
FROM
  "public"."oban_jobs" AS o0
WHERE
  (o0."meta" ? $13)
  AND (o0."meta" ->> $14 = $2)
  AND (o0."meta" ->> $15 IS NULL)
  AND (o0."state" = $3)
UNION ALL (
  SELECT
    $4
  FROM
    "public"."oban_jobs" AS o0
  WHERE
    (o0."meta" ? $16)
    AND (o0."meta" ->> $17 = $5)
    AND (o0."meta" ->> $18 IS NULL)
    AND (o0."state" = $6)
  LIMIT
    $19)
UNION ALL (
  SELECT
    $7
  FROM
    "public"."oban_jobs" AS o0
  WHERE
    (o0."meta" ? $20)
    AND (o0."meta" ->> $21 = $8)
    AND (o0."meta" ->> $22 IS NULL)
    AND (o0."state" = $9)
  LIMIT
    $23)
UNION ALL (
  SELECT
    $10
  FROM
    "public"."oban_jobs" AS o0
  WHERE
    (o0."meta" ? $24)
    AND (o0."meta" ->> $25 = $11)
    AND (o0."meta" ->> $26 IS NULL)
    AND (o0."state" = $12)
  LIMIT
    $27)
LIMIT
  $28
What our job numbers looks like

Any ideas? Did I miss something in the changelogs that could have caused this?

Thanks for the help and sorry for the fat finger ghost thread I created just before :cry:

Most Liked

benwilson512

benwilson512

Author of Craft GraphQL APIs in Elixir with Absinthe

Can you \d oban_jobs? I’m curious to see exactly what indexes are / aren’t in place.

As a workaround for efficiently clearing out completed jobs like this I have sometimes used this approach:

begin;
lock oban_jobs in access exclusive mode;
create TEMPORARY table useful_oban_jobs as (select * from oban_jobs where state != 'completed');
truncate oban_jobs;
insert into oban_jobs (select * from useful_oban_jobs);
commit;

This ends up being much faster than a delete + vacuum believe it or not.

Overall I’d consider using partitioned tables if you want millions of jobs around, although Postgres 13 being nearly 4 years old at this point I don’t recall how good its support for partitioned tables is.

sorenone

sorenone

Oban Core Team

Heya,

Those ^ are queries from Workflows.

Here is a link to the specific migration that should help!

https://getoban.pro/docs/pro/1.4.10/changelog.html#upgrading-workflows-in-v1-4

We’re currently working on better centralized migrations.

benwilson512

benwilson512

Author of Craft GraphQL APIs in Elixir with Absinthe

And to add on to this if you’ve been dealing with a large number of jobs deleted I’d consider a vacuum full oban_jobs but do note that this will lock the table while it rebuilds. We’ve had issues with table bloat on the oban_jobs table when we get heavy delete activity.

Where Next?

Popular in Questions Top

Tee
can someone please explain to me how Enum.reduce works with maps
New
chokchit
** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 2733ms. You can configure how long re...
New
JeremM34
Hello, how can I check the Phoenix version ? Thanks !
New
alice
Hey, Just curious what are the main benefits of Elixir compared to Clojure? When is Elixir more useful than Clojure and vice versa? Th...
New
Emily
I have VueJS GUIs with the project generated using Webpack. I have Elixir modules that will need to be used by the VueJS GUIs. I fore...
New
nobody
How to bind a phoenix app to a specific ip address? could not find anything about that, nowhere, unfortunately, but for me this is qui...
New
hariharasudhan94
lets say i have a sample like a = 20; b = 10; if (a > b) do {:ok, "a"} end if (a < b) do {:ok, b} end if (a == b) do {:ok, "eq...
New
sergio_101
I am VERY much an elixir newbie. I have taken one elixir course and one phoenix course on Udemy. During that course, I saw the instructor...
New
JDanielMartinez
Hi! May someone helps me, please! I have two apps into an umbrella project: the first one is Database, which manages queries, and the se...
New
PeterCarter
There are pre-rolled solutions for other frameworks that do work. However, Phoenix does not seem to have these. Have people had good expe...
New

Other popular topics Top

Harrisonl
We have an ECS cluster with 4 services, where each task joins a single cluster, via discovery ECS discovery service. Currently when I de...
New
chrismccord
Phoenix 1.4.0 released Phoenix 1.4 is out! This release ships with exciting new features, most notably with HTTP2 support, improved deve...
688 30840 112
New
josevalim
Hi everyone, One of the features added to Elixir early on to help integration with Erlang code was the idea of overridable function defi...
New
aesmail
Hello guys, I have finally made it. I created an admin interface for a framework. It’s been on my todo list for years and with the curre...
New
belgoros
I’m not a pro in using Regex and can’t figure out why the following behaviour happens, especially if we take into account the difference ...
New
vonH
When I run the Plug and I recompile I wind up having to use Ctrl C to quit iex and start again. Witht the help of rlwrap I can use the cu...
New
saif
Hello everyone, Long time lurker first time poster here. I’ve recently begun working on Elixir full-time again! :raised_hands: It’s been...
New
Qqwy
Update: How to use the Blogs & Podcasts section You can post links to your blog posts or podcasts either in one of the Official Blog...
3271 126226 1237
New
PeterCarter
There are pre-rolled solutions for other frameworks that do work. However, Phoenix does not seem to have these. Have people had good expe...
New
jononomo
For some reason my phoenix channels are working for me in my local dev environment, but as soon as I deploy via Docker, I get a 403 error...
New

We're in Beta

About us Mission Statement