Problem with Oban Workers

I’m developing process automation using oban, but lately I’m facing a problem with updating the code in worker modules.

Whenever I update the module for the worker and start the application, the worker keeps processing jobs as if the code hasn’t been updated. In the worker module, even if I comment out everything that is in the perform(_job) function, the oban continues processing the code that was there. It’s like there’s some kind of cache.

ie.:

this job successfully run:

  @impl Oban.Worker
  def perform(_job) do
    Users.list_users_from_last_day()
    |> Emails.digest_email()
    |> Mailer.deliver_now()

if I changes to:

  @impl Oban.Worker
  def perform(_job) do
    IO.inspect("Nothing to do")

the previous job before the change continues to run successfully. Or if I change the tags for testing, they don’t change in db.

Another common error I’m experiencing is the “unknown worker”- oban error, when I change the worker set in config.exs.

I’ve already tried:

  • clear cache files → sudo rm -rf ./_build ./.elixir_ls ./mix.lock ~/.hex/cache.ets
  • recompile the app from scratch

Running through IEX or running the automated tests, everything works. The problem only occurs in homolog environment

Could anyone give a help on how to debug this problem?

1 Like

It seems like you’re experiencing a combination of multiple versions of a module in memory, and lingering data in the database. Is it possible you have another app instance running somewhere?

There are no instances running in the background. Running via docker or locally without the docker, is the same problem. The only thing that works is updating the crontab. Do you have any test suggestions I can do to check if it’s something in the app’s startup config?

Ty!

There isn’t anything in the config or worker definitions that would prevent code reloading. As far as Oban knows, the worker is a string that represents a module name. When the job executes it looks up the module name and executes the job—the module definition isn’t stored in the database.

Can you call the perform/1 function directly from a test to ensure that it works as intended?

1 Like

Yes, it works. The “bug” appears when running mix phx.server with MIX_ENV=dev

Lately I’ve had an error with it a bit similar, after many executions of jobs it stops showing any debug messages for me, but the work runs normally, this in my dev environment, in production it doesn’t happen.

This sounds like the telemetry handler for logging gets detached in development. There will be a warning in your logs stating that the handler encountered an error if that’s the case.

That’s strange.

As a workaround you can always delete the _build directory (or just _build/dev/lib/myapp).

Also you can call :code.which(MyWorker) to retrieve the beam file for your module so you could check the modification time of this file, or load it in an iex session to check stuff.

1 Like

Curious to find out more what you’re building. We are also starting a process automation project soon. Would appreciate if you drop me a message.

2 Likes

updating:

Running oban tests using iex in dev:

  iex> Oban.Worker.from_string("Oban.Integration.Worker")
  {:ok, Oban.Integration.Worker}

  iex> defmodule NotAWorker, do: []
  ...> Oban.Worker.from_string("NotAWorker")
  {:error, %RuntimeError{message: "module is not a worker: NotAWorker"}}

  iex> Oban.Worker.from_string("RandomWorker")
  {:error, %RuntimeError{message: "unknown worker: RandomWorker"}}

Everything worked as it should. The module is recognized. I also run the app in production with oban workers and it worked!! The error only continues running locally in dev environment: iex -S mix phx.server and mix phx.server where the code doesn’t update and I get the error: unknown worker from oban.

I believe it’s a problem in the compilation of my code because when I start the app, I sometimes get the error: ** (File.Error) could not write to file "/home/desktop/projects/api/_build/dev/lib/api/ebin/my-worker-file.beam": permission denied, but running with release in prod works. Thanks for the help, I’ll search for the cause of this compilation issue.

1 Like

I will observe, thanks for the reply.

Have you been able to figure out what is causing this ? I have the same issue, even with removing _build it still uses the old worker code when running through iex -S mix phx.server . Seems it is cached somewhere ?

1 Like

Hahaha, never mind, there was an old cache but it was because I am using gitlab review environments which after a git push automatically run. So what was happening that the server for some reason always was the first to be able to get the job and send out the email and not my dev laptop. Leaving this as it might save someone else from wasting two hours trying to figure out what was causing this :-/

1 Like