How to handle max concurrency for ChromicPDF?

I’m using ChromicPDF in my application to allow a user to download a pdf. In my application, I have a button they can click that sends a request to my controller which generates the pdf with ChromicPDF and sends it back to the user with send_download from the controller.

ChomicPDF docs state:

To increase or limit the number of concurrent workers, you can pass pool configuration to the supervisor. Please note that this is a non-queueing session pool. If you intend to max it out, you will need a job queue as well.

I was able to generate the error in testing:

** (ChromicPDF.Browser.ExecutionError) Caught EXIT signal from NimblePool.checkout!/4

      ** (EXIT) time out

This means that your operation was unable to acquire a worker from the pool
within 5000ms, as all workers are currently occupied.
You're experiencing this error randomly under load. This would indicate that
   the number of concurrent print jobs exceeds the total number of workers in
   the pool, so that all workers are occupied.

   To fix this, you need to increase your resources, e.g. by increasing the number
   of workers with the `session_pool: [size: ...]` option.

   However, please be aware that while ChromicPDF (by virtue of the underlying
   NimblePool worker pool) does perform simple queueing of worker checkouts,
   it is not suitable as a proper job queue. If you expect to peaks in your load
   leading to a high level of concurrent use of your PDF printing component,
   a job queue like Oban will provide a better experience.

The error message suggests Oban. However, I’m not sure Oban is the best option for a couple reasons:

  1. Oban isn’t designed for the caller to block awaiting the result of the job. However, it does have the concept of a Notifier which can be used to return the pdf to the controller process to send back to the user.
  2. I would prefer not to rebuild the conn and assigns within the job because I’m reusing some functions in my controller. But I also don’t want to pass the results of ChromicPDF.Template.source_and_options/1 to the job directly.

The other options I’ve looked at are:

Any suggestions for the best option to queue the pdf printing work when I don’t want the work to be done in the background?

Currently on mobile, so I can add more info when I am in front of a computer. I make heavy use of ChromicPDF in eaglemms.com so that customers can generate invoices. I think I used to have issues similar to what you are describing so I put poolboy in front of ChromicPDF and haven’t had any issues. I also do something similar to what you are describing where a controller generates the PDF on demand versus generating and storing it in S3 or something and poolboy+ChromicPDF has no problem keeping up (at least for my current customer load).

4 Likes

I ended up using poolboy and it worked well during testing. Thanks!

Using a dedicated pooler is a great option.

As an alternative for others in the future, there is a blocking option in Oban called Relay, like a transparently distributed task with concurrency limits (Pro only though).

3 Likes

Hi @axelclark , could you please expand on this a little bit more? I haven’t worked much with hand writing pools/queues and would like to understand/learn more about what an implementation would look like if you could share.

Glad this worked for you, I’m just trying to close the loop on this in my head. Also, loop is pool in reverse, whoa :slight_smile:

Check out the Elixir School article on poolboy for an overview on poolboy:

Why use Poolboy?

Let’s think of a specific example for a moment. You are tasked to build an application for saving user profile information to the database. If you’ve created a process for every user registration, you would create an unbounded number of connections. At some point the number of those connections can exceed the capacity of your database server. Eventually your application can get timeouts and various exceptions.

The solution is to use a set of workers (processes) to limit the number of connections instead of creating a process for every user registration. Then you can easily avoid running out of your system resources.

Then for my use case, in the worker, I swapped out :math.sqrt(x) in the article for ChromicPDF.print_to_pdf/2. ChromicPDF.print_to_pdf/2 is the function that uses chromium so I want to limit and queue the processes that can access chromium at any one time.

  def handle_call({:square_root, x}, _from, state) do
    IO.puts("process #{inspect(self())} calculating square root of #{x}")
    Process.sleep(1000)
    {:reply, :math.sqrt(x), state}
  end
2 Likes

thanks @axelclark!

I had seen that article before, but for some reason thought it would be more involved than that. Pretty awesome that it seems to have just worked TM for you :joy:

Thanks again!

1 Like

Hi, author of ChromicPDF here.

I’m currently scratching my head a bit as the accepted solution in this thread does not make sense to me (putting a worker pool in front of a worker pool), yet two people independently report good experience with it :slight_smile: - I wonder if this points to an obscure bug in Chromic, or merely to an opportunity to improve its documentation.


@axelclark Let me first address your original post. I believe the docs you cited led to some confusion, rather than being helpful. Especially this line:

Please note that this is a non-queueing session pool.

This is definitely misleading. I should have written that it is not a persistently queueing session pool. The NimblePool library used in Chromic’s SessionPool does in fact queue “worker checkout” commands (by virtue of OTP message boxes & genserver). But its queue isn’t persistent, as in, if you overwhelm your system, worker checkouts will timeout and requests will be dropped; which leads to the exception you have posted. That’s also perfectly fine depending on your use-case, if you expect only few concurrent requests, or if you’re fine with the occassional failure when you exceed the pool concurrency, there’s nothing you need to do about it at all :slightly_smiling_face:.

In other words, what I meant to say in the docs is this: If you expect peaks in concurrent demand (= greater than max throughput of your system) and you are required to handle them gracefully, you must begin to write requests down for later and asynchronously process them (e.g. with a job queue like Oban, backed by a database table, and so on). This is actually not specific to ChromicPDF in any way, just generally job processing. And vice versa, if your scenario does not have peaks or you don’t care about them, that’s 100% cool, too, and you don’t need any of this.

I probably should just ditch that statement fromt the docs.


Now, with regards to putting poolboy in front of it: I’m honestly curious how that would change anything in your test setup? You should now get a poolboy exception instead of ChromicPDF’s when you overwhelm your system. Even the default checkout timeout (think, “max queue wait time”) is the same, I believe, at 5 seconds. Question goes to @akoutmos , too, of course.


Fun fact: ChromicPDF used to depend on poolboy before v0.6.0. Which is when I realized that a process pool in front of a single connection process is pointless. NimblePool eliminates the worker processes and only pools “resources”.

3 Likes

Here is the module in the project I used to test the different approaches:

It contains, start_pdf/0, a function using Task.async/1 to call Chromic directly.

Then, start_pdf_with_poolboy/0, a function using Task.async/1 that then wraps the call to Chromic in poolboy transaction with a worker.

The difference between the two was that I could set a timeout on the call to :poolboy.transaction to wait for an available worker. But with my call to Chromic directly, NimblePool would crash as soon as a worker was not available, see the error message in the original post.

When I updated the session pool timeout per this section of the Chromic docs, I think it increases the timeout for an individual job but not the timeout to get an available worker.

You are correct that if I exceed timeout for either getting a worker or from the GenServer call directly (defaults to 5s) poolboy will also crash, but with the timeout for an available worker in the transaction, I was better able to handle spikes in requests.

Maybe I could get the same results if I could pass in a checkout timeout through the config_options here:

Maybe I could get the same results if I could pass in a checkout timeout through the config_options here

Hehe, I was kind of hoping you wouldn’t call me out on the missing configurability of the checkout timeout :grimacing:

Reason for this omission is that I would really like that to be a “global” option instead of a per-job option on print_to_pdf/2, as IMO that makes way more sense API-wise. So, ideally I’d like it to be part of

Unfortunately, the others are all options that I can keep around in the “pool state” of NimblePool, only the checkout timeout needs to known by the caller. And at that point in print_to_pdf/2, I don’t currently have easy access to the global config anymore. Need to refactor a few things. Ticket is here Make checkout timeout configurable · Issue #276 · bitcrowd/chromic_pdf · GitHub.

2 Likes

@axelclark pushed a PR adding the checkout_timeout option [#276] Make checkout timeout configurable by maltoe · Pull Request #278 · bitcrowd/chromic_pdf · GitHub

reviews welcome, especially wrt. the changes in the docs. Do you think it’s more clear now?

3 Likes

Clearly you have put effort into this which makes me wonder why didn’t you try something else e.g. @fredwu’s library: GitHub - fredwu/opq: Elixir queue! A simple, in-memory queue with worker pooling and rate limiting in Elixir.

Not to disparage @maltoe’s work (just adding one more option to the pile) and I love maintainers who follow up on forums! :heart: And his checkout_timeout is more directly related to what you want to accomplish whereas opq has an interval after which to periodically check if there are more workers available, which is not strictly the same thing.

1 Like

I like the updates! I think both the docs and error message are more clear.

OPQ is a great suggestion! I tried it in my test project after my original post:

However, I ran into a similar situation to Oban I mentioned in my original post because OPQ.enqueue/2 returns :ok. However, I want to block in my controller awaiting the result. I think I would need to implement something like Oban.Notifier to get the result back.

Note: My test outputs to a folder, but in my actual project, I want to send the file as a download.

Hmm. In this case I’d think DynamicSupervisor + a few copies of your own GenServer that you can await on (for them to send you a message saying “job done”) might be the best way to go. It’s super easy to make your own worker pool anyway, I’ve done it 3 times until I got pointed at GitHub - uwiger/jobs: Job scheduler for load regulation (which solved one of my cases; btw that’s an Erlang library but pretty easy to use from Elixir).

I checked OPQ’s source but couldn’t quickly find a way to await on a job. And I’ve done my own mini pool a while ago that could do that.

1 Like

@axelclark Thanks for reviewing! Released as v1.14.0

Btw, since :timeout and :checkout_timeout are Erlang timeout values (i.e. used in an receive ... after construct resp. as a GenServer timeout), you can even pass :infinity to them, effectively creating an infinite queue. The sky is the limit! (In fact, the heap / max message queue size is the limit…) Just emphasizing this as we seem to be discussing additional pools in front of ChromicPDF still.

1 Like

I upgraded to ChromicPDF version 1.14 and updated the :session_pool :timeout and :checkout_timeout to match what I had in poolboy. ChromicPDF and NimblePool handle my concurrency test just like poolboy.

For anyone interested in the updated ChromicPDF config options, check out the session pool docs.

1 Like

Thanks for the reply! Looking at the git blame for the pooling code that I wrote…it looks like I wrote it in December 2021. Not sure what version of Chromic I was using back then, but I do recall putting poolboy in front of Chromic fixing the issues that we were having when we had too many requests come in to render PDFs.

It sounds like I will no longer need this pool logic though :tada:. Thanks for the explanation and the great library!

1 Like