FLAME slow cold starts

eileennoonan · October 2, 2024, 12:47am

I’m deploying an app on Fly.io, and it’s going to involve running headless browsers similarly to World Page Speed. So naturally I’d like to use FLAME to manage the headless browser portion.

The problem is my FLAME nodes have very long cold starts - upwards of 20 seconds even when I boost the machine resources.

I am assuming that this is because I have so many dependencies in my main app - Phoenix, Nx, and Ash + a bunch of Ash extensions are probably the main contenders.

Is that assumption correct?

If so, would it make sense to just create a separate mix app for my headless browser FLAME calls? I think just about the only dependency it would need to boot would be web_driver_client and req. No Ash, no Phoenix. And I see that with FLAME.FlyBackend we can specify which docker image we want our node to spin up.

Is this a normal / recommended way of using FLAME? Any tips before I head down that road? It seems like I might not even need to write any code in the headless browser runner app, just make sure the correct dependencies are loaded.

~Eileen Noonan

jswanner · October 2, 2024, 6:13am

Welcome Eileen! We met at ElixirConf, good to see you on the forum.

I’ve haven’t used FLAME yet, so I don’t know if there are recommended ways to handle your situation and I’m not speaking from experience. If I were in your position, I would start by modifying the MyApp.Application.start/2 callback (lib/my_app/application.ex) to conditionally specify what should start when run via FLAME or not. FLAME.Parent.get/0 can be used to determine which way the application is being started.

josevalim · October 2, 2024, 7:38am

If you are concerned about the boot time on the number of dependencies, maybe setting RELEASE_MODE=interactive as an env var will help loading less code? You should also consider trimming your Application.start supervision tree.

chrismccord · October 2, 2024, 12:50pm

What is the size of your app image? What region(s) is your app located? There’s nothing to compile/build since it will be launching the prebuilt docker image from the parent. Are you certain the time is the cold start and not something app specific like the time to load your headless chromedriver process and start your sup tree? worldpagespeed cold starts are in the 5-10s range with the time including actual time to start chrome/chromedriver and start driving the browser. Are you consistently seeing these times or only occasionally? Depending on machine placement, the docker image layers may need to be pulled to a new host, so that can also add time and the size of the image would affect that, but we need to know more. Your fly logs would also be reporting the image pull time.

You can also look into starting with a warm pool min: whatever and min_idle_shutdown_after to idle down below min if no work is needed to avoid deploys causing users to hit a cold pool.

eileennoonan · October 2, 2024, 6:42pm

Hey Jacob nice to see you here!

I am already modifying the supervision tree for FLAME specific nodes. Basically I’m following the pattern from the world page speed repo, borrowing its children function

eileennoonan · October 2, 2024, 6:43pm

Thank you José, I will definitely experiment with this!

I ended up finally reading the docs on releases because of this too, so now that’s not such a scary topic.

eileennoonan · October 2, 2024, 6:44pm

Thanks Chris! I’m marking this as the solution here but my actual reply is over in the fly.io forum. In the future I will just keep these questions to a single forum