Idiomatic place for this code when the application boots?

Simple, I have a Phoenix application that publishes to some Kafka topics, and I’d like to log the partitions found for each topic when the application boots. What’s an idiomatic place to put such code? (The question is not posted in a Phoenix-specific channel because I believe it is valid for any Elixir application that has a supervision tree.)

Some spots where this could technically be done would be:

  1. MyApp.Application.start/2: Once the supervisor has been launched you can log before returning {:ok, pid} to the caller. However, that hijacks a very important call, doesn’t feel quite right to me (but maybe it is considered to be OK?).

  2. Another option is to write a GenServer for the sole purpose of being initialized as part of the supervision tree, print, and shutdown. This feels like using the wrong abstraction just for its side-effects.

  3. Including a one-off Task as part of the supervision tree. Same idea as (2), but a Task seems a more concise abstraction than a GenServer perhaps, communicates better.

  4. Something else?

What do you recommend?

1 Like

I’ve used both 1 as well as 2. Personally I prefer 2. though. A GenServer can deal with sideeffects in init/1, but return :ignore, so it’s never attached to the supervision tree in the first place.

By now my preferred option is a Task, because it communicates better for my taste, and the code that logs the partitions is so small:

children = [
  elsa_supervisor(),
  {Task, &log_partitions/0},
  ...
]
1 Like

Hi @fxn!

I’ve seen this be done in MyApp.Application.start/2. We used SQS quite heavily in one app and the queues were all created there, part of a with. Folks then started to move that pattern to other apps.

with {:ok, _} <- SQS.create_queue( ... ),
     {:ok, _} <- SQS.create_queue( ... ) do
  :ok
else
  error -> 
  Logger.error("log the error here #{inspect(error)}")
end

children = [...]
opts = [...]
Supervisor.start_link(children, opts)

This has worked well, seems like it is a pattern folks started to follow after a while. You case may be different. In ours we want things to fail loud right when things boot, because we need the queues ready to go and don’t want the setup to be async.

Hope this helps!

PS: Big fan of your work, super happy that you are part of the Elixir community :heart:

2 Likes

:wave:

I think that 2 are 3 are similar but not the quite the same. 2 (with side-effect in init) is blocking further supervision tree startup (until init finishes) whereas 3 is async and doesn’t block other children from starting.

Is it always possible to avoid hijacking Application.start? I faced a similar issue recently, where I needed to run some code on start that I definitely did not want to block the rest of app start. It seemed that it is possible to use handle_continue to do this with GenServer, but in my case I wanted to store some state based on the result so I was reaching for Agent, but was surprised to find that’s not available (since my vague impression was that Agent was a flavor of GenServer). I encountered problems having the code run anywhere as part of the tree itself because Supervisors were not themselves started, so I ended up adding a Agent.cast call after the the start_link call, and I agree with @fxn, does not feel clean.

Its implementation is indeed a GenServer, but that doesn’t mean Agents expose that fact to their users. Agents are processes to store state, that’s it. It doesn’t expose means of initializing that state in an non-blocking manner.

Why is it “hijacking” if you clearly want to start a process when your application starts. That’s exactly what the Application.start/2 callback is there for. If you don’t want to block the supervision tree startup, then a GenServer + handle_continue can do that.

That was the word that OP used to express what I think is a valid concern about putting arbitrary code in Application.start. Perhaps it’s not arbitrary to put code I want to run on start there, but my feeling is that it is probably not good practice. I am a novice at process management with Elixir though, so maybe I am wrong, but it seems like startup code should be executed as part of the tree. Is that not why you say you prefer option 2?

So any code, which should be run as part of your application being started should be triggered from the Application.start/2 callback implementation. If the code can run without any started processes (of your app) this can be a plain function call. If the code depends on other processes of your application already running, then letting it be triggered by the supervision tree is one way to go. Another would be starting the supervision tree first and then calling the function in start/2. Both are equally viable. I like the supervision tree option more, because it allows the triggered code to run as soon as possible, while I can put e.g. starting the phoenix endpoint later. Also if the triggered code is executed synchronously then I can be sure the endpoint is never started before the code is done running.

Depends on what you need. Sometimes 1 will be the best option, sometimes 2/3 (these are almost the same to be honest), and sometimes 4 (start_phase/3).

So as with most things - it depends.

2 Likes

TIL about start_phase/3, how perfect for this situation!

Does “finding partitions” involves a remote network call? If so, then I’d avoid running it during the blocking part of the app start, because slow connection might affect the startup latency, and no connection might prevent booting the entire system. Therefore, a task would be my first choice here.

Otherwise, if this information is obtained locally (say from some config file), I’d do it synchronously, e.g. in app start, or start_link or init of some singleton server process.

3 Likes

That’s a good point.

Conceptually, it does, that is metadata you ask for to the server. Technically, depending on some execution order it could be the case that brod had it cached in ETS, but that is internal anyway. Thinking about a maintainer, I would put it in a place that makes sense assuming a network call, less questions.

Also, logging is something I want to do at some point while the application boots, but it is definitely not something that has to happen at any specific time. I believe async makes also sense from that point of view.

1 Like

I’m a bit confused by this point, probably because I don’t understand what else do you want to log.

Either way, I typically sprinkle log expressions close to the place where it makes sense, i.e. in the code where the related action is happening. E.g. if I want to log the url of the external service I’m connecting to, I’d do this immediately before I’m establishing a connection. In any case, I wouldn’t move all logging to async task(s), save for the previously mentioned caveat (data to be logged has to be obtained from a remote service).

If you want to see application boot messages in the log then you can do so by simply setting:

config :logger, handle_sasl_reports: true

And you will get messages fired by Erlang itself.

Hmmm, I want to log what the post says: some specific metadata from the Kafka server. That is all.

Agreed. Remember my snippet above?

children = [
  elsa_supervisor(),
  {Task, &log_partitions/0},
  ...
]

That’s logging close to the spot. The connection to Kafka and all involved processes are managed entirely by Elsa (which uses brod behind the scenes). So, right after Elsa has initialized its main supervisor, then there’s code that logs Kafka metadata (using Elsa API).

This is a one-off trace I want to do when the application boots in case I need to verify these values and I only have application logs, this metadata can be considered to be static (doesn’t change during the lifetime of the application).

@sasajuric @hauleth Guess this sentence was confusing:

Also, logging is something I want to do at some point while the application boots.

There, “logging” wanted to mean “the logging we are talking about in this thread: logging partitions per topic during application startup”.

Ah, gotcha. I got the impression that you might be talking about logging during startup in general, so I just wanted to make sure that no general conclusions arise from what I said earlier. Anyway, given your further clarification, I agree that starting a task would be the way to go.

1 Like