We’re in the process of adding additional error logging for unhandled exceptions in a Phoenix application. One of the issues we’re experiencing is that because the unhandled exception is crashing the request process, we sometimes lose unflushed Logger messages.
To solve this, I’d like to use Logger.flush/0. However, the docs say:
This is useful for testing and it should not be called in production code.
What exactly is the downside of calling this in production code?
We really want our logs to be flushed in these error cases, and we’re okay with the process blocking for a bit until that happens. But is there some other downside of this function that would make this undesirable to call in our case?
I tried crashing a process by raising an exception right after logging and it works as expected
I don’t think your test is valid. The docs say that Logger alternates between sync and async modes to remain performant.
There is a very large difference between spinning up 1 process and logging 1 message in an IEX shell, and a production Phoenix server handling hundreds to thousands of requests per second, each of which is potentially creating dozens of logs. In our case it seems likely that it is running in async mode, while in your IEX session it’s probably running synchronously.
Logger.flush/0 seems to be no-op by default
Seems like you are making some assumptions about which logging backend we’re using. We are not, as assumed, logging only to stdout, but to a log file.
I guess I should specify also that we’re using Elixir 1.14.5 and OTP 24. FWIW, here’s the Logger.flush implementation for Elixir 1.14. I am very skeptical that Logger.flush/0 is a no-op.
What does “may” mean? In what situations? How much and what kind of performance degradation?
If it’s just a bit of slowdown in the process calling it, that’s probably acceptable. I’m only talking about using this in an unhandled exception logger, which is less than 1% of total requests.
in general it is pointless to use in production
What about in my specific scenario, i.e. ensuring logs are flushed before a process crashes?
If your process crashes, it will be logged anyway, Logger.flush won’t make any difference. Except if there are too many log entries, then the logger may discard some, but Logger.flush won’t change it.
The reason why it is not encouraged is because it tells all loggers to stop what they are doing and write to disk. If you call it several times, they will stop and write to disk, every single time, even though they believe batching and waiting is the efficient thing to do.
Asa generally huge fan of Elixir documentation (if that’s not obvious from my comments here over the years) OP’s sentiment is actually one I have felt quite a bit. I don’t have any concrete examples on hand but I have often read similar warnings in the docs and really felt myself really wanting to know the “why.” Is keeping this kind of thing out of the docs intentional (perhaps to keep them learn) or would PRs generally be accepted to clarify these things?
Of course, we can always improve the docs and PRs are welcome. If this is happening quite a bit, then we are not being told about those cases, because they would certainly have been addressed otherwise.
Awesome, thanks! I just honestly wasn’t sure if it was intentional or not as documentation bloat can be a very real problem and many people are happy to read “don’t do this” and happy to take it at face value. I will try and be more diligent opening PRs/issues.
Looking at the code, the function will only do something if the handler exposes a filesync/1 callback, which isn’t part of any behaviour to my knowledge. And if it does, the performance implication will depend on the exact handler, which in the majority of cases will be LoggerBackends, but can be something else as well Perhaps the right place to put the bulk of documentation is LoggerBackends docs?