I recently run into a situation where my system would eventually stop logging and become very very slow. I traced all this to the mailbox of the Logger process containing around 2 million messages, which means the Logger had switched to “sync” mode and therefore all the calling code had become either extremely slow or just blocked forever.
This was using the default console back that was then handled by systemd on a CentOS VM.
I blogged about it and I proposed that the design of the Logger has severe limitations, which I think I observed. I think instead that the Logger should drop messages if the backend can’t handle them on time (and just add a synthetic message so that people can see it happened).
I’m not sure if I need to open an issue, go to the mailing list, so I start here…
That isn’t really true. In the Java world, for example, the norm is for configurable fixed-size buffers to be used to drop log messages when the production rate exceeds the rate they can be processed. Nobody wants an application that falls over because of logging.
Log messages should always be considered indicative, not canonical and created with the understanding that they are lossy. If you want transactional semantics, you need to look to another tool.
When we designed Logger, we made sure it wouldn’t run into some of such situations. There is an application that was created when Lager was first designed that reproduces such situations and Logger is resilient to it.
We borrowed a lot from Lager, such as the sync/async behaviour that you blogged about. However, the sync behaviour is only useful if the same processes are logging over and over again. I am imagining that in your case you have newly spawned processes doing the logging and it is clear now that this back-pressure does not really stop new processes from being created.
It seems Lager already has mechanisms to handle those cases and we should probably follow their footsteps once more. Would you be so kind to open up an issue? I want to work on this ASAP and include a solution on v1.6.