Scheduler utilization and process death (howto transparency)

I’m writing a kind of dwarf-fortress-like game, so of course there’s processes everywhere.

What I’ve noticed via the :observer tool is that when scheduler utilization reaches 100%, a bunch of processes die a fiery death.

A part of this is my fault, of course - The more processes there are around, the more messages eventually get sent (a mob leaving a room generates a notification that gets sent to every other mob in the room), and I have a very limited number of rooms at present.

Still, this is a problem. I’d like to tell my system to stop what it’s doing (stop reproducing!) before it kills everything – very much like a theory of ecosystem balance where a species would not grow so much that it destroys its environment. How can I get some kind of transparency into the system so I know if it’s reaching a kind of “danger zone” ?

[ Edit - so far, got a recommendation from Slack for http://erlang.org/doc/man/erlang.html#statistics_scheduler_wall_time and for https://github.com/ferd/recon/ ]