I think this is highly dependent on what the system is designed for - I also think that “let it crash” is more in regards to distributed systems than to regular web app flows and requires a bit of explanation of a use case to be accessed.
For instance in a small project I’m working on, I have genservers holding state, they save it to the db as initial state when created, then they keep it updated after each action and through casts save it to the db (so it doesn’t block the genserver). The state is always served from the GenServer directly, because it’s set on its own initialisation. If it crashes for some reason, when restarted it always tries to fetch the state from the db record, and if none it simply creates a new one.
I don’t deal with anything of ecto possible errors, because I make some assumptions up to that point, based on the fact that no data is changed or actually written to the db that I don’t validate, since I control all the entry points (a single one in this case), I can rely on the fact that ecto will not crash arbitrarily unless due to a bug. The entry point performs some generic validation and then, and dispatches according to the request, where it’s further validated. Since the state that is used is never available to the user for direct change I can rely on it being stable.
Another example is another app I’m currently building, that relies on several os processes to control chrome instances, opened as ports, along with websockets to connect to the remote dev tools of each one of these chrome instances, and two genservers up till now to control the spawning of new chrome instances and keep track of sockets. I’m cleaning up the failing paths and tidying it up, and in the end I’ll just write a trap exit for the main genserver that holds all the accountable state and in case it breaks it will just dump it into a dets to be reused. But the thing is, this process should never crash unless there’s a power failure or something really “exceptional”, because it is important I have to make sure that all interactions with it are sanitised, so I’m moving the logic that can break towards the edges, where processes can fail and be restarted (if necessary fetching from the state of the genserver that is important).
They will have completely different strategies, since in one I have a DB and in the other one I don’t, in one I deal with the state that only pertains to a singular game, so it can be restarted without a problem, whereas in the other the genserver holds a whole bunch of linked information (OS PIDs, Process PIDs, Socket PIDs, urls, etc etc) for many different processes, and I can’t just “let it crash” so I’m trying to move stuff to the edges and making sure that all interactions are sane whenever another process tries to talk with this one.
Having said this, I would also like to see some more guides on handling error and exceptions with useful patterns.
Regarding your particular question and comments, I think you need to bubble the error up, but in the form of {:error, your_own_error_struct}
, so that then you have only two possible outcomes for the case, either {:ok, something}
, or {:error, something}
. I also think that you should separate through layers, for instance a missing password on a submission shouldn’t even reach your function logic, it should be taken care, perhaps through function pattern matching on the args, before it has any opportunity to cascade into the inner workings of your processes. If you really have 50 errors that can come out of any given flow, then perhaps writing a module that works as an error interface and from which you can call a function like error_explanation(:some_error)
to return specific error messages or data pertaining to that error.
But again, I think that “let it crash” is more towards system engineering, and moving “may-break” parts towards the edges, were a failure of a process doesn’t take down whatever is important to stay up, along with setting up relevant links, supervisors, and restart/traps/inits. When it comes to error translation for the user you really have no option than to write something that translates that error, because I don’t think we’re anywhere near a language that can infere what it should output for a user. And mostly because it’s highly subjective, if you’re dealing with API calls, then the errors will be of a certain type and structured in a certain way (and then again different if you’re outputting json or xml, or plaintext), if you’re dealing with form submission, then the error will need a different structure to signal fields that are wrong, etc and so on.