Design: how to monitor a "deferred task"?

I’m trying to determine the best way for a caller to monitor work that will be done some time in the future, so it can react to crashes, etc. (and resubmit the work, or whatever it decides should be done).

I’m writing a service that will rate-limit requests made to a remote host according to various criteria.

When a caller asks the service to execute a request, it will immediately receive an AllocatedRequest struct, which is very similar to the Task struct. When the request has been executed, the caller will receive a {ref, result} message, where ref is the same reference as the one contained in the AllocatedRequest. Basically, it works similarly to tasks (and uses them under the hood).

Requests can be queued (i.e. performed much later than when the caller wants them) within the service, and it will periodically dequeue a request, trigger a task for it, then forward the result to the caller (tagged with the appropriate ref value from the allocated request already sent to the caller).

In my current implementation, the ref used in the AllocatedRequest simply comes from make_ref (i.e. it doesn’t correspond to a monitor ref like Task uses). It would be nice for the caller to be able to monitor an allocated reference and know if it will never get a result (due to crashes). What’s the best way to achieve this?

If the task was triggered immediately, I could forward its ref, but that’s not the case. The requests get routed to GenServer partitions, and I basically want to properly handle the case where the GenServer crashes (I’ve implemented a terminate/2 callback, but that’s not guaranteed to run): if the GenServer goes down, it will lose all queued requests, and I’d like the caller to be able to know about that…