System.stacktrace returning a GenServer call tuple

I’m currently experiencing an issue where the stacktrace being returned by System.stacktrace/0 isn’t a stacktrace at all but a GenServer call:

{:gen_server, :call, [MyApp.Redis.Pool, {:checkout, #Reference<0.0.2.97>, true}, 5000]}

I know that the underlying reason the error is getting thrown is that the GenServer hasn’t been started yet in the supervision tree. (Per https://github.com/edgurgel/verk/issues/58)

What is confusing me is why the stacktrace would yield the GenServer call instead of an actual stacktrace. I know that System.stacktrace/0 gets inlined to :erlang.get_stacktrace/0. That being said, the format that :erlang.get_stacktrace/0 is supposed to return is very specific: a list of tuples, where each tuple is a stack_item.

Is this an expected behavior by the VM when the GenServer hasn’t been started that I need to account for?

3 Likes

Can you provide a way to reproduce this? I would expect System.stacktrace to always return a list. If it does not, it is surely a bug.

1 Like

I’m working to isolate the issue in a reproducible way. At the moment, I only have firm reproduction steps for my company’s codebase, which isn’t that helpful for a bug report. The strange thing is that SASL seems to have a full stacktrace, since this is the output produced to the console:

[error] GenServer #PID<0.395.0> terminating
** (stop) exited in: :gen_server.call(MyApp.Redis.Pool, {:checkout, #Reference<0.0.2.176>, true}, 5000)
    ** (EXIT) no process
    :erlang.send(MyApp.Redis.Pool, {:"$gen_cast", {:cancel_waiting, #Reference<0.0.2.176>}}, [:noconnect])
    (stdlib) gen_server.erl:416: :gen_server.do_send/2
    (stdlib) gen_server.erl:232: :gen_server.do_cast/2
    (poolboy) src/poolboy.erl:58: :poolboy.checkout/3
    (poolboy) src/poolboy.erl:74: :poolboy.transaction/3
    (my_app) lib/my_app/redis/cache.ex:76: MyApp.Redis.Cache.delete/1
    (verk) lib/verk/worker.ex:27: Verk.Worker.handle_cast/2
    (stdlib) gen_server.erl:615: :gen_server.try_dispatch/4
Last message: {:"$gen_cast", {:perform, "MyApp.Workers.RedisCacheInvalidationWorker", [1341],     "6d3d459df292e6e1611dc286", #PID<0.396.0>}}
State: nil

I’m going to see how far I can isolate it. Based on the trace above and tracing the failure through erlyberly, I’ve been trying to see if I can trigger it in isolation by sending messages to non-existing processes.

Quite honestly, if I didn’t see it happening with my own eyes, I would think it impossible. :sweat: But the third party library we use for this codebase is definitely just calling System.stacktrace/0 and then crashing when it passes the value to Exception.format_stacktrace/1:

[error] Manager terminating, reason: {{%Protocol.UndefinedError{description: nil, protocol: Enumerable, value: {:gen_server, :call, [MyApp.Redis.Pool, {:checkout, #Reference<0.0.1.138>, true}, 5000]}}, [{Enumerable, :impl_for!, 1, [file: 'lib/enum.ex', line: 1]}, {Enumerable, :reduce, 3, [file: 'lib/enum.ex', line: 116]}, {Enum, :reduce, 3, [file: 'lib/enum.ex', line: 1486]}, {Enum, :map_join, 3, [file: 'lib/enum.ex', line: 1118]}, {Exception, :format_stacktrace, 1, [file: 'lib/exception.ex', line: 388]}, {Verk.QueueManager, :build_retry_job, 5, [file: 'lib/verk/queue_manager.ex', line: 135]}, {Verk.QueueManager, :handle_call, 3, [file: 'lib/verk/queue_manager.ex', line: 117]}, {:gen_server, :try_handle_call, 4, [file: 'gen_server.erl', line: 629]}]}, {GenServer, :call, [:"elixir.queue_manager", {:retry, %Verk.Job{args: [1341], class: "MyApp.Workers.RedisCacheInvalidationWorker", enqueued_at: 1462511589.706, error_backtrace: nil, error_message: nil, failed_at: nil, finished_at: nil, jid: "58592642725653573b8e8fdb", original_json: "{\"queue\":\"elixir\",\"class\":\"MyApp.Workers.RedisCacheInvalidationWorker\",\"args\":[1341],\"retry\":true,\"jid\":\"58592642725653573b8e8fdb\",\"created_at\":1462511589.705,\"enqueued_at\":1462511589.706}", queue: "elixir", retried_at: nil, retry_count: 0}, 1462511592, %RuntimeError{message: ":noproc"}, {:gen_server, :call, [MyApp.Redis.Pool, {:checkout, #Reference<0.0.1.138>, true}, 5000]}}, 5000]}}

All that is to say…working on something clearer cut that can be reproduced.

Have you tried the simple test of adding a printout to see what value is actually being returned from the System.stacktrace call? And extending that to explicitly follow the stacktrace by printing it out from where it was created to where you try to format it? That SASL prints it correctly implies that the value is initially correct.

That this is a GenServer has no effect on the VM. The VM doesn’t actually know about GenServers, from its POV there are just processes.

1 Like