GenServer terminating without a helpful stack trace

A memorable morning today, as I got my first serious Phoenix error in production. To be honest, the error is quite horrible, because it is a crash in my “Game Master” GenServer, which monitors all game sessions.

The problem is, the error I got from the logs wasn’t as useful as I’d hoped. The stack trace specifically doesn’t make sense to me, which is the first time I can’t pinpoint a problem in my Elixir code from the logs. I understand it’s a Poison encode error, which is super clear, but … where? How did we get here? Why did the game_master terminate?

(my best guess would be a truncated stack trace, but I don’t recall ever running into that… if so, how do I increase the limit?)

Any insights are greatly appreciated:

22:58:31.492 [error] Game result error: Repo update failure: [paid_coins: {"must be greater than or equal to %{number}",
  [validation: :number, number: 0]},
 paid_coins: {"must be greater than or equal to %{number}",
  [validation: :number, number: 0]}]
22:58:31.496 [error] GenServer :game_master terminating
** (Poison.EncodeError) unable to encode value: {:ended, :white_player, :normal}
    (poison) lib/poison/encoder.ex:354: Poison.Encoder.Any.encode/2
    (poison) lib/poison/encoder.ex:213: anonymous fn/4 in Poison.Encoder.Map.encode/3
    (poison) lib/poison/encoder.ex:214: Poison.Encoder.Map."-encode/3-lists^foldl/2-0-"/3
    (poison) lib/poison/encoder.ex:214: Poison.Encoder.Map.encode/3
    (poison) lib/poison/encoder.ex:213: anonymous fn/4 in Poison.Encoder.Map.encode/3
    (poison) lib/poison/encoder.ex:214: Poison.Encoder.Map."-encode/3-lists^foldl/2-0-"/3
    (poison) lib/poison/encoder.ex:214: Poison.Encoder.Map.encode/3
    (poison) lib/poison/encoder.ex:213: anonymous fn/4 in Poison.Encoder.Map.encode/3
Last message: {:DOWN, #Reference<0.0.130809857.143582>, :process, #PID<0.22428.17>, {:shutdown, %{...}}}
State: %{...}}
1 Like

The VM only keep the last 8 or 10 stacktrace entries, but you can increase the value. Phoenix has a configuration you can set. Look at your dev/test config files, you likely have an example there. But you should be able to reconstruct why that is happening. You know the last message you have received is a DOWN. If you are calling anything that renders JSON, such as a Phoenix view or broadcasting to a Phoenix endpoint, in handle_info/2, that will be the source. Somewhere in your data you have a tuple like {:ended, :white_player, :normal} which you cannot correctly encode to JSON.

5 Likes

Thanks for the reply always, Jose. I had actually written a test with the payload, but it passed, which made me a bit frustrated… But the majority of this issue is my fault, of course.

Turns out, after tackling the issue with a clear head, the problem manifests itself within Sentry. Obviously, I use a mock instance during tests, so I don’t really test what happens if I send sentry an object that isn’t JSON encodable. I’ll raise an issue with them and see if it makes sense not to crash because of this.

1 Like

… Actually, I do test talking with Sentry, just that the @client within sentry is set to nil during testing. This means the Sentry Event object is created, but is never encoded nor sent out during the tests.

1 Like

Well, the event should be at least encoded during tests, that’s for sure. That’s why tools like bypass are good for integration tests.

3 Likes