So im in the process of migrating an app to GenStage. i have three major systems that interact and hopefully i can generalize it enough to get a good idea of what is going on…
+`Farmbot.Supervisor` <- main entry point
|
+------- `Farmbot.BotState.Supervisor`
| |
| + `Farmbot.BotState` - GenStage producer consumer. subscribe_to: Farmbot.Firmware
| |
| + `Farmbot.BotState.Transport.Supervisor`
| |
| +--- `Farmbot.BotState.Transport.GenMQTT` - ` - genstage consumer subscribes to Farmbot.BotState
| |
| +---- `Farmbot.BotState.Transport.HTTP` - genstage consumer subscribes to `Farmbot.BotState`
|
|
+-------`Farmbot.Firmware.Supervisor`
|
+ `Farmbot.Firmware` - GenStage producer-consumer `subscribe_to: Farmbot.Firmware.Handler`
|
+ `Farmbot.Firmware.Handler` - handles UART communication - GenStage producer
so Farmbot.BotState
consumes events from theFarmbot.Firmware
which consumes events that get sent over Nerves.UART
and sends those events over various transports (MQTT and HTTP) All of these use dispatcher: GenStage.BroadcastDispatcher
.
and any transport may receive commands that command the Farmbot.Firmware
to do something.
What happens is when a command message comes in over a transport. so for example a transport receives a message to move to a coordinate (0, 1, 2), and that fails for some reason, and Farmbot.Firmware.Handler
crashes. this crashes Farmbot.Firmware
which crashes Farmbot.BotState
which crashes all of the transports. That is all fine, but the app doesn’t ever come back up, In the logs i see it flail around for number of resets, and i can see all the init
functions successfully complete (via log messages), but the app still crashes. but some random things are still left running but i can’t see them in Observer…
heres a log that may or may not help.
iex(3)>
Elixir.Farmbot.BotState.Transport.GenMQTT.Client received %{args: %{label: "4c0723a7-e08b-447b-864a-81eb913491ee"}, body: [%{args: %{speed: 100, x: 1, y: 0, z: 0}, body: [], comment: nil, kind: "move_relative"}], comment: nil, kind: "rpc_request"}
Elixir.Farmbot.CeleryScript doing: #RpcRequest<[:label]>
Elixir.Farmbot.CeleryScript doing: #MoveRelative<[:speed, :x, :y, :z]>
*lixir.Farmbot.Firmware.UartHandler Got unhandled gcode: R08 *G00 X1 Y0 Z0 A100 B100 C100
Elixir.Farmbot.Firmware.UartHandler Got unhandled gcode: 1.00 Y0.00 Z0.00 Q0
Elixir.Farmbot.BotState.Transport.GenMQTT.Client Connected to real time services.
Elixir.Farmbot.BotState.Transport.GenMQTT.Client Connected to real time services.
18:38:27.594 [error] GenServer Farmbot.Firmware terminating
** (stop) exited in: GenServer.call(#PID<0.636.0>, {:move_absolute, %Farmbot.Firmware.Vec3{x: 1, y: 0, z: 0}, 100}, 5000)
** (EXIT) time out
(elixir) lib/gen_server.ex:774: GenServer.call/3
(farmbot) lib/farmbot/firmware/firmware.ex:96: Farmbot.Firmware.handle_call/3
(gen_stage) lib/gen_stage.ex:1958: GenStage.handle_call/3
(stdlib) gen_server.erl:636: :gen_server.try_handle_call/4
(stdlib) gen_server.erl:665: :gen_server.handle_msg/6
(stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Last message (from #PID<0.807.0>): {:move_absolute, [%Farmbot.Firmware.Vec3{x: 1, y: 0, z: 0}, 100]}
State: %Farmbot.Firmware.State{handler: #PID<0.636.0>, handler_mod: Farmbot.Firmware.UartHandler, idle: true, pins: %{}}
Client #PID<0.807.0> is dead
18:38:27.594 [info] GenStage consumer Farmbot.BotState is stopping after receiving cancel from producer #PID<0.635.0> with reason: {:timeout,
{GenServer, :call,
[#PID<0.636.0>,
{:move_absolute, %Farmbot.Firmware.Vec3{x: 1, y: 0, z: 0}, 100}, 5000]}}
Elixir.Farmbot.BotState.Transport.GenMQTT.Client Connected to real time services.
18:38:27.603 [error] GenServer Farmbot.BotState terminating
** (stop) exited in: GenServer.call(#PID<0.636.0>, {:move_absolute, %Farmbot.Firmware.Vec3{x: 1, y: 0, z: 0}, 100}, 5000)
** (EXIT) time out
Last message: {:DOWN, #Reference<0.2682793687.2540175366.49774>, :process, #PID<0.635.0>, {:timeout, {GenServer, :call, [#PID<0.636.0>, {:move_absolute, %Farmbot.Firmware.Vec3{x: 1, y: 0, z: 0}, 100}, 5000]}}}
State: %Farmbot.BotState{configuration: %{"auto_sync" => false, "current_repo" => "A", "firmware_hardware" => nil, "first_boot" => false, "first_party_farmware" => true, "first_party_farmware_url" => "https://raw.githubusercontent.com/FarmBot-Labs/farmware_manifests/master/manifest.json", "first_sync" => false, "ignore_external_logs" => true, "os_auto_update" => false, "timezone" => nil}, informational_settings: %{busy: false, commit: "4e20709", controller_version: "6.0.1-alpha.1", env: :dev, sync_status: :sync_now, target: "host"}, jobs: %{}, location_data: %{position: %{x: 0, y: 0, z: 0}, raw_encoders: %{x: 0, y: 0, z: 0}, scaled_encoders: %{x: 0.0, y: 0.0, z: 0.0}}, mcu_params: %{}, pins: %{}, process_info: %{}, user_env: %{}}
18:38:27.604 [info] GenStage consumer Farmbot.BotState.Transport.HTTP is stopping after receiving cancel from producer #PID<0.639.0> with reason: {:timeout,
{GenServer, :call,
[#PID<0.636.0>,
{:move_absolute, %Farmbot.Firmware.Vec3{x: 1, y: 0, z: 0}, 100}, 5000]}}
18:38:27.604 [info] GenStage consumer Farmbot.BotState.Transport.GenMQTT is stopping after receiving cancel from producer #PID<0.639.0> with reason: {:timeout,
{GenServer, :call,
[#PID<0.636.0>,
{:move_absolute, %Farmbot.Firmware.Vec3{x: 1, y: 0, z: 0}, 100}, 5000]}}
18:38:27.604 [error] GenServer Farmbot.BotState.Transport.GenMQTT terminating
** (stop) exited in: GenServer.call(#PID<0.636.0>, {:move_absolute, %Farmbot.Firmware.Vec3{x: 1, y: 0, z: 0}, 100}, 5000)
** (EXIT) time out
Last message: {:DOWN, #Reference<0.2682793687.2540175366.49824>, :process, #PID<0.639.0>, {:timeout, {GenServer, :call, [#PID<0.636.0>, {:move_absolute, %Farmbot.Firmware.Vec3{x: 1, y: 0, z: 0}, 100}, 5000]}}}
State: {%{client: #PID<0.642.0>}, %Farmbot.BotState{configuration: %{"auto_sync" => false, "current_repo" => "A", "firmware_hardware" => nil, "first_boot" => false, "first_party_farmware" => true, "first_party_farmware_url" => "https://raw.githubusercontent.com/FarmBot-Labs/farmware_manifests/master/manifest.json", "first_sync" => false, "ignore_external_logs" => true, "os_auto_update" => false, "timezone" => nil}, informational_settings: %{busy: false, commit: "4e20709", controller_version: "6.0.1-alpha.1", env: :dev, sync_status: :sync_now, target: "host"}, jobs: %{}, location_data: %{position: %{x: 0, y: 0, z: 0}, raw_encoders: %{x: 0, y: 0, z: 0}, scaled_encoders: %{x: 0.0, y: 0.0, z: 0.0}}, mcu_params: %{}, pins: %{}, process_info: %{}, user_env: %{}}}
18:38:27.605 [error] GenServer Farmbot.BotState.Transport.HTTP terminating
** (stop) exited in: GenServer.call(#PID<0.636.0>, {:move_absolute, %Farmbot.Firmware.Vec3{x: 1, y: 0, z: 0}, 100}, 5000)
** (EXIT) time out
Last message: {:DOWN, #Reference<0.2682793687.2540175366.49999>, :process, #PID<0.639.0>, {:timeout, {GenServer, :call, [#PID<0.636.0>, {:move_absolute, %Farmbot.Firmware.Vec3{x: 1, y: 0, z: 0}, 100}, 5000]}}}
State: %{bot_state: %Farmbot.BotState{configuration: %{"auto_sync" => false, "current_repo" => "A", "firmware_hardware" => nil, "first_boot" => false, "first_party_farmware" => true, "first_party_farmware_url" => "https://raw.githubusercontent.com/FarmBot-Labs/farmware_manifests/master/manifest.json", "first_sync" => false, "ignore_external_logs" => true, "os_auto_update" => false, "timezone" => nil}, informational_settings: %{busy: false, commit: "4e20709", controller_version: "6.0.1-alpha.1", env: :dev, sync_status: :sync_now, target: "host"}, jobs: %{}, location_data: %{position: %{x: 0, y: 0, z: 0}, raw_encoders: %{x: 0, y: 0, z: 0}, scaled_encoders: %{x: 0.0, y: 0.0, z: 0.0}}, mcu_params: %{}, pins: %{}, process_info: %{}, user_env: %{}}, public_key: %JOSE.JWK{fields: %{}, keys: :undefined, kty: {:jose_jwk_kty_rsa, {:RSAPublicKey, 22285049158885340376326070958724265578825988326592561891505335517981697112663115627605336016312974648727152712825232672703606901833990104527742064940818896351125254041668220713149710519971929393596540889444662981852109138863362312382240414604690780890284495352157133567803500537833974423927025044579584717594115081637678287985602165208959626135586335293657212186069038239670817395935529798146206303315156219733820787146161466247576911481782975175283080062638007814045263061446090972610985470798813113429879314241226322495616246179054635918379249452581046287333578216120004125518974406573192664202976595616504079460129, 65537}}}, sockets: []}
Elixir.Farmbot.BotState.Transport.GenMQTT.Client Connected to real time services.
Elixir.Farmbot.BotState.Transport.GenMQTT.Client Connected to real time services.
Elixir.Farmbot.BotState.Transport.GenMQTT.Client Connected to real time services.
Elixir.Farmbot.Bootstrap.Supervisor Beginning authorization: Elixir.Farmbot.Bootstrap.Authorization - admin@admin.com - http://localhost:3000
Elixir.Farmbot.Bootstrap.Supervisor Successful authorization: Elixir.Farmbot.Bootstrap.Authorization - admin@admin.com - http://localhost:3000
Elixir.Farmbot.BotState.Transport.GenMQTT.Client Connected to real time services.
Elixir.Farmbot.Bootstrap.Supervisor Beginning authorization: Elixir.Farmbot.Bootstrap.Authorization - admin@admin.com - http://localhost:3000
Elixir.Farmbot.Bootstrap.Supervisor Successful authorization: Elixir.Farmbot.Bootstrap.Authorization - admin@admin.com - http://localhost:3000
Elixir.Farmbot.BotState.Transport.GenMQTT.Client Connected to real time services.
Elixir.Farmbot.Bootstrap.Supervisor Beginning authorization: Elixir.Farmbot.Bootstrap.Authorization - admin@admin.com - http://localhost:3000
Elixir.Farmbot.Bootstrap.Supervisor Successful authorization: Elixir.Farmbot.Bootstrap.Authorization - admin@admin.com - http://localhost:3000
Elixir.Farmbot.BotState.Transport.GenMQTT.Client Connected to real time services.
18:38:27.923 [info] Application farmbot exited: shutdown
The first 5 lines are the transport receiving a command,
Elixir.Farmbot.Bootstrap.Supervisor Beginning authorization: Elixir.Farmbot.Bootstrap.Authorization - admin@admin.com - http://localhost:3000
as far as i can tell should not be called. It is much lower in the bootstrap process. This is fetching a JWT.
Elixir.Farmbot.BotState.Transport.GenMQTT.Client Connected to real time services.
is the mqtt client authenticating with a JWT. (that is stored in a sqlite db from that above auth task). I’m not exactly sure why this happens.