Well, that could still be a better ui. As you receive messages through handle_info, update the state and trigger a render or stream it. I believe the issue with the receive/blocking approach is that the selective receive will have to weed out the system messages present in the mailbox and other messages present in there. With the handle_info approach you could update the ui as elements come from your external program.
If the number of messages on an hardly used LV page is a problem, then many of the systems out there wouldnāt work.
I donāt think I made myself clear. I donāt believe itās necessarily a problem, but as I understand it a receive in a gen_server will have to selectively weed out system messages, gen_server specific messages ⦠I donāt see how that would not affect performances, even in the slightest.
That being said, I also believe that itās not great to block in a callback when itās possible to avoid it. Sure it would also be possible to wrap a System.cmd in a task and handle the result of the task. But then you would spawn a new process for the task and a Port.
Iām not claiming that performances would be very degraded by blocking, simply that it may make more sense not to block (and possibly improve the ui).
System.cmd wraps the whole Port machinery in a nice shortcut, where receives are handled for you, in a gen_server it might be better to use the primitives that are given to you.
It only affects performance if in fact system messages in the queue. Why would there be in this case? It would take a LOT of messages to make receive performance measurably degrade, it is very well optimized.
Shot in the dark: what happens if you call :erlang.garbage_collect() before and after :timer.tc/1? My theory is that the garbage collector is being triggered during System.cmd because enough work has been done before handle_event (and thatās more likely to happen than in mount).
And, just to add to that, this isnāt my first rodeo and Iām sensitive to premature optimization. You know what app is really slow? The one that didnāt get made because we exceeded budget optimizing crap that doesnāt matter. ![]()
Thereās tradeoffs in all things. Itās just a bit scary with an intermittent 10x performance cost for reasons I canāt explain. But like I said, thereās not enough data to actually understand what is happening, and itās apparently not a universal problem, so Iām just going to move on and weāll see what happens in prod. Iāll definitely update this thread with any new info if it comes up. ![]()
DANGIT! No dice (but I super thought that was it)
[debug] MOUNT PtestWeb.PerfLive.Index
Parameters: %{}
Session: %{"_csrf_token" => "yWT637rNb1MlJQv4eoSvoJ_6"}
shell_out took 3220 microseconds from mount
[debug] Replied in 3ms
[debug] HANDLE EVENT "refresh" in PtestWeb.PerfLive.Index
Parameters: %{"value" => ""}
shell_out took 20130 microseconds from handle_event
[debug] Replied in 24ms
[debug] HANDLE EVENT "refresh" in PtestWeb.PerfLive.Index
Parameters: %{"value" => ""}
shell_out took 13889 microseconds from handle_event
[debug] Replied in 14ms
[debug] HANDLE EVENT "refresh" in PtestWeb.PerfLive.Index
Parameters: %{"value" => ""}
shell_out took 6235 microseconds from handle_event
[debug] Replied in 6ms
Hereās the code change, in case I screwed something up: Force garbage collection by treshenry Ā· Pull Request #1 Ā· treshenry/phoenix_perftest Ā· GitHub
def handle_event("refresh", _params, socket) do
:erlang.garbage_collect()
{timer, _} = :timer.tc(fn -> shell_out() end)
:erlang.garbage_collect()
IO.puts("shell_out took #{timer} microseconds from handle_event")
{:noreply, socket}
end
Canāt reproduce
[debug] MOUNT PtestWeb.PerfLive.Index
Parameters: %{}
Session: %{"_csrf_token" => "a1x_wxsgi4gy6NV-oOcG7TDt"}
shell_out took 2221 microseconds from mount
[debug] Replied in 2ms
[debug] HANDLE EVENT "refresh" in PtestWeb.PerfLive.Index
Parameters: %{"value" => ""}
shell_out took 2161 microseconds from handle_event
[debug] Replied in 5ms
[debug] HANDLE EVENT "refresh" in PtestWeb.PerfLive.Index
Parameters: %{"value" => ""}
shell_out took 4526 microseconds from handle_event
[debug] Replied in 4ms
[debug] HANDLE EVENT "refresh" in PtestWeb.PerfLive.Index
Parameters: %{"value" => ""}
shell_out took 1888 microseconds from handle_event
[debug] Replied in 2ms
[debug] HANDLE EVENT "refresh" in PtestWeb.PerfLive.Index
Parameters: %{"value" => ""}
shell_out took 1604 microseconds from handle_event
[debug] Replied in 1ms
[debug] HANDLE EVENT "refresh" in PtestWeb.PerfLive.Index
Parameters: %{"value" => ""}
shell_out took 1598 microseconds from handle_event
[debug] Replied in 1ms
So what browser are you using?
By the way, I wouldnāt judge about anything related to performance in benchmarks with <100 runs and with +/- 600% deviation. But it is still very interesting to find out what causes such a behaviour
Can the sympton be reproduced by System.cmd("true", [])? true is the simplest external command that does nothing.
Huh, did not know that! Sadly, still reproduces. As shown above, the issue isnāt the time it takes the OS to execute the command, but the time it takes Elixir to ask the OS to execute the command, then get the result.
Great news! Hopefully itās just something weird on a couple of local machines then.
Is there anything different about those local machines, eg the Elixir app is inside docker or virtualization?
FWIW, I am also not able to reproduce locally. Just did a few runs but seeing that mount is on par or a little slower than handle_event. Running Linux 6.3.1 on arch with Firefox 112.0.2 and vivaldi 6.0.2979.
Firefox
[debug] MOUNT PtestWeb.LsLive.Index
Parameters: %{}
Session: %{"_csrf_token" => "qjZyNlgmfueoXuFR_d2nzSGD"}
get_files took 3954 microseconds, on mount
[debug] Replied in 4ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 3000 microseconds, on handle
[debug] Replied in 3ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 2384 microseconds, on handle
[debug] Replied in 2ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 2703 microseconds, on handle
[debug] Replied in 3ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 2685 microseconds, on handle
[debug] Replied in 2ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 3104 microseconds, on handle
[debug] Replied in 3ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 2019 microseconds, on handle
Vivaldi
[debug] MOUNT PtestWeb.LsLive.Index
Parameters: %{}
Session: %{"_csrf_token" => "zl31ZF6ilVtybyAu1AvFHvK_"}
get_files took 3376 microseconds, on mount
[debug] Replied in 3ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 2796 microseconds, on handle
[debug] Replied in 5ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 2705 microseconds, on handle
[debug] Replied in 3ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 2452 microseconds, on handle
[debug] Replied in 2ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 2516 microseconds, on handle
[debug] Replied in 2ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 2661 microseconds, on handle
[debug] Replied in 2ms
[debug] HANDLE EVENT "refresh" in PtestWeb.LsLive.Index
Parameters: %{"value" => ""}
get_files took 2106 microseconds, on handle
[debug] Replied in 2ms
Just to give an idea of the variability on my machine, another run took 1634usec on mount and averaged ~2700usec on handle_event but then on refreshing the page mount took 5715usecs and handle_event averaged ~2200usec. So maybe just noise with low sample sizes?
I added :msacc.start() to the router function in lib/ptest_web.ex and then added :msacc.print() to the mount and handle_event functions after the get_files function is called in each. This didnāt seem to show much discrepancy in the two:
get_files took 6805 microseconds, on mount
Average thread real-time : 209483891 us
Accumulated system run-time : 1632155 us
Average scheduler run-time : 372646 us
Thread aux check_io emulator gc other port sleep
Stats per thread:
async( 0) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
aux( 1) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_cpu_( 1) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_cpu_( 2) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_cpu_( 3) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_cpu_( 4) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 1) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 2) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 3) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 4) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 5) 0.00% 0.00% 0.01% 0.00% 0.04% 0.00% 99.95%
dirty_io_s( 6) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 7) 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 99.99%
dirty_io_s( 8) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 9) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s(10) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
poll( 0) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
scheduler( 1) 0.01% 0.00% 0.09% 0.01% 0.08% 0.00% 99.81%
scheduler( 2) 0.01% 0.00% 0.08% 0.01% 0.05% 0.00% 99.85%
scheduler( 3) 0.01% 0.00% 0.10% 0.00% 0.10% 0.00% 99.79%
scheduler( 4) 0.01% 0.00% 0.08% 0.00% 0.07% 0.01% 99.84%
Stats per type:
async 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
aux 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_cpu_sche 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_sched 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 99.99%
poll 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
scheduler 0.01% 0.00% 0.09% 0.00% 0.07% 0.00% 99.82%
get_files took 2206 microseconds, on handle
Average thread real-time : 215479101 us
Accumulated system run-time : 1701034 us
Average scheduler run-time : 387481 us
Thread aux check_io emulator gc other port sleep
Stats per thread:
async( 0) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
aux( 1) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_cpu_( 1) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_cpu_( 2) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_cpu_( 3) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_cpu_( 4) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 1) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 2) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 3) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 4) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 5) 0.00% 0.00% 0.01% 0.00% 0.04% 0.00% 99.95%
dirty_io_s( 6) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 7) 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 99.98%
dirty_io_s( 8) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s( 9) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_s(10) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
poll( 0) 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
scheduler( 1) 0.01% 0.00% 0.10% 0.01% 0.08% 0.00% 99.81%
scheduler( 2) 0.01% 0.00% 0.08% 0.01% 0.05% 0.00% 99.85%
scheduler( 3) 0.01% 0.00% 0.10% 0.00% 0.10% 0.00% 99.79%
scheduler( 4) 0.01% 0.00% 0.08% 0.00% 0.07% 0.01% 99.83%
Stats per type:
async 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
aux 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_cpu_sche 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
dirty_io_sched 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 99.99%
poll 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00%
scheduler 0.01% 0.00% 0.09% 0.00% 0.08% 0.00% 99.82%
Again only a single run so might not be valid data.
At least on my machine handle_event seems to perform much more consistently than mount with the latter varying between 1400 and 7000 usec and the former between 1800 and 3200 usec.
Also in reply to @benwilson512 above, it does look like there is something about these local machine environments that may be the cause. After these last two posts I went back to our beefy linux box, and thereās basically no difference there on second look and with a larger data set. Previously I didnāt grab a big sample size, because I just assumed if it reproād twice, it would repro thrice. Bad assumption. I think I know what the issue is, or at least what field we are playing ball in.
Can anyone try to reproduce this with an M1 Mac? The two machines we are seeing crazy slowdowns on are Apple silicone. However, I happen to have an Intel mac we keep around for testing and it does not reproduce there. Consistently seeing ~4ms.
![]()
Did you do joseās garbage collection trick on the M1 machines?
Correct, and that didnāt resolve the issue. But hereās my new theory, which again, if someone else can repro on an M1, that would be super helpful: something about mount is eating the translation performance hit up front (maybe something prior is shelling out too, like maybe Tailwind compilation?), whereas handle_event is eating the hit when Iām calling System.cmd. Something along those lines. That might be wildly off, but Iām thinking the issue has something to do with System.cmd, in the handle_event path, specifically on Apple chips. Totally just a theory, but currently fits the facts.
If someone else has an M1, and canāt reproduce the issue, then I throw up my hands officially. ![]()
Same results here it seems:
[debug] MOUNT ChannelsTestWeb.TestLive
Parameters: %{}
Session: %{"_csrf_token" => "v_1aco0pTl98KPuowfY6ThJz"}
System.cmd took 2675 microseconds
[debug] Replied in 2ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 6074 microseconds
[debug] Replied in 6ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 5035 microseconds
[debug] Replied in 5ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 7705 microseconds
[debug] Replied in 7ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 8343 microseconds
[debug] Replied in 8ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 6803 microseconds
[debug] Replied in 7ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 6116 microseconds
[debug] Replied in 6ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 15607 microseconds
[debug] Replied in 15ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 5405 microseconds
[debug] Replied in 5ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 7124 microseconds
[debug] Replied in 7ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 6561 microseconds
[debug] Replied in 6ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 6113 microseconds
[debug] Replied in 6ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 8810 microseconds
[debug] Replied in 9ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 7972 microseconds
[debug] Replied in 8ms
[debug] HANDLE EVENT "refresh" in ChannelsTestWeb.TestLive
Parameters: %{"value" => ""}
System.cmd took 7700 microseconds
[debug] Replied in 7ms
WOOT! OMG, thank you @LostKobrakai. Haha. I thought I was loosing my mind.
Great, now I can officially not worry about this as we will not be hosting the app on M1 in prod any time soon hah. I guess the remaining question is: should I file a bug?
Also, is there a good way to close this thread out and summarize in the original post or something? It doesnāt look like I can edit the post or the title⦠Forum newb here.
You can mark a post as the solution. Even your own ![]()






















