Scotty, I need warp speed in three minutes - my journey in optimising Elixir codebase

I have recently spent some time optimising the hell out of my fork of Supabase’s Supavisor. I wanted to make it as fast as possible to compare it against “state of the art” and “native” solutions like PgBouncer and PgDog.

I haven’t achieved comparable performance, but I think that the read may be interesting to some people out there.

17 Likes

Nice post! And thank you for the shout-out, and your many contributions to Peep :slight_smile:

I think your post is currently missing the links to Ultravisor and Speedoscope.

This project now lives as [Ultravisor][]
[…]
That has disadvantages, but at least it was workable within [Speedoscope][]

Aside: Did you happen to explore using Linux’s perf(1) for less-obtrusive tracing? It’s a bit clunky to use (or maybe I haven’t found a better way of wielding it), but I find it super helpful, especially since the BEAM emits perf maps for jitted functions. When I was spending a lot of time trying to optimize stuff, which runs in a Docker container in prod, I repeatedly ran perf(1), and generated flame graphs (using GitHub - brendangregg/FlameGraph: Stack trace visualizer · GitHub ) like so:

# copy perf map out of Docker container, making it visible to perf running on the host.
sudo docker cp container:/tmp/perf-6.map /tmp/perf-2969015.map
sudo perf record --call-graph=fp --pid 2969015 -- sleep 10
sudo perf script > out.perf
sudo chown richard:richard out.perf
./FlameGraph/stackcollapse-perf.pl out.perf > out.folded
# strip [0-9]+_ and/or _[0-9]+ from all scheduler names
sed -e 's/^[0-9]\+_//' -e 's/^erts_\([^_]\+\)_[0-9]\+/erts_\1/' out.folded > out.folded_sched
./FlameGraph/flamegraph.pl out.folded_sched > out_sched.svg

These flame graphs weren’t totally complete (i.e. they missed stuff from NIFs compiled in the container without debug symbols), but it was nice to get a mix of BEAM and native code in a single flame graph.

2 Likes

Yes, I dod. That required a little bit of twiddling to get it right, but I have managed to run it in OrbStack, as unfortunately +JPperf true do not work on macOS with Instruments.app.

This was an enjoyable read, thank you. I’ll try using Peep soon as a result.

1 Like