PubSub, too many subscriptions?

I’m a little hazy on what the best practice is for designing a scalable PubSub system. Should there be a few subscriptions, or even just one subscription, from which relevant information is filtered? Or should there be possibly hundreds or more subscriptions which contain very specific information and thus needs less filtering?

Many specific subscriptions seems like the way to go, but at the same time, having users subscribe to hundreds of topics when they log in also seems like it’d slow things down.

1 Like

Note that you can add metadata to the subscriptions that can later be used for filtering during dispatch.

Phoenix.PubSub.subscribe(YourApp.PubSub, "topic:user-id", metadata: [some: :extra_info])

which (if you are familiar with Registry) is similar to

Registry.subscribe(YourApp.PubSub, "topic:user-id", [some: :extra_info])

In general, the number of subscriptions doesn’t affect much, as the total throughput should stay about the same:

bench.exs

Mix.install([:benchee, :phoenix_pubsub])

{:ok, _pid} = Phoenix.PubSub.Supervisor.start_link(name: App.PubSub)

defmodule Sub do
  use GenServer

  def start_link(opts \\ []) do
    GenServer.start_link(__MODULE__, opts)
  end

  @impl true
  def init(opts) do
    topic = Keyword.fetch!(opts, :topic)
    Phoenix.PubSub.subscribe(App.PubSub, topic)
    {:ok, nil}
  end

  @impl true
  def handle_info(_message, state) do
    {:noreply, state}
  end
end

Enum.each(1..200_000, fn _ -> Sub.start_link(topic: "topic") end)
Enum.each(1..10, fn _ -> Sub.start_link(topic: "topic2") end)

map = %{"some" => "key", "then" => "some oter key"}

Benchee.run(
  %{
    "message" => fn topic -> Phoenix.PubSub.broadcast(App.PubSub, topic, "message") end,
    "map" => fn topic -> Phoenix.PubSub.broadcast(App.PubSub, topic, map) end
  },
  inputs: %{"lots subs" => "topic", "few subs" => "topic2"}
)

> elixir bench.exs

Operating System: macOS
CPU Information: Apple M1
Number of Available Cores: 8
Available memory: 8 GB
Elixir 1.13.1
Erlang 24.2

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
parallel: 1
inputs: few subs, lots subs
Estimated total run time: 28 s

Benchmarking map with input few subs...
Benchmarking map with input lots subs...
Benchmarking message with input few subs...
Benchmarking message with input lots subs...

##### With input few subs #####
Name              ips        average  deviation         median         99th %
map          147.35 K        6.79 μs  ±1531.35%        4.99 μs       24.99 μs
message       95.00 K       10.53 μs  ±9207.44%        2.99 μs        8.99 μs

Comparison:
map          147.35 K
message       95.00 K - 1.55x slower +3.74 μs

##### With input lots subs #####
Name              ips        average  deviation         median         99th %
map              3.12      320.16 ms     ±9.22%      313.18 ms      387.97 ms
message          2.78      360.19 ms    ±16.91%      352.94 ms      479.31 ms

Comparison:
map              3.12
message          2.78 - 1.13x slower +40.02 ms
5 Likes

Thanks for putting this together @ruslandoga :clap: