Tsung load test stuck exactly at 5000 connected users

<?xml version="1.0"?>
<!DOCTYPE tsung SYSTEM "tsung-1.0.dtd">
<tsung loglevel="warning" version="1.0">
  
  <clients>
    <client host="localhost" use_controller_vm="true" maxusers="60000"/>
  </clients>
  
  <servers>
    <server host="141.145.208.139" port="4000" type="tcp"/>
  </servers>
  
  <load>
    <arrivalphase phase="1" duration="400" unit="second">
      <users maxnumber="60000" arrivalrate="500" unit="second"/>
    </arrivalphase>
  </load>
  
  <options>
    <option name="ports_range" min="1025" max="65535"/>
  </options>
  
  <sessions>
    <session name="websocket" probability="100" type="ts_websocket">
      
      <request>
        <websocket type="connect" path="/socket/websocket"/>
      </request>
      
      <request>
        <websocket type="message">{"topic":"conversation:83","event":"phx_join","payload":{},"ref":"1"}</websocket>
      </request>
      
      <for var="i" from="1" to="1000" incr="1">
        <thinktime value="30"/>
        <request>
          <websocket type="message">{"topic":"phoenix","event":"heartbeat","payload":{},"ref":"heartbeat"}</websocket>
        </request>
      </for>
      
    </session>
  </sessions>
</tsung>

what i tried:

increasing ulimit for both client and server

when i start the test i can see some increase in cpu in phoenix service cpu usage and is logs the connected message and join the topic, then stops

this is the logs of tsung:

stats: dump at 1772737296

stats: users 4466 4466
stats: {cpu,“tsung_controller@unknown”} 1 23.92556014996428 0 23.92556014996428 23.92556014996428 0 0
stats: {load,“tsung_controller@unknown”} 1 0.30078125 0 0.30078125 0.30078125 0 0
stats: {freemem,“tsung_controller@unknown”} 1 5084.1953125 0 5084.1953125 5084.1953125 0 0
stats: users_count 4466 4466
stats: finish_users_count 0 0
stats: request 8380 65.26576169451066 24.256316804754 198.083 34.12 0 0
stats: connect 4229 40.902805864270455 4.201730056014052 60.217 34.467 0 0
stats: page 4177 130.49040052669372 17.767087540512446 248.559 105.436 0 0
stats: size_rcv 1329777 1329777
stats: size_sent 5318132 5318132
stats: connected 4229 4229
stats: websocket_succ 4203 4203

stats: dump at 1772737306

stats: users 9393 9393
stats: {cpu,“tsung_controller@unknown”} 1 18.802265983274886 0.0 23.92556014996428 18.802265983274886 23.92556014996428 1
stats: {load,“tsung_controller@unknown”} 1 0.25 0.0 0.30078125 0.25 0.30078125 1
stats: {freemem,“tsung_controller@unknown”} 1 4877.2734375 0.0 5084.1953125 4877.2734375 5084.1953125 1
stats: users_count 4927 9393
stats: finish_users_count 0 0
stats: request 1764 64.8651275510203 34.46547744678856 1127.223 34.12 65.26576169451066 8380
stats: connect 843 42.19985646500603 36.21356727868989 1084.529 34.467 40.902805864270455 4229
stats: page 895 130.3516491620113 37.330267446701676 1167.76 105.436 130.49040052669372 4177
stats: size_rcv 278047 1607824
stats: size_sent 1062444 6380576
stats: connected 843 5072
stats: websocket_succ 869 5072

stats: dump at 1772737316

stats: users 14204 14204
stats: {cpu,“tsung_controller@unknown”} 1 11.827101564571732 0.0 23.92556014996428 11.827101564571732 21.36391306661958 2
stats: {load,“tsung_controller@unknown”} 1 0.2890625 0.0 0.30078125 0.25 0.275390625 2
stats: {freemem,“tsung_controller@unknown”} 1 4617.7890625 0.0 5084.1953125 4617.7890625 4980.734375 2
stats: users_count 4811 14204
stats: finish_users_count 0 0
stats: request 2 561.432 523.1210000000001 1127.223 34.12 65.19609305993683 10144
stats: connect 1 1044.818 0.0 1084.529 34.467 41.11838426656148 5072
stats: page 1 1122.941 0.0 1167.76 105.436 130.46591660094634 5072
stats: size_rcv 317 1608141
stats: size_sent 1258 6381834
stats: connected 1 5073
stats: websocket_succ 1 5073

stats: dump at 1772737326

stats: users 17763 17763
stats: {cpu,“tsung_controller@unknown”} 1 10.584228451663611 0.0 23.92556014996428 10.584228451663611 18.1849758992703 3
stats: {load,“tsung_controller@unknown”} 1 0.3203125 0.0 0.3203125 0.25 0.2799479166666667 3
stats: {freemem,“tsung_controller@unknown”} 1 4332.6484375 0.0 5084.1953125 4332.6484375 4859.752604166667 3
stats: users_count 3559 17763
stats: finish_users_count 0 0
stats: request 4207 50.92889160922288 353.6956122984609 19550.499 34.12 65.29391208357964 10146
stats: size_rcv 416850 2024991
stats: connect 3 11651.867666666667 6268.992401118612 19507.13 34.467 41.3162355608121 5073
stats: page 4204 50.96614486203623 354.8746595210742 19597.113 34.146 130.66155529272618 5073
stats: size_sent 325178 6707012
stats: connected 3 5076
stats: websocket_succ 3 5076

stats: dump at 1772737336

stats: users 20868 20868
stats: {cpu,“tsung_controller@unknown”} 1 46.22984919396776 0.0 46.22984919396776 10.584228451663611 16.284789037368625 4
stats: {load,“tsung_controller@unknown”} 1 0.4296875 0.0 0.4296875 0.25 0.2900390625 4
stats: {freemem,“tsung_controller@unknown”} 1 4099.703125 0.0 5084.1953125 4099.703125 4727.9765625 4
stats: users_count 3105 20868
stats: finish_users_count 0 0
stats: request 871 41.58996900114803 4.761006355350374 19550.499 34.059 61.083388768898466 14353
stats: connect 0 0 0 19507.13 34.467 48.17826359338058 5076
stats: page 871 41.58996900114803 4.761006355350374 19597.113 34.059 94.54648517839821 9277
stats: size_rcv 86229 2111220
stats: size_sent 64068 6771080
stats: connected 0 5076
stats: websocket_succ 0 5076

stats: dump at 1772737346

stats: users 23648 23648
stats: {cpu,“tsung_controller@unknown”} 1 32.80853224987303 0.0 46.22984919396776 10.584228451663611 22.273801068688453 5
stats: {load,“tsung_controller@unknown”} 1 0.51953125 0.0 0.51953125 0.25 0.31796875 5
stats: {freemem,“tsung_controller@unknown”} 1 3922.98828125 0.0 5084.1953125 3922.98828125 4602.321875 5
stats: users_count 2780 23648
stats: finish_users_count 0 0
stats: request 19 2171.656894736842 3081.405941854377 19550.499 34.059 59.96812545980029 15224
stats: page 10 4126.455400000001 3216.7239622951856 19597.113 34.059 90.00124221521483 10148
stats: size_rcv 2952 2114172
stats: size_sent 11398 6782478
stats: connect 9 4489.216555555556 3070.07860660223 19507.13 34.467 48.17826359338058 5076
stats: connected 9 5085
stats: websocket_succ 9 5085

stats: dump at 1772737356

stats: users 27230 27230
stats: {cpu,“tsung_controller@unknown”} 1 32.48730964467005 0.0 46.22984919396776 10.584228451663611 24.02958959888588 6
stats: {load,“tsung_controller@unknown”} 1 0.44140625 0.0 0.51953125 0.25 0.3515625 6
stats: {freemem,“tsung_controller@unknown”} 1 3720.27734375 0.0 5084.1953125 3720.27734375 4489.099609375 6
stats: users_count 3582 27230
stats: finish_users_count 0 0
stats: request 4250 42.41863152941167 21.352453923474716 19550.499 34.059 62.6002901659778 15243
stats: page 4249 42.42863403153671 21.777244209394205 19597.113 34.059 93.97491238432764 10158
stats: connect 1 1054.813 0.0 19507.13 34.467 56.038508357915404 5085
stats: size_rcv 420869 2535041
stats: size_sent 325626 7108104
stats: connected 1 5086
stats: websocket_succ 1 5086

stats: dump at 1772737367

stats: users 28237 28249
stats: {cpu,“tsung_controller@unknown”} 1 48.11715481171548 0.0 48.11715481171548 10.584228451663611 25.23783531971219 7
stats: {load,“tsung_controller@unknown”} 1 0.609375 0.0 0.609375 0.25 0.36439732142857145 7
stats: {freemem,“tsung_controller@unknown”} 1 3482.6640625 0.0 5084.1953125 3482.6640625 4379.267857142857 7
stats: session 696 213.49577729885067 33.9491718719882 353.745 137.804 0 0
stats: users_count 1703 28933
stats: finish_users_count 696 696
stats: request 829 54.91196501809405 391.5845597034931 19550.499 34.059 58.20014400041036 19493
stats: page 828 54.97843357487919 393.45779514016743 19597.113 34.059 78.7725706948011 14407
stats: size_rcv 82190 2617231
stats: connect 1 11273.349 0.0 19507.13 34.467 56.234885568226474 5086
stats: error_connect_eaddrinuse 2794 2794
stats: size_sent 62590 7170694
stats: connected 1 5087
stats: websocket_succ 1 5087
stats: error_abort_max_conn_retries 691 691

now im stuck idk what’s wrong exactly pleas help :frowning:

1 Like

okay, guys turned out that a test from a local machine was a bad idea, i tested in the same vps im hosting phoenix in, and it handles it perfectly, it’s a 4ocpu, 24gb ram vps, it only took 1gb ram in the test, and some cpu cuz the connect function had to decrypt a token, how much realistically i could handle in this machine ?

That depends on what those connected users do.

2 Likes

a user connect, put his online status in an ets table and broadcast it to his friends, and also connect to all his friends channels to listen to their status(online and typing) if one of them change it broadcast it

Are you planning on having more than 10’000 users connected at the same time?

Why worry about it? If you find that you are hitting a limit you stop the instance, give it some more CPU and off you go.

My personal advice would be to stop worrying about it and instead build your thing and then try to get 10’000 simultaneous users (which is very hard).

i dont think it will be that hard, all im managing here is chat, typing and online status, im ets table which have O(1), so it will never be a problem ig, the only problem im thinking about now, is that when i user comes online he have to send a message to all his friends, if he have 500 friend that will be so heavy, one solution im thinking about is sending the message only to friends already online only, what’s scaring me now it the typing status