Performance issue when running in Kubernetes

Kalaww · May 4, 2020, 5:06pm

Hello, I have performance issue when running my elixir app in kubernetes.

I have a function that do a query to cassandra and them decode the result. The average execution time of the query is around 2ms and 1ms for decode, but when I do several calls, I observe sometimes a spike in execution time reaching 70 to 90ms either on query or decode step (but never the two of them for a same call).
I guess it is not related to network since it happened on the decode step too.
The cassandra connection is using a pool, and the decode step is run after releasing the connection.

I tried reproducing the issue locally, but it never happened (either with a local cassandra or the one in my kubernetes cluster). I am starting to think that maybe the way my app is running on the kubernetes node might have an impact. I don’t really know what I should do to have a better understanding of what causing these spikes.
If anyone has ideas, it will help me a lot.
I am using elixir 1.10
The app is running on a 4vCPU - 3.6Gb memory

I tried with these flags, without success

+sbwt none
+sbwtdcpu none
+sbwtdio none

benwilson512 · May 4, 2020, 6:05pm

Hi @Kalaww. You mention Kubernetes, but then don’t elaborate much about your K8s environment. Is “4vCPU - 3.6Gb memory” your node properties or pod spec? Are you running other things in your cluster?

Kalaww · May 4, 2020, 7:53pm

Thanks for your reply,
I have a lot of deployments running in my k8s cluster (managed by gcloud), my app is always running aside of other pods on its node (between 5 to 20 pods).
The “4vCPU - 3.6Gb memory” is the spec of the node where my app is running.
The issue seems to appear when I do read and write requests in parallel (around 6 to 10 requests simultaneously). When I run only a few read requests, the execution times are normal. My pool size is at 20.

dom · May 5, 2020, 7:38am

Are you setting the correct number of schedulers (+S)?

It’s explained under “Container Resources” in https://adoptingerlang.org/docs/production/kubernetes/

Kalaww · May 6, 2020, 2:37pm

I am not setting the number of schedulers, I have checked my pod and it is automatically set to 4.
Thank you for this documentation, there are a lot of interesting topics here to improve my deployment.
I will experiment with +S. My limits.cpu is at 1000Mi, that might cause my app to be throttled when doing several tasks in parallel.

benwilson512 · May 6, 2020, 2:57pm

You should set +S to the number of full CPUs allocated to your pod, minimum one. Setting 4 schedulers but only allowing 1 CPU is definitely going to cause contention.

sb8244 · May 6, 2020, 3:25pm

Won’t k8s balance your CPU usage balance across cores transparently? So you can use 4 cores with an allocation of 1000mcore (or whatever it’s called), and it will allow you fast usage of all of the cores, but will throttle your usage to 1 total. You may easily max out that amount, in which case you’ll see the pod usage at 1000 consistently, and you should upgrade it.

My understanding is that k8s won’t say “you can access 1 CPU” if you set the core limit to 1. Instead, it’s limiting you to 1 core’s amount of CPU usage even if across multiple cores. This allows you to take advantage of parallelism while still maintaining a CPU limit.

I’m not sure of your apps, but 5-20 pods running on 4vCPU and 3.6GB memory seems like a lot of pods to the size of the node. At least for the apps I’m running.

Kalaww · May 6, 2020, 3:51pm

I have just tested with a limits.cpu=4000 and +S 4 and there is no more spikes. Everything run perfectly. Thank you !!!
I am new to Kubernetes and I was definitely not understanding how to set the values in cpu.

I have now

requests:
  cpu: 1
limits:
  cpu: 4

Is it fine to have limits.cpu set to the node number of core to make sure my app won’t be throttled ? Won’t it be risky for other pods running on the same node to have my app potentially using too much of the cpu ?

And, what would be the reasoning to do to chose the value for requests.cpu ? I guess if I set it too high, I might end up having a node underused because less pods would be scheduled on it.

benwilson512 · May 6, 2020, 3:55pm

I would read up on this: Configure Quality of Service for Pods | Kubernetes

For my core services I always aim for the Guaranteed QOS class, which means that the pod requests and limits must be identical. This does mean you may need some more nodes to ensure that you can actually provide the guaranteed level of resources. For secondary services Burstable is fine, and what’s left can get Best Effort.

Kalaww · May 6, 2020, 3:56pm

Yes, it averages around 7-10 pods, but we have quite a lot of deployments without requests/limits. We are correcting them to have a better usage of each of our nodes.

Kalaww · May 6, 2020, 4:20pm

Thanks again, I didn’t know about k8s QOS. I definitely want to have my app aiming for Guaranteed QOS because uptime and performance are critical for it.

sb8244 · May 6, 2020, 5:40pm

If you are setting your limit to 4 on a 4 vCPU node, you could potentially starve out the underlying nodes from performing work.

We used to have some Ruby apps on the same hardware as Elixir apps without a limit set (oops). In a situation where Elixir was 100% slammed (8 schedulers at 100%), the Ruby app times jumped from 50ms to >30s per request. The only connection between the two apps was that they ran on the same node.

shanesveller · May 6, 2020, 10:44pm

You’ve got this essentially right - the limits are applied through cgroups. The problems arise because, prior to an as-yet-unreleased version of Erlang/OTP, the runtime itself is not cgroups-aware. If you do not explicitly set the number of BEAM schedulers with +S, you’ll get a default number based on the physical core count of the Docker host, not based on the CPU shares dictated by the resource requests/limits. This means that if you’ve scaled up to many-core machines under the hood, you’ll get unnecessary contention when you get i.e. 16 BEAM schedulers fighting for 2000 millicores of CPU share.

shanesveller · May 6, 2020, 10:46pm

Cannot endorse this enough! I need a very compelling argument to do otherwise, for anything that is user-facing or with response expectations measured in less than “minutes”.