Kubernetes Horizontal Pod AutoScaling

Kubernetes Horizontal Pod AutoScaling - HPA

- October 21, 2022

In Kubernetes, a HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand.

Horizontal scaling means that the response to increased load is to deploy more Pods.

Horizontal pod autoscaling does not apply to objects that can't be scaled (for example: a DaemonSet.)

From the most basic perspective, the HorizontalPodAutoscaler controller operates on the ratio between desired metric value and current metric value:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

For example, if the current metric value is 200m, and the desired value is 100m, the number of replicas will be doubled, since 200.0 / 100.0 == 2.0

If the current value is instead 50m, you'll halve the number of replicas, since 50.0 / 100.0 == 0.5.

The control plane skips any scaling action if the ratio is sufficiently close to 1.0 (within a globally-configurable tolerance, 0.1 by default).

HorizontalPodAutoscaler controller works based on the metrics collected. So, make sure Metric Server is installed. Once you have a metric server in place.

Create a deployment with the service IP attached to it.

root@master-node:/kubernetes# kubectl apply -f php.yml

deployment.apps/php-apache created

service/php-apache created

root@master-node:/kubernetes#

root@master-node:/kubernetes# kubectl get svc

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

php-apache ClusterIP 10.104.10.167 <none> 80/TCP 18s

root@master-node:/kubernetes#

Creating a HPA for the deployment.

root@master-node:/kubernetes# kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=5

horizontalpodautoscaler.autoscaling/php-apache autoscaled

root@master-node:/kubernetes# kubectl get hpa

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE

php-apache Deployment/php-apache <unknown>/50% 1 5 0 6s

root@master-node:/kubernetes#

YAML for reference:

root@master-node:/kubernetes# cat hpa.yml

apiVersion: autoscaling/v1

kind: HorizontalPodAutoscaler

metadata:

creationTimestamp: null

name: php-apache

spec:

maxReplicas: 5

minReplicas: 1

scaleTargetRef:

apiVersion: apps/v1

kind: Deployment

name: php-apache

targetCPUUtilizationPercentage: 50

root@master-node:/kubernetes#

Load testing:

root@master-node:/kubernetes# kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"

root@master-node:/kubernetes# kubectl get pods --all-namespaces | grep -i metrics

kube-system metrics-server-847d45fd4f-j2fpc 0/1 Running 0 93s

root@master-node:/kubernetes#

root@master-node:/kubernetes# kubectl top pod

NAME CPU(cores) MEMORY(bytes)

php-apache-6766b988d9-wjbb9 1m 9Mi

root@master-node:/kubernetes#

root@master-node:/kubernetes# kubectl get hpa

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE

php-apache Deployment/php-apache 0%/50% 1 5 1 27m

root@master-node:/kubernetes#

Triggering load pod.

root@master-node:~# kubectl get hpa --watch

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE

php-apache Deployment/php-apache <unknown>/50% 1 5 1 10m

php-apache Deployment/php-apache <unknown>/50% 1 5 1 13m

php-apache Deployment/php-apache <unknown>/50% 1 5 1 17m

php-apache Deployment/php-apache <unknown>/50% 1 5 1 18m

php-apache Deployment/php-apache 0%/50% 1 5 1 26m

php-apache Deployment/php-apache 127%/50% 1 5 1 28m

root@master-node:~# kubectl get hpa --watch

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE

php-apache Deployment/php-apache <unknown>/50% 1 5 1 10m

php-apache Deployment/php-apache <unknown>/50% 1 5 1 13m

php-apache Deployment/php-apache <unknown>/50% 1 5 1 17m

php-apache Deployment/php-apache <unknown>/50% 1 5 1 18m

php-apache Deployment/php-apache 0%/50% 1 5 1 26m

php-apache Deployment/php-apache 127%/50% 1 5 1 28m

php-apache Deployment/php-apache 213%/50% 1 5 3 28m

php-apache Deployment/php-apache 126%/50% 1 5 5 28m

php-apache Deployment/php-apache 98%/50% 1 5 5 28m

Now we can see max replicas achieved by HPA.

HPA scaling behavior:

Scaling behavior decides how the scale up/down should happen based on the metrics collected.

The following example shows this behavior while scaling down:

behavior:

scaleDown:

policies:

- type: Pods

value: 4

periodSeconds: 60

- type: Percent

value: 10

periodSeconds: 60

Above scale down behavior has 2 policies:

1) To take down 4 pods in 60 sec.

2) To take down 10% of overall pods in 60 sec.

When you have two policies, by default HPA pickups whichever policy will take high number of pods down.

Stabilization Window:

It prevents frequent flapping of replication count.

behavior:

scaleDown:

stabilizationWindowSeconds: 300

policies:

- type: Pods

value: 5

periodSeconds: 120

The above scale down behavior has a stabilizationWindowSeconds which waits for 5 mins before taking down 5 pods in 2 mins.

Search This Blog

RSInfoMinds

Kubernetes Horizontal Pod AutoScaling - HPA

Comments

Post a Comment

Popular posts from this blog

K8s - ETCD

SRE/DevOps Syllabus

K8s - Deployment and HPA replicas