Kubernetes Pod Priority and Preemption
Pod priority indicates the importance of a pod relative to other pods and queues the pods based on that priority.
Pod preemption allows the cluster to evict, or preempt, lower-priority pods so that higher-priority pods can be scheduled if there is no available space on a suitable node Pod priority also affects the scheduling order of pods and out-of-resource eviction ordering on the node.
Priority classes can help you control the Kubernetes scheduler decisions to favor higher priority pods over lower priority pods.
The Kubernetes scheduler can even preempt (remove) lower priority pods that are running so that pending higher priority pods can be scheduled.
By setting pod priority, you can help prevent lower priority workloads from impacting critical workloads in your cluster, especially in cases where the cluster starts to reach its resource capacity.
root@masterk8s:~# kubectl describe pod kube-scheduler-masterk8s -n kube-system | grep -i priority
Priority: 2000001000
Priority Class Name: system-node-critical
root@masterk8s:~#
By default, Kubernetes or OKD has two reserved priority classes for critical system pods to have guaranteed scheduling.
* System-node-critical:
This priority class has a value of 2000001000 and is used for all pods that should never be evicted from a node.
* System-cluster-critical:
This priority class has a value of 2000000000 (two billion) and is used with pods that are important for the cluster.
Pods with this priority class can be evicted from a node in certain circumstances.
For example, pods configured with the system-node-critical priority class can take priority. However, this priority class does ensure guaranteed scheduling.
A priority class object can take any 32-bit integer value smaller than or equal to 1000000000 (one billion).
Reserve numbers larger than one billion for critical pods that should not be preempted or evicted.
How to use priority and preemption?
You apply pod priority and preemption by creating a priority class objects and associating pods to the priority using the "priorityClassName" in your pod specifications.
globalDefault: This field is false by default.
Adding a priority class with "globalDefault:true" affects only pods created after the priority class is added and does not change the priorities of existing pods.
root@masterk8s:/kube# kubectl get priorityclass
NAME VALUE GLOBAL-DEFAULT AGE
system-cluster-critical 2000000000 false 23d
system-node-critical 2000001000 false 23d
root@masterk8s:/kube#
(Notes) if you delete a PriorityClass, existing Pods that use the name of the deleted PriorityClass remain unchanged, but you cannot create more Pods that use the name of the deleted PriorityClass.
Eg:
Create 2 priority classes high and low.
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: low-priority
value: 50
globalDefault: false
description: "Low-priority Pods"
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority
value: 100
globalDefault: false
description: "High-priority Pods"
root@masterk8s:/kube# kubectl get priorityclass
NAME VALUE GLOBAL-DEFAULT AGE
high-priority 100 false 3s
low-priority 50 false 2m47s
system-cluster-critical 2000000000 false 23d
system-node-critical 2000001000 false 23d
root@masterk8s:/kube#
Now lets create a deployment using low priority class:
root@masterk8s:/kube# cat low_prio_deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx-deployment
name: nginx-deployment
spec:
replicas: 10
selector:
matchLabels:
app: nginx-deployment
template:
metadata:
labels:
app: nginx-deployment
spec:
priorityClassName: "low-priority"
containers:
- image: nginx
name: nginx-deployment
resources:
limits:
memory: 100Mi
root@masterk8s:/kube#
root@masterk8s:/kube# kubectl apply -f low_prio_deployment.yml
deployment.apps/nginx-deployment created
root@masterk8s:/kube# kubectl get deployment nginx-deployment --watch
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 0/10 10 0 3s
nginx-deployment 1/10 10 1 4s
nginx-deployment 2/10 10 2 6s
nginx-deployment 3/10 10 3 8s
nginx-deployment 4/10 10 4 12s
nginx-deployment 5/10 10 5 14s
nginx-deployment 6/10 10 6 18s
nginx-deployment 7/10 10 7 22s
nginx-deployment 8/10 10 8 24s
nginx-deployment 9/10 10 9 27s
nginx-deployment 10/10 10 10 29s
root@masterk8s:/kube# cat high_prio_deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx-deployment
name: high-nginx-deployment
spec:
replicas: 10
selector:
matchLabels:
app: nginx-deployment
template:
metadata:
labels:
app: nginx-deployment
spec:
priorityClassName: "high-priority"
containers:
- image: nginx
name: nginx-deployment
resources:
limits:
memory: 100Mi
root@masterk8s:/kube#
Lets deploy high priority deployment.
root@masterk8s:/kube# kubectl apply -f high_prio_deployment.yml
deployment.apps/high-nginx-deployment created
root@masterk8s:/kube#
root@masterk8s:/kube# kubectl get deployment --watch
NAME READY UP-TO-DATE AVAILABLE AGE
high-nginx-deployment 10/10 10 10 2m3s
nginx-deployment 5/10 10 5 5m8s
When the higher-priority deployment is created, it started to remove lower-priority pods on the nodes.
root@masterk8s:/kube# kubectl get pods
NAME READY STATUS RESTARTS AGE
high-nginx-deployment-768b657896-25pfd 1/1 Running 0 169m
high-nginx-deployment-768b657896-2nnvc 1/1 Running 0 169m
high-nginx-deployment-768b657896-2vzwj 1/1 Running 0 169m
high-nginx-deployment-768b657896-55s5p 1/1 Running 0 169m
high-nginx-deployment-768b657896-5jhfp 1/1 Running 0 169m
high-nginx-deployment-768b657896-5vk7x 1/1 Running 0 169m
high-nginx-deployment-768b657896-d6lq9 1/1 Running 0 169m
high-nginx-deployment-768b657896-gnvn9 1/1 Running 0 169m
high-nginx-deployment-768b657896-mbfm7 1/1 Running 0 169m
high-nginx-deployment-768b657896-rgpkr 1/1 Running 0 169m
nginx-deployment-54f6864c7b-2mnkk 0/1 Pending 0 169m
nginx-deployment-54f6864c7b-g6r98 1/1 Running 0 172m
nginx-deployment-54f6864c7b-gqlzp 1/1 Running 0 172m
nginx-deployment-54f6864c7b-j6lvc 1/1 Running 0 172m
nginx-deployment-54f6864c7b-lfghw 0/1 Pending 0 169m
nginx-deployment-54f6864c7b-mhbhc 0/1 Pending 0 169m
nginx-deployment-54f6864c7b-ngqcp 0/1 Pending 0 169m
nginx-deployment-54f6864c7b-p6289 1/1 Running 0 172m
nginx-deployment-54f6864c7b-ssqdn 1/1 Running 0 172m
nginx-deployment-54f6864c7b-tmlcp 0/1 Pending 0 169m
root@masterk8s:/kube#
Let's delete the high priority deployment.
root@masterk8s:/kube# kubectl delete deployment high-nginx-deployment
deployment.apps "high-nginx-deployment" deleted
root@masterk8s:/kube# kubectl get deployment --watch
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 5/10 10 5 173m
nginx-deployment 6/10 10 6 173m
nginx-deployment 7/10 10 7 173m
nginx-deployment 8/10 10 8 173m
nginx-deployment 9/10 10 9 173m
nginx-deployment 10/10 10 10 173m
root@masterk8s:/kube# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-54f6864c7b-2mnkk 1/1 Running 0 171m
nginx-deployment-54f6864c7b-g6r98 1/1 Running 0 174m
nginx-deployment-54f6864c7b-gqlzp 1/1 Running 0 174m
nginx-deployment-54f6864c7b-j6lvc 1/1 Running 0 174m
nginx-deployment-54f6864c7b-lfghw 1/1 Running 0 171m
nginx-deployment-54f6864c7b-mhbhc 1/1 Running 0 171m
nginx-deployment-54f6864c7b-ngqcp 1/1 Running 0 171m
nginx-deployment-54f6864c7b-p6289 1/1 Running 0 174m
nginx-deployment-54f6864c7b-ssqdn 1/1 Running 0 174m
nginx-deployment-54f6864c7b-tmlcp 1/1 Running 0 171m
root@masterk8s:/kube#
Pods with without a PriorityClass are 0. A global PriorityClass can be assigned.
Comments
Post a Comment