AWS - EKS Failover


 In this post, we will see how to do an AZ failover for an EKS Cluster.


Above is the typical multi-region failover where we will have 2 DNS records.

Primary EKS cluster in US-EAST-1 with DNS east.eks.com

Secondary EKS Cluster in US-WEST with DNS west.eks.com

This means we will have 2 A records for the address www.app.com.

DNS failover will happen whenever there is a health check failure.

But in this post, we will how to perform AZ resiliency. Regions are a collection of AZs.

Let's imagine an EKS Cluster is created on a single region us-east-1 with 2 AZs us-east-1a and us-east-1b.


In this single region architecture, We have core nodes 1 or 2 in each AZ (Nodes where K8s core components like Kube-API, Kube Scheduler, ETCD, and others will be running).

Similarly, We will have a set of worker nodes running on both the AZs (US-EAST-1A and US-EAST1B).

When we create a deployment with 2 replicas, each will be created in one AZ.

1st copy in US-EAST-1A 
2nd copy in US-EAST-1B

This can be seen using # kubectl get pods -o wide or kubectl get pods --field-selector spec.nodeName=node1

Now, How we can do failover?

One important thing before we do failover. We must ensure the deployments are running with a minimum of 2 replicas else during the failover we will have downtime to create pods in other AZ.

Once that's validated we can perform TAINT and DRAIN.

TAINT -> Will mark the node UNSCHEDULABLE which means no new workloads are accepted on the worker node.

So, we TAINT all the worker nodes in US_EAST_1A using:

Eg:

# kubectl taint node us-east-1a-workernode key:NoExecute

root@kubemaster:~/.kube# kubectl describe node kubeclient1 | grep -i taint
Taints:             key:NoExecute
root@kubemaster:~/.kube#

Once all the worker nodes are TAINTED then we need to DRAIN all the worker nodes in US_EAST_1A.

DRAIN -> Will delete the pods on the worker nodes and create them on available worker nodes. In our case, we drained all the US_EAST_1A worker nodes which will eventually force the workloads to recreate on US_EAST_1B.

# kubectl drain us-east-1a-workernode  --ignore-daemonsets

Once all the worker nodes in us-east-1a are drained. You should see the workloads on us-east-1b.

Can be verified with  # kubectl get pods -o wide or kubectl get pods --field-selector spec.nodeName=node1




Comments

Popular posts from this blog

SRE/DevOps Syllabus

AWS Code Commit - CI/CD Series Part 1

Docker - Preventing IP overlapping