K8s - QnA
1. What are the key differences between a Deployment and a StatefulSet in Kubernetes?
-> Deployment is stateless(nothing stored), StatefulSet stores data in the volume to process.
-> Deployment pod name does not follow order, StatefulSet follows ordering of numbers.
-> Deployment pod scaling are random, StatefulSet pod scaling follows strict order (new to old).
-> Replaced pod get new name, StatefulSet pod gets the same name.
-> Rolling updates can be fast and parallel, stateful deployments are ordered and controlled.
2. How would you safely perform a node upgrade in a Kubernetes cluster?
-> Considering the question refers to node in Data plane.
-> Cordon the node first to mark it as SCHEDULING DISABLED.
-> Drain the node to let the pods to be created across other nodes.
-> While draining we can ignore "daemon-sets".
kubectl cordon: Prevent New Pods
- Purpose: Marks a node as unschedulable.
- Effect: No new pods will be scheduled on the node.
- Existing Pods: Continue running undisturbed.
- Use Case: When you want to stop future workloads from landing on a node but aren’t ready to evict current ones.
kubectl drain: Evict Existing Pods
- Purpose: Safely evicts all non-DaemonSet pods from the node.
- Effect: Moves pods to other nodes in the cluster.
- Includes Cordon: Automatically cordons the node first if not already done.
- Use Case: Ideal before rebooting, upgrading, or decommissioning a node.
3. How do you handle Kubernetes manifest version mismatches across environments?
-> Depends, Recommended approach is to use versioning system integrated with CI/CD pipeline.
-> With this, we can have files per environment like dev.yaml, test.yaml, qa.yaml and prod.yaml'
4. What happens to a pod if the node it’s running on suddenly crashes?
-> Kubelet running on the crashed node becomes unresponsive.
-> Kube controller on the control plane detects the node is not healthy/unresponsive to the heartbeat.
-> Kube controller is control plane component that runs a collection of built-in controllers.
- Examples include:
- Deployment Controller
- ReplicaSet Controller
- Job Controller
- Node Controller ← Responsible for node management.
-> ETCD is marked that the node is not responding and marked NotReady/Unreachable.
-> Kube Scheduler read node info from the ETCD to decide where to place new pods.
-> FYI, this behavior varies based on the AFFINITIES.
5. How do you configure and use an Admission Controller in Kubernetes?
-> Admission controller are the part API Server.
-> Admission controllers are used to parse the API request made to the API server.
-> Admission controllers are used to modify the API request and perform administrative and compliance tasks.
6. What strategies would you use to minimize container startup time?
-> Container startup time is something out of K8s context.
-> Prepare the image lightweight.
-> Instead of building image with dependencies make them as external layers.
7. What are Mutating and Validating Webhooks in Kubernetes, and when would you use them?
8. How does the kubectl drain command behave, and what does it do under the hood?
-> Drain, mark the node "Scheduling disabled" and drain the pods.
-> Kubelet on the node handles pod termination and cleanup.
-> Node controller updates the node status in ETCD.
--> Scheduler avoids the cordoned node for new pods.
9. How do you troubleshoot slow image pulls in Kubernetes?
-> This varies from scenario to scenario.
-> Does the slow image issue happening across all the images or specific images?
-> If across one image, then its a isolate issue to that specific image.
-> If across all the images, I would troubleshoot if the issue with specific node or across all the nodes.
-> If the issue with specific node, then we can repave the node instead of troubleshooting. We can triage
like checking CPU,MEM,NETSTAT,IOSTAT. Ensure filesystem is not full. Check for TCP buffer limits. Check of ENI stats and others.
-> If the issue across all the nodes, triage at the networking layer.
-> Check if the repo(image stored) is reachable, if there is any latency identified.
-> Engage networking team to triage the path.
-> Check if the issue happening with specific subnet nodes.
10. What is an emptyDir volume and how does it behave during pod restarts?
-> emptyDir is type of volume which mounts a empty directory on the host node to the pod as volume.
-> When the pod restarts, the attached volume is deleted.
-> Does not persist across reboot.
11. Describe how you would use Kubernetes to run scheduled data pipelines.
-> User "CronJob" resources to schedule pipeline tasks at fixed intervals.
12. How does Kubernetes handle IP address assignment for pods?
-> Kubernetes assigns IP addresses to Pods using a modular and extensible system built around CNI (Container Network Interface) plugins.
-> - Kubernetes delegates networking to a CNI plugin (e.g., Calico, Flannel, Cilium).
13. Explain the differences between using ConfigMaps and environment variables for configuration.
-> ConfigMap, lets to store key value pair and attach to the pods as separate volume.
-> ConfigMap is decoupled from pod specification.
-> ConfigMap can be updated independently. No need for pod restart.
-> Environment variables are mentioned part of pod spec.
-> Cannot be updated on the fly as it needs pod restart.
14. What is a Resource Quota and how is it enforced in a namespace?
-> Resource Quota is a K8s resource to enforce resource management in a namespace.
-> Once the namespace is created we configure quota to the namespace.
-> Attributes like no.of instances, CPU, MEM can be configured.
-> It is enforce by creating resource of type "ResourceQuota" under the namespace.
15. How do you use the Horizontal Pod Autoscaler with custom metrics?
-> Backbone for HPA to work in metric server.
-> CPU and MEM are standard metrics collected.
-> Custom metrics can be collected using Prometheus.
-> With custom metrics, HPA can be configured based on HTTP request/sec, error rate, queue length.
16. What are Kubernetes finalizers and why might a resource be stuck in Terminating state because of one?
-> Finalizers are condition to protect a resource from accident termination.
-> Primarily applied on persistent volumes.
-> If finalizer is set, then the respective resource cannot be deleted unless the finalizer
keyword is removed.
17. How would you expose multiple services in a single Ingress resource?
-> Based on on Path or Host based routing we can expose multiple services in a single ingress resource.
18. What’s the purpose of the --force flag in kubectl delete and when should it be used cautiously?
-> Flag deletes the resource forcefully without going through shutdown sequence.
-> Force deleting pod which are stuck in terminating or unknown state.
19. How does the terminationGracePeriodSeconds setting affect pod termination?
-> Its a pod level setting to determine how much time kubelet to wait before terminating pod via SIGTERM.
-> spec:
terminationGracePeriodSeconds: 60
-> This feature is really useful when your application needs more time to shutdown, preventing from
terminating pod before actual application stopped.
-> Also, this can be used to flush the logs, close DB connection and other activities to perform.
20. What are some anti-patterns you've seen in Kubernetes resource definitions?
-> Mentioning replica count without HPA.
-> Not mentioning resource requests and limits.
-> Use images with proper version instead of mentioning 'latest' for effective rollback.
-> Don't hardcode credentials, IP's.
-> Don't use empty directory as stateful volume.
-> Use readiness and liveness probes.
-> Don't use single YAML file, break into modules.
-> Avoid using privilege containers.
-> Don't use multiple tools like IaaC, EKS console, Helm to manage infra.
21. Describe how Kubernetes handles rolling back a failed deployment.
-> Kubernetes by default does not handle rollback for failed deployment.
-> When a new version is deployed, K8s tracks it by "revision" number.
-> User can roll back to previous version.
-> This process can be automated using shell script or CI/CD pipeline.
22. How can you use Network Policies to secure communication within a namespace?
-> Network policies(NP) are like firewall in K8s.
-> K8s by default does not provide any network related services.
-> NP can be used to control ingress and egress traffic between pods and namespace.
-> NP can also control external traffic inside/outside of the cluster from a pod.
23. How do you inspect pod logs for a job that has already completed?
-> kubect logs <job.batch/pod-name>
Comments
Post a Comment