Posts

Apple Interview QnA - Part II

Image
                                                      A pod in Kubernetes cannot reach an external API, but curl works fine from the node. What is your debugging flow? This situation clearly shows the issue is in the pod layer, because the endpoint is accessible from node where the pod is running. Node and Pods don’t share the same network in real time. So, I would start with the below checks:   Check if the endpoint is resolving from the pod. This is to eliminate if it is a network issue or DNS resolve issue. If DNS fails, check “CORE DNS” pods which is usually created on all the worker nodes. CORE DNS pods are usually run a replicaset. It is worth to check the pod health and resource consumption. Let’s say the DNS work fine and we are getting timeout while connecting the external API. This could be due to the network policy (EGREES) configured...

Apple Interview QnA - Part I

Image
  A server’s CPU is pegged at 100% but top shows no process consuming that much. How do you debug? When the CPU is at 100% then there is no point in running top. Because “top” command needs CPU and Memory to fetch the statistics. We can start with running # sar or # vmstat command to find the CPU performance. CPU stats are divided into %user, %system, %idle and %wait. If the CPU% is high on the user end which refers to an application which is consuming more space. Then, I would start filtering out the process which is consuming more space using # ps -ef <followed_by_cpu_flags> to get which process is consuming more CPU. If the process is owned by non-root application, we can restart the application. If the process is owned by root, then we can dig into the logs to see what is causing the issue. Ideal solution would be to reboot the server. If the CPU% is high on wait end which refers CPU waiting for an event I/O to happen. Check mpstat -P ALL 1 to see per-core usage and %ir...

Availability vs Consistency

Image
  Let’s understand few terms from the question. Availability refers to a system or application which is online, or uptime is always guaranteed (100% success rate to ping command). Reliability refers to the ability of the system or application to perform the work that it is intended to (Responding with 200 HTTP response for every request). Stale data refers to data which is not up to date or old. Let’s say I updated by WhatsApp profile picture, but my contacts are still seeing the old picture – This is called “Eventual consistency”. Strong consistency refers to the data is always the same when queried at time. Let’s say I send 100$ to my friend, then my bank balance should immediately reflect the debited amount and show the updated balance. Depending on critical user journey of the application we can combine the above terms to build the application. In an e-commerce application, it is mandatory to have the application always UP compared to strong data. User-facing is the f...

Circuit Breaker - SDLC

Image
  It’s a software designs best practice to avoid cascading failures when an application cannot process the request or fails for some reason. Let’s say service A calls Service B. When service B has issues, can calling service like A will get 5xx errors and this can have cascading effects on the service which invoked A. To avoid this issue, it’s good to have a circuit break pattern. When a service fails continuously for 3 times  then we have open the “Circuit Breaker”,  • Service B will no longer accept requests from service A. • Service B handles the request by queueing them to process later once service B is recovered. • It can fail over the request to some other service. • It can half open after certain amount of time to see if the service recovered. Simple Python Program Implementing Circuit Breaker: The program uses random.random function to generate numbers between 0.0 to 1.0. If the generated number is lesser than 0.5 for 3 times, its calls another function ...

K8s - QnA

Image
  1. What are the key differences between a Deployment and a StatefulSet in Kubernetes? -> Deployment is stateless(nothing stored), StatefulSet stores data in the volume to process. -> Deployment pod name does not follow order, StatefulSet follows ordering of numbers. -> Deployment pod scaling are random, StatefulSet pod scaling follows strict order (new to old). -> Replaced pod get new name, StatefulSet pod gets the same name. -> Rolling updates can be fast and parallel, stateful deployments are ordered and controlled. 2. How would you safely perform a node upgrade in a Kubernetes cluster? -> Considering the question refers to node in Data plane. -> Cordon the node first to mark it as SCHEDULING DISABLED. -> Drain the node to let the pods to be created across other nodes. -> While draining we can ignore "daemon-sets". kubectl cordon: Prevent New Pods - Purpose: Marks a node as unschedulable. - Effect: No new pods will be scheduled on the node. - Ex...

AWS - Immutable Deployment

Image
  The AWS service most commonly associated with immutable deployments is AWS Elastic Beanstalk . Immutable deployment is often confused with blue/green deployment strategy. The word immutable means something that is unchanging over time or unable to be changed . It’s often used to describe things that are fixed, permanent, or resistant to modification. Let's say your application is deployed as below: So, we have a load balancer with a target group attached to a auto scaling group and there are 3 instances serving traffic for an application version1. Now, you want to deploy application of version2 without any downtime using Immutable deployment strategy.  This starts by creating a another ASG with one instance deployed with application version2. At this moment, there are totally 4 instances. And the load balancer routes traffic to all the 4 instances in round robin fashion. Once the new instances look good, ASG count on v2 version increased by 2 and the ASG count on v1 is s...

AWS - Serving S3 Static Content Via CDN

Image
  Amazon CloudFront is a content delivery network (CDN)  service. You can speed up the delivery of static files using HTTP or HTTPS protocols. Each CloudFront distribution has a unique  cloudfront.net   domain name that can be used to reference objects through the global network of edge locations. AWS CloudFront uses a global network of edge locations for content delivery. You can also monitor and receive notifications on the operational performance of CloudFront distributions using CloudWatch, and track trends in data transfer and requests checking the usage charts. Lets start by creating a S3 bucket and I upload a image. Now, lets create a CDN. Next, I select S3 as my Origin. Origin refers to where my actual content exists. Select the bucket which we created earlier. Next, Enable the option of " Allow private S3 bucket access to CloudFront". This updates the S3 bucket policy to ensure the S3 objects are accessible only from CDN. We are do...