Posts

Showing posts from September, 2025

Apple Interview QnA - Part II

Image
                                                      A pod in Kubernetes cannot reach an external API, but curl works fine from the node. What is your debugging flow? This situation clearly shows the issue is in the pod layer, because the endpoint is accessible from node where the pod is running. Node and Pods don’t share the same network in real time. So, I would start with the below checks:   Check if the endpoint is resolving from the pod. This is to eliminate if it is a network issue or DNS resolve issue. If DNS fails, check “CORE DNS” pods which is usually created on all the worker nodes. CORE DNS pods are usually run a replicaset. It is worth to check the pod health and resource consumption. Let’s say the DNS work fine and we are getting timeout while connecting the external API. This could be due to the network policy (EGREES) configured...

Apple Interview QnA - Part I

Image
  A server’s CPU is pegged at 100% but top shows no process consuming that much. How do you debug? When the CPU is at 100% then there is no point in running top. Because “top” command needs CPU and Memory to fetch the statistics. We can start with running # sar or # vmstat command to find the CPU performance. CPU stats are divided into %user, %system, %idle and %wait. If the CPU% is high on the user end which refers to an application which is consuming more space. Then, I would start filtering out the process which is consuming more space using # ps -ef <followed_by_cpu_flags> to get which process is consuming more CPU. If the process is owned by non-root application, we can restart the application. If the process is owned by root, then we can dig into the logs to see what is causing the issue. Ideal solution would be to reboot the server. If the CPU% is high on wait end which refers CPU waiting for an event I/O to happen. Check mpstat -P ALL 1 to see per-core usage and %ir...

Availability vs Consistency

Image
  Let’s understand few terms from the question. Availability refers to a system or application which is online, or uptime is always guaranteed (100% success rate to ping command). Reliability refers to the ability of the system or application to perform the work that it is intended to (Responding with 200 HTTP response for every request). Stale data refers to data which is not up to date or old. Let’s say I updated by WhatsApp profile picture, but my contacts are still seeing the old picture – This is called “Eventual consistency”. Strong consistency refers to the data is always the same when queried at time. Let’s say I send 100$ to my friend, then my bank balance should immediately reflect the debited amount and show the updated balance. Depending on critical user journey of the application we can combine the above terms to build the application. In an e-commerce application, it is mandatory to have the application always UP compared to strong data. User-facing is the f...