Posts

Showing posts from December, 2025

SRE Interview Questions and Answers - Part II

Image
                                                                   What is latency and how do you reduce it? Latency - Refers to time taken to respond the request (Processing time of the application to process the request). Latency could be measured for a user request or application to application request or application to data request like (S3, Database, Redshift). There could be various reasons for a latency issue. 1) Client side issue this is pretty straight forward to identify when our i nternal telemetry and observability looks GREEN. 2) Possible reasons for client side issue could be ISP, User Agent like browser, Geo location based issues. If its server side issues, better to start with the probing to pin point the issues:  Let's say one of the microservices called " inventory " ...

SRE Interview Questions and Answers - Part I

Image
                                                                What is SRE and how is it different from DevOps? SRE stands for Site Reliability Engineering which primarily focus on managing the application and its infrastructure in PRODUCTION. Their aim is to focus on improving the reliability and resiliency of the applications, improve the monitoring and observability of the application, focus on SHIFT LEFT approach to address the issue at the development stage of the software, monitor the promised SLA, SLO and SLI. Approach every problem from a software development approach. Identify and eliminate toils. Focus on automation and run books to improve the reliability and resiliency of the application and systems and involve in Root Cause Analysis and Post mortem calls after a major incident. What are...