UK

Lead Site Reliability Engineer

UKG
Pune3-8 LPA Posted 4 Jun 2025
FULL TIME
Network Design
System Design
Cloud Architecture
It Architecture

Job Description

Job Responsibilities

  • Engage in and improve the lifecycle of services from conception to EOL, including system design consulting, and capacity planning.
  • Define and implement standards and best practices related to System Architecture, Service delivery, metrics, and the automation of operational tasks.
  • Support services, product & engineering teams by providing common tooling and frameworks to deliver increased availability and improved incident response.
  • Improve system performance, application delivery and efficiency through automation, process refinement, postmortem reviews, and in-depth configuration analysis.
  • Collaborate closely with engineering professionals within the organization to deliver reliable services.
  • Increase operational efficiency, effectiveness, and quality of services by treating operational challenges as a software engineering problem (reduce toil).
  • Guide junior team members and serve as a champion for Site Reliability Engineering.
  • Actively participate in incident response, including on-call responsibilities.
  • Partner with stakeholders to influence and help drive the best possible technical and business outcomes.

Required Qualifications

  • Engineering degree, or a related technical discipline, or equivalent work experience.
  • Experience coding in higher-level languages (e.g., Python, JavaScript, C++, or Java).
  • Knowledge of Cloud based applications & Containerization Technologies.
  • Demonstrated understanding of best practices in metric generation and collection, log aggregation pipelines, time-series databases, and distributed tracing.
  • Working experience with industry standards like Terraform, Ansible.
  • Demonstrable fundamentals in two of the following: Computer Science, Cloud architecture, Security, or Network Design fundamentals.

Experience, Education, Certification, License and Training

  • Must have at least five years of hands-on experience working in Engineering or Cloud.
  • Minimum five years' experience with public cloud platforms (e.g., GCP, AWS, Azure).
  • Minimum three years' experience in configuration and maintenance of applications and/or systems infrastructure for large scale customer facing company.
  • Experience with distributed system design and architecture

Join WhatsApp Channel