NO

SRE Engineer

Nomiso
Gurgaon5-8 LPA Posted 30 Jun 2025
FULL TIME
Docker
Kubernetes
Terraform
Grafana
Prometheus

Job Description

Roles and Responsibilities:

  • Monitor application and infrastructure metrics; build dashboards and alerts (Prometheus, Grafana, ELK).
  • Automate health checks, incident remediation, and reliability guardrails.
  • Manage on-call rotations, conduct root cause analysis, and implement postmortem action plans.
  • Define and track SLOs, SLIs, and error budgets.
  • Use chaos engineering and resilience testing to ensure fault tolerance.

Must Have Skills:

  • 4-5years of experience in managing production-grade Kubernetes clusters and cloud-native platforms.
  • Proficiency in Linux system internals, containers, and networking.
  • Scripting/automation expertise in Python/Go/Shell.
  • Familiarity with incident management, runbooks, and observability standards.
  • Exposure to service discovery, DNS routing, and load balancing is a bonus.

Qualification:

  • BE/BTech/MCA/ME/MTech/MS in Computer Science or a related technical field or equivalent practical experience.

Join WhatsApp Channel