ID

Site Reliability Engineer

IDESLABS
Hyderabad8-10 LPA Posted 25 Nov 2025
FULL TIME
Docker
Kubernetes
Terraform
Grafana
Aws

Job Description

  • Responsible for Handling Major incidents - CIRS (Critical Issue Response System).
  • Sending frequent updates on CES based CIRS until issue is stabilized.
  • Application deep dive troubleshooting.
  • Identifying & creating CIRS preventive action items.
  • CIRS based requests (DFs, feature toggles, Deployments).
  • Follow up on major production incidents.
  • Familiar with monitoring tools, such as Dynatrace, Kibana, etc.
  • Driving & monitoring planned activities.
  • Writing new monitoring & enhancing existing monitoring scope.
  • Handling customer escalations