IB

Production Engineer

IBM
Bangalore5-9 LPA Posted 13 Nov 2025
FULL TIME
Helm
Debugging
Kubernetes
Networking
Cloud

Job Description

Key Responsibilities:

  • Monitor, troubleshoot, and optimize production systems running on Kubernetes (EKS, GKE, AKS) to ensure platform reliability and performance.
  • Develop and maintain automation for infrastructure provisioning, scaling, and incident response.
  • Participate in on-call rotations to detect, mitigate, and resolve production incidents.
  • Drive Kubernetes version upgrades, node pool scaling, and security patching.
  • Implement and refine observability and monitoring tools (Datadog, Prometheus, Splunk, Victoria Metrics) for proactive alerts.
  • Manage infrastructure using Terraform, Terragrunt, Helm, and Kubernetes manifests.
  • Build, maintain, and improve CI/CD pipelines using GitHub Actions, ArgoCD, and related tools.
  • Collaborate with developers, SREs, and other teams to enhance platform stability.
  • Analyze and optimize cloud and containerized workloads for cost efficiency and high availability.
  • Ensure platform security best practices, incident response, and compliance adherence.

Join WhatsApp Channel