AT

Senior Site Reliability Engineer - Logging Metrics and Monitoring

Athenahealth Technology Private Limited
Bangalore5-8 LPA Posted 12 Jun 2025
FULL TIME
Ansible
Ruby
Puppet
Bash
Golang
+6 more

Job Description

Job Responsibilities

  • Automate the deployment of logging, metrics, and monitoring services through configuration management utilizing Puppet.
  • Address and resolve production incidents by applying Linux administration and engineering expertise.
  • Lead projects from inception to completion, including designing technical solutions, managing timelines, and executing deliverables.
  • Design and implement metrics dashboards and alert criteria to effectively monitor and scale services.
  • Participate in a week-long on-call rotation in collaboration with team members.
  • Assist development teams in enhancing their logging and metrics collection processes.
  • Demonstrate the ability to manage on-call rotations every few weeks.

Typical Qualifications

  • Possess 5 to 8 years of prior experience in a production environment, exhibit strong system administration and DevOps skills for managing services within a Linux environment.
  • Demonstrate hands-on experience with configuration management tools such as Puppet or Ansible.
  • Strong experience troubleshooting production services in a Linux environment and participating in on-call rotations.
  • Proficient in programming with experience writing and maintaining scripts in the following languages: Bash, Ruby, Python, Perl, C++, Java, and Golang.
  • Experience developing Infrastructure as Code utilizing Terraform and CloudFormation.
  • Display adaptability and flexibility in response to changing environmental and business demands.

Additional Qualifications

  • Demonstrated experience in managing production server fleets at a scale of thousands.
  • Subject matter expertise in relevant technologies, including FluentD, Kafka, Elasticsearch, Graphite, Clickhouse, Prometheus, Grafana, Graylog, Terraform, CloudFormation, Docker, Jenkins, and Git.
  • Exposure to Amazon Web Services (AWS) for deploying, managing, and scaling applications, with a foundational understanding of AWS services, architecture, and best practices.
  • Proficient in using protocol analyzers such as tcpdump and Wireshark.
Join WhatsApp Channel