SGSgtc India
Command Center Engineer, CloudOps
Bangalore ₹4-7 LPA Posted 1 Aug 2025
FULL TIME
Windows
Rhel
Zabbix
Datadog
Azure
+2 more
Job Description
Key Responsibilities:
- Monitor and manage system alerts, incidents, and performance metrics 24/7, ensuring timely resolution and escalation as necessary.
- Serve as the first point of contact for operational issues/alerts, ensuring effective communication with internal and external stakeholders.
- Lead customer communication, assuring timely status updates and case resolution
- Lead to collaboration efforts between the company and third parties to troubleshoot and resolve escalated customer issues
- Report product defects and enhancement requests
- Review and collaborate on product documentation for accuracy before new releases
- Design and maintain troubleshooting runbooks
- Author and review knowledge base articles for internal and external use
- Provide formal and informal training to co-workers, customers, and partners
- Develop tools, scripts, and programs to improve the quality of our customer support
- Coordinate incident response efforts, conducting root cause analysis and documenting findings to improve future response strategies.
- Collaborate with IT teams to ensure systems operate optimally and identify potential issues before they escalate.
- Maintain detailed logs of incidents and responses, providing regular reports and updates to management.
Who We Want
- Detail-oriented process improvers. Critical thinkers who naturally see opportunities to develop and optimize work processes - finding ways to simplify, standardize,e and automate.
- Self-directed imitators. People who take ownership of their work need no prompting to drive productivity, change, and outcomes.
- Analytical problem solvers. People who go beyond just fixing to identify root causes, evaluate optimal solutions, and recommend comprehensive upgrades to prevent future issues.
What You Will Need
- Bachelor s degree in computer science or related field of study or 6+ years
- 4+ years of experience in a command center, or similar environment preferably in Healthcare IT
- IT Infrastructure & Cloud : Expertise in cloud platforms (AWS, Azure, VMware), system administration (Windows, RHEL), and network management.
- Incident Management : Experienced in ITIL and command center operations, particularly in Healthcare IT environments.
- Automation & Scripting : Proficient in Python and Bash for automating tasks.
- Essential Monitoring Tools: Proficient in using Zabbix, Datadog , and CloudWatch for infrastructure monitoring.
- Ticketing Systems : Strong experience with ServiceNow and Salesforce for incident and request management.
- Problem Solving & Communication : Excellent under pressure with strong communication and team collaboration skills.
- Flexible Work : Available to work shifts, including nights, weekends, and holidays.