BLBlend360 India
ML Ops Manager
Hyderabad ₹3-15 LPA Posted 25 Jun 2025
FULL TIME
Docker
Ansible
Terraform
MLops
Job Description
Key Responsibilities
- Maintain and enhance existing ML pipelines in On Premise with a focus on infrastructure as code.
- Implement minimal but essential pipeline extensions to support ongoing data science workstreams.
- Convert the Data Science notebooks into production ready deployable components.
- Build ML pipelines for training, inference, monitoring.
- Document infrastructure usage, architecture, and design using tools like Confluence, GitHub Wikis, and system diagrams.
- Act as the internal infrastructure expert, collaborating with data scientists to guide and support ML model deployments.
- Research and implement optimization strategies for ML workflows and infrastructure.
- Work independently and collaboratively with cross-functional teams to support ML product
Key Responsibilities
- Lead the design, development, and management of robust ML pipelines and infrastructure in on-premises or private cloud environments.
- Define and drive MLOps strategy and best practices for model deployment, monitoring, and lifecycle management.
- Oversee the implementation and governance of Infrastructure as Code (IaC) using tools like Ansible, Terraform (for private cloud), or Puppet.
- Manage, mentor, and guide MLOps engineers, fostering a high-performing and collaborative team.
- Collaborate with cross-functional teams to align MLOps solutions with business and data science objectives.
- Drive automation and standardization of CI/CD pipelines, model versioning, and container orchestration (e.g., Docker, Kubernetes, OpenShift).
- Ensure comprehensive documentation of infrastructure, architecture, and operational workflows using tools like Confluence, GitHub Wikis, and system diagrams.
- Identify and implement optimization opportunities for ML infrastructure performance, cost, and scalability.
- Stay updated on industry trends and emerging technologies to continuously enhance MLOps capabilities.