CA

ML Inference & Optimization Engineer

Careerfit Ai
Mumbai2-4 LPA Posted 11 Sept 2025
FULL TIME
Pytorch

Job Description

You Bring

  • 3+ years of experience in deploying and optimizing machine learning models in production, with 1+ years of experience in deploying deep learning models
  • Experience deploying async inference APIs (FastAPI, gRPC, Ray Serve etc.)
  • Understanding of PyTorch internals and inference-time optimization
  • Familiarity with LLM runtimes: vLLM, TGI, TensorRT-LLM, ONNX Runtime etc.
  • Familiarity with GPU profiling tools (nsight, nvtop), model quantization pipelines

Required Skills

Join WhatsApp Channel