CACareerfit Ai
ML Inference & Optimization Engineer
Mumbai ₹2-4 LPA Posted 11 Sept 2025
FULL TIME
Pytorch
Job Description
You Bring
- 3+ years of experience in deploying and optimizing machine learning models in production, with 1+ years of experience in deploying deep learning models
- Experience deploying async inference APIs (FastAPI, gRPC, Ray Serve etc.)
- Understanding of PyTorch internals and inference-time optimization
- Familiarity with LLM runtimes: vLLM, TGI, TensorRT-LLM, ONNX Runtime etc.
- Familiarity with GPU profiling tools (nsight, nvtop), model quantization pipelines