PH

Data Engineer

PHOTON
Pune3-7 LPA Posted 25 Apr 2025
FULL TIME
Docker
Devops Tools
Kubernets
Sql
data engineering
+2 more

Job Description

  • Key ResponsibilitiesData Integration & Pipeline Development : Design and implement data pipelines to support training and finetuning of knowledge base and user data, ensuring data quality, scalability, and efficiency.
  • Data Processing & Transformation : Develop data transformation processes to prepare data for Natural Language Processing (NLP) models, facilitating personalized and accurate health recommendations.
  • Privacy & Security Compliance : Ensure all data handling practices comply with privacy and security standards, focusing on user data protection within AI model training and deployment.
  • Infrastructure Setup & Management : Build and maintain foundational cloud infrastructure on GCP to host, deploy, and scale securely and efficiently across platforms.
  • Collaboration with AI & DevOps Teams : Partner with AI/ML and DevOps teams to finetune, test, and optimize NLP models for production, focusing on deployment performance and user experience.
  • Website & Mobile Integration Support : Work alongside frontend developers to ensure smooth data flow and integration between the backend, website and mobile app.
  • Monitoring & Optimization : Implement monitoring, logging, and automated alerts to ensure data pipelines, model interactions, and infrastructure meet performance and reliability requirements.
  • QualificationsEducation : Bachelor s or Master s in Computer Science, Data Engineering, or a related field.
  • Experience :
  • 3+ years in data engineering, preferably within Generative AI or NLP-focused projects.
  • Hands-on experience with Google Cloud Platform (GCP), including BigQuery, Dataflow, and Cloud Storage.
  • Proven ability in data pipeline design and data transformations for AI model training.
  • Skills :
  • Strong programming skills in Python and familiarity with SQL.
  • Experience with DevOps tools (e.g., Kubernetes, Docker) and CI/CD pipelines in GCP.
  • Proficient in data management practices, data privacy, and security protocols.
  • Familiarity with AI/ML workflows, specifically NLP model training and finetuning.
  • Nice to Have :
  • Experience working with Contentful, or React Native integrations.
  • Knowledge of MLOps practices to support continuous model training and deployment.

Join WhatsApp Channel