SI

PySpark Developer

Sigma Allied Services
Pune7-17 LPA Posted 20 Feb 2026
FULL TIME
Spark
Etl
Data Integration
Pyspark
Distributed Computing
+2 more

Job Description

  • Design, implement, and optimize ETL pipelines and data processing workflows using PySpark
  • Work on distributed computing frameworks for large-scale data processing
  • Collaborate with Databricks and other cloud platforms for data storage and transformation
  • Perform data analysis, validation, and integration from multiple sources
  • Troubleshoot and resolve data pipeline and processing issues
  • Maintain proper documentation of data workflows, pipelines, and processes
  • Ensure best practices for performance, scalability, and data governance

Key Performance Indicators

  • Timely delivery of data pipelines and ETL workflows
  • Accuracy, consistency, and integrity of processed data
  • Performance and scalability of data processing solutions
  • Effective collaboration with cross-functional teams

Join WhatsApp Channel