ALAlike Thoughts
Big Data developer (with Pyspark)
Pune ₹5-8 LPA Posted 27 Jun 2025
FULL TIME
Devops
Apache Spark
Pyspark
Python
Job Description
Key Responsibilities:
- Design, develop, and optimize batch and streaming data pipelines primarily using PySpark (Python with Apache Spark).
- Write efficient, reusable, and testable code for large-scale data processing.
- Work with diverse and large-scale datasets from various sources, including but not limited to Kafka, Hive, S3, and Parquet files.
- Collaborate closely with data scientists, data analysts, and DevOps teams to build robust and scalable data pipelines.
- Tune Spark jobs for optimal performance and resource efficiency.
- Implement comprehensive data quality checks, logging, and error-handling mechanisms within data pipelines.
- Contribute to the overall architecture and strategy for big data solutions