AL

Big Data developer (with Pyspark)

Alike Thoughts
Pune5-8 LPA Posted 27 Jun 2025
FULL TIME
Devops
Apache Spark
Pyspark
Python

Job Description

Key Responsibilities:

  • Design, develop, and optimize batch and streaming data pipelines primarily using PySpark (Python with Apache Spark).
  • Write efficient, reusable, and testable code for large-scale data processing.
  • Work with diverse and large-scale datasets from various sources, including but not limited to Kafka, Hive, S3, and Parquet files.
  • Collaborate closely with data scientists, data analysts, and DevOps teams to build robust and scalable data pipelines.
  • Tune Spark jobs for optimal performance and resource efficiency.
  • Implement comprehensive data quality checks, logging, and error-handling mechanisms within data pipelines.
  • Contribute to the overall architecture and strategy for big data solutions

Join WhatsApp Channel