ST

Data Engineering

Kolkata ₹3-7 LPA Posted 19 Jun 2025

FULL TIME

Data Modeling

Etl

Apache Hive

Apache Hadoop

Pyspark

+1 more

Design, develop, and optimize large-scale data processing pipelines using PySpark.
Utilize Apache tools and frameworks (e.g., Hadoop, Hive, HDFS) for data ingestion, transformation, and management.
Ensure high performance and reliability of ETL jobs in production environments.
Collaborate with Data Scientists, Analysts, and stakeholders to deliver robust data solutions.
Implement data quality checks and maintain data lineage for transparency and auditability.
Handle ingestion, transformation, and integration of structured and unstructured data sources.
(If applicable) Leverage Apache NiFi for automated, repeatable data flow management.
Write clean, efficient, and maintainable code in Python and Java.
Contribute to architecture, performance tuning, and scalability strategies.

Required Skills:

Nice-to-Have Skills:

Soft Skills: