ID

Data Engineer

IDESLABS
Pune8-10 LPA Posted 25 Nov 2025
FULL TIME
Etl
Apache Spark
Data Warehousing
Sql
Aws
+1 more

Job Description

  • 5+ years of relevant experience in the field of Data Engineering
  • Advance skills in big data technologies like Hadoop, Python, Spark, SQL.
  • Must have experience building data APIs.
  • Bachelors in Computer Science or related disciplines
  • Knowledge of Data Structures and Algorithm.
  • Strong Python programming skills with ability to implement OOPs and functional programming. Knowledge of Scala/Java would be plus.
  • Strong knowledge on RDBMS and NoSQL databases with the ability to implement them from scratch. Knowledge of Graph databases will be a plus.
  • Strong expertise in building & optimizing data pipelines, architectures, and data sets.
  • Experience working with different file formats like Parquet, ORC, Avro, RC, etc.
  • Experience with big data infrastructure inclusive of MapReduce, Hive, HDFS, YARN, HBase, Oozie, etc.
  • Knowledge and experience of using orchestration frameworks like Airflow, Oozie, Luigi, etc.
  • Experience using Spark, and building jobs using Python/Scala/Java.
  • Experience or Knowledge building stream processing platforms using Spark Streaming, Storm, etc. Knowledge of Kafka/Flink+Beam would be plus.
  • Knowledge of building REST API end points for data consumption.
  • Experience in building scalable data pipelines for both real time and batch using best practices in data modeling, ETL/ELT processes utilizing varioud technologies such as Spark, Kafka, Presto, SAP HANA, Airflow, informatica.
  • Perform Data analysis using Python, complex SQLs, and other tools.
  • Perform root cause analysis of issues from platform standpoint on Kubernetes, Containers, Hadoop, Spark, Hive, Presto
  • Excellent oral and written communication is a must.

Preferred

  • Master's in Computer Science or related disciplines
  • Experience building self-service tools for analytics would be plus.
  • Knowledge of ELK stack would be a plus.
  • Knowledge of implementing CI/CD on the pipelines is a plus.
  • Knowledge of Containerization (Docker/Kubernetes) will be plus.
  • Experience working with one of the popular Public Cloud based platforms is preferred