NR
Job Description
Key Responsibilities:
- Design, implement, and optimize data pipelines for large-scale data processing.
- Develop and maintain ETL/ELT workflows using Spark, Hadoop, Hive, and Airflow.
- Collaborate with data scientists, analysts, and engineers to ensure data availability and quality.
- Write efficient and optimized SQL queries for data extraction, transformation, and analysis.
- Leverage PySpark and cloud tools (preferably Google Cloud Platform) to build reliable and scalable solutions.
- Monitor and troubleshoot data pipeline performance and reliability issues.
Required Skills:
- 46 years of experience in a Data Engineering role.
- Strong hands-on experience with PySpark and SQL.
- Good working knowledge of GCP or any major cloud platform (AWS, Azure).
- Experience with Hadoop, Hive, and distributed data systems.
- Proficiency in data orchestration tools such as Apache Airflow.
- Ability to work independently in a fast-paced, agile environment.
Good to Have:
- Experience with data modeling and data warehousing concepts.
- Exposure to DevOps and CI/CD practices for data pipelines.
- Familiarity with other programming/scripting languages (Python, Shell scripting).
Educational Qualification:
- Bachelors or Master's degree in Computer Science, Information Technology, Engineering, or a related field.