TE

Big Data Engineer (GCP, Hadoop, PySpark)

Teamware Solutions
Hyderabad4-7 LPA Posted 16 Jul 2025
FULL TIME
Hipaa
Pyspark
Gdpr
Python
Hadoop

Job Description

Key Responsibilities:

  • Design, develop, and optimize big data pipelines and ETL workflows using PySpark, Hadoop (HDFS, MapReduce, Hive, HBase).
  • Develop and maintain data ingestion, transformation, and integration processes on Google Cloud Platform services such as BigQuery, Dataflow, Dataproc, and Cloud Storage.
  • Ensure data quality, security, and governance across all pipelines.
  • Monitor and troubleshoot performance issues in data pipelines and storage systems.
  • Collaborate with data scientists and analysts to understand data needs and deliver clean, processed datasets.
  • Implement batch and real-time data processing solutions.
  • Write efficient, reusable, and maintainable code in Python and PySpark.
  • Automate deployment and orchestration using tools like Airflow, Cloud Composer, or similar.
  • Stay current with emerging big data technologies and recommend improvements.

Qualifications and Requirements:

  • Bachelor's or Master's degree in Computer Science, Engineering, or related field.
  • 3+ years of experience in big data engineering or related roles.
  • Strong hands-on experience with Google Cloud Platform (GCP) services for big data processing.
  • Proficiency in Hadoop ecosystem tools: HDFS, MapReduce, Hive, HBase, etc.
  • Expert-level knowledge of PySpark for data processing and analytics.
  • Experience with data warehousing concepts and tools such as BigQuery.
  • Good understanding of ETL processes, data modeling, and pipeline orchestration.
  • Programming proficiency in Python and scripting.
  • Familiarity with containerization (Docker) and CI/CD pipelines.
  • Strong analytical and problem-solving skills.

Desirable Skills:

  • Experience with streaming data platforms like Kafka or Pub/Sub.
  • Knowledge of data governance and compliance standards (GDPR, HIPAA).
  • Familiarity with ML workflows and integration with big data platforms.
  • Experience with Terraform or other infrastructure-as-code tools.
  • Certification in GCP Data Engineer or equivalent.

Join WhatsApp Channel