FU

Databricks Pyspark

Fusion Plus Solutions
Hyderabad5-8 LPA Posted 18 Jul 2025
FULL TIME
Pyspark
Sql
Airflow
Aws
Databricks

Job Description

  • Develop and optimize data processing jobs using PySpark to handle complex data transformations and aggregations efficiently.
  • Design and implement robust data pipelines on the AWS platform, ensuring scalability and efficiency.
  • Leverage AWS services such as EC2, S3 for comprehensive data processing and storage solutions.
  • Manage SQL database schema design, query optimization, and performance tuning to support data transformation and loading processes.
  • Design and maintain scalable and performant data warehouses, employing best practices in data modeling and ETL processes.
  • Utilize modern data platforms for collaborative data science, integrating seamlessly with various data sources and types.
  • Ensure high data quality and accessibility by maintaining optimal performance of Databricks clusters and Spark jobs.
  • Develop and implement security measures, backup procedures, and disaster recovery plans using AWS best practices.
  • Manage source code and automate deployment using GitHub along with CI/CD practices tailored for data operations in cloud environments.
  • Provide expertise in troubleshooting and optimizing PySpark scripts, Databricks notebooks, SQL queries, and Airflow DAGs.
  • Stay updated on the latest developments in cloud data technologies and recommend adoption of new tools and practices.
  • Use Apache Airflow to orchestrate and automate data workflows, ensuring timely and reliable execution of data jobs.
  • Collaborate with data scientists and business analysts to design data models and pipelines that support advanced analytics and machine learning projects.

Join WhatsApp Channel