FU

Databricks Pyspark

Fusion Plus Solutions

Hyderabad ₹5-8 LPA Posted 18 Jul 2025

FULL TIME

Pyspark

Sql

Airflow

Aws

Databricks

Job Description

Develop and optimize data processing jobs using PySpark to handle complex data transformations and aggregations efficiently.
Design and implement robust data pipelines on the AWS platform, ensuring scalability and efficiency.
Leverage AWS services such as EC2, S3 for comprehensive data processing and storage solutions.
Manage SQL database schema design, query optimization, and performance tuning to support data transformation and loading processes.
Design and maintain scalable and performant data warehouses, employing best practices in data modeling and ETL processes.
Utilize modern data platforms for collaborative data science, integrating seamlessly with various data sources and types.
Ensure high data quality and accessibility by maintaining optimal performance of Databricks clusters and Spark jobs.
Develop and implement security measures, backup procedures, and disaster recovery plans using AWS best practices.
Manage source code and automate deployment using GitHub along with CI/CD practices tailored for data operations in cloud environments.
Provide expertise in troubleshooting and optimizing PySpark scripts, Databricks notebooks, SQL queries, and Airflow DAGs.
Stay updated on the latest developments in cloud data technologies and recommend adoption of new tools and practices.
Use Apache Airflow to orchestrate and automate data workflows, ensuring timely and reliable execution of data jobs.
Collaborate with data scientists and business analysts to design data models and pipelines that support advanced analytics and machine learning projects.

Required Skills

Pyspark Sql Airflow Aws Databricks