FUFusion Plus Solutions
Databricks Pyspark
Hyderabad ₹5-8 LPA Posted 18 Jul 2025
FULL TIME
Pyspark
Sql
Airflow
Aws
Databricks
Job Description
- Develop and optimize data processing jobs using PySpark to handle complex data transformations and aggregations efficiently.
- Design and implement robust data pipelines on the AWS platform, ensuring scalability and efficiency.
- Leverage AWS services such as EC2, S3 for comprehensive data processing and storage solutions.
- Manage SQL database schema design, query optimization, and performance tuning to support data transformation and loading processes.
- Design and maintain scalable and performant data warehouses, employing best practices in data modeling and ETL processes.
- Utilize modern data platforms for collaborative data science, integrating seamlessly with various data sources and types.
- Ensure high data quality and accessibility by maintaining optimal performance of Databricks clusters and Spark jobs.
- Develop and implement security measures, backup procedures, and disaster recovery plans using AWS best practices.
- Manage source code and automate deployment using GitHub along with CI/CD practices tailored for data operations in cloud environments.
- Provide expertise in troubleshooting and optimizing PySpark scripts, Databricks notebooks, SQL queries, and Airflow DAGs.
- Stay updated on the latest developments in cloud data technologies and recommend adoption of new tools and practices.
- Use Apache Airflow to orchestrate and automate data workflows, ensuring timely and reliable execution of data jobs.
- Collaborate with data scientists and business analysts to design data models and pipelines that support advanced analytics and machine learning projects.