Azure Databricks Lead
Job Description
Job Description
Responsibilities:
1. Design and Develop Data Pipelines:
o Create scalable data processing pipelines using Azure Databricks and PySpark/Scala.
o Implement ETL (Extract, Transform, Load) processes to ingest, transform, and load data from various sources.
o Collaborate with data engineers and architects to ensure efficient data movement and transformation.
2. Data Quality Implementation:
o Establish data quality checks and validation rules within Azure Databricks.
o Monitor data quality metrics and address anomalies promptly.
o Work closely with data governance teams to maintain data accuracy and consistency.
3. Unity Catalog Integration:
o Leverage Azure Databricks Unity Catalog to manage metadata, tables, and views.
o Integrate Databricks assets seamlessly with other Azure services.
o Ensure proper documentation and organization of data assets.
4. Delta Lake Expertise:
o Understand and utilize Delta Lake, which provides ACID transactions and time travel capabilities on top of data lakes.
o Implement Delta Lake tables for reliable data storage and versioning.
o Optimize performance by leveraging Delta Lake features.
5. Performance Tuning and Query Optimization:
o Profile and analyze query performance.
o Optimize SQL queries, Spark jobs, and transformations for efficiency.
o Tune resource allocation to achieve optimal execution times.
6. Resource Optimization:
o Manage compute resources effectively within Azure Databricks clusters.
o Scale clusters dynamically based on workload requirements.
o Monitor resource utilization and cost efficiency.
7. Source System Integration:
o Integrate Azure Databricks with various source systems (e.g., databases, data lakes, APIs).
o Ensure seamless data ingestion and synchronization.
o Handle schema evolution and changes in source data.
8 Stored Procedure Conversion in Databricks:
· Convert existing stored procedures (e.g., from SQL Server) into Databricks-compatible code.
· Optimize and enhance stored procedures for better performance within Databricks
· SSRS conversion experience
Qualifications and Skills:
· Education: Bachelor's degree in Computer Science, Information Technology, or a related field.
· Experience:
o Minimum 5- 6 years of relevant experience architecting and building data platforms on Azure.
o Proficiency in Azure Databricks, PySpark, and SQL.
o Familiarity with Delta Lake concepts.
· Certifications (preferred):
o Microsoft Certified: Azure Data Engineer Associate or similar.
o Databricks Certified Associate Developer for Apache Spark.
Soft Skills:
o Strong problem-solving abilities.
o Excellent communication and collaboration skills.
o Ability to work in a fast-paced, agile environment.