Responsibilities:

Databricks Platform Expertise: Act as a subject matter expert for the Databricks platform within the Digital Capital team, providing technical guidance, best practices, and innovative solutions.
Databricks Workflows and Orchestration: Design and implement complex data pipelines using Azure Data Factory or Qlik Replicate.
End-to-End Data Pipeline Development: Design, develop, and implement highly scalable and efficient ETL/ELT processes using Databricks notebooks (Python/Spark or SQL) and other Databricks-native tools.
Delta Lake Expertise: Utilize Delta Lake for building reliable data lake architecture, implementing ACID transactions, schema enforcement, time travel, and optimizing data storage for performance.
Spark Optimization: Optimize Spark jobs and queries for performance and cost efficiency within the Databricks environment. Demonstrate a deep understanding of Spark architecture, partitioning, caching, and shuffle operations.
Data Governance and Security: Implement and enforce data governance policies, access controls, and security measures within the Databricks environment using Unity Catalog and other Databricks security features.
Collaborative Development: Work closely with data scientists, data analysts, and business stakeholders to understand data requirements and translate them into Databricks-based data solutions.
Monitoring and Troubleshooting: Establish and maintain monitoring, alerting, and logging for Databricks jobs and clusters, proactively identifying and resolving data pipeline issues.
Code Quality and Best Practices: Champion best practices for Databricks development, including version control (Git), code reviews, testing frameworks, and documentation.
Performance Tuning: Continuously identify and implement performance improvements for existing Databricks data pipelines and data models.
Cloud Integration: Integrate Databricks with other cloud services (e.g., Azure Data Lake Storage Gen2, Azure Synapse Analytics, Azure Key Vault) for a seamless data ecosystem.
Traditional Data Warehousing & SQL: Design, develop, and maintain schemas and ETL processes for traditional enterprise data warehouses. Demonstrate expert-level proficiency in SQL for complex data manipulation, querying, and optimization within relational database systems.

Required Skills:

Proficiency in Databricks and Databricks Workflows and Orchestration.
Hands-on experience in Python for automation and scripting.
Strong knowledge of Azure Data Lakes, Data Warehouses, and cloud architecture.
Proficiency in designing web applications and data engineering solutions (Solution Architecture).
Familiarity with DevOps Basics, including Jenkins and CI/CD pipelines.
Excellent verbal and written communication skills.
Ability to quickly grasp new technologies and adapt to changing requirements (Fast Learner).
Experience integrating Databricks with other cloud services (e.g., Azure Data Lake Storage Gen2, Azure Synapse Analytics, Azure Key Vault) for a seamless data ecosystem.
Extensive experience with Spark (PySpark, Spark SQL) for large-scale data processing.

Azure Specialist-CDM Smith

Job Description

Required Skills