SI
Job Description
Roles and Responsibilities:
- Proficient in Python scripting and PySpark for data processing tasks
- Strong SQL capabilities with hands on experience managing big data using ETL tools like Informatica
- Experience with the AWS cloud platform and its data services including S3 Redshift Lambda EMR Airflow Postgres SNS and EventBridge
- Skilled in BASH Shell scripting
- Understanding of data lakehouse architecture particularly with Iceberg format is a plus
- Preferred Experience with Kafka and Mulesoft API
- Understanding of healthcare data systems is a plus
- Experience in Agile methodologies
- Strong analytical and problem-solving skills
- Effective communication and teamwork abilities
Responsibilities
- Develop and maintain data pipelines and ETL processes to manage large scale datasets
- Collaborate to design test data architectures to align with business needs
- Implement and optimize data models for efficient querying and reporting
- Assist in the development and maintenance of data quality checks and monitoring processes
- Support the creation of data solutions that enable analytical capabilities