Senior Software Engineer
Job Description
Overview
At its core, Epsilon's ETS (Epsilon Technology Services) team establishes the foundation of our IT infrastructure. This team champions innovation and efficiency through cutting-edge technology across all Epsilon platforms and business verticals. From initial infrastructure needs to final deployment, ETS provides end-to-end solutions for our client-facing platforms, supporting all revenue-generating aspects and setting architectural direction for enterprise deployments. By embracing the latest technologies like Cloud, Automation, and Artificial Intelligence, the team is at the forefront of transforming our digital business and seizing new opportunities.
We're looking for someone with strong expertise in public cloud service providers who wants to further grow their Python experience to significantly impact the automation of complex business processes. You'll have opportunities to explore cloud-first technologies and develop and automate repeatable patterns with a software engineering approach. If you enjoy new challenges and are solution-oriented, this role is for you.
As part of the Epsilon Solutions Enablement team, you'll help mature and enable access to services that power most of our business processes across Epsilon. You'll have the chance to present ideas and influence technology approaches, gaining significant experience with public cloud providers and other SaaS services in an open, transparent environment that values innovation and efficiency.
Responsibilities
- Design, develop, and implement automated observability services across our entire infrastructure.
- Champion Site Reliability Engineering (SRE) principles and actively contribute to the evolution of our monitoring and observability strategy.
- Build and improve complex and repeatable business processes through the development and implementation of robust automation solutions.
- Analyze telemetry data (logs, metrics, traces) to proactively identify performance bottlenecks, security vulnerabilities, and areas for improvement in our systems and applications.
- Integrate and enable automated observability services within our public cloud environments (e.g., AWS, Azure, GCP), leveraging cloud-native monitoring tools and services.
- Collaborate with other teams, exploring innovation to bring ideas that improve our services.
Qualifications
- 5+ years of expertise in software engineering with a strong focus on designing, implementing, and maintaining robust observability solutions, including deep expertise in monitoring patterns, best practices, and industry standards.
- Proven ability to develop and deploy high-quality software using Python.
- Strong experience with CI/CD principles and tools (e.g., Jenkins, GitLab CI, Azure DevOps) for automated builds, testing, and deployments.
- Experience with infrastructure-as-code tools (e.g., Ansible, Terraform, Puppet) to automate infrastructure provisioning, configuration management, and deployments.
- Strong system administration experience with Linux/Unix environments (preferred); familiarity with Windows systems is a plus.
- Expertise in utilizing and integrating various observability tools (e.g., Prometheus, Grafana, Jaeger, ELK Stack, Splunk) to collect, analyze, and visualize logs, metrics, and traces.
- Ability to analyze and understand business workflows to design and implement software solutions that automate and orchestrate business processes using RESTful APIs, GraphQL, or other integration technologies.
- Cloud certifications in AWS, Azure, or Google Cloud are highly preferred.
- Experience with cloud-native technologies and services is a must.
- Excellent written and verbal communication skills.
- Ability to clearly document technical designs, procedures, and best practices.