EPEpsilon Data Management
Lead Software Engineer
Bangalore ₹8-12 LPA Posted 3 Jun 2025
FULL TIME
Spark
Scala
Gcp
Aws
Hadoop
Job Description
Job Summary
- What will you enjoy in this role Will focus on designing, developing, and supporting all our online data solutions
- This person will work closely with business Managers to design and build innovative solutions
What you ll do
- We seek Software Engineers with experience building and scaling services in on-premises and cloud environments
- As a Lead Engineer in the Epsilon Attribution/Forecasting Product Development team, you will play a key role in implementing and optimizing advanced data processing solutions using Scala, Spark, and Hadoop
- You will collaborate with cross-functional teams to deploy scalable big data solutions on our on-premises and cloud infrastructure
- Your responsibilities will include building, scheduling, and maintaining complex workflows, as well as performing data integration and transformation tasks
- You will troubleshoot issues, document processes, and communicate technical concepts clearly to both technical and non-technical stakeholders
- Additionally, you will focus on continuously enhancing our attribution and forecasting engines, ensuring they effectively meet evolving business needs
- Strong written and verbal communication skills (in English) are required to facilitate work across multiple countries and time zones
- Good understanding of Agile Methodologies - SCRUM
Qualifications
- Over 8 years of strong experience in Scala programming and extensive use of Apache Spark for developing and maintaining scalable big data solutions on both on-premises and cloud environments, particularly AWS and GCP
- Proficient in performance tuning of Spark jobs, optimizing resource usage, shuffling, partitioning, and caching for maximum efficiency
- Skilled in implementing scalable, fault-tolerant data pipelines with comprehensive monitoring and alerting
- Hands-on experience with Python for developing infrastructure modules
- Deep understanding of the Hadoop ecosystem, including HDFS, YARN, and MapReduce
- Proficient in writing efficient SQL queries for handling large volumes of data in various database systems
- Experienced in building, scheduling, and maintaining DAG workflows
- Familiar with data warehousing concepts and technologies
- Capable of taking end-to-end ownership in defining, developing, and documenting software objectives and requirements in collaboration with stakeholders
- Experienced with GIT or equivalent source control systems
- Proficient in developing and implementing unit test cases to ensure code quality and reliability and experienced in utilizing integration testing frameworks to validate system interactions
- Effective collaborator with stakeholders and teams to understand requirements and develop solutions
- Ability to work within tight deadlines, prioritize tasks effectively, and perform under pressure
- Experience in mentoring junior staff
- Advantageous to have experience on below: Hands-on with Databricks for unified data analytics, including Databricks Notebooks, Delta Lake, and Catalogues
- Proficiency in using the ELK (Elasticsearch, Logstash, Kibana) stack for real-time search, log analysis, and visualization
- Strong background in analytics, including the ability to derive actionable insights from large datasets and support data-driven decision-making
- Experience with data visualization tools like Tableau, Power BI, or Grafana
- Familiarity with Docker for containerization and Kubernetes for orchestratio