TATata Consultancy Services Limited
Scala Developer
Mumbai ₹5-9 LPA Posted 24 Nov 2025
FULL TIME
hdfs
Hive
Yarn
Hadoop
Job Description
Role Overview
The role focuses on designing, developing, and optimizing large-scale data processing solutions using Spark Scala and Hadoop ecosystem technologies. The position requires strong expertise in big data components, distributed processing, SQL optimization, and end-to-end pipeline development in both batch and streaming environments.
Key Responsibilities
- Create Spark Scala jobs for data transformation, aggregation, and large-scale data processing
- Design and implement data processing pipelines using Hadoop ecosystem tools such as HDFS, Hive, YARN, MapReduce, and Sqoop
- Write and optimize Spark jobs, Spark SQL queries, and streaming/batch data processing flows
- Develop and optimize complex Hive and SQL queries involving UDFs, joins, views, and large datasets
- Debug Spark code and enhance performance for distributed applications
- Utilize UNIX commands and shell scripting for automation and environment handling
- Work with Autosys and Gradle for job scheduling and build management
- Produce unit tests for Spark transformations and associated helper methods
- Write clear Scaladoc-style documentation for all developed code
- Collaborate with SMEs and stakeholders to meet timelines and ensure accurate status reporting
- Create and maintain detailed documentation for developed mappings and processes
- Work effectively within an agile environment
Required Experience & Skills
- Minimum 5+ years of experience in Spark Scala development
- Strong experience with Hadoop ecosystem components (HDFS, Spark, Hive, Parquet, YARN, MapReduce, Sqoop)
- Experience with batch and streaming data processing
- Strong SQL and Hive query optimization skills
- Experience in debugging and performance tuning Spark applications
- Knowledge of UNIX commands and shell scripting
- Hands-on experience with Autosys and Gradle
- Strong analytical and problem-solving abilities
- Ability to work with multiple teams, manage timelines, and maintain documentation