![Mobile Think Beyond The Label Logo](https://dn9tckvz2rpxv.cloudfront.net/thinkbeyondthelabel.jobs/logo_header_b-w.png)
Job Information
Nielsen Senior Data Engineer (Spark, Presto, Pandas, relational database systems (RDBMS)) in Bangalore, India
You’ll be working within an international group of teams which span from India to Europe and US
As a Senior Data Engineer you will be responsible for data pipelines to work on both scheduled and real time use cases. Schedule pipelines run typically in Airflow and real time processing could be airflow or integrated querying in backend services.
Responsibilities
Discuss the Cost of Change (= code quality) with your team members continuously
Write unit tests, integration tests and API tests
Should be able to support the application 24/7 based on team on call rotations.
Write clean code (mindful about coupling, separation of concerns, etc.)
Work closely with team leads and backend developers to design and develop functional, robust data pipelines to support internal and customer needs
Manage and optimize scalable pipelines in the cloud
Optimize internal and external applications for performance and scalability
Communicate regularly with project managers, quality engineers, and other developers regarding progress on long-term technology roadmap
Recommend systems solutions by comparing advantages and disadvantages of custom development and purchased alternatives
Participate as a team lead on projects, which includes training, coaching, and sharing technical knowledge with less experienced staff.
Rapidly identify and resolve technical incidents as they emerge
Key Skills
Required
4-8 years of experience as a software/data engineer
Bachelor or Master’s degree in computer science or related discipline (field), or equivalent work experience
Advanced experience with relational database systems (RDBMS), with advanced proficiency in SQL for data querying, manipulation. Additionally complex data analysis, optimization, and performance tuning (MySQL/PostgresSQL, Amazon RDS).
Advanced experience with analytical data stores and strong understanding of distributed computing principles (e.g., Spark, Presto, Pandas)
Experience with data pipeline orchestration tools like Apache Airflow for workflow management and automation.
2+ years of experience working with Docker
Advanced proficiency with data integration and ETL (Extract, Transform, Load) processes to move and transform data between systems.
Hands-on experience with Python.
Advanced proficiency with cloud service technologies, ex: AWS (EMR, RDS, EC2, S3, Athena, Lambda)
Advanced knowledge of data warehouse design principles and best practices, including schema design, indexing strategies, and partitioning techniques.
Familiarity with storage, network, computer services, multi-zone, region-based design
Experience with GIT (Gitlab preferred) source control systems
Fluent in English, both spoken and written, with a large vocabulary (C1 english level)
Commitment to following best practices for security, scalability, and performance.
Excellent problem-solving skills and ability to troubleshoot complex technical issues in production environments.
Strong communication skills to collaborate effectively with cross-functional teams, stakeholders, and third-party vendors.
Continuous improvement mindset to identify opportunities for automation, optimization, and efficiency gains in infrastructure and deployment processes.
Ability to document processes, procedures, and technical architectures for knowledge sharing and future reference.
Preferred
Leadership qualities and the ability to inspire and motivate a team
At least 1 year of experience with Test-driven development
Familiarity with Java and Frontend development
Experience with data visualization tools such as Tableau, Power BI, or Looker for creating interactive dashboards and reports
Knowledge of UI/UX principles and best practices for designing intuitive and user-friendly interfaces.
Knowledge of networking principles and security best practices.