Work with stakeholders to understand needs for data structure, availability, scalability, and accessibility.
Develop tools to improve data flows between internal/external systems and the data lake/warehouse.
Build robust and reproducible data ingest pipelines to collect, clean, harmonize, merge, and consolidate data sources.
Understanding existing data applications and infrastructure architecture
Build and support new data feeds for various Data Management layers and Data Lakes
Evaluate business needs and requirements
Support migration of existing data transformation jobs in Oracle, and MS-SQL to Snowflake.
Lead the migration of the existing data transformation jobs in Oracle, Hive, Impala etc. into Spark, Python on Glue etc.
Able to document the processes and steps.
Develop and maintain datasets.
Improve data quality and efficiency.
Lead Business requirements and deliver accordingly.
Collaborate with Data Scientists, Architect and Team on several Data Analytics projects.
Collaborate with DevOps Engineer to improve system deployment and monitoring process.
Requirements
Bachelor qualification in a computer science or STEM (science, technology, engineering, or mathematics) related field.
At least 8+ years of strong data warehousing experience using RDBMS and Non-RDBMS databases.
At least 5 years of recent hands-on professional experience (actively coding) working as a data engineer (back-end software engineer considered).
Professional experience working in an agile, dynamic and customer facing environment is required.
Understanding of distributed systems and cloud technologies (AWS) is highly preferred.
Understanding of data streaming and scalable data processing is preferred to have.
Experience with large scale datasets, data lake and data warehouse technologies such as AWS Redshift, Google BigQuery, Snowflake. Snowflake is highly preferred.
Atleast 2+ years of experience in ETL (AWS Glue), Amazon S3, Amazon RDS, Amazon Kinesis, Amazon Lambda, Apache Airflows, Amazon Step Functions.
Strong knowledge in scripting languages like Python, UNIX shell and Spark is required.
Understanding of RDBMS, Data ingestions, Data flows, Data Integrations etc.
Understanding of TCP/IP network protocols including TLS/SSL, HTTP, etc.
Understanding of AWS VPC networking and IAM access control policy.
Technical expertise with data models, data mining and segmentation techniques.
Experience with full SDLC lifecycle and Lean or Agile development methodologies.
Knowledge of CI/CD and GIT Deployments.
Ability to work in team in diverse/ multiple stakeholder environment.
Ability to communicate complex technology solutions to diverse teams namely, technical, business and management teams.
Ability to work in a collaborative environment and coach other team members on coding practices, design principles, and implementation patterns that lead to high-quality maintainable solutions.
Ability to work in a dynamic, agile environment within a geographically distributed team.
Ability to focus on promptly addressing customer needs.
Ability to work within a diverse and inclusive team.
Technically curious, self-motivated, versatile and solution oriented.
Beware of fraud agents! do not pay money to get a job
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.