Data Engineer

Data Engineer Pyspark Sg C1

SG, Singapore

https://sg.mncjobz.com/company/capgemini

Apply Now

Job Description

-------------------

Data Engineer should be able to understand the Business requirements, Functional and Technical requirements and should build effective data transformation jobs in Python, PySpark/SCALA, Python Framework. Should have strong hands-on working expertise in creating the optimized data pipelines in Pyspark/Python/Scala. Produce unit tests for Spark transformations and helper methods. Understand the complex transformation logic and translate and build data pipelines in Pyspark/Spark-SQL/Hive logic to ingest the data from source systems to Data Lake (Hive/Hbase/Parquet)/ Enterprise Data Domain tables. Work closely with Business Analysts team to review the test results and obtain sign off. Prepare necessary design/operations documentation for future usage. Perform peers Code quality review and be gatekeeper for quality checks. Hands-on coding, usually in a pair programming environment. Working in highly collaborative teams and building quality code. The candidate must exhibit a good understanding of data structures, data manipulation, distributed processing, application development, and automation.
Familiar with Oracle, Spark streaming, Kafka, ML.

Have knowledge on RDMS concepts, hands on experience on PLSQL etc. To develop an application by using Hadoop tech stack and delivered effectively, efficiently, on-time, in-specification and in a cost-effective manner. Ensure smooth production deployments as per plan and post-production deployment verification. This Hadoop Developer will play a hands-on role to develop quality applications within the desired timeframes and resolving team queries.
Technical Requirements:

Hadoop data engineer with total 4-6years of experience and should have strong experience in Hadoop, Spark, Pyspark, Scala , Hive, Spark-SQL, Python, Impala, CI/CD, Git, Jenkins, Agile Methodologies, DevOps, Cloudera Distribution. Strong Knowledge in data warehousing Methodology and Change Data Capture. Relevant 5+ years of Hadoop & Spark/Pyspark experience is mandatory. Good Knowledge and experience in any RDBMS database (MariaDB or SQL Server OR MySQL or Oracle ). Knowledge on stored procedures is an added advantage. Have exposure to TWS jobs for scheduling. Strong in enterprise data architectures and data models. Good experience in Core Banking, Finance domain. * Exposure in AML Domain preferred, not mandatory.

Beware of fraud agents! do not pay money to get a job

MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.