Hadoop Data Engineer

SG, Singapore

Job Description

We are looking for an experienced and highly skilled

Hadoop Data Engineer

to join our dynamic team. The ideal candidate will have hands-on expertise in developing optimized data pipelines using

Python, PySpark, Scala, Spark-SQL, Hive

, and other big data technologies. You will be responsible for translating complex business and technical requirements into efficient data pipelines and ensuring high-quality code delivery through collaboration and code reviews.


Roles & Responsibilities:



Data Transformation & Pipeline Development:



Design and implement optimized data pipelines using

PySpark, Python, Scala, and Spark-SQL

. Build complex data transformation logic and ensure data ingestion from source systems to

Data Lakes

(Hive, HBase, Parquet). Produce unit tests for Spark transformations and helper methods.

Collaboration & Communication:



Work closely with

Business Analysts

to review test results and obtain sign-offs. Prepare comprehensive design and operational documentation for future reference.

Code Quality & Review:



Conduct peer code reviews and act as a gatekeeper for quality checks. Ensure quality and efficiency in the delivery of code through pair programming and collaboration.

Production Deployment:



Ensure smooth production deployments and perform post-deployment verification.

Technical Expertise:



Provide hands-on coding and support in a

highly collaborative environment

. Contribute to development, automation, and continuous improvement practices.

System Knowledge:



Strong understanding of

data structures, data manipulation, distributed processing

, and

application development

. Exposure to technologies like

Kafka

,

Spark Streaming

, and

ML

is a plus.

RDBMS & Database Management:



Hands-on experience with

RDBMS

technologies (MariaDB, SQL Server, MySQL, Oracle). Knowledge of

PLSQL

and stored procedures is an added advantage.

Other Responsibilities:



Exposure to

TWS jobs

for scheduling. Knowledge and experience in

Hadoop tech stack

,

Cloudera Distribution

, and

CI/CD pipelines

using

Git, Jenkins

. Experience with

Agile Methodologies

and

DevOps

practices.

Technical Requirements:



Experience:

6-9.5 years of experience in

Hadoop

,

Spark

,

PySpark

,

Scala

,

Hive

,

Spark-SQL

,

Python

,

Impala

,

CI/CD

, and

Git

. Strong understanding of

Data Warehousing Methodology

and

Change Data Capture

(CDC). In-depth knowledge of

Hadoop & Spark

ecosystems with hands-on experience in

PySpark

and

Hadoop

technologies. Proficiency in working with

RDBMS

such as

MariaDB

,

SQL Server

,

MySQL

, or

Oracle

. Experience with

stored procedures

and

TWS job scheduling

. Solid experience with

Enterprise Data Architectures

and

Data Models

. Background in

Core Banking

or

Finance domains

is preferred; experience in

AML

(Anti-Money Laundering) domain is a plus.

Skills & Qualifications:



Strong hands-on coding skills in

Python

,

PySpark

,

Scala

,

Spark-SQL

. Proficient in

Hadoop

ecosystem (Hive, HBase, etc.). Knowledge of

CI/CD

,

Agile

, and

DevOps

methodologies. Good understanding of

data integration

,

data pipelines

, and

distributed data systems

. Experience with

Oracle

,

PLSQL

, and working with large-scale databases. * Strong analytical and problem-solving skills, with an ability to troubleshoot complex data issues.

Beware of fraud agents! do not pay money to get a job

MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD1676818
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    SG, Singapore
  • Education
    Not mentioned