Extract, Transform, and Ingest real estate data into data warehouse
Build robust program for data extraction with failure retry with various queries (restful API, GraphQL, selenium, etc.)
Design and implement data extraction solution in a distributed system
Build efficient data pipelines (using Airflow, DBT, AWS etc.)
Build data models and applications to support products and solve specific business needs (using SQL, Python etc.)
Perform data validation to assure integrity, accuracy and consistency, identify the root cause of data inconsistencies and process defects, and implement timely corrective actions
Maintain data pipeline to ensure the smooth operation
Collaborate with data analysts, data scientists, and product teams to improve data flows and applications performance
Requirements:
Major in computer science, computer engineering, data analytics or other related technical field preferred
Minimum 2 years of experience in data related fields
Experience in handling large data sets and working with structured, unstructured and geographical datasets
Deep understanding of databases and best engineering practices - include handling and logging errors, monitoring the system, building human-fault-tolerant pipelines, understanding how to scale up, addressing continuous integration, knowledge of database administration, maintaining data cleaning and ensuring a deterministic pipeline
Strong experience in developing with Python and SQL
Experience with distributed computing, parallel processing, and working with large datasets
Strong experience in data lakes (integration of different data sources into the data lake)
Experience in using Apache Airflow, DBT and/or AWS will be an added advantage
Excellent communication and teamwork skills with the ability to work effectively in cross-functional teams
Beware of fraud agents! do not pay money to get a job
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.