Site Reliability Engineer Job in EXASOFT PTE. LTD.

Site Reliability Engineer

SG, Singapore

EXASOFT PTE. LTD.

33 Current Jobs Openings

Apply Now

Job Description

Job Summary:

We are seeking a

Senior Site Reliability Engineer (SRE)

with 10-15 years of proven experience in building, managing, and maintaining highly available, scalable, and secure infrastructure across

multi-cloud

and

hybrid cloud

environments--including

on-premises data centers

.

The ideal candidate will have deep knowledge of

SRE principles

, strong hands-on experience in

automation

observability

incident response

, and

infrastructure resilience

, and the ability to architect solutions that span

cloud and traditional data center

environments.

Key Responsibilities:

Design, implement, and manage

reliable and scalable systems

across

public clouds (AWS, Azure, GCP)

and

on-premises data centers

. Apply

SRE best practices

--including

SLIs, SLOs, error budgets, incident management, and postmortems

--across cloud and non-cloud environments. Develop and maintain

Infrastructure as Code (IaC)

using tools like Terraform, Ansible, or CloudFormation. Drive

automation

for deployment, scaling, monitoring, and infrastructure management. Implement and enhance

observability practices

(monitoring, logging, tracing) using tools like Prometheus, Grafana, ELK, Datadog, New Relic, etc. Work with application teams to ensure

high availability

performance

, and

cost optimization

across hybrid environments. Lead and participate in

on-call rotations

and improve overall

incident response

processes. Collaborate with security and compliance teams to enforce

best practices in data protection

, access control, and system hardening in hybrid setups. Evaluate and recommend emerging tools and technologies for

resilience engineering

disaster recovery

, and

infrastructure modernization

Required Qualifications:

10-15 years

of experience in SRE, DevOps, or infrastructure engineering roles. Proven experience managing infrastructure in

multi-cloud (AWS, Azure, GCP)

and

hybrid cloud/on-prem environments

. Solid understanding of

networking, load balancing, storage, virtualization, and container orchestration

(Kubernetes, Docker). Strong scripting and programming skills (e.g., Python, Go, Bash). Experience with

CI/CD pipelines

, tools like Jenkins, GitLab CI, ArgoCD, etc. In-depth knowledge of

SRE methodologies

and real-world application of SLAs, SLOs, and error budgets. Hands-on experience with

monitoring and observability stacks

. * Strong analytical and troubleshooting skills for

production incidents

across complex, distributed systems.

Beware of fraud agents! do not pay money to get a job

MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.

Related Jobs

Site Reliability Engineer

Thales

Singapore, S00, SG

Apply Now
Senior Site Reliability Engineer

Oracle

Singapore

Apply Now

Principal Site Reliability Engineer

Oracle

Singapore

Apply Now
Lead Site Reliability Engineer, Electronic Trading Service

JPMorganChase

Tampines, S00, SG

Apply Now

Job Detail

Job Id

JD1635998
Industry

Not mentioned
Total Positions

1
Job Type:

Full Time
Salary:

Not mentioned
Employment Status

Permanent
Job Location

SG, Singapore
Education

Not mentioned

Jobs by Function

Popular Job Skills

Popular Industries

Popular Cities

Jobseekers

Employers