Site Reliability Engineer

Singapore, Singapore

Job Description

Job Title: Site Reliability Engineer
Location: Singapore
Job Type: Full-time
Responsibility:

  • Cluster Operations & Management
  • Manage and maintain container clusters (Kubernetes, Docker) and open-source component clusters (Kafka, Redis, Elasticsearch) across multiple business units
  • Ensure optimal performance, scalability, and reliability of distributed systems
  • Infrastructure Platform Development
  • Design, build, and enhance infrastructure operation platforms
  • Develop and maintain systems for infrastructure management, CI/CD pipelines, monitoring/alerting, and centralized logging
  • Drive platform standardization and automation initiatives
  • High Availability & Reliability
  • Ensure maximum uptime for production services through proactive monitoring and incident response
  • Continuously optimize service architecture, deployment strategies, and operational processes
  • Implement and maintain SLA/SLO frameworks and reliability engineering practices
  • Automation & Process Improvement
  • Lead the development of automated operations and maintenance systems
  • Create self-service tools and workflows to improve team productivity
  • Establish best practices for infrastructure such as code and configuration management
Required Qualifications
  • Experience & Education
  • 2+ years of hands-on experience in Systems Operations, DevOps, or Site Reliability Engineering (SRE)
  • Bachelor's degree in Computer Science, Engineering, or related technical field preferred
  • Cloud & Infrastructure
  • Experience with public cloud platforms (AWS, Azure, or GCP) is highly valued
  • Strong understanding of large-scale internet architecture and distributed systems
  • Proven experience with infrastructure monitoring, logging, and observability tools
  • Technical Skills
  • Proficiency in scripting and automation using Shell, Python, or similar languages
  • Strong knowledge of containerization technologies (Kubernetes, Docker)
  • Hands-on experience operating production-grade container clusters and managing CI/CD pipelines
  • Strong familiarity with common infrastructure components: Nginx, MySQL, Redis, Kafka, Elasticsearch
Advanced Networking (Preferred)
  • Experience with Service Mesh architectures, Cilium CNI, and eBPF technologies
  • Understanding network security, load balancing, and traffic management
  • Knowledge of cloud-native networking patterns and best practices

Skills Required

Beware of fraud agents! do not pay money to get a job

MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD1696263
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Singapore, Singapore
  • Education
    Not mentioned