Site Reliability Engineer

Singapore, Singapore

Job Description


Responsibilities

  • Design and implement monitoring solutions using APM products.
  • Create and maintain monitoring dashboards to provide real-time visibility into system health and performance.
  • Collaborate with development and operations teams to define and implement alerting rules based on established best practices and specific system requirements.
  • Monitor system performance, availability, and capacity to proactively identify and address potential issues.
  • Continuously analyze monitoring data to identify opportunities for optimization and efficiency improvements.
  • Collaborate with cross-functional teams to ensure the reliability, scalability, and performance of our infrastructure.
  • Document monitoring and alerting configurations, processes, and best practices.
Requirements
  • Bachelor\'s degree in Computer Science, related technical discipline, or equivalent practical experiences.
  • Proven experience as a Site Reliability Engineer (SRE) or a similar role with a focus on monitoring and alerting.
  • Proficiency with APM tools and technologies such as SolarWinds, IBM Instana, Prometheus, Grafana, etc.
  • Experience in creating and maintaining monitoring dashboards and writing alerting rules.
  • Understanding of cloud platforms (e.g., AWS, Azure, GCP) and container orchestration (e.g., Kubernetes) is a plus.
  • Good communication and teamwork skills..
Shortlisted candidates will be offered a 1 Year Agency contract employment.

Jobline Resources

Beware of fraud agents! do not pay money to get a job

MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD1446023
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Singapore, Singapore
  • Education
    Not mentioned