Observability / Site Reliability Principal Engineer

Jurong East, Singapore, Singapore

Job Description


We are seeking a skilled Observability Principal Engineer with at least 2-3 years of experience in observability to join our dynamic team. In this role, you will be responsible for implementing, managing, and optimizing observability tools. You will work closely with cross-functional teams to ensure that our systems are monitored effectively, and issues are identified and resolved proactively.Key Responsibilities:

  • Design, implement, and maintain observability frameworks using tools such as Prometheus, Grafana, ELK Stack, tableau or similar.
  • Design, implement, and maintain Monitoring tools such as BMC, CA, SolarWinds, SCOM, Dynatrace, Datadog or similar.
  • Create and manage dashboards, visualizations, and reports to communicate system health and performance metrics.
  • Collaborate with the sales team to understand client requirements and demonstrate how our observability solutions can address their specific needs.
  • Prepare and deliver presentations, demos, and workshops to potential clients showcasing the capabilities and benefits of our observability tools.
  • Troubleshoot and resolve tools-related issues in a timely manner.
  • Assist in the training and mentoring of team members on observability and monitoring tools and practices.
Job Requirements:
  • 2-3 years of experience in software development, Implementation, operations, or a related field with a focus on observability tools.
  • Proficiency in implementing and managing observability tools.
  • Solid understanding of cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes).
  • Experience with scripting languages (Python, Bash, etc.) for automation tasks.
  • Knowledge of best practices in monitoring, logging, and incident management.
  • Strong analytical skills with the ability to diagnose issues and propose effective solutions.
  • Excellent communication and collaboration skills, with a proactive approach to problem-solving.
  • Technical experience in Enterprise Monitoring tools such as Dynatrace, Grafana, BMC,
  • Knowledge of Automation tools, Cloud Technologies and DevOps Concepts, Open systems and Networking Technologies
  • Good knowledge in various monitoring tools e.g. BMC, SolarWinds, CloudWatch and Azure.
  • Experience with configuration management tools (Ansible, Terraform, etc.).
  • Familiarity with APM (Application Performance Management) tools such as New Relic, Dynatrace, or similar.
  • Understanding of network protocols and architectures.
  • Experience with orchestration tools (e.g., BMC, Kubernetes, Apache Airflow, Jenkins) to create and manage automated workflows for deploying, monitoring, and scaling observability solutions.
Preferred Qualifications:
  • Proficiency in observability tools (e.g., Grafana, ELK Stack, Datadog, Prometheus etc).
  • Proficiency in ITOM tools (e.g., BMC, Dynatrace, CA, SCOM, IBM, SolarWinds etc).
  • Strong understanding of monitoring and logging frameworks.
  • Experience with distributed systems and microservices architecture
  • Ability to write scripts for automation and data analysis.
  • Experienced in cloud platforms (AWS, Azure, GCP) and their monitoring services.
  • Experience with CI/CD pipelines and infrastructure as code (IaC) tools like Terraform or Ansible
  • Relevant certifications in cloud computing, DevOps, or observability tools can be a plus.
Work location: Jurong East

ST Engineering

Beware of fraud agents! do not pay money to get a job

MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD1488514
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Jurong East, Singapore, Singapore
  • Education
    Not mentioned