Job Summary
We are seeking a highly skilled and experienced Cloud Engineer lead (Level 3) to support cloud
infrastructure for Commercial and Singapore Government-appointed agency operating across
commercial cloud platforms. This role requires experiences managing multi-cloud environments
predominantly on Amazon Web Services (AWS), with knowledge in Microsoft Azure and Google
Cloud Platform (GCP). The ideal candidate will demonstrate strong Infrastructure-as-Code (IaC)
capabilities, comprehensive OS lifecycle and patching operations, application deployment and
troubleshooting expertise, and proactive operational leadership. This role emphasizes hands-on
technical proficiency, security awareness, automation-driven practices, mentorship capabilities,
and familiarity with strict uptime, compliance, and audit requirements in network separation
environments.
Key Responsibilities
Multi-Cloud Infrastructure Operations
? Operate and maintain cloud-native services in production across AWS, Microsoft Azure, and
Google Cloud Platform:
? Hands-on experience with cloud services including: Lambda, ECS/EKS, FSx, Glue, SES,
GuardDuty, WAF, Shield Advanced, Security Hub, KMS, Secret Manager, SNS, SQS,
EventBridge, API Gateway, EC2, S3, CloudWatch, Systems Manager, Azure Virtual Machines,
Azure Kubernetes Service (AKS), Azure Functions, Azure Storage, Azure Monitor, Compute
Engine, Google Kubernetes Engine (GKE), Cloud Functions, Cloud Storage, Cloud Monitoring
? Monitor and troubleshoot infrastructure performance, uptime, and scalability across all
platforms
? Support production and staging environments with 24/7 reliability objectives
? Able to participate in 24/7 shift rotation to provide round-the-clock operational support and
assist a team of L2 engineers with hands-on troubleshooting of technical issues.
Infrastructure as Code (IaC)
? With working knowledge, able to maintain infrastructure deployment pipelines with 1 of the
following: Terraform, Ansible, and/or Azure Resource Manager (ARM) templates
? Troubleshoot environment drift and pipeline failures across multi-cloud environments.
? Promote and be empowered to drive automation in cloud operations and continuous
improvement initiatives.
? Implement and maintain GitOps practices for infrastructure deployment
Operating System Lifecycle & Patch Management
? Lead OS patching operations across RHEL (v8 to v10) and Windows Server (2016?2025)
using AWS Patch Manager, Azure Update Management, WSUS, SCCM, and YUM/DNF
? Maintain basic knowledge of Linux administration with deep expertise in Wintel Operating
System patching and management
? Schedule, automate, and track patches across all environments
? Coordinate patch approvals and ensure compliance with organizational policies
? Execute monthly and quarterly patch cycles with minimal disruption
? Perform post-patch validation and remediation activities
Application Deployment & Troubleshooting
? Deploy and troubleshoot applications across Windows and Linux operating systems
? Support application teams with OS-level diagnostics and performance optimization
? Collaborate with development teams to resolve infrastructure and OS-related application
issues
? Implement and maintain application monitoring and alerting frameworks
Security & Compliance
? Execute CIS (Center for Internet Security) security remediations across cloud platforms
? Perform security hardening based on CIS Benchmarks and government security baselines
? Conduct vulnerability remediation using tools such as Trend Micro Vision One, Qualys,
Tenable, and AWS Config
? Track SSL certificate renewals across all environments
? Identify and remediate End-of-Life (EOL) components including OS versions and Lambda
runtimes
? Support compliance with government-level security, audit, and regulatory requirements
Container & DevSecOps
? Demonstrate knowledge of container technologies (Docker, Kubernetes, ECS, EKS, AKS, GKE)
? Familiarity or insights of DevSecOps practices using SHIP-HATS (Secure Hybrid Integration
Pipeline - Hive Agile Testing Solutions) under Singapore Government technology stack
? Support CI/CD pipeline operations and integration with security scanning tools
ITIL & Service Management
? Adhere to ITIL processes including Incident, Problem, Change, and Request Management
? Manage and resolve ITSM tickets via ServiceNow, Jira, or similar platforms
? Drive ITSM ticket escalation between engineering teams and stakeholders
? Coordinate change management activities and participate in Change Advisory Board (CAB)
reviews with junior engineers.
? Maintain service level agreements (SLAs) and operational level agreements (OLAs)
Tool Integration & Observability
? Integrate third-party tools including NGINX, monitoring dashboards, and observability stacks
? Configure and maintain observability tools for metrics, logs, and alerts across multi-cloud
environments
? Implement log aggregation and analysis using CloudWatch, Azure Monitor, and GCP Cloud
Logging
Documentation & Knowledge Management
? Create and maintain comprehensive infrastructure runbooks, system documentation, and
change tracking logs and infrastructure architecture design of Application assigned.
? Develop standard operating procedures (SOPs) and knowledge base articles
? Ensure audit-readiness through meticulous documentation discipline
? Maintain configuration management databases (CMDB) and asset inventories
Leadership & Mentorship
? Provide technical guidance and mentorship to Level 2 and junior engineers
? Lead technical discussions and architecture reviews
? Facilitate knowledge transfer sessions and training programs
? Act as escalation point for complex technical issues
? Drive continuous improvement initiatives and best practice adoption
Soft Skills & Competencies
? Problem Solving - Advanced troubleshooting of complex multi-cloud systems
? Communication - Clear and effective communication with technical and non-technical
teams, stakeholders, and management
? Leadership - Ability to guide teams and drive technical initiatives
? Collaboration - Cross-functional teamwork across engineering, security, and business teams
? Adaptability - Responsive and effective in rapidly changing environments
? Accountability / Attention to Detail - Takes ownership of outcomes and service delivery,
ensures accurate and secure implementations
? Customer Focus - Supportive, service-oriented approach with stakeholder management
? Continuous Learning - Stays current with evolving cloud and security practices
? Resilience - Performs effectively under pressure and during incident response
? Mentorship - Develops and supports junior team engineers
SME Expectations - Role Behavior
This Subject Matter Expert (SME) role requires:
? Proficiency across Amazon Web Services with working knowledge of Azure and GCP
? Proven experience in uptime-critical and compliance-driven environments
? Strong mentorship and leadership capabilities for junior and mid-level engineers
? Proactive initiative in incident prevention and operational excellence
? Calm, structured, and methodical approach to incident handling with strict adherence to
change management and incident response processes
? Audit-readiness mindset with comprehensive documentation practices
? Ability to drive escalations and manage stakeholder communications effectively
? Experience working within Singapore Government technology frameworks
Technical Skills & Experience
Area Skills Required
Cloud Platforms Hands-on production experience with Amazon Web
Services or Microsoft Azure or Google Cloud Platform
Infrastructure as Code Terraform, ARM Templates
Operating Systems Windows Server (2012/2016/2019/2022/2025), basic
to intermediate Linux/RHEL administration
Patch Management AWS Patch Manager, Azure Update Management,
WSUS, SCCM, YUM/DNF, Air Gapped Linux Repo
Application Support OS-level application deployment, troubleshooting, and
performance optimization
Security & Hardening CIS Benchmarks, security remediation, vulnerability
management, IAM best practices
Containers Docker, Kubernetes, ECS, EKS, AKS
DevSecOps Familiarity with SHIP-HATS and/or DevSecOps
frameworks
ITIL & ITSM Incident, Problem, Change, Request Management;
ServiceNow, Jira
SSL/Certificate Management End-to-end SSL certificate lifecycle and renewal
tracking
Scripting & Automation PowerShell, Bash, Python, AWS CLI, Azure CLI, gcloud
CLI
Documentation Runbooks, SOPs, Logs, and technical documentation
Required Qualifications
? Bachelor's degree in Computer Science, Information Systems, or related field
? Minimum 3 years of experience in Commercial Cloud Engineering roles
? At least 2 years of experience in public sector or regulated cloud environments
? Minimum 3 years of hands-on experience with AWS or Microsoft Azure or Google Cloud
Platform
? Experience in 24/7 operational support environments with shift rotation
? Demonstrated experience in mentoring and leading junior engineers
? Strong background in ITIL processes and ITSM platforms with experiences on CIS security
hardening and remediation
? Familiarity with Singapore Government technology standards and frameworks (e.g., SHIP-
HATS, IM8 Policy)
Preferred Certifications
? AWS Certified Solutions Architect - Associate / Professional
? AWS Certified SysOps Administrator - Associate (preferred)
? Microsoft Certified: Azure Administrator Associate or Azure Solutions Architect Expert
? Microsoft Certified: Windows Server Hybrid Administrator Associate
? RHCE or Linux Professional Institute Certification (LPIC)
? ITIL v3/v4 Foundation
Work Arrangements
? This role requires participation in 24/7 shift rotation to support critical infrastructure
operations
? Extended work hours may be required during incidents, maintenance windows, and change
implementations
? On-call support responsibilities as part of rotation schedule
? Flexibility to work outside normal office hours for patching activities and emergency
response
Job Type: Full-time
Pay: $3,989.08 - $11,746.67 per month
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.