Project: Cloud File Transfer (CFT)Part 1: Mandatory RequirementsResponsibilities:
Spearhead cloud operations with a strong focus on monitoring, performance tuning, and release management within AWS environments.
Ensure L2 incident management and escalation procedures are robust and proactive, prioritizing multiple issues effectively.
Coordinate with internal and external teams to swiftly resolve application and security incidents in line with SLAs.
Develop and refine operational support processes, including daily checklists, work dashboards, and communication protocols to maintain clear timelines and issue tracking.
Regularly analyse operational metrics, report on cloud system status, and provide insightful updates to stakeholders.
Exhibit excellent communication skills to convey key findings and maintain strong relationships across the board.
Lead change management initiatives by assessing impacts thoroughly, crafting strategies, and developing risk mitigation measures.
Oversee a validation team tasked with rigorous QA and security assessments to ensure stakeholder changes are thoroughly vetted before release.
Organize and execute maintenance schedules and system upgrades to optimize cloud infrastructure performance, liaising with vendors and teams for seamless cloud environment stability.
Set and evolve OKRs and SLAs, striving for continuous enhancement of cloud operation performance.
Experience and Skills:
Degree or equivalent in Computer Science, Information Technology, or related fields, supplemented by relevant experience.
At least 2 years of hands-on management of public cloud services, preferably AWS.
Acute problem-solving skills within varied cloud infrastructures and applications.
Exceptional customer service acumen, with a strong sense of urgency and detail-oriented approach to issue resolution.
Track record in developing and enforcing IT processes, procedures, and policies.
Proficient in managing cloud production environments and instituting preventative measures to mitigate potential business impact.
Competent in operational cloud technology activities, including impact assessments and service improvement execution.
Key Technologies:
Experience with infrastructure as code, specifically Terraform, for efficient resource provisioning and management.
Proficiency in GitLab for continuous integration/continuous deployment (CI/CD) pipelines and version control.
Strong understanding of AWS services and architecture, underpinning the majority of our cloud operations.
Part II: General RequirementsAs a DevOps Specialist, you will be responsible for:
Develop automation and processes to enable teams to deploy, manage scale and monitor their applications in data centers and in cloud.
System troubleshooting and problem solving across platform and application domains, expect to participate in on-call escalations to troubleshoot customer facing issues.
Take ownership of end-to-end solutions provided by teams across the organisation.
Deploy and manage monitoring tools of infrastructure performance, utilization and health.
Implement configuration management system for business continuity management and automate disaster recovery measures.
Provision virtual machines, databases, application containers and licenses for development team.
As a DevOps Specialist, you need to bring to the team:
Passion for automation, standardization and best practices
Excellent understanding of Software Development Life Cycle, Test Driven Development, Continuous Integration and Continuous Delivery
Experience working with high availability, high performance, high security, multidata centre systems and hybrid cloud environments
Demonstrable skills in three or more programing/scripting languages
Experience with version control systems such as Git
Experience with such as GPC, GCC (i.e. AWS, Azure, Google Cloud)
Ability to troubleshoot complex issues ranging from system resource to application stack traces
Comfortable with Agile methodologies and working closely with product development teams
Strong on collaboration and communication including documentation
Degree or Diploma in Computer Science, Computer or Electronics Engineering, Information Technology or related disciplines.
Experience required
Experience in one or more automated provisioning tools such as Vagrant, Ansible, Puppet, Chef
Experience in one or more automated infrastructure testing tools such as Serverspec
Experience in one or more Cloud infrastructure such as OpenStack, CloudStack, vSphere
Knowledge of RPM file deployment, management and design
Knowledge of disaster recovery, system backup and restore
Experience in one or more virtualization technologies (KVM, Xen, VMware, Hyper-V)