What you will be working on
As an Infrastructure Resiliency Consultant, you will be involved in the planning, operation, and continual improvement of CPF Board's infrastructure platforms. This includes overseeing compute, storage, backup, and recovery systems, ensuring they remain secure, reliable, and recovery-ready. You will be responsible for shaping the Disaster Recovery (DR) strategy and advancing practices for infrastructure resiliency, acting as a bridge between technical execution and strategic resilience planning.
In this role, you will:
- Plan, manage and conduct DR exercises in partnership with stakeholders, users and system owners to ensure business continuity preparedness.
- Design and develop recovery strategies, considering system dependencies, multi-site configurations, and various failure scenarios.
- Maintain and update disaster recovery playbooks, streamlining recovery workflows to enhance response times and efficiency.
- Collaborate with internal teams to ensure preparedness for service disruptions, address resiliency gaps and implement improvements.
- Manage and optimise infrastructure platforms, including converged infrastructure and NAS storage systems.
- Oversee system lifecycle management, capacity planning, and platform hardening across compute, storage, and virtualisation layers.
- Maintain performance, patch compliance, and operational reliability of core systems.
- Oversee backup platform operations, coordinate backup and recovery testing for systems, and validate integrity and recoverability of data.
What we are looking for
We value the diverse talents and experiences that each individual brings to the table. While mastery of every requirement may not be necessary, familiarity and expertise in some of the following areas will position you for success within this team.
- Relevant experience in infrastructure or systems engineering, with exposure to backup, DR, or resiliency planning.
- Pior experience managing converged and storage infrastructures such as EMC Isilon, Hitachi Storage Navigator, Device Manager, and Isilon OneFS.
- Experience in configuring and managing backup policies, performing backup testing, restore operations, and disaster recovery drills.
- Good understanding of Business Continuity Planning/DR frameworks and key metrics including RTO, RPO, and high availability architectures.
- Knowledge of failover mechanisms and redundancy setups across application, database and infrastructure layers will be a plus.
- Availability for on-call rotation, emergency response and after-hours maintenance windows where necessary.
The seniority of appointment and actual corporate job title will commensurate with individual work experiences.
Position is on a 2-year full-time contract directly under the payroll of CPF Board with potential for emplacement into a permanent position.