The Technical Architect is accountable for the design, development, and validation of end?to?end disaster recovery (DR) runbooks, establishing a consistent, technically credible recovery approach for each in?scope IT service.
The role ensures alignment with DR standards, BIA?driven recovery requirements, and operational executability..
|
.
- Conduct detailed analysis of BIA outputs to understand service criticality, RTO/RPO, and recovery sequencing requirements.
- Work with stakeholders to:
- Understand application architecture, dependencies, and failure modes,
- Define “minimum viable recovery” states per service.
- Design technically sound DR recovery approaches, including severe failure scenarios where environments are fully lost.
- Develop, test, and refine service?specific DR runbooks, ensuring:
- Clear recovery sequencing,
- Explicit decision points,
- Operational clarity.
- Lead structured walk?through and table?top testing of recovery runbooks.
- Provide formal handover to Application Support teams, ensuring clarity of ownership and ongoing maintenance.
|
Disaster Recovery & Resilience Expertise
- Strong understanding of Disaster Recovery (DR) concepts, patterns, and best practices
- Experience designing end?to?end recovery approaches rather than partial or component?level recovery
- Practical experience defining minimum viable recovery (MVR/MVI) states
- Understanding of severe failure scenarios (e.g. full environment loss, data corruption)
Business Impact Analysis (BIA) Interpretation
- Ability to analyse and interpret BIA outputs
- Translate business recovery requirements into technical recovery steps
- Strong grasp of RTO/RPO principles and their technical implications
- Ability to challenge unrealistic recovery objectives based on technical constraints
Runbook Design & Technical Documentation
- Proven experience creating operational runbooks from scratch
- Strong technical writing skills, producing:
- Clear, structured, unambiguous instructions
- Explicit pass/fail and go/no?go decision points
- Ability to define standards, templates, and quality guides
- Experience producing documentation suitable for operational execution under pressure
Systems & Application Architecture Understanding
- Deep understanding of:
- Application architectures (monolith, microservices, SaaS, COTS)
- Infrastructure components (compute, storage, network)
- Cloud and hybrid environments
- Ability to identify dependencies, sequencing, and failure points
- Strong skills in dependency mapping and recovery order definition
Cloud & Platform Knowledge
- Experience with cloud recovery patterns (backup/restore, rebuild, redeploy, rehydrate)
- Understanding of:
- Cloud native services (IaaS, PaaS, SaaS recovery considerations)
- Identity, networking, and access dependencies during recovery
- Ability to vet and validate cloud DR solutions for feasibility
Testing, Validation & Assurance
- Experience designing and leading:
- Table?top exercises
- Runbook walk?throughs
- Strong ability to:
- Validate recovery logic and sequencing
- Identify gaps, risks, and unrealistic steps
- Comfort owning issue identification, refinement, and resolution tracking
Stakeholder Engagement & Collaboration
- Strong ability to work with:
- Application Support SMEs
- Infrastructure and platform teams
- Solution Architects and business stakeholders
- Skilled at eliciting undocumented system knowledge
- Ability to gain operational buy?in for recovery approaches
Analytical & Problem?Solving Skills
- Structured, methodical approach to problem analysis
- Ability to design recovery under incomplete or imperfect information
- Strong judgement to identify:
- Stop points
- Escalation triggers
- Risk acceptance scenarios
Governance, Standards & Assurance Mindset
- Comfortable working within:
- DR standards and patterns (e.g. CIP?aligned frameworks)
- Quality gates and assurance criteria
- Ability to define “what good looks like” and assess compliance
- Strong sense of accountability for technical correctness
Communication & Leadership Skills
- Clear, confident communication of technical recovery designs
- Ability to explain complex recovery concepts in practical terms
- Comfortable leading recovery discussions and testing sessions
- Strong documentation and handover discipline
|