VCDX Study Plan – BC/DR

Business Continuity and Disaster Recovery will be the weakest point for most VCDX candidates; I know it was for me, I flopped miserably during my first VCDX attempt. So make it your strongest point, become a Disaster Recovery guru.

List of articles in my VCDX Deep-Dive series (more than 70 posts)

You may already have this addressed in your VCDX design. If so, collect all of the Recoverability and Availability requirements with your design decisions and then map these to an A3 diagram to ensure that it all fits together. Run through the scenarios you are protecting against and validate them. You will be surprised at the inconsistencies you will expose.

Here is the list (in no particular order, I typed them as they occurred to me):

Where do your DR requirements come from?
Do you just make them up?
What is a Business Impact Analysis?
Understand the relationship between MTD, WRT, RTO, RPO and Availability
What does five/four/three 9s of availability actually mean?
How will that impact the cost of the solution?
Should I protect every system?
What should I protect against?
How do I protect against Site Failure?
How do I protect against Hardware or Software failure?
How do I protect against Datastore/LUN failure?
How do I protect against the Accidental/Malicious Deletion of Data?
What is a Runbook?
Can I automate it?
Are manual Runbooks a good idea?
How do I automate DR Failover/Failback?
When should I execute Disaster Recovery Drills?
How does Switchover differ from Failover?
Does my Availability calculation include Planned Downtime or just Unplanned?
What availability will the VMware vSphere availability/recovery mechanisms actually give me?
What will they protect against?
VMware vSphere HA – Host-HA, VM-HA, App-HA
VMware SRM/vSphere Replication
VMware vSphere Data Protection
VMware vSphere VADP/CBT
VMware vCenter Server Heart Beat
Storage Replication – Asynchronous and Synchronous
What other mechanisms are there to protect my Customer’s services and data?
Application Clustering
DB replication/protection mechanisms
Do I need a third site for Witness/Quorum?
Multi-site Data Center connectivity – Bandwidth and Latency
Backup times, RECOVERY TIMES
Backup/Recovery – Physical Tape, IP-based
Tape Movement procedures
Global Site Load Balancing
DNS – Intranet and Internet
Load Balancing
How do I implement DR Automation for multiple platform solutions (eg. vSphere, zOS and AIX)?
Who are the DR Automation players out there?
Who are the Backup/Recovery experts?
How do I design solutions with the right blend of Disaster Recovery, High Availability and Backup/Recovery?
What are the operational considerations?
When I implement my Disaster Recovery plan, how does it impact my customers?
What will they see?
How long will they be interrupted for?
Do they have to do anything?
Do I have to communicate with them?
How will I communicate with them?

Resources you may want to consider:

Validate these scenarios:

My customer has an RPO of 60 minutes. The data being protected is 5TB in size. My design intends to use the existing LTO-2 tape library for recovery. Will I meet the customer’s RPO in the event of data corruption?
My customer has an availability requirement of five 9s for a Tier-1 Business Critical Application. My design uses vSphere HA and vSphere Replication to meet this SLA with a manual runbook for site failover/failback. Is this a major risk to the SLA when the primary site fails?

VCDX Study Plan – BC/DR

Published by

vcdx133

Share this:

Related

Published by

vcdx133