vSphere Design Considerations

This is the VMware vSphere Design Deep-Dive.  I have aggregated all of the design considerations that I could find that need to be assessed in a VMware vSphere architecture design.  Brevity and bullet-points are used to keep the information concise and readable. If you are going down the VMware vSAN route, consider these design considerations as well in the physical design.

List of articles in my VCDX Deep-Dive series (more than 80 posts)

I have separated the design decisions into the areas specified by the VCDX6-DCV blueprint.

Business Goals/Problems

  • What are the business goals of the solution?
  • What are the business problems to be solved?

Customer Use-Cases

  • What are the customer use-cases for the solution?

Requirements/Constraints/Assumptions

  • What are the requirements, constraints and assumptions of the solution?

Risks

  • What are the risks of the solution and how have you mitigated them?  Are there specific implementation tasks, tests or procedures that can be referenced?

A. Virtual Data Center Management

Logical Design Decisions

  • Number of Pooled Compute, Network and Storage resources?
  • What services are you delivering?
  • Required availability levels of virtualisation management systems?
  • 3rd party integrations: IT Service Management, Infrastructure Management systems, Enterprise services (DNS, LDAP, NTP, PKI, Syslog, SNMP Traps), Vendor Data collection?
  • Advanced Operations?
  • Hypervisor Workload Protection mechanisms?
  • Hypervisor Workload Resource Balancing mechanisms?

Physical Design Decisions

  • Hypervisor: ESXi and which version?
  • Dedicated Management Cluster?
  • Standalone or Linked-Mode vCenters?
  • vCenter Server version and installation type?
  • vCenter Server database?
  • vCenter Server protection mechanism?
  • PSC design? Single SSO domain?
  • PKI design? SSL requirements?
  • vSphere components that will be used?
  • Host profiles?
  • Update management of ESXi, VM Hardware version and VMtools?
  • Antivirus integration via vShield/NSX?
  • vRealize Suite being used for advanced operations and cloud management?
  • vRO for workflows? PowerCLI for scripts?
  • Enterprise Management solution to integrate with?
  • Automated vendor support mechanisms? How will VMware Skyline be used?
  • Any Service Desk or Change Management requirements that must be met?
  • What are the vSphere HA and vSphere DRS requirements?
  • If using HCI, make sure you understand any mandatory vendor HCI configuration settings for vSphere.
  • DNS and NTP integration?
  • Role Based Access Control and LDAP integration?
  • Which vCenter Server User Interface for Administration and Operations?
  • What vSphere licencing is required?
  • 3rd party software licencing considerations? Per physical socket/core or vCPU?  DRS VM-Host rules required?

B. Compute

Logical Design Decisions

  • Traditional Monolithic Compute, Server-Side Flash Cache Acceleration with legacy infrastructure, Converged Infrastructure or Hyper-Converged Infrastructure? Obviously this must align with the Storage section.
  • Minimum number of Hypervisor Hosts per Cluster
  • Host sizing: Scale Up or Scale Out?
  • Homogeneous or Heterogeneous nodes?
  • Number of Sockets per Host?
  • Host Spanning for Failure Domains?
  • Required CPU Capacity?
  • Required Memory Capacity?

Physical Design Decisions

  • Server vendor?
  • Processor type?
  • CPU Features: VT-x supported, Hyper-threading, Turbo Boost, NUMA enabled?
  • Server Hardware and Configuration?
  • Number of CPU sockets per node?
  • Model of Processor, number of cores and GHz per core?
  • GPU required?
  • Host locations?
  • Single Rack, Multi-Rack with striping?
  • Cluster Availability requirements?
  • Align compute availability with storage availability?
  • Future expansion?

C. Storage

Logical Design Decisions

  • Traditional Monolithic Storage, Server-Side Flash Cache Acceleration with legacy infrastructure, Converged Infrastructure or Hyper-Converged Infrastructure?  Obviously this must align with the Compute section.
  • Block-based or IP-based Storage Access?
  • Automated storage management?
  • RDM devices allowed?
  • Hypervisor boot method? DAS, LUN or PXE?
  • Thin or Thick provisioning for Back-end and VMs?
  • Required storage resources (performance and capacity)?
  • Storage replication?

Physical Design Decisions

  • Storage vendor?
  • Usable Storage Calculation, considering Storage Pools, Replication Factor, Usable Capacity and Usable Performance?
  • Number of SSD and HDD drives per storage system?
  • What about Free-Space Reservations, Deduplication, Compression, Erasure Coding and Storage APIs?
  • Active Working Set required?
  • Storage Auto-Tiering thresholds?
  • Self-Encrypting Disks?  KMS?
  • Storage network?
  • ESXi host boot mechanism?
  • VM DirectPath I/O and SR-IOV?
  • Datastores per cluster?
  • Storage DRS and SIOC being used?
  • Different VMDK shares being used?
  • VAAI being used?
  • VASA and VM storage profiles?  VASA not supported and VM storage profiles could be manually configured for a multi-container cluster with different settings.
  • Asynchronous DR, Metro or Synchronous DR required? (mentioned again in Backup/Recovery and BC/DR sections)
  • Future expansion?

D. Network

Logical Design Decisions

  • Legacy 3-Tier Switch, Collapsed Core or Clos-type Leaf/Spine?
  • Clustered Physical or Standalone EoR/ToR Switches?
  • Stretched or Per Rack VLANs?
  • Functional traffic types separated with vSwitches or VLANs?
  • Jumbo Frames?
  • Quality of Service?
  • Load Balancing?
  • IP version?
  • Inter-Data Center links, including RTT?
  • Required Network Capacity?
  • Single vNIC or Multi vNIC VMs allowed?

Physical Design Decisions

  • Clos-type Leaf/Spine vendor selection for large installations? Or legacy Core/Agg/Dist/Access switch infrastructure?
  • Blocking or non-blocking Data Center switch fabric?
  • If blocking, what is the over-subscription ratio?
  • What is the traffic path for North/South and East/West traffic?
  • Where are the Layer 3 gateways for each IP Subnet?
  • Any Dynamic Routing requirements?
  • Is Multi-Cast required?
  • End-to-End Jumbo Frames?
  • Host interfaces: 1GbE and/or 10GbE? How many per node?
  • LAGs or unbonded host interfaces?
  • Management overlay required for KVM and iLO/iDRAC/IPMI?
  • Physical LAN Performance?
  • Host interface connectivity matrix?
  • Metro Ethernet required between Data Centers?
  • QoS, Network Control and vSphere Network I/O Control?
  • Edge QoS enforced or End-to-End QoS?
  • vSphere NIOC System and User-Defined Network Resource Pools?
  • Multi-NIC vMotion?
  • VLAN Pruning?
  • Spanning Tree considerations?
  • VM DirectPath I/O and SR-IOV?
  • TCP Offload enabled?
  • VSS or VDS?
  • Separate vSwitches per Cluster or shared?
  • Teaming and Load Balancing?
  • VMkernel ports?
  • Portgroups?
  • VMware NSX-v required?
  • Future Expansion?

E. Data Protection

Logical Design Decisions

  • VM Image Backup Frequency?
  • Application and Database Consistent Backup Frequency?
  • Backup Restore Times?
  • Physical Separation of Operational Data and Backup Data?
  • Required Backup Resources
  • Required Backup and Restore Performance

Physical Design Decisions

  • VADP used?
  • Backup/Recovery solution?
  • Backup/Restore mechanism?
  • VM-Centric Snapshots?
  • Async DR Replication of VM-Centric Snapshots to remote cluster/public cloud?
  • Backup frequency?
  • Retention period?
  • Backup capacity and performance?
  • Fast restore of management cluster direct to host?
  • Future expansion?

F. Virtual Machine

Logical Design Decisions

  • Standard VM T-shirt sizes?
  • VM CPU and RAM management mechanisms used?
  • Location of VM files?
  • Guest OS standardisation?
  • 64-bit and 32-bit?
  • Templates used?

Physical Design Decisions

  • Standard VMs of what size?
  • vApps and Resource Pools?
  • VM files on shared storage?
  • Standard vDisk setups per VM?
  • Thin provisioned vDisks?
  • Nutanix or vSphere Snapshots allowed?
  • CBT enabled?
  • 64-bit/32-bit Guest OS versions?
  • vSCSI adapters?
  • vNIC adapters?
  • VM Hardware version?
  • VMtools installed and version?
  • VM Options?
  • VM Templates?
  • VM Template Repository?
  • Mission-Critical/Business-Critical Application considerations?

G. Security

Logical Design Decisions

  • Zones of Trust?
  • Defence-in-Depth?
  • Multi-Vendor?
  • Physical separation requirements?
  • Compliance standards?
  • Virtualisation security requirements?
  • Required Network Security Capacity?

Physical Design Decisions

  • Physical and Virtual Network Zoning?
  • Application-level, Network-level Firewalls?
  • IDS and IPS?
  • SSL and IP-Sec VPNs?
  • Unified Threat Management?
  • Vendor selection?
  • VMware NSX-v/T required?
  • Anti-Virus? Endpoint Protection?
  • Network Security Performance?
  • Security Information & Event Management (SIEM)?
  • Public Key Infrastructure (PKI)?
  • vSphere Cluster security? STIG?
  • ESXi host security?
  • Network security?
  • Storage security?
  • Backup security?
  • VM security?
  • Future Expansion?

H. BC/DR

Logical Design Decisions

  • Protection Mechanisms?
  • Manual or Automated Run-books?
  • RPO, RTO, WRT and MTD of Mission-Critical, Business-Critical and Non-Critical applications?
  • Global Site Load Balancers?
  • DNS TTL for clients?

Physical Design Decisions

  • DR Automation solution?
  • VMware SRM?
  • GSLB solution?
  • Internal and External DNS servers?
  • Metro or Synchronous DR to remote clusters?
  • Multi-Site Application, Database or Message Queue clustering/replication?

Published by

vcdx133

Chief Enterprise Architect and Strategist, 4xVCDX#133, NPX#8, DECM-EA.

2 thoughts on “vSphere Design Considerations”

Comments are closed.