vRA Design Considerations

This is the VMware vRealize Automation Design Deep-Dive.  I have aggregated all of the design considerations that Gregg Robertson, VCDX#205, Andrea Siviero, VCDX#240 and myself needed to answer for our VMware vRealize Automation architecture design.  Brevity and bullet-points are used to keep the information concise and readable.

List of articles in my VCDX Deep-Dive series (more than 80 posts)

I have separated the design decisions into the areas specified by the VCDX7-CMA blueprint. Be aware that VCDX7-CMA allows for vCD, vRA and vIO, but the design logic of this article follows a VMware SDDC solution selection (vRealize Suite, vSphere, NSX-v).

Business Goals/Problems

  • What are the business goals of the solution?
  • What are the business problems to be solved?

Customer Use-Cases

  • What are the customer use-cases for the solution?

Requirements/Constraints/Assumptions

  • What are the requirements, constraints and assumptions of the solution?

Risks

  • What are the risks of the solution and how have you mitigated them?  Are there specific implementation tasks, tests or procedures that can be referenced?

A. Cloud Management Infrastructure – Management Components

Logical Design Decisions

  • Private Cloud, Public Cloud or Hybrid Cloud?
  • Unified Cloud Management Platform with SDDC?
  • Single or Multiple Sites?
  • Number of Pooled Compute, Network and Storage resources?
  • Management, Control and Data Planes are separated?
  • Cloud Management infrastructure is highly available?  Across all sites?
  • IT Service Management will be a separate layer?
  • Service Catalogue resides in IT Service Management Layer or Cloud Management Platform?
  • Centralised Logging and Advanced Operations?

Physical Design Decisions

  • VMware vCloud Director, VMware vRealize Automation, VMware vSphere Integrated OpenStack or another CMP vendor?  What version of CMP software?
  • CMP integration with other SDDC components (server virtualisation, network virtualisation, storage virtualisation)?  What versions of vSphere and NSX-v?
  • Deployment location of Management and Control Plane infrastructure?  Dedicated Management Cluster?  Dedicated Control Plane Cluster?  Relationship to each site?
  • What APIs will be used?  What Orchestration Engine will be used?  What version of Orchestration software?
  • vRA Sizing?  Standalone or Distributed?
  • vRealize Identity Appliance or vCenter SSO?  Availability choices?
  • vRA RBAC? LDAP or Local Accounts?  Who will have the System administrator, IaaS Administrator, Fabric Administrator, Tenant Administrator, Service Architect, Business Group Manager, Support User, Business User, Approval Administrator, Approver, etc. roles?
  • vRA deployment method, SSL termination and SSL certificate store? Loadbalancers?
  • vRA Appliance Database?
  • vRA IaaS Web deployment method, SSL termination and SSL certificate store? Loadbalancers?
  • vRA IaaS Model Manager deployment method, SSL termination and SSL certificate store? Loadbalancers?
  • vRA IaaS DEM Orchestrator deployment location?
  • vRA IaaS DEM Worker site placement and quantity?
  • vRA IaaS Agent Servers deployment and mapping to vCenter Servers?
  • vRA End-Points? (vSphere selected, KVM and Hyper-V dropped)
  • SMTP Integration
  • vRO endpoints?
  • vRO Appliances or VMs?  SSL certificate store?
  • vRO Database?
  • vRO highly available? vSphere DRS and Storage DRS considerations?
  • vRO Load Balancing
  • Branding and Corporate logos required?  Copyright notices and legal policy required?
  • Single or Multi-Tenant Fabric Groups?
  • Default Tenant usage?
  • Approval Workflows?
  • Virtual Machine infrastructure?
  • Single VM and Multi-Machine Blueprints?
  • Tenant default reservations?
  • Machine blueprint reservations?
  • CPU/RAM reservations?
  • Storage/Network reservations?
  • Build Profiles? Global Build Profile?
  • Tenant specific blueprints?
  • Machine Prefixes?
  • Business Groups?
  • Configuration Management?
  • Software Update Management?
  • Showback or Chargeback?
  • Network Addressing of vRA blueprints?
  • Advanced Operations?
  • Centralised Logging?
  • Loadbalancing of vRA management components?
  • vCenter Server instances?
  • VMware Availability & vSphere DRS?

B. Cloud Management & Resource Infrastructure – Compute Layer

Logical Design Decisions

  • Traditional Monolithic Compute, Server-Side Flash Cache Acceleration with legacy infrastructure, Converged Infrastructure or Hyper-Converged Infrastructure?
  • Minimum number of Hypervisor Hosts per Cluster
  • Host sizing: Scale Up or Scale Out?
  • Homogeneous or Heterogeneous nodes?
  • Number of Sockets per Host?
  • Host Spanning for Failure Domains?
  • Required CPU Capacity?
  • Required Memory Capacity?

Physical Design Decisions

  • Compute Vendor?
  • Processor type?
  • Processor Features required?
  • Number of hosts per cluster?
  • Number of CPU sockets per node?
  • Model of Processor, number of cores and GHz per core?
  • GPU required?
  • Host locations?
  • Single Rack, Multi-Rack with striping?
  • Cluster Availability requirements?
  • Failure Domains?
  • Align compute availability with storage availability?
  • Future expansion?

C. Cloud Management & Resource Infrastructure – Network Layer

Logical Design Decisions

  • Legacy 3-Tier Switch, Collapsed Core or Clos-type Leaf/Spine?
  • Clustered Physical or Standalone EoR/ToR Switches?
  • Stretched or Per Rack VLANs?
  • Functional traffic types separated with vSwitches or VLANs?
  • Jumbo Frames?
  • Network Virtualisation required?
  • Quality of Service?
  • Load Balancing?
  • IP version?
  • Inter-Data Center links, including Round-Trip-Time?
  • Required Network Capacity?
  • Single vNIC or Multi vNIC VMs allowed?

Physical Design Decisions

  • Legacy or Clos-type Leaf/Spine vendor selection?
  • Blocking or non-blocking Data Center switch fabric?
  • If blocking, what is the over-subscription ratio?
  • What is the traffic path for North/South and East/West traffic?
  • Where are the Layer 3 gateways for each IP Subnet?
  • Any Dynamic Routing requirements?
  • Is Multi-Cast required?
  • End-to-End Jumbo Frames?
  • Host interfaces: 1GbE and/or 10GbE/25GbE/40GbE? How many per node?
  • LAGs or unbonded host interfaces?
  • Management overlay required for KVM and IPMI?
  • Physical LAN Performance?
  • Host interface connectivity matrix?
  • Metro Ethernet required between Data Centers?
  • QoS, Network Control and vSphere Network I/O Control?
  • Edge QoS enforced or End-to-End QoS?
  • vSphere NIOC System and User-Defined Network Resource Pools?
  • Multi-NIC vMotion?
  • VLAN Pruning?
  • Spanning Tree considerations?
  • VM DirectPath I/O and SR-IOV?
  • TCP Offload enabled?
  • vSS or vDS?
  • Separate vSwitches per Cluster or shared?
  • Teaming and Load Balancing?
  • VMkernel ports?
  • Portgroups?
  • VMware NSX-v/NSX-T required?
  • Future Expansion?

D. Cloud Management & Resource Infrastructure – Storage Layer

Logical Design Decisions

  • Traditional Monolithic Storage, Server-Side Flash Cache Acceleration with legacy infrastructure, Converged Infrastructure or Hyper-Converged Infrastructure?
  • Block-based or IP-based Storage Access?
  • Homogeneous or Heterogeneous storage design?
  • Storage Policy control?
  • Automated storage management?
  • RDM devices allowed?
  • Hypervisor boot method? DAS, LUN or PXE?
  • Thin or Thick provisioning for Back-end storage and VMs?
  • Required storage resources (performance and capacity)?
  • Storage replication?

Physical Design Decisions

  • Storage Vendor?
  • Usable Storage Calculation, considering Storage Pools, RAID/FTT/Replication Factor, Usable Capacity and Usable Performance?
  • Number of SSD and HDD drives per array/node?
  • Also consider Free-Space Reservations, Deduplication, Compression, Erasure Coding and Storage APIs.
  • Self-Encrypting Disks?  If yes, consider the KMS requirements.
  • ESXi host boot method?
  • Tiers of storage and Auto-Tiering thresholds?
  • VM DirectPath I/O and SR-IOV?
  • Datastores per cluster?
  • Storage DRS and SIOC being used?
  • Different VMDK shares being used?
  • VAAI being used?
  • VASA and VM storage profiles?
  • Asynchronous DR, Metro or Synchronous DR required?
  • Future expansion?

E. Cloud Management & Resource Infrastructure – Data Protection

Logical Design Decisions

  • DPaaS required for blueprints?
  • VM Image Backup Frequency?
  • Application and Database Consistent Backup Frequency?
  • Backup Restore Times?
  • Physical Separation of Operational Data and Backup Data?
  • Required Backup Resources
  • Required Backup and Restore Performance

Physical Design Decisions

  • VADP used?
  • Backup/Recovery solution?
  • Backup/Restore mechanism?
  • VM-Centric Snapshots?
  • Async DR Replication of VM Snapshots to remote site/cloud?
  • Backup frequency?
  • Retention period?
  • Backup capacity and performance?
  • Fast restore of management cluster direct to host?
  • Future expansion?

F. Cloud Management & Resource Infrastructure – Virtual Machines

Note: if you have containers in your design, do the same for containers.

Logical Design Decisions

  • Standard VM T-shirt sizes for blueprints? What about MMBs? What about CMP infrastructure?
  • VM CPU and RAM management mechanisms used?
  • Location of VM files?
  • Guest OS standardisation?
  • 64-bit and 32-bit?
  • Templates used?

Physical Design Decisions

  • Standard VMs of what size?
  • vApps and Resource Pools?
  • VM files on shared storage?
  • Standard vDisk setups per VM?
  • Thin provisioned vDisks?
  • Storage sub-system or vSphere Snapshots allowed?
  • CBT enabled?
  • 64-bit/32-bit Guest OS versions?
  • vSCSI adapters?
  • vNIC adapters?
  • VM Hardware version?
  • VMtools installed and version?
  • VM Options?
  • VM Templates?
  • VM Template Repository?
  • Mission-Critical/Business-Critical Application considerations?

G. Cloud Management & Resource Infrastructure – Security

Logical Design Decisions

  • Zones of Trust?
  • Defence-in-Depth?
  • Multi-Vendor?
  • Physical separation requirements?
  • Microsegmentation?
  • Compliance standards?
  • Virtualisation security requirements?
  • Required Network Security Capacity?

Physical Design Decisions

  • Physical and Virtual Network Zoning?
  • Application-level, Network-level Firewalls?
  • IDS and IPS?
  • SSL and IP-Sec VPNs?
  • Unified Threat Management?
  • Vendor selection?
  • VMware NSX-v/NSX-T required?
  • Anti-Virus? Endpoint Protection?
  • Network Security Performance?
  • Security Information & Event Management (SIEM)?
  • Public Key Infrastructure (PKI)?
  • vRA security? RBAC?
  • ESXi host security?
  • Network security?
  • Storage security?
  • Backup security?
  • VM security?
  • Future Expansion?

H. Cloud Management & Resource Infrastructure – BC/DR

Logical Design Decisions

  • DRaaS required for blueprints?
  • Protection Mechanisms?
  • Manual or Automated Run-books?
  • RPO, RTO, WRT and MTD of Mission-Critical, Business-Critical and Non-Critical applications?
  • Global Site Load Balancers?
  • DNS TTL for clients?

Physical Design Decisions

  • DR Automation solution?
  • VMware SRM?
  • GSLB solution?
  • Internal and External DNS servers?
  • Metro or Synchronous DR to remote clusters?
  • Multi-Site Application, Database or Message Queue clustering/replication?

I. Automation & Extensibility

Logical Design Decisions

  • API Integrations?
  • Orchestration Engines?
  • VM Templates or Application Blueprints?  Multi-machine or Single-machine?
  • Application and Guest OS Orchestration?
  • 3rd Party integrations with existing systems?

Physical Design Decisions

  • XaaS required?
  • Platform-as-a-Service and Software-as-a-Service?
  • Workflow Stubs?
  • Property Dictionary Entries?

Published by

vcdx133

Chief Enterprise Architect and Strategist, 4xVCDX#133, NPX#8, DECM-EA.

2 thoughts on “vRA Design Considerations”

Comments are closed.