When you first start learning about virtualisation as an architect, the first design question you will learn is: “Scale Up or Scale Out?” The basis of the question is quite simple: “Do I buy a small number of hosts with a large amount of resources, or do I buy a large number of hosts with a small amount of resources?” Simply put: “Can I afford to put all of my eggs in one basket?” The answer to that question varies, depending on the business requirements and constraints of your customer.
The Compute design actually has multiple components that need to be considered:
- Rack or Blade?
- Number of Sockets per Host?
- Number of Cores per Socket?
- GHz per Core?
- GB of RAM per NUMA Node (Socket)?
- Power Consumption per Chassis/Host/Socket/Core/GHz/DIMM/NUMA Node?
- Cost per Chassis/Host/Socket/Core/GHz/DIMM/NUMA Node?
- NUMA limits?
- Benefits of Hyper-Threading?
- Failure Domains within the Chassis/Rack/Row/Data Center?
- Hyper-converged computing? – You have to consider network and storage as well
The next question is: “How do I make the right selection?” “What tools should I use?”
Initially, answer the following questions:
- What is the workload? (i.e. Choppy or steady? OLTP or OLAP?) – At least two months of performance data should be collected to capture daily, weekly and monthly cycles. Use VMware Capacity Planner.
- What is the budget? – This constraint can directly conflict with the workload requirement.
- Are there Space/Power/Cooling constraints in the Data Center?
- What is the level of protection required? (i.e. Assuming vSphere HA is used – N+1, N+2, etc.)
- What percentage of headroom is required for overhead, future growth and risk mitigation?
The CPU relationship is illustrated in the figure below.
Secondly, use the following formulas to select the number of hosts required. You will use the equations as a sliding scale, and the calculation with the maximum number of hosts is your magic number. The trick is to select a configuration that provides similar values from each computation:
- vCPU to pCore ratio – This is commonly used as a rough estimate for how many hosts are required. Generally accepted ratios range from 1 to 12, depending upon the application vendor. The maximum “vCPUs per Core” of vSphere 5.5 is 32. CPU speed and RAM are not considered here.
- Peak GHz calculation
- Peak pRAM calculation
The formulas are illustrated in the figure below. The “Protection” value is the number of tolerated host failures. You need to build the amount of headroom required into each calculation (eg. 15% of overhead for Memory = Peak_pRAM_Per_Cluster x 1.15).
Pay particular attention to the Configuration Maximums of vSphere; there is no sense designing a 40 node cluster when the configuration maximum is 32, or 7,000 VMs within a cluster when the configuration maximum is 4,000.
If you are using modern operating systems with modern applications that are NUMA aware with vSphere 5.x, then NUMA is not an issue. However if you are virtualising non-NUMA aware physical servers or using pre-vSphere 5.x (vNUMA not supported), then you need to ensure that the widest VM fits within the cores of a single socket (with or without HT) and the RAM of a single NUMA node.
It is your job to change the variables of each calculation and ensure that the final values fit within the Requirements, Constraints, Assumptions and Risks accepted by your Customer. If there are conflicts that cannot be resolved, discuss the impacts and risks with your customer.