This article builds upon the white papers and blogs that are available (listed below). It took me a long time to understand how these mechanisms work together, since there was no single document (that I could find) describing their integration and correct usage; which left experimentation in my Pre-Production environment. This post is an attempt to correct that knowledge gap.
There are a number of design decisions that need to be considered when using vSphere Storage DRS and Storage I/O Control with EMC’s FAST (VP or DP) with the EMC Symmetrix VMAX. In particular, how your design choices relate to the correct system-wide settings within the EMC and VMware products, some well known and some not.
The diagram below provides an overview of the use case.
The use case mechanisms are:
- Thin Provisioned VMDK files: Optimised space utilisation of VMs.
- Storage DRS: Initial Placement and Automatic Datastore space load balancing of Thin Provisioned VMs. Anti-Affinity rules separate Highly Available VMs to separate Datastores.
- Storage I/O Control: Datastore-wide “Noisy Neighbour” congestion protection and per/VM (or VMDK) QoS with Resource Pools
- PowerPath/VE: Host to Storage Array Intelligent Multi-Pathing (Target-based load balancing)
- EMC FAST-VP: Sub-LUN Auto-Tiering with over-subscription (or not) of higher tiers of storage
VAAI, VASA and VM Storage Profiles are out of the scope of this blog post.
- Requirement 1: End-to-End QoS for Storage
- Requirement 2: VM Space Optimisation
- Requirement 3: Automated space management
- Constraint 1: EMC Symmetrix VMAX Storage
- Constraint 2: VMware vSphere Enterprise Plus Licences
- Constraint 3: EMC PowerPath/VE Licences
- Assumption 1: At least two HBAs/vHBAs per host
- Assumption 2: Correctly sized, redundant, Enterprise SAN switch network
Design Decision 1: Automation Level: Automated or Manual (for Thin Provisioned VMs)?
This is discussed in the FAST-VP section below.
Design Decision 2: Utilised Space Threshold percentage?
Setting of 75% provides 25% of headroom per datastore provides enough protection against rapid expansion of VMDK files.
Design Decision 3: I/O Metric enabled?
Should be disabled to prevent Storage DRS and FAST-VP from competing against each other (commonly known and well documented within the industry).
Storage I/O Control
Design Decision 4: Automatic Threshold Detection or Manual Congestion Threshold?
Automatic Threshold detection should never be used with FAST-VP, this is because the SIOC Injection mechanism will measure a lower latency than is actually true due to the promotion of hot blocks of storage; SIOC will then throttle I/O prematurely across the entire datastore (all connected hosts) when congestion is not actually occurring.
Set the Manual Congestion Threshold for each Datastore in the SDRS Cluster (generally accepted values: 50ms for SATA/NL-SAS, 30ms for 15K FC/SAS, 15ms for SSD).
Design Decision 5: Storage Resource Pools (per VM/VMDK)?
You can set per VM/VMDK Share values that SIOC will enforce during congestion.
If you have the budget, purchase PowerPath/VE licences. The VMware Native Multi-Pathing mechanisms are Initiator-based load balancing mechanisms, which are not aware of the array, whereas PowerPath/VE is Target-based (Intelligent).
Design Decision 6: LB Policy?
Will be “SymmOpt” for Symmetrix arrays. Other supported policies are “ClarOpt” for Clarion/VNX and “Adaptive” for supported 3rd Party Active/Active arrays.
Design Decision 7: Electronic Licence Management: Served or Unserved?
“Served” requires the “rtools” server (RHEL, W2K8 or Virtual Appliance) which provides licence keys automatically to the host, “Unserved” requires an activation key from Powerlink each time you add a host to the array. The EMC VSI plugin for vCenter will provide target-based visibility status of PowerPath/VE from vCenter.
Design Decision 8: System-wide Performance Time Window?
Design Decision 9: System-wide Move Time Window?
FAST-VP has a system-wide policy that defines the Performance Time Window and the Move Time Window.
If Storage DRS is configured to Automatic (less Administrator intervention), then the performance collection and data movement windows should be 24/7, since you want FAST-VP to promote hot blocks of storage immediately after a VMDK has moved data stores (to cold storage).
If consistent performance (with Thin Provisioned VMs) is more important than reducing management overhead, then Manual should be selected and operators will have to approve the SDRS recommendations at a time of day that is outside the Performance Time and Move Time Windows of FAST-VP. A possible use case for this configuration is an organisation that requires optimal performance during business hours (FAST-VP Performance Time Window) and has the optimisation tasks configured for out-of-hours (FAST-VP Move Time Window and Operator approval of SDRS recommendations). There could be a window of time when VMDKs moved by SDRS will be on cold blocks and FAST-VP could have promoted the original hot blocks to faster tiers of storage. If you have an environment that is relatively static, but space optimisation and consistent performance is an important design consideration, then this could work for you.
Design Decision 10: Over-subscribe Thin Pools?
Each FAST-VP policy (eg. Bronze, Silver, Gold, Platinum) is associated with a Storage Group. The percentages set for each tier is “over-subscribed”, if the sum of percentages for each Tier for all FAST-VP policies exceeds 100%. You can also configured different priorities for each FAST-VP policy.
Every design needs to be tested before deployment to a Production environment, then Capacity Planning and Operational Monitoring processes need to be in place to ensure that the design targets are being met.
There are many associated risks with each design decision, listed below are some of the less known that I mention with the intent of sharing knowledge:
Risk 1: Every storage solution that uses global caching (mirrored or not) is at risk of over-utilisation if not properly managed (ie. No capacity planning, no operational monitoring = enormous risk), this includes the EMC Symmetrix VMAX 40K. The mechanisms described in this blog, attempt to minimise that risk and provide an End-to-End Storage QoS design. However, if a solution is misconfigured after implementation and an I/O intensive workload is placed on the lowest tier of storage and the global cache has frequent forced flushes, then you risk an outage on all of your workloads, because the global cache is common to all tiers of storage. This risk is credited to Michael Webster, who explained it to me.
Risk 2: If you use SRDF with FAST-VP as part of your Disaster Recovery solution, be aware that your DR site will use cold blocks of storage upon failover, which will mean reduced performance until FAST-VP promotes hot blocks to faster tiers.
Risk 3: If using SIOC Automated Threshold Detection: The C# client in 5.1 does not support the configuration of Automated Threshold Detection; you must use the Web Client for this. However, after it has been configured, if you use the C# client to modify the SIOC Manual Congestion Threshold value, this will disable Automated Threshold Detection and set the manual value. Ensure that your Standard Operating Procedures are updated and that your Administrators are aware of this caveat to reduce this operational risk.
Alternate Solution 1:
If your business requirements cannot accept any type of performance impact, then SDRS could be used for initial placement only, your VMs should be Thick Provisioned Eager Zeroed and the system-wide FAST-VP settings should be optimised for the required performance window.
Alternate Solution 2:
Investigate performance increases with VMDirectPath I/O and RDMs.
If you have any additional insights or comments about these use cases, please add them and share your thoughts.
7 thoughts on “EMC FAST-VP & PowerPath/VE with Storage DRS & Storage I/O Control”
Thanks for this nice initiative for consolidating and sharing this valuable information.
I found this Duncan Epping’ s post which can serve as a supplement to what you have mentioned here.
Link to post.
I have a question about the abstract below:
“Automatic Threshold detection should never be used with FAST-VP, this is because the SIOC Injection mechanism will measure a lower latency than is actually true due to the promotion of hot blocks of storage; SIOC will then throttle I/O prematurely across the entire datastore (all connected hosts) when congestion is not actually occurring.”
When the measured latency is lower than actual, why would SIOC throttle I/O ?
I think SIOC throttles I/O only when congestion is detected and in this case it would be further delayed as detected latency is lower than actual due to promotion of hot blocks.
Can you also please explain how does the promotion of hot blocks work ?
Thanks in anticipation,
The measured latency of I/O to the Datastore is compared to the calculated congestion threshold (this is the congestion detection mechanism) with SIOC Automatic Threshold detection configured. If the storage used by the Injection Mechanism was promoted to SSD, then SIOC will kick-in sooner than it should because the indicator for congestion of SSD (approx. 15ms) is much lower than the congestion indicator for 15K FC/SAS (approx. 30ms) or SATA/NL-SAS (approx. 50ms). Therefore, if I have a Datastore with 5% SSD, 10% SAS and 85% NL-SAS, and the calculated congestion threshold was 15ms (Injection Mechanism was on SSD), then SIOC will kick-in at 15ms when I still have 15-35ms of headroom for 95% of the other tiers of storage. This is a “false-positive” that you want to avoid, so a manual congestion threshold should be selected instead.
Auto-Tiering, known in the EMC world as FAST (“Fully Automated Storage Tiering”), allows you to take a large amount of cheap storage (SATA/NL-SAS) and mix it with a percentage of faster storage types (15K FC/SAS and SSD). The storage array controllers then detect “hot spots” of high I/O and promote these blocks or sub-blocks to a faster storage tier automatically. Assuming that your workloads have been analysed correctly and that the tiers of storage have been selected accordingly, this then allows you to build a cost-effective storage solution that has the optimum blend of speed (SSD/15K FC/SAS) and capacity (SATA/NL-SAS) for the cheapest price. Without Auto-Tiering, you would have to purchase larger amounts of fast storage (SSD or 15K FC/SAS) and dedicate an entire LUN or Meta-LUN to a workload that may only use 10% of that tier of performance; the remaining 90% would be wasted.
Hope this clarifies things.
Rene Van Den Bedem
Thanks Rene ! Its crystal clear now..
Read your response and in the mean while revisited an article from Frank Denneman on SIOC and Pluralsight blog on SIOC. Also read EMC whitepaper on FAST.
Now can relate to all this much better and I can have a good sleep 😉
Hope to read some more excellent post you VCDX folks.
Comments are closed.