What is PCI DSS? The Payment Card Industry (PCI) Data Security Standard (DSS) is a standard that any organisation that maintains a business relationship with Europay, MasterCard or Visa must adhere to. The standard is about securing and protecting Cardholder data (eg. Primary Account Number – PAN, Cardholder Name, Service code, Expiry Date) and identifies any system that stores, transmits or processes Cardholder data as a “PCI Asset”. If you work for a bank, financial organisation or retail chain, then you will be well aware of the PCI Data Security Standard.
How does the PCI DSS audit work? This is initiated from within your organisation, typically someone from your Information Security Department will be the owner/project manager and a representative from each technology silo of the IT Division will participate in this project team (Application, Network, Compute, Storage, Recovery & Archive). Regardless of whether it is your first audit or the annual recertification, there are a set of operational processes and procedures that will be in constant use and a lot of preparation will be required before the Qualified Security Assessor (QSA – PCI certified auditor) arrives.
When the QSA arrives, he will typically have 7 to 10 days scheduled with your organisation; to save time, you should have all of your documentation, interviews and evidence prepared. It is important that you understand the PCI DSS compliance requirements and do not waste the time of the QSA. The Data Security Standard can be open to interpretation and you must justify your design decisions and demonstrate your operational processes and procedures to the QSA. Ideally, the QSA would like to see a physically separate “PCI” infrastructure with the encryption of all Cardholder data, when it is being stored, processed or transmitted. Some organisations do this, but in the real world it is somewhere in between with some physical separation and a lot of logical segregation with compensating controls. However, here is the compliance risk: if your PCI and non-PCI workloads are deemed to be on a common infrastructure without any logical separation or compensating controls, then the entire “shared” infrastructure will be treated as a “PCI Asset”. Which of course will turn into an operational nightmare, because instead of scanning and patching a small number (eg. 20) of assets, this will be large number (eg. 200 assets) which will increase your attack surface and the time required to maintain it.
At VMworld 2013 Barcelona last year, I talked to a guy from CatBird who claimed to have the methodology and products that would allow a single vSphere cluster of mixed workloads to achieve PCI DSS compliance with a QSA (see use case below). Every QSA I have dealt with has always required physically separate hosts, however, that is probably changing with vCNS. My experiences with outsourcing PCI DSS compliance has been troubled and my current stance is that you should own the process and make sure it is incorporated into your current operational processes. Do not treat it as a yearly “chore”; your patching and vulnerability scans should be every quarter or bi-monthly. If you are certifying for the first time, by all means bring in consulting experts to assist you, but aim to perform the annual recertification on your own steam.
These lessons learned can also assist in achieving compliance with the following security standards:
- DIACAP
- HIPAA/HHS
- FISMA/NIST
- SOX/COBIT
Listed below are the CRITICAL vSphere design choices that I can verify have passed PCI DSS 2.0 audits for organisations that are 99% virtualised with a shared Compute, Network, Storage and Backup/Recovery infrastructure:
- Physically Separate Compute Clusters – DMZ, PCI, Production, Test, Unified Communications and VDI hosts are on separate hosts in the same racks/chassis.
- Full RBAC – used for securing and separating administrator and operator roles within vSphere.
- ESXi Lockdown – all hosts are locked down and the ESXi firewall is enabled and configured to filter essential services only.
- Dedicated Management Subnet – vSphere vCenter, vCOPs, ESXi Hosts are connected to a secured VLAN and Management subnet with restricted access for Administrators and Operators only.
- Host Profiles – used for the regular configuration compliance audit of all hosts.
- Logically Separate Datastores – same as the compute clusters, separate Datastores for DMZ, PCI, Production, Test, Unified Communications and VDI hosts. If on common shared storage, each Datastore must use storage security mechanisms to ensure that hosts cannot “cross-access” other Datastores. This is normally achieved with Masking Views, SAN Switch Zoning (Single Initiator – Single/Multiple Target), Separate Non-Routed Subnets for iSCSI and NFS, NFS Exports etc.
- Virtual Switches – use a separate VSS, VDS or Nexus 1000V for each of the functional groups (DMZ, PCI, Production, Test, Unified Communications and VDI) and define the sub-functional Portgroups on each virtual switch. This ensures that PCI hosts are the only hosts that can access the PCI networks. Therefore, it is CRITICAL that your organisation has a network design that AT LEAST has the logical separation of the PCI, DMZ and Non-PCI networks. This is the most common mistake that organisations make; if you have mixed workloads in one subnet, then take the time and effort to migrate those workloads to a schema that maintains PCI separation.
- Use VLAN Pruning from the Physical Switch network to the Virtual Switch Uplinks (VSS, VDS or Nexus 1000V) – same as the compute cluster separation, the dedicated VLANs for DMZ, PCI, Production, Test, Unified Communications and VDI hosts must be pruned from the core to the uplink of the host.
- Private VLANs – these can be used as a compensating control to separate Non-PCI and PCI workloads that share a common network, however it is better to separate them as mentioned in the previous point. PVLANs can be used to further enhance the security of a PCI subnet, by enforcing the micro-segmentation and isolation of PCI workloads that do not need to communicate with each other.
- Shared Storage Array Disk Encryption – If your storage array supports hardware disk encryption (eg. EMC DARE), this will be accepted by the QSA as a compensating control for the encryption of PCI data at rest. Otherwise, you will have to implement a software-based/hardware-based encryption mechanism per Guest OS (painful and expensive).
- Backup and Recovery – You have to prove that your PCI Backups are stored in an encrypted state and show that your Restore procedures are tested regularly. If you are using LTO Tape Technology, make sure that your backup system supports encryption and that your encryption keys are valid at your DR site, otherwise you will fail the Restore process.
- Archive – Use WORM Policies and Disk Encryption to secure data at rest
Currently PCI DSS 2.0 is the standard you must comply with, however, in 2015 this will change to 3.0. Refer to the links section below for additional information.
My ideal use case for PCI is one mega-cluster of hosts with vCNS/NSX to separate the DMZ, PCI, Production and Unified Communication workloads into virtual data centers (vCNS Edge) with micro-segmentation and PCI data security scanning (vCNS App). In conjunction with vCOPS Infrastructure Navigator, the PCI data security scanning would automatically isolate unsecured PCI data using a rule feed from INAV into vCNS Edge and App.
Another great use case, is dedicated clusters for DMZ and PCI with Hyper-Converged nodes. This way you achieve the “Air-Gap” for compute and storage. Read my blog post for additional information.
Useful links:
- VMware Security & Compliance Blog – PCI DSS 3.0 Compliance Toolkit
- VMware Architecture Design Guide for Payment Card Industry
- VMware Payment Card Industry Solution Guide
- VMware PCI DSS 2.0 Validated Reference Architecture
- VMware Compliance Checker for PCI
- Coalfire PCI DSS Compliance and VMware
- Payment Card Industry Security Standards Documents
- PCI DSS 2.0 Standard
- PCI DSS 3.0 Standard
- Approved Companies and Providers
Don’t forget PCI DSS is a minimum requirement – it’s always good to aim higher than the minimum.
I’d especially want vulnerability scans and associated patching very frequently – some of the nicer solutions integrate 3rd party scans with ‘virtual patching’ of vulnerabilities on your Web Application Firewall – which enables you to protect from if not day 0 then at least day 1 or 2 vulnerabilities. Certainly faster than perhaps waiting for a real patch to be released, then testing it, change control etc.
Very true, thank you for the comment.
Great post Rene, especially the mapping of design choices – very useful.
thanks
Thank you for the feedback, Gareth.
I know this post is old, but I see that you’ve specified a dedicated management subnet, do you know if that needs to be split out into possibly two? One for PCI traffic and the other for non-PCI traffic? It seems odd to separate hosts physically, separate VLAN’s, etc, but then mix the management traffic, no?
Hello Kyle, Assuming a single operations/administration entity has control of both PCI and Non-PCI assets, in conjunction with RBAC, ESXi Firewalls, ESXi lockdown, dedicated management subnet, etc. – a single management network for all ESXi hosts with a single vCenter is perfectly acceptable and will comply with the PCI DSS 2.0 standard. If you want to increase operational complexity and solution cost, yes, you could have separate vCenters and management subnets for PCI and Non-PCI assets. Cheers, Rene.