There are a few questions that come up quite often regarding vCenter Server upgrades and mixed-versions that we would like to address. In this blog post we will discuss and attempt to clarify the guidance in the vSphere Documentation for Upgrade or Migration Order and Mixed-Version Transitional Behavior for Multiple vCenter Server Instance Deployments. This doc breaks down what happens during the vCenter Server upgrade process and describes the impacts of having components – vCenter Server and Platform Services Controller (PSC), running at different versions during the upgrade window. For example, once you get some vCenter Server instances upgraded, say to 6.5 Update 1, you won’t be able to manage those upgraded instances from any 5.5 instances. While most of the functionality limitations manifest themselves when upgrading from 5.5 to 6.x, there could also be some quirks in environments running a mix of 6.0 and 6.5. There are a couple of additional questions that seem to arise from this doc so let’s see if we can address them.
The Upgrade Process
I’m not going to go through the entire process here, but it is important to understand the basics of how a vCenter Server upgrade works. Remember that there are two components to vCenter Server – the Platform Services Controller (PSC) which runs the vSphere (SSO) Domain and vCenter Server itself. For a vCenter Server upgrade, the vSphere Domain and all PSCs within it, must be upgraded first. Once that is complete, then the vCenter Servers can be upgraded. Obviously, if you have a standalone vCenter Server with an embedded PSC, this is a much simpler proposition. But, for those requiring external PSCs because of other requirements such as Enhanced Linked Mode, just remember the PSCs need to be upgraded first.
The other important point to make here is that upgrading by site is not supported. Looking at the above example, you can see there are two sites each with an external PSC and a vCenter Server. It is a common that a customer would like to upgrade an entire site, test, and then move onto the next site. Unfortunately, this is not supported and all PSCs within the vSphere Domain across all sites must be upgraded first.
Now, on to the questions mentioned earlier. The first question is, “Can I run vCenter Servers and Platform Services Controllers (PSCs) of different versions in my vSphere Domain?” The answer here is yes, but only during the time of an upgrade. VMware does not support running different versions of these components under normal operations within a vSphere Domain. The exact verbiage from the article is, “Mixed-version environments are not supported for production. Use these environments only during the period when an environment is in transition between vCenter Server versions.” So, do not plan on running different versions of vCenter Server and PSC in production on an ongoing basis.
The second question is then, “How long can I run in this mixed-version mode?” This question is a bit tougher to answer. There is no magic date or time bomb when things will just stop working. This is really more of a question of understanding the risks and knowing how problems may affect the environment should something go wrong while in this mixed-version state.
An example of one such risk would be if you were upgrading to vSphere 6.5 from 5.5. Let’s say you had your vSphere Domain (i.e. PSCs) and one vCenter Server already upgraded leaving you with 1 or more vCenter Server 5.5 instances. Imagine that something happens leaving a vCenter Server 5.5 completely wiped out. You could restore that vCenter Server 5.5 instance and be back in production as long as you have a good, current backup. If the backup you need to restore from was taken prior to the start of the vSphere Domain upgrade, you would not be able to use it to restore. The reason for this is that the vCenter Server instance that you would be restoring is expecting a 5.5 vSphere Domain and the communication between that restored vCenter Server instance and the 6.5 PSC would not work. An alternative to this would be to rollback the entire vSphere Domain and any other vCenter Servers that were upgraded.
Another risk would be if we are unable to restore that instance because the backups were bad (it does happen) or you couldn’t accept the outcome of losing the data since that backup was taken. The result here is that you would be forced to rebuild that vCenter Server instance and re-attach all the hosts. This may not be desirable because this new vCenter Server instance would have a new UUID and all of the hosts, VMs, and other objects would also have new moref IDs. This means that any backup tools or monitoring software would see these as all net new objects and you would lose continuity of backups or monitoring. You also would have to rebuild the vCenter Server instance as 6.5 which also may not be desirable because you may have an application or other constraint that requires a specific version of vCenter Server. If you rebuild the instance as 6.5 you may break that application.
Finally, let’s consider the possibility of having a PSC failure instead of losing a vCenter Server. What happens? Normally, you could easily repoint a vCenter Server instance to another external PSC within the same SSO Site. However, this would not be possible if the vCenter Server is not running the same version as the PSC you are attempting to repoint to. For example, if you had a vCenter Server 5.5 or 6.0 and they were pointing to a 6.5 PSC (because it has already been upgraded), if that PSC failed you would not be able to repoint that vCenter Server to another PSC. Remember that all PSCs must be upgraded first so all PSCs should be running 6.5 already. The only way to recover from this scenario is to restore or redeploy the failed PSC which may take longer than repointing.
So, give the above scenarios, what do we tell a customer who asks, “My upgrade plan spans multiple sites over multiple months. How should I plan my upgrade?” Here are our recommendations:
- 1. Minimize the upgrade window
2. Follow the upgrade documentation
3. Take full backups before, during, and after the upgrade
4. Check the interop matrices and test the upgrade first
The first recommendation is to minimize the upgrade window as much as possible. We understand that there’s only so much you can do here, but it is important to reduce the amount of time you’ll be running different versions of vCenter Server (and PSC) in the same vSphere Domain. The second recommendation is to, no matter how tempting to do otherwise, upgrade the entire vSphere Domain (SSO Instances and PSCs) first as is called out in the vSphere Documentation. It is not supported to upgrade everything in one site and then move onto the next. You must upgrade all SSO Instances and PSCs in the vSphere Domain, across ALL sites and locations, first. Third, make sure you have good backups every step of the way. While snapshots can be a path to a quick rollback, when dealing with SSO, PSCs, and vCenter Server they don’t always work. Taking a full backup ensures the ability to restore to a known clean state. Last, and certainly not least, do your interoperability testing and test the upgrade in a lab environment that represents your production environment as much as possible.
Emad has a great 3-part series on upgrades (Part 1, Part 2, Part 3) so be sure to check it out prior to testing and beginning your upgrade. Also know and understand the risks and impacts of problems during the upgrade process. Finally, know how the upgrade process is going to affect all of the yet-to-be-upgraded parts of your environment and have good rollback and mitigation plans if any issues come up.
About the Author
Adam Eckerle manages the vSphere Technical Marketing team in the Cloud Platform Business Unit at VMware. This team is responsible for vSphere launch, enablement, and ongoing content generation for the VMware field, Partners, and Customers. In addition, Adam’s team is also focused on preparing Customers and Partners for vSphere upgrades through workshops, VMUGs, and other events.
New Features in vSphere 6.0
Following the hands on labs module HOL-SDC-1410:
Scalability – Configuration Maximums
The Configuration Maximums have increased across the board for vSphere Hosts in 6.0. Each vSphere Host can now support:
• 480 Physical CPUs per Host
• Up to 12TB of Physical Memory
• 1000 VMs per Host
• 64 Hosts per Cluster
Scalability – Virtual Hardware v11
This release of vSphere gives us Virtual Hardware v11. Some of the highlights include:
• 128 vCPUs
• 4 TB RAM
• Hot-add RAM now vNUMA aware
• WDDM 1.1 GDI acceleration features
• xHCI 1.0 controller compatible with OS X 10.8+ xHCI driver
• A virtual machine can now have a maximum of 32 serial ports
• Serial and parallel ports can now be removed
Local ESXi Account and Password Management Enhancements
In the latest release of vSphere 6.0, we expand support for account management on ESXi Hosts.
New ESXCLI Commands:
• CLI interface for managing ESXi local user accounts and permissions
• Coarse grained permission management
• ESXCLI can be invoked against vCenter instead of directly accessing the ESXi host.
• Previously, the account and permission management functionality for ESXi hosts was available only with direct host connections.
• Previously customers had to manually edit by hand the file /etc/pam.d/passwd, now they can do it from VIM API OptionManager.updateValues().
• Advanced options can also be accessed through vCenter, so there is not need to make a direct host connection.
• PowerCLI cmdlet allows setting host advanced configuration options
• Security.AccountLockFailures – “Maximum allowed failed login attempts before locking out a user’s account. Zero disables account locking.”
• Default: 10 tries
• Security.AccountUnlockTime – “Duration in seconds to lock out a user’s account after exceeding the maximum allowed failed login attempts.”
• Default: 2 minutes
vCenter Server 6.0 – Platform Services Controller:
The Platform Services Controller (PSC) includes common services that are used across the suite.
• These include SSO, Licensing and the VMware Certificate Authority (VMCA)
• The PSC is the first piece that is either installed or upgraded. When upgrading a SSO instance becomes a PSC.
• There are two models of deployment, embedded and centralized.
- o Embedded means the PSC and vCenter Server are installed on a single virtual machine. – Embedded is recommended for sites with a single SSO solution such as a single vCenter.
o Centralized means the PSC and vCenter Server are installed on different virtual machines. – Centralized is recommended for sites with two or more SSO solutions such as multiple vCenter Servers, vRealize Automation, etc. When deploying in the centralized model it is recommended to make the PSC highly available as to not have a single point of failure, in addition to utilizing vSphere HA a load balancer can be placed in front of two or more PSC’s to create a highly available PSC architecture.
The PSC and vCenter servers can be mixed and matched, meaning you can deploy Appliance PSC’s along with Windows PSC’s with Windows and Appliance based vCenter Servers. Any combination uses the PSC’s built in replication.
What’s New in vSphere 6.0 – Networking and Security
Networking in vSphere 6.0 has received some significant improvements which has led to the following new vMotion capabilities:
• Cross vSwitch vMotion
• Cross vCenter vMotion
• Long Distance vMotion
• vMotion across Layer 3 boundaries
More detail on each of these follows as well as details on the improved Network I/O Control (NIOC) version 3.
Cross vSwitch vMotion
Cross vSwitch vMotion allows you to seamlessly migrate a VM across different virtual switches while performing a vMotion.
• No longer restricted by the network you created on the vSwitches in order to vMotion a virtual machine.
• Requires the source and destination portgroups to share the same L2. The IP address within the VM will not change.
• vMotion will work across a mix of switches (standard and distributed). Previously, you could only vMotion from vSS to vSS or within a single vDS. This limitation has been removed.
The following Cross vSwitch vMotion migrations are possible:
• vSS to vSS
• vSS to vDS
• vDS to vDS
• vDS to VSS is not allowed
Another added feature is that vDS to vDS migration transfers the vDS metadata to the destination vDS (network statistics).
Cross vCenter vMotion
Expanding on the Cross vSwitch vMotion enhancement, we are also excited to announce support for Cross vCenter vMotion.
vMotion can now perform the following changes simultaneously.
• Change compute (vMotion) – Performs the migration of virtual machines across compute hosts
• Change storage (Storage vMotion) – Performs the migration of the virtual machine disks across datastores
• Change network (Cross vSwitch vMotion) – Performs the migration of a VM across different virtual switches and finally…
• Change vCenter (Cross vCenter vMotion) – Performs the migration of the vCenter which manages the VM.
All of these types of vMotion are seamless to the guest OS. Like with vSwitch vMotion, Cross vCenter vMotion requires L2 network connectiviy since the IP of the VM will not be changed. This functionality builds upon Enhanced vMotion and shared storage is not required. Target support for local (single site), metro (multiple well-connected sites), and cross-continental sites.
Long Distance vMotion
Long Distance vMotion is an extension of Cross vCenter vMotion however targeted for environments where vCenter servers are spread across large geographic distances and where the latency across sites is 100ms or less. Although spread across a long distance, all the standard vMotion guarantees are honored.
This does not require VVOLs to work. A VMFS/NFS system will work also.
• The requirements for Long Distance vMotion are the same as Cross vCenter vMotion, except with the addition of the maximum latency between the source and destination sites must be 100 ms or less, and there is 250 Mbps of available bandwidth.
• To stress the point: The VM network will need to be a stretched L2 because the IP of the guest OS will not change. If the destination portgroup is not in the same L2 domain as the source, you will lose network connectivity to the guest OS. This means in some topologies, such as metro or cross-continental, you will need a stretched L2 technology in place. The stretched L2 technologies are not specified. Any technology that can present the L2 network to the vSphere hosts will work, because it’s unknown to ESX how the physical network is configured. Some examples of technologies that would work are VXLAN, NSX L2 Gateway Services, or GIF/GRE tunnels.
• There is no defined maximum distance that will be supported as long as the network meets these requirements. Your mileage may vary, but are eventually constrained by the laws of physics.
• The vMotion network can now be configured to operate over an L3 connection. More details on this are in the next slide.
Network I/O Control v3
Network I/O Control Version 3 allows administrators or service providers to reserve or guarantee bandwidth to a vNIC in a virtual machine or at a higher level the Distributed Port Group.
This ensures that other virtual machines or tenants in a multi-tenancy environment don’t impact the SLA of other virtual machines or tenants sharing the same upstream links.
• Allows private or public cloud administrators to guarantee bandwidth to business units or tenants. –> This is done at the VDS port group level.
• Allows vSphere administrators to guarantee bandwidth to mission critical virtual machines. –> This is done at the VMNIC level.
What’s New in vSphere 6.0 Storage & Availability
At a high level, these are the new Storage & Availability features of vSphere 6.0.
You will find more details on some of the features below.
VMware Virtual Volumes
VVOLS changes the way storage is architected and consumed. Using external arrays without VVOLS, typically the LUN is the unit of both capacity and policy. In other words, you create LUNs with fixed capacity and fixed data services. Then, VMs are assigned to LUNs based on their data service needs. This can result in problems when a LUN with a certain data service runs out of capacity, while other LUNs still have plenty of room to spare. The effect of this is that typically admins overprovision their storage arrays, just to be on the safe side.
With VVOLS, it is totally different. Each VM is assigned its own storage policy, and all VMs use storage from the same common pool. Storage architects need only provision for the total capacity of all VMs, without worrying about different buckets with different policies. Moreover, the policy of a VM can be changed, and this doesn’t require that it be moved to a different LUN.
VVOLS – VASA Provider
The VASA Provider is the component that exposes the storage services which a VVOLS array can provide. It also understands VASA APIs for operations such as the creation of virtual volume files. It can be thought of as the “control plane” element of VVOLS. A VASA provider can be implemented in the firmware of an array, or it can be in a separate VM that runs on the cluster which is accessing the VVOLS storage (e.g., as a part of the array’s management server virtual appliance)
VVOLS – Storage Container (SC)
A storage container is a logical construct for grouping Virtual Volumes. It is set up by the storage admin, and the capacity of the container can be defined. As mentioned before, VVOLS allows you to separate capacity management from policy management. Containers provide the ability to isolate or partition storage according to whatever need or requirement you may have. If you don’t want to have any partitioning, you could simply have one storage container for the entire array. The maximum number of containers depends upon the particular array model.
VVOLS – Storage Policy-Based Management
Instead of being based on static, per-LUN assignment, storage policies with VVOLS are managed through the Storage Policy-Based Management framework of vSphere. This framework uses the VASA APIs to query the storage array about what data services it offers, and then exposes them to vSphere as capabilities. These capabilities can then be grouped together into rules and rulesets, which are then assigned to VMs when they get deployed. When configuring the array, the storage admin can choose which capabilities to expose or not expose to vSphere.
To get more detailed information on VVOLS consider taking HOL-SDC-1429 – Virtual Volumes (VVOLS) Setup and Enablement.
vSphere 6.0 Fault Tolerance
The benefits of Fault Tolerance are:
• Protect mission critical, high performance applications regardless of OS
• Continuous availability – Zero downtime, zero data loss for infrastructure failures
• Fully automated response
The new version of Fault Tolerance greatly expands the use cases for FT to approximately 90% of workloads with these new features:
• Enhanced virtual disk support – Now supports any disk format (thin, thick or EZT)
• Now supports hot configure of FT – No longer required to turn off VM to enable FT
• Greatly increased FT host compatibility – If you can vMotion a VM between hosts you can use FT
The new technology used by FT is called Fast Checkpointing and is basically a heavily modified version of an xvMotion (cross-vCenter vMotion) that never ends and executes many more checkpoints (multiple/sec).
FT logging (traffic between hosts where primary and secondary are running) is very bandwidth intensive and will use a dedicated 10G nic on each host. This isn’t required, but highly recommended as at a minimum an FT protected VM will use more. If FT doesn’t get the bandwidth it needs the impact is that the protected VM will run slower.
vSphere FT 6.0 New Capabilities
DRS is supported for initial placement of VMs only.
Backing Up FT VMs
FT VMs can now be backed up using standard backup software, the same as all other VMs (FT VMs could always be backed up using agents). They are backed up using snapshots through VADP.
Snapshots are not user-configurable – users can’t take snapshots. It is only supported as part of VADP
Availability – vSphere Replication
The features on this slide are new in vSphere Replication (VR) 6.0
• Compression can be enabled when configuring replication for a VM. It is disabled by default.
• Updates are compressed at source (vSphere host) and stay compressed until written to storage. This does cost some CPU cycles on source host (compress) and target storage host (decompress).
• Uses FastLZ compression libraries. Fast LZ provides a nice balance between performance, compression, and limited overhead (CPU).
• Typical compression ratio is 1.7 to 1
Best results when using vSphere 6.0 at source and target along with vSphere Replication (VR) 6.0 appliance(s). Other configurations supported – example: Source is vSphere 6.0, target is vSphere 5.5. vSphere Replication Server (VRS) must decompress packets internally (costing VR appliance CPU cycles) before writing to storage.
• With VR 6.0, VR traffic can be isolated from other vSphere host traffic.
• At source, a NIC can be specified for VR traffic. NIOC can be used to control replication bandwidth utilization.
• At target, VR appliances can have multiple vmnics with separate IP addresses to separate incoming replication traffic, management traffic, and NFC traffic to target host(s).
• At target, NIC can be specified for incoming NFC traffic that will be written to storage.
• The user must, of course, set up the appropriate network configuration (vSwitches, VLANs, etc.) to separate traffic into isolated, controllable flows.
VMware Tools in vSphere 2015 includes a “freeze/thaw” mechanism for quiescing certain Linux distributions at the file system level for improved recovery reliability. See vSphere documentation for specifics on supported Linux distributions.
- What’s New in VMware vSphere 6.0? – Overview
- What’s New in VMware vSphere 6.0 Platforme? – Technical Whitepaper – more deep-dive
- vSphere 6 – Compare vSphere Editions (Standard, Enterprise, Enterprise Plus)
- HOL-SDC-1410 – What’s New with vSphere 6 (Register for the free Hands-On-Lab and try vSphere 6 live!)
- Comparing vSphere 5.0, 5.1 and 5.5 Features to the features in vSphere 6.0
- Understanding the vSphere Upgrade Path (Upgrade Flowcharts)
For information on upgrading to vSphere 6.0, visit the vSphere Upgrade Center
This article provides the default location of the vCenter Server logs.
- vCenter Server 5.x and earlier versions on Windows XP, 2000, 2003: %ALLUSERSPROFILE%\Application Data\VMware\VMware VirtualCenter\Logs\
- vCenter Server 5.x and earlier versions on Windows Vista, 7, 2008: C:\ProgramData\VMware\VMware VirtualCenter\Logs\
- vCenter Server 5.x Linux Virtual Appliance: /var/log/vmware/vpx/
- vCenter Server 5.x Linux Virtual Appliance UI: /var/log/vmware/vami
Note: If the service is running under a specific user, the logs may be located in the profile directory of that user instead of %ALLUSERSPROFILE%.
vCenter Server logs are grouped by component and purpose:
- vpxd.log: The main vCenter Server logs, consisting of all vSphere Client and WebServices connections, internal tasks and events, and communication with the vCenter Server Agent (vpxa) on managed ESX/ESXi hosts.
- vpxd-profiler.log, profiler.log and scoreboard.log: Profiled metrics for operations performed in vCenter Server. Used by the VPX Operational Dashboard (VOD) accessible at https://VCHostnameOrIPAddress/vod/index.html.
- vpxd-alert.log: Non-fatal information logged about the vpxd process.
- cim-diag.log and vws.log: Common Information Model monitoring information, including communication between vCenter Server and managed hosts’ CIM interface.
- drmdump\: Actions proposed and taken by VMware Distributed Resource Scheduler (DRS), grouped by the DRS-enabled cluster managed by vCenter Server. These logs are compressed.
- ls.log: Health reports for the Licensing Services extension, connectivity logs to vCenter Server.
- vimtool.log: Dump of string used during the installation of vCenter Server with hashed information for DNS, username and output for JDBC creation.
- stats.log: Provides information about the historical performance data collection from the ESXi/ESX hosts
- sms.log: Health reports for the Storage Monitoring Service extension, connectivity logs to vCenter Server, the vCenter Server database and the xDB for vCenter Inventory Service.
- eam.log: Health reports for the ESX Agent Monitor extension, connectivity logs to vCenter Server.
- catalina.<date>.log and localhost.<date>.log: Connectivity information and status of the VMware Webmanagement Services.
- jointool.log: Health status of the VMwareVCMSDS service and individual ADAM database objects, internal tasks and events, and replication logs between linked-mode vCenter Servers.
- Additional log files:
Note: As each log grows, it is rotated over a series of numbered component-nnn.log files. On some platforms, the rotated logs are compressed.
vCenter Server logs can be viewed from:
- The vSphere Client connected to vCenter Server 4.0 and higher – Click Home > Administration > System Logs.
- The Virtual Infrastructure Client connected to VirtualCenter Server 2.5 – Click Administration > System Logs.
- From the vSphere 5.1 and 5.5 Web Client – Click Home > Log Browser, then from the Log Browser, click Select object now, choose an ESXi host or vCenter Server object, and click OK.