I’ve been doing a lot of virtual machine migration work, both physical to virtual (P2V) and virtual to virtual (V2V) and with the recent migration work, I faced a challenging scenario and decided to share my experience.
- There should be no outage during migration
- 1 or 2 packet loss is fine
- Two clusters in a same vCenter server:
- Each cluster has 4 ESXi servers
- Storage is FC based but not shared across clusters and storage systems are different:
- Source_Cluster uses Storage_VendorA
- Destination_Cluster uses Storage_VendorB
- Each cluster has it’s own dvSwitch:
- There are two 10GBe uplinks for all ESXi servers in Source_Cluster & Destination_Cluster
- Different network configuration on physical interfaces:
- LAG & LACP are enabled on the Destination_Cluster
- Active/Active on the Source_Cluster, i.e. no LAG
- Management & vMotion VMkernel are in different VLAN
- VLAN 1000 for Source_Cluster
- VLAN 2000 for Destination_Cluster
- vMotion VMKernels are in different subnet
- There is only one shared VLAN, VLAN 3000
The plan was to migrate virtual machines from source cluster to destination cluster with the new migration type “Change both host and datastore” (note that it’s available on web-client only).
To use this new feature, there were some pre-requisites to be met:
- Source and destination ESXi servers must be in a same dvSwitch, with at least one uplink for both dvSwitches from both ESXi servers
- The vMotion VMkernels for source and destination ESXi servers must be in a same subnet
- Virtual machines must be using a portgroup that’s available on both source and destination ESXi servers
- Storage doesn’t have to be shared
- Storage vMotion without shared storage rely on network, i.e. vMotion network
How could we achieve this? Let’s take a look at it in depth, a figure is attached below.
The above figure represents how the clusters are setup (before migration). To allow vMotion and storage vMotion to work, firstly, a single dvSwitch must be shared across. Since the destination ESXi server is configured with LAG & LACP, it’s decided pull one uplink out from the source dvSwitch and add it to the destination dvSwitch. It will look like the following:
Next task is to put vMotion VMkernels in a same subnet and it needs to be performed for all 8 ESXi servers. Make sure you record the previous VMkernel information as they will have to be used while reverting back.
The last bit is, all virtual machines must be using a portgroup that is shared between clusters. This could be accomplished simply by using a portgroup in destination dvSwitch, please make a reference to the figure above.
There are 3 ways of doing this:
- Modifying portgroup on virtual machine edit settings
- dvSwitch Manage Portgroup
Whichever is used, the outcome will be the same.
- Add destination dvSwitch to Source ESXi
- Pull one uplink out from source dvSwitch and add it to destination dvSwitch
- Re-configure vMotion VMkernels for all ESXi servers to be in VLAN3000
- Change the portgroup of virtual machines to the one in destination dvSwitch
For the above change, only 1 source ESXi server was configured. The reason behind is that if one 10GBe uplink is pulled out from source dvSwitch, the existing virtual machines will be running with 1 x 10GBe uplink. This might potentially cause a performance issue for some virtual machines that require extensive network bandwidth. To minimise the impact, it’s been decided to use 1 source ESXi server purely for the migration. I will call this one as ESXi_For_Migration.
- Migrate virtual machines to ESXi_For_Migration
- Change the portgroup on virtual machines running on ESXi_For_Migration to one in destination dvSwitch
- vMotion & storage vMotion virtual machines to the destination cluster
- Once done, vMotion virtual machines from one of the 3 ESXi servers to ESXi_For_Migration (Alternatively, it is possible to place the ESXi server on maintenance mode if DRS fully automated is used)
- Repeat 2~4 for until the migration is finished
- Once all done, bring one uplink back to source dvSwitch and remove destination dvSwitch from ESXi_For_Migration
- Put vMotion VMkernels back to original VLANs, i.e. 1000 for source and 2000 for destination for all ESXi servers
Hopefully the migration scenario explained above will help you if you are in a migration project and looking for a real life example.
In near future, I will upload another migration scenario that might be useful.