Last week, migration of virtual machines to vMSC was almost completed and it was about time to bring one cluster to another vCenter server to distribute the workload. During the testing phase, faced an unexpected result and I would like to share this experience with you.
The following was how the VMware infrastructure was setup:
- 2 x vCenter servers
- vCenter_A and vCenter_B
- 1 x Metro Storage Cluster
- 6 x ESXi servers
- dvSwitch with same name & configuration on both vCenter servers
The following was the migration plan in the document:
- Import dvSwitch configuration from vCenter_A and export it on vCenter_B
- This was done in advance
- Disconnect and remove one ESXi from vCenter_A
- Connect the ESXi on vCenter_B
- Repeat 2~3 for the rest of ESXi servers
Before working on the production ESXi servers, it was decided to use one test ESXi server (here after ESXi_Test) to bring across to vCenter_B to validate the migration plan. As per the document, I performed the same procedure mentioned above:
- Disconnected and removed ESXi_Test from vCenter_A
- Connected ESXi_Test on vCenter_B
- Found that dvSwitch didn’t exist
- dvSwitch existed but the problem was that some of the portgroups in vCenter_A were modified and not in vCenter_B, i.e. didn’t match
- It was decided to import/export the dvSwitch in vCenter_A to vCenter_B and do the test again
- Disconnected and removed ESXi_Test from vCenter_B
- Connected ESXi_Test back to vCenter_A
At this stage, ESXi was not connected to vCenter_A causing the following error message:
ODBC error: (23000) – [Microsoft][SQL Server Native Client 10.0][SQL Server]The INSERT statement conflicted with the FOREIGN KEY constraint “FK_COMP_RES_VSAN_REF_HOST”. The conflict occurred in database “vcenter_database”, table “dbo.VPX_HOST”, column ‘ID’.” is returned when executing SQL statement “INSERT INTO VPX_COMPUTE_RESOURCE_VSAN_HOST WITH (ROWLOCK) (COMP_RES_ID, HOST_ID, NODE_UUID, CLUSTER_UUID, AUTOCLAIM_STORAGE, ENABLED) VALUES (?, ?, ?, ?, ?, ?)”
Something has happened while inserting an element to the vCenter database. Tried again and got the same issue. The worst part was that after a few minutes, vCenter_A went offline.
To troubleshoot this:
- Checked the vCenter services
- Everything was running
- Checked the database services
- Everything was running
This indicated that vCenter & database services were running but the communication between them through ODBC had an issue. The first thing I tried was to restart vCenter server services using Heartbeat console. Luckily, this brought the vCenter server back to online.
It’s not certain to say this has happened due to the mis-configuration of dvSwitch in vCenter_A and vCenter_B. Hence, I will be logging a support request to VMware Support and update this post in near future.
In the meantime, if you are planning to do the same thing, bear in mind that bringing back the ESXi server might cause an outage.
Hope this helps.