PowerCLI Report Tip – Part 4

Introduction

Finally, it’s the last PowerCLI Report Tip series! In this blog post, it will be discussing how PLINK could be used for advanced reporting. This will be quite short compared to other blog series and if you want to read previous series, they could be found below:

What is PLINK?

Many of you would already be familiar with SSH client. Putty if you are running Windows or terminal for Mac. PLINK is a command-line interface to the PuTTY back ends. More information could be found and downloaded here.

Why PLINK with PowerCLI?

Normally, PowerCLI would be enough for reports with information from vCenter Server and ESXi server. However, sometimes, reports will require additional information from external sources. I will explain reasons in Examples section.

How do you run PLINK with PowerCLI?

Running PLINK with PowerCLI is quite simple:

  1. Download the file
  2. Locate it under a folder
  3. Run

Example attached below:

PowerCLI C:\script\plink> .\PLINK.EXE
PuTTY Link: command-line connection utility
Usage: plink [options] [user@]host [command]
("host" can also be a PuTTY saved session name)
Options:
  -V        print version information and exit
  -pgpfp    print PGP key fingerprints and exit
  -v        show verbose messages
  -load sessname  Load settings from saved session
  -ssh -telnet -rlogin -raw -serial
            force use of a particular protocol
  -P port   connect to specified port
  -l user   connect with specified username
  -batch    disable all interactive prompts
The following options only apply to SSH connections:
  -pw passw login with specified password
  -D [listen-IP:]listen-port
            Dynamic SOCKS-based port forwarding
  -L [listen-IP:]listen-port:host:port
            Forward local port to remote address
  -R [listen-IP:]listen-port:host:port
            Forward remote port to local address
  -X -x     enable / disable X11 forwarding
  -A -a     enable / disable agent forwarding
  -t -T     enable / disable pty allocation
  -1 -2     force use of particular protocol version
  -4 -6     force use of IPv4 or IPv6
  -C        enable compression
  -i key    private key file for authentication
  -noagent  disable use of Pageant
  -agent    enable use of Pageant
  -m file   read remote command(s) from file
  -s        remote command is an SSH subsystem (SSH-2 only)
  -N        don't start a shell/command (SSH-2 only)
  -nc host:port
            open tunnel in place of session (SSH-2 only)
  -sercfg configuration-string (e.g. 19200,8,n,1,X)
            Specify the serial configuration (serial only)

Specifically, you could execute the following to run a command over a server:

C:\script\PLINK\plink.exe -pw "Password of the user" "Username@ServerName" "Command you want to run"

Assuming the connection is successfully made, let’s go through the tips.

Tips, PLINK with PowerCLI

First tip

When you run SSH command to a server for the first time, it asks you to accept the certification like the following:

PowerCLI C:\script\plink> .\PLINK.EXE -pw "Password" "Username@ServerName" "Command you want to run"
The server's host key is not cached in the registry. You have no guarantee that the server is the computer you think it is.
The server's dss key fingerprint is:
ssh-dss 1024 aa:bb:cc:dd:ee:ff:gg:aa:bb:cc:dd:ee:ff:gg:aa:bb
If you trust this host, enter "y" to add the key to
PuTTY's cache and carry on connecting.
If you want to carry on connecting just once, without
adding the key to the cache, enter "n".
If you do not trust this host, press Return to abandon the connection.
Store key in cache? (y/n)

Instead of typing yes all the time for the first connection, you could simply use “echo” to avoid it. Use the following:

echo y | C:\script\PLINK\plink.exe -pw "Password of the user" "Username@ServerName" "Command you want to run"

Second tip

When you save an output from PLINK to a PowerCLI variable, formatting it is not straightforward. Let’s have a look at example. Below is an output from Dell Remote Access Control (DRAC):

PowerCLI C:\script\plink> .\PLINK.EXE -pw ""root@10.10.10.1" "racadm getniccfg"
IPv4 settings:
 NIC Enabled = 1
 IPv4 Enabled = 1
 DHCP Enabled = 0
 IP Address = 10.10.10.1
 Subnet Mask = 255.255.255.0
 Gateway = 10.10.10.254
IPv6 settings:
 IPv6 Enabled = 0
 DHCP6 Enabled = 1
 IP Address 1 = ::
 Gateway = ::
 Link Local Address = ::
 IP Address 2 = ::
 IP Address 3 = ::
 IP Address 4 = ::
 IP Address 5 = ::
 IP Address 6 = ::
 IP Address 7 = ::
 IP Address 8 = ::
 IP Address 9 = ::
 IP Address 10 = ::
 IP Address 11 = ::
 IP Address 12 = ::
 IP Address 13 = ::
 IP Address 14 = ::
 IP Address 15 = ::
LOM Status:
 NIC Selection = Dedicated
 Link Detected = Yes
 Speed = 100Mb/s
 Duplex Mode = Full Duplex

Let’s say you would like to pull out network settings, i.e. IP address/Subnet Mask/Gateway, how could we pull only three elements out? First of all, save the output to a variable:

PowerCLI C:\script\plink> $output = .\PLINK.EXE -pw ""root@10.10.10.1" "racadm getniccfg"

Good news is that the output is saved as an array, not to a single line meaning you could do the following:

PowerCLI C:\script\plink> $output[5]
IP Address = 10.10.10.1
PowerCLI C:\script\plink> $output[6]
Subnet Mask = 255.255.255.0
PowerCLI C:\script\plink> $output[7]
Gateway = 10.10.10.254

Then you could simply use -replace and -split functions to get the information you are after:

PowerCLI C:\script\plink> $output[5] -replace " "
IPAddress=10.10.10.1 

PowerCLI C:\script\plink> ($output[5] -replace " ") -split "="
IPAddress
10.10.10.1

PowerCLI C:\script\plink> $ipaddress = (($output[5] -replace " ") -split "=")[1]
PowerCLI C:\script\plink> $ipaddress
10.10.10.1

Not a simple way but it’s achievable.

Let’s take a look at another example. This time, it’s an output from IBM SVC:

PowerCLI C:\script\plink> .\PLINK.EXE -pw "Password!" "UserName@ServerName" "lsvdisk"
id name             IO_group_id IO_group_name status mdisk_grp_id mdisk_grp_name capacity type    FC_id FC_name RC_id RC_name vdisk_UID                        fc_map_count copy_count fast_write_state se_copy_count RC_change compressed_copy_count
0  vDisk1           0           io_grp0       online 0            MDISK1         1.00TB   striped                             6005076812345678A000000000000000 0            1          not_empty        0             no        0
1  vDisk2           0           io_grp0       online 1            MDISK2         1.00TB   striped                             6005076812345678A000000000000003 0            1          empty            0             no        0

Referring to the output, it won’t be as easy as the first example since the elements are separated with spaces that varies.

Thinking about the SVC command, it has the parameter to output with delimiter, i.e. CSV output:

PowerCLI C:\script\plink> .\PLINK.EXE -pw "Password!" "Username@ServerName" "lsvdisk -delim ,"PowerCLI C:\script\plink> output.csv
id,name,IO_group_id,IO_group_name,status,mdisk_grp_id,mdisk_grp_name,capacity,type,FC_id,FC_name,RC_id,RC_name,vdisk_UID,fc_map_count,copy_count,fast_write_state,se_copy_count,RC_change,compressed_copy_count
0,vDisk1,io_grp0,online,0,MDISK1,striped,,,,,6005076812345678A000000000000000,0,1,not_empty,0,no,0
1,vDisk2,io_grp0,online,1,MDISK2,striped,,,,,6005076812345678A000000000000003,0,1,empty,0,no,0

Why would you want to output with delimiter “,”? The reason is simple, PowerCLI has CSV function Import-CSV and using this, formatting the output will be quite easy:

  1. > to save the output as a CSV file
  2. Then, run Import-Csv
PowerCLI C:\script\plink> $output > test.csv
PowerCLI C:\script\plink> Import-Csv .\test.csv

Now, you will see formatted output:

id                    : 0
name                  : vDisk1
IO_group_id           : 0
IO_group_name         : io_grp0
status                : online
disk_grp_id           : 0
mdisk_grp_name        : MDISK1
capacity              : 1.00TB
type                  : striped
FC_id                 :
FC_name               :
RC_id                 :
RC_name               :
vdisk_UID             : 6005076812345678A000000000000000
fc_map_count          : 0
copy_count            : 1
fast_write_state      : not_empty
se_copy_count         : 0
RC_change             : no
compressed_copy_count : 0

id                    : 1
name                  : vDisk2
IO_group_id           : 0
IO_group_name         : io_grp0
status                : online
mdisk_grp_id          : 1
mdisk_grp_name        : MDISK2
capacity              : 1.00TB
type                  : striped
FC_id                 :
FC_name               :
RC_id                 :
RC_name               :
vdisk_UID             : 6005076812345678A000000000000003
fc_map_count          : 0
copy_count            : 1
fast_write_state      : empty
se_copy_count         : 0
RC_change             : no
compressed_copy_count : 0

This is a way of formatting the output nice and simple. One thing to note is that it will save the output to your local disk, e.g. C drive. I suggest you to delete the file once you save it to a variable, like this:

PowerCLI C:\script\plink> .\PLINK.EXE -pw "Password!" "Username@ServerName" "lsvdisk -delim ,"PowerCLI C:\script\plink> output.csv
PowerCLI C:\script\plink> $output = Import-Csv output.csv
PowerCLI C:\script\plink> Remove-Item output.csv

Time to go through examples.

Example 1

I already wrote a blog post Virtual Machines running on vDisks in relationships. The purpose of this report was to find which virtual machines are mapped with VMFS volumes that are in Metro Mirror relationship, i.e. synchronous replication. Why would you need PLINK for this?

Even though the naming convention across VMFS volumes is solid to represent which ones are in relation, administrators make mistakes. Meaning that if there are VMFS volumes with wrong naming convention, then the report will not be accurate. Hence, I decided to match against UID so the report is 100% right! For more information, refer to the link attached above.

Example 2

To check custom services running on ESXi servers:

PowerCLI C:\script\plink> Get-VMHostService -VMHost (Get-VMHost -Name "ESXi.test.com")
Key Label Policy Running Required
--- ----- ------ ------- --------
DCUI Direct Console UI on True False
TSM ESXi Shell off False False
TSM-SSH SSH on True False
lbtd lbtd on True False
lsassd Local Security Authenticati... off False False
lwiod I/O Redirector (Active Dire... off False False
netlogond Network Login Server (Activ... off False False
ntpd NTP Daemon on True False
sfcbd-watchdog CIM Server on True False
snmpd snmpd on False False
vmware-fdm vSphere High Availability A... off False False
vprobed vprobed off False False
vpxa vpxa on True False
xorg xorg on False False

How about if there is a custom VIB installed, for example HP AMS and want to check the status? Unfortunately, Get-VMHostService won’t show this.

The way of checking the status is, SSH to ESXi directly and run /etc/init.d/hp-ams.sh status:

PowerCLI C:\script\plink> .\PLINK.EXE -pw "Password!" "Username@ESXiServer" "/etc/init.d/hp-ams.sh status"

Wrap-Up

Hope the PowerCLI Report Tip series helped and in near future, I will come back with PowerCLI automation tips.

As always, feel free to leave a comment for any clarifications.

PowerCLI Report Tip – Part 3

Introduction

For PowerCLI tip part 3, it will be going through advanced reporting using ESXCLI.

For the contents below, the version of ESXi server used was 5.5 No Update.

What’s ESXCLI?

I suggest you to go and read the following blogs:

Why ESXCLI For Reporting?

PowerCLI itself already has most of the functions to generate a report. However, there are some ESXCLI commands that could be used to produce a report in a faster and easier way. This will be discussed in the later sections with specific examples (scenarios).

In this blog post, there will be two examples discussed:

  1. Storage 
  2. Network & Firewall

Preparation

Before starting, one of the advantages of running ESXCLI through PowerCLI is that SSH does not have to be enabled. But, it requires certain roles so might as well run it with Read-Only account and add permissions accordingly.

First of all, let’s run and save ESXCLI to a variable $esxcli = Get-VMHost -Name “ESXi” | Get-ESXCLi. Then, calling $esxcli will output the following:

PowerCLI C:\> $esxcli
===============================
EsxCli: esx01.test.com

Elements:
---------
device
esxcli
fcoe
graphics
hardware
iscsi
network
sched
software
storage
system
vm
vsan

The output looks very similar to running ESXCLI on ESXi shell. The difference is, for example, if you want to call storage object, then you run $esxcli.storage. No space in between, i.e. esxcli storage. 

With the preparation work above, we are ready to go through some examples! 🙂

Storage

Let’s give an example. There is a request from storage team to generate a report across all VMFS volumes (mapped with FC disk), literally same as the following screenshot:
PowerCLI Report Tip #3 1
 

NOTE: For the report below, I am assuming the virtual disks from storage array are mapped to all ESXi servers in a cluster (Well I guess this is usual for most of people to benefit from HA/DRS).

Looking at above screenshot, the report should contain:

  • Cluster
  • Adapter, e.g. vmhba2 or vmhba3
  • Device, i.e. UID
  • TargetIdentifier, e.g. 50:05:07……
  • RuntimeName, e.g. C0:T1:L11
  • LUN, e.g. 11
  • State, e.g. Active

Using ESXCLI, it could be achieved quite simply.  Assuming you already have saved ESXCLI value to a variable $esxcli, save the following to variables accordingly:

  • $esxcli.storage.core.path.list()
    • It outputs the list of all paths of storage devices attached to this ESXi server.
  • $esxcli.storage.core.device.list()
    • It outputs the list of all storage devices attached to this ESXi server.

Then, using the device list, filter it to query only Fibre Channel devices and for each of them, if the list of path match to this device, select elements.

Combining above it becomes:

$path_list = $esxcli.storage.core.path.list()
$device_list = $esxcli.storage.core.device.list()
$vmfs_list = $esxcli.storage.vmfs.extent.list() 
$cluster = Get-Cluster -VMHost (Get-VMHost -Name $esxcli.system.hostname.get().FullyQualifiedDomainName)

$device_list | where {$_.DisplayName -match "Fibre Channel"} | ForEach-Object { $device = $_.Device; $path_list | where {$_.device -match $device} | select @{N=“Cluster”;E={$cluster.Name}}, Adapter, Device, TargetIdentifier, RuntimeName, LUN, State }

Example Output:

Cluster : Development
Adapter : vmhba3
Device : naa.60050768018d8303c000000000000003
TargetIdentifier : fc.5005076801000002:5005076801100002
RuntimeName : vmhba3:C0:T0:L11
LUN : 11
State : active

Cluster : Development
Adapter : vmhba3
Device : naa.60050768018d8303c000000000000003
TargetIdentifier : fc.5005076801000001:5005076801100001
RuntimeName : vmhba3:C0:T1:L11
LUN : 11
State : active 

Cluster : Development
Adapter : vmhba2
Device : naa.60050768018d8303c000000000000003
TargetIdentifier : fc.5005076801000002:5005076801200002
RuntimeName : vmhba2:C0:T0:L11
LUN : 11
State : active

Cluster : Development
Adapter : vmhba2
Device : naa.60050768018d8303c000000000000003
TargetIdentifier : fc.5005076801000001:5005076801200001
RuntimeName : vmhba2:C0:T1:L11
LUN : 11
State : active

Quite easy, isn’t it?

Another example: virtualisation team manager asked for virtual disks (FC type) that are attached to ESXi servers but not formatted as VMFS. To make it more specific, he was expecting the following:

  • Cluster
  • Device
  • Device file system path
  • Display Name
  • Size

With the report above, it would be very handy to identify which virtual disks are being wasted.

Using ESXCLI, above report could be accomplished simply. Save the following to variables accordingly:
  • $esxcli.storage.core.path.list()
    • It outputs the list of all paths of storage devices attached to this ESXi server.
  • $esxcli.storage.vmfs.extent.list()
    • It outputs the list of all storage devices partitioned (formatted) with VMFS volumes attached to this ESXi server.

Using device list, run a where filter to:

  • Make sure this device is not formatted as VMFS
    • I used -match against all VMFS volumes joined by | which means or
  • The type is Fibre Channel

Combining above, it will become:

$device_list = $esxcli.storage.core.device.list()
$vmfs_list = $esxcli.storage.vmfs.extent.list()
$cluster = Get-Cluster -VMHost (Get-VMHost -Name $esxcli.system.hostname.get().FullyQualifiedDomainName)

$device_list | where {$_.Device -notmatch ([string]::Join("|", $vmfs_list.DeviceName)) -and $_.DisplayName -match "Fibre Channel" } | select @{N="Cluster";E={$cluster.Name}}, Device, DevfsPath, DisplayName, @{N="Size (GB)";E={$_.Size / 1024}}

Example Attached:

Cluster : Development
Device : naa.60050768018d8303c000000000000006
DevfsPath : /vmfs/devices/disks/naa.60050768018d8303c000000000000006
DisplayName : IBM Fibre Channel Disk (naa.60050768018d8303c000000000000006)
Size (GB) : 128
Hope the examples above were easy to follow and let us move on to Network.

Network

In this Network section, I will be giving two examples with:

  1. Firewall
  2. LACP

Let’s start with Firewall.

One of the VMware administrators deployed vRealize Log-Insight and before configuring ESXi servers to point to Log-Insight, he wanted to check the allowed IP addresses configured before and remove them in advance. It was configured to restrict access to syslog server for security purpose.

This time, it will be using $esxcli.network.firewall command. First of all, save the list of ruleset with allowedIP addresses:

  • $esxcli.network.firewall.ruleset.allowedip.list()

Then, use the filter to query only syslog service. Combining above:

$esxi= $esxcli.system.hostname.get().FullyQualifiedDomainName
$ruleset_list = $esxcli.network.firewall.ruleset.allowedip.list() 
$ruleset_list | where {$_.ruleset -eq "syslog"} | select @{N="ESXi";E={$esxi}}, Ruleset, AllowedIPAddresses

Example output:

ESXi : esx01.test.com
Ruleset : syslog
AllowedIPAddresses : {10.10.1.10}

Another example: network team wanted an output from ESXi servers to check the following:

  1. Check the status of LACP DUs, i.e. transmit/receive and see if there are any errors
  2. Check LACP configuration, especially the LACP period. Either fast or slow

I wrote an article about Advanced LACP Configuration using ESXCLI, I suggest you to read it if not familiar with LACP configuration on ESXi.

Similar to above, save the LACP stats to a variable and select the following:

  • Name of ESXi
  • Name of dvSwitch
  • NIC, e.g. vmnic0
  • Receive errors
  • Received LACPDUs
  • Transmit errors
  • Transmitted LACPDUs

And the script would be:

$esxi= $esxcli.system.hostname.get().FullyQualifiedDomainName
$lacp_stats = $esxcli.network.vswitch.dvs.vmware.lacp.stats.get()
$lacp_stats | select @{N="ESXi";E={$esxi}}, DVSwitch, NIC, RxErrors, RxLACPDUs, TxErrors, TxLACPDUs

Example Output:

ESXi : esx01.test.com
DVSwitch : dvSwitch_Test
NIC : vmnic1
RxErrors : 0
RxLACPDUs : 556096
TxErrors : 0
TxLACPDUs : 555296
ESXi : esx01.test.com
DVSwitch : dvSwitch_Test
NIC : vmnic0
RxErrors : 0
RxLACPDUs : 556096
TxErrors : 0
TxLACPDUs : 555296

For the configuration report, you might be interested in Fast/Slow LACP period as mentioned above.

Similarly, save the LACP status output to a variable. Then for each object pointing to NicList, select the following:

  • Name of ESXi server
  • Name of dvSwitch
  • Status of LACP
  • NIC, e.g. vmnic0
  • Flag Description
  • Flags

Combining above:

$esxi= $esxcli.system.hostname.get().FullyQualifiedDomainName
$information = $esxcli.network.vswitch.dvs.vmware.lacp.status.get()

$information.NicList | ForEach-Object { $_ | Select @{N="ESXi";E={$esxi}}, @{N="dvSwitch";E={$information.dvSwitch}}, @{N="LACP Status";E={$information.Mode}}, Nic, @{N="Flag Description";E={$information.Flags}}, @{N="Flags";E={$_.PartnerInformation.Flags}} }

Example Output:

ESXi : esx01.test.com
dvSwitch : dvSwitch_Test
LACP Status : Active
Nic : vmnic1
Flag Description : {S - Device is sending Slow LACPDUs, F - Device is sending fast LACPDUs, A - Device is in active mode, P - Device is in passive mode}
Flags : SA

ESXi: esx01.test.com
dvSwitch : dvSwitch_Test
LACP Status : Active
Nic : vmnic0
Flag Description : {S - Device is sending Slow LACPDUs, F - Device is sending fast LACPDUs, A - Device is in active mode, P - Device is in passive mode}
Flags : SA

With the report above, network team could find out which ESXi server is configured with Fast or Slow so that they could configure the LACP accordingly (LACP period mis-match is not good!).

Wrap-Up

In this blog post, it discussed the way of using ESXCLI command to generate an advanced report. I didn’t go through properties deeply as I discussed in Part 2 and you could slowly take a look properties on your own.

Hope it was easy enough to follow and understand. On the next series, I will be discussing how to use PLINK to generate a combined report with ESXi and non ESXi.

Always welcome for for you to leave a reply for any questions or clarifications.

vSphere Migration Scenario #3

Introduction

We’ve been running vCenter Server version 5.1 and two vCenter Servers version 5.5 for almost two years. Managers decided to decommission vCenter Server 5.1 as it contained only a few clusters. The work I was involved in is to migrate clusters from vCenter Server 5.1 to vCenter Server 5.5. During the migration work, I noticed a few interesting behaviours and in this blog post, I will be going through:

  • Issues
  • Solutions
  • Workarounds

Environment

The following is the production vSphere environment I worked on:

Two vCenter Servers

  • Destination vCenter Server 5.5 No Update
    • vcs5.5
  • Source vCenter Server 5.1 No Update
    • vcs5.1

2 x ESXi Servers 5.1

  • esxi5.1_A
  • esxi5.1_B

2 x dvSwitches 5.0 Version

  • dvSwitch_VM_Network
    • 2 x 10Gbe
    • NIOC enabled with a custom resource pool
    • It is mapped to one portgroup
  • dvSwitch_iSCSI_Network
    • 2 x 10Gbe
    • NIOC disabled

Requirement

Same as previous migration scenarios, no outage allowed. A few packet drops are acceptable.

Risk

Migration of Software iSCSI VMKernels configured on dvSwitch_iSCSI_Network.

Mitigation

Import/Export dvSwitch_iSCSI_Network and corresponding portgroups maintaining identifiers to vcs5.5.

Issues & Solutions

Initial migration plan I came up with is the following:

  1. Export dvSwitch_VM_Network & dvSwitch_iSCSI_Network and import them to vCenter Server 5.5 (pre-work)
  2. Create a new cluster in vcs5.5, same configuration as in vcs5.1
  3. Disable HA & DRS on the cluster
  4. Disconnect & remove esxi5.1_A & esxi5.1_B from vcs5.1
  5. Register esxi5.1_A & esxi5.1_B to the cluster in vcs5.5
  6. Migrate dvSwitches, dvSwitch_VM_Network and dvSwitch_iSCSI_Network to the imported ones in vcs5.5
  7. Enable HA & DRS on the cluster
  8. Delete the cluster instance in vcs5.1
  9. Repeat the steps above for the rest of clusters

For step 1, because it doesn’t affect production system, I decided to do it before the change. The reason we would want to preserve original distributed switch and port group identifiers (mentioned in mitigation above) is to make sure the ESXi servers at destination vCenter Server picks up the new dvSwitch without any interruptions. Since there are iSCSI VMKernels mapped to the dvSwitch_iSCSI_Network, migration of bounded iSCSI VMKernels to another dvSwitch in live won’t be allowed. This is the main reason of preserving original identifiers. During exporting dvSwitch configuration from vCenter Server 5.1 and importing it to 5.5 vCenter Server, it caused an error with the following message:

VM Migration Scenario #3

Looking at the vCenter Log located under ProgramData… folder:

2015-01-12T15:18:39.937+13:00 [05388 error 'corevalidate' opID=2f3dc91a] [Validate::CheckLacpFeatureCapability] LACP is not supported on DVS [dvSwitch_VM_Network] 
2015-01-12T15:18:39.937+13:00 [04620 info 'commonvpxLro' opID=E7F30A2D-0004F88E-7e] [VpxLRO] -- FINISH task-internal-33360670 -- -- vmodl.query.PropertyCollector.retrieveContents -- 
2015-01-12T15:18:39.937+13:00 [05388 error 'dvsvpxdMoDvsManager' opID=2f3dc91a] [MoDvsManager::CreateNewEntity] Import Failed while creating DVS from Backup with key[51 4d 2d 50 93 51 73 6b-46 47 d0 fa 09 af 88 fc]. Fault:[vmodl.fault.NotSupported] 
2015-01-12T15:18:39.937+13:00 [05388 error 'dvsvpxdMoDvsManager' opID=2f3dc91a] [MoDvsManager::CreateNewEntity] Import Failed while creating DVPG from Backup with key[dvportgroup-62]. Fault:[vim.fault.NotFound] 
2015-01-12T15:18:39.937+13:00 [05388 error 'dvsvpxdMoDvsManager' opID=2f3dc91a] [MoDvsManager::CreateNewEntity] Import Failed while creating DVPG from Backup with key[dvportgroup-66]. Fault:[vim.fault.NotFound] 
2015-01-12T15:18:39.937+13:00 [05388 error 'dvsvpxdMoDvsManager' opID=2f3dc91a] [MoDvsManager::CreateNewEntity] Import Failed for some hosts 
2015-01-12T15:18:39.937+13:00 [05388 info 'commonvpxLro' opID=2f3dc91a] [VpxLRO] -- FINISH task-290085 -- -- vim.dvs.DistributedVirtualSwitchManager.importEntity -- 
2015-01-12T15:18:39.937+13:00 [05388 info 'Default' opID=2f3dc91a] [VpxLRO] -- ERROR task-290085 -- -- vim.dvs.DistributedVirtualSwitchManager.importEntity: vim.fault.NotFound: --> Result: --> (vim.fault.NotFound) { --> dynamicType = <unset>, --> faultCause = (vmodl.MethodFault) null, --> faultMessage = (vmodl.LocalizableMessage) [ --> (vmodl.LocalizableMessage) { --> dynamicType = <unset>, --> key = "com.vmware.vim.vpxd.dvs.notFound.label", --> arg = (vmodl.KeyAnyValue) [ --> (vmodl.KeyAnyValue) { --> dynamicType = <unset>, --> key = "type", --> value = "DVS", --> }, --> (vmodl.KeyAnyValue) { --> dynamicType = <unset>, --> key = "value", --> value = "51 4d 2d 50 93 51 73 6b-46 47 d0 fa 09 af 88 fc", --> } --> ], --> message = <unset>, --> } --> ], --> msg = "" --> } --> Args: -->

Could find a related KB article and this was the known bug in vCenter Server 5.1 No Update. According to the resolution field, it was fixed in either vCenter Server 5.1 Update 2 or 5.5. So, I raised another change in advance to upgrade vCenter Server to 5.1 Update 2. Even though the vCenter Server was upgraded to 5.1 Update 2, no luck. Consulting VMware Support, upgrading the version of dvSwitch to 5.1 was required. Once dvSwitch was upgraded, import/export worked without a problem. After this pre-work, I was quite confident with the rest of migration work. On the day of work, I disconnected/removed esxi5.1_A from the source cluster and added to the cluster in vcs5.5. The next step was to rejoin esxi5.1_A to the dvSwitch imported in vcs5.5. Before doing this work, I was constantly pinging a few virtual machines and ESXi server to ensure there is no outage. VM Migration Scenario #3 1 The work was quite simple:

  1. Navigate to Networking View
  2. Right-click on dvSwitch_VM_Network and click Add Host

VM Migration Scenario #3 2 Ignore migrating VMKernels and VM Network, click next and finish VM Migration Scenario #3 3 VM Migration Scenario #3 4

Yup – another issue happened. Error messages attached below:

vDS operation failed on host prod.esxi.com, Received SOAP response fault from [<cs p:00000000f3627820, 
TCP:prod.esxi.com:443>]: invokeHostTransactionCall
An error occurred during host configuration. got (vim.fault.PlatformConfigFault) exception
An error occurred during host configuration.
Operation failed, diagnostics report: Unable to set network resource pools list (8) 
(netsched.pools.persist.nfs;netsched.pools.persist.mgmt;netsched.pools.persist.vmotion;netsched.pools.persist.vsan;netsched.pools.persist.hbr;netsched.pools.persist.iscsi;netsched.pools.persist.vm;netsched.pools.persist.ft;) to dvswitch id (48 59 2d 50 06 30 c4 39-96 74 bb 0e c1 73 fc 87); Status: Busy

Screenshot:

VM Migration Scenario #3 6

Investigating the log, looked like the dvSwitch imported to vcs5.5 had an issue with network resource pool, i.e. NIOC. Gotcha – NIOC custom resource wasn’t completely imported. Hence, I created one (exact same configuration as defined in vcs5.1) and mapped it to the appropriate portgroup. However, there was no luck, still had the same issue as above. I guess the configuration had to be imported instead of the user manually creating them.

Work Around

I guessed that the virtual machines using the portgroup with custom resource is causing the issue. One attempt I made was to update dvSwitch on the ESXi server on maintenance mode, i.e. virtual machines running. I was correct – if there are no virtual machines at all, dvSwitch update was successful. Once this was done, the next update required was dvSwitch_iSCSI_Network. Expected behaviour, “migration of iSCSI VMKernels can cause APD state to some LUNs”, as attached below. However, since we maintained identifiers of dvSwitch and portgroups, it was safe to continue without resolving the errors: VM Migration Scenario #3 7

After the work on esxi5.1_A, migrated esxi5.1_B to vcs5.5 and placed it on maintenance mode to evacuate virtual machines to esxi5.1_A. Once vMotion was finished, updated dvSwitch and It was successful!

Final Migration Plan

The following is the final migration plan:

  1. Export dvSwitch_VM_Network & dvSwitch_iSCSI_Network and import them to vcs5.5 (pre-work)
  2. Ensure vCenter Server is 5.1 Update 2 or above on the source
  3. Create a new cluster in vcs5.5, same configuration as in vcs5.1
  4. Disable HA & DRS on the cluster in vcs5.1
  5. Place esxi5.1_A on maintenance mode
  6. Disconnect & remove esxi5.1_A vcs5.1 and register it in vcs5.5 cluster
  7. Re-join esxi5.1_A to dvSwitch_VM_Network and dvSwitch_iSCSI_Network in vcs5.5
  8. Place esxi5.1_B on maintenance mode
  9. Disconnect & remove esxi5.1_B vcs5.1 and register it in vcs5.5 cluster
  10. Exit esxi5.1_A and esxi5.1_B maintenance mode
  11. Enable DRS only to fully automated
  12. Place esxi5.1_B on maintenance mode
  13. Once done, Re-join esxi5.1_B to dvSwitch_VM_Network and dvSwitch_iSCSI_Network to the imported ones in vcs5.5
  14. Enable HA on the cluster in vcs5.5
  15. Delete the cluster instance in vcs5.1
  16. Repeat above for the rest of clusters

Recommended Post Work

I think it is a bug that I am facing, ESXi servers migrated don’t recognise current Network settings, screenshot attached below: VM Migration Scenario #3 5

There is no problem with selecting portgroups on VM configuration but I found that if these ESXi servers are part of vCAC and trying to create reservations, network adapters didn’t show up 😦 I recommend you to restart vCenter Server as it fixes the issue above (do not tell me you have vCenter Heartbeat installed!).

Wrap-Up

Hope the real life migration scenario described above helps and if you want other examples, they could be found on the following:

More than welcome if you’ve got any questions or clarifications.