Introduction
Getting Started
Let’s get started with a simple script:
$report = foreach ($vm in (Get-VM -Location (Get-Cluster -Name cluster1))) { $vm | Select @{N="ESXi";E={$vm.VMHost.Name}}, Name, NumCPU, MemoryGB, @{N="Datastore";E={ [string]::Join(",", (Get-Datastore -VM $vm | %{$_.Name})) }} }
An example output is attached below:
ESXi,VM,CPU,Memory,Datastore ESXi_test1,test1,1,1,datastore1,datastore2,datastore3 ESXi_test1,test2,1,1,datastore1 ESXi_test2,test3,1,4,datastore4,datastore5
For the testing, I selected a cluster with 300 virtual machines to measure how long it takes to run the script above and it took 5 minutes. 5 minutes does look OK, however, how about 2000~3000 virtual machines? It will take about an hour which is very inefficient.
How could we improve the performance?
Trick
Many people might think that saving a full list of output would take longer than applying a filter. For example, Get-Datastore vs Get-Datastore -VM $vm.
Is this the case? Let’s have a look:
- Get-Datastore => To retrieve 565 datastores, it took 2.2 seconds
- Get-Datastore -VM “VM” => It took 8 seconds
Surprising result, right? Get-Datastore without a filter is approximately 4 times faster, even it queried for 565 datastores. With this result, it was safe to assume that for the script above, it does Get-Datastore for 300 times which is roughly 300 * 8 = 2400 seconds, about 4 minutes. Sounds about right.
To improve the performance, one of the ways I could came up was to:
- Save Get-Datastore output in a variable, e.g. $datastore_list = Get-Datastore
- Utilise the $datastore_list to find which datastores are allocated to virtual machines
This way, instead of executing Get-Datastore 300 times, it will run Get-Datastore once, save the result in a variable and query datastore information from the variable. This does look much more efficient. However, how could we achieve this?
If you look at the properties of Get-VM closely (run Get-VM | Select * to view all properties, this will be discussed in depth on the next series), there is a property called “DatastoreIdList”. Each datastore has a unique datastore id and Get-VM has this datastore id value. This means, we could:
- Run a foreach loop against DatastoreIdList
- If datastore ID matches to any datastore id in $datastore_list variable, output
Translating the above into a PowerCLI command:
$datastore_list = Get-Datastore (Get-VM -Name “VM”).DatastoreIdList | Foreach-Object { $datastore_id = $_; $datastore_list | where {$_.id -match $datastore_id} }
Converting the script in Getting Started section:
$datastore_list = Get-Datastore
$report = foreach ($vm in (Get-VM -Location (Get-Cluster -Name cluster1))) { $vm | Select @{N="ESXi";E={$_.VMHost.Name}}, Name, NumCPU, MemoryGB, @{N="Datastore";E={ [string]::Join(",", ( $_.DatastoreIdList | Foreach-Object { $datastore_id = $_; $datastore_list | where {$_.id -match $datastore_id} } )) }} }
The above script took 27 seconds, producing the same output. This is approximately 20 times faster than the original one.
Wrap-Up
Throughout the blog, it discussed a few ways on how to improve the performance of PowerCLI scripts:
- Instead of applying a filter, save the whole output
- Avoid executing a same command over and over
- Take a closer look at properties to avoid running a command
Hope this helped and on the next series, I will deep dive into properties.
nice work Steven. Keep it up. Great to see your tips here.
Thanks Brian!