Power On Virtual Machine Fails – VMFS Heapsize

Introduction

Several days before, an incident was assigned to me saying the end-user could not power on one of his virtual machines. Tried to power it on and an error message popped up, however, the message was not really helpful. All it said is “A general system error occurred: The virtual machine could not start”.

Symptom

As described in introduction, the error message was not telling me what the problem is. A figure is attached below.

Screen Shot 2014-06-30 at 11.29.12 am

Next thing I looked into was the vmware.log and found that there was an issue with creating swapfile.

vmware.log
2014-06-29T22:27:18.702Z| vmx| CreateVM: Swap: generating normal swap file name.
2014-06-29T22:27:18.704Z| vmx| Swap file path: '/vmfs/volumes/521439a7-bd74efb3-d915-d4ae52a522bf/aaa/aaa-aa1fa36d.vswp'
2014-06-29T22:27:18.704Z| vmx| VMXVmdb_GetDigestDiskCount: numDigestDisks = 0
2014-06-29T22:27:18.705Z| vmx| Msg_Post: Error
2014-06-29T22:27:18.705Z| vmx| [msg.dictionary.writefile.truncate] An error occurred while truncating configuration file "/vmfs/volumes/521439a7-bd74efb3-d915-d4ae52a522bf/aaa/aaa.vmx":Cannot allocate memory.
2014-06-29T22:27:18.705Z| vmx| [vob.heap.grow.max.reached] Heap vmfs3 already at its maximum size of 83887056. Cannot expand.
2014-06-29T22:27:18.705Z| vmx| [vob.heap.grow.max.reached] Heap vmfs3 already at its maximum size of 83887056. Cannot expand.             2014-06-29T22:27:18.705Z| vmx| [vob.swap.poweron.createfailure.status] Failed to create swap file '/vmfs/volumes/521439a7-bd74efb3-d915-d4ae52a522bf/shrwebprd02/shrwebprd02-aa1fa36d.vswp' : Out of memory
2014-06-29T22:27:18.705Z| vmx| [msg.vmmonVMK.creatVMFailed] Could not power on VM : Out of 
2014-06-29T22:27:18.705Z| vmx| [msg.monitorLoop.createVMFailed.vmk] Failed to power on VM.                                                2014-06-29T22:27:18.705Z| vmx| ----------------------------------------                                                                   2014-06-29T22:27:18.838Z| vmx| Module MonitorLoop power on failed.                                                                        2014-06-29T22:27:18.838Z| vmx| VMX_PowerOn: ModuleTable_PowerOn = 0                                                                       2014-06-29T22:27:18.840Z| vmx| Vix: [28466190 mainDispatch.c:4084]: VMAutomation_ReportPowerOpFinished: statevar=1, newAppState=1873, success=1 additionalError=0
2014-06-29T22:27:18.840Z| vmx| Vix: [28466190 mainDispatch.c:4103]: VMAutomation: Ignoring ReportPowerOpFinished because the VMX is shutting down.              
2014-06-29T22:27:18.840Z| vmx| Vix: [28466190 mainDispatch.c:4084]: VMAutomation_ReportPowerOpFinished: statevar=0, newAppState=1870, success=1 additionalError=0
2014-06-29T22:27:18.840Z| vmx| Vix: [28466190 mainDispatch.c:4103]: VMAutomation: Ignoring ReportPowerOpFinished because the VMX is shutting down.              
2014-06-29T22:27:18.840Z| vmx| Transitioned vmx/execState/val to poweredOff                                                               2014-06-29T22:27:18.842Z| vmx| Vix: [28466190 mainDispatch.c:4084]: VMAutomation_ReportPowerOpFinished: statevar=0, newAppState=1870, success=0 additionalError=0
2014-06-29T22:27:18.842Z| vmx| Vix: [28466190 mainDispatch.c:4103]: VMAutomation: Ignoring ReportPowerOpFinished because the VMX is shutting down.
2014-06-29T22:27:18.842Z| vmx| VMX idle exit

To summarise the log above:

  1. The ESXi server tried to create a swapfile for this virtual machine to power on
  2. There was an error while creating swapfile due to VMFS3 HeapSize is already at its maximum size
  3. Failed to power-on this virtual machine
To understand this issue, it’s required to go through VMFS HeapSize.

VMFS HeapSize

The ESXi server in this environment is 5.0 with Update 1. As per the KB article, the default allowed active VMDK storage per ESXi is 8TB with the default heap size 80MB. This means that if a single ESXi server has virtual machines with more than 8TB of active VMDKs, the ESXi will refuse to power-on more virtual machines.

To check how much VMDK the ESXi server has, I ran a simple PowerCLI script to check. The script and output is attached below:

foreach ($esxi in (Get-Cluster -Name “Name of Cluster” | Get-VMHost | Sort Name)) { 
  $esxi | select Name, @{N="Sum";E={ ($esxi | Get-VM | Get-HardDisk | %{$_.CapacityGB} | Measure-Object -Sum).Sum }}
}
Name,Sum                            
ESXiA,3335.7681760788
ESXiB,3035.02425670624
ESXiC,3942.765625
ESXiD,4861
ESXiE,4538.28125
ESXiF,16272.9050750732

ESXi A to E look good but F, it’s using approximately 16TB. Checked the history of virtual machine where it was running previously and yes, it was in ESXiF.

One thing to note is that it doesn’t necessarily mean that ESXiF has 16TB of active VMDK, it could be much lesser than 16TB. However, there is a high chance of exceeding 8TB and it happened to this specific virtual machine.

Solution

The solution is quite simple, vMotioned the virtual machine to ESXiB which has the lowest VMDK size and it was happy.

Ultimately, you will want to upgrade ESXi servers to 5.5 that improved VMFS HeapSize. There is an excellent article by Cormac Horgan explaining the enhancement.

Hope this helps.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s