After upgrading the ESXi host to 6.5 U1, the host may experience a PSOD. This post will help in verifying the symptoms. It also helps in summarize the cause and the solution for the issue.
Symptoms:
In vmkernel logs we see following messages prior to PSOD:
04:40:54.761Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc795 status: "Invalid address" (bad0026)
04:40:54.763Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc7b5 status: "Invalid address" (bad0026
04:40:54.764Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc7d5 status: "Invalid address" (bad0026)
04:40:54.765Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc7f5 status: "Invalid address" (bad0026)
04:40:54.766Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc815 status: "Invalid address" (bad0026)
04:40:54.768Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc835 status: "Invalid address" (bad0026)
PSOD stack :
VmMemCowPShareRemoveWithCheck@vmkernel#nover+0x10f stack: 0x418011d0
VmMemCow_CopyPageWithMPN@vmkernel#nover+0x19f stack: 0x3fffffffff,0
VmMemPf@vmkernel#nover+0x133 stack: 0x449fd475255d69,
PShareHashTableWalkMatchMPN@vmkernel#nover+0x2d stack: 0x3110dc
PShare_RemoveHint@vmkernel#nover+0xb3 stack: 0x4391ccaa7000
VmMemCow_PShareRemoveHint@vmkernel#nover+0x72 stack: 0x4391ccc1bef8
VmMemCowPFrameRemoveHint@vmkernel#nover+0xc6 stack: 0x304
VmMemCowPShareFn@vmkernel#nover+0x5c3 stack: 0x6422bec
VmAssistantProcessTasks@vmkernel#nover+0x144 stack: 0x0
CpuSched_StartWorld@vmkernel#nover+0x99 stack: 0x0
Multiple CPUs getting locked:
10:49:52.654Z cpu4:155673)WARNING: Heartbeat: 794: PCPU 51 didn't have a heartbeat for 9 seconds; *may* be locked up.
10:49:52.654Z cpu52:172048)WARNING: Heartbeat: 794: PCPU 27 didn't have a heartbeat for 12 seconds; *may* be locked
Cause: This is due to the memory corruption cause when MPN mapped to VMX get incorrectly updated.
Resolution: The issue has been fixed in the release of 6.5 P02, i.e. build# 7312210 released on 2017-12-19.