After upgrading ESXi host to 6.5 U1, hosts may experience the PSOD.
The PSOD stack looks like :
2017-09-16T15:34:30.908Z cpu6:65645)@BlueScreen: #PF Exception 14 in world 65645:HELPER_UPLIN IP 0x41802c496258 addr 0x0PTEs:0x292379a027;0x2efe54c027;0xbfffffffff001;2017-09-16T15:34:30.908Z cpu6:65645)Code start: 0x41802c200000 VMK uptime: 4:02:26:10.1512017-09-16T15:34:30.908Z cpu6:65645)0x4390c369bd00:[0x41802c496258]UplinkTreePackQueueFilters@vmkernel#nover+0x188 stack: 0xe154270002017-09-16T15:34:30.909Z cpu6:65645)0x4390c369bd90:[0x41802c49e142]UplinkLB_LoadBalanceCB@vmkernel#nover+0x1e42 stack: 0x12017-09-16T15:34:30.909Z cpu6:65645)0x4390c369bf20:[0x41802c4916f2]UplinkAsyncProcessCallsHelperCB@vmkernel#nover+0x116 stack: 0x43048761eac02017-09-16T15:34:30.910Z cpu6:65645)0x4390c369bf50:[0x41802c2c9e0d]helpFunc@vmkernel#nover+0x3c5 stack: 0x4300b9b2a0502017-09-16T15:34:30.910Z cpu6:65645)0x4390c369bfe0:[0x41802c4c91b5]CpuSched_StartWorld@vmkernel#nover+0x99 stack: 0x02017-09-16T15:34:30.913Z cpu6:65645)base fs=0x0 gs=0x418041800000 Kgs=0x0
Contributing factors:
- The host is upgraded to 6.5 U1
- It has 10G or more capacity NIC cards, such as elxnet FlexFabric 20Gb or FlexFabric 10Gb. But it is not restricted to only Emulex.
Cause :
Netqueue commit phase abruptly stops due to a failure of hardware activation of an Rx queue.
Workaround:
The workaround for this issue is to downgrade the ESXi host to 6.0 U2. As 6.0 U2 has the fix for this issue.
Resolution:
This issue is resolved in VMware ESXi 6.5 P02 (ESXi-6.5.0-20171204001-standard)
Reference: