Dead SD-Card in ESXi vSAN node
Yesterday I was contacted by a customer that needed to reboot a ESXi host on a HPE server, because i needed to get the NAND reset, to clear an error in the ILO. But the customer have had some bad experience with reboot there ESXi/vSAN hosts, that they could not boot on the SD-Card , so they wanted me to be standby to help if that happend again.
Before the did the reboot, i required that they toke a ESXi configuration backup. The host was already in maintenance mode. We uses PowerCLI to create the configuration backup with, VMware has a KB on how to do this here. KB2042141
import-module -name vmware.powercli connect-viserver esxi01.domain.local Get-VMHostFirmware -VMHost esxi01.domain.local -BackupConfiguration -DestinationPath C:\Temp
We also noted down the ESXi version and Build.
Note: we did not remove the host from the vCenter or moved it away from the HA/DRS/vSAN cluster, if we did, then this will not work.
Afterward we rebooted and did the NAND reset, at the ESXi host started to boot, but at some point i just hang, i checked the ESXi logger screen and could see alot of error about reading from the SD-Card. And in the ILO information page, the SD-Card was “Not present”, so the SD-Card was also defekt on the host. After opening a support case with HPE, the customer got a new SD-Card, and they change it out, and it now showed up in the ILO.
Then we just installed the ESXi host, with the same version and build of ESXi, configured with an temporary password, and network configuration. Then put the host into maintenance mode, and did a configuration restore. In this case we used the original IP adddress of the host as the temporary IP address.
Note: do not touch the disks that is part for vSAN disk groups.
Note: The restore only work on the same Build version.
import-module -name vmware.powercli connect-viserver <temporary IP> Set-VMHost -VMHost <temporary IP> -State "Maintenance" Set-VMHostFirmware -VMHost <temporary IP -Restore -Force -SourcePath c:\temp\<backup file name>
Afterward the host booted, and came up, in the vCenter we just connected the host again, this is necessary because the vCenter agent (vpxa) needs to be installed, and toke the host out of maintenance mod, and the vSAN did start a resync of all the vSAN objects, and afterwards it was happy again. And the customer was also happy, that operation went so smoothly.
This was done on ESXi 6.7 Update 3, I have not tested this with other version.
Note: I dont know if this i 100% supported by VMware, if you do this, it is at your own risk.
Note: This will proberly also work with ESXi installed on USB and disks.
Notes: Do not use a configuration backup from another server.