Time: somewhere around 3:50AM this morning.
A delightful screech of house alarms woke me up, only to discover we still had power (???). After nearly falling out of bed grabbing my phone, Northern Power Grid had indeed noticed a cut nearby. Then the WiFi went off.
The WiFi went off?!
F*ck sake – tired me at 4AM
Turns out, it had gone for long enough that the ESXI host had shut down – interestingly, the PSU LED was showing amber, and system health red which means the PSU has gone into standby. (also interesting to note, the storage server next to it didn’t flinch – still showing an uptime of 172 days)
Poking the power button seemed to make it fire back up, including a full speed fan test, just to guarantee everyone else was also awake.
Cue later that morning, there was still no WiFi when I woke up. Plugging in a monitor led me down the rabbit hole of oh god everything failed
Not the message I want to be seeing 😱😱 pic.twitter.com/uXNyQlAN1V
— Callum Snowden (@callum_snowden_) April 2, 2018
Turns out, it was trying to boot off a USB HDD rather than the proper drives. D’oh. Unplugging that it rebooted fine, and up came ESXI shortly later followed by all the VMs. In flapping about the drives being dead, I fixed the problem of a dislodged SAS cable and now all 8 drives are showing happy green lights.
Lesson learnt: batteries & a UPS would be useful although power cuts rarely happen.