StrikedOut Posted August 27, 2019 Share Posted August 27, 2019 Hi All. One of my sites is having a very strange power issue that I can figure out. We have 2 servers, one an older Dell PowerEdger 2900 due for retirement, and the other an HP DL385 Gen8. Recently, one of the servers (HP) became unresponsive so I asked a user to take a look and tell me what they could see. The screen showed no input signal, so after confirming the server was connected to the monitor correctly, we did a hard reset. This brought the server back up and everything became available again. I logged in to the Dell server and this, along with the HP server, showed the unexpected power off message indicating to me both lost power at the same time, looked at the log and found nothing to indicate any issues. I suspected the batteries were faulty (hadn’t been replaced for some time) in the UPS, an APC Smart UPS 2200, and ordered the replacements and went on site the next day for the install. Before I have a chance to replace them, I was working on the server when for a split second, all servers lost power and went into the same loop as the day before. So, I checked to see if this was isolated in the building and it was just the servers that lost power, both at the same time, server room lights didn’t flicker and I can’t tell you if the switches restarted but I am sure the router didn’t (but I may be wrong). So, I replaced the batteries, simulated a power cut by pulling the plug on the UPS - All fine. I also moved the power socket as it was originally in a floor box. A couple of days later, I get the call that it has happened again, logged onto the server and again, see the unexpected shutdown message, so no I suspect the UPS is faulty. Order a replacement for next day delivery and was onsite in the morning to install. This is what I then did and tested; Installed a new APC UPS system and a new APC PDU and plugged one PSU from each server into the PDU. Plugged the other power supply on both servers into a power lead straight into a socket so there are 2 feeds from different sources for the power. During the transition, both power supplies in both servers were tested and all 4 are fine. Simulated a power cut from both sockets - All fine. Full virus scan from Panda Adaptive Defence on both servers - All clear Checked the logs and it seems that because it is a power failure, no log is created to indicate any issue. So, my thoughts are that it isn’t the UPS as it is the same on a new unit plus, I have an independent feed to the second PSU on both servers. The leads are fine as surely it would not cause both servers to restart? Mains power supply isn’t the issue as the UPS should take care of this and a simulation of a power cut was done with no issues. An application would not cause a power cut to both servers unless it is a virus but a straight power cut? If it was an application, then I would expect a graceful restart/shutdown. At this point, I am out of ideas so hope you have some for me?? Link to comment Share on other sites More sharing options...
+BudMan MVC Posted August 27, 2019 MVC Share Posted August 27, 2019 What exact error are you seeing in the log.. Just an event ID 6008 saying the previous shutdown was unexpected? Link to comment Share on other sites More sharing options...
StrikedOut Posted August 27, 2019 Author Share Posted August 27, 2019 Correct, Just 6008. Link to comment Share on other sites More sharing options...
spikey_richie Posted August 27, 2019 Share Posted August 27, 2019 You got minidumps turned on? If so, crack one open with WinDbg Link to comment Share on other sites More sharing options...
StrikedOut Posted August 27, 2019 Author Share Posted August 27, 2019 Minidumps are on but doesnt give any. I suspect due to the manner in which it is losing power. Link to comment Share on other sites More sharing options...
spikey_richie Posted August 27, 2019 Share Posted August 27, 2019 Eep, that's a pretty catastrophic failure then. You got another PSU you can try? Link to comment Share on other sites More sharing options...
StrikedOut Posted August 27, 2019 Author Share Posted August 27, 2019 Both servers have 2 PSU's, both seemed to be picking up when testing by pulling the others socket. Plus this is affected on 2 servers and the one time I was in the room, both lost power at the same time. Never seen anything like it and damned if I can figure out the common component that may be failing. Link to comment Share on other sites More sharing options...
Recommended Posts