Server problems, 31/03/05


Recommended Posts

Weirdly enough, MY site, hosted at ASO in The Planet was working. :blink:

585702551[/snapback]

timdorr has quite a few servers in all of the different datacenters there, the first one, which houses two of Neowin's servers was entirely down.

I'll post the official explanation from ThePlanet when they get it up... but looking at the routing graphs in Orbit... this was *bad*

Here's the reason peeps

At approximately 4:15AM CST, a pair of redundant Powerware 500KVA UPS units failed creating a power failure in section B of our DLLSTX2 datacenter. Emergency teams were deployed within minutes and power was restored within minutes but intermmittment power outages did occur until 6:45AM CST. Powerware, JT Packard, and electricians are currently onsite with over 100 Planet technicians working to resolve the issue. We do not anticipate any further outages . A formal RFO will be released once the team debriefs. We apologize for all issues that has caused.

In laymens: A couple of UPS's blew up and took a section of the DC with them, Neowin's servers are in that section.

Edited by Homer

Latest statement from theplanet.com

Location: DLLSTX2

Severity: Level 5

Description: At 4:09AM CST, a power failure occurred in a redundant pair of Powerware UPS units feeding power to section B of the DLLSTX2 datacenter. The power failure was caused by a faulty fuse in UPS unit B-1. As the load transferred to UPS unit B-2, the spike in the load created an overloaded breaker and UPS B2 also lost power. This resulted in a power outage to the main power distribution unit feeding section B of the datacenter floor. Emergency teams were notified and the power was restored within 20 minutes. The power continued to cycle until 6:45AM due to the faulty fuse and the inability of the redundant UPS units to remain in bypass mode. Customers may have noticed several power cycles during this time period. Powerware, JT Packard, and electricians found the problem and replaced the faulty hardware and fuses. At 6:45AM, all electrical service was restored to normal and the NOC team began to bring all servers back online. The technical staff is currently placing a console on all servers to verify server restarts. Customers with operating systems that require a file check may have experienced extended downtime during the file check. Powerware and JT Packard will continue to monitor the UPS systems for the next 24 to 48 hours. The Planet does not anticipate any further outages.

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.