Microsoft blames heat for Outlook.com, Hotmail, SkyDrive outage

You may recall that on March 12 a number of Microsoft's online services went down unexpectedly, which resulted in nearly 16 hours of outage for services such as Outlook.com, Hotmail, Calendar and SkyDrive. Today Microsoft has offered an explanation as to why the services went down, stating that the data center responsible for keeping these things running encountered a "rapid and substantial temperature spike".

Basically, the data center overheated due to a firmware update failure, according to Microsoft Vice President Arthur de Haan. Human intervention, alongside traditional software fixes, was required to fix the issues adding significant time to the restoration process; everything was fully restored by 5:43 AM PDT on March 13.

Yesterday wasn't green across the board, unlike today.

While Microsoft's online services have traditionally had exceptional uptime, this is the second disruption to the service in recent times, with the last outage occurring on February 1st.

Source: Outlook Blog via BBC

Report a problem with article
Previous Story

Bow-chicka-OH-MY: A look at the world's web porn searches

Next Story

Lots of Google Reader users petition to keep it going

27 Comments

Commenting is disabled on this article.

we had a temperature spike one time with one of our HP DL380 G4's... not sure why but most of the fans failed at the same time, the temp kill switch in it didn't go off, and the Northbridge got so hot it melted the solder holding the clip connector down the connector flew up, heat sink dislodged and the server basically killed itself....

this is even with redundant cooling systems keeping our server rooms pretty cool

neufuse said,
we had a temperature spike one time with one of our HP DL380 G4's... not sure why but most of the fans failed at the same time, the temp kill switch in it didn't go off, and the Northbridge got so hot it melted the solder holding the clip connector down the connector flew up, heat sink dislodged and the server basically killed itself....

this is even with redundant cooling systems keeping our server rooms pretty cool

Thats one crazy IT horror story, gulp!!

alwaysonacoffebreak said,
Strangely I've heard stories about HP servers doing that before. Not on the same model You got but still an HP one. Thankfully I've been lucky so far.

that was the only HP server we've ever had do that... thankfully we had spare similar hardware laying around we could swap drives with and keep going... but it was definitely a WTF moment when we realized what happened

alwaysonacoffebreak said
Strangely I've heard stories about HP servers doing that before. Not on the same model You got but still an HP one. Thankfully I've been lucky so far.
Based on my experiences with HP, they've never been good with managing heat.

I've had problems with laptops and some PC's. I even remember coming home one day finding my HP CRT monitor going up in smoke..

I reset my password thinking this was the cause (fearing a hack), and since then have an issue adding my MS account to my phone and also cant access any API based service (all web access is fine).

I get a capcha pop up when trying to add MS account on phone, this capcha does NOT WORK (over 40 attempts).

Same when trying to load skydrive desktop app, or one note (notebooks stored on skydrive).

I can login to W8 laptop fine with new password though, and like I said all other services.

very very frustrating.

so, why its ' encountered a "rapid and substantial temperature spike" '?

bad choice of hardware ?
the hardawre are now in planned obsolescene stages ?

server farms like this need a tpn of air conditioning to keep them stable and air flow is a carefully engineered thing. If something affected the A/C even for a short period the re-coup time on ambient temperature and heat build up close to devices could easily lead to hardware downtime.

Brony said,
"While Microsoft's online services have traditionally had exceptional uptime"

yeah, sure!.


That's why "traditionally". They used to be quiet stable but this past six months there were a couple of severe downtime events, albeit, I have to say, the event described in this article didn't affect me in an way and I rely solely on skydrive and outlook.

billyea said,
Windows Update has incredible uptime!

true, but apparently windows update is controlled by another company,may be Akamai?.

Brony said,
"While Microsoft's online services have traditionally had exceptional uptime"

yeah, sure!.


Go ahead and prove them wrong.

Luke Baldwin said,
I didn't even notice and i use Outlook.com and Skydrive all the time

Same, I've been filling in job applications allot last couple of days and thus rely on my mail non-stop. Not a single issue what so ever. However I am using my own domain with Live Domains though.

Good one Microsoft, I don't recall a datacentre yet that ever required a "firmware" upgrade.

You do know what a datacente is? right?