Office 365 Exchange/OWA down for 4+ hours


Recommended Posts

IME, it's primarily due to Internet Connectivity and not the provider as much. To the enterprise it doesn't matter.

Most on-premises in 24/7 operations don't have 4 hour outages for Email or phone systems (Lync). The threat of losing one's job prevents it. 4 hours in a year isn't that much, but a 4 hour outage is more than enough to be somewhat catastrophic depending on the organization.

The fact that the entity is not in "control" and can't simply make a phone call and say get it back up in an hour or you'll be looking for a job tomorrow is just not enough control for most whose livelihood relies on uptime. Being down 1 hour every quarter may be better than a single 4 hour outage in a year depending on the effect is has on your business.

The cloud definitely has value, but it is not a blanket solution. Much thought has to go into what is viable for "your" organization to move to the cloud.

 

This is why I have ZERO desire to work In a large IT department.

Link to comment
Share on other sites

I definitely read the article. Even after ready what you quoted I don't get why engineers had to do anything and why it took 4-5 hours or whatever time. Load balancing, failover, all these services were created to be automatic. Surely Microsoft had the resources and money. So I'm curious wth really happened!

 

sometimes things go bad; i know some cases of things that shouldn't happen, but they still do happen.

Link to comment
Share on other sites

sometimes things go bad; i know some cases of things that shouldn't happen, but they still do happen.

I agree but tier 3 data centers hold up pretty well.

Link to comment
Share on other sites

This is why I have ZERO desire to work In a large IT department.

I'm not so sure it's the actual size of the "IT" department as it is the size of the organization and the criticality of its services.

Link to comment
Share on other sites

I definitely read the article. Even after ready what you quoted I don't get why engineers had to do anything and why it took 4-5 hours or whatever time. Load balancing, failover, all these services were created to be automatic. Surely Microsoft had the resources and money. So I'm curious wth really happened!

Sounds familiar. I work in healthcare, and we have redundant servers in two data centers (close proximity though), and while the technical people are trained and ready to failover servers to the other data center, it's the managers who lag. They hate making decisions like that. If the problem can be solved in < 2 hours, we wait it out, which just kills me knowing that redundant hardware could be used. Maybe something like that happened, though at big company like Microsoft, that would be surprising. In my eyes, ###### does happen, but for a company like Microsoft with the resources and money to have redundant data centers and servers around US/worldwide, four hour downtime is simply just unacceptable. 

Link to comment
Share on other sites

I definitely read the article. Even after ready what you quoted I don't get why engineers had to do anything and why it took 4-5 hours or whatever time. Load balancing, failover, all these services were created to be automatic. Surely Microsoft had the resources and money. So I'm curious wth really happened!

It depends. People were speculating that exchange online protection was hit with a DDoS attack, which could have made it difficult for automated backup systems to come online.

Now most companies aren't going to publicly admit to a DDoS attack so we may never know.

Link to comment
Share on other sites

This topic is now closed to further replies.