Just learned the hard way not to use New File System Technologies

February 16, 2013

I just learned the hard way not to trust New File System Technologies, and wanted to share my experience with you all, in hope you can learn from my mistakes.

I received two new drives in the mail yesterday, and after 24 hours of stress testing in a seperate system, I went to put them in my Server case. My server consists of the following:

LSI 9260-8i raid controller

Intel RES2SV240NC sas expander

4x Supermicro 5 in 3 drive cages

4x Samsung HD204UI 2TB drives (RAID 10)

5x Seagate 3TB drives (RAID 5)

The Samsung drives were formatted as ReFS, and were used solely for my virtual machines (Email, nzb downloads, etc). When I went to insert the two new drives, something happened, and the drive lights lit up on all bays occupied. I rebooted the server, and when it came back up, it said the cache was lost but the controller recovered, and it came back up fine, except Hyper-V would not load any VMS. I checked in windows, and drive E (my VMS /ReFS drive) did not have a full/empty bar. I clicked it, and got a message saying that the drive repair was unsuccessful.

So what does this mean? Something happened to the supposedly "Resilient" file system, it couldn't repair the issue, and it basically wiped my drives (well not wiped, but it thinks it's empty, and won't let me access it" see image:

the only fix is to reformat and restore from backup, which in my case is about a month old.

Moral of the story? Don't trust ReFS yet, and keep better backups. My two other Volumes were fine, both of which are NTFS volumes. I have reformated the RAID 10 volume as NTFS instead of ReFS, and will be investing in a battery backup unit for the LSI controller, just in case this happens again.

February 16, 2013

http://redmondmag.com/articles/2012/05/11/microsoft-offering-improved-chkdsk-utility-in-windows-8.aspx

?

Also, I find it highly annoying that most (big) backup providers have not even come out with a solution yet.

February 16, 2013

http://redmondmag.co...-windows-8.aspx

?

Also, I find it highly annoying that most (big) backup providers have not even come out with a solution yet.

I tried chkdsk, but STUPIDLY enough, it won't run, giving a message "This volume cannot be checked because it cannot be accessed", yet diskpart works fine at selecting it, and even says that the volume is healthy. I posted a shot showing that diskpart thinks the volume is completely empty

February 16, 2013

Did you try any software to recover the partition information? I've got almost all of my servers running on 2012 now and haven't had any issues (most are on RAID1 though)

February 16, 2013

Did you try any software to recover the partition information? I've got almost all of my servers running on 2012 now and haven't had any issues (most are on RAID1 though)

I didnt. Fortunately it was only VMS, and since I had a backup from 20 days ago, I only lost about 80 emails worth of data. I've got a RAID 1 offline box that is only powered on to backup VMS and other critical data, so I'm fairly protected. I also found out the cause of the original issue, one of the pins in a molex connector that powers the SAS expander came out of the connector, thus cutting off power to the SAS exapander and bringing all drives off at the same time.

In your server 2012 instance, are you useing REFS?

February 16, 2013

I didnt. Fortunately it was only VMS, and since I had a backup from 20 days ago, I only lost about 80 emails worth of data. I've got a RAID 1 offline box that is only powered on to backup VMS and other critical data, so I'm fairly protected. I also found out the cause of the original issue, one of the pins in a molex connector that powers the SAS expander came out of the connector, thus cutting off power to the SAS exapander and bringing all drives off at the same time.

In your server 2012 instance, are you useing REFS?

I've got a mixed environment right now for the most part. The mission critical ones (exchange - sharepoint) I keep on NTFS just because I didn't want to be a test pig with over 500 employees worth of data.

The power will do it in a heart beat, but doesn't the expander have a battery on it as well?

February 19, 2013

I've got a mixed environment right now for the most part. The mission critical ones (exchange - sharepoint) I keep on NTFS just because I didn't want to be a test pig with over 500 employees worth of data.

The power will do it in a heart beat, but doesn't the expander have a battery on it as well?

Nah, the expander receives power directly from the motherboard, or in my case, a 4 pin molex cable (which in this case had a pin come loose) no BBU for intels expanders, HP's may have one. I suppose in hindsight it's a good thing it was power to the expander that failed, since if something happened to say, 2 drives in the RAID 5 array, or 2 in one of the RAID 10 arrays parts, since then I'd probably have lost EVERYTHING.

The entire box has a 1 hour UPS on it, and until I get a BBU for the card, I've disabled write-back on all VD's, as an added precaution.

Sign In

Just learned the hard way not to use New File System Technologies

Recommended Posts

SirEvan

Link to comment

Share on other sites

+Neowin User 007 Subscriber²

Link to comment

Share on other sites

SirEvan

Link to comment

Share on other sites

+Neowin User 007 Subscriber²

Link to comment

Share on other sites

SirEvan

Link to comment

Share on other sites

+Neowin User 007 Subscriber²

Link to comment

Share on other sites

SirEvan

Link to comment

Share on other sites

Recently Browsing 0 members

Similar Content

Posts

Recent Achievements

Popular Contributors

Tell a friend