Recommended Posts

I just learned the hard way not to trust New File System Technologies, and wanted to share my experience with you all, in hope you can learn from my mistakes.

I received two new drives in the mail yesterday, and after 24 hours of stress testing in a seperate system, I went to put them in my Server case. My server consists of the following:

LSI 9260-8i raid controller

Intel RES2SV240NC sas expander

4x Supermicro 5 in 3 drive cages

4x Samsung HD204UI 2TB drives (RAID 10)

5x Seagate 3TB drives (RAID 5)

The Samsung drives were formatted as ReFS, and were used solely for my virtual machines (Email, nzb downloads, etc). When I went to insert the two new drives, something happened, and the drive lights lit up on all bays occupied. I rebooted the server, and when it came back up, it said the cache was lost but the controller recovered, and it came back up fine, except Hyper-V would not load any VMS. I checked in windows, and drive E (my VMS /ReFS drive) did not have a full/empty bar. I clicked it, and got a message saying that the drive repair was unsuccessful.

So what does this mean? Something happened to the supposedly "Resilient" file system, it couldn't repair the issue, and it basically wiped my drives (well not wiped, but it thinks it's empty, and won't let me access it" see image:

post-26332-0-93795900-1360985609.png

the only fix is to reformat and restore from backup, which in my case is about a month old.

Moral of the story? Don't trust ReFS yet, and keep better backups. My two other Volumes were fine, both of which are NTFS volumes. I have reformated the RAID 10 volume as NTFS instead of ReFS, and will be investing in a battery backup unit for the LSI controller, just in case this happens again.

http://redmondmag.com/articles/2012/05/11/microsoft-offering-improved-chkdsk-utility-in-windows-8.aspx

?

Also, I find it highly annoying that most (big) backup providers have not even come out with a solution yet.

  On 16/02/2013 at 03:32, Tech Greek said:

http://redmondmag.co...-windows-8.aspx

?

Also, I find it highly annoying that most (big) backup providers have not even come out with a solution yet.

I tried chkdsk, but STUPIDLY enough, it won't run, giving a message "This volume cannot be checked because it cannot be accessed", yet diskpart works fine at selecting it, and even says that the volume is healthy. I posted a shot showing that diskpart thinks the volume is completely empty

  On 16/02/2013 at 03:48, Tech Greek said:

Did you try any software to recover the partition information? I've got almost all of my servers running on 2012 now and haven't had any issues (most are on RAID1 though)

I didnt. Fortunately it was only VMS, and since I had a backup from 20 days ago, I only lost about 80 emails worth of data. I've got a RAID 1 offline box that is only powered on to backup VMS and other critical data, so I'm fairly protected. I also found out the cause of the original issue, one of the pins in a molex connector that powers the SAS expander came out of the connector, thus cutting off power to the SAS exapander and bringing all drives off at the same time.

In your server 2012 instance, are you useing REFS?

  On 16/02/2013 at 04:24, SirEvan said:

I didnt. Fortunately it was only VMS, and since I had a backup from 20 days ago, I only lost about 80 emails worth of data. I've got a RAID 1 offline box that is only powered on to backup VMS and other critical data, so I'm fairly protected. I also found out the cause of the original issue, one of the pins in a molex connector that powers the SAS expander came out of the connector, thus cutting off power to the SAS exapander and bringing all drives off at the same time.

In your server 2012 instance, are you useing REFS?

I've got a mixed environment right now for the most part. The mission critical ones (exchange - sharepoint) I keep on NTFS just because I didn't want to be a test pig with over 500 employees worth of data.

The power will do it in a heart beat, but doesn't the expander have a battery on it as well?

  On 16/02/2013 at 05:01, Tech Greek said:

I've got a mixed environment right now for the most part. The mission critical ones (exchange - sharepoint) I keep on NTFS just because I didn't want to be a test pig with over 500 employees worth of data.

The power will do it in a heart beat, but doesn't the expander have a battery on it as well?

Nah, the expander receives power directly from the motherboard, or in my case, a 4 pin molex cable (which in this case had a pin come loose) no BBU for intels expanders, HP's may have one. I suppose in hindsight it's a good thing it was power to the expander that failed, since if something happened to say, 2 drives in the RAID 5 array, or 2 in one of the RAID 10 arrays parts, since then I'd probably have lost EVERYTHING.

The entire box has a 1 hour UPS on it, and until I get a BBU for the card, I've disabled write-back on all VD's, as an added precaution.

This topic is now closed to further replies.
  • Posts

    • Larry likes animals.  Mommy acts sh!t
    • vPro is a platform that has features aimed at companies, remote management, security, threats detection, drivers that are stable for 15 months, and many other things. And this includes Core, Core Ultra (which are excellent), and a set of Xeon. And this exists for years, and a Intel platform also includes networking Ethernet and WiFi (this requires years of development), a host of security features, power management, storage technologies, This does not exist in such a complete and coherent platform at AMD… And. please, no Windows 10 PC is going to turn into dust in a second, there is 3 years of extended support… for old laptops, there is no reason not to switch, and that backup feature is unrelated.
    • When will the US stop the monster ? They ###### everyone, customers, victims, employees, all the suppliers: the story behind every hw parts is the same, acquired, outsourced, but poaching employees is the cheapest way. The narrative arcs from the last 15 years, all partially or entirely deceptive: security (they stopped with that one, not sustainable), privacy, with the secure cloud ahah, the local AI ahah, selling the user base to Google, ahah, that’s the privacy you will enjoy. Innovations, it became a running gag, when everything is average and years late. The competition: on macOs all their crappy services, 20 apps, that you cannot even hide. They have business practices that no other company could adopt, they would be sued, Apple, on a pedestal for being the greatest parasite company (they capture a lot of wealth, give nothing back, except tens of billion in buybacks every year). Ah, the M socs, they hit a wall (long ago) and still market them as having AAA gaming abilities, they do not, one more fraud. At least there is some fun, Apple building fabs in cooperation with TSMC… ahahahah how are they involved ? Catering ? They provide the cloths ? It’s really sad to see that in the US corruption and mediocrity have reached a level so that companies like Apple or Tesla are getting so much help and support from the administrations, and other entities… that they do not even try to hide today. I wonder when will we know how much the news outlets are being paid to regurgitate anything Apple wants.. This is decay and it’s ugly.
    • it might be a work or school thing. at my work, a disclaimer pops up stating that "you should use google chrome or edge for the best possible experience." at my school, the disclaimer says just to use google chrome. i'm sure a lot of IT guys just want to make it easy and tell employees to use google chrome because of the apparent trends in web developers testing and all. i'm sure that can have big ramifications on browser usage for average users since "if my IT dep permits it, then it's good". i liked that my work also stated Edge, but i've seen "use google chrome" a lot more without mentioning edge. matter of fact, my employer removed firefox from all devices.
  • Recent Achievements

    • Week One Done
      Ricky Chan earned a badge
      Week One Done
    • Week One Done
      maimutza earned a badge
      Week One Done
    • Week One Done
      abortretryfail earned a badge
      Week One Done
    • First Post
      Mr bot earned a badge
      First Post
    • First Post
      Bkl211 earned a badge
      First Post
  • Popular Contributors

    1. 1
      +primortal
      483
    2. 2
      +FloatingFatMan
      263
    3. 3
      snowy owl
      240
    4. 4
      ATLien_0
      227
    5. 5
      Edouard
      188
  • Tell a friend

    Love Neowin? Tell a friend!