RAID-0: HDD going bad, or Controller?


Recommended Posts

This has happened to me three times in the past couple weeks. One of the drives in my RAID 0 stops working rendering the logical drive useless. Windows 7 is installed on a separate SSD so I can still use my system once this happens. Restarting my PC restores everything to working order but the HDD stays offline until I restart. I'm wondering what is causing it, is the drive going bad? Is it the controller? Could is be a software problem? Earlier today it happened as I was loading PlanetSide 2.

4rc9zk.png

Link to comment
Share on other sites

It might be the case of the H/Ds error retry mechanism VS the RAID's timeout system, in some hard drives if the drive encounters an error whilst reading/writing, it will reattempt the operation (which takes much longer than the few ms it should normally take) and so the RAID controller can sometimes think the drive has failed completely when it might be just trying to work around an error.

Only some hard drives have this, other drives don't and some (WD enterprise for example) allow you to enable, disable and configure the time each error operation retry can take before giving up.

It could be that, which would be indictive of a pending drive failure, and seeing as it's happening quite a few times that'd also tie in with the pending drive failure so I'd offline the PC, clone the drive to another and put the new clone in and see if it still happens. Very doubtful it'd be a RAID card/controller error.

(See: http://en.wikipedia.org/wiki/Error_recovery_control)

Link to comment
Share on other sites

When I last had this type of issue I blamed the controller (AMD) & power down options for HDDs in the BIOS and in Windows 7. I replaced the controller with a Marvel based PCI-E 4x card - no further issues or errors. It appeared to be resolved but 2 months later, the same drive died and was replaced under warranty by Seagate (Samsung HDD). From what I could gather, the logic board had failed - no physical fault.

I would run some diagnostics on the drive and see if something else will report a fail.

Link to comment
Share on other sites

Thanks for the replies. Here's a few more details:

It's happened three times, each time it's been the same physical HDD that has stopped.

The drives are 1 TB Western Digital Blacks.

I'm using the onboard RAID function on a Gigabyte 970A-UD3 motherboard.

As far as cloning the drive in question, what would you suggest I use to do so? There are still 2 available SATA ports on the motherboard as far as I can recall.

Link to comment
Share on other sites

Turn the PC off, take the drive out that's having problems, put it into another PC with a blank 1TB drive and boot up linux and either use gparted to move it over or use dd.

If you use dd, MAKE SURE YOU GET THE RIGHT DEVICE ID FIRST!

Then use: dd if=/dev/sdX of=/dev/sdX bs=64M

Use gparted to find out if you want sda, sdb, sdc, sdd, etc.

Link to comment
Share on other sites

This topic is now closed to further replies.