Jump to content



Photo

To RAID or not to RAID? If not, what then?

raid data integrity

  • Please log in to reply
6 replies to this topic

#1 Glassed Silver

Glassed Silver

    ☆♡Neowin's portion of Crazy♡☆

  • Joined: 10-June 04
  • Location: MY CATFORT in Kassel, Germany
  • OS: OS X ML; W7; Elementary; Android 4
  • Phone: iPhone 5 64GB Black (6.0.2)

Posted 01 December 2011 - 13:04

Heya guys!

I have a concept in mind and would like to discuss it with you guys! :)
I have no date of deploying this and I have to wait a while either way, but I want to get a perspective.

To sort this out right in the beginning:
I'm paranoid with my data integrity, I want to minimize risk of bad sectors and the like as much as possible.
Simply put, I want all my data to be as bullet proof as possible at okay prices and not too many hoops.
It's a lot of data and I want it to scale over time.

Here we go, my basic concept:

Mac -> externally connected RAID
-> Mac OS X also doing Time Machine backups from the RAID to a RAID0(?) NAS or some other solution.

I read a bit about RAIDs and am scared that I get a faulty controller/it dies/I have some not often called files that I don't see they are damaged in time to restore from the Time Machine backup.
So it's also a lot about the integrity I'm worried about.

Basically I want to keep my data till I die, literally, of cause this one designated setup won't die after me, but it should be upgradable and scalable.

We're talking about starting with 6-8ish TB of capacity before redundancy.

I really am worried about a worst case:
RAID controller breaks -> what then?
Simply get a new one of the same type and dandy?

The filesystem I use is HFS+ case insensitive.

Should the idea of servers arise: Not sure whether I'm too fond of it... Rather not.

I want to always have rock solid file meta data, too. (Creation dates, modification dates (both in Mac handled sense, not in Windows sense meaning: a copy of a file is not created in that moment! That's how I roll!)

Maaaany thanks in advance!
Cheers,

Glassed Silver:ios


#2 +zhiVago

zhiVago

    Pax Orbis

  • Tech Issues Solved: 2
  • Joined: 04-October 01
  • Location: The Heartland
  • OS: Windows Seven

Posted 01 December 2011 - 13:15

To raid! :D

I'm paranoid with my data integrity, I want to minimize risk of bad sectors and the like as much as possible.
Simply put, I want all my data to be as bullet proof as possible at okay prices and not too many hoops.
It's a lot of data and I want it to scale over time.


RAID 1 is what you want. :)

We're talking about starting with 6-8ish TB of capacity before redundancy


wow, that's a lot. Have you got a full tower case already?

RAID controller breaks -> what then?
Simply get a new one of the same type and dandy?


:yes:

#3 +patseguin

patseguin

    Neowin Addict

  • Tech Issues Solved: 1
  • Joined: 21-May 02
  • Location: Buffalo, NY
  • OS: Windows 8.1
  • Phone: iPhone 6

Posted 01 December 2011 - 18:56

Correct me if I'm wrong, but if that's a boot drive, you'll need an EFI BIOS and have to format the drive as gpi.

#4 REM2000

REM2000

    Neowinian Senior

  • Joined: 20-July 04
  • Location: UK

Posted 01 December 2011 - 19:23

as far as i know there is only one filesystem/system that could be used in an everyday way, to do what you want to do and that is ZFS.

RAID only protects against mechanical failure, that in the event of a disk failing the system will keep operating. This of course does nothing to protect data integrity. The main problem facing the storage industry at the moment is Data Rot, where through firmware, controller and other defects data can become corrupted silently, you would never know until you come to access and use the file.

ZFS eliminates this with hashing the files, if it detects a file has become corrupted it will automatically repair/replace the file from a backup. The RAIDz system built into ZFS is incredibly powerful, it will allow you to quickly setup a 3 disk, 5 disk 10 disk RAID system with your choice of 1, 2 or 3 disk redundancy. If you are paranoid about your data and want to keep it forever then you would want at least 2 disk, as sometimes when a disk fails, you replace it with a new one, the stress of rebuilding the array, scrubbing etc.. will overwork the hard disks and cause another to fail, so having at least two will ensure that you can survive both a lost disk and a RAID rebuilt/scrub.

As for implementing this, you'd be surprised how easy setting up a SMB server with Solaris 11 is, it will work with most hardware and is free to download. You only pay for support, but that can be obtained from the forums and you are not running this in an enterprise environment.

The other option is to use FreeNAS, which is a web based spin of FreeBSD 8 with ZFS built in, ZFS is not a hack job and is completely supported in FreeBSD and thus FreeNAS. There is a good community and the product is well supported and mature. The ZFS filesystem is being developed and is given versions, i.e. version 7 support encryption etc.. I think the FreeBSD's are a version of two behind Solaris, however for what you want to use it for, the version supported in FreeBSD / FreeNAS is all you need.

ZFS also does some really cool stuff with compression and Deduping files. The dedup is spectacular and really makes you drop your jaw when you see it running. It does exactly what it says on the tin, if you upload two files that are the same it will only store one copy. I tried this out by copying a video file to various folders on the ZFS system and even though the file system reported the total amount as the total copies being stored the free space reported the used space for only one file.

The only drawback with ZFS is that it's quite a heavy FS system and requires good spec's to work well. You will need a computer with at least dual core (quad if possible or a really fast dual such as a Core2Duo) and at least 2GB RAM, although i would recommend 4GB minimum with 8GB if you want really really fast network/disk transfer speeds, say for example if your transferring a lot of media files over a 1GBs network.

If however you don't want to go down this route and you said you wanted to stay away from a server then the only other option would be to have multiple copies of data, copy it to external hard disks, cloud, other computers, DVD-R etc.. Work on the principle of storage redundancy, that if one storage pool goes down, i.e an external hard disk then you have a cloud copy. Perhaps get some backup software that performs hash file integrity checks after the backup, etc..

One thing i would also recommend is not bet everything on HFS+ it's not a great filesystem and i have had two FS crashes resulting in a complete hard disk corruption, purely software the hard disks were mechanically fine. Im a big mac fan but i will always say that HFS+ is the mac's weakest link.

Happy trails

#5 +theblazingangel

theblazingangel

    Software Engineer

  • Tech Issues Solved: 6
  • Joined: 25-March 04
  • Location: England, UK

Posted 01 December 2011 - 20:25

I read a bit about RAIDs and am scared that I get a faulty controller/it dies

as someone just mentioned, RAID is about redundancy - keeping your system running should a drive fail - a backup is still essential!

#6 htcz

htcz

    Neowinian Senior

  • Joined: 22-July 11

Posted 01 December 2011 - 21:41

I have a RAID5 and well I see you are going for a RAID1 so it isnt too much of a issue. My RAID5 is very expensive to upgrade (which I want to do).

Just sharing the thought :)

#7 +BudMan

BudMan

    Neowinian Senior

  • Tech Issues Solved: 100
  • Joined: 04-July 02
  • Location: Schaumburg, IL
  • OS: Win7, Vista, 2k3, 2k8, XP, Linux, FreeBSD, OSX, etc. etc.

Posted 01 December 2011 - 22:01

"Basically I want to keep my data till I die, literally"

Keeping your data has NOTHING to do with RAID.. As mentioned already

"keeping your system running should a drive fail - a backup is still essential!"

I can not stress this enough!!! You could be running Raid 6, with backup controllers, the whole 9 Yards -- all this does is minimize the risk to data being offline due to a hardware failure. This does not prevent loss of data.

Comes down to this -- if you data was "offline" is this a concern for you if would take you say 24 hours or lets go to the extreme X weeks to restore from backup? Would this be your concern, or is your concern loss of data?

Raid can never be considered a "backup" of your data. if what you are worried about is loss -- then you need a backup solution. This will prevent loss of data no matter the reason, be it hardware failure, theft, disaster, etc.

If you data is that critical to you, then you need a backup solution that takes into account disaster recovery - ie the loss of not only your hardware, but say the whole building. Tornado or fire for example could take out your raid, your backup, etc.

Don't get me wrong - the correct raid level can be very useful and save you loads and loads of time in the case of drive failure (which they all do). But it is not a backup solution by any means.