Ever wished that you had setup your Mac OS X Server with a boot RAID before you installed everything? Afraid that making a RAID mirror now will require you to backup, format, and restore?
Read on to see how to create a RAID mirror without formatting your drive.MAKE SURE YOU HAVE A GOOD BACKUP BEFORE YOU ATTEMPT THIS!
If you’ve ever wanted to add the redundancy of a RAID, but didn’t want to go through the backup, format, restore dance; this is your lucky day. The command line version of the drive section of Disk Utility has a nifty verb, ‘enableRAID’. What you can do with this is take an existing disk and turn it into a degraded RAID set. Then you can use Disk Utility or diskutil to add and rebuild a second member of the set.
It’s even pretty easy to do.
- Make sure you have a good backup. We are going to be re-writing the partition map of your disk. If something were to go wrong you could loose all of your data. That said, this seems to be pretty reliable.
- If you want to mirror your boot device you will need to startup off of something else first. Next open the Terminal and type
diskutil list
. You will get some output like this:[macxmv2:~] admin% diskutil list
/dev/disk0
#: type name size identifier
0: Apple_partition_scheme *172.6 GB disk0
1: Apple_partition_map 31.5 KB disk0s1
2: Apple_Driver43 28.0 KB disk0s2
3: Apple_Driver43 28.0 KB disk0s3
4: Apple_Driver_ATA 28.0 KB disk0s4
5: Apple_Driver_ATA 28.0 KB disk0s5
6: Apple_FWDriver 256.0 KB disk0s6
7: Apple_Driver_IOKit 256.0 KB disk0s7
8: Apple_Patches 256.0 KB disk0s8
9: Apple_HFS Server HD 172.5 GB disk0s9
/dev/disk1
#: type name size identifier
0: Apple_partition_scheme *172.6 GB disk1
1: Apple_partition_map 31.5 KB disk1s1
2: Apple_Driver43 28.0 KB disk1s2
3: Apple_Driver43 28.0 KB disk1s3
4: Apple_Driver_ATA 28.0 KB disk1s4
5: Apple_Driver_ATA 28.0 KB disk1s5
6: Apple_FWDriver 256.0 KB disk1s6
7: Apple_Driver_IOKit 256.0 KB disk1s7
8: Apple_Patches 256.0 KB disk1s8
9: Apple_HFS SecondHD 172.5 GB disk1s9The main thing we are looking for here is the device name of our disks. Let’s pretend we are going to make a mirror of our boot disk, disk0.
- We’ve decided that we want to create a mirror of disk0 so type
diskutil enableRAID mirror disk0
and the drive in question will vanish from the desktop! Wait for it… Wait for it… Is that fear I see in your eyes soldier?!? Are you thinking about your backup yet? Pop! The drive should remount after 30 seconds or so with it’s contents intact.(It may come back quicker on your Mac, I tested this procedure on a G4 500 with two 18GB SCSI drives. Bleh…)
- Now that your drive is back you can rebuild the new RAID set with either with a
diskutil repairMirror (RAID device name) (Bad slice) (Good member) (New member)
or just open up Disk Utility, drag and drop, and then watch the pretty progress bar.If you mirrored your boot device you can startup off of it and get back to work before you rebuild since you can rebuild in the background now. In fact, when writing this tip I was letting the boot RAID rebuild in the background of a G4 AGP 500 running Mac OS X 10.3.4 client. I then followed up with a G4 XServe running 10.3.5 Server.
(It’s occured to me that I should let you know that the rebuild takes a really, really long time. Disk Utility warns you about this, but it doesn’t drive the point home with enough force. When I rebuilt my RAID for this article it took over an hour for less than 8 GB of data to mirror. This is a good reason to reboot and rebuild in the background as any I can think of!)
And that’s really all there is to it! Now when a drive dies you get to decide when to replace it. If you have a Xserve you don’t even need to shutdown to swap drives and rebuild your boot RAID.
Things to remember
Enjoy!
I’m about to try this right now! I have my G5 Xserve booted in target disk
mode off my PowerBook and have created a NetBoot Install image of the
Xserve, so if it bombs, I can restore directly off of the image I created. I’ll
post in how it goes!
Worked for me!
One note for anyone using FireWire Target Disk Mode….either boot the server
up off the boot drive and then complete the mirroring process, or be
prepared to leave your laptop with the server! It took about 20 hours for a
250GB drive to mirror in my dual 2.0GHz Xserve G5, when booted off my
laptop atleast.
Just a heads up! 🙂
It’s a bit hard to tell from the KB, but it looks like it was added with 10.3.3.
The 10.3.5 article also claims to add this ability but you can tell it is a cut and
paste job from the 10.3.3 update. That and the fact that I was able to rebuild
in the background on a 10.3.4 client and I would say that the 10.3.3 article is
accurate.
Just select the RAID set in Disk Utility, click the RAID tab, and then drag a new
member into the set and click “Rebuild”.
—
Breaking my server to save yours.
Josh Wisenbaker
http://www.afp548.com
Wondering if anyone’s seen this….
So, I created my boot mirror, which went well, but now I’m noticing odd
amounts of CPU activity. For example, I came in today to discover that one
CPU had been hovering at 50% for quite some time, without any users
connected or any tasks running. Also, I was unable to run fsck when I
attempted in single user mode. Is there a different set of commands when
using a RAID mirror? Please excuse me if this is a simple question, but I don’t
have a ton of experience running mirrors and such and I just want to be sure
I’m not seeing the beginning of something larger.
Thanks!
I haven’t seen the CPU issue on any of the boot RAIDs that our customers have.
I deleted my test RAID mirror at work the other day, but I can re-RAID it and see what happens.
I think you should be able to use fsck on the volume just fine. If you can boot to it in single user mode then you obviously can read the device. I have found a few KB articles that speak of running it on a boot RAID mirror, but none that say you can’t.
Josh
—
Breaking my server to save yours.
Josh Wisenbaker
http://www.afp548.com
Tuesday morning, when I’m back in the office, I’ll see what I’m at as far as
CPU usage goes. It appeared that it was gradually gaining after my reboot
mid-day on Friday. It stayed at 0% – 3% for a while, then it just kept moving
up slowly. I’ll also try running fsck on the server, especially if the CPU usage
is abnormally high again.
I’ll write back in and let you all know whats up! I was also considering
removing that drive mirror and rebuilding, but I’d prefer not to, of course,
especially if its not necessary.
Ok, so I don’t leave all of you sitting at the edges of your seats wondering
what I’ve found… 🙂
So, I came in on Tuesday to see that the server was using a very small amount
of CPU, about 3%, which is more than acceptable. I did notice, however, that
it restarted sometime on Saturday. So far, since that reboot, its been running
much better. It’s showing a bit more network traffic than before, but that’s
probably in part to the new OD Master running on it for MCX, which rocks, by
the way.
I’m going to keep my eye on it though, of course. I’ll have to reboot that guy
anyway when I install the security patch, so I will try fsck once again at that
time and see if I receive an error and I’ll report back what I find.
It’s a bit hard to tell from the KB, but it looks like it was added with 10.3.3.
Had no choice but to try this remotely on 10.2.8 and it worked! Had to reboot for the changes to be seen.
Worked perfectly on a friend’s G5, thanks!
10.3.5 will rebuild automatically once the RAID rebuild is started. I discovered
this by accidentally quitting Disk Utility part-way through the rebuild, when I
re-opened it the progress window popped up immediately…
One question though, when rebuilding form the CLI, you mention the RAID
device name. Would this be the name of the drive you are mirroring from?
‘diskutil repairMirror (RAID device name) (Bad slice) (Good member) (New
member)’
BTW, on a 1.6ghz G5 with 2gb of RAM, and 2 80gb SATA drives with about
45gb of data on them the rebuild took 45 minutes.
Cheers,
Dave
Thanks for the excellent article. I have used this half dozen times since it was posted. In most cases, I’ve advised my customers to purchase a stock G5 with 160 drive and replace with two Maxtor SATA 250s ($200 from MicroCenter).
It appears as though the time it takes to remount the partition is roughly 30 seconds even on a G5.
I’ve done a couple of the configs remotely without problem.
Takes about 2-2.5 hours to rebuild the mirror on a 250 with 50gb + worth of data. I’m just starting the rebuild of my drives (90gb on a 160) so we’ll see how it does.
Thanks again.
I am running OS 10.3.5 on a G5 with external FW800 LaCie twin 305GB drives as RAID 1. After a freeze I had to shut down the machine and on restart got an unrecognixzable disk in the RAID array, so shut the machine down and went to bed.
This morning it boots fine and DiskUtility shows the RAID array with both disks degraded.
But…I checked the Raid status with terminal as suggested in
http://docs.info.apple.com/article.html?artnum=106987
and it says both disks have status OK
Name: Hydra
Unique ID: Hydrab2e9e566214a11d99823000a95afef40
Type: Mirror
Status: Degraded
Device Node: disk6
————————————————————-
# Device Node Status
————————————————————-
0 disk5 OK
1 disk2 OK
————————————————————-
I also ran Disk Warrior to rebuild the directory and ran repair in Disk Utilities and they checked out OK
Is it possible there isn’t a problem? or if not how would I find a problem?
Apple notes that with the OS X Server it is possible to ignore the degraded status in some cases, implying it might be mistakenly reported
http://docs.info.apple.com/article.html?artnum=107406
anyone ever heard of this? any suggestions
thanks
OK,
So when attempting the first step on this drive w/ partitions, I get
Error enabling disk to RAID Disk is not suitable for enabling RAID (-9694)
Wonder if there is a way around this.
oh, and the link to the manpage has changed & can now be found at
http://developer.apple.com/documentation/Darwin/Reference/ManPages/man8/diskutil.8.html
That is correct. A software mirror RAID in 10.3 can not have partitions.
—
Changing the world, one server at a time.
Joel Rennich
http://www.afp548.com
Anyone have any statistics on drive performance during a rebuild? We have a
bit of a high usage server (editing server) that will be undergoing a raid
rebuild – I need to know if I should tell our editors to not bother working for
the hour it takes to rebuild, or to proceed as normal…. Any experience
anywhere?
The chase: I have removed two disks from a 10.3.9 software mirror RAID
set using the 10.4GM diskutil.
One now has a partition map with slice 4 of type "Apple_RAID", the
other disk has slice 4 of type "Apple_RAID_OfflineV2". How do I get one
or both back to being normal Journaled HFS+ volumes without wiping their
data?
At the very least, I need technical doc for Software RAID – a google for
Apple_RAID
yields absolutely no hits – nor does a search on Developer Connection.
—
Disclaimer: I know what I did here was wrong and that I need to go back
to a backup. No need for anyone to waste their time on a helpful post to
tell me this – but thanks in advance for the thoughts anyway.
The detail:
It started out as a sunny day in Sydney, Australia. Life seemed good.
The night before I had upgraded a 2x2Ghz G5 PowerMac from Panther Server
to Tiger Server successfully.
Kerberos, AFP, DHCP, DNS were all smoothly up and running, as was Samba,
Apache and even JBoss.
I had gotten IMAP and POP running where others had failed – Apple
includes an invalid TLSNames attribute in the cyrus config file that
needs to be deleted to allow mail delivery to kick off.
The jewel in the crown… My HP5510 printer was even behaving properly
with the new server and print serving was finally working again using
IPP – it had been broken under Panther.
So, I thought, before I go to bed I had better just check the world is
good with disk utility.
Alas, I found that my software mirror was in a ‘degraded’ state. I had
forgotten that before I did the Tiger upgrade I had attempted to split
one drive from the array and leave it powered off Just In Case.
The array failed to boot with the missing disk, so I had reluctantly
plugged it back in, intending to rebuild the array before the Tiger
upgrade. Mistake #1 – I forgot to do so.
In any case, to its credit, the Tiger Server installer happily upgraded
the OS on the degraded array and left it in the same state.
So, I thought, I know I can’t repair that array volume because it is the
startup disk. I’ll have to reboot with the Tiger Server DVD and rebuild
the array then.
Once booted with the Tiger DVD, I found Disk Utility to be a little
whacked. (I later learned that there is a difference between Panther
RAIDsets and Tiger RAIDsets but at the time I was ignorant). It was
unable to rebuild the array or do anything with the existing
configuration – except show little padlock icons next to each disk. I
now attribute ths to differences between Tiger and Panther RAID.
So, I thought, this can’t be right. And off I trundled into the Terminal
app to hunt down the problem with Disk Utility.
I determined my only option was to attempt to rebuild the RAIDset using
diskutil repairMirror <mirrorname> <offlinediskname>.
I kicked this off – no dice.
Figured it didn’t ‘take’ – so kicked it off again.
Nothing. No activity whatsoever. Mistakes #2 and #3 had just happened.
Figured it might be like some old Promise RAID implementations I’ve used
in the past and that rebuild might start only on reboot.
Rebooted the system (that had never previously failed to boot).
It failed to boot – with a big "no" symbol where the Apple normally is.
(That’s ‘no’ as in ‘no smoking’ but without the cigarette)
So, I thought, time to boot the server install DVD again.
Once more, the graphical Disk Utility was useless. Back to terminal.
Sure enough, a quick diskutil showRAID revealed that there was one
"online" disk, one "offline" disk and two unknown disks – a total of
four drives – now involved in my 2-physical-disk mirror RAID set.
No amount of checking, removing, or otherwise could do anything to get
rid of the "unknown" disks from the RAIDset – I now suspect these were
created by trying to rebuild a 10.3 array in 10.4 Disk Utility.
On a whim (mistake #4) I thought I’d try to rebuild the mirror again,
having no real other option, so I fed diskutil the disk marked as
‘offline’. Lo and behold, the mirror began to rebuild. A quick "diskutil
checkRAID" confirmed progress was being made.
Sometime between 80% and 100% something bad must have happened
because
when I came back in the morning I found that the rebuild had terminated
normally, but there was no change in the mirror status – it was still
degraded, with one disk offline, one online, and two phantom unknown
disks.
Finally, in desperation I decided to split the offline disk from the
array using diskutil removeFromRAID. It worked!!!
The array was still all funky with those other drives so I also then
removedFromRAID the online drive, thus completely destroying the RAIDset
– but at least those phantom disks are gone. Mistake #5
Now I figured, I could simply add either (but preferably the formerly
online) disk back into a new array using diskutil enableRAID mirror
diskname, then rebuild it with repairMirror feeding it the other.
Unfortunately, enableRAID requires its disk to be mounted. A quick
diskutil list showed me neither of my offline mirror halves were indeed
mounted.
So, I thought, a quick diskutil mountDisk will fix my last problem.
Mistake #6. It turns out that the 4th slice of each of the individual
mirror disks now have non standard partition types. One is Apple_RAID.
The other is Apple_RAID_OfflineV2.
However the rest of the disk partitions look good and I know these disks
have not been written to since I started this whole messy process.
My belief is that incompatibilities between Apple_RAID v1 and v2 exists
and that I am partially a victim of this problem – largely just my own
stupidity and lack of caution.
So, the question is, does anybody out there know about how I can convert
these disks back to standard bootable drives – or even where to find
details on the partitioning scheme used by Apple. And ideally, how can I
overwrite the drive partition map data but not mess up the data?
I since found the Monster Disk Technote and tried using pdisk to change
the ContentHint for the Apple_RAID partition back to Apple_HFS.
No dice – can’t verify the volume in Disk Utility.
It started off as such a nice day…
Thanks for reading such a longwinded post. All help accepted humbly and
gratefully.
Marc
Oh my.
Did you take a look at the convertRAID verb for diskutility? It should take an 1.x
RAID (10.3-) and convert it into a 2.x RAID (10.4+). If you got the remove from
RAID verb to work though it may have already been converted.
If you think the data partition is good you might be able to copy it off with
something like dd or Data Rescue. The drive you pdisked might not ever mount
again. 🙁
—
Breaking my server to save yours.
Josh Wisenbaker
http://www.afp548.com
No joy for me. I booted from the 10.4 Server CD-ROM, and ran Disk Utility
from the menu. enableRAID fails to shrink (‘grow’) the volume, although
there’s plenty of free space.
Here’s what I saved off before booting back to normal:
-sh-2.05b# diskutil list
/dev/disk0
#: type name size identifier
0: Apple_partition_scheme *115.0 GB disk0
1: Apple_partition_map 31.5 KB disk0s1
2: Apple_Driver43 28.0 KB disk0s2
3: Apple_Driver43 28.0 KB disk0s3
4: Apple_Driver_ATA 28.0 KB disk0s4
5: Apple_Driver_ATA 28.0 KB disk0s5
6: Apple_FWDriver 256.0 KB disk0s6
7: Apple_Driver_IOKit 256.0 KB disk0s7
8: Apple_Patches 256.0 KB disk0s8
9: Apple_HFS www 114.9 GB disk0s10
/dev/disk1
#: type name size identifier
0: Apple_partition_scheme *149.1 GB disk1
1: Apple_partition_map 31.5 KB disk1s1
2: Apple_Driver43 28.0 KB disk1s2
3: Apple_Driver43 28.0 KB disk1s3
4: Apple_Driver_ATA 28.0 KB disk1s4
5: Apple_Driver_ATA 28.0 KB disk1s5
6: Apple_FWDriver 256.0 KB disk1s6
7: Apple_Driver_IOKit 256.0 KB disk1s7
8: Apple_Patches 256.0 KB disk1s8
9: Apple_HFS www-spare 115.9 GB disk1s10
10: Apple_HFS 33gb 32.9 GB disk1s12
/dev/disk2
#: type name size identifier
0: Apple_partition_scheme *71.6 GB disk2
1: Apple_partition_map 31.5 KB disk2s1
2: Apple_Driver43 28.0 KB disk2s2
3: Apple_Driver43 28.0 KB disk2s3
4: Apple_Driver_ATA 28.0 KB disk2s4
5: Apple_Driver_ATA 28.0 KB disk2s5
6: Apple_FWDriver 256.0 KB disk2s6
7: Apple_Driver_IOKit 256.0 KB disk2s7
8: Apple_Patches 256.0 KB disk2s8
9: Apple_HFS 71gb 71.5 GB disk2s10
/dev/disk3
#: type name size identifier
0: Apple_partition_scheme *232.9 GB disk3
1: Apple_partition_map 31.5 KB disk3s1
2: Apple_Driver43 28.0 KB disk3s2
3: Apple_Driver43 28.0 KB disk3s3
4: Apple_Driver_ATA 28.0 KB disk3s4
5: Apple_Driver_ATA 28.0 KB disk3s5
6: Apple_FWDriver 256.0 KB disk3s6
7: Apple_Driver_IOKit 256.0 KB disk3s7
8: Apple_Patches 256.0 KB disk3s8
9: Apple_HFS 232gb 232.7 GB disk3s10
/dev/disk4
#: type name size identifier
0: Apple_partition_scheme *2.6 GB disk4
1: Apple_partition_map 31.5 KB disk4s1
2: Apple_Driver_ATAPI 4.0 KB disk4s2
3: Apple_HFS Mac OS X Server Install Disc 2.6 GB disk4s3
/dev/disk6
#: type name size identifier
0: untitled *467.0 KB disk6
/dev/disk7
#: type name size identifier
0: untitled *95.0 KB disk7
/dev/disk8
#: type name size identifier
0: untitled *95.0 KB disk8
/dev/disk9
#: type name size identifier
0: untitled *95.0 KB disk9
/dev/disk10
#: type name size identifier
0: untitled *219.0 KB disk10
-sh-2.05b# diskutil enableRAID
Disk Utility Tool
Usage: diskutil enableRAID [mirror|concat] [Device Node|Device Identifier]
Convert a single filesystem disk into a degraded mirror or concatenated RAID
set.
This will not work with all disks. Filesystem must be mounted and
shrinkable.
(i.e. Journaled HFS+, or its derivates). There must be enough room
to insert RAID information. Ownership of the affected disk is required.
Enabling RAID is an inherently dangerous operation. Please make
backups
of all affected data before proceeding.
Example: diskutil enableRAID mirror /Volumes/Target
-sh-2.05b# diskutil enableRAID mirror /Volumes/www
changing filesystem size on disk ‘disk0s10’…
Attempting to change filesystem size from 123376680960 to 123387248640
bytes
Filesystem grow failed, 1
Disk Management could not shrink the filesystem to fit the new RAID headers
Error enabling disk to RAID Invalid request (-9998)
-sh-2.05b# diskutil enableRAID mirror /dev/disk0
The target disk must be a volume, not a whole disk
Disk Utility Tool
Usage: diskutil enableRAID [mirror|concat] [Device Node|Device Identifier]
Convert a single filesystem disk into a degraded mirror or concatenated RAID
set.
This will not work with all disks. Filesystem must be mounted and
shrinkable.
(i.e. Journaled HFS+, or its derivates). There must be enough room
to insert RAID information. Ownership of the affected disk is required.
Enabling RAID is an inherently dangerous operation. Please make
backups
of all affected data before proceeding.
Example: diskutil enableRAID mirror /Volumes/Target
-sh-2.05b# diskutil enableRAID mirror /dev/disk0s10
changing filesystem size on disk ‘disk0s10’…
Attempting to change filesystem size from 123376680960 to 123387248640
bytes
Filesystem grow failed, 1
Disk Management could not shrink the filesystem to fit the new RAID headers
Error enabling disk to RAID Invalid request (-9998)
-sh-2.05b# df -hl
Filesystem Size Used Avail Capacity Mounted on
/dev/disk4s3 2.6G 2.5G 94M 96% /
/dev/disk6 467K 11K 433K 2% /Volumes
/dev/disk7 95K 64K 27K 70% /private/var/tmp
/dev/disk8 95K 13K 78K 14% /private/var/run
/dev/disk3s10 233G 146G 87G 63% /Volumes/232gb
/dev/disk2s10 71G 40G 31G 57% /Volumes/71gb
/dev/disk1s10 116G 52M 116G 0% /Volumes/www-spare
/dev/disk1s12 33G 34M 33G 0% /Volumes/33gb
/dev/disk9 95K 18K 73K 20% /private/tmp
/dev/disk10 219K 29K 180K 14% /private/var/db/netinfo
/dev/disk0s10 115G 40G 74G 35% /Volumes/www
-sh-2.05b# diskutil
Disk Utility Tool
Utility to manage local disks and volumes.
Most options require root access to the device
Usage: diskutil <verb> <options>
<verb> is one of the following:
list (List the partitions of a disk)
information | info (Get information on a disk or volume)
unmount (Unmount a single volume)
unmountDisk (Unmount an entire disk (all volumes))
eject (Eject a disk)
mount (Mount a single volume)
mountDisk (Mount an entire disk (all mountable volumes))
rename (Rename a volume)
enableJournal (Enable HFS+ journaling on a mounted HFS+ volume)
disableJournal (Disable HFS+ journaling on a mounted HFS+ volume)
verifyVolume (Verify the structure of a volume)
repairVolume (Repair the structure of a volume)
verifyPermissions (Verify the permissions of a volume)
repairPermissions (Repair the permissions of a volume)
repairOS9Permissions (Repair the permissions for the current
Classic boot volume)
eraseDisk (Erase an existing disk, removing all volumes)
eraseVolume (Erase an existing volume)
reformat (Reformat an existing volume)
eraseOptical (Erase an optical media (CD/RW, DVD/RW, etc.))
zeroDisk (Erase a disk, writing zeros to the media)
randomDisk (Erase a disk, writing random data to the media)
secureErase (Securely erase a disk or freespace on a volume)
partitionDisk ((re)Partition a disk, removing all volumes)
createRAID (Create a RAID set on multiple disks)
destroyRAID (Destroy an existing RAID set)
checkRAID (Check a RAID set for errors)
enableRAID (Convert a disk to a degraded RAID mirror set)
convertRAID (Convert a RAID 1.x (pre-Tiger) to a RAID 2.x (Tiger))
updateRAID (Update the settings of an existing RAID)
addToRAID (Add a spare or member disk to an existing RAID)
removeFromRAID (Remove a spare or member disk from an existing
RAID)
repairMirror (Repair a damaged RAID mirror set)
diskutil <verb> with no options will provide help on that verb
I typed up a pdf file with my experiences with tiger server building on the post
from above. ideally it doesnt involve much downtime at all. just requires a
reboot and the server can be headless and remote from your location.
http://escamuel.org/tiger_raid.pdf