becks.strugglers.net

http://gallery.strugglers.net/albums/Hardware/becks.sized.jpg


 * AMD Sempron(tm) Processor 3100+
 * 512MB RAM
 * nVidia nForce3-based motherboard
 * 100Mbit onboard ethernet (forcedeth)
 * 3xMaxtor 120GB, 1xSeagate 120GB, configured mostly as RAID5

RAID mishap, 2005-11-07
Here's an email I sent to a mailing list regarding this, then just now I thought other people might be interested, if only to point and laugh.

Date: Tue, 8 Nov 2005 23:04:37 +0000 From: Andy Smith  Subject: Software RAID slight problem

I have a cheap fileserver which when I built it I decided I'd use 4x120GB Maxtor SATA disks, cos they were cheap. Yes -- Maxtor -- you can stop laughing now.

On Monday morning, some 11 months after the machine was built, /dev/sdb decided it didn't want to read from a 50 or so of its sectors and that it had reached its maximal sector reallocation limit, and was very unwell, and would degrade my array. I might have known sooner with smartd, but smartd does not work with libata SATA drives in Linux currently without a kernel patch.

Anyway so I immediately ordered a replacement and figured I wouldn't scrape the bottom of the barrel this time, and ordered a Seagate 120GB. Now I'm sure some of you have problems with Seagate too but let's just agree they aren't as shonky as Maxtor and get on with the tale.

I identified sdb, removed it and threw it in a pile marked "to swear at and then RMA" and stuck in the Seagate. Here's what I get when I boot the machine now:

scsi0 : sata_nv Vendor: ATA      Model: Maxtor 6Y120M0    Rev: YAR5 Type:  Direct-Access                      ANSI SCSI revision: 05 scsi1 : sata_nv Vendor: ATA      Model: ST3120827AS       Rev: 3.42 Type:  Direct-Access                      ANSI SCSI revision: 05 scsi2 : sata_nv Vendor: ATA      Model: Maxtor 6Y120M0    Rev: YAR5 Type:  Direct-Access                      ANSI SCSI revision: 05 scsi3 : sata_nv Vendor: ATA      Model: Maxtor 6Y120M0    Rev: YAR5 Type:  Direct-Access                      ANSI SCSI revision: 05 SCSI device sda: 240121728 512-byte hdwr sectors (122942 MB) SCSI device sdb: 234441648 512-byte hdwr sectors (120034 MB) SCSI device sdc: 240121728 512-byte hdwr sectors (122942 MB) SCSI device sdd: 240121728 512-byte hdwr sectors (122942 MB)

Anyone spotted the current thorn in my side yet?

Yes, thanks Seagate, your "120GB" hard disk is 2908MB smaller than the 3 "120GB" Maxtors.

So I haven't even bothered trying to partition it the same as sd[acd] yet as it's not going to bloody work is it.

Fortunately, I can probably bodge this. You see, the partition table of each of the 3 Maxtors is like this:

Partition      Start sector    End sector      Purpose 1              63              144584          /boot 2              144585          10635029        /stripe 3              10635029        229472460       everything else

/boot is a RAID-1 on sda1 and sdb1 with sdc1 and sdd1 being hot spares.[1] (md0)

/stripe is/was a RAID-0 across sd[abcd]2 for use as scratch space. (md1)

"everything else" is a RAID-5 of /dev/sd[abcd]3 which I use as an LVM PV and carve up lots of LVs for my root filesystem and all other things.

So the situation right now is:

$ cat /proc/mdstat Personalities : [raid0] [raid1] [raid5] [multipath] md2 : active raid5 sdd3[3] sdc3[2] sda3[0] 344208384 blocks level 5, 256k chunk, algorithm 2 [4/3] [U_UU]

md0 : active raid1 sdd1[1] sdc1[2] sda1[0] 72192 blocks [2/2] [UU]

unused devices:

i.e. md0 (/boot) has rebuilt itself onto sdd1 and has sdc1 as a hot spare. md1 (/stripe) is totally screwed and was commented out of /etc/fstab since it is now dead. There wasn't anything on it anyway; it really was scratch space. md2 is running as a degraded RAID-5 with sdb3 missing.

I figure I can create sdb3 first and make it the same size as partition 3 on the other disks. I can then compromise by taking the excess out of sdb2, and not bother about sdb1 at all. md1 will end up using less than the full extent of sd[acd]2 since sdb2 will still be smaller than those others, but it's only scratch space so I don't care.

But what's the easiest way to do this? I can't start creating partitions at the end can I? Or maybe I can, in something other than fdisk.

It would no doubt be easier to create sdb1 first as the correct size to match sd[acd]3 then use the rest for sdb2, but then I end up with a "less pretty" md setup where md2 is made of {sda3, sdb1, sdc3, sdd3}.

So what would you guys do? :)

Answers of the form "remove all Maxtor disks, replace with all the same drives from one sane vendor, and restore from backups" not welcome thanks ;)

Oh, and, will I need to do anything special (like shrink sd[acd]2) when trying to rebuild the stripe (md1) with the smaller patition from sdb?

Cheers, Andy

[1] Did it like that because I wanted /boot on a RAID 1, but I keep it   read only most of the time and didn't see the point in having it    a 4-way RAID 1. But then, there isn't much to do with two 70-odd MB disk partitions so I made them hot spares.

Suggestions
Neil Brown suggested I use cfdisk to create the correctly-sized partitions, and pointed out that md's RAID-0 doesn't need its devices to be all the same size. Hugo Mills and Adrian Bridgett both of HantsLUG also suggested cfdisk.

"Tim" also from HantsLUG suggested that I replace all the Maxtor drives with Seagate ones. It's not clear how that would have fixed my immediate problem and it would have been costly.

"Seanie" suggested nuking it all and recreating from backups. I have backups but they are all over the place due to perhaps foolishly building a fileserver bigger than I can comfortably backup. As a result this would take a long time, so it was a last resort really.

Simon Amor suggested using the 120GB Seagate for something else and buying a 160GB one instead. This would have taken a few days longer and given that I had 215GB of data running off a degraded RAID-5 I really wanted this sorted out as soon as possible. Also I intend to upgrade this server perhaps as soon as Christmas anyway.

Resolution
I chose the cfdisk solution and everything is up and working redundantly again now.