我的/ dev / md0 RAID 6中有以下设备:/ dev / sd [abcdef]
还存在以下驱动器,与RAID无关:/ dev / sd [gh]
以下驱动器是读卡器的一部分,再次连接不相关:/ dev / sd [ijkl]
分析
sdf的SATA电缆坏了(你可以说它在使用时被拔掉了),sdf随后从/ dev / md0阵列被拒绝了.我更换了电缆,驱动器又回来了,现在在/ dev / sdm.请不要挑战我的诊断,驱动器没有问题.
mdadm –detail / dev / md0显示sdf(F),即sdf有问题.所以我使用mdadm –manage / dev / md0 – 删除故障来删除故障驱动器.
现在mdadm –detail / dev / md0在sdf曾经的空间中显示“已删除”.
root@galaxy:~# mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Wed Jul 30 13:17:25 2014 Raid Level : raid6 Array Size : 15627548672 (14903.59 GiB 16002.61 GB) Used Dev Size : 3906887168 (3725.90 GiB 4000.65 GB) Raid Devices : 6 Total Devices : 5 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Mar 17 21:16:14 2015 State : active,degraded Active Devices : 5 Working Devices : 5 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Name : eclipse:0 UUID : cc7dac66:f6ac1117:ca755769:0e59d5c5 Events : 67205 Number Major Minor RaidDevice State 0 8 0 0 active sync /dev/sda 1 8 32 1 active sync /dev/sdc 4 0 0 4 removed 3 8 48 3 active sync /dev/sdd 4 8 64 4 active sync /dev/sde 5 8 16 5 active sync /dev/sdb
由于某种原因,“已移除”设备的RaidDevice现在匹配一个活动的设备.无论如何,让我们尝试添加以前的设备(现在称为/ dev / sdm),因为这是原始意图:
root@galaxy:~# mdadm --add /dev/md0 /dev/sdm mdadm: added /dev/sdm root@galaxy:~# mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Wed Jul 30 13:17:25 2014 Raid Level : raid6 Array Size : 15627548672 (14903.59 GiB 16002.61 GB) Used Dev Size : 3906887168 (3725.90 GiB 4000.65 GB) Raid Devices : 6 Total Devices : 6 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Mar 17 21:19:30 2015 State : active,degraded Active Devices : 5 Working Devices : 6 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 512K Name : eclipse:0 UUID : cc7dac66:f6ac1117:ca755769:0e59d5c5 Events : 67623 Number Major Minor RaidDevice State 0 8 0 0 active sync /dev/sda 1 8 32 1 active sync /dev/sdc 4 0 0 4 removed 3 8 48 3 active sync /dev/sdd 4 8 64 4 active sync /dev/sde 5 8 16 5 active sync /dev/sdb 6 8 192 - spare /dev/sdm
如您所见,设备显示为备用设备并拒绝与阵列的其余部分同步:
root@galaxy:~# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 sdm[6](S) sdb[5] sda[0] sde[4] sdd[3] sdc[1] 15627548672 blocks super 1.2 level 6,512k chunk,algorithm 2 [6/5] [UU_UUU] bitmap: 17/30 pages [68KB],65536KB chunk unused devices:
我也尝试在添加之前使用mdadm –zero-superblock / dev / sdm,结果相同.
我使用RAID 6的原因是为了提供高可用性.我不会接受停止/ dev / md0并使用–assume-clean或类似的解决方法重新组装它来解决此问题.这需要在线解决,否则我没有看到使用mdadm的意义.
解决方法
对于某些(当前未知)原因,RAID状态变得冻结.获胜的命令是cat / sys / block / md0 / md / sync_action:
root@galaxy:~# cat /sys/block/md0/md/sync_action frozen
简单地说,这就是为什么它没有使用可用的备件.我所有的头发都以一只简单的猫咪命令为代价而消失!
所以,只需解冻数组:
root@galaxy:~# echo idle > /sys/block/md0/md/sync_action
你走了!
root@galaxy:~# cat /sys/block/md0/md/sync_action recover root@galaxy:~# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 sdm[6] sdb[5] sda[0] sde[4] sdd[3] sdc[1] 15627548672 blocks super 1.2 level 6,algorithm 2 [6/5] [UU_UUU] [>....................] recovery = 0.0% (129664/3906887168) finish=4016.8min speed=16208K/sec bitmap: 17/30 pages [68KB],65536KB chunk unused devices: root@galaxy:~# mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Wed Jul 30 13:17:25 2014 Raid Level : raid6 Array Size : 15627548672 (14903.59 GiB 16002.61 GB) Used Dev Size : 3906887168 (3725.90 GiB 4000.65 GB) Raid Devices : 6 Total Devices : 6 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Mar 17 22:05:30 2015 State : active,degraded,recovering Active Devices : 5 Working Devices : 6 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 512K Rebuild Status : 0% complete Name : eclipse:0 UUID : cc7dac66:f6ac1117:ca755769:0e59d5c5 Events : 73562 Number Major Minor RaidDevice State 0 8 0 0 active sync /dev/sda 1 8 32 1 active sync /dev/sdc 6 8 192 2 spare rebuilding /dev/sdm 3 8 48 3 active sync /dev/sdd 4 8 64 4 active sync /dev/sde 5 8 16 5 active sync /dev/sdb
幸福:-)