Table of Contents



Sistema Operativo Debian squeeze (debian-6.0.7-amd64-netinst.iso)
Discos 2 de 4096 bytes por sector*
Tipo de Raid RAID 1
Live CD Gparted (gparted-live-0.14.1-6-i486.iso)

* Tambien conocido como Advanced Format. Provoca, si la herramienta no los maneja bien, particiones no alineadas

Nota He ido infinidad de veces adelante y atrás en el proceso, por lo que resumo los pasos realizados.

1. (Netinst) Borro todas las particiones

2. (Netinst) Creo las dos particiones en cada disco (/ y swap). Se me olvida marcar / como bootable

3. (Gparted) Ejecuto fdisk. Las particiones NO están alineadas.

3.1 (Gparted / fdisk ) Borro y creo la partición #2 en cada disco

3.2 (Gparted / gparted ) Creo una nueva partición dentro de #2 (es la #5) y la formateo a swap (en cada disco).

3.3 (Gparted / fdisk ) Ahora SI están alineadas las particiones

4. (Netinst) Marco las pariciones / y swap en cada disco como parte de RAID. Marco la / como bootable

5. (Netinst) Configuro los 2 RAID 1, uno para las particiones / (sda1 y sdb1) y otro para las swap (sda5 y sdb5)

6. Al arrancar veo que de nuevo las particiones NO están alineadas

6.1 Ejecuto parted y las elimino/creo de nuevo. Resumen:

Number  Start        End          Size         Type      File system     Flags
 1      2048s        1921824767s  1921822720s  primary   ext3            boot, raid
 2      1921824768s  1953525160s  31700393s    extended                  lba
 5      1921826816s  1953525160s  31698345s    logical   linux-swap(v1)  raid

7. Instalo grub en el disco 2:

sudo grub-install /dev/sdb


Crear el RAID

mdadm --create --verbose /dev/md0 --raid-devices=4 --level=raid5 /dev/sda /dev/sdb /dev/sdc /dev/sdd

Formateamos el RAID

mkfs.ext3 /dev/md0

Instalación Raid en Debian

One thing should be added to this nice article in case this installation is being done on brand new pristine disks.

If Grub is being installed on the RAID1 boot sector rather than MBR and you are on x86 or x86_64, the debian installer will probably prompt you about having an MBR installed (as this is required for the BIOS to initially access the disk).

At this step you can only pick from one of the physical devices and not the RAID partitions. So the MBR should be manually installed on the other disks as a post installation task to ensure that no disk is being left MBRless and so unusable by the BIOS.

This should be true with PATA hardware and is something i went through when performing RAID sanity tests after an etch install (a year ago or so).

Most of the time i have no specific requirements for an MBR, so i usually tend to install the bootloader on the MBR and then duplicate it by hand on the other disks.

For the record, here's how I do the MBR replication:

# grub --no-floppy
device (hd0) /dev/sda
root (hd0,0)
setup (hd0)

device (hd0) /dev/sdb
root (hd0,0)
setup (hd0)

device (hd0) /dev/sdc
root (hd0,0)
setup (hd0)

... and so on.

* --no-floppy speeds up grub's loading
* the 'device' trick insures that the 2nd stage and the kernel are loaded from the same disk as the MBR, provides some independence from the BIOS settings (i've seen some voodoo cases where this was required)
* to be noted that after the first disk, the grub-shell history is of great use: 3xup,bksp,b, enter, 3xup, enter, 3xup, enter , and so on ;)
* take great care that the raid1 is in sync, to insure that all the required files are in their final position on disk
* thanks to grub's architecture, this only has to be done when upgrading grub or when changing a disk, not on every reconfiguration or kernel upgrade.

Cambiar un disco

Mostramos los discos que hay

cat /proc/mdstat
  Personalities : [raid1] [raid6] [raid5] [raid4]
  md1 : active raid5 sda2[0] sdc2[2] sdb2[1]
      957232896 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]
mdadm --detail /dev/md1
        Version : 00.90.03
  Creation Time : Thu Oct 25 21:16:03 2007
     Raid Level : raid5
     Array Size : 957232896 (912.89 GiB 980.21 GB)
    Device Size : 478616448 (456.44 GiB 490.10 GB)
   Raid Devices : 3
  Total Devices : 3
  Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Fri Oct 26 11:48:40 2007
          State : clean
  Active Devices : 3
  Working Devices : 3
  Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 141d4151:1b6badaa:ac063430:591eaac6
         Events : 0.10

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2
       2       8       34        2      active sync   /dev/sdc2

Los 3 discos del RAID 5 están funcionando correctamente.

Forzamos el fallo en un disco:

mdadm --manage --set-faulty /dev/md1 /dev/sdb2
  mdadm: set /dev/sdb2 faulty in /dev/md1
cat /proc/mdstat
  Personalities : [raid1] [raid6] [raid5] [raid4]
  md1 : active raid5 sda2[0] sdc2[2] sdb2[3](F)
      957232896 blocks level 5, 64k chunk, algorithm 2 [3/2] [U_U]
mdadm --detail /dev/md1
        Version : 00.90.03
  Creation Time : Thu Oct 25 21:16:03 2007
     Raid Level : raid5
     Array Size : 957232896 (912.89 GiB 980.21 GB)
    Device Size : 478616448 (456.44 GiB 490.10 GB)
   Raid Devices : 3
  Total Devices : 3
  Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Fri Oct 26 12:04:08 2007
          State : clean, degraded
  Active Devices : 2
  Working Devices : 2
  Failed Devices : 1
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 141d4151:1b6badaa:ac063430:591eaac6
         Events : 0.16

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       0        0        1      removed
       2       8       34        2      active sync   /dev/sdc2

       3       8       18        -      faulty spare   /dev/sdb2

En /var/log/syslog vemos las lineas:

  Oct 26 12:04:03 servidor kernel:  --- rd:3 wd:2 fd:1
  Oct 26 12:04:03 servidor kernel:  disk 0, o:1, dev:sda2
  Oct 26 12:04:03 servidor kernel:  disk 1, o:0, dev:sdb2
  Oct 26 12:04:03 servidor kernel:  disk 2, o:1, dev:sdc2
  Oct 26 12:04:03 servidor kernel: RAID5 conf printout:
  Oct 26 12:04:03 servidor kernel:  --- rd:3 wd:2 fd:1
  Oct 26 12:04:03 servidor kernel:  disk 0, o:1, dev:sda2
  Oct 26 12:04:03 servidor kernel:  disk 2, o:1, dev:sdc2
  Oct 26 12:04:03 servidor mdadm: Fail event detected on md device /dev/md1, component device /dev/sdb2

Sacamos el disco del RAID5. Se saca en caliente si no está activo en el RAID:

mdadm  /dev/md1 --remove /dev/sdb2
  mdadm: hot removed /dev/sdb2

Aparece quitado:

mdadm --detail /dev/md1
        Version : 00.90.03
  Creation Time : Thu Oct 25 21:16:03 2007
     Raid Level : raid5
     Array Size : 957232896 (912.89 GiB 980.21 GB)
    Device Size : 478616448 (456.44 GiB 490.10 GB)
   Raid Devices : 3
  Total Devices : 2
  Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Fri Oct 26 12:14:21 2007
          State : clean, degraded
  Active Devices : 2
  Working Devices : 2
  Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 141d4151:1b6badaa:ac063430:591eaac6
         Events : 0.62

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       0        0        1      removed
       2       8       34        2      active sync   /dev/sdc2

Lo volvemos a añadir:

mdadm  /dev/md1 -a /dev/sdb2
  mdadm: re-added /dev/sdb2
mdadm --detail /dev/md1
        Version : 00.90.03
  Creation Time : Thu Oct 25 21:16:03 2007
     Raid Level : raid5
     Array Size : 957232896 (912.89 GiB 980.21 GB)
    Device Size : 478616448 (456.44 GiB 490.10 GB)
   Raid Devices : 3
  Total Devices : 3
  Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Fri Oct 26 12:15:12 2007
          State : active, degraded, recovering
  Active Devices : 2
  Working Devices : 3
  Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

  Rebuild Status : 0% complete

           UUID : 141d4151:1b6badaa:ac063430:591eaac6
         Events : 0.65

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       3       8       18        1      spare rebuilding   /dev/sdb2
       2       8       34        2      active sync   /dev/sdc2

Si volvemos a lanzar el comando:

  Rebuild Status : 18% complete

Vemos como se está reconstruyendo el disco:

cat /proc/mdstat
  Personalities : [raid1] [raid6] [raid5] [raid4]
  md1 : active raid5 sdb2[3] sda2[0] sdc2[2]
      957232896 blocks level 5, 64k chunk, algorithm 2 [3/2] [U_U]
      [>....................]  recovery =  0.8% (3914236/478616448) finish=112.2min speed=70464K/sec


Para mirar el estado del RAID

#cat /proc/mdstat

md1 : 
     active raid5 sda2[0] sdc2[2] sdb2[1]
     957232896 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] 

md0 : 
     active raid1 sda1[0] sdc1[2]
     9767424 blocks [3/2] [U_U] 

unused devices: <none>

Añadir un disco
Si tenemos un RAID 5 con 3 discos y queremos añadir otro, primero lo añadimos.
Actualmente el RAID está así:
#cat /proc/mdstat

Personalities : [raid1] [raid6] [raid5] [raid4]
md0 : active raid1 sda1[0] sdd1[2] sdc1[1]
      9767424 blocks [3/3] [UUU]
#mdadm --detail /dev/md0
        Version : 00.90.03
  Creation Time : Thu Oct 25 21:15:28 2007
     Raid Level : raid1
     Array Size : 9767424 (9.31 GiB 10.00 GB)
    Device Size : 9767424 (9.31 GiB 10.00 GB)
   Raid Devices : 3
  Total Devices : 3
 Preferred Minor : 0
    Persistence : Superblock is persistent
    Update Time : Sat Nov  3 15:07:36 2007
          State : clean
 Active Devices : 3
 Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

           UUID : a912d356:3a213509:fb13e982:631824f5
         Events : 0.1284

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       33        1      active sync   /dev/sdc1
       2       8       49        2      active sync   /dev/sdd1

Añadimos el disco:

#mdadm /dev/md0 -a /dev/sdb1
mdadm: added /dev/sdb1

Ahora el disco aparecerà en reserva:

servidor:~# mdadm –detail /dev/md0

        Version : 00.90.03
  Creation Time : Thu Oct 25 21:15:28 2007
     Raid Level : raid1
     Array Size : 9767424 (9.31 GiB 10.00 GB)
    Device Size : 9767424 (9.31 GiB 10.00 GB)
   Raid Devices : 3
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sat Nov  3 15:12:17 2007
          State : clean
 Active Devices : 3
 Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

           UUID : a912d356:3a213509:fb13e982:631824f5
         Events : 0.1284

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       33        1      active sync   /dev/sdc1
       2       8       49        2      active sync   /dev/sdd1

       3       8       17        -      spare   /dev/sdb1

Aumentamos el tamaño del RAID para que lo coja: #mdadm –grow /dev/md0 –raid-devices=4

md0 : active raid1 sdb1[4] sda1[0] sdd1[2] sdc1[1]
      9767424 blocks [4/3] [UUU_]

md1 : active raid5 sda2[0] sdb2[3] sdd2[2] sdc2[1]
      957232896 blocks super 0.91 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
      [>....................]  reshape =  1.4% (7020736/478616448) finish=539.3min speed=14570K/sec

Para que coja todo el tamaño del disco:

#mdadm --grow /dev/md1 --size=max

Todavía no ha cogido el filesystem todo el tamaño: #pvdisplay

  --- Physical volume ---
  PV Name               /dev/md1
  VG Name               servidor
  PV Size               912.89 GB / not usable 0
  Allocatable           yes (but full)
  PE Size (KByte)       4096
  Total PE              233699
  Free PE               0
  Allocated PE          233699
  PV UUID               FWHDaX-piDe-3962-ThyA-xUoX-I49J-v2qOoF

Le decimos que lo coja todo:
# pvresize /dev/md1

Physical volume "/dev/md1" changed
1 physical volume(s) resized / 0 physical volume(s) not resized

servidor:~# pvdisplay

  --- Physical volume ---
  PV Name               /dev/md1
  VG Name               servidor
  PV Size               1.34 TB / not usable 0
  Allocatable           yes
  PE Size (KByte)       4096
  Total PE              350549
  Free PE               116850
  Allocated PE          233699
  PV UUID               FWHDaX-piDe-3962-ThyA-xUoX-I49J-v2qOoF

Primero hacemos un test: #lvresize -v -d -t -L +457G /dev/servidor/servidor_home

Nos la jugamos:

#lvresize -v -d -L +457G /dev/servidor/servidor_home
    Found volume group "servidor"
    Loading servidor-servidor_home table
    Suspending servidor-servidor_home (253:3)
    Found volume group "servidor"
    Resuming servidor-servidor_home (253:3)
  Logical volume servidor_home successfully resized

Una vez añadido incrementamos el tamaño online. Necesitamos el paquete ext2resize: #ext2online /dev/servidor/servidor_home

Cambiar disco en frio (apagando server)

1. Apagar servidor

2. Desconectar disco que creemos que falla

3. Arrancar servidor

4. Entrar en la BIOS


5. Comprobar que el unico disco conectado es master. En mi caso en la BIOS Asus:


6. (Opcional) si el disco no estaba como master apagar servidor y conectar el disco como master cambiando cables y encender de nuevo

7. Arrancar servidor

8. Tocar flecha arriba/abajo para que muestre opciones de grub

9. Arrancar en modo recovery

10. Decir que se quiere arrancar el RAID degradado


11. Arrancar normal

TODO: provide menu label

12. Ver las pariciones del disco bueno, en mi caso /dev/sda

sudo fdisk /dev/sda
The device presents a logical sector size that is smaller than
the physical sector size. Aligning to a physical sector (or optimal
I/O) size boundary is recommended, or performance may be impacted.

Command (m for help): 
Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x000ab48f

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048  1953523711   976760832   fd  Linux RAID autodetect

Command (m for help): 

13. Copiar el mismo particionado:

13.1. Crear particion

sudo fdisk /dev/sdb
The device presents a logical sector size that is smaller than
the physical sector size. Aligning to a physical sector (or optimal
I/O) size boundary is recommended, or performance may be impacted.

Command (m for help):
Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x4d1b267d

   Device Boot      Start         End      Blocks   Id  System

Command (m for help):
Partition type:
   p   primary (0 primary, 0 extended, 4 free)
   e   extended
Select (default p): 
Using default response p
Partition number (1-4, default 1): 
Using default value 1
First sector (2048-1953525167, default 2048): 
Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-1953525167, default 1953525167): 
Using default value 1953525167

Command (m for help):
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

13.2. Cambiar label

sudo fdisk /dev/sdb
The device presents a logical sector size that is smaller than
the physical sector size. Aligning to a physical sector (or optimal
I/O) size boundary is recommended, or performance may be impacted.

Command (m for help):
Selected partition 1
Hex code (type L to list codes):
Changed system type of partition 1 to fd (Linux RAID autodetect)

Command (m for help):
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

14. Ver estado del raid:

cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid1 sda1[1]
      976629568 blocks super 1.2 [2/1] [_U]
unused devices: <none>

15. Anyadir particion (en mi caso /dev/sdb1) al RAID (en mi caso md0)

sudo mdadm -a /dev/md0 /dev/sdb1
mdadm: added /dev/sdb1

16. Ver progreso

cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid1 sdb1[2] sda1[1]
      976629568 blocks super 1.2 [2/1] [_U]
      [>....................]  recovery =  0.0% (19072/976629568) finish=15349.0min speed=1059K/sec
unused devices: <none>

17. Install grub on the MBR of new hard disk

sudo grub-install /dev/sdb
/usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
Installation finished. No error reported.

TODO: quiza esperar a que termine el rebuild


incrementally starting raid arrays

A mi me paso cuando casco 1 de los 2 discos de un RAID 0 e intente arrancar con uno de los discos.

Entra en un loop infinito y no llega a arrancar:

incrementally starting raid arrays
mdadm: Create user root not found 
mdadm: create group disk not found 
incrementally started raid arrays

Causa: intercanvie de lugar en la placa base el disco. Es decir, antes era el disco maestro, y lo puse conectado a los cables que lo hacian esclavo


1. Volver a colocar el disco en la “posicion” que okupaba anteriormente en el RAID. Basicamente ensallo/error. Se apaga el servidor, se colocan cables y se enciende. ¿Que no arranca? se vuelve a apagar y se cambian cables etc…

2. Una vez haya arrancado:

sudo su
echo mpt2sas >>/etc/initramfs-tools/modules

3. Apagar el servidor. Ya se puede colocar el disco en cualquier posicion que arrancara

Recuperar RAID

# mdadm --detail --scan
ARRAY /dev/md/0 metadata=1.2 name=proxmox01:0 UUID=ccc2aadb:7808895c:d4339489:b9b0e569
ARRAY /dev/md/2 metadata=1.2 name=proxmox01:2 UUID=bc05007c:2855e55e:45d0671c:86e24616
ARRAY /dev/md/1 metadata=1.2 name=proxmox01:1 UUID=4b92d787:c7f43fb7:ab52584c:c6207c8a