SSW logo Blog - Michael's Musings

Sun, 05 Feb 2012

LVM mirroring: the right way

LVM now supports mirroring inside of LVM, rather than requiring that you put mirrors underneath LVM physical volumnes. This provides much more flexibility, and some volumnes can be mirrored, some not (such as swap partitions), and different RAID algorithms can be used. LVM uses the same underlying mechanisms as Linux RAID system (mdadm) to do the RAID operations, so there is no change in overall performance.

Lucas and I learnt on the Hydra project that creating a mirror as follows:

lvconvert -m 1 --corelog /dev/nv0/time1root

or at lvcreate time:

lvcreate -L 4G --name time1root -m 1 --corelog --nosync /dev/nv0

while it works, produces a mirror that keeps certain meta-info in memory only. Should the machine reboot in an uncontrolled way, the mirror will be marked as bad and rebuilt in order to validate the meta-data.

On a machine with with VMs running (nvxen-0, crtlXX) after a reboot it can take hours for the mirror to rebuild. The correct answer it turns out is to use --mirrorlog mirrored, and an option to put the mirror logs anywhere.

lvconvert -m 1 --mirrorlog mirrored --alloc anywhere /dev/nv0/time1root The allocation policy of "anywhere" permits the two 4M mirror logs (4M is the minimum allocation that LVM can do) to be kept on the same disks as the data they are mirroring. Otherwise, if you have only two physical volumnes, you can not put the log anywhere and the default policy (which I think is wrong) is to insist that the mirrorlogs go on different volumnes than the data. (I don't know why this necessary)

Converting between is a pain: the only way I found to do this is to remove the mirroring and then re-create it.

ionice -c3 lvconvert -m 0 /dev/nv0/time1root ionice -c3 lvconvert -m 1 --mirrorlog mirrored --alloc anywhere /dev/nv0/time1root

I wrote a script to process the output of lvs and do this. The ionice keeps the process in the background, not chewing up I/O.

On the fresh boot after the crash however, you may find your system is almost completely unresponsive as it tries to resync dozens of mirrors. On that, /dev/md0-style raid devices get it right. How to fix: find the kcopyd kernel processes and run ionice on them:

ps ax | grep kcopyd | awk '{print $1}' | while read pid; do sudo ionice -i3 -p$pid; done

once you have done this, you can then get in long enough to run the lvconvert. I suggest you remove all the mirrors first (-m 0) as that stops the resync operation from getting in the way of the resync you will have to anyway.