Recovery of RAID and LVM2 Volumes

April 28th, 2006 by Richard Bullington-McGuire in

Raid and Logical Volume Managers are great, until you lose data.

The combination of Linux software RAID (Redundant Array of Inexpensive Disks) and LVM2 (Logical Volume Manager, version 2) offered in modern Linux operating systems offers both robustness and flexibility, but at the cost of complexity should you ever need to recover data from a drive formatted with software RAID and LVM2 partitions. I found this out the hard way when I recently tried to mount a system disk created with RAID and LVM2 on a different computer. The first attempts to read the filesystems on the disk failed in a frustrating manner.

I had attempted to put two hard disks into a small-form-factor computer that was really only designed to hold only one hard disk, running the disks as a mirrored RAID 1 volume. (I refer to that system as raidbox for the remainder of this article.) This attempt did not work, alas. After running for a few hours, it would power-off with an automatic thermal shutdown failure. I already had taken the system apart and started re-installing with only one disk when I realized there were some files on the old RAID volume that I wanted to retrieve.

Recovering the data would have been easy if the system did not use RAID or LVM2. The steps would have been to connect the old drive to another computer, mount the filesystem and copy the files from the failed volume. I first attempted to do so, using a computer I refer to as recoverybox, but this attempt met with frustration.

Why Was This So Hard?

Getting to the data proved challenging, both because the data was on a logical volume hidden inside a RAID device, and because the volume group on the RAID device had the same name as the volume group on the recovery system.

Some popular modern operating systems (for example, Red Hat Enterprise Linux 4, CentOS 4 and Fedora Core 4) can partition the disk automatically at install time, setting up the partitions using LVM for the root device. Generally, they set up a volume group called VolGroup00, with two logical volumes, LogVol00 and LogVol01, the first for the root directory and the second for swap, as shown in Listing 1.

The original configuration for the software RAID device had three RAID 1 devices: md0, md1 and md2, for /boot, swap and /, respectively. The LVM2 volume group was on the biggest RAID device, md2. The volume group was named VolGroup00. This seemed like a good idea at the time, because it meant that the partitioning configuration for this box looked similar to how the distribution does things by default. Listing 2 shows how the software RAID array looked while it was operational.

If you ever name two volume groups the same thing, and something goes wrong, you may be faced with the same problem. Creating conflicting names is easy to do, unfortunately, as the operating system has a default primary volume group name of VolGroup00.

Restoring Access to the RAID Array Members

To recover, the first thing to do is to move the drive to another machine. You can do this pretty easily by putting the drive in a USB2 hard drive enclosure. It then will show up as a SCSI hard disk device, for example, /dev/sda, when you plug it in to your recovery computer. This reduces the risk of damaging the recovery machine while attempting to install the hardware from the original computer.

The challenge then is to get the RAID setup recognized and to gain access to the logical volumes within. You can use sfdisk -l /dev/sda to check that the partitions on the old drive are still there.

To get the RAID setup recognized, use mdadm to scan the devices for their raid volume UUID signatures, as shown in Listing 3.

This format is very close to the format of the /etc/mdadm.conf file that the mdadm tool uses. You need to redirect the output of mdadm to a file, join the device lines onto the ARRAY lines and put in a nonexistent second device to get a RAID1 configuration. Viewing the the md array in degraded mode will allow data recovery:

[root@recoverybox ~]# mdadm --examine --scan  /dev/sda1
 ↪/dev/sda2 /dev/sda3 >> /etc/mdadm.conf
[root@recoverybox ~]# vi /etc/mdadm.conf

Edit /etc/mdadm.conf so that the devices statements are on the same lines as the ARRAY statements, as they are in Listing 4. Add the “missing” device to the devices entry for each array member to fill out the raid1 complement of two devices per array. Don't forget to renumber the md entries if the recovery computer already has md devices and ARRAY statements in /etc/mdadm.conf.

Then, activate the new md devices with mdadm -A -s, and check /proc/mdstat to verify that the RAID array is active. Listing 5 shows how the raid array should look.

If md devices show up in /proc/mdstat, all is well, and you can move on to getting the LVM volumes mounted again.

Recovering and Renaming the LVM2 Volume

The next hurdle is that the system now will have two sets of lvm2 disks with VolGroup00 in them. Typically, the vgchange -a -y command would allow LVM2 to recognize a new volume group. That won't work if devices containing identical volume group names are present, though. Issuing vgchange -a y will report that VolGroup00 is inconsistent, and the VolGroup00 on the RAID device will be invisible. To fix this, you need to rename the volume group that you are about to mount on the system by hand-editing its lvm configuration file.

If you made a backup of the files in /etc on raidbox, you can edit a copy of the file /etc/lvm/backup/VolGroup00, so that it reads VolGroup01 or RestoreVG or whatever you want it to be named on the system you are going to restore under, making sure to edit the file itself to rename the volume group in the file.

If you don't have a backup, you can re-create the equivalent of an LVM2 backup file by examining the LVM2 header on the disk and editing out the binary stuff. LVM2 typically keeps copies of the metadata configuration at the beginning of the disk, in the first 255 sectors following the partition table in sector 1 of the disk. See /etc/lvm/lvm.conf and man lvm.conf for more details. Because each disk sector is typically 512 bytes, reading this area will yield a 128KB file. LVM2 may have stored several different text representations of the LVM2 configuration stored on the partition itself in the first 128KB. Extract these to an ordinary file as follows, then edit the file:

dd if=/dev/md2 bs=512 count=255 skip=1 of=/tmp/md2-raw-start
vi /tmp/md2-raw-start

You will see some binary gibberish, but look for the bits of plain text. LVM treats this metadata area as a ring buffer, so there may be multiple configuration entries on the disk. On my disk, the first entry had only the details for the physical volume and volume group, and the next entry had the logical volume information. Look for the block of text with the most recent timestamp, and edit out everything except the block of plain text that contains LVM declarations. This has the volume group declarations that include logical volumes information. Fix up physical device declarations if needed. If in doubt, look at the existing /etc/lvm/backup/VolGroup00 file to see what is there. On disk, the text entries are not as nicely formatted and are in a different order than in the normal backup file, but they will do. Save the trimmed configuration as VolGroup01. This file should then look like Listing 6.

Once you have a volume group configuration file, migrate the volume group to this system with vgcfgrestore, as Listing 7 shows.

At this point, you can now mount the old volume on the new system, and gain access to the files within, as shown in Listing 8.

Now that you have access to your data, a prudent final step would be to back up the volume group information with vcfgbackup, as Listing 9 shows.

Conclusion

LVM2 and Linux software RAID make it possible to create economical, reliable storage solutions with commodity hardware. One trade-off involved is that some procedures for recovering from failure situations may not be clear. A tool that reliably extracted old volume group information directly from the disk would make recovery easier. Fortunately, the designers of the LVM2 system had the wisdom to keep plain-text backup copies of the configuration on the disk itself. With a little patience and some research, I was able to regain access to the logical volume I thought was lost; may you have as much success with your LVM2 and RAID installation.

Resources for this article: /article/8948.

Richard Bullington-McGuire is the Managing Partner of PKR Internet, LLC, a software and systems consulting firm in Arlington, Virginia, specializing in Linux, Open Source and Java. He has been a Linux sysadmin since 1994. You can reach him at rbulling@pkrinternet.com.

__________________________

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Thanks, this helped a lot.

On May 6th, 2008 cactuz (not verified) says:

The system-disk in my file-server just crashed the other day and as I was installing it all over again tonight I had no problem figuring out how to mount my old LVM's that wasn't in RAID-configuration.... but then my precious backup-volume for documents/photos I was using RAID and couldn't figure out how to get it back online until I googled and found this article.

So thank you for saving me many hours of work!

Excellent Article - Another system saved!

On May 11th, 2008 Anonymous (not verified) says:

Excellent stuff! I had attached two VolGroup00s to one system. Realizing that I could not access the data on the second one, I removed it following instructions at http://www.linuxtopia.org/online_books/linux_lvm_guide/removepvsfromvg.html. OOPS! This article saved me.

Another "life" saved!

On January 15th, 2008 Michael Hill (not verified) says:

While juggling drives and trying to fix an annoying boot problem, I managed to overwrite the MBR of one of the drives. I had unwisely chosen to use the entire device as an LVM PV (instead of a partition spanning the whole drive), so that whacked the PV metadata. Many thanks to Richard for writing the original article, and in particular to Toby Fruth, whose reply led me through the steps to recover my PV and all the LVs on it. I was fortunate in not having to reconstruct the VG config file from raw sectors; LVM made backup copies of the VG configs every time I made a change, so I had a recent backup copy at hand.

Thanks again for helping me recover access to my data!

vgrename with UUID?

On November 6th, 2007 Elrond (not verified) says:

Many thanks for the insight into LVM2's internal working for metadata. I always like to have an idea, how stuff is layed out on disk, so I can *worst case* do dmsetup myself.

The subject mainly gives my question:

I found in vgrename(8), that it seems to support vgrename VG-UUID NewName. This looks like the perfect way to rename conflicting VGnames. Did anyone try this?
(Yes, modulo all the MD-trouble)

Thanks so much

On September 11th, 2007 JR Peck (not verified) says:

I've spent this morning trying to mount a 2.5" drive from a failed laptop that I had place in a USB enclosure. No joy until this article got me going and I can't say how much I appreciate it.

You are welcome

On September 14th, 2007 Richard Bullington-McGuire says:

I am glad you were able to retrieve your data.

Many of the people who left comments on this article had helpful suggestions that are even more simple than the methods I outlined in the article.

THANKS!

On August 24th, 2007 Anonymous (not verified) says:

I ran into this issue while trying to recover my original drive for my home server, which in turn of google searches I found this thread. Than k you so much for the excellent explanation of this issue!

I took a differn't approach however once I understood what was in conflict. I popped in a new drive, that I wanted to recover the data too, and just reinstalled my OS but did NOT use LVM this time. Just a good old fashioned swap/boot/and root partition scheme and then re-ran linux rescue which mounted the old filesystem easily, and was very easy for me to mount the secondary disk. Copied all files over, and put back my configs. All said and done, just a couple hours and my system is back up to normal after a drive failure. AWESOME!

It sounds as if you have another good solution to the basic problem, as long as RAID is not involved.

LVM recovery

On August 17th, 2007 srinivas Chamarthi (not verified) says:

hey! thanks a lot for clearing the confusion regarding recovery of LVM2 volumes. I got a successful recovery! hats off for u

Am i missing something?

On July 15th, 2007 Anonymous (not verified) says:

After you installed the failed raid disk into the recovery box (or hooked it up via usb), couldn't you have booted the recovery box with a Live CD and simply mounted only the drive partitions you needed?

In otherwords, just don't mount the drive in the recovery box that had the equivalent vol group. That way there would have been no conflict right?

If i understand the problem correctly, the problem is NOT that the raid drive does NOT HAVE AN LVM CONFIG (or that it was damaged), it's just that it's the SAME as the recovery boxes LVM config (e.g. has the same volgroup name) which prevents it from being seen (i think?)

Another way of asking the question is this. If the recovery box did NOT have any LVM partitions or LVM config native to it.. could i simply plug the raid drive in and the recovery box would automagically find the raid LVM partitions or would I still have to something else to make it work? If I have to do something else to make it work, i'd totally appreciate it if you could explain what i would need to do (either a subset of the above article steps or just a streamlined set of guidelines).

That would fully help me understand this topic completely because i imagine at some point, if i have a system just like this, i'm going to need to recover it some day. And it would be pretty easy for me to NOT use LVM on the target recovery box.

thanks

> ... couldn't you have booted the recovery box with a Live CD and simply mounted

only the drive partitions you needed?

That was what I was originally hoping to do, but that did not work automatically. RAID arrays on USB-connected drives are not available to the system when it does its first scan for RAID arrays. Also, if the recovery box has a volume group with the same name, it will not recognize the newly-attached volume group.

I have used USB RAID arrays in production, and you have to take some extra steps to activate them late in the boot process. I typically use a script similar to this to do the job:


#!/bin/sh
#
# Mount a USB raid array
#
# Call from /etc/rc.d/rc.local

DEVICE=/dev/ExampleVolGroup/ExampleVol00
MOUNTPOINT=/mnt/ExampleVol00

# Activate the array. This assumes that /etc/mdadm.conf has an entry for it already
/sbin/mdadm -A -s
# Look for LVM2 volume groups on all connected partitions, including the array
/sbin/vgscan --mknodes
# Activate all LVM partitions, including that on the array
/sbin/vgchange -a y
# Make sure to fsck the device so it stays healthy long-term
fsck -T -a $DEVICE
mount $DEVICE $MOUNTPOINT

> In otherwords, just don't mount the drive in the recovery box that had the equivalent vol group. That way there would have been no conflict right?

That's mostly right. You'd still need to scan for the RAID arrays with 'mdadm --examine --scan $MYDEVICENAME' , then activate them after creating /etc/mdadm.conf.

If you had other md software RAID devices on the system, you might have to fix up the device numbering on the md devices.

> If the recovery box did NOT have any LVM partitions or LVM config native to it.. could i simply plug the raid drive in and the recovery box would automagically find the raid LVM partitions or would I still have to something else to make it work?

On a recovery box without any software RAID or LVM configuration, if you plugged the RAID drive directly into the IDE or SATA connector, it might automagically find the RAID array and LVM volume. I have not done that particular experiment, you might try it and let me know how it goes.

If the drive was attached to the recovery box using a USB enclosure, the RAID and LVM configurations probably won't be autodetected during the early boot stages, and you'll almost certainly have to do a scan / activate procedure on both the RAID and LVM layers.

You might have to scan for RAID partitions, build an /etc/mdadm.conf file, and then scan for volume groups and activate them in either case.

The most difficult part of the recovery outlined in the article was pulling the LVM configuration out of the on-disk ring buffer. You can avoid that by making sure you have a backup of the LVM configuration for that machine stored elsewhere.

LVM, This Article, the Author, and Success!

On June 3rd, 2007 Toby Fruth (not verified) says:

I emailed Mr. Bullington-McGuire, for I had created a self-inflicted dilemma. I had run the following command:

pvremove /dev/sdb2 -f

Why? Because I thought I needed to remove LVM data from a drive in order to mount it under a new install, which had been on a different drive. I could have done it this way:

mount /dev/VolGroup00/LogVol00 /mnt

assuming that another LogVol00 was not already mounted and that a /dev/VolGroup00/LogVol00 did not already exist. Of course, they originally did exist under the new install on the new drive, so I did another new install, using different LVM names on the new drive.

So, I managed to recover from the pvremove by doing a pvcreate, using a restore file created with the instructions in this article.

lvm> pvcreate --restorefile /tmp/VolGroup00 --uuid O3tLZO-ZvUq-oggv-yuIZ-kEtv-eAMi-zgN0aB /dev/sdb2
Couldn't find device with uuid 'O3tLZO-ZvUq-oggv-yuIZ-kEtv-eAMi-zgN0aB'.
Physical volume "/dev/sdb2" successfully created

lvm> vgcfgrestore --file /tmp/VolGroup00 VolGroup00
Restored volume group VolGroup00

Once this was done, I was able to use the mount command I listed earlier in this post to mount up my old drive's LVM group.

After the mount command, I issued the following commands:

df -h

ls -l /mnt

I can now see all my old data, which I am promptly copying to the new drive, as soon as I make a backup of the LVM data!

Glad to help

On June 22nd, 2007 Richard Bulington-McGuire (not verified) says:

Thank you for contacting me regarding your problem. I am glad you managed to recover your data. It looks as if the procedure I sent you worked.

:0) Saved my marrage!

On March 2nd, 2007 Anonymous (not verified) says:

just making a backup and poof the power goes :( on reboot i can't get to my lvm and my 50gig backup is awol!!
your ickle guide saved my life as the missus sims2 data was on there and its more then my life is worth to lose that

Another day, another marriage saved

On June 22nd, 2007 Richard Bulington-McGuire (not verified) says:

Thank you for your kind words. I am glad you were able to recover.

Bacon saved!

On February 17th, 2007 Jason (not verified) says:

Just another saved me comment! Thanks! I thought I was hosed, but this article pointed me in the direction I needed to go to recover my essential data. Yes, yes, I do backups, monthly and archive media every 6 months, but now, I have learned: always RAID1 or RAID5, no LVM, test UPS control regularly, and invest in large external eSATA/USB/Firewire drives to do nightly incrementals and keep them unmounted when not in use.

Oh, and avoid drawing power from BG&E if at all possible...they suck.

You are welcome

On June 22nd, 2007 Richard Bulington-McGuire (not verified) says:

If you live in the Mid-Atlantic as I do, your real enemy may be the trees.

Thanks

On December 10th, 2006 Anonymous (not verified) says:

Count another life saved. In spite of destroying one HD out of a two HD LVM set, we still recovered some data thanks to these tips. Not too shabby.

Alternate recovery method

On November 5th, 2006 Garth Webb (not verified) says:

In my situation I did not have any explicit RAID arrays. I just had the standard RedHat FC5 configuration of a single VolGroup00 volume group with two logical volumes. I pulled that 20GB drive from my system recently and installed a 320GB drive in its place and reinstalled FC5. Because this new drive had the same VolGroup00 volume group created, I could not mount the 20 GB drive I had in a USB enclosure.

It seemed that most of this article was aimed at teasing out the lvm metadata and rewriting it to affect a volume group name change. Since all of the lvm tools require that you address the volume groups on your physical drives by their name, you enconter the naming conflict (how hard would it have been to include a rename command that took a physical path and renamed the volume there?).

Rather than fight that battle, I booted off my FC5 rescue CD (or any bootable tools CD with the LVM tools on it) and did a:

vgrename VolGroup00 Seagate320

The naming conflict didn't really matter here it seems. It just renamed whatever VolGroup00 it found first, which happened to be my new 320 G drive. I could then activate both with:

vgchange -ay

and then mount the volumes and copy, etc.

Nice recovery method

On November 22nd, 2006 rbulling says:

This looks like a much simpler way to do things, as long as you are not dealing with software RAID.

One thing you'd need to be careful about is making sure that you leave the new VolGroup00 named VolGroup00 at the end of the recovery process.

I suspect that using vgrename / vgscan / vgchange in combination would allow you to rename both volume groups to something else, then rename the newest VolGroup00 back to VolGroup00, so that the system would continue to work on boot.

You could probably use the same technique after you recovered the RAID configuration, and avoid the messy surgery on raw disk information. Next time I encounter this issue, I'll give that a try.

Thanks!

On October 20th, 2006 Juan (not verified) says:

This article save my day!

Thanks!

On October 6th, 2006 Anders Båtstrand (not verified) says:

This worked great for me. Thanks for putting it together!

Doesn't quite work for me

On September 21st, 2006 jweage (not verified) says:

I just ran into a similar problem attempting to move a disk from one machine to another, with both disks configured as VolGroup00. I worked through your example, but when it came to restoring VolGroup01 (Listing 6), vgcfgrestore refused claiming it couldn't find a contents line. In my dump, there are 5 additional header lines before the VolGroup01 { line, which vgcfgrestore requires.

After I figured this out and restored the volume group, I could not get any logical volumes to show up. lvscan did not pick up the three logical volumes on the volume group! Those were also in the dd extracted file, so I had to add all of that back into the config file and do another vgcfgrestore, vgactivate.

This is really disconcerting, as this is likely to be a common problem. Unfortunately it seems that LVM is NOT the way to go for the typical workstation, unless someone really needs the ability to resize a volume.

Correction to Listing 6

On November 22nd, 2006 rbulling says:

It appears that Listing 6 got truncated somewhere along the line before publication.

The full Listing 6 should be:


Listing 6: Modified Volume Group Configuration File

VolGroup01 {
id = "xQZqTG-V4wn-DLeQ-bJ0J-GEHB-4teF-A4PPBv"
seqno = 1
status = ["RESIZEABLE", "READ", "WRITE"]
extent_size = 65536
max_lv = 0
max_pv = 0

physical_volumes {

pv0 {
id = "tRACEy-cstP-kk18-zQFZ-ErG5-QAIV-YqHItA"
device = "/dev/md2"

status = ["ALLOCATABLE"]
pe_start = 384
pe_count = 2365
}
}

# Generated by LVM2: Sun Feb 5 22:57:19 2006
logical_volumes {

LogVol00 {
id = "i17qXJ-Blzu-u1Dr-bSlR-0kNC-yuBH-lnbkSi"
status = ["READ", "WRITE", "VISIBLE"]
segment_count = 1

segment1 {
start_extent = 0
extent_count = 2364

type = "striped"
stripe_count = 1 # linear

stripes = [
"pv0", 0
]
}
}
}
}

contents = "Text Format Volume Group"
version = 1

description = ""

creation_host = "localhost.localdomain" # Linux localhost.localdomain 2.6.9-11.EL #1 Wed Jun 8 20:20:13 CDT 2005 i686
creation_time = 1139180239 # Sun Feb 5 22:57:19 2006

what if the machine your

On August 16th, 2006 dave (not verified) says:

what if the machine your using for recovery has raid itself?? when you append to mdadm.conf can md0,1,2 be renumbered to 3,4,5?

You should be able to do that without any problems, as long as you explicitly keep the UUID signature in the renamed device line.

Experienced this exact

On September 13th, 2006 Neekofab (not verified) says:

Experienced this exact problem. moved a md0/md1 disk to a recovery workstation that already had an md0/md1 device. they could not coexist, and I could not find a way to move the additional md0/md1 devices to md2/md3. I ended up disconnecting the system md0/md1 devices, booting up with sysresccd and shoving the data over the network.

bleah

I ran into the same issue

On May 9th, 2007 Anonymous (not verified) says:

I ran into the same issue and solved it with a little reading about mdadm. All you have to do is create a new array from the old disks.

# MAKEDEV md1
# mdadm -C /dev/md1 -l 1 -n 2 missing /dev/sdb1

Voila. Your raid array has now been moved from md0 to md1.

I ran into the same issue

On May 9th, 2007 Anonymous (not verified) says:

I ran into the same issue and solved it with a little reading about mdadm. All you have to do is create a new array from the old disks.

# MAKEDEV md1
# mdadm -C /dev/md1 -l 1 -n 2 missing /dev/sdb1

Voila. Your raid array has now been moved from md0 to md1.

Recovery Linux Data

On May 29th, 2007 Anonymous (not verified) says:

Kernel Linux - Ext2 & Ext3 Data Recovery Software supports LVM Partition PARTIALLY
LVM Recovery

Data Recovery Software for Linux

On August 23rd, 2007 Unistal Data Recovery (not verified) says:

If you need efficient linux data recovery software, which runs on Linux then download Linux Data Recovery Software.

LVM Recovery

On June 4th, 2007 Recover Data (not verified) says:

Recover Data for Linux also supports LVM Recovery. Download the software DEMO and scan you disk for lost data and files.

Visit : Recover Data Software