Recovery of RAID and LVM2 Volumes

Raid and Logical Volume Managers are great, until you lose data.
Restoring Access to the RAID Array Members

To recover, the first thing to do is to move the drive to another machine. You can do this pretty easily by putting the drive in a USB2 hard drive enclosure. It then will show up as a SCSI hard disk device, for example, /dev/sda, when you plug it in to your recovery computer. This reduces the risk of damaging the recovery machine while attempting to install the hardware from the original computer.

The challenge then is to get the RAID setup recognized and to gain access to the logical volumes within. You can use sfdisk -l /dev/sda to check that the partitions on the old drive are still there.

To get the RAID setup recognized, use mdadm to scan the devices for their raid volume UUID signatures, as shown in Listing 3.

This format is very close to the format of the /etc/mdadm.conf file that the mdadm tool uses. You need to redirect the output of mdadm to a file, join the device lines onto the ARRAY lines and put in a nonexistent second device to get a RAID1 configuration. Viewing the the md array in degraded mode will allow data recovery:

[root@recoverybox ~]# mdadm --examine --scan  /dev/sda1
 ↪/dev/sda2 /dev/sda3 >> /etc/mdadm.conf
[root@recoverybox ~]# vi /etc/mdadm.conf

Edit /etc/mdadm.conf so that the devices statements are on the same lines as the ARRAY statements, as they are in Listing 4. Add the “missing” device to the devices entry for each array member to fill out the raid1 complement of two devices per array. Don't forget to renumber the md entries if the recovery computer already has md devices and ARRAY statements in /etc/mdadm.conf.

Then, activate the new md devices with mdadm -A -s, and check /proc/mdstat to verify that the RAID array is active. Listing 5 shows how the raid array should look.

If md devices show up in /proc/mdstat, all is well, and you can move on to getting the LVM volumes mounted again.

Recovering and Renaming the LVM2 Volume

The next hurdle is that the system now will have two sets of lvm2 disks with VolGroup00 in them. Typically, the vgchange -a -y command would allow LVM2 to recognize a new volume group. That won't work if devices containing identical volume group names are present, though. Issuing vgchange -a y will report that VolGroup00 is inconsistent, and the VolGroup00 on the RAID device will be invisible. To fix this, you need to rename the volume group that you are about to mount on the system by hand-editing its lvm configuration file.

If you made a backup of the files in /etc on raidbox, you can edit a copy of the file /etc/lvm/backup/VolGroup00, so that it reads VolGroup01 or RestoreVG or whatever you want it to be named on the system you are going to restore under, making sure to edit the file itself to rename the volume group in the file.

If you don't have a backup, you can re-create the equivalent of an LVM2 backup file by examining the LVM2 header on the disk and editing out the binary stuff. LVM2 typically keeps copies of the metadata configuration at the beginning of the disk, in the first 255 sectors following the partition table in sector 1 of the disk. See /etc/lvm/lvm.conf and man lvm.conf for more details. Because each disk sector is typically 512 bytes, reading this area will yield a 128KB file. LVM2 may have stored several different text representations of the LVM2 configuration stored on the partition itself in the first 128KB. Extract these to an ordinary file as follows, then edit the file:

dd if=/dev/md2 bs=512 count=255 skip=1 of=/tmp/md2-raw-start
vi /tmp/md2-raw-start

You will see some binary gibberish, but look for the bits of plain text. LVM treats this metadata area as a ring buffer, so there may be multiple configuration entries on the disk. On my disk, the first entry had only the details for the physical volume and volume group, and the next entry had the logical volume information. Look for the block of text with the most recent timestamp, and edit out everything except the block of plain text that contains LVM declarations. This has the volume group declarations that include logical volumes information. Fix up physical device declarations if needed. If in doubt, look at the existing /etc/lvm/backup/VolGroup00 file to see what is there. On disk, the text entries are not as nicely formatted and are in a different order than in the normal backup file, but they will do. Save the trimmed configuration as VolGroup01. This file should then look like Listing 6.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Thanks

Anonymous's picture

Count another life saved. In spite of destroying one HD out of a two HD LVM set, we still recovered some data thanks to these tips. Not too shabby.

Alternate recovery method

Garth Webb's picture

In my situation I did not have any explicit RAID arrays. I just had the standard RedHat FC5 configuration of a single VolGroup00 volume group with two logical volumes. I pulled that 20GB drive from my system recently and installed a 320GB drive in its place and reinstalled FC5. Because this new drive had the same VolGroup00 volume group created, I could not mount the 20 GB drive I had in a USB enclosure.

It seemed that most of this article was aimed at teasing out the lvm metadata and rewriting it to affect a volume group name change. Since all of the lvm tools require that you address the volume groups on your physical drives by their name, you enconter the naming conflict (how hard would it have been to include a rename command that took a physical path and renamed the volume there?).

Rather than fight that battle, I booted off my FC5 rescue CD (or any bootable tools CD with the LVM tools on it) and did a:

vgrename VolGroup00 Seagate320

The naming conflict didn't really matter here it seems. It just renamed whatever VolGroup00 it found first, which happened to be my new 320 G drive. I could then activate both with:

vgchange -ay

and then mount the volumes and copy, etc.

Nice recovery method

rbulling's picture

This looks like a much simpler way to do things, as long as you are not dealing with software RAID.

One thing you'd need to be careful about is making sure that you leave the new VolGroup00 named VolGroup00 at the end of the recovery process.

I suspect that using vgrename / vgscan / vgchange in combination would allow you to rename both volume groups to something else, then rename the newest VolGroup00 back to VolGroup00, so that the system would continue to work on boot.

You could probably use the same technique after you recovered the RAID configuration, and avoid the messy surgery on raw disk information. Next time I encounter this issue, I'll give that a try.

With my distro (Ubuntu), and

Anonymous's picture

With my distro (Ubuntu), and I suspect many others these days, drives are actually mounted by UUID, not by device name *or* by VG/LV name. So renaming the VG, even the one containing root, should *) not be a problem.

What I did to recover with a RAID1/LVM stack which blew one of the drives, was boot the ubuntu desktop liveCD, sudo apt-get install mdadm lvm2, mdadm --assemble /dev/mdX (be careful of raid devnames that occur in multiples -- ie run this on recoveryPC with old_drives, and disconnected recoveryPC_drives -- but it will simply scan the drives attached for appropriate partitions), then the lvm stuff will simply magically appear and you can vgrename easily if necessary. Once you've done that, connect all the drives and reboot, and after the mdadm.conf magic you will have all your data.

Incidentally, on my home server, raid1 OS disks, Raid5 storage disks, and LVM over the top, hell yeah that makes sense. On my desktop, not so much.

* Yeah. I haven't tested this explicitly.

Thanks!

Juan's picture

This article save my day!

Thanks!

Anders Båtstrand's picture

This worked great for me. Thanks for putting it together!

Doesn't quite work for me

jweage's picture

I just ran into a similar problem attempting to move a disk from one machine to another, with both disks configured as VolGroup00. I worked through your example, but when it came to restoring VolGroup01 (Listing 6), vgcfgrestore refused claiming it couldn't find a contents line. In my dump, there are 5 additional header lines before the VolGroup01 { line, which vgcfgrestore requires.

After I figured this out and restored the volume group, I could not get any logical volumes to show up. lvscan did not pick up the three logical volumes on the volume group! Those were also in the dd extracted file, so I had to add all of that back into the config file and do another vgcfgrestore, vgactivate.

This is really disconcerting, as this is likely to be a common problem. Unfortunately it seems that LVM is NOT the way to go for the typical workstation, unless someone really needs the ability to resize a volume.

Correction to Listing 6

rbulling's picture

It appears that Listing 6 got truncated somewhere along the line before publication.

The full Listing 6 should be:


Listing 6: Modified Volume Group Configuration File

VolGroup01 {
id = "xQZqTG-V4wn-DLeQ-bJ0J-GEHB-4teF-A4PPBv"
seqno = 1
status = ["RESIZEABLE", "READ", "WRITE"]
extent_size = 65536
max_lv = 0
max_pv = 0

physical_volumes {

pv0 {
id = "tRACEy-cstP-kk18-zQFZ-ErG5-QAIV-YqHItA"
device = "/dev/md2"

status = ["ALLOCATABLE"]
pe_start = 384
pe_count = 2365
}
}

# Generated by LVM2: Sun Feb 5 22:57:19 2006
logical_volumes {

LogVol00 {
id = "i17qXJ-Blzu-u1Dr-bSlR-0kNC-yuBH-lnbkSi"
status = ["READ", "WRITE", "VISIBLE"]
segment_count = 1

segment1 {
start_extent = 0
extent_count = 2364

type = "striped"
stripe_count = 1 # linear

stripes = [
"pv0", 0
]
}
}
}
}

contents = "Text Format Volume Group"
version = 1

description = ""

creation_host = "localhost.localdomain" # Linux localhost.localdomain 2.6.9-11.EL #1 Wed Jun 8 20:20:13 CDT 2005 i686
creation_time = 1139180239 # Sun Feb 5 22:57:19 2006

what if the machine your

dave's picture

what if the machine your using for recovery has raid itself?? when you append to mdadm.conf can md0,1,2 be renumbered to 3,4,5?

Renumbering md0 to md3, for example, works

rbulling's picture

You should be able to do that without any problems, as long as you explicitly keep the UUID signature in the renamed device line.

Experienced this exact

Neekofab's picture

Experienced this exact problem. moved a md0/md1 disk to a recovery workstation that already had an md0/md1 device. they could not coexist, and I could not find a way to move the additional md0/md1 devices to md2/md3. I ended up disconnecting the system md0/md1 devices, booting up with sysresccd and shoving the data over the network.

bleah

I ran into the same issue

Anonymous's picture

I ran into the same issue and solved it with a little reading about mdadm. All you have to do is create a new array from the old disks.

# MAKEDEV md1
# mdadm -C /dev/md1 -l 1 -n 2 missing /dev/sdb1

Voila. Your raid array has now been moved from md0 to md1.

I ran into the same issue

Anonymous's picture

I ran into the same issue and solved it with a little reading about mdadm. All you have to do is create a new array from the old disks.

# MAKEDEV md1
# mdadm -C /dev/md1 -l 1 -n 2 missing /dev/sdb1

Voila. Your raid array has now been moved from md0 to md1.

Recovery Linux Data

Anonymous's picture

Kernel Linux - Ext2 & Ext3 Data Recovery Software supports LVM Partition PARTIALLY
LVM Recovery

LVM Data Recovery

Chris - Armor-IT's picture

I review data recovery software for my employer. I would like an opportunity to test your software if possible.

Data Recovery Software for Linux

Unistal Data Recovery's picture

If you need efficient linux data recovery software, which runs on Linux then download Linux Data Recovery Software.

LVM Recovery

Recover Data's picture

Recover Data for Linux also supports LVM Recovery. Download the software DEMO and scan you disk for lost data and files.

Visit : Recover Data Software

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState