[arch-general] dmraid Partitions Lost - Recovered -> Howto
List, I thought I would pass this along should anyone else experience a loss of all partitions on a drive or array. May help somebody out someday: dmraid Partition Loss with dmraid-1.0.0rc15 Testing dmraid-1.0.0rc15 on a box with two separate dmraid arrays, I experienced the total loss of all partitions on the second dmraid array. The first array held an openSuSE install running dmraid-1.0.0rc14 while the second held Archlinux with dmraid-1.0.0rc15 where testing was being done. All testing of dmraid-1.0.0rc15 on Archlinux went fine, the problem occurred when the machine was boot back into openSuSE. Regardless of the situation, whether using a raid setup or not, partition loss is serious business. dmraid Partition Recovery Recovery of dmraid partitions proceed in the same manner as recovering partitions from a singe drive. if you haven't destroyed the information on the array, you should be able to put the pieces of the puzzle back together again. The basic outline for the process is to locate and restore the partitions on the array and then reinstall the boot loader so your box is functional again. (Note: if you were smart enough to save the "fdisk -l" information for your drives, you can simply fdisk your array and be done) Tools Required Partition location and recovery software (I used testdisk) http://www.cgsecurity.org/ http://www.cgsecurity.org/wiki/TestDisk_Download http://www.cgsecurity.org/testdisk-6.11.linux26.tar.bz2 Rescue CD for your OS (generally your install CD/DVD, or knoppix, etc.) Using testdisk testdisk is a great piece of GPL code written by Christophe Grenier. testdisk can be used with most operating systems and will scan you disk or array and locate partition boundaries and give you the opportunity to recover them. I had 4 partitions dedicated to my Archlinux install totaling roughly 70G on a 750G raid array. To start testdisk, for Linux26, you will untar the bzip archive and then cd into the linux subdirectory. The prebuilt binary is: ./testdisk_static The first thing you will need to do is set the correct disk geometry. In my case the disk reported 254 heads and needed to be changed to 255 heads to work properly. (This is recommended if the first Quick Scan doesn't find your partitions). After setting the geometry, just choose "Analyze" and "Quick Scan" and go get a coffee or something. In my case since the 70G I was using was at the front of the 750G array, it had found my partitions within 5 minutes or so. Once all of your partitions are found you can "Stop" the scan by hitting the return key. You are then presented with the list of found partitions. They will be initially labeled "D" for deleted and you simply toggle on the partitions you need to recover by selecting ("P" Primary, "*" Primary Boot, "L" Logical or leave as "D" for Deleted). testdisk will check your selections for partition overlap and give you confirmation in green if your partition layout is OK. Just hit return to continue. Don't worry about the extended partition boundary, it will be provided. Review the partitions to be recovered and choose "Write" and your are done. (a reboot is required to activate the partitions) If no partitions were found during the "Quick Scan", then (1) check your drive geometry setting; and (2) you will be given the option to do an "In Depth Scan" (go get 4 cups of coffee, walk the dog, etc...) Have Your Rescue CD Handy Once the partition information has been changed, there is a near 100% chance your boot loader configuration will be messed up. Don't worry, everything is still there, you just have to reinstall grub or lilo into the boot record to recover from the situation. Reinstalling Grub Here you will be booting from your CD or DVD into rescue mode, using dmraid to activate the arrays, and then using the information about the dm nodes in /dev/mapper and the partition information in from "cat /proc/partitions" to create a chroot of your install to repair the boot loader: (1) boot from the install DVD (2) choose "Rescue System", login as "root" (no password needed) (3) activate the dmraid arrays with "dmraid -ay" (4) check which device nodes to use to create the chroot with "ls -al /dev/dm*" or "ls -al /dev/mapper". I was dealing with 2 separate arrays, 9 partitions (duplicated by having both dmraid-1.0.0rc14 and dmraid-1.0.0rc15 metadata) that left me with dm-0 to dm-20 to deal with. Compare the size shown for dm-X, /dev/mapper/raiddevice_name and the size shown from "cat /proc/partitions" to determine your "/", "/home", and "/boot" and any other partitions you need to setup in your chroot. (5) mount all dm-X devices or /dev/mapper devices under /mnt to create your actual filesystem, and then bind dev/, proc/ and sys/ to their respective mount points under /mnt and chroot. **Note, you need to mount the device containing the / (root) filesystem first before mounting /boot and /home. Otherwise, the /boot and /home mount points will not exist: Example: mount /dev/dm-5 /mnt mount /dev/dm-7 /mnt/boot mount /dev/dm-6 /mnt/home mount -o bind /dev /mnt/dev mount -o bind /proc /mnt/proc mount -o bind /sys /mnt/sys cd /mnt chroot /mnt (6) Reinstall grub to fix the mbr on your raid discs (mine were hd0 and hd1). See http://wiki.archlinux.org/index.php/Installing_with_Fake-RAID#Install_GRUB for my notes on getting the (hdX,Y) numbers right. When you start grub, you get a small ">" prompt, just use the following as a guide. If you only have a single array, you will only need to worry about setting up hd0: grub >root (hd0,4) >setup (hd0) >*** few lines of grub output *** >root (hd1,5) >setup (hd1) >*** more lines of grub output *** >quit (7) check your /etc/grub.conf to make sure it agrees with the way you have just configured grub. For the example above, it should look like this for hd0 (I boot to hd0 and then chainload to get to hd1 and the second array) setup --stage2=/boot/grub/stage2 (hd0) (hd0,4) quit (8) exit (to exit chroot) and reboot, and if you were successful (or just damn lucky), your system will be 100% again. Now immediately do "fdisk -l" on each of your arrays and drives and save that information remotely so if this happens again, you have a shortcut;-) -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com
On Thu, 2009-06-25 at 22:49 -0500, David C. Rankin wrote:
List,
[putolin]
(Note: if you were smart enough to save the "fdisk -l" information for your drives, you can simply fdisk your array and be done)
This may bee of some use to you: http://linux.die.net/man/8/sfdisk I use The fourth type of invocation: sfdisk device will cause sfdisk to read the specification for the desired partitioning of device from its standard input, and then to change the partition tables on that disk. Thus, it is possible to use sfdisk from a shell script. When sfdisk determines that its standard input is a terminal, it will be conversational; otherwise it will abort on any error. BE EXTREMELY CAREFUL - ONE TYPING MISTAKE AND ALL YOUR DATA IS LOST As a precaution, one can save the sectors changed by sfdisk: % sfdisk /dev/hdd -O hdd-partition-sectors.save ... % Then, if you discover that you did something stupid before anything else has been written to disk, it may be possible to recover the old situation with % sfdisk /dev/hdd -I hdd-partition-sectors.save % (This is not the same as saving the old partition table: a readable version of the old partition table can be saved using the -d option. However, if you create logical partitions, the sectors describing them are located somewhere on disk, possibly on sectors that were not part of the partition table before. Thus, the information the -O option saves is not a binary version of the output of -d.)
On Friday 26 June 2009 07:06:22 am Baho Utot wrote:
On Thu, 2009-06-25 at 22:49 -0500, David C. Rankin wrote:
List,
[putolin]
(Note: if you were smart enough to save the "fdisk -l" information for your drives, you can simply fdisk your array and be done)
This may bee of some use to you:
http://linux.die.net/man/8/sfdisk
I use
The fourth type of invocation: sfdisk device will cause sfdisk to read the specification for the desired partitioning of device from its standard input, and then to change the partition tables on that disk. Thus, it is possible to use sfdisk from a shell script. When sfdisk determines that its standard input is a terminal, it will be conversational; otherwise it will abort on any error. BE EXTREMELY CAREFUL - ONE TYPING MISTAKE AND ALL YOUR DATA IS LOST As a precaution, one can save the sectors changed by sfdisk: % sfdisk /dev/hdd -O hdd-partition-sectors.save ... % Then, if you discover that you did something stupid before anything else has been written to disk, it may be possible to recover the old situation with % sfdisk /dev/hdd -I hdd-partition-sectors.save % (This is not the same as saving the old partition table: a readable version of the old partition table can be saved using the -d option. However, if you create logical partitions, the sectors describing them are located somewhere on disk, possibly on sectors that were not part of the partition table before. Thus, the information the -O option saves is not a binary version of the output of -d.)
Thanks Baho, That is great information to add to my bag of tricks and -- hope I don't have to use it anytime soon ;-) What I liked about testdisk that I hadn't found anywhere else, was its 'scan' feature where it just scans the drive looking for partition boundaries and reports what it found giving you the opportunity to select what needs to be restored. It gives you a way to approach partition recovery without needing any prior knowledge of what is one the disk. One other neat trick is has is the ability to "View" the files on the partitions it has found which provides another check for determining the correct partitions. You can safely run testdisk on your hard drive and see what it does. Just don't hit "Write" and you will be fine (It also forces a confirmation prompt so you would have to make two mistakes to do any damage) I'll look at sfdisk, and thank you for the information. -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com
participants (2)
-
Baho Utot
-
David C. Rankin