I was faced with an interesting challenge tonight. But first, why I was faced with the challenge to begin with.
I had been teaching Linux to system administrators for more than a year before taking my current job as a system administrator myself. During a couple of the courses, the students had the chance to learn about the boot procedure from pressing the power button to the login screen and everything in between. This meant getting a deep, personal understanding of your MBR, the Linux kernel and the System V init scripts.
During the lecture on the MBR, I always lectured how GRUB is the defacto standard, and LILO has basically been replaced. I would then teach the students possible situations that they might face should their MBR be corrupt or missing, and how to fix it. Of course, fixing it meant troubleshooting, and learning to use the rescue media that ships with your distribution, be it RHEL or SLES. Well, I never thought that I would face the situation personally on my own machine, as I tend to be much more careful with my own machines than someone else’s (like a training center).
First, when I installed my desktop, I wanted to take advantage of LVM. I have two disks in the machine, so talking total advantage of the space was important. LVM fits the bill nicely. However, I didn’t plan well enough, and put my boot partition on a logical volume. This means that GRUB doesn’t get installed by default, and instead, you get LILO. Further, this also means that there is no nice pretty splash screen while booting, but it’s back to the old kernel and init script output. Oh well. This system remains up 90% of the time anyway, so no big deal.
Then, earlier today, I thought to myself “GRUB 2 is supposed to handle the boot partition on a logical volume”. I surely would rather have GRUB than LILO, plus I miss the slick boot splash screen. So, I pulled up a terminal, installed the ‘grub2′ package, and all it’s dependencies, ran ‘grub-install /dev/sda’, then rebooted to a black screen saying it couldn’t find my boot partition. GREAT! Here I am, thinking this would be no sweat, and I’m left without a bootable computer. No worries, I thought. I’ve taught my students over, and over, and over again on getting them out of this jam, surely I can do it myself. So, I grabbed an Ubuntu LiveCD, and went to work.
First thing first. I need to make a decision about GRUB or LILO. Do I want to figure out why GRUB puked on me, or should I just stick with LILO, and be done with it? Either way, I need to make a decision. I decide to stick with LILO. So, I boot into the live environment, pull up a terminal, and get to work. As mentioned, every last bit of disk space is on a logical volume. The LiveCD doesn’t come with LVM support by default, so I need to install it, and load the module:
sudo aptitude install lvm2 sudo modprobe dm_mod
Now with LVM installed, and the module loaded, I can mount the volume and get to work. Hold on though. Not so fast. In order to mount the volume, I need to call the volume by device. It’s not there, if I search under /dev. Get back in your terminal to find out why by running ‘lvscan’:
This will scan the volumes, detecting any available. Also, ‘lvscan’ will let us know if the volumes are active or inactive. In my case (and same with you if you’re following along with this tutorial), the volumes are inactive. So, we need to make them active:
sudo lvchange -a y /dev/janus/root
/dev/janus/root is my volume that contains my root filesystem, including /boot, which is needed to bring by box into working order. Now with my volume active, I can reference it, and mount it. But that’s not all I need to mount. I need to mount /proc for the kernel, and mount /dev for all the correct devices that the kernel sees on my system. So, back into the terminal we go:
sudo mount /dev/janus/root /mnt sudo mount -t proc none /mnt/proc sudo mount -o bind /dev /mnt/dev
Okay. We’re getting there- slowly, but surely. Now that everything is mounted, it’s time to change to that filesystem, and start fixing stuff. This is done with the ‘chroot’ command. I won’t go into vast details about the ‘chroot’ command. Basically, it just changes your root filesystem to the directory you specify. In this case, we’re going to change our root filesystem to /mnt, where my logical volume is mounted that holds the actual filesystem that resides on my disk, not the LiveCD:
sudo chroot /mnt
Cool. Our prompt should show that we are on the ‘/’ filesystem, meaning that any files we alter, we’re altering on disk. So, back at the beginning of the post, I mentioned that I installed GRUB, and it failed to boot, so I decided to stick with LILO, rather then figure out why it failed (I can do that later). I uninstalled LILO when I installed GRUB, so I need to reinstall. Remember, that because we ran ‘chroot’ just now, we are now operating on disk, as that’s where we reside. So, installing any packages will be persistent across boots. If this is the first time you’re installing LILO, then you’ll need to take an extra step. If you’re just rescuing an existing LILO system as I am, it won’t be needed. The extra step is to run ‘liloconfig’ to create /etc/lilo.conf. Otherwise, just run ‘lilo’ itself, then reboot:
sudo aptitude install lilo sudo liloconfig # Answer yes to everything sudo lilo
LILO should now be installed in the MBR of the disk, and you should have a bootable box at this point. So, the only things left to do are leave the chrooted evironment and reboot, hoping everything works. If you received any warning about LILO when installing to the MBR, ignore them. If you receive any FATAL errors, you won’t have a bootable box, and will have to troubleshoot further from there.
exit sudo reboot
At this point, my box booted fine, and I’m typing this post right after the rescue. Everything is in place. I hope this finds some help for someone who is in a similar boat that I was just in fixing their LILO on LVM.