Host Startup; Booting into LVM Disk Fails
-
@Wayne-Workman : Oh! Right - that was the other concern of mine I have on my whiteboard. Thanks for bringing that up again. Should I just… delete /var/www/fog and call .foginstall again on the FOG Server? Or what would be the best way to invalidate this?
-
@dholtz-docbox Looking at a thread from a while ago here:
https://forums.fogproject.org/topic/8736/chainloading-failedAssign an image to this host - it doesn’t need to be a legit image, just make one up in Image Management, and assign it. See if the chainloading error goes away or not.
-
I also wanted to update you on this…
I believe some services needed to be restarted after upgrading to RC-14, or something needed to be restarted. But after looking at the DHCP issues and talking w/ George, I have queried it again, for EXIT mode, and now I get…
Edit> I removed the output. It was misleading, because I had the PXE menu enabled. George and I are talking, and when looking at the SANBOOT option, it tries to boot into 0x80 instead of 0x81, which would be the second drive - which is the drive I am looking to boot from.
My SANBOOT output is…
#!ipxe set fog-ip 10.1.10.42 set fog-webroot fog set boot-url http://${fog-ip}/${fog-webroot} sanboot --no-describe --drive 0x80
The EXIT output is still the same, with the last line being exit.
-
Immediately after calling the link I gave you earlier, check the apache error logs for entries from the moment prior.
Web Interface -> FOG Configuration -> Log Viewer -> Apache Error
Sort by newest first, be aware of the timestamps. Copy/paste what you find.
-
@Wayne-Workman : Nothing too interesting, just some shutdown/startup commands I made earlier today.
[Wed Oct 19 18:15:32.174284 2016] [core:notice] [pid 1564] AH00094: Command line: '/usr/sbin/apache2' [Wed Oct 19 18:15:32.173684 2016] [mpm_prefork:notice] [pid 1564] AH00163: Apache/2.4.7 (Ubuntu) OpenSSL/1.0.1f configured -- resuming normal operations [Wed Oct 19 18:15:13.109673 2016] [mpm_prefork:notice] [pid 4501] AH00169: caught SIGTERM, shutting down [Wed Oct 19 18:12:39.854083 2016] [core:notice] [pid 4501] AH00094: Command line: '/usr/sbin/apache2' [Wed Oct 19 18:12:39.853942 2016] [mpm_prefork:notice] [pid 4501] AH00163: Apache/2.4.7 (Ubuntu) OpenSSL/1.0.1f configured -- resuming normal operations [Wed Oct 19 18:12:18.080349 2016] [mpm_prefork:notice] [pid 24123] AH00169: caught SIGTERM, shutting down
What George and I were discussing regarded the saboot switches, in particular –drive 0x80. His thought was, whatever needs to be done, should be handling the assignment of this to 0x81, if I am trying to boot to /dev/sdb, which is on the second drive. These drives though, they are listed according to what the BIOS has listed? Or is there anyway to change what drive is mapped to 0x80?
-
@Tom-Elliott the chainloading issue with the exit type in this thread is very similar to this thread:
https://forums.fogproject.org/topic/8736/chainloading-failedEspecially suspicious since it was worked on between 13 and 14 and is still having issues.
-
@Wayne-Workman : Agreed! I am actually trying things from that topic as we speak.
-
@dholtz-docbox The issue here is totally different from the stuff in that thread.
-
I say this, because you’re seeing stuff populate after the boot-url portion. From what I can tell, it appears No menu is on?
-
@Tom-Elliott said in Host Startup; Booting into LVM Disk Fails:
I say this, because you’re seeing stuff populate after the boot-url portion. From what I can tell, it appears No menu is on?
Right no menus were on. I had him turn it on and he posted the results in chat. If you ignore everywhere this thread has gone, the issue is by their specific design his /dev/sda is empty (no partition) and /dev/sdb has ubuntu on it. If he changes the boot order in bios to not pxe the target computer boots correctly, but if it boots through iPXE sanboot is trying to exit to the blank disk (/dev/sda). This is outside of FOG imaging so far. Right now he is testing his reference image to boot through ipxe to OS boot.
-
@Tom-Elliott : Yeah, I wasn’t having any luck trying anything from the other thread. I have been tossing around the idea, why are these drives setup this way and what can we do to change it. In my head, is it as easy as just swapping the SATA cables? I am not sure why the drives are the way they are, but I would not mind changing it if I can. I just haven’t looked into it yet - it just sounds like a lot of issues would be resolved if the OS was on the first disk?
-
http://askubuntu.com/questions/371049/how-are-dev-sda-and-dev-sdb-chosen
According to that, you might want to remove one of the disks for Ubuntu’s installation only, then add it back afterwards and see how that goes. And also as that link suggests, use UUIDs instead of /dev/sdX, follow the documentation.
-
@dholtz-docbox outside of your design requirements, the one question in my mind is if you did switch the cables (as a test) does it pxe boot properly to the OS. That way we can just focus on getting the proper exit to work/boot to the second hard drive. I know I’ve seen it in the past change the boot drive, but I can’t remember what I did to do that until I can get my hands on a fog server.
-
I am looking to have someone open the enclosure so I can swap the cables - darn special screwdriver that I don’t have. Until then, I am looking at how to handle how /dev/sda is assigned, and it looks like it comes down to udev rules? That’s what I am looking at right now. I know a lot of this is hitting a bridge I was looking to prolong, but I have been looking to more intimately setup the partitions. I guess this is the start of understanding how some of this is handled.
-
@dholtz-docbox Remember at this point in time you are NOT in any OS. When the iPXE menu exits you are still running the iPXE kernel. It knows nothing about your target OS. Its just passing (or trying to) pass off boot control some some boot device.
-
George is right on that.
-
@george1421 OK I have a hackish way to test this idea.
But first, I went through each of the exit modes and all pick the first hard drive. Looking into the code all of the exit modes are hard coded to boot to the first hard disk.
Sooo… as a test please edit this file /var/www/html/fog/lib/fog/bootmenu.class.php search for some fragment of “sanboot --no-describe --drive 0x80” it should be listed only once in that file. Change the 0x80 to 0x81 then file and save the class file. Be sure that you have the host set to sanboot and then use the browser trick I chatted to you to confirm that the sanboot is now sending to the 0x81 disk. Once confirmed try to boot that target computer again. If it works then we have to find the next steps, but we can prove the process works.
-
That makes sense. I guess my thought here, is, before trying to boot through network, after installing Ubuntu, I run a set of udev rules, this remapping which disk /dev/sda and /dev/sdb are associated with.
Also, I can’t switch the cables around, one drive is SATA where the other is… like an onboard PCI-e disk. I wanted to handle it when installing Linux, but it looks like they have their own set of udev rules which always maps the larger drive to /dev/sda instead /dev/sdb. Which is why I was thinking, maybe we’ll just have our own we run as necessary to correct this, prior to building our golden image.
At the end of the day, the primary disk should be /dev/sda anyway, so I want to figure out how to get the smaller drive mapped to /dev/sda instead of /dev/sdb if I can.
Anyway, my time has run up for today. I will try a few more things tomorrow morning, specifically regarding udev rules first, most likely.
Thanks again for everyone’s help.
-Dustin
-
@george1421 said in Host Startup; Booting into LVM Disk Fails:
@george1421 OK I have a hackish way to test this idea.
But first, I went through each of the exit modes and all pick the first hard drive. Looking into the code all of the exit modes are hard coded to boot to the first hard disk.
Sooo… as a test please edit this file /var/www/html/fog/lib/fog/bootmenu.class.php search for some fragment of “sanboot --no-describe --drive 0x80” it should be listed only once in that file. Change the 0x80 to 0x81 then file and save the class file. Be sure that you have the host set to sanboot and then use the browser trick I chatted to you to confirm that the sanboot is now sending to the 0x81 disk. Once confirmed try to boot that target computer again. If it works then we have to find the next steps, but we can prove the process works.
Haha, yeah… I tried that earlier. The disk just boot looped with no prompt, indefinitely, until I force-killed-power to the machine.
Edit> That was my immediate hack idea too, just to see if it would work
PS> Everything checked out, regarding your hack for reviewing the settings passed - 0x81 was assigned when tested. -
@george1421 said in Host Startup; Booting into LVM Disk Fails:
Sooo… as a test please edit this file /var/www/html/fog/lib/fog/bootmenu.class.php
Along the same lines, you will find this at line 142 concerning grub that is easy to snipe:
'rootnoverify (hd0);chainloader +1`
Maybe change hd0 to hd1 and switch the host to grub and see what happens.