Host Startup; Booting into LVM Disk Fails

dholtz-docbox

@george1421 : Haha, okay, let’s start over, knowing what we know now…

george1421

@dholtz-docbox If you are capturing a linux image that uses LVM, the last I knew LVM disk structure wasn’t supported. Understand I don’t work with cloning linux systems so I can’t give you first hand experience with them. But the last I knew only normal partitions are supported.

@Developers Is LVM now supported with FOG 1.3.0RCx?

dholtz-docbox

After discussing what I have, the current issue is as follows…

My system has two (2) drives; the first drive /dev/sda is not formatted or touched, and the second drive /dev/sdb is formatted with Ubuntu 14.04.5 - Server using LVM. There is nothing complex about the LVM partition - other than it being LVM - as it only contains one primary partition currently - along with any of its other typical partitions.

This is where the issue comes into play.

I have the host registered with the FOG server, and all I am trying to do is boot the host into its system through PXE. However, every time it goes to boot into the drive, it either hangs - if using SANBOOT - or hits a chainloading loop - when using EXIT.

I am told the chainloading loop is easier to deal with than the SANBOOT issue.

Since then, I have been trying to narrow down various things, and DHCP configuration was the first to square away.

george1421

@george1421 Just thinking after I hit the submit button, if you deploy to a target computer and of course it will fail on reboot, but then go into FOG and schedule another deployment , but be sure you pick DEBUG deploy option. Then pxe boot the target computer. That will boot you into the FOG Engine and then drop you to a command prompt after pressing enter a few times. From there you can see check the disk using standard linux tools to see if you can detect anything wrong with the disk structure.

Wayne Workman

@george1421 to bring you up to speed, scroll down and look at the big photo he posted. in there, iPXE is saying: “Duplicate option 66 (next server) from DHCP proxy and DHCP server.”

This is the least of the problems though, for some reason the iPXE boot script isn’t getting generated right by the fog server.

dholtz-docbox

@Wayne-Workman : Oh! Right - that was the other concern of mine I have on my whiteboard. Thanks for bringing that up again. Should I just… delete /var/www/fog and call .foginstall again on the FOG Server? Or what would be the best way to invalidate this?

Wayne Workman

@dholtz-docbox Looking at a thread from a while ago here:
https://forums.fogproject.org/topic/8736/chainloading-failed

Assign an image to this host - it doesn’t need to be a legit image, just make one up in Image Management, and assign it. See if the chainloading error goes away or not.

dholtz-docbox

I also wanted to update you on this…

I believe some services needed to be restarted after upgrading to RC-14, or something needed to be restarted. But after looking at the DHCP issues and talking w/ George, I have queried it again, for EXIT mode, and now I get…

Edit> I removed the output. It was misleading, because I had the PXE menu enabled. George and I are talking, and when looking at the SANBOOT option, it tries to boot into 0x80 instead of 0x81, which would be the second drive - which is the drive I am looking to boot from.

My SANBOOT output is…

#!ipxe
set fog-ip 10.1.10.42
set fog-webroot fog
set boot-url http://${fog-ip}/${fog-webroot}
sanboot --no-describe --drive 0x80

The EXIT output is still the same, with the last line being exit.

Wayne Workman

Immediately after calling the link I gave you earlier, check the apache error logs for entries from the moment prior.
Web Interface -> FOG Configuration -> Log Viewer -> Apache Error

Sort by newest first, be aware of the timestamps. Copy/paste what you find.

dholtz-docbox

@Wayne-Workman : Nothing too interesting, just some shutdown/startup commands I made earlier today.

[Wed Oct 19 18:15:32.174284 2016] [core:notice] [pid 1564] AH00094: Command line: '/usr/sbin/apache2'
[Wed Oct 19 18:15:32.173684 2016] [mpm_prefork:notice] [pid 1564] AH00163: Apache/2.4.7 (Ubuntu) OpenSSL/1.0.1f configured -- resuming normal operations
[Wed Oct 19 18:15:13.109673 2016] [mpm_prefork:notice] [pid 4501] AH00169: caught SIGTERM, shutting down
[Wed Oct 19 18:12:39.854083 2016] [core:notice] [pid 4501] AH00094: Command line: '/usr/sbin/apache2'
[Wed Oct 19 18:12:39.853942 2016] [mpm_prefork:notice] [pid 4501] AH00163: Apache/2.4.7 (Ubuntu) OpenSSL/1.0.1f configured -- resuming normal operations
[Wed Oct 19 18:12:18.080349 2016] [mpm_prefork:notice] [pid 24123] AH00169: caught SIGTERM, shutting down

What George and I were discussing regarded the saboot switches, in particular –drive 0x80. His thought was, whatever needs to be done, should be handling the assignment of this to 0x81, if I am trying to boot to /dev/sdb, which is on the second drive. These drives though, they are listed according to what the BIOS has listed? Or is there anyway to change what drive is mapped to 0x80?

Wayne Workman

@Tom-Elliott the chainloading issue with the exit type in this thread is very similar to this thread:
https://forums.fogproject.org/topic/8736/chainloading-failed

Especially suspicious since it was worked on between 13 and 14 and is still having issues.

dholtz-docbox

@Wayne-Workman : Agreed! I am actually trying things from that topic as we speak.

Tom Elliott

@dholtz-docbox The issue here is totally different from the stuff in that thread.

Tom Elliott

I say this, because you’re seeing stuff populate after the boot-url portion. From what I can tell, it appears No menu is on?

george1421

@Tom-Elliott said in Host Startup; Booting into LVM Disk Fails:

I say this, because you’re seeing stuff populate after the boot-url portion. From what I can tell, it appears No menu is on?

Right no menus were on. I had him turn it on and he posted the results in chat. If you ignore everywhere this thread has gone, the issue is by their specific design his /dev/sda is empty (no partition) and /dev/sdb has ubuntu on it. If he changes the boot order in bios to not pxe the target computer boots correctly, but if it boots through iPXE sanboot is trying to exit to the blank disk (/dev/sda). This is outside of FOG imaging so far. Right now he is testing his reference image to boot through ipxe to OS boot.

dholtz-docbox

@Tom-Elliott : Yeah, I wasn’t having any luck trying anything from the other thread. I have been tossing around the idea, why are these drives setup this way and what can we do to change it. In my head, is it as easy as just swapping the SATA cables? I am not sure why the drives are the way they are, but I would not mind changing it if I can. I just haven’t looked into it yet - it just sounds like a lot of issues would be resolved if the OS was on the first disk?

Wayne Workman

http://askubuntu.com/questions/371049/how-are-dev-sda-and-dev-sdb-chosen

According to that, you might want to remove one of the disks for Ubuntu’s installation only, then add it back afterwards and see how that goes. And also as that link suggests, use UUIDs instead of /dev/sdX, follow the documentation.

https://help.ubuntu.com/community/UsingUUID

george1421

@dholtz-docbox outside of your design requirements, the one question in my mind is if you did switch the cables (as a test) does it pxe boot properly to the OS. That way we can just focus on getting the proper exit to work/boot to the second hard drive. I know I’ve seen it in the past change the boot drive, but I can’t remember what I did to do that until I can get my hands on a fog server.

dholtz-docbox

I am looking to have someone open the enclosure so I can swap the cables - darn special screwdriver that I don’t have. Until then, I am looking at how to handle how /dev/sda is assigned, and it looks like it comes down to udev rules? That’s what I am looking at right now. I know a lot of this is hitting a bridge I was looking to prolong, but I have been looking to more intimately setup the partitions. I guess this is the start of understanding how some of this is handled.

george1421

@dholtz-docbox Remember at this point in time you are NOT in any OS. When the iPXE menu exits you are still running the iPXE kernel. It knows nothing about your target OS. Its just passing (or trying to) pass off boot control some some boot device.

Host Startup; Booting into LVM Disk Fails

146

12.1k

17.3k

155.4k