Cannot deploy images
-
lets start with some background information.
- What is the manufacturer and model of the hardware you are trying to pxe boot?
- Is the firmware in uefi or bios (legacy) mode?
- What exactly do you have programmed for dhcp option 67 {boot-file}?
- Is this a new install or new hardware that won’t boot?
I assume that you see it download bzImage (linux OS) and init.zx (virtual hard drive) but then it doesn’t proceed from there.
Lets start there and see where we go with the new information.
-
Just want to add one more question to the list. What kind (model and vendor) of Ethernet is this? Onboard or external USB NIC? Best of you can provide
lspci -n
andlsusb
outputs.PS: If I had to guess I’d say it’s one of these realtek based USB NICs…
-
@george1421
1.What is the manufacturer and model of the hardware you are trying to pxe boot? Dell Latitude 7240, I tried several machines.
2.Is the firmware in uefi or bios (legacy) mode? legacy
3.What exactly do you have programmed for dhcp option 67 {boot-file}? undionly.kkpxeThe system was stable and working perfectly, suddenly it stopped.
Please watch the short video here
Thanks,Tom
-
@tom said in Cannot deploy images:
The system was stable and working perfectly, suddenly it stopped.
What exactly do you mean by that?
What kind (model and vendor) of Ethernet is this? Onboard or external USB NIC? Best of you can provide
lspci -n
andlsusb
outputs. For that boot a linux lice CD… -
@tom I guess the first question I have is, Is the firmware updated on the 7240s?
When you say it suddenly stopped, can you determine an event that could have happened between when it did and when it didn’t work? (like upgrading fog??)
I have the 7240s on my campus so if needed I can go grab one and test. The only time I’ve personally seen this happen is when the system was a Lenovo and in uefi mode the system would download 69% of init.xy and then switch to the black screen and cursor.
Is this the only hardware that is doing this behavior or is it all?
BTW: The video is perfect since we can see how you got to the error. Bravo.
-
@sebastian-roth
The system is Latitude E7240 with Intel onboard NIC I218-LM.
attached is the result of lspci -n
when I tried to run lsusb, I got “unable to initialize libusb: -99” -
@george1421
We have only Latitude in our environment, the issue repeats on 7280 as well. All firmware are up to date.
I cannot tell what changed, I’ve upgraded to 1.4.4 awhile ago and it was working properly.
Is there any log I can look into?Thanks,
Tom
-
@Tom Ok I was wrong on this one. Not one of these realtek USB NICs, thank god! Searching the forum I can see that many people have used this model. So it definitely works or has worked. Let’s see if George can replicate the issue or not.
-
@sebastian-roth OK I can pxe boot these 7240s no problem. I set the bios back to factory, put laptop in bios (legacy mode), enabled pxe booting on built in network interface and then booted into FOG registration.
This is just for documentation purposes: this laptop has bios A21 with 4GB of ram and a 128GB SSD.
-
@tom Can we make the statement that all latitudes have the issue, or all computers have the issue?
While this is just a wild (strange) idea, do you have an unmanaged switch you can place between the pxe booting computer and the building network switch? This really doesn’t sound like a spanning tree issue, but it just might. Those 7240s are pretty quick.
-
Actually, there is already unmanaged switch between.
Tom -
@tom OK watching your video frame by frame (well almost). I see your fog server is on 192.168.2.196 but your pxe client computer and dhcp server are on 10.141.32.x/24 Is this expected?
Unfortunately the video doesn’t show me where its getting bzImage from (just off the top of the screen)
-
@george1421
Yes, the IP’s are correct and expected. -
@tom ok just to rule out a routing/router issue can you try and move the pxe booting client to the same subnet as the FOG server? I know I’m grasping at straws here, but it should but working… We have to find out where the issue isn’t at this point.
-
@george1421
That was not it. I put the laptop on the 192.168.2.x subnet and I got the same results.Thanks,
Tom
-
@tom These next steps are shotgunning in several directions.
-
Do you have any other models we can try. Like an older latitude or even a desktop. I’m interested to see if we can get FOS to boot at all.
-
Lets assume that FOS is corrupted. From the fog server linux console, navigate to
/var/www/html/fog/service/ipxe
and then lets run themd5sum
command and compare your images to mine.
1md5sum bzImage
dfba0a3d8b0e8652857c69c45393465a
2md5sum bzImage32
87d61b318f111434f880e0d4b500e539
3md5sum init.xz
edbd5936f81144d1afb888f4b8712de5 -
If we put the pxe booting computer on the same subnet as the FOG server, we can use the fog server to listen in on the pxe booting process of the target computer. (I’m not sure this one is going to tell us much, but again your pxe booting process is not typical either). Follow the instructions here: https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue upload the pcap to a google drive or dropbox and I’ll take a look at it. Either post the link here or send me an IM and I will review it off-line. Use the provided instructions
except
use this tcpdump string.tcpdump -w output.pcap port 67 or port 68 or port 69 or port 4011 or port 80
because I want to see what is going on with the init.xz download, which uses http.
-
-
@george1421
Wait
, I remember another situation just like this, where the OP was sending pxelinux.0 or any of the undionly.0 files as dhcp option 67 {boot-file} and it would run just fine but blow up while it was downloading the init.xz. This sure seems to fit that situation. -
@george1421
seems to be correct. -
@tom Ok then we can rule out something happened to the FOS image during or after the upgrade since the keys match.
-
@Tom Would you please also comment on George’s question about other laptops/PCs PXE behavior. Is it only one single device or all of the 7240s? Please try other devices to see if you have a general issue or if it’s model related in your case.