FOG Imaging stopet working with DHCP error
-
Hello,
we have a FOG server for Imaging purposes and it Suddenly Stopt working for all ouer Notebooks.
You can boot into the FOG and start the Imgaing but after the Restart the following happens:The FOG Server is in the same subnet withe the clients and we can Reach the FOG website.
FOG Version:1.5.8
Have you any Explanation for this behavior, all other Clients work normaliy in this Subnet and are Getting a DHCP-IP Adress.Best Regards
Nick -
@nick Terminate all running tasks.
Setup another capture/deploy (doesn’t matter) but before you hit the schedule task button tick the debug checkbox.
PXE boot the target computer. After a few screens of text you need to clear with the enter key you will be dropped to the FOS Linux command prompt.
Key in the following:
uname -a
ip a s
lspci -nn | grep -i -e net
Post the results here. Then we can work on the next steps. It does look like you have two network adapters in this computer. So it will be interesting to see what the commands produce.
-
@george1421 Helo George1421,
here is a Picture of the Commands.
Also i also want to mention that the PXE Boot takes very long (about 3 Minutes) before the Client starts to boot into FOG.
Also it only Boots into FOG Weboverlay only without an Deploy process, with one it dosent Boot like seen in the Screenshot in my Original Post.Kind Regards
Nick -
@nick OK good this output can rule out about a dozen of issues.
You have a new(ish) kernel. I checked the network adapter ID and this network adapter has been supported since linux version 4.12.
Its strange that its not picking up an IP address.
While on the FOS Linux command line I want you to key in:
/sbin/udhcpc -i enp0s31f6 --now
To see if it picked up an IP address now issue the
ip a s
command like before. See if the ethernet adapter enp0s31f6 gets an ip address now after time. I think the slowness in pxe booting is all related to this and the cause is environmental, but lets see.We need to catch this in the broken state to find a solution.
EDIT
Oh wait, I see you have kernel 5.6.18. Lets also have you update to the latest 5.10.xx A lot of new hardware was added in the 5.10.x branch. I don’t think that is your issue here, but lets rule out an older kernel too. -
@george1421 Thanks for the fast reply.
Here is the photo after entering the commands.
i will start the Kernel Update later this day, but after the command the Client got an IP-Adress and DNS-Server.
-
@nick OK so we can say that time fixes the issue related to getting an IP address. Because in the second picture the ethernet adapter did not get an IP address but when you keyed in the same command it picked up an IP address. So the only thing that changed is time??
What I want you to do now is find one of those cheap 5 or 8 port unmanaged switch. Place that between the pxe booting computer and the building enterprise switch. Now see if the target computer gets an IP address every time pxe booting normally.
If an unmanaged switch fixes the issue then have your networking folks look at the enterprise switch and make sure that fast-STP, RSTP, or port-fast (or what ever your switch mfg calls it) is enabled.
-
@george1421 So the unmanaged Switch fixet the Problem with the IP-Adresse.
I checked the STP mode on ouer Enterprise Switches (Cisco C3560G ) and it is on Rapid PVST, so rapid spanning tree.
Can there be another Reson why the Client dosent get an IP-Adress?Also do you have an Kernel update Guid somwhere?
The Update on the FOG will be made by an Colleague and a Guid would be helpfull.Thanks in Advance
-
@nick said in FOG Imaging stopet working with DHCP error:
I checked the STP mode on ouer Enterprise Switches (Cisco C3560G ) and it is on Rapid PVST, so rapid spanning tree.
Can there be another Reson why the Client dosent get an IP-Adress?I don’t know cisco to know if rapid pvst is the same thing as portfast or not. But if a dumb switch fixes the problem, consistently then what we’ve seen it standard spanning tree is being used. Standard spanning tree listens for a bpdu for 27 seconds before it starts forwarding data. Where RSTP starts forwarding right away while it listens for a bpdu packet (loop back detection). The dumb/unmanged switches don’t typically do spanning tree so they keep the enterprise switch in forwarding mode while the target compute boots up. No only will this cause a problem with FOS Linux but also pxe booting will be delayed for at least 27 seconds while spanning tree has the port in a hold state.
As for the kernel update From the FOG Web UI -> FOG Configuraiton -> Kernel update. Then download the latest 32 and 64 bit versions.
-
@george1421 Thanks for the Answer, we updated the Kernel today.
Also we now have a new error with ipxe and AMD Mainboards/CPUs, can we discuss this in this thread or should i make a new one for this?
My assumption is that there is a problem with the DHCP-Bootfile
-
@nick Since we had STP issues before, test it again with the unmanaged switch and see if the problem is resolved. This is also symptom of standard STP. If that doesn’t solve it then we can dig deeper by changing the iPXE boot loader from ipxe.efi to snponly.efi or recompiling the latest version of iPXE.
-
@george1421 So we tested over a unmanaged Switch and this was the Result. So i think we need to dig deeper into the iPXE bot loader
-
@nick So the next thing I would do is to update the firmware on the mobo just to be sure that isn’t it.
Then try the snponly.efi boot loader. You will need to update your dhcp server for this. The snponly.efi boot loader uses the driver built into the uefi nic. Its akin to undionly.kpxe for bios. The ipxe.efi boot loader is more like a linux kernel in that it has all of the common and known drivers built in, most but not all.
If that fails then here is how to build the latest version of iPXE. Note the build number on your current version of iPXE. Its the hex characters in the brackets after the version number. https://forums.fogproject.org/topic/15826/updating-compiling-the-latest-version-of-ipxe
If this doesn’t work then we will need to get wireshark on a witness computer (second computer on the same subnet as the pxe booting computer). This is a bit more involved so lets start the easier stuff first.