boot.php........Connection timed out
-
@george1421 It doesn’t make sense to me either. We don’t have a proxy and we currently don’t have an issue imaging the Dell All-In-One 7440. Even across VLANs. Just the Dell All-In-One 7450 and 7460, which imaged fine during the summer. Thinking its some type of filter by hardware. Maybe Palo Alto setting. Sorry, not an expert in those realms.
-
@george1421 Also, to answer your question, it doesn’t reach iPXE menu. Based on default.ipxe file, perhaps mac is not binding? Is IPV6 also, a possible issue?
#!ipxe
cpuid --ext 29 && set arch x86_64 || set arch i386
params
param mac0 ${net0/mac}
param arch ${arch}
param platform ${platform}
param product ${product}
param manufacturer ${product}
param ipxever ${version}
param filename ${filename}
isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
:bootme
chain http://10.20.164.93/fog/service/ipxe/boot.php##params -
well if its downloading default.ipxe that tells me that the iPXE boot loader is working and that the subnet can communicate with the fog server to 1. download the ipxe boot loader and 2. it transfers default.ipxe. Where its failing is where / when it talks to the apache server over http protocol.
The 7440 AOIs are working but the 7450 and 7460 AOIs are not. Since you mentioned “can no longer image”, we can assume that it did at one time image.
What version of FOG are you using?
Many years ago I use to work in an industrial plant. We had the saying that “When things just ‘magically’ stopped working, just follow the fork lift tracks to the problem.” Can you networking folks think of any reason why http might not be working from these devices?
Is the fog server on the same subnet as the target (PXE) booting computers? If not, what happens if you put a pxe booting computer on the same subnet as the fog server? The idea here is to test if its the FOG server at fault or the network. If the pxe booting computer works on the same subnet as the FOG server then its (maybe) not the fog server.
Did the networking folks change any settings in the FOG server like enabling a firewall or such?
-
@george1421 We are running 1.3.5 and they didn’t modify FOG itself. I’m heavily leaning on the FOG server NOT being the issue because 7450 and 7460 did image properly and we did over a thousand of them. Now, after network changes, even if its the same/different subnet those 2 models no longer image. 7440 works both same/different subnet. Really looking for guidance as to what, outside of FOG, can cause this type of filtering. Allowing one PC model to access http FOG yet not allowing the other.
I have packet captures if needed for further scrutiny.
-
@cmachado This IS an interesting issue, because on the same subnet and model specific limitations.
Did they implement some kind of NAC/NAP/802.1x authorization? (understand I have to guess here because what you are telling me isn’t possible “simple to implement”).
For a packet capture you would have to be on a mirrored port to the pxe booting client. The tftp and http requests will be unicast so you can’t pick that traffic up on a witness computer, it has to be listening in stream to the data.
You are getting a connection timeout. So the client is trying to reach the fog server but the fog server isn’t responding to the client. Does the apache logs show the client talking to the fog server during that time. At least requesting data and the fog server not responding? /var/log and then depending on the target OS it may be in apache2 or http directory in the log directory. There should be an access log as well as error log.
The Web UI works as expected without any errors?
In the case of packet capture, you can do it with tcpdump on the fog server. I would create a filter that monitors “udp port 69 and tcp port 80 and the host IP address of the pxe booting computer.” That will give you the tftp transfer of undionly.kpxe as well as any http requests from the target computer. We would want to filter on the IP address of the target computer so we only see the traffic of the computer of interest. That way if the server is busy with lots of FOG Client traffic we can filter out that data.
Again I’m a bit all over on this since I’m trying to think how this is possible.
-
@george1421 Web UI is good. No issues there. I followed this link to perform the packet capture. https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue. I attached 2 Files, Output-7440 (successful), Output-7450 (unsuccessful) both from same subnets.
I’ll find out about the NAC/NAP/802.1x authorization.
-
@cmachado The filter in that article was for detecting pxe boot issues (i.e. delivering the boot loader to target computer). We will need to use a different one since the problem is past pxe booting.
I had to look this one up my self. But this filter should give us what we need.
tcpdump -w output.pcap "(port 69 or port 80) and host 10.20.162.62"
I see both the 7040 and 7050 are both uefi based systems so that rules out another issue I had on my list.
-
@cmachado From the PCAPs it looks like the machines are on different subnets - unless you use one really big 10.x.x.x subnet. So I would expect a router to be doing the filtering. Can you move one of the 7450 (unsuccessful) machines to the subnet where you have a 7440 (successful) and see if it loads the iPXE menu? That would rule out many things easily.
-
@george1421 OK. Got it. File attached.
-
@Sebastian-Roth Wow! Yes it works if in same subnet 164.
-
@Sebastian-Roth Looks like we have enough info to solve. I’ll keep you guys posted. Thanks again fellas!
-
@cmachado said in boot.php........Connection timed out:
Wow! Yes it works if in same subnet 164.
Same subnet as the FOG server or 7050 on the same subnet as 7040?
-
@cmachado said in boot.php........Connection timed out:
@Sebastian-Roth Looks like we have enough info to solve. I’ll keep you guys posted. Thanks again fellas!
Just to complete the picture FOG uses http tftp ftp and nfs protocols for imaging.
-
@cmachado said in boot.php........Connection timed out:
Only issue that I can see is with fogreplicator. Keep getting this error and the replication keeps overwriting img files.
Would you mind opening a new topic on this issue? We try to keep things sorted so other people find answers more easily. Thanks in advance!
-
@Sebastian-Roth OK. No problem.