Hosts are looking for tftp server.
-
@george1421 I reinstalled recently the server with https .
first time When I installed it I put the DNS IP for the server but this time I just skipped it. !Maybe this is the problem. Our DNS is DHCP server
-
@marted As I see it right now this is NOT a FOG problem. You are not even to FOG yet. I’ll say that as long as the FOG server is NOT your DHCP server. There isn’t a fog configuration that would cause what I’m seeing so far. I’m not saying this to redirect any fault here. Its just that at this stage the communication is between the target computer and your dhcp server.
-
@george1421 when a host boot I see it takes ip from dhcp, I see the dhcp address, I see the fog server on dhcp proxy place and everything is fine just to tftp: server. I put tftp server and it boot in fog menu. Like I say 25 hosts boot and 10 to 15 hosts ask for tftp server BUT have already IP and ready, just tftp server missing. Next time I send task other hosts ask for tftp server, always different hosts
-
@marted Well lets start tomorrow by fixing the dnsmasq setting then grab another pcap of the pxe boot process. Lets make new assumptions based on the correct (and well tested) dnsmasq file.
There ARE tweaks we can make to the dnsmasq configuration file to cover certain circumstances.
Like in this section
pxe-service=X86PC, "Boot to FOG", undionly.kpxe pxe-service=X86-64_EFI, "Boot to FOG UEFI", ipxe.efi pxe-service=BC_EFI, "Boot to FOG UEFI PXE-BC", ipxe.efi
we can specify the boot server in the services line. It would look like this
pxe-service=X86PC, "Boot to FOG", undionly.kpxe, 192.168.149.43 pxe-service=X86-64_EFI, "Boot to FOG UEFI", ipxe.efi, 192.168.149.43 pxe-service=BC_EFI, "Boot to FOG UEFI PXE-BC", ipxe.efi, 192.168.149.43
Its not typical that we need to do that, but in certain environments it necessary.
-
@george1421 thank you so much. I’ll fix that tomorrow and will post the resold. Thanks again!
-
@george1421 said in Hosts are looking for tftp server.:
I still don’t understand why we don’t see an offer packet from your main dhcp server.
Because the capture was taken on the FOG server (I guess) and the DHCP offer & ACKs are not broadcasted but send directly (unicast MAC) to the client.
-
@Sebastian-Roth tel me how to make the test with wireshark to see the actual situation. Thanks
-
@marted For a single client you could use a monitoring port on the switch or connect it to a hub to capture the traffic. But it’s quite a task to do and you still don’t get the full truth. You’d need to capture on the DHCP server to get all the packets. But make sure you do filter on capture or later on using display filters and export to a new PCAP so we don’t have all your network traffic in it.
Capture filter:
port 67 or port 68 or port 4011
On the other hand you won’t see the TFTP requests on the FOG server this way.
-
@george1421 @Sebastian-Roth I chanced the options in dnsmasq, restarted and nothing changed, always 5 to 10 different hosts ask for tftp server after taking an IP from DHCP of the University ![0_1583510428281_547BADC4-E100-4376-B025-9D12F6A3F622.jpeg](Uploading 0%)
-
@marted Are these 5-10 hosts all on the same subnet as the fog server? There is something going on here that isn’t apparent.
-
@george1421 Yes, all of them. Next time when I stop the hosts and started with wake up on LAN now other 10-15 ask for tftp server, always in the same room of 25 hosts
-
@george1421 if I just press enter without entering tftp server it gives this
Same host on the next boot
-
@george1421 now one more thing - when I boot manually host by host with F12 every host boot correctly with no problems. The problem come only when I try to boot them all with a task and wake up on LAN. I have impression that there is a limit of hosts to connect to tftp server simultaneously at the same time. this is a new model very fast with i7 8th Gen , 1 Tb SSD and it boot for few seconds.
-
@marted Well let me say that iPXE is working exactly as it was programmed to do. If it doesn’t receive pxe boot information from either the dhcp server or a proxydhcp server then it will prompt the user.
ref: https://github.com/FOGProject/fogproject/blob/master/src/ipxe/src/ipxescriptThis is a dhcp (proxydhcp) issue and not anything to do with tftp. If it was a tftp issue the iPXE boot loader would not be running on the target computer asking for a boot server.
Along the lines of a random dhcp issue, that can come from having two or more dhcp servers on your campus that have different configuration for the subnets. Where the first dhcp server that responds wins the election. Now that a proxydhcp server is involved, if the proxydhcp server doesn’t respond in time to too late the client will not use (or have) any pxe boot information.
I’ll ask the question again, is the computer that is showing this random ask for tftp server on the same subnet as the fog server? If so then there is something wrong with the proxy dhcp process because since its on the local subnet as the pxe booting computer it should hear the discover every time (the first pcap was showing that). What I did not see in the first pcap was the main dhcp server responding. Based on what I’m seeing in the pcap I would say the main dhcp is either responding sluggishly or random dhcp servers are in play.
If the target computer is on a different subnet, then you will need to load wireshark on a witness computer with the capture filters that Sebastian provided. This will only allow us to see the dhcp process, but at least we can see what actors are involved here.
IMO the issue at the moment is an network infrastructure one and not anything to do with FOG, other than we need network booting to work to get FOG to work. Since we don’t know your networking infrastructure we can only make suggestions where to look based on our experiences and intimately knowing how FOG works.
-
@george1421 @Sebastian-Roth you’re right. This is a issue of the dnsmasq (DHCP proxy server) not FOG. If you want change the place of the topic.
The Dnsmasq is not capable to handle many requests at a time. All tests I made Yesterday I found that up to 10-12 computers at a time there is no issues. Like I said earlier in my posts, the problem is ONLY with this new model we have, because they boot simultaneously and I guess almost all at the same time ‘‘ask’’ the proxy dhcp for information. Like you said if the proxy is not capable to handle the request for a host, this host will pass to the next dhcp in the network, and because we don’t have 3th one dhcp in the network, it will return to the main dhcp (DHCP of the University) . We see this request in the wireshark file like a request on the exit IP 192.168.148.1 and answer from it.
Now the question is how to fix this situation. In this close private network we have 10 rooms each room with 25 computers, all of them (250) installed on 4 sub net 192.168.148.0, 192.168.149.0 192.168.150.0 192.168.151.0. The server FOG is a virtual server fixed on 192.168.149.43 and configured on our private switch in the lab like an IP Helper (DHCP proxy). Up to now almost 5 months, no issues with FOG for booting. Like I said this is the first time we have a problem like this, simply because in other rooms the old models, when I send a task for 25 hosts they don’t wake up on LAN exactly in the same time, and because of that they don’t '‘ask’ dhcp proxy for information in the same time. Now the new model hosts I see it do that.
My questions (I am just asking I don’t know the question is correct or no )
Is it possible to setup the dnsmasq to handle requests one at a time and like this to be able to proceed all requests?
Can we have second port open to handle part of the requests?
or second dnsmasq on the same server?
or second server only with dnsmasq installed which will transfer only the information which leads to the real FOG server?
or getting(install) better network card?
If you have some other suggestion I am open to listen.
I know it is always possible just to boot the hosts one at a time with F12 and it will work a 100% or make small groups of 5-10 hosts for this model, but I like very much the way FOG can handle many hosts at a time and.
Another thing I turnoff all hosts in the evening and when I wake up on LAN room by room in the morning just in this room I have to go and reboot again manually or enter tftp server info.
I hope to find some solution!
Thanks again for all your help -
@marted That’s an interesting one. From what you describe it really sounds as if dnsmasq is not able to serve all of them at the same time. If that’s the case we should be able to see this in the logs. First figure out which log file is used:
grep "dnsmasq" /var/log/messages /var/log/syslog /var/log/daemon.log
Depending on the Linux OS you have the logging might be in a different file. When you have found it schedule a deploy task for those Dell AIO 7470 hosts and run
tail -f /var/log/syslog | grep "dnsmasq" | tee /tmp/dnsmasq.log
to see all the log messages coming in life as well as save those to a separate log file in /tmp/dnsmasq.log.Together with a lost of MAC addresses of the Dell AIO 7470 hosts and the log file you should be able to see which one got the PXE/TFTP information and which didn’t on that run. Maybe there are hints in the log that one was skipped. Not sure. Upload the log file here if you need help with finding anything in it.
-
@Sebastian-Roth I got the log file dnsmasq.log
This are the MAC addresses which asked for tftp server
00:4e:01:c5:f4:67
00:4e:01:c5:fa:98
00:4e:01:c5:e7:c4
00:4e:01:c5:a5:9a -
@marted Well done!
First thing I notice is that we see pretty much every request coming in twice in the logs. Makes me wonder if this might confuse the clients as they probably get two responses from that as well. Probably these duplicated messages come from the IP helper?!
Though it’s interesting you get a 100% success rate on PXE booting when it’s not a multicast.
As well further down in the log we see it repeat the same log messages three times before it goes on to actually send out the information:
Mar 9 12:55:47 foglabunix dnsmasq-dhcp[744]: 1635745377 available DHCP subnet: 192.168.149.43/255.255.252.0 Mar 9 12:55:47 foglabunix dnsmasq-dhcp[744]: 1635745377 vendor class: PXEClient:Arch:00007:UNDI:003010 Mar 9 12:55:47 foglabunix dnsmasq-dhcp[744]: 1635745377 user class: iPXE Mar 9 12:55:47 foglabunix dnsmasq-dhcp[744]: 1635745377 available DHCP subnet: 192.168.149.43/255.255.252.0 Mar 9 12:55:47 foglabunix dnsmasq-dhcp[744]: 1635745377 vendor class: PXEClient:Arch:00007:UNDI:003010 Mar 9 12:55:47 foglabunix dnsmasq-dhcp[744]: 1635745377 user class: iPXE Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 available DHCP subnet: 192.168.149.43/255.255.252.0 Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 vendor class: PXEClient:Arch:00007:UNDI:003010 Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 user class: iPXE Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 PXE(ens32) 00:4e:01:c6:36:08 proxy Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 tags: UEFI, ens32 Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 bootfile name: ipxe.efi Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 server name: 192.168.149.43 Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 next server: 192.168.149.43 Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 sent size: 1 option: 53 message-type 5 Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 sent size: 4 option: 54 server-identifier 192.168.149.43 Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 sent size: 9 option: 60 vendor-class 50:58:45:43:6c:69:65:6e:74 Mar 9 12:55:48 foglabunix dnsmasq-dhcp[744]: 1635745377 sent size: 17 option: 97 client-machine-id ...
See if you can figure out why all the DHCP messages seem to be duplicates in your network. This might be the key. Not sure though but it’s still worth looking at and fixing it.
-
@Sebastian-Roth said in Hosts are looking for tftp server.:
grep “dnsmasq” /var/log/
I have just seen the tftpd log and something is wrong. See the time I test today two times
root@foglabunix:/var/log# systemctl status tftpd-hpa ● tftpd-hpa.service - LSB: HPA's tftp server Loaded: loaded (/etc/init.d/tftpd-hpa; generated) Active: active (running) since Mon 2020-03-09 12:13:55 EDT; 2h 2min ago Docs: man:systemd-sysv-generator(8) Process: 1473 ExecStart=/etc/init.d/tftpd-hpa start (code=exited, status=0/SUCCESS) Tasks: 1 (limit: 4915) CGroup: /system.slice/tftpd-hpa.service └─1509 /usr/sbin/in.tftpd --listen --user root --address :69 -s /tftpboot Mar 09 12:55:51 foglabunix in.tftpd[3843]: tftp: client does not accept options Mar 09 12:55:51 foglabunix in.tftpd[3845]: tftp: client does not accept options Mar 09 12:55:51 foglabunix in.tftpd[3849]: tftp: client does not accept options Mar 09 12:55:51 foglabunix in.tftpd[3851]: tftp: client does not accept options Mar 09 12:55:51 foglabunix in.tftpd[3853]: tftp: client does not accept options Mar 09 13:24:03 foglabunix in.tftpd[6395]: tftp: client does not accept options Mar 09 13:24:03 foglabunix in.tftpd[6406]: tftp: client does not accept options Mar 09 13:24:03 foglabunix in.tftpd[6419]: tftp: client does not accept options Mar 09 13:24:03 foglabunix in.tftpd[6421]: tftp: client does not accept options Mar 09 13:24:03 foglabunix in.tftpd[6432]: tftp: client does not accept options
and all log from today
Mar 9 11:34:40 foglabunix in.tftpd[10796]: tftp: client does not accept options Mar 9 11:35:35 foglabunix in.tftpd[10979]: tftp: client does not accept options Mar 9 12:24:07 foglabunix in.tftpd[14779]: tftp: client does not accept options Mar 9 12:25:07 foglabunix in.tftpd[14950]: tftp: client does not accept options Mar 9 12:33:42 foglabunix in.tftpd[15559]: tftp: client does not accept options Mar 9 12:13:55 foglabunix tftpd-hpa[1473]: * Starting HPA's tftpd in.tftpd Mar 9 12:13:55 foglabunix tftpd-hpa[1473]: ...done. Mar 9 12:39:34 foglabunix in.tftpd[2389]: tftp: client does not accept options Mar 9 12:39:36 foglabunix in.tftpd[2391]: tftp: client does not accept options Mar 9 12:39:36 foglabunix in.tftpd[2393]: tftp: client does not accept options Mar 9 12:39:36 foglabunix in.tftpd[2395]: tftp: client does not accept options Mar 9 12:39:44 foglabunix in.tftpd[2411]: tftp: client does not accept options Mar 9 12:39:44 foglabunix in.tftpd[2413]: tftp: client does not accept options Mar 9 12:39:44 foglabunix in.tftpd[2415]: tftp: client does not accept options Mar 9 12:39:44 foglabunix in.tftpd[2417]: tftp: client does not accept options Mar 9 12:39:44 foglabunix in.tftpd[2419]: tftp: client does not accept options Mar 9 12:39:45 foglabunix in.tftpd[2421]: tftp: client does not accept options Mar 9 12:40:05 foglabunix in.tftpd[2455]: tftp: client does not accept options Mar 9 12:40:05 foglabunix in.tftpd[2457]: tftp: client does not accept options Mar 9 12:40:05 foglabunix in.tftpd[2458]: tftp: client does not accept options Mar 9 12:40:05 foglabunix in.tftpd[2461]: tftp: client does not accept options Mar 9 12:40:05 foglabunix in.tftpd[2462]: tftp: client does not accept options Mar 9 12:55:41 foglabunix in.tftpd[3796]: tftp: client does not accept options Mar 9 12:55:41 foglabunix in.tftpd[3798]: tftp: client does not accept options Mar 9 12:55:41 foglabunix in.tftpd[3800]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3815]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3817]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3819]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3821]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3823]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3825]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3826]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3827]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3831]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3833]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3834]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3837]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3838]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3840]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3843]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3845]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3847]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3849]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3851]: tftp: client does not accept options Mar 9 12:55:51 foglabunix in.tftpd[3853]: tftp: client does not accept options Mar 9 12:56:12 foglabunix in.tftpd[3890]: tftp: client does not accept options Mar 9 13:02:59 foglabunix in.tftpd[4521]: tftp: client does not accept options Mar 9 13:04:02 foglabunix in.tftpd[4599]: tftp: client does not accept options Mar 9 13:23:53 foglabunix in.tftpd[6370]: tftp: client does not accept options Mar 9 13:23:53 foglabunix in.tftpd[6372]: tftp: client does not accept options Mar 9 13:23:53 foglabunix in.tftpd[6374]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6394]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6395]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6398]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6401]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6400]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6402]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6406]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6408]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6409]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6411]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6413]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6416]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6418]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6419]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6421]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6424]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6426]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6428]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6430]: tftp: client does not accept options Mar 9 13:24:03 foglabunix in.tftpd[6432]: tftp: client does not accept options Mar 9 13:24:24 foglabunix in.tftpd[6479]: tftp: client does not accept options Mar 9 13:24:33 foglabunix in.tftpd[6490]: tftp: client does not accept options Mar 9 13:25:35 foglabunix in.tftpd[6553]: tftp: client does not accept options Mar 9 13:31:52 foglabunix in.tftpd[7174]: tftp: client does not accept options Mar 9 13:32:53 foglabunix in.tftpd[7241]: tftp: client does not accept options Mar 9 13:52:48 foglabunix in.tftpd[9009]: tftp: client does not accept options
-
@marted said in Hosts are looking for tftp server.:
tftp: client does not accept options
As far as I know this is ok. It means that the client requests the size and TFTP server just says it doesn’t support querying size. I have seen this often. Should not cause a problem.
Have you looked at why DHCP queries come in duplicated?