Mounting File System Failed
-
I had new issues this time…
I brought my PC with the FOG VM into work (was testing at home) and I had to change the gateway IP and DNS server.
I thought re-running the installer was the best way to do it?
I ran it once or twice before I realised I had to change the router option in the Linux network card settings. Then the installed picked it up.
This is where I was having “tftp connection timed out” issues.
After following this: https://wiki.fogproject.org/wiki/index.php/Tftp_timeout…#Testing_TFTP
and getting to the storage management section and checking the password, I noticed the “interface” was set to “eth0” instead of the strange “eno16777736” that VMWare allocated.
So changing that fixed THAT issue. But why the heck did this issue suddenly appear ? Did the installer change it on me ?No that wasn’t it! after another reboot it it doesn’t work again, still getting “tftp connection timed out”
Tested tftp from another machine OK.
A couple of times now it has downloaded the files quickly and shown the fog menu. Then it won’t work again for the next 20 reboots or so
Some images from wireshark: In between the two images is just the rest of the file transfer -
Does restarting the tftp service help?
service xinetd restart
Check netstat to see if it is listening on port 69:
netstat -antup | grep ":69"
-
@Rusty the intermittent hit-or-miss you’re describing with TFTP suggests that you have more than one DHCP server in your environment and not all of them are configured with options 066 and 067 properly.
-
@Sebastian-Roth
[root@localhost bin]# netstat -antup | grep “:69”
udp 0 0 0.0.0.0:69 0.0.0.0:* 7860/xinetd@Wayne-Workman Funny you say that, last night (after my problems) our old DHCP server was restarted and the DHCP service started again, so this morning we had issues with 2 running.
However that’s fixed now and like yesterday it’s still not working.
Whats the best way to check for other DHCP servers ?
Shouldn’t the Wireshark capture I posted show it? -
@Rusty Do another capture just to be sure.
-
@Wayne-Workman Attaches is the last part of another capture. Just like previous, the PXE gets an ip, immediately downloads undionly.kpxe but then
“iPXE initialising devices”
tftp://10.0.0.51/default.ipxe… <<- This is where you see the ARP broadcasts start again.
After several seconds, it gives up and boots the HDD -
I did another capture/filter with ip.src==10.0.0.51 || ip.dst==10.0.0.51 for curiosity. This capture is from immediatly after the download of undionly.kpxe to when PXE gives up and boots the HDD.
-
@Rusty Packet 328 and 329 in the last capture you posted looks interesting. It happens again at 559 and 560. What name was it trying to resolve?
Looks like it was trying to find a PTR record for 51.0.0.10 ??? Is that right? Can we see more?
-
What is happening at this stage ?
to get undionly.kpxe tftp needs to have worked ? which means the option66/67 is working on the DHCP server
but now it seems to not be working… -
@Wayne-Workman Sure, whats a PTR record?
-
It just worked again, so I uploaded an image. Then it worked a second time.
Now its not working again (trying about 6 times now) -
A PRT record is a reverse lookup for DNS. Basically it asks DNS for this IP address what is its conical host name.
While I haven’t read this entire thread, for DHCP its best for option 66 to be an IP address and not a conical host name. Some PXE clients are just not that smart enough to know how to look up a name for option 66, so set it to the IP address of the FOG server.
Also I know they have been working on the boot image (in the last 3-4 days) where it was getting stuck at the initialing devices. You may want to try to update to the latest SVN. It sounds like you are almost there.
-
@george1421 Thanks.
My DHCP options use the IP of the fog server, so I’m not sure why that is the case then…OK thanks
-
after updating SVN, I re-run the installer ?
Anything need to be backed up/recorded etc ? -
yes after you do the svn up then cd to the bin folder and run setup again.
-
Not to send you down a rabbit hole here, but I take it you don’t have a reverse lookup zone created for your DNS server. Because looking at the capture its looking for the 10.0.0.51 address but the reply from DNS is name not found. Can I guess that 10.0.0.51 is your FOG server? Outside of the scope of this project you should ensure you have a reverse lookup zone created in DNS.
Here is a how to for 2012. http://www.tomshardware.com/faq/id-1954333/create-reverse-primary-dns-zone-windows-server-2012.html
-
@Rusty said:
Anything need to be backed up/recorded etc ?
If it’s virtualized, I recommend taking a snapshot before updating Trunk versions of fog.
If it’s not virtualized (and a production server), do a DB export (FOG Configuration -> Configuration Save -> Export) and a Host export (Host Management -> Export hosts -> Export) at minimum. Append the file names with the SVN version you’re using.
IF you decided to blast away the current build you have and start over, copy your images and grab a copy of /opt/fog because it has your SSL keys in it.
-
@george1421 Ok. So this isn’t contributing to my issues*?
Also excuse my ignorance, why should I add the reverse DNS lookup zone ? (what does it do for me practically?)
Thanks !
*Its possible there is a compatibility issue with the notebook I was using to image. I grabbed another old Dell and it seems to work everytime. I swapped back and Dell1 didnt work with several attempts, back to Dell2 and It is working 100%.
-
@Rusty Trade out patch cables, if you have more than one of these problematic dells I would suggest trying with another just to see what happens.
Are you using a managed switch?
Are DHCP Helper addresses set on the managed switch?
Is portfast enabled on the managed switch?What if you created a reverse lookup zone and it worked more reliably?
A lot of things use PTR records. There’s no conformity. Printers sometimes, computers finding printers, computers with IP printers installed and drivers that want to use the FQDN, some client / server software uses PTR records. the list goes on.
I wouldn’t be surprised at all if the firmware on the device you’re using is trying to querry for the PTR record just to fill some internal system variable (whether it’s used or not).
as @george1421 said, it is good practice to have a reverse lookup zone. More stuff will work and you’ll have less mysteries.
-
@Rusty Some systems will try to do a two way match between conical name->IP address -> conical name. This is a standard function of DNS. I’m kind of surprised that this issue (not having a reverse lookup zone) hasn’t been a problem.