Resgistration Issues
-
So, i realize i made the mistake of posting this on a Friday afternoon, but i’m still as stuck as i was previously. Has anyone else run into this?
-
@cklemm Are you able to access that URL in the browser? http://10.18.10.182/fog//index.php ?
-
@sebastian-roth Yes, i have no problems getting into the web page.
-
@cklemm The message you see is twofold. So it could be either a DHCP issue which I’d dismiss for now as DHCP has worked several times till a client gets to that point. Then there is the HTTP connection check which might fail but I can’t see why. I don’t like guessing much so we should just take a look at the facts - the packets on the network that is. So get your client ready but don’t start it yet. Go to your FOG server and install
tcpdump
(again guessing as we don’t know your server system, eitherapt-get install tcpdump
oryum install tcpdump
as root should do the trick). Then run the following command and substitutex.x.x.x
with the client’s IP address:tcpdump -w /tmp/boot_issue.pcap host x.x.x.x
Leave that command sitting there, boot up the client till it shows the error “Either DHCP failed or …”. Now stop tcpdump (Ctrl+c). Upload the generated file
/tmp/boot_issue.pcap
to your dropbox/google drive and post a link here or send me a private message if you don’t want to share this with the rest of the world. -
@sebastian-roth The file came out to be, well, bigger than i was expecting. Anyway, here it is:
https://drive.google.com/open?id=1S-O-TSydukduMd6CKs04595qYTBW8I99
If it’ll help, i’m running Ubuntu 17.10.After doing this a few more times, it looks like there isn’t anything that shows up during the time that the client is trying to register. kadath-fog-test is the hostname of the client, in case that isn’t clear.
-
@cklemm Yeah it’s that big because in the packet dump we have the kernel and initrd (essentially the TCP packets transferring those files from the server to the client).
So seems like I was wrong by thinking that this is a HTTP issue. The last thing in the packet dump I see is the request and response for the initrd. Nothing after that. So to me this means something going wrong with DHCP at that point. What is serving DHCP in your network?
-
@sebastian-roth We have a separate DHCP server running server 2012. I have options 66 and 67 setup for the scope in question. There are also the various vendor classes and such that i can’t remember the specifics of, but were mentioned in a FOG Wiki entry about running a separate DHCP server. Those were set up as well. Is there some specific information from this server that you need?
-
@cklemm I kind of guessed that would be the case. It’s not an issue by itself but it’s just a little harder to get the information we need. Best would be to capture the full DHCP traffic going between the client an the DHCP server. I suspect you are not allowed to install wireshark on that server and let it capture the client’s traffic as we did on the FOG server before?!
But we might be lucky. I am not absolutely sure but I think I have seen Windows DHCP servers responding with broadcast DHCP answers so we could see those even when capturing packets on the FOG server. Let’s just give it a try. So get your tcpdump ready:
tcpdump -w /tmp/dhcp_issue.pcap port 67 or port 68 or port 69
Now boot up the client and wait till you see the DHCP error at least once. Stop tcpdump and upload that file again - should be way smaller this time.To make it a little easier for us we need to know the client’s MAC address. So please post that here as well.
-
@sebastian-roth Here’s the file:
https://drive.google.com/open?id=1a7o5ylngM3ixO1wgzXZh9Mt0v_4o5NmS
We’ll definitely have to try what we can without touching the DHCP server before i go playing around with it. You’re also right that the file was less than a third the size this time.The client MAC address is 40B0.341C.FB6D.
Also thanks for sticking with this, i’d be really stuck without the help.
-
@cklemm Ok, there is some major thing adding to the complexity that you haven’t mentioned yet and I haven’t looked close enough to figure this when looking at the big packet dump.
All the packets being seen by the FOG server do not come straight from DHCP server or client but go through a DHCP relay/router first. This is not an issue per se. It’s actually good network design to structure those things and put them all together into one huge subnet. But it adds to the complexity and can cause weird issues. Unfortunately we only see half of the traffic and I am not able to guess what’s wrong with too little information in the packet dump. Sorry.
So lets recap this thing from the start. Have you ever had a full functioning FOG server or is this your first try to set this up and get it top work. I assume the later from what I read but just wanna make sure to not head the wrong direction with this.
The best to go about this would be to capture the full traffic right at the client. You can either do this by configuring a monitoring port on the switch where you client is directly connected to. The other usually easier option is to grab an old network hub and connect that between your client and the network switch. Jack in another PC/notebook to that hub and you can capture all the network traffic being send forth and back from and to the client.
-
@cklemm said in Resgistration Issues:
but when i try to register the device it restarts and says that it fails to get an IP address via DHCP.
I’ve only loosely scanned this thread. But let me see if I can set the picture here. You an pxe boot and get into FOS for registering, but then when FOS is booing you get an error message 3 times that you can’t get an IP address and then FOS gives up?
If this is the case, did you ever change the IP address of the fog server since FOG was installed?
-
Also here is the full text behind the tcpdump command Sebastian provided. https://forums.fogproject.org/topic/9673/when-dhcp-pxe-booting-process-goes-bad-and-you-have-no-clue If you can narrow down your capture to the time you are trying to boot the client that would be helpful.
The pcap you provided did not give any intelligence to what is going on here. What I can tell is that your fog server is on one subnet and the pxe booting client is on another subnet. Also it appears your real dhcp server is also on another subnet from the fog server. It almost appears that you have a dhcp-relay service running on your vlan router and you are sending dhcp inform packets to your FOG server. Are you running dnsmasq on your FOG server? That is the only reason why I can think your router is sending a unicast dhcp inform and discovery packets directly to your FOG server.
-
@sebastian-roth So, this is my first time setting up a fog server, but apparently there used to be one that covered all our clients. It should work.
Both the server and the client are connected to the same switch already, would that matter as far as capturing the packets?
As far as the network, the DHCP server isn’t even in the same building. Sorry i didn’t mention that, i didn’t know it was important.
-
@george1421 You seem to understand the problem exactly. You are also correct that the server itself, the client, and the DHCP server are all on different vlans. This is important to the way our network is set up, and i really need this to be able to work across vlans. If it matters, i could move the fog server to the same vlan as the DHCP server, but the client absolutely will not be on that vlan.
I don’t think that the fog server is running dhsmasq, unless it automatically installed when i install fog.
ps -ef | grep dnsmasq
The above only shows the command itself. I’m not an expert (obviously), but i think it isn’t running. If there is something else you’d like me to check, just let me know. -
@cklemm I’m only working from the perspective of what is in the pcap file you captured.
Because we are dealing with broadcasts, and unicast messages, ideally we would like to see the fog server, dhcp server, and pxe booting computer on the same subnet so we can get an accurate picture of what is going wrong. We can get what we need captured in a round about way.
FOG WILL work in the setup as you described.
OK just to reconfirm, you did change the fog server IP address after FOG was installed? This is important since when FOS boots (custom linux that runs on the pxe booting computer), part of its network testing for dhcp is to try to communicate with the FOG server. FOS will report that it doesn’t have an IP address, but what is really happening is that it’s trying to ping the FOG server and its getting no response. So FOS assumes it doesn’t have a good network connection.
-
@sebastian-roth said in Resgistration Issues:
The best to go about this would be to capture the full traffic right at the client. You can either do this by configuring a monitoring port on the switch where you client is directly connected to. The other usually easier option is to grab an old network hub and connect that between your client and the network switch. Jack in another PC/notebook to that hub and you can capture all the network traffic being send forth and back from and to the client.
-
@george1421 I did not change the IP after FOG was installed. Is there something i can check that would show which IP FOS is pointing at so i can be sure that is correct?
@sebastian-roth Ok, i’m working on it. I’m finding it hard to get a working hub, but i’ll post here when i have it.
-
So i still don’t have the packet info from the hub, but i restarted the computer while it was hooked into the hub, and just for kicks i decided to try to register the client. It worked.
A month of restarting the client and as soon as i plug it into a hub of all things it works. Right now it is capturing the image (at least, i’m pretty sure that’s what it’s doing). Does that give you any information? it makes me think it is a network configuration issue. I’m still going to try to capture the packets, but i thought this was worth posting.
-
@cklemm Nice one! Don’t bother about capturing the network packets.
Read up on spanning tree and port fast settings. It’s either that or even possible EEE (ethernet energy saving stuff) or an Auto-negotiation issue.
-
@sebastian-roth Just an update, i was able to get this to work even in another building. Apparently this was a problem specific to the random old switch i was using to test this stuff. Thanks so much Sebastian and George.