2 NIC in host problem (loops at sending discovery...)
Currently I am setting up FOG in a lab environment.
I am not very experienced with most services FOG runs on but I am going along fine.
Everything is going all right, I can use FOG with most hosts perfectly as expected.
However there is an issue with specific hosts containing 2 network cards when trying to register them. I can boot to the PXE menu just fine with those hosts, no issues. But whenever I try to do a full registration it keeps looping in “Sending discovery…” for a very long time.
The hosts are connected to 2 separate networks.
Network number 1 contains FOG server, DHCP and a connection to the internet.
Network number 2 is a closed of network without a DHCP (IP addresses are assigned static).
See my crude quick network drawing for visual representation of relevant parts.
I have found 2 workarounds for this issue which might be able to determine the issue.
1.) Disconnect the cable from the host in Network 2 before booting.
2.) Temporarily connect the switches in Network 2 and Network 1 (where FOG and DHCP servers reside).
When I perform the workaround to register them the issue still persists when trying to run image tasks. But as you can see this workaround is not really desirable.
It seems that both of the network adapters need an IP and/or connection to the FOG server. I think this issue can be solved by removing the check for NIC 2 but I don’t have the knowledge to fix this. If anyone could point me in the correct direction I would be very thankful.
I am currently running FOG SVN 3508 on a hardware server.
Providing a link in this thread for future readers to a similar problem on a later revision: https://forums.fogproject.org/topic/5928/r4930-fedora-21-unable-to-ignore-mac-for-imaging
The change that fixed the excessive DHCP timeouts was 3541. This may not be the most stable commit, but anything after that point will have it fixed.
I think 3554 was pushed this morning.
I just tested SVN 3553 on a different server, from what I can tell the issue seems to be solved for this set-up.
@cspence just to be sure, what SVN number can be considered the next SVN?
Really glad to see FOG is so actively maintained, thanks everyone!
I received word that the init images have been updated. A reinstall of the latest SVN will fix this problem.
I still highly suggest waiting for the next SVN to be released though. The bug that @Tom-Elliott is working on affects multiple NIC setups.
@Knut That is definitely an edge case…
I did some more troubleshooting with @ch3i and we pretty closely determined the issue.
@cspence , I hope this provides sufficient information to fix this bug correctly.
NIC 1 (Network 1) has access to DHCP and the FOG server.
NIC 2 (Network 2) is completely isolated there is NO DHCP service.
When trying to run a task it loops into “Sending discover…”
The second card has no access to an IP whatsoever. (they are assigned static afterwards)
I set up a temporary DHCP server on Network 2 this completely solves the issue.
Network 1 and Network 2 have different network addresses.
So the issue seems that the host needs an IP address on all active interfaces or else the task loops at “Sending discover…”.
It is desirable for our set-up to not have a DHCP running on the second network, there isn’t much on this network anyway.
Thanks to you both for your support.
I just pushed up a potential adjustment as commit 03c9451 on git. @Tom-Elliott is in the process of fixing a nasty bug on the portal. As soon as he finishes that, he’ll be able to get you an SVN to test out.
@Knut Hi, sorry for the late answer… could you upload or download as a debug task. Just check debug after select upload or download.
Noting this forum post on the git repo issues. I’ll take a closer look at how networking is handled. It would be good for us to handle multiple NICs more gracefully.
Thanks for reporting.
No, this issue persists also when downloading or uploading an image if I register the computer either manually or via the workarounds I mentioned. I can also use the workarounds for imaging tasks.
What is weird is that if I just let it loop for a while (like 15 minutes) it sometimes ends up in the registration screen but other times it does not (did not try it with imaging). But if I use one of the workarounds it works as expected.
Thanks for your input, regards.
I’ve the same configuration in my school without problem. You have the problem only in full registration ?