Building a test environment
-
Hello!
As a new year came, i got my hands on few machine for test project (goal is to be able to use bios end uefi craps work, too). For this, I had the following:
network: 1 switch, and behind it 1 server and some clients. 10.0.0.255 network
server 10.0.0.1, i let fog setup all needed for this machine (dns, dhcp, i let it all on)
client: i think it is not relevant for start, but it is a little hp machine which can load into legacy and uefi mode, toofog version: 1.4.4
issue1: tftp was not set properly (i was trying to test if it works at all and on client i got file not found errors).
fix try: opened up setup folder and put tftp files to server /srv/tftp folder
outcome1: now it gets connected, but it says doesnt find default.ipxe file. well, it was not there, should it? so as it was a legacy box, i give it a try with undionly.kkpxe to if it loads anything at all. now it does, but got into a load loop. after ipxe initialisation part (status ok written) it restart ipxe config part.my question (not mentioning the really strange issue with unpopulated tftp directory) why does it want to have default.ipxe at all? in config part of dhcp it is not even mentioned (stock isc dhcp conf, made by fog setup script). next question would be why is there that loop?
i just wanted to make the simpliest setup and at this state i cant even start testing what i wanted is it the result of christmas food or what the hell i did wrong?
-
I remember seeing a similar issue over the holidays (sorry too much celebration to remember exactly when). But I think the issue/key was a problem with Debian 9 causing tftp to not install correctly. Let me see if I can find that. I don’t think there was a resolution with it. I know the FOG Project test the distros daily to ensure they install properly.
Let me see if I can find that post.
[Edit] Not a solid resolution: https://forums.fogproject.org/topic/11253/cant-pxe-boot-pxe-t01-file-not-found
Now if you don’t have Debian 9 installed, we will need to dig a bit deeper.
-
I met that post a bit earlier, and yes, deb 9 has issues with dhcp config (actually i dont get why it cant kill a process with service stop commands…). manually it is solved. BUT, as there is always a but… if i rerun the setup, why tftpboot directory is populated? and if i did populate it with files, and it finds them (seemingly), why looks for a nonexistent file? (meaning default.ipxe, as it was not mentioned in config file at all).
-
@Foglalt said:
if i rerun the setup, why tftpboot directory is populated? and if i did populate it with files, and it finds them (seemingly), why looks for a nonexistent file? (meaning default.ipxe, as it was not mentioned in config file at all).
Sorry but I don’t get why you mean by that. The issue is that in debian 9 the file
/etc/defaults/isc-dhcp-server
needs some specific variable set to make things work. So by re-running the installer (after fixing the issue) the setup runs through all the way and also populates the /tftpboot directory.I am working on this issue and will push a fix soon.
-
Maybe i was wrong somehow. Telling or doing it. After another retry (fully kill and redo fog in test environment) dhcp booted with legacy hw. And, as my luck goes, it stuck at another point:
After initializing random number generator, it starts configure interface (client computer). then it says things about ip it gots, lease time, then “failed to get ip via dhcp tried on eth0 interface”. it is strange, as it was starting with dhcp, so it must have gotten it already once in the process.
I will do check if it still needs the default.ipxe file which was never in dhcp config, but was requested by the client.
-
@foglalt The default.ipxe is a fog supplied config file. It is chained by undionly.kpxe or ipxe.efi.
The failed to get an up via dhcp error message is sometimes related to networking. More specifically the network switch having spanning tree enabled but not one of the fast spanning tree protocols like (RSTP, MSTP, Fast-STP, or what ever your switch port vendor calls it). A quick test to find out if its a spanning tree issue is to place an unmanaged switch between the pxe booting computer and the building switch. If the target computer boots correctly then you have a spanning tree issue with your building switch.
Also a clear picture of the error taken with a mobile and posted here would help to see the exact spot the error is being generated. The context of the error is almost as important as the error message itself.
-
@Foglalt The FOG boot process consists of several steps doing DHCP three times. At one point it requests default.ipxe. So this is all fine.
then it says things about ip it gots, lease time, then “failed to get ip via dhcp tried on eth0 interface”. it is strange, as it was starting with dhcp, so it must have gotten it already once in the process.
This might happen when spanning tree or auto negotiation or EEE is an issue. The later two do not play a role in earlier DHCP rounds as the other drivers don’t to much special things but only the linux kernel (maybe) does. So it can be many different things that is causing this issue. But it could also be a different problem because at that stage after requesting an IP cia DHCP it also checks connecting to the web UI and if that fails the error looks very similar. So you better take a picture of the actual error and post here so we can have a look.
Beside that you can use an unmanaged mini switch or hub to connect between client and your main switch and see if the problem goes away.
-
Actually the whole setup consiste of 1 switch (a dumb one, not managed), 2 machines (server and 1 client) for the purest environment.
I am a bit confused about what is written here about defalt.ipxe file and its chainloading. Can you pls be a bit more detailed? I am not an ipxe expert sorry
When i am in the office and have a bit of time, i will shoot some pictures for more specific .
(just for the record, de unpopulated tftp folder had a good reason. upon failing to setup dhcp (well, it maybe a debian specific issue on 9.3) it fails the later part of the script which sets up a new tftp location and puts there files as usually. i was not aware about it at first, my bad, sorry.
-
@foglalt said in Building a test environment:
I am a bit confused about what is written here about defalt.ipxe file and its chainloading. Can you pls be a bit more detailed?
This is kind of a long story but I try to give you a picture:
- PC/Laptop boots and is set to PXE boot
- DHCP request to get an IP and PXE information (
next-server
andfilename
, called option 66 and 67 in DHCP speek) - Requests
filename
fromnext-server
(BIOS useundionly.kpxe
and UEFI useipxe.efi
by default - see in DHCP config, e.g./etc/dhcp/dhcpd.conf
) - This FOG specific iPXE binary comes with an embedded script included (code here) which sends another DHCP request to get an IP and be able to communicate with the FOG server to load the
default.ipxe
file - From the
default.ipxe
it chainloads to the FOG boot menu or runs a scheduled task (this part is auto-generated depending on which client sends the request…) - If it’s a task for this client iPXE will chainload the Linux kernel and boot up the FOS mini Linux to do the task.
- Here the mini Linux environment will send a third DHCP request, again to be able to communicate with the FOG server.
You might wonder why we need so many DHCP handshakes? This is because FOG combines different open source software like iPXE and the Linux kernel to do what it does. Each part needs to setup network communication on it’s own.
-
Thank you a lot for description, it is clear and i think it was actually needed to understand.
Here comes the place where it fails:
It fails with no dhcp response, but before that point, dhcp responded, even lease time is received, and ofc init files are loaded properly before this part. What is it then?
-
@Foglalt Ok, I just saw that we added an extra message to make this more clear after 1.4.4 release. So this is only in the RC version and will be in the next release. But that’s ok for now. You don’t need that message.
Please open the URL http://10.0.0.12/fog/service/ipxe/boot.php?mac=00:11:22:33:44:55 in your browser (exchange
00:11:22:33:44:55
with the client’s MAC address). Is10.0.0.12
your FOG server IP? Copy and paste the full text output you get in the browser here. -
@Sebastian-Roth
thanks a lot, following your guidelines i found the problematic part of config (actually i made mistakes then correcting too more steps than i previously tought (ip of server was the key, which was mistakeingly entered during the tries of installation tests.now it is testing imaging (bah, finally i can start the true test for making hybrid bios/uefi combos work automatically; and selecting proper win10 image types.)
-
anyway can i have rights to delete my own post? i need more coffee, i edited, reedited my prev post, then i wanted to purge it but with no rights)
-
@Foglalt Great you figured it out! Too easy if you know what to look for, right?
I purged the post for you. Not sure who’s in charge of the forum rights at the moment. Pinging @joe.schmitt ?!
-
Bah, imaging was ok, but that mistake i made somehow still lingers on. I know that I may correct it with full reinstall, but i think i learn more if i know more piece of the mechanism. Can you maybe take a look at this error i got?
-
@Foglalt It’s either the wrong IP again or username/password wrong. Take a look at FOG Configuration -> FOG Settings -> TFTP Server. First the items.
As well take a look at this: https://wiki.fogproject.org/wiki/index.php/Troubleshoot_FTP
And to fix all the IP things, check this out: https://wiki.fogproject.org/wiki/index.php/Change_FOG_Server_IP_Address
-
Seems now all goes as wanted. THX a lot for help (can i mark my topic as solved somehow? in old forum i knew the way for it i guess, but not now)