040ee119 error on boot
-
Hello All,
iPXE Developers acknowledged an issue with the iPXE software and has pushed a potential fix for the problem. As always, the files are on my website at: [url]https://mastacontrola.com/ipxe/[/url]
The rev number to test is: d6300-DEFAULT-TEST but all of our pre-tests appear to be successful of this 040ee119 issue. If this error occurs now, it’s likely an actual DHCP issue rather than a problem with the software as it was before.
As the prospects seem good at the moment, I’ve pushed SVN 1769 and 1770 with all of the undionly/ipxe files in svn trunk. undionly.kpxe is the one that we want to work, but if it isn’t working for your systems, try one of the kkpxe or pxe files as your undionly.kpxe.
[url]https://svn.code.sf.net/p/freeghost/code/trunk/packages/tftp[/url] is the location.
Special thanks to Wolfbane8653 for testing all of these and keeping an amazing log that we could easily track and find out where the issue was.
The FOG Dev team took many build as you can guess, but we don’t have all the hardware or issues others were having. I hope this is the end-all-be-all fix for this issue, and thank you all for the patience. The FOG Dev team’s debugging, suggestions, patience, and testing was instrumental in finding what we could do to “fix” the issue which lead to the iPXE dev team to find an entirely separate, but related, issue with this problem.
So again, I thank you all personally and understand your frustration but appreciate your patience with this. I think I speak for the entire Dev team in saying thank you all very much.
Now on to the regularly scheduled updates. Go forth and test and report as normal. If you’re still having issue please don’t hesitate to let us know so we can track and help further.
Thank you,
-
This newest Undionly.kpxe is working for my DC 7900’s.
-
Bit of a noob on this - but do I just literally download the files that are provided in the link and replace them with the files that are in the boot folder? or is there something else I should do?
-
the tftpboot folder, yes
-
Hi,
Another day, another bash at trying to get this working again…
I’ve replaced the files in the tftpboot folder with the files in the above link, and when I PXE boot from the client now I get IP Address etc (as before) it shows as version iPXE 1.0.0+ (d630) I get a message saying configuring … OK
then I get/defaut.ipxe… No such file or directory
and just sits there
Any thoughts?
Cheers
Matt
-
My guess is this is an issue of the tftp server not being located at the proper location.
See if your FOG server has /var/lib/tftp or /var/lib/tftpboot. Verify that the /tftpboot/default.ipxe file exists, if it does and one of the other directories exist as well, try:
[code]ln -s /tftpboot/default.ipxe /var/lib/{WHICHEVER}/default.ipxe[/code] Try booting again and hopefully all will start working. -
Hi Tom,
OK my server has only a /var/lib/tftpboot folder BUT does not have any files inside of it… where will I find a copy of the correct default.ipxe to put inside of it?
I’m hoping that this will then fix the issue?!?
Thanks
Matt
-
You should have a a folder in your root called /tftpboot. If not then you will need to re-install fog.
-
and copied it now into /var/lib/tftpboot - I know get the FOG boot menu
BUT
If I now click on Quick Image I get as shown in the attached image…
[url=“/_imported_xf_attachments/0/877_IMG_2051.JPG?:”]IMG_2051.JPG[/url]
-
I think i may have spoken too soon about my HP 7900’s. It works fine on a box i’ve been building for a new user. But several of them in production had issues with Looping restarts this morning. Escaping out of PXE boot mode allows them to get going.
There are no hardware differences between the boxes. Only their physical location is different, and therefore they may not share some of the same switches/networking gear. They are both in the same office building, network, subnet, DHCP server, are all the same.
I’m going to pack the new box out to the station that was having trouble and test it there to see if that makes a difference. I’ll let you know.
-
Wow never expected that result.
Yes there appears to be something in the network itself causing this newer problem. Took my test machine out to the “Production” workstation and it too failed to boot (reboot loop). Brought it back to the “Test” desk and it works fine.
Grabbed the box from the “Production” workstation and brought it to my “test” desk. It booted up flawlessly.
Upon initial inspection, the only major difference in network protocol is that the “Test” desk is 10/100, while the “Production” desk is running on gigabit.
Further inspection denotes that all the core switches are either Dell Powerconnect, or Netgear Prosafe Gigabits. The “Test” desk has a cheap-o C-net 10/100 24 port to give me the number of ports i need at the desk which effectively drops the network speed to 10/100 for that cable run. The “Production” desk is running directly off the core switches at full Gigabit speed.
I then pulled the cable from my 10/100 switch and tried booting a pc directly on the “Test” desk from the direct line, without the 10/100 switch reducing the transmission speed. Reboot loop occured once again.
Therefore i must conclude that there is an issue with Undionly and gigabit speeds.
-
[QUOTE]Netgear Prosafe Gigabits. [/QUOTE]
Be absolutely sure that STP is disable on these switches. Otherwise, PXE or iPXE will not work.
-
The netgear switches are isolated to our DMZ, i didn’t take the time to trace all the cables. But to allay your fears, .33 worked perfectly for all our machines w/pxe.
The “Production” desk is directly connected to a dell Powersafe, that i know for certain.
This is a recent issue since 1.0.1 and more specifically, if STP were an issue i would not expect it to suddenly “go away” just because i put another switch in-line after it. In fact i would expect it to get worse or at least stay the same since there would be more usage of STP with another switch in place.
In this particular office, we only have 2 subnets. Our DMZ, and the main network. In fact, i only ever use Fog here in this office to clone new machines, and i’ve done that well over 100 times on .33 with no problems
Well, other than multicast taking the network down when we try to use it, but that’s neither here nor there and probably has more to do with our highly distributed network and massive VPN usage to handle it all. In either case, i’m not worried about multi-casting, and we don’t push images over the WAN anyway.
-
[QUOTE]But to allay your fears, .33 worked perfectly for all our machines w/pxe.[/QUOTE]
Fears eased . Just have to throw that out there just-in-case. I have Netgear Prosafes (1GB) w/ 1GB GBIC’s here and it works…except for the few bugs i keep hitting…cough TOM!..
Have you you tried any of the other undionly.ipxe files from [url]https://mastacontrola.com/ipxe/?[/url] I would strongly suggest /[URL=‘https://mastacontrola.com/ipxe/d6300-DEFAULT-GOOD/’]d6300-DEFAULT-GOOD/[/URL] This is the last I did on the rounds of testing here and it seems to be running smoothly.
-
When connected to the 10/100 switch everything functions as well as can be expected. They boot just fine right into the menu and then on to windows.
Not that it’s necessarily relevant yet, but it locks up at “loading boot sector… booting…” when trying to do a memtest. Let’s not trouble Tom with other issues here until we can get everyone to the menu first
However bump up the speed to 1gb, (by removing the 10/100 switch) and it, picks up DHCP, goes to configuring, twirls there for a bit, then flashes an error message which i can’t read, because it instantly reboots.
-
Tom,
I got rid of the “Operation not Supported” message on a Dell 390 today by using the “e0478-DEFAULT-TEST” ipxe.pxe . Though that version did not work on the Dell 990’s. I went through all of the most current files you have posted and here is what I have found.
e0478-DEFAULT-TEST
undionly.kkpxe - - reboot loop on dell 990 operation not supported
ipxe.pxe - - - - operation not supported then freezes
undionly.kpxe - - - operation not supported then freezes
undionly.pxe - - - - assumming it errors but goes way to fast to read error then reboots
ipxe.kpxe - - - operation not supported then freezes
ipxe.kkpxe - - - operation not supported then freezesd6300-DEFAULT-GOOD
ipxe.pxe - - - operation not supported then freezes
undionly.kpxe - – operation not supported then freezes
undionly.pxe - - - - assumming it errors but goes way to fast to read error then reboots
ipxe.kpxe - - - operation not supported then freezes
ipxe.kkpxe - - - operation not supported then freezes
undionly.kkpxe - - - - operation not supported then reboots -
the undionly.kpxe file is not your problem. something else is not configured properly
use the d6300-DEFAULT-GOOD file -
What would you suggest? Should I run a backup and re-install? Or would that carry the same problem over. Its just weird that the same issue happened on a Dell 390 and I change to ipxe.pxe and it worked fine.
-
[quote=“Tribble, post: 29072, member: 17221”]When connected to the 10/100 switch everything functions as well as can be expected. They boot just fine right into the menu and then on to windows.
Not that it’s necessarily relevant yet, but it locks up at “loading boot sector… booting…” when trying to do a memtest. Let’s not trouble Tom with other issues here until we can get everyone to the menu first
However bump up the speed to 1gb, (by removing the 10/100 switch) and it, picks up DHCP, goes to configuring, twirls there for a bit, then flashes an error message which i can’t read, because it instantly reboots.[/quote]
In reference to the “twirling” you’re experiencing, do your switches have STP enabled on them? Do you have an option for portfast on these switches?
Being connected to the 10/100 doesn’t really mean anything, it simply means it can receive the dhcp request from ipxe. If you take out all managed switches and connect directly to the FOG server, does all work as expected at 1gb?
This will let you know if it’s a 10/100->1000 problem. I run gig switches on all my tests, and don’t experience the boot loop issue. (BTW for all, the boot loop is intentional kind of).
Doing this won’t necessarily be easy, but I suppose, if you could for DHCP, connect your FOG Server, Client test system, and DHCP server to the same switch. Start off easy and use a “dummy” switch preferably rated for 1GB.
This will, again, determine if it’s truly the undionly OR if something’s on your network not sending the DHCP back in time.
-
[quote=“Buddy, post: 29100, member: 225”]What would you suggest? Should I run a backup and re-install? Or would that carry the same problem over. Its just weird that the same issue happened on a Dell 390 and I change to ipxe.pxe and it worked fine.[/quote]
The reason, in past posts, I asked if you performed a schema update is because there are pieces of information pulled from the database. If the DB’s not accessible it can’t, properly, create the Boot scripts.
That all said, you stated that schema is fine and you can log in to the FOG Web GUI, correct? Using the “latest-ish” undionly/ipxe files have helped you get further, but the 990’s are still not working correct?
Is it possible these systems aren’t receiving all the data intended? Meaning, it’s only downloaded a portion of the default.ipxe, if any?
Is your system actually trying to follow the right paths? ( pxe->download undionly.kpxe->load ipxe->receive new dhcp->load the /tftpboot/default.ipxe->load menu system ([url]http://FOGIP/fog/service/ipxe/boot.php?mac=XX:XX:XX:XX:XX:XX)?[/url] We already know they are pxe booting, downloading the undionly.kpxe, and attempting to load default.ipxe
While the link to boot.php is not 100% accurate as the ipxe loading uses post variables as opposed to get vars, if you go to the link, you should see your tasking, or menu information. Replace the XX’s with the client MAC or just stop at boot.php without sending the mac at all. What is seen in your browser if you go to this link?
All of your other systems are working, just not the 990’s. Is there a wireless nic, or some other nic on the system?