Optiplex 3020 TFTP Issue
-
[quote=“Jack S., post: 44818, member: 29223”]Well unfortunately, we do not have any hubs on hand to do this, but I’d welcome some guesses![/quote]
Guessing with this is like finding drugs in a house. Not very efficient doing this without having one of these trained dogs with you… My trained dog in network hickups is wireshark/tcpdump.If you don’t have a hub you can either just connect your capture laptop to the switch as some DHCP servers talk broadcast all the time (so you see all the traffic anyway) or use tcpdump on the for server where your dnsmasq DHCP is running:
[CODE]sudo tcpdump -i eth0 -w dhcp_dump.pcap udp[/CODE]
You’ll only see one message from tcpdump. All the packets captured are written to a file. After you started up one of your faulting clients you can stop tcpdump (ctrl+c), get that file from your server (e.g. winscp) and check it out with wireshark or even upload it to let us do that.
Are you absolutely sure that there is no other DHCP server or DHCP proxy on the net???
-
Okay I can give tcpdump a shot when I get in the office later today.
[QUOTE]Are you absolutely sure that there is no other DHCP server or DHCP proxy on the net??? [/QUOTE]
I am fairly certain that there isn’t any DHCP server that can hand out PXE information. We do have DHCP running in our Windows domain, but it was never touched while we setup FOG, nor has the Windows DHCP ever been configured for PXE -
[SIZE=3][COLOR=#000000][FONT=Times New Roman]Okay I have some captures to take a look at and I’ll upload them here as well. I have a capture from a machine that can successfully boot into FOG (a Dell Latitude E5540) and one that does not (one of the Optiplex desktops in question). Its very curious because from just looking over the captures both seem to TFTP from the right server (10.1.0.53), but the Optiplex times out. [/FONT][/COLOR][/SIZE]
[SIZE=3][COLOR=#000000][FONT=Times New Roman]Virustotal link for my ZIP file: [url]https://www.virustotal.com/cs/file/e8529017b59fa10f49b4005fbe84432977b3e3a32163cc15e39b024f3604f227/analysis/1427906958/[/url][/FONT][/COLOR][/SIZE]
[url=“/_imported_xf_attachments/1/1843_DHCP_DUMP.zip?:”]DHCP_DUMP.zip[/url]
-
Does the Optiplex 3020 have solid state drives?
And, out of curiosity, can you enable the long POST check on one? Like, a full memory count before booting… or something similar?
I read something the other day that claims this can help newer/faster systems to PXE boot right… Again, Just a curiosity. -
The hard drives are standard HDD, any ideas on how I would enable a “long” post check?
-
Hi,
we have some optiplex 380, 390, 3010 and 3020 and no problem with PXE. Just the 3020 is very slow to start.
-
I just checked on my Optiplex 7010,
My particular BIOS does not allow Long-POST, or Through-POST.
I know older systems used to allow configuring fast POSTing and full POSTing…
Maybe that’s gone to the wayside???Either way, you don’t know till you just get in there and look.
-
This post is deleted! -
[quote=“davido38, post: 44849, member: 29185”]Hi,
we have some optiplex 380, 390, 3010 and 3020 and no problem with PXE. Just the 3020 is very slow to start.[/quote]
Do you happen to know which version of FOG/which kernel you are running?
[quote=“Wayne Workman, post: 44850, member: 28155”]I just checked on my Optiplex 7010,
My particular BIOS does not allow Long-POST, or Through-POST.
I know older systems used to allow configuring fast POSTing and full POSTing…
Maybe that’s gone to the wayside???Either way, you don’t know till you just get in there and look.[/quote]
I just checked my BIOS option and there was barely anything there for POST options so I think I’m SOL with that path
I’m thinking this is definitely a hardware issue with this particular batch of computers (8 in total), but I’m holding on to hope
-
So the Cisco VoIP phone is an indicator to me. The TFTP requests that VoIP phones require for their configuration is a know potential problem. What’s more concerning is the crappy 8 port switch with this VoIP phone is only seeming to affect the 3020s. Does this somewhat match what you’re seeing? Is there a way to find out what the VoIP system is looking at for their configuration?
-
[quote=“Tom Elliott, post: 44855, member: 7271”]So the Cisco VoIP phone is an indicator to me. The TFTP requests that VoIP phones require for their configuration is a know potential problem. What’s more concerning is the crappy 8 port switch with this VoIP phone is only seeming to affect the 3020s. Does this somewhat match what you’re seeing? Is there a way to find out what the VoIP system is looking at for their configuration?[/quote]
The VoIP system gets its configuration information from a separate subnet. All end devices (PCs, laptops, servers) are on 10.x.x.x while the VoIP system is on 172.16.x.x. What gets even weirder is we’ve been able to FOG without issue for months now through the VoIP phones built in switch. Now not all desks have the 8-port switch to worry about, so maybe tomorrow I’ll try a port straight from the wall and see where that gets us.
-
Okay, so bad news: using a cable straight from a wall jack into the computer made no difference. The OptiPlex 3020 still tries to grab default.ipxe from 10.1.0.1 instead of 10.1.0.53. Even worse news: I just tried to image a Dell OptiPlex 380 and while it could boot the FOG menu (using the same wall jack as the OptiPlex 3020), PartClone did not start and it just sat at a black screen for about 10 minutes. I also tried to create a task through the FOG web console and the task just sat at “Queued” even if I tried to force it to start. I’ve restored the FOG VM to a what I thought was a stable configuration before we even tried to mess with the 3020’s, but it looks like I need to go back further.
-
What is running DHCP in your environment? Do you have access to it?
-
[quote=“Jack S., post: 44806, member: 29223”]I should point out that 10.1.0.1 is not the IP for our FOG server, but rather it is the router for our network, which does not do DHCP, nor is it specified anywhere in the FOG configs that I’ve checked.[/quote]
The packet dumps tell a completely different story… I see two answers to every DHCP request. One is coming from 10.1.0.53 (next-server and filename are properly set but no IP is being offered!) and another one is coming from 10.1.0.1 (IP is being offered but no PXE options!). [B]Looks like Proxy DHCP setup to me![/B]
From what I can see it’s working for some of your devices so I guess it’s setup kind of properly. Question remains, why do Optiplex 3020 fail to find the correct TFTP server. [I]Well, in fact it does find the correct TFTP server on the first run[/I] when the PC comes up and the NIC itself requests an IP and loads undionly.kpxe from the TFTP. I see that this works great in both cases!! But after that. On working devices the iPXE binary (undionly.kpxe) requests another IP and loads default.ipxe, fine! On the failing device I don’t see another DHCP request, weird!
[quote=“Jack S., post: 44806, member: 29223”]tftp://10.1.0.1/default.ipxe…Connection Timed Out[/quote]
Could you please provide more infos on what happens before this! Take a picture with your camera or type it from the screen. We need to know what messages exactly you see on the screen before this error comes up…Maybe the Optiplex 3020 is a bit buggy when it comes to iPXE and Proxy DHCP??
-
[quote=“Wayne Workman, post: 44901, member: 28155”]What is running DHCP in your environment? Do you have access to it?[/quote]
Windows is running DHCP in our environment and yes I can get access to it[quote=“Uncle Frank, post: 44902, member: 28116”]
Could you please provide more infos on what happens before this! Take a picture with your camera or type it from the screen. We need to know what messages exactly you see on the screen before this error comes up…
Maybe the Optiplex 3020 is a bit buggy when it comes to iPXE and Proxy DHCP??[/quote]
Thanks for going through those dump files! It is really hard to tell what comes up on the screen before the error since it goes by so quickly, but it looks like the OptiPlex initially pulls the correct information like you said.
It looks like the screen says the IP of the PC, the FOG server IP, subnet mask, default gateway of 10.1.0.1 and then it flashes to iPXE initializing devices…ok. Then it goes to tftp://10.1.0.1/default.ipxe…Connection timed out ([url]http://ipxe.org/4c126035[/url]) Selected boot device failed.
The PCs that work have the same information but they pull tftp://10.1.0.53/default.ipxe
-
[quote=“Jack S., post: 44905, member: 29223”]Windows is running DHCP in our environment and yes I can get access to it[/quote]
So maybe you turn off FOG DHCP (‘service dnsmasq stop’) for a moment any configure your Windows DHCP server with DHCP options 066 and 067 ([url]http://www.fogproject.org/wiki/index.php/Windows_DHCP_Server[/url]). I don’t know much about this but Wayne does! I guess he can tell you all about it!! -
[quote=“Uncle Frank, post: 44907, member: 28116”]So maybe you turn off FOG DHCP (‘service dnsmasq stop’) for a moment any configure your Windows DHCP server with DHCP options 066 and 067 ([url]http://www.fogproject.org/wiki/index.php/Windows_DHCP_Server[/url]). I don’t know much about this but Wayne does! I guess he can tell you all about it!![/quote]
Oooo man looks like I lied, DHCP is handed out by our router now 10.1.0.1 and that I don’t have access to. Although that may start to explain some issues
-
Ask your networking guy to configure DHCP options 066 and 067 on the router’s DHCP service.
Then turn off DNSMASQ like Uncle Frank suggested… see what happens…
----------------------Resources----------------------------
Best one:
[url]http://www.symantec.com/business/support/index?page=content&id=HOWTO8974[/url]Others:
[url]http://www.networking-forum.com/viewtopic.php?t=25022[/url]
[url]http://www.cisco.com/c/en/us/td/docs/net_mgmt/network_registrar/6-1-1/user/guide/users/UserApB.html[/url]
[url]http://www.cisco.com/c/en/us/td/docs/net_mgmt/network_registrar/6-1-1/user/guide/users/UserApB.html#wp1094757[/url]
[url]http://blog.kolbash.net/2012/08/cisco-ios-dhcp-server-supporting.html[/url] -
Well today I tired to update to the latest svn 3200, but after the script ran nothing changed as everything said I was still on 2993 even after a server reboot (wish I would’ve had the foresight to grab the logs). Then on a whim I decided to revert the VM back to a snapshot I had before I updated to svn 2993, so it was just native FOG 1.2.0 and now everything works!
-
huh…
Please do us a favor… please…
[B]Keep your working snapshot[/B]
And update FOG by like 10 revisions at a time… test each one with MemTest, perhaps.
So, 2920 then 2930, then 2940… find out where it breaks…Then revert to your working snapshot… and take the last range you used and update one at a time… Figure out EXACTLY where it breaks. This will allow the developers to fix this issue for you and everyone else.