PXE & startnet.cmd Questions
DHCP, HTTP, NFS, SMB etc. (another server is DNS)
All the above is on ESXi as a VM, test host is a physical Dell 3620 Workstation
So my question(s) pertain to using FOG to PXE boot Windows installs using wimboot, winpe, etc. The Windows version doesnt really matter (as the issues are pre-windows) at this rate however in the end ill want it to work with 7, 10, Server 2012 R2, Server 2016.
I am running a Server 2016 VM with latest MDT, ADK, etc. I followed this thread and am using the batch file to generate the winpe image with the startnet.cmd batch file in it.
So here is the strange bit. If I boot the machine BIOS boot everything works with the script pretty much as is. WinPE boots, the startnet.cmd starts to run and it almost immediately (after pinging) connects to the share and runs setup.exe.
My issue is mostly when booting UEFI (secure boot off). WinPE boots, and then its inconsistent.
Sometimes it hangs for anywhere from 1-3 minutes at the ‘net use’ command. At this rate it will either; connect and move on or give me an error 53 (something about not finding path). If I get error 53, I can then in that window type the command by hand and it will work.
If it does move past the ‘net use’ command it will execute the setup.exe command and usually nothing happens right away (on rare occasion it will open as expected quickly), I have let it go for anywhere from 1-10 minutes before the setup actually opens and runs after this line executes.
The real odd part is if I actually move on through the setup the setup is super slow after the previous slow behavior as well when typically its not.
Here are a few things I tried:
- Adding ‘sleep 2’ lines in between ping, net use and setup.exe. I get errors when it runs that ‘sleep’ isnt recognized. However the error alone seems to increase the rate at which this succeeds. Could be placebo.
- Adding -n 20 to ping. Making ping run 20x instead of the typical 4x, making for 20 seconds between initial run and trying to ‘net use’ connect. May marginally increase success but not substantially enough.
- I have created a loop in which ping and ‘net use’ re-run if ‘net use’ didnt connect. Re-running ping only for a reasonable delay. This works in that it always eventually connects, but still puts me in the 1-10 minute time frame to get to setup.exe running.
my startnet.cmd looks basically like this:
wpeinit :loop ping 10.0.0.2 net use i: \\10.0.0.2\pxeshare\os\win\s16 /user:username password || goto :loop i: i:\setup.exe
The user is a user I setup on the FOG server, in a group created for smb (smbgrp) specifically for startnet.cmd (as the password is visible). This user has smb credentials and access to the \pxeshare folder and its sub folders, which I have used from various other linux and windows machines to connect to or map to the share without issue.
Any help would be greatly appreciated, I have been struggling with this for a couple days and feel like I am just spinning my wheels at this rate. Thanks
@george1421 Ok so not an issue with the network, R620 works just fine off the same switch.
I am gonna chalk it up to the Dell 3620’s firmware (which is the latest) being problematic. Its not the first time it proved to be, likely wont be the last.
I guess I can live with the issue as it always eventually runs setup.exe. As long as its not a wide spread issue and limited to that model machine I am gonna move on and just live with it.
Thanks for the help though, always good to have another person to bounce ideas off and to get other ideas from.
@george1421 So I tried it on an R620, booted multiple times and connected to share and ran setup.exe almost instantly. Granted the machine takes ~4 mins to reboot and cycle back to the pxe menu.
I injected the drivers from the pack into the wim and just tried again on the Dell 3620, “same” problem in that it had issues. Actually (likely coincidentally) at the worse end of the spectrum of problems. Got error 53 trying to mount share, my loop took 12 cycles to finally connect, then setup.exe took roughly 1 min to show up.
So it looks like drivers may not have resolved things. I am beginning to think its these Dell 3620 workstations and their firmware thats just bad.
I am also wondering if its a networking issue as the data passes through a dumb swtich to where I am testing the 3620. Ill roll a R620 over to it and see if the R620 suddenly has issues then maybe its a network issue and not specific to the machine.
If all else fails ill try a share on a windows server to serve the install media.
@zer0cool OK good choice on testing with windows server hosting the files. That was going to be my next suggestion.
I also would suggest using the dell winpe10 drivers (from my instructions) and loading those into your winpe image. There is nothing “dell” in there, they are common network and disk subsystem drivers. They do work on most hardware, even not dell hardware (an intel nic is an intel nic).
If you get super twisted up, it would be interesting to know what its doing during this pause. I’m thinking wireshark on a mirrored port to see if its trying to reach out to something on the network while it is waiting and then just gets tired and gives up.
@george1421 I didnt feed it any drivers, was hoping for a machine agnostic winpe image…maybe wishful thinking. I can add the drivers to it.
Dell Precision 3620 is the box with the delay, ill try something else see if its an issue.
The image I am using starts up to a command prompt and runs the commands. When it fails to connect to the share I can manually type in the
net usepart and get it to connect.
setup.exe however appears to hang in the command window but given enough time eventually runs. I havent had an instance of it failing yet (as long as the share is mapped).
Gonna try another machine first, if it works on another machine differently I will try drivers in PE.
I am starting to lean towards a possible samba issue. I am considering even placing the Windows installation/winpe stuff on a windows server machine and serving them from there (I had this when I initially started this, then moved the install media to my FOG server). If it works from the Windows based share I will know its a samba issue on the FOG/CentOS server.
@george1421 Ok scrap some of my prior comments…I went for a “sanity check” only to have it backfire on me.
It is now happening in BIOS boot as well, however seemingly not as slow (1-3 mins when the issue occurs vs 5-10 mins under UEFI).
The observation that seems the most sure is that time between runs appears to have an impact. If I run it and have not done so recently it works pretty much instantly. If however I run it, cancel and reboot and run it again I then have issues. Giving it some time between attempts appears to have the most noticeable impact of anything I have tried yet.
Does samba server (run from CentOS) have anything about it that may cause slow auth, prevent multiple connections from the same user, etc.?
Does the act of mounting the share in winpe in any way get stored server side? Like if I reboot and dont give it time to flush out it thinks its still connected or wont let it reconnect right away?
@zer0cool That was something I didn’t touch on before, what built in drivers are you installing in the winpe image? Did you use my instructions for creating the winpe image?
Also what hardware are you pxe booting that has this delay?
Using my winpe image I sent you, that will drop you to a command prompt. What happens if you execute the net use command by hand then?
timeout 10game the same “command not recognized error” as
Still, if it was a SMB issue wouldnt it be a problem when BIOS booting the machine and running winpe via pxe?
Its odd that BIOS booted it works instantly every time as expected but when UEFI booting it works right maybe 1/10 times.
I could live with an extra 1-2 minutes to run setup.exe, maybe, but waiting 5-10 minutes for it is a no go (especially if other are to use this).
Its also weird that when the share will mount but then the setup.exe takes forever to execute. In other words the command to run setup.exe is executed but it then takes a long time for the setup dialog to appear.
Is it possible winpe uses different NIC drivers under BIOS/UEFI?
The BCD, boot.sdi and boot.wim are the same for BIOS and UEFI correct? I dont need to pass iPXE a different set for BIOS vs UEFI do I?
I am kind of grasping at straws here :/
@george1421 Set logging to 10 and added a line to smb.conf to generate a log for each machine. Reviewing the log for the problem machine doesnt appear to reveal any clues.
I get some lines about it finding the user, then something like
adding homes service for user...and
lp_servicenumber: couldnt find homes.
When it booted and connected instantly and ran setup.exe within seconds it was the same thing as the last pass when
net useworked within about 30 seconds and then setup.exe took ~5 mins to run.
It seems to me almost as if it runs fine after a period of time has passed but if I exit the setup and try too soon it has issues.
I am still at a loss with this, gonna try the timeout option peppered in, not sure what else to do beyond that.
@zer0cool I see that my smb.conf is a bit more complex, but I’m not seeing anything that jumps out at me as blatantly wrong with your setup: https://forums.fogproject.org/topic/10944/using-fog-to-pxe-boot-into-your-favorite-installer-images/2
The dns names should be OK, but remember we are dealing with SMB naming and not IP/DNS naming (not sure how to describe it) Its more on the wins side.
@george1421 you know, you keep helping me you are gonna have to start charging me :P
Some of your ideas where thoughts of mine too, but have since realized some of them are not the culprit.
wpeinitI would presume has the network up since the
pingcommand works prior to
net use. I will give
timeouta try however as its basically the same concept I was trying with
I think the ping command is in that script mostly to add a delay and the verifying connection is kind of secondary. That was the only rationale I could come up with for pinging 4 times or as you pointed out I could simply ping 1 time and use it to verify the connection.
I am curious about your DNS statement though. Since I am using IP instead of hostname, shouldn’t that “circumvent” the DNS being an issue? If not maybe some sort of delay as you mention is needed, however I found
ping -n 20to not be long enough to make a difference. Also, doesnt explain why its not an issue BIOS booted.
My samba entry is super simple, just:
[global] unix extensions = no [pxeshare] comment = PXE Server SMB Share path = /pxeshare valid users = @smbgrp guest ok = no browsable = yes writable = yes follow symlinks = yes wide links = yes
I will take a look at the samba logs and see if any errors are present.
From your post I can’t tell if you are having a networking issue (tcp/ip) or a SMB (network share) issue.
Your ping command might be improved a bit to test if its reachable.
ping -n 1 192.168.1.1 | find "TTL=" >nul if errorlevel 1 ( echo host not reachable ) else ( echo host reachable )
Other thoughts are the
wpeinitstarts the networking subsystem. You may need to create a small delay while the network comes up and the target gets an IP address.
So you could do something like:
ping 127.0.0.1 -n 11 > nul
To give the networking a chance to come up before trying to mount the network share.
While I have no basis for this, you have to consider a freshly booted computer may not be listed in DNS right away. Is your samba server doing any type of computer validation? Is samba showing any error in the error log when you try to connect?