FOG 1.2
All our PCs have VMWare Workstation installed (hence the two additional MAC addresses).
Posts made by Trevelyan
-
RE: Fog client weirdness
-
RE: Fog client weirdness
After trying some other PCs, I have a feeling that this might be to do with the speed at which the PCs using SSDs boot up. If anyone wants to correct me on that for sure, please do - but what I’ll do is reimage one HDD and one SSD machine to see if they are behaving the same way and then secondly turn the FOG service into a delayed start service first and then try and reimage some machines to see if that changes anything.
-
RE: Fog client weirdness
Two log files here - the first is when I had restarted the FOG service (left side) and the second is after a restart (right side). It just makes no sense to me as to why errors are being returned like this. The only difference between hardware types in our rooms is the presence of an SSD in these labs that has never caused an issue before!
[url=“/_imported_xf_attachments/1/1492_worksdoesnt1.png?:”]worksdoesnt1.png[/url][url=“/_imported_xf_attachments/1/1493_worksdoesnt2.png?:”]worksdoesnt2.png[/url]
-
RE: Fog client weirdness
The weird thing is, is in this room of 21 PCs, 2 managed to apparently take a snapin after wiping the snapins, tasks and multicast sessions. No idea why though as their fog.log files are spammed with GUI notification/dispatch failed entries (stops after you login apparently)
But, really bizarrely, manually restarting the FOG service on one of the PCs seemed to fix this issue. But after a restart, it persists. This just makes no sense at all!
-
RE: Fog client weirdness
Mental.
[url=“/_imported_xf_attachments/1/1491_mental.png?:”]mental.png[/url]
-
RE: Fog client weirdness
Not using wireless at all. And as you say, the hosts are all in the database (which is apparent from the fog log too). It is really confusing and kind of frustrating too to be honest as I have no idea really what to even look for to resolve this.
I’ll try the wiping of those sessions though, cheers!
-
RE: Fog client weirdness
Weird thing here also is that even after I cancelled a load of tasks, they still show up under “Active Multicast Tasks”. I can’t even cancel them from this page - cancel doesn’t actually do anything. But many of those tasks were deleted already by me from the all tasks page…
[url=“/_imported_xf_attachments/1/1480_er.png?:”]er.png[/url]
-
Fog client weirdness
So there I am, looking at why some of our rooms aren’t taking snapin tasks, and I look through the fog.log file to see this sometimes:
[QUOTE]
04/11/2014 13:07 FOG::SnapinClient Attempting to connect to fog server…
04/11/2014 13:07 FOG::SnapinClient General Error Returned:
04/11/2014 13:07 FOG::SnapinClient #!er:No Host Found04/11/2014 13:12 FOG::SnapinClient Attempting to connect to fog server…
04/11/2014 13:12 FOG::SnapinClient General Error Returned:
04/11/2014 13:12 FOG::SnapinClient #!er:No Host Found04/11/2014 13:18 FOG::SnapinClient Attempting to connect to fog server…
04/11/2014 13:18 FOG::SnapinClient General Error Returned:
04/11/2014 13:18 FOG::SnapinClient #!er:No Host FoundAnd so on…[/QUOTE]
It seems consistent across the rooms I am looking at today but I can’t figure out what it means. When I restart, the log looks like this (from the startup, I am guessing, to when it has finished):
[CODE] 04/11/2014 13:19 FOG Service Engine Version: 3
04/11/2014 13:19 Starting all sub processes
04/11/2014 13:19 14 modules loaded
04/11/2014 13:19 * Starting FOG.AutoLogOut
04/11/2014 13:19 * Starting FOG.SnapinClient
04/11/2014 13:19 * Starting FOG.DirCleaner
04/11/2014 13:19 * Starting FOG.DisplayManager
04/11/2014 13:19 * Starting FOG.GreenFog
04/11/2014 13:19 * Starting FOG.GUIWatcher
04/11/2014 13:19 * Starting FOG.HostNameChanger
04/11/2014 13:19 FOG::GUIWatcher Starting GUI Watcher…
04/11/2014 13:19 FOG::AutoLogOut Starting process…
04/11/2014 13:19 FOG::DirCleaner Sleeping for 37 seconds.
04/11/2014 13:19 * Starting FOG.HostRegister
04/11/2014 13:19 FOG::ClientUpdater Starting client update process…
04/11/2014 13:19 FOG::ClientUpdater Sleeping for 326 seconds.
04/11/2014 13:19 * Starting FOG.MODDebug
04/11/2014 13:19 FOG::GreenFog Sleeping for 48 seconds.
04/11/2014 13:19 FOG::MODDebug Start Called
04/11/2014 13:19 FOG::DisplayManager Starting display manager process…
04/11/2014 13:19 FOG::MODDebug Sleeping for 100 Seconds
04/11/2014 13:19 * Starting FOG.PrinterManager
04/11/2014 13:19 * Starting FOG.SnapinClient
04/11/2014 13:19 * Starting FOG.TaskReboot
04/11/2014 13:19 FOG::PrinterManager Starting interprocess communication process…
04/11/2014 13:19 FOG::HostnameChanger Starting hostname change process…
04/11/2014 13:19 FOG::HostnameChanger Yielding to other subservices for 6 seconds.
04/11/2014 13:19 * Starting FOG.UserCleanup
04/11/2014 13:19 * Starting FOG.UserTracker
04/11/2014 13:19 FOG::TaskReboot Taskreboot in lazy mode.
04/11/2014 13:19 FOG::TaskReboot Starting Task Reboot…
04/11/2014 13:19 FOG::UserTracker Starting user tracking process…
04/11/2014 13:19 FOG::SnapinClient Starting snapin client process…
04/11/2014 13:19 FOG::UserCleanup Sleeping for 17 seconds.
04/11/2014 13:19 FOG::DisplayManager Attempting to connect to fog server…
04/11/2014 13:19 FOG::SnapinClient Sleeping for 386 seconds.
04/11/2014 13:19 FOG::HostRegister Attempting to connect to fog server…
04/11/2014 13:19 FOG::UserTracker Attempting to connect to fog server…
04/11/2014 13:19 FOG::TaskReboot Attempting to connect to fog server…
04/11/2014 13:19 FOG::HostnameChanger Attempting to connect to fog server…
04/11/2014 13:19 FOG::PrinterManager General Error Returned:
04/11/2014 13:19 FOG::PrinterManager #!er:No Host Found
04/11/2014 13:19 FOG::AutoLogOut Unknown error, module will exit.
04/11/2014 13:19 FOG::TaskReboot #!er:No Host Found
04/11/2014 13:19 FOG::TaskReboot No task found for client.
04/11/2014 13:19 FOG::UserTracker Unknown error, module will exit.
04/11/2014 13:19 FOG::DisplayManager Unknown error, module will exit.
04/11/2014 13:19 FOG::HostnameChanger Module is active…
04/11/2014 13:19 FOG::HostnameChanger AD mode requested, confirming settings.
04/11/2014 13:19 FOG::HostnameChanger Hostname is up to date
04/11/2014 13:19 FOG::HostnameChanger Attempting to join domain if not already a member…
04/11/2014 13:19 FOG::HostnameChanger Domain Error! (‘SetupAlreadyJoined’ Code: 2691)
04/11/2014 13:19 FOG::UserCleanup Starting user cleanup process…
04/11/2014 13:19 FOG::UserCleanup Attempting to connect to fog server…
04/11/2014 13:19 FOG::UserCleanup Module is disabled globally on the FOG Server, exiting.
04/11/2014 13:20 FOG::DirCleaner Starting directory cleaning process…
04/11/2014 13:20 FOG::DirCleaner Attempting to connect to fog server…
04/11/2014 13:20 FOG::DirCleaner Module is disabled globally on the FOG Server.
04/11/2014 13:20 FOG::GreenFog Starting green fog…
04/11/2014 13:20 FOG::GreenFog Attempting to connect to fog server…
04/11/2014 13:20 FOG::GreenFog Module is disabled globally on the FOG Server, exiting.
04/11/2014 13:21 FOG::MODDebug Reading config settings…
04/11/2014 13:21 FOG::MODDebug Reading of config settings passed.
04/11/2014 13:21 FOG::MODDebug Starting Core processing…
04/11/2014 13:21 FOG::MODDebug Operating System ID: 6
04/11/2014 13:21 FOG::MODDebug Operating System Minor: 1
04/11/2014 13:21 FOG::MODDebug MAC ID 0 00:50:56:C0:00:01
04/11/2014 13:21 FOG::MODDebug MAC ID 1 00:50:56:C0:00:08
04/11/2014 13:21 FOG::MODDebug MAC ID 2 54:BE:F7:37:B2:E1
04/11/2014 13:21 FOG::MODDebug MAC POST String: 00:50:56:C0:00:01|00:50:56:C0:00:08|54:BE:F7:37:B2:E1
04/11/2014 13:21 FOG::MODDebug A user is currently logged in
04/11/2014 13:21 FOG::MODDebug Username: (my username)
04/11/2014 13:21 FOG::MODDebug Hostname: (hostname)
04/11/2014 13:21 FOG::MODDebug Attempting to open connect to: http://fog.cst.beds.ac.uk/fog/service/debug.php
04/11/2014 13:21 FOG::MODDebug Server responded with: Hello FOG Client
04/11/2014 13:21 FOG::MODDebug Module has finished work and will now exit.
04/11/2014 13:25 FOG::ClientUpdater Attempting to connect to fog server…
04/11/2014 13:25 FOG::ClientUpdater Module is active…
04/11/2014 13:25 FOG::ClientUpdater Checking Status : AutoLogOut.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : ClientUpdater.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : DirCleaner.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : DisplayManager.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : GreenFog.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : GUIWatcher.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : HostRegister.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : MODDebug.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : PrinterManager.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : SnapinClient.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : TaskReboot.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : UserCleanup.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : UserTracker.dll
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Checking Status : config.ini
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater Zero byte response returned
04/11/2014 13:25 FOG::ClientUpdater 0 new modules found!
04/11/2014 13:25 FOG::ClientUpdater Client update will be applied during next service startup.
04/11/2014 13:25 FOG::ClientUpdater Client update process complete, exiting…
04/11/2014 13:26 FOG::SnapinClient Attempting to connect to fog server…
04/11/2014 13:26 FOG::SnapinClient General Error Returned:
04/11/2014 13:26 FOG::SnapinClient #!er:No Host Found
[/CODE]I don’t get what is wrong. Some additional notes:
[LIST]
[]Other rooms work fine
[]Host can ping fog through DNS and IP
[]Hosts are all registered on FOG
[]Boot menu shows it is registered
[]Host is joined to AD (this is how I log onto the PCs)
[]Hosts have successfully had snapin tasks applied before
[*]Out of ~500 PCs, these ~100 machines that don’t deploy their snapin all happen to have SSDs. The task I was trying to deploy to them was to extend their hard drive - manually, this same snapin works fine on the same PCs. And half of one of the rooms did seem to deploy fine. Just, bizarrely, about 100 haven’t and they all are in the SSD labs.
[/LIST]
I can well believe this isn’t something complicated as the issue but would really appreciate any pointers anyone can give.Thanks!
-
Multicast in 1.2 - vlan and subnet differences and partial completion
So there’s recently been an upgrade to all of the switches where I work and many computers seem to be in incorrect subnets and vlans now. Which makes very little difference in practice except when imaging. Which works. Just slowly.
Or, in the case of fog 1.2, I suspect causes some tasks to either full out fail or only partially complete. I set a load of tasks going last night to test this and it seems quite bizarre; some rooms have fully failed but they actually have some hosts that didn’t fail. The multicast log shows that some even finished first - how is this even possible with multicast?! The pc checkin numbers correspond to retransmissions to hosts that are different in subnet from the previous address in a lost, usually, but is this actually correct in assumption?
The point is, im interested in what is actually happening, so I can look in the right areas before asking the central ict department for help (I need to be able to ask them to change specific ports to a vlan etc). Does each subnet correspond to a separate broadcast, essentially allowing hosts to potentially finish the tasks before other hosts in another subnet have finished?
Also, it seems lots of hosts in queues end up sticking at waiting for slots, even when left overnight. Weird one really. Id set about 90 going on unicast, stacked up in a queue for 10, and now most are still in a queue waiting for nothing!
-
RE: Slow Deploy Speed.
Just as a note; if you are using multicast for a 10/100 set of hosts from a gigabit server, I reckon you will get slow speeds as a result of a whole load of re-transmits. If you were to force the server to run at 10/100, I reckon you will get the same performance as you would for unicast.
-
Multicast timeout?
I use FOG 1.2 now and am just going over some of our labs that we use FOG in. For many of them, we seem to be able to multicast fine but for a couple we can’t.
All hosts will sit at the blue screen waiting and nothing happens.
I checked the multicast log and apparently all 31 hosts have timeout notAnswered and notReady and eventually are just all dropped.
[CODE]Udp-sender 20120424
Using mcast address 234.1.63.202
UDP sender for (stdin) at 10.1.63.202 on eth0
Broadcasting control to 224.0.0.1
New connection from 10.9.110.68 (#0) 00000009
New connection from 10.9.110.172 (#1) 00000009
Starting transfer: 00000009
Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000
Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000
Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000
Timeout notAnswered=[0,1] notReady=[0,1] nrAns=0 nrRead=0 nrPart=2 avg=10000
…
[/CODE]I’m not really sure where to start but what happened to /opt/fog/service/etc/config.php? I was going to try and increase UDPSENDER_MAXWAIT - but this doesn’t even exist in there. Is there any way to increase the timeout at all somewhere?
Cheers
-
RE: Network/Fog issue - Machines won't boot from fog
Right! It works!
A few days ago I had tried defining the next-server and the bootfile in the dhcp pool, which didn’t work.
Portfast also hasn’t worked.
I have now tried portfast again AND defining the next-server and bootfile… and it works!
I wouldn’t have thought I would have had to have defined these as I already use the helper address as the fog server… but perhaps the switch - acting as a DHCP server itself - just ignored it? I have really no idea. I am still not 100% happy, but it works and works well…
Thanks for the portfast tip!
-
RE: Network/Fog issue - Machines won't boot from fog
Just enabled Portfast and nothing - although thanks for reminding me about that, it probably should be enabled on all host ports!
As for Windows Firewall - it wouldnt be that as the PC does it fine when connected to another switch.
I just can’t work out what this could be. The error I get is usually “No boot filename received” on one hardware set and on another its “No DHCP or proxyDHCP offers received”. Different errors but identical room setups (different NICs though)
-
Network/Fog issue - Machines won't boot from fog
We have used FOG for a while at where I work (a University) and it has worked well so far, however we have run into some weird issue whereby machines on a new network setup just won’t fog boot.
We have 3 Cisco 3750 switches stacked/fibred that serve 3 vlans (3 rooms). Our linux box, acting as a router, connects each of the vlans through separate interfaces, each assigned only to the vlan they are serving. The router then connects to the internet/rest of the university network.
On the same physical switch as the router (but not the rooms), is connected our fog server. This switch is part of the University network which we can’t configure, however this is highly unlikely to be anything to do with the issue we have.
The switch configurations are very basic. I configured three DHCP pools (Network, DNS, default gateway) that serve three subnets. Each of the vlans that serve the three rooms obtain their IP from the respective DHCP pools, relative to their subnets. They can all then boot into Windows, connect to the internet through the router just fine and even ping the FOG server (and connect to the web interface, SFTP, etc).
However, the machines still do not PXE boot, receiving the message that no DHCP or proxyDHCP offers were received. From within Windows, running “tftp <fog IP> get pxelinux.0” times out.
Go across the corridor to another room and they can PXE boot absolutely fine into fog and get the file from within windows fine. The machines are of the same hardware spec, so PXE isn’t the issue (they even imaged from the previous infrastructure).
What on earth could the problem be?
Some additional points, relating to the potential issues:
[LIST]
[]In each of the vlan interfaces, the “ip helper-address <fog server>” is configured.
[]The vlans are configured correctly, hosts can all ping fog, get internet connectivity etc.
[]Running wireshark on the router interface connected to the switch serving the hosts shows that tftp requests are being sent out from hosts and replies from fog are also being sent back, although it appears no acknowledgement is being sent back for each of the blocks.
[]Running wireshark on hosts shows that tftp requests are being received back and, as before, are not being acknowledged. Acknowledgements are sent back for hosts where fog is “working”.
[]The switches have no access control lists or any filtering set.
[]The switches do [B]not[/B] have options 66/67 set (for next-server and filename).
[]The machines did all, at one time or other, previously boot into fog fine when on a previous network infrastructure, which indicates that the new setup has some issue…
[]…however it seems like its a host issue if no tftp acknowledgements are being sent back. Although the hosts are not the issue.
[*]dnsmasq.d does serve the subnets to which these hosts belong (and every IP possibility in-between)
[/LIST]I really have no idea what the issue could be. In many senses, it is a very simple setup - even if it was just one room, one switch, a router out to fog, the setup would likely be the same. The same issue is present on subnets in the rest of the University where the ip helper hasn’t been set on the core switches, but on our switches it definitely is.
Any suggestions or pointers would be a seriously big help!
Thanks!
-
RE: Computers are not renaming themselves and joining the domain after imaging.
Had similar problems, although I had thought that this step was done before the machines loaded their image. Are you using active directory at all?
One issue we had was network card drivers in one particular room. Windows 7, generalize’d and OOBE’d, didn’t like the Intel NIC that was in those particular machines. Added the drivers to the base image and it was fine - computers then all joined and renamed fine.
Can you actually login to Windows?
-
RE: Multicast speed issues
Lethal - Multicasting should be as fast as Unicast yes, but if FOG is sending packets via UDP at 1gpbs, the switch would surely drop a lot of the incoming packets because it can’t send them out of the 100mbps fast enough (and thus lots of retransmissions)? Even with some buffering, whats coming in will be ~10 times faster than whats being sent out and thus it would be SLOWER, due to the constant re-transmissons and dropped packets
Ffor this reason also, I had an idea that a 10 host multicast was the most efficient thing to do and so far its seemed about right - a single host takes 1.5 hours, but 10 take just under an hour to do. My guess was; less retransmissions because of the slower rate to each host.
If one PC was slowing the rest down, that would show in the log (have had this issue on mixed speed rooms). But it seems like they’re all hiccuping from the log
As for the switch things… its tricky because we have no way to access the switches. I work in the academic computing department at a University - normally, all departments are managed via a central ICT department but we have moved away from their systems and recently decided to use FOG. They keep all their switches under lock and key pretty much, so even physical access is limited. However, it seems that the particular switch (note, there are many switches in a stack) has no other hosts on it as far as I can tell. There might be a phone or a printer somewhere but I have no way of knowing. I’ll make a note to ask - communication face-to-face is difficult as they have to manage the infrastructure for the entire University - which involves multiple campuses. So there is a growing list of things to ask!
-
Multicast speed issues
Problem: for a 100mbps roon, Unicast speeds show as about 780Mib/min, yet on multicast they drop to under 100Mib/min
A little snippet of the log, for imaging 30 machines
[QUOTE]
Timeout notAnswered=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29] notReady=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29] nrAns=0 nrRead=0 nrPart=30 avg=99
bytes= 8 164 796 640 re-xmits=0000133 ( 0.0%) slice=0058 73 709 551 615 [/QUOTE]
Seems to be a common issue, some people live with it or switch to unicast only but for the computers here, we would really like the use multicast, so I want to get to the heart of the issue.
Things to note:- The FOG server is on a 1gbps link. The current switch that many of the machines here are connected to are 100mbps links.
- Physical - and remote - access to the switches is somewhat limited, because of the way the organisation works here. Yet we have ~300 - 400 PCs to manage.
My thinking is, is that because the switches are only 100mpbs out, the buffer for the 100mbps could be overflowing with the 1gbps packets coming in. UDP just gets thrown out there and the poor 100mbps clients drop most of the stuff being thrown at them. As a result, FOG has to retransmit loads of data and thus takes more time with more clients, I would imagine (whereas, I guess, with more unicast clients it would take proportionally less time because of the additional load from multiple machines slowing down the rate to each client, meaning less dropped packets).
I could be wrong there, but why im posting under FOG issues is that I was wondering if there was a way to lets say “cap” how many packets are sent out at a time, or adjust the rate in fog (IE slow the speed down by adjusting the transmission rate of UDP packets from within FOG). Another solution would be to adjust the NIC rate I guess, but anything that would need to be done to the switches could be problematic, as it means talking to the people who manage our network equipment.
Any thoughts or comments about the above are most appreciated. And any help, of course!