Latest FOG 0.33b
-
No, it only needs to uncompress the one time. The slowdown occurs in the making sure all systems are at the same level.
-
A multicast tasks with 30 clients -> one thread/slot -> one uncompress process
Two diferent multicast tasks, one with 30 clients and other with 15 clients -> two threads/slots -> two uncompress processes
Three diferents multicast tasks, 30, 15, 18 clients -> three threads/slots -> three uncompress processes
…Each multicast tasks have one uncompress process, no? And the gunzip process is heavier than udp-sender process, and will overload the CPU.
-
so why if i have no task scheduled/running
i have these process? i try to kill and restart multicast.
root@fog:/opt/fog/log# ps -ef|grep udp-sender
root 567 24601 0 16:10 ? 00:00:00 sh -c exec gunzip -c “/images//labinfociro/d1p1.img”|/usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p2.img”|/usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p3.img”|/usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;
root 570 567 0 16:10 ? 00:00:02 /usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd
root 10972 29116 0 18:15 pts/1 00:00:00 grep --color=auto udp-sender
root@fog:/opt/fog/log# -
i tryed out the multicast:
clicked on group ->basic task -> multicast.
is a group of 5 pc
these pc is waked up by wol, but too fast, so some pc start the process other boots from local disk bypassing.
so i reset by hand powering off then on, then all starts the multicast process.
the problem is that all pc stays with empty gray screen of partclone.
there is a bug, also if the members of the group scheduled is 5 pc, for some reason it expect 29 connection before start.
as a note, my pc is members of 3 group. I think that the check of how many pc is scheduled is to see how many pc is in the group that i 've scheduled, without other group membership…
my situation:
total # of pc in mysql: 27
pc in first group: 25
pc in second group: 3
pc in third group: 5on the server:
root@fog:/opt/fog/log# ps -ef|grep fog
avahi 507 1 0 10:03 ? 00:00:02 avahi-daemon: running [fog.local]
root 12747 1 0 18:30 ? 00:00:00 /usr/bin/php -q /opt/fog/service/FOGTaskScheduler/FOGTaskScheduler
root 12781 1 0 18:30 ? 00:00:02 /usr/bin/php -q /opt/fog/service/FOGMulticastManager/FOGMulticastManager
root 12816 1 0 18:30 ? 00:00:00 /usr/bin/php -q /opt/fog/service/FOGImageReplicator/FOGImageReplicator
root 13467 29116 4 18:36 pts/1 00:00:00 grep --color=auto fogmulticast.log:
[01-31-14 6:36:16 pm] * [01-31-14 6:36:16 pm] I am the group manager.
[01-31-14 6:36:27 pm] * [01-31-14 6:36:27 pm] Checking if I am the group manager.
[01-31-14 6:36:27 pm] * [01-31-14 6:36:27 pm] I am the group manager.
[01-31-14 6:36:38 pm] * [01-31-14 6:36:38 pm] Checking if I am the group manager.
[01-31-14 6:36:38 pm] * [01-31-14 6:36:38 pm] I am the group manager.
[01-31-14 6:36:49 pm] * [01-31-14 6:36:49 pm] Checking if I am the group manager.
[01-31-14 6:36:49 pm] * [01-31-14 6:36:49 pm] I am the group manager.
[01-31-14 6:37:00 pm] * [01-31-14 6:37:00 pm] Checking if I am the group manager.
[01-31-14 6:37:00 pm] * [01-31-14 6:37:00 pm] I am the group manager.multicast.log.udpcast.50:
Udp-sender 20120424
Using mcast address 232.168.0.3
UDP sender for (stdin) at 192.168.0.3 on eth0
Broadcasting control to 224.0.0.1
New connection from 192.168.0.133 (#0) 00000009
New connection from 192.168.0.113 (#1) 00000009
New connection from 192.168.0.141 (#2) 00000009root@fog:/opt/fog/log# ps -ef|grep udp
root 13001 12781 0 18:31 ? 00:00:00 sh -c exec gunzip -c “/images//labinfociro/d1p1.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p2.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p3.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;
root 13003 13001 0 18:31 ? 00:00:00 /usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd -
[quote=“fabritrento, post: 22341, member: 21607”]
these pc is waked up by wol, but too fast, so some pc start the process other boots from local disk bypassing.
[/quote]Check the BIOS setup. In HP machines you must setup the WOL boots from Remote Server and not from local machine.
-
Doing some testing on the xp, single partition resizable, with init.gz from different revisions
1110 will deploy an image created by 1110
1110 will deploy an image created by 1170
1170 will not deploy an image created by 1170
1170 will not deploy an image created by 1110
1142 will not deploy an image created by 1110
1115 will deploy an image created by 1110
1115 will deploy an image created by 1170
1132 will deploy an image created by 1170
1137 will deploy an image created by 1170
1139 will deploy an image created by 1170So it looks like image upload is still OK at 1142, however deploy is broken from then on.
Hope this helps.
-
I expect that 1141 and previous should deploy SDR Images with little to no problem. I suspect that something with the download is screwed up and will focus, as much as possible, my time on that script this weekend.
I have to go get surgery on Monday, so if I’m not fully responsive then you all know why.
-
r1171 released.
Progress bar now matches the Active tasks table.
Active Tasks table, now, includes the created by so we know who’s doing what when.
Minor elements added for location patch by Lee Rowlett (Not working yet but in progress.)Adjusts some elements of the fog.download script to maybe get windows xp working? (PLEASE PLEASE PLEASE)
Fix for the “No host found for …” as I was finally able to replicate the issue.
-
Just tried a quick test of xp, single partition resizeable, on 1171, using an images created previous.
It worked.
Many thanks for all your good work, hope all goes OK Monday.
-
Finally making progress. Thanks for the testing and faith that I’d get it.
-
[quote=“Fernando Gietz, post: 22339, member: 13”]A multicast tasks with 30 clients -> one thread/slot -> one uncompress process
Two diferent multicast tasks, one with 30 clients and other with 15 clients -> two threads/slots -> two uncompress processes
Three diferents multicast tasks, 30, 15, 18 clients -> three threads/slots -> three uncompress processes
…Each multicast tasks have one uncompress process, no? And the gunzip process is heavier than udp-sender process, and will overload the CPU.[/quote]
While you’re right with this, and maybe I’m thinking too much on this, it would seem to me that, let’s say you have 3 multicast sessions running, Session one with 30 clients, session two with 15 clients, and session 3 with 18 clients.
If we have all of them, for some reason, start getting their data at (more or less) the same time: individually decompressing the image file, we’d actually be doing more work to accomplish the same result. What I mean by this is it is actually opening up 63 (individual as it may be) gunzip tasks. And while this is load is on the individual host, it’s way more work. Some Systems may decompress faster than others causing delay’s and possibly timeout’s on the udp session.
While you’re right that it could become CPU intensive on the server, it would ultimately take much longer if each of the clients are performing their own decompression techniques. We’re only performing three gunzip tasks versus 63 gunzip.
This isn’t necessarily a bad approach as it keeps resources on the server available for other imaging/snapin (or what have you) tasks to perform better, it seems that using all of these, however, techniques has their pros/cons.
-
[quote=“fabritrento, post: 22341, member: 21607”]i tryed out the multicast:
clicked on group ->basic task -> multicast.
is a group of 5 pc
these pc is waked up by wol, but too fast, so some pc start the process other boots from local disk bypassing.
so i reset by hand powering off then on, then all starts the multicast process.
the problem is that all pc stays with empty gray screen of partclone.
there is a bug, also if the members of the group scheduled is 5 pc, for some reason it expect 29 connection before start.
as a note, my pc is members of 3 group. I think that the check of how many pc is scheduled is to see how many pc is in the group that i 've scheduled, without other group membership…
my situation:
total # of pc in mysql: 27
pc in first group: 25
pc in second group: 3
pc in third group: 5on the server:
root@fog:/opt/fog/log# ps -ef|grep fog
avahi 507 1 0 10:03 ? 00:00:02 avahi-daemon: running [fog.local]
root 12747 1 0 18:30 ? 00:00:00 /usr/bin/php -q /opt/fog/service/FOGTaskScheduler/FOGTaskScheduler
root 12781 1 0 18:30 ? 00:00:02 /usr/bin/php -q /opt/fog/service/FOGMulticastManager/FOGMulticastManager
root 12816 1 0 18:30 ? 00:00:00 /usr/bin/php -q /opt/fog/service/FOGImageReplicator/FOGImageReplicator
root 13467 29116 4 18:36 pts/1 00:00:00 grep --color=auto fogmulticast.log:
[01-31-14 6:36:16 pm] * [01-31-14 6:36:16 pm] I am the group manager.
[01-31-14 6:36:27 pm] * [01-31-14 6:36:27 pm] Checking if I am the group manager.
[01-31-14 6:36:27 pm] * [01-31-14 6:36:27 pm] I am the group manager.
[01-31-14 6:36:38 pm] * [01-31-14 6:36:38 pm] Checking if I am the group manager.
[01-31-14 6:36:38 pm] * [01-31-14 6:36:38 pm] I am the group manager.
[01-31-14 6:36:49 pm] * [01-31-14 6:36:49 pm] Checking if I am the group manager.
[01-31-14 6:36:49 pm] * [01-31-14 6:36:49 pm] I am the group manager.
[01-31-14 6:37:00 pm] * [01-31-14 6:37:00 pm] Checking if I am the group manager.
[01-31-14 6:37:00 pm] * [01-31-14 6:37:00 pm] I am the group manager.multicast.log.udpcast.50:
Udp-sender 20120424
Using mcast address 232.168.0.3
UDP sender for (stdin) at 192.168.0.3 on eth0
Broadcasting control to 224.0.0.1
New connection from 192.168.0.133 (#0) 00000009
New connection from 192.168.0.113 (#1) 00000009
New connection from 192.168.0.141 (#2) 00000009root@fog:/opt/fog/log# ps -ef|grep udp
root 13001 12781 0 18:31 ? 00:00:00 sh -c exec gunzip -c “/images//labinfociro/d1p1.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p2.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p3.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;
root 13003 13001 0 18:31 ? 00:00:00 /usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd[/quote]My guess, here, is that what you’re seeing is something I was actually trying to accomplish in the interim. That being said. My guess for how your systems are setup.
Group 1 uses (with 25 clients) uses image name labinfociro.
Group 2 (with 3 clients) uses image name labinfociro
Group 3 (with 5 clients) uses image name labinfociro
Does this sound correct?
My methodology (while maybe incorrect at this point) was to use the image name as the session generating factor.
My thought on this is:
If the client, not initially in the group tasking, has the same image name as a currently running session, regenerate the cmd (which I haven’t figured out how to do yet.) to add the new client to the same multicast group. This way, it’s less taxing on the server than to open multiple threads (at this point) of gunzip and udp-senders as multicast can wreak havoc on a network.When you’re on the host page and see the three deploy icons (Upload – The up arrow, Unicast Download – the down arrow, Multicast Download – the four arrows) perform different functions (as described.)
My guess to why you saw 27, then 29, and so on, is you used the 4 Arrows to deploy the task to the systems. Then you killed the udp-sender manually, and FOGMulticastManager performed it’s checks and re-created the command. So You had a multicast deploy job set for your Grouping 1 setup. (25 systems.)
Then you are using the Multicast Deploy message to image another machine. That machine’s image is the same as that of your originally deployed multicast tasking. So it’s trying to (not working yet I must stress this) join the current operating multicast session.
Then you are doing the same on another machine. Once again, this image is the same as your multicast session, so it’s tasking is generating the into the same portbase operation.
Hopefully this makes sense as to what you’re seeing.
If you’re trying to image individual systems, I’d recommend unicast. Heck, I’d recommend unicast deployments anyway as, from what I’ve seen, it works much faster than multicast does.
-
Just finishing a few tests but looking good so far.
Upload and Deploy are working for me, including renaming early.
I even have my test setup running over two sites connected via VPN.With the setup of multiple TFTP servers… the pxelinux.cfg directory needs to be mounted from the master server… however if my VPN goes down the clients no longer successfully boot. They error because the files are missing.
Would you suggest it is better to NFS mount the TFTPboot directory instead of the pxelinux.cfg so that if the VPN is down the clients just error out and boot from their next set device? Or could it be setup in another way so that it works with multiple TFTP servers without an NFS mount.
Only slight annoyance is the replication service doesn’t seem to do anything on it’s own. needs a manual restart to work. might be an idea to maybe have it configurable or got rid of since there could be better solutions for those using many sites and those on one site with a storage server could just use rsync or something simple.
-
r1172 released.
Should fix login history page to verify if it’s already an instance of a previous session. If not, it will not generate the graphs, so should be good with that.
Snapin Deployment tasks actually get cleared from the queue now. (Still todo with this: need to make it so you can create tasks over snapin deployments. You currently can create snapin deployment’s over tasks, but not vice versa.)
Service scripts now use all the MAC’s presented to verify if a host is registered. (PrinterMangager, Hostname changer, etc…) Still need to figure out how to get it to add a mac to the host if it’s not already there.
-
r1173 released.
If you create a multicast job from the individual host, it will generate a new udp-cast session. The caveat to this, however, is groups.
Multicast was designed for groups. So normally, if you are trying to multicast different groups, you’re (assumingly) trying to image those groups with separate images. If you have multiple groups with the same image id, it will try to assume you’re trying to use the same session. I haven’t worked out the kinks in that yet.
-
r1174 released.
More elements added for Lee Rowlett’s Location Plugin. Still not functional yet, but building the necessary elements seems more important before getting it work yet. That parts relatively easy.
Fixed multicast SDR resizing issues. (I didn’t add it originally DOH!)
-
r1175 released.
Even more elements in the Location Plugin. Schema actually creates the information. Can create/remove locations. No associations yet so not useful, but getting there!
Tweaks to the register.php scripts.
-
Nice, looking forward to this location addition, I may well be making good use of it in the future
-
r1176 released.
Tested pigz decompression of multicast task on the client vs. gunzip. This way it can take advantage of multiple cores. Changed multicast to use full-duplex over half-duplex so gig networks should see better speeds.
Added location scripts for checking when registering the host. Nothing implemented into the fog.man.reg or quickreg scripts yet, but will work it out shortly.
-
r1177 released.
More location management page tweaking. Still no associative properties yet, but will be simple to implement. You can search by storagegroup name or id, location name. Will work on implementing host searching once associations are made.
Removed the “old” fog.bkup script and the one left is the one from before 1142.