Latest FOG 0.33b
-
Are you sure your fog installation stuff is at the latest?
-
finally with r1170 all seems to work.
but now there are another problem: a multicast task, in 0.32, wait for all clients of the multicast process to connect, then send to all al maximum 100Mbit speed.
in 0.33b also if you schedule a multicast, don’t work, the first pc start immediatly the data transfer not waiting the others
but i test better in 1 hour (now i have to do other urgent works)
-
multicast imaging on r1170 don’t work, partclone grey empty ncurses screen is on the monitors, bot the transfer don’t start, here some logs:
root@fog:/opt/fog/log# ps -ef|grep fog
avahi 507 1 0 10:03 ? 00:00:02 avahi-daemon: running [fog.local]
root 1161 29116 0 16:15 pts/1 00:00:00 grep --color=auto fog
root 24601 1 0 14:51 ? 00:00:19 /usr/bin/php -q /opt/fog/service/FOGMulticastManager/FOGMulticastManager
root 24611 1 0 14:51 ? 00:00:00 /usr/bin/php -q /opt/fog/service/FOGImageReplicator/FOGImageReplicator
root 24624 1 0 14:51 ? 00:00:03 /usr/bin/php -q /opt/fog/service/FOGTaskScheduler/FOGTaskSchedulerroot@fog:/opt/fog/log# vi multicast.log
[01-31-14 4:14:21 pm] * [01-31-14 4:14:21 pm] I am the group manager.
[01-31-14 4:14:31 pm] * [01-31-14 4:14:31 pm] Checking if I am the group manager.
[01-31-14 4:14:31 pm] * [01-31-14 4:14:31 pm] I am the group manager.
[01-31-14 4:14:41 pm] * [01-31-14 4:14:41 pm] Checking if I am the group manager.
[01-31-14 4:14:41 pm] * [01-31-14 4:14:41 pm] I am the group manager.
[01-31-14 4:14:52 pm] * [01-31-14 4:14:52 pm] Checking if I am the group manager.
[01-31-14 4:14:52 pm] * [01-31-14 4:14:52 pm] I am the group manager.
[01-31-14 4:15:03 pm] * [01-31-14 4:15:03 pm] Checking if I am the group manager.
[01-31-14 4:15:03 pm] * [01-31-14 4:15:03 pm] I am the group manager.
[01-31-14 4:15:13 pm] * [01-31-14 4:15:13 pm] Checking if I am the group manager.
[01-31-14 4:15:13 pm] * [01-31-14 4:15:13 pm] I am the group manager.
[01-31-14 4:15:24 pm] * [01-31-14 4:15:24 pm] Checking if I am the group manager.
[01-31-14 4:15:24 pm] * [01-31-14 4:15:24 pm] I am the group manager.
[01-31-14 4:15:34 pm] * [01-31-14 4:15:34 pm] Checking if I am the group manager.
[01-31-14 4:15:34 pm] * [01-31-14 4:15:34 pm] I am the group manager.
[01-31-14 4:15:45 pm] * [01-31-14 4:15:45 pm] Checking if I am the group manager.
[01-31-14 4:15:45 pm] * [01-31-14 4:15:45 pm] I am the group manager.
[01-31-14 4:15:55 pm] * [01-31-14 4:15:55 pm] Checking if I am the group manager.
[01-31-14 4:15:56 pm] * [01-31-14 4:15:56 pm] I am the group manager.
[01-31-14 4:16:06 pm] * [01-31-14 4:16:06 pm] Checking if I am the group manager.
[01-31-14 4:16:06 pm] * [01-31-14 4:16:06 pm] I am the group manager.root@fog:/opt/fog/log# vi multicast.log.udpcast.50
Udp-sender 20120424
Using mcast address 232.168.0.3
UDP sender for (stdin) at 192.168.0.3 on eth0
Broadcasting control to 224.0.0.1
New connection from 192.168.0.41 (#0) 00000009
New connection from 192.168.0.128 (#1) 00000009
New connection from 192.168.0.42 (#2) 00000009 -
How many clients are supposed to be connected?
On the fog server run:
[code]ps -ef|grep udp-sender[/code]Look for --max-clients. There should be a number there.
-
[quote=“Tom Elliott, post: 22331, member: 7271”]How many clients are supposed to be connected?
On the fog server run:
[code]ps -ef|grep udp-sender[/code]Look for --max-clients. There should be a number there.[/quote]
hanged processes? i have no active multicast deployment at this time… but 3 different single download. Whats the differences between download and deploy?? is not the same?
root@fog:/opt/fog/log# ps -ef|grep udp-sender
root 567 24601 0 16:10 ? 00:00:00 sh -c exec gunzip -c “/images//labinfociro/d1p1.img”|/usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p2.img”|/usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p3.img”|/usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;
root 570 567 0 16:10 ? 00:00:00 /usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd
root 2886 29116 4 16:42 pts/1 00:00:00 grep --color=auto udp-sender
root@fog:/opt/fog/log# -
Just had my syntax wrong.
Your udp-sender command is currently awaiting 27 systems.
provided by:
[code]–min-receivers 27[/code]The group you deployed from (deploy and download are the same. Deploy also means to schedule the task, but in the case of download and multicast, it means the same thing.)
-
[quote=“Tom Elliott, post: 22156, member: 7271”]Multicast decompression happens on the server, otherwise, the clients, during upload, compress the image.[/quote]
Hi Tom,
And is not better compress and uncompress on the clients? If you have four, five, six or more multicast threads, the server must make four, five, six or more uncompressions, the CPU load increases a lot of, no?
-
No, it only needs to uncompress the one time. The slowdown occurs in the making sure all systems are at the same level.
-
A multicast tasks with 30 clients -> one thread/slot -> one uncompress process
Two diferent multicast tasks, one with 30 clients and other with 15 clients -> two threads/slots -> two uncompress processes
Three diferents multicast tasks, 30, 15, 18 clients -> three threads/slots -> three uncompress processes
…Each multicast tasks have one uncompress process, no? And the gunzip process is heavier than udp-sender process, and will overload the CPU.
-
so why if i have no task scheduled/running
i have these process? i try to kill and restart multicast.
root@fog:/opt/fog/log# ps -ef|grep udp-sender
root 567 24601 0 16:10 ? 00:00:00 sh -c exec gunzip -c “/images//labinfociro/d1p1.img”|/usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p2.img”|/usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p3.img”|/usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;
root 570 567 0 16:10 ? 00:00:02 /usr/local/sbin/udp-sender --min-receivers 27 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd
root 10972 29116 0 18:15 pts/1 00:00:00 grep --color=auto udp-sender
root@fog:/opt/fog/log# -
i tryed out the multicast:
clicked on group ->basic task -> multicast.
is a group of 5 pc
these pc is waked up by wol, but too fast, so some pc start the process other boots from local disk bypassing.
so i reset by hand powering off then on, then all starts the multicast process.
the problem is that all pc stays with empty gray screen of partclone.
there is a bug, also if the members of the group scheduled is 5 pc, for some reason it expect 29 connection before start.
as a note, my pc is members of 3 group. I think that the check of how many pc is scheduled is to see how many pc is in the group that i 've scheduled, without other group membership…
my situation:
total # of pc in mysql: 27
pc in first group: 25
pc in second group: 3
pc in third group: 5on the server:
root@fog:/opt/fog/log# ps -ef|grep fog
avahi 507 1 0 10:03 ? 00:00:02 avahi-daemon: running [fog.local]
root 12747 1 0 18:30 ? 00:00:00 /usr/bin/php -q /opt/fog/service/FOGTaskScheduler/FOGTaskScheduler
root 12781 1 0 18:30 ? 00:00:02 /usr/bin/php -q /opt/fog/service/FOGMulticastManager/FOGMulticastManager
root 12816 1 0 18:30 ? 00:00:00 /usr/bin/php -q /opt/fog/service/FOGImageReplicator/FOGImageReplicator
root 13467 29116 4 18:36 pts/1 00:00:00 grep --color=auto fogmulticast.log:
[01-31-14 6:36:16 pm] * [01-31-14 6:36:16 pm] I am the group manager.
[01-31-14 6:36:27 pm] * [01-31-14 6:36:27 pm] Checking if I am the group manager.
[01-31-14 6:36:27 pm] * [01-31-14 6:36:27 pm] I am the group manager.
[01-31-14 6:36:38 pm] * [01-31-14 6:36:38 pm] Checking if I am the group manager.
[01-31-14 6:36:38 pm] * [01-31-14 6:36:38 pm] I am the group manager.
[01-31-14 6:36:49 pm] * [01-31-14 6:36:49 pm] Checking if I am the group manager.
[01-31-14 6:36:49 pm] * [01-31-14 6:36:49 pm] I am the group manager.
[01-31-14 6:37:00 pm] * [01-31-14 6:37:00 pm] Checking if I am the group manager.
[01-31-14 6:37:00 pm] * [01-31-14 6:37:00 pm] I am the group manager.multicast.log.udpcast.50:
Udp-sender 20120424
Using mcast address 232.168.0.3
UDP sender for (stdin) at 192.168.0.3 on eth0
Broadcasting control to 224.0.0.1
New connection from 192.168.0.133 (#0) 00000009
New connection from 192.168.0.113 (#1) 00000009
New connection from 192.168.0.141 (#2) 00000009root@fog:/opt/fog/log# ps -ef|grep udp
root 13001 12781 0 18:31 ? 00:00:00 sh -c exec gunzip -c “/images//labinfociro/d1p1.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p2.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p3.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;
root 13003 13001 0 18:31 ? 00:00:00 /usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd -
[quote=“fabritrento, post: 22341, member: 21607”]
these pc is waked up by wol, but too fast, so some pc start the process other boots from local disk bypassing.
[/quote]Check the BIOS setup. In HP machines you must setup the WOL boots from Remote Server and not from local machine.
-
Doing some testing on the xp, single partition resizable, with init.gz from different revisions
1110 will deploy an image created by 1110
1110 will deploy an image created by 1170
1170 will not deploy an image created by 1170
1170 will not deploy an image created by 1110
1142 will not deploy an image created by 1110
1115 will deploy an image created by 1110
1115 will deploy an image created by 1170
1132 will deploy an image created by 1170
1137 will deploy an image created by 1170
1139 will deploy an image created by 1170So it looks like image upload is still OK at 1142, however deploy is broken from then on.
Hope this helps.
-
I expect that 1141 and previous should deploy SDR Images with little to no problem. I suspect that something with the download is screwed up and will focus, as much as possible, my time on that script this weekend.
I have to go get surgery on Monday, so if I’m not fully responsive then you all know why.
-
r1171 released.
Progress bar now matches the Active tasks table.
Active Tasks table, now, includes the created by so we know who’s doing what when.
Minor elements added for location patch by Lee Rowlett (Not working yet but in progress.)Adjusts some elements of the fog.download script to maybe get windows xp working? (PLEASE PLEASE PLEASE)
Fix for the “No host found for …” as I was finally able to replicate the issue.
-
Just tried a quick test of xp, single partition resizeable, on 1171, using an images created previous.
It worked.
Many thanks for all your good work, hope all goes OK Monday.
-
Finally making progress. Thanks for the testing and faith that I’d get it.
-
[quote=“Fernando Gietz, post: 22339, member: 13”]A multicast tasks with 30 clients -> one thread/slot -> one uncompress process
Two diferent multicast tasks, one with 30 clients and other with 15 clients -> two threads/slots -> two uncompress processes
Three diferents multicast tasks, 30, 15, 18 clients -> three threads/slots -> three uncompress processes
…Each multicast tasks have one uncompress process, no? And the gunzip process is heavier than udp-sender process, and will overload the CPU.[/quote]
While you’re right with this, and maybe I’m thinking too much on this, it would seem to me that, let’s say you have 3 multicast sessions running, Session one with 30 clients, session two with 15 clients, and session 3 with 18 clients.
If we have all of them, for some reason, start getting their data at (more or less) the same time: individually decompressing the image file, we’d actually be doing more work to accomplish the same result. What I mean by this is it is actually opening up 63 (individual as it may be) gunzip tasks. And while this is load is on the individual host, it’s way more work. Some Systems may decompress faster than others causing delay’s and possibly timeout’s on the udp session.
While you’re right that it could become CPU intensive on the server, it would ultimately take much longer if each of the clients are performing their own decompression techniques. We’re only performing three gunzip tasks versus 63 gunzip.
This isn’t necessarily a bad approach as it keeps resources on the server available for other imaging/snapin (or what have you) tasks to perform better, it seems that using all of these, however, techniques has their pros/cons.
-
[quote=“fabritrento, post: 22341, member: 21607”]i tryed out the multicast:
clicked on group ->basic task -> multicast.
is a group of 5 pc
these pc is waked up by wol, but too fast, so some pc start the process other boots from local disk bypassing.
so i reset by hand powering off then on, then all starts the multicast process.
the problem is that all pc stays with empty gray screen of partclone.
there is a bug, also if the members of the group scheduled is 5 pc, for some reason it expect 29 connection before start.
as a note, my pc is members of 3 group. I think that the check of how many pc is scheduled is to see how many pc is in the group that i 've scheduled, without other group membership…
my situation:
total # of pc in mysql: 27
pc in first group: 25
pc in second group: 3
pc in third group: 5on the server:
root@fog:/opt/fog/log# ps -ef|grep fog
avahi 507 1 0 10:03 ? 00:00:02 avahi-daemon: running [fog.local]
root 12747 1 0 18:30 ? 00:00:00 /usr/bin/php -q /opt/fog/service/FOGTaskScheduler/FOGTaskScheduler
root 12781 1 0 18:30 ? 00:00:02 /usr/bin/php -q /opt/fog/service/FOGMulticastManager/FOGMulticastManager
root 12816 1 0 18:30 ? 00:00:00 /usr/bin/php -q /opt/fog/service/FOGImageReplicator/FOGImageReplicator
root 13467 29116 4 18:36 pts/1 00:00:00 grep --color=auto fogmulticast.log:
[01-31-14 6:36:16 pm] * [01-31-14 6:36:16 pm] I am the group manager.
[01-31-14 6:36:27 pm] * [01-31-14 6:36:27 pm] Checking if I am the group manager.
[01-31-14 6:36:27 pm] * [01-31-14 6:36:27 pm] I am the group manager.
[01-31-14 6:36:38 pm] * [01-31-14 6:36:38 pm] Checking if I am the group manager.
[01-31-14 6:36:38 pm] * [01-31-14 6:36:38 pm] I am the group manager.
[01-31-14 6:36:49 pm] * [01-31-14 6:36:49 pm] Checking if I am the group manager.
[01-31-14 6:36:49 pm] * [01-31-14 6:36:49 pm] I am the group manager.
[01-31-14 6:37:00 pm] * [01-31-14 6:37:00 pm] Checking if I am the group manager.
[01-31-14 6:37:00 pm] * [01-31-14 6:37:00 pm] I am the group manager.multicast.log.udpcast.50:
Udp-sender 20120424
Using mcast address 232.168.0.3
UDP sender for (stdin) at 192.168.0.3 on eth0
Broadcasting control to 224.0.0.1
New connection from 192.168.0.133 (#0) 00000009
New connection from 192.168.0.113 (#1) 00000009
New connection from 192.168.0.141 (#2) 00000009root@fog:/opt/fog/log# ps -ef|grep udp
root 13001 12781 0 18:31 ? 00:00:00 sh -c exec gunzip -c “/images//labinfociro/d1p1.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p2.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;gunzip -c “/images//labinfociro/d1p3.img”|/usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd;
root 13003 13001 0 18:31 ? 00:00:00 /usr/local/sbin/udp-sender --min-receivers 29 --portbase 27198 --interface eth0 --half-duplex --ttl 32 --nokbd[/quote]My guess, here, is that what you’re seeing is something I was actually trying to accomplish in the interim. That being said. My guess for how your systems are setup.
Group 1 uses (with 25 clients) uses image name labinfociro.
Group 2 (with 3 clients) uses image name labinfociro
Group 3 (with 5 clients) uses image name labinfociro
Does this sound correct?
My methodology (while maybe incorrect at this point) was to use the image name as the session generating factor.
My thought on this is:
If the client, not initially in the group tasking, has the same image name as a currently running session, regenerate the cmd (which I haven’t figured out how to do yet.) to add the new client to the same multicast group. This way, it’s less taxing on the server than to open multiple threads (at this point) of gunzip and udp-senders as multicast can wreak havoc on a network.When you’re on the host page and see the three deploy icons (Upload – The up arrow, Unicast Download – the down arrow, Multicast Download – the four arrows) perform different functions (as described.)
My guess to why you saw 27, then 29, and so on, is you used the 4 Arrows to deploy the task to the systems. Then you killed the udp-sender manually, and FOGMulticastManager performed it’s checks and re-created the command. So You had a multicast deploy job set for your Grouping 1 setup. (25 systems.)
Then you are using the Multicast Deploy message to image another machine. That machine’s image is the same as that of your originally deployed multicast tasking. So it’s trying to (not working yet I must stress this) join the current operating multicast session.
Then you are doing the same on another machine. Once again, this image is the same as your multicast session, so it’s tasking is generating the into the same portbase operation.
Hopefully this makes sense as to what you’re seeing.
If you’re trying to image individual systems, I’d recommend unicast. Heck, I’d recommend unicast deployments anyway as, from what I’ve seen, it works much faster than multicast does.
-
Just finishing a few tests but looking good so far.
Upload and Deploy are working for me, including renaming early.
I even have my test setup running over two sites connected via VPN.With the setup of multiple TFTP servers… the pxelinux.cfg directory needs to be mounted from the master server… however if my VPN goes down the clients no longer successfully boot. They error because the files are missing.
Would you suggest it is better to NFS mount the TFTPboot directory instead of the pxelinux.cfg so that if the VPN is down the clients just error out and boot from their next set device? Or could it be setup in another way so that it works with multiple TFTP servers without an NFS mount.
Only slight annoyance is the replication service doesn’t seem to do anything on it’s own. needs a manual restart to work. might be an idea to maybe have it configurable or got rid of since there could be better solutions for those using many sites and those on one site with a storage server could just use rsync or something simple.