Latest Development FOG
-
Tom,
2484 on multicast works like a champ now. Noticing that imaging is a lot faster now then doing 3 udp casts a once. Originally it was pushing 800 mbps now I’m doing 3 laptops at 1.62gig. That’s huge. One thing I noticed when it loads the session still says send method UDPCAST and a cool thing would be while sitting at the partclone or maybe have a screen before the partclone screen stating how many computers we are waiting on until session starts and with a timeout counter. So for example computer joins session and it sits at a please wait screen like previous versions but says waiting on 2 out of 3 computers until session starts with the timeout counter below . As other computers join the session the number on computers decreases . Waiting on now 1 of 3 computers until session starts . Would be very helpful if you are doing a ton of computers at once on imaging.
Ray Z
-
Ray,
I’m psyched that imaging now works through the session name for multicast, but I missed something and it should now work properly. That said, to inform about your feature request, I do understand why you’d want it.
However, I’m not the developer for Partclone. That’s not to say I couldn’t do what you’re requesting, but it’d be a lot of work to code in a replacement field while waiting. Hopefully you understand as my c skills aren’t quite there yet.
-
Tom,
No problem buddy. In time it will get there. Thanks for all your help. Ill continue to test. Now we gotta get this UEFI for the tablets working. Im pumped as well multicasting is working great now to. Thanks again
-
SVN 2487 released.
With this just comes more proper fixes for the multicast session joining feature. As well as ipxe file updates. Nothing major to note (I don’t think) just wanted to keep all informed.
-
Tom,
Is there a reason why every time I update to the latest svn that it inserts a 298 into my MySQL password ? I have $$$ at the end of my password and it will insert 298$ and delete the other two dollar signs .
-
sorry inserts 2983
-
My guess is the .fogsettings file.
Edit the file:
/opt/fog/.fogsettings (yes that’s slash opt slash fog slash period fogsettingsChange the snmysqlpass field
and where you have the $$$ you may need to replace with:Backslash$Backslash$Backslash$
So it would show something like $$$
-
Tom,
Two things… ever since the update to 2487 when you go to register a host from ipxe menu or join a multicast session it hangs at sending network discovery and then finally goes in. About a 60 sec lag time. Also another thing I noticed is in order for a host to join a multicast session, that host has to be registered with the image that is being broadcasted. Im not sure if that’s a bug or by design. However, if the host system is not registered with the image that is being broadcasted it just hangs at the Broadcom screen all rights reserved. If this is by design, then it should give an error message saying this host isn’t registered with the following image and kick you back to the main ipxe menu.
Ray Z
-
[quote=“Ray Zuchowski, post: 38332, member: 24449”]Tom,
Two things… ever since the update to 2487 when you go to register a host from ipxe menu or join a multicast session it hangs at sending network discovery and then finally goes in. About a 60 sec lag time. Also another thing I noticed is in order for a host to join a multicast session, that host has to be registered with the image that is being broadcasted. Im not sure if that’s a bug or by design. However, if the host system is not registered with the image that is being broadcasted it just hangs at the Broadcom screen all rights reserved. If this is by design, then it should give an error message saying this host isn’t registered with the following image and kick you back to the main ipxe menu.
Ray Z[/quote]
Right now, the host must be registered, not necessarily with the particular image, just registered, to join the session. I need to figure out the logic on how to get the host to join when it isn’t registered though and that’s going to be tough.
The 60 sec lag time is a bit odd, but it fits the description that I specified. Basically I’ve set the retries to 100 times and each retry has a 60 second timeout. So my guess is the first request is failing, but the second request is working properly. So I will likely change the timeout to 5 or 10 seconds but keep the retries amount the same.
-
Tom,
Just tried doing a multicast with 6 hosts and timeout set to 5 minutes and they just sat at the partclone screen. I just reran that multicast session with 3 hosts and 1 min timeout and now its working. So im not sure if the issues is related to the timeout or the quantity of hosts. How does the timeout work ? Is the timeout reset back to 5 minutes every time a host joins the session and counts down to 0 or when the first host joins does it start the timeout and when 5 minutes is up the multicast starts regardless of how many pcs specified ?
Ray Z
-
SVN 2488 released.
This should decrease the timeout value from 60 seconds to 20 seconds.
There are a total of 100 retries performed. Each retrie has a timeout of 20 seconds. This means, at 3 retries, you’re waiting approximately 1 minute. It’s pretty safe to assume that if you have not received an IP at this point, you need to fix your network as it’s not something fog is failing to do. For those with USB NIC’s you should not just try booting as this leaves the potential of waiting a full 2000 seconds for no good reason. We know what the issue is and have a workaround that will work if followed correctly. Please don’t try to pin the tail on me for failing to do something if you miss this step.
-
[quote=“Ray Zuchowski, post: 38335, member: 24449”]Tom,
Just tried doing a multicast with 6 hosts and timeout set to 5 minutes and they just sat at the partclone screen. I just reran that multicast session with 3 hosts and 1 min timeout and now its working. So im not sure if the issues is related to the timeout or the quantity of hosts. How does the timeout work ? Is the timeout reset back to 5 minutes every time a host joins the session and counts down to 0 or when the first host joins does it start the timeout and when 5 minutes is up the multicast starts regardless of how many pcs specified ?
Ray Z[/quote]
The multicast timeout works on a “wait” potential. If the timeout value is met but not all the hosts have connected, it should start imaging. If all the hosts in the task connect, it should immediately start imaging. I don’t know if it resets the timeout every time a host connects though.
-
Tom,
Just tried adding a host to a multicast that is registered with no image selected and it wont join the session. Ill have to try next a host that is registered with a different image to determine if multicast only plays nice with hosts registered with that particular image.
-
When I mean no image selected I mean when you register the host and it asks you for a particular image that host should be registered with.
-
Tom,
The issue with the multicasting I narrowed down to that if a host is registered with no image selected within the host registration section the host has issues and wont join the multicast session. If a host is registered with a different image under host registration then what is being broadcasted in the multicast it loads the script like its about to join the session and reboots. So the hosts deft have to be registered with the image selected that is being broadcasted.
Ray Z
-
Bugs Found SVN 2488
- Host has to be registered with the image being broadcasted or wont join session.
- If joining a session on multicast and username and password is entered incorrectly, you are brought back to the ipxe boot screen however upon retrying to join the session the old username and password credentials are left behind. You have to manually delete this. This should automatically be deleted when retrying to enter credentials.
- On the multicast image management screen when session starts the client count is incorrect. Currently im imaging 5 clients in a multicast right now and it says I’m doing 10.
- Active Multicast- Task Management displays session in State Queued even tho the task is started and imaging.
- Under imaging log reports the failed client hosts are being shown. However only successful images should be displayed or have failed host images in a separate report. This throws off stats.
Easy way to determine off the logs if image failed i noticed was End Date = -42 or if End Time = 0:00:00 . Maybe have that in that in the code if those variables are present discard from the imaging log .
-
SVN 2490 Should fix the registered host only working with the hosts image.
Also adds the image id to the display of the image. This should make finding the image for registration that little bit easier.
Ray:
- Yes, but this should now be corrected for.
- Should now be fixed.
- I think that’s my fault. I’m trying to ensure the client count is set, so maybe once the udp-sender commands I simply need to reset the client count to zero? I’ll give that a shot.
- It’s kind of the nature of the way multicast sessions happen. I will see what I can do, but it’s kind of dealing with the client count parameter which we have know real clue of.
- Again, kind of how it was intended. While the tasking itself failed, the fact it checked in is still valid. Hopefully once I get the quirks of the bugs taken out of the scenario, it should help you out.
-
2492 should fix the client count issue.
-
Theoretically, 2495 should fix the client count but I don’t know. This client count is compared and is what sets the stateID. If they’re equal, it should set it to in-progress.
I’ve added a sess client table that just stores the numbers. It creates the tasking based on this parameter as well if there are no associations (which there shouldn’t be).
This should allow the original clients field to be left as is. This should also then set the tasking to in progress if there’s even one host that’s checked in under this setting. It’s not 100% accurate but should give a fairly good idea.
This should address all of the issues you reported.
-
I have tried to use the code from github and found here a small issue in boths files from bin folder. Looks like that there was an merge conflict and it wasn’t corrected fully.
Example:
[url]https://github.com/mastacontrola/fogproject/blob/dev-branch/bin/.install.sh#L1[/url]Please fix and ty,
Albatros