Multicast timeout
-
@processor You have to tell the system to boot. If it’s starting after the fact, why not just physically power it off?
There wouldn’t be a nice way to do this from the web UI
-
To put things in context. I work in education center and have to install bunch PC on friday afternoon often.
Some of the image we load are heavy and our NAS core has some limitation.
Even with the FOG high perf, when I load 6 or 7 classes at the same time the loading time can be quite big. A nd I can wait part of night for it.It arrive that multicast machine do not attach properly or crash during loading.
If there is a crash and the machine restart usually it’s not really an issue, I create a unicast session remotely (on saturday usually) for this PC and at restart it load its image.
If the machine stuck on partlcone blue screen I can’t do anything, so if the machine itself could check the presence of the multicast session and restart if it does not exist anymore it would be a salvation.There wouldn’t be a nice way to do this from the web UI
It would be even better but I did not hear of such a solution.
If anyone now it, tell us please.
-
@processor I apologize, I guess, for this free solution that provides help in the best way it can?
Sorry if I’m seeming rude, but we cannot possible fix all potentialities.
Yes there may be some “pain” points, but I would think what our utility provides in normal circumstances far outweighs the issues you’re finding with these occasional issues?
I’m not aware of any magical utility that could do exactly what you’re looking for:
- Boot to some different envirnoment in multicast.
- Look at if the multicast session is already running, but I’m not doing anything forcibly shut down the system.
-
Sorry if I’m seeming rude, but we cannot possible fix all potentialities.
Ahaha, no worries I do not take it personnally and you take some of you time to answer, so I would be rude if I could not understand your position.
anyway I have never seen this option before :
could it be my solution ?
-
@processor That deals with “FOGClient” which when you’re in a tasking would not do as you’re requesting.
-
@Tom-Elliott Yes I saw it 5 mins after I posted my question.
SSH FOS client would be a solution for me. (If I well understood as my english is far from prefect)
I may be found a solution with this :
Can you confirm it works ?
-
@processor it does work.
-
@Tom-Elliott
So I have been able to modify the init.xz
Now I can remotely access the client, reboot it etc.
I’ll try to make a script to reboot it and move it to a standalone deploy if the mutlicast crashed , this way everything will be automated.Thanks for your time,
Proc.
-
Hi,
I deal with my problem, now crashed multicast clients can be stopped and relaunch automatically once session has finished.
The fact that any machine in our environnement use IP reservation made things easy.
I managed only crashed clients, but it would be easy to add a WOL or a client restart but I consider that if WOL or Client restart has not worked first there no reason it works better 1h later without any intervention.
If someone is interested by such a feature here is the script. It could be trigger every hour, this way whenever the multicast session is finished the client is reinstalled.
I used it in test environnement for now, so there could be some issues. I you find some or have any idea to make it better please share and I’ll correct it :
#!/bin/bash # Infos # Works on linux machine (tested on Ubuntu 22.04) # Requirements: # -sshpass installed # -mysql table hostsIP with hosts names and IPs. # -a modified init.xz containing a modified shadow file with a root password set inside (more informations here : https://wiki.fogproject.org/wiki/index.php?title=Modifying_the_Init_Image) IFS=$'\n' userToken='userToken' fogToken='serverToken' fogserver='fogServerIP' rootPass='set password in shadow file of init.xz' multicastSessions=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/multicastsession/current | jq -r '.multicastsessions[].name') #echo "curl -X 'GET' -s -L -H \"fog-user-token: $userToken\" -H \"fog-api-token: $fogToken\" http://$fogserver/fog/task/current | jq -r ." for session in $multicastSessions;do hostArray=() group=$(echo $session | rev | cut -d' ' -f1 | rev) groupHostsQty=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/group/list | jq '.groups[] | select(.name == "'${group}'") | .hostcount') currentHostQty=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/current | jq '.tasks[] | select(.name == "'${session}'") | .host.name' | wc -l) if [[ $groupHostsQty != $currentHostQty ]];then hosts=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/current | jq -r '.tasks[] | select(.host.name | startswith("'${group}'")) | .host.name') for host in $hosts; do hostIP=$(mysql fog -Bse "SELECT ip FROM hostsIP WHERE name = '$host'") hostID=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/current | jq -r '.tasks[] | select(.host.name == "'${host}'") | .host.id') if [[ -z $taskID ]]; then taskID=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/current | jq -r '.tasks[] | select(.host.name == "'${host}'") | .id') fi if [[ -z $imageID ]]; then imageID=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/current | jq -r '.tasks[] | select(.host.name == "'${host}'") | .image.id') fi hostArray=("${hostArray[@]}" "$hostIP,$hostID") done echo "Task : $taskID" curl -X "DELETE" -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/$taskID/cancel message="Multicast session still active" while sessionActive=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/multicastsession/current | jq -r '.multicastsessions[] | .name | select(contains("'${group}'"))'); [[ -n $sessionActive ]]; do message="${message}." sleep 1 done for host in ${hostArray[@]}; do hostIP=$(echo $host | cut -d',' -f1) hostID=$(echo $host | cut -d',' -f2) echo "Relaunch JOB for : $hostID ($hostIP)" curl -X 'POST' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" -d '{"taskTypeID": 1, "imageID": "'${imageID}'", "taskName": "$hostID : deploy", "shutdown": false, "debug": false, "deploySnapins": false, "wol": false}' http://$fogserver/fog/host/$hostID/task echo "Reboot PC" sshpass -p "$rootPass" ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null root@$hostIP 'nohup reboot -f > /dev/null 2>&1 & exit' done fi done
Bye
Proc.
-
@processor You created a table in FOG called hostsIP?
Note: This isn’t a bad thing, I’m just trying to reference the information.
The way I understand this, you’re looking at the mutlicastSessions table, getting a group, and getting the name of the hosts.
You’re at least aware of the IPs which most others will not be, but just trying to think this through.
-
@Tom-Elliott said in Multicast timeout:
You created a table in FOG called hostsIP?
Yes, in our environment hosts, have ip reservations and I did not found in FOG tables last used ip by hosts (may be I missed something) so I created a table associating ip to hosts names.
@Tom-Elliott said in Multicast timeout:
Note: This isn’t a bad thing, I’m just trying to reference the information.
No worries.
@Tom-Elliott said in Multicast timeout:
The way I understand this, you’re looking at the mutlicastSessions table, getting a group, and getting the name of the hosts.
- First I get all running mutlicast sessions.
- foreach sessoin I guess the group associated.
- I get max number of machine in the group
- I get the current current associated machines to the session
- As if a machine is finished, it is removed from the sesssion machine list, I compare the group machine number and the current machine running for this session. If the two differ it means that at least one went to the end but not the others.
- Now you make me talk on it I understand the specificity of our setup, all of our machines names start with group name (ex : all machines in rooms 1 will have such a name pattern : R1-P1, R1-P2…R1-P10), So as I have group name and I know that hosts which belong to it have it in their name, I get all machines with name starting with the group name we are working on.
- The same way I get machines IDs
- If not already done I get image ID
- Same for task ID (one client is enough as in mulitcast if I kill on client it kills all multicast)
- For each remaining hosts, I store ID, and Name in an array.
- I kill the task and do a 1 sec sleep loop till the mutlicast session disappear. (It can take few seconds for the multicast to be removed and without this loop some clients can’t be reinstalled because they are still part of another process.
- and finally for each hosts in created array, I create a unicast session (using host id) and reboot the pc using host ip.
@Tom-Elliott said in Multicast timeout:
You’re at least aware of the IPs which most others will not be, but just trying to think this through.
I searched for a more elegant way to get hosts ip and which could be universal, but the lack of time lead me to this rude behaviour
But if someone has an idea on how to do it I would be very please to update this.
To be honest I thought to modify more the init.xz to populate automatically the hostsIP table when it loads but I have a working solution and it’s not even a priority for us but something more comfortable, if I had spent more time on this, someone would have come to break my fingersSorry for the long post.