Posts made by processor

processor

I’ll reply to myself.
finally I found that image was partially copied to storing folder.
So yes no images was found !!
I just had to re-upload a backup to solve this issue.

sorry for the disturbing.

processor

Hi,

I have a FOG server 1.5.10 on an ubuntu 22.04.
We can dump and load files on/from this server without any issue, but since friday we try to load an old image (not loaded for 2 years now) and we have this message :

no image file(s) found that would match the partition(s) to restored etc …

In image file folder this what we have in the config files :
fstypes:

/dev/sda1 ntfs

fixed_size_partitions :

empty

partitions:

label: dos
label-id: 0x33799606
device: /dev/sda
unit: sectors
/dev/sda1 : start=        2048, size=  1000212992, type=7, bootable

Minimum partitions :

 cat d1.minimum.partitions
label: dos
label-id: 0x33799606
device: /dev/sda
unit: sectors
/dev/sda1 : start=        2048, size=   402749760, type=7, bootable

We tried to load the file using all image type setting in web interface of fog server.

This never changed the error message we had except for RAW where this time the machine reboot without giving any visible error message before image load of course.

We tried to load it on several physical machines with disks from 500GB to 2TB.
We tried on VM with 1TB of disk and it did not help either.

I take any help or advice in this case.

Proc.

processor

@Joe-Gill
I work in particular environment : a school.

All we install is use for a very short period (not more than 2 weeks) , So I don’t need to SysPrep for instance.
Nearly the same for our software that are pre-installed in our images as they usually are free and for those which are under trial license it depends from a software to another.

So I’m not using FOG in a “REAL” production environment.

processor

HI @Tom-Elliott,

Ok thanks for you anwers.

Is it possible to check for replication may be once a day @ night and not 24/24-7/7 as it going right now ?

Proc.

processor

FIrst let set the scene :

I have 2 FOGs server : 1 on main site 1 other on branch site.
Both are ubutnu 22.04
Each fog server storage node are masters on their site.
On main site, branch storage node has been added to default group but not as master. This way we should have only 1 way replication.

I can see on main site FOG server dashboard both storages.

This strange behaviour appear :
On main site FOG server (also detaining master storage node) I detected massive bandwidth utilisation. (input and output)
Regarding the input this do not surprise me as master should replicate images not already sync images on branch site.
Then regarding the input I don’t really understand as the master should no get anything from the branch other than images list.

So then I tail the replicator log and this is what I have :

[06-21-24 9:00:34 am]  * Starting Image Replication.
[06-21-24 9:00:34 am]  * We are group ID: 1. We are group name: default
[06-21-24 9:00:34 am]  * We are node ID: 1. We are node name: DefaultMember
[06-21-24 9:00:34 am]  * Attempting to perform Group -> Group image replication.
[06-21-24 9:00:34 am]  | Replicating postdownloadscripts
[06-21-24 9:00:34 am]  * Found Image to transfer to 1 node
[06-21-24 9:00:34 am]  | File Name: postdownloadscripts
[06-21-24 9:00:35 am]   # postdownloadscripts: No need to sync fog.postdownload (FOG-01)
[06-21-24 9:00:36 am]  * All files synced for this item.
[06-21-24 9:00:36 am]  | Replicating postinitscripts
[06-21-24 9:00:36 am]  * Found Image to transfer to 1 node
[06-21-24 9:00:36 am]  | File Name: dev/postinitscripts
[06-21-24 9:00:37 am]   # dev/postinitscripts: No need to sync fog.postinit (FOG-01)
[06-21-24 9:00:37 am]  * All files synced for this item.
[06-21-24 9:00:37 am]  * Not syncing Image between groups
[06-21-24 9:00:37 am]  | Image Name: Android Studio
[06-21-24 9:00:37 am]  | There are no other members to sync to.
[06-21-24 9:00:37 am]  * Not syncing Image between groups
....
and it does this for all images we host on master node
....
[06-21-24 9:00:40 am]  | Image Name: Android Studio
[06-21-24 9:00:42 am]   # Android Studio: No need to sync d1.fixed_size_partitions (FOG-01)
[06-21-24 9:00:43 am]   # Android Studio: No need to sync d1.mbr (FOG-01)
[06-21-24 9:00:45 am]   # Android Studio: No need to sync d1.minimum.partitions (FOG-01)
[06-21-24 9:00:45 am]   # Android Studio: No need to sync d1.original.fstypes (FOG-01)
[06-21-24 9:00:45 am]   # Android Studio: No need to sync d1.original.swapuuids (FOG-01)
[06-21-24 9:00:46 am]   # Android Studio: No need to sync d1.partitions (FOG-01)
[06-21-24 9:00:48 am]   # Android Studio: No need to sync d1p1.img (FOG-01)
[06-21-24 9:00:50 am]   # Android Studio: No need to sync d1p2.img (FOG-01)
[06-21-24 9:00:50 am]  * All files synced for this item.
...
and it does this for all images we host on master node and loop to the first part endlessly.

In about 1 hour I get near a 10000 lines log file for about only 90 images where I expected to have less than 1000.

Is it normal it loops over and over ?

This is how I configured the branch storage node on main server :

Many thanks by advance for any help.

processor

Hi,

I’m using FOG for years now (April 2019) and nothing changed since.

We are imaging, W10, W11, WServers 2019 (I don’t remember if we do more recent ones), a bunch a Linux distribs. (All bios and uefi)
And we did not changed anything since 2019.

Oh, I did not saw the second part of your message regarding software. for this I don’t know

processor

I quickly read about your issue, could this be related with your problem ? :

https://forums.fogproject.org/topic/10006/ubuntu-is-fog-s-enemy?page=1

processor

@Tom-Elliott

@Tom-Elliott said in Multicast timeout:

You created a table in FOG called hostsIP?

Yes, in our environment hosts, have ip reservations and I did not found in FOG tables last used ip by hosts (may be I missed something) so I created a table associating ip to hosts names.

@Tom-Elliott said in Multicast timeout:

Note: This isn’t a bad thing, I’m just trying to reference the information.

No worries.

@Tom-Elliott said in Multicast timeout:

The way I understand this, you’re looking at the mutlicastSessions table, getting a group, and getting the name of the hosts.

First I get all running mutlicast sessions.
foreach sessoin I guess the group associated.
I get max number of machine in the group
I get the current current associated machines to the session
As if a machine is finished, it is removed from the sesssion machine list, I compare the group machine number and the current machine running for this session. If the two differ it means that at least one went to the end but not the others.
Now you make me talk on it I understand the specificity of our setup, all of our machines names start with group name (ex : all machines in rooms 1 will have such a name pattern : R1-P1, R1-P2…R1-P10), So as I have group name and I know that hosts which belong to it have it in their name, I get all machines with name starting with the group name we are working on.
The same way I get machines IDs
If not already done I get image ID
Same for task ID (one client is enough as in mulitcast if I kill on client it kills all multicast)
For each remaining hosts, I store ID, and Name in an array.
I kill the task and do a 1 sec sleep loop till the mutlicast session disappear. (It can take few seconds for the multicast to be removed and without this loop some clients can’t be reinstalled because they are still part of another process.
and finally for each hosts in created array, I create a unicast session (using host id) and reboot the pc using host ip.

@Tom-Elliott said in Multicast timeout:

You’re at least aware of the IPs which most others will not be, but just trying to think this through.

I searched for a more elegant way to get hosts ip and which could be universal, but the lack of time lead me to this rude behaviour

But if someone has an idea on how to do it I would be very please to update this.
To be honest I thought to modify more the init.xz to populate automatically the hostsIP table when it loads but I have a working solution and it’s not even a priority for us but something more comfortable, if I had spent more time on this, someone would have come to break my fingers

Sorry for the long post.

processor

Hi,

I deal with my problem, now crashed multicast clients can be stopped and relaunch automatically once session has finished.

The fact that any machine in our environnement use IP reservation made things easy.

I managed only crashed clients, but it would be easy to add a WOL or a client restart but I consider that if WOL or Client restart has not worked first there no reason it works better 1h later without any intervention.

If someone is interested by such a feature here is the script. It could be trigger every hour, this way whenever the multicast session is finished the client is reinstalled.

I used it in test environnement for now, so there could be some issues. I you find some or have any idea to make it better please share and I’ll correct it :

#!/bin/bash


# Infos
# Works on linux machine (tested on Ubuntu 22.04)
# Requirements:
# -sshpass installed
# -mysql table hostsIP with hosts names and IPs.
# -a modified init.xz containing a modified shadow file with a root password set inside (more informations here : https://wiki.fogproject.org/wiki/index.php?title=Modifying_the_Init_Image)

IFS=$'\n'

userToken='userToken'
fogToken='serverToken'
fogserver='fogServerIP'
rootPass='set password in shadow file of init.xz'


multicastSessions=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/multicastsession/current | jq -r '.multicastsessions[].name')
#echo "curl -X 'GET' -s -L -H \"fog-user-token: $userToken\" -H \"fog-api-token: $fogToken\" http://$fogserver/fog/task/current | jq -r ."

for session in $multicastSessions;do
  hostArray=()
  group=$(echo $session | rev | cut -d' ' -f1 | rev)
  groupHostsQty=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/group/list | jq '.groups[] | select(.name == "'${group}'") | .hostcount')
  currentHostQty=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/current | jq '.tasks[] | select(.name == "'${session}'") | .host.name' | wc -l)
  if [[ $groupHostsQty != $currentHostQty ]];then
    hosts=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/current | jq  -r  '.tasks[] | select(.host.name | startswith("'${group}'")) | .host.name')
    for host in $hosts; do
      hostIP=$(mysql fog -Bse "SELECT ip FROM hostsIP WHERE name = '$host'")
      hostID=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/current | jq  -r  '.tasks[] | select(.host.name == "'${host}'") | .host.id')
      if [[ -z $taskID ]]; then
        taskID=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/current | jq  -r  '.tasks[] | select(.host.name == "'${host}'") | .id')
      fi
      if [[ -z $imageID ]]; then
        imageID=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/current | jq  -r  '.tasks[] | select(.host.name == "'${host}'") | .image.id')
      fi
      hostArray=("${hostArray[@]}" "$hostIP,$hostID")
    done
    echo "Task : $taskID"
    curl -X "DELETE" -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/task/$taskID/cancel
      message="Multicast session still active"
    while sessionActive=$(curl -X 'GET' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" http://$fogserver/fog/multicastsession/current | jq -r '.multicastsessions[] | .name | select(contains("'${group}'"))'); [[ -n $sessionActive ]]; do
      message="${message}."
      sleep 1
    done
    for host in ${hostArray[@]}; do
  hostIP=$(echo $host | cut -d',' -f1)
  hostID=$(echo $host | cut -d',' -f2)
  echo "Relaunch JOB for : $hostID ($hostIP)"
  curl -X 'POST' -s -L -H "fog-user-token: $userToken" -H "fog-api-token: $fogToken" -d '{"taskTypeID": 1, "imageID": "'${imageID}'", "taskName": "$hostID : deploy", "shutdown": false, "debug": false, "deploySnapins": false, "wol": false}' http://$fogserver/fog/host/$hostID/task
  echo "Reboot PC"
        sshpass -p "$rootPass" ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null root@$hostIP 'nohup reboot -f > /dev/null 2>&1 & exit'
    done
  fi
done

Bye

Proc.

processor

@Tom-Elliott

Thanks Tom !!

Sorry for this POST I feel so ashamed by this noob mistake.

So it’s solved!

processor

Hi

Admitting few things :

Server ip : 10.0.0.1
One multicast session
Need to cancel one client task belonging to the mutlicast group
Task id for the client : 3001

Runnning the following command should kill the client task or I’am wrong :

curl -s -L -H "fog-user-token: [token]"  -H "fog-api-token: [token]" http://10.0.0.1/fog/task/3001/[cancel|remove|delete]

Whatever the option I choose (cancel, remove or delete), this API resquest result in a json which is the same as the one I get requesting task info :

curl -s -L -H "fog-user-token: [token]"  -H "fog-api-token: [token]" http://10.0.0.1/fog/task/3001

Anybody can tell me what I’m doing wrong ?
I use this document as fog API support : https://news.fogproject.org/simplified-api-documentation/

Thanks in advance for any help.

Proc.

processor

@Tom-Elliott
So I have been able to modify the init.xz
Now I can remotely access the client, reboot it etc.
I’ll try to make a script to reboot it and move it to a standalone deploy if the mutlicast crashed , this way everything will be automated.

Thanks for your time,

Proc.

processor

@Tom-Elliott Yes I saw it 5 mins after I posted my question.

SSH FOS client would be a solution for me. (If I well understood as my english is far from prefect)

I may be found a solution with this :

https://forums.fogproject.org/topic/9462/forcing-fos-to-reboot-from-remote-via-ssh-login/4?_=1717681421087

Can you confirm it works ?

processor

@Tom-Elliott

Sorry if I’m seeming rude, but we cannot possible fix all potentialities.

Ahaha, no worries I do not take it personnally and you take some of you time to answer, so I would be rude if I could not understand your position.

anyway I have never seen this option before :

could it be my solution ?

processor

@Tom-Elliott

To put things in context. I work in education center and have to install bunch PC on friday afternoon often.
Some of the image we load are heavy and our NAS core has some limitation.
Even with the FOG high perf, when I load 6 or 7 classes at the same time the loading time can be quite big. A nd I can wait part of night for it.

It arrive that multicast machine do not attach properly or crash during loading.
If there is a crash and the machine restart usually it’s not really an issue, I create a unicast session remotely (on saturday usually) for this PC and at restart it load its image.
If the machine stuck on partlcone blue screen I can’t do anything, so if the machine itself could check the presence of the multicast session and restart if it does not exist anymore it would be a salvation.

There wouldn’t be a nice way to do this from the web UI

It would be even better but I did not hear of such a solution.

If anyone now it, tell us please.

processor

@Tom-Elliott Do you know if the client can be edited ?

I mean a solution for me would be to cron a a job to check if multicast session is still available on server from the client. If not, send a kill to force the machine reboot.

processor

Hi,

Anybody knows if it’s possible to set a timeout on a partclone multicast client ?

I mean if a multicast session start and a client connect the session to late, it won’t get any data and will be stuck on partclone blue screen even when multicast end.

Is there any way to make it reboot if nothing happen for him after 10 minutes for instance ?

Thanks

processor

Hi, I know this topic is quite old but we are experiencing exactly same issue with Optiplex 3000 series.

On any other serie we deploy at between 7-8GB/Min and 15GB/Min (Depends on harddrives, cables, unicast multicast etc)

On 3000 series, we never go above 2.4GB/min in multicast and 4-5 in unicast.

It does not help a lot, but your are not alone.

processor

@processor Yes you can !

processor

So I found by myself.

The issue was the /etc/exports that was still referencing original /images folder instead of /cloning/fog

So this part is solved.