Imaging Jobs Freezing
-
@Sebastian-Roth Below is the screenshots from the fdisk -l
I will be running the tests shortly with old init.xz and will get back to you.
Thanks,
Anthony
-
@atarone So the partition table on sda (first HD) looks pretty straight forward. I don’t think there is much that can go wrong. Why did you run
fdisk -l /dev/sda1
? Just out of curiosity or by intention? There shouldn’t be a partition table within the first partition I reckon. But maybe that’s just a coincidence?!Can you please post a picture (or text output) when running the following commands on your FOG server:
ls -al /images/Vertix cat /images/Vertix/d1.partitions cat /images/Vertix/d1.minimum.partitions cat /images/Vertix/d1.fixed_size_partitions
-
@Sebastian-Roth The deployments started freezing again with the new init.xz and the old one.
Below are the screen captures you requested:
ls -al Vertix
total 9526116
drwxrwxrwx 2 fog root 4096 Jun 15 11:07 .
drwxrwxrwx 11 fog root 4096 Jun 15 11:07 …
-rwxrwxrwx 1 root root 1 Jun 15 10:21 d1.fixed_size_partitions
-rwxrwxrwx 1 root root 512 Jun 15 10:21 d1.mbr
-rwxrwxrwx 1 root root 132 Jun 15 10:21 d1.minimum.partitions
-rwxrwxrwx 1 root root 15 Jun 15 10:21 d1.original.fstypes
-rwxrwxrwx 1 root root 0 Jun 15 10:21 d1.original.swapuuids
-rwxrwxrwx 1 root root 9754708186 Jun 15 11:07 d1p1.img
-rwxrwxrwx 1 root root 132 Jun 15 10:21 d1.partitionscat /images/Vertix/d1.partitions
label: dos
label-id: 0x708be90c
device: /dev/sda
unit: sectors/dev/sda1 : start= 63, size= 78134424, type=7, bootable
cat /images/Vertix/d1.minimum.partitions
label: dos
label-id: 0x708be90c
device: /dev/sda
unit: sectors/dev/sda1 : start= 63, size= 24918072, type=7, bootable
cat /images/Vertix/d1.fixed_size_partitions
fogadmin@INC-FOG01:~$
I think I have found part of it given this output. Please let me know.
Thanks,
Anthony
-
@atarone said in Imaging Jobs Freezing:
The deployments started freezing again with the new init.xz and the old one.
Are you able to switch between virtual terminal one and two (as described earlier) when deployment freezes?
The numbers in the outputs you posted look pretty ok to me. I can’t see where things are going wrong here yet.
I think I have found part of it given this output. Please let me know.
What do you mean by that?
-
@Sebastian-Roth “I think I have found part of it given this output. Please let me know.” I though that last output capture being blank may have been a problem, but you said all looks good to you. I am unable to switch between the VTYs.
Thanks,
Anthony
-
@atarone said in Imaging Jobs Freezing:
I am unable to switch between the VTYs.
So the client really seems to fully freeze. What if you hit caps lock by the way. Does the LED on the keyboard change state when it hangs? Just want to make sure…
-
As well, could you please try this: Boot the client into deploy task using the new init.xz as normal. As FOG starts to prepare the disk for imaging (before the blue partclone screen) switch to VT2 (Ctrl+Alt+F2) and run this command:
tail -f /var/log/messages
. Just let it sit there. You should see all (kernel) messages coming in. Maybe this will give us a hint on what’s causing the hang. Please take a picture and upload here.Unfortunately you can’t see when it freezes while you are in VT2 but you can run a ping from another machine to check if the client is still alive…
-
@Sebastian-Roth Do you have a VT2 version for init_32.xe? I can only use the NCR and they are 32 bit.
-
@atarone Yes, no problem. Find a fresh version of both 32 bit and 64 bit in the same place.
@Tom-Elliott What do you think about adding a virtual terminal to the official initrds? Would this use too much resources on the clients for no reason?
-
@Sebastian-Roth I don’t think it would. I think it’s just the access to those terminals can be rather limited, which is all the more reason I added the openssh utils. Anybody can remote in much easier than have a device that’s having issues right next to them the whole time.
Using the openssh elements of it all allow us (devs and what not) to remote in and ssh in to see the machine too.
Pair the postinit scripts with a means to associate the root password and you don’t even, fully, need debug mode to test things (though I’ll admit you’d be strained for time to get information).
-
@Tom-Elliott You are absolutely right about SSH being the more advanced method to get access to such a client. But in this case when network connection is lost or it actually freezes it’s quite handy I suppose. On the other hand, I see that we never ever had such a case yet. So maybe just leave it.
-
@Sebastian-Roth Thanks! What is the key combination to switch VTY lines? I am unable to switch them using CTRL-ALT-Fx or any other combination of those keys.
Thanks!
-
@atarone You mean you aren’t able to switch even when the imaging did not freeze yet? Yeah, Ctrl+Alt+Fx is the key…
-
@Sebastian-Roth Correct. I am not able to switch when using CTRL-ALT-Fx.
-
@atarone Before or after it freezes??
-
Yes before it freezes I am not able to switch. On another interesting note, I tried updating to the latest client and i get the error message below:
Could this be related?
Thanks,
Anthony
-
@atarone This message is typically generated during an upload if/when someone mucks about with the linux user called fog. This is a service account that is owned by the FOG backend and should not be used for general system administration. If you happened to change this password or the installer got this account out of sync you will see the above error. I’m not saying that is your case, because what you have is a bit unique.
But I would start by inspecting the /opt/fog/.fogsettings file for the password, then make sure in the web gui that the storage node password matches. If they all do then connect to the fog server using a windows box ftp client. Use the user ID (fog) and password found in the .fogsettings file. Confirm that you can login.
-
@george1421 I confirmed the passwords and they do match. I can connect via a Windows FTP client, but when I run the update I still get the error listed below.
Thanks,
Anthony
-
@atarone Well my last post was way off base, you are not capturing an image but just trying to update the client settings.
There is another place the fog user account info is hidden in the
fog settings
undertftp server
. I can’t say for absolute why this is there. I’m not saying that is your issue either. Its just strange why you have your root issue. I don’t think its related to updating the client settings. -
The settings under TFTP Server is from the days of PXE Boot (plain and simple). These were used to define the TFTP Server and the fog username/password were used to upload the latest pxe file when tasking a client machine. This, then, was also used for updating the kernels. I just don’t have a logical way to achieve a more dynamic means of updating these things so these “settings” are now used during updating kernels.