Imaging Jobs Freezing
-
@atarone said in Imaging Jobs Freezing:
So I updated init.xz with the one created by Sebastian-Roth and now images deploy
This all sounds totally weird! Is this in your small simple test environment? The only thing I did was downloading the most current init.xz and adding this one line to /etc/inititab:
tty2::askfirst:-/bin/bash
From my point of view it’s impossible that this change is actually solving your freezing problem!
but Windows no longer boots after imaging.
Which error do you see? Please post a picture!
It did the capture but returned errors trying to update the database using the username “fog”. I tested that login and it is still the default. What password is it trying to use and could this be the root cause of my deployments freezing?
Check this wiki page about database password. Again I highly doubt that this could cause the freezing!
-
This has been the weirdest issue since day one haha. This is all being done in the production environment. I got the password stuff sorted out. Below is what the target is doing. This occurs anytime FOG turns over booting to the OS.
-
@atarone I am not really sure where this is taking us. Would you mind testing with the provided init.xz file (second virtual terminal) over and over till you run into the freezing issue again. Then see if you can still switch terminals. If freezes don’t occur anymore then you might want to go back to the init.xz you had before and try again. Maybe there was something screwed with that initrd you had. Would be good to rule that out.
I am not sure about your other issue. Looks like deploy is not properly working. Is this a complete new image your uploaded freshly? Could you please run a debug deploy task on this client again (like when you schedule a normal deploy but tick the checkbox for debug). When you get to the shell run
fdisk -l /dev/sda
and post a clear picture of that here. -
@Sebastian-Roth Below is the screenshots from the fdisk -l
I will be running the tests shortly with old init.xz and will get back to you.
Thanks,
Anthony
-
@atarone So the partition table on sda (first HD) looks pretty straight forward. I don’t think there is much that can go wrong. Why did you run
fdisk -l /dev/sda1
? Just out of curiosity or by intention? There shouldn’t be a partition table within the first partition I reckon. But maybe that’s just a coincidence?!Can you please post a picture (or text output) when running the following commands on your FOG server:
ls -al /images/Vertix cat /images/Vertix/d1.partitions cat /images/Vertix/d1.minimum.partitions cat /images/Vertix/d1.fixed_size_partitions
-
@Sebastian-Roth The deployments started freezing again with the new init.xz and the old one.
Below are the screen captures you requested:
ls -al Vertix
total 9526116
drwxrwxrwx 2 fog root 4096 Jun 15 11:07 .
drwxrwxrwx 11 fog root 4096 Jun 15 11:07 …
-rwxrwxrwx 1 root root 1 Jun 15 10:21 d1.fixed_size_partitions
-rwxrwxrwx 1 root root 512 Jun 15 10:21 d1.mbr
-rwxrwxrwx 1 root root 132 Jun 15 10:21 d1.minimum.partitions
-rwxrwxrwx 1 root root 15 Jun 15 10:21 d1.original.fstypes
-rwxrwxrwx 1 root root 0 Jun 15 10:21 d1.original.swapuuids
-rwxrwxrwx 1 root root 9754708186 Jun 15 11:07 d1p1.img
-rwxrwxrwx 1 root root 132 Jun 15 10:21 d1.partitionscat /images/Vertix/d1.partitions
label: dos
label-id: 0x708be90c
device: /dev/sda
unit: sectors/dev/sda1 : start= 63, size= 78134424, type=7, bootable
cat /images/Vertix/d1.minimum.partitions
label: dos
label-id: 0x708be90c
device: /dev/sda
unit: sectors/dev/sda1 : start= 63, size= 24918072, type=7, bootable
cat /images/Vertix/d1.fixed_size_partitions
fogadmin@INC-FOG01:~$
I think I have found part of it given this output. Please let me know.
Thanks,
Anthony
-
@atarone said in Imaging Jobs Freezing:
The deployments started freezing again with the new init.xz and the old one.
Are you able to switch between virtual terminal one and two (as described earlier) when deployment freezes?
The numbers in the outputs you posted look pretty ok to me. I can’t see where things are going wrong here yet.
I think I have found part of it given this output. Please let me know.
What do you mean by that?
-
@Sebastian-Roth “I think I have found part of it given this output. Please let me know.” I though that last output capture being blank may have been a problem, but you said all looks good to you. I am unable to switch between the VTYs.
Thanks,
Anthony
-
@atarone said in Imaging Jobs Freezing:
I am unable to switch between the VTYs.
So the client really seems to fully freeze. What if you hit caps lock by the way. Does the LED on the keyboard change state when it hangs? Just want to make sure…
-
As well, could you please try this: Boot the client into deploy task using the new init.xz as normal. As FOG starts to prepare the disk for imaging (before the blue partclone screen) switch to VT2 (Ctrl+Alt+F2) and run this command:
tail -f /var/log/messages
. Just let it sit there. You should see all (kernel) messages coming in. Maybe this will give us a hint on what’s causing the hang. Please take a picture and upload here.Unfortunately you can’t see when it freezes while you are in VT2 but you can run a ping from another machine to check if the client is still alive…
-
@Sebastian-Roth Do you have a VT2 version for init_32.xe? I can only use the NCR and they are 32 bit.
-
@atarone Yes, no problem. Find a fresh version of both 32 bit and 64 bit in the same place.
@Tom-Elliott What do you think about adding a virtual terminal to the official initrds? Would this use too much resources on the clients for no reason?
-
@Sebastian-Roth I don’t think it would. I think it’s just the access to those terminals can be rather limited, which is all the more reason I added the openssh utils. Anybody can remote in much easier than have a device that’s having issues right next to them the whole time.
Using the openssh elements of it all allow us (devs and what not) to remote in and ssh in to see the machine too.
Pair the postinit scripts with a means to associate the root password and you don’t even, fully, need debug mode to test things (though I’ll admit you’d be strained for time to get information).
-
@Tom-Elliott You are absolutely right about SSH being the more advanced method to get access to such a client. But in this case when network connection is lost or it actually freezes it’s quite handy I suppose. On the other hand, I see that we never ever had such a case yet. So maybe just leave it.
-
@Sebastian-Roth Thanks! What is the key combination to switch VTY lines? I am unable to switch them using CTRL-ALT-Fx or any other combination of those keys.
Thanks!
-
@atarone You mean you aren’t able to switch even when the imaging did not freeze yet? Yeah, Ctrl+Alt+Fx is the key…
-
@Sebastian-Roth Correct. I am not able to switch when using CTRL-ALT-Fx.
-
@atarone Before or after it freezes??
-
Yes before it freezes I am not able to switch. On another interesting note, I tried updating to the latest client and i get the error message below:
Could this be related?
Thanks,
Anthony
-
@atarone This message is typically generated during an upload if/when someone mucks about with the linux user called fog. This is a service account that is owned by the FOG backend and should not be used for general system administration. If you happened to change this password or the installer got this account out of sync you will see the above error. I’m not saying that is your case, because what you have is a bit unique.
But I would start by inspecting the /opt/fog/.fogsettings file for the password, then make sure in the web gui that the storage node password matches. If they all do then connect to the fog server using a windows box ftp client. Use the user ID (fog) and password found in the .fogsettings file. Confirm that you can login.