FOG will not recognise valid image
-
Server
- FOG Version: 1.3.5
- OS: Ubuntu
Client
- Service Version:
- OS:
Description
Last night a multicast deployment to 26 clients (multi disk, non-resizable. Previous test on 1, 2, 4 and 8 client groups went well) got stuck halfway through the first disk on a particular block. Ultimately, the server was unreachable via the network, and the udp-sender process could not be kill -f ed, and the web portal was unaccessable. To make the server usable again we gracefully rebooted it.
Looking in /var/logs I cannot find the logs for that particular multicast deployment, but previous and unsuccessful subsequent ones are present.
The subsequent multicast attempts failed because FOG would not recognize the ~60GB image. In the images tab the size on client was reported as 0 and it said no valid data.
The image was successfully imaged to all of the clients manually using PXE Clonezilla. While that happened I installed FOG on a new server and attempted to import the image via CSV, and copy it over to teh new /images/ directory on the new server but the new instance of FOG does not recognize it (same as above)
At this point I was looking for assistance n figuring out why FOG will not recognize a valid (from Clonezilla’s standpoint) image. And any possible way I can avoid having to recapture the image.
-
lets focus on the new fog server and the images you copied over.
The images actually have 2 parts to them. There are the raw files in /images/<image_name> directory and the database meta data.
I assume you copied the data files over? And then exported the image definitions and imported them into the new fog server?
If so then lets get some outputs
- Post a screen shot of the image definition for the image in question from the fog management gui
- Post the output of the following command (adjusted for the proper image name)
ls -la /images/<image_name>
- Post the output of the following file
cat /images/<image_name>/d1.partitions
-
1:
2:
total 48972612 drwxrwxr-x 4 pederskz pederskz 4096 Apr 20 15:20 . drwxrwxrwx 3 fog root 4096 Apr 20 15:20 .. -rwxrwxr-x 1 pederskz pederskz 1048576 Apr 20 01:40 d1.mbr -rwxrwxr-x 1 pederskz pederskz 283 Apr 20 01:40 d1.original.uuids -rwxrwxr-x 1 pederskz pederskz 354080437 Apr 20 01:40 d1p1.img -rwxrwxr-x 1 pederskz pederskz 12913474 Apr 20 01:40 d1p2.img -rwxrwxr-x 1 pederskz pederskz 7562563 Apr 20 01:40 d1p3.img -rwxrwxr-x 1 pederskz pederskz 41711471677 Apr 20 01:52 d1p4.img -rwxrwxr-x 1 pederskz pederskz 792 Apr 20 01:52 d1.partitions -rwxrwxr-x 1 pederskz pederskz 1048576 Apr 20 01:52 d2.mbr -rwxrwxr-x 1 pederskz pederskz 47 Apr 20 01:52 d2.original.swapuuids -rwxrwxr-x 1 pederskz pederskz 195 Apr 20 01:52 d2.original.uuids -rwxrwxr-x 1 pederskz pederskz 948534 Apr 20 01:52 d2p1.img -rwxrwxr-x 1 pederskz pederskz 8058820203 Apr 20 01:54 d2p2.img -rwxrwxr-x 1 pederskz pederskz 532 Apr 20 01:54 d2.partitions drwxrwxrwx 3 fog root 4096 Apr 20 01:35 dev drwxrwxrwx 2 fog root 4096 Apr 20 01:35 postdownloadscripts
3:
label: gpt label-id: 33836BEB-AAD5-499E-8B4D-4EE5FD3FBC00 device: /dev/sda unit: sectors first-lba: 34 last-lba: 976773134 /dev/sda1 : start= 2048, size= 921600, type=DE94BBA4-06D1-4D40-A16A-BFD50179D6AC, uuid=BA584409-010B-461D-9060-30A9047953DA, name="Basic data partition" /dev/sda2 : start= 923648, size= 204800, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=BFB6FCAD-46C9-436A-8CF9-155F4B7F5540, name="EFI system partition" /dev/sda3 : start= 1128448, size= 32768, type=E3C9E316-0B5C-4DB8-817D-F92DF00215AE, uuid=5B8329A1-AD63-4C73-8425-14CB9D8895DE, name="Microsoft reserved partition" /dev/sda4 : start= 1161216, size= 975609856, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, uuid=56AFEE5D-2CC8-474A-8EED-9FDACF32731E, name="Basic data partition"
And thanks in advance for the help george1421
-
@pederskz on the image definition, can you go into the definition and post the settings? I’m not so concerned with the image size or captured information since this image was imported. Those values are recorded at capture time.
-
Of course:
-
@pederskz OK two things that jump out at me. Your operating system is
other
and your image type isall disks non resizable
.Is this how the image was captured?
-
Yes. The two disks (sda and sdb) are an Ubuntu and Windows 10 image. All 26 machines have the exact same hardware configuration.
-
@pederskz OK just doing a sanity check on your data. With multidisk non resizable if just one system has a smaller disk deployment will fail.
-
Are you still having issues with multicast, or worried about the fact the main page doesn’t “see” the “On client” or Capture dates?
The ON Client, and Capture dates, information have no bearing on the imaging capabilities. They’re both there as informative elements, not elements telling you whether or not an image can be used.
-
@george1421 these settings look good except for the file ownership. They should be owned by fog.root
-
@Tom-Elliott There should be a multicast deployment log some place like /opt/fog/logs??
-
@george1421 correct.
-
@Tom-Elliott In short, yes to the multicast; And sorry about the delay in responding.
Multicast is working now (issues on the old server where quite likely related to an…interesting hack, as well as a switch misconfiguration), but it’s on a machine with a “bad” hardware configuration (both ports plugged into the same network).
All of the PCs are HP Elitedesks 800 G1 TWRs with an extra pci intel gigabit NIC installed. The built in NIC is connected to a switch that is (practically speaking) never powered on, which shows up as eth0. Eth1 is the pci NIC, which is connected to a always-on network that can reach the fog server. When the client boots eth1 predictably fails to get a DHCP address, but when client says “no link detected on eth0 for 35 seconds, skipping it!” It attempts to start eth0 again.
I tried using the kernel update feature to update to kernel “Kernel - 4.10.10 TomElliott” and trying the new inits you suggested here but i’m still having the same issue (both as previously described and as @kleanthis in the previously mentioned thread)
-
@pederskz said in FOG will not recognise valid image:
All of the PCs are HP Elitedesks 800 G1 TWRs with an extra pci intel gigabit NIC installed. The built in NIC is connected to a switch that is (practically speaking) never powered on, which shows up as eth0. Eth1 is the pci NIC, which is connected to a always-on network that can reach the fog server. When the client boots eth1 predictably fails to get a DHCP address, but when client says “no link detected on eth0 for 35 seconds, skipping it!” It attempts to start eth0 again.
Its not clear where this error is being thrown. Is this in the iPXE kernel or the FOS Engine kernel? Can you snap a clear picture of the error with a mobile phone and post it here. It will allow us to see the context of the error. Make sure you grab other messages around the error.
-
@george1421 The fog engine kernel. I can photograph the error in about an hour.
-
-
@pederskz WOudl you mind installing FOG 1.4.0-RC-8? There was a fix put in place to ensure we don’t only look on a single interface as what seems to be the case here.
-
@Tom-Elliott Not at all, i’ll get started on that now
-
@pederskz What you are seeing was an issue with 1.3.5 stable that issue was fixed in 1.4.0RC2 or 3.
-
@george1421 https://news.fogproject.org/fog-1-4-0-rc-2/ just for reference