Imaging from Storage node fails



  • Hi, i upgraded to FOG1.5.6 and have a problem with imaging from the storage node.
    It crashes randomnly (see photo)
    On the ‘defaultmember’ it works fine
    Storage nodes run on Ubuntu 18.04.02 LTS
    IMG_3253.jpg


  • Senior Developer

    @ErwinBullen said in Imaging from Storage node fails:

    I tried to mount and extract the image, but do not get beyond this error

    Sorry for that. Please try this:

    mkdir -p /mnt/images
    mount -t nfs 192.168.100.32:/images /mnt/images
    zcat /mnt/images/B-Blok-v18-v4/d1p2.img | partclone.ntfs --restore --restore_raw_file --ignore_crc -N -f 1 -s - -O /tmp/d1p2_deployed.img
    

    Hint: Again make sure you do this on a system where you have enough space in /tmp or where ever else you have enough room to dump the extracted image for testing. Maybe best to mount an empty extra disk in /mnt/test/ and write the output file there.

    As well please run ls -alR /images on your storage node and post output here.



  • is’t this strange ? All images report the same size ?
    abf75ba7-8b61-40f4-8070-a730a96d50d0-image.png



  • This is the initial error only showing for a second. SN-error.jpg
    Looks like a corruption error but a local extract works fine (see previous post)



  • I created a storage node on the vmware where the FOG is running with the same Ubunu, Fog 1.5.6 and still crashing only when a client gets connected to the Storage node. Any suggestions ?
    I tried to mount and extract the image, but do not get beyond this error
    partclone_error.jpg


  • Senior Developer

    @ErwinBullen said in Imaging from Storage node fails:

    Do you know a way to manual extract the image over te network to another machine/folder so i can test the network card of the server ?

    Probably could test by mounting the NFS share from a different system and try to run it through partclone like this:

    mkdir -p /mnt/images
    mount -t nfs 192.168.100.32:/images /mnt/images
    zcat /mnt/images/B-Blok-v18-v4/d1p2.img | partclone.restore --ignore_crc -O /tmp/d1p2_deployed.img -N -f 1
    

    Make sure you have enough free space on the destination machine you are running this command from!! As well you need to have partclone installed on that system!



  • @Sebastian-Roth Hi Sebastian. It runs on a VMware. I’ll try with another network adaptor.
    Do you know a way to manual extract the image over te network to another machine/folder so i can test the network card of the server ?


  • Senior Developer

    @ErwinBullen Ok, thanks for the update. So I definitely went down the wrong track assuming this was only on one machine.

    Is it always sda2 failing as seen in the two pictures? Maybe it’s some kind of network (driver) issue?! Just blindly guessing here.



  • @Sebastian-Roth unfortunately it is not that simple.
    I have the same on every machine i tested (more than 10) and on two different storage nodes. (with both a different Ubuntu Desktop version)
    When i disable the StorageNodes, and use the defaultmember (initial fog instance with sql DB), the same pc’s all get their image perfectly.


  • Senior Developer

    @ErwinBullen That’s interesting. The compressed image file seems to de-compress just fine. Not what I expected.

    The error does not appear all the time. Sometimes (rare) it succeeds

    From my point of view this sounds like a hardware issue. Please to RAM checks (memtest) and see if it finds any error on this particular client!

    I expect the client/host to be the trouble maker here and not the storage node.



  • @Sebastian-Roth Hi, this is the output.
    It look fine to me.
    P1.jpg
    P2.jpg
    The error does not appear all the time. Sometimes (rare) it succeeds


  • Senior Developer

    @ErwinBullen Hmmm, seems like the compressed image file cannot be read properly. Possibly it’s corrupt? Do you have enough space on your server to do an extraction test?

    cd /images/B-Blok-v18-v4
    file d1p2.img
    zcat d1p2.img > partition2_extraction_test.img
    

    Please take a picture of the output of those commands and post here.



  • Hi,
    yes, all nodes are 1.5.6
    I upgraded ubuntu on the SN without succes.
    I deleted the images on the SN so that a fresh copy was made from the default to the SN.
    I noticed there was another error before the picture above see attachment.
    IMG_3255.jpg


  • Senior Developer

    @ErwinBullen Did you update all your nodes to 1.5.6?


Log in to reply
 

284
Online

7.4k
Users

14.5k
Topics

136.5k
Posts