Getting "disk read errors have occurred" after image completion on Fog trunk v7198



  • So I’m on Fog trunk version 7198 on a Ubuntu 14.04 LTS. I’ve recently upgraded from v1.2.0 which was recently before that upgraded from 0.32. Images were working fine on v1.2.0 upload and download. Now I’m trying to deploy on v7198 and am getting “disk read errors have occurred” directly on the first boot after the imaging process on 2 different “partimage” images at least. I have always use “single disk” resizeable and the images are set as this in v7198. If I try to switch to any other option the task errors out before starting so I’m sure this is all correct. Is there anything you need to do to the old 0.32 \ 1.2.0 images before you use them in v7198 other than pointing to them and setting them as “partimage”?


  • Developer


  • Developer

    @Developers Ok I actually found the information on where the first partition is meant to start within the rec.img.000 file. Well I should better say how many hidden sectors are being expected. Let me explain. The MBR code in the first sector of a disk is looking for the first active marked primary partition, reads that partition’s VBR (volume boot record) and hands off execution to it. At this stage it does not matter if that partition starts at offset 2048 or 63 or any other sector! But within that VBR the number of so called hidden sectors (sectors before the VBR starts) is stored at position 0x1C. Starting from Windows Vista this number defaults to 2048!! Although I was able to get past the A disk read error occurred message (which you run into if partition start sector and hidden sector count mismatch) by partitioning a disk with sda1 starting at sector 63 plus modifying that hidden sector count to also make it 63 (0x3F) using a hex editor - just tried to proof to myself that I am on the right track with that and that Windows 7 does actually boot from sector 63 (I guess there are people around who have messed with this and we as FOG team will have to face it at some point).

    If you are keen to dive into this I suggest reading through those three articles: http://www.dewassoc.com/kbase/hard_drives/master_boot_record.htm, http://thestarman.pcministry.com/asm/mbr/NTFSBR.htm and http://thestarman.pcministry.com/asm/mbr/W7MBR.htm
    Valuable information for the quick reader:

    Note: Under Windows™ Vista and Windows™ 7, the number of “Hidden” or Reserved Sectors for the first partition has been increased to 2048 (0x800) rather than 63 (0x3F).

    and

    Basically, since the starting offset for many disks, including the majority of Windows XP OS installs, was 63 …

    So back to the very practical side of this. Instead of checking for image format or legacy type (fog.download line 99) I’d opt for defaulting to 2048 as this case branch is Windows 7/8.1/… only anyway! Sure there is a chance of people having messed things up by making it start at 63 but we should definitely default to 2048 I reckon! Hope it’s not going to break things for a lot of people out there.

    For all of you who haven’t followed this all the way I have to say that this whole story is only playing a role for images that are missing the d1.partitions information where we have to make assumptions on the partition layout (FOG pre 1.2.0 I think)!!

    If we are not happy with the default assumption on sector 2048 I can put together some kind of code to extract the hidden sector count value from the rec.img.000 file. Any votes for this?


  • Developer

    @Tom-Elliott I’ve read a couple of things about windows bootup process, BCD, bootmgr and so far I haven’t found a way to extract the start sector information from rec.img.000 or sys.img.000!
    The problem is actually really simple from what I understand. The boot code in the MBR (if we don’t find one in the image we just deploy our e.g. win7.mbr) is pointing to the start of the first partition to try and find bootmgr.exe. Our win7.mbr comes with start sector 2048 and we must fail if we re-create the first partition starting at sector 63.

    Now in FOG trunk we re-create both partitions - which is not a really bad idea.

    I guess I was wrong with what I said. I think recreating the first partition on a different start sector is not very wise. I am able to reproduce the exact error in a VM. Just need some more time to play with this and I hopefully might find something we can do. The assumptions on start sector and partition size we are doing if layout information is missing will only work by pure luck.


  • Developer

    @Zourous See in the top right corner there is a speech bubble, that’s for private chats. I just sent you a message.



  • @Sebastian-Roth

    Hi Sebastian, Yes, the file is only 8.5mb so can probably send as a attachment. I’m not sure how I private message you on here?


  • Developer

    @Zourous Great to hear that this has worked for you. Please be aware that the information I posted might only match this one particular image you gave the information. It might work with other disks (disk size…) and images if you are really lucky.

    @Tom-Elliott said:

    Do you know of a way I might be able to take this information and use it within the inits?

    I have thought about this as well! Looking at the 1.2.0 version to see why it worked for Zourous I was in hope to find a special trick that was made back then. But unfortunately it was pure luck because in 1.2.0 we simply dump the win7.mbr (which has 2048 set as start for the first partition) to disk and only recreate partition number two. Now in FOG trunk we re-create both partitions - which is not a really bad idea. Just the fact that we guess the start sector is not wise.

    I am trying to see if we are able to extract the original partition start sector information from the rec.sys.000 file. That’s our only hope I suppose.
    @Zourous Would you be able and willing to upload your rec.img.000 file and send me a private message on where I can download it? I will have a look and see if I can find a general solution to this problem.


  • Senior Developer

    @Zourous I don’t know that I am able to automate a process for this as we don’t know it won’t work until after the first time image attempt. In most cases older images from 0.32 and before always defaulted to a start sector of 63 as @Sebastian-Roth stated. This wasn’t ALWAYS the case, but it was more often than not which is why it got setup in such a way to begin with.

    @Sebastian-Roth Do you know of a way I might be able to take this information and use it within the inits?



  • Success!

    Thanks very much for your prompt responses and efforts which is better than some of the paid support I have to face on a day to day basis as a technician. I can now move all my use all my old images over to the fog trunk server.


  • Developer

    @george1421 Good catch! But taking a closer look I kind of doubt that this is the same issue. Here we have a legacy image (partition layout information not available!) but with the Optiplex 3020 the image seems to be a fresh new one from what I understand - totally different I guess.

    @Zourous To get around the default start sector 63 you can “simply” add partition layout description files to your image on the FOG server. Create two text files with the following content (make sure you get all the numbers right!):

    • /images/BASE32STD2015/d1.fixed_size_partitions
    1:
    
    • /images/BASE32STD2015/d1.partitions
    label: dos
    label-id: 0x86308630
    device: /dev/sda
    unit: sectors
    
    /dev/sda1 : start=        2048, size=      204800, type=7, bootable
    /dev/sda2 : start=      206848, size=   312374960, type=7
    

    Then try deploying the image again. Please let us know if this is working for you!?


  • Moderator

    @Sebastian-Roth Could this case be the issue with this one too? https://forums.fogproject.org/topic/7299/deploy-problem-with-optiplex-3020/39 It might explain why the two drives of the same size but different models would give different results.


  • Developer

    @Zourous I think we found it! From what I remember looking through our latest script code FOG trunk defaults to start sector 63 for legacy images where we don’t have any partition layout information. As your first partition starts at sector 2048 (which is perfectly fine!) FOG’s assumption is simply wrong and therefore fails to image properly so that the OS can boot. Please give me a little more time over the weekend and I am sure we’ll find a nice solution for this.



  • @Sebastian-Roth

    Ok, rechecking with debug mode it doesn’t look like there is any specific error message as I thought using FOG trunk and an old 0.32 image. Here is the screenshot any way. Not sure it tells you much

    0_1462528797207_1.jpg

    “Important information we need is the OS ID configured for the old” image on the new FOG trunk server!”

    OS ID on the new server is “Windows 7 (5)” just like it was on 0.32 & 1.2.0

    “As well I need to ask you to deploy that image via your old 0.32 or 1.2.0 FOG server so the client can boot properly. Then schedule a debug (capture or deploy does not matter) task on that client and run fdisk -l /dev/sda when you get to the shell. Please take a picture as we need the exact numbers (no typos allowed! ;-)

    See below, hopefully it will help:

    0_1462535396343_3.jpg


  • Developer

    @Zourous I have to admit that I obviously didn’t properly read all your posts from start to end, sorry! Nothing personal, just me having a lot less time lately. As I have a quiet moment right now I try to get my head around your issue.
    A lot of our script code that does all the deploy stuff has changed since FOG 1.2.0 - so I guess we just haven’t tested all the code handling those legacy image things.

    I think there might have been a message at the end of the image process before the reboot

    Please schedule the next deploy as debug (right before you click create deploy task in FOG trunk there is a checkbox for debug mode). This way you need to step through the process and you can read all the messages on screen. You could actually connect to the client via SSH to be able to copy&paste all the messages on screen when running the fog command via SSH.

    Ok, from what I get between the lines this could be an issue in how we handle the partitioning of legacy images. Good to know that you are able to do a fresh capture/deploy on FOG trunk without a problem.

    Important information we need is the OS ID configured for the “old” image on the new FOG trunk server!
    As well I need to ask you to deploy that image via your old 0.32 or 1.2.0 FOG server so the client can boot properly. Then schedule a debug (capture or deploy does not matter) task on that client and run fdisk -l /dev/sda when you get to the shell. Please take a picture as we need the exact numbers (no typos allowed! ;-)



  • @Sebastian-Roth said in Getting "disk read errors have occurred" after image completion on Fog trunk v7198:

    @Zourous Would you please be so kind and upload a picture of the error on screen or at least post the exact error message here in the forum.

    0_1462394528756_image.png

    This is the message (replica grabbed from the internet) and happens straight after the first reboot from the image completing. I think there might have been a message at the end of the image process before the reboot, but I missed it due to a distraction at work. If I get a chance I’ll try an image again and see it I can catch the message at the end on my camera.


  • Developer

    @Zourous Would you please be so kind and upload a picture of the error on screen or at least post the exact error message here in the forum. I try google for “disk read errors have occurred” but there is not much I can find.

    If FOG wouldn’t copy the MBR there wouldn’t be a partition table and no partitions…

    Thanks for testing (re-capture and deploy in trunk) and elaborating (images worked in 1.2.0) as this is viable information! Hope we can find out a little bit more about the error message and we should be able to find what’s wrong. One possible problem could be the newer version of partimage in FOG trunk not handling your images properly - I really hope this is not the case.


  • Moderator

    @Zourous apologies. @Sebastian-Roth ideas?



  • @Wayne-Workman said in Getting "disk read errors have occurred" after image completion on Fog trunk v7198:

    @Zourous I think these are partimage type images, instead of the newer format partclone.
    FOG 1.2.0 and FOG Trunk support partimage but I think you have to tell it to use partimage for that image - or maybe even for the whole server, not sure. Asking @Sebastian-Roth or @Tom-Elliott to clarify.

    Just to reiterate, I have set the old 0.32 images to “partimage” instead of “partclone” in the image properties. I’ve tried a brand new image upload and download in Fog trunk and this worked fine, it just doesn’t seem to like my old images at the moment.This means I’ve got a separate server build on 0.32 so I can just restore old images. Trouble is I also need to play with W10 so appreciate the new stuff in the trunk version.


  • Moderator

    @Zourous I think these are partimage type images, instead of the newer format partclone.
    FOG 1.2.0 and FOG Trunk support partimage but I think you have to tell it to use partimage for that image - or maybe even for the whole server, not sure. Asking @Sebastian-Roth or @Tom-Elliott to clarify.



  • “Can you list the contents of this particular image’s directory? The command should be something like this:
    ls -laht /images/ImageNameGoesHere”

    total 21G
    drwxrwxr-x 2 build build 4.0K Jun 19 2015  .
    drwxrwxrwx 5 root root   4.0K Apr 29 09:05 ..
    -rw-rw-r-- 1 build build  21G Jun 19 2015  sys.img.000
    -rw-rw-r-- 1 build build 8.7M Jun 19 2015  rec.img.000
    

    I tested one of my old backed up old 0.32 images back on Fog v0.32 and it imaged fine. Have just built a new fog server and now on v7356 and I’m back to the same issue again upon first boot after image deployment of “disk read errors have occrred”


Log in to reply
 

342
Online

39.3k
Users

11.0k
Topics

104.4k
Posts

Looks like your connection to FOG Project was lost, please wait while we try to reconnect.