Tablet PC hangs on bzImage



  • I’m trying to get Fog to work with imaging a bunch of Windows 10 point of sale tablets that my company sets up for our customers. The tablets connect to a base station with usb and an ethernet adapter built into it. They are 32bit machines and UEFI with no ability to switch to Legacy BIOS. When I try to do a full host registration it hangs on bzImage and never continues past that. So tried adding the host via the web interface and start an image capture task. Once again the tablet will get to bzImage and hang.

    I followed the BIOS and UEFI Co-Existance wiki entry https://wiki.fogproject.org/wiki/index.php/BIOS_and_UEFI_Co-Existence incase it wasn’t getting the boot image it needed as an UEFI machine, but it’s still hanging on bzImage.


  • Developer

    Wow, we finally made it. So to make a long story short, it was simply compiling a special iPXE binary that has a native AXIS driver included. Keep in mind that other USB devices are being disabled through this.

    As well note George’s advice on the inits.

    I am Wanderung if we should add this special AXIS iPXE binary to our repo and installer?



  • Capture and deploy completed successfully! Looks like the problem is solved. Thanks for all your help, guys!



  • @george1421 It looks like that did it! It’s currently capturing the image. I’ll keep an eye on it to make sure it completes without any errors. Then I’ll try and deploy the image back to the tablet to make sure that works as well.


  • Moderator

    @Sebastian-Roth I think I would (instead of messing with the init names) I would just update the Host Init field in the host definition to init_32.xz with the Host Kernel set to bzImage32. That way other hosts will still boot correctly. Technically the 64 bit systems will boot with the 32 bit kernels (as with the older versions of FOG). But the next FOG update will undo you switch around.

    The unclean file system message is because windows was not shutdown properly before you attempted to capture it. Use either sysprep to power off the device or run the command shutdown -s -t 0 command to properly power off the device. The simple Startbutton->Shutdown will not properly power off the device.



  • @Sebastian-Roth Alright, that got me a bit further. It looked like it was actually going to start capturing, but I got “An error has been detected!”

    * Could not mount /dev/mmcblk2p3 (/bin/fog.upload->beginUpload)
    Args Passed:
    Reason: The disk contains an unclean file system (0, 0).
    Metadata kept in Windows cache refused to mount.
    Failed to mount '/dev/mmcblk2p3' : Operation not permitted
    The NTFS partition is in an unsafe state. Please resume and shutdown Windows fully (no hibernation or fast restarting), or mount the volume read-only with the 'ro' mount option.
    

    I allowed the tablet to boot into Windows and did a full shutdown using the command Shutdown -f -s -t 0. Then I booted it back up and now when it tries to boot over ipv4 it immediately skips over it and tries ipv6 before booting back into Windows.


  • Developer

    @Zerpie Ohh wow, I think we are very much ahead of the initial issue!!! Sorry if you posted about this already and I didn’t notice it!

    Starting init: /sbin/init exists but couldn’t execute it (error -8)

    AFAIK this means again the same thing, the CPU is pinned to 32 bit but this time the FOS init is 64 bit and fails to execute on this CPU. Would you be able to manipulate those binaries just for the moment of testing?

    The kernel can be set using the host kernel setting but the init can’t. So I ask you to copy the init binary on the FOG server:

    sudo -i
    cd /var/www/fog/service/ipxe
    mv init.xz init.xz.orig
    cp init_32.xz init.xz
    

    Still leave bzImage32 set in the hosts kernel setting and try again.



  • Ok I still had debug earlyprintk=efi loglevel=7 set in the Host Kernel Arguments so I cleared that out and tried again. I didn’t get the wall of text this time, just the Kernel panic. Here’s everything I see.

    Starting init: /sbin/init exists but couldn't execute it (error -8)
    Starting init: /bin/sh exists but couldn't execute it (error -8)
    Kernel panic - not syncing: No working init found.  Try passing init= option to kernel.  See Linux Documentation/admin-guide/init.rst for guidance.
    Kernel Offset: disabled
    ---[ end Kernel panic - not syncing: No working init found.  Try passing init= option to kernel.  See Linux Documentation/admin-guide/init.rst for guidance.
    


  • @Sebastian-Roth Yup. I got these errors at some point before and posted about it earlier in the thread.

    Ok I could have sworn I tried it with bzimage32 set for Host Kernel earlier today, but I just tried it now and I got a little further. It gave me a whole page of text and it looked like it was going to work but then it stopped and I see these lines at the bottom.

    Kernel panic - not syncing: No working init found.  Try passing init= option to kernel.  See Linux Documentation/admin-guide/init.rst for guidance.
    Kernel Offset: disabled
    ---[ end Kernel panic - not syncing: No working init found.  Try passing init= option to kernel.  See Linux Documentation/admin-guide/init.rst for guidance.
    

  • Developer

    @Zerpie Are you sure we have had the Exec format error before? As far as I remember we only got to the point where it would hang on bzImage. Now the exec Format error could actually be what George said, a CPU being pinned to 32 bit mode. In the host’s settings change Kernel setting to bzImage32 and see if that makes a difference…



  • I probably should have posted this a while back. I’m not sure if it’ll help, but it’ll at least be good to know what we’re working with. The device is a Touch Dynamics Quest 7 Tablet mobile POS (point of sale).

    https://www.touchdynamic.com/products/mobile-pos-tablets/quest-tablet-and-premium-dock/

    There’s not a whole lot out there about it that I could find.



  • @Sebastian-Roth Ok I gave that a try and now it’s back what I was getting before.

    BzImage_debug... ok
    Could not select: Exec format error (http://ipxe.org/2e008081)
    Could not boot: Exec format error (http://ipxe.org/2e008081)
    Could not boot: Exec format error (http://ipxe.org/2e008081)
    

    And then it just continues to boot into the OS.


  • Developer

    @george1421 @JJ-Fullmer We have been through a lot of debugging with this and after we have hunted down the kernel it turned out to be a stupid network driver issue. iPXE does not seem to use a native driver for the ASIX AX88179 USB 3.0 adapter but the more general SNP. For whatever reason this ASIX chip does not play by the rules and network connection stalls after it started to transfer the first bytes of the kernel binary (seen in wireshark dumps!). Just to give you a quick heads up on what we’ve been into so far.

    @Zerpie Would it be an option to use a different USB network adapter? I know you’ve tried three different ones. But maybe there are more out there. Other than that it would be a case of getting into the source code and trying to debug either when this NIC doesn’t want to play propery SNP or try to get the AXGE driver to work… For me it’s very hard to tell which way could be the more promising. Not having your hardware here I can’t even do any testing myself.

    EDIT: I just re-read that post in the iPXE forums where I got the information about how to build a binary with native AXGE driver included and saw this:

    When building the ipxe target, drivers for usb devices is not included (due to it disabling all other connected USB devices)

    Sorry for obviously having overlooked this important information. So I guess it does not hang at all but just sits there waiting for input and is not getting any as USB devices are shut off. So instead of the simple shell.txt create a new text file dhcp.txt with the following content:

    #!ipxe
    ifopen net0
    dhcp net0
    chain tftp://${next-server}/default.ipxe
    

    Then compile and copy the new binary:

    make bin-i386-efi/ncm--ecm--axge.efi EMBED=dhcp.txt
    cp path_to_ipxe_code/src/bin-i386-efi/ncm–ecm–axge.efi /tftpboot/i386-efi/ipxe.efi
    


  • @Quazz said in Tablet PC hangs on bzImage:

    @Zerpie Did you ever end up trying the has_usb_nic=1 kernel argument?

    I did. Still no luck.


  • Moderator

    There is a couple of points here.

    1. I’ve seen issues with some 32 bit tablets (the models escape me at the moment), in that they have a 64 bit processor, but the processor has been hardware locked in 32 bit mode. iPXE will detect its a 64 bit process (because it really is) and send the wrong iPXE kernel to the target computer. I have not check, but you might be able to bypass this auto detection by specifically calling out bzImage32 in the host configuration properties.
    2. FOG doesn’t currently include refind 32 bit at part of the package. Only refind 64 bit is included. This will cause a problem when exiting from the iPXE menu in that the wrong refind kernel will try to boot and fail.

    I can’t say that either of these are the specific case here, but I have seen issues before with 32 bit tablets.


  • Testers

    @Zerpie You can also try booting from the ipxe shell, which if isn’t built in to the tablet as a boot option (sometimes it is sometimes it isn’t) then you can make a rEFInd usb disk and add all the ipxe efi boot options. Then you can create a startup.nsh script that will switch to the fs#: of the usb drive and then boot to whichever 32 bit ipxe file ends up working. It would be tricky and still involve usb drives but you could in theory make it work.

    Another possibility would be to customize fog’s built in refind for those tablets if that happens to be booted to successfully (i.e. if boot to hard drive from the fog menu is working). You could change the default boot settings, I believe you can add some conditions to it, I know you can do it to have different times of the day have different default boot options. So one possibility would be to add the refind efi shell to the fog refind.conf boot options and make it the default during some time slot you are going to image and just also find a way to link a startup.nsh script. I haven’t actually tested this idea, it’s just another possibility if you want network boot to work. But all of that is nill if none of the ipxe efi boot files get you through bzimage32 boot.


  • Moderator

    @Zerpie Did you ever end up trying the has_usb_nic=1 kernel argument?



  • @Sebastian-Roth That’s a shame. I was hoping we’d be able to find a network boot solution for imaging the tablets as we typically have to image several at once for orders. I haven’t found any imaging or cloning software with network boot/PXE boot that works with this tablets. We’re currently using a USB boot option already which works, but we still have to do one at a time.


  • Developer

    @Zerpie I think we should start looking into other ways of booting this tablet. George posted a great manual on how to build a bootable USB medium:
    https://forums.fogproject.org/topic/7727/building-usb-booting-fos-image

    Sure this is not perfect but give it a try to see if we can get ahead of iPXE not being able to boot this tablet.



  • @Sebastian-Roth said in Tablet PC hangs on bzImage:

    Try enabling debug make bin-i386-efi/ncm–ecm–axge.efi EMBED=shell.txt DEBUG=ncm,ecm,axge, copy the binary over, try again and see if you get any further output on screen that could help us.

    Same results. It drops me to the ipxe shell and doesn’t accept any input from the keyboard. There’s no firmware upgrade available either.


 

505
Online

41.9k
Users

12.4k
Topics

116.7k
Posts