Tablet PC hangs on bzImage
-
@george1421 @JJ-Fullmer We have been through a lot of debugging with this and after we have hunted down the kernel it turned out to be a stupid network driver issue. iPXE does not seem to use a native driver for the ASIX AX88179 USB 3.0 adapter but the more general SNP. For whatever reason this ASIX chip does not play by the rules and network connection stalls after it started to transfer the first bytes of the kernel binary (seen in wireshark dumps!). Just to give you a quick heads up on what we’ve been into so far.
@Zerpie Would it be an option to use a different USB network adapter? I know you’ve tried three different ones. But maybe there are more out there. Other than that it would be a case of getting into the source code and trying to debug either when this NIC doesn’t want to play propery SNP or try to get the AXGE driver to work… For me it’s very hard to tell which way could be the more promising. Not having your hardware here I can’t even do any testing myself.
EDIT: I just re-read that post in the iPXE forums where I got the information about how to build a binary with native AXGE driver included and saw this:
When building the ipxe target, drivers for usb devices is not included (due to it disabling all other connected USB devices)
Sorry for obviously having overlooked this important information. So I guess it does not hang at all but just sits there waiting for input and is not getting any as USB devices are shut off. So instead of the simple
shell.txt
create a new text filedhcp.txt
with the following content:#!ipxe ifopen net0 dhcp net0 chain tftp://${next-server}/default.ipxe
Then compile and copy the new binary:
make bin-i386-efi/ncm--ecm--axge.efi EMBED=dhcp.txt cp path_to_ipxe_code/src/bin-i386-efi/ncm–ecm–axge.efi /tftpboot/i386-efi/ipxe.efi
-
@Sebastian-Roth Ok I gave that a try and now it’s back what I was getting before.
BzImage_debug... ok Could not select: Exec format error (http://ipxe.org/2e008081) Could not boot: Exec format error (http://ipxe.org/2e008081) Could not boot: Exec format error (http://ipxe.org/2e008081)
And then it just continues to boot into the OS.
-
I probably should have posted this a while back. I’m not sure if it’ll help, but it’ll at least be good to know what we’re working with. The device is a Touch Dynamics Quest 7 Tablet mobile POS (point of sale).
https://www.touchdynamic.com/products/mobile-pos-tablets/quest-tablet-and-premium-dock/
There’s not a whole lot out there about it that I could find.
-
@Zerpie Are you sure we have had the
Exec format error
before? As far as I remember we only got to the point where it would hang on bzImage. Now the exec Format error could actually be what George said, a CPU being pinned to 32 bit mode. In the host’s settings change Kernel setting tobzImage32
and see if that makes a difference… -
@Sebastian-Roth Yup. I got these errors at some point before and posted about it earlier in the thread.
Ok I could have sworn I tried it with bzimage32 set for Host Kernel earlier today, but I just tried it now and I got a little further. It gave me a whole page of text and it looked like it was going to work but then it stopped and I see these lines at the bottom.
Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance. Kernel Offset: disabled ---[ end Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
-
Ok I still had debug earlyprintk=efi loglevel=7 set in the Host Kernel Arguments so I cleared that out and tried again. I didn’t get the wall of text this time, just the Kernel panic. Here’s everything I see.
Starting init: /sbin/init exists but couldn't execute it (error -8) Starting init: /bin/sh exists but couldn't execute it (error -8) Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance. Kernel Offset: disabled ---[ end Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
-
@Zerpie Ohh wow, I think we are very much ahead of the initial issue!!! Sorry if you posted about this already and I didn’t notice it!
Starting init: /sbin/init exists but couldn’t execute it (error -8)
AFAIK this means again the same thing, the CPU is pinned to 32 bit but this time the FOS init is 64 bit and fails to execute on this CPU. Would you be able to manipulate those binaries just for the moment of testing?
The kernel can be set using the host kernel setting but the init can’t. So I ask you to copy the init binary on the FOG server:
sudo -i cd /var/www/fog/service/ipxe mv init.xz init.xz.orig cp init_32.xz init.xz
Still leave
bzImage32
set in the hosts kernel setting and try again. -
@Sebastian-Roth Alright, that got me a bit further. It looked like it was actually going to start capturing, but I got “An error has been detected!”
* Could not mount /dev/mmcblk2p3 (/bin/fog.upload->beginUpload) Args Passed: Reason: The disk contains an unclean file system (0, 0). Metadata kept in Windows cache refused to mount. Failed to mount '/dev/mmcblk2p3' : Operation not permitted The NTFS partition is in an unsafe state. Please resume and shutdown Windows fully (no hibernation or fast restarting), or mount the volume read-only with the 'ro' mount option.
I allowed the tablet to boot into Windows and did a full shutdown using the command Shutdown -f -s -t 0. Then I booted it back up and now when it tries to boot over ipv4 it immediately skips over it and tries ipv6 before booting back into Windows.
-
@Sebastian-Roth I think I would (instead of messing with the init names) I would just update the
Host Init
field in the host definition to init_32.xz with theHost Kernel
set to bzImage32. That way other hosts will still boot correctly. Technically the 64 bit systems will boot with the 32 bit kernels (as with the older versions of FOG). But the next FOG update will undo you switch around.The unclean file system message is because windows was not shutdown properly before you attempted to capture it. Use either sysprep to power off the device or run the command
shutdown -s -t 0
command to properly power off the device. The simple Startbutton->Shutdown will not properly power off the device. -
@george1421 It looks like that did it! It’s currently capturing the image. I’ll keep an eye on it to make sure it completes without any errors. Then I’ll try and deploy the image back to the tablet to make sure that works as well.
-
Capture and deploy completed successfully! Looks like the problem is solved. Thanks for all your help, guys!
-
Wow, we finally made it. So to make a long story short, it was simply compiling a special iPXE binary that has a native AXIS driver included. Keep in mind that other USB devices are being disabled through this.
As well note George’s advice on the inits.
I am Wanderung if we should add this special AXIS iPXE binary to our repo and installer?
-
@Zerpie Ok I don’t know what happened in the last couple weeks, but it’s not working anymore. I’m back to getting the following errors when trying to capture an image from these tablets.
bzImage32... ok Could not select: Exec format error (http://ipxe.org/2e008081) Could not boot: Exec format error (http://ipxe.org/2e008081) Could not boot: Exec format error (http://ipxe.org/2e008081)
I haven’t made any changes to Fog since I got this working originally. I told my team that it was working and they could move forward with using Fog to image these tablets. Once they got around to it they brought it to my attention that it wasn’t working.
I also don’t know if I need to start a new thread since I already marked this one as solved.
-
@Zerpie Remember this tablet seems to disguise it’s architecture being 32 or 64 bit. In the process of trying to make this work we renamed kernels and inits and also set those parameters in the host’s settings.
Probably best if you reset kernels an inits to the latest and then check the host’s settings again.
sudo -i cd /var/www/fog/service/ipxe mkdir backup mv bzImage* backup/ wget https://fogproject.org/kernels/bzImage wget https://fogproject.org/kernels/bzImage32 mv init* backup/ wget https://fogproject.org/inits/init.xz wget https://fogproject.org/inits/init_32.xz
Not go to the tablet’s host’s settings in the web UI and make sure you have Host Kernel set to
bzImage32
and Host Init set toinit_32.xz
. -
@Sebastian-Roth The first thing I checked was that I had Host Kernel set to bzImage32 and Host Init set to init_32.xz since that was the last thing we were missing when we got it working last.
I followed your instructions of moving the original files and getting the latest ones, but I’m still getting the same Exec format error.
-
@Zerpie Are you surely using the exact same physical tablet device this time? Maybe it’s just another one, same model but different firmware version or BIOS settings?
What happens if you set back to bzImage/init.xz in the host’s settings? Same exec error?
To me it sounds as if secure boot is enabled on the tablet.
-
@Sebastian-Roth It’s the same model, but not the exact same tablet I got working before.
I tried changing the host settings back to bzimage and init.xz and I got the same error.
These tablets do not have the option to enable/disable secure boot in BIOS setup.
I’ll see if we still have the original tablet that we got working to see if it’s also getting the errors.
EDIT: Nope. It’s impossible to find the original tablet. It was added back to the stock of all the others and I didn’t get the serial# from it.
-
Ok I don’t know what happened, but after I exited BIOS settings this time on the tablet, it booted into Fog and started a capture task. I have no idea why. I didn’t change any of the settings. I’ll have to play with it some more.
-
@Zerpie Oh dear, doesn’t sound like reliable devices you have there. Hope you figure out how to handle those and make them work properly. Doesn’t sound like this is something we can handle with FOG. But keep us posted on how you go.
-
@Sebastian-Roth Yeah I was thinking the same thing. So now that tablet is booting reliably every time. I’ve moved to another one and it’s getting the errors again. This is totally a tablet-related issue and not a Fog related one.