another init.xz issue



  • oh, I thought that was a type, sorry for that. But it doesn’t help, the set command doesn’t give me anything mac-related:

    [Tue Jun 21 root@fogclient ~]# set |grep mac
    SHELLOPTS=braceexpand:emacs:hashall:histexpand:history:interactive-comments:monitor:posix
    [Tue Jun 21 root@fogclient ~]#
    

    And here is the lspci output:

    [Tue Jun 21 root@fogclient ~]# lspci -m
    00:00.0 "Host bridge" "Intel Corporation" "Atom Processor Z36xxx/Z37xxx Series SoC Transaction Register" -r11 "Intel Corporation" "Device 7270"
    00:02.0 "VGA compatible controller" "Intel Corporation" "Atom Processor Z36xxx/Z37xxx Series Graphics & Display" -r11 "Intel Corporation" "Device 7270"
    00:13.0 "SATA controller" "Intel Corporation" "Atom Processor E3800 Series SATA AHCI Controller" -r11 -p01 "Intel Corporation" "Device 7270"
    00:17.0 "SD Host controller" "Intel Corporation" "Atom Processor E3800 Series eMMC 4.5 Controller" -r11 -p01 "Intel Corporation" "Device 7270"
    00:1a.0 "Encryption controller" "Intel Corporation" "Atom Processor Z36xxx/Z37xxx Series Trusted Execution Engine" -r11 "Intel Corporation" "Device 7270"
    00:1b.0 "Audio device" "Intel Corporation" "Atom Processor Z36xxx/Z37xxx Series High Definition Audio Controller" -r11 "Intel Corporation" "Device 7270"
    00:1c.0 "PCI bridge" "Intel Corporation" "Atom Processor E3800 Series PCI Express Root Port 1" -r11 "" ""
    00:1c.1 "PCI bridge" "Intel Corporation" "Atom Processor E3800 Series PCI Express Root Port 2" -r11 "" ""
    00:1c.2 "PCI bridge" "Intel Corporation" "Atom Processor E3800 Series PCI Express Root Port 3" -r11 "" ""
    00:1c.3 "PCI bridge" "Intel Corporation" "Atom Processor E3800 Series PCI Express Root Port 4" -r11 "" ""
    00:1d.0 "USB controller" "Intel Corporation" "Atom Processor Z36xxx/Z37xxx Series USB EHCI" -r11 -p20 "Intel Corporation" "Device 7270"
    00:1f.0 "ISA bridge" "Intel Corporation" "Atom Processor Z36xxx/Z37xxx Series Power Control Unit" -r11 "Intel Corporation" "Device 7270"
    00:1f.3 "SMBus" "Intel Corporation" "Atom Processor E3800 Series SMBus Controller" -r11 "Intel Corporation" "Device 7270"
    02:00.0 "Network controller" "Qualcomm Atheros" "AR9580 Wireless Network Adapter" -r01 "Qualcomm Atheros" "Device 3123"
    

  • Moderator

    @bmaster001 The actual command for functs is . /usr/share/fog/lib/funcs.sh (dot space <path>). Without the preceding dot the variable are lost when the script quits running. Unfortunately this information is important for debugging since the mac variable is what is passed to FOG. But you are right the mac address of the eth0 changes at random. This is crazy.

    what do you get from lspci -m and then lspci -k



  • Thanks for the “secret” :-)

    First run:

    [Tue Jun 21 root@fogclient ~]# ip addr show
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
        link/ether 06:a7:78:e5:c9:ed brd ff:ff:ff:ff:ff:ff
        inet 10.1.14.214/16 brd 10.1.255.255 scope global eth0
           valid_lft forever preferred_lft forever
    [Tue Jun 21 root@fogclient ~]# /usr/share/fog/lib/funcs.sh
    [Tue Jun 21 root@fogclient ~]# set |grep mac
    SHELLOPTS=braceexpand:emacs:hashall:histexpand:history:interactive-comments:monitor:posix
    

    I think the funcs.sh script doesn’t output anything mac-address-related into a variable. I looked in the script, and saw that it runs /sbin/ip which returns basically the same info as ip add show or ifconfig.

    Second run:

    [Tue Jun 21 root@fogclient ~]# ip add show
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
        link/ether 96:c1:60:13:b0:09 brd ff:ff:ff:ff:ff:ff
        inet 10.1.14.248/16 brd 10.1.255.255 scope global eth0
           valid_lft forever preferred_lft forever
    

    Third run:

    [Tue Jun 21 root@fogclient ~]# ip addr show
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
        link/ether ce:82:f1:bb:00:6c brd ff:ff:ff:ff:ff:ff
        inet 10.1.14.190/16 brd 10.1.255.255 scope global eth0
           valid_lft forever preferred_lft forever
    

    I think this confirms what I already saw: the mac address changes on each reboot :-/


  • Moderator

    @bmaster001 This is totally strange, I can’t understand why the mac address is coming from random places. I’m not going to rule out something strange going on in the FOS engine, but this hardware is the only one doing this (so far)

    What I’m going to suggest is that you update the kernel and inits on that usb stick. Just download the files using these urls
    https://fogproject.org/inits/init.xz
    https://fogproject.org/inits/init_32.xz
    https://fogproject.org/kernels/bzImage
    https://fogproject.org/kernels/bzImage32

    And replace the files in the /boot folder on the stick. This will put the latest kernels and inits on that stick.

    Once that is done boot from the usb stick and select the debug boot. That will drop you to a command prompt on the device.

    1. Then key in ip addr show and record the network mac addresses
    2. Key in . /usr/share/fog/lib/funcs.sh
    3. Key in set | grep mac
    4. Record the value
    5. Reboot
    6. Test 2 more times. See if the mac address is dynamic (for some reason).

    Hopefully some pattern will show up. The mac variable is what fog uses to identify the target to the FOG server.

    I’ll tell you a secret for debugging the FOS engine. if you use the ip addr show you can find the IP address of the target. Then if you set a password for the logged in user (root) with passwd you can then use putty to ssh into the FOS engine. This makes it easier to copy / paste take screen shots of what is going on in the FOS environment.



    • I’m now on version 8185, and pxe boot hangs again on “init.xz…ok”. Weird, I’m sure it worked yesterday.
    • Disabled uefi again, booted the fos usb stick, and noticed that the MAC address is totally different as before. It seems that the mac address is different after each reboot. That makes this fos image pretty hard to use for deployment/capture tasks with fog because it gives the “fatal error: unknown request type :: Null” error each time. @george1421: any idea why this happens? I tried starting from the stich in debug mode, update the host on the server so that it matches the virtual mac, and ran ‘fog’ on the host, but that gave the same error. Maybe this is fixed when the real mac is used when booting … ?


  • I didn’t build the img myself. But apparently the newer kernels seem to help me a lot :-)

    The inventory task didn’t work either so I had to create the host manually in the fog server, and I used the MAC address I saw during network boot for that. No idea where the other address comes from. It seems to work though, when I ping from another pc to this device, I see that same mac address in my ARP table. Very strange. EDIT: If you want me to do more tests with your image, just let me know, I have this device on my desk for a few more days :-)

    • Next problem: there seems to a problem with the NTFS partition on the device, so the capture doesn’t start. I have to boot to windows and run “chkdsk /f”. Can’t do that of course, since I want to capture the image as it is :-/

    • Quick registration and Full Host Registration still don’t work. After init.xz the screen goes black

    • memtest gives me “Exec format error”

    So we’re not entirely there yet…


  • Moderator

    @bmaster001 said in another init.xz issue:

    Ok, now I am very confused:

    1. Did you guys change something in the latest trunk version to make this thing boot correctly?

    Tom said that the kernels were updated last monday or tuesday to the latest release. My fos-usb.img file has the older kernels on it. If you built your own then it should have the latest kernels on it.

    I did wonder myself if the fos client was picking the wrong network adapter as compared to what was captured during inventory.



  • I already did a lookup for both addresses, and it’s the second one that’s strange because it couldn’t find a manufacturer:

    • 00:40:fd:0a:41:a8 is the one I see when network-booting, which is LXE. That’s correct, because the manufacturer of this VM3 device was called LXE a few years ago

    • f6:c5:25:ec:b1:ff is the address I see when booting from the usb stick. No vendor can be found for this one… maybe it’s something virtual?


  • Developer

    @bmaster001 said:

    1. Did you guys change something in the latest trunk version to make this thing boot correctly?

    Possibly a newer kernel that does handle this hardware properly? @Tom-Elliott?

    1. Is it normal that a pxe boot gives me a different MAC address than an USB-boot ?

    Should not be the case if both are using the same NIC! Would you mind posting both MACs here so we can have a look. Usually MAC addresses have a vendor part and we might shed a light on this if we see the two different addresses.



  • Ok, now I am very confused:

    • First I updated to the latest trunk version
    • I tried to choose the first grub-menu item on the FOS stick, with and without a capture task in the server, and both failed with the same error
    • I tried running “fog” from within the debug menu item -> with the same error
    • then I noticed that the MAC address that I see with ifconfig is totally different from what I have in the fog server, so I created a new host on the server, with that MAC address, but that doesn’t help either
    • Then I removed the USB stick, and re-enabled UEFI boot in the bios to check the MAC address, and suddenly the network boot works?!
    1. Did you guys change something in the latest trunk version to make this thing boot correctly?
    2. Is it normal that a pxe boot gives me a different MAC address than an USB-boot ?

  • Developer

    @bmaster001 Great to hear you are making progress. If I remember correctly George and Tom have changed some code in the init files lately. Please upgrade to the very latest trunk version and see if you still get that fatal error: unknown request type :: Null thing.



  • I first tried the last menu entry, which is the debug mode. That’s when I wrote my previous “yes, progress!” post. Then I did schedule a capture task on the server, booted the host, selected the first menu item (capture/deploy), and then it gave me the “type null” error. At that point I added the update to my previous post.

    @george1421 As far as I remember, I didn’t downgrade the fog server… but I’ll make sure to check that on monday!

    I can’t test anything until after the weekend (I’m already home and don’t have access to this device of course), but if you guys can think of more things to try on monday, just let me know! Always happy to test things with this weird piece of hardware… and hopefully it can help someone with the same issues in the future!


  • Moderator

    @Tom-Elliott FWIW, on the grub menu there is the capture/deploy that should not give him debug mode. The very last menu entry in the grub menu IS debug. From what the OP describes (with the type–null) that task was not scheduled / ready for that specific host when FOS was booted (or something is going sideways with hostinfo.php).


  • Senior Developer

    @bmaster001 I don’t know what the value looks like that you’re selecting (as you’re using the FOGS-L USB System), but from the sounds of things you’re in the “debug” console? If that is the case, when you type fog the init you’re host is currently booted into has no understanding of the parameters the normal PXE boot would hand out. Do make it recognize things, you need to (still) schedule a tasking on the FOG GUI for that host.

    All you should need to do is schedule the tasking on the GUI and then type the command fog as you had before. The script will then attempt to make a request to the fog server to setup the parameters needed for that tasking.


  • Moderator

    @bmaster001 Hey that’s great. I didn’t have high expectations because of the class of computer it is. Yes you need to schedule a capture task on the fog server then boot the FOS client. We need to clean up the FOS client a bit and of someone forget to schedule a capture/deploy before booting FOS to have FOS wait. But that’s another issue.

    Now to the other issue. If you are using FOG r8050 or newer you should not get the type-null error as long as you schedule the task first in FOG and then boot the FOS client. The capture deploy step is the only action that requires a job to be scheduled on the fog server first. Your OP stated you are on 8099. Is that still accurate?



  • Finally, some progress! :-D
    When I disable UEFI Boot in the bios, and add ACPI=OFF to the kernel parameters, it boots from the FOS stick!

    So what’s next now? I use this method to capture/deploy this type of device? Or are there other steps we can take to make if network-boot?

    Update: When I choose the “FOG Image Deploy/Captyre” GRUB entry (to which I added the ACPI=OFF parameter too), then it halts with error “fatal error: unknown request type :: Null” (with or without a tastk scheduled for this host). The “Quick Registration” entry as well. “Client System Information” seems to work fine.


  • Moderator

    @bmaster001 said

    Just for clarity too: I put “has_usb_nic=1 mdraid=true” at the end of the line "linux $myimage loglevel=7 … " near the bottom of the file?

    Now that I look at this a second time, the answer is yes. The last menu entry is debug, so add these in that line. But I would try Sebastian’s suggestion of acpi=off first since that is where the process seems to stop. Once you can get into debug mode then you can add the settings that work to the menu entry 1 for capture / deploy.


  • Developer

    @bmaster001 Thanks for posting the picture. The messages about raid6 might seem strange on first sight but are actually quite normal. See here or various other dmesg outputs on pastebin…

    Hanging right after some ACPI messages is what I find interesting. Maybe try kernel parameters to turn of ACPI altogether to see if it makes any difference. See here for a list of all the different kernel parameters. I’d start by trying acpi=off


  • Moderator

    @bmaster001 said in another init.xz issue:

    @george1421 said in another init.xz issue:

    Just for clarity, if you are booing with the FOS USB stick, you need to update the image args in the /boot/grub/grub.cfg file. The FOG server is not part of the booting process at this level.

    Just for clarity too: I put “has_usb_nic=1 mdraid=true” at the end of the line "linux $myimage loglevel=7 … " near the bottom of the file?

    Sorry I should have been a bit more descriptive, I knew exactly what I was talking about.

    When you boot off the USB stick using the image I sent you. The booting kernel only looks at that usb drive for settings. So adding things into the fog console like mdraid=true will not make into the booting image. You need to update the /boot/grub/grub.cfg file and add those parameters into the capture deploy line.

    Such as in this example.

    menuentry "1. FOG Image Deploy/Capture" {
     echo loading the kernel
     linux  $myimage loglevel=$myloglevel initrd=init.xz root=/dev/ram0 rw ramdisk_size=127000 keymap= web=$myfogip/fog/ boottype=usb consoleblank=0 rootfstype=ext4 has_usb_nic=1 mdraid=true 
     echo loading the virtual hard drive
     initrd $myinits
     echo booting kernel...
    }
    


  • Two updates:

    • Switching between AHCI and IDE in the bios doesn’t make any difference when I try the capture task.
    • This device is starting to annoy me. The usb connector on top doesn’t seem to work anymore now (usb keyboard nor usb stick are recognised). The only other usb connection still works but then I have to choose between USB stick and USB keyboard :-/ This thing also has a built-in ups so removing the power to really reset it, is not so easy: I have to remove a couple of screws to reach a reset-button. Ugh. I think I’m gonna leave it disconnected from the power during the night, and retry tomorrow morning.

    Thanks again for the help guys, I’ll get back tomorrow with the results of the FOS usb stick… I hope.


Log in to reply
 

466
Online

6.2k
Users

13.5k
Topics

127.5k
Posts