ipxe chain boot.php permission denied on pxe but not autoboot
-
@DBCountMan I’m going to guess that you enabled https on your server not using the fog installer. The permission denied message usually comes from the ipxe client not having the certificate that matches what apache server has so it fails to boot. When you use the FOG installer to create the https confiugration it should recompile the ipxe programs with the certificate.
-
@george1421 I did run the fog installer using https and it had been working up until yesterday. If ipxe doesn’t have the correct cert then why would the ipxe still loaded properly after running the autoboot command at the ipxe shell? Should I re-run the installer anyway?
I used wireshark and this is the result right after the ipxe.efi file is downloaded. .138 is the client device, .59 is the fog server.
-
@DBCountMan I’m going to repeat what I’ve previously said a bit differently.
This error is typically because the certificate in iPXE (if it exists) is different than the certificate on the server. This has to do with the https protocol.
The booting process is such.
PXE ROM: DHCP to collect pxe boot info over udp port 67
PXE ROM: TFTP download of iPXE boot loader udp port 69
iPXE: DHCP to collect pxe boot info so iPXE knows where to find the FOG server udp port 67
iPXE: TFTP Download of default.ipxe udp port 69
iPXE: default.ipxe script chain loadshttps://...boot.php
over port 443. This is the first interaction of iPXE and the Apache web server.So the question is, did the certificate in Apache change the day before yesterday for some reason, or did possibly ipxe.efi/snp.efi change two days ago? Something has changed in your environment.
-
@george1421 As far as the part of the environment I have control over, no. Not sure if there was a network change. However that being said nothing changed with apache. I ran the installer again using the -S option and it completed but the issue persists. Just seems odd that it won’t chain during the first attempt but works after running the autoboot command. Can I PM you a webm video recording of the vm I’m testing with? It is about 400KB.
EDIT: I added the fog server ip to the apache2.conf file at the bottom of the file as “ServerName <fogip>” and got rid of that AH00558 error message, but it didn’t fix the issue. Also autoboot doesn’t always work the first try, eventually it does though.
apache2ctl -S AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using <fogserverhostname>. Set the 'ServerName' directive globally to suppress this message VirtualHost configuration: *:80 <fogserverip> (/etc/apache2/sites-enabled/001-fog.conf:1) *:443 <fogserverip> (/etc/apache2/sites-enabled/001-fog.conf:16) ServerRoot: "/etc/apache2" Main DocumentRoot: "/var/www/html" Main ErrorLog: "/var/log/apache2/error.log" Mutex watchdog-callback: using_defaults Mutex rewrite-map: using_defaults Mutex ssl-stapling-refresh: using_defaults Mutex ssl-stapling: using_defaults Mutex proxy: using_defaults Mutex ssl-cache: using_defaults Mutex default: dir="/var/run/apache2/" mechanism=default Mutex mpm-accept: using_defaults PidFile: "/var/run/apache2/apache2.pid" Define: DUMP_VHOSTS Define: DUMP_RUN_CFG User: name="www-data" id=33 Group: name="www-data" id=33
/etc/apache2/sites-enabled/001-fog.conf
<VirtualHost *:80> <FilesMatch "\.php$"> SetHandler "proxy:fcgi://127.0.0.1:9000/" </FilesMatch> KeepAlive Off ServerName <fogserverip> ServerAlias <fogserverhostname> DocumentRoot /var/www/ RewriteEngine On RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK) RewriteRule .* - [F] RewriteRule /management/other/ca.cert.der$ - [L] RewriteCond %{HTTPS} off RewriteRule (.*) https://%{HTTP_HOST}/$1 [R,L] </VirtualHost> <VirtualHost *:443> KeepAlive Off <FilesMatch "\.php$"> SetHandler "proxy:fcgi://127.0.0.1:9000/" </FilesMatch> ServerName <fogserverip> ServerAlias <fogserverhostname> DocumentRoot /var/www/ SSLEngine On SSLProtocol all -SSLv3 -SSLv2 SSLCipherSuite ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GC> SSLHonorCipherOrder On SSLCertificateFile /var/www/fog//management/other/ssl/srvpublic.crt SSLCertificateKeyFile /opt/fog/snapins/ssl//.srvprivate.key SSLCACertificateFile /var/www/fog//management/other/ca.cert.pem <Directory /var/www/fog/> DirectoryIndex index.php index.html index.htm </Directory> RewriteEngine On RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK) RewriteRule .* - [F] RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-d RewriteRule ^/fog/(.*)$ /fog/api/index.php [QSA,L] </VirtualHost>
-
@DBCountMan said in ipxe chain boot.php permission denied on pxe but not autoboot:
SSLCertificateFile /var/www/fog//management/other/ssl/srvpublic.crt SSLCertificateKeyFile /opt/fog/snapins/ssl//.srvprivate.key SSLCACertificateFile /var/www/fog//management/other/ca.cert.pem
Lets start by inspecting these keys, has the file date changed?
If you use ssl and these are self signed certificates, the web browser should show a red mark in the address line to that there is something wrong with the ssl key. You should be able to inspect that ssl key from the browser, lets make sure the expiry date has not been reached. A certificate expiring would also cause this issue.
EDIT: This site shows how to check a certificate expiry date from the fog server linux console https://computingforgeeks.com/how-to-check-ssl-certificate-expiration-with-openssl/If everything looks good on the certificate side, then lets go and rebuild ipxe that should recreate ipxe with the properly installed certificate.
-
@george1421 Since I re-ran the fog installer using the -S switch, it rebuilt ipxe and the corresponding certs were regenerated. The file dates themselves are marked with today’s date when they were recreated.
openssl x509 -in /var/www/fog//management/other/ssl/srvpublic.crt -text -noout Certificate: Data: Version: 3 (0x2) Serial Number: *redacted* Signature Algorithm: sha256WithRSAEncryption Issuer: CN = FOG Server CA Validity Not Before: Aug 23 12:18:57 2023 GMT Not After : Aug 20 12:18:57 2033 GMT Subject: CN = <fogserverip>
openssl x509 -in /var/www/fog//management/other/ca.cert.pem -text -noout Certificate: Data: Version: 3 (0x2) Serial Number: *redacted* Signature Algorithm: sha512WithRSAEncryption Issuer: CN = FOG Server CA Validity Not Before: Apr 27 20:15:47 2022 GMT Not After : Apr 24 20:15:47 2032 GMT Subject: CN = FOG Server CA
-
@DBCountMan And does the files in /tftpboot have todays date too? I was kind of hoping to catch things in a broken state to understand the the symptom vs cure.
-
@george1421 I probably shouldn’t have re-run the installer but I wanted to see if that would fix it.
ls /tftpboot -lah total 6.5M drwxr-xr-x 5 fogproject root 4.0K Aug 23 09:13 . drwxr-xr-x 29 root root 4.0K Aug 23 08:24 .. drwxr-xr-x 4 fogproject root 4.0K Mar 16 2021 10secdelay drwxr-xr-x 2 fogproject root 4.0K Mar 16 2021 arm64-efi -rw-r-xr-x 1 fogproject root 39 Aug 23 09:13 autoexec.ipxe -rw-r-xr-x 1 fogproject root 868 Apr 27 2022 boot.txt -rw-r-xr-x 1 fogproject root 459 Aug 23 08:36 default.ipxe -rw-r-xr-x 1 fogproject root 454 Mar 16 2021 default.ipxe.bak -rw-r-xr-x 1 fogproject root 458 Jun 10 2021 default_usb.ipxe drwxr-xr-x 2 fogproject root 4.0K Mar 16 2021 i386-efi -rw-r-xr-x 1 fogproject root 270K Aug 23 08:36 intel.efi -rw-r-xr-x 1 fogproject root 103K Aug 23 08:36 intel.kkpxe -rw-r-xr-x 1 fogproject root 103K Aug 23 08:36 intel.kpxe -rw-r-xr-x 1 fogproject root 103K Aug 23 08:36 intel.pxe -rw-r-xr-x 1 fogproject root 1.1M Aug 23 08:36 ipxe.efi -rw-r-xr-x 1 fogproject root 888K Aug 23 08:36 ipxe.iso -rw-r-xr-x 1 fogproject root 363K Aug 23 08:36 ipxe.kkpxe -rw-r-xr-x 1 fogproject root 363K Aug 23 08:36 ipxe.kpxe -rw-r-xr-x 1 fogproject root 362K Aug 23 08:36 ipxe.krn -rw-r-xr-x 1 fogproject root 362K Aug 23 08:36 ipxe.lkrn -rw-r-xr-x 1 fogproject root 363K Aug 23 08:36 ipxe.pxe -rw-r-xr-x 1 fogproject root 400K Aug 23 08:36 ipxe.usb -rw-r-xr-x 1 fogproject root 26K Aug 23 08:36 memdisk -rw-r-xr-x 1 fogproject root 301K Aug 23 08:36 ncm--ecm--axge.efi -rw-r-xr-x 1 fogproject root 268K Aug 23 08:36 realtek.efi -rw-r-xr-x 1 fogproject root 104K Aug 23 08:36 realtek.kkpxe -rw-r-xr-x 1 fogproject root 104K Aug 23 08:36 realtek.kpxe -rw-r-xr-x 1 fogproject root 104K Aug 23 08:36 realtek.pxe -rw-r-xr-x 1 fogproject root 268K Aug 23 08:36 snp.efi -rw-r-xr-x 1 fogproject root 268K Aug 23 08:36 snponly.efi -rw-r-xr-x 1 fogproject root 102K Aug 23 08:36 undionly.kkpxe -rw-r-xr-x 1 fogproject root 102K Aug 23 08:36 undionly.kpxe -rw-r-xr-x 1 fogproject root 102K Aug 23 08:36 undionly.pxe -rw-r-xr-x 1 fogproject root 51K Mar 17 2021 wimboot
-
Found something interesting. When I first boot the vm and get the initial error, then drop to prompt, I run certstat. The first certstat shows this:
The addressing in .59 is the fog server. Now look at certstat after a successful chain of boot.php:
So I don’t know where that “5e…c9” CA key is from or why it appears initially. Could explain the initial permission denied because the key doesn’t match? BTW the “81…0c” is the SHA key of the cert that is actually being used, I matched it with the cert issued to my browser.
-
@DBCountMan First let me say this is a new one, that I’ve never seen before. So the rest of this is a lot of pure guessing.
If we reference the ipxe documentation https://ipxe.org/cmd/certstat for certstat something jumps out at me. The definition of permanent:
[PERMANENT] The certificate was embedded into iPXE at build time.
This is a certificate that was added when ipxe was compiled. For the one that no work, it has a permenent id of 5e…c9 for the CA certificate. In the one that works the permanent one is 81…0c (which is also what your browser is reporting.
So if we build a truth table on this, it points that you might have 2 ipxe boot loaders at play here (because we are seeing two different certificates). So the question is how can we tell?
ideas from the ipxe console:
- Seeing if you have multiple dhcp servers responding here? There should be a way to see dhcp option 66 and 67
- Seeing if there is a way to find the boot loader name or version number or build number to see if a second ipxe boot loader is in play
- The one working vs not working is the platform different uefi vs bios?
-
@DBCountMan When you re-run the installer the iPXE buildscript is called and should use
/opt/fog/snapins/ssl/CA/.fogCA.pem
to embed into the compiled binary (code ref). That file is being copied to/var/www/fog/management/other/ca.cert.pem
by the installer but only on the very first install run or if you force it to - don’t do that without knowing exactly what you do and having backup copies of all the files!! (code ref).I suggest you compare the fingerprints of those two files - should be identical:
openssl x509 -in /opt/fog/snapins/ssl/CA/.fogCA.pem -noout -fingerprint openssl x509 -in /var/www/fog/management/other/ca.cert.pem -noout -fingerprint
-
@Sebastian-Roth @george1421
Guess what? The key ID of 5e…c9 belongs to my secondary FOG server. Somehow it was getting mixed into the pxe request chain. My DHCP server is configured to send pxe requests to my primary FOG server only. I have no reference to my secondary server anywhere in the tftp configuration. Very strange. I use dnsmasq because for some reason tftpd didn’t play nice with our DHCP server. I had dnsmasq running on the secondary server and thus is somehow received pxe requests.Disabled the service on the secondary FOG server and now pxe works flawlessly.
The only reason I left dnsmasq running on the secondary is for redundancy. I replicate my primary to my secondary. Since any time I use the USB boot method outside the subnet of any FOG server, the ipxe boot process will ask for an IP. What I thought I could have is the option to boot from the secondary in case the primary was out of service. I also incorrectly thought that dnsmasq is required but since the USB boot method takes care of ipxe, its just an HTTPS connection from there.
So again I thank you guys for helping me figure this out.
-
-
@DBCountMan Now that you know the root of the problem, you can/could bring everything back together by syncing the certificates and ipxe boot files from your primary FOG server to your secondary FOG server. The issue as you found is two different certificates on your campus.
-
@george1421 That would be the next step. When I said “replicate” I meant that I have a cron sync from primary to secondary every night, not instantly. So lets say I set up both servers to accept ipxe req’s and a field tech loads a new image on the primary at 10am and wants to deploy them to 20 PCs in the field. Primary FOG server takes a dump. The secondary FOG server won’t have the image until midnight. So there’s one issue. Unless I set cron to sync every hour or 30min. Another issue I found with rsync as much as I love it, is that if the primary server goes offline, the secondary, which has the /images nfs share from FOG1 mounted, will appear empty, and rsync will sync an empty dir to its own /images, thus wiping out that folder. I read that rsync can be tuned to not sync if a directory is empty but I have to research and test.