[Proof of Concept] Combining {FoG & XCP-NG} for a Zero-Touch Bulletproof Classroom Deployment!
-
Hi All,
Are you interested in solving a general worldwide Lazy IT “problem”? (Zero-Touch Machine Control) or follow how the can get solved? Then continue reading! --Note to my knowledge there is no such software available. If so please comment details about the software i will check it out
Intro:
I’m an system administrator for a company that educates in IT (Instructor-led courses) for all kinds of vendor’s such as Cisco, Citris, VMWare, Redhat, Microsoft and many more.We try to avoid costs of “Virtual Labs” thus making our own (on instructions of vendor we re-build cloud OS machines, and deploy them on a powerful physical machine) images to deploy to classrooms. Our classrooms are high-end in IT hardware. You can imagine we have a (currently Altiris .gho files) huge library of images that we can deploy.
But every Friday when imaging to be ready for the new course starting next Monday we encounter “lazy IT problems” which increase our time we need to spend on restoring a lot of classrooms (multiple external locations, each location has multiple classrooms)
A default course is 5 days, we deploy corresponding images every Friday (after 17:00) to the systems. It’s a variation of images from Linux, to Windows Server editions that has Hyper-V with multiple images inside (Example: it will be used for a course Microsoft Domain Controller.).
Our “lazy IT problems” :
We want a bullet proof classroom, we want to fully control the classrooms systems. We want to be able to boot/shutdown, yes we have working WoL! But that diden’t solve our problem… <—Soooo many “We want” i know. Lazy…Because Alot of courses use Hyper-V. Hyper-V breaks WoL It Hijacks NiC Ownership making WoL not work. We need to unplug/plug machine from power then WoL works untill it loads OS with Hyper-V again. That means we need to walk to the classrooms and/or ask the receptionist to boot up machines.
Idea!
Then i had one crazy idea to combine FOG with XCP-NG Server a.k.a (Citris Hypervisor, Xen-Server) ^0^. Wait what? What is XCP? It’s a free version of Xen-Server but with all features enabled for free.-> https://xcp-ng.org/XCP-NG it not intended for classroom deployments… I know! But hear me out how i want to use the open-source combination to create a zero-touch deployment! even when Harddrive is wiped including the XCP-NG Host installation thus making us lose control of the machine! (I know it sounds insane, but bare with me a lot of typing is required to explain this!)
[XCP-NG limitations]:
*When XCP-NG Is installed it displayers the “Server console on screen”
First Dilemma: How do we serve end-user VM Guest screen instead of the console displayed on the image below?
Solution: vGPU Passthrough function, we stream the VM output to Graphic card. Attach Graphic card to VM Solved we see OS screen!Second Dilemma: USB Port not working, mouse and keyboard not useable.
Solution: Almost same solution, but we use USB Passthrough, and there is another function to be able to hot mount/dismount usbs to vms. But for now we pre-configure usb passthrough.Oke we now proofed that end-users can use the same fysical Guest VM machine where XCP-NG host is installed on LOL ;0
How do we recover XCP-NG Host if disk is formatted for any reason?
Well, my idea is we let the host machine boot by default on PXE and load our “customised” XCP-NG Netinstaller including answer file which includes the configuration.I was planning on editing the netinstaller code to:
Check Disk -> Partition with XCP-NG Host Found?
----------------------- ^Yes -Boot from disk ^No -Start XCP-NG Host Netinstaller recovering XCP-NG Host.Thus making it zero touch bulletproof, correct? We can fully use WoL or the XCP-NG tools*
*- XCP-NG Center (Installer)
- Xen Orchenstra (Web-GUI)
- API Commands (FoG plugin if possible
to remote control it.
And what has FoG to do with it?
Well, FoG is insanely faster when pxe installing a VM than import and mounting the image file.So i wanted to create a plugin for FoG to be able to send API commands to any XCP-NG Host to create an empty VM. When creating the vm template we can generate a MAC for the VM NiC& Boot it on PXE.
Because we generate the MAC in the template inside FoG webserver we can internally parse a deploy task for FoG targeting the guest Mac Address (That we generated when creating template) and install our classroom image.
What do i think we a achieve if this 100% works? First ever Zero-Touch Classroom Deployment.
–This workaround is pure for 1 thing! Hyper-V breaking WoL when Hyper-V is installed on host OS.----If some one now comes with a solution to get WoL working on Hyper-V host machine i will fall off my chair.
So what do you think? I’m halfway with the PoC. Next step is to customize the net-installer to decide whether to reinstall or boot from disk.
Cheers!
Mokerhamer -
@Mokerhamer Nice post! I am still not sure I get the full story but I’l just engage in this and put in my thoughts on this. We heavily use XCP-NG at work and I kind of like seeing you make extended use of this great project in combination with FOG.
Reading a bit about the “Hyper-V kills WOL” topic I found this technically interesting forum topic where it says that a graphics driver is preventing Windows from being able to shut down the server in a way that WOL works. And here is another one that states that you only need to set a registry key to get this to work.
Kind of hope that this is not breaking the whole idea of your intention to use XCP-NG…
-
Thanks for the reply!
I’ve checked out the links you gave unfortunately it only works with older Hyper-V and not the latest one that we need to use.
The first link details the problem “Wake-on-lan is no longer available because the Hyper-V hypervisor is now the owner of the physical NIC, not the OS installed (the management OS).”So we still dont have working WoL with Hyper-V. --Continuing with XCP-NG
If this initial works, i think i will help so many system administrators around the world. So i am curious if i worded it out correctly.
I wouldn’t mind if some one helps with editing the XCP-NG netinstaller to check if XCP is already installed yes/no.
Hope this thread gets more attention and support tough
Cheers,
Mokerhamer -
@Mokerhamer said in [Proof of Concept] Combining {FoG & XCP-NG} for a Zero-Touch Bulletproof Classroom Deployment!:
The first link details the problem “Wake-on-lan is no longer available because the Hyper-V hypervisor is now the owner of the physical NIC, not the OS installed (the management OS).”
Obviously you have not read both links from top to bottom. Read the later one first and try out the registry key to see if it helps.
-
@Sebastian-Roth I’ve tested it before. What i learned; from Windows server 2012 and higher (our minimal is Windows server 2016) WoL dies when using Hyper-V, with no known bypasses. I’ve spended months trying to solve it.
And i am 100% sure it’s not “Server” related. Since we re-produced it with a clean windows 10 installation (Installed Hyper-V Role) and WoL didn’t work anymore. We’ve try t the registry keys and other advice found on internet to. no result :S Best what we could achieve is WoL a Guest VM.
After months of breaking my head this idea popped-up.
-
@Mokerhamer Ok, I see you have dug down this path and it’s a dead road - not what I could figure out in just a quick search on the web.
I was planning on editing the netinstaller code to:
Check Disk -> Partition with XCP-NG Host Found?I can imagine this is possible without modifying the official code. If you use GRUB2 as PXE bootloader you should be able to take a look at the local disk (GRUB can do this, while pxelinux and iPXE cannot!) and decide weather to boot local or netinstaller.
Let’s just take it one step at a time…
-
@Mokerhamer I’ve had a play with GRUB to give you a starting point. Run the following commands on your FOG server and hopefully you should be able to get a PXE booting GRUB up in no time:
sudo -i apt install grub-pc-bin grub-efi-amd64-bin mkdir /tftboot/grub rsync -av /usr/lib/grub/i386-pc /tftboot/grub rsync -av /usr/lib/grub/x86_64-efi /tftboot/grub grub-mkimage -d /usr/lib/grub/i386-pc/ -O i386-pc-pxe -o /tftpboot/grub.pxe -p '(tftp)/grub' pxe tftp grub-mkimage -d /usr/lib/grub/x86_64-efi/ -O x86_64-efi -o /tftpboot/grubx64.efi -p '(tftp)/grub' efinet tftp
As well you need a config in
/tftpboot/grub/grub.cfg
- example:set timeout=5 insmod part_msdos if [ -f (hd0,msdos1)/boot/xen.gz ]; then menuentry 'localboot' { search --label --set root root-bdtixc multiboot2 /boot/xen.gz com1=115200,8n1 dom0_mem=1024,max:1024 watchdog ucode=scan dom0_max_vcpus=1-4 crashkernel=256M,below=4G console=vga vga=mode-0x0311 nmi=ignore module2 /boot/vmlinuz-4.19.58-xen root=LABEL=root-bdtixc ro nolvm hpet=disable xencons=hvc console=hvc0 console=tty0 vga=785 plymouth.ignore-serial-consoles nmi=ignore module2 /boot/initrd-4.19.58-xen.img } else menuentry 'install' { multiboot2 xcp-ng/xen.gz dom0_max_vcpus=1-2 dom0_mem=1024M,max:1024M com1=115200,8n1 console=vga module2 xcp-ng/vmlinuz xencons=hvc console=hvc0 console=tty0 answerfile=http://192.168.2.10/fog/xcp-ng.xml install module2 xcp-ng/install.img } fi
This is not fully tested but it should work somewhere along these lines. I definitely got as far as PXE booting into the automated installer which took my answer file and started installing. Though this is all running in a VirtualBox test environment it somehow crashed along the way when installing. I guess this is just due to my setup. Give it a try and see if you can get it to run this way.
-
@Sebastian-Roth Thank you so much! we will work on it i will report back asap.