@Tom-Elliott said in unable to deploy RAID 1 disk:
@george1421 Correction, it will only work for 1.4.1
Then corrected, I do stand.
@Tom-Elliott said in unable to deploy RAID 1 disk:
@george1421 Correction, it will only work for 1.4.1
Then corrected, I do stand.
OK we have a functional fix in place now. This fix will only work for FOG 1.4.0 and 1.4.1. You will need to go to where you installed fog from. For git installs it may be /root/fogproject or for svn it may be /root/fog_trunk or what ever. The idea is that there are binariesXXXXXX.zip
files. remove all of those files the fog installer will download what it needs again. There will be one binariesXXXXX.zip for each version of fog you installed.
Once those files are removed, rerun the installer with its default values (you already configured). This will download the updated kernels and inits from the FOG servers.
Now pxe boot your target computer with the raid using debug deploy [or capture depending on what you wanted to do] like was done before. Once you are at the fog key in cat /proc/mdstat
if md126 now says (auto-read-only) then you win!!. IF it still says (read-only) then you might not have the most current inits. We will deal with that once we see the output of the cat /proc/mdstat.
I was able to deploy an image to a test system in the lab using the intel raid so I know it does work with the new inits.
I agree upgrade off from 1.3.5.x There was a nasty resize bug that was addressed in 1.4.0rc1
@aparker said in Replicating images to other FOG servers:
1.) All three existing FOG servers are “normal” installs as opposed to “storage” installs. Can I make this work as-is? Or do they need to be converted in some way to a storage install?
We have our environment setup similar to this. We have a dev environment with a full FOG server and then a production environment with full FOG server and also storage nodes. The dev fog server replicates to the production FOG server similar to how a full FOG server replicates to a storage node. The difference in this dev -> production setup is that each full fog server has its own sql database, where a full fog server and storage node setup only has one database on the full fog server.
The manual bit that Wayne talked about is that you need to manually export the image definitions from your source fog server and import them into your destination fog server. As long as you don’t add new images to your root fog server then you only need to do this one. You can update the image files and they will replicate as long as they use the same image definitions you don’t need to touch anything. A future release of FOG may automate this process, but for today its manual.
2.) What’s the storage node setup look like? I haven’t been able to find much documentation on what exactly needs to be done here. It appears that each FOG server has a “default” storage group and a “DefaultMember” storage node. Would I just add the two downstream FOG servers as storage nodes?
Short answer, yes. In this storage group (collection of fog servers) you will have one master node (root) and all other traditional FOG servers as slaves.
3.) We want to make sure that no capture or image traffic goes across the WAN. This is indeed possible with the location plug-in, correct?
Correct you will create locations, assign fog servers to the location as well as assign the target computers to that location. During registration you will select the location that the target will talk to for deployment. Captures always happen to the full fog server never storage nodes. This is normal. BUT in your situation you have 3 standalone FOG servers. The slaves as well as the root fog server really doesn’t know about each other. The replicator does. So in your setup nothing special needs to be done. Each site will have its own dhcp server pointing to its local FOG server. So no change here.
We’re currently running FOG 1.3.4. on all FOG servers (and I’m not opposed to upgrading if there is a compelling reason to do so).
Upgrading would be advised as always. There were a few annoying bugs in 1.3.4 (more so in 1.3.5) that was addressed in 1.4.0. In regards to your setup 1.3.4 will work fine for this image replication part.
@george1421 Only note to myself
[Wed May 31 root@fogclient ~]# mdadm --create --verbose /dev/md/imsm /dev/sd[a-b] --raid-devices 2 --metadata=imsm
[Wed May 31 root@fogclient ~]# mdadm -C /dev/md124 /dev/md125 -n 2 -l 1
mdadm: array /dev/md124 started.
mdadm: failed to launch mdmon. Array remains readonly
Ok after about 5 hours of working on this I have a solution, there is a missing array management utility that needs to be in FOS to get the array to switch from active (read-only) to active (auto-read-only) [small but important difference]. Once I copied the utility over and recreated the array by hand it started syncing (rebuilding) the raid-1 array. I need to talk to the developers to see if we can get this utility built into FOS.
The document that lead to a solution: https://www.spinics.net/lists/raid/msg35592.html
@eistek Sorry my real job has been very busy today so I have little time at the moment. Jonathan has you on track for a solution. I can say if you are starting with FOG, you started with a very hard target computer. These intel raid controllers are a problem to work with in linux.
I can see from your last image the raid array is /dev/md126 and its currently configured for read only (same problem as Jonathan) as in your case resync=PENDING (means that you just created the array but it hasn’t completed the mirroring as of now). I have a system in my test lab at the same point as you. I hope to spend some time after work hours to see if I can get my test system to activate the array and sync the sectors. If I can then I can give you guidance.
I can say the info you have provided will get us to a solution for you. So please wait until I can get into the lab.
@eistek Sorry I don’t want to say “fake-raid” as negative. It is a bit negative if you are a server guy. But it is perfectly fine.
Yes, from your screen shot you have the ICH10R controller. That is the intel hardware assisted software raid. To use that raid there is the hard ware component that you setup and then within the operating system there is the other part of the driver. FOG can see that if you tell it to load the software raid drivers.
Actually there is another fog admin who has the same issue as you at the moment @Jonathan-Cool
In your case I did create a tutorial a while ago for managing the intel raid controllers with FOG.
https://forums.fogproject.org/topic/7882/capture-deploy-to-target-computers-using-intel-rapid-storage-onboard-raid
The issue is probably do you use the intel “fake-raid” to build this disk array or do you have a dedicated array controller for this array?
When I say “fake-raid” that is a negative term for a hardware assisted software raid which is often found only in the MS Windows world. ref: https://en.wikipedia.org/wiki/Intel_Matrix_RAID
FOG can work with these MS Windows software raid but you must do some setup first.
This is an interesting issue.
The fog replicator compares the md5sum of a certain number of bytes of each file before it decides if they are different. If the md5sum key is not the same then it makes the decision that the storage node is out of sync and starts replicating the image again.
The fog replicator logs are in /opt/fog/logs does the master replication log or the replication log for the norway server give you an idea why it just keeps replicating?
Make a change like you posted about then tail the apache error log to see what error its throwing. For debian I think the error log is called error.log and its in the /etc directory somewhere (sorry I’m rhel guy). You can also view the apache error log in the fog configuration -> log viewer area.
@Tom-Elliott The error continues to persist even after the update
The conditions are the master node is set to disabled in a storage group. When you attempt to deploy an image the deploy task get created and the above error is thrown and the page never refreshes only times out. If you pick another menu item them that page is painted.
Should the master know ever be disabled in a storage group? (probably not)
Would this ever happen in the wild?? (probably not)
Does it needs to be fixed ?? (only the developers will know that)
@maciej12203 Your picture is very interesting to me, because if you have fog running and you did not change fog’s IP address, it should be working.
I want to see what fog is doing when it creates the iPXE boot menu. To view this iPXE boot menu past this into your web browser.
http://192.168.4.70/fog/service/ipxe/boot.php?mac=00:00:00:00:00:00
This will show you the contents of the iPXE boot menu.
What will be interesting is the very first lines
#!ipxe
set fog-ip 192.168.4.70
set fog-webroot fog
set boot-url http://${fog-ip}/${fog-webroot}
But also look down the entire page to make sure only the fog server IP address is being displayed.
You have all of the right things enabled in your ltsp.conf. I would like to offer this new configuration to you but only if you are running dnsmasq 2.76. We will need to adjust a few things if you run this configuraiton file on version less than 2.76
. The following configuration will work for both bios and uefi systems. You may not need it today, but sometimes in the future, maybe.
# Don't function as a DNS server:
port=0
# Log lots of extra information about DHCP transactions.
log-dhcp
# Set the root directory for files available via FTP.
tftp-root=/tftpboot
# The boot filename, Server name, Server Ip Address
dhcp-boot=undionly.kpxe,,<fog_server_IP>
# Disable re-use of the DHCP servername and filename fields as extra
# option space. That's to avoid confusing some old or broken DHCP clients.
dhcp-no-override
# inspect the vendor class string and match the text to set the tag
dhcp-vendorclass=BIOS,PXEClient:Arch:00000
dhcp-vendorclass=UEFI32,PXEClient:Arch:00006
dhcp-vendorclass=UEFI,PXEClient:Arch:00007
dhcp-vendorclass=UEFI64,PXEClient:Arch:00009
# Set the boot file name based on the matching tag from the vendor class (above)
dhcp-boot=net:UEFI32,i386-efi/ipxe.efi,,<fog_server_IP>
dhcp-boot=net:UEFI,ipxe.efi,,<fog_server_IP>
dhcp-boot=net:UEFI64,ipxe.efi,,<fog_server_IP>
# PXE menu. The first part is the text displayed to the user. The second is the timeout, in seconds.
pxe-prompt="Booting FOG Client", 1
# The known types are x86PC, PC98, IA64_EFI, Alpha, Arc_x86,
# Intel_Lean_Client, IA32_EFI, BC_EFI, Xscale_EFI and X86-64_EFI
# This option is first and will be the default if there is no input from the user.
pxe-service=X86PC, "Boot to FOG", undionly.kpxe
pxe-service=X86-64_EFI, "Boot to FOG UEFI", ipxe.efi
pxe-service=BC_EFI, "Boot to FOG UEFI PXE-BC", ipxe.efi
dhcp-range=<fog_server_ip>,proxy
You MUST replace <fog_server_ip>
with the actual IP address of your fog server in all places in the configuration file.
Also this tutorial shows you how to compile dnsmasq 2.76 which is required if you want to dynamically switch the boot file between bios and uefi systems.
ref: https://forums.fogproject.org/topic/8725/compiling-dnsmasq-2-76-if-you-need-uefi-support/6
@Joe-Gill You have to know what hardware you are using, but the default setting for pxe booting (at least for bios mode) is undionly.kpxe. That is the most generic and universal driver. There are other boot files on your fog server in the /tftpboot directory. There is one specific for realtek chips, as well as intel network chips. You just update/replace undionly.kpxe in your dhcp option 67 with one of the other iPXE kernels.
The error in the picture says that bzImage could not be downloaded. If you get this far then you must see the iPXE menu. So that tells me that dnsmasq should be working. But we must check first.
I have to ask you this question, because of the error. Does your FOG server have a static IP address -OR- did you change the IP address of your FOG server after FOG was installed?
Hello and welcome to the FOG forums.
Lets see if we can get your fog server working.
I see you don’t have direct access to the dhcp server. DNSMasq will work if the pxe booting computers and FOG server are on the same subnet. If they are on different subnets we can also make work, but you will need access to your subnet router to make an adjustment.
For starters, please post your ltsp.conf file here for review.
@Joe-Gill PXE booting is not supported over wifi AFAIK. BUT you could either pxe boot off a supported usb 2.0 to ethernet adapter, or usb boot either iPXE or FOS to get you going.
@Mastriani once you get your servers back online, look at Veeam Endpoint Backup (free) to make a DR image of your physical servers, both windows and linux. If you have it installed and you have your DR backup you can bring back the server by booting off the Veeam DR disk and then connect to your backup repository (files).
ref: https://www.veeam.com/windows-endpoint-server-backup-free.html
@zingaro look at the fog talk bubble at the top of the browser window. Its on the fog tool tray.
Just be aware the computer that you are using to retrieve the files from must be a linux computer. Windows doesn’t understand ext4 disk format.