Solved GIT 5676 Multicast results in bad images on machines.
Good afternoon. As the title states I am using Trunk 5676 on a clean install Ubuntu 14.04 server. The master image consists of multiple partitions on 2 primary and and extended partition. MBR on Part 1 Primary, Win 7 32 on Part 2 Primary, Extended Partition has 7 64, 8.1 32, 8.1 64, 10 32, 10 64, 7 Enterprise 64.
I am able to pull an image and deploy it back to multiple computers successfully. However, when deploying via multicast I end up with a bad restoration on both computers. For starters my OS selection screen appears to be fine, but selecting Win 10 boots up Win 7 32. Once in the OS it hangs with explorer never launching.
Can anyone give any insight? Is it due to our weird partitions? Is anyone successful in deploying via multicast lately? I appreciate the replies!
How are you initiating the multicast on the computers? Through the boot menu?
Have you tried with groups at all?
@Wayne-Workman I’m using the web ui to set up a multicast to my group of computers (2). Everything appears to work correctly until the machine reboots and an OS is selected. Another interesting thing I noticed is when the images are sent back to the machine they are send in alphabetical instead of numeric. In my case I have partitions up to sda10. The restore order goes sda1, sda10, sda2,sda3… Not sure if that matters, just thought I’d mention it.
Everything appears to work correctly until the machine reboots and an OS is selected.
Can you explain that line more?
The multicast does not begin till both machines are ready to receive. The machines progress at the same rate, switching to restoring a new partition until finished. Each machine renames the computer at the end of the restore process and reboots. (appears to fail for Windows 10 32/64 upon further inspection)
The computers show the normal pre OS boot screens and then the multiboot Windows selection screen is shown. Selecting Windows 10 for instance, which is second on the list but default choice, actually loads up Windows 7 which appears first on the list. Windows 7 attempts to load the desktop but hangs. Task manager shows a bunch of instances of winmail.exe. I was concerned it was a virus somehow stuck in there
Restoring both machines at the same time with the deploy option completes successfully and is bootable.
@mrdally204 oh I see now why you were bringing up the order in which the partitions are restored. The disordered restore is most likely swapping the 2nd option on the menu somehow.
So now we need to ask the @Developers if there is any particular reason why the partitions are restored in the order that they are.
For resizing reasons, it’s obvious that a certain restore order is needed in order to expand partitions into empty contiguous drive space. But for non-Resizable image types, I think it’d be best to restore the partitions back in the original order.
We natsort the files to prevent this type of problem. Can I see output from the log file directly?
Can you please update?
I hope to have a solution for you now. The files from multicast command generation were working properly, but the partition’s listed to be used were not being sorted in the same fashion. I’ve added a sort command to this in hopes that you should be good, but I need a test with a user who has the high number of partitions you currently have to ensure thigns are good.
@Tom-Elliott I can certainly update. I am new to this and plan on using GIT. Is there anything special I need to do to update? Last commit looks to be from yesterday which tell me your code is not there quite yet.
Edit: nevermind, I was looking at the wrong location Updating in a few
@Tom-Elliott The 2 machines restored using multicast without issue using GIT Trunk 5698. I did notice the partitions restored in order, putting the 10th partition last. Is this how a normal deploy functions currently? I could have sworn the deploy option would send the partitions sda1, sda10, sda2…
Either way I really appreciate the fast turn around with the issue I was experiencing. Expect more questions and bug reports as we continue to use this great imaging solution. We are a qa team working for a software company using it to image our test machines.
I did notice the partitions restored in order, putting the 10th partition last. Is this how a normal deploy functions currently? I could have sworn the deploy option would send the partitions sda1, sda10, sda2…
That’s what Tom fixed, and that’s why your issue is fixed.
@Wayne-Workman I’m almost positive that when I used the deploy option, BEFORE he fixed the order issue, it would restore SDA1, SDA10, SDA2… and the machines would boot up correctly and function fine. The only time that I saw the issue was when the Multicast was used, again restoring them out of order. That was the puzzling part, why would they both restore out of order but it was only an issue with the Multicast restore process and not the deploy. Either way it looks like it’s running swell now
@mrdally204 The two different types of tasks (multicast and unicast) probably use different code bases.
Or, Tom might have just re-written all of the base code for that to make it cleaner and it was just an oversight.
Glad it’s fixed though.
@mrdally204 the order doesn’t matter on unicast imaging (non multicast) because unicast pulls the partition from the partition label it’s iterating on. In multicast the order is performed by the order the udp-sender commands are sent in. So unicast would always work but multicast would give the exact issue you had. Either way I’m glad to have the partitions iterating properly anyway. It helps people truly know how far in the process they are. Imagine if you had 19 partitions. It would have ordered them imaging in 1,10,11,12,13,14,15,16,17,18,19,2,3,4,5,6,7,8,9 which would’ve been not so fun to troubleshoot where and issue occurred.
@Tom-Elliott to further iterate and give some more specific information.
Unicast worked because while the order was not “proper” it pulled the partition number from the iterated item. Example /dev/sda1 would look for file d1p1.img. /dev/sda10 would actually look for and use d1p10.img.
In multicast this iteration happens but the data is sent by the server.
It did not scan for a particular file.
So in your case the commands were sent in expected order.
udp-sender would send in order
d1p1.img, d1p2, d1p10
It sent the commands in that specific order. The partition receiving the file was not matching the file it was receiving. /dev/sda1 would get d1p1.img properly but /dev/sda10 was getting d1p2.img. Hopefully that helps make sense of the problem.