"Saving Partitions" - GPT to MBR issue
-
This has been an issue for a while.
When a computer system ships with Windows and a storage disk that is formatted as GPT, and the technician/owner changes the Firmware mode from UEFI to Legacy, resulting in an MBR disk layout when they re-install windows, they must run a FOG debug and do a
fixparts
correction in order to capture via fog. This is of course after having the “Saving Partitions” portion of the FOG Capture process hang overnight and realizing that isn’t right or acceptable and seeking further help.I want to completely eliminate this issue.
I think that if FOS takes longer than 60 seconds to save the original partitions, AND the disk is in MBR layout, it’s then appropriate to run fixparts against that disk automatically to correct the issue, and afterwards to try to re-start the “Saving Partitons” phase of capture.
I will be testing code tweaks for this issue, and I’ll post progress here.
-
I’ve emailed the creator of
fixparts
for advice about scripting this process. I’m waiting for a reply.If I don’t get a reply, I’ll make an attempt on my own.
-
I’ve already added code that will automatically do the fix parts, it just doesn’t work. You lite should be a little easier.
As for a “timeout” it won’t work because it’s “stuck”
-
@Tom-Elliott said in "Saving Partitions" - GPT to MBR issue:
As for a “timeout” it won’t work because it’s “stuck”
There’s ways to do it. I don’t know the function names, but say the function to save the partitions is called
save_partitions
, it would look like this in it’s flow:#Start the saving partitions process, when it is done, create a file indicating it's done, and background this process. (save_partitions;touch /savingPartitionsDone) & #Start a loop that looks for this file. If it finds it, exit loop. #If it does not find it, increment the counter and sleep for 1 second. #If the counter reaches 60, kill the command that is hanging and start fixparts and then exit the loop. counter=0 while true do if [[ -f /savingPartitionsDone ]]; then break else sleep 1 counter=$((counter + 1)) if [[ $counter -gt 60 ]]; then #Kill hanging command here. #Start fixparts next here. #exit the loop break fi fi done
And looking at this now, it’s probably not necessary to wait 60 seconds. Probably 15 would do fine.
-
Be aware I’m just throwing out random stuff here.
-
What is the impact of running fixparts for every deploy process? Is it safe to run on a healthy disk structure?
-
Can you detect if the disk is gpt before attempting to save anything to the disk? If so what negative affect would it be for a gpt formatted disk to run fixparts (gratuitously) even if the final disk image will be GPT anyway? I assume the deploy scripts know if the target is suppose to be mbr or gpt before writing to the disk?
-
Is it safe to assume bash doesn’t have a try catch function like try this command for 20 seconds, if the timeout is reached then abort the try?
-
-
@george1421 said in "Saving Partitions" - GPT to MBR issue:
Is it safe to assume bash doesn’t have a try catch function like try this command for 20 seconds, if the timeout is reached then abort the try?
It doesn’t, the closest thing is what I wrote below. I can make it more advanced with exit code checking but I’ve not dug that deep into it yet.
-
@george1421 said in "Saving Partitions" - GPT to MBR issue:
What is the impact of running fixparts for every deploy process? Is it safe to run on a healthy disk structure?
That could be the answer. And, we only ever need fixparts if the drive is in MBR/Legacy currently. If it is of no consequence to do it for every legacy / mbr drive, then yes do that.
-
@george1421 said in "Saving Partitions" - GPT to MBR issue:
I assume the deploy scripts know if the target is suppose to be mbr or gpt before writing to the disk?
This is only for the capture process, not deploy.
-
This is the fixparts function I already created.
-
This is the function calls the runFixparts function.
-
Here’s where the saveOriginalPartitions is called.
-
That should cover the “testable” elements. I’m telling you though, this isn’t an easy thing to figure out.
The gpt or mbr parts work perfectly but it get’s stuck if it’s not that.
-
@Wayne-Workman Stuck commands are literally that. The only way to “test” and kill would be to put the task into background, get the started pid. Then do the loop.
The problem with that approach is you need to know which elements and why it’s being caused.
I think it’s simply waiting for some input on “what to do” and we just need to let it through by saying no when it’s stuck.
I haven’t got a good mechanism, though, to reliably detect the “bad gpt-mbr” state.
-
@Wayne-Workman said in "Saving Partitions" - GPT to MBR issue:
@Tom-Elliott said in "Saving Partitions" - GPT to MBR issue:
As for a “timeout” it won’t work because it’s “stuck”
There’s ways to do it. I don’t know the function names, but say the function to save the partitions is called
save_partitions
, it would look like this in it’s flow:#Start the saving partitions process, when it is done, create a file indicating it's done, and background this process. (save_partitions;touch /savingPartitionsDone) & #Start a loop that looks for this file. If it finds it, exit loop. #If it does not find it, increment the counter and sleep for 1 second. #If the counter reaches 60, kill the command that is hanging and start fixparts and then exit the loop. counter=0 while true do if [[-f /savingPartitionsDone]]; then break else sleep 1 counter=$((counter + 1)) if [[$counter -gt 60]]; then #Kill hanging command here. #Start fixparts next here. #exit the loop break fi fi done
And looking at this now, it’s probably not necessary to wait 60 seconds. Probably 15 would do fine.
Your code should look more like: – Within the “checking part” (within the function saveSfdiskPartitions from partition funcs.sh)
local sfdiskcount=0 (sfdisk -d $disk 1>$file) & local sfpid=$! sleep 15 if ps -p $sfpid>/dev/null; then kill -9 $sfpid >/dev/null false else true fi
This would replace the sfdisk call itself and happen before the status check.
Of course this doesn’t make it truly dependent upon if sfdisk errors out or not. We kind of lose the ability to detect what the exit status was because we background it.
-
Just for fun and good to know information:
I frequently reference this site when dealing with the
dd
command but near the bottom or more directly:
DD - Destroyer of disks | Show progress status statistics of ddThis is one way of “looping” to get the pid and information if you really want to loop and test.
I say, test it after a sleep is the most elegant way. Looping would be more if you wanted to return output for status/progress information.
-
@Tom-Elliott said in "Saving Partitions" - GPT to MBR issue:
We kind of lose the ability to detect what the exit status was because we background it.
You can have the backgrounded process echo it’s exit code, like this:
(save_partitions;echo $? > /exitCode) &
Again I’ve not dug in yet. But I will.
-
@Wayne-Workman The problem with it is the program get’s stuck. So you’d never get a return code.
I still thinking finding out the “why” is the best approach. If we can get the “why” it’s stuck then we can figure out a way to get out of it.
-
@Tom-Elliott said in "Saving Partitions" - GPT to MBR issue:
The problem with it is the program get’s stuck. So you’d never get a return code.
That’s where the timer comes into play. No exit code after a specified amount of time means you need to kill the hanging process and do fixparts, then retry.
-
@Wayne-Workman I know where you’re headed, and I do understand. But I really think we’re “stuck” because it’s waiting for confirmation or input.
Programs typically don’t get stuck in that sense.