Imaging stops after client boots up
-
Tom thinks the drive HAD a GPT, but the update made it think it has MBR and that broke it. And Windows doesn’t properly remove GPT fragments from the disk.
You can use fixparts to repair it.
-
Trunk: at first I was not sure what are you talking about, for me, trunk meant a different thing. So, on to the latest version you suggest?
GPT: hm… so you meant if we do fixpart we can do a try to deploy and see if it helps? well, a try always can be ok. My problem is that not ALL deploy fails! I hate random errors as hard to get too many factors and users dont like this if i say “wait a while, we may solve it. or not”
In the past we had mbr issue when “never used disk we got with dealers recovery on it”. We killed mbr and deploy was ok. But, that time it was not same, 0.32 version and the actial deployement died with error. This time 1.2, and after init.xz and the other file loaded, it gives back prompt. From what point we can kick on with ./linuxrc (if gpt issue is in place, do you think it would stuck like it?)
We will do a try tomorrow with fixparts somehow (i havent ever used that as I remember, hope nothing extra is needed to fix it)
-
[quote=“Foglalt, post: 47312, member: 26236”]Trunk: at first I was not sure what are you talking about, for me, trunk meant a different thing. So, on to the latest version you suggest?
GPT: hm… so you meant if we do fixpart we can do a try to deploy and see if it helps? well, a try always can be ok. My problem is that not ALL deploy fails! I hate random errors as hard to get too many factors and users dont like this if i say “wait a while, we may solve it. or not”
In the past we had mbr issue when “never used disk we got with dealers recovery on it”. We killed mbr and deploy was ok. But, that time it was not same, 0.32 version and the actial deployement died with error. This time 1.2, and after init.xz and the other file loaded, it gives back prompt. From what point we can kick on with ./linuxrc (if gpt issue is in place, do you think it would stuck like it?)
We will do a try tomorrow with fixparts somehow (i havent ever used that as I remember, hope nothing extra is needed to fix it)[/quote]
You’d use a debug deployment to try it.
-
Well, it dont let me rest, can you enlighten me a bit, surely you know it better. The boot process is stuck, maybe somehow here is the key. Machine boots up, ipxe configurate the environment, then get the boot.php from the url where it should. As my poor knowledge helps it gets the tasklist and then generate the boot menu. If I understand well if the machine has task to do, that menu is an instant deployment. As no deployment starts, only I get a prompt, can it be because somehow the tasklist or the “bootmenu” generation fails?
Ofc I see active tasks, so not on this end (server). When that boot.php generates the “menu” what is the exact next step of booting? (is there a step by step list for me to understand?)
-
the easiest way to understand what the menu is telling you client is to go to the boot menu in a browser. it’s generated as a plain text file accessable via url
-
I forgot to post a screen of the “stop point” which I am hanging at after imaging stops. As I said, no error at all, only it stops here. And, I ran a bit around with mbr, gpt issue possibility. Put some old hdd into failing machine to see what it has (no gpt even touched those disks, a lot faster than remove what I have never saw).
What do you think?
-
Would you be willing to upgrade to the Trunk version?
read more here:
https://wiki.fogproject.org/wiki/index.php/Upgrade_to_trunkthis method (IMHO) is the simplist.
https://wiki.fogproject.org/wiki/index.php/SVN -
It sort of looks like your tasks start in “Debug” mode every time…
-
Yes. I havent looked it this way, but… well, yes. all can be done, but have to be done manually. Well, partially, as I only have to kickstart it with linuxrc command, not all stepts it does 1by1, but yes.
Ok, svn trunk updated to revision: 3482 (in the fog logo it writes this number I mean). We do some tests and see what happens.
UPDATE: svn version was up, and we have a bit worse scenarion… up to know, kicking it mad it work. now, no upload, no imaging. and i am going crazy…
stage1, imaging: after the ipxe boot, we see some testing before imaging. then at a point (erasing mbr/gpt info is done correctly it says, and here comes ->) initializing /dev/sda with NTFS partition…error: /dev/sda1: no such file or directory. some more error on same and udev. Then comes: using hard disk /dev/sda, and write cash check ok. And here error again (i guess it is cos of the first hdd error) Preparing hard disks (stage1): error. no such file or dir (sda1 missing again).
Fun fact: preparing hard disk (stage3) done properly (status written as “done”).Here comes the blue screen of partclone. It says: starting to restore /dev/sda1). Calculating bitmap… please wait, then come the final error: Target partition size (105 MB) is smaller than 35717 MB). Use option -C to disable size checking (dangerous).
Here we stand now. Any guess?
We at first thought it is about some new settings on fog image settings panel, so we went there. Some new option (like the wanted partition selection, or mbr backup, which is really nice to see coming!), but nothing obvious. So, we gave a try to create and image then redo it on an other disk to see what happens. Fail. At this point I was, you may understand :D, at a point of kicking something, so I go to eat for safety of my office (i will go through the video we made about this failure to see if there is any clue I can tell about imaging, but it was some /dev/ thing again. Like if some hardware driver go wrong or so (2 disk we tried).
As I do many tests to solve this problem I have a backup of things, so I went back to official last release to let my collegaue do some imaging with the “kicking it manually” way, to let him release some steam and do his work till we find a solution)
Ah yes, btw, This time we are over that “debug-like” prompt and it wants to start imaging. Only it cant cos of device error. Am I right?
-
Final case: The debugging of this state closed. No solution found with those machines in testing. I ended up with a fully new hardware architecture, a lot newer machine. Fully reinstall again, now on Ubuntu 14.04.2 LTS, 64bit. Fun fact, issue never again happened. Maybe candid camera was removed from room… damn, it was a bit frustrating haveing such insane malfunction with a normally working software. A week passed by, no issue anymore. Strange.
Thx for replies and tries in helping!