Question About Snapin Installation Time Reporting
- FOG Version: 1.5.0 RC-6 Working V16
- OS: CEntOS 7
- Service Version: 0.11.12
- OS: Windows 10 & Window 7
Can someone tell me what’s normal with regard to snapin execution reporting. I’m on the Working Branch. I’m hoping folks can tell me what they see in v1.4.4. I don’t want to post this in the developer area unless it’s unique to v1.5.0.
Here’s the issue:
When deploying an image with snapins, the FOGSERVICE runs all snapins until it hits a snapin that requires a reboot. It reboots and then runs a bunch again without stopping. This is all good.
The problem is that the Snapin History for the host and the Snapin Log both show the start times for each snapin in an unbroken series to be the same time when in fact, each starts after the previous Snapin finishes (snapins do not install in parallel, but in series).
The effect is the reported/recorded duration of any snapin changes depending on the duration of all the snapins that ran before it.
Here’s a sample:
Note that the first 4 snapins show the same start time. The 4th reports taking 2 minutes 37 seconds to complete, while it actually took 2 seconds for the snapin to run (the difference between the its end-time and the end-time of the previous snapin).
Since the 4th snapin forces a reboot, 5th snapin starts another series of 7 snapins, all showing the same start time.
The impact is that we can’t determine anything from the history of each snapin over time, nor can we use this info to help set valid execution duration for each snapin.
So the questions are :
- Does v1.4.4 work the same way?
- Would changing this be a feature request, bug fix, what?
We’re good on the cancellation issue, but I’m still seeing a series of packets with the same start time and ever increasing durations times for each subsequent snapin.
I didn’t get that you made a change. I doubt I can be of any help with the code, but I’ll look.
The screenshot above is a single deployment of an image with several snapins. There’s a reboot after 0-W7PostImageCleanup and another after 1-UACDisable. So there are 3 “series” of snapins in this deployment.
2017-08-10 12:24:36 (2 snapins)
2017-08-10 12:25:28 (1 snapin)
2017-08-10 12:26:14 (6 snapins)
All scheduled as part of the image deployment and all separated only by reboots implemented by FOG as part of a snapin.
@jim-graczyk But I already made the change for the snapin start times to be more accurate. It does not set all snapins to the same time, it set’s each snapin it’s working on to whatever time it starts now. Are you saying this is not working? See here:
I don’t think setting the time for the cancellation is necessary at all for the reasons I listed earlier. I understand what you’re saying about cancelled snapins, but I still think it’s better to know how long before each snapin task was cancelled.
That depends on whether you (all) have any intention of changing the reported start time and duration for snapins when they run in a series (a single pass of the Snapin component in FOGService.
I still believe it’s unclear to have all spanins show the start time of the first snapin in a series. I’ve done some extending of FOG on the Snapin deployment side. My snapins are network installs that reside on DFS on windows servers and Samba on storage nodes. This allows dynamic editing of all snapins w/o the need to reupload anything to the FOG server and prevents snapins from requiring 2x the freespace (1 to download from FOG, 1 for the installation). The FOG Snapin is a small exe that passes needed info the the CMD (regarding where to connect to the SMB share at each site) and starts the CMD. I log the output of each CMD (>>some_log_file.log) so as to make debugging easier for creating and running snapins.
The logging process I have clearly shows each subsequent snapin starting after the previous one completes. I don’t see why this wouldn’t be how FOG records snapin execution.
So, leave this one as UNSOLVED only if you will consider making the reported times accurate based on start time, end time and duration the snapin actually has.
Mark it as SOLVED if you don’t intend to change anything or feel there’s nothing to change.
@Jim-Graczyk Shall we mark this solved then?
Thanks for the explanation. I hadn’t thought of it that way. Makes sense.
@jim-graczyk I’m not feeling the change over for that. Playing with the cancellation ultimately isn’t hard, but it could be useful to help describe to managers why you cancelled the task. “Well it was running for 5 weeks 2 days 8 hours 45 minutes and 16 seconds without running. This is why I cancelled it.” Or (I cancelled it because it was unintended to be pushed to that machine.)
This is a personal preference. I think having the time it actually completed makes the most sense. Besides, to get it to properly display 0, I have to do a check on all the current “start times” and set each task. That may not seem like much, but means additional queries to get the information. (Not too much, but can impact performance too).
One more similar observations - when you cancel snapins, they’re each assigned a duration equal to the time between the snapin request being submitted and the snapin being canceled. Again, since none of the snapins actually ran, one would expect to see a duration of 0 time for each.
I’m hoping that the start time, end time and duration for the execution of each snapin would be just that. I’m not concerned with seconds. If one is lucky enough to have PCs so fast 5 snapins start and end on the same second, whoohoo.
I’m tracking what PCs are taking what time to do what snapin. Sometimes we’ll see something odd and we’ll need to check out why this happened. The value is also the long-term record plus we have timeout limits to set for each snapin. So we’re looking for long time intervals for snapin installation.
I’ll check out the change you’ve made later tonight or tomorrow.
The start time was not being set “dynamically” (totally).
I think the intent was to lessen the amount of calls, just without much thought. Sorry I’m dumb like that sometimes.
I’ve updated this so hopefully future snapin task starts will represent the accurate time.
Ultimately, though, i don’t know what the times might actually represent so you might have a 1 second timeout followed by a 32 second gap that adminset occurs, followed by 2 seconds for the post image cleanup etc…
Hopefully this helps describe what you’re seeing.