[quote=“Matt Harding, post: 11231, member: 1207”]So the disk is the common component here? You used it on the Dell and moved it to the new system when you tried that out?[/quote]
That is correct. I did a fresh install of the OS and FOG just in case. Thinking that it might be the drive, even tho it never produced any errors, I just went through the whole install process with a different drive. It made no difference. I tried again but this time without LVM (I used primary partitions) and Ext3 instead of Ext4. It made no difference. At this point, every single bit of hardware and software has been replaced without any difference in the result. The only constant left is the client machines. So…I pulled to identical XP machines out of storage to image and restore. This actually worked fine, but the images were just a fraction of the size of the HP images I’ve been handling. I can’t honestly say that it was a fair test.
[quote=“Matt Harding, post: 11231, member: 1207”]To be honest, if you’re using only one disk to push images out and its a standard SATA3 drive, I’d be very surprised if you didnt see the data dip after a while, but its curious that when you stagger machines that it starts off well then dips on each. Is it dropping off at the same point no matter what machine or how many at once you’re trying to do? If thats the case I suspect the image is being cached as it would be normally in disk reads to RAM, and you’re hitting a limit somewhere… either with the drives ability to sustain pushing the data out, or the interface used etc…[/quote]
I suspected it might be something like you described, but I did swap my gigabit switch for a 10/100 megabit switch early on in my testing. Obviously, the data rates were a lot slower but it still dropped at the same point for the image I was testing. I don’t think I reimaged with 10/100 switch.
During this last reconfiguration I noted above, I pulled a new image while I had top running. To be honest, I never really payed that much attention while I was uploading an image since I wasn’t experiencing any slow downs during that operation. I paid attention this time and noticed something. When I started the upload to the server, the load average was around .75 with bounces between .6 - 1.2 with the top processes being 3-5 different nfsd processes. The bandwidth graph (receive) was pretty jagged and then it suddenly flatlined at 0 MB/S and the same time I noticed all the nfsd processes had disappeared from top. Average load also began to drop to idle (.06 - .13). I did a ps -A and saw the nfsd processes were still running…they just weren’t doing anything. About 20 seconds later, the bandwidth graph jumped up to its previous levels and nfsd processes were active again. Then 10 seconds later, it flatlined again and stayed flat at zero. Nfsd was also inactive. The graph was showing a brief spike 5-6 MB/s every 40 seconds or so before returning to zero. Interestingly, the client was still showing data was being copied (without any slow down in data rate…it was around 3 GB/min) and continued to do so until it got to the end of the partition (45 minutes later). The total size of that partion was 126 GB. I did note these two slow downs happened at 19.6 and 21.9 GB copied. I then deployed that image to a new machine and it followed the exact same pattern as above with the slow downs occurring at the exact same point in the image (19.6 and 21.9 GB). The only difference is that the data rate does drop (as reported by the client) so what took 45 min to image takes several hours to complete.
Any ideas what might be going on?