First check your disk usage on the remote node as Tom said. Check it with this command: df -h look for partitions with 99% or 100% usage.
@FallingWax I remember having this problem although I can’t remember what I called the thread title… it’s here in the forums somewhere.
But, basically I figured out that very large images were not completing replication within the grace window and the fog image replicator would just kill the old replication task and start it again.
I brought this issue up to @Tom-Elliott at the time and he coded a fix - the fix made the image replicator aware of prior spawned lftp instances, and it would wait for those instances to complete before trying to restart them.
Maybe something in the code base is goofed, I’m not sure. But you need to look at this setting and write down what it is:
Web Interface -> FOG Configuration -> FOG Settings -> FOG Linux Service Sleep Times -> IMAGEREPSLEEPTIME So write that down, it’s in seconds. Next you need to go through your replication logs. Tom pointed out the places in the filesystem but they are also available via the web interface here: Web Interface -> FOG Configuration -> Log Viewer -> Image Replicator. You need to figure out if the image replication sleep time is close to when the image replicator just restarts the transfer - or not. If it’s close to when it restarts, this could mean that there’s an issue with the image replicator keeping track of lftp instances that it created. There could of course be other issues that we don’t know about so you should be extra observant when looking through all of this stuff.