@Londonium I finally found the time to look into this myself and do some testing. Sorry for the delay. Your debugging and posts helped a lot to nail this issue down, so thanks again!
Your VPN tunnel down scenario is kind of special. I triggers when port 21 (FTP) can be reached (socket connection fine and so online status true) but an FTP connection fails. What I mean is that the TCP connection opens but it doesn’t receive the FTP banner information as expected when connecting to a FTP server. I was able to replicate this behavior using netcat
(listens on port 21 but does not respond to a full FTP connection try). It’s interesting you can trigger this issue by closing a VPN tunnel because to me it seems a TCP connection to port 21 is still being answered despite the VPN tunnel being down - maybe by the VPN router?
[11-09-22 8:36:47 pm] * Type: 2, File: /var/www/fog/lib/fog/fogurlrequests.class.php, Line: 611, Message: fsockopen(): unable to connect to 192.168.1.134:21 (Connection timed out), Host: 10.81.2.117, Username: fogproject
I was also able to replicate this scenario and found why it shows two different IPs (when “trigger_error” is commented). On FTP errors the PHP function error_get_last() is used to retrieve information from the last error happening and combines it with current host/username information. But when a FTP connect fails this is not a general PHP error and therefore the “trigger_error” is used to simulate this being a real PHP error. If “trigger_error” is not in place the last PHP error can be anything really and in your case it was the failed fsockopen connect to the prior storage node.
Anyhow, thanks to your topic I was able to improve the replication code (dev-branch 1.5.9.222) and make it skip to the next task/node in case of FTP connect errors.