[dev-branch] multicast: for some hosts DB not updated after restore
-
See this post.
Apache logs don’t contain anything of note.
PHP-FPM log during (or shortly after) multicast restore sessions sometimes contains these warnings:[04-Jan-2020 16:38:01] NOTICE: [pool www] child 29241 started [05-Jan-2020 02:54:37] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 0 idle, and 17 total children [05-Jan-2020 02:54:38] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 0 idle, and 22 total children ... [25-Jan-2020 18:00:58] NOTICE: [pool www] child 9916 started [25-Jan-2020 18:54:59] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 0 idle, and 17 total children [25-Jan-2020 18:55:00] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 0 idle, and 22 total children [25-Jan-2020 18:55:01] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 27 total children
Output of
egrep '^pm\.(start|min|max)' /etc/php-fpm.d/www.conf
pm.max_children = 50 pm.start_servers = 5 pm.min_spare_servers = 5 pm.max_spare_servers = 10 pm.max_requests = 2000
Output of
free -h
on FOG servertotal used free shared buff/cache available Mem: 3,9G 362M 372M 8,6M 3,2G 3,5G Swap: 4,0G 0B 4,0G
Output of
lscpu | egrep '^Core|^Socket'
Core(s) per socket: 2 Socket(s): 1
Output of
ps --no-headers -o rss,cmd -C php-fpm|awk '{sum+=$1}END{print sum/NR/1024"M"}'
21.2727M
Output of
mysql -u root fog -e 'select l.taskID,s.tsName from taskLog as l,taskStates as s where l.taskStateID=s.tsID and l.id between 3282 and 3348 order by l.taskID'
+--------+-------------+ | taskID | tsName | +--------+-------------+ | 1701 | In-Progress | | 1701 | Complete | | 1702 | In-Progress | | 1702 | Complete | | 1703 | In-Progress | | 1703 | Complete | | 1704 | In-Progress | | 1704 | Complete | | 1705 | In-Progress | | 1706 | In-Progress | | 1706 | Complete | | 1707 | In-Progress | | 1707 | Complete | | 1708 | In-Progress | | 1708 | Complete | | 1709 | In-Progress | | 1709 | Complete | | 1710 | In-Progress | | 1710 | Complete | | 1711 | In-Progress | | 1711 | Complete | | 1712 | In-Progress | | 1712 | Complete | | 1713 | In-Progress | | 1713 | Complete | | 1714 | In-Progress | | 1714 | Complete | | 1715 | In-Progress | | 1715 | Complete | | 1716 | In-Progress | | 1716 | Complete | | 1717 | In-Progress | | 1717 | Complete | | 1718 | In-Progress | | 1718 | Complete | | 1719 | In-Progress | | 1719 | Complete | | 1720 | In-Progress | | 1721 | In-Progress | | 1721 | Complete | | 1722 | In-Progress | | 1722 | Complete | | 1723 | In-Progress | | 1723 | Complete | | 1724 | In-Progress | | 1724 | Complete | | 1725 | In-Progress | | 1725 | Complete | | 1726 | In-Progress | | 1726 | Complete | | 1727 | In-Progress | | 1727 | Complete | | 1728 | In-Progress | | 1728 | Complete | | 1729 | In-Progress | | 1730 | In-Progress | | 1730 | Complete | | 1731 | In-Progress | | 1732 | In-Progress | | 1732 | Complete | | 1733 | In-Progress | | 1733 | Complete | | 1734 | In-Progress | | 1735 | In-Progress | | 1736 | In-Progress | | 1736 | Complete | +--------+-------------+
-
@shruggy said in 1.5.7.89: partclone doesn't capture an image in dd mode: wrong options in fog.upload:
After the coming Microsoft Patch Day (probably over the next weekend) I am planning to capture another disk image with this and deploy it to my pool in multi-cast mode.
I did it last weekend and the results are mixed. Yes, the image was successfully captured and then restored to 36 PCs in multi-cast. But: On five hosts I got this error message after restoring the image:
Reattempting to update database: Failed
The image was restored successfully on those hosts nevertheless. Only the FOG database wasn’t updated. All 36 PCs are identical hardware.
In the Imaging Log the End column for those five hosts says:
-0001-11-30 00:00:00
while the Duration column says:
2020 years 1 month 18 days 15 hours 35 minutes 43 seconds
It looks like somehow the data for Start timestamp got written into Duration?
-
@shruggy said in [dev-branch] multicast: for some hosts DB not updated after restore:
WARNING: [pool www] seems busy (you may need to increase pm.start_servers
How many hosts do you have with fog-client installed? From those logs I would assume you have a lot.
I would try adjusting
/etc/php-fpm.d/www.conf
to:pm.max_children = 100 pm.start_servers = 10 pm.min_spare_servers = 10 pm.max_spare_servers = 20 pm.max_requests = 2000
Don’t forget to restart php-fpm after adjustment.
As well you might want to increase the fog-client checkin time (FOG web UI -> FOG Configuration -> FOG Settings -> …)
-
@Sebastian-Roth You can mark this as solved now. I didn’t go with the adjustments you suggested, though: just wanted to try first the configuration suggested at https://www.sitepoint.com/php-fpm-tuning-using-pm-static-max-performance and it worked.
Here is an excerpt from my current
/etc/php-fpm.d/www.conf
(the changed lines are the first two and the last):pm = static pm.max_children = 40 pm.start_servers = 5 pm.min_spare_servers = 5 pm.max_spare_servers = 35 pm.max_requests = 500
I have a pool of 38 identical hosts.
-
Hi.
In our case the same thing is happening (with 30 PCs with the same hardware and the fog server mounted on ubuntu 18.04). When multicast with 12 pcs, on some hosts I received this error message after restoring the image: “Trying to update the database: Failed”, and in the database (imagingLog table) it does not record the end time of the deploymentThe Apache logs contain nothing of note and the PHP-FPM log contains no warnings. What can happen in our case?
Thanks in advance
-
@shruggy I’m interested in this issue. How many systems do you typically image at the same time with multicast? How much memory do you have on the fog server?
-
@tec618 Can you follow Shruggy’s guidance. Update the www.conf file (the location will be some place under /etc (hint:
find /etc -name www.conf
) and change the pm to static pm = static and set pm.max_children = 50 . Save the file and then issue asudo systemctl restart php-fpm
to restart the php-fpm service.We will need to watch the available ram on your system since each pm client will consume a bit of ram memory.
-
Ok, I will follow the @shruggy’s guidance and tomorrow I will tell you the results.
In any case, comment that the fog server is a virtual machine with ubuntu 18.4 and 4Gb RAM. The main server has the latest version of CENTOS 7 installed and virtualizes with kvm
-
@george1421 said in [dev-branch] multicast: for some hosts DB not updated after restore:
@shruggy How many systems do you typically image at the same time with multicast? How much memory do you have on the fog server?
Usually, it’s 36 systems at once. The setup is similar to @tec618’s: FOG on a VM with 4GB RAM, but both the VM and the hosting server run CentOS 7, and it’s Xen, not KVM. PHP 7.3 from Remi’s repo.