Multicast image issue with FOG 1.5.2
-
With 1.4.2 we never had any problems, but we tried to multicast multiple sets of 25 pc’s yesterday and they all hang at the end of the multicast task. Even 12 pc’s hang. However, 4 pc’s not. Still had 1CPU and 1GB memory configured @ the FOG VM and increased it to 2 CPU and 16GB memory. It’s using 3,5GB with 12 PC’s. Storage is SAN, and network speed is 40Gbit @ FOG server.
After 100% the web interface of fog server gets unresponsive. I can only reboot server through SSH and than PCs reboot automatically too. However the PC’s will now not join with domain.
Anyone knows a solution for this problem?
-
FOG 1.5.2 uses a new php engine instead of the default apache php engine. I want to see if this new engine is causing you some pain.
So for this first test lets adjust a file setting then restart the fog services.
- Edit this file /etc/php-fpm.d/www.conf
- Look for a line that reads:
;pm.max_requests = 500
- Uncomment that line and change the parameter to
2000
to make it look like this:
pm.max_requests = 2000
- Save and exit the editor.
- Restart php-fpm and apache.
- systemctl restart php-fpm
- systemctl restart httpd
If that doesn’t resolve your issue then lets switch back to the apache php engine.
Depending on your distribution you will need to look in /etc/httpd or /etc/apache2 once you are in that directory we need to search for a file that conaints127.0.0.1:9000
- Execute this command
grep -R -e “127.0.0.1:9000” * - That should find the config file for apache we need.
- You should see a section that has
#SetHandler application/x-httpd-php SetHandler "proxy:fcgi://127.0.0.1:9000"
- Move the pound ( # ) between the handlers to look like this.
SetHandler application/x-httpd-php #SetHandler "proxy:fcgi://127.0.0.1:9000"
- Save the config file
- Restart apache
systemctl restart httpd
I’m pretty sure that the first fix will address your issue though.
-
Hello george… thx for the very fast respons.
My www.conf location is strangely enough: /etc/php/7.1/fpm/pool.d/www.conf
Changed the pm.max_requests and rebooted the server and tried again with 12 pc’s.
PC’s waited at the “Erasing current MBR/GPT Tables…” screen for almost 10 minutes…
Why I do not now… Is that result of the change?11 PC’s of the 12 started multicasting and after they finished the web interface was still
responsive and the 11 PC’s disappeared from the task window and finished. The change
worked…! thx againWe will give it another try with more next week
-
@utilman I can’t explain the wait other than the 2 of the 12 that didn’t compete there is an issue.
Since you have php 7.1 install the path to the file is a bit crazy. I should have had you search that instead of using my memory for the path.
Well what is at issue, there is a memory leak (FOG asking for a block of memory from the OS and not returning it after the task is over) in the FOG code. The developers are having a hard time finding this one line of code in the 1000s of lines that make up the FOG server. What this “fix” does is restart the pgp-fpm worker threads after they have responded to 2000 requests. This fix keeps the memory usage from ballooning out of control and the UI responsive. Its good practice to have this option on anyway. The next release of FOG (1.5.3) will have this feature enabled by default so I’m told.
-
This is worth to be reported as an bug! As you know a solution already you should report it.
(For me only switching back to the apache php engine helped. Otherwise the web inteface did not respond, even not with normal imaging.)
-
@trialanderror said in Multicast image issue with FOG 1.5.2:
This is worth to be reported as an bug! As you know a solution already you should report it.
The
pm.max_requests = 2000
is known and has already been addressed in the 1.5.3 working release.(For me only switching back to the apache php engine helped. Otherwise the web inteface did not respond, even not with normal imaging.)
Can you tell me more about what OS you have on your fog server and what you tried before switching back to the apache php handler? I would think that you would have more noticeable slowness with the apache-php engine over a dedicated php engine.
-
@george1421
It is Ubuntu 16.04. (But I cannot answer more questions because I downgraded to 1.4.3 )I would like to suggest publishing “known issues” like that obiviously known multicast problem. It would be VERY helpful.
-
@trialanderror that’s not an obviously known issue, and it has been addressed. The known issues come and go as versions increment and they aren’t all “common”. 1.5.2 brought php fpm usages, but it wasn’t known until well into the release. (The requests needing to be at 2000.) and even that isn’t strong enough as a known issue as it only seemed to have impacted a few people, not everybody.
-
For those reading this… I had this same issue during a Multicast in 1.5.2. In my case, restarting the server did not bring things back to life. I upgraded to 1.5.3 and things came back online. I have not tried Multicasting since. I am Unicasting 8 right now and they are about done. I will try Multicasting a large group here next week and see.
Thanks for posting this!
-
@Tom-Elliott
To update this… In my case, updating to 1.5.3 did not resolve this issue. I was able to Unicast one batch of 8 PCS… That worked. Then when I went to do a wipe task on 26, the interface locked up and the task did not complete.Any suggestions?
-
@Tom-Elliott
So my interface completely locked up. I restarted the server and had no luck at all getting to my interface. I ended up having to remove PHP completely and then let the FOG installer re-install PHP. My interface is back up. I have not ran any tasks yet. I’m hoping it stays up long enough to get something done.