When you say 2500 hosts, do all of them have the FOG client installed?
Yes, all of them (7000) have the FOG client installed. But 2500 could be polling the server in a ordinary class day. At this time that it is not a problem because all the schools are in summer break. In order to use the task reboot manager to unnatended image deploys, we set the check in time in 180 seconds. (So if a multicast deploy task is sended, the computers would have time to reboot and get suscribed before the 300 seconds limit.)
my intuition is telling me that 8 vCPU is much and you might see better performance with 4 or 6 vCPUs. But at the moment only change one thing at a time.
Ok, 6 vCPUs was the setting until one month. We were running very slow tasks and I ask for more power on our server (6-8CPU - 12-16GB RAM). We were aware that it could not be the best but we were “forced” to test it. We didn’t notice a much better performance and the plan is to have 6 vCPUs after the classrooms are ready for the new academic year.
While we are getting off point of your initial post.
Ok, coming back to the point ;P. I have been talking with one of our network team and he have give me some general information about our network. Our FOG server is atacched to a “CORE” router (10GbE). From this central point there are connections to the four named campus. I have done a sketch map.
Looking the map, and remembering the benchmark tests, the first unicast tests hosts are in Campus 1. And the last unicast test hosts (at “the opposite end” :) ) are in Campus 4.
So, (if my memory serves me correctly) my network workmate has tell me that IGMP does not use the port number parameter (only IP). And, today, we are not sure if the router has the capability of “discard” or “route efficiently” the muliticast data only to the subscribers on IP+portnumber multicast session. (Our high experienced guy on tunning multicast in our network is on holidays.)
I am looking for the cause that doubles the time needed (with the v1.5.2 server not heavily used) for a multicast tasks. (If we compare it with the v0.30 server.)
We could take a multicast deploy to a group in “campus 3” and on april with v0.30 took less than 4h (about 58GB). But yesterday with v1.5.2 more than 8h (about 67GB).
(Don’t get me wrong, I know it is very difficult to tune up all the settings. And in addition, I think our FOG implementation is not an easy one :). So step by step.)
On another level, to add some little test results to the multicast performance problem we try with the Bitrate option (yes, it seems that setting it up on the “Storage” options it get added to the udpsender command):
- In “campus 1” deploy 2.43 GB/min Vs 4.24GB/min (the second test 5 minutes later without --max-bitrate 200m to the same two hosts)
root 23705 2180 0 19:44 ? 00:00:00 /usr/local/sbin/udp-sender --max-bitrate 200m --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 18.104.22.168 --portbase 51604 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img
root 31218 2180 8 19:51 ? 00:00:12 /usr/local/sbin/udp-sender --interface ens192 --min-receivers 2 --max-wait 300 --mcast-data-address 22.214.171.124 --portbase 52262 --full-duplex --ttl 32 --nokbd --nopointopoint --file /images/aula-upv-ehu-enajenacion/d1p1.img
- I have not had chance to get tested on other “campus”.