Another slow Deployment qustion
-
@Wayne-Workman It’s not all that are expiriencing the slowness, but it’s most. We could have 18 computers running at one time and they all could be running at 60MB-PM then we can fire up number 19 and have it crank at 6GB-PM! Random… if we restarted it somewhere in the middle it could get 60 or it could get 6 again… hit or miss…
FOG comes with the maximum client limit set to 10 so I figured that was a pretty good benchmark of where it should be. I could try turning it down. I don’t want to go as low as 3 or 2 though otherwise it would be ton of work to add to our summer plans.
Would it be better to multicast the image instead of unicast it?
We have set up one computer at time and it will image at 60MB-PM … number two we set up could crank out at 6GB-PM, or it could get 60MB-PM… See what I mean. It almost seems like a negotiating problem and not a overhead problem.
-
@Arsenal101 How does the FOS engine (that runs on the target computer) know which storage node to connect to?
As for multicasting, there are pitfalls there too. If all of your clients are on the same subnet as your fog servers then it works pretty smooth. If you have to cross vlans/router then you add complexity to your setup.
-
@george1421 Don’t forget about the weakest link in the case of multicast.
If one client is pulling the data in at 50MB/s due to a cabling issue, or on a different speed of switch (Think all systems on gig network, but this one has a 10/100MB switch connected to it).
-
@george1421 I am not really sure? I thought all that was handled automatically and it just filled the first 10 slots on the Master Node and the started filling slots on the storage node
We will probably continue to stay away from multicast then, we would have to route to get to the subnet the current devices are imaging on.
Should I set up a location plug in so that each location knows what server to pull from?
We have 5 locations 4 schools and one SAU, all of the schools are connect with 10gb multimode fiber and the converted to copper to 1gb switches from there. We image in the computer labs/libraries that are hardwired to HP Procurve 2910al switches. All 1gb. So I am confident that there is no 10 or 100 mb switch in the way.
At our high school it is a possibility that it is cabling, it is old and i think cat 3? It could 5 though I am not sure. I was ruling out cabling though because of that "one station could be imaging at 60 and as soon as you reboot it, it cranks at 6GB.
We can try it at our Elementary school which is wired with relatively new Cat 6 cabling. So that should weed any cabling issues out.
-
On a side 100% side note. If I don’t define a kernel on the “Hosts” Page the machine defaults to bzImage4.1.2. Is there anywhere I can change which bzImage it chooses if nothing is defined?
-
@Arsenal101 Ignore me sorry!.. Nothing a simple search wouldn’t have solved…
-
Do you have storage nodes set up at each location? If so, it is probably best to use the location plugin yes.
-
This is going to sound abstract.
You have 400 computers to image over the summer.
Why not take 2 models that are the same and of moderate performance and give them a “field promotion” to FOG server? (hint" “field promotion” comes from the military when an officer dies in battle and a private becomes in charge in the field of battle). Set these two moderate performance systems up as a pair of FOG 1.2.0 trunk version servers. The trunk version of FOG/FOS will give you better performance than your current setup. It will also tell you if your slowdown is in FOG or somewhere on your network. At the end of your imaging task just reimage these two servers as desktops using your old fog server and build a plan for what to do next.
-
@Quazz right now we have a master and a storage node at the same location. Same IP Subnet Same switch.
-
@Arsenal101 said in Another slow Deployment qustion:
FOG comes with the maximum client limit set to 10 so I figured that was a pretty good benchmark of where it should be.
It’s not, I want the default set to 3, actually.
-
@Arsenal101 Why can’t you set both nodes to 2 and try it? I am not understanding this. You’re not ruling out possibilities. This problem is so simple, and turning down the maximum connections will almost surely solve this.
-
@Arsenal101 said:
IS there any advantages to multicast over unicast?
YES! If network is setup properly you can send the image to two, ten or 50 machines without a major speed dropdown using multicast because the data is being sent only once over the network and all the clients “hear” it. Think of it like a telephone conference. If you want to tell the exact same thing to ten different people you better get together or meet for a conference call. Other than calling them one by one and telling the same story over and over again…
-
@Sebastian-Roth I couldn’t have explained it any better.
-
@Sebastian-Roth I will have to look into that once I can get a storage server built for every Subnet. But for right now thats more work on our switches/router then I want to do just yet.
-
@Arsenal101 You’ve still not tried my suggestion?
-
@Arsenal101 You don’t need to setup a storage server for each subnet, but your network has to be setup to allow multicasting.
I would go with my suggestion (of course) with 2 borrowed systems for fog. Get that working and then test mulicast deployment to a remote subnet if that works then you are golden, if not you still have the two newer fog servers running the latest trunk build.
For multicasting your router needs to allow directed broadcasts between the subnets and you should have igmp snooping on for all vlans where you would have multicast clients. This is typically set on each switch that would be part of the multicast conversation.
From an analytical side, you’ve tested what I would have tested to identify the performance issue. Unless the performance issue is area specific I would focus on the areas in command like the datacenter network and the fog servers. That is is the only thing in common at this point (in my mind).
-
I think I may have figured it out. After I was able to upgrade the master server to the latest trunk we were still having troubles.One of my coworkers noticed that the ones that were going slow were loading and connection to the Storage node (version .32) So I unplugged and and removed it from the master server, that seemed to speed everything up. I can now image 10 machines at 800MB-PM which for me is more than acceptable.
My best guess is thatit was a combination of the master server was offloading the work to the storage server which is not very good hardware, on top of the fact that it was deploying a windows 10 image with fog version .32 which has no idea windows 10 existed… So I am planning on building a few more storage nodes since they are rather simple to build and we should be able to image 20 machines at 800MB-PM (in theory).
Thanks for all the help and suggestions guy!
-
@Arsenal101 800MB/min is slow, even. If you limit each storage node to 2, you’ll get 7GB/min on two at a time, and get more done in a day.
-
Just thinking about the numbers, here.
Let say if you have a single unicast image sent. And that transfer goes at 6GB/m, that translates to about 100MB/s (near the theoretical limit of a GbE network). So we know that 6GB is near the fastest we can go for a GbE network. ( I know there are other factors here like compression ratio, target system performance, and so on. I’m just talking in general terms ).
So for a standard 25GB fat client at 6GB/m it should take just a tad over 4 minutes to image that system. (25/6 = 4.1m)
Now the OP can image 10 machines at 800MB/m or 13MB/s. To deploy a 25GB image it should take about 31 minutes to net 10 systems.
If we serialized the deployment and only deployed 1 system at a time, with a 4 minute deployment we should be able to deploy 6 systems per 30 minutes, which does not beat the 10 machines at 800MB/m. If we allowed dual unticast deployments per imaging cycle we still should be able to achieve 10 systems per 30 minutes.
So how could we go faster?
I might start with creating a bonding network connection with maybe 3 or 4 links. Adding network bonding to the equation add some processing overhead, to off set this just add more links to the LAG (more than just one additional). This will spread the network load over multiple links (actually since 10G is available, I would just jump to a 10G adapter then LAG is not needed). Once the network bottleneck was eliminated, I would then probably install a SSD drive in the server to host the images on. If you think about it having 10 systems all at different parts of the download, those drive heads are bouncing all around the platter to service the data request. Moving the images to SSD on the FOG server will eliminate the drive thrashing. The FOG server CPU really doesn’t come into play here, the only thing the FOG server is doing is moving data from the hard drive to the network adapter. There is not a lot of computational power required here. Its all network and disk subsystems that are under load.
-
@george1421 I didn’t suggest one at a time, I suggested two at a time.
I can test this in the lab and give hard numbers, I know this because I’ve had max clients set to 10 before and it was terrible, and changed it to 2 and now it’s performing great.
Also, pretty sure the figures Partclone displays is read/write speed, not network transfer speed.