Unable to get Gigabit Speeds
-
I will try to be as detailed as possible about my issue as I’ve been pulling my hair out for nearly a week now trying to figure this out and I’d [I]really[/I] like to figure this out.
I have FOG 0.32 running on Ubuntu 10.04 with Capone. I can make and deploy images, Capone is working great. However, I can not get the FOG to push more than 100Mbit across the network. If I start one machine it works fine (at roughly 100Mbit - 600-800MiB/min). If I try to start another machine, it takes 15-20 minutes to load the bzimage and init image. If I start two computers at roughly the same time, they will load the init images fairly fast, but then share roughly 100Mbit of traffic. The meter in the FOG portal and the Ubuntu system monitor will both show ~100Mbit of traffic going out. If I try to add a third machine at this point it loads the init images very slowly, as soon as the first two machines are removed, the third will kick up to 100Mbit/s.
[U]Hardware - FOG Machine[/U]
[LIST]
[]PowerSpec S200
[]Intel Dual-core (C2D) @ 2.33GHz
[]2GB of ECC RAM
[]2 - 500GB HDD in RAID 1
[]Dual Intel NICs (currently not used)
[]Intel Pro 1000GT Quad port (PCI-X)
[/LIST]
[U]Test clients[/U]
[LIST]
[]Dell N5110
[]Intel i5 2450
[]4GB RAM
[]1TB HDD
[]Gigabit NIC
[/LIST]
[U]Network[/U]
[LIST]
[]Gigabit switches, mostly crappy, but they have proven to do 2GB/s+ on their backplanes through file transfers
[*]Cat6 only, new for this project
[/LIST]
I have run point to point from the server to the client, through switches, etc, always the same speed, never will go over 100Mbit/s.I believe I have the current Intel driver for this NIC (e1000), as seen in the screenshot.
Please help before I lose all my hair!
Edit: Image is what I see while imaging one computer, this was taken about five minutes after it started. The CPU jumps around a bit, but does not go over ~40% even with three computers trying to image.
[ATTACH=full]165[/ATTACH]
[url=“/_imported_xf_attachments/0/165_Screenshot.png?:”]Screenshot.png[/url]
-
I had something just like this not even two weeks ago. I ended up running tests from various switches moving further out from the Fog server until I found the offender. It ended up being a access layer switch that someone (cough my boss) went cheap ass on and didn’t buy a GBIC gig adapter for the uplink. So everywhere else had gig uplinks but the access switch for our floor was being jammed into a 10/100 port. Long story short I jumped a gig uplink port from our distro to the setup room to a cheap gig Dlink switch for imaging and bam 2.5 gigs on average. Takes about 5 min to drop a 14 gig image to a computer and as soon as I work out snapins I’ll shave another 5-8 gig off my images and it should be really fast then.
-
[quote=“djm79, post: 8327, member: 1568”]I had something just like this not even two weeks ago. I ended up running tests from various switches moving further out from the Fog server until I found the offender. It ended up being a access layer switch that someone (cough my boss) went cheap ass on and didn’t buy a GBIC gig adapter for the uplink. So everywhere else had gig uplinks but the access switch for our floor was being jammed into a 10/100 port. Long story short I jumped a gig uplink port from our distro to the setup room to a cheap gig Dlink switch for imaging and bam 2.5 gigs on average. Takes about 5 min to drop a 14 gig image to a computer and as soon as I work out snapins I’ll shave another 5-8 gig off my images and it should be really fast then.[/quote]
That makes sense, I’ve heard horror stories like that.
This is happening in a very closed loop though. FOG server -> Gigabit Switch -> Client. Even FOG -> Client directly has these speeds.
-
Bring up a Linux box, liveCd will be fine. Mount the NSF share from fog server onto the client and pull and image file over. If you get 100mb speeds then it’s not fog.
-
[quote=“chad-bisd, post: 8330, member: 18”]Bring up a Linux box, liveCd will be fine. Mount the NSF share from fog server onto the client and pull and image file over. If you get 100mb speeds then it’s not fog.[/quote]
Good idea, I’ll try that tomorrow when I have access to the machines.
-
Yea chad-bisd idea sounds like and easy plan that will tell you if its fog or not but I would almost be willing to beat that its a switch of NIC causing all this crazy speeds issues. Are your switches managed or unmanaged? Or you could try a another kind of switch to see if its that model if you don’t have the means to setup a live CD test, I would try another switch and see if the speed works out.
-
[quote=“chad-bisd, post: 8330, member: 18”]Bring up a Linux box, liveCd will be fine. Mount the NSF share from fog server onto the client and pull and image file over. If you get 100mb speeds then it’s not fog.[/quote]
Tried this with a HP Touchsmart tm2. It jumped around a lot, but pulling the N5110 image over the network went well over 100Mbit at times. So now the question is, what in FOG is artificially limiting me to 100Mbit?
[ATTACH=full]167[/ATTACH]
[url=“/_imported_xf_attachments/0/167_Screenshot-2.png?:”]Screenshot-2.png[/url]
-
You may be getting 100mb speeds due to the network driver in the FOG kernel (/tftpboot/fog/kernel/bzImage). Have you tried a newer kernel? Sometimes older kernels work also. You might try debug mode imaging and see if you can mount the nfs share and transfer a file. If it’s limiting to 100mb in debug mode, it’s probably the kernel module driver for the network interface.
In my FOG setup, I’m using 6 disks in a raid 5 array and gigabit switches and I get almost 4GB per minute on 4 computers imaging at the same time. With 10 going, I get about 2GB/min. I start to get limited by the disk throughput before the network throughput.
-
[quote=“chad-bisd, post: 8373, member: 18”]You may be getting 100mb speeds due to the network driver in the FOG kernel (/tftpboot/fog/kernel/bzImage). Have you tried a newer kernel? Sometimes older kernels work also. You might try debug mode imaging and see if you can mount the nfs share and transfer a file. If it’s limiting to 100mb in debug mode, it’s probably the kernel module driver for the network interface.
In my FOG setup, I’m using 6 disks in a raid 5 array and gigabit switches and I get almost 4GB per minute on 4 computers imaging at the same time. With 10 going, I get about 2GB/min. I start to get limited by the disk throughput before the network throughput.[/quote]
I will try that after lunch, thanks.
In the past, using Ghost, we saw about 400MiB/min, so anything over that is an improvement. 4GB/min would be amazing. We reimage around 50 units a day, but not at the same time and a queue is not a problem, so those are hopeful numbers.
I started this project without a full understanding of how FOG is intended to be deployed. Our old network used Lacie NAS boxes that eventually crapped out under constant and heavy load (and thunderstorms). To start the replacement, I built a FreeNAS system that started with a RAID-Z2 array (8 * 2TB). It was supposed to end with three such arrays in the same box, and function with our Ghost system in place. We decided that there should be a replacement to Ghost as well, so then I got involved with setting up the FOG system. After recently reading about nodes, it seems like this might be a better way for us to go about things. Multicasting is not an option, and with Capone working so nicely, I don’t want to bother with it. Meaning that unicast performance is paramount in importance, so throttled at 100MBit is too slow to really be of use. The future may also hold some form of Link Bonding as well, but that’s for another thread.
-
Link bonding in Ubuntu 10.04 is quite easy if your switches support it. I bonded a built-in NIC and an add-on NIC in about 20 minutes of research and configuration.
-
Been a busy few weeks, haven’t gotten a chance to play with this in awhile. But, yesterday, I did.
I loaded up an Alienware M17xR4 to make an image and happened to glance at it while it was pushing along… at 4.5GiB/min… That made me pretty happy, so I grabbed another one once it was done and loaded up 2 of them, pulling an image, they [I]both[/I] were reporting ~4.2GiB/min pulling from the FOG machine. The FOG homepage showed about 80MB/s of traffic, but it was all over the graph, where with the previous Dell units it would hit 700MiB/s and stay pretty flat.
So, I think that proves that my issue is related to the Dell units (N5110) or perhaps the stock kernel. I haven’t touched any of the kernel related stuff yet, but I think that is my next move.
Oh, also as a note while I was messing around with the Alienwares, I tried starting up one of the N5110 that I test with and as soon as the Dell connects, it slows the bzimage loading. When the Dell is pulling an image (at 700MiB/min) and I try to start an Alienware, it loads the bzimage slowly. If I pause the Dell imaging, the Alienware loading the bzimage takes off and goes it’s normal speed. Then if I unpause the Dell, both with go their respective speeds (Dell 700MiB/min, Alienware 4.2GiB/min). Thoughts?
-
I would say its a kernel driver issue. I would try different kernels. It easy to change through the GUI, other information tab at the top and then kernel updates give you a long list to pick from. If one of those doesn’t work I might be able to link a download to a all in one kernel that works great for me.
-
Hello,
I was reading you thread and from reading your initial post and the post where you state FOG “does not multicast” I believe I might have an idea of what’s happening to you.
First, are you starting multiple individual images? If so, this means each PC is going to access the image on the fog server independently, hitting your disks harder and slowing the transfer with each additional PC.
Secondly, you CAN multicast. Here’s how:
Add the machines you want to image into inventory.
Create a group and place the machines you want to image into that group.
Go to TASK MANAGEMENT.
LIST ALL GROUPS
You’ll see the multicast option next to the group you want to image.
Set the CRON settings, shutdown, etc., and hit submit the job.I hope I’m not totally off base, but I did this in the past and had the same reaction. “WOW this is too slow!” “I used ghost and it was soo much faster!” etc,.
Let me know if it helps.