Download/Upload Performance Issues Since 1.1.2
-
[quote=“braindead, post: 31694, member: 24282”]
Before the update, I was getting 5-7 GB/min for my downloads, and anywhere from 2-4 GB/min uploads. Now, I get ~1.39 GB/min downloads and ~750 MB/min to 1.3 GB/min uploads.
[/quote]Is 5-7 GB/min aggregate for deploying multiple images simultaneously? And are these the numbers reported by fog/partclone or have you also noticed that an actual image is taking longer?
I’m on a single [S]gigabit[/S] fast ethernet (100Mb/s) link and partclone consistently tells me that it is running at about 1-2GB/min for a download at the moment - I haven’t noticed a significant change from other versions. The numbers that partclone reports will be for a the compressed image transfer so will be quite variable.
If you wanted to dig a bit deeper you could take a look at the link negotiation on the switch and make sure it isn’t falling back to 100MB/s somewhere. The debug task in fog might also help you check the link speed and do some copies to check the numbers, for gigabit ethernet the max is somewhere around 117MB/s without jumbo frames, in the real world for NFS I’d expect somewhere around 90MB/s.
EDIT ethtool is available in the fog debug task, “ethtool eth0” should tell you what link speed was negotiated without messing with your switch. Mine is reporting 100MB/s for some depressing reason
-
OK, so I’ve put everything back on gigabit and my numbers are consistent with your “Before” values (7GB/min or so), I’m on 1.1.2 by they way.
I strongly suspect 100BaseT is being negotiated somewhere between your client and fog server. ethtool on the fog server and in the debug task on your client should help you rule them out, then it is just the switch(es) that would need to be checked.
-
I’ll check this stuff out when I get back from lunch and get back to you.
ethtool on the servers was reporting gigabit earlier today (in the terminals), but I’m going to check it out to be sure.
-
Well, bad news: ethtool is showing all interfaces to be at gigabit speeds, and the switch is showing gigabit speeds being negotiated.
My copies are going as they have before: ~75 MB/sec.
Also, my numbers were based on one machine imaging. With about 7 machines, it got down to ~1.2 GB/min.
-
Are you running multicast or unicast?
-
I know you edited a field to “answer” this, but are these speeds from multicast or unicast or both?
-
[quote=“Tom Elliott, post: 31724, member: 7271”]Are you running multicast or unicast?[/quote]
Unicast, and those speeds are based on one machine at a time.
-
Do you have a apple devices in your network? Specifically running on the bonjour side of things?
If you cut out the middle man as a test, do speeds improve? Here I mean take one of the “slow” clients and place them on the same switch as the FOG Server.
-
Unfortunately, this all on our in-shop network, so no Apple devices are connected, and it’s all connected to one switch.
I’m imaging Lenovo E540 laptops, and we had the former speeds with these same machines just a couple of days ago.
[quote=“Tom Elliott, post: 31727, member: 7271”]Do you have a apple devices in your network? Specifically running on the bonjour side of things?
If you cut out the middle man as a test, do speeds improve? Here I mean take one of the “slow” clients and place them on the same switch as the FOG Server.[/quote]
-
Former speeds on 1.0.1*
-
I swear, nothing major changed from “functionality” of the init’s and I haven’t seen a speed increase or decrease on my side. It sounds to me like either the switch is failing or cables are faulty then. While I realize this probably isn’t the case, have you simply tried restarting the switch and see if it helps out?
-
I have restarted the switch already.
The only change that I made recently was that I changed the compression of the image from 9 to 3, then I changed it to 1.
I’m going to change that back and re-upload the image and see what that does.
-
*The only change besides upgrading to 1.1.2.
-
While the PIGZ_COMP may play into the speed, I doubt it’ll increase it that much. I did, however, change methods of networking within the kernel to hopefully fix some of the DHCP/BOOTP problems people where having. Maybe try a kernel from the 1.0.1 tag?
-
[quote=“braindead, post: 31723, member: 24282”]
My copies are going as they have before: ~75 MB/sec.
[/quote]Is this is a raw copy over NFS? If not could you try a copy and report the speed? 75MB/s is in the right ballpark for gigabit over NFS.
-
[quote=“ianabc, post: 31746, member: 24548”]Is this is a raw copy over NFS? If not could you try a copy and report the speed? 75MB/s is in the right ballpark for gigabit over NFS.[/quote]
yeah, 75MB a second is pretty fast. Sorry I mis-interpretted it as 75MB/min.
-
Changed the compression and re-uploaded: no change.
Finally was able to check the NFS transfer speed: I had 26 MB/sec transferring to and from one VM to the FOG server. Apparently my 75 MB/sec was my transfer speed to/from the FOG server to the iSCSI disk.
So I’m left puzzled about what happened between now and then. I did some Ubuntu updates too.
Guess I have something to be puzzled about over the weekend.
-
The iSCSI speed sounds plausible for a single GigE link, but the NFS does indeed seem slow.
If there was a 100Mb link somewhere you would expect to see 7MB/s or so and as you said everything is claiming to be running GigE. 26MB/s is an odd result, especially if that is bi-directional. Can I ask how you are testing? dd over NFS can help but it can be a bit tricky because of the buffering and syncing.
Did you try Tom’s suggestion of switching out the kernel for a test.
Have a fun weekend!
-
Here is a sample NFS mount from a linux client - in principle it should all be GigE between the fog server and this machine, but they are quite a few hops apart and those switches and routers might be busy, could you try something similar for comparison?
[CODE]
$ mkdir /fogtest
$ mount -o vers=3,nolock IP.OF.YOUR.FOG:/images/dev /fogtest
$ dd bs=1M count=1024 if=/dev/zero of=zeros.img conv=fdatasync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied 30.5422 s, 35.2 MB/s$ dd bs=1024M if=./zeros.img of=/dev/null iflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied 21.0712 s, 51.0 MB/s$ rm zeros.img
$ umount /fogtest
$ rmdir /fogtest
[/CODE]
I’m not too worried (or impressed :)) by these numbers. If you get similar results you know that the problem lies further up the stack. -
Finally got around to the testing this using your test, here’s my results:
[CODE]$/media/test$ sudo dd bs=1M count=1024 if=/dev/zero of=zeros conv=fdatasync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 26.2786 s, 40.9 MB/s
$/media/test$ sudo dd bs=1M count=1024 if=/dev/zero of=zeros conv=fdatasync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 24.3489 s, 44.1 MB/s
$/media/test$ sudo dd bs=1M count=1024 if=/dev/zero of=zeros conv=fdatasync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 23.3077 s, 46.1 MB/s
$/media/test$ dd bs=1024M if=zeros of=/dev/null1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 12.7035 s, 84.5 MB/s
$/media/test$ dd bs=1024M if=zeros of=/dev/null
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 12.5741 s, 85.4 MB/s
$/media/test$
$/media/test$ dd bs=1024M if=zeros of=/dev/null
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 12.9285 s, 83.1 MB/s
$/media/test$ dd bs=1024M if=zeros of=/dev/null
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 13.0684 s, 82.2 MB/s
[/CODE]I was completely perplexed with this issue. It almost seems like an update to 12.04 is causing this, because I updated Ubuntu 12.04 around the same time as upgrading FOG.
Other things I’ve tried that have zero changes:
[LIST]
[]Changing the kernel
[]Making sure /etc/exports has async
[]Changed the switch
[]Loaded FOG 1.0.1 on the iSCSI target and ran FOG from that
[/LIST]
Then, it suddenly occurred to me: I haven’t tested the server on a different machine.(Facepalm)
I ran FOG on a completely different system/image, but one with comparable image size, and voila – speeds are back.
So here’s what I think the problem is: the laptops I was imaging – [U][URL=‘http://support.lenovo.com/en_US/product-and-parts/detail.page?DocID=PD030723’]Thinkpad Edge E540[/URL][/U] – runs one of these ethernet controllers: [SIZE=12px][FONT=Arial][COLOR=#000000]Realtek RTL8111/8168/8411 PCIe GBE Ethernet Controller, and [/COLOR][/FONT][/SIZE][U][SIZE=12px][FONT=Arial][COLOR=#000000][URL=‘http://forums.linuxmint.com/viewtopic.php?f=49&t=152180’]they [/URL][/COLOR][/FONT][/SIZE][/U][U][URL=‘http://forums.linuxmint.com/viewtopic.php?f=49&t=152180’]seem to have an issue with running at 100 Mbit even though it’s a gigabit interface[/URL][/U]. This correlates with my experience and the speeds that I was getting on those machines.
Perhaps, then, the kernels need an update to include those drivers? I’m not sure, but at the very least I wanted to report my findings.
PS- I think the hyperlink color should change, or at least add an underline like below. One word links seem to blend-in too well.
Also, thanks again for all the help.