[Seeking Volunteers] Bench Testing! Our trip to the best results!
-
@Junkhacker are you using any 10gb network appliance? or 1gb?
-
@Junkhacker said in [Seeking Volunteers] Bench Testing! Our trip to the best results!:
oh, also, i disagree with george about how fast an “ideal setup” can be with a single GbE network and one unicast:
Boy, who made Mr. Hacker grumpy today?
While we all know this already, but the number show in Partclone (GB/m) is actually a composite score of network transfer rate, decompression rate, and the speed to write the image onto the storage media.
In a typical 1GbE network if someone said they were getting between 6.0 and 6.7GB/m deploy rate, based on my experience I would say that’s normal.
If we just look at the numbers, a 1GbE network has a theoretical maximum throughput of 125MB/s. In practice I see 110 to 120MB/s. So lets use 120MB/s x 60 seconds = 7.2GB/m. So a 1GbE link can only transmit a maximum of 7.2GB/m. So in theory its not possible to get a deployment rate faster than 7.2GB/m. Your video showed 10GB/m, how is that possible? That is because the partclone number is a composite number which also includes image expansion and writing to the storage media. If you have a fast target computer with a fast disk, the target computer can take that 7.2GB/m data stream, expand it and write it to disk just as fast as the data can get to the target computer.
-
@george1421 right don’t forget that the amount of data written too disk is much larger because of the compression ratio which is why the compression medium and speed of decompression is so important.
-
@george1421 all of what you said is true, but it just emphasizes the importance of benchmarking compression. with all of the variables that can come into play, the one thing you can usually rely on being consistent among peoples setups is 1GbE to the end client.
the maximum transfer rate on gigabit is well established, but what’s important is end result speed of writing to disk, and that’s what we can effect with compression methods.
btw, i wasn’t being grumpy. i just like to highlight how fast Fog can be. it’s one of Fog’s killer features that other methods can’t beat. (if anyone has seen a faster deployment method than Fog, please let me know.)
-
We’re currently around 15gb/s with unicast and 8 GB/s with multicast. Why almost 50% difference? so continuing with tweaking and figuring out some things.
Would there be difference between a small 24gb image or one of our large 300gb+ one?
-
Imaging speed is determined on Stream->Decompress->Write to disk. (Technically the speed you see in partclone is purely the speed at which it’s writing to the disk.) However, this is limited to the amount of bandwidth in use to send the data.
As @george1421 showed, from a purely networking standpoint, the maximum would be 7.25 GB/min.
This is likely where you’re seeing a variance in the Unicast vs. Multicast.
I’m assuming in the Unicast instance, you’re sending the image to a single machine. So it’s got the full GB backbone available for network streaming. In the case of multicast, it floods your entire network (subnet basically) with the packets. It also has to keep a “steady” stream so the slowest machine in the group will be the entire limiting factor for all machines.
I’ll bet if you Unicast to 5 machines and Multicast to the same 5 machines, you’ll see a semi-evening out of the speed rates.
-
@Tom-Elliott said in [Seeking Volunteers] Bench Testing! Our trip to the best results!:
I’ll bet if you Unicast to 5 machines and Multicast to the same 5 machines, you’ll see a semi-evening out of the speed rates.
Only citing for reference: During my testing 3 unicast images would saturate a 1GbE link. So 5 would surely show the difference between multicasting and unicasting.
@Mokerhamer You also need to remember two things about multicasting.
- The speed of the multcast is controlled by the speed of the slowest computer in the multicast group. If you have a computer with a slow check in for “next block” that will impact the whole group.
- Multicasting is a different technology than unicasting. Muticasting relies on how efficient your network switches handle multicast packets and if you have igmp snooping enabled, and if you have the switches in pim mode sparse vs dense.
Please also understand that no one here is discouraging your testing. It is and will be helpful to future FOG Admin. Its an interesting topic that is why you have so much focus on your thread. Well done!!
-
You knocked it right on the head with the multicast details, took a few tries to get the all the details configured. We’re now thinking about setting up a 10GB network and do the exact same tests. just curious… what speed would we reach? especially with all the variables in play.
This is a pure trial and fail, find the limits. Fail uncountable times and still keep seeking for answers. We’re using something new with a very high compression ration and i find there is a limited information pool about it. So i am extra curious about pushing limits with this.
In my eyes these trial and fails can decide or break a future plan of our classroom hardware architecture.
-
@Mokerhamer said in [Seeking Volunteers] Bench Testing! Our trip to the best results!:
We’re now thinking about setting up a 10GB network and do the exact same tests.
What we have here is a 3 legged stool. On the one leg we have CPU+Memory, on the next leg we have the disk subsystem and on the final leg we have networking. Its always a challenge to see which way the stool will tip.
If you look a bit back in time at the target computers the disk subsystems were the bottleneck. They were in the range of 40-90MB/s. The CPU+memory has been fine from the speed side for many years as well as the networking had plenty of bandwidth.
Now look today we still have primarly a 1 GbE networking infrastructure to the desk, NVMe disks that can write upwards of 700MB/s, Fast and WIDE CPUs (multiple cores). Now the bottle neck is the network. It just can’t pump the bits down the wire to keep both the CPU and disk busy.
Moving to 10GbE will be interesting to see which leg will fail next. With 10GbE you will have a maximum throughput of 1.250 MB/s. On a clear network you “should” be able to saturate that disk subsystem again, assuming the CPU+memory can keep up with the network card and expand the data stream fast enough.
Make sure when you get it sorted out you share your changes on what you found so others can see the improvements you’ve made.
-
We’re having difficulties with the 10GBe network card on client.
We’ve Fully disabled onboard NIC on the system (Bios).
System boots PXE (TFTP/http)… but when it wants to mount FOG it suddenly said no DHCP on ENP12S0 nic. Like it’s expecting to receive DHCP on onboard nic. Dident expect that…
-
@Mokerhamer said in [Seeking Volunteers] Bench Testing! Our trip to the best results!:
but when it wants to mount FOG
Lets just be sure I understand correctly.
You can pxe boot into the fog iPXE menu. When you select something like full registration or pick imaging both bzImage and init.xz is transferred to the target computer. The target computer then starts FOS Linux, but during the boot of FOS, you get to a point where it can’t get an IP address or contact the fog server, it tries 3 times then gives up? Is that where its failing?
-
Yes!
-
@Mokerhamer Something happened with the picture upload. You need to wait until the image appears in the right edit panel before submitting your post.
OK it sounds like FOS Linux doesn’t have the driver for your network adapter.
Lets start out by having your schedule a debug deploy/capture to this target computer. When you schedule the task tick the debug checkbox before you press the schedule task button.
PXE boot the target computer, after several screens of text where you have to clear by pressing the enter key you should be dropped to the FOS Linux command prompt.
At the FOS Linux command prompt key in the following and post a the screen shots here.
ip link show
lspci -nn|grep -i net
Also what model of 10G adapter are you using?
-
@george1421 said in [Seeking Volunteers] Bench Testing! Our trip to the best results!:
When you schedule the task tick the debug checkbox before you press the schedule task button.
*Doing it now (debug).
*Nic X550T1BLK
https://www.kommago.nl/intel-x550-t1-10-gigabit-netwerk-adapter/pid=51799 -
@Mokerhamer If you have this nic in a running windows box. Will you get the hardware ID of it? OR from the FOS Linux run the lspci command as I’ve outlined below. I’ll look it up to see if linux supports that card.
The 10G stuff is new and may not be enabled in FOS Linux. Having the hardware ID will help (i.e. 8086:1AF2 made up number, but that is what I’m looking for)
-
-
@Mokerhamer That card driver should be included with FOS Linux it has been in the linux kernel since 4.7. I checked and its enabled in the FOS Linux build config: https://github.com/FOGProject/fos/blob/master/configs/kernelx64.config#L1447
From the FOS Linux command prompt key in
ip addr show
uname -a
and post the results
-
Okey.
-
@Mokerhamer Well this is a good one. It should be working.
At the fos linux command prompt key in
/sbin/udhcpc -i enp11s0 --now
then do an
ip addr show
-
@george1421
Done