Slowdown Unicast and Multicast after upgrading FOG Server



  • @george1421

    Server

    -----------------------------------------------------------
    Server listening on 5201
    -----------------------------------------------------------
    Accepted connection from x.x.x.x, port 50672
    [  5] local x.x.x.x port 5201 connected to x.x.x.x port 50674
    [ ID] Interval           Transfer     Bandwidth
    [  5]   0.00-1.00   sec   108 MBytes   903 Mbits/sec
    [  5]   1.00-2.00   sec   112 MBytes   942 Mbits/sec
    [  5]   2.00-3.00   sec   112 MBytes   942 Mbits/sec
    [  5]   3.00-4.00   sec   112 MBytes   942 Mbits/sec
    [  5]   4.00-5.00   sec   112 MBytes   942 Mbits/sec
    [  5]   5.00-6.00   sec   112 MBytes   942 Mbits/sec
    [  5]   6.00-7.00   sec   112 MBytes   942 Mbits/sec
    [  5]   7.00-8.00   sec   112 MBytes   942 Mbits/sec
    [  5]   8.00-9.00   sec   112 MBytes   942 Mbits/sec
    [  5]   9.00-10.00  sec   112 MBytes   942 Mbits/sec
    [  5]  10.00-10.04  sec  4.35 MBytes   937 Mbits/sec
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bandwidth       Retr
    [  5]   0.00-10.04  sec  1.10 GBytes   939 Mbits/sec   11             sender
    [  5]   0.00-10.04  sec  1.10 GBytes   938 Mbits/sec                  receiver
    -----------------------------------------------------------
    Server listening on 5201
    -----------------------------------------------------------
    

    Client

    Connecting to host x.x.x.x, port 5201
    [  5] local x.x.x.x port 50674 connected to x.x.x.x port 5201
    [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
    [  5]   0.00-1.00   sec   113 MBytes   947 Mbits/sec    3    258 KBytes       
    [  5]   1.00-2.00   sec   112 MBytes   943 Mbits/sec    0    364 KBytes       
    [  5]   2.00-3.00   sec   112 MBytes   939 Mbits/sec    2    232 KBytes       
    [  5]   3.00-4.00   sec   112 MBytes   943 Mbits/sec    1    318 KBytes       
    [  5]   4.00-5.00   sec   112 MBytes   943 Mbits/sec    2    211 KBytes       
    [  5]   5.00-6.00   sec   112 MBytes   943 Mbits/sec    0    364 KBytes       
    [  5]   6.00-7.00   sec   112 MBytes   943 Mbits/sec    1    267 KBytes       
    [  5]   7.00-8.00   sec   112 MBytes   943 Mbits/sec    1    364 KBytes       
    [  5]   8.00-9.00   sec   112 MBytes   943 Mbits/sec    0    366 KBytes       
    [  5]   9.00-10.00  sec   112 MBytes   943 Mbits/sec    1    282 KBytes       
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bitrate         Retr
    [  5]   0.00-10.00  sec  1.10 GBytes   943 Mbits/sec   11             sender
    [  5]   0.00-10.00  sec  1.10 GBytes   942 Mbits/sec                  receiver
    
    iperf Done.
    

  • Moderator

    @Quazz Interesting, on the previous dd test. I would like to see this dd test and then the next step is an iperf test. That will test local disk and then network without involving the nfs stack or partclone. At least in my mind is how I would break it down. Something had to have changed besides fog.


  • Moderator

    @george1421 He did some write tests earlier at around ~100MB/s from RAM to disk using dd.


  • Moderator

    @mp12 Read performance is what I would expect from an SSD drive.

    The next bit we will test write performance. For this we need to collect the structure of the existing SSD drive. What we need to find is a partition that has at least 1GB of disk space.

    Show me the output from this command: lsblk (executed on the target computer in a debug console)

    NOTE: The document I’m working from is referenced here: https://forums.fogproject.org/topic/10459/can-you-make-fog-imaging-go-fast



  • @george1421

    /dev/sda:
     Timing cached reads:   29692 MB in  1.99 seconds = 14911.18 MB/sec
     Timing buffered disk reads: 1614 MB in  3.00 seconds = 537.87 MB/sec
    

  • Moderator

    @mp12 I assume you ran the last hdparm from a debug console on the target computer. If so lets run this one too hdparm -Tt /dev/sda That should give us the disk performance test. I’m not totally convinced its a target computer issue, but we need to start collecting data where we can.



  • @Quazz

    Here the output:

    /dev/sda:
    
    ATA device, with non-removable media
    	Model Number:       Samsung SSD 860 EVO 500GB               
    	Serial Number:      S3Z2NB1KA50028H     
    	Firmware Revision:  RVT01B6Q
    	Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
    Standards:
    	Used: unknown (minor revision code 0x005e) 
    	Supported: 11 8 7 6 5 
    	Likely used: 11
    Configuration:
    	Logical		max	current
    	cylinders	16383	16383
    	heads		16	16
    	sectors/track	63	63
    	--
    	CHS current addressable sectors:    16514064
    	LBA    user addressable sectors:   268435455
    	LBA48  user addressable sectors:   976773168
    	Logical  Sector size:                   512 bytes
    	Physical Sector size:                   512 bytes
    	Logical Sector-0 offset:                  0 bytes
    	device size with M = 1024*1024:      476940 MBytes
    	device size with M = 1000*1000:      500107 MBytes (500 GB)
    	cache/buffer size  = unknown
    	Form Factor: 2.5 inch
    	Nominal Media Rotation Rate: Solid State Device
    Capabilities:
    	LBA, IORDY(can be disabled)
    	Queue depth: 32
    	Standby timer values: spec'd by Standard, no device specific minimum
    	R/W multiple sector transfer: Max = 1	Current = 1
    	DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 
    	     Cycle time: min=120ns recommended=120ns
    	PIO: pio0 pio1 pio2 pio3 pio4 
    	     Cycle time: no flow control=120ns  IORDY flow control=120ns
    Commands/features:
    	Enabled	Supported:
    	   *	SMART feature set
    	    	Security Mode feature set
    	   *	Power Management feature set
    	   *	Write cache
    	   *	Look-ahead
    	   *	Host Protected Area feature set
    	   *	WRITE_BUFFER command
    	   *	READ_BUFFER command
    	   *	NOP cmd
    	   *	DOWNLOAD_MICROCODE
    	    	SET_MAX security extension
    	   *	48-bit Address feature set
    	   *	Device Configuration Overlay feature set
    	   *	Mandatory FLUSH_CACHE
    	   *	FLUSH_CACHE_EXT
    	   *	SMART error logging
    	   *	SMART self-test
    	   *	General Purpose Logging feature set
    	   *	WRITE_{DMA|MULTIPLE}_FUA_EXT
    	   *	64-bit World wide name
    	    	Write-Read-Verify feature set
    	   *	WRITE_UNCORRECTABLE_EXT command
    	   *	{READ,WRITE}_DMA_EXT_GPL commands
    	   *	Segmented DOWNLOAD_MICROCODE
    	   *	Gen1 signaling speed (1.5Gb/s)
    	   *	Gen2 signaling speed (3.0Gb/s)
    	   *	Gen3 signaling speed (6.0Gb/s)
    	   *	Native Command Queueing (NCQ)
    	   *	Phy event counters
    	   *	READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
    	   *	DMA Setup Auto-Activate optimization
    	    	Device-initiated interface power management
    	   *	Asynchronous notification (eg. media change)
    	   *	Software settings preservation
    	    	Device Sleep (DEVSLP)
    	   *	SMART Command Transport (SCT) feature set
    	   *	SCT Write Same (AC2)
    	   *	SCT Error Recovery Control (AC3)
    	   *	SCT Features Control (AC4)
    	   *	SCT Data Tables (AC5)
    	   *	reserved 69[4]
    	   *	DOWNLOAD MICROCODE DMA command
    	   *	SET MAX SETPASSWORD/UNLOCK DMA commands
    	   *	WRITE BUFFER DMA command
    	   *	READ BUFFER DMA command
    	   *	Data Set Management TRIM supported (limit 8 blocks)
    	   *	Deterministic read ZEROs after TRIM
    Security: 
    	Master password revision code = 65534
    		supported
    	not	enabled
    	not	locked
    		frozen
    	not	expired: security count
    		supported: enhanced erase
    	4min for SECURITY ERASE UNIT. 8min for ENHANCED SECURITY ERASE UNIT.
    Logical Unit WWN Device Identifier: 5002538e408a5e55
    	NAA		: 5
    	IEEE OUI	: 002538
    	Unique ID	: e408a5e55
    Device Sleep:
    	DEVSLP Exit Timeout (DETO): 50 ms (drive)
    	Minimum DEVSLP Assertion Time (MDAT): 30 ms (drive)
    Checksum: correct
    

  • Moderator

    @mp12 Those speeds are about 4 times as slow as we’d expect for a SATA SSD.

    Can you also try the command hdparm -I /dev/sda in debug?



  • @Sebastian-Roth

    Here the results:

    DSC_0580.JPG

    I also copied some rows out of mysql fog.tasks. Here you can see the difference between speed and duration.

    First a normal multicast with FOG 1.5.3.* (dev-branch). Sorry I don’t know the exact version anymore.

    | 17919 | Multi-Cast Task | 2020-01-24 08:11:41 | 2020-01-24 08:12:33 | 143 | 389 | 4 | 0 | fog | 0 | 0000-00-00 00:00:00 | 8 | 0000000100 | 12.42GB | 00:21:00 | 00:00:00 | 249.006 GiB | 100 | 249.124 GiB | 1 |  1 |   | 0 | 1  |   |   |
    

    This is the last single deploy I made with FOG 1.5.3.* (dev-branch):

    | 18027 | Deploy Task | 2020-02-10 15:16:44 | 2020-02-10 15:17:34 | 214 | 391 | 4 | 0 | fog | 0 | 0000-00-00 00:00:00 | 1 | 0000000100 | 12.72GB | 00:20:38 | 00:00:00 | 250.320 GiB | 100 | 250.650 GiB | 1 | 9 |   | 0 | 1 |   |   |
    

    Here a multicast after upgrading to FOG 1.5.7.109 (dev-branch):

    | 18032 | Multi-Cast Task - 04-634-Reihe-1 | 2020-02-10 16:40:43 | 2020-02-10 16:42:14 | 214 | 391 | 4 | 0 | fog | 0 | 0000-00-00 00:00:00 | 8 | 0000000100 | 1.13GB | 03:52:47 | 00:00:00 | 250.422 GiB | 100 | 250.650 GiB | 1 | 1 |   |  0 | 1   |    |    |
    

    And this single deploy was created with FOG 1.5.7.109 (dev-branch) the day after:

    | 18042 | Deploy Task - 04-628-36 | 2020-02-11 12:53:24 | 2020-02-11 12:54:13 | 347 | 392 | 4 | 0 | fog | 0  | 0000-00-00 00:00:00 | 1 | 0000000100 | 4.61GB | 00:56:53 | 00:00:00 | 250.371 GiB | 100 | 250.620 GiB | 1 | 9 |  |  0 | 1 |   |    |
    

  • Developer

    @mp12 said in Slowdown Unicast and Multicast after upgrading FOG Server:

    It is a normal 2.5 in. SATA Samsung 860 EVO with 500GB

    Ok, I was totally on the wrong track with it being a NVMe issue.

    Then we need to start looking at the components one by one to figure out what’s causing the slowdown. I would start with a single host as multicast testing is way more complex. Schedule a debug deploy task (just as if you create a normal task but make sure to check the box for debug before clicking create task button) for one host. Let it boot up to the terminal. To get the NFS share mounted you run the command fog and hit ENTER a few times till you see the message about share being mounted and checked. Then just hit Ctrl+c to stop the deploy and get back to the terminal.

    Now we will run a few different tests:

    mkdir /mnt/ramdisk
    mount -t tmpfs -o size=1024m tmpfs /mnt/ramdisk
    dd if=/images/#IMAGENAME#/d1p2.img of=/mnt/ramdisk/test.img
    

    This will kind of test the network speed. It will copy about 1 GB of data from your FOG server to the client without writing to the local disk. Make sure you put in the correct #IMAGENAME# and depending on your disk layout you might need to choose a different partition file (d1p1.img or d1p3.img) which is larger than 1 GB to copy from.

    Now we’ll dump that file from local RAM to disk. No network involved. Be aware this will wipe the data off your drive! You’d need to properly re-deploy this client after the test.

    dd if=/mnt/ramdisk/test.img of=/dev/sda
    

    Now take a picture of the screen and post that here in the forums!
    For a clean shutdown you can run:

    umount /mnt/ramdisk
    umount /image
    halt
    

    This is mostly from the top of my head so don’t hesitate to ask if this doesn’t work as described.



  • @Sebastian-Roth

    It is a normal 2.5 in. SATA Samsung 860 EVO with 500GB (Model: MZ-76E500B/AM). Any other suggestions?


  • Developer

    @mp12 said in Slowdown Unicast and Multicast after upgrading FOG Server:

    Hardware Client: Dell Optiplex 9010, i7, 16GB RAM, SATA Samsung EVO 860 500GB

    Is this SAMSUNG 860 EVO M.2, 500 GB SSD? Yes, then create a post init script and add this command: nvme set-feature -f 0x0c -v=0 /dev/nvme0

    If not we need to start looking at different things causing this.



  • Will quit work for now and restart at 8 a.m. CET

    @Sebastian-Roth
    No performance tweaks. Deployment around 1 hour.
    Hardware Client: Dell Optiplex 9010, i7, 16GB RAM, SATA Samsung EVO 860 500GB

    So we are ready to try the other things ;-)

    @george1421
    We tried different two older kernels but the results where even worse.


  • Developer

    @mp12 I would imagine this to be caused by the NVMe issues we have seen in the past weeks and months. Please edit one of your slow host’s settings and add nvme_core.default_ps_max_latency_us=0 as Host Kernel Arguments. Then try deploying again. If that doesn’t make a difference I have more things for you to try. Just give us a shout.


  • Moderator

    @mp12 said in Slowdown Unicast and Multicast after upgrading FOG Server:

    Unicast FOG 1.5.3: 21 minutes 20 seconds
    Unicast FOG 1.5.7: 1 hour 7 minutes 40 seconds

    Translation to ~11GB/m for 1.5.3 and 3GB/m 1.5.7 (just to be clear we are talking about deployment speeds, right?)

    I have to say I don’t believe you would get that change just upgrading from 1.5.3 to 1.5.7. Its not possible. The heavy workload is done by the target computer not the FOG server. Something else in your environment had to change. That is what my intuition is telling me.

    With that said, It would be interesting to see what would happen if we used the older inits and kernel (from 1.5.3) in your current 1.5.7 environment.Those execute directly on the target computer, possibly impacting imaging speeds.


Log in to reply
 

366
Online

6.6k
Users

14.0k
Topics

132.3k
Posts