@mfinn999 Deduplication really hasn’t been studied on the FOG captured images. On the two links you provided there was discussion about adding certain options to the utilities that capture the image. Those options were added to the FOG code base.
here: https://github.com/FOGProject/fos/blob/master/Buildroot/board/FOG/FOS/rootfs_overlay/usr/share/fog/lib/funcs.sh#L2089
and here: https://github.com/FOGProject/fos/blob/master/Buildroot/board/FOG/FOS/rootfs_overlay/usr/share/fog/lib/funcs.sh#L1594
Beyond that no other testing have been done. Also realize that FOG does nothing in regards to dedup, that role should be done by the host OS or host hardware of the FOG server.
Also understand that the options that were added to the image capture do not specifically address dedup operations. Those (new) settings will only impact newly captured images in zstd format and not gzip.
How FOG captures images is that it uses a utility called partclone to read the disk blocks on the target computer. Then it directs those read blocks through a compressor (zstd, or gzip) before being sent to the FOG server. The FOG server takes the compressed blocks from the target computer and writes them unaltered to the FOG server’s disk. So what’s written to the fog server’s disk is a packed (compressed) binary file. I can’t see how two images would have a lot of duplicating blocks to make dedup even effective here.
@Junkhacker @Quazz Do you know anything I’m missing here?
TBH I wonder if the -B (block size) option for zstd would have an impact on the dedup’d image. But also it would require the FOG developers to have access to dedup storage (and the desire) to see if there were any improvements that could be made in this area.
[donald@duckserver html]# zstdmt --help
*** zstd command line interface 64-bits v1.4.4, by Yann Collet ***
Usage :
zstdmt [args] [FILE(s)] [-o file]
FILE : a filename
with no FILE, or when FILE is - , read standard input
Arguments :
-# : # compression level (1-19, default: 3)
-d : decompression
-D file: use `file` as Dictionary
-o file: result stored into `file` (only if 1 input file)
-f : overwrite output without prompting and (de)compress links
--rm : remove source file(s) after successful de/compression
-k : preserve source file(s) (default)
-h/-H : display help/long help and exit
Advanced arguments :
-V : display Version number and exit
-v : verbose mode; specify multiple times to increase verbosity
-q : suppress warnings; specify twice to suppress errors too
-c : force write to standard output, even if it is the console
-l : print information about zstd compressed files
--exclude-compressed: only compress files that are not previously compressed
--ultra : enable levels beyond 19, up to 22 (requires more memory)
--long[=#]: enable long distance matching with given window log (default: 27)
--fast[=#]: switch to very fast compression levels (default: 1)
--adapt : dynamically adapt compression level to I/O conditions
--stream-size=# : optimize compression parameters for streaming input of given number of bytes
--size-hint=# optimize compression parameters for streaming input of approximately this size
--target-compressed-block-size=# : make compressed block near targeted size
-T# : spawns # compression threads (default: 1, 0==# cores)
-B# : select size of each job (default: 0==automatic)
--rsyncable : compress using a rsync-friendly method (-B sets block size)
--no-dictID : don't write dictID into header (dictionary compression)
--[no-]check : integrity check (default: enabled)
--[no-]compress-literals : force (un)compressed literals
-r : operate recursively on directories
--output-dir-flat[=directory]: all resulting files stored into `directory`.
--format=zstd : compress files to the .zst format (default)
--test : test compressed file integrity
--[no-]sparse : sparse mode (default: enabled on file, disabled on stdout)
-M# : Set a memory usage limit for decompression
--no-progress : do not display the progress bar
-- : All arguments after "--" are treated as files
Dictionary builder :
--train ## : create a dictionary from a training set of files
--train-cover[=k=#,d=#,steps=#,split=#,shrink[=#]] : use the cover algorithm with optional args
--train-fastcover[=k=#,d=#,f=#,steps=#,split=#,accel=#,shrink[=#]] : use the fast cover algorithm with optional args
--train-legacy[=s=#] : use the legacy algorithm with selectivity (default: 9)
-o file : `file` is dictionary name (default: dictionary)
--maxdict=# : limit dictionary to specified size (default: 112640)
--dictID=# : force dictionary ID to specified value (default: random)
Benchmark arguments :
-b# : benchmark file(s), using # compression level (default: 3)
-e# : test all compression levels from -bX to # (default: 1)
-i# : minimum evaluation time in seconds (default: 3s)
-B# : cut file into independent blocks of size # (default: no block)
--priority=rt : set process priority to real-time