Document parts of the code, especially the apparently
non-sense parts.
Other:
- change pointer increment constants to sizeof() values
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
This patch implements several micro-optimizations on lz77_compress()
with the goal of reducing the number of instructions per [input]
byte (a.k.a. IPB).
Changes:
- change hashtable to be u32 (instead of u64) -- change the hash
function to reflect that (adds lz77_hash() and lz77_read32() helpers)
- batch-write literals instead of 1 by 1 -- now that we have a well
defined hot path (match finding) and a cold path (encode literals +
match), batch writing makes a significant difference
- implement adaptive skipping of input bytes -- skip input bytes more
aggressively if too few matches are being found
- name some constants for more meaningful context
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Increase max distance (i.e. window size) from 1k to 8k.
This allows better compression and is just as fast.
Other:
- drop LZ77_MATCH_MIN_DIST as it's nused -- main loop
already checks if dist > 0
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
- lz77_match_len() increments @cur before checking for equality,
leading to off-by-one match len in some cases.
Fix by moving pointers increment to inside the loop.
Also rename @wnd arg to @match (more accurate name).
- both lz77_match_len() and lz77_compress() checked for
"buf + step < end" when the correct is "<=" for such cases.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
@dst buffer is allocated with same size as @src, which, for good
compression cases, works fine.
However, when compression goes bad (e.g. random bytes payloads), the
compressed size can increase significantly, and even by stopping the
main loop at 7/8 of @slen, writing leftover literals could write past
the end of @dst because of LZ77 metadata.
To fix this, add lz77_compressed_alloc_size() helper to compute the
correct allocation size for @dst, accounting for metadata and worst
cast scenario (all literals).
While this is overprovisioning memory, it's not only correct, but also
allows lz77_compress() main loop to run without ever checking @dst
limits (i.e. a perf improvement).
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
End-of-stream flag could lead to UB because of int promotion
(overwriting signed bit).
Fix it by changing operand from '1' to '1UL'.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.
auto-generated by the following:
for i in `git grep -l -w asm/unaligned.h`; do
sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
- Check data compressibility with some heuristics (copied from
btrfs):
- should_compress() final decision is is_compressible(data)
- Cleanup compress/lz77.h leaving only lz77_compress() exposed:
- Move parts to compress/lz77.c, while removing the rest of it
because they were either unused, used only once, were
implemented wrong (thanks to David Howells for the help)
- Updated the compression parameters (still compatible with
Windows implementation) trading off ~20% compression ratio
for ~40% performance:
- min match len: 3 -> 4
- max distance: 8KiB -> 1KiB
- hash table type: u32 * -> u64 *
Known bugs:
This implementation currently works fine in general, but breaks with
some payloads used during testing. Investigation ongoing, to be
fixed in a next commit.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Co-developed-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Move SMB3.1.1 compression code into experimental config option,
and fix the compress mount option. Implement unchained LZ77
"plain" compression algorithm as per MS-XCA specification
section "2.3 Plain LZ77 Compression Algorithm Details".
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>