linux/drivers/block
gao xu f0f6f78714 zram: optimize LZ4 dictionary compression performance
Calling `LZ4_loadDict()` repeatedly in Zram causes significant overhead
due to its internal dictionary pre-processing.  This commit introduces a
template stream mechanism to pre-process the dictionary only once when the
dictionary is initially set or modified.  It then efficiently copies this
state for subsequent compressions.

Verification Test Items:
Test Platform: android16-6.12
1. Collect Anonymous Page Dataset
1) Apply the following patch:
static bool zram_meta_alloc(struct zram *zram, u64 disksize)
	if (!huge_class_size)
-		huge_class_size = zs_huge_class_size(zram->mem_pool);
+		huge_class_size = 0;

2)Install multiple apps and monkey testing until SwapFree is close to 0.

3)Execute the following command to export data:
dd if=/dev/block/zram0 of=/data/samples/zram_dump.img bs=4K

2. Train Dictionary
Since LZ4 does not have a dedicated dictionary training tool, the zstd
tool can be used for training[1]. The command is as follows:
zstd --train /data/samples/* --split=4096 --maxdict=64KB -o /vendor/etc/dict_data

3. Test Code
adb shell "dd if=/data/samples/zram_dump.img of=/dev/test_pattern bs=4096 count=131072 conv=fsync"
adb shell "swapoff /dev/block/zram0"
adb shell "echo 1 > /sys/block/zram0/reset"
adb shell "echo lz4 > /sys/block/zram0/comp_algorithm"
adb shell "echo dict=/vendor/etc/dict_data   >  /sys/block/zram0/algorithm_params"
adb shell "echo 6G > /sys/block/zram0/disksize"
echo "Start Compression"
adb shell "taskset 80 dd if=/dev/test_pattern of=/dev/block/zram0 bs=4096 count=131072 conv=fsync"
echo.
echo "Start Decompression"
adb shell "taskset 80 dd if=/dev/block/zram0 of=/dev/output_result bs=4096 count=131072 conv=fsync"
echo "mm_stat:"
adb shell "cat /sys/block/zram0/mm_stat"
echo.
Note: To ensure stable test results, it is best to lock the CPU frequency
before executing the test.

LZ4 supports dictionaries up to 64KB. Below are the test results for
compression rates at various dictionary sizes:
dict_size          base        patch
  4 KB          156M/s      219M/s
  8 KB          136M/s      217M/s
 16KB           98M/s       214M/s
 32KB           66M/s       225M/s
 64KB           38M/s       224M/s

When an LZ4 compression dictionary is enabled, compression speed is
negatively impacted by the dictionary's size; larger dictionaries result
in slower compression.  This patch eliminates the influence of dictionary
size on compression speed, ensuring consistent performance regardless of
dictionary scale.

Link: https://lkml.kernel.org/r/698181478c9c4b10aa21b4a847bdc706@honor.com
Link: https://github.com/lz4/lz4?tab=readme-ov-file [1]
Signed-off-by: gao xu <gaoxu2@honor.com>
Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:30 -07:00
..
aoe Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
drbd block-7.0-20260227 2026-02-27 10:42:02 -08:00
mtip32xx block: switch ->getgeo() to struct gendisk 2025-08-13 02:59:29 -04:00
null_blk Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
rnbd Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses 2026-02-22 08:26:33 -08:00
rnull configfs changes for v7.0 2026-02-12 14:01:38 -08:00
xen-blkback Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
zram zram: optimize LZ4 dictionary compression performance 2026-04-05 13:53:30 -07:00
amiflop.c block: switch ->getgeo() to struct gendisk 2025-08-13 02:59:29 -04:00
ataflop.c treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
brd.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
floppy.c array_size.h: add ARRAY_END() 2026-01-20 19:44:19 -08:00
Kconfig rbd: stop selecting CRC32, CRYPTO, and CRYPTO_AES 2025-12-10 11:50:54 +01:00
loop.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
Makefile rnull: move driver to separate directory 2025-09-02 05:23:56 -06:00
n64cart.c block: move the nonrot flag to queue_limits 2024-06-19 07:58:28 -06:00
nbd.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
ps3disk.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
ps3vram.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
rbd_types.h
rbd.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
sunvdc.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
swim_asm.S
swim.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
swim3.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
ublk_drv.c block-7.0-20260312 2026-03-13 10:13:06 -07:00
virtio_blk.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
xen-blkfront.c Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses 2026-02-22 08:26:33 -08:00
z2ram.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
zloop.c block-7.0-20260227 2026-02-27 10:42:02 -08:00