sst-linux/lib
Kairui Song fa47f96ee7 mm/filemap: optimize filemap folio adding
commit 6758c1128ceb45d1a35298912b974eb4895b7dd9 upstream.

Instead of doing multiple tree walks, do one optimism range check with
lock hold, and exit if raced with another insertion.  If a shadow exists,
check it with a new xas_get_order helper before releasing the lock to
avoid redundant tree walks for getting its order.

Drop the lock and do the allocation only if a split is needed.

In the best case, it only need to walk the tree once.  If it needs to
alloc and split, 3 walks are issued (One for first ranged conflict check
and order retrieving, one for the second check after allocation, one for
the insert after split).

Testing with 4K pages, in an 8G cgroup, with 16G brd as block device:

  echo 3 > /proc/sys/vm/drop_caches

  fio -name=cached --numjobs=16 --filename=/mnt/test.img \
    --buffered=1 --ioengine=mmap --rw=randread --time_based \
    --ramp_time=30s --runtime=5m --group_reporting

Before:
bw (  MiB/s): min= 1027, max= 3520, per=100.00%, avg=2445.02, stdev=18.90, samples=8691
iops        : min=263001, max=901288, avg=625924.36, stdev=4837.28, samples=8691

After (+7.3%):
bw (  MiB/s): min=  493, max= 3947, per=100.00%, avg=2625.56, stdev=25.74, samples=8651
iops        : min=126454, max=1010681, avg=672142.61, stdev=6590.48, samples=8651

Test result with THP (do a THP randread then switch to 4K page in hope it
issues a lot of splitting):

  echo 3 > /proc/sys/vm/drop_caches

  fio -name=cached --numjobs=16 --filename=/mnt/test.img \
      --buffered=1 --ioengine=mmap -thp=1 --readonly \
      --rw=randread --time_based --ramp_time=30s --runtime=10m \
      --group_reporting

  fio -name=cached --numjobs=16 --filename=/mnt/test.img \
      --buffered=1 --ioengine=mmap \
      --rw=randread --time_based --runtime=5s --group_reporting

Before:
bw (  KiB/s): min= 4141, max=14202, per=100.00%, avg=7935.51, stdev=96.85, samples=18976
iops        : min= 1029, max= 3548, avg=1979.52, stdev=24.23, samples=18976·

READ: bw=4545B/s (4545B/s), 4545B/s-4545B/s (4545B/s-4545B/s), io=64.0KiB (65.5kB), run=14419-14419msec

After (+12.5%):
bw (  KiB/s): min= 4611, max=15370, per=100.00%, avg=8928.74, stdev=105.17, samples=19146
iops        : min= 1151, max= 3842, avg=2231.27, stdev=26.29, samples=19146

READ: bw=4635B/s (4635B/s), 4635B/s-4635B/s (4635B/s-4635B/s), io=64.0KiB (65.5kB), run=14137-14137msec

The performance is better for both 4K (+7.5%) and THP (+12.5%) cached read.

Link: https://lkml.kernel.org/r/20240415171857.19244-5-ryncsn@gmail.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Closes: https://lore.kernel.org/linux-mm/A5A976CB-DB57-4513-A700-656580488AB6@flyingcircus.io/
[ kasong@tencent.com: minor adjustment of variable declarations ]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17 15:21:26 +02:00
..
842
crypto
dim linux/dim: Do nothing if no time delta between samples 2023-05-24 17:32:31 +01:00
fonts lib/fonts: fix undefined behavior in bit shift for get_default_font 2022-12-31 13:31:56 +01:00
kunit kunit: Fix timeout message 2024-07-11 12:47:08 +02:00
livepatch
lz4
lzo
math bitmap: introduce generic optimized bitmap_size() 2024-08-29 17:30:14 +02:00
mpi crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init 2024-02-23 09:12:49 +01:00
pldmfw
raid6
reed_solomon treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
test_fortify
vdso lib/vdso: use "grep -E" instead of "egrep" 2022-11-23 19:50:15 +01:00
xz xz: cleanup CRC32 edits from 2018 2024-10-17 15:20:58 +02:00
zlib_deflate
zlib_dfltcc
zlib_inflate
zstd zstd: Fix array-index-out-of-bounds UBSAN warning 2023-12-13 18:39:04 +01:00
.gitignore
argv_split.c
ashldi3.c
ashrdi3.c
asn1_decoder.c
asn1_encoder.c
assoc_array.c
atomic64_test.c
atomic64.c
audit.c
base64.c
bcd.c
bch.c
bitfield_kunit.c
bitmap.c lib/bitmap: drop optimization of bitmap_{from,to}_arr64 2023-07-19 16:21:58 +02:00
bitrev.c
bootconfig-data.S
bootconfig.c bootconfig: use memblock_free_late to free xbc memory to buddy 2024-04-27 17:07:17 +02:00
bsearch.c
btree.c
bucket_locks.c
bug.c cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG 2023-03-10 09:33:47 +01:00
build_OID_registry
buildid.c
bust_spinlocks.c
check_signature.c
checksum.c
clz_ctz.c lib/clz_ctz.c: Fix __clzdi2() and __ctzdi2() for 32-bit kernels 2023-08-30 16:11:08 +02:00
clz_tab.c
cmdline_kunit.c lib/cmdline: Fix an invalid format specifier in an assertion msg 2024-03-26 18:20:28 -04:00
cmdline.c
cmpdi2.c
compat_audit.c
cpu_rmap.c lib: cpu_rmap: Fix potential use-after-free in irq_cpu_rmap_release() 2023-06-14 11:15:22 +02:00
cpumask_kunit.c lib/test_cpumask: Add for_each_cpu_and(not) tests 2022-10-06 05:57:36 -07:00
cpumask.c lib/find_bit: add find_next{,_and}_bit_wrap 2022-10-01 10:22:57 -07:00
crc4.c
crc7.c
crc8.c
crc16.c
crc32.c
crc32defs.h
crc32test.c
crc64-rocksoft.c
crc64.c
crc-ccitt.c
crc-itu-t.c
crc-t10dif.c
ctype.c
debug_info.c
debug_locks.c
debugobjects.c debugobjects: Fix conditions in fill_pool() 2024-10-17 15:21:22 +02:00
dec_and_lock.c
decompress_bunzip2.c decompress_bunzip2: fix rare decompression failure 2024-08-03 08:49:38 +02:00
decompress_inflate.c
decompress_unlz4.c
decompress_unlzma.c
decompress_unlzo.c
decompress_unxz.c
decompress_unzstd.c
decompress.c
devmem_is_allowed.c
devres.c
digsig.c
dump_stack.c
dynamic_debug.c dyndbg: fix old BUG_ON in >control parser 2024-05-17 11:56:20 +02:00
dynamic_queue_limits.c
earlycpio.c
errname.c parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes 2023-12-08 08:51:15 +01:00
error-inject.c
errseq.c
extable.c
fault-inject-usercopy.c
fault-inject.c mm: fix unexpected changes to {failslab|fail_page_alloc}.attr 2022-11-22 18:50:44 -08:00
fdt_addresses.c
fdt_empty_tree.c
fdt_ro.c
fdt_rw.c
fdt_strerror.c
fdt_sw.c
fdt_wip.c
fdt.c
find_bit_benchmark.c treewide: use prandom_u32_max() when possible, part 1 2022-10-11 17:42:55 -06:00
find_bit.c lib/find_bit: Introduce find_next_andnot_bit() 2022-10-06 05:57:36 -07:00
flex_proportions.c
fortify_kunit.c
gen_crc32table.c
gen_crc64table.c
genalloc.c
generic-radix-tree.c lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc() 2024-09-12 11:10:25 +02:00
glob.c
globtest.c
group_cpus.c genirq/affinity: Only build SMP-only helper functions on SMP kernels 2024-01-10 17:10:36 +01:00
hexdump.c
hweight.c
idr.c ida: Fix crash in ida_free when the bitmap is empty 2024-01-20 11:50:09 +01:00
inflate.c
interval_tree_test.c
interval_tree.c
iomap_copy.c
iomap.c kmsan: add iomap support 2022-10-03 14:03:21 -07:00
iommu-helper.c
iov_iter.c instrumented.h: allow instrumenting both sides of copy_from_user() 2022-10-03 14:03:18 -07:00
irq_poll.c
irq_regs.c
is_signed_type_kunit.c
is_single_threaded.c
kasprintf.c
Kconfig This update includes the following changes: 2022-10-10 13:04:25 -07:00
Kconfig.debug bpf, kconfig: Fix DEBUG_INFO_BTF_MODULES Kconfig definition 2024-05-17 11:55:56 +02:00
Kconfig.kasan kasan: drop CONFIG_KASAN_TAGS_IDENTIFY 2022-10-03 14:02:57 -07:00
Kconfig.kcsan
Kconfig.kfence
Kconfig.kgdb parisc: Convert PDC console to an early console 2022-10-11 12:01:24 +02:00
Kconfig.kmsan kmsan: make sure PREEMPT_RT is off 2022-11-08 15:57:24 -08:00
Kconfig.ubsan
kfifo.c
klist.c
kobject_uevent.c kobject_uevent: Fix OOB access within zap_modalias_env() 2024-08-03 08:49:39 +02:00
kobject.c kobject: Add sanity check for kset->kobj.ktype in kset_register() 2023-09-23 11:11:07 +02:00
kstrtox.c
kstrtox.h
libcrc32c.c
linear_ranges.c
list_debug.c
list_sort.c
list-test.c
llist.c
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-rtmutex.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c
lockref.c lockref: stop doing cpu_relax in the cmpxchg loop 2023-02-01 08:34:34 +01:00
logic_iomem.c
logic_pio.c
lru_cache.c
lshrdi3.c
Makefile genirq/affinity: Move group_cpus_evenly() into lib/ 2024-01-10 17:10:33 +01:00
maple_tree.c maple_tree: fix mas_empty_area_rev() null pointer dereference 2024-06-16 13:41:31 +02:00
memcat_p.c
memcpy_kunit.c lib: memcpy_kunit: Fix an invalid format specifier in an assertion msg 2024-03-26 18:20:28 -04:00
memory-notifier-error-inject.c
memregion.c
memweight.c
muldi3.c
net_utils.c
netdev-notifier-error-inject.c
nlattr.c netlink: add nla be16/32 types to minlen array 2024-03-06 14:45:06 +00:00
nmi_backtrace.c
notifier-error-inject.c lib/notifier-error-inject: fix error when writing -errno to debugfs file 2022-12-31 13:31:58 +01:00
notifier-error-inject.h
objagg.c mlxsw: spectrum_acl_erp: Fix object nesting warning 2024-08-03 08:49:05 +02:00
of-reconfig-notifier-error-inject.c
oid_registry.c
once.c once: rename _SLOW to _SLEEPABLE 2022-10-03 17:34:32 -07:00
overflow_kunit.c overflow: Refactor test skips for Clang-specific issues 2022-10-25 14:57:42 -07:00
packing.c
parman.c
parser.c
pci_iomap.c pci_iounmap(): Fix MMIO mapping leak 2024-04-03 15:19:25 +02:00
percpu_counter.c
percpu_test.c
percpu-refcount.c
plist.c
pm-notifier-error-inject.c
polynomial.c
radix-tree.c radix tree: remove unused variable 2023-08-30 16:11:08 +02:00
random32.c treewide: use get_random_bytes() when possible 2022-10-11 17:42:58 -06:00
ratelimit.c
rbtree_test.c
rbtree.c
ref_tracker.c
refcount.c
rhashtable.c
sbitmap.c lib/sbitmap: define swap_lock as raw_spinlock_t 2024-10-17 15:21:10 +02:00
scatterlist.c
seq_buf.c
sg_pool.c
sg_split.c
show_mem.c mm: reduce noise in show_mem for lowmem allocations 2022-09-26 19:46:29 -07:00
siphash.c
slub_kunit.c mm/slub, kunit: Use inverted data to corrupt kmem cache 2024-06-12 11:03:04 +02:00
smp_processor_id.c
sort.c
stackdepot.c stackdepot: respect __GFP_NOLOCKDEP allocation flag 2024-05-02 16:29:29 +02:00
stackinit_kunit.c lib: stackinit: update reference to kunit-tool 2022-09-30 13:21:22 -06:00
stmp_device.c
string_helpers.c
string.c kmsan: disable strscpy() optimization under KMSAN 2022-10-03 14:03:22 -07:00
strncpy_from_user.c
strnlen_user.c
syscall.c
test_bitmap.c lib/bitmap: workaround const_eval test build failure 2023-08-11 12:08:10 +02:00
test_bitops.c
test_bits.c
test_blackhole_dev.c net: blackhole_dev: fix build warning for ethh set but not used 2024-03-26 18:20:33 -04:00
test_bpf.c
test_debug_virtual.c
test_dynamic_debug.c
test_firmware.c test_firmware: return ENOMEM instead of ENOSPC on failed memory allocation 2023-08-03 10:24:19 +02:00
test_fprobe.c fprobe: Pass entry_data to handlers 2023-10-25 12:03:12 +02:00
test_fpu.c
test_free_pages.c
test_hash.c
test_hexdump.c treewide: use prandom_u32_max() when possible, part 1 2022-10-11 17:42:55 -06:00
test_hmm_uapi.h hmm-tests: add test for migrate_device_range() 2022-10-12 18:51:50 -07:00
test_hmm.c lib/test_hmm.c: handle src_pfns and dst_pfns allocation failure 2024-06-12 11:03:29 +02:00
test_ida.c ida: Fix crash in ida_free when the bitmap is empty 2024-01-20 11:50:09 +01:00
test_kmod.c
test_kprobes.c treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
test_linear_ranges.c
test_list_sort.c treewide: use prandom_u32_max() when possible, part 1 2022-10-11 17:42:55 -06:00
test_lockup.c
test_maple_tree.c maple_tree: add GFP_KERNEL to allocations in mas_expected_entries() 2023-11-02 09:35:24 +01:00
test_memcat_p.c
test_meminit.c lib/test_meminit: fix off-by-one error in test_pages() 2023-10-15 18:32:41 +02:00
test_min_heap.c treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
test_module.c
test_objagg.c treewide: use get_random_bytes() when possible 2022-10-11 17:42:58 -06:00
test_parman.c
test_printf.c
test_ref_tracker.c
test_rhashtable.c rhashtable: make test actually random 2022-10-26 13:39:09 +01:00
test_scanf.c lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix() 2023-09-19 12:28:05 +02:00
test_siphash.c
test_sort.c
test_static_key_base.c
test_static_keys.c
test_string.c
test_strscpy.c
test_sysctl.c
test_ubsan.c
test_user_copy.c
test_uuid.c
test_vmalloc.c treewide: use get_random_{u8,u16}() when possible, part 2 2022-10-11 17:42:58 -06:00
test_xarray.c mm/filemap: optimize filemap folio adding 2024-10-17 15:21:26 +02:00
test-kstrtox.c
test-string_helpers.c treewide: use prandom_u32_max() when possible, part 1 2022-10-11 17:42:55 -06:00
textsearch.c
timerqueue.c
trace_readwrite.c
ts_bm.c lib/ts_bm: reset initial match offset for every block of text 2023-07-19 16:21:13 +02:00
ts_fsm.c
ts_kmp.c
ubsan.c panic: Consolidate open-coded panic_on_warn checks 2023-01-24 07:24:41 +01:00
ubsan.h
ucmpdi2.c
ucs2_string.c
usercopy.c uaccess: Add speculation barrier to copy_from_user() 2023-02-25 11:25:41 +01:00
uuid.c treewide: use get_random_bytes() when possible 2022-10-11 17:42:58 -06:00
vsprintf.c lib/vsprintf: Fix %pfwf when current node refcount == 0 2024-01-01 12:39:07 +00:00
win_minmax.c
xarray.c lib/xarray: introduce a new helper xas_get_order 2024-10-17 15:21:26 +02:00
xxhash.c