linux/drivers
David Hildenbrand 53cdc1cb29 drivers/base/memory.c: indicate all memory blocks as removable
We see multiple issues with the implementation/interface to compute
whether a memory block can be offlined (exposed via
/sys/devices/system/memory/memoryX/removable) and would like to simplify
it (remove the implementation).

1. It runs basically lockless. While this might be good for performance,
   we see possible races with memory offlining that will require at
   least some sort of locking to fix.

2. Nowadays, more false positives are possible. No arch-specific checks
   are performed that validate if memory offlining will not be denied
   right away (and such check will require locking). For example, arm64
   won't allow to offline any memory block that was added during boot -
   which will imply a very high error rate. Other archs have other
   constraints.

3. The interface is inherently racy. E.g., if a memory block is detected
   to be removable (and was not a false positive at that time), there is
   still no guarantee that offlining will actually succeed. So any
   caller already has to deal with false positives.

4. It is unclear which performance benefit this interface actually
   provides. The introducing commit 5c755e9fd8 ("memory-hotplug: add
   sysfs removable attribute for hotplug memory remove") mentioned

	"A user-level agent must be able to identify which sections
	 of memory are likely to be removable before attempting the
	 potentially expensive operation."

   However, no actual performance comparison was included.

Known users:

 - lsmem: Will group memory blocks based on the "removable" property. [1]

 - chmem: Indirect user. It has a RANGE mode where one can specify
          removable ranges identified via lsmem to be offlined. However,
          it also has a "SIZE" mode, which allows a sysadmin to skip the
          manual "identify removable blocks" step. [2]

 - powerpc-utils: Uses the "removable" attribute to skip some memory
          blocks right away when trying to find some to offline+remove.
          However, with ballooning enabled, it already skips this
          information completely (because it once resulted in many false
          negatives). Therefore, the implementation can deal with false
          positives properly already. [3]

According to Nathan Fontenot, DLPAR on powerpc is nowadays no longer
driven from userspace via the drmgr command (powerpc-utils).  Nowadays
it's managed in the kernel - including onlining/offlining of memory
blocks - triggered by drmgr writing to /sys/kernel/dlpar.  So the
affected legacy userspace handling is only active on old kernels.  Only
very old versions of drmgr on a new kernel (unlikely) might execute
slower - totally acceptable.

With CONFIG_MEMORY_HOTREMOVE, always indicating "removable" should not
break any user space tool.  We implement a very bad heuristic now.
Without CONFIG_MEMORY_HOTREMOVE we cannot offline anything, so report
"not removable" as before.

Original discussion can be found in [4] ("[PATCH RFC v1] mm:
is_mem_section_removable() overhaul").

Other users of is_mem_section_removable() will be removed next, so that
we can remove is_mem_section_removable() completely.

[1] http://man7.org/linux/man-pages/man1/lsmem.1.html
[2] http://man7.org/linux/man-pages/man8/chmem.8.html
[3] https://github.com/ibm-power-utilities/powerpc-utils
[4] https://lkml.kernel.org/r/20200117105759.27905-1-david@redhat.com

Also, this patch probably fixes a crash reported by Steve.
http://lkml.kernel.org/r/CAPcyv4jpdaNvJ67SkjyUJLBnBnXXQv686BiVW042g03FUmWLXw@mail.gmail.com

Reported-by: "Scargall, Steve" <steve.scargall@intel.com>
Suggested-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Nathan Fontenot <ndfont@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Karel Zak <kzak@redhat.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200128093542.6908-1-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-03-29 09:47:05 -07:00
..
accessibility
acpi x86/mm: split vmalloc_sync_all() 2020-03-21 18:56:06 -07:00
amba
android binderfs: use refcount for binder control devices too 2020-03-11 19:33:52 +01:00
ata libata-5.6-2020-02-05 2020-02-06 06:11:50 +00:00
atm atm: nicstar: fix if-statement empty body warning 2020-02-29 21:28:30 -08:00
auxdisplay auxdisplay: charlcd: replace zero-length array with flexible-array member 2020-03-06 22:18:07 +01:00
base drivers/base/memory.c: indicate all memory blocks as removable 2020-03-29 09:47:05 -07:00
bcma Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-01-28 16:02:33 -08:00
block virtio: fixes 2020-03-09 16:02:32 -07:00
bluetooth Bluetooth: btrtl: Use kvmalloc for FW allocations 2020-01-24 19:57:53 +01:00
bus Merge tag 'omap-for-v5.6/fixes-rc6-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes 2020-03-25 14:27:22 +01:00
cdrom scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places 2020-02-24 15:06:07 -05:00
char ipmi_si: Avoid spurious errors for optional IRQs 2020-03-11 21:15:19 -05:00
clk clk: imx: Align imx sc clock parent msg structs to 4 2020-03-25 18:46:05 -07:00
clocksource ARM: SoC: late updates 2020-02-08 14:17:27 -08:00
connector
counter
cpufreq cpufreq: Fix policy initialization for internal governor drivers 2020-02-27 08:57:48 +01:00
cpuidle ARM: SoC-related driver updates 2020-02-08 14:04:19 -08:00
crypto Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-01-28 16:02:33 -08:00
dax dax: Get rid of fs_dax_get_by_host() helper 2020-01-16 09:52:27 -08:00
dca
devfreq Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs" 2020-02-24 11:14:29 +09:00
dio
dma dmaengine: ti: k3-udma-glue: Fix an error handling path in 'k3_udma_glue_cfg_rx_flow()' 2020-03-23 11:48:34 +05:30
dma-buf dma-buf: free dmabuf->name in dma_buf_release() 2020-02-27 18:01:58 +05:30
edac EDAC/synopsys: Do not print an error with back-to-back snprintf() calls 2020-02-27 16:44:25 +01:00
eisa
extcon
firewire
firmware Two EFI fixes: 2020-03-15 12:42:03 -07:00
fpga fpga: xilinx-pr-decoupler: Remove clk_get error message for probe defer 2020-01-10 12:51:56 -08:00
fsi fsi: aspeed: add unspecified HAS_IOMEM dependency 2020-02-10 13:45:49 -08:00
gnss
gpio gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 CHT + AXP288 model 2020-03-24 10:06:54 +01:00
gpu Merge tag 'amd-drm-fixes-5.6-2020-03-26' of git://people.freedesktop.org/~agd5f/linux into drm-fixes 2020-03-27 13:03:17 +10:00
greybus
hid Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid 2020-03-17 09:38:03 -07:00
hsi
hv - Most of the commits here are work to enable host-initiated hibernation 2020-02-03 14:42:03 +00:00
hwmon hwmon: (adt7462) Fix an error return in ADT7462_REG_VOLT() 2020-03-03 12:42:55 -08:00
hwspinlock hwspinlock: sirf: Use devm_hwspin_lock_register() to register hwlock controller 2020-01-21 16:16:36 -08:00
hwtracing intel_th: pci: Add Elkhart Lake CPU support 2020-03-18 11:32:56 +01:00
i2c i2c: acpi: put device when verifying client fails 2020-03-13 15:15:30 +01:00
i3c i3c: master: dw: reattach device on first available location of address table 2020-01-13 10:00:05 +01:00
ide scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places 2020-02-24 15:06:07 -05:00
idle intel_idle: Introduce 'states_off' module parameter 2020-02-03 11:57:18 +01:00
iio First set of IIO fixes in the 5.6 cycle. 2020-03-18 11:20:42 +01:00
infiniband RDMA/mlx5: Block delay drop to unprivileged users 2020-03-25 09:56:30 -03:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2020-03-26 20:49:44 -07:00
interconnect interconnect: Handle memory allocation errors 2020-03-03 08:02:57 +01:00
iommu iommu/vt-d: Populate debugfs if IOMMUs are detected 2020-03-14 20:02:43 +01:00
ipack
irqchip irqchip fixes for 5.6, take #2 2020-03-15 10:53:11 +01:00
isdn proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
leds leds: lm3532: add pointer to documentation and fix typo 2020-01-22 21:08:24 +01:00
lightnvm
macintosh macintosh: windfarm: fix MODINFO regression 2020-03-10 12:30:59 +01:00
mailbox
mcb
md block-5.6-2020-03-07 2020-03-07 14:14:38 -06:00
media media: mc-entity.c: use & to check pad flags, not == 2020-02-24 15:10:04 +01:00
memory mvebu drivers for 5.6 (part 1) 2020-01-16 10:45:44 -08:00
memstick
message Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
mfd chrome platform changes for 5.6 2020-02-04 07:17:41 +00:00
misc mmc: rtsx_pci: Fix support for speed-modes that relies on tuning 2020-03-18 11:55:02 +01:00
mmc mmc: rtsx_pci: Fix support for speed-modes that relies on tuning 2020-03-18 11:55:02 +01:00
mtd treewide: remove redundant IS_ERR() before error code check 2020-02-04 03:05:27 +00:00
mux
net wireless-drivers fixes for v5.6 2020-03-25 13:12:26 -07:00
nfc NFC: fdp: Fix a signedness bug in fdp_nci_send_patch() 2020-03-23 21:05:13 -07:00
ntb
nubus
nvdimm mm: Cleanup __put_devmap_managed_page() vs ->page_free() 2020-01-31 10:30:37 -08:00
nvme nvmet-tcp: set MSG_MORE only if we actually have more to send 2020-03-21 04:37:53 +09:00
nvmem Merge branch 'i2c/for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2020-02-07 12:54:13 -08:00
of drivers/of/of_mdio.c:fix of_mdiobus_register() 2020-03-03 19:01:51 -08:00
opp ioremap changes for 5.6 2020-01-27 13:03:00 -08:00
oprofile tracing: Make struct ring_buffer less ambiguous 2020-01-13 13:19:38 -05:00
parisc proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
parport
pci PCI: brcmstb: Fix build on 32bit ARM platforms with older compilers 2020-02-27 08:06:20 -06:00
pcmcia
perf drivers/perf: arm_pmu_acpi: Fix incorrect checking of gicc pointer 2020-03-02 12:07:35 +00:00
phy phy: for 5.6-rc 2020-03-04 13:28:52 +01:00
pinctrl pinctrl: qcom: Assign irq_eoi conditionally 2020-03-09 16:31:34 +01:00
platform platform/chrome: wilco_ec: Include asm/unaligned instead of linux/ path 2020-02-11 09:10:36 +01:00
pnp proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
power ARM: SoC platform updates 2020-02-08 13:55:25 -08:00
powercap Merge back power capping changes for v5.6. 2020-01-13 10:32:19 +01:00
pps
ps3
ptp Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
pwm pwm: Remove set but not set variable 'pwm' 2020-01-20 15:40:49 +01:00
rapidio
ras
regulator regulator: Fixes for v5.6 2020-03-06 14:48:30 -06:00
remoteproc remoteproc: qcom: q6v5-mss: Improve readability of reset_assert 2020-01-24 09:34:07 -08:00
reset reset: intel: add unspecified HAS_IOMEM dependency 2020-02-10 11:11:55 +01:00
rpmsg rpmsg: add rpmsg support for mt8183 SCP. 2020-01-20 10:29:56 -08:00
rtc rtc: max8907: add missing select REGMAP_IRQ 2020-03-19 09:55:25 -07:00
s390 block-5.6-2020-03-13 2020-03-13 12:45:23 -07:00
sbus
scsi scsi: sd: Fix optimal I/O size for devices that change reported values 2020-03-24 22:53:04 -04:00
sfi
sh
siox siox: Use the correct style for SPDX License Identifier 2020-01-14 21:46:53 +01:00
slimbus slimbus: ngd: add v2.1.0 compatible 2020-03-12 16:51:15 +01:00
soc soc: samsung: chipid: Fix return value on non-Exynos platforms 2020-03-25 14:27:27 +01:00
soundwire soundwire: cadence: fix kernel-doc parameter descriptions 2020-01-16 17:34:38 +05:30
spi spi: Fixes for v5.6 2020-03-06 14:50:16 -06:00
spmi spmi: pmic-arb: Set lockdep class for hierarchical irq domains 2020-02-10 13:16:04 +01:00
ssb
staging Staging/IIO fixes for 5.6-rc7 2020-03-20 09:20:38 -07:00
target scsi: Revert "target: iscsi: Wait for all commands to finish before freeing a session" 2020-02-14 17:13:54 -05:00
tc The main MIPS changes for 5.6: 2020-01-31 11:28:31 -08:00
tee Merge tag 'tee-amdtee-fix2-for-5.6' of https://git.linaro.org/people/jens.wiklander/linux-tee into arm/fixes 2020-03-25 14:27:27 +01:00
thermal - Fix a SEVERE docs build failure for cpu idle cooling device (Randy Dunlap) 2020-01-31 14:39:21 -08:00
thunderbolt thunderbolt: Fix error code in tb_port_is_width_supported() 2020-03-04 12:34:17 +03:00
tty tty: fix compat TIOCGSERIAL checking wrong function ptr 2020-03-18 13:15:13 +01:00
uio uio: uio_pdrv_genirq: Do not log an error when deferring probe routine. 2020-01-14 15:27:51 +01:00
usb USB-serial fixes for 5.6-rc7 2020-03-18 10:42:57 +01:00
vfio VFIO updates for v5.6-rc1 2020-02-03 22:22:05 +00:00
vhost vhost: Check docket sk_family instead of call getname 2020-02-22 21:41:42 -08:00
video ARM: SoC fixes 2020-03-08 17:36:22 -07:00
virt
virtio virtio_balloon: Adjust label in virtballoon_probe 2020-03-08 05:35:24 -04:00
visorbus visorbus: fix uninitialized variable access 2020-01-14 15:30:35 +01:00
vlynq
vme Char/Misc driver changes for 5.6-rc1 2020-01-29 10:35:54 -08:00
w1 Char/Misc driver changes for 5.6-rc1 2020-01-29 10:35:54 -08:00
watchdog watchdog: iTCO_wdt: Make ICH_RES_IO_SMI optional 2020-03-10 10:20:37 +01:00
xen xen/xenbus: fix locking 2020-03-05 09:42:23 -06:00
zorro Kbuild updates for v5.6 (2nd) 2020-02-09 16:05:50 -08:00
Kconfig
Makefile