linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-11 08:03:05 +02:00

Author	SHA1	Message	Date
Greg Kroah-Hartman	2300418cc6	This is the 5.10.65 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmFBpvcACgkQONu9yGCS aT7JVxAAof4h5rPObKwhFBu4qOHXEtlHrFAF1xTEaQZnIbv9CkEF0LPufWXP+nKS mQDOdDmX3rZhWXZbnNK3ZxBADJXyHS6M0jHByuGrzQ8dmMONJtpjYUjxou6k/xg2 4ECHqzeVbwbWuKrJrAfC1xuZofIHXZBHrkAQmLoMw8ERp309lgPS2cXDOXRzn/n/ ri+5AhTaw1ZG1JXrXvyfoxjbdE/eEeJXx8N/zJf0sas5lYpsqeAWTgXBkNpPeJm7 G66ISwEVp6TPxihpRSKwUhADjuM2+EAok2WXwwTvO0s00vE7LL5ijK27hhP5ual1 +xtxBHag95oIZ+sq1t3z4BgmE1n3z/lHkQki98JQaWShABLGhMdKYPF75hMzR6Pw j0TvLHdkPRSrtUelc7rGtqaT9tF9+RU59I5fPGlBpGckOJ5u2IHCKdjk1WadRgrj JF7R8ApfP18y1X46tDfr/CIPIZfTVNJyd7hZ1zt11wdBYFmaw/oNyg81OalqzaWf ckUIt6AucRQ04uuFfhSaTuxLSEl5Uuh6W30HuO/0N3CoDsfD1RMc+76sXORt/JdK MPxTy124KM6VZADVW4tQXHMoGkLftqTAIgRKt4iRPz80rdhACJFoZJlmVON0MmKV tSODsqGBxIxhkLj197vQzT152G4wBkmzPtqJfJH7lkGKmBpoKZE= =lJCV -----END PGP SIGNATURE----- Merge 5.10.65 into android12-5.10-lts Changes in 5.10.65 locking/mutex: Fix HANDOFF condition regmap: fix the offset of register error log regulator: tps65910: Silence deferred probe error crypto: mxs-dcp - Check for DMA mapping errors sched/deadline: Fix reset_on_fork reporting of DL tasks power: supply: axp288_fuel_gauge: Report register-address on readb / writeb errors crypto: omap-sham - clear dma flags only after omap_sham_update_dma_stop() sched/deadline: Fix missing clock update in migrate_task_rq_dl() rcu/tree: Handle VM stoppage in stall detection EDAC/mce_amd: Do not load edac_mce_amd module on guests posix-cpu-timers: Force next expiration recalc after itimer reset hrtimer: Avoid double reprogramming in __hrtimer_start_range_ns() hrtimer: Ensure timerfd notification for HIGHRES=n udf: Check LVID earlier udf: Fix iocharset=utf8 mount option isofs: joliet: Fix iocharset=utf8 mount option bcache: add proper error unwinding in bcache_device_init blk-throtl: optimize IOPS throttle for large IO scenarios nvme-tcp: don't update queue count when failing to set io queues nvme-rdma: don't update queue count when failing to set io queues nvmet: pass back cntlid on successful completion power: supply: smb347-charger: Add missing pin control activation power: supply: max17042_battery: fix typo in MAx17042_TOFF s390/cio: add dev_busid sysfs entry for each subchannel s390/zcrypt: fix wrong offset index for APKA master key valid state libata: fix ata_host_start() crypto: omap - Fix inconsistent locking of device lists crypto: qat - do not ignore errors from enable_vf2pf_comms() crypto: qat - handle both source of interrupt in VF ISR crypto: qat - fix reuse of completion variable crypto: qat - fix naming for init/shutdown VF to PF notifications crypto: qat - do not export adf_iov_putmsg() fcntl: fix potential deadlock for &fasync_struct.fa_lock udf_get_extendedattr() had no boundary checks. s390/kasan: fix large PMD pages address alignment check s390/pci: fix misleading rc in clp_set_pci_fn() s390/debug: keep debug data on resize s390/debug: fix debug area life cycle s390/ap: fix state machine hang after failure to enable irq power: supply: cw2015: use dev_err_probe to allow deferred probe m68k: emu: Fix invalid free in nfeth_cleanup() sched/numa: Fix is_core_idle() sched: Fix UCLAMP_FLAG_IDLE setting rcu: Fix to include first blocked task in stall warning rcu: Add lockdep_assert_irqs_disabled() to rcu_sched_clock_irq() and callees rcu: Fix stall-warning deadlock due to non-release of rcu_node ->lock m68k: Fix invalid RMW_INSNS on CPUs that lack CAS block: return ELEVATOR_DISCARD_MERGE if possible spi: spi-fsl-dspi: Fix issue with uninitialized dma_slave_config spi: spi-pic32: Fix issue with uninitialized dma_slave_config genirq/timings: Fix error return code in irq_timings_test_irqs() irqchip/loongson-pch-pic: Improve edge triggered interrupt support lib/mpi: use kcalloc in mpi_resize clocksource/drivers/sh_cmt: Fix wrong setting if don't request IRQ for clock source channel block: nbd: add sanity check for first_minor spi: coldfire-qspi: Use clk_disable_unprepare in the remove function irqchip/gic-v3: Fix priority comparison when non-secure priorities are used crypto: qat - use proper type for vf_mask certs: Trigger creation of RSA module signing key if it's not an RSA key tpm: ibmvtpm: Avoid error message when process gets signal while waiting x86/mce: Defer processing of early errors spi: davinci: invoke chipselect callback blk-crypto: fix check for too-large dun_bytes regulator: vctrl: Use locked regulator_get_voltage in probe path regulator: vctrl: Avoid lockdep warning in enable/disable ops spi: sprd: Fix the wrong WDG_LOAD_VAL spi: spi-zynq-qspi: use wait_for_completion_timeout to make zynq_qspi_exec_mem_op not interruptible EDAC/i10nm: Fix NVDIMM detection drm/panfrost: Fix missing clk_disable_unprepare() on error in panfrost_clk_init() drm/gma500: Fix end of loop tests for list_for_each_entry ASoC: mediatek: mt8183: Fix Unbalanced pm_runtime_enable in mt8183_afe_pcm_dev_probe media: TDA1997x: enable EDID support leds: is31fl32xx: Fix missing error code in is31fl32xx_parse_dt() soc: rockchip: ROCKCHIP_GRF should not default to y, unconditionally media: cxd2880-spi: Fix an error handling path drm/of: free the right object bpf: Fix a typo of reuseport map in bpf.h. bpf: Fix potential memleak and UAF in the verifier. drm/of: free the iterator object on failure gve: fix the wrong AdminQ buffer overflow check libbpf: Fix the possible memory leak on error ARM: dts: aspeed-g6: Fix HVI3C function-group in pinctrl dtsi arm64: dts: renesas: r8a77995: draak: Remove bogus adv7511w properties i40e: improve locking of mac_filter_hash soc: qcom: rpmhpd: Use corner in power_off libbpf: Fix removal of inner map in bpf_object__create_map gfs2: Fix memory leak of object lsi on error return path firmware: fix theoretical UAF race with firmware cache and resume driver core: Fix error return code in really_probe() ionic: cleanly release devlink instance media: dvb-usb: fix uninit-value in dvb_usb_adapter_dvb_init media: dvb-usb: fix uninit-value in vp702x_read_mac_addr media: dvb-usb: Fix error handling in dvb_usb_i2c_init media: go7007: fix memory leak in go7007_usb_probe media: go7007: remove redundant initialization media: rockchip/rga: use pm_runtime_resume_and_get() media: rockchip/rga: fix error handling in probe media: coda: fix frame_mem_ctrl for YUV420 and YVU420 formats media: atomisp: fix the uninitialized use and rename "retvalue" Bluetooth: sco: prevent information leak in sco_conn_defer_accept() 6lowpan: iphc: Fix an off-by-one check of array index drm/amdgpu/acp: Make PM domain really work tcp: seq_file: Avoid skipping sk during tcp_seek_last_pos ARM: dts: meson8: Use a higher default GPU clock frequency ARM: dts: meson8b: odroidc1: Fix the pwm regulator supply properties ARM: dts: meson8b: mxq: Fix the pwm regulator supply properties ARM: dts: meson8b: ec100: Fix the pwm regulator supply properties net/mlx5e: Prohibit inner indir TIRs in IPoIB net/mlx5e: Block LRO if firmware asks for tunneled LRO cgroup/cpuset: Fix a partition bug with hotplug drm: mxsfb: Enable recovery on underflow drm: mxsfb: Increase number of outstanding requests on V4 and newer HW drm: mxsfb: Clear FIFO_CLEAR bit net: cipso: fix warnings in netlbl_cipsov4_add_std Bluetooth: mgmt: Fix wrong opcode in the response for add_adv cmd arm64: dts: renesas: rzg2: Convert EtherAVB to explicit delay handling arm64: dts: renesas: hihope-rzg2-ex: Add EtherAVB internal rx delay devlink: Break parameter notification sequence to be before/after unload/load driver net/mlx5: Fix missing return value in mlx5_devlink_eswitch_inline_mode_set() i2c: highlander: add IRQ check leds: lt3593: Put fwnode in any case during ->probe() leds: trigger: audio: Add an activate callback to ensure the initial brightness is set media: em28xx-input: fix refcount bug in em28xx_usb_disconnect media: venus: venc: Fix potential null pointer dereference on pointer fmt PCI: PM: Avoid forcing PCI_D0 for wakeup reasons inconsistently PCI: PM: Enable PME if it can be signaled from D3cold bpf, samples: Add missing mprog-disable to xdp_redirect_cpu's optstring soc: qcom: smsm: Fix missed interrupts if state changes while masked debugfs: Return error during {full/open}_proxy_open() on rmmod Bluetooth: increase BTNAMSIZ to 21 chars to fix potential buffer overflow PM: EM: Increase energy calculation precision selftests/bpf: Fix bpf-iter-tcp4 test to print correctly the dest IP drm/msm/mdp4: refactor HW revision detection into read_mdp_hw_revision drm/msm/mdp4: move HW revision detection to earlier phase drm/msm/dpu: make dpu_hw_ctl_clear_all_blendstages clear necessary LMs arm64: dts: exynos: correct GIC CPU interfaces address range on Exynos7 counter: 104-quad-8: Return error when invalid mode during ceiling_write cgroup/cpuset: Miscellaneous code cleanup cgroup/cpuset: Fix violation of cpuset locking rule ASoC: Intel: Fix platform ID matching Bluetooth: fix repeated calls to sco_sock_kill drm/msm/dsi: Fix some reference counted resource leaks net/mlx5: Register to devlink ingress VLAN filter trap net/mlx5: Fix unpublish devlink parameters ASoC: rt5682: Implement remove callback ASoC: rt5682: Properly turn off regulators if wrong device ID usb: dwc3: meson-g12a: add IRQ check usb: dwc3: qcom: add IRQ check usb: gadget: udc: at91: add IRQ check usb: gadget: udc: s3c2410: add IRQ check usb: phy: fsl-usb: add IRQ check usb: phy: twl6030: add IRQ checks usb: gadget: udc: renesas_usb3: Fix soc_device_match() abuse selftests/bpf: Fix test_core_autosize on big-endian machines devlink: Clear whole devlink_flash_notify struct samples: pktgen: add missing IPv6 option to pktgen scripts Bluetooth: Move shutdown callback before flushing tx and rx queue PM: cpu: Make notifier chain use a raw_spinlock_t usb: host: ohci-tmio: add IRQ check usb: phy: tahvo: add IRQ check libbpf: Re-build libbpf.so when libbpf.map changes mac80211: Fix insufficient headroom issue for AMSDU locking/lockdep: Mark local_lock_t locking/local_lock: Add missing owner initialization lockd: Fix invalid lockowner cast after vfs_test_lock nfsd4: Fix forced-expiry locking arm64: dts: marvell: armada-37xx: Extend PCIe MEM space clk: staging: correct reference to config IOMEM to config HAS_IOMEM i2c: synquacer: fix deferred probing firmware: raspberrypi: Keep count of all consumers firmware: raspberrypi: Fix a leak in 'rpi_firmware_get()' usb: gadget: mv_u3d: request_irq() after initializing UDC mm/swap: consider max pages in iomap_swapfile_add_extent lkdtm: replace SCSI_DISPATCH_CMD with SCSI_QUEUE_RQ Bluetooth: add timeout sanity check to hci_inquiry i2c: iop3xx: fix deferred probing i2c: s3c2410: fix IRQ check i2c: fix platform_get_irq.cocci warnings i2c: hix5hd2: fix IRQ check gfs2: init system threads before freeze lock rsi: fix error code in rsi_load_9116_firmware() rsi: fix an error code in rsi_probe() ASoC: Intel: kbl_da7219_max98927: Fix format selection for max98373 ASoC: Intel: Skylake: Leave data as is when invoking TLV IPCs ASoC: Intel: Skylake: Fix module resource and format selection mmc: sdhci: Fix issue with uninitialized dma_slave_config mmc: dw_mmc: Fix issue with uninitialized dma_slave_config mmc: moxart: Fix issue with uninitialized dma_slave_config bpf: Fix possible out of bound write in narrow load handling CIFS: Fix a potencially linear read overflow i2c: mt65xx: fix IRQ check i2c: xlp9xx: fix main IRQ check usb: ehci-orion: Handle errors of clk_prepare_enable() in probe usb: bdc: Fix an error handling path in 'bdc_probe()' when no suitable DMA config is available usb: bdc: Fix a resource leak in the error handling path of 'bdc_probe()' tty: serial: fsl_lpuart: fix the wrong mapbase value ASoC: wcd9335: Fix a double irq free in the remove function ASoC: wcd9335: Fix a memory leak in the error handling path of the probe function ASoC: wcd9335: Disable irq on slave ports in the remove function iwlwifi: follow the new inclusive terminology iwlwifi: skip first element in the WTAS ACPI table ice: Only lock to update netdev dev_addr ath6kl: wmi: fix an error code in ath6kl_wmi_sync_point() atlantic: Fix driver resume flow. bcma: Fix memory leak for internally-handled cores brcmfmac: pcie: fix oops on failure to resume and reprobe ipv6: make exception cache less predictible ipv4: make exception cache less predictible net: sched: Fix qdisc_rate_table refcount leak when get tcf_block failed net: qualcomm: fix QCA7000 checksum handling octeontx2-af: Fix loop in free and unmap counter octeontx2-af: Fix static code analyzer reported issues octeontx2-af: Set proper errorcode for IPv4 checksum errors ipv4: fix endianness issue in inet_rtm_getroute_build_skb() ASoC: rt5682: Remove unused variable in rt5682_i2c_remove() iwlwifi Add support for ax201 in Samsung Galaxy Book Flex2 Alpha f2fs: guarantee to write dirty data when enabling checkpoint back time: Handle negative seconds correctly in timespec64_to_ns() io_uring: IORING_OP_WRITE needs hash_reg_file set bio: fix page leak bio_add_hw_page failure tty: Fix data race between tiocsti() and flush_to_ldisc() perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op x86/resctrl: Fix a maybe-uninitialized build warning treated as error Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()" KVM: s390: index kvm->arch.idle_mask by vcpu_idx KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted KVM: VMX: avoid running vmx_handle_exit_irqoff in case of emulation KVM: nVMX: Unconditionally clear nested.pi_pending on nested VM-Enter ARM: dts: at91: add pinctrl-{names, 0} for all gpios fuse: truncate pagecache on atomic_o_trunc fuse: flush extending writes IMA: remove -Wmissing-prototypes warning IMA: remove the dependency on CRYPTO_MD5 fbmem: don't allow too huge resolutions backlight: pwm_bl: Improve bootloader/kernel device handover clk: kirkwood: Fix a clocking boot regression Linux 5.10.65 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ie0b9306ba6ee4193de3200df7cdacaeba152b83e	2021-09-15 14:16:47 +02:00
Peter Zijlstra	97bc540bfb	locking/mutex: Fix HANDOFF condition [ Upstream commit `048661a1f9` ] Yanfei reported that setting HANDOFF should not depend on recomputing @first, only on @first state. Which would then give: if (ww_ctx \|\| !first) first = __mutex_waiter_is_first(lock, &waiter); if (first) __mutex_set_flag(lock, MUTEX_FLAG_HANDOFF); But because 'ww_ctx \|\| !first' is basically 'always' and the test for first is relatively cheap, omit that first branch entirely. Reported-by: Yanfei Xu <yanfei.xu@windriver.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Waiman Long <longman@redhat.com> Reviewed-by: Yanfei Xu <yanfei.xu@windriver.com> Link: https://lore.kernel.org/r/20210630154114.896786297@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-09-15 09:50:23 +02:00
Greg Kroah-Hartman	4968ab31d1	Merge 5.10.40 into android12-5.10 Changes in 5.10.40 firmware: arm_scpi: Prevent the ternary sign expansion bug openrisc: Fix a memory leak tee: amdtee: unload TA only when its refcount becomes 0 RDMA/siw: Properly check send and receive CQ pointers RDMA/siw: Release xarray entry RDMA/core: Prevent divide-by-zero error triggered by the user RDMA/rxe: Clear all QP fields if creation failed scsi: ufs: core: Increase the usable queue depth scsi: qedf: Add pointer checks in qedf_update_link_speed() scsi: qla2xxx: Fix error return code in qla82xx_write_flash_dword() RDMA/mlx5: Recover from fatal event in dual port mode RDMA/core: Don't access cm_id after its destruction nvmet: remove unused ctrl->cqs nvmet: fix memory leak in nvmet_alloc_ctrl() nvme-loop: fix memory leak in nvme_loop_create_ctrl() nvme-tcp: rerun io_work if req_list is not empty nvme-fc: clear q_live at beginning of association teardown platform/mellanox: mlxbf-tmfifo: Fix a memory barrier issue platform/x86: intel_int0002_vgpio: Only call enable_irq_wake() when using s2idle platform/x86: dell-smbios-wmi: Fix oops on rmmod dell_smbios RDMA/mlx5: Fix query DCT via DEVX RDMA/uverbs: Fix a NULL vs IS_ERR() bug tools/testing/selftests/exec: fix link error powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks ptrace: make ptrace() fail if the tracee changed its pid unexpectedly nvmet: seset ns->file when open fails perf/x86: Avoid touching LBR_TOS MSR for Arch LBR locking/lockdep: Correct calling tracepoints locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to signal powerpc: Fix early setup to make early_ioremap() work btrfs: avoid RCU stalls while running delayed iputs cifs: fix memory leak in smb2_copychunk_range misc: eeprom: at24: check suspend status before disable regulator ALSA: dice: fix stream format for TC Electronic Konnekt Live at high sampling transfer frequency ALSA: intel8x0: Don't update period unless prepared ALSA: firewire-lib: fix amdtp_packet tracepoints event for packet_index field ALSA: line6: Fix racy initialization of LINE6 MIDI ALSA: dice: fix stream format at middle sampling rate for Alesis iO 26 ALSA: firewire-lib: fix calculation for size of IR context payload ALSA: usb-audio: Validate MS endpoint descriptors ALSA: bebob/oxfw: fix Kconfig entry for Mackie d.2 Pro ALSA: hda: fixup headset for ASUS GU502 laptop Revert "ALSA: sb8: add a check for request_region" ALSA: firewire-lib: fix check for the size of isochronous packet payload ALSA: hda/realtek: reset eapd coeff to default value for alc287 ALSA: hda/realtek: Add some CLOVE SSIDs of ALC293 ALSA: hda/realtek: Fix silent headphone output on ASUS UX430UA ALSA: hda/realtek: Add fixup for HP OMEN laptop ALSA: hda/realtek: Add fixup for HP Spectre x360 15-df0xxx uio_hv_generic: Fix a memory leak in error handling paths Revert "rapidio: fix a NULL pointer dereference when create_workqueue() fails" rapidio: handle create_workqueue() failure Revert "serial: mvebu-uart: Fix to avoid a potential NULL pointer dereference" nvme-tcp: fix possible use-after-completion x86/sev-es: Move sev_es_put_ghcb() in prep for follow on patch x86/sev-es: Invalidate the GHCB after completing VMGEXIT x86/sev-es: Don't return NULL from sev_es_get_ghcb() x86/sev-es: Use __put_user()/__get_user() for data accesses x86/sev-es: Forward page-faults which happen during emulation drm/amdgpu: Fix GPU TLB update error when PAGE_SIZE > AMDGPU_PAGE_SIZE drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang drm/amdgpu: update gc golden setting for Navi12 drm/amdgpu: update sdma golden setting for Navi12 powerpc/64s/syscall: Use pt_regs.trap to distinguish syscall ABI difference between sc and scv syscalls powerpc/64s/syscall: Fix ptrace syscall info with scv syscalls mmc: sdhci-pci-gli: increase 1.8V regulator wait xen-pciback: redo VF placement in the virtual topology xen-pciback: reconfigure also from backend watch handler ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry dm snapshot: fix crash with transient storage and zero chunk size kcsan: Fix debugfs initcall return type Revert "video: hgafb: fix potential NULL pointer dereference" Revert "net: stmicro: fix a missing check of clk_prepare" Revert "leds: lp5523: fix a missing check of return value of lp55xx_read" Revert "hwmon: (lm80) fix a missing check of bus read in lm80 probe" Revert "video: imsttfb: fix potential NULL pointer dereferences" Revert "ecryptfs: replace BUG_ON with error handling code" Revert "scsi: ufs: fix a missing check of devm_reset_control_get" Revert "gdrom: fix a memory leak bug" cdrom: gdrom: deallocate struct gdrom_unit fields in remove_gdrom cdrom: gdrom: initialize global variable at init time Revert "media: rcar_drif: fix a memory disclosure" Revert "rtlwifi: fix a potential NULL pointer dereference" Revert "qlcnic: Avoid potential NULL pointer dereference" Revert "niu: fix missing checks of niu_pci_eeprom_read" ethernet: sun: niu: fix missing checks of niu_pci_eeprom_read() net: stmicro: handle clk_prepare() failure during init scsi: ufs: handle cleanup correctly on devm_reset_control_get error net: rtlwifi: properly check for alloc_workqueue() failure ics932s401: fix broken handling of errors when word reading fails leds: lp5523: check return value of lp5xx_read and jump to cleanup code qlcnic: Add null check after calling netdev_alloc_skb video: hgafb: fix potential NULL pointer dereference vgacon: Record video mode changes with VT_RESIZEX vt_ioctl: Revert VT_RESIZEX parameter handling removal vt: Fix character height handling with VT_RESIZEX tty: vt: always invoke vc->vc_sw->con_resize callback drm/i915/gt: Disable HiZ Raw Stall Optimization on broken gen7 openrisc: mm/init.c: remove unused memblock_region variable in map_ram() x86/Xen: swap NX determination and GDT setup on BSP nvme-multipath: fix double initialization of ANA state rtc: pcf85063: fallback to parent of_node x86/boot/compressed/64: Check SEV encryption in the 32-bit boot-path nvmet: use new ana_log_size instead the old one video: hgafb: correctly handle card detect failure during probe Bluetooth: SMP: Fail if remote and local public keys are identical Linux 5.10.40 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I4523cf43d1da6bea507e4027bd83bc491a574f41	2021-05-27 08:36:46 +02:00
Zqiang	e354e3744b	locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to signal [ Upstream commit `3a010c4932` ] When a interruptible mutex locker is interrupted by a signal without acquiring this lock and removed from the wait queue. if the mutex isn't contended enough to have a waiter put into the wait queue again, the setting of the WAITER bit will force mutex locker to go into the slowpath to acquire the lock every time, so if the wait queue is empty, the WAITER bit need to be clear. Fixes: `040a0a3710` ("mutex: Add support for wound/wait style locks") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Zqiang <qiang.zhang@windriver.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210517034005.30828-1-qiang.zhang@windriver.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-05-26 12:06:50 +02:00
Greg Kroah-Hartman	e92949726c	Merge 5.10.28 into android12-5.10 Changes in 5.10.28 arm64: mm: correct the inside linear map range during hotplug check bpf: Fix fexit trampoline. virtiofs: Fail dax mount if device does not support it ext4: shrink race window in ext4_should_retry_alloc() ext4: fix bh ref count on error paths fs: nfsd: fix kconfig dependency warning for NFSD_V4 rpc: fix NULL dereference on kmalloc failure iomap: Fix negative assignment to unsigned sis->pages in iomap_swapfile_activate ASoC: rt1015: fix i2c communication error ASoC: rt5640: Fix dac- and adc- vol-tlv values being off by a factor of 10 ASoC: rt5651: Fix dac- and adc- vol-tlv values being off by a factor of 10 ASoC: sgtl5000: set DAP_AVC_CTRL register to correct default value on probe ASoC: es8316: Simplify adc_pga_gain_tlv table ASoC: soc-core: Prevent warning if no DMI table is present ASoC: cs42l42: Fix Bitclock polarity inversion ASoC: cs42l42: Fix channel width support ASoC: cs42l42: Fix mixer volume control ASoC: cs42l42: Always wait at least 3ms after reset NFSD: fix error handling in NFSv4.0 callbacks kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing vhost: Fix vhost_vq_reset() io_uring: fix ->flags races by linked timeouts scsi: st: Fix a use after free in st_open() scsi: qla2xxx: Fix broken #endif placement staging: comedi: cb_pcidas: fix request_irq() warn staging: comedi: cb_pcidas64: fix request_irq() warn ASoC: rt5659: Update MCLK rate in set_sysclk() ASoC: rt711: add snd_soc_component remove callback thermal/core: Add NULL pointer check before using cooling device stats locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handling locking/ww_mutex: Fix acquire/release imbalance in ww_acquire_init()/ww_acquire_fini() nvmet-tcp: fix kmap leak when data digest in use io_uring: imply MSG_NOSIGNAL for send[msg]()/recv[msg]() calls static_call: Align static_call_is_init() patching condition ext4: do not iput inode under running transaction in ext4_rename() io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL net: mvpp2: fix interrupt mask/unmask skip condition flow_dissector: fix TTL and TOS dissection on IPv4 fragments can: dev: move driver related infrastructure into separate subdir net: introduce CAN specific pointer in the struct net_device can: tcan4x5x: fix max register value brcmfmac: clear EAP/association status bits on linkdown events ath11k: add ieee80211_unregister_hw to avoid kernel crash caused by NULL pointer rtw88: coex: 8821c: correct antenna switch function netdevsim: dev: Initialize FIB module after debugfs iwlwifi: pcie: don't disable interrupts for reg_lock ath10k: hold RCU lock when calling ieee80211_find_sta_by_ifaddr() net: ethernet: aquantia: Handle error cleanup of start on open appletalk: Fix skb allocation size in loopback case net: ipa: remove two unused register definitions net: ipa: fix register write command validation net: wan/lmc: unregister device when no matching device is found net: 9p: advance iov on empty read bpf: Remove MTU check in __bpf_skb_max_len ACPI: tables: x86: Reserve memory occupied by ACPI tables ACPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead() ALSA: usb-audio: Apply sample rate quirk to Logitech Connect ALSA: hda: Re-add dropped snd_poewr_change_state() calls ALSA: hda: Add missing sanity checks in PM prepare/complete callbacks ALSA: hda/realtek: fix a determine_headset_type issue for a Dell AIO ALSA: hda/realtek: call alc_update_headset_mode() in hp_automute_hook ALSA: hda/realtek: fix mute/micmute LEDs for HP 640 G8 xtensa: fix uaccess-related livelock in do_page_fault xtensa: move coprocessor_flush to the .text section KVM: SVM: load control fields from VMCB12 before checking them KVM: SVM: ensure that EFER.SVME is set when running nested guest or on nested vmexit PM: runtime: Fix race getting/putting suppliers at probe PM: runtime: Fix ordering in pm_runtime_get_suppliers() tracing: Fix stack trace event size s390/vdso: copy tod_steering_delta value to vdso_data page s390/vdso: fix tod_steering_delta type mm: fix race by making init_zero_pfn() early_initcall drm/amdkfd: dqm fence memory corruption drm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings() drm/amdgpu: check alignment on CPU page for bo map reiserfs: update reiserfs_xattrs_initialized() condition drm/imx: fix memory leak when fails to init drm/tegra: dc: Restore coupling of display controllers drm/tegra: sor: Grab runtime PM reference across reset vfio/nvlink: Add missing SPAPR_TCE_IOMMU depends pinctrl: rockchip: fix restore error in resume extcon: Add stubs for extcon_register_notifier_all() functions extcon: Fix error handling in extcon_dev_register firmware: stratix10-svc: reset COMMAND_RECONFIG_FLAG_PARTIAL to 0 usb: dwc3: pci: Enable dis_uX_susphy_quirk for Intel Merrifield video: hyperv_fb: Fix a double free in hvfb_probe firewire: nosy: Fix a use-after-free bug in nosy_ioctl() usbip: vhci_hcd fix shift out-of-bounds in vhci_hub_control() USB: quirks: ignore remote wake-up on Fibocom L850-GL LTE modem usb: musb: Fix suspend with devices connected for a64 usb: xhci-mtk: fix broken streams issue on 0.96 xHCI cdc-acm: fix BREAK rx code path adding necessary calls USB: cdc-acm: untangle a circular dependency between callback and softint USB: cdc-acm: downgrade message to debug USB: cdc-acm: fix double free on probe failure USB: cdc-acm: fix use-after-free after probe failure usb: gadget: udc: amd5536udc_pci fix null-ptr-dereference usb: dwc2: Fix HPRT0.PrtSusp bit setting for HiKey 960 board. usb: dwc2: Prevent core suspend when port connection flag is 0 usb: dwc3: qcom: skip interconnect init for ACPI probe usb: dwc3: gadget: Clear DEP flags after stop transfers in ep disable soc: qcom-geni-se: Cleanup the code to remove proxy votes staging: rtl8192e: Fix incorrect source in memcpy() staging: rtl8192e: Change state information from u16 to u8 driver core: clear deferred probe reason on probe retry drivers: video: fbcon: fix NULL dereference in fbcon_cursor() riscv: evaluate put_user() arg before enabling user access Revert "kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing" bpf: Use NOP_ATOMIC5 instead of emit_nops(&prog, 5) for BPF_TRAMP_F_CALL_ORIG Linux 5.10.28 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ifdbbeda8de3ee22a7aa3f5d3b10becf0aba1a124	2021-04-09 09:29:17 +02:00
Waiman Long	905ef030bd	locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handling [ Upstream commit `5de2055d31` ] The use_ww_ctx flag is passed to mutex_optimistic_spin(), but the function doesn't use it. The frequent use of the (use_ww_ctx && ww_ctx) combination is repetitive. In fact, ww_ctx should not be used at all if !use_ww_ctx. Simplify ww_mutex code by dropping use_ww_ctx from mutex_optimistic_spin() an clear ww_ctx if !use_ww_ctx. In this way, we can replace (use_ww_ctx && ww_ctx) by just (ww_ctx). Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lore.kernel.org/r/20210316153119.13802-2-longman@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-04-07 15:00:06 +02:00
xieliujie	80b4341d05	ANDROID: vendor_hooks: Add hooks for rwsem and mutex Add hooks to apply oem's optimization of rwsem and mutex Bug: 182237112 Signed-off-by: xieliujie <xieliujie@oppo.com> Change-Id: I6332623732e2d6826b8b61087ca74e55393e0c3d	2021-03-10 21:18:26 +00:00
Sangmoon Kim	9ad8ff902e	ANDROID: vendor_hooks: add waiting information for blocked tasks - Add the hook to get mutex/rwsem information that the tasks are waiting for. - Add the hook to print messages for sched_show_task. - ANDROID_VENDOR_DATA_ARRAY added to task_struct Bug: 162776704 Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com> Change-Id: Ib436fbd8d0ad509c3b5a73ea8f5170e0761a13fd (cherry picked from commit b519ac423787d38f467ca479d2126b7204d6f498)	2020-08-19 14:52:44 +00:00
Davidlohr Bueso	c571b72e2b	Revert "locking/mutex: Complain upon mutex API misuse in IRQ contexts" This ended up causing some noise in places such as rxrpc running in softirq. The warning is misleading in this case as the mutex trylock and unlock operations are done within the same context; and therefore we need not worry about the PI-boosting issues that comes along with no single-owner lock guarantees. While we don't want to support this in mutexes, there is no way out of this yet; so lets get rid of the WARNs for now, as it is only fair to code that has historically relied on non-preemptible softirq guarantees. In addition, changing the lock type is also unviable: exclusive rwsems have the same issue (just not the WARN_ON) and counting semaphores would introduce a performance hit as mutexes are a lot more optimized. This reverts: a0855d24fc22: ("locking/mutex: Complain upon mutex API misuse in IRQ contexts") Fixes: a0855d24fc22: ("locking/mutex: Complain upon mutex API misuse in IRQ contexts") Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Tested-by: David Howells <dhowells@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-afs@lists.infradead.org Cc: linux-fsdevel@vger.kernel.org Cc: will@kernel.org Link: https://lkml.kernel.org/r/20191210220523.28540-1-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-11 00:27:43 +01:00
Davidlohr Bueso	a0855d24fc	locking/mutex: Complain upon mutex API misuse in IRQ contexts Add warning checks if mutex_trylock() or mutex_unlock() are used in IRQ contexts, under CONFIG_DEBUG_MUTEXES=y. While the mutex rules and semantics are explicitly documented, this allows to expose any abusers and robustifies the whole thing. While trylock and unlock are non-blocking, calling from IRQ context is still forbidden (lock must be within the same context as unlock). Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Link: https://lkml.kernel.org/r/20191025033634.3330-1-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-10-29 12:22:52 +01:00
Qian Cai	5facae4f35	locking/lockdep: Remove unused @nested argument from lock_release() Since the following commit: `b4adfe8e05` ("locking/lockdep: Remove unused argument in __lock_release") @nested is no longer used in lock_release(), so remove it from all lock_release() calls and friends. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: airlied@linux.ie Cc: akpm@linux-foundation.org Cc: alexander.levin@microsoft.com Cc: daniel@iogearbox.net Cc: davem@davemloft.net Cc: dri-devel@lists.freedesktop.org Cc: duyuyang@gmail.com Cc: gregkh@linuxfoundation.org Cc: hannes@cmpxchg.org Cc: intel-gfx@lists.freedesktop.org Cc: jack@suse.com Cc: jlbec@evilplan.or Cc: joonas.lahtinen@linux.intel.com Cc: joseph.qi@linux.alibaba.com Cc: jslaby@suse.com Cc: juri.lelli@redhat.com Cc: maarten.lankhorst@linux.intel.com Cc: mark@fasheh.com Cc: mhocko@kernel.org Cc: mripard@kernel.org Cc: ocfs2-devel@oss.oracle.com Cc: rodrigo.vivi@intel.com Cc: sean@poorly.run Cc: st@kernel.org Cc: tj@kernel.org Cc: tytso@mit.edu Cc: vdavydov.dev@gmail.com Cc: vincent.guittot@linaro.org Cc: viro@zeniv.linux.org.uk Link: https://lkml.kernel.org/r/1568909380-32199-1-git-send-email-cai@lca.pw Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-10-09 12:46:10 +02:00
Peter Zijlstra	e57d143091	mutex: Fix up mutex_waiter usage The patch moving bits into mutex.c was a little too much; by also moving struct mutex_waiter a few less common CONFIGs would no longer build. Fixes: `5f35d5a66b` ("locking/mutex: Make __mutex_owner static to mutex.c") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2019-08-08 09:09:25 +02:00
Mukesh Ojha	a037d26922	locking/mutex: Use mutex flags macro instead of hard code Use the mutex flag macro instead of hard code value inside __mutex_owner(). Signed-off-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: mingo@redhat.com Cc: will@kernel.org Link: https://lkml.kernel.org/r/1564585504-3543-2-git-send-email-mojha@codeaurora.org	2019-08-06 12:49:16 +02:00
Mukesh Ojha	5f35d5a66b	locking/mutex: Make __mutex_owner static to mutex.c __mutex_owner() should only be used by the mutex api's. So, to put this restiction let's move the __mutex_owner() function definition from linux/mutex.h to mutex.c file. There exist functions that uses __mutex_owner() like mutex_is_locked() and mutex_trylock_recursive(), So to keep legacy thing intact move them as well and export them. Move mutex_waiter structure also to keep it private to the file. Signed-off-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: mingo@redhat.com Cc: will@kernel.org Link: https://lkml.kernel.org/r/1564585504-3543-1-git-send-email-mojha@codeaurora.org	2019-08-06 12:49:16 +02:00
Sebastian Andrzej Siewior	6c11c6e3d5	locking/mutex: Test for initialized mutex An uninitialized/ zeroed mutex will go unnoticed because there is no check for it. There is a magic check in the unlock's slowpath path which might go unnoticed if the unlock happens in the fastpath. Add a ->magic check early in the mutex_lock() and mutex_trylock() path. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190703092125.lsdf4gpsh2plhavb@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-07-25 15:39:27 +02:00
Mauro Carvalho Chehab	387b14684f	docs: locking: convert docs to ReST and rename to *.rst Convert the locking documents to ReST and add them to the kernel development book where it belongs. Most of the stuff here is just to make Sphinx to properly parse the text file, as they're already in good shape, not requiring massive changes in order to be parsed. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Federico Vaga <federico.vaga@vaga.pv.it>	2019-07-15 08:53:27 -03:00
Thomas Gleixner	457c899653	treewide: Add SPDX license identifier for missed files Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-21 10:50:45 +02:00
Davidlohr Bueso	3bb5f4ac55	kernel/locking/mutex.c: remove caller signal_pending branch predictions This is already done for us internally by the signal machinery. Link: http://lkml.kernel.org/r/20181116002713.8474-2-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-01-04 13:13:48 -08:00
Thomas Hellstrom	e13e2366d8	locking/mutex: Fix mutex debug call and ww_mutex documentation The following commit: `08295b3b5b` ("Implement an algorithm choice for Wound-Wait mutexes") introduced a reference in the documentation to a function that was removed in an earlier commit. It also forgot to remove a call to debug_mutex_add_waiter() which is now unconditionally called by __mutex_add_waiter(). Fix those bugs. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dri-devel@lists.freedesktop.org Fixes: `08295b3b5b` ("Implement an algorithm choice for Wound-Wait mutexes") Link: http://lkml.kernel.org/r/20180903140708.2401-1-thellstrom@vmware.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-09-10 10:05:10 +02:00
Thomas Hellstrom	08295b3b5b	locking: Implement an algorithm choice for Wound-Wait mutexes The current Wound-Wait mutex algorithm is actually not Wound-Wait but Wait-Die. Implement also Wound-Wait as a per-ww-class choice. Wound-Wait is, contrary to Wait-Die a preemptive algorithm and is known to generate fewer backoffs. Testing reveals that this is true if the number of simultaneous contending transactions is small. As the number of simultaneous contending threads increases, Wait-Wound becomes inferior to Wait-Die in terms of elapsed time. Possibly due to the larger number of held locks of sleeping transactions. Update documentation and callers. Timings using git://people.freedesktop.org/~thomash/ww_mutex_test tag patch-18-06-15 Each thread runs 100000 batches of lock / unlock 800 ww mutexes randomly chosen out of 100000. Four core Intel x86_64: Algorithm #threads Rollbacks time Wound-Wait 4 ~100 ~17s. Wait-Die 4 ~150000 ~19s. Wound-Wait 16 ~360000 ~109s. Wait-Die 16 ~450000 ~82s. Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: David Airlie <airlied@linux.ie> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-doc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Co-authored-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@kernel.org>	2018-07-03 09:44:36 +02:00
Peter Ziljstra	55f036ca7e	locking: WW mutex cleanup Make the WW mutex code more readable by adding comments, splitting up functions and pointing out that we're actually using the Wait-Die algorithm. Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: David Airlie <airlied@linux.ie> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-doc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Co-authored-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Acked-by: Ingo Molnar <mingo@kernel.org>	2018-07-03 09:42:40 +02:00
Peter Zijlstra	c427f69564	locking/mutex: Optimize __mutex_trylock_fast() Use try_cmpxchg to avoid the pointless TEST instruction.. And add the (missing) atomic_long_try_cmpxchg*() wrappery. On x86_64 this gives: 0000000000000710 <mutex_lock>: 0000000000000710 <mutex_lock>: 710: 65 48 8b 14 25 00 00 mov %gs:0x0,%rdx 710: 65 48 8b 14 25 00 00 mov %gs:0x0,%rdx 717: 00 00 717: 00 00 715: R_X86_64_32S current_task 715: R_X86_64_32S current_task 719: 31 c0 xor %eax,%eax 719: 31 c0 xor %eax,%eax 71b: f0 48 0f b1 17 lock cmpxchg %rdx,(%rdi) 71b: f0 48 0f b1 17 lock cmpxchg %rdx,(%rdi) 720: 48 85 c0 test %rax,%rax 720: 75 02 jne 724 <mutex_lock+0x14> 723: 75 02 jne 727 <mutex_lock+0x17> 722: f3 c3 repz retq 725: f3 c3 repz retq 724: eb da jmp 700 <__mutex_lock_slowpath> 727: eb d7 jmp 700 <__mutex_lock_slowpath> 726: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 729: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 72d: 00 00 00 On ARM64 this gives: 000000000000638 <mutex_lock>: 0000000000000638 <mutex_lock>: 638: d5384101 mrs x1, sp_el0 638: d5384101 mrs x1, sp_el0 63c: d2800002 mov x2, #0x0 63c: d2800002 mov x2, #0x0 640: f9800011 prfm pstl1strm, [x0] 640: f9800011 prfm pstl1strm, [x0] 644: c85ffc03 ldaxr x3, [x0] 644: c85ffc03 ldaxr x3, [x0] 648: ca020064 eor x4, x3, x2 648: ca020064 eor x4, x3, x2 64c: b5000064 cbnz x4, 658 <mutex_lock+0x20> 64c: b5000064 cbnz x4, 658 <mutex_lock+0x20> 650: c8047c01 stxr w4, x1, [x0] 650: c8047c01 stxr w4, x1, [x0] 654: 35ffff84 cbnz w4, 644 <mutex_lock+0xc> 654: 35ffff84 cbnz w4, 644 <mutex_lock+0xc> 658: b40000c3 cbz x3, 670 <mutex_lock+0x38> 658: b5000043 cbnz x3, 660 <mutex_lock+0x28> 65c: a9bf7bfd stp x29, x30, [sp,#-16]! 65c: d65f03c0 ret 660: 910003fd mov x29, sp 660: a9bf7bfd stp x29, x30, [sp,#-16]! 664: 97ffffef bl 620 <__mutex_lock_slowpath> 664: 910003fd mov x29, sp 668: a8c17bfd ldp x29, x30, [sp],#16 668: 97ffffee bl 620 <__mutex_lock_slowpath> 66c: d65f03c0 ret 66c: a8c17bfd ldp x29, x30, [sp],#16 670: d65f03c0 ret 670: d65f03c0 ret Reported-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-05-04 10:02:39 +02:00
Matthew Wilcox	45dbac0e28	locking/mutex: Improve documentation On Wed, Mar 14, 2018 at 01:56:31PM -0700, Andrew Morton wrote: > My memory is weak and our documentation is awful. What does > mutex_lock_killable() actually do and how does it differ from > mutex_lock_interruptible()? Add kernel-doc for mutex_lock_killable() and mutex_lock_io(). Reword the kernel-doc for mutex_lock_interruptible(). Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: cl@linux.com Cc: tj@kernel.org Link: http://lkml.kernel.org/r/20180315115812.GA9949@bombadil.infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:07:41 +01:00
Mauro Carvalho Chehab	7b4ff1adb5	mutex, futex: adjust kernel-doc markups to generate ReST There are a few issues on some kernel-doc markups that was causing troubles with kernel-doc output on ReST format: ./kernel/futex.c:492: WARNING: Inline emphasis start-string without end-string. ./kernel/futex.c:1264: WARNING: Block quote ends without a blank line; unexpected unindent. ./kernel/futex.c:1721: WARNING: Block quote ends without a blank line; unexpected unindent. ./kernel/futex.c:2338: WARNING: Block quote ends without a blank line; unexpected unindent. ./kernel/futex.c:2426: WARNING: Block quote ends without a blank line; unexpected unindent. ./kernel/futex.c:2899: WARNING: Block quote ends without a blank line; unexpected unindent. ./kernel/futex.c:2972: WARNING: Block quote ends without a blank line; unexpected unindent. Fix them. No functional changes. Acked-by: Darren Hart (VMware) <dvhart@infradead.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2017-05-16 08:43:25 -03:00
Ingo Molnar	b17b01533b	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/debug.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:34 +01:00
Ingo Molnar	174cd4b1e5	sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:32 +01:00
Ingo Molnar	84f001e157	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/wake_q.h> We are going to split <linux/sched/wake_q.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/wake_q.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:26 +01:00
Linus Torvalds	42e1b14b6e	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - Implement wraparound-safe refcount_t and kref_t types based on generic atomic primitives (Peter Zijlstra) - Improve and fix the ww_mutex code (Nicolai Hähnle) - Add self-tests to the ww_mutex code (Chris Wilson) - Optimize percpu-rwsems with the 'rcuwait' mechanism (Davidlohr Bueso) - Micro-optimize the current-task logic all around the core kernel (Davidlohr Bueso) - Tidy up after recent optimizations: remove stale code and APIs, clean up the code (Waiman Long) - ... plus misc fixes, updates and cleanups" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) fork: Fix task_struct alignment locking/spinlock/debug: Remove spinlock lockup detection code lockdep: Fix incorrect condition to print bug msgs for MAX_LOCKDEP_CHAIN_HLOCKS lkdtm: Convert to refcount_t testing kref: Implement 'struct kref' using refcount_t refcount_t: Introduce a special purpose refcount type sched/wake_q: Clarify queue reinit comment sched/wait, rcuwait: Fix typo in comment locking/mutex: Fix lockdep_assert_held() fail locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock() locking/rwsem: Reinit wake_q after use locking/rwsem: Remove unnecessary atomic_long_t casts jump_labels: Move header guard #endif down where it belongs locking/atomic, kref: Implement kref_put_lock() locking/ww_mutex: Turn off __must_check for now locking/atomic, kref: Avoid more abuse locking/atomic, kref: Use kref_get_unless_zero() more locking/atomic, kref: Kill kref_sub() locking/atomic, kref: Add kref_read() locking/atomic, kref: Add KREF_INIT() ...	2017-02-20 13:23:30 -08:00
Peter Zijlstra	b9c16a0e1f	locking/mutex: Fix lockdep_assert_held() fail In commit: `659cf9f582` ("locking/ww_mutex: Optimize ww-mutexes by waking at most one waiter for backoff when acquiring the lock") I replaced a comment with a lockdep_assert_held(). However it turns out we hide that lock from lockdep for hysterical raisins, which results in the assertion always firing. Remove the old debug code as lockdep will easily spot the abuse it was meant to catch, which will make the lock visible to lockdep and make the assertion work as intended. Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicolai Haehnle <Nicolai.Haehnle@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `659cf9f582` ("locking/ww_mutex: Optimize ww-mutexes by waking at most one waiter for backoff when acquiring the lock") Link: http://lkml.kernel.org/r/20170117150609.GB32474@worktop Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-30 11:42:59 +01:00
Tejun Heo	1460cb65a1	locking/mutex, sched/wait: Add mutex_lock_io() We sometimes end up propagating IO blocking through mutexes; however, because there currently is no way of annotating mutex sleeps as iowait, there are cases where iowait and /proc/stat:procs_blocked report misleading numbers obscuring the actual state of the system. This patch adds mutex_lock_io() so that mutex sleeps can be marked as iowait in those cases. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: adilger.kernel@dilger.ca Cc: jack@suse.com Cc: kernel-team@fb.com Cc: mingbo@fb.com Cc: tytso@mit.edu Link: http://lkml.kernel.org/r/1477673892-28940-4-git-send-email-tj@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:30:05 +01:00
Nicolai Hähnle	977625a693	locking/mutex: Initialize mutex_waiter::ww_ctx with poison when debugging Help catch cases where mutex_lock is used directly on w/w mutexes, which otherwise result in the w/w tasks reading uninitialized data. Signed-off-by: Nicolai Hähnle <Nicolai.Haehnle@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <dev@mblankhorst.nl> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dri-devel@lists.freedesktop.org Link: http://lkml.kernel.org/r/1482346000-9927-12-git-send-email-nhaehnle@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:53 +01:00
Nicolai Hähnle	c516df978d	locking/ww_mutex: Optimize ww-mutexes by yielding to other waiters from optimistic spin Lock stealing is less beneficial for w/w mutexes since we may just end up backing off if we stole from a thread with an earlier acquire stamp that already holds another w/w mutex that we also need. So don't spin optimistically unless we are sure that there is no other waiter that might cause us to back off. Median timings taken of a contention-heavy GPU workload: Before: real 0m52.946s user 0m7.272s sys 1m55.964s After: real 0m53.086s user 0m7.360s sys 1m46.204s This particular workload still spends 20%-25% of CPU in mutex_spin_on_owner according to perf, but my attempts to further reduce this spinning based on various heuristics all lead to an increase in measured wall time despite the decrease in sys time. Signed-off-by: Nicolai Hähnle <Nicolai.Haehnle@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <dev@mblankhorst.nl> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dri-devel@lists.freedesktop.org Link: http://lkml.kernel.org/r/1482346000-9927-11-git-send-email-nhaehnle@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:52 +01:00
Nicolai Hähnle	25f13b4040	locking/ww_mutex: Re-check ww->ctx in the inner optimistic spin loop In the following scenario, thread #1 should back off its attempt to lock ww1 and unlock ww2 (assuming the acquire context stamps are ordered accordingly). Thread #0 Thread #1 --------- --------- successfully lock ww2 set ww1->base.owner attempt to lock ww1 confirm ww1->ctx == NULL enter mutex_spin_on_owner set ww1->ctx What was likely to happen previously is: attempt to lock ww2 refuse to spin because ww2->ctx != NULL schedule() detect thread #0 is off CPU stop optimistic spin return -EDEADLK unlock ww2 wakeup thread #0 lock ww2 Now, we are more likely to see: detect ww1->ctx != NULL stop optimistic spin return -EDEADLK unlock ww2 successfully lock ww2 ... because thread #1 will stop its optimistic spin as soon as possible. The whole scenario is quite unlikely, since it requires thread #1 to get between thread #0 setting the owner and setting the ctx. But since we're idling here anyway, the additional check is basically free. Found by inspection. Signed-off-by: Nicolai Hähnle <Nicolai.Haehnle@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <dev@mblankhorst.nl> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dri-devel@lists.freedesktop.org Link: http://lkml.kernel.org/r/1482346000-9927-10-git-send-email-nhaehnle@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:51 +01:00
Peter Zijlstra	427b18207a	locking/mutex: Improve inlining Instead of inlining __mutex_lock_common() 5 times, once for each {state,ww} variant. Reduce this to two, ww and !ww. Then add __always_inline to mutex_optimistic_spin(), so that that will get inlined all 4 remaining times, for all {waiter,ww} variants. text data bss dec hex filename 6301 0 0 6301 189d defconfig-build/kernel/locking/mutex.o 4053 0 0 4053 fd5 defconfig-build/kernel/locking/mutex.o 4257 0 0 4257 10a1 defconfig-build/kernel/locking/mutex.o This reduces total text size and better separates the ww and !ww mutex code generation. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:48 +01:00
Nicolai Hähnle	659cf9f582	locking/ww_mutex: Optimize ww-mutexes by waking at most one waiter for backoff when acquiring the lock The wait list is sorted by stamp order, and the only waiting task that may have to back off is the first waiter with a context. The regular slow path does not have to wake any other tasks at all, since all other waiters that would have to back off were either woken up when the waiter was added to the list, or detected the condition before they added themselves. Median timings taken of a contention-heavy GPU workload: Without this series: real 0m59.900s user 0m7.516s sys 2m16.076s With changes up to and including this patch: real 0m52.946s user 0m7.272s sys 1m55.964s Signed-off-by: Nicolai Hähnle <Nicolai.Haehnle@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <dev@mblankhorst.nl> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dri-devel@lists.freedesktop.org Link: http://lkml.kernel.org/r/1482346000-9927-9-git-send-email-nhaehnle@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:44 +01:00
Nicolai Hähnle	200b187440	locking/ww_mutex: Notify waiters that have to back off while adding tasks to wait list While adding our task as a waiter, detect if another task should back off because of us. With this patch, we establish the invariant that the wait list contains at most one (sleeping) waiter with ww_ctx->acquired > 0, and this waiter will be the first waiter with a context. Since only waiters with ww_ctx->acquired > 0 have to back off, this allows us to be much more economical with wakeups. Signed-off-by: Nicolai Hähnle <Nicolai.Haehnle@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <dev@mblankhorst.nl> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dri-devel@lists.freedesktop.org Link: http://lkml.kernel.org/r/1482346000-9927-8-git-send-email-nhaehnle@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:43 +01:00
Nicolai Hähnle	6baa5c60a9	locking/ww_mutex: Add waiters in stamp order Add regular waiters in stamp order. Keep adding waiters that have no context in FIFO order and take care not to starve them. While adding our task as a waiter, back off if we detect that there is a waiter with a lower stamp in front of us. Make sure to call lock_contended even when we back off early. For w/w mutexes, being first in the wait list is only stable when taking the lock without a context. Therefore, the purpose of the first flag is split into two: 'first' remains to indicate whether we want to spin optimistically, while 'handoff' indicates that we should be prepared to accept a handoff. For w/w locking with a context, we always accept handoffs after the first schedule(), to handle the following sequence of events: 1. Task #0 unlocks and hands off to Task #2 which is first in line 2. Task #1 adds itself in front of Task #2 3. Task #2 wakes up and must accept the handoff even though it is no longer first in line Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: =?UTF-8?q?Nicolai=20H=C3=A4hnle?= <Nicolai.Haehnle@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <dev@mblankhorst.nl> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dri-devel@lists.freedesktop.org Link: http://lkml.kernel.org/r/1482346000-9927-7-git-send-email-nhaehnle@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:42 +01:00
Nicolai Hähnle	c5470b22d1	locking/ww_mutex: Remove the __ww_mutex_lock*() inline wrappers Keep the documentation in the header file since there is no good place for it in mutex.c: there are two rather different implementations with different EXPORT_SYMBOLs for each function. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: =?UTF-8?q?Nicolai=20H=C3=A4hnle?= <Nicolai.Haehnle@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <dev@mblankhorst.nl> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dri-devel@lists.freedesktop.org Link: http://lkml.kernel.org/r/1482346000-9927-6-git-send-email-nhaehnle@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:41 +01:00
Nicolai Hähnle	ea9e0fb8fe	locking/ww_mutex: Set use_ww_ctx even when locking without a context We will add a new field to struct mutex_waiter. This field must be initialized for all waiters if any waiter uses the ww_use_ctx path. So there is a trade-off: Keep ww_mutex locking without a context on the faster non-use_ww_ctx path, at the cost of adding the initialization to all mutex locks (including non-ww_mutexes), or avoid the additional cost for non-ww_mutex locks, at the cost of adding additional checks to the use_ww_ctx path. We take the latter choice. It may be worth eliminating the users of ww_mutex_lock(lock, NULL), but there are a lot of them. Signed-off-by: Nicolai Hähnle <Nicolai.Haehnle@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <dev@mblankhorst.nl> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dri-devel@lists.freedesktop.org Link: http://lkml.kernel.org/r/1482346000-9927-5-git-send-email-nhaehnle@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:40 +01:00
Nicolai Hähnle	3822da3ed0	locking/ww_mutex: Extract stamp comparison to __ww_mutex_stamp_after() The function will be re-used in subsequent patches. Signed-off-by: Nicolai Hähnle <Nicolai.Haehnle@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <dev@mblankhorst.nl> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dri-devel@lists.freedesktop.org Link: http://lkml.kernel.org/r/1482346000-9927-4-git-send-email-nhaehnle@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:39 +01:00
Peter Zijlstra	e274795ea7	locking/mutex: Fix mutex handoff While reviewing the ww_mutex patches, I noticed that it was still possible to (incorrectly) succeed for (incorrect) code like: mutex_lock(&a); mutex_lock(&a); This was possible if the second mutex_lock() would block (as expected) but then receive a spurious wakeup. At that point it would find itself at the front of the queue, request a handoff and instantly claim ownership and continue, since owner would point to itself. Avoid this scenario and simplify the code by introducing a third low bit to signal handoff pickup. So once we request handoff, unlock clears the handoff bit and sets the pickup bit along with the new owner. This also removes the need for the .handoff argument to __mutex_trylock(), since that becomes superfluous with PICKUP. In order to guarantee enough low bits, ensure task_struct alignment is at least L1_CACHE_BYTES (which seems a good ideal regardless). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `9d659ae14b` ("locking/mutex: Add lock handoff to avoid starvation") Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:38 +01:00
Davidlohr Bueso	642fa448ae	sched/core: Remove set_task_state() This is a nasty interface and setting the state of a foreign task must not be done. As of the following commit: `be628be095` ("bcache: Make gc wakeup sane, remove set_task_state()") ... everyone in the kernel calls set_task_state() with current, allowing the helper to be removed. However, as the comment indicates, it is still around for those archs where computing current is more expensive than using a pointer, at least in theory. An important arch that is affected is arm64, however this has been addressed now [1] and performance is up to par making no difference with either calls. Of all the callers, if any, it's the locking bits that would care most about this -- ie: we end up passing a tsk pointer to a lot of the lock slowpath, and setting ->state on that. The following numbers are based on two tests: a custom ad-hoc microbenchmark that just measures latencies (for ~65 million calls) between get_task_state() vs get_current_state(). Secondly for a higher overview, an unlink microbenchmark was used, which pounds on a single file with open, close,unlink combos with increasing thread counts (up to 4x ncpus). While the workload is quite unrealistic, it does contend a lot on the inode mutex or now rwsem. [1] https://lkml.kernel.org/r/1483468021-8237-1-git-send-email-mark.rutland@arm.com == 1. x86-64 == Avg runtime set_task_state(): 601 msecs Avg runtime set_current_state(): 552 msecs vanilla dirty Hmean unlink1-processes-2 36089.26 ( 0.00%) 38977.33 ( 8.00%) Hmean unlink1-processes-5 28555.01 ( 0.00%) 29832.55 ( 4.28%) Hmean unlink1-processes-8 37323.75 ( 0.00%) 44974.57 ( 20.50%) Hmean unlink1-processes-12 43571.88 ( 0.00%) 44283.01 ( 1.63%) Hmean unlink1-processes-21 34431.52 ( 0.00%) 38284.45 ( 11.19%) Hmean unlink1-processes-30 34813.26 ( 0.00%) 37975.17 ( 9.08%) Hmean unlink1-processes-48 37048.90 ( 0.00%) 39862.78 ( 7.59%) Hmean unlink1-processes-79 35630.01 ( 0.00%) 36855.30 ( 3.44%) Hmean unlink1-processes-110 36115.85 ( 0.00%) 39843.91 ( 10.32%) Hmean unlink1-processes-141 32546.96 ( 0.00%) 35418.52 ( 8.82%) Hmean unlink1-processes-172 34674.79 ( 0.00%) 36899.21 ( 6.42%) Hmean unlink1-processes-203 37303.11 ( 0.00%) 36393.04 ( -2.44%) Hmean unlink1-processes-224 35712.13 ( 0.00%) 36685.96 ( 2.73%) == 2. ppc64le == Avg runtime set_task_state(): 938 msecs Avg runtime set_current_state: 940 msecs vanilla dirty Hmean unlink1-processes-2 19269.19 ( 0.00%) 30704.50 ( 59.35%) Hmean unlink1-processes-5 20106.15 ( 0.00%) 21804.15 ( 8.45%) Hmean unlink1-processes-8 17496.97 ( 0.00%) 17243.28 ( -1.45%) Hmean unlink1-processes-12 14224.15 ( 0.00%) 17240.21 ( 21.20%) Hmean unlink1-processes-21 14155.66 ( 0.00%) 15681.23 ( 10.78%) Hmean unlink1-processes-30 14450.70 ( 0.00%) 15995.83 ( 10.69%) Hmean unlink1-processes-48 16945.57 ( 0.00%) 16370.42 ( -3.39%) Hmean unlink1-processes-79 15788.39 ( 0.00%) 14639.27 ( -7.28%) Hmean unlink1-processes-110 14268.48 ( 0.00%) 14377.40 ( 0.76%) Hmean unlink1-processes-141 14023.65 ( 0.00%) 16271.69 ( 16.03%) Hmean unlink1-processes-172 13417.62 ( 0.00%) 16067.55 ( 19.75%) Hmean unlink1-processes-203 15293.08 ( 0.00%) 15440.40 ( 0.96%) Hmean unlink1-processes-234 13719.32 ( 0.00%) 16190.74 ( 18.01%) Hmean unlink1-processes-265 16400.97 ( 0.00%) 16115.22 ( -1.74%) Hmean unlink1-processes-296 14388.60 ( 0.00%) 16216.13 ( 12.70%) Hmean unlink1-processes-320 15771.85 ( 0.00%) 15905.96 ( 0.85%) x86-64 (known to be fast for get_current()/this_cpu_read_stable() caching) and ppc64 (with paca) show similar improvements in the unlink microbenches. The small delta for ppc64 (2ms), does not represent the gains on the unlink runs. In the case of x86, there was a decent amount of variation in the latency runs, but always within a 20 to 50ms increase), ppc was more constant. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Cc: mark.rutland@arm.com Link: http://lkml.kernel.org/r/1483479794-14013-5-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:16 +01:00
Davidlohr Bueso	d269a8b8c5	kernel/locking: Compute 'current' directly This patch effectively replaces the tsk pointer dereference (which is obviously == current), to directly use get_current() macro. This is to make the removal of setting foreign task states smoother and painfully obvious. Performance win on some archs such as x86-64 and ppc64. On a microbenchmark that calls set_task_state() vs set_current_state() and an inode rwsem pounding benchmark doing unlink: == 1. x86-64 == Avg runtime set_task_state(): 601 msecs Avg runtime set_current_state(): 552 msecs vanilla dirty Hmean unlink1-processes-2 36089.26 ( 0.00%) 38977.33 ( 8.00%) Hmean unlink1-processes-5 28555.01 ( 0.00%) 29832.55 ( 4.28%) Hmean unlink1-processes-8 37323.75 ( 0.00%) 44974.57 ( 20.50%) Hmean unlink1-processes-12 43571.88 ( 0.00%) 44283.01 ( 1.63%) Hmean unlink1-processes-21 34431.52 ( 0.00%) 38284.45 ( 11.19%) Hmean unlink1-processes-30 34813.26 ( 0.00%) 37975.17 ( 9.08%) Hmean unlink1-processes-48 37048.90 ( 0.00%) 39862.78 ( 7.59%) Hmean unlink1-processes-79 35630.01 ( 0.00%) 36855.30 ( 3.44%) Hmean unlink1-processes-110 36115.85 ( 0.00%) 39843.91 ( 10.32%) Hmean unlink1-processes-141 32546.96 ( 0.00%) 35418.52 ( 8.82%) Hmean unlink1-processes-172 34674.79 ( 0.00%) 36899.21 ( 6.42%) Hmean unlink1-processes-203 37303.11 ( 0.00%) 36393.04 ( -2.44%) Hmean unlink1-processes-224 35712.13 ( 0.00%) 36685.96 ( 2.73%) == 2. ppc64le == Avg runtime set_task_state(): 938 msecs Avg runtime set_current_state: 940 msecs vanilla dirty Hmean unlink1-processes-2 19269.19 ( 0.00%) 30704.50 ( 59.35%) Hmean unlink1-processes-5 20106.15 ( 0.00%) 21804.15 ( 8.45%) Hmean unlink1-processes-8 17496.97 ( 0.00%) 17243.28 ( -1.45%) Hmean unlink1-processes-12 14224.15 ( 0.00%) 17240.21 ( 21.20%) Hmean unlink1-processes-21 14155.66 ( 0.00%) 15681.23 ( 10.78%) Hmean unlink1-processes-30 14450.70 ( 0.00%) 15995.83 ( 10.69%) Hmean unlink1-processes-48 16945.57 ( 0.00%) 16370.42 ( -3.39%) Hmean unlink1-processes-79 15788.39 ( 0.00%) 14639.27 ( -7.28%) Hmean unlink1-processes-110 14268.48 ( 0.00%) 14377.40 ( 0.76%) Hmean unlink1-processes-141 14023.65 ( 0.00%) 16271.69 ( 16.03%) Hmean unlink1-processes-172 13417.62 ( 0.00%) 16067.55 ( 19.75%) Hmean unlink1-processes-203 15293.08 ( 0.00%) 15440.40 ( 0.96%) Hmean unlink1-processes-234 13719.32 ( 0.00%) 16190.74 ( 18.01%) Hmean unlink1-processes-265 16400.97 ( 0.00%) 16115.22 ( -1.74%) Hmean unlink1-processes-296 14388.60 ( 0.00%) 16216.13 ( 12.70%) Hmean unlink1-processes-320 15771.85 ( 0.00%) 15905.96 ( 0.85%) Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Cc: mark.rutland@arm.com Link: http://lkml.kernel.org/r/1483479794-14013-4-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:14:14 +01:00
Pan Xinhui	05ffc95139	locking/mutex: Break out of expensive busy-loop on {mutex,rwsem}_spin_on_owner() when owner vCPU is preempted An over-committed guest with more vCPUs than pCPUs has a heavy overload in the two spin_on_owner. This blames on the lock holder preemption issue. Break out of the loop if the vCPU is preempted: if vcpu_is_preempted(cpu) is true. test-case: perf record -a perf bench sched messaging -g 400 -p && perf report before patch: 20.68% sched-messaging [kernel.vmlinux] [k] mutex_spin_on_owner 8.45% sched-messaging [kernel.vmlinux] [k] mutex_unlock 4.12% sched-messaging [kernel.vmlinux] [k] system_call 3.01% sched-messaging [kernel.vmlinux] [k] system_call_common 2.83% sched-messaging [kernel.vmlinux] [k] copypage_power7 2.64% sched-messaging [kernel.vmlinux] [k] rwsem_spin_on_owner 2.00% sched-messaging [kernel.vmlinux] [k] osq_lock after patch: 9.99% sched-messaging [kernel.vmlinux] [k] mutex_unlock 5.28% sched-messaging [unknown] [H] 0xc0000000000768e0 4.27% sched-messaging [kernel.vmlinux] [k] __copy_tofrom_user_power7 3.77% sched-messaging [kernel.vmlinux] [k] copypage_power7 3.24% sched-messaging [kernel.vmlinux] [k] _raw_write_lock_irq 3.02% sched-messaging [kernel.vmlinux] [k] system_call 2.69% sched-messaging [kernel.vmlinux] [k] wait_consider_task Tested-by: Juergen Gross <jgross@suse.com> Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: David.Laight@ACULAB.COM Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: benh@kernel.crashing.org Cc: boqun.feng@gmail.com Cc: bsingharora@gmail.com Cc: dave@stgolabs.net Cc: kernellwp@gmail.com Cc: konrad.wilk@oracle.com Cc: linuxppc-dev@lists.ozlabs.org Cc: mpe@ellerman.id.au Cc: paulmck@linux.vnet.ibm.com Cc: paulus@samba.org Cc: rkrcmar@redhat.com Cc: virtualization@lists.linux-foundation.org Cc: will.deacon@arm.com Cc: xen-devel-request@lists.xenproject.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1478077718-37424-4-git-send-email-xinhui.pan@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-22 12:48:10 +01:00
Waiman Long	194a6b5b9c	sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q Currently the wake_q data structure is defined by the WAKE_Q() macro. This macro, however, looks like a function doing something as "wake" is a verb. Even checkpatch.pl was confused as it reported warnings like WARNING: Missing a blank line after declarations #548: FILE: kernel/futex.c:3665: + int ret; + WAKE_Q(wake_q); This patch renames the WAKE_Q() macro to DEFINE_WAKE_Q() which clarifies what the macro is doing and eliminates the checkpatch.pl warnings. Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1479401198-1765-1-git-send-email-longman@redhat.com [ Resolved conflict and added missing rename. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-21 10:29:01 +01:00
Christian Borntraeger	f2f09a4cee	locking/core: Remove cpu_relax_lowlatency() users With the s390 special case of a yielding cpu_relax() implementation gone, we can now remove all users of cpu_relax_lowlatency() and replace them with cpu_relax(). Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1477386195-32736-5-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-16 10:15:10 +01:00
Waiman Long	b341afb325	locking/mutex: Enable optimistic spinning of woken waiter This patch makes the waiter that sets the HANDOFF flag start spinning instead of sleeping until the handoff is complete or the owner sleeps. Otherwise, the handoff will cause the optimistic spinners to abort spinning as the handed-off owner may not be running. Tested-by: Jason Low <jason.low2@hpe.com> Signed-off-by: Waiman Long <Waiman.Long@hpe.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ding Tianhong <dingtianhong@huawei.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul E. McKenney <paulmck@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Will Deacon <Will.Deacon@arm.com> Link: http://lkml.kernel.org/r/1472254509-27508-2-git-send-email-Waiman.Long@hpe.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-10-25 11:31:54 +02:00
Waiman Long	a40ca56577	locking/mutex: Simplify some ww_mutex code in __mutex_lock_common() This patch removes some of the redundant ww_mutex code in __mutex_lock_common(). Tested-by: Jason Low <jason.low2@hpe.com> Signed-off-by: Waiman Long <Waiman.Long@hpe.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ding Tianhong <dingtianhong@huawei.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul E. McKenney <paulmck@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Will Deacon <Will.Deacon@arm.com> Link: http://lkml.kernel.org/r/1472254509-27508-1-git-send-email-Waiman.Long@hpe.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-10-25 11:31:53 +02:00
Peter Zijlstra	5bbd7e6443	locking/mutex: Restructure wait loop Doesn't really matter yet, but pull the HANDOFF and trylock out from under the wait_lock. The intention is to add an optimistic spin loop here, which requires we do not hold the wait_lock, so shuffle code around in preparation. Also clarify the purpose of taking the wait_lock in the wait loop, its tempting to want to avoid it altogether, but the cancellation cases need to to avoid losing wakeups. Suggested-by: Waiman Long <waiman.long@hpe.com> Tested-by: Jason Low <jason.low2@hpe.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-10-25 11:31:53 +02:00
Peter Zijlstra	9d659ae14b	locking/mutex: Add lock handoff to avoid starvation Implement lock handoff to avoid lock starvation. Lock starvation is possible because mutex_lock() allows lock stealing, where a running (or optimistic spinning) task beats the woken waiter to the acquire. Lock stealing is an important performance optimization because waiting for a waiter to wake up and get runtime can take a significant time, during which everyboy would stall on the lock. The down-side is of course that it allows for starvation. This patch has the waiter requesting a handoff if it fails to acquire the lock upon waking. This re-introduces some of the wait time, because once we do a handoff we have to wait for the waiter to wake up again. A future patch will add a round of optimistic spinning to attempt to alleviate this penalty, but if that turns out to not be enough, we can add a counter and only request handoff after multiple failed wakeups. There are a few tricky implementation details: - accepting a handoff must only be done in the wait-loop. Since the handoff condition is owner == current, it can easily cause recursive locking trouble. - accepting the handoff must be careful to provide the ACQUIRE semantics. - having the HANDOFF bit set on unlock requires care, we must not clear the owner. - we must be careful to not leave HANDOFF set after we've acquired the lock. The tricky scenario is setting the HANDOFF bit on an unlocked mutex. Tested-by: Jason Low <jason.low2@hpe.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Waiman Long <Waiman.Long@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-10-25 11:31:52 +02:00

1 2

92 Commits