Changes in 5.10.37
Bluetooth: verify AMP hci_chan before amp_destroy
bluetooth: eliminate the potential race condition when removing the HCI controller
net/nfc: fix use-after-free llcp_sock_bind/connect
io_uring: truncate lengths larger than MAX_RW_COUNT on provide buffers
Revert "USB: cdc-acm: fix rounding error in TIOCSSERIAL"
usb: roles: Call try_module_get() from usb_role_switch_find_by_fwnode()
tty: moxa: fix TIOCSSERIAL jiffies conversions
tty: amiserial: fix TIOCSSERIAL permission check
USB: serial: usb_wwan: fix TIOCSSERIAL jiffies conversions
staging: greybus: uart: fix TIOCSSERIAL jiffies conversions
USB: serial: ti_usb_3410_5052: fix TIOCSSERIAL permission check
staging: fwserial: fix TIOCSSERIAL jiffies conversions
tty: moxa: fix TIOCSSERIAL permission check
staging: fwserial: fix TIOCSSERIAL permission check
drm: bridge: fix LONTIUM use of mipi_dsi_() functions
usb: typec: tcpm: Address incorrect values of tcpm psy for fixed supply
usb: typec: tcpm: Address incorrect values of tcpm psy for pps supply
usb: typec: tcpm: update power supply once partner accepts
usb: xhci-mtk: remove or operator for setting schedule parameters
usb: xhci-mtk: improve bandwidth scheduling with TT
ASoC: samsung: tm2_wm5110: check of of_parse return value
ASoC: Intel: kbl_da7219_max98927: Fix kabylake_ssp_fixup function
ASoC: tlv320aic32x4: Register clocks before registering component
ASoC: tlv320aic32x4: Increase maximum register in regmap
MIPS: pci-mt7620: fix PLL lock check
MIPS: pci-rt2880: fix slot 0 configuration
FDDI: defxx: Bail out gracefully with unassigned PCI resource for CSR
PCI: Allow VPD access for QLogic ISP2722
KVM: x86: Defer the MMU unload to the normal path on an global INVPCID
PCI: xgene: Fix cfg resource mapping
PCI: keystone: Let AM65 use the pci_ops defined in pcie-designware-host.c
PM / devfreq: Unlock mutex and free devfreq struct in error path
soc/tegra: regulators: Fix locking up when voltage-spread is out of range
iio: inv_mpu6050: Fully validate gyro and accel scale writes
iio:accel:adis16201: Fix wrong axis assignment that prevents loading
iio:adc:ad7476: Fix remove handling
sc16is7xx: Defer probe if device read fails
phy: cadence: Sierra: Fix PHY power_on sequence
misc: lis3lv02d: Fix false-positive WARN on various HP models
phy: ti: j721e-wiz: Invoke wiz_init() before of_platform_device_create()
misc: vmw_vmci: explicitly initialize vmci_notify_bm_set_msg struct
misc: vmw_vmci: explicitly initialize vmci_datagram payload
selinux: add proper NULL termination to the secclass_map permissions
x86, sched: Treat Intel SNC topology as default, COD as exception
async_xor: increase src_offs when dropping destination page
md/bitmap: wait for external bitmap writes to complete during tear down
md-cluster: fix use-after-free issue when removing rdev
md: split mddev_find
md: factor out a mddev_find_locked helper from mddev_find
md: md_open returns -EBUSY when entering racing area
md: Fix missing unused status line of /proc/mdstat
mt76: mt7615: use ieee80211_free_txskb() in mt7615_tx_token_put()
ipw2x00: potential buffer overflow in libipw_wx_set_encodeext()
cfg80211: scan: drop entry from hidden_list on overflow
rtw88: Fix array overrun in rtw_get_tx_power_params()
mt76: fix potential DMA mapping leak
FDDI: defxx: Make MMIO the configuration default except for EISA
drm/i915/gvt: Fix virtual display setup for BXT/APL
drm/i915/gvt: Fix vfio_edid issue for BXT/APL
drm/qxl: use ttm bo priorities
drm/panfrost: Clear MMU irqs before handling the fault
drm/panfrost: Don't try to map pages that are already mapped
drm/radeon: fix copy of uninitialized variable back to userspace
drm/dp_mst: Revise broadcast msg lct & lcr
drm/dp_mst: Set CLEAR_PAYLOAD_ID_TABLE as broadcast
drm: bridge/panel: Cleanup connector on bridge detach
drm/amd/display: Reject non-zero src_y and src_x for video planes
drm/amdgpu: fix concurrent VM flushes on Vega/Navi v2
ALSA: hda/realtek: Re-order ALC882 Acer quirk table entries
ALSA: hda/realtek: Re-order ALC882 Sony quirk table entries
ALSA: hda/realtek: Re-order ALC882 Clevo quirk table entries
ALSA: hda/realtek: Re-order ALC269 HP quirk table entries
ALSA: hda/realtek: Re-order ALC269 Acer quirk table entries
ALSA: hda/realtek: Re-order ALC269 Dell quirk table entries
ALSA: hda/realtek: Re-order ALC269 ASUS quirk table entries
ALSA: hda/realtek: Re-order ALC269 Sony quirk table entries
ALSA: hda/realtek: Re-order ALC269 Lenovo quirk table entries
ALSA: hda/realtek: Re-order remaining ALC269 quirk table entries
ALSA: hda/realtek: Re-order ALC662 quirk table entries
ALSA: hda/realtek: Remove redundant entry for ALC861 Haier/Uniwill devices
ALSA: hda/realtek: ALC285 Thinkpad jack pin quirk is unreachable
ALSA: hda/realtek: Fix speaker amp on HP Envy AiO 32
KVM: s390: VSIE: correctly handle MVPG when in VSIE
KVM: s390: split kvm_s390_logical_to_effective
KVM: s390: fix guarded storage control register handling
s390: fix detection of vector enhancements facility 1 vs. vector packed decimal facility
KVM: s390: VSIE: fix MVPG handling for prefixing and MSO
KVM: s390: split kvm_s390_real_to_abs
KVM: s390: extend kvm_s390_shadow_fault to return entry pointer
KVM: x86/mmu: Alloc page for PDPTEs when shadowing 32-bit NPT with 64-bit
KVM: x86: Remove emulator's broken checks on CR0/CR3/CR4 loads
KVM: nSVM: Set the shadow root level to the TDP level for nested NPT
KVM: SVM: Don't strip the C-bit from CR2 on #PF interception
KVM: SVM: Do not allow SEV/SEV-ES initialization after vCPUs are created
KVM: SVM: Inject #GP on guest MSR_TSC_AUX accesses if RDTSCP unsupported
KVM: nVMX: Defer the MMU reload to the normal path on an EPTP switch
KVM: nVMX: Truncate bits 63:32 of VMCS field on nested check in !64-bit
KVM: nVMX: Truncate base/index GPR value on address calc in !64-bit
KVM: arm/arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST read
KVM: Destroy I/O bus devices on unregister failure _after_ sync'ing SRCU
KVM: Stop looking for coalesced MMIO zones if the bus is destroyed
KVM: arm64: Fully zero the vcpu state on reset
KVM: arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION read
Revert "drivers/net/wan/hdlc_fr: Fix a double free in pvc_xmit"
Revert "i3c master: fix missing destroy_workqueue() on error in i3c_master_register"
ovl: fix missing revert_creds() on error path
Revert "drm/qxl: do not run release if qxl failed to init"
usb: gadget: pch_udc: Revert d3cb25a121 completely
Revert "tools/power turbostat: adjust for temperature offset"
firmware: xilinx: Fix dereferencing freed memory
firmware: xilinx: Add a blank line after function declaration
firmware: xilinx: Remove zynqmp_pm_get_eemi_ops() in IS_REACHABLE(CONFIG_ZYNQMP_FIRMWARE)
fpga: fpga-mgr: xilinx-spi: fix error messages on -EPROBE_DEFER
crypto: sun8i-ss - fix result memory leak on error path
memory: gpmc: fix out of bounds read and dereference on gpmc_cs[]
ARM: dts: exynos: correct fuel gauge interrupt trigger level on GT-I9100
ARM: dts: exynos: correct fuel gauge interrupt trigger level on Midas family
ARM: dts: exynos: correct MUIC interrupt trigger level on Midas family
ARM: dts: exynos: correct PMIC interrupt trigger level on Midas family
ARM: dts: exynos: correct PMIC interrupt trigger level on Odroid X/U3 family
ARM: dts: exynos: correct PMIC interrupt trigger level on SMDK5250
ARM: dts: exynos: correct PMIC interrupt trigger level on Snow
ARM: dts: s5pv210: correct fuel gauge interrupt trigger level on Fascinate family
ARM: dts: renesas: Add mmc aliases into R-Car Gen2 board dts files
arm64: dts: renesas: Add mmc aliases into board dts files
x86/platform/uv: Set section block size for hubless architectures
serial: stm32: fix code cleaning warnings and checks
serial: stm32: add "_usart" prefix in functions name
serial: stm32: fix probe and remove order for dma
serial: stm32: Use of_device_get_match_data()
serial: stm32: fix startup by enabling usart for reception
serial: stm32: fix incorrect characters on console
serial: stm32: fix TX and RX FIFO thresholds
serial: stm32: fix a deadlock condition with wakeup event
serial: stm32: fix wake-up flag handling
serial: stm32: fix a deadlock in set_termios
serial: stm32: fix tx dma completion, release channel
serial: stm32: call stm32_transmit_chars locked
serial: stm32: fix FIFO flush in startup and set_termios
serial: stm32: add FIFO flush when port is closed
serial: stm32: fix tx_empty condition
usb: typec: tcpci: Check ROLE_CONTROL while interpreting CC_STATUS
usb: typec: tps6598x: Fix return value check in tps6598x_probe()
usb: typec: stusb160x: fix return value check in stusb160x_probe()
regmap: set debugfs_name to NULL after it is freed
spi: rockchip: avoid objtool warning
mtd: rawnand: fsmc: Fix error code in fsmc_nand_probe()
mtd: rawnand: brcmnand: fix OOB R/W with Hamming ECC
mtd: Handle possible -EPROBE_DEFER from parse_mtd_partitions()
mtd: rawnand: qcom: Return actual error code instead of -ENODEV
mtd: don't lock when recursively deleting partitions
mtd: maps: fix error return code of physmap_flash_remove()
ARM: dts: stm32: fix usart 2 & 3 pinconf to wake up with flow control
arm64: dts: qcom: sm8250: Fix level triggered PMU interrupt polarity
arm64: dts: qcom: sm8250: Fix timer interrupt to specify EL2 physical timer
arm64: dts: qcom: sdm845: fix number of pins in 'gpio-ranges'
arm64: dts: qcom: sm8150: fix number of pins in 'gpio-ranges'
arm64: dts: qcom: sm8250: fix number of pins in 'gpio-ranges'
arm64: dts: qcom: db845c: fix correct powerdown pin for WSA881x
crypto: sun8i-ss - Fix memory leak of object d when dma_iv fails to map
spi: stm32: drop devres version of spi_register_master
regulator: bd9576: Fix return from bd957x_probe()
arm64: dts: renesas: r8a77980: Fix vin4-7 endpoint binding
spi: stm32: Fix use-after-free on unbind
x86/microcode: Check for offline CPUs before requesting new microcode
devtmpfs: fix placement of complete() call
usb: gadget: pch_udc: Replace cpu_to_le32() by lower_32_bits()
usb: gadget: pch_udc: Check if driver is present before calling ->setup()
usb: gadget: pch_udc: Check for DMA mapping error
usb: gadget: pch_udc: Initialize device pointer before use
usb: gadget: pch_udc: Provide a GPIO line used on Intel Minnowboard (v1)
crypto: ccp - fix command queuing to TEE ring buffer
crypto: qat - don't release uninitialized resources
crypto: qat - ADF_STATUS_PF_RUNNING should be set after adf_dev_init
fotg210-udc: Fix DMA on EP0 for length > max packet size
fotg210-udc: Fix EP0 IN requests bigger than two packets
fotg210-udc: Remove a dubious condition leading to fotg210_done
fotg210-udc: Mask GRP2 interrupts we don't handle
fotg210-udc: Don't DMA more than the buffer can take
fotg210-udc: Complete OUT requests on short packets
usb: gadget: s3c: Fix incorrect resources releasing
usb: gadget: s3c: Fix the error handling path in 's3c2410_udc_probe()'
dt-bindings: serial: stm32: Use 'type: object' instead of false for 'additionalProperties'
mtd: require write permissions for locking and badblock ioctls
arm64: dts: renesas: r8a779a0: Fix PMU interrupt
bus: qcom: Put child node before return
soundwire: bus: Fix device found flag correctly
phy: ti: j721e-wiz: Delete "clk_div_sel" clk provider during cleanup
phy: marvell: ARMADA375_USBCLUSTER_PHY should not default to y, unconditionally
arm64: dts: mediatek: fix reset GPIO level on pumpkin
NFSD: Fix sparse warning in nfs4proc.c
NFSv4.2: fix copy stateid copying for the async copy
crypto: poly1305 - fix poly1305_core_setkey() declaration
crypto: qat - fix error path in adf_isr_resource_alloc()
usb: gadget: aspeed: fix dma map failure
USB: gadget: udc: fix wrong pointer passed to IS_ERR() and PTR_ERR()
drivers: nvmem: Fix voltage settings for QTI qfprom-efuse
driver core: platform: Declare early_platform_cleanup() prototype
memory: pl353: fix mask of ECC page_size config register
soundwire: stream: fix memory leak in stream config error path
m68k: mvme147,mvme16x: Don't wipe PCC timer config bits
firmware: qcom_scm: Make __qcom_scm_is_call_available() return bool
firmware: qcom_scm: Reduce locking section for __get_convention()
firmware: qcom_scm: Workaround lack of "is available" call on SC7180
iio: adc: Kconfig: make AD9467 depend on ADI_AXI_ADC symbol
mtd: rawnand: gpmi: Fix a double free in gpmi_nand_init
irqchip/gic-v3: Fix OF_BAD_ADDR error handling
staging: comedi: tests: ni_routes_test: Fix compilation error
staging: rtl8192u: Fix potential infinite loop
staging: fwserial: fix TIOCSSERIAL implementation
staging: fwserial: fix TIOCGSERIAL implementation
staging: greybus: uart: fix unprivileged TIOCCSERIAL
soc: qcom: pdr: Fix error return code in pdr_register_listener
PM / devfreq: Use more accurate returned new_freq as resume_freq
clocksource/drivers/timer-ti-dm: Fix posted mode status check order
clocksource/drivers/timer-ti-dm: Add missing set_state_oneshot_stopped
clocksource/drivers/ingenic_ost: Fix return value check in ingenic_ost_probe()
spi: Fix use-after-free with devm_spi_alloc_*
spi: fsl: add missing iounmap() on error in of_fsl_spi_probe()
soc: qcom: mdt_loader: Validate that p_filesz < p_memsz
soc: qcom: mdt_loader: Detect truncated read of segments
PM: runtime: Replace inline function pm_runtime_callbacks_present()
cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration
ACPI: CPPC: Replace cppc_attr with kobj_attribute
crypto: allwinner - add missing CRYPTO_ prefix
crypto: sun8i-ss - Fix memory leak of pad
crypto: sa2ul - Fix memory leak of rxd
crypto: qat - Fix a double free in adf_create_ring
cpufreq: armada-37xx: Fix setting TBG parent for load levels
clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock
cpufreq: armada-37xx: Fix the AVS value for load L1
clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz
clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0
cpufreq: armada-37xx: Fix driver cleanup when registration failed
cpufreq: armada-37xx: Fix determining base CPU frequency
spi: spi-zynqmp-gqspi: use wait_for_completion_timeout to make zynqmp_qspi_exec_op not interruptible
spi: spi-zynqmp-gqspi: add mutex locking for exec_op
spi: spi-zynqmp-gqspi: transmit dummy circles by using the controller's internal functionality
spi: spi-zynqmp-gqspi: fix incorrect operating mode in zynqmp_qspi_read_op
spi: fsl-lpspi: Fix PM reference leak in lpspi_prepare_xfer_hardware()
usb: gadget: r8a66597: Add missing null check on return from platform_get_resource
USB: cdc-acm: fix unprivileged TIOCCSERIAL
USB: cdc-acm: fix TIOCGSERIAL implementation
tty: actually undefine superseded ASYNC flags
tty: fix return value for unsupported ioctls
tty: Remove dead termiox code
tty: fix return value for unsupported termiox ioctls
serial: core: return early on unsupported ioctls
firmware: qcom-scm: Fix QCOM_SCM configuration
node: fix device cleanups in error handling code
crypto: chelsio - Read rxchannel-id from firmware
usbip: vudc: fix missing unlock on error in usbip_sockfd_store()
m68k: Add missing mmap_read_lock() to sys_cacheflush()
spi: spi-zynqmp-gqspi: Fix missing unlock on error in zynqmp_qspi_exec_op()
memory: renesas-rpc-if: fix possible NULL pointer dereference of resource
memory: samsung: exynos5422-dmc: handle clk_set_parent() failure
security: keys: trusted: fix TPM2 authorizations
platform/x86: pmc_atom: Match all Beckhoff Automation baytrail boards with critclk_systems DMI table
ARM: dts: aspeed: Rainier: Fix humidity sensor bus address
Drivers: hv: vmbus: Use after free in __vmbus_open()
spi: spi-zynqmp-gqspi: fix clk_enable/disable imbalance issue
spi: spi-zynqmp-gqspi: fix hang issue when suspend/resume
spi: spi-zynqmp-gqspi: fix use-after-free in zynqmp_qspi_exec_op
spi: spi-zynqmp-gqspi: return -ENOMEM if dma_map_single fails
x86/platform/uv: Fix !KEXEC build failure
hwmon: (pmbus/pxe1610) don't bail out when not all pages are active
Drivers: hv: vmbus: Increase wait time for VMbus unload
PM: hibernate: x86: Use crc32 instead of md5 for hibernation e820 integrity check
usb: dwc2: Fix host mode hibernation exit with remote wakeup flow.
usb: dwc2: Fix hibernation between host and device modes.
ttyprintk: Add TTY hangup callback.
serial: omap: don't disable rs485 if rts gpio is missing
serial: omap: fix rs485 half-duplex filtering
xen-blkback: fix compatibility bug with single page rings
soc: aspeed: fix a ternary sign expansion bug
drm/tilcdc: send vblank event when disabling crtc
drm/stm: Fix bus_flags handling
drm/amd/display: Fix off by one in hdmi_14_process_transaction()
drm/mcde/panel: Inverse misunderstood flag
sched/fair: Fix shift-out-of-bounds in load_balance()
afs: Fix updating of i_mode due to 3rd party change
rcu: Remove spurious instrumentation_end() in rcu_nmi_enter()
media: vivid: fix assignment of dev->fbuf_out_flags
media: saa7134: use sg_dma_len when building pgtable
media: saa7146: use sg_dma_len when building pgtable
media: omap4iss: return error code when omap4iss_get() failed
media: rkisp1: rsz: crash fix when setting src format
media: aspeed: fix clock handling logic
drm/probe-helper: Check epoch counter in output_poll_execute()
media: venus: core: Fix some resource leaks in the error path of 'venus_probe()'
media: platform: sunxi: sun6i-csi: fix error return code of sun6i_video_start_streaming()
media: m88ds3103: fix return value check in m88ds3103_probe()
media: docs: Fix data organization of MEDIA_BUS_FMT_RGB101010_1X30
media: [next] staging: media: atomisp: fix memory leak of object flash
media: atomisp: Fixed error handling path
media: m88rs6000t: avoid potential out-of-bounds reads on arrays
media: atomisp: Fix use after free in atomisp_alloc_css_stat_bufs()
drm/amdkfd: fix build error with AMD_IOMMU_V2=m
of: overlay: fix for_each_child.cocci warnings
x86/kprobes: Fix to check non boostable prefixes correctly
selftests: fix prepending $(OUTPUT) to $(TEST_PROGS)
pata_arasan_cf: fix IRQ check
pata_ipx4xx_cf: fix IRQ check
sata_mv: add IRQ checks
ata: libahci_platform: fix IRQ check
seccomp: Fix CONFIG tests for Seccomp_filters
nvme-tcp: block BH in sk state_change sk callback
nvmet-tcp: fix incorrect locking in state_change sk callback
clk: imx: Fix reparenting of UARTs not associated with stdout
power: supply: bq25980: Move props from battery node
nvme: retrigger ANA log update if group descriptor isn't found
media: i2c: imx219: Move out locking/unlocking of vflip and hflip controls from imx219_set_stream
media: i2c: imx219: Balance runtime PM use-count
media: v4l2-ctrls.c: fix race condition in hdl->requests list
vfio/fsl-mc: Re-order vfio_fsl_mc_probe()
vfio/pci: Move VGA and VF initialization to functions
vfio/pci: Re-order vfio_pci_probe()
vfio/mdev: Do not allow a mdev_type to have a NULL parent pointer
clk: zynqmp: move zynqmp_pll_set_mode out of round_rate callback
clk: zynqmp: pll: add set_pll_mode to check condition in zynqmp_pll_enable
drm: xlnx: zynqmp: fix a memset in zynqmp_dp_train()
clk: qcom: a53-pll: Add missing MODULE_DEVICE_TABLE
clk: qcom: apss-ipq-pll: Add missing MODULE_DEVICE_TABLE
drm/amd/display: use GFP_ATOMIC in dcn20_resource_construct
drm/radeon: Fix a missing check bug in radeon_dp_mst_detect()
clk: uniphier: Fix potential infinite loop
scsi: pm80xx: Increase timeout for pm80xx mpi_uninit_check()
scsi: pm80xx: Fix potential infinite loop
scsi: ufs: ufshcd-pltfrm: Fix deferred probing
scsi: hisi_sas: Fix IRQ checks
scsi: jazz_esp: Add IRQ check
scsi: sun3x_esp: Add IRQ check
scsi: sni_53c710: Add IRQ check
scsi: ibmvfc: Fix invalid state machine BUG_ON()
mailbox: sprd: Introduce refcnt when clients requests/free channels
mfd: stm32-timers: Avoid clearing auto reload register
nvmet-tcp: fix a segmentation fault during io parsing error
nvme-pci: don't simple map sgl when sgls are disabled
media: cedrus: Fix H265 status definitions
HSI: core: fix resource leaks in hsi_add_client_from_dt()
x86/events/amd/iommu: Fix sysfs type mismatch
perf/amd/uncore: Fix sysfs type mismatch
io_uring: fix overflows checks in provide buffers
sched/debug: Fix cgroup_path[] serialization
drivers/block/null_blk/main: Fix a double free in null_init.
xsk: Respect device's headroom and tailroom on generic xmit path
HID: plantronics: Workaround for double volume key presses
perf symbols: Fix dso__fprintf_symbols_by_name() to return the number of printed chars
ASoC: Intel: boards: sof-wm8804: add check for PLL setting
ASoC: Intel: Skylake: Compile when any configuration is selected
RDMA/mlx5: Fix mlx5 rates to IB rates map
wilc1000: write value to WILC_INTR2_ENABLE register
KVM: x86/mmu: Retry page faults that hit an invalid memslot
Bluetooth: avoid deadlock between hci_dev->lock and socket lock
net: lapbether: Prevent racing when checking whether the netif is running
libbpf: Add explicit padding to bpf_xdp_set_link_opts
bpftool: Fix maybe-uninitialized warnings
iommu: Check dev->iommu in iommu_dev_xxx functions
iommu/vt-d: Reject unsupported page request modes
selftests/bpf: Re-generate vmlinux.h and BPF skeletons if bpftool changed
libbpf: Add explicit padding to btf_dump_emit_type_decl_opts
powerpc/fadump: Mark fadump_calculate_reserve_size as __init
powerpc/prom: Mark identical_pvr_fixup as __init
MIPS: fix local_irq_{disable,enable} in asmmacro.h
ima: Fix the error code for restoring the PCR value
inet: use bigger hash table for IP ID generation
pinctrl: pinctrl-single: remove unused parameter
pinctrl: pinctrl-single: fix pcs_pin_dbg_show() when bits_per_mux is not zero
MIPS: loongson64: fix bug when PAGE_SIZE > 16KB
ASoC: wm8960: Remove bitclk relax condition in wm8960_configure_sysclk
iommu/arm-smmu-v3: add bit field SFM into GERROR_ERR_MASK
RDMA/mlx5: Fix drop packet rule in egress table
IB/isert: Fix a use after free in isert_connect_request
powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration
MIPS/bpf: Enable bpf_probe_read{, str}() on MIPS again
gpio: guard gpiochip_irqchip_add_domain() with GPIOLIB_IRQCHIP
ALSA: core: remove redundant spin_lock pair in snd_card_disconnect
net: phy: lan87xx: fix access to wrong register of LAN87xx
udp: never accept GSO_FRAGLIST packets
powerpc/pseries: Only register vio drivers if vio bus exists
net/tipc: fix missing destroy_workqueue() on error in tipc_crypto_start()
bug: Remove redundant condition check in report_bug
RDMA/core: Fix corrupted SL on passive side
nfc: pn533: prevent potential memory corruption
net: hns3: Limiting the scope of vector_ring_chain variable
mips: bmips: fix syscon-reboot nodes
iommu/vt-d: Don't set then clear private data in prq_event_thread()
iommu: Fix a boundary issue to avoid performance drop
iommu/vt-d: Report right snoop capability when using FL for IOVA
iommu/vt-d: Report the right page fault address
iommu/vt-d: Preset Access/Dirty bits for IOVA over FL
iommu/vt-d: Remove WO permissions on second-level paging entries
iommu/vt-d: Invalidate PASID cache when root/context entry changed
ALSA: usb-audio: Add error checks for usb_driver_claim_interface() calls
HID: lenovo: Use brightness_set_blocking callback for setting LEDs brightness
HID: lenovo: Fix lenovo_led_set_tp10ubkbd() error handling
HID: lenovo: Check hid_get_drvdata() returns non NULL in lenovo_event()
HID: lenovo: Map mic-mute button to KEY_F20 instead of KEY_MICMUTE
KVM: arm64: Initialize VCPU mdcr_el2 before loading it
ASoC: simple-card: fix possible uninitialized single_cpu local variable
liquidio: Fix unintented sign extension of a left shift of a u16
IB/hfi1: Use kzalloc() for mmu_rb_handler allocation
powerpc/64s: Fix pte update for kernel memory on radix
powerpc/perf: Fix PMU constraint check for EBB events
powerpc: iommu: fix build when neither PCI or IBMVIO is set
mac80211: bail out if cipher schemes are invalid
perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric
xfs: fix return of uninitialized value in variable error
rtw88: Fix an error code in rtw_debugfs_set_rsvd_page()
mt7601u: fix always true expression
mt76: mt7615: fix tx skb dma unmap
mt76: mt7915: fix tx skb dma unmap
mt76: mt7915: fix aggr len debugfs node
mt76: mt7615: fix mib stats counter reporting to mac80211
mt76: mt7915: fix mib stats counter reporting to mac80211
mt76: mt7663s: make all of packets 4-bytes aligned in sdio tx aggregation
mt76: mt7663s: fix the possible device hang in high traffic
KVM: PPC: Book3S HV P9: Restore host CTRL SPR after guest exit
ovl: invalidate readdir cache on changes to dir with origin
RDMA/qedr: Fix error return code in qedr_iw_connect()
IB/hfi1: Fix error return code in parse_platform_config()
RDMA/bnxt_re: Fix error return code in bnxt_qplib_cq_process_terminal()
cxgb4: Fix unintentional sign extension issues
net: thunderx: Fix unintentional sign extension issue
RDMA/srpt: Fix error return code in srpt_cm_req_recv()
RDMA/rtrs-clt: destroy sysfs after removing session from active list
i2c: cadence: fix reference leak when pm_runtime_get_sync fails
i2c: img-scb: fix reference leak when pm_runtime_get_sync fails
i2c: imx-lpi2c: fix reference leak when pm_runtime_get_sync fails
i2c: imx: fix reference leak when pm_runtime_get_sync fails
i2c: omap: fix reference leak when pm_runtime_get_sync fails
i2c: sprd: fix reference leak when pm_runtime_get_sync fails
i2c: stm32f7: fix reference leak when pm_runtime_get_sync fails
i2c: xiic: fix reference leak when pm_runtime_get_sync fails
i2c: cadence: add IRQ check
i2c: emev2: add IRQ check
i2c: jz4780: add IRQ check
i2c: mlxbf: add IRQ check
i2c: rcar: make sure irq is not threaded on Gen2 and earlier
i2c: rcar: protect against supurious interrupts on V3U
i2c: rcar: add IRQ check
i2c: sh7760: add IRQ check
powerpc/xive: Drop check on irq_data in xive_core_debug_show()
powerpc/xive: Fix xmon command "dxi"
ASoC: ak5558: correct reset polarity
net/mlx5: Fix bit-wise and with zero
net/packet: make packet_fanout.arr size configurable up to 64K
net/packet: remove data races in fanout operations
drm/i915/gvt: Fix error code in intel_gvt_init_device()
iommu/amd: Put newline after closing bracket in warning
perf beauty: Fix fsconfig generator
drm/amd/pm: fix error code in smu_set_power_limit()
MIPS: pci-legacy: stop using of_pci_range_to_resource
powerpc/pseries: extract host bridge from pci_bus prior to bus removal
powerpc/smp: Reintroduce cpu_core_mask
KVM: x86: dump_vmcs should not assume GUEST_IA32_EFER is valid
rtlwifi: 8821ae: upgrade PHY and RF parameters
wlcore: fix overlapping snprintf arguments in debugfs
i2c: sh7760: fix IRQ error path
i2c: mediatek: Fix wrong dma sync flag
mwl8k: Fix a double Free in mwl8k_probe_hw
netfilter: nft_payload: fix C-VLAN offload support
netfilter: nftables_offload: VLAN id needs host byteorder in flow dissector
netfilter: nftables_offload: special ethertype handling for VLAN
vsock/vmci: log once the failed queue pair allocation
libbpf: Initialize the bpf_seq_printf parameters array field by field
net: ethernet: ixp4xx: Set the DMA masks explicitly
gro: fix napi_gro_frags() Fast GRO breakage due to IP alignment check
RDMA/cxgb4: add missing qpid increment
RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
ALSA: usb: midi: don't return -ENOMEM when usb_urb_ep_type_check fails
sfc: ef10: fix TX queue lookup in TX event handling
vsock/virtio: free queued packets when closing socket
net: marvell: prestera: fix port event handling on init
net: davinci_emac: Fix incorrect masking of tx and rx error channel
mt76: mt7615: fix memleak when mt7615_unregister_device()
crypto: ccp: Detect and reject "invalid" addresses destined for PSP
nfp: devlink: initialize the devlink port attribute "lanes"
net: stmmac: fix TSO and TBS feature enabling during driver open
net: renesas: ravb: Fix a stuck issue when a lot of frames are received
net: phy: intel-xway: enable integrated led functions
RDMA/rxe: Fix a bug in rxe_fill_ip_info()
RDMA/core: Add CM to restrack after successful attachment to a device
powerpc/64: Fix the definition of the fixmap area
ath9k: Fix error check in ath9k_hw_read_revisions() for PCI devices
ath10k: Fix a use after free in ath10k_htc_send_bundle
ath10k: Fix ath10k_wmi_tlv_op_pull_peer_stats_info() unlock without lock
wlcore: Fix buffer overrun by snprintf due to incorrect buffer size
powerpc/perf: Fix the threshold event selection for memory events in power10
powerpc/52xx: Fix an invalid ASM expression ('addi' used instead of 'add')
net: phy: marvell: fix m88e1011_set_downshift
net: phy: marvell: fix m88e1111_set_downshift
net: enetc: fix link error again
bnxt_en: fix ternary sign extension bug in bnxt_show_temp()
ARM: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
arm64: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb
selftests: net: mirror_gre_vlan_bridge_1q: Make an FDB entry static
selftests: mlxsw: Remove a redundant if statement in tc_flower_scale test
bnxt_en: Fix RX consumer index logic in the error path.
KVM: VMX: Intercept FS/GS_BASE MSR accesses for 32-bit KVM
net:emac/emac-mac: Fix a use after free in emac_mac_tx_buf_send
selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro
selftests/bpf: Fix field existence CO-RE reloc tests
selftests/bpf: Fix core_reloc test runner
bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds
RDMA/siw: Fix a use after free in siw_alloc_mr
RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
net: bridge: mcast: fix broken length + header check for MRDv6 Adv.
net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
perf tools: Change fields type in perf_record_time_conv
perf jit: Let convert_timestamp() to be backwards-compatible
perf session: Add swap operation for event TIME_CONV
ia64: fix EFI_DEBUG build
kfifo: fix ternary sign extension bugs
mm/sl?b.c: remove ctor argument from kmem_cache_flags
mm: memcontrol: slab: fix obtain a reference to a freeing memcg
mm/sparse: add the missing sparse_buffer_fini() in error branch
mm/memory-failure: unnecessary amount of unmapping
afs: Fix speculative status fetches
bpf: Fix alu32 const subreg bound tracking on bitwise operations
bpf, ringbuf: Deny reserve of buffers larger than ringbuf
bpf: Prevent writable memory-mapping of read-only ringbuf pages
arm64: Remove arm64_dma32_phys_limit and its uses
net: Only allow init netns to set default tcp cong to a restricted algo
smp: Fix smp_call_function_single_async prototype
Revert "net/sctp: fix race condition in sctp_destroy_sock"
sctp: delay auto_asconf init until binding the first addr
Linux 5.10.37
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5bee89c285d9dd72de967b0e70d96951ae4e06ae
commit 1139aeb1c5 upstream.
As of commit 966a967116 ("smp: Avoid using two cache lines for struct
call_single_data"), the smp code prefers 32-byte aligned call_single_data
objects for performance reasons, but the block layer includes an instance
of this structure in the main 'struct request' that is more senstive
to size than to performance here, see 4ccafe0320 ("block: unalign
call_single_data in struct request").
The result is a violation of the calling conventions that clang correctly
points out:
block/blk-mq.c:630:39: warning: passing 8-byte aligned argument to 32-byte aligned parameter 2 of 'smp_call_function_single_async' may result in an unaligned pointer access [-Walign-mismatch]
smp_call_function_single_async(cpu, &rq->csd);
It does seem that the usage of the call_single_data without cache line
alignment should still be allowed by the smp code, so just change the
function prototype so it accepts both, but leave the default alignment
unchanged for the other users. This seems better to me than adding
a local hack to shut up an otherwise correct warning in the caller.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Link: https://lkml.kernel.org/r/20210505211300.3174456-1-arnd@kernel.org
[nc: Fix conflicts, modify rq_csd_init]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 04ea3086c4 upstream.
Only the very first page of BPF ringbuf that contains consumer position
counter is supposed to be mapped as writeable by user-space. Producer
position is read-only and can be modified only by the kernel code. BPF ringbuf
data pages are read-only as well and are not meant to be modified by
user-code to maintain integrity of per-record headers.
This patch allows to map only consumer position page as writeable and
everything else is restricted to be read-only. remap_vmalloc_range()
internally adds VM_DONTEXPAND, so all the established memory mappings can't be
extended, which prevents any future violations through mremap()'ing.
Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: Ryota Shiga (Flatt Security)
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b81ccebae upstream.
A BPF program might try to reserve a buffer larger than the ringbuf size.
If the consumer pointer is way ahead of the producer, that would be
successfully reserved, allowing the BPF program to read or write out of
the ringbuf allocated area.
Reported-by: Ryota Shiga (Flatt Security)
Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 049c4e1371 upstream.
Fix a bug in the verifier's scalar32_min_max_*() functions which leads to
incorrect tracking of 32 bit bounds for the simulation of and/or/xor bitops.
When both the src & dst subreg is a known constant, then the assumption is
that scalar_min_max_*() will take care to update bounds correctly. However,
this is not the case, for example, consider a register R2 which has a tnum
of 0xffffffff00000000, meaning, lower 32 bits are known constant and in this
case of value 0x00000001. R2 is then and'ed with a register R3 which is a
64 bit known constant, here, 0x100000002.
What can be seen in line '10:' is that 32 bit bounds reach an invalid state
where {u,s}32_min_value > {u,s}32_max_value. The reason is scalar32_min_max_*()
delegates 32 bit bounds updates to scalar_min_max_*(), however, that really
only takes place when both the 64 bit src & dst register is a known constant.
Given scalar32_min_max_*() is intended to be designed as closely as possible
to scalar_min_max_*(), update the 32 bit bounds in this situation through
__mark_reg32_known() which will set all {u,s}32_{min,max}_value to the correct
constant, which is 0x00000000 after the fix (given 0x00000001 & 0x00000002 in
32 bit space). This is possible given var32_off already holds the final value
as dst_reg->var_off is updated before calling scalar32_min_max_*().
Before fix, invalid tracking of R2:
[...]
9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
9: (5f) r2 &= r3
10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=1,s32_max_value=0,u32_min_value=1,u32_max_value=0) R3_w=inv4294967298 R10=fp0
[...]
After fix, correct tracking of R2:
[...]
9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
9: (5f) r2 &= r3
10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=0,s32_max_value=0,u32_min_value=0,u32_max_value=0) R3_w=inv4294967298 R10=fp0
[...]
Fixes: 3f50f132d8 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Fixes: 2921c90d47 ("bpf: Fix a verifier failure with xor")
Reported-by: Manfred Paul (@_manfp)
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 10bf4e8316 ]
Similarly as b02709587e ("bpf: Fix propagation of 32-bit signed bounds
from 64-bit bounds."), we also need to fix the propagation of 32 bit
unsigned bounds from 64 bit counterparts. That is, really only set the
u32_{min,max}_value when /both/ {umin,umax}_value safely fit in 32 bit
space. For example, the register with a umin_value == 1 does /not/ imply
that u32_min_value is also equal to 1, since umax_value could be much
larger than 32 bit subregister can hold, and thus u32_min_value is in
the interval [0,1] instead.
Before fix, invalid tracking result of R2_w=inv1:
[...]
5: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0) R10=fp0
5: (35) if r2 >= 0x1 goto pc+1
[...] // goto path
7: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umin_value=1) R10=fp0
7: (b6) if w2 <= 0x1 goto pc+1
[...] // goto path
9: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,smin_value=-9223372036854775807,smax_value=9223372032559808513,umin_value=1,umax_value=18446744069414584321,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_max_value=1) R10=fp0
9: (bc) w2 = w2
10: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv1 R10=fp0
[...]
After fix, correct tracking result of R2_w=inv(id=0,umax_value=1,var_off=(0x0; 0x1)):
[...]
5: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0) R10=fp0
5: (35) if r2 >= 0x1 goto pc+1
[...] // goto path
7: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umin_value=1) R10=fp0
7: (b6) if w2 <= 0x1 goto pc+1
[...] // goto path
9: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,smax_value=9223372032559808513,umax_value=18446744069414584321,var_off=(0x0; 0xffffffff00000001),s32_min_value=0,s32_max_value=1,u32_max_value=1) R10=fp0
9: (bc) w2 = w2
10: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,umax_value=1,var_off=(0x0; 0x1)) R10=fp0
[...]
Thus, same issue as in b02709587e holds for unsigned subregister tracking.
Also, align __reg64_bound_u32() similarly to __reg64_bound_s32() as done in
b02709587e to make them uniform again.
Fixes: 3f50f132d8 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Reported-by: Manfred Paul (@_manfp)
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ad789f84c9 ]
The handling of sysrq key can be activated by echoing the key to
/proc/sysrq-trigger or via the magic key sequence typed into a terminal
that is connected to the system in some way (serial, USB or other mean).
In the former case, the handling is done in a user context. In the
latter case, it is likely to be in an interrupt context.
Currently in print_cpu() of kernel/sched/debug.c, sched_debug_lock is
taken with interrupt disabled for the whole duration of the calls to
print_*_stats() and print_rq() which could last for the quite some time
if the information dump happens on the serial console.
If the system has many cpus and the sched_debug_lock is somehow busy
(e.g. parallel sysrq-t), the system may hit a hard lockup panic
depending on the actually serial console implementation of the
system.
The purpose of sched_debug_lock is to serialize the use of the global
cgroup_path[] buffer in print_cpu(). The rests of the printk calls don't
need serialization from sched_debug_lock.
Calling printk() with interrupt disabled can still be problematic if
multiple instances are running. Allocating a stack buffer of PATH_MAX
bytes is not feasible because of the limited size of the kernel stack.
The solution implemented in this patch is to allow only one caller at a
time to use the full size group_path[], while other simultaneous callers
will have to use shorter stack buffers with the possibility of path
name truncation. A "..." suffix will be printed if truncation may have
happened. The cgroup path name is provided for informational purpose
only, so occasional path name truncation should not be a big problem.
Fixes: efe25c2c7b ("sched: Reinstate group names in /proc/sched_debug")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210415195426.6677-1-longman@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6494ccb932 ]
In rcu_nmi_enter(), there is an erroneous instrumentation_end() in the
second branch of the "if" statement. Oddly enough, "objtool check -f
vmlinux.o" fails to complain because it is unable to correctly cover
all cases. Instead, objtool visits the third branch first, which marks
following trace_rcu_dyntick() as visited. This commit therefore removes
the spurious instrumentation_end().
Fixes: 04b25a495b ("rcu: Mark rcu_nmi_enter() call to rcu_cleanup_after_idle() noinstr")
Reported-by Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 39a2a6eb5c ]
Syzbot reported a handful of occurrences where an sd->nr_balance_failed can
grow to much higher values than one would expect.
A successful load_balance() resets it to 0; a failed one increments
it. Once it gets to sd->cache_nice_tries + 3, this *should* trigger an
active balance, which will either set it to sd->cache_nice_tries+1 or reset
it to 0. However, in case the to-be-active-balanced task is not allowed to
run on env->dst_cpu, then the increment is done without any further
modification.
This could then be repeated ad nauseam, and would explain the absurdly high
values reported by syzbot (86, 149). VincentG noted there is value in
letting sd->cache_nice_tries grow, so the shift itself should be
fixed. That means preventing:
"""
If the value of the right operand is negative or is greater than or equal
to the width of the promoted left operand, the behavior is undefined.
"""
Thus we need to cap the shift exponent to
BITS_PER_TYPE(typeof(lefthand)) - 1.
I had a look around for other similar cases via coccinelle:
@expr@
position pos;
expression E1;
expression E2;
@@
(
E1 >> E2@pos
|
E1 >> E2@pos
)
@cst depends on expr@
position pos;
expression expr.E1;
constant cst;
@@
(
E1 >> cst@pos
|
E1 << cst@pos
)
@script:python depends on !cst@
pos << expr.pos;
exp << expr.E2;
@@
# Dirty hack to ignore constexpr
if exp.upper() != exp:
coccilib.report.print_report(pos[0], "Possible UB shift here")
The only other match in kernel/sched is rq_clock_thermal() which employs
sched_thermal_decay_shift, and that exponent is already capped to 10, so
that one is fine.
Fixes: 5a7f555904 ("sched/fair: Relax constraint on task's load during load balance")
Reported-by: syzbot+d7581744d5fd27c9fbe1@syzkaller.appspotmail.com
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: http://lore.kernel.org/r/000000000000ffac1205b9a2112f@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Add hook to dup_task_struct for vendor data fields initialisation.
Bug: 188004638
Change-Id: I4b58604ee822fb8d1e0cc37bec72e820e7318427
Signed-off-by: Liangliang Li <liliangliang@vivo.com>
This reverts commit ae7fe4794d which
showed up in 5.10.36 and broke the abi. It should not be needed in
Android systems at this time so it is safe to revert.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I69f1885bb0d25ebbaa52422d6c238f13fbe0a0f3
Changes in 5.10.36
bus: mhi: core: Fix check for syserr at power_up
bus: mhi: core: Clear configuration from channel context during reset
bus: mhi: core: Sanity check values from remote device before use
nitro_enclaves: Fix stale file descriptors on failed usercopy
dyndbg: fix parsing file query without a line-range suffix
s390/disassembler: increase ebpf disasm buffer size
s390/zcrypt: fix zcard and zqueue hot-unplug memleak
vhost-vdpa: fix vm_flags for virtqueue doorbell mapping
tpm: acpi: Check eventlog signature before using it
ACPI: custom_method: fix potential use-after-free issue
ACPI: custom_method: fix a possible memory leak
ftrace: Handle commands when closing set_ftrace_filter file
ARM: 9056/1: decompressor: fix BSS size calculation for LLVM ld.lld
arm64: dts: marvell: armada-37xx: add syscon compatible to NB clk node
arm64: dts: mt8173: fix property typo of 'phys' in dsi node
ecryptfs: fix kernel panic with null dev_name
fs/epoll: restore waking from ep_done_scan()
mtd: spi-nor: core: Fix an issue of releasing resources during read/write
Revert "mtd: spi-nor: macronix: Add support for mx25l51245g"
mtd: spinand: core: add missing MODULE_DEVICE_TABLE()
mtd: rawnand: atmel: Update ecc_stats.corrected counter
mtd: physmap: physmap-bt1-rom: Fix unintentional stack access
erofs: add unsupported inode i_format check
spi: stm32-qspi: fix pm_runtime usage_count counter
spi: spi-ti-qspi: Free DMA resources
scsi: qla2xxx: Fix crash in qla2xxx_mqueuecommand()
scsi: mpt3sas: Block PCI config access from userspace during reset
mmc: uniphier-sd: Fix an error handling path in uniphier_sd_probe()
mmc: uniphier-sd: Fix a resource leak in the remove function
mmc: sdhci: Check for reset prior to DMA address unmap
mmc: sdhci-pci: Fix initialization of some SD cards for Intel BYT-based controllers
mmc: sdhci-tegra: Add required callbacks to set/clear CQE_EN bit
mmc: block: Update ext_csd.cache_ctrl if it was written
mmc: block: Issue a cache flush only when it's enabled
mmc: core: Do a power cycle when the CMD11 fails
mmc: core: Set read only for SD cards with permanent write protect bit
mmc: core: Fix hanging on I/O during system suspend for removable cards
irqchip/gic-v3: Do not enable irqs when handling spurious interrups
cifs: Return correct error code from smb2_get_enc_key
cifs: fix out-of-bound memory access when calling smb3_notify() at mount point
cifs: detect dead connections only when echoes are enabled.
smb2: fix use-after-free in smb2_ioctl_query_info()
btrfs: handle remount to no compress during compression
x86/build: Disable HIGHMEM64G selection for M486SX
btrfs: fix metadata extent leak after failure to create subvolume
intel_th: pci: Add Rocket Lake CPU support
btrfs: fix race between transaction aborts and fsyncs leading to use-after-free
posix-timers: Preserve return value in clock_adjtime32()
fbdev: zero-fill colormap in fbcmap.c
cpuidle: tegra: Fix C7 idling state on Tegra114
bus: ti-sysc: Probe for l4_wkup and l4_cfg interconnect devices first
staging: wimax/i2400m: fix byte-order issue
spi: ath79: always call chipselect function
spi: ath79: remove spi-master setup and cleanup assignment
bus: mhi: core: Destroy SBL devices when moving to mission mode
crypto: api - check for ERR pointers in crypto_destroy_tfm()
crypto: qat - fix unmap invalid dma address
usb: gadget: uvc: add bInterval checking for HS mode
usb: webcam: Invalid size of Processing Unit Descriptor
x86/sev: Do not require Hypervisor CPUID bit for SEV guests
crypto: hisilicon/sec - fixes a printing error
genirq/matrix: Prevent allocation counter corruption
usb: gadget: f_uac2: validate input parameters
usb: gadget: f_uac1: validate input parameters
usb: dwc3: gadget: Ignore EP queue requests during bus reset
usb: xhci: Fix port minor revision
kselftest/arm64: mte: Fix compilation with native compiler
ARM: tegra: acer-a500: Rename avdd to vdda of touchscreen node
PCI: PM: Do not read power state in pci_enable_device_flags()
kselftest/arm64: mte: Fix MTE feature detection
ARM: dts: BCM5301X: fix "reg" formatting in /memory node
ARM: dts: ux500: Fix up TVK R3 sensors
x86/build: Propagate $(CLANG_FLAGS) to $(REALMODE_FLAGS)
x86/boot: Add $(CLANG_FLAGS) to compressed KBUILD_CFLAGS
efi/libstub: Add $(CLANG_FLAGS) to x86 flags
soc/tegra: pmc: Fix completion of power-gate toggling
arm64: dts: imx8mq-librem5-r3: Mark buck3 as always on
tee: optee: do not check memref size on return from Secure World
soundwire: cadence: only prepare attached devices on clock stop
perf/arm_pmu_platform: Use dev_err_probe() for IRQ errors
perf/arm_pmu_platform: Fix error handling
random: initialize ChaCha20 constants with correct endianness
usb: xhci-mtk: support quirk to disable usb2 lpm
fpga: dfl: pci: add DID for D5005 PAC cards
xhci: check port array allocation was successful before dereferencing it
xhci: check control context is valid before dereferencing it.
xhci: fix potential array out of bounds with several interrupters
bus: mhi: core: Clear context for stopped channels from remove()
ARM: dts: at91: change the key code of the gpio key
tools/power/x86/intel-speed-select: Increase string size
platform/x86: ISST: Account for increased timeout in some cases
spi: dln2: Fix reference leak to master
spi: omap-100k: Fix reference leak to master
spi: qup: fix PM reference leak in spi_qup_remove()
usb: gadget: tegra-xudc: Fix possible use-after-free in tegra_xudc_remove()
usb: musb: fix PM reference leak in musb_irq_work()
usb: core: hub: Fix PM reference leak in usb_port_resume()
usb: dwc3: gadget: Check for disabled LPM quirk
tty: n_gsm: check error while registering tty devices
intel_th: Consistency and off-by-one fix
phy: phy-twl4030-usb: Fix possible use-after-free in twl4030_usb_remove()
crypto: sun8i-ss - Fix PM reference leak when pm_runtime_get_sync() fails
crypto: sun8i-ce - Fix PM reference leak in sun8i_ce_probe()
crypto: stm32/hash - Fix PM reference leak on stm32-hash.c
crypto: stm32/cryp - Fix PM reference leak on stm32-cryp.c
crypto: sa2ul - Fix PM reference leak in sa_ul_probe()
crypto: omap-aes - Fix PM reference leak on omap-aes.c
platform/x86: intel_pmc_core: Don't use global pmcdev in quirks
spi: sync up initial chipselect state
btrfs: do proper error handling in create_reloc_root
btrfs: do proper error handling in btrfs_update_reloc_root
btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s
drm: Added orientation quirk for OneGX1 Pro
drm/qxl: do not run release if qxl failed to init
drm/qxl: release shadow on shutdown
drm/ast: Fix invalid usage of AST_MAX_HWC_WIDTH in cursor atomic_check
drm/amd/display: changing sr exit latency
drm/ast: fix memory leak when unload the driver
drm/amd/display: Check for DSC support instead of ASIC revision
drm/amd/display: Don't optimize bandwidth before disabling planes
drm/amdgpu/display: buffer INTERRUPT_LOW_IRQ_CONTEXT interrupt work
drm/amd/display/dc/dce/dce_aux: Remove duplicate line causing 'field overwritten' issue
scsi: lpfc: Fix incorrect dbde assignment when building target abts wqe
scsi: lpfc: Fix pt2pt connection does not recover after LOGO
drm/amdgpu: Fix some unload driver issues
sched/pelt: Fix task util_est update filtering
kvfree_rcu: Use same set of GFP flags as does single-argument
scsi: target: pscsi: Fix warning in pscsi_complete_cmd()
media: ite-cir: check for receive overflow
media: drivers: media: pci: sta2x11: fix Kconfig dependency on GPIOLIB
media: imx: capture: Return -EPIPE from __capture_legacy_try_fmt()
atomisp: don't let it go past pipes array
power: supply: bq27xxx: fix power_avg for newer ICs
extcon: arizona: Fix some issues when HPDET IRQ fires after the jack has been unplugged
extcon: arizona: Fix various races on driver unbind
media: media/saa7164: fix saa7164_encoder_register() memory leak bugs
media: gspca/sq905.c: fix uninitialized variable
power: supply: Use IRQF_ONESHOT
backlight: qcom-wled: Use sink_addr for sync toggle
backlight: qcom-wled: Fix FSC update issue for WLED5
drm/amdgpu: mask the xgmi number of hops reported from psp to kfd
drm/amdkfd: Fix UBSAN shift-out-of-bounds warning
drm/amdgpu : Fix asic reset regression issue introduce by 8f211fe8ac
drm/amd/pm: fix workload mismatch on vega10
drm/amd/display: Fix UBSAN warning for not a valid value for type '_Bool'
drm/amd/display: DCHUB underflow counter increasing in some scenarios
drm/amd/display: fix dml prefetch validation
scsi: qla2xxx: Always check the return value of qla24xx_get_isp_stats()
drm/vkms: fix misuse of WARN_ON
scsi: qla2xxx: Fix use after free in bsg
mmc: sdhci-esdhc-imx: validate pinctrl before use it
mmc: sdhci-pci: Add PCI IDs for Intel LKF
mmc: sdhci-brcmstb: Remove CQE quirk
ata: ahci: Disable SXS for Hisilicon Kunpeng920
drm/komeda: Fix bit check to import to value of proper type
nvmet: return proper error code from discovery ctrl
selftests/resctrl: Enable gcc checks to detect buffer overflows
selftests/resctrl: Fix compilation issues for global variables
selftests/resctrl: Fix compilation issues for other global variables
selftests/resctrl: Clean up resctrl features check
selftests/resctrl: Fix missing options "-n" and "-p"
selftests/resctrl: Use resctrl/info for feature detection
selftests/resctrl: Fix incorrect parsing of iMC counters
selftests/resctrl: Fix checking for < 0 for unsigned values
power: supply: cpcap-charger: Add usleep to cpcap charger to avoid usb plug bounce
scsi: smartpqi: Use host-wide tag space
scsi: smartpqi: Correct request leakage during reset operations
scsi: smartpqi: Add new PCI IDs
scsi: scsi_dh_alua: Remove check for ASC 24h in alua_rtpg()
media: em28xx: fix memory leak
media: vivid: update EDID
drm/msm/dp: Fix incorrect NULL check kbot warnings in DP driver
clk: socfpga: arria10: Fix memory leak of socfpga_clk on error return
power: supply: generic-adc-battery: fix possible use-after-free in gab_remove()
power: supply: s3c_adc_battery: fix possible use-after-free in s3c_adc_bat_remove()
media: tc358743: fix possible use-after-free in tc358743_remove()
media: adv7604: fix possible use-after-free in adv76xx_remove()
media: i2c: adv7511-v4l2: fix possible use-after-free in adv7511_remove()
media: i2c: tda1997: Fix possible use-after-free in tda1997x_remove()
media: i2c: adv7842: fix possible use-after-free in adv7842_remove()
media: platform: sti: Fix runtime PM imbalance in regs_show
media: sun8i-di: Fix runtime PM imbalance in deinterlace_start_streaming
media: dvb-usb: fix memory leak in dvb_usb_adapter_init
media: gscpa/stv06xx: fix memory leak
sched/fair: Ignore percpu threads for imbalance pulls
drm/msm/mdp5: Configure PP_SYNC_HEIGHT to double the vtotal
drm/msm/mdp5: Do not multiply vclk line count by 100
drm/amdgpu/ttm: Fix memory leak userptr pages
drm/radeon/ttm: Fix memory leak userptr pages
drm/amd/display: Fix debugfs link_settings entry
drm/amd/display: Fix UBSAN: shift-out-of-bounds warning
drm/amdkfd: Fix cat debugfs hang_hws file causes system crash bug
amdgpu: avoid incorrect %hu format string
drm/amd/display: Try YCbCr420 color when YCbCr444 fails
drm/amdgpu: fix NULL pointer dereference
scsi: lpfc: Fix crash when a REG_RPI mailbox fails triggering a LOGO response
scsi: lpfc: Fix error handling for mailboxes completed in MBX_POLL mode
scsi: lpfc: Remove unsupported mbox PORT_CAPABILITIES logic
mfd: intel-m10-bmc: Fix the register access range
mfd: da9063: Support SMBus and I2C mode
mfd: arizona: Fix rumtime PM imbalance on error
scsi: libfc: Fix a format specifier
perf: Rework perf_event_exit_event()
sched,fair: Alternative sched_slice()
block/rnbd-clt: Fix missing a memory free when unloading the module
s390/archrandom: add parameter check for s390_arch_random_generate
sched,psi: Handle potential task count underflow bugs more gracefully
power: supply: cpcap-battery: fix invalid usage of list cursor
ALSA: emu8000: Fix a use after free in snd_emu8000_create_mixer
ALSA: hda/conexant: Re-order CX5066 quirk table entries
ALSA: sb: Fix two use after free in snd_sb_qsound_build
ALSA: usb-audio: Explicitly set up the clock selector
ALSA: usb-audio: Add dB range mapping for Sennheiser Communications Headset PC 8
ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 445 G7
ALSA: hda/realtek: GA503 use same quirks as GA401
ALSA: hda/realtek: fix mic boost on Intel NUC 8
ALSA: hda/realtek - Headset Mic issue on HP platform
ALSA: hda/realtek: fix static noise on ALC285 Lenovo laptops
ALSA: hda/realtek: Add quirk for Intel Clevo PCx0Dx
tools/power/turbostat: Fix turbostat for AMD Zen CPUs
btrfs: fix race when picking most recent mod log operation for an old root
arm64/vdso: Discard .note.gnu.property sections in vDSO
Makefile: Move -Wno-unused-but-set-variable out of GCC only block
fs: fix reporting supported extra file attributes for statx()
virtiofs: fix memory leak in virtio_fs_probe()
kcsan, debugfs: Move debugfs file creation out of early init
ubifs: Only check replay with inode type to judge if inode linked
f2fs: fix error handling in f2fs_end_enable_verity()
f2fs: fix to avoid out-of-bounds memory access
mlxsw: spectrum_mr: Update egress RIF list before route's action
openvswitch: fix stack OOB read while fragmenting IPv4 packets
ACPI: GTDT: Don't corrupt interrupt mappings on watchdow probe failure
NFS: fs_context: validate UDP retrans to prevent shift out-of-bounds
NFS: Don't discard pNFS layout segments that are marked for return
NFSv4: Don't discard segments marked for return in _pnfs_return_layout()
Input: ili210x - add missing negation for touch indication on ili210x
jffs2: Fix kasan slab-out-of-bounds problem
jffs2: Hook up splice_write callback
powerpc/powernv: Enable HAIL (HV AIL) for ISA v3.1 processors
powerpc/eeh: Fix EEH handling for hugepages in ioremap space.
powerpc/kexec_file: Use current CPU info while setting up FDT
powerpc/32: Fix boot failure with CONFIG_STACKPROTECTOR
powerpc: fix EDEADLOCK redefinition error in uapi/asm/errno.h
intel_th: pci: Add Alder Lake-M support
tpm: efi: Use local variable for calculating final log size
tpm: vtpm_proxy: Avoid reading host log when using a virtual device
crypto: arm/curve25519 - Move '.fpu' after '.arch'
crypto: rng - fix crypto_rng_reset() refcounting when !CRYPTO_STATS
md/raid1: properly indicate failure when ending a failed write request
dm raid: fix inconclusive reshape layout on fast raid4/5/6 table reload sequences
fuse: fix write deadlock
exfat: fix erroneous discard when clear cluster bit
sfc: farch: fix TX queue lookup in TX flush done handling
sfc: farch: fix TX queue lookup in TX event handling
security: commoncap: fix -Wstringop-overread warning
Fix misc new gcc warnings
jffs2: check the validity of dstlen in jffs2_zlib_compress()
smb3: when mounting with multichannel include it in requested capabilities
smb3: do not attempt multichannel to server which does not support it
Revert 337f13046f ("futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op")
futex: Do not apply time namespace adjustment on FUTEX_LOCK_PI
x86/cpu: Initialize MSR_TSC_AUX if RDTSCP *or* RDPID is supported
kbuild: update config_data.gz only when the content of .config is changed
ext4: annotate data race in start_this_handle()
ext4: annotate data race in jbd2_journal_dirty_metadata()
ext4: fix check to prevent false positive report of incorrect used inodes
ext4: do not set SB_ACTIVE in ext4_orphan_cleanup()
ext4: fix error code in ext4_commit_super
ext4: fix ext4_error_err save negative errno into superblock
ext4: fix error return code in ext4_fc_perform_commit()
ext4: allow the dax flag to be set and cleared on inline directories
ext4: Fix occasional generic/418 failure
media: dvbdev: Fix memory leak in dvb_media_device_free()
media: dvb-usb: Fix use-after-free access
media: dvb-usb: Fix memory leak at error in dvb_usb_device_init()
media: staging/intel-ipu3: Fix memory leak in imu_fmt
media: staging/intel-ipu3: Fix set_fmt error handling
media: staging/intel-ipu3: Fix race condition during set_fmt
media: v4l2-ctrls: fix reference to freed memory
media: venus: hfi_parser: Don't initialize parser on v1
usb: gadget: dummy_hcd: fix gpf in gadget_setup
usb: gadget: Fix double free of device descriptor pointers
usb: gadget/function/f_fs string table fix for multiple languages
usb: dwc3: gadget: Remove FS bInterval_m1 limitation
usb: dwc3: gadget: Fix START_TRANSFER link state check
usb: dwc3: core: Do core softreset when switch mode
usb: dwc2: Fix session request interrupt handler
tty: fix memory leak in vc_deallocate
rsi: Use resume_noirq for SDIO
tools/power turbostat: Fix offset overflow issue in index converting
tracing: Map all PIDs to command lines
tracing: Restructure trace_clock_global() to never block
dm persistent data: packed struct should have an aligned() attribute too
dm space map common: fix division bug in sm_ll_find_free_block()
dm integrity: fix missing goto in bitmap_flush_interval error handling
dm rq: fix double free of blk_mq_tag_set in dev remove after table load fails
lib/vsprintf.c: remove leftover 'f' and 'F' cases from bstr_printf()
thermal/drivers/cpufreq_cooling: Fix slab OOB issue
thermal/core/fair share: Lock the thermal zone while looping over instances
Linux 5.10.36
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I7b8075de5edd8de69205205cddb9a3273d7d0810
Add a hook in irqtime_account_process_tick, which helps to get
information about the high load task.
Bug: 187904818
Change-Id: I644f7d66b09d047ca6b0a0fbd2915a6387c8c007
Signed-off-by: Liangliang Li <liliangliang@vivo.com>
With CONFIG_LTO_CLANG_FULL, LLVM drops all CFI jump table symbols
from vmlinux, which doesn't affect kernel functionality, but can
make stack traces and other kernel output that prints out jump
table addresses harder to read.
This change works around the issue for now by adding a script that
tells kallsyms about the missing jump table symbols, even though
they don't actually exist in the symbol table, and generates a
linker script to add the missing symbols to kernel modules.
Bug: 186152035
Bug: 187415564
Change-Id: Ic3c51751c756f2f5fb2a31229e16c3397eb6e666
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
commit aafe104aa9 upstream.
It was reported that a fix to the ring buffer recursion detection would
cause a hung machine when performing suspend / resume testing. The
following backtrace was extracted from debugging that case:
Call Trace:
trace_clock_global+0x91/0xa0
__rb_reserve_next+0x237/0x460
ring_buffer_lock_reserve+0x12a/0x3f0
trace_buffer_lock_reserve+0x10/0x50
__trace_graph_return+0x1f/0x80
trace_graph_return+0xb7/0xf0
? trace_clock_global+0x91/0xa0
ftrace_return_to_handler+0x8b/0xf0
? pv_hash+0xa0/0xa0
return_to_handler+0x15/0x30
? ftrace_graph_caller+0xa0/0xa0
? trace_clock_global+0x91/0xa0
? __rb_reserve_next+0x237/0x460
? ring_buffer_lock_reserve+0x12a/0x3f0
? trace_event_buffer_lock_reserve+0x3c/0x120
? trace_event_buffer_reserve+0x6b/0xc0
? trace_event_raw_event_device_pm_callback_start+0x125/0x2d0
? dpm_run_callback+0x3b/0xc0
? pm_ops_is_empty+0x50/0x50
? platform_get_irq_byname_optional+0x90/0x90
? trace_device_pm_callback_start+0x82/0xd0
? dpm_run_callback+0x49/0xc0
With the following RIP:
RIP: 0010:native_queued_spin_lock_slowpath+0x69/0x200
Since the fix to the recursion detection would allow a single recursion to
happen while tracing, this lead to the trace_clock_global() taking a spin
lock and then trying to take it again:
ring_buffer_lock_reserve() {
trace_clock_global() {
arch_spin_lock() {
queued_spin_lock_slowpath() {
/* lock taken */
(something else gets traced by function graph tracer)
ring_buffer_lock_reserve() {
trace_clock_global() {
arch_spin_lock() {
queued_spin_lock_slowpath() {
/* DEAD LOCK! */
Tracing should *never* block, as it can lead to strange lockups like the
above.
Restructure the trace_clock_global() code to instead of simply taking a
lock to update the recorded "prev_time" simply use it, as two events
happening on two different CPUs that calls this at the same time, really
doesn't matter which one goes first. Use a trylock to grab the lock for
updating the prev_time, and if it fails, simply try again the next time.
If it failed to be taken, that means something else is already updating
it.
Link: https://lkml.kernel.org/r/20210430121758.650b6e8a@gandalf.local.home
Cc: stable@vger.kernel.org
Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Fixes: b02414c8f0 ("ring-buffer: Fix recursion protection transitions between interrupt context") # started showing the problem
Fixes: 14131f2f98 ("tracing: implement trace_clock_*() APIs") # where the bug happened
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212761
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 785e3c0a3a upstream.
The default max PID is set by PID_MAX_DEFAULT, and the tracing
infrastructure uses this number to map PIDs to the comm names of the
tasks, such output of the trace can show names from the recorded PIDs in
the ring buffer. This mapping is also exported to user space via the
"saved_cmdlines" file in the tracefs directory.
But currently the mapping expects the PIDs to be less than
PID_MAX_DEFAULT, which is the default maximum and not the real maximum.
Recently, systemd will increases the maximum value of a PID on the system,
and when tasks are traced that have a PID higher than PID_MAX_DEFAULT, its
comm is not recorded. This leads to the entire trace to have "<...>" as
the comm name, which is pretty useless.
Instead, keep the array mapping the size of PID_MAX_DEFAULT, but instead
of just mapping the index to the comm, map a mask of the PID
(PID_MAX_DEFAULT - 1) to the comm, and find the full PID from the
map_cmdline_to_pid array (that already exists).
This bug goes back to the beginning of ftrace, but hasn't been an issue
until user space started increasing the maximum value of PIDs.
Link: https://lkml.kernel.org/r/20210427113207.3c601884@gandalf.local.home
Cc: stable@vger.kernel.org
Fixes: bc0c38d139 ("ftrace: latency tracer infrastructure")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 46b41d5dd8 upstream.
If the timestamp of the .config file is updated, config_data.gz is
regenerated, then vmlinux is re-linked. This occurs even if the content
of the .config has not changed at all.
This issue was mitigated by commit 67424f61f8 ("kconfig: do not write
.config if the content is the same"); Kconfig does not update the
.config when it ends up with the identical configuration.
The issue is remaining when the .config is created by *_defconfig with
some config fragment(s) applied on top.
This is typical for powerpc and mips, where several *_defconfig targets
are constructed by using merge_config.sh.
One workaround is to have the copy of the .config. The filechk rule
updates the copy, kernel/config_data, by checking the content instead
of the timestamp.
With this commit, the second run with the same configuration avoids
the needless rebuilds.
$ make ARCH=mips defconfig all
[ snip ]
$ make ARCH=mips defconfig all
*** Default configuration is based on target '32r2el_defconfig'
Using ./arch/mips/configs/generic_defconfig as base
Merging arch/mips/configs/generic/32r2.config
Merging arch/mips/configs/generic/el.config
Merging ./arch/mips/configs/generic/board-boston.config
Merging ./arch/mips/configs/generic/board-ni169445.config
Merging ./arch/mips/configs/generic/board-ocelot.config
Merging ./arch/mips/configs/generic/board-ranchu.config
Merging ./arch/mips/configs/generic/board-sead-3.config
Merging ./arch/mips/configs/generic/board-xilfpga.config
#
# configuration written to .config
#
SYNC include/config/auto.conf
CALL scripts/checksyscalls.sh
CALL scripts/atomic/check-atomics.sh
CHK include/generated/compile.h
CHK include/generated/autoksyms.h
Reported-by: Elliot Berman <eberman@codeaurora.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cdf78db407 upstream.
FUTEX_LOCK_PI does not require to have the FUTEX_CLOCK_REALTIME bit set
because it has been using CLOCK_REALTIME based absolute timeouts
forever. Due to that, the time namespace adjustment which is applied when
FUTEX_CLOCK_REALTIME is not set, will wrongly take place for FUTEX_LOCK_PI
and wreckage the timeout.
Exclude it from that procedure.
Fixes: c2f7d08ccc ("futex: Adjust absolute futex timeouts with per time namespace offset")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210422194704.984540159@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4fbf5d6837 upstream.
The FUTEX_WAIT operand has historically a relative timeout which means that
the clock id is irrelevant as relative timeouts on CLOCK_REALTIME are not
subject to wall clock changes and therefore are mapped by the kernel to
CLOCK_MONOTONIC for simplicity.
If a caller would set FUTEX_CLOCK_REALTIME for FUTEX_WAIT the timeout is
still treated relative vs. CLOCK_MONOTONIC and then the wait arms that
timeout based on CLOCK_REALTIME which is broken and obviously has never
been used or even tested.
Reject any attempt to use FUTEX_CLOCK_REALTIME with FUTEX_WAIT again.
The desired functionality can be achieved with FUTEX_WAIT_BITSET and a
FUTEX_BITSET_MATCH_ANY argument.
Fixes: 337f13046f ("futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210422194704.834797921@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e36299efe7 upstream.
Commit 56348560d4 ("debugfs: do not attempt to create a new file
before the filesystem is initalized") forbids creating new debugfs files
until debugfs is fully initialized. This means that KCSAN's debugfs
file creation, which happened at the end of __init(), no longer works.
And was apparently never supposed to work!
However, there is no reason to create KCSAN's debugfs file so early.
This commit therefore moves its creation to a late_initcall() callback.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: stable <stable@vger.kernel.org>
Fixes: 56348560d4 ("debugfs: do not attempt to create a new file before the filesystem is initalized")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9d10a13d1e ]
psi_group_cpu->tasks, represented by the unsigned int, stores the
number of tasks that could be stalled on a psi resource(io/mem/cpu).
Decrementing these counters at zero leads to wrapping which further
leads to the psi_group_cpu->state_mask is being set with the
respective pressure state. This could result into the unnecessary time
sampling for the pressure state thus cause the spurious psi events.
This can further lead to wrong actions being taken at the user land
based on these psi events.
Though psi_bug is set under these conditions but that just for debug
purpose. Fix it by decrementing the ->tasks count only when it is
non-zero.
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lkml.kernel.org/r/1618585336-37219-1-git-send-email-charante@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c2de3f054 ]
The current sched_slice() seems to have issues; there's two possible
things that could be improved:
- the 'nr_running' used for __sched_period() is daft when cgroups are
considered. Using the RQ wide h_nr_running seems like a much more
consistent number.
- (esp) cgroups can slice it real fine, which makes for easy
over-scheduling, ensure min_gran is what the name says.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210412102001.611897312@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef54c1a476 ]
Make perf_event_exit_event() more robust, such that we can use it from
other contexts. Specifically the up and coming remove_on_exec.
For this to work we need to address a few issues. Remove_on_exec will
not destroy the entire context, so we cannot rely on TASK_TOMBSTONE to
disable event_function_call() and we thus have to use
perf_remove_from_context().
When using perf_remove_from_context(), there's two races to consider.
The first is against close(), where we can have concurrent tear-down
of the event. The second is against child_list iteration, which should
not find a half baked event.
To address this, teach perf_remove_from_context() to special case
!ctx->is_active and about DETACH_CHILD.
[ elver@google.com: fix racing parent/child exit in sync_child_event(). ]
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210408103605.1676875-2-elver@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9bcb959d05 ]
During load balance, LBF_SOME_PINNED will be set if any candidate task
cannot be detached due to CPU affinity constraints. This can result in
setting env->sd->parent->sgc->group_imbalance, which can lead to a group
being classified as group_imbalanced (rather than any of the other, lower
group_type) when balancing at a higher level.
In workloads involving a single task per CPU, LBF_SOME_PINNED can often be
set due to per-CPU kthreads being the only other runnable tasks on any
given rq. This results in changing the group classification during
load-balance at higher levels when in reality there is nothing that can be
done for this affinity constraint: per-CPU kthreads, as the name implies,
don't get to move around (modulo hotplug shenanigans).
It's not as clear for userspace tasks - a task could be in an N-CPU cpuset
with N-1 offline CPUs, making it an "accidental" per-CPU task rather than
an intended one. KTHREAD_IS_PER_CPU gives us an indisputable signal which
we can leverage here to not set LBF_SOME_PINNED.
Note that the aforementioned classification to group_imbalance (when
nothing can be done) is especially problematic on big.LITTLE systems, which
have a topology the likes of:
DIE [ ]
MC [ ][ ]
0 1 2 3
L L B B
arch_scale_cpu_capacity(L) < arch_scale_cpu_capacity(B)
Here, setting LBF_SOME_PINNED due to a per-CPU kthread when balancing at MC
level on CPUs [0-1] will subsequently prevent CPUs [2-3] from classifying
the [0-1] group as group_misfit_task when balancing at DIE level. Thus, if
CPUs [0-1] are running CPU-bound (misfit) tasks, ill-timed per-CPU kthreads
can significantly delay the upgmigration of said misfit tasks. Systems
relying on ASYM_PACKING are likely to face similar issues.
Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org>
[Use kthread_is_per_cpu() rather than p->nr_cpus_allowed]
[Reword changelog]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20210407220628.3798191-2-valentin.schneider@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ee6ddf5847 ]
Running an rcuscale stress-suite can lead to "Out of memory" of a
system. This can happen under high memory pressure with a small amount
of physical memory.
For example, a KVM test configuration with 64 CPUs and 512 megabytes
can result in OOM when running rcuscale with below parameters:
../kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig CONFIG_NR_CPUS=64 \
--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 \
rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot" --trust-make
<snip>
[ 12.054448] kworker/1:1H invoked oom-killer: gfp_mask=0x2cc0(GFP_KERNEL|__GFP_NOWARN), order=0, oom_score_adj=0
[ 12.055303] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510
[ 12.055416] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
[ 12.056485] Workqueue: events_highpri fill_page_cache_func
[ 12.056485] Call Trace:
[ 12.056485] dump_stack+0x57/0x6a
[ 12.056485] dump_header+0x4c/0x30a
[ 12.056485] ? del_timer_sync+0x20/0x30
[ 12.056485] out_of_memory.cold.47+0xa/0x7e
[ 12.056485] __alloc_pages_slowpath.constprop.123+0x82f/0xc00
[ 12.056485] __alloc_pages_nodemask+0x289/0x2c0
[ 12.056485] __get_free_pages+0x8/0x30
[ 12.056485] fill_page_cache_func+0x39/0xb0
[ 12.056485] process_one_work+0x1ed/0x3b0
[ 12.056485] ? process_one_work+0x3b0/0x3b0
[ 12.060485] worker_thread+0x28/0x3c0
[ 12.060485] ? process_one_work+0x3b0/0x3b0
[ 12.060485] kthread+0x138/0x160
[ 12.060485] ? kthread_park+0x80/0x80
[ 12.060485] ret_from_fork+0x22/0x30
[ 12.062156] Mem-Info:
[ 12.062350] active_anon:0 inactive_anon:0 isolated_anon:0
[ 12.062350] active_file:0 inactive_file:0 isolated_file:0
[ 12.062350] unevictable:0 dirty:0 writeback:0
[ 12.062350] slab_reclaimable:2797 slab_unreclaimable:80920
[ 12.062350] mapped:1 shmem:2 pagetables:8 bounce:0
[ 12.062350] free:10488 free_pcp:1227 free_cma:0
...
[ 12.101610] Out of memory and no killable processes...
[ 12.102042] Kernel panic - not syncing: System is deadlocked on memory
[ 12.102583] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510
[ 12.102600] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
<snip>
Because kvfree_rcu() has a fallback path, memory allocation failure is
not the end of the world. Furthermore, the added overhead of aggressive
GFP settings must be balanced against the overhead of the fallback path,
which is a cache miss for double-argument kvfree_rcu() and a call to
synchronize_rcu() for single-argument kvfree_rcu(). The current choice
of GFP_KERNEL|__GFP_NOWARN can result in longer latencies than a call
to synchronize_rcu(), so less-tenacious GFP flags would be helpful.
Here is the tradeoff that must be balanced:
a) Minimize use of the fallback path,
b) Avoid pushing the system into OOM,
c) Bound allocation latency to that of synchronize_rcu(), and
d) Leave the emergency reserves to use cases lacking fallbacks.
This commit therefore changes GFP flags from GFP_KERNEL|__GFP_NOWARN to
GFP_KERNEL|__GFP_NORETRY|__GFP_NOMEMALLOC|__GFP_NOWARN. This combination
leaves the emergency reserves alone and can initiate reclaim, but will
not invoke the OOM killer.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b89997aa88 ]
Being called for each dequeue, util_est reduces the number of its updates
by filtering out when the EWMA signal is different from the task util_avg
by less than 1%. It is a problem for a sudden util_avg ramp-up. Due to the
decay from a previous high util_avg, EWMA might now be close enough to
the new util_avg. No update would then happen while it would leave
ue.enqueued with an out-of-date value.
Taking into consideration the two util_est members, EWMA and enqueued for
the filtering, ensures, for both, an up-to-date value.
This is for now an issue only for the trace probe that might return the
stale value. Functional-wise, it isn't a problem, as the value is always
accessed through max(enqueued, ewma).
This problem has been observed using LISA's UtilConvergence:test_means on
the sd845c board.
No regression observed with Hackbench on sd845c and Perf-bench sched pipe
on hikey/hikey960.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20210225165820.1377125-1-vincent.donnefort@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c93a5e20c3 ]
When irq_matrix_free() is called for an unallocated vector the
managed_allocated and total_allocated counters get out of sync with the
real state of the matrix. Later, when the last interrupt is freed, these
counters will underflow resulting in UINTMAX because the counters are
unsigned.
While this is certainly a problem of the calling code, this can be catched
in the allocator by checking the allocation bit for the to be freed vector
which simplifies debugging.
An example of the problem described above:
https://lore.kernel.org/lkml/20210318192819.636943062@linutronix.de/
Add the missing sanity check and emit a warning when it triggers.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210319111823.1105248-1-vkuznets@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 2d036dfa5f upstream.
The return value on success (>= 0) is overwritten by the return value of
put_old_timex32(). That works correct in the fault case, but is wrong for
the success case where put_old_timex32() returns 0.
Just check the return value of put_old_timex32() and return -EFAULT in case
it is not zero.
[ tglx: Massage changelog ]
Fixes: 3a4d44b616 ("ntp: Move adjtimex related compat syscalls to native counterparts")
Signed-off-by: Chen Jun <chenjun102@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210414030449.90692-1-chenjun102@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c9af478c0 upstream.
# echo switch_mm:traceoff > /sys/kernel/tracing/set_ftrace_filter
will cause switch_mm to stop tracing by the traceoff command.
# echo -n switch_mm:traceoff > /sys/kernel/tracing/set_ftrace_filter
does nothing.
The reason is that the parsing in the write function only processes
commands if it finished parsing (there is white space written after the
command). That's to handle:
write(fd, "switch_mm:", 10);
write(fd, "traceoff", 8);
cases, where the command is broken over multiple writes.
The problem is if the file descriptor is closed, then the write call is
not processed, and the command needs to be processed in the release code.
The release code can handle matching of functions, but does not handle
commands.
Cc: stable@vger.kernel.org
Fixes: eda1e32855 ("tracing: handle broken names in ftrace filter")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit fb4c1c2e9f.
Fixes the ABI issues in 5.10.35 that at the moment, we can't handle due
to the KABI freeze. These are not patches that mean much for android
systems, and will be reverted the next KABI "reset" point.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I7b12040440bc2bedfbabc71d60339cb843e59570
This reverts commit 22163a8ec8.
Fixes the ABI issues in 5.10.35 that at the moment, we can't handle due
to the KABI freeze. These are not patches that mean much for android
systems, and will be reverted the next KABI "reset" point.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie256b7f1ed5b19f4600ff4c3681efaea5046adda
This reverts commit 1bbcc985d1.
Fixes the ABI issues in 5.10.35 that at the moment, we can't handle due
to the KABI freeze. These are not patches that mean much for android
systems, and will be reverted the next KABI "reset" point.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I88427586de4df897c39bb3d48e8cba86ebf5c4bb
This reverts commit 1f2ef5a0f7.
Fixes the ABI issues in 5.10.35 that at the moment, we can't handle due
to the KABI freeze. These are not patches that mean much for android
systems, and will be reverted the next KABI "reset" point.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I461f44ec54102dcf06ba73959a79172615cdb8fa
This reverts commit 9efd5df078.
Fixes the ABI issues in 5.10.35 that at the moment, we can't handle due
to the KABI freeze. These are not patches that mean much for android
systems, and will be reverted the next KABI "reset" point.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ia22b608e4d017c6056e481d2ec819074b35810cf
This reverts commit 25ed8827cf.
Fixes the ABI issues in 5.10.35 that at the moment, we can't handle due
to the KABI freeze. These are not patches that mean much for android
systems, and will be reverted the next KABI "reset" point.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9928cdc06a26b9db3052b93067c557011e53f735
This reverts commit 85a5a6875c.
Fixes the ABI issues in 5.10.35 that at the moment, we can't handle due
to the KABI freeze. These are not patches that mean much for android
systems, and will be reverted the next KABI "reset" point.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ibe7639f73acc11b356f6b2f8f76011844c7f451b
This reverts commit f8e71c667e.
Fixes the ABI issues in 5.10.35 that at the moment, we can't handle due
to the KABI freeze. These are not patches that mean much for android
systems, and will be reverted the next KABI "reset" point.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I6684b40d913a86b79899220cba823a74aea5923f
Changes in 5.10.35
mips: Do not include hi and lo in clobber list for R6
netfilter: conntrack: Make global sysctls readonly in non-init netns
net: usb: ax88179_178a: initialize local variables before use
igb: Enable RSS for Intel I211 Ethernet Controller
bpf: Fix masking negation logic upon negative dst register
bpf: Fix leakage of uninitialized bpf stack under speculation
net: qrtr: Avoid potential use after free in MHI send
perf data: Fix error return code in perf_data__create_dir()
capabilities: require CAP_SETFCAP to map uid 0
perf ftrace: Fix access to pid in array when setting a pid filter
tools/cgroup/slabinfo.py: updated to work on current kernel
driver core: add a min_align_mask field to struct device_dma_parameters
swiotlb: add a IO_TLB_SIZE define
swiotlb: factor out an io_tlb_offset helper
swiotlb: factor out a nr_slots helper
swiotlb: clean up swiotlb_tbl_unmap_single
swiotlb: refactor swiotlb_tbl_map_single
swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single
swiotlb: respect min_align_mask
nvme-pci: set min_align_mask
ovl: fix leaked dentry
ovl: allow upperdir inside lowerdir
ALSA: usb-audio: Add MIDI quirk for Vox ToneLab EX
USB: Add LPM quirk for Lenovo ThinkPad USB-C Dock Gen2 Ethernet
USB: Add reset-resume quirk for WD19's Realtek Hub
platform/x86: thinkpad_acpi: Correct thermal sensor allocation
perf/core: Fix unconditional security_locked_down() call
vfio: Depend on MMU
Linux 5.10.35
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iff7d5abe7b821f453bbe4d9dad94dfd35fe0a082
commit 08ef1af4de upstream.
Currently, the lockdown state is queried unconditionally, even though
its result is used only if the PERF_SAMPLE_REGS_INTR bit is set in
attr.sample_type. While that doesn't matter in case of the Lockdown LSM,
it causes trouble with the SELinux's lockdown hook implementation.
SELinux implements the locked_down hook with a check whether the current
task's type has the corresponding "lockdown" class permission
("integrity" or "confidentiality") allowed in the policy. This means
that calling the hook when the access control decision would be ignored
generates a bogus permission check and audit record.
Fix this by checking sample_type first and only calling the hook when
its result would be honored.
Fixes: b0c8fdc7fd ("lockdown: Lock down perf when in confidentiality mode")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Link: https://lkml.kernel.org/r/20210224215628.192519-1-omosnace@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit: 1f221a0d0d
swiotlb: respect min_align_mask
Respect the min_align_mask in struct device_dma_parameters in swiotlb.
There are two parts to it:
1) for the lower bits of the alignment inside the io tlb slot, just
extent the size of the allocation and leave the start of the slot
empty
2) for the high bits ensure we find a slot that matches the high bits
of the alignment to avoid wasting too much memory
Based on an earlier patch from Jianxiong Gao <jxgao@google.com>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jianxiong Gao <jxgao@google.com>
Tested-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit: 16fc3cef33
swiotlb_tbl_map_single currently nevers sets a tlb_addr that is not
aligned to the tlb bucket size. But we're going to add such a case
soon, for which this adjustment would be bogus.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jianxiong Gao <jxgao@google.com>
Tested-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit: 26a7e09478
Split out a bunch of a self-contained helpers to make the function easier
to follow.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jianxiong Gao <jxgao@google.com>
Tested-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit: ca10d0f8e5
swiotlb: clean up swiotlb_tbl_unmap_single
Remove a layer of pointless indentation, replace a hard to follow
ternary expression with a plain if/else.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jianxiong Gao <jxgao@google.com>
Tested-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit: c32a77fd18
Factor out a helper to find the number of slots for a given size.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jianxiong Gao <jxgao@google.com>
Tested-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit: c7fbeca757
Replace the very genericly named OFFSET macro with a little inline
helper that hardcodes the alignment to the only value ever passed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jianxiong Gao <jxgao@google.com>
Tested-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit: b5d7ccb7aa
Add a new IO_TLB_SIZE define instead open coding it using
IO_TLB_SHIFT all over.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jianxiong Gao <jxgao@google.com>
Tested-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit db2e718a47 ]
cap_setfcap is required to create file capabilities.
Since commit 8db6c34f1d ("Introduce v3 namespaced file capabilities"),
a process running as uid 0 but without cap_setfcap is able to work
around this as follows: unshare a new user namespace which maps parent
uid 0 into the child namespace.
While this task will not have new capabilities against the parent
namespace, there is a loophole due to the way namespaced file
capabilities are represented as xattrs. File capabilities valid in
userns 1 are distinguished from file capabilities valid in userns 2 by
the kuid which underlies uid 0. Therefore the restricted root process
can unshare a new self-mapping namespace, add a namespaced file
capability onto a file, then use that file capability in the parent
namespace.
To prevent that, do not allow mapping parent uid 0 if the process which
opened the uid_map file does not have CAP_SETFCAP, which is the
capability for setting file capabilities.
As a further wrinkle: a task can unshare its user namespace, then open
its uid_map file itself, and map (only) its own uid. In this case we do
not have the credential from before unshare, which was potentially more
restricted. So, when creating a user namespace, we record whether the
creator had CAP_SETFCAP. Then we can use that during map_write().
With this patch:
1. Unprivileged user can still unshare -Ur
ubuntu@caps:~$ unshare -Ur
root@caps:~# logout
2. Root user can still unshare -Ur
ubuntu@caps:~$ sudo bash
root@caps:/home/ubuntu# unshare -Ur
root@caps:/home/ubuntu# logout
3. Root user without CAP_SETFCAP cannot unshare -Ur:
root@caps:/home/ubuntu# /sbin/capsh --drop=cap_setfcap --
root@caps:/home/ubuntu# /sbin/setcap cap_setfcap=p /sbin/setcap
unable to set CAP_SETFCAP effective capability: Operation not permitted
root@caps:/home/ubuntu# unshare -Ur
unshare: write failed /proc/self/uid_map: Operation not permitted
Note: an alternative solution would be to allow uid 0 mappings by
processes without CAP_SETFCAP, but to prevent such a namespace from
writing any file capabilities. This approach can be seen at [1].
Background history: commit 95ebabde38 ("capabilities: Don't allow
writing ambiguous v3 file capabilities") tried to fix the issue by
preventing v3 fscaps to be written to disk when the root uid would map
to the same uid in nested user namespaces. This led to regressions for
various workloads. For example, see [2]. Ultimately this is a valid
use-case we have to support meaning we had to revert this change in
3b0c2d3eaa ("Revert 95ebabde38 ("capabilities: Don't allow writing
ambiguous v3 file capabilities")").
Link: https://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git/log/?h=2021-04-15/setfcap-nsfscaps-v4 [1]
Link: https://github.com/containers/buildah/issues/3071 [2]
Signed-off-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Andrew G. Morgan <morgan@kernel.org>
Tested-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Tested-by: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 801c6058d1 upstream.
The current implemented mechanisms to mitigate data disclosure under
speculation mainly address stack and map value oob access from the
speculative domain. However, Piotr discovered that uninitialized BPF
stack is not protected yet, and thus old data from the kernel stack,
potentially including addresses of kernel structures, could still be
extracted from that 512 bytes large window. The BPF stack is special
compared to map values since it's not zero initialized for every
program invocation, whereas map values /are/ zero initialized upon
their initial allocation and thus cannot leak any prior data in either
domain. In the non-speculative domain, the verifier ensures that every
stack slot read must have a prior stack slot write by the BPF program
to avoid such data leaking issue.
However, this is not enough: for example, when the pointer arithmetic
operation moves the stack pointer from the last valid stack offset to
the first valid offset, the sanitation logic allows for any intermediate
offsets during speculative execution, which could then be used to
extract any restricted stack content via side-channel.
Given for unprivileged stack pointer arithmetic the use of unknown
but bounded scalars is generally forbidden, we can simply turn the
register-based arithmetic operation into an immediate-based arithmetic
operation without the need for masking. This also gives the benefit
of reducing the needed instructions for the operation. Given after
the work in 7fedb63a83 ("bpf: Tighten speculative pointer arithmetic
mask"), the aux->alu_limit already holds the final immediate value for
the offset register with the known scalar. Thus, a simple mov of the
immediate to AX register with using AX as the source for the original
instruction is sufficient and possible now in this case.
Reported-by: Piotr Krysiuk <piotras@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b9b34ddbe2 upstream.
The negation logic for the case where the off_reg is sitting in the
dst register is not correct given then we cannot just invert the add
to a sub or vice versa. As a fix, perform the final bitwise and-op
unconditionally into AX from the off_reg, then move the pointer from
the src to dst and finally use AX as the source for the original
pointer arithmetic operation such that the inversion yields a correct
result. The single non-AX mov in between is possible given constant
blinding is retaining it as it's not an immediate based operation.
Fixes: 979d63d50c ("bpf: prevent out of bounds speculation on pointer arithmetic")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We already applied the 'vendor hook' for Dtask Debugging Information
in below issue.
(https://issuetracker.google.com/issues/162776704)
There are vendor hook call in mutex and rw_semaphore, but not for rt_mutex
Please refer to description in details as below,
1. Description
This feature writes rt mutex lock waiting information
on the task_struct structure. We can check mutex information and
mutex owner through the kernel log and custom analysis tools.
Like the previous feature in mutex and rw semaphore,
added data can be checked by ramdump analysis.
2. Vendor Hook Position
1) VENDOR_DATA
- struct task_struct in sched.h
VENDOR_DATA_ARRAY(2)
[0] : type // RTmutex (Mutex, Rwsem, ...)
[1] : pointer // address of lock
2) VENDOR_HOOKs
- __rt_mutex_slowlock() in kernel/locking/rtmutex.c
3. Example
- SysRq-w in kernel log
...
[ 54.164463] [3: kworker/u16:3: 253] kworker/3:2 D12736 418 2 0x00000228
[ 54.164497] [3: kworker/u16:3: 253] RTmutex: 0xffffffc051fa3ae8: owner[sh :9003]
[ 54.167812] [3: kworker/u16:3: 253] sh D12848 9003 6900 0x04000200
[ 54.167824] [3: kworker/u16:3: 253] RTmutex: 0xffffffc051fa3b08: owner[kworker/3:2 :418]
...
Bug: 186567468
Signed-off-by: JINHO LIM <jordan.lim@samsung.com>
Change-Id: I93f9753be0b2c1fa1a6eaea09379d54c31d1ebcf
(cherry picked from commit e289faa9f1)