Commit Graph

1201956 Commits

Author SHA1 Message Date
Jiri Pirko
0d0946d648 net/mlx5: Remove redundant is_mdev_switchdev_mode() check from is_ib_rep_supported()
is_mdev_switchdev_mode() check is done in is_eth_rep_supported().
Function is_ib_rep_supported() calls is_eth_rep_supported().
Remove the redundant check from it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:34 -07:00
Jiri Pirko
61955da523 net/mlx5: Remove redundant MLX5_ESWITCH_MANAGER() check from is_ib_rep_supported()
MLX5_ESWITCH_MANAGER() check is done in is_eth_rep_supported().
Function is_ib_rep_supported() calls is_eth_rep_supported().
Remove the redundant check from it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:34 -07:00
Roi Dayan
15ddd72ee3 net/mlx5e: E-Switch, Fix shared fdb error flow
On error flow resources being freed in esw_master_egress_destroy_resources()
but pointers not being set to null if error flow is from creating a
bounce rule. Then in esw_acl_egress_ofld_cleanup() we try to access already
freed pointers. Fix it by resetting the pointers to null.
Also if error is from creating a second or later bounce rule then the
flow group and table being used and cannot and should not be freed.
Add a check to destroy the flow group and table if there are no bounce
rules.

mlx5_core.sf mlx5_core.sf.2: mlx5_destroy_flow_group:2306:(pid 2235): Flow group 4 wasn't destroyed, refcount > 1
mlx5_core.sf mlx5_core.sf.2: mlx5_destroy_flow_table:2295:(pid 2235): Flow table 3 wasn't destroyed, refcount > 1

Fixes: 5e0202eb49 ("net/mlx5: E-switch, Handle multiple master egress rules")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:34 -07:00
Roi Dayan
ae4de89493 net/mlx5e: Remove redundant comment
The function comment says what it is and the comment
is redundant.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:34 -07:00
Roi Dayan
4575ab3b7d net/mlx5e: E-Switch, Pass other_vport flag if vport is not 0
When creating flow table for shared fdb resources, there is
only need to pass other_vport flag if vport is not 0 or
if the port is ECPF in BlueField.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:34 -07:00
Roi Dayan
70c3643839 net/mlx5e: E-Switch, Use xarray for devcom paired device index
To allow devcom events on E-Switch that is not a vport group manager,
use vhca id as an index instead of device index which might be shared
between several E-Switches. for example SF and its PF.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:33 -07:00
Roi Dayan
1552e9b518 net/mlx5e: E-Switch, Add peer fdb miss rules for vport manager or ecpf
Add peer fdb rules for E-Switch that are vport managers or ecpf device.
It is not needed for other devices.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:33 -07:00
Roi Dayan
1da9f36252 net/mlx5e: Use vhca_id for device index in vport rx rules
Device index is like PF index and limited to max physical ports.
For example, SFs created under PF the device index is the PF device index.
Use vhca_id which gets the FW index per vport, for vport rx rules
and vport pair events.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:33 -07:00
Roi Dayan
8ec91f5d07 net/mlx5: Lag, Remove duplicate code checking lag is supported
Remove duplicate function for checking if device has lag support.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:33 -07:00
Dan Carpenter
690ad62fc6 net/mlx5: Fix error code in mlx5_is_reset_now_capable()
The mlx5_is_reset_now_capable() function returns bool, not negative
error codes.  So if fast teardown is not supported it should return
false instead of -EOPNOTSUPP.

Fixes: 92501fa6e4 ("net/mlx5: Ack on sync_reset_request only if PF can do reset_now")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:33 -07:00
Lama Kayal
9ee473c259 net/mlx5: Fix reserved at offset in hca_cap register
A member of struct mlx5_ifc_cmd_hca_cap_bits has been mistakenly
assigned the wrong reserved_at offset value. Correct it to align to the
right value, thus avoid future miscalculation.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:33 -07:00
Shay Drory
25c24801d7 net/mlx5: Fix SFs kernel documentation error
Indent SFs probe code example in order to fix the below error:

Documentation/networking/device_drivers/ethernet/mellanox/mlx5/switchdev.rst:57: ERROR: Unexpected indentation.
Documentation/networking/device_drivers/ethernet/mellanox/mlx5/switchdev.rst:61: ERROR: Unexpected indentation.

Fixes: e71383fb9c ("net/mlx5: Light probe local SFs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Automatic Verification <verifier@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
2023-06-23 12:27:32 -07:00
Shay Drory
da744fd136 net/mlx5: Fix UAF in mlx5_eswitch_cleanup()
mlx5_eswitch_cleanup() is using esw right after freeing it for
releasing devlink_param.
Fix it by releasing the devlink_param before freeing the esw, and
adjust the create function accordingly.

Fixes: 3f90840305 ("net/mlx5: Move esw multiport devlink param to eswitch code")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Automatic Verification <verifier@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-23 12:27:32 -07:00
Linus Torvalds
afa4bb778e workqueue: clean up WORK_* constant types, clarify masking
Dave Airlie reports that gcc-13.1.1 has started complaining about some
of the workqueue code in 32-bit arm builds:

  kernel/workqueue.c: In function ‘get_work_pwq’:
  kernel/workqueue.c:713:24: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    713 |                 return (void *)(data & WORK_STRUCT_WQ_DATA_MASK);
        |                        ^
  [ ... a couple of other cases ... ]

and while it's not immediately clear exactly why gcc started complaining
about it now, I suspect it's some C23-induced enum type handlign fixup in
gcc-13 is the cause.

Whatever the reason for starting to complain, the code and data types
are indeed disgusting enough that the complaint is warranted.

The wq code ends up creating various "helper constants" (like that
WORK_STRUCT_WQ_DATA_MASK) using an enum type, which is all kinds of
confused.  The mask needs to be 'unsigned long', not some unspecified
enum type.

To make matters worse, the actual "mask and cast to a pointer" is
repeated a couple of times, and the cast isn't even always done to the
right pointer, but - as the error case above - to a 'void *' with then
the compiler finishing the job.

That's now how we roll in the kernel.

So create the masks using the proper types rather than some ambiguous
enumeration, and use a nice helper that actually does the type
conversion in one well-defined place.

Incidentally, this magically makes clang generate better code.  That,
admittedly, is really just a sign of clang having been seriously
confused before, and cleaning up the typing unconfuses the compiler too.

Reported-by: Dave Airlie <airlied@gmail.com>
Link: https://lore.kernel.org/lkml/CAPM=9twNnV4zMCvrPkw3H-ajZOH-01JVh_kDrxdPYQErz8ZTdA@mail.gmail.com/
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-23 12:08:14 -07:00
Jens Axboe
c36591f682 Merge tag 'md-next-20230623' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.5/block-late
Pull MD fixes from Song.

* tag 'md-next-20230623' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
  raid10: avoid spin_lock from fastpath from raid10_unplug()
  md: fix 'delete_mutex' deadlock
  md: use mddev->external to select holder in export_rdev()
  md/raid1-10: fix casting from randomized structure in raid1_submit_write()
  md/raid10: fix the condition to call bio_end_io_acct()
2023-06-23 11:59:05 -06:00
Catalin Marinas
abc17128c8 Merge branch 'for-next/feat_s1pie' into for-next/core
* for-next/feat_s1pie:
  : Support for the Armv8.9 Permission Indirection Extensions (stage 1 only)
  KVM: selftests: get-reg-list: add Permission Indirection registers
  KVM: selftests: get-reg-list: support ID register features
  arm64: Document boot requirements for PIE
  arm64: transfer permission indirection settings to EL2
  arm64: enable Permission Indirection Extension (PIE)
  arm64: add encodings of PIRx_ELx registers
  arm64: disable EL2 traps for PIE
  arm64: reorganise PAGE_/PROT_ macros
  arm64: add PTE_WRITE to PROT_SECT_NORMAL
  arm64: add PTE_UXN/PTE_WRITE to SWAPPER_*_FLAGS
  KVM: arm64: expose ID_AA64MMFR3_EL1 to guests
  KVM: arm64: Save/restore PIE registers
  KVM: arm64: Save/restore TCR2_EL1
  arm64: cpufeature: add Permission Indirection Extension cpucap
  arm64: cpufeature: add TCR2 cpucap
  arm64: cpufeature: add system register ID_AA64MMFR3
  arm64/sysreg: add PIR*_ELx registers
  arm64/sysreg: update HCRX_EL2 register
  arm64/sysreg: add system registers TCR2_ELx
  arm64/sysreg: Add ID register ID_AA64MMFR3
2023-06-23 18:34:16 +01:00
Catalin Marinas
f42039d10b Merge branches 'for-next/kpti', 'for-next/missing-proto-warn', 'for-next/iss2-decode', 'for-next/kselftest', 'for-next/misc', 'for-next/feat_mops', 'for-next/module-alloc', 'for-next/sysreg', 'for-next/cpucap', 'for-next/acpi', 'for-next/kdump', 'for-next/acpi-doc', 'for-next/doc' and 'for-next/tpidr2-fix', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
  docs: perf: Fix warning from 'make htmldocs' in hisi-pmu.rst
  docs: perf: Add new description for HiSilicon UC PMU
  drivers/perf: hisi: Add support for HiSilicon UC PMU driver
  drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver
  perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE
  perf/arm-cmn: Add sysfs identifier
  perf/arm-cmn: Revamp model detection
  perf/arm_dmc620: Add cpumask
  dt-bindings: perf: fsl-imx-ddr: Add i.MX93 compatible
  drivers/perf: imx_ddr: Add support for NXP i.MX9 SoC DDRC PMU driver
  perf/arm_cspmu: Decouple APMT dependency
  perf/arm_cspmu: Clean up ACPI dependency
  ACPI/APMT: Don't register invalid resource
  perf/arm_cspmu: Fix event attribute type
  perf: arm_cspmu: Set irq affinitiy only if overflow interrupt is used
  drivers/perf: hisi: Don't migrate perf to the CPU going to teardown
  drivers/perf: apple_m1: Force 63bit counters for M2 CPUs
  perf/arm-cmn: Fix DTC reset
  perf: qcom_l2_pmu: Make l2_cache_pmu_probe_cluster() more robust
  perf/arm-cci: Slightly optimize cci_pmu_sync_counters()

* for-next/kpti:
  : Simplify KPTI trampoline exit code
  arm64: entry: Simplify tramp_alias macro and tramp_exit routine
  arm64: entry: Preserve/restore X29 even for compat tasks

* for-next/missing-proto-warn:
  : Address -Wmissing-prototype warnings
  arm64: add alt_cb_patch_nops prototype
  arm64: move early_brk64 prototype to header
  arm64: signal: include asm/exception.h
  arm64: kaslr: add kaslr_early_init() declaration
  arm64: flush: include linux/libnvdimm.h
  arm64: module-plts: inline linux/moduleloader.h
  arm64: hide unused is_valid_bugaddr()
  arm64: efi: add efi_handle_corrupted_x18 prototype
  arm64: cpuidle: fix #ifdef for acpi functions
  arm64: kvm: add prototypes for functions called in asm
  arm64: spectre: provide prototypes for internal functions
  arm64: move cpu_suspend_set_dbg_restorer() prototype to header
  arm64: avoid prototype warnings for syscalls
  arm64: add scs_patch_vmlinux prototype
  arm64: xor-neon: mark xor_arm64_neon_*() static

* for-next/iss2-decode:
  : Add decode of ISS2 to data abort reports
  arm64/esr: Add decode of ISS2 to data abort reporting
  arm64/esr: Use GENMASK() for the ISS mask

* for-next/kselftest:
  : Various arm64 kselftest improvements
  kselftest/arm64: Log signal code and address for unexpected signals
  kselftest/arm64: Add a smoke test for ptracing hardware break/watch points

* for-next/misc:
  : Miscellaneous patches
  arm64: alternatives: make clean_dcache_range_nopatch() noinstr-safe
  arm64: hibernate: remove WARN_ON in save_processor_state
  arm64/fpsimd: Exit streaming mode when flushing tasks
  arm64: mm: fix VA-range sanity check
  arm64/mm: remove now-superfluous ISBs from TTBR writes
  arm64: consolidate rox page protection logic
  arm64: set __exception_irq_entry with __irq_entry as a default
  arm64: syscall: unmask DAIF for tracing status
  arm64: lockdep: enable checks for held locks when returning to userspace
  arm64/cpucaps: increase string width to properly format cpucaps.h
  arm64/cpufeature: Use helper for ECV CNTPOFF cpufeature

* for-next/feat_mops:
  : Support for ARMv8.8 memcpy instructions in userspace
  kselftest/arm64: add MOPS to hwcap test
  arm64: mops: allow disabling MOPS from the kernel command line
  arm64: mops: detect and enable FEAT_MOPS
  arm64: mops: handle single stepping after MOPS exception
  arm64: mops: handle MOPS exceptions
  KVM: arm64: hide MOPS from guests
  arm64: mops: don't disable host MOPS instructions from EL2
  arm64: mops: document boot requirements for MOPS
  KVM: arm64: switch HCRX_EL2 between host and guest
  arm64: cpufeature: detect FEAT_HCX
  KVM: arm64: initialize HCRX_EL2

* for-next/module-alloc:
  : Make the arm64 module allocation code more robust (clean-up, VA range expansion)
  arm64: module: rework module VA range selection
  arm64: module: mandate MODULE_PLTS
  arm64: module: move module randomization to module.c
  arm64: kaslr: split kaslr/module initialization
  arm64: kasan: remove !KASAN_VMALLOC remnants
  arm64: module: remove old !KASAN_VMALLOC logic

* for-next/sysreg: (21 commits)
  : More sysreg conversions to automatic generation
  arm64/sysreg: Convert TRBIDR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBTRG_EL1 register to automatic generation
  arm64/sysreg: Convert TRBMAR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBSR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBBASER_EL1 register to automatic generation
  arm64/sysreg: Convert TRBPTR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBLIMITR_EL1 register to automatic generation
  arm64/sysreg: Rename TRBIDR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBTRG_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBMAR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBSR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBBASER_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBPTR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBLIMITR_EL1 fields per auto-gen tools format
  arm64/sysreg: Convert OSECCR_EL1 to automatic generation
  arm64/sysreg: Convert OSDTRTX_EL1 to automatic generation
  arm64/sysreg: Convert OSDTRRX_EL1 to automatic generation
  arm64/sysreg: Convert OSLAR_EL1 to automatic generation
  arm64/sysreg: Standardise naming of bitfield constants in OSL[AS]R_EL1
  arm64/sysreg: Convert MDSCR_EL1 to automatic register generation
  ...

* for-next/cpucap:
  : arm64 cpucap clean-up
  arm64: cpufeature: fold cpus_set_cap() into update_cpu_capabilities()
  arm64: cpufeature: use cpucap naming
  arm64: alternatives: use cpucap naming
  arm64: standardise cpucap bitmap names

* for-next/acpi:
  : Various arm64-related ACPI patches
  ACPI: bus: Consolidate all arm specific initialisation into acpi_arm_init()

* for-next/kdump:
  : Simplify the crashkernel reservation behaviour of crashkernel=X,high on arm64
  arm64: add kdump.rst into index.rst
  Documentation: add kdump.rst to present crashkernel reservation on arm64
  arm64: kdump: simplify the reservation behaviour of crashkernel=,high

* for-next/acpi-doc:
  : Update ACPI documentation for Arm systems
  Documentation/arm64: Update ACPI tables from BBR
  Documentation/arm64: Update references in arm-acpi
  Documentation/arm64: Update ARM and arch reference

* for-next/doc:
  : arm64 documentation updates
  Documentation/arm64: Add ptdump documentation

* for-next/tpidr2-fix:
  : Fix the TPIDR2_EL0 register restoring on sigreturn
  kselftest/arm64: Add a test case for TPIDR2 restore
  arm64/signal: Restore TPIDR2 register rather than memory state
2023-06-23 18:32:20 +01:00
Mark Brown
f7a5d72edc kselftest/arm64: Add a test case for TPIDR2 restore
Due to the fact that TPIDR2 is intended to be managed by libc we don't
currently test modifying it via the signal context since that might
disrupt libc's usage of it and cause instability. We can however test the
opposite case with less risk, modifying TPIDR2 in a signal handler and
making sure that the original value is restored after returning from the
signal handler. Add a test which does this.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230621-arm64-fix-tpidr2-signal-restore-v2-2-c8e8fcc10302@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-23 18:32:10 +01:00
Mark Brown
616cb2f4b1 arm64/signal: Restore TPIDR2 register rather than memory state
Currently when restoring the TPIDR2 signal context we set the new value
from the signal frame in the thread data structure but not the register,
following the pattern for the rest of the data we are restoring. This does
not work in the case of TPIDR2, the register always has the value for the
current task. This means that either we return to userspace and ignore the
new value or we context switch and save the register value on top of the
newly restored value.

Load the value from the signal context into the register instead.

Fixes: 39e5449928 ("arm64/signal: Include TPIDR2 in the signal context")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: <stable@vger.kernel.org> # 6.3.x
Link: https://lore.kernel.org/r/20230621-arm64-fix-tpidr2-signal-restore-v2-1-c8e8fcc10302@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-23 18:31:50 +01:00
Mario Limonciello
112a7f9c8e PCI/ACPI: Call _REG when transitioning D-states
ACPI r6.5, sec 6.5.4, describes how AML is unable to access an
OperationRegion unless _REG has been called to connect a handler:

  The OS runs _REG control methods to inform AML code of a change in the
  availability of an operation region. When an operation region handler is
  unavailable, AML cannot access data fields in that region.  (Operation
  region writes will be ignored and reads will return indeterminate data.)

The PCI core does not call _REG at any time, leading to the undefined
behavior mentioned in the spec.

The spec explains that _REG should be executed to indicate whether a
given region can be accessed:

  Once _REG has been executed for a particular operation region, indicating
  that the operation region handler is ready, a control method can access
  fields in the operation region. Conversely, control methods must not
  access fields in operation regions when _REG method execution has not
  indicated that the operation region handler is ready.

An example included in the spec demonstrates calling _REG when devices are
turned off: "when the host controller or bridge controller is turned off
or disabled, PCI Config Space Operation Regions for child devices are
no longer available. As such, ETH0’s _REG method will be run when it
is turned off and will again be run when PCI1 is turned off."

It is reported that ASMedia PCIe GPIO controllers fail functional tests
after the system has returning from suspend (S3 or s2idle). This is because
the BIOS checks whether the OSPM has called the _REG method to determine
whether it can interact with the OperationRegion assigned to the device as
part of the other AML called for the device.

To fix this issue, call acpi_evaluate_reg() when devices are transitioning
to D3cold or D0.

[bhelgaas: split pci_power_t checking to preliminary patch]
Link: https://uefi.org/specs/ACPI/6.5/06_Device_Configuration.html#reg-region
Link: https://lore.kernel.org/r/20230620140451.21007-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
2023-06-23 12:28:08 -05:00
Bjorn Helgaas
5557b62634 PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Previously acpi_pci_set_power_state() assumed the requested power state was
valid (PCI_D0 ... PCI_D3cold).  If a caller supplied something else, we
could index outside the state_conv[] array and pass junk to
acpi_device_set_power().

Validate the pci_power_t parameter and return -EINVAL if it's invalid.

Link: https://lore.kernel.org/r/20230621222857.GA122930@bhelgaas
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
2023-06-23 12:28:01 -05:00
Ondrej Zary
9e30fd26f4 PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3cold
The quirk for Elo i2 introduced in commit 92597f97a4 ("PCI/PM: Avoid
putting Elo i2 PCIe Ports in D3cold") is also needed by EloPOS E2/S2/H2
which uses the same Continental Z2 board.

Change the quirk to match the board instead of system.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215715
Link: https://lore.kernel.org/r/20230614074253.22318-1-linux@zary.sk
Signed-off-by: Ondrej Zary <linux@zary.sk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2023-06-23 12:27:58 -05:00
Palmer Dabbelt
488833ccdc
Merge patch series "dt-bindings: riscv: cpus: switch to unevaluatedProperties: false"
Conor Dooley <conor@kernel.org> says:

From: Conor Dooley <conor.dooley@microchip.com>

Do the various bits needed to drop the additionalProperties: true that
we currently have in riscv/cpu.yaml, to permit actually enforcing what
people put in cpus nodes.

* b4-shazam-merge:
  dt-bindings: riscv: cpus: switch to unevaluatedProperties: false
  dt-bindings: riscv: cpus: add a ref the common cpu schema

Link: https://lore.kernel.org/r/20230615-creamer-emu-ade0fa0bdb68@spud
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-23 10:06:23 -07:00
Song Shuai
91afbaafd6
riscv: hibernate: remove WARN_ON in save_processor_state
During hibernation or restoration, freeze_secondary_cpus
checks num_online_cpus via BUG_ON, and the subsequent
save_processor_state also does the checking with WARN_ON.

In the case of CONFIG_PM_SLEEP_SMP=n, freeze_secondary_cpus
is not defined, but the sole possible condition to disable
CONFIG_PM_SLEEP_SMP is !SMP where num_online_cpus is always 1.
We also don't have to check it in save_processor_state.

So remove the unnecessary checking in save_processor_state.

Fixes: c031721001 ("RISC-V: Add arch functions to support hibernation/suspend-to-disk")
Signed-off-by: Song Shuai <songshuaishuai@tinylab.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230609075049.2651723-4-songshuaishuai@tinylab.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-23 10:06:22 -07:00
Palmer Dabbelt
b5e13f3ace
Merge patch series "riscv: Add independent irq/softirq stacks support"
guoren@kernel.org <guoren@kernel.org> says:

From: Guo Ren <guoren@linux.alibaba.com>

This patch series adds independent irq/softirq stacks to decrease the
press of the thread stack. Also, add a thread STACK_SIZE config for
users to adjust the proper size during compile time.

* b4-shazam-merge:
  riscv: stack: Add config of thread stack size
  riscv: stack: Support HAVE_SOFTIRQ_ON_OWN_STACK
  riscv: stack: Support HAVE_IRQ_EXIT_ON_IRQ_STACK

Link: https://lore.kernel.org/r/20230614013018.2168426-1-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-23 10:06:21 -07:00
Palmer Dabbelt
42b89447b6
Merge patch series "ISA string parser cleanups"
Conor Dooley <conor@kernel.org> says:

From: Conor Dooley <conor.dooley@microchip.com>

Here are some bits that were discussed with Drew on the "should we
allow caps" threads that I have now created patches for:
- splitting of riscv_of_processor_hartid() into two distinct functions,
  one for use purely during early boot, prior to the establishment of
  the possible-cpus mask & another to fit the other current use-cases
- that then allows us to then completely skip some validation of the
  hartid in the parser
- the biggest diff in the series is a rework of the comments in the
  parser, as I have mostly found the existing (sparse) ones to not be
  all that helpful whenever I have to go back and look at it
- from writing the comments, I found a conditional doing a bit of a
  dance that I found counter-intuitive, so I've had a go at making that
  match what I would expect a little better
- `i` implies 4 other extensions, so add them as extensions and set
  them for the craic. Sure why not like...

* b4-shazam-merge:
  RISC-V: always report presence of extensions formerly part of the base ISA
  dt-bindings: riscv: explicitly mention assumption of Zicntr & Zihpm support
  RISC-V: remove decrement/increment dance in ISA string parser
  RISC-V: rework comments in ISA string parser
  RISC-V: validate riscv,isa at boot, not during ISA string parsing
  RISC-V: split early & late of_node to hartid mapping
  RISC-V: simplify register width check in ISA string parsing

Link: https://lore.kernel.org/r/20230607-audacity-overhaul-82bb867a825f@spud
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-23 10:06:20 -07:00
Miquel Raynal
cf431a5998 Merge branch 'nand/next' into mtd/next 2023-06-23 19:02:09 +02:00
Yu Kuai
a8d5fdd4d2 raid10: avoid spin_lock from fastpath from raid10_unplug()
Commit 0c0be98bbe ("md/raid10: prevent unnecessary calls to wake_up()
in fast path") missed one place, for example, with:

	fio -direct=1 -rw=write/randwrite -iodepth=1 ...

Plug and unplug are called for each io, then wake_up() from raid10_unplug()
will cause lock contention as well.

Avoid this contention by using wake_up_barrier() instead of wake_up(),
where spin_lock is not held if waitqueue is empty.

Fio test script:

[global]
name=random reads and writes
ioengine=libaio
direct=1
readwrite=randrw
rwmixread=70
iodepth=64
buffered=0
filename=/dev/md0
size=1G
runtime=30
time_based
randrepeat=0
norandommap
refill_buffers
ramp_time=10
bs=4k
numjobs=400
group_reporting=1
[job1]

Test result with ramdisk raid10(By Ali):

	Before this patch	With this patch
READ	IOPS=2033k		IOPS=3642k
WRITE	IOPS=871k		IOPS=1561K

By the way, in this scenario, blk_plug_cb() will be allocated and freed
for each io, this seems need to be optimized as well.

Reported-and-tested-by: Ali Gholami Rudi <aligrudi@gmail.com>
Closes: https://lore.kernel.org/all/20231606122233@laper.mirepesht/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230621105728.1268542-1-yukuai1@huaweicloud.com
2023-06-23 09:41:50 -07:00
Yu Kuai
4934b6401a md: fix 'delete_mutex' deadlock
Commit 3ce94ce5d0 ("md: fix duplicate filename for rdev") introduce a
new lock 'delete_mutex', and trigger a new deadlock:

t1: remove rdev			t2: sysfs writer

rdev_attr_store			rdev_attr_store
 mddev_lock
 state_store
 md_kick_rdev_from_array
  lock delete_mutex
  list_add mddev->deleting
  unlock delete_mutex
 mddev_unlock
				 mddev_lock
				 ...
  lock delete_mutex
  kobject_del
  // wait for sysfs writers to be done
				 mddev_unlock
				 lock delete_mutex
				 // wait for delete_mutex, deadlock

'delete_mutex' is used to protect the list 'mddev->deleting', turns out
that this list can be protected by 'reconfig_mutex' directly, and this
lock can be removed.

Fix this problem by removing the lock, and use 'reconfig_mutex' to
protect the list. mddev_unlock() will move this list to a local list to
be handled after 'reconfig_mutex' is dropped.

Fixes: 3ce94ce5d0 ("md: fix duplicate filename for rdev")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230621142933.1395629-1-yukuai1@huaweicloud.com
2023-06-23 09:41:47 -07:00
Song Liu
a1d7671910 md: use mddev->external to select holder in export_rdev()
mdadm test "10ddf-create-fail-rebuild" triggers warnings like the following

[  215.526357] ------------[ cut here ]------------
[  215.527243] WARNING: CPU: 18 PID: 1264 at block/bdev.c:617 blkdev_put+0x269/0x350
[  215.528334] Modules linked in:
[  215.528806] CPU: 18 PID: 1264 Comm: mdmon Not tainted 6.4.0-rc2+ #768
[  215.529863] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
[  215.531464] RIP: 0010:blkdev_put+0x269/0x350
[  215.532167] Code: ff ff 49 8d 7d 10 e8 56 bf b8 ff 4d 8b 65 10 49 8d bc
24 58 05 00 00 e8 05 be b8 ff 41 83 ac 24 58 05 00 00 01 e9 44 ff ff ff
<0f> 0b e9 52 fe ff ff 0f 0b e9 6b fe ff ff1
[  215.534780] RSP: 0018:ffffc900040bfbf0 EFLAGS: 00010283
[  215.535635] RAX: ffff888174001000 RBX: ffff88810b1c3b00 RCX: ffffffff819a4061
[  215.536645] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff88810b1c3ba0
[  215.537657] RBP: ffff88810dbde800 R08: fffffbfff0fca983 R09: fffffbfff0fca983
[  215.538674] R10: ffffc900040bfbf0 R11: fffffbfff0fca982 R12: ffff88810b1c3b38
[  215.539687] R13: ffff88810b1c3b10 R14: ffff88810dbdecb8 R15: ffff88810b1c3b00
[  215.540833] FS:  00007f2aabdff700(0000) GS:ffff888dfb400000(0000) knlGS:0000000000000000
[  215.541961] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  215.542775] CR2: 00007fa19a85d934 CR3: 000000010c076006 CR4: 0000000000370ee0
[  215.543814] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  215.544840] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  215.545885] Call Trace:
[  215.546257]  <TASK>
[  215.546608]  export_rdev.isra.63+0x71/0xe0
[  215.547338]  mddev_unlock+0x1b1/0x2d0
[  215.547898]  array_state_store+0x28d/0x450
[  215.548519]  md_attr_store+0xd7/0x150
[  215.549059]  ? __pfx_sysfs_kf_write+0x10/0x10
[  215.549702]  kernfs_fop_write_iter+0x1b9/0x260
[  215.550351]  vfs_write+0x491/0x760
[  215.550863]  ? __pfx_vfs_write+0x10/0x10
[  215.551445]  ? __fget_files+0x156/0x230
[  215.552053]  ksys_write+0xc0/0x160
[  215.552570]  ? __pfx_ksys_write+0x10/0x10
[  215.553141]  ? ktime_get_coarse_real_ts64+0xec/0x100
[  215.553878]  do_syscall_64+0x3a/0x90
[  215.554403]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  215.555125] RIP: 0033:0x7f2aade11847
[  215.555696] Code: c3 66 90 41 54 49 89 d4 55 48 89 f5 53 89 fb 48 83 ec
10 e8 1b fd ff ff 4c 89 e2 48 89 ee 89 df 41 89 c0 b8 01 00 00 00 0f 05
<48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 448
[  215.558398] RSP: 002b:00007f2aabdfeba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[  215.559516] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007f2aade11847
[  215.560515] RDX: 0000000000000005 RSI: 0000000000438b8b RDI: 0000000000000010
[  215.561512] RBP: 0000000000438b8b R08: 0000000000000000 R09: 00007f2aaecf0060
[  215.562511] R10: 000000000e3ba40b R11: 0000000000000293 R12: 0000000000000005
[  215.563647] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000c70750
[  215.564693]  </TASK>
[  215.565029] irq event stamp: 15979
[  215.565584] hardirqs last  enabled at (15991): [<ffffffff811a7432>] __up_console_sem+0x52/0x60
[  215.566806] hardirqs last disabled at (16000): [<ffffffff811a7417>] __up_console_sem+0x37/0x60
[  215.568022] softirqs last  enabled at (15716): [<ffffffff8277a2db>] __do_softirq+0x3eb/0x531
[  215.569239] softirqs last disabled at (15711): [<ffffffff810d8f45>] irq_exit_rcu+0x115/0x160
[  215.570434] ---[ end trace 0000000000000000 ]---

This means export_rdev() calls blkdev_put with a different holder than the
one used by blkdev_get_by_dev(). This is because mddev->major_version == -2
is not a good check for external metadata. Fix this by using
mddev->external instead.

Also, do not clear mddev->external in md_clean(), as the flag might be used
later in export_rdev().

Fixes: 2736e8eeb0 ("block: use the holder as indication for exclusive opens")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230617052405.305871-1-song@kernel.org
2023-06-23 09:39:00 -07:00
Baruch Siach
aa88054b70 binfmt_elf: fix comment typo s/reset/regset/
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/0b2967c4a4141875c493e835d5a6f8f2d19ae2d6.1687499804.git.baruch@tkos.co.il
2023-06-23 09:36:30 -07:00
Baruch Siach
0b3d412798 elf: correct note name comment
NT_PRFPREG note is named "CORE". Correct the comment accordingly.

Fixes: 00e19ceec8 ("ELF: Add ELF program property parsing support")
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/455b22b986de4d3bc6d9bfd522378e442943de5f.1687499411.git.baruch@tkos.co.il
2023-06-23 09:34:55 -07:00
Yu Kuai
b5a99602b7 md/raid1-10: fix casting from randomized structure in raid1_submit_write()
Following build error triggered while build with clang version 17.0.0
with W=1(this can't be reporduced with gcc 13.1.0):

drivers/md/raid1-10.c:117:25: error: casting from randomized structure
pointer type 'struct block_device *' to 'struct md_rdev *'
     117 |         struct md_rdev *rdev = (struct md_rdev *)bio->bi_bdev;
         |                                ^

Fix this by casting 'bio->bi_bdev' to 'void *', as it used to be.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202306142042.fmjfmTF8-lkp@intel.com/
Fixes: 8295efbe68 ("md/raid1-10: factor out a helper to submit normal write")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230616012136.3047071-1-yukuai1@huaweicloud.com
2023-06-23 09:33:16 -07:00
Li Nan
125bfc7cd7 md/raid10: fix the condition to call bio_end_io_acct()
/sys/block/[device]/queue/iostats is used to control whether to count io
stat. Write 0 to it will clear queue_flags QUEUE_FLAG_IO_STAT which means
iostats is disabled. If we disable iostats and later endable it, the io
issued during this period will be counted incorrectly, inflight will be
decreased to -1.

  //T1 set iostats
  echo 0 > /sys/block/md0/queue/iostats
   clear QUEUE_FLAG_IO_STAT

			//T2 issue io
			if (QUEUE_FLAG_IO_STAT) -> false
			 bio_start_io_acct
			  inflight++

  echo 1 > /sys/block/md0/queue/iostats
   set QUEUE_FLAG_IO_STAT

					//T3 io end
					if (QUEUE_FLAG_IO_STAT) -> true
					 bio_end_io_acct
					  inflight--	-> -1

Also, if iostats is enabled while issuing io but disabled while io end,
inflight will never be decreased.

Fix it by checking start_time when io end. If start_time is not 0, call
bio_end_io_acct().

Fixes: 528bc2cf2f ("md/raid10: enable io accounting")
Signed-off-by: Li Nan <linan122@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230609094320.2397604-1-linan666@huaweicloud.com
2023-06-23 09:33:16 -07:00
Biju Das
7bce166308
regulator: Add Renesas PMIC RAA215300 driver
The RAA215300 is a 9-channel PMIC that consists of
 * Internally compensated regulators
 * built-in Real Time Clock (RTC)
 * 32kHz crystal oscillator
 * coin cell battery charger

The RTC on RAA215300 is similar to the IP found in the ISL1208.
The existing driver for the ISL1208 works for this PMIC too,
however the RAA215300 exposes two devices via I2C, one for the RTC
IP, and one for everything else. The RTC IP has to be enabled
by the other I2C device, therefore this driver is necessary to get
the RTC to work.

The external oscillator bit is inverted on PMIC version 0x11.

Add PMIC RAA215300 driver for enabling RTC block and instantiating
RTC device based on PMIC version.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Link: https://lore.kernel.org/r/Message-Id: <20230623140948.384762-3-biju.das.jz@bp.renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-23 16:29:00 +01:00
Biju Das
fff8f6b072
regulator: dt-bindings: Add Renesas RAA215300 PMIC bindings
Document Renesas RAA215300 PMIC bindings.

The RAA215300 is a high Performance 9-Channel PMIC supporting DDR
Memory, with Built-In Charger and RTC.

It supports DDR3, DDR3L, DDR4, and LPDDR4 memory power requirements.
The internally compensated regulators, built-in Real-Time Clock (RTC),
32kHz crystal oscillator, and coin cell battery charger provide a
highly integrated, small footprint power solution ideal for
System-On-Module (SOM) applications. A spread spectrum feature
provides an ease-of-use solution for noise-sensitive audio or RF
applications.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/Message-Id: <20230623140948.384762-2-biju.das.jz@bp.renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-23 16:29:00 +01:00
Dan Carpenter
ed959833db
ASoC: tas2781: Fix error code in tas2781_load_calibration()
Return -EINVAL instead of success on this error path.

Fixes: 915f5eadeb ("ASoC: tas2781: firmware lib")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/Message-Id: <729bb6b3-bc1d-4b3d-8b65-077a492c753c@moroto.mountain>
Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-23 16:28:58 +01:00
Christian König
0c3855ba8d drm/ttm: fix warning that we shouldn't mix && and ||
Trivial warning fix.

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 4481913607 ("drm/ttm: fix bulk_move corruption when adding a entry")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230623070935.65102-1-christian.koenig@amd.com
2023-06-23 17:21:02 +02:00
Wilken Gottwalt
b54c4b02ab hwmon: (corsair-psu) various cleanups
Fix some typos, adjust documentation and comments to current state of
knowledge and update coding style to be more uniform.

Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net>
Link: https://lore.kernel.org/r/ZJWf3H972hGgLK-8@monster.localdomain
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-06-23 07:47:37 -07:00
Donglin Peng
fc30ace06f tracing: Fix warnings when building htmldocs for function graph retval
When building htmldocs, the following warnings appear:

Documentation/trace/ftrace.rst:2797: WARNING: Literal block expected; none found.
Documentation/trace/ftrace.rst:2816: WARNING: Literal block expected; none found.

So fix it.

Link: https://lore.kernel.org/all/20230623143517.19ffc6c0@canb.auug.org.au/
Link: https://lkml.kernel.org/r/20230623071728.25688-1-pengdonglin@sangfor.com.cn

Fixes: 21c094d3f8 ("tracing: Add documentation for funcgraph-retval and funcgraph-retval-hex")
Signed-off-by: Donglin Peng <pengdonglin@sangfor.com.cn>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-06-23 10:42:12 -04:00
Demi Marie Obenour
81ca2dbefa dm ioctl: Refuse to create device named "." or ".."
Using either of these is going to greatly confuse userspace, as they are
not valid symlink names and so creating the usual /dev/mapper/NAME
symlink will not be possible.  As creating a device with either of these
names is almost certainly a userspace bug, just error out.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-23 10:31:52 -04:00
Demi Marie Obenour
a85f1a9de9 dm ioctl: Refuse to create device named "control"
Typical userspace setups create a symlink under /dev/mapper with the
name of the device, but /dev/mapper/control is reserved for DM's control
device.  Therefore, trying to create such a device is almost certain to
be a userspace bug.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-23 10:31:51 -04:00
Demi Marie Obenour
249bed821b dm ioctl: Avoid double-fetch of version
The version is fetched once in check_version(), which then does some
validation and then overwrites the version in userspace with the API
version supported by the kernel.  copy_params() then fetches the version
from userspace *again*, and this time no validation is done.  The result
is that the kernel's version number is completely controllable by
userspace, provided that userspace can win a race condition.

Fix this flaw by not copying the version back to the kernel the second
time.  This is not exploitable as the version is not further used in the
kernel.  However, it could become a problem if future patches start
relying on the version field.

Cc: stable@vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-23 10:31:51 -04:00
Demi Marie Obenour
10655c7a48 dm ioctl: structs and parameter strings must not overlap
The NUL terminator for each target parameter string must precede the
following 'struct dm_target_spec'.  Otherwise, dm_split_args() might
corrupt this struct.  Furthermore, the first 'struct dm_target_spec'
must come after the 'struct dm_ioctl', as if it overlaps too much
dm_split_args() could corrupt the 'struct dm_ioctl'.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-23 10:31:51 -04:00
Demi Marie Obenour
13f4a697f8 dm ioctl: Avoid pointer arithmetic overflow
Especially on 32-bit systems, it is possible for the pointer
arithmetic to overflow and cause a userspace pointer to be
dereferenced in the kernel.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-23 10:31:51 -04:00
Demi Marie Obenour
b60528d9e6 dm ioctl: Check dm_target_spec is sufficiently aligned
Otherwise subsequent code, if given malformed input, could dereference
a misaligned 'struct dm_target_spec *'.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> # use %zu
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-23 10:31:49 -04:00
Yu Kuai
fcaa174a9c scsi/sg: don't grab scsi host module reference
In order to prevent request_queue to be freed before cleaning up
blktrace debugfs entries, commit db59133e92 ("scsi: sg: fix blktrace
debugfs entries leakage") use scsi_device_get(), however,
scsi_device_get() will also grab scsi module reference and scsi module
can't be removed.

It's reported that blktests can't unload scsi_debug after block/001:

blktests (master) # ./check block
block/001 (stress device hotplugging) [failed]
     +++ /root/blktests/results/nodev/block/001.out.bad 2023-06-19
      Running block/001
      Stressing sd
     +modprobe: FATAL: Module scsi_debug is in use.

Fix this problem by grabbing request_queue reference directly, so that
scsi host module can still be unloaded while request_queue will be
pinged by sg device.

Reported-by: Chaitanya Kulkarni <chaitanyak@nvidia.com>
Link: https://lore.kernel.org/all/1760da91-876d-fc9c-ab51-999a6f66ad50@nvidia.com/
Fixes: db59133e92 ("scsi: sg: fix blktrace debugfs entries leakage")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230621160111.1433521-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-23 08:28:18 -06:00
Pavel Begunkov
c98c81a4ac io_uring: merge conditional unlock flush helpers
There is no reason not to use __io_cq_unlock_post_flush for intermediate
aux CQE flushing, all ->task_complete should apply there, i.e. if set it
should be the submitter task. Combine them, get rid of of
__io_cq_unlock_post() and rename the left function.

This place was also taking a couple percents of CPU according to
profiles for max throughput net benchmarks due to multishot recv
flooding it with completions.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/bbed60734cbec2e833d9c7bdcf9741aada5d8aab.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-23 08:19:40 -06:00
Pavel Begunkov
0fdb9a196c io_uring: make io_cq_unlock_post static
io_cq_unlock_post() is exclusively used in io_uring/io_uring.c, mark it
static and don't expose to other files.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3dc8127dda4514e1dd24bb32035faac887c5fa37.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-23 08:19:40 -06:00
Pavel Begunkov
ff12617728 io_uring: inline __io_cq_unlock
__io_cq_unlock is not very helpful, and users should be calling flush
variants anyway. Open code the function.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d875c4cfb69f38ccecb58a57111446c77a614caa.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-23 08:19:40 -06:00