- Prevent a device from being probed before device_add() has finished
initializing it; gate probe with a "ready_to_probe" device flag to
avoid races with concurrent driver_register() calls
- Fix a kernel-doc warning for DEV_FLAG_COUNT introduced by the above
- Return -ENOTCONN from software_node_get_reference_args() when a
referenced software node is known but not yet registered, allowing
callers to defer probe
- In sysfs_group_attrs_change_owner(), also check is_visible_const();
missed when the const variant was introduced
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQS2q/xV6QjXAdC7k+1FlHeO1qrKLgUCaePuOwAKCRBFlHeO1qrK
LrBdAP9hcc5TBpPgbW4vdpsGgPhspz2x3Ijq0m91354RT2KISwEAx0ZHu6pvtLb/
hW0hHvk0onL6uBbCR6629U+A/xQ79go=
=Hdaa
-----END PGP SIGNATURE-----
Merge tag 'driver-core-7.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core fixes from Danilo Krummrich:
- Prevent a device from being probed before device_add() has finished
initializing it; gate probe with a "ready_to_probe" device flag to
avoid races with concurrent driver_register() calls
- Fix a kernel-doc warning for DEV_FLAG_COUNT introduced by the above
- Return -ENOTCONN from software_node_get_reference_args() when a
referenced software node is known but not yet registered, allowing
callers to defer probe
- In sysfs_group_attrs_change_owner(), also check is_visible_const();
missed when the const variant was introduced
* tag 'driver-core-7.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
driver core: Add kernel-doc for DEV_FLAG_COUNT enum value
sysfs: attribute_group: Respect is_visible_const() when changing owner
software node: return -ENOTCONN when referenced swnode is not registered yet
driver core: Don't let a device probe until it's ready
The moment we link a "struct device" into the list of devices for the
bus, it's possible probe can happen. This is because another thread
can load the driver at any time and that can cause the device to
probe. This has been seen in practice with a stack crawl that looks
like this [1]:
really_probe()
__driver_probe_device()
driver_probe_device()
__driver_attach()
bus_for_each_dev()
driver_attach()
bus_add_driver()
driver_register()
__platform_driver_register()
init_module() [some module]
do_one_initcall()
do_init_module()
load_module()
__arm64_sys_finit_module()
invoke_syscall()
As a result of the above, it was seen that device_links_driver_bound()
could be called for the device before "dev->fwnode->dev" was
assigned. This prevented __fw_devlink_pickup_dangling_consumers() from
being called which meant that other devices waiting on our driver's
sub-nodes were stuck deferring forever.
It's believed that this problem is showing up suddenly for two
reasons:
1. Android has recently (last ~1 year) implemented an optimization to
the order it loads modules [2]. When devices opt-in to this faster
loading, modules are loaded one-after-the-other very quickly. This
is unlike how other distributions do it. The reproduction of this
problem has only been seen on devices that opt-in to Android's
"parallel module loading".
2. Android devices typically opt-in to fw_devlink, and the most
noticeable issue is the NULL "dev->fwnode->dev" in
device_links_driver_bound(). fw_devlink is somewhat new code and
also not in use by all Linux devices.
Even though the specific symptom where "dev->fwnode->dev" wasn't
assigned could be fixed by moving that assignment higher in
device_add(), other parts of device_add() (like the call to
device_pm_add()) are also important to run before probe. Only moving
the "dev->fwnode->dev" assignment would likely fix the current
symptoms but lead to difficult-to-debug problems in the future.
Fix the problem by preventing probe until device_add() has run far
enough that the device is ready to probe. If somehow we end up trying
to probe before we're allowed, __driver_probe_device() will return
-EPROBE_DEFER which will make certain the device is noticed.
In the race condition that was seen with Android's faster module
loading, we will temporarily add the device to the deferred list and
then take it off immediately when device_add() probes the device.
Instead of adding another flag to the bitfields already in "struct
device", instead add a new "flags" field and use that. This allows us
to freely change the bit from different thread without worrying about
corrupting nearby bits (and means threads changing other bit won't
corrupt us).
[1] Captured on a machine running a downstream 6.6 kernel
[2] https://cs.android.com/android/platform/superproject/main/+/main:system/core/libmodprobe/libmodprobe.cpp?q=LoadModulesParallel
Cc: stable@vger.kernel.org
Fixes: 2023c610dc ("Driver core: add new device to bus's list before probing")
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patch.msgid.link/20260406162231.v5.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Code using driver_deferred_probe_check_state() differs from most
EPROBE_DEFER handling in the kernel. Where other EPROBE_DEFER handling
(e.g. clks, gpios and regulators) waits indefinitely for suppliers to
show up, code using driver_deferred_probe_check_state() will fail
after the deferred_probe_timeout.
This is a problem for generic distro kernels which want to support many
boards using a single kernel build. These kernels want as much drivers to
be modular as possible. The initrd also should be as small as possible,
so the initrd will *not* have drivers not needing to get the rootfs.
Combine this with waiting for a full-disk encryption password in
the initrd and it is pretty much guaranteed that the default 10s timeout
will be hit, causing probe() failures when drivers on the rootfs happen
to get modprobe-d before other rootfs modules providing their suppliers.
Make the default timeout configurable from Kconfig to allow distro kernel
configs where many of the supplier drivers are modules to set the default
through Kconfig.
Reviewed-by: Saravana Kannan <saravanak@kernel.org>
Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Link: https://patch.msgid.link/20260314084916.10868-1-johannes.goede@oss.qualcomm.com
[ Drop deferred_probe_timeout documentation change in
kernel-parameters.txt. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Currently, __device_set_driver_override() handles clearing the override
via empty string ("") and newline ("\n") in two separate paths. The "\n"
case also performs an unnecessary memory allocation and immediate free.
Simplify the logic by initializing 'new' to NULL and only allocating
memory if the string length remains non-zero after stripping the
trailing newline.
Reduce code size, improve readability, and avoid unnecessary memory
operations.
No functional change intended.
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/driver-core/DGS82WWLXPJ0.2EH4VJSF30UR5@kernel.org/
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Link: https://patch.msgid.link/20260325090905.169000-1-hanguidong02@gmail.com
[ Narrow cp's scope to the newline handling block; use scoped_guard().
- Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Currently, there are 12 busses (including platform and PCI) that
duplicate the driver_override logic for their individual devices.
All of them seem to be prone to the bug described in [1].
While this could be solved for every bus individually using a separate
lock, solving this in the driver-core generically results in less (and
cleaner) changes overall.
Thus, move driver_override to struct device, provide corresponding
accessors for busses and handle locking with a separate lock internally.
In particular, add device_set_driver_override(),
device_has_driver_override(), device_match_driver_override() and
generalize the sysfs store() and show() callbacks via a driver_override
feature flag in struct bus_type.
Until all busses have migrated, keep driver_set_override() in place.
Note that we can't use the device lock for the reasons described in [2].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220789 [1]
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [2]
Tested-by: Gui-Dong Han <hanguidong02@gmail.com>
Co-developed-by: Gui-Dong Han <hanguidong02@gmail.com>
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260303115720.48783-2-dakr@kernel.org
[ Use dev->bus instead of sp->bus for consistency; fix commit message to
refer to the struct bus_type's driver_override feature flag. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
This reverts commit dc23806a7c ("driver core: enforce device_lock for
driver_match_device()") and commit 289b14592c ("driver core: fix
inverted "locked" suffix of driver_match_device()").
While technically correct, there is a major downside to this approach:
When a device is already present in the system and a driver is
registered on the same bus, we iterate over all devices registered on
this bus to see if one of them matches. If we come across an already
bound one where the corresponding driver crashed while holding the
device lock (e.g. in probe()) we can't make any progress anymore.
However, drivers are typically the least tested code in the kernel and
hence it is a case that is likely to happen regularly. Besides hurting
developer ergonomics, it potentially decreases chances of shutting
things down cleanly and obtaining logs in production environments as
well [1].
This came up in the context of a firewire bug, which only in combination
with the reverted commit, caused the machine to hang [2]. Additionally,
it was observed in [3].
Thus, revert commit dc23806a7c ("driver core: enforce device_lock for
driver_match_device()") and add a brief note clarifying that an
implementer of struct bus_type must not expect match() to be called with
the device lock held.
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
Link: https://lore.kernel.org/all/67f655bb-4d81-4609-b008-68d200255dd2@davidgow.net/ [2]
Link: https://lore.kernel.org/lkml/CALbr=LZ4v7N=tO1vgOsyj9AS+XuNbn6kG-QcF+PacdMjSo0iyw@mail.gmail.com/ [3]
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Closes: https://lore.kernel.org/driver-core/CAHk-=wgJ_L1C=HjcYJotg_zrZEmiLFJaoic+PWthjuQrutrfJw@mail.gmail.com/
Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260302002545.19389-1-dakr@kernel.org
[ Add additional Link: reference. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
In the current implementation driver_match_device() expects the device
lock to be held, while driver_match_device_locked() acquires the device
lock.
By convention it should be the other way around, hence swap the name of
both functions.
Fixes: dc23806a7c ("driver core: enforce device_lock for driver_match_device()")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com>
Link: https://patch.msgid.link/20260131014211.12841-1-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Currently, driver_match_device() is called from three sites. One site
(__device_attach_driver) holds device_lock(dev), but the other two
(bind_store and __driver_attach) do not. This inconsistency means that
bus match() callbacks are not guaranteed to be called with the lock
held.
Fix this by introducing driver_match_device_locked(), which guarantees
holding the device lock using a scoped guard. Replace the unlocked calls
in bind_store() and __driver_attach() with this new helper. Also add a
lock assertion to driver_match_device() to enforce this guarantee.
This consistency also fixes a known race condition. The driver_override
implementation relies on the device_lock, so the missing lock led to the
use-after-free (UAF) reported in Bugzilla for buses using this field.
Stress testing the two newly locked paths for 24 hours with
CONFIG_PROVE_LOCKING and CONFIG_LOCKDEP enabled showed no UAF recurrence
and no lockdep warnings.
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Suggested-by: Qiu-ji Chen <chenqiuji666@gmail.com>
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Fixes: 49b420a13f ("driver core: check bus->match without holding device lock")
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Link: https://patch.msgid.link/20260113162843.12712-1-hanguidong02@gmail.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Currently, the driver's device private data is allocated and initialized
from driver core code called from bus abstractions after the driver's
probe() callback returned the corresponding initializer.
Similarly, the driver's device private data is dropped within the
remove() callback of bus abstractions after calling the remove()
callback of the corresponding driver.
However, commit 6f61a2637a ("rust: device: introduce
Device::drvdata()") introduced an accessor for the driver's device
private data for a Device<Bound>, i.e. a device that is currently bound
to a driver.
Obviously, this is in conflict with dropping the driver's device private
data in remove(), since a device can not be considered to be fully
unbound after remove() has finished:
We also have to consider registrations guarded by devres - such as IRQ
or class device registrations - which are torn down after remove() in
devres_release_all().
Thus, it can happen that, for instance, a class device or IRQ callback
still calls Device::drvdata(), which then runs concurrently to remove()
(which sets dev->driver_data to NULL and drops the driver's device
private data), before devres_release_all() started to tear down the
corresponding registration. This is because devres guarded registrations
can, as expected, access the corresponding Device<Bound> that defines
their scope.
In C it simply is the driver's responsibility to ensure that its device
private data is freed after e.g. an IRQ registration is unregistered.
Typically, C drivers achieve this by allocating their device private data
with e.g. devm_kzalloc() before doing anything else, i.e. before e.g.
registering an IRQ with devm_request_threaded_irq(), relying on the
reverse order cleanup of devres.
Technically, we could do something similar in Rust. However, the
resulting code would be pretty messy:
In Rust we have to differentiate between allocated but uninitialized
memory and initialized memory in the type system. Thus, we would need to
somehow keep track of whether the driver's device private data object
has been initialized (i.e. probe() was successful and returned a valid
initializer for this memory) and conditionally call the destructor of
the corresponding object when it is freed.
This is because we'd need to allocate and register the memory of the
driver's device private data *before* it is initialized by the
initializer returned by the driver's probe() callback, because the
driver could already register devres guarded registrations within
probe() outside of the driver's device private data initializer.
Luckily there is a much simpler solution: Instead of dropping the
driver's device private data at the end of remove(), we just drop it
after the device has been fully unbound, i.e. after all devres callbacks
have been processed.
For this, we introduce a new post_unbind() callback private to the
driver-core, i.e. the callback is neither exposed to drivers, nor to bus
abstractions.
This way, the driver-core code can simply continue to conditionally
allocate the memory for the driver's device private data when the
driver's initializer is returned from probe() - no change needed - and
drop it when the driver-core code receives the post_unbind() callback.
Closes: https://lore.kernel.org/all/DEZMS6Y4A7XE.XE7EUBT5SJFJ@kernel.org/
Fixes: 6f61a2637a ("rust: device: introduce Device::drvdata()")
Acked-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Igor Korotin <igor.korotin.linux@gmail.com>
Link: https://patch.msgid.link/20260107103511.570525-7-dakr@kernel.org
[ Remove #ifdef CONFIG_RUST, rename post_unbind() to post_unbind_rust().
- Danilo]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
When a device is hot-plugged, the drivers_autoprobe sysfs attribute is
not checked (at least for PCI devices). This means that
drivers_autoprobe is not working as intended, e.g. hot-plugged PCI
devices will still be autoprobed and bound to drivers even with
drivers_autoprobe disabled.
The problem likely started when device_add() was removed from
pci_bus_add_device() in commit 4f535093cf ("PCI: Put pci_dev in device
tree as early as possible") which means that the check for
drivers_autoprobe which used to happen in bus_probe_device() is no
longer present (previously bus_add_device() calls bus_probe_device()).
Conveniently, in commit 9170304169 ("PCI: Allow built-in drivers to
use async initial probing") device_attach() was replaced with
device_initial_probe() which faciliates this change to push the check
for drivers_autoprobe into device_initial_probe().
Make sure all devices check drivers_autoprobe by pushing the
drivers_autoprobe check into device_initial_probe(). This will only
affect devices on the PCI bus for now as device_initial_probe() is only
called by pci_bus_add_device() and bus_probe_device(), but
bus_probe_device() already checks for autoprobe, so callers of
bus_probe_device() should not observe changes on autoprobing.
Note also that pushing this check into device_initial_probe() rather
than device_attach() makes it only affect automatic probing of
drivers (e.g. when a device is hot-plugged), userspace can still choose
to manually bind a driver by writing to drivers_probe sysfs attribute,
even with autoprobe disabled.
Any future callers of device_initial_probe() will respect the
drivers_autoprobe sysfs attribute, which is the intended purpose of
drivers_autoprobe.
Signed-off-by: Vincent Liu <vincent.liu@nutanix.com>
Link: https://patch.msgid.link/20251022120740.2476482-1-vincent.liu@nutanix.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:
commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")
Switch to using system_dfl_wq because system_unbound_wq is going away as part of
a workqueue restructuring.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251114141618.172154-2-marco.crivellari@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The dev_pm_domain_attach() function is typically used in bus code
alongside dev_pm_domain_detach(), often following patterns like:
static int bus_probe(struct device *_dev)
{
struct bus_driver *drv = to_bus_driver(dev->driver);
struct bus_device *dev = to_bus_device(_dev);
int ret;
// ...
ret = dev_pm_domain_attach(_dev, true);
if (ret)
return ret;
if (drv->probe)
ret = drv->probe(dev);
// ...
}
static void bus_remove(struct device *_dev)
{
struct bus_driver *drv = to_bus_driver(dev->driver);
struct bus_device *dev = to_bus_device(_dev);
if (drv->remove)
drv->remove(dev);
dev_pm_domain_detach(_dev);
}
When the driver's probe function uses devres-managed resources that
depend on the power domain state, those resources are released later
during device_unbind_cleanup().
Releasing devres-managed resources that depend on the power domain state
after detaching the device from its PM domain can cause failures.
For example, if the driver uses devm_pm_runtime_enable() in its probe
function, and the device's clocks are managed by the PM domain, then
during removal the runtime PM is disabled in device_unbind_cleanup()
after the clocks have been removed from the PM domain. It may happen
that the devm_pm_runtime_enable() action causes the device to be runtime-
resumed. If the driver specific runtime PM APIs access registers directly,
this will lead to accessing device registers without clocks being enabled.
Similar issues may occur with other devres actions that access device
registers.
Add detach_power_off member to struct dev_pm_info, to be used
later in device_unbind_cleanup() as the power_off argument for
dev_pm_domain_detach(). This is a preparatory step toward removing
dev_pm_domain_detach() calls from bus remove functions. Since the current
PM domain detach functions (genpd_dev_pm_detach() and acpi_dev_pm_detach())
already set dev->pm_domain = NULL, there should be no issues with bus
drivers that still call dev_pm_domain_detach() in their remove functions.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/20250703112708.1621607-3-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Here is a small set of patches for the driver core code for 6.12-rc1.
This set is the one that caused the most delay on my side, due to lots
of last-minute reports of problems in the async shutdown feature that
was added. In the end, I've reverted all of the patches in that series
so we are back to "normal" and the patch set is being reworked for the
next merge window.
Other than the async shutdown patches that were reverted, included in
here are:
- minor driver core cleanups
- minor driver core bus and class api cleanups and simplifications for
some callbacks
- some const markings of structures
- other even more minor cleanups
All of these, including the last minute reverts, have been in
linux-next, but all of the reports of problems in linux-next were before
the reverts happened. After the reverts, all is good.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZvZefg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yldYACgzXnaSyQhjeac6posBqHmBIyNrmYAoM5PA+mn
1XZYdGpbOyecxsXP4wRn
=cKIA
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is a small set of patches for the driver core code for 6.12-rc1.
This set is the one that caused the most delay on my side, due to lots
of last-minute reports of problems in the async shutdown feature that
was added. In the end, I've reverted all of the patches in that series
so we are back to "normal" and the patch set is being reworked for the
next merge window.
Other than the async shutdown patches that were reverted, included in
here are:
- minor driver core cleanups
- minor driver core bus and class api cleanups and simplifications
for some callbacks
- some const markings of structures
- other even more minor cleanups
All of these, including the last minute reverts, have been in
linux-next, but all of the reports of problems in linux-next were
before the reverts happened. After the reverts, all is good"
* tag 'driver-core-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits)
Revert "driver core: don't always lock parent in shutdown"
Revert "driver core: separate function to shutdown one device"
Revert "driver core: shut down devices asynchronously"
Revert "nvme-pci: Make driver prefer asynchronous shutdown"
Revert "driver core: fix async device shutdown hang"
driver core: fix async device shutdown hang
driver core: attribute_container: Remove unused functions
driver core: Trivially simplify ((struct device_private *)curr)->device->p to @curr
devres: Correclty strip percpu address space of devm_free_percpu() argument
driver core: Make parameter check consistent for API cluster device_(for_each|find)_child()
bus: fsl-mc: make fsl_mc_bus_type const
nvme-pci: Make driver prefer asynchronous shutdown
driver core: shut down devices asynchronously
driver core: separate function to shutdown one device
driver core: don't always lock parent in shutdown
platform: Make platform_bus_type constant
driver core: class: Check namespace relevant parameters in class_register()
driver:base:core: Adding a "Return:" line in comment for device_link_add()
drivers/base: Introduce device_match_t for device finding APIs
firmware_loader: Block path traversal
...
Introduce KUnit resource wrappers around platform_driver_register(),
platform_device_alloc(), and platform_device_add() so that test authors
can register platform drivers/devices from their tests and have the
drivers/devices automatically be unregistered when the test is done.
This makes test setup code simpler when a platform driver or platform
device is needed. Add a few test cases at the same time to make sure the
APIs work as intended.
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Reviewed-by: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lore.kernel.org/r/20240718210513.3801024-6-sboyd@kernel.org
Change device_driver_attach() and driver_attach() to take a const * to
struct device driver as neither of them modify the structure at all.
Also, for some odd reason, drivers/dma/idxd/compat.c had a duplicate
external reference to device_driver_attach(), so remove that to fix up
the build, it should never have had that there in the first place.
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Petr Tesarik <petr.tesarik.ext@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: dmaengine@vger.kernel.org
Link: https://lore.kernel.org/r/2024061401-rasping-manger-c385@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Within struct device_private, mark the async_driver * as const as it is
never modified. This requires some internal-to-the-driver-core
functions to also have their parameters marked as constant, and there is
one place where we cast _back_ from the const pointer to a real one, as
the driver core still wants to modify the structure in a number of
remaining places.
Cc: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20240611130103.3262749-12-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
driver_detach() does not modify the driver itself, so make the pointer
constant. In doing so, the function driver_allows_async_probing() also
needs to be changed so that the pointer type passes through to that
function properly.
Cc: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20240611130103.3262749-11-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Once the deferred probe timeout has elapsed it is very likely that the
devices that are still deferring probe won't ever be probed. Therefore
log the defer probe pending reason at the warning level instead to bring
attention to the issue.
Signed-off-by: "Nícolas F. R. A. Prado" <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20240305-device-probe-error-v1-3-a06d8722bf19@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use the dev_* instead of the pr_* functions to log the status of device
probe so that the log message gets the device metadata attached to it.
Signed-off-by: "Nícolas F. R. A. Prado" <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20240305-device-probe-error-v1-2-a06d8722bf19@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Drivers can return -ENODEV or -ENXIO from their probe to reject a device
match, and return -EPROBE_DEFER if probe should be retried. Any other
error code is not expected during normal behavior and indicates an
issue occurred, so it should be logged at the error level.
Also make use of the device variant, dev_err(), so that the device
metadata is attached to the log message.
Signed-off-by: "Nícolas F. R. A. Prado" <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20240305-device-probe-error-v1-1-a06d8722bf19@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ending a boot log with
platform 3f202000.mmc: deferred probe pending
is already a nice hint about the problem. Sometimes there is a more
detailed error indicator available, add that to the output.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20231122093332.274145-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This commit fixes a bug in commit 9ed9895370 ("driver core: Functional
dependencies tracking support") where the device link status was
incorrectly updated in the driver unbind path before all the device's
resources were released.
Fixes: 9ed9895370 ("driver core: Functional dependencies tracking support")
Cc: stable <stable@kernel.org>
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Closes: https://lore.kernel.org/all/20231014161721.f4iqyroddkcyoefo@pengutronix.de/
Signed-off-by: Saravana Kannan <saravanak@google.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Matti Vaittinen <mazziesaccount@gmail.com>
Cc: James Clark <james.clark@arm.com>
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20231018013851.3303928-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When test_remove is enabled really_probe() does not properly pair
dma_configure() with dma_remove(), it will end up calling dma_configure()
twice. This corrupts the owner_cnt and renders the group unusable with
VFIO/etc.
Add the missing cleanup before going back to re_probe.
Fixes: 25f3bcfc54 ("driver core: Add dma_cleanup callback in bus_type")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Tested-by: Zenghui Yu <yuzenghui@huawei.com>
Closes: https://lore.kernel.org/all/6472f254-c3c4-8610-4a37-8d9dfdd54ce8@huawei.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/0-v2-4deed94e283e+40948-really_probe_dma_cleanup_jgg@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
bool is the most sensible return value for a yes/no return. Also
add __init as this funtion is only called from the early boot code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230531125535.676098-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Don't require the use of dynamic debug (or modification of the kernel to
add a #define DEBUG to the top of this file) to get the printk message
about driver probe timing. This printk is only emitted when
initcall_debug is enabled on the kernel commandline, and it isn't
immediately obvious that you have to do something else to debug boot
timing issues related to driver probe. Add a comment too so it doesn't
get converted back to pr_debug().
Fixes: eb7fbc9fb1 ("driver core: Add missing '\n' in log messages")
Cc: stable <stable@kernel.org>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Brian Norris <briannorris@chromium.org>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20230412225842.3196599-1-swboyd@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the file is written to and sync_state() hasn't been called for the
device yet, then call sync_state() for the device independent of the
state of its consumers.
This is useful for supplier devices that have one or more consumers that
don't have a driver but the consumers are in a state that don't use the
resources supplied by the supplier device.
This gives finer grained control than using the
fw_devlink.sync_state=timeout kernel commandline parameter.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230304005355.746421-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When all devices that could probe have finished probing (based on
deferred_probe_timeout configuration or late_initcall() when
!CONFIG_MODULES), this parameter controls what to do with devices that
haven't yet received their sync_state() calls.
fw_devlink.sync_state=strict is the default and the driver core will
continue waiting on all consumers of a device to probe successfully
before sync_state() is called for the device. This is the default
behavior since calling sync_state() on a device when all its consumers
haven't probed could make some systems unusable/unstable. When this
option is selected, we also print the list of devices that haven't had
sync_state() called on them by the time all devices the could probe have
finished probing.
fw_devlink.sync_state=timeout will cause the driver core to give up
waiting on consumers and call sync_state() on any devices that haven't
yet received their sync_state() calls. This option is provided for
systems that won't become unusable/unstable as they might be able to
save power (depends on state of hardware before kernel starts) if all
devices get their sync_state().
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230304005355.746421-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230202141621.2296458-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The logic to touch the bus notifier was open-coded in numberous places
in the driver core. Clean that up by creating a local bus_notify()
function and have everyone call this function instead, making the
reading of the caller code simpler and easier to maintain over time.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230111092331.3946745-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is not used outside of its compilation unit, so there's no need to
export this variable.
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Acked-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/r/20221227232152.3094584-1-javierm@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a driver registers with a bus, it will attempt to match with every
device on the bus through the __driver_attach() function. Currently, if
the bus_type.match() function encounters an error that is not
-EPROBE_DEFER, __driver_attach() will return a negative error code, which
causes the driver registration logic to stop trying to match with the
remaining devices on the bus.
This behavior is not correct; a failure while matching a driver to a
device does not mean that the driver won't be able to match and bind
with other devices on the bus. Update the logic in __driver_attach()
to reflect this.
Fixes: 656b8035b0 ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan <saravanak@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Link: https://lore.kernel.org/r/20220921001414.4046492-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
driver_allows_async_probing is only used in drivers/base/dd.c, so mark
it static and remove the declaration in drivers/base/base.h.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221030092255.872280-1-hch@lst.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmMeQ2keHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGYRMH+gLNHiGirGZlm2GQ
tKaZQUy7MiXuIP0hGDonDIIIAmIVhnjm9MDG8KT4W8AvEd7ukncyYqJfwWeWQPhP
4mZcf6l3Z8Ke+qiaFpXpMPCxTyWcln1ox0EoNx2g9gdPxZntaRuuaTQVljUfTiey
aVPHxve8ip3G7jDoJnuLSxESOqWxkb8v/SshBP1E5bF5BZ+cgZRqq7FNigFqxjbk
wF29K09BVOPjdgkSvY/b0/SnL5KlSdMAv+FrPcJNGivcdIPgf/qJks5cI2HRUo7o
CpKgbcLorCVyD+d+zLonJBwIy3arbmKD8JqYnfdTSIqVOUqHXWUDfeydsH32u1Gu
lPSI2Hw=
=7LTL
-----END PGP SIGNATURE-----
Merge 6.0-rc5 into driver-core-next
We need the driver core and debugfs changes in this branch.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.
If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.
If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.
Fixes: 656b8035b0 ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan <saravanak@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Link: https://lore.kernel.org/r/20220817184026.3468620-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 9cbffc7a59.
There are a few more issues to fix that have been reported in the thread
for the original series [1]. We'll need to fix those before this will work.
So, revert it for now.
[1] - https://lore.kernel.org/lkml/20220601070707.3946847-1-saravanak@google.com/
Fixes: 9cbffc7a59 ("driver core: Delete driver_deferred_probe_check_state()")
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220819221616.2107893-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In __driver_attach function, There are also AA deadlock problem,
like the commit b232b02bf3 ("driver core: fix deadlock in
__device_attach").
stack like commit b232b02bf3 ("driver core: fix deadlock in
__device_attach").
list below:
In __driver_attach function, The lock holding logic is as follows:
...
__driver_attach
if (driver_allows_async_probing(drv))
device_lock(dev) // get lock dev
async_schedule_dev(__driver_attach_async_helper, dev); // func
async_schedule_node
async_schedule_node_domain(func)
entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
/* when fail or work limit, sync to execute func, but
__driver_attach_async_helper will get lock dev as
will, which will lead to A-A deadlock. */
if (!entry || atomic_read(&entry_count) > MAX_WORK) {
func;
else
queue_work_node(node, system_unbound_wq, &entry->work)
device_unlock(dev)
As above show, when it is allowed to do async probes, because of
out of memory or work limit, async work is not be allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__driver_attach_async_helper getting lock dev.
Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like
below:
[ 370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 370.787154] task:swapper/0 state:D stack: 0 pid: 1 ppid:
0 flags:0x00004000
[ 370.788865] Call Trace:
[ 370.789374] <TASK>
[ 370.789841] __schedule+0x482/0x1050
[ 370.790613] schedule+0x92/0x1a0
[ 370.791290] schedule_preempt_disabled+0x2c/0x50
[ 370.792256] __mutex_lock.isra.0+0x757/0xec0
[ 370.793158] __mutex_lock_slowpath+0x1f/0x30
[ 370.794079] mutex_lock+0x50/0x60
[ 370.794795] __device_driver_lock+0x2f/0x70
[ 370.795677] ? driver_probe_device+0xd0/0xd0
[ 370.796576] __driver_attach_async_helper+0x1d/0xd0
[ 370.797318] ? driver_probe_device+0xd0/0xd0
[ 370.797957] async_schedule_node_domain+0xa5/0xc0
[ 370.798652] async_schedule_node+0x19/0x30
[ 370.799243] __driver_attach+0x246/0x290
[ 370.799828] ? driver_allows_async_probing+0xa0/0xa0
[ 370.800548] bus_for_each_dev+0x9d/0x130
[ 370.801132] driver_attach+0x22/0x30
[ 370.801666] bus_add_driver+0x290/0x340
[ 370.802246] driver_register+0x88/0x140
[ 370.802817] ? virtio_scsi_init+0x116/0x116
[ 370.803425] scsi_register_driver+0x1a/0x30
[ 370.804057] init_sd+0x184/0x226
[ 370.804533] do_one_initcall+0x71/0x3a0
[ 370.805107] kernel_init_freeable+0x39a/0x43a
[ 370.805759] ? rest_init+0x150/0x150
[ 370.806283] kernel_init+0x26/0x230
[ 370.806799] ret_from_fork+0x1f/0x30
To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.
Fixes: ef0ff68351 ("driver core: Probe devices asynchronously instead of the driver")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function is no longer used. So delete it.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-10-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some devices might need to be probed and bound successfully before the
kernel boot sequence can finish and move on to init/userspace. For
example, a network interface might need to be bound to be able to mount
a NFS rootfs.
With fw_devlink=on by default, some of these devices might be blocked
from probing because they are waiting on a optional supplier that
doesn't have a driver. While fw_devlink will eventually identify such
devices and unblock the probing automatically, it might be too late by
the time it unblocks the probing of devices. For example, the IP4
autoconfig might timeout before fw_devlink unblocks probing of the
network interface.
This function is available to temporarily try and probe all devices that
have a driver even if some of their suppliers haven't been added or
don't have drivers.
The drivers can then decide which of the suppliers are optional vs
mandatory and probe the device if possible. By the time this function
returns, all such "best effort" probes are guaranteed to be completed.
If a device successfully probes in this mode, we delete all fw_devlink
discovered dependencies of that device where the supplier hasn't yet
probed successfully because they have to be optional dependencies.
This also means that some devices that aren't needed for init and could
have waited for their optional supplier to probe (when the supplier's
module is loaded later on) would end up probing prematurely with limited
functionality. So call this function only when boot would fail without
it.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-5-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since we had to effectively reverted
commit 35a672363a ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") in an earlier patch, a non-zero
deferred_probe_timeout will break NFS rootfs mounting [1] again. So, set
the default back to zero until we can fix that.
[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
Fixes: 2b28a1a84a ("driver core: Extend deferred probe timeout on driver registration")
Cc: Mark Brown <broonie@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220526034609.480766-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mounting NFS rootfs was timing out when deferred_probe_timeout was
non-zero [1]. This was because ip_auto_config() initcall times out
waiting for the network interfaces to show up when
deferred_probe_timeout was non-zero. While ip_auto_config() calls
wait_for_device_probe() to make sure any currently running deferred
probe work or asynchronous probe finishes, that wasn't sufficient to
account for devices being deferred until deferred_probe_timeout.
Commit 35a672363a ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") tried to fix that by making
sure wait_for_device_probe() waits for deferred_probe_timeout to expire
before returning.
However, if wait_for_device_probe() is called from the kernel_init()
context:
- Before deferred_probe_initcall() [2], it causes the boot process to
hang due to a deadlock.
- After deferred_probe_initcall() [3], it blocks kernel_init() from
continuing till deferred_probe_timeout expires and beats the point of
deferred_probe_timeout that's trying to wait for userspace to load
modules.
Neither of this is good. So revert the changes to
wait_for_device_probe().
[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
[2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/
[3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/
Fixes: 35a672363a ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires")
Cc: John Stultz <jstultz@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Basil Eljuse <Basil.Eljuse@arm.com>
Cc: Ferry Toth <fntoth@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: linux-pm@vger.kernel.org
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220526034609.480766-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the set of driver core changes for 5.19-rc1.
Note, I'm not really happy with this pull request as-is, see below for
details, but overall this is all good for everything but a small set of
systems, which we have a fix for already.
Lots of tiny driver core changes and cleanups happened this cycle,
but the two major things were:
- firmware_loader reorganization and additions including the
ability to have XZ compressed firmware images and the ability
for userspace to initiate the firmware load when it needs to,
instead of being always initiated by the kernel. FPGA devices
specifically want this ability to have their firmware changed
over the lifetime of the system boot, and this allows them to
work without having to come up with yet-another-custom-uapi
interface for loading firmware for them.
- physical location support added to sysfs so that devices that
know this information, can tell userspace where they are
located in a common way. Some ACPI devices already support
this today, and more bus types should support this in the
future.
Smaller changes included:
- driver_override api cleanups and fixes
- error path cleanups and fixes
- get_abi script fixes
- deferred probe timeout changes.
It's that last change that I'm the most worried about. It has been
reported to cause boot problems for a number of systems, and I have a
tested patch series that resolves this issue. But I didn't get it
merged into my tree before 5.18-final came out, so it has not gotten any
linux-next testing.
I'll send the fixup patches (there are 2) as a follow-on series to this
pull request if you want to take them directly, _OR_ I can just revert
the probe timeout changes and they can wait for the next -rc1 merge
cycle. Given that the fixes are tested, and pretty simple, I'm leaning
toward that choice. Sorry this all came at the end of the merge window,
I should have resolved this all 2 weeks ago, that's my fault as it was
in the middle of some travel for me.
All have been tested in linux-next for weeks, with no reported issues
other than the above-mentioned boot time outs.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnv/A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk/fACgvmenbo5HipqyHnOmTQlT50xQ9EYAn2eTq6ai
GkjLXBGNWOPBa5cU52qf
=yEi/
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core changes for 5.19-rc1.
Lots of tiny driver core changes and cleanups happened this cycle, but
the two major things are:
- firmware_loader reorganization and additions including the ability
to have XZ compressed firmware images and the ability for userspace
to initiate the firmware load when it needs to, instead of being
always initiated by the kernel. FPGA devices specifically want this
ability to have their firmware changed over the lifetime of the
system boot, and this allows them to work without having to come up
with yet-another-custom-uapi interface for loading firmware for
them.
- physical location support added to sysfs so that devices that know
this information, can tell userspace where they are located in a
common way. Some ACPI devices already support this today, and more
bus types should support this in the future.
Smaller changes include:
- driver_override api cleanups and fixes
- error path cleanups and fixes
- get_abi script fixes
- deferred probe timeout changes.
It's that last change that I'm the most worried about. It has been
reported to cause boot problems for a number of systems, and I have a
tested patch series that resolves this issue. But I didn't get it
merged into my tree before 5.18-final came out, so it has not gotten
any linux-next testing.
I'll send the fixup patches (there are 2) as a follow-on series to this
pull request.
All have been tested in linux-next for weeks, with no reported issues
other than the above-mentioned boot time-outs"
* tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
driver core: fix deadlock in __device_attach
kernfs: Separate kernfs_pr_cont_buf and rename_lock.
topology: Remove unused cpu_cluster_mask()
driver core: Extend deferred probe timeout on driver registration
MAINTAINERS: add Russ Weight as a firmware loader maintainer
driver: base: fix UAF when driver_attach failed
test_firmware: fix end of loop test in upload_read_show()
driver core: location: Add "back" as a possible output for panel
driver core: location: Free struct acpi_pld_info *pld
driver core: Add "*" wildcard support to driver_async_probe cmdline param
driver core: location: Check for allocations failure
arch_topology: Trace the update thermal pressure
kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file.
export: fix string handling of namespace in EXPORT_SYMBOL_NS
rpmsg: use local 'dev' variable
rpmsg: Fix calling device_lock() on non-initialized device
firmware_loader: describe 'module' parameter of firmware_upload_register()
firmware_loader: Move definitions from sysfs_upload.h to sysfs.h
firmware_loader: Fix configs for sysfs split
selftests: firmware: Add firmware upload selftests
...
In __device_attach function, The lock holding logic is as follows:
...
__device_attach
device_lock(dev) // get lock dev
async_schedule_dev(__device_attach_async_helper, dev); // func
async_schedule_node
async_schedule_node_domain(func)
entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
/* when fail or work limit, sync to execute func, but
__device_attach_async_helper will get lock dev as
well, which will lead to A-A deadlock. */
if (!entry || atomic_read(&entry_count) > MAX_WORK) {
func;
else
queue_work_node(node, system_unbound_wq, &entry->work)
device_unlock(dev)
As shown above, when it is allowed to do async probes, because of
out of memory or work limit, async work is not allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__device_attach_async_helper getting lock dev.
To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.
Fixes: 765230b5f0 ("driver-core: add asynchronous probing support for drivers")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Link: https://lore.kernel.org/r/20220518074516.1225580-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The deferred probe timer that's used for this currently starts at
late_initcall and runs for driver_deferred_probe_timeout seconds. The
assumption being that all available drivers would be loaded and
registered before the timer expires. This means, the
driver_deferred_probe_timeout has to be pretty large for it to cover the
worst case. But if we set the default value for it to cover the worst
case, it would significantly slow down the average case. For this
reason, the default value is set to 0.
Also, with CONFIG_MODULES=y and the current default values of
driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
drivers will cause their consumer devices to always defer their probes.
This is because device links created by fw_devlink defer the probe even
before the consumer driver's probe() is called.
Instead of a fixed timeout, if we extend an unexpired deferred probe
timer on every successful driver registration, with the expectation more
modules would be loaded in the near future, then the default value of
driver_deferred_probe_timeout only needs to be as long as the worst case
time difference between two consecutive module loads.
So let's implement that and set the default value to 10 seconds when
CONFIG_MODULES=y.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Cc: linux-gpio@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220429220933.1350374-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's currently no way to use driver_async_probe kernel cmdline param
to enable default async probe for all drivers. So, add support for "*"
to match with all driver names. When "*" is used, all other drivers
listed in driver_async_probe are drivers that will NOT match the "*".
For example:
* driver_async_probe=drvA,drvB,drvC
drvA, drvB and drvC do asynchronous probing.
* driver_async_probe=*
All drivers do asynchronous probing except those that have set
PROBE_FORCE_SYNCHRONOUS flag.
* driver_async_probe=*,drvA,drvB,drvC
All drivers do asynchronous probing except drvA, drvB, drvC and those
that have set PROBE_FORCE_SYNCHRONOUS flag.
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Feng Tang <feng.tang@intel.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220504005344.117803-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>