linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-13 00:28:54 +02:00

Author	SHA1	Message	Date
Linus Torvalds	8c2bf4a2e5	Driver core fixes for 7.1-rc1 - Prevent a device from being probed before device_add() has finished initializing it; gate probe with a "ready_to_probe" device flag to avoid races with concurrent driver_register() calls - Fix a kernel-doc warning for DEV_FLAG_COUNT introduced by the above - Return -ENOTCONN from software_node_get_reference_args() when a referenced software node is known but not yet registered, allowing callers to defer probe - In sysfs_group_attrs_change_owner(), also check is_visible_const(); missed when the const variant was introduced -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQS2q/xV6QjXAdC7k+1FlHeO1qrKLgUCaePuOwAKCRBFlHeO1qrK LrBdAP9hcc5TBpPgbW4vdpsGgPhspz2x3Ijq0m91354RT2KISwEAx0ZHu6pvtLb/ hW0hHvk0onL6uBbCR6629U+A/xQ79go= =Hdaa -----END PGP SIGNATURE----- Merge tag 'driver-core-7.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core fixes from Danilo Krummrich: - Prevent a device from being probed before device_add() has finished initializing it; gate probe with a "ready_to_probe" device flag to avoid races with concurrent driver_register() calls - Fix a kernel-doc warning for DEV_FLAG_COUNT introduced by the above - Return -ENOTCONN from software_node_get_reference_args() when a referenced software node is known but not yet registered, allowing callers to defer probe - In sysfs_group_attrs_change_owner(), also check is_visible_const(); missed when the const variant was introduced * tag 'driver-core-7.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: driver core: Add kernel-doc for DEV_FLAG_COUNT enum value sysfs: attribute_group: Respect is_visible_const() when changing owner software node: return -ENOTCONN when referenced swnode is not registered yet driver core: Don't let a device probe until it's ready	2026-04-19 12:58:08 -07:00
Douglas Anderson	a2225b6e83	driver core: Don't let a device probe until it's ready The moment we link a "struct device" into the list of devices for the bus, it's possible probe can happen. This is because another thread can load the driver at any time and that can cause the device to probe. This has been seen in practice with a stack crawl that looks like this [1]: really_probe() __driver_probe_device() driver_probe_device() __driver_attach() bus_for_each_dev() driver_attach() bus_add_driver() driver_register() __platform_driver_register() init_module() [some module] do_one_initcall() do_init_module() load_module() __arm64_sys_finit_module() invoke_syscall() As a result of the above, it was seen that device_links_driver_bound() could be called for the device before "dev->fwnode->dev" was assigned. This prevented __fw_devlink_pickup_dangling_consumers() from being called which meant that other devices waiting on our driver's sub-nodes were stuck deferring forever. It's believed that this problem is showing up suddenly for two reasons: 1. Android has recently (last ~1 year) implemented an optimization to the order it loads modules [2]. When devices opt-in to this faster loading, modules are loaded one-after-the-other very quickly. This is unlike how other distributions do it. The reproduction of this problem has only been seen on devices that opt-in to Android's "parallel module loading". 2. Android devices typically opt-in to fw_devlink, and the most noticeable issue is the NULL "dev->fwnode->dev" in device_links_driver_bound(). fw_devlink is somewhat new code and also not in use by all Linux devices. Even though the specific symptom where "dev->fwnode->dev" wasn't assigned could be fixed by moving that assignment higher in device_add(), other parts of device_add() (like the call to device_pm_add()) are also important to run before probe. Only moving the "dev->fwnode->dev" assignment would likely fix the current symptoms but lead to difficult-to-debug problems in the future. Fix the problem by preventing probe until device_add() has run far enough that the device is ready to probe. If somehow we end up trying to probe before we're allowed, __driver_probe_device() will return -EPROBE_DEFER which will make certain the device is noticed. In the race condition that was seen with Android's faster module loading, we will temporarily add the device to the deferred list and then take it off immediately when device_add() probes the device. Instead of adding another flag to the bitfields already in "struct device", instead add a new "flags" field and use that. This allows us to freely change the bit from different thread without worrying about corrupting nearby bits (and means threads changing other bit won't corrupt us). [1] Captured on a machine running a downstream 6.6 kernel [2] https://cs.android.com/android/platform/superproject/main/+/main:system/core/libmodprobe/libmodprobe.cpp?q=LoadModulesParallel Cc: stable@vger.kernel.org Fixes: `2023c610dc` ("Driver core: add new device to bus's list before probing") Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patch.msgid.link/20260406162231.v5.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2026-04-11 14:32:07 +02:00
Hans de Goede	56e3ee721b	driver core: Make deferred_probe_timeout default a Kconfig option Code using driver_deferred_probe_check_state() differs from most EPROBE_DEFER handling in the kernel. Where other EPROBE_DEFER handling (e.g. clks, gpios and regulators) waits indefinitely for suppliers to show up, code using driver_deferred_probe_check_state() will fail after the deferred_probe_timeout. This is a problem for generic distro kernels which want to support many boards using a single kernel build. These kernels want as much drivers to be modular as possible. The initrd also should be as small as possible, so the initrd will not have drivers not needing to get the rootfs. Combine this with waiting for a full-disk encryption password in the initrd and it is pretty much guaranteed that the default 10s timeout will be hit, causing probe() failures when drivers on the rootfs happen to get modprobe-d before other rootfs modules providing their suppliers. Make the default timeout configurable from Kconfig to allow distro kernel configs where many of the supplier drivers are modules to set the default through Kconfig. Reviewed-by: Saravana Kannan <saravanak@kernel.org> Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com> Link: https://patch.msgid.link/20260314084916.10868-1-johannes.goede@oss.qualcomm.com [ Drop deferred_probe_timeout documentation change in kernel-parameters.txt. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2026-03-30 20:45:27 +02:00
Gui-Dong Han	3210dabba4	driver core: simplify __device_set_driver_override() clearing logic Currently, __device_set_driver_override() handles clearing the override via empty string ("") and newline ("\n") in two separate paths. The "\n" case also performs an unnecessary memory allocation and immediate free. Simplify the logic by initializing 'new' to NULL and only allocating memory if the string length remains non-zero after stripping the trailing newline. Reduce code size, improve readability, and avoid unnecessary memory operations. No functional change intended. Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/driver-core/DGS82WWLXPJ0.2EH4VJSF30UR5@kernel.org/ Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com> Link: https://patch.msgid.link/20260325090905.169000-1-hanguidong02@gmail.com [ Narrow cp's scope to the newline handling block; use scoped_guard(). - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2026-03-30 14:45:51 +02:00
Danilo Krummrich	cb3d1049f4	driver core: generalize driver_override in struct device Currently, there are 12 busses (including platform and PCI) that duplicate the driver_override logic for their individual devices. All of them seem to be prone to the bug described in [1]. While this could be solved for every bus individually using a separate lock, solving this in the driver-core generically results in less (and cleaner) changes overall. Thus, move driver_override to struct device, provide corresponding accessors for busses and handle locking with a separate lock internally. In particular, add device_set_driver_override(), device_has_driver_override(), device_match_driver_override() and generalize the sysfs store() and show() callbacks via a driver_override feature flag in struct bus_type. Until all busses have migrated, keep driver_set_override() in place. Note that we can't use the device lock for the reasons described in [2]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=220789 [1] Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [2] Tested-by: Gui-Dong Han <hanguidong02@gmail.com> Co-developed-by: Gui-Dong Han <hanguidong02@gmail.com> Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260303115720.48783-2-dakr@kernel.org [ Use dev->bus instead of sp->bus for consistency; fix commit message to refer to the struct bus_type's driver_override feature flag. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2026-03-17 20:30:23 +01:00
Danilo Krummrich	9de68394a6	Revert "driver core: enforce device_lock for driver_match_device()" This reverts commit `dc23806a7c` ("driver core: enforce device_lock for driver_match_device()") and commit `289b14592c` ("driver core: fix inverted "locked" suffix of driver_match_device()"). While technically correct, there is a major downside to this approach: When a device is already present in the system and a driver is registered on the same bus, we iterate over all devices registered on this bus to see if one of them matches. If we come across an already bound one where the corresponding driver crashed while holding the device lock (e.g. in probe()) we can't make any progress anymore. However, drivers are typically the least tested code in the kernel and hence it is a case that is likely to happen regularly. Besides hurting developer ergonomics, it potentially decreases chances of shutting things down cleanly and obtaining logs in production environments as well [1]. This came up in the context of a firewire bug, which only in combination with the reverted commit, caused the machine to hang [2]. Additionally, it was observed in [3]. Thus, revert commit `dc23806a7c` ("driver core: enforce device_lock for driver_match_device()") and add a brief note clarifying that an implementer of struct bus_type must not expect match() to be called with the device lock held. Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1] Link: https://lore.kernel.org/all/67f655bb-4d81-4609-b008-68d200255dd2@davidgow.net/ [2] Link: https://lore.kernel.org/lkml/CALbr=LZ4v7N=tO1vgOsyj9AS+XuNbn6kG-QcF+PacdMjSo0iyw@mail.gmail.com/ [3] Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Closes: https://lore.kernel.org/driver-core/CAHk-=wgJ_L1C=HjcYJotg_zrZEmiLFJaoic+PWthjuQrutrfJw@mail.gmail.com/ Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260302002545.19389-1-dakr@kernel.org [ Add additional Link: reference. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2026-03-03 13:12:42 +01:00
Danilo Krummrich	289b14592c	driver core: fix inverted "locked" suffix of driver_match_device() In the current implementation driver_match_device() expects the device lock to be held, while driver_match_device_locked() acquires the device lock. By convention it should be the other way around, hence swap the name of both functions. Fixes: `dc23806a7c` ("driver core: enforce device_lock for driver_match_device()") Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com> Link: https://patch.msgid.link/20260131014211.12841-1-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2026-02-01 22:24:25 +01:00
Gui-Dong Han	dc23806a7c	driver core: enforce device_lock for driver_match_device() Currently, driver_match_device() is called from three sites. One site (__device_attach_driver) holds device_lock(dev), but the other two (bind_store and __driver_attach) do not. This inconsistency means that bus match() callbacks are not guaranteed to be called with the lock held. Fix this by introducing driver_match_device_locked(), which guarantees holding the device lock using a scoped guard. Replace the unlocked calls in bind_store() and __driver_attach() with this new helper. Also add a lock assertion to driver_match_device() to enforce this guarantee. This consistency also fixes a known race condition. The driver_override implementation relies on the device_lock, so the missing lock led to the use-after-free (UAF) reported in Bugzilla for buses using this field. Stress testing the two newly locked paths for 24 hours with CONFIG_PROVE_LOCKING and CONFIG_LOCKDEP enabled showed no UAF recurrence and no lockdep warnings. Cc: stable@vger.kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789 Suggested-by: Qiu-ji Chen <chenqiuji666@gmail.com> Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com> Fixes: `49b420a13f` ("driver core: check bus->match without holding device lock") Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Link: https://patch.msgid.link/20260113162843.12712-1-hanguidong02@gmail.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2026-01-16 12:40:58 +01:00
Danilo Krummrich	a995fe1a3a	rust: driver: drop device private data post unbind Currently, the driver's device private data is allocated and initialized from driver core code called from bus abstractions after the driver's probe() callback returned the corresponding initializer. Similarly, the driver's device private data is dropped within the remove() callback of bus abstractions after calling the remove() callback of the corresponding driver. However, commit `6f61a2637a` ("rust: device: introduce Device::drvdata()") introduced an accessor for the driver's device private data for a Device<Bound>, i.e. a device that is currently bound to a driver. Obviously, this is in conflict with dropping the driver's device private data in remove(), since a device can not be considered to be fully unbound after remove() has finished: We also have to consider registrations guarded by devres - such as IRQ or class device registrations - which are torn down after remove() in devres_release_all(). Thus, it can happen that, for instance, a class device or IRQ callback still calls Device::drvdata(), which then runs concurrently to remove() (which sets dev->driver_data to NULL and drops the driver's device private data), before devres_release_all() started to tear down the corresponding registration. This is because devres guarded registrations can, as expected, access the corresponding Device<Bound> that defines their scope. In C it simply is the driver's responsibility to ensure that its device private data is freed after e.g. an IRQ registration is unregistered. Typically, C drivers achieve this by allocating their device private data with e.g. devm_kzalloc() before doing anything else, i.e. before e.g. registering an IRQ with devm_request_threaded_irq(), relying on the reverse order cleanup of devres. Technically, we could do something similar in Rust. However, the resulting code would be pretty messy: In Rust we have to differentiate between allocated but uninitialized memory and initialized memory in the type system. Thus, we would need to somehow keep track of whether the driver's device private data object has been initialized (i.e. probe() was successful and returned a valid initializer for this memory) and conditionally call the destructor of the corresponding object when it is freed. This is because we'd need to allocate and register the memory of the driver's device private data before it is initialized by the initializer returned by the driver's probe() callback, because the driver could already register devres guarded registrations within probe() outside of the driver's device private data initializer. Luckily there is a much simpler solution: Instead of dropping the driver's device private data at the end of remove(), we just drop it after the device has been fully unbound, i.e. after all devres callbacks have been processed. For this, we introduce a new post_unbind() callback private to the driver-core, i.e. the callback is neither exposed to drivers, nor to bus abstractions. This way, the driver-core code can simply continue to conditionally allocate the memory for the driver's device private data when the driver's initializer is returned from probe() - no change needed - and drop it when the driver-core code receives the post_unbind() callback. Closes: https://lore.kernel.org/all/DEZMS6Y4A7XE.XE7EUBT5SJFJ@kernel.org/ Fixes: `6f61a2637a` ("rust: device: introduce Device::drvdata()") Acked-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Igor Korotin <igor.korotin.linux@gmail.com> Link: https://patch.msgid.link/20260107103511.570525-7-dakr@kernel.org [ Remove #ifdef CONFIG_RUST, rename post_unbind() to post_unbind_rust(). - Danilo] Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2026-01-16 01:17:29 +01:00
Vincent Liu	ea34511aaf	driver core: Check drivers_autoprobe for all added devices When a device is hot-plugged, the drivers_autoprobe sysfs attribute is not checked (at least for PCI devices). This means that drivers_autoprobe is not working as intended, e.g. hot-plugged PCI devices will still be autoprobed and bound to drivers even with drivers_autoprobe disabled. The problem likely started when device_add() was removed from pci_bus_add_device() in commit `4f535093cf` ("PCI: Put pci_dev in device tree as early as possible") which means that the check for drivers_autoprobe which used to happen in bus_probe_device() is no longer present (previously bus_add_device() calls bus_probe_device()). Conveniently, in commit `9170304169` ("PCI: Allow built-in drivers to use async initial probing") device_attach() was replaced with device_initial_probe() which faciliates this change to push the check for drivers_autoprobe into device_initial_probe(). Make sure all devices check drivers_autoprobe by pushing the drivers_autoprobe check into device_initial_probe(). This will only affect devices on the PCI bus for now as device_initial_probe() is only called by pci_bus_add_device() and bus_probe_device(), but bus_probe_device() already checks for autoprobe, so callers of bus_probe_device() should not observe changes on autoprobing. Note also that pushing this check into device_initial_probe() rather than device_attach() makes it only affect automatic probing of drivers (e.g. when a device is hot-plugged), userspace can still choose to manually bind a driver by writing to drivers_probe sysfs attribute, even with autoprobe disabled. Any future callers of device_initial_probe() will respect the drivers_autoprobe sysfs attribute, which is the intended purpose of drivers_autoprobe. Signed-off-by: Vincent Liu <vincent.liu@nutanix.com> Link: https://patch.msgid.link/20251022120740.2476482-1-vincent.liu@nutanix.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-11-26 15:22:19 +01:00
Marco Crivellari	e40ad215ce	driver core: replace use of system_unbound_wq with system_dfl_wq Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") Switch to using system_dfl_wq because system_unbound_wq is going away as part of a workqueue restructuring. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251114141618.172154-2-marco.crivellari@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-11-26 15:21:29 +01:00
Claudiu Beznea	f99508074e	PM: domains: Detach on device_unbind_cleanup() The dev_pm_domain_attach() function is typically used in bus code alongside dev_pm_domain_detach(), often following patterns like: static int bus_probe(struct device _dev) { struct bus_driver drv = to_bus_driver(dev->driver); struct bus_device dev = to_bus_device(_dev); int ret; // ... ret = dev_pm_domain_attach(_dev, true); if (ret) return ret; if (drv->probe) ret = drv->probe(dev); // ... } static void bus_remove(struct device _dev) { struct bus_driver drv = to_bus_driver(dev->driver); struct bus_device dev = to_bus_device(_dev); if (drv->remove) drv->remove(dev); dev_pm_domain_detach(_dev); } When the driver's probe function uses devres-managed resources that depend on the power domain state, those resources are released later during device_unbind_cleanup(). Releasing devres-managed resources that depend on the power domain state after detaching the device from its PM domain can cause failures. For example, if the driver uses devm_pm_runtime_enable() in its probe function, and the device's clocks are managed by the PM domain, then during removal the runtime PM is disabled in device_unbind_cleanup() after the clocks have been removed from the PM domain. It may happen that the devm_pm_runtime_enable() action causes the device to be runtime- resumed. If the driver specific runtime PM APIs access registers directly, this will lead to accessing device registers without clocks being enabled. Similar issues may occur with other devres actions that access device registers. Add detach_power_off member to struct dev_pm_info, to be used later in device_unbind_cleanup() as the power_off argument for dev_pm_domain_detach(). This is a preparatory step toward removing dev_pm_domain_detach() calls from bus remove functions. Since the current PM domain detach functions (genpd_dev_pm_detach() and acpi_dev_pm_detach()) already set dev->pm_domain = NULL, there should be no issues with bus drivers that still call dev_pm_domain_detach() in their remove functions. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://patch.msgid.link/20250703112708.1621607-3-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2025-07-07 20:41:21 +02:00
Dmitry Torokhov	04d3e5461c	driver core: introduce device_set_driver() helper In preparation to closing a race when reading driver pointer in dev_uevent() code, instead of setting device->driver pointer directly introduce device_set_driver() helper. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20250311052417.1846985-2-dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-04-15 17:04:35 +02:00
Linus Torvalds	e5f0e38e7e	Driver core update for 6.12-rc1 Here is a small set of patches for the driver core code for 6.12-rc1. This set is the one that caused the most delay on my side, due to lots of last-minute reports of problems in the async shutdown feature that was added. In the end, I've reverted all of the patches in that series so we are back to "normal" and the patch set is being reworked for the next merge window. Other than the async shutdown patches that were reverted, included in here are: - minor driver core cleanups - minor driver core bus and class api cleanups and simplifications for some callbacks - some const markings of structures - other even more minor cleanups All of these, including the last minute reverts, have been in linux-next, but all of the reports of problems in linux-next were before the reverts happened. After the reverts, all is good. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZvZefg8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+yldYACgzXnaSyQhjeac6posBqHmBIyNrmYAoM5PA+mn 1XZYdGpbOyecxsXP4wRn =cKIA -----END PGP SIGNATURE----- Merge tag 'driver-core-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is a small set of patches for the driver core code for 6.12-rc1. This set is the one that caused the most delay on my side, due to lots of last-minute reports of problems in the async shutdown feature that was added. In the end, I've reverted all of the patches in that series so we are back to "normal" and the patch set is being reworked for the next merge window. Other than the async shutdown patches that were reverted, included in here are: - minor driver core cleanups - minor driver core bus and class api cleanups and simplifications for some callbacks - some const markings of structures - other even more minor cleanups All of these, including the last minute reverts, have been in linux-next, but all of the reports of problems in linux-next were before the reverts happened. After the reverts, all is good" * tag 'driver-core-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits) Revert "driver core: don't always lock parent in shutdown" Revert "driver core: separate function to shutdown one device" Revert "driver core: shut down devices asynchronously" Revert "nvme-pci: Make driver prefer asynchronous shutdown" Revert "driver core: fix async device shutdown hang" driver core: fix async device shutdown hang driver core: attribute_container: Remove unused functions driver core: Trivially simplify ((struct device_private *)curr)->device->p to @curr devres: Correclty strip percpu address space of devm_free_percpu() argument driver core: Make parameter check consistent for API cluster device_(for_each\|find)_child() bus: fsl-mc: make fsl_mc_bus_type const nvme-pci: Make driver prefer asynchronous shutdown driver core: shut down devices asynchronously driver core: separate function to shutdown one device driver core: don't always lock parent in shutdown platform: Make platform_bus_type constant driver core: class: Check namespace relevant parameters in class_register() driver:base:core: Adding a "Return:" line in comment for device_link_add() drivers/base: Introduce device_match_t for device finding APIs firmware_loader: Block path traversal ...	2024-09-27 08:48:37 -07:00
Zijun Hu	efb0b309fa	driver core: Trivially simplify ((struct device_private )curr)->device->p to @curr Trivially simplify ((struct device_private )curr)->device->p to @curr in deferred_devs_show() since both are same. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20240908-trivial_simpli-v1-1-53e0f1363299@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-09-11 16:31:02 +02:00
Stephen Boyd	5ac7973032	platform: Add test managed platform_device/driver APIs Introduce KUnit resource wrappers around platform_driver_register(), platform_device_alloc(), and platform_device_add() so that test authors can register platform drivers/devices from their tests and have the drivers/devices automatically be unregistered when the test is done. This makes test setup code simpler when a platform driver or platform device is needed. Add a few test cases at the same time to make sure the APIs work as intended. Cc: Brendan Higgins <brendan.higgins@linux.dev> Reviewed-by: David Gow <davidgow@google.com> Cc: Rae Moar <rmoar@google.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20240718210513.3801024-6-sboyd@kernel.org	2024-07-29 15:33:12 -07:00
Greg Kroah-Hartman	269e974e66	driver core: make [device_]driver_attach take a const * Change device_driver_attach() and driver_attach() to take a const * to struct device driver as neither of them modify the structure at all. Also, for some odd reason, drivers/dma/idxd/compat.c had a duplicate external reference to device_driver_attach(), so remove that to fix up the build, it should never have had that there in the first place. Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Petr Tesarik <petr.tesarik.ext@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: dmaengine@vger.kernel.org Link: https://lore.kernel.org/r/2024061401-rasping-manger-c385@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-20 12:51:42 +02:00
Greg Kroah-Hartman	c6c631d2b7	driver core: mark async_driver as a const * Within struct device_private, mark the async_driver * as const as it is never modified. This requires some internal-to-the-driver-core functions to also have their parameters marked as constant, and there is one place where we cast _back_ from the const pointer to a real one, as the driver core still wants to modify the structure in a number of remaining places. Cc: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20240611130103.3262749-12-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-13 16:44:00 +02:00
Greg Kroah-Hartman	f6e98ef5f7	driver core: make driver_detach() take a const * driver_detach() does not modify the driver itself, so make the pointer constant. In doing so, the function driver_allows_async_probing() also needs to be changed so that the pointer type passes through to that function properly. Cc: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20240611130103.3262749-11-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-13 16:43:54 +02:00
Greg Kroah-Hartman	33ebea9bc0	driver core: make device_release_driver_internal() take a const * Change device_release_driver_internal() to take a const struct device_driver * as it is not modifying it at all. Cc: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20240611130103.3262749-10-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-13 16:43:53 +02:00
Nícolas F. R. A. Prado	6aeb8850e0	device: core: Log warning for devices pending deferred probe on timeout Once the deferred probe timeout has elapsed it is very likely that the devices that are still deferring probe won't ever be probed. Therefore log the defer probe pending reason at the warning level instead to bring attention to the issue. Signed-off-by: "Nícolas F. R. A. Prado" <nfraprado@collabora.com> Link: https://lore.kernel.org/r/20240305-device-probe-error-v1-3-a06d8722bf19@collabora.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-03-07 22:10:31 +00:00
Nícolas F. R. A. Prado	448af2d288	driver: core: Use dev_* instead of pr_* so device metadata is added Use the dev_* instead of the pr_* functions to log the status of device probe so that the log message gets the device metadata attached to it. Signed-off-by: "Nícolas F. R. A. Prado" <nfraprado@collabora.com> Link: https://lore.kernel.org/r/20240305-device-probe-error-v1-2-a06d8722bf19@collabora.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-03-07 22:10:31 +00:00
Nícolas F. R. A. Prado	32de4b4f9d	driver: core: Log probe failure as error and with device metadata Drivers can return -ENODEV or -ENXIO from their probe to reject a device match, and return -EPROBE_DEFER if probe should be retried. Any other error code is not expected during normal behavior and indicates an issue occurred, so it should be logged at the error level. Also make use of the device variant, dev_err(), so that the device metadata is attached to the log message. Signed-off-by: "Nícolas F. R. A. Prado" <nfraprado@collabora.com> Link: https://lore.kernel.org/r/20240305-device-probe-error-v1-1-a06d8722bf19@collabora.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-03-07 22:10:31 +00:00
Uwe Kleine-König	7c41da586e	driver core: Emit reason for pending deferred probe Ending a boot log with platform 3f202000.mmc: deferred probe pending is already a nice hint about the problem. Sometimes there is a more detailed error indicator available, add that to the output. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20231122093332.274145-2-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-07 11:35:26 +09:00
Saravana Kannan	2e84dc3792	driver core: Release all resources during unbind before updating device links This commit fixes a bug in commit `9ed9895370` ("driver core: Functional dependencies tracking support") where the device link status was incorrectly updated in the driver unbind path before all the device's resources were released. Fixes: `9ed9895370` ("driver core: Functional dependencies tracking support") Cc: stable <stable@kernel.org> Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Closes: https://lore.kernel.org/all/20231014161721.f4iqyroddkcyoefo@pengutronix.de/ Signed-off-by: Saravana Kannan <saravanak@google.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Mark Brown <broonie@kernel.org> Cc: Matti Vaittinen <mazziesaccount@gmail.com> Cc: James Clark <james.clark@arm.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20231018013851.3303928-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-21 23:17:52 +02:00
Jason Gunthorpe	f429378a9b	driver core: Call dma_cleanup() on the test_remove path When test_remove is enabled really_probe() does not properly pair dma_configure() with dma_remove(), it will end up calling dma_configure() twice. This corrupts the owner_cnt and renders the group unusable with VFIO/etc. Add the missing cleanup before going back to re_probe. Fixes: `25f3bcfc54` ("driver core: Add dma_cleanup callback in bus_type") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Closes: https://lore.kernel.org/all/6472f254-c3c4-8610-4a37-8d9dfdd54ce8@huawei.com/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/0-v2-4deed94e283e+40948-really_probe_dma_cleanup_jgg@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-05 08:31:41 +02:00
Christoph Hellwig	aa5f6ed8c2	driver core: return bool from driver_probe_done bool is the most sensible return value for a yes/no return. Also add __init as this funtion is only called from the early boot code. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20230531125535.676098-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2023-06-05 10:55:20 -06:00
Stephen Boyd	e2f06aa885	driver core: Don't require dynamic_debug for initcall_debug probe timing Don't require the use of dynamic debug (or modification of the kernel to add a #define DEBUG to the top of this file) to get the printk message about driver probe timing. This printk is only emitted when initcall_debug is enabled on the kernel commandline, and it isn't immediately obvious that you have to do something else to debug boot timing issues related to driver probe. Add a comment too so it doesn't get converted back to pr_debug(). Fixes: `eb7fbc9fb1` ("driver core: Add missing '\n' in log messages") Cc: stable <stable@kernel.org> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Brian Norris <briannorris@chromium.org> Reviewed-by: Brian Norris <briannorris@chromium.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20230412225842.3196599-1-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-04-20 14:17:47 +02:00
Saravana Kannan	f8fb576658	driver core: Make state_synced device attribute writeable If the file is written to and sync_state() hasn't been called for the device yet, then call sync_state() for the device independent of the state of its consumers. This is useful for supplier devices that have one or more consumers that don't have a driver but the consumers are in a state that don't use the resources supplied by the supplier device. This gives finer grained control than using the fw_devlink.sync_state=timeout kernel commandline parameter. Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20230304005355.746421-3-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-03-10 09:06:22 +01:00
Saravana Kannan	ffbe08a8e8	driver core: Add fw_devlink.sync_state command line param When all devices that could probe have finished probing (based on deferred_probe_timeout configuration or late_initcall() when !CONFIG_MODULES), this parameter controls what to do with devices that haven't yet received their sync_state() calls. fw_devlink.sync_state=strict is the default and the driver core will continue waiting on all consumers of a device to probe successfully before sync_state() is called for the device. This is the default behavior since calling sync_state() on a device when all its consumers haven't probed could make some systems unusable/unstable. When this option is selected, we also print the list of devices that haven't had sync_state() called on them by the time all devices the could probe have finished probing. fw_devlink.sync_state=timeout will cause the driver core to give up waiting on consumers and call sync_state() on any devices that haven't yet received their sync_state() calls. This option is provided for systems that won't become unusable/unstable as they might be able to save power (depends on state of hardware before kernel starts) if all devices get their sync_state(). Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20230304005355.746421-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-03-10 09:06:21 +01:00
Greg Kroah-Hartman	36c893d3a7	drivers: base: dd: fix memory leak with using debugfs_lookup() When calling debugfs_lookup() the result must have dput() called on it, otherwise the memory will leak over time. To make things simpler, just call debugfs_lookup_and_remove() instead which handles all of the logic at once. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Link: https://lore.kernel.org/r/20230202141621.2296458-2-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-02-08 13:33:13 +01:00
Greg Kroah-Hartman	ed9f918174	driver core: bus: move bus notifier logic into bus.c The logic to touch the bus notifier was open-coded in numberous places in the driver core. Clean that up by creating a local bus_notify() function and have everyone call this function instead, making the reading of the caller code simpler and easier to maintain over time. Reviewed-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20230111092331.3946745-2-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-18 09:00:48 +01:00
Javier Martinez Canillas	504fa212d7	driver core: Make driver_deferred_probe_timeout a static variable It is not used outside of its compilation unit, so there's no need to export this variable. Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Acked-by: John Stultz <jstultz@google.com> Link: https://lore.kernel.org/r/20221227232152.3094584-1-javierm@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-11 16:06:40 +01:00
Isaac J. Manjarres	27c0d21734	driver core: Fix bus_type.match() error handling in __driver_attach() When a driver registers with a bus, it will attempt to match with every device on the bus through the __driver_attach() function. Currently, if the bus_type.match() function encounters an error that is not -EPROBE_DEFER, __driver_attach() will return a negative error code, which causes the driver registration logic to stop trying to match with the remaining devices on the bus. This behavior is not correct; a failure while matching a driver to a device does not mean that the driver won't be able to match and bind with other devices on the bus. Update the logic in __driver_attach() to reflect this. Fixes: `656b8035b0` ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()") Cc: stable@vger.kernel.org Cc: Saravana Kannan <saravanak@google.com> Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com> Link: https://lore.kernel.org/r/20220921001414.4046492-1-isaacmanjarres@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-10 18:36:04 +01:00
Christoph Hellwig	189a87f8ef	driver core: mark driver_allows_async_probing static driver_allows_async_probing is only used in drivers/base/dd.c, so mark it static and remove the declaration in drivers/base/base.h. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221030092255.872280-1-hch@lst.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-10 18:31:04 +01:00
Greg Kroah-Hartman	a791dc1353	Linux 6.0-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmMeQ2keHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGYRMH+gLNHiGirGZlm2GQ tKaZQUy7MiXuIP0hGDonDIIIAmIVhnjm9MDG8KT4W8AvEd7ukncyYqJfwWeWQPhP 4mZcf6l3Z8Ke+qiaFpXpMPCxTyWcln1ox0EoNx2g9gdPxZntaRuuaTQVljUfTiey aVPHxve8ip3G7jDoJnuLSxESOqWxkb8v/SshBP1E5bF5BZ+cgZRqq7FNigFqxjbk wF29K09BVOPjdgkSvY/b0/SnL5KlSdMAv+FrPcJNGivcdIPgf/qJks5cI2HRUo7o CpKgbcLorCVyD+d+zLonJBwIy3arbmKD8JqYnfdTSIqVOUqHXWUDfeydsH32u1Gu lPSI2Hw= =7LTL -----END PGP SIGNATURE----- Merge 6.0-rc5 into driver-core-next We need the driver core and debugfs changes in this branch. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-12 16:51:22 +02:00
Wolfram Sang	07b7b883be	driver_core: move from strlcpy with unused retval to strscpy Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20220818205956.6528-1-wsa+renesas@sang-engineering.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-01 18:15:32 +02:00
Isaac J. Manjarres	25e9fbf0fd	driver core: Don't probe devices after bus_type.match() probe deferral Both __device_attach_driver() and __driver_attach() check the return code of the bus_type.match() function to see if the device needs to be added to the deferred probe list. After adding the device to the list, the logic attempts to bind the device to the driver anyway, as if the device had matched with the driver, which is not correct. If __device_attach_driver() detects that the device in question is not ready to match with a driver on the bus, then it doesn't make sense for the device to attempt to bind with the current driver or continue attempting to match with any of the other drivers on the bus. So, update the logic in __device_attach_driver() to reflect this. If __driver_attach() detects that a driver tried to match with a device that is not ready to match yet, then the driver should not attempt to bind with the device. However, the driver can still attempt to match and bind with other devices on the bus, as drivers can be bound to multiple devices. So, update the logic in __driver_attach() to reflect this. Fixes: `656b8035b0` ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()") Cc: stable@vger.kernel.org Cc: Saravana Kannan <saravanak@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Saravana Kannan <saravanak@google.com> Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com> Link: https://lore.kernel.org/r/20220817184026.3468620-1-isaacmanjarres@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-01 15:57:13 +02:00
Saravana Kannan	13a8e0f6b0	Revert "driver core: Delete driver_deferred_probe_check_state()" This reverts commit `9cbffc7a59`. There are a few more issues to fix that have been reported in the thread for the original series [1]. We'll need to fix those before this will work. So, revert it for now. [1] - https://lore.kernel.org/lkml/20220601070707.3946847-1-saravanak@google.com/ Fixes: `9cbffc7a59` ("driver core: Delete driver_deferred_probe_check_state()") Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Peng Fan <peng.fan@nxp.com> Tested-by: Douglas Anderson <dianders@chromium.org> Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Reviewed-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220819221616.2107893-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-23 13:14:02 +02:00
Zhang Wensheng	70fe758352	driver core: fix potential deadlock in __driver_attach In __driver_attach function, There are also AA deadlock problem, like the commit `b232b02bf3` ("driver core: fix deadlock in __device_attach"). stack like commit `b232b02bf3` ("driver core: fix deadlock in __device_attach"). list below: In __driver_attach function, The lock holding logic is as follows: ... __driver_attach if (driver_allows_async_probing(drv)) device_lock(dev) // get lock dev async_schedule_dev(__driver_attach_async_helper, dev); // func async_schedule_node async_schedule_node_domain(func) entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC); /* when fail or work limit, sync to execute func, but __driver_attach_async_helper will get lock dev as will, which will lead to A-A deadlock. */ if (!entry \|\| atomic_read(&entry_count) > MAX_WORK) { func; else queue_work_node(node, system_unbound_wq, &entry->work) device_unlock(dev) As above show, when it is allowed to do async probes, because of out of memory or work limit, async work is not be allowed, to do sync execute instead. it will lead to A-A deadlock because of __driver_attach_async_helper getting lock dev. Reproduce: and it can be reproduce by make the condition (if (!entry \|\| atomic_read(&entry_count) > MAX_WORK)) untenable, like below: [ 370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 370.787154] task:swapper/0 state:D stack: 0 pid: 1 ppid: 0 flags:0x00004000 [ 370.788865] Call Trace: [ 370.789374] <TASK> [ 370.789841] __schedule+0x482/0x1050 [ 370.790613] schedule+0x92/0x1a0 [ 370.791290] schedule_preempt_disabled+0x2c/0x50 [ 370.792256] __mutex_lock.isra.0+0x757/0xec0 [ 370.793158] __mutex_lock_slowpath+0x1f/0x30 [ 370.794079] mutex_lock+0x50/0x60 [ 370.794795] __device_driver_lock+0x2f/0x70 [ 370.795677] ? driver_probe_device+0xd0/0xd0 [ 370.796576] __driver_attach_async_helper+0x1d/0xd0 [ 370.797318] ? driver_probe_device+0xd0/0xd0 [ 370.797957] async_schedule_node_domain+0xa5/0xc0 [ 370.798652] async_schedule_node+0x19/0x30 [ 370.799243] __driver_attach+0x246/0x290 [ 370.799828] ? driver_allows_async_probing+0xa0/0xa0 [ 370.800548] bus_for_each_dev+0x9d/0x130 [ 370.801132] driver_attach+0x22/0x30 [ 370.801666] bus_add_driver+0x290/0x340 [ 370.802246] driver_register+0x88/0x140 [ 370.802817] ? virtio_scsi_init+0x116/0x116 [ 370.803425] scsi_register_driver+0x1a/0x30 [ 370.804057] init_sd+0x184/0x226 [ 370.804533] do_one_initcall+0x71/0x3a0 [ 370.805107] kernel_init_freeable+0x39a/0x43a [ 370.805759] ? rest_init+0x150/0x150 [ 370.806283] kernel_init+0x26/0x230 [ 370.806799] ret_from_fork+0x1f/0x30 To fix the deadlock, move the async_schedule_dev outside device_lock, as we can see, in async_schedule_node_domain, the parameter of queue_work_node is system_unbound_wq, so it can accept concurrent operations. which will also not change the code logic, and will not lead to deadlock. Fixes: `ef0ff68351` ("driver core: Probe devices asynchronously instead of the driver") Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com> Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-06-27 16:43:51 +02:00
Saravana Kannan	9cbffc7a59	driver core: Delete driver_deferred_probe_check_state() The function is no longer used. So delete it. Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220601070707.3946847-10-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-06-10 15:57:54 +02:00
Saravana Kannan	f516d01b9d	Revert "driver core: Set default deferred_probe_timeout back to 0." This reverts commit 11f7e7ef553b6b93ac1aa74a3c2011b9cc8aeb61. Let's take another shot at getting deferred_probe_timeout=10 to work. Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220601070707.3946847-7-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-06-10 15:57:54 +02:00
Saravana Kannan	2f8c3ae828	driver core: Add wait_for_init_devices_probe helper function Some devices might need to be probed and bound successfully before the kernel boot sequence can finish and move on to init/userspace. For example, a network interface might need to be bound to be able to mount a NFS rootfs. With fw_devlink=on by default, some of these devices might be blocked from probing because they are waiting on a optional supplier that doesn't have a driver. While fw_devlink will eventually identify such devices and unblock the probing automatically, it might be too late by the time it unblocks the probing of devices. For example, the IP4 autoconfig might timeout before fw_devlink unblocks probing of the network interface. This function is available to temporarily try and probe all devices that have a driver even if some of their suppliers haven't been added or don't have drivers. The drivers can then decide which of the suppliers are optional vs mandatory and probe the device if possible. By the time this function returns, all such "best effort" probes are guaranteed to be completed. If a device successfully probes in this mode, we delete all fw_devlink discovered dependencies of that device where the supplier hasn't yet probed successfully because they have to be optional dependencies. This also means that some devices that aren't needed for init and could have waited for their optional supplier to probe (when the supplier's module is loaded later on) would end up probing prematurely with limited functionality. So call this function only when boot would fail without it. Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220601070707.3946847-5-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-06-10 15:57:54 +02:00
Saravana Kannan	9be4cbd09d	driver core: Set default deferred_probe_timeout back to 0. Since we had to effectively reverted commit `35a672363a` ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires") in an earlier patch, a non-zero deferred_probe_timeout will break NFS rootfs mounting [1] again. So, set the default back to zero until we can fix that. [1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/ Fixes: `2b28a1a84a` ("driver core: Extend deferred probe timeout on driver registration") Cc: Mark Brown <broonie@kernel.org> Cc: Rob Herring <robh@kernel.org> Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220526034609.480766-3-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-06-03 11:58:54 -07:00
Saravana Kannan	5ee76c256e	driver core: Fix wait_for_device_probe() & deferred_probe_timeout interaction Mounting NFS rootfs was timing out when deferred_probe_timeout was non-zero [1]. This was because ip_auto_config() initcall times out waiting for the network interfaces to show up when deferred_probe_timeout was non-zero. While ip_auto_config() calls wait_for_device_probe() to make sure any currently running deferred probe work or asynchronous probe finishes, that wasn't sufficient to account for devices being deferred until deferred_probe_timeout. Commit `35a672363a` ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires") tried to fix that by making sure wait_for_device_probe() waits for deferred_probe_timeout to expire before returning. However, if wait_for_device_probe() is called from the kernel_init() context: - Before deferred_probe_initcall() [2], it causes the boot process to hang due to a deadlock. - After deferred_probe_initcall() [3], it blocks kernel_init() from continuing till deferred_probe_timeout expires and beats the point of deferred_probe_timeout that's trying to wait for userspace to load modules. Neither of this is good. So revert the changes to wait_for_device_probe(). [1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/ [2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/ [3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/ Fixes: `35a672363a` ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires") Cc: John Stultz <jstultz@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Basil Eljuse <Basil.Eljuse@arm.com> Cc: Ferry Toth <fntoth@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Anders Roxell <anders.roxell@linaro.org> Cc: linux-pm@vger.kernel.org Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220526034609.480766-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-06-03 11:58:54 -07:00
Linus Torvalds	500a434fc5	Driver core changes for 5.19-rc1 Here is the set of driver core changes for 5.19-rc1. Note, I'm not really happy with this pull request as-is, see below for details, but overall this is all good for everything but a small set of systems, which we have a fix for already. Lots of tiny driver core changes and cleanups happened this cycle, but the two major things were: - firmware_loader reorganization and additions including the ability to have XZ compressed firmware images and the ability for userspace to initiate the firmware load when it needs to, instead of being always initiated by the kernel. FPGA devices specifically want this ability to have their firmware changed over the lifetime of the system boot, and this allows them to work without having to come up with yet-another-custom-uapi interface for loading firmware for them. - physical location support added to sysfs so that devices that know this information, can tell userspace where they are located in a common way. Some ACPI devices already support this today, and more bus types should support this in the future. Smaller changes included: - driver_override api cleanups and fixes - error path cleanups and fixes - get_abi script fixes - deferred probe timeout changes. It's that last change that I'm the most worried about. It has been reported to cause boot problems for a number of systems, and I have a tested patch series that resolves this issue. But I didn't get it merged into my tree before 5.18-final came out, so it has not gotten any linux-next testing. I'll send the fixup patches (there are 2) as a follow-on series to this pull request if you want to take them directly, _OR_ I can just revert the probe timeout changes and they can wait for the next -rc1 merge cycle. Given that the fixes are tested, and pretty simple, I'm leaning toward that choice. Sorry this all came at the end of the merge window, I should have resolved this all 2 weeks ago, that's my fault as it was in the middle of some travel for me. All have been tested in linux-next for weeks, with no reported issues other than the above-mentioned boot time outs. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnv/A8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+yk/fACgvmenbo5HipqyHnOmTQlT50xQ9EYAn2eTq6ai GkjLXBGNWOPBa5cU52qf =yEi/ -----END PGP SIGNATURE----- Merge tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core changes for 5.19-rc1. Lots of tiny driver core changes and cleanups happened this cycle, but the two major things are: - firmware_loader reorganization and additions including the ability to have XZ compressed firmware images and the ability for userspace to initiate the firmware load when it needs to, instead of being always initiated by the kernel. FPGA devices specifically want this ability to have their firmware changed over the lifetime of the system boot, and this allows them to work without having to come up with yet-another-custom-uapi interface for loading firmware for them. - physical location support added to sysfs so that devices that know this information, can tell userspace where they are located in a common way. Some ACPI devices already support this today, and more bus types should support this in the future. Smaller changes include: - driver_override api cleanups and fixes - error path cleanups and fixes - get_abi script fixes - deferred probe timeout changes. It's that last change that I'm the most worried about. It has been reported to cause boot problems for a number of systems, and I have a tested patch series that resolves this issue. But I didn't get it merged into my tree before 5.18-final came out, so it has not gotten any linux-next testing. I'll send the fixup patches (there are 2) as a follow-on series to this pull request. All have been tested in linux-next for weeks, with no reported issues other than the above-mentioned boot time-outs" * tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) driver core: fix deadlock in __device_attach kernfs: Separate kernfs_pr_cont_buf and rename_lock. topology: Remove unused cpu_cluster_mask() driver core: Extend deferred probe timeout on driver registration MAINTAINERS: add Russ Weight as a firmware loader maintainer driver: base: fix UAF when driver_attach failed test_firmware: fix end of loop test in upload_read_show() driver core: location: Add "back" as a possible output for panel driver core: location: Free struct acpi_pld_info pld driver core: Add "" wildcard support to driver_async_probe cmdline param driver core: location: Check for allocations failure arch_topology: Trace the update thermal pressure kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file. export: fix string handling of namespace in EXPORT_SYMBOL_NS rpmsg: use local 'dev' variable rpmsg: Fix calling device_lock() on non-initialized device firmware_loader: describe 'module' parameter of firmware_upload_register() firmware_loader: Move definitions from sysfs_upload.h to sysfs.h firmware_loader: Fix configs for sysfs split selftests: firmware: Add firmware upload selftests ...	2022-06-03 11:48:47 -07:00
Zhang Wensheng	b232b02bf3	driver core: fix deadlock in __device_attach In __device_attach function, The lock holding logic is as follows: ... __device_attach device_lock(dev) // get lock dev async_schedule_dev(__device_attach_async_helper, dev); // func async_schedule_node async_schedule_node_domain(func) entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC); /* when fail or work limit, sync to execute func, but __device_attach_async_helper will get lock dev as well, which will lead to A-A deadlock. */ if (!entry \|\| atomic_read(&entry_count) > MAX_WORK) { func; else queue_work_node(node, system_unbound_wq, &entry->work) device_unlock(dev) As shown above, when it is allowed to do async probes, because of out of memory or work limit, async work is not allowed, to do sync execute instead. it will lead to A-A deadlock because of __device_attach_async_helper getting lock dev. To fix the deadlock, move the async_schedule_dev outside device_lock, as we can see, in async_schedule_node_domain, the parameter of queue_work_node is system_unbound_wq, so it can accept concurrent operations. which will also not change the code logic, and will not lead to deadlock. Fixes: `765230b5f0` ("driver-core: add asynchronous probing support for drivers") Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com> Link: https://lore.kernel.org/r/20220518074516.1225580-1-zhangwensheng5@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-19 19:37:08 +02:00
Saravana Kannan	2b28a1a84a	driver core: Extend deferred probe timeout on driver registration The deferred probe timer that's used for this currently starts at late_initcall and runs for driver_deferred_probe_timeout seconds. The assumption being that all available drivers would be loaded and registered before the timer expires. This means, the driver_deferred_probe_timeout has to be pretty large for it to cover the worst case. But if we set the default value for it to cover the worst case, it would significantly slow down the average case. For this reason, the default value is set to 0. Also, with CONFIG_MODULES=y and the current default values of driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing drivers will cause their consumer devices to always defer their probes. This is because device links created by fw_devlink defer the probe even before the consumer driver's probe() is called. Instead of a fixed timeout, if we extend an unexpired deferred probe timer on every successful driver registration, with the expectation more modules would be loaded in the near future, then the default value of driver_deferred_probe_timeout only needs to be as long as the worst case time difference between two consecutive module loads. So let's implement that and set the default value to 10 seconds when CONFIG_MODULES=y. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Rob Herring <robh@kernel.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Kevin Hilman <khilman@kernel.org> Cc: Thierry Reding <treding@nvidia.com> Cc: Mark Brown <broonie@kernel.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Cc: linux-gpio@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: iommu@lists.linux-foundation.org Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220429220933.1350374-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-19 19:32:33 +02:00
Saravana Kannan	f79f662e4c	driver core: Add "" wildcard support to driver_async_probe cmdline param There's currently no way to use driver_async_probe kernel cmdline param to enable default async probe for all drivers. So, add support for "" to match with all driver names. When "" is used, all other drivers listed in driver_async_probe are drivers that will NOT match the "". For example: * driver_async_probe=drvA,drvB,drvC drvA, drvB and drvC do asynchronous probing. * driver_async_probe=* All drivers do asynchronous probing except those that have set PROBE_FORCE_SYNCHRONOUS flag. * driver_async_probe=*,drvA,drvB,drvC All drivers do asynchronous probing except drvA, drvB, drvC and those that have set PROBE_FORCE_SYNCHRONOUS flag. Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Feng Tang <feng.tang@intel.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220504005344.117803-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-19 19:28:24 +02:00
Greg Kroah-Hartman	0e509f537f	Merge 5.18-rc5 into driver-core-next We need the kernfs/driver core fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-02 13:56:48 +02:00

1 2 3 4 5 ...

271 Commits