mirror of
https://github.com/torvalds/linux.git
synced 2026-05-18 20:03:14 +02:00
While running reboot tests w/ a specific set of USB devices (and slub_debug enabled), I found that once every few hours my device would be crashed with a stack that looked like this: [ 14.012445] BUG: spinlock bad magic on CPU#0, modprobe/2091 [ 14.012460] lock: 0xffffffc0cb055978, .magic: ffffffc0, .owner: cryption contexts: %lu/%lu [ 14.012460] /1025536097, .owner_cpu: 0 [ 14.012466] CPU: 0 PID: 2091 Comm: modprobe Not tainted 4.4.79 #352 [ 14.012468] Hardware name: Google Kevin (DT) [ 14.012471] Call trace: [ 14.012483] [<....>] dump_backtrace+0x0/0x160 [ 14.012487] [<....>] show_stack+0x20/0x28 [ 14.012494] [<....>] dump_stack+0xb4/0xf0 [ 14.012500] [<....>] spin_dump+0x8c/0x98 [ 14.012504] [<....>] spin_bug+0x30/0x3c [ 14.012508] [<....>] do_raw_spin_lock+0x40/0x164 [ 14.012515] [<....>] _raw_spin_lock_irqsave+0x64/0x74 [ 14.012521] [<....>] __wake_up+0x2c/0x60 [ 14.012528] [<....>] async_completed+0x2d0/0x300 [ 14.012534] [<....>] __usb_hcd_giveback_urb+0xc4/0x138 [ 14.012538] [<....>] usb_hcd_giveback_urb+0x54/0xf0 [ 14.012544] [<....>] xhci_irq+0x1314/0x1348 [ 14.012548] [<....>] usb_hcd_irq+0x40/0x50 [ 14.012553] [<....>] handle_irq_event_percpu+0x1b4/0x3f0 [ 14.012556] [<....>] handle_irq_event+0x4c/0x7c [ 14.012561] [<....>] handle_fasteoi_irq+0x158/0x1c8 [ 14.012564] [<....>] generic_handle_irq+0x30/0x44 [ 14.012568] [<....>] __handle_domain_irq+0x90/0xbc [ 14.012572] [<....>] gic_handle_irq+0xcc/0x18c Investigation using kgdb() found that the wait queue that was passed into wake_up() had been freed (it was filled with slub_debug poison). I analyzed and instrumented the code and reproduced. My current belief is that this is happening: 1. async_completed() is called (from IRQ). Moves "as" onto the completed list. 2. On another CPU, proc_reapurbnonblock_compat() calls async_getcompleted(). Blocks on spinlock. 3. async_completed() releases the lock; keeps running; gets blocked midway through wake_up(). 4. proc_reapurbnonblock_compat() => async_getcompleted() gets the lock; removes "as" from completed list and frees it. 5. usbdev_release() is called. Frees "ps". 6. async_completed() finally continues running wake_up(). ...but wake_up() has a pointer to the freed "ps". The instrumentation that led me to believe this was based on adding some trace_printk() calls in a select few functions and then using kdb's "ftdump" at crash time. The trace follows (NOTE: in the trace below I cheated a little bit and added a udelay(1000) in async_completed() after releasing the spinlock because I wanted it to trigger quicker): <...>-2104 0d.h2 13759034us!: async_completed at start: as=ffffffc0cc638200 mtpd-2055 3.... 13759356us : async_getcompleted before spin_lock_irqsave mtpd-2055 3d..1 13759362us : async_getcompleted after list_del_init: as=ffffffc0cc638200 mtpd-2055 3.... 13759371us+: proc_reapurbnonblock_compat: free_async(ffffffc0cc638200) mtpd-2055 3.... 13759422us+: async_getcompleted before spin_lock_irqsave mtpd-2055 3.... 13759479us : usbdev_release at start: ps=ffffffc0cc042080 mtpd-2055 3.... 13759487us : async_getcompleted before spin_lock_irqsave mtpd-2055 3.... 13759497us!: usbdev_release after kfree(ps): ps=ffffffc0cc042080 <...>-2104 0d.h2 13760294us : async_completed before wake_up(): as=ffffffc0cc638200 To fix this problem we can just move the wake_up() under the ps->lock. There should be no issues there that I'm aware of. Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|---|---|---|
| .. | ||
| accessibility | ||
| acpi | ||
| amba | ||
| android | ||
| ata | ||
| atm | ||
| auxdisplay | ||
| base | ||
| bcma | ||
| block | ||
| bluetooth | ||
| bus | ||
| cdrom | ||
| char | ||
| clk | ||
| clocksource | ||
| connector | ||
| cpufreq | ||
| cpuidle | ||
| crypto | ||
| dax | ||
| dca | ||
| devfreq | ||
| dio | ||
| dma | ||
| dma-buf | ||
| edac | ||
| eisa | ||
| extcon | ||
| firewire | ||
| firmware | ||
| fmc | ||
| fpga | ||
| fsi | ||
| gpio | ||
| gpu | ||
| hid | ||
| hsi | ||
| hv | ||
| hwmon | ||
| hwspinlock | ||
| hwtracing | ||
| i2c | ||
| ide | ||
| idle | ||
| iio | ||
| infiniband | ||
| input | ||
| iommu | ||
| ipack | ||
| irqchip | ||
| isdn | ||
| leds | ||
| lguest | ||
| lightnvm | ||
| macintosh | ||
| mailbox | ||
| mcb | ||
| md | ||
| media | ||
| memory | ||
| memstick | ||
| message | ||
| mfd | ||
| misc | ||
| mmc | ||
| mtd | ||
| mux | ||
| net | ||
| nfc | ||
| ntb | ||
| nubus | ||
| nvdimm | ||
| nvme | ||
| nvmem | ||
| of | ||
| oprofile | ||
| parisc | ||
| parport | ||
| pci | ||
| pcmcia | ||
| perf | ||
| phy | ||
| pinctrl | ||
| platform | ||
| pnp | ||
| power | ||
| powercap | ||
| pps | ||
| ps3 | ||
| ptp | ||
| pwm | ||
| rapidio | ||
| ras | ||
| regulator | ||
| remoteproc | ||
| reset | ||
| rpmsg | ||
| rtc | ||
| s390 | ||
| sbus | ||
| scsi | ||
| sfi | ||
| sh | ||
| sn | ||
| soc | ||
| spi | ||
| spmi | ||
| ssb | ||
| staging | ||
| target | ||
| tc | ||
| tee | ||
| thermal | ||
| thunderbolt | ||
| tty | ||
| uio | ||
| usb | ||
| uwb | ||
| vfio | ||
| vhost | ||
| video | ||
| virt | ||
| virtio | ||
| vlynq | ||
| vme | ||
| w1 | ||
| watchdog | ||
| xen | ||
| zorro | ||
| Kconfig | ||
| Makefile | ||