linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-08 14:42:37 +02:00

Author	SHA1	Message	Date
Suganath Prabu S	6eddf5c993	scsi: mpt3sas: Unblock device after controller reset commit `7ff723ad0f` upstream. While issuing any ATA passthrough command to firmware the driver will block the device. But it will unblock the device only if the I/O completes through the ISR path. If a controller reset occurs before command completion the device will remain in blocked state. Make sure we unblock the device following a controller reset if an ATA passthrough command was queued. [mkp: clarified patch description] Fixes: ac6c2a93bd07 ("mpt3sas: Fix for SATA drive in blocked state, after diag reset") Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-12-02 09:09:02 +01:00
Andrey Grodzovsky	ffffc1ed47	scsi: mpt3sas: Fix secure erase premature termination commit `18f6084a98` upstream. This is a work around for a bug with LSI Fusion MPT SAS2 when perfoming secure erase. Due to the very long time the operation takes, commands issued during the erase will time out and will trigger execution of the abort hook. Even though the abort hook is called for the specific command which timed out, this leads to entire device halt (scsi_state terminated) and premature termination of the secure erase. Set device state to busy while ATA passthrough commands are in progress. [mkp: hand applied to 4.9/scsi-fixes, tweaked patch description] Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com> Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com> Cc: <linux-scsi@vger.kernel.org> Cc: Sathya Prakash <sathya.prakash@broadcom.com> Cc: Chaitra P B <chaitra.basappa@broadcom.com> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com> Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-12-02 09:09:01 +01:00
Sreekanth Reddy	d245874049	scsi: mpt3sas: Fix for block device of raid exists even after deleting raid disk commit `6d3a56ed09` upstream. While merging mpt3sas & mpt2sas code, we added the is_warpdrive check condition on the wrong line --------------------------------------------------------------------------- scsih_target_alloc(struct scsi_target *starget) sas_target_priv_data->handle = raid_device->handle; sas_target_priv_data->sas_address = raid_device->wwid; sas_target_priv_data->flags \|= MPT_TARGET_FLAGS_VOLUME; - raid_device->starget = starget; + sas_target_priv_data->raid_device = raid_device; + if (ioc->is_warpdrive) + raid_device->starget = starget; } spin_unlock_irqrestore(&ioc->raid_device_lock, flags); return 0; ------------------------------------------------------------------------------ That check should be for the line sas_target_priv_data->raid_device = raid_device; Due to above hunk, we are not initializing raid_device's starget for raid volumes, and so during raid disk deletion driver is not calling scsi_remove_target() API as driver observes starget field of raid_device's structure as NULL. Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com> Fixes: `7786ab6aff` ("mpt3sas: Ported WarpDrive product SSS6200 support") Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-11-18 10:48:36 +01:00
Bill Kuzeja	6e897d034d	scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init commit `a5dd506e15` upstream. A system can get hung task timeouts if a qlogic board fails during initialization (if the board breaks again or fails the init). The hang involves the scsi scan. In a nutshell, since commit `beb9e315e6` ("qla2xxx: Prevent removal and board_disable race"): ...it is possible to have freed ha (base_vha->hw) early by a call to qla2x00_remove_one when pdev->enable_cnt equals zero: if (!atomic_read(&pdev->enable_cnt)) { scsi_host_put(base_vha->host); kfree(ha); pci_set_drvdata(pdev, NULL); return; Almost always, the scsi_host_put above frees the vha structure (attached to the end of the Scsi_Host we're putting) since it's the last put, and life is good. However, if we are entering this routine because the adapter has broken sometime during initialization AND a scsi scan is already in progress (and has done its own scsi_host_get), vha will not be freed. What's worse, the scsi scan will access the freed ha structure through qla2xxx_scan_finished: if (time > vha->hw->loop_reset_delay * HZ) return 1; The scsi scan keeps checking to see if a scan is complete by calling qla2xxx_scan_finished. There is a timeout value that limits the length of time a scan can take (hw->loop_reset_delay, usually set to 5 seconds), but this definition is in the data structure (hw) that can get freed early. This can yield unpredictable results, the worst of which is that the scsi scan can hang indefinitely. This happens when the freed structure gets reused and loop_reset_delay gets overwritten with garbage, which the scan obliviously uses as its timeout value. The fix for this is simple: at the top of qla2xxx_scan_finished, check for the UNLOADING bit in the vha structure (_vha is not freed at this point). If UNLOADING is set, we exit the scan for this adapter immediately. After this last reference to the ha structure, we'll exit the scan for this adapter, and continue on. This problem is hard to hit, but I have run into it doing negative testing many times now (with a test specifically designed to bring it out), so I can verify that this fix works. My testing has been against a RHEL7 driver variant, but the bug and patch are equally relevant to to the upstream driver. Fixes: `beb9e315e6` ("qla2xxx: Prevent removal and board_disable race") Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com> Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-11-18 10:48:35 +01:00
Sumit Saxena	ae94da4c53	scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression commit `5e5ec1759d` upstream. This patch will fix regression caused by commit `1e793f6fc0` ("scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices"). The problem was that the MEGASAS_IS_LOGICAL macro did not have braces and as a result the driver ended up exposing a lot of non-existing SCSI devices (all SCSI commands to channels 1,2,3 were returned as SUCCESS-DID_OK by driver). [mkp: clarified patch description] Fixes: `1e793f6fc0` Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Tested-by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Tested-by: Jens Axboe <axboe@fb.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-11-15 07:46:40 +01:00
Ching Huang	c77a234622	scsi: arcmsr: Send SYNCHRONIZE_CACHE command to firmware commit `2bf7dc8443` upstream. The arcmsr driver failed to pass SYNCHRONIZE CACHE to controller firmware. Depending on how drive caches are handled internally by controller firmware this could potentially lead to data integrity problems. Ensure that cache flushes are passed to the controller. [mkp: applied by hand and removed unused vars] Signed-off-by: Ching Huang <ching2048@areca.com.tw> Reported-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-11-10 16:36:35 +01:00
Ewan D. Milne	69ee0ed0c6	scsi: scsi_debug: Fix memory leak if LBP enabled and module is unloaded commit `4d2b496f19` upstream. map_storep was not being vfree()'d in the module_exit call. Signed-off-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-11-10 16:36:35 +01:00
Kashyap Desai	9075faf140	scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices commit `1e793f6fc0` upstream. Commit `02b01e010a` ("megaraid_sas: return sync cache call with success") modified the driver to successfully complete SYNCHRONIZE_CACHE commands without passing them to the controller. Disk drive caches are only explicitly managed by controller firmware when operating in RAID mode. So this commit effectively disabled writeback cache flushing for any drives used in JBOD mode, leading to data integrity failures. [mkp: clarified patch description] Fixes: `02b01e010a` Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-11-10 16:36:35 +01:00
Johannes Thumshirn	2577121578	mpt3sas: Don't spam logs if logging level is 0 commit `0d667f72b2` upstream. In _scsih_io_done() we test if the ioc->logging_level does _not_ have the MPT_DEBUG_REPLY bit set and if it hasn't we print the debug messages. This unfortunately is the wrong way around. Note, the actual bug is older than `af0094115` but this commit removed the CONFIG_SCSI_MPT3SAS_LOGGING Kconfig option which hid the bug. Fixes: `af0094115` 'mpt2sas, mpt3sas: Remove SCSI_MPTXSAS_LOGGING entry from Kconfig' Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-10-31 04:14:01 -06:00
Don Brace	652a174a7d	hpsa: correct skipping masked peripherals commit `64ce60cab2` upstream. The SA controller spins down RAID drive spares. A REGNEWD event causes an inquiry to be sent to all physical drives. This causes the SA controller to spin up the spare. The controller suspends all I/O to a logical volume until the spare is spun up. The spin-up can take over 50 seconds. This can result in one or both of the following: - SML sends down aborts and resets to the logical volume and can cause the logical volume to be off-lined. - a negative impact on the logical volume's I/O performance each time a REGNEWD is triggered. Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Juerg Haefliger <juerg.haefliger@hpe.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-10-28 03:01:33 -04:00
Martin K. Petersen	9814eb7549	sd: Fix rw_max for devices that report an optimal xfer size commit `6b7e9cde49` upstream. For historic reasons, io_opt is in bytes and max_sectors in block layer sectors. This interface inconsistency is error prone and should be fixed. But for 4.4--4.7 let's make the unit difference explicit via a wrapper function. Fixes: `d0eb20a863` ("sd: Optimal I/O size is in bytes, not sectors") Reported-by: Fam Zheng <famz@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Andrew Patterson <andrew.patterson@hpe.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Juerg Haefliger <juerg.haefliger@hpe.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-10-28 03:01:33 -04:00
Ming Lei	bffff9301e	scsi: Fix use-after-free commit `bcd8f2e948` upstream. This patch fixes one use-after-free report[1] by KASAN. In __scsi_scan_target(), when a type 31 device is probed, SCSI_SCAN_TARGET_PRESENT is returned and the target will be scanned again. Inside the following scsi_report_lun_scan(), one new scsi_device instance is allocated, and scsi_probe_and_add_lun() is called again to probe the target and still see type 31 device, finally __scsi_remove_device() is called to remove & free the device at the end of scsi_probe_and_add_lun(), so cause use-after-free in scsi_report_lun_scan(). And the following SCSI log can be observed: scsi 0:0:2:0: scsi scan: INQUIRY pass 1 length 36 scsi 0:0:2:0: scsi scan: INQUIRY successful with code 0x0 scsi 0:0:2:0: scsi scan: peripheral device type of 31, no device added scsi 0:0:2:0: scsi scan: Sending REPORT LUNS to (try 0) scsi 0:0:2:0: scsi scan: REPORT LUNS successful (try 0) result 0x0 scsi 0:0:2:0: scsi scan: REPORT LUN scan scsi 0:0:2:0: scsi scan: INQUIRY pass 1 length 36 scsi 0:0:2:0: scsi scan: INQUIRY successful with code 0x0 scsi 0:0:2:0: scsi scan: peripheral device type of 31, no device added BUG: KASAN: use-after-free in __scsi_scan_target+0xbf8/0xe40 at addr ffff88007b44a104 This patch fixes the issue by moving the putting reference at the end of scsi_report_lun_scan(). [1] KASAN report ================================================================== [ 3.274597] PM: Adding info for serio:serio1 [ 3.275127] BUG: KASAN: use-after-free in __scsi_scan_target+0xd87/0xdf0 at addr ffff880254d8c304 [ 3.275653] Read of size 4 by task kworker/u10:0/27 [ 3.275903] CPU: 3 PID: 27 Comm: kworker/u10:0 Not tainted 4.8.0 #2121 [ 3.276258] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 3.276797] Workqueue: events_unbound async_run_entry_fn [ 3.277083] ffff880254d8c380 ffff880259a37870 ffffffff94bbc6c1 ffff880078402d80 [ 3.277532] ffff880254d8bb80 ffff880259a37898 ffffffff9459fec1 ffff880259a37930 [ 3.277989] ffff880254d8bb80 ffff880078402d80 ffff880259a37920 ffffffff945a0165 [ 3.278436] Call Trace: [ 3.278528] [<ffffffff94bbc6c1>] dump_stack+0x65/0x84 [ 3.278797] [<ffffffff9459fec1>] kasan_object_err+0x21/0x70 [ 3.279063] device: 'psaux': device_add [ 3.279616] [<ffffffff945a0165>] kasan_report_error+0x205/0x500 [ 3.279651] PM: Adding info for No Bus:psaux [ 3.280202] [<ffffffff944ecd22>] ? kfree_const+0x22/0x30 [ 3.280486] [<ffffffff94bc2dc9>] ? kobject_release+0x119/0x370 [ 3.280805] [<ffffffff945a0543>] __asan_report_load4_noabort+0x43/0x50 [ 3.281170] [<ffffffff9507e1f7>] ? __scsi_scan_target+0xd87/0xdf0 [ 3.281506] [<ffffffff9507e1f7>] __scsi_scan_target+0xd87/0xdf0 [ 3.281848] [<ffffffff9507d470>] ? scsi_add_device+0x30/0x30 [ 3.282156] [<ffffffff94f7f660>] ? pm_runtime_autosuspend_expiration+0x60/0x60 [ 3.282570] [<ffffffff956ddb07>] ? _raw_spin_lock+0x17/0x40 [ 3.282880] [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160 [ 3.283200] [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0 [ 3.283563] [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250 [ 3.283882] [<ffffffff9507efc1>] do_scan_async+0x41/0x450 [ 3.284173] [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610 [ 3.284492] [<ffffffff941a8954>] ? pwq_dec_nr_in_flight+0x124/0x2a0 [ 3.284876] [<ffffffff941d1770>] ? preempt_count_add+0x130/0x160 [ 3.285207] [<ffffffff941a9a84>] process_one_work+0x544/0x12d0 [ 3.285526] [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0 [ 3.285844] [<ffffffff941aa810>] ? process_one_work+0x12d0/0x12d0 [ 3.286182] [<ffffffff941bb365>] kthread+0x1c5/0x260 [ 3.286443] [<ffffffff940855cd>] ? __switch_to+0x88d/0x1430 [ 3.286745] [<ffffffff941bb1a0>] ? kthread_worker_fn+0x5a0/0x5a0 [ 3.287085] [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40 [ 3.287368] [<ffffffff941bb1a0>] ? kthread_worker_fn+0x5a0/0x5a0 [ 3.287697] Object at ffff880254d8bb80, in cache kmalloc-2048 size: 2048 [ 3.288064] Allocated: [ 3.288147] PID = 27 [ 3.288218] [<ffffffff940b27ab>] save_stack_trace+0x2b/0x50 [ 3.288531] [<ffffffff9459f246>] save_stack+0x46/0xd0 [ 3.288806] [<ffffffff9459f4bd>] kasan_kmalloc+0xad/0xe0 [ 3.289098] [<ffffffff9459c07e>] __kmalloc+0x13e/0x250 [ 3.289378] [<ffffffff95078e5a>] scsi_alloc_sdev+0xea/0xcf0 [ 3.289701] [<ffffffff9507de76>] __scsi_scan_target+0xa06/0xdf0 [ 3.290034] [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160 [ 3.290362] [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0 [ 3.290724] [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250 [ 3.291055] [<ffffffff9507efc1>] do_scan_async+0x41/0x450 [ 3.291354] [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610 [ 3.291695] [<ffffffff941a9a84>] process_one_work+0x544/0x12d0 [ 3.292022] [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0 [ 3.292325] [<ffffffff941bb365>] kthread+0x1c5/0x260 [ 3.292594] [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40 [ 3.292886] Freed: [ 3.292945] PID = 27 [ 3.293016] [<ffffffff940b27ab>] save_stack_trace+0x2b/0x50 [ 3.293327] [<ffffffff9459f246>] save_stack+0x46/0xd0 [ 3.293600] [<ffffffff9459fa61>] kasan_slab_free+0x71/0xb0 [ 3.293916] [<ffffffff9459bac2>] kfree+0xa2/0x1f0 [ 3.294168] [<ffffffff9508158a>] scsi_device_dev_release_usercontext+0x50a/0x730 [ 3.294598] [<ffffffff941ace9a>] execute_in_process_context+0xda/0x130 [ 3.294974] [<ffffffff9508107c>] scsi_device_dev_release+0x1c/0x20 [ 3.295322] [<ffffffff94f566f6>] device_release+0x76/0x1e0 [ 3.295626] [<ffffffff94bc2db7>] kobject_release+0x107/0x370 [ 3.295942] [<ffffffff94bc29ce>] kobject_put+0x4e/0xa0 [ 3.296222] [<ffffffff94f56e17>] put_device+0x17/0x20 [ 3.296497] [<ffffffff9505201c>] scsi_device_put+0x7c/0xa0 [ 3.296801] [<ffffffff9507e1bc>] __scsi_scan_target+0xd4c/0xdf0 [ 3.297132] [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160 [ 3.297458] [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0 [ 3.297829] [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250 [ 3.298156] [<ffffffff9507efc1>] do_scan_async+0x41/0x450 [ 3.298453] [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610 [ 3.298777] [<ffffffff941a9a84>] process_one_work+0x544/0x12d0 [ 3.299105] [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0 [ 3.299408] [<ffffffff941bb365>] kthread+0x1c5/0x260 [ 3.299676] [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40 [ 3.299967] Memory state around the buggy address: [ 3.300209] ffff880254d8c200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 3.300608] ffff880254d8c280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 3.300986] >ffff880254d8c300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 3.301408] ^ [ 3.301550] ffff880254d8c380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 3.301987] ffff880254d8c400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 3.302396] ================================================================== Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-10-28 03:01:31 -04:00
Brian King	2ed1b50a40	scsi: ibmvfc: Fix I/O hang when port is not mapped commit `07d0e9a847` upstream. If a VFC port gets unmapped in the VIOS, it may not respond with a CRQ init complete following H_REG_CRQ. If this occurs, we can end up having called scsi_block_requests and not a resulting unblock until the init complete happens, which may never occur, and we end up hanging I/O requests. This patch ensures the host action stay set to IBMVFC_HOST_ACTION_TGT_DEL so we move all rports into devloss state and unblock unless we receive an init complete. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-10-22 12:26:56 +02:00
Borislav Petkov	161cbfec10	scsi: arcmsr: Simplify user_len checking commit `4bd173c307` upstream. Do the user_len check first and then the ver_addr allocation so that we can save us the kfree() on the error path when user_len is > ARCMSR_API_DATA_BUFLEN. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Marco Grassi <marco.gra@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Tomas Henzl <thenzl@redhat.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-10-22 12:26:56 +02:00
Dan Carpenter	2404092282	scsi: arcmsr: Buffer overflow in arcmsr_iop_message_xfer() commit `7bc2b55a5c` upstream. We need to put an upper bound on "user_len" so the memcpy() doesn't overflow. Reported-by: Marco Grassi <marco.gra@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-10-22 12:26:55 +02:00
Dan Carpenter	07dc725268	fnic: pci_dma_mapping_error() doesn't return an error code commit `dd7328e4c5` upstream. pci_dma_mapping_error() returns true on error and false on success. Fixes: `fd6ddfa4c1` ('fnic: check pci_map_single() return value') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-10-07 15:23:45 +02:00
Maurizio Lombardi	56e5ad1e1d	megaraid: fix null pointer check in megasas_detach_one(). commit `546e559c79` upstream. The pd_seq_sync pointer can't be NULL, we have to check its entries instead. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-30 10:18:36 +02:00
Tyrel Datwyler	6f0caecda5	scsi: fix upper bounds check of sense key in scsi_sense_key_string() commit `a87eeb900d` upstream. Commit `655ee63cf3` ("scsi constants: command, sense key + additional sense string") added a "Completed" sense string with key 0xF to snstext[], but failed to updated the upper bounds check of the sense key in scsi_sense_key_string(). Fixes: `655ee63cf3` ("[SCSI] scsi constants: command, sense key + additional sense strings") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:54 +02:00
Manoj N. Kumar	d4009e4b6e	cxlflash: Move to exponential back-off when cmd_room is not available [ Upstream commit `ea76543127` ] While profiling the cxlflash_queuecommand() path under a heavy load it was found that number of retries to find cmd_room was fairly high. There are two problems with the current back-off: a) It starts with a udelay of 0 b) It backs-off linearly Tried several approaches (a higher multiple 10n, 100n, as well as n^2, 2^n) and found that the exponential back-off(2^n) approach had the least overall cost. Cost as being defined as overall time spent waiting. The fix is to change the linear back-off to an exponential back-off. This solution also takes care of the problem with the initial delay (starts with 1 usec). Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:50 +02:00
Matthew R. Ochs	f88503578d	cxlflash: Fix to avoid virtual LUN failover failure [ Upstream commit `d5e26bb1d8` ] Applications which use virtual LUN's that are backed by a physical LUN over both adapter ports may experience an I/O failure in the event of a link loss (e.g. cable pull). Virtual LUNs may be accessed through one or both ports of the adapter. This access is encoded in the translation entries that comprise the virtual LUN and used by the AFU for load-balancing I/O and handling failover scenarios. In a link loss scenario, even though the AFU is able to maintain connectivity to the LUN, it is up to the application to retry the failed I/O. When applications are unaware of the virtual LUN's underlying topology, they are unable to make a sound decision of when to retry an I/O and therefore are forced to make their reaction to a failed I/O absolute. The result is either a failure to retry I/O or increased latency for scenarios where a retry is pointless. To remedy this scenario, provide feedback back to the application on virtual LUN creation as to which ports the LUN may be accessed. LUN's spanning both ports are candidates for a retry in a presence of an I/O failure. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:49 +02:00
Manoj Kumar	610f1a8d37	cxlflash: Fix to escalate LINK_RESET also on port 1 [ Upstream commit `a9be294ecb` ] The original fix to escalate a 'login timed out' error to a LINK_RESET was only made for one of the two ports on the card. This fix resolves the same issue for the second port (port 1). Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:49 +02:00
James Smart	9f4a5a1c0c	lpfc: Fix DMA faults observed upon plugging loopback connector [ Upstream commit `ae09c76510` ] Driver didn't program the REG_VFI mailbox correctly, giving the adapter bad addresses. Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com> Signed-off-by: James Smart <james.smart@avagotech.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:48 +02:00
Manoj N. Kumar	d774bfcec0	cxlflash: Fix to resolve dead-lock during EEH recovery [ Upstream commit `635f6b0893` ] When a cxlflash adapter goes into EEH recovery and multiple processes (each having established its own context) are active, the EEH recovery can hang if the processes attempt to recover in parallel. The symptom logged after a couple of minutes is: INFO: task eehd:48 blocked for more than 120 seconds. Not tainted 4.5.0-491-26f710d+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. eehd 0 48 2 Call Trace: __switch_to+0x2f0/0x410 __schedule+0x300/0x980 schedule+0x48/0xc0 rwsem_down_write_failed+0x294/0x410 down_write+0x88/0xb0 cxlflash_pci_error_detected+0x100/0x1c0 [cxlflash] cxl_vphb_error_detected+0x88/0x110 [cxl] cxl_pci_error_detected+0xb0/0x1d0 [cxl] eeh_report_error+0xbc/0x130 eeh_pe_dev_traverse+0x94/0x160 eeh_handle_normal_event+0x17c/0x450 eeh_handle_event+0x184/0x370 eeh_event_handler+0x1c8/0x1d0 kthread+0x110/0x130 ret_from_kernel_thread+0x5c/0xa4 INFO: task blockio:33215 blocked for more than 120 seconds. Not tainted 4.5.0-491-26f710d+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. blockio 0 33215 33213 Call Trace: 0x1 (unreliable) __switch_to+0x2f0/0x410 __schedule+0x300/0x980 schedule+0x48/0xc0 rwsem_down_read_failed+0x124/0x1d0 down_read+0x68/0x80 cxlflash_ioctl+0x70/0x6f0 [cxlflash] scsi_ioctl+0x3b0/0x4c0 sg_ioctl+0x960/0x1010 do_vfs_ioctl+0xd8/0x8c0 SyS_ioctl+0xd4/0xf0 system_call+0x38/0xb4 INFO: task eehd:48 blocked for more than 120 seconds. The hang is because of a 3 way dead-lock: Process A holds the recovery mutex, and waits for eehd to complete. Process B holds the semaphore and waits for the recovery mutex. eehd waits for semaphore. The fix is to have Process B above release the semaphore before attempting to acquire the recovery mutex. This will allow eehd to proceed to completion. Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:47 +02:00
Manoj N. Kumar	321aeea695	cxlflash: Fix to avoid unnecessary scan with internal LUNs [ Upstream commit `603ecce95f` ] When switching to the internal LUN defined on the IBM CXL flash adapter, there is an unnecessary scan occurring on the second port. This scan leads to the following extra lines in the log: Dec 17 10:09:00 tul83p1 kernel: [ 3708.561134] cxlflash 0008:00:00.0: cxlflash_queuecommand: (scp=c0000000fc1f0f00) 11/1/0/0 cdb=(A0000000-00000000-10000000-00000000) Dec 17 10:09:00 tul83p1 kernel: [ 3708.561147] process_cmd_err: cmd failed afu_rc=32 scsi_rc=0 fc_rc=0 afu_extra=0xE, scsi_extra=0x0, fc_extra=0x0 By definition, both of the internal LUNs are on the first port/channel. When the lun_mode is switched to internal LUN the same value for host->max_channel is retained. This causes an unnecessary scan over the second port/channel. This fix alters the host->max_channel to 0 (1 port), if internal LUNs are configured and switches it back to 1 (2 ports) while going back to external LUNs. Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:46 +02:00
Ching Huang	def391c897	arcmsr: fixes not release allocated resource [ Upstream commit `98f90debc2` ] Releasing allocated resource if get configuration data failed. Signed-off-by: Ching Huang <ching2048@areca.com.tw> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinicke <hare@suse.de> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:45 +02:00
Ching Huang	c77d6c3c88	arcmsr: fixed getting wrong configuration data [ Upstream commit `251e2d25bf` ] Fixed getting wrong configuration data of adapter type B and type D. Signed-off-by: Ching Huang <ching2048@areca.com.tw> Reviewed-by: Hannes Reinicke <hare@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:45 +02:00
Swapnil Nagle	07235140a4	qla2xxx: Use ATIO type to send correct tmr response [ Upstream commit `d7236ac368` ] The function value inside se_cmd can change if the TMR is cancelled. Use original ATIO Type to correctly determine CTIO response. Signed-off-by: Swapnil Nagle <swapnil.nagle@purestroage.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:44 +02:00
Suganath prabu Subramani	116a7584cb	mpt3sas: Fix for Asynchronous completion of timedout IO and task abort of timedout IO. [ Upstream commit `03d1fb3a65` ] Track msix of each IO and use the same msix for issuing abort to timed out IO. With this driver will process IO's reply first followed by TM. Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@avagotech.com> Signed-off-by: Chaitra P B <chaitra.basappa@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:43 +02:00
Tomas Henzl	8b208a5d0a	mpt3sas: A correction in unmap_resources [ Upstream commit `5f985d88ba` ] It might happen that we try to free an already freed pointer. Reported-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Tomas Henzl <thenzl@redhat.com> Acked-by: Chaitra P B <chaitra.basappa@avagotech.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:43 +02:00
Tomas Henzl	e77b14a6f3	megaraid_sas: Add an i/o barrier [ Upstream commit `b99dbe56d5` ] A barrier should be added to ensure proper ordering of memory mapped writes. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com> Acked-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:43 +02:00
Sumit Saxena	948cb1ca1f	megaraid_sas: Fix SMAP issue [ Upstream commit `ea1c928bb6` ] Inside compat IOCTL hook of driver, driver was using wrong address of ioc->frame.raw which leads sense_ioc_ptr to be calculated wrongly and failing IOCTL. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:43 +02:00
Sumit Saxena	f7aeba6bd2	megaraid_sas: Do not allow PCI access during OCR [ Upstream commit `11c71cb4ab` ] This patch will do synhronization between OCR function and AEN function using "reset_mutex" lock. reset_mutex will be acquired only in the first half of the AEN function which issues a DCMD. Second half of the function which calls SCSI API (scsi_add_device/scsi_remove_device) should be out of reset_mutex to avoid deadlock between scsi_eh thread and driver. During chip reset (inside OCR function), there should not be any PCI access and AEN function (which is called in delayed context) may be firing DCMDs (doing PCI writes) when chip reset is happening in parallel which will cause FW fault. This patch will solve the problem by making AEN thread and OCR thread mutually exclusive. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:42 +02:00
James Smart	2dfdfa3439	lpfc: Fix external loopback failure. [ Upstream commit `4360ca9c24` ] Fix external loopback failure. Rx sequence reassembly was incorrect. Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com> Signed-off-by: James Smart <james.smart@avagotech.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:42 +02:00
James Smart	10103c5228	lpfc: Fix mbox reuse in PLOGI completion [ Upstream commit `01c73bbcd7` ] Fix mbox reuse in PLOGI completion. Moved allocations so that buffer properly init'd. Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com> Signed-off-by: James Smart <james.smart@avagotech.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:42 +02:00
James Smart	13c61f0468	lpfc: Fix RDP Speed reporting. [ Upstream commit `81e7517723` ] Fix RDP Speed reporting. Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com> Signed-off-by: James Smart <james.smart@avagotech.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:42 +02:00
James Smart	6b9bd78515	lpfc: Fix crash in fcp command completion path. [ Upstream commit `c90261dcd8` ] Fix crash in fcp command completion path. Missed null check. Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com> Signed-off-by: James Smart <james.smart@avagotech.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:42 +02:00
James Smart	7cf5c223cc	lpfc: Fix driver crash when module parameter lpfc_fcp_io_channel set to 16 [ Upstream commit `6690e0d4fc` ] Fix driver crash when module parameter lpfc_fcp_io_channel set to 16 Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com> Signed-off-by: James Smart <james.smart@avagotech.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:42 +02:00
James Smart	0217efc5b8	lpfc: Fix RegLogin failed error seen on Lancer FC during port bounce [ Upstream commit `4b7789b71c` ] Fix RegLogin failed error seen on Lancer FC during port bounce Fix the statemachine and ref counting. Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com> Signed-off-by: James Smart <james.smart@avagotech.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:41 +02:00
James Smart	1a1455583d	lpfc: Fix the FLOGI discovery logic to comply with T11 standards [ Upstream commit `d6de08cc46` ] Fix the FLOGI discovery logic to comply with T11 standards We weren't properly setting fabric parameters, such as R_A_TOV and E_D_TOV, when we registered the vfi object in default configs and pt2pt configs. Revise to now pass service params with the values to the firmware and ensure they are reset on link bounce. Required reworking the call sequence in the discovery threads. Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com> Signed-off-by: James Smart <james.smart@avagotech.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:41 +02:00
James Smart	3f499a4264	lpfc: Fix FCF Infinite loop in lpfc_sli4_fcf_rr_next_index_get. [ Upstream commit `f5cb5304eb` ] Fix FCF Infinite loop in lpfc_sli4_fcf_rr_next_index_get. Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com> Signed-off-by: James Smart <james.smart@avagotech.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:41 +02:00
Manoj Kumar	8c67424e2a	cxlflash: Enable device id for future IBM CXL adapter [ Upstream commit `a2746fb16e` ] This drop enables a future card with a device id of 0x0600 to be recognized by the cxlflash driver. As per the design, the Accelerator Function Unit (AFU) for this new IBM CXL Flash Adapter retains the same host interface as the previous generation. For the early prototypes of the new card, the driver with this change behaves exactly as the driver prior to this behaved with the earlier generation card. Therefore, no card specific programming has been added. These card specific changes can be staged in later if needed. Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:41 +02:00
Manoj Kumar	20ab6948f1	cxlflash: Resolve oops in wait_port_offline [ Upstream commit `b45cdbaf9f` ] If an async error interrupt is generated, and the error requires the FC link to be reset, it cannot be performed in the interrupt context. So a work element is scheduled to complete the link reset in a process context. If either an EEH event or an escalation occurs in between when the interrupt is generated and the scheduled work is started, the MMIO space may no longer be available. This will cause an oops in the worker thread. [ 606.806583] NIP kthread_data+0x28/0x40 [ 606.806633] LR wq_worker_sleeping+0x30/0x100 [ 606.806694] Call Trace: [ 606.806721] 0x50 (unreliable) [ 606.806796] wq_worker_sleeping+0x30/0x100 [ 606.806884] __schedule+0x69c/0x8a0 [ 606.806959] schedule+0x44/0xc0 [ 606.807034] do_exit+0x770/0xb90 [ 606.807109] die+0x300/0x460 [ 606.807185] bad_page_fault+0xd8/0x150 [ 606.807259] handle_page_fault+0x2c/0x30 [ 606.807338] wait_port_offline.constprop.12+0x60/0x130 [cxlflash] To prevent the problem space area from being unmapped, when there is pending work, a mapcount (using the kref mechanism) is held. The mapcount is released only when the work is completed. The last reference release is tied to the unmapping service. Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:41 +02:00
Manoj Kumar	284555fc7f	cxlflash: Fix to resolve cmd leak after host reset [ Upstream commit `ee91e332a6` ] After a few iterations of resetting the card, either during EEH recovery, or a host_reset the following is seen in the logs. cxlflash 0008:00: cxlflash_queuecommand: could not get a free command At every reset of the card, the commands that are outstanding are being leaked. No effort is being made to reap these commands. A few more resets later, the above error message floods the logs and the card is rendered totally unusable as no free commands are available. Iterated through the 'cmd' queue and printed out the 'free' counter and found that on each reset certain commands were in-use and stayed in-use through subsequent resets. To resolve this issue, when the card is reset, reap all the commands that are active/outstanding. Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:41 +02:00
Dan Carpenter	edf9459008	cxlflash: a couple off by one bugs [ Upstream commit `e37390bee6` ] The "> MAX_CONTEXT" should be ">= MAX_CONTEXT". Otherwise we go one step beyond the end of the cfg->ctx_tbl[] array. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-15 08:27:39 +02:00
Yinghai Lu	bbaf719376	megaraid_sas: Fix probing cards without io port commit `e7f851684e` upstream. Found one megaraid_sas HBA probe fails, [ 187.235190] scsi host2: Avago SAS based MegaRAID driver [ 191.112365] megaraid_sas 0000:89:00.0: BAR 0: can't reserve [io 0x0000-0x00ff] [ 191.120548] megaraid_sas 0000:89:00.0: IO memory region busy! and the card has resource like, [ 125.097714] pci 0000:89:00.0: [1000:005d] type 00 class 0x010400 [ 125.104446] pci 0000:89:00.0: reg 0x10: [io 0x0000-0x00ff] [ 125.110686] pci 0000:89:00.0: reg 0x14: [mem 0xce400000-0xce40ffff 64bit] [ 125.118286] pci 0000:89:00.0: reg 0x1c: [mem 0xce300000-0xce3fffff 64bit] [ 125.125891] pci 0000:89:00.0: reg 0x30: [mem 0xce200000-0xce2fffff pref] that does not io port resource allocated from BIOS, and kernel can not assign one as io port shortage. The driver is only looking for MEM, and should not fail. It turns out megasas_init_fw() etc are using bar index as mask. index 1 is used as mask 1, so that pci_request_selected_regions() is trying to request BAR0 instead of BAR1. Fix all related reference. Fixes: `b6d5d8808b` ("megaraid_sas: Use lowest memory bar for SR-IOV VF support") Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-07 08:32:43 +02:00
Greg Edwards	7386f927cf	mpt3sas: Fix resume on WarpDrive flash cards commit `ce7c6c9e1d` upstream. mpt3sas crashes on resume after suspend with WarpDrive flash cards. The reply_post_host_index array is not set back up after the resume, and we deference a stale pointer in _base_interrupt(). [ 47.309711] BUG: unable to handle kernel paging request at ffffc90001f8006c [ 47.318289] IP: [<ffffffffc00863ef>] _base_interrupt+0x49f/0xa30 [mpt3sas] [ 47.326749] PGD 41ccaa067 PUD 41ccab067 PMD 3466c067 PTE 0 [ 47.333848] Oops: 0002 [#1] SMP ... [ 47.452708] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0 #6 [ 47.460506] Hardware name: Dell Inc. OptiPlex 990/06D7TR, BIOS A18 09/24/2013 [ 47.469629] task: ffffffff81c0d500 ti: ffffffff81c00000 task.ti: ffffffff81c00000 [ 47.479112] RIP: 0010:[<ffffffffc00863ef>] [<ffffffffc00863ef>] _base_interrupt+0x49f/0xa30 [mpt3sas] [ 47.490466] RSP: 0018:ffff88041d203e30 EFLAGS: 00010002 [ 47.497801] RAX: 0000000000000001 RBX: ffff880033f4c000 RCX: 0000000000000001 [ 47.506973] RDX: ffffc90001f8006c RSI: 0000000000000082 RDI: 0000000000000082 [ 47.516141] RBP: ffff88041d203eb0 R08: ffff8804118e2820 R09: 0000000000000001 [ 47.525300] R10: 0000000000000001 R11: 00000000100c0000 R12: 0000000000000000 [ 47.534457] R13: ffff880412c487e0 R14: ffff88041a8987d8 R15: 0000000000000001 [ 47.543632] FS: 0000000000000000(0000) GS:ffff88041d200000(0000) knlGS:0000000000000000 [ 47.553796] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.561632] CR2: ffffc90001f8006c CR3: 0000000001c06000 CR4: 00000000000406f0 [ 47.570883] Stack: [ 47.575015] 000000001d211228 ffff88041d2100c0 ffff8800c47d8130 0000000000000100 [ 47.584625] ffff8804100c0000 100c000000000000 ffff88041a8992a0 ffff88041a8987f8 [ 47.594230] ffff88041d203e00 ffffffff81111e55 000000000000038c ffff880414ad4280 [ 47.603862] Call Trace: [ 47.608474] <IRQ> [ 47.610413] [<ffffffff81111e55>] ? call_timer_fn+0x35/0x120 [ 47.620539] [<ffffffff81100a1f>] handle_irq_event_percpu+0x7f/0x1c0 [ 47.629061] [<ffffffff81100b8c>] handle_irq_event+0x2c/0x50 [ 47.636859] [<ffffffff81103fff>] handle_edge_irq+0x6f/0x130 [ 47.644654] [<ffffffff8102fbf3>] handle_irq+0x73/0x120 [ 47.652011] [<ffffffff810c6ada>] ? atomic_notifier_call_chain+0x1a/0x20 [ 47.660854] [<ffffffff817e374b>] do_IRQ+0x4b/0xd0 [ 47.667777] [<ffffffff817e160c>] common_interrupt+0x8c/0x8c [ 47.675635] <EOI> Move the reply_post_host_index array setup into mpt3sas_base_map_resources(), which is also in the resume path. Signed-off-by: Greg Edwards <gedwards@fireweed.org> Acked-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-07 08:32:43 +02:00
Dave Carroll	e4878ef66e	aacraid: Check size values after double-fetch from user commit `fa00c437ee` upstream. In aacraid's ioctl_send_fib() we do two fetches from userspace, one the get the fib header's size and one for the fib itself. Later we use the size field from the second fetch to further process the fib. If for some reason the size from the second fetch is different than from the first fix, we may encounter an out-of- bounds access in aac_fib_send(). We also check the sender size to insure it is not out of bounds. This was reported in https://bugzilla.kernel.org/show_bug.cgi?id=116751 and was assigned CVE-2016-6480. Reported-by: Pengfei Wang <wpengfeinudt@gmail.com> Fixes: `7c00ffa31` '[SCSI] 2.6 aacraid: Variable FIB size (updated patch)' Signed-off-by: Dave Carroll <david.carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-09-07 08:32:43 +02:00
Mauricio Faria de Oliveira	74d55e5d96	lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt() commit `05a05872c8` upstream. The lpfc_sli4_scmd_to_wqidx_distr() function expects the scsi_cmnd 'lpfc_cmd->pCmd' not to be null, and point to the midlayer command. That's not true in the .eh_(device\|target\|bus)_reset_handler path, because lpfc_send_taskmgmt() sends commands not from the midlayer, so does not set 'lpfc_cmd->pCmd'. That is true in the .queuecommand path because lpfc_queuecommand() stores the scsi_cmnd from midlayer in lpfc_cmd->pCmd; and lpfc_cmd is stored by lpfc_scsi_prep_cmnd() in piocbq->context1 -- which is passed to lpfc_sli4_scmd_to_wqidx_distr() as lpfc_cmd parameter. This problem can be hit on SCSI EH, and immediately with sg_reset. These 2 test-cases demonstrate the problem/fix with next-20160601. Test-case 1) sg_reset # strace sg_reset --device /dev/sdm <...> open("/dev/sdm", O_RDWR\|O_NONBLOCK) = 3 ioctl(3, SG_SCSI_RESET, 0x3fffde6d0994 <unfinished ...> +++ killed by SIGSEGV +++ Segmentation fault # dmesg Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xd00000001c88442c Oops: Kernel access of bad area, sig: 11 [#1] <...> CPU: 104 PID: 16333 Comm: sg_reset Tainted: G W 4.7.0-rc1-next-20160601-00004-g95b89dc #6 <...> NIP [d00000001c88442c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc] LR [d00000001c826fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc] Call Trace: [c000003c9ec876f0] [c000003c9ec87770] 0xc000003c9ec87770 (unreliable) [c000003c9ec87720] [d00000001c82e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc] [c000003c9ec87780] [d00000001c831a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc] [c000003c9ec87880] [d00000001c87f27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc] [c000003c9ec87950] [d00000001c87fd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc] [c000003c9ec87a10] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0 [c000003c9ec87a40] [c0000000006113e8] scsi_ioctl_reset+0x198/0x2c0 [c000003c9ec87bf0] [c00000000060fe5c] scsi_ioctl+0x13c/0x4b0 [c000003c9ec87c80] [c0000000006629b0] sd_ioctl+0xf0/0x120 [c000003c9ec87cd0] [c00000000046e4f8] blkdev_ioctl+0x248/0xb70 [c000003c9ec87d30] [c0000000002a1f60] block_ioctl+0x70/0x90 [c000003c9ec87d50] [c00000000026d334] do_vfs_ioctl+0xc4/0x890 [c000003c9ec87de0] [c00000000026db60] SyS_ioctl+0x60/0xc0 [c000003c9ec87e30] [c000000000009120] system_call+0x38/0x108 Instruction dump: <...> With fix: # strace sg_reset --device /dev/sdm <...> open("/dev/sdm", O_RDWR\|O_NONBLOCK) = 3 ioctl(3, SG_SCSI_RESET, 0x3fffe103c554) = 0 close(3) = 0 exit_group(0) = ? +++ exited with 0 +++ # dmesg [ 424.658649] lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002 Test-case 2) SCSI EH Using this debug patch to wire an SCSI EH trigger, for lpfc_scsi_cmd_iocb_cmpl(): - cmd->scsi_done(cmd); + if ((phba->pport ? phba->pport->cfg_log_verbose : phba->cfg_log_verbose) == 0x32100000) + printk(KERN_ALERT "lpfc: skip scsi_done()\n"); + else + cmd->scsi_done(cmd); # echo 0x32100000 > /sys/class/scsi_host/host11/lpfc_log_verbose # dd if=/dev/sdm of=/dev/null iflag=direct & <...> After a while: # dmesg lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000) lpfc: skip scsi_done() <...> Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xd0000000199e448c Oops: Kernel access of bad area, sig: 11 [#1] <...> CPU: 96 PID: 28556 Comm: scsi_eh_11 Tainted: G W 4.7.0-rc1-next-20160601-00004-g95b89dc #6 <...> NIP [d0000000199e448c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc] LR [d000000019986fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc] Call Trace: [c000000ff0d0b890] [c000000ff0d0b900] 0xc000000ff0d0b900 (unreliable) [c000000ff0d0b8c0] [d00000001998e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc] [c000000ff0d0b920] [d000000019991a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc] [c000000ff0d0ba20] [d0000000199df27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc] [c000000ff0d0baf0] [d0000000199dfd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc] [c000000ff0d0bbb0] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0 [c000000ff0d0bbe0] [c0000000006126cc] scsi_eh_ready_devs+0x49c/0x9c0 [c000000ff0d0bcb0] [c000000000614160] scsi_error_handler+0x580/0x680 [c000000ff0d0bd80] [c0000000000ae848] kthread+0x108/0x130 [c000000ff0d0be30] [c0000000000094a8] ret_from_kernel_thread+0x5c/0xb4 Instruction dump: <...> With fix: # dmesg lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000) lpfc: skip scsi_done() <...> lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (0, 0) return x2002 <...> lpfc 0006:01:00.4: 4:(0):0723 SCSI layer issued Target Reset (1, 0) return x2002 <...> lpfc 0006:01:00.4: 4:(0):0714 SCSI layer issued Bus Reset Data: x2002 <...> lpfc 0006:01:00.4: 4:(0):3172 SCSI layer issued Host Reset Data: <...> Fixes: `8b0dff1416` ("lpfc: Add support for using block multi-queue") Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-08-20 18:09:27 +02:00
Hannes Reinecke	5a6f9d06d8	scsi: ignore errors from scsi_dh_add_device() commit `221255aee6` upstream. device handler initialisation might fail due to a number of reasons. But as device_handlers are optional this shouldn't cause us to disable the device entirely. So just ignore errors from scsi_dh_add_device(). Reviewed-by: Johannes Thumshirn <jthumshirn@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-08-16 09:30:48 +02:00
Brian King	8727178338	ipr: Clear interrupt on croc/crocodile when running with LSI commit `54e430bbd4` upstream. If we fall back to using LSI on the Croc or Crocodile chip we need to clear the interrupt so we don't hang the system. Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-08-10 11:49:29 +02:00

1 2 3 4 5 ...

12672 Commits