linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-25 23:53:30 +02:00

Author	SHA1	Message	Date
Omer Efrat	397c657a06	cfg80211: use BIT_ULL for NL80211_STA_INFO_* attribute types The BIT macro uses unsigned long which some architectures handle as 32 bit and therefore might cause macro's shift to overflow when used on a value equals or larger than 32 (NL80211_STA_INFO_RX_DURATION and afterwards). Since 'filled' member in station_info changed to u64, BIT_ULL macro should be used with all NL80211_STA_INFO_* attribute types instead of BIT to prevent future possible bugs when one will use BIT macro for higher attributes by mistake. This commit cleans up all usages of BIT macro with the above field in cfg80211 by changing it to BIT_ULL instead. In addition, there are some places which don't use BIT nor BIT_ULL macros so align those as well. Signed-off-by: Omer Efrat <omer.efrat@tandemg.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:52:23 +02:00
Johannes Berg	f0c0407d2a	mac80211: remove unnecessary NULL check We don't need to check if he_oper is NULL before calling ieee80211_verify_sta_he_mcs_support() as it - now - will correctly check this itself. Remove the redundant check. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:51:39 +02:00
Gustavo A. R. Silva	47aa7861b9	mac80211: fix potential null pointer dereference he_op is being dereferenced before it is null checked, hence there is a potential null pointer dereference. Fix this by moving the pointer dereference after he_op has been properly null checked. Notice that, currently, he_op is already being null checked before calling this function at 4593: 4593 if (!he_oper \|\| 4594 !ieee80211_verify_sta_he_mcs_support(sband, he_oper)) 4595 ifmgd->flags \|= IEEE80211_STA_DISABLE_HE; but in case ieee80211_verify_sta_he_mcs_support is ever called without verifying he_oper is not null, we will end up having a null pointer dereference. So, we better don't take any chances. Addresses-Coverity-ID: 1470068 ("Dereference before null check") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:50:43 +02:00
Arnd Bergmann	fe0984d389	cfg80211: track time using boottime The cfg80211 layer uses get_seconds() to read the current time in its supend handling. This function is deprecated because of the 32-bit time_t overflow, and it can cause unexpected behavior when the time changes due to settimeofday() calls or leap second updates. In many cases, we want to use monotonic time instead, however cfg80211 explicitly tracks the time spent in suspend, so this changes the driver over to use ktime_get_boottime_seconds(), which is slightly slower, but not used in a fastpath here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:49:28 +02:00
Johannes Berg	95bca62fb7	nl80211: check nla_parse_nested() return values At the very least we should check the return value if nla_parse_nested() is called with a non-NULL policy. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:44:51 +02:00
Bob Copeland	188f60ab8e	nl80211: relax ht operation checks for mesh Commit `9757235f45`, "nl80211: correct checks for NL80211_MESHCONF_HT_OPMODE value") relaxed the range for the HT operation field in meshconf, while also adding checks requiring the non-greenfield and non-ht-sta bits to be set in certain circumstances. The latter bit is actually reserved for mesh BSSes according to Table 9-168 in 802.11-2016, so in fact it should not be set. wpa_supplicant sets these bits because the mesh and AP code share the same implementation, but authsae does not. As a result, some meshconf updates from authsae which set only the NONHT_MIXED protection bits were being rejected. In order to avoid breaking userspace by changing the rules again, simply accept the values with or without the bits set, and mask off the reserved bit to match the spec. While in here, update the 802.11-2012 reference to 802.11-2016. Fixes: `9757235f45` ("nl80211: correct checks for NL80211_MESHCONF_HT_OPMODE value") Cc: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Bob Copeland <bobcopeland@fb.com> Reviewed-by: Masashi Honma <masashi.honma@gmail.com> Reviewed-by: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:39:30 +02:00
Denis Kenzior	e7441c9274	mac80211: disable BHs/preemption in ieee80211_tx_control_port() On pre-emption enabled kernels the following print was being seen due to missing local_bh_disable/local_bh_enable calls. mac80211 assumes that pre-emption is disabled in the data path. BUG: using smp_processor_id() in preemptible [00000000] code: iwd/517 caller is __ieee80211_subif_start_xmit+0x144/0x210 [mac80211] [...] Call Trace: dump_stack+0x5c/0x80 check_preemption_disabled.cold.0+0x46/0x51 __ieee80211_subif_start_xmit+0x144/0x210 [mac80211] Fixes: `9118064914` ("mac80211: Add support for tx_control_port") Signed-off-by: Denis Kenzior <denkenz@gmail.com> [commit message rewrite, fixes tag] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:39:08 +02:00
Ping-Ke Shih	9a98302de1	rtlwifi: rtl8821ae: fix firmware is not ready to run Without this patch, firmware will not run properly on rtl8821ae, and it causes bad user experience. For example, bad connection performance with low rate, higher power consumption, and so on. rtl8821ae uses two kinds of firmwares for normal and WoWlan cases, and each firmware has firmware data buffer and size individually. Original code always overwrite size of normal firmware rtlpriv->rtlhal.fwsize, and this mismatch causes firmware checksum error, then firmware can't start. In this situation, driver gives message "Firmware is not ready to run!". Fixes: `fe89707f0a` ("rtlwifi: rtl8821ae: Simplify loading of WOWLAN firmware") Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Cc: Stable <stable@vger.kernel.org> # 4.0+ Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>	2018-06-29 10:08:47 +03:00
Tony Lindgren	ad5003300b	phy: mapphone-mdm6600: Fix wrong enum used for status lines Kbuilt test robot reported: drivers/phy/motorola/phy-mapphone-mdm6600.c:188:16: warning: is used uninitialized in this function [-Wuninitialized] val \|= values[i] << i; ~~~~~~^~~ Looking at the phy_mdm6600_status() values does get initialized by gpiod_get_array_value_cansleep(), but we are using wrong enum in that function. Let's fix the use, both end up being three though so urgent rush on this one AFAIK. Fixes: `5d1ebbda03` ("phy: mapphone-mdm6600: Add USB PHY driver for MDM6600 on Droid 4") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-06-29 12:00:22 +05:30
Jaedon Shin	d70262ea0e	phy: phy-brcm-usb-init: Fix power down USB 3.0 PHY when XHCI reenabled Unset is required to enable USB 3.0 PHY when XHCI reenabled in response to setting PHY3_IDDQ_OVERRIDE in uninit(). Fixes: `cd6f769fde` ("phy: phy-brcm-usb-init: Power down USB 3.0 PHY when XHCI disabled") Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-06-29 12:00:22 +05:30
Wolfram Sang	12b731dd46	i2c: gpio: initialize SCL to HIGH again It seems that during the conversion from gpio* to gpiod*, the initial state of SCL was wrongly switched to LOW. Fix it to be HIGH again. Fixes: `7bb75029ef` ("i2c: gpio: Enforce open drain through gpiolib") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org	2018-06-29 08:23:12 +02:00
Peter Rosin	9aa613674f	i2c: smbus: kill memory leak on emulated and failed DMA SMBus xfers If DMA safe memory was allocated, but the subsequent I2C transfer fails the memory is leaked. Plug this leak. Fixes: `8a77821e74` ("i2c: smbus: use DMA safe buffers for emulated SMBus transactions") Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org	2018-06-29 08:19:52 +02:00
Wolfram Sang	2173ed0adc	i2c: algos: bit: mention our experience about initial states So, if somebody wants to re-implement this in the future, we pinpoint to a problem case. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2018-06-29 08:19:51 +02:00
Wolfram Sang	2a2c8ee2d7	Revert "i2c: algo-bit: init the bus to a known state" This reverts commit `3e5f06bed7`. As per bugzilla #200045, this caused a regression. I don't really see a way to fix it without having the hardware. So, revert the patch and I will fix the issue I was seeing originally in the i2c-gpio driver itself. I couldn't find new users of this algorithm since, so there should be no one depending on the new behaviour. Reported-by: Sergey Larin <cerg2010cerg2010@mail.ru> Fixes: `3e5f06bed7` ("i2c: algo-bit: init the bus to a known state") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Sergey Larin <cerg2010cerg2010@mail.ru> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org	2018-06-29 08:19:41 +02:00
David S. Miller	58c77bf1d4	Merge branch 'ila-Cleanup' Tom Herbert says: ==================== ila: Cleanup Perform some cleanup in ILA code. This includes: - Fix rhashtable walk for cases where nl dumps are done with muliple function calls. Add a skip index to skip over entries in a node that have been previously visitied. Call rhashtable_walk_peek to avoid dropping items between calls to ila_nl_dump. - Call alloc_bucket_spinlocks to create bucket locks. - Split out module initialization and netlink definitions into separate files. - Add ILA_CMD_FLUSH netlink command to clear the ILA translation table. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:32:55 +09:00
Tom Herbert	b6e71bdebb	ila: Flush netlink command to clear xlat table Add ILA_CMD_FLUSH netlink command to clear the ILA translation table. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:32:55 +09:00
Tom Herbert	ad68147ef2	ila: Create main ila source file Create a main ila file that contains the module initialization functions as well as netlink definitions. Previously these were defined in ila_xlat and ila_common. This approach allows better extensibility. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:32:55 +09:00
Tom Herbert	b893281715	ila: Call library function alloc_bucket_locks To allocate the array of bucket locks for the hash table we now call library function alloc_bucket_spinlocks. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:32:55 +09:00
Tom Herbert	f7a2ba5ab9	ila: Fix use of rhashtable walk in ila_xlat.c Perform better EAGAIN handling, handle case where ila_dump_info fails and we missed objects in the dump, and add a skip index to skip over ila entires in a list on a rhashtable node that have already been visited (by a previous call to ila_nl_dump). Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:32:55 +09:00
David S. Miller	6d268910a4	Merge branch 'hns3-a-few-code-improvements' Peng Li says: ==================== net: hns3: a few code improvements This patchset fixes a few code stylistic issues from concentrated review, no functional changes introduced. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:35 +09:00
Huazhong Tan	ab68059e15	net: hns3: use lower_32_bits and upper_32_bits MACRO lower_32_bits and upper_32_bits can help to get bits 0-31 and bits 32-63 of a number, so just use it. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:34 +09:00
Huazhong Tan	541a7bd6bf	net: hns3: remove back in struct hclge_hw hclge_hw is embedded in hclge_dev, so use container_of instead of back to get hclge_dev. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:34 +09:00
Peng Li	43e2b1c7f4	net: hns3: remove the Redundant put_vector in hns3_client_uninit The interface h->ae_algo->ops->put_vector is called in both hns3_nic_dealloc_vector_data and hns3_nic_uninit_vector_data in hns3_client_uninit, this will cause vector freed twice. This patch remove the Redundant put_vector to make vector freed only once. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:34 +09:00
Peng Li	ccc2bef829	net: hns3: print the ret value in error information Print the ret value in error information can help find the reason. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:34 +09:00
Peng Li	48569cdaaf	net: hns3: extraction an interface for state init\|uninit Extraction an interface for state init\|uninit to make the code easier to read. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:34 +09:00
Peng Li	fe589e0454	net: hns3: remove unused head file in hnae3.c linux/slab.h is not used in hnae3.h, this patch removes it. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:34 +09:00
Peng Li	0e6084aa1c	net: hns3: add unlikely for error check The first bd of a packet is invalid and invalid ring head for tx IRQ is not offen, they may occur when there is error, Add unlikely for error check branch is better for performance. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:34 +09:00
Peng Li	94c5e53213	net: hns3: add l4_type check for both ipv4 and ipv6 HW supports UDP, TCP and SCTP packets checksum for both ipv4 and ipv6, but do not support other type packets checksum for ipv4 or ipv6. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:34 +09:00
Peng Li	36cbbdf643	net: hns3: add vector status check before free vector If the hdev->vector_status[vector_id] is already HCLGE_INVALID_VPORT, should log the error and return. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:34 +09:00
Peng Li	e718a93fee	net: hns3: rename the interface for init_client_instance and uninit_client_instance The interface init_client_instance and uninit_client_instance do not register anything, only initialize the client instance. This patch rename the related interface to make the function name to indicate the purpose. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:34 +09:00
Peng Li	b204bc7484	net: hns3: remove hclge_get_vector_index from hclge_bind_ring_with_vector In hclge_unmap_ring_frm_vector, there are 2 steps: step 1: get vector index. step 2 unbind ring with vector. But it gets vector id again in step 2 interface. This patch removes hclge_get_vector_index from hclge_bind_ring_with_vector, and make the step the same with hns3 PF driver. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:06:34 +09:00
Jeff Moyer	5a14e91d55	dev-dax: check_vma: ratelimit dev_info-s This is easily triggered from userspace, so let's ratelimit the messages. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2018-06-28 18:23:36 -07:00
Dan Williams	b62cc6fdd7	libnvdimm, pmem: Fix memcpy_mcsafe() return code handling in nsio_rw_bytes() Commit `60622d6822` "x86/asm/memcpy_mcsafe: Return bytes remaining" converted callers of memcpy_mcsafe() to expect a positive 'bytes remaining' value rather than a negative error code. The nsio_rw_bytes() conversion failed to return success. The failure is benign in that nsio_rw_bytes() will end up writing back what it just read. Fixes: `60622d6822` ("x86/asm/memcpy_mcsafe: Return bytes remaining") Cc: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2018-06-28 18:21:30 -07:00
Jann Horn	0da74120c5	selinux: move user accesses in selinuxfs out of locked regions If a user is accessing a file in selinuxfs with a pointer to a userspace buffer that is backed by e.g. a userfaultfd, the userspace access can stall indefinitely, which can block fsi->mutex if it is held. For sel_read_policy(), remove the locking, since this method doesn't seem to access anything that requires locking. For sel_read_bool(), move the user access below the locked region. For sel_write_bool() and sel_commit_bools_write(), move the user access up above the locked region. Cc: stable@vger.kernel.org Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: removed an unused variable in sel_read_policy()] Signed-off-by: Paul Moore <paul@paul-moore.com>	2018-06-28 20:39:54 -04:00
David Ahern	4c79579b44	bpf: Change bpf_fib_lookup to return lookup status For ACLs implemented using either FIB rules or FIB entries, the BPF program needs the FIB lookup status to be able to drop the packet. Since the bpf_fib_lookup API has not reached a released kernel yet, change the return code to contain an encoding of the FIB lookup result and return the nexthop device index in the params struct. In addition, inform the BPF program of any post FIB lookup reason as to why the packet needs to go up the stack. The fib result for unicast routes must have an egress device, so remove the check that it is non-NULL. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-29 00:02:02 +02:00
Kleber Sacilotto de Souza	3203c90100	test_bpf: flag tests that cannot be jited on s390 Flag with FLAG_EXPECTED_FAIL the BPF_MAXINSNS tests that cannot be jited on s390 because they exceed BPF_SIZE_MAX and fail when CONFIG_BPF_JIT_ALWAYS_ON is set. Also set .expected_errcode to -ENOTSUPP so the tests pass in that case. Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-28 23:58:39 +02:00
Gustavo Padovan	c981c01164	Immutable branch between fbdev and drm for the v4.19 merge window (contains the deferred console takeover feature) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJbNOGiAAoJEH4ztj+gR8ILB6cP/1ZQM0fzrvpOfAB9Oqy7BPsD +XbcTcsUdpERHPNneaHtaZppaCNH99HrjchecoNuGSj2OqiInzwTyXdoUyashz+h zm8XGplWwjYrlUfwsa50FsaTCfIZSG5KWzLGd2zf0ztzKkXJmMjeCnuySi1pGo1x P4M+ggPkp0jk1qZiQvHg0B3QZOs8E61zTGiqHShF0s4mcmY16eaLksEprrSBqCD8 W4TYhMTJriqGvgYO9Y8kf8wlyEa9Wu7fn+vZVtvjY2+1KcOvZHxcj4pHz8QwZdc2 3uUlGtGBtxxR3aN+VkH7+MpbUSfueg4hX4loGHafugfgFKEeBoylXrOFMiuc7/T9 kqrsDyQU5cToGc2IZQkByyGZyPh7mjG/jGoJDkpSsKKMDLBzk1Cd3RhXILhYCWYF FS9rJk14ZRMhez8lihaZp/q9JOwJp9W3wRUsBnpC65n5HTUZaqWxVQsEE3Ypi4G/ xqFK1eANJimUFoiZCWrjjswQR/B+mV9deSwEQtMQKn5aC+0SRHtrZykd4RK36P2l hJjVGZX3g+8I+E+tQsh482Fc+3eL4lb3YJaG4L17S3Rkht2A5N//if7kc4HP7xtA h3/U43T31dDmJtrjvxne8XuedMT0HJRCHBjP8LOsXDecqmroZ/UIp/7rCe6rx6cW X+CtkaxlqaQk7OJBERhS =V0dd -----END PGP SIGNATURE----- Merge tag 'ib-fbdev-drm-v4.19-deferred-console-takeover' of https://github.com/bzolnier/linux into drm-misc-next Immutable branch between fbdev and drm for the v4.19 merge window (contains the deferred console takeover feature) Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com> # gpg: Signature made Thu 28 Jun 2018 10:24:50 AM -03 # gpg: using RSA key 7E33B63FA047C20B # gpg: Can't check signature: public key not found # Conflicts: # drivers/gpu/drm/i915/i915_gem.c # drivers/gpu/drm/i915/intel_crt.c # drivers/gpu/drm/i915/intel_display.c # drivers/gpu/drm/i915/intel_lrc.c Link: https://patchwork.freedesktop.org/patch/msgid/2462549.rLSfW9kX99@amdc3058	2018-06-28 18:56:03 -03:00
Chris Wilson	9512f985c3	drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd) Back in commit `27af5eea54` ("drm/i915: Move execlists irq handler to a bottom half"), we came to the conclusion that running our CSB processing and ELSP submission from inside the irq handler was a bad idea. A really bad idea as we could impose nearly 1s latency on other users of the system, on average! Deferring our work to a tasklet allowed us to do the processing with irqs enabled, reducing the impact to an average of about 50us. We have since eradicated the use of forcewaked mmio from inside the CSB processing and ELSP submission, bringing the impact down to around 5us (on Kabylake); an order of magnitude better than our measurements 2 years ago on Broadwell and only about 2x worse on average than the gem_syslatency on an unladen system. In this iteration of the tasklet-vs-direct submission debate, we seek a compromise where by we submit new requests immediately to the HW but defer processing the CS interrupt onto a tasklet. We gain the advantage of low-latency and ksoftirqd avoidance when waking up the HW, while avoiding the system-wide starvation of our CS irq-storms. Comparing the impact on the maximum latency observed (that is the time stolen from an RT process) over a 120s interval, repeated several times (using gem_syslatency, similar to RT's cyclictest) while the system is fully laden with i915 nops, we see that direct submission an actually improve the worse case. Maximum latency in microseconds of a third party RT thread (gem_syslatency -t 120 -f 2) x Always using tasklets (a couple of >1000us outliers removed) + Only using tasklets from CS irq, direct submission of requests +------------------------------------------------------------------------+ \| + \| \| + \| \| + \| \| + + \| \| + + + \| \| + + + + x x x \| \| +++ + + + x x x x x x \| \| +++ + ++ + + x x x x x x \| \| +++ + ++ + x x x x x \| \| + +++ + ++ * * +xxx x x xx \| \| * +++ + ++++* x+xx+ x x xxxx x \| \| *x++++++*+xx**x+ +x xx xxxx x x \| \|x* ****+************+++**xxxxxx xxx xxx + x+\| \| \|__________MA___________\| \| \| \|______M__A________\| \| +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 118 91 186 124 125.28814 16.279137 + 120 92 187 109 112.00833 13.458617 Difference at 95.0% confidence -13.2798 +/- 3.79219 -10.5994% +/- 3.02677% (Student's t, pooled s = 14.9237) However the mean latency is adversely affected: Mean latency in microseconds of a third party RT thread (gem_syslatency -t 120 -f 1) x Always using tasklets + Only using tasklets from CS irq, direct submission of requests +------------------------------------------------------------------------+ \| xxxxxx + ++ \| \| xxxxxx + ++ \| \| xxxxxx + +++ ++ \| \| xxxxxxx +++++ ++ \| \| xxxxxxx +++++ ++ \| \| xxxxxxx +++++ +++ \| \| xxxxxxx + ++++++++++ \| \| xxxxxxxx ++ ++++++++++ \| \| xxxxxxxx ++ ++++++++++ \| \| xxxxxxxxxx +++++++++++++++ \| \| xxxxxxxxxxx x +++++++++++++++ \| \|x xxxxxxxxxxxxx x + + ++++++++++++++++++ +\| \| \|__A__\| \| \| \|____A___\| \| +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 120 3.506 3.727 3.631 3.6321417 0.02773109 + 120 3.834 4.149 4.039 4.0375167 0.041221676 Difference at 95.0% confidence 0.405375 +/- 0.00888913 11.1608% +/- 0.244735% (Student's t, pooled s = 0.03513) However, since the mean latency corresponds to the amount of irqsoff processing we have to do for a CS interrupt, we only need to speed that up to benefit not just system latency but our own throughput. v2: Remember to defer submissions when under reset. v4: Only use direct submission for new requests v5: Be aware that with mixing direct tasklet evaluation and deferred tasklets, we may end up idling before running the deferred tasklet. v6: Remove the redudant likely() from tasklet_is_enabled(), restrict the annotation to reset_in_progress(). v7: Take the full timeline.lock when enabling perf_pmu stats as the tasklet is no longer a valid guard. A consequence is that the stats are now only valid for engines also using the timeline.lock to process state. Testcase: igt/gem_exec_latency/rthog* References: `27af5eea54` ("drm/i915: Move execlists irq handler to a bottom half") Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-9-chris@chris-wilson.co.uk	2018-06-28 22:55:10 +01:00
Chris Wilson	fd8526e509	drm/i915/execlists: Trust the CSB Now that we use the CSB stored in the CPU friendly HWSP, we do not need to track interrupts for when the mmio CSB registers are valid and can just check where we read up to last from the cached HWSP. This means we can forgo the atomic bit tracking from interrupt, and in the next patch it means we can check the CSB at any time. v2: Change the splitting inside reset_prepare, we only want to lose testing the interrupt in this patch, the next patch requires the change in locking Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-8-chris@chris-wilson.co.uk	2018-06-28 22:55:09 +01:00
Chris Wilson	3800cd1953	drm/i915/execlists: Stop storing the CSB read pointer in the mmio register As we now never read back our current head position from the CSB pointers register, and the HW itself doesn't use it to prevent overwriting unread CSB entries, we do not need to keep updating the register. As it turns out this register is not listed as being shadowed, and so requires forcewake -- but we haven't been taking forcewake around it so the writes has probably been regularly dropped. Fortuitously, we only read the value after a reset where it did not matter, and zero was the right answer (well, close enough). Mika pointed out that this was how we used to do it (accidentally!) before he fixed it in commit `cc53699b25` ("drm/i915: Use masked write for Context Status Buffer Pointer"). References: `cc53699b25` ("drm/i915: Use masked write for Context Status Buffer Pointer") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-7-chris@chris-wilson.co.uk	2018-06-28 22:55:08 +01:00
Chris Wilson	f4b58f0438	drm/i915/execlists: Reset CSB write pointer after reset On HW reset, the HW clears the write pointer (to 0). But since it also writes its first CSB entry to slot 0, we need to reset the write pointer back to the element before (so the first entry we read is 0). This is required for the next patch, where we trust the CSB completely! v2: Use _MASKED_FIELD v3: Store the reset value, so that we differentiate between mmio/hwsp transparently and without pretense. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-6-chris@chris-wilson.co.uk	2018-06-28 22:55:07 +01:00
Chris Wilson	bc4237ec8d	drm/i915/execlists: Unify CSB access pointers Following the removal of the last workarounds, the only CSB mmio access is for the old vGPU interface. The mmio registers presented by vGPU do not require forcewake and can be treated as ordinary volatile memory, i.e. they behave just like the HWSP access just at a different location. We can reduce the CSB access to a set of read/write/buffer pointers and treat the various paths identically and not worry about forcewake. (Forcewake is nightmare for worstcase latency, and we want to process this all with irqsoff -- no latency allowed!) v2: Comments, comments, comments. Well, 2 bonus comments. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-5-chris@chris-wilson.co.uk	2018-06-28 22:55:06 +01:00
Chris Wilson	8ea397fa70	drm/i915/execlists: Process one CSB update at a time In the next patch, we will process the CSB events directly from the submission path, rather than only after a CS interrupt. Hence, we will no longer have the need for a loop until the has-interrupt bit is clear, and in the meantime can remove that small optimisation. v2: Tvrtko pointed out it was safer to unconditionally kick the tasklet after each irq, when assuming that the tasklet is called for each irq. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-4-chris@chris-wilson.co.uk	2018-06-28 22:55:04 +01:00
Chris Wilson	d8857d541c	drm/i915/execlists: Pull CSB reset under the timeline.lock In the following patch, we will process the CSB events under the timeline.lock and not serialised by the tasklet. This also means that we will need to protect access to common variables such as execlists->csb_head with the timeline.lock during reset. v2: Move sync_irq to avoid deadlocks between taking timeline.lock from our interrupt handler. v3: Kill off the synchronize_hardirq as it raises more questions than answered; now we use the timeline.lock entirely for CSB serialisation between the irq and elsewhere, we don't need to be so heavy handed with flushing v4: Treat request cancellation (wedging after failed reset) similarly Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-3-chris@chris-wilson.co.uk	2018-06-28 22:55:04 +01:00
Chris Wilson	0b02befa82	drm/i915/execlists: Pull submit after dequeue under timeline lock In the next patch, we will begin processing the CSB from inside the submission path (underneath an irqsoff section, and even from inside interrupt handlers). This means that updating the execlists->port[] will no longer be serialised by the tasklet but needs to be locked by the engine->timeline.lock instead. Pull dequeue and submit under the same lock for protection. (An alternate future plan is to keep the in/out arrays separate for concurrent processing and reduced lock coverage.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-2-chris@chris-wilson.co.uk	2018-06-28 22:55:03 +01:00
Chris Wilson	74093f3ecc	drm/i915: Drop posting reads to flush master interrupts We do not need to do a posting read of our uncached mmio write to re-enable the master interrupt lines after handling an interrupt, so don't. This saves us a slow UC read before we can process the interrupt, most noticeable in execlists where any stalls imposes extra latency on GPU command execution. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjala <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-1-chris@chris-wilson.co.uk	2018-06-28 22:55:02 +01:00
Michal Wajdeczko	f7dc0157e4	drm/i915/uc: Fetch GuC/HuC firmwares from guc/huc specific init We're fetching GuC/HuC firmwares directly from uc level during init_early stage but this breaks guc/huc struct isolation and also strict SW-only initialization rule for init_early. Move fw fetching to init phase and do it separately per guc/huc struct. v2: don't forget to move wopcm_init - Michele v3: fetch in init_misc phase - Michal Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> #2 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180628141522.62788-2-michal.wajdeczko@intel.com	2018-06-28 22:51:33 +01:00
Michal Wajdeczko	c39d2e7e35	drm/i915/guc: Use intel_guc_init_misc to hide GuC internals We will add more init steps to misc phase and there is no need to expose them separately for use in uc_init_misc function. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180628141522.62788-1-michal.wajdeczko@intel.com	2018-06-28 22:51:32 +01:00
Jesper Dangaard Brouer	509fda105b	samples/bpf: xdp_rxq_info action XDP_TX must adjust MAC-addrs XDP_TX requires also changing the MAC-addrs, else some hardware may drop the TX packet before reaching the wire. This was observed with driver mlx5. If xdp_rxq_info select --action XDP_TX the swapmac functionality is activated. It is also possible to manually enable via cmdline option --swapmac. This is practical if wanting to measure the overhead of writing/updating payload for other action types. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-28 23:50:20 +02:00
Jesper Dangaard Brouer	0d25c43ab9	samples/bpf: extend xdp_rxq_info to read packet payload There is a cost associated with reading the packet data payload that this test ignored. Add option --read to allow enabling reading part of the payload. This sample/tool helps us analyse an issue observed with a NIC mlx5 (ConnectX-5 Ex) and an Intel(R) Xeon(R) CPU E5-1650 v4. With no_touch of data: Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:no_touch XDP stats CPU pps issue-pps XDP-RX CPU 0 14,465,157 0 XDP-RX CPU 1 14,464,728 0 XDP-RX CPU 2 14,465,283 0 XDP-RX CPU 3 14,465,282 0 XDP-RX CPU 4 14,464,159 0 XDP-RX CPU 5 14,465,379 0 XDP-RX CPU total 86,789,992 When not touching data, we observe that the CPUs have idle cycles. When reading data the CPUs are 100% busy in softirq. With reading data: Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:read XDP stats CPU pps issue-pps XDP-RX CPU 0 9,620,639 0 XDP-RX CPU 1 9,489,843 0 XDP-RX CPU 2 9,407,854 0 XDP-RX CPU 3 9,422,289 0 XDP-RX CPU 4 9,321,959 0 XDP-RX CPU 5 9,395,242 0 XDP-RX CPU total 56,657,828 The effect seen above is a result of cache-misses occuring when more RXQs are being used. Based on perf-event observations, our conclusion is that the CPUs DDIO (Direct Data I/O) choose to deliver packet into main memory, instead of L3-cache. We also found, that this can be mitigated by either using less RXQs or by reducing NICs the RX-ring size. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-28 23:50:20 +02:00

... 243 244 245 246 247 ...

781771 Commits