The default value of 5ms will use GPIO hardware based debounce clocks
that will keep L4PER from idling consuming about extra 30mW.
Use a value of 10ms that is above the hardware debounce maximum of
7.95ms forcing software based debouncing.
This allows droid4 to enter PER retention during idle as long as UARTs
are idled and USB modules unloaded or unbound.
Note that there seems to be a pending issue with having droid 4 enter core
retention during idle where GPIO bank 1 needs to be reset late after init
for some reason to not block core retention. In addition to that, we are
also missing GPIO related PM runtime calls for omap4 that will be posted
separately.
Cc: Marcel Partap <mpartap@gmx.net>
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Michael Scott <hashcode0f@gmail.com>
Cc: NeKit <nekit1000@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
By reconfiguring few GPIOs in the dts file we can make duovero parlor
hit retention during idle:
1. Let's a larger debounce value for gpio-keys
This will then make gpio-keys use software debounce and the GPIO
debounce clock is not enabled.
2. Let's allow WLAN suspend for mwifiex
This can be done just by adding keep-power-in-suspend.
3. Let's reconfigure smsc911x driver to use GPIO edge interrupt
This will allow using NFSroot while the system idles, and the kernel
has quite a few dts files with "smsc,lan9115" compatible using edge
interrupts.
Then to have the system hit core retention during idle, the UARTs
needs to be idled and USB modules need to be unloaded or unbound.
Cc: Ash Charles <ash@gumstix.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The cooling device properties, like "#cooling-cells" and
"dynamic-power-coefficient", should either be present for all the CPUs
of a cluster or none. If these are present only for a subset of CPUs of
a cluster then things will start falling apart as soon as the CPUs are
brought online in a different order. For example, this will happen
because the operating system looks for such properties in the CPU node
it is trying to bring up, so that it can register a cooling device.
Add such missing properties.
Do minor rearrangement as well to keep ordering consistent.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
When there is no need to send an IPI to a CPU with VP number > 64
we can do the job with fast HVCALL_SEND_IPI hypercall.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Cc: devel@linuxdriverproject.org
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tianyu Lan <Tianyu.Lan@microsoft.com>
Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
Link: https://lkml.kernel.org/r/20180622170625.30688-4-vkuznets@redhat.com
Current Hyper-V TLFS (v5.0b) claims that HvCallSendSyntheticClusterIpi
hypercall can't be 'fast' (passing parameters through registers) but
apparently this is not true, Windows always uses 'fast' version. We can
do the same in Linux too.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Cc: devel@linuxdriverproject.org
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tianyu Lan <Tianyu.Lan@microsoft.com>
Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
Link: https://lkml.kernel.org/r/20180622170625.30688-3-vkuznets@redhat.com
Implement 'Fast' hypercall with two 64-bit input parameter. This is
going to be used for HvCallSendSyntheticClusterIpi hypercall.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: devel@linuxdriverproject.org
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tianyu Lan <Tianyu.Lan@microsoft.com>
Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
Link: https://lkml.kernel.org/r/20180622170625.30688-2-vkuznets@redhat.com
Apparently didn't get carefully checked.
Fixes: 50525c332b ("drm: content-type property for HDMI connector")
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180702091023.695-1-daniel.vetter@ffwll.ch
The wl1835mod.pdf data sheet says this pretty clearly for WL_IRQ line:
"WLAN SDIO out-of-band interrupt line. Set to rising edge (active high)
by default."
And it seems this interrupt can be optionally configured to use falling
edge too since commit bd763482c8 ("wl18xx: wlan_irq: support platform
dependent interrupt types").
On omap4, if the wlcore interrupt is configured as level instead of edge,
L4PER will stop doing hardware based idling after ifconfig wlan0 down is
done and the WL_EN line is pulled down.
The symptoms show up with L4PER status registers no longer showing the
IDLEST bits as 2 but as 0 for all the active GPIO banks and for
L4PER_CLKCTRL. Also the l4per_pwrdm RET count stops increasing in
the /sys/kernel/debug/pm_debug/count.
While there is also probably a GPIO related issue that needs to be
still fixed, this change gets us to the point where we can have L4PER
idling.
I'm guessing wlcore was at some point configured to use level interrupts
because of edge handling issues in gpio-omap. However, with the recent
fixes to gpio-omap the edge interrupts seem to be working just fine.
Let's change it for all omap boards with wlcore interrupt set as level.
Cc: Dave Gerlach <d-gerlach@ti.com>
Cc: Eyal Reizer <eyalr@ti.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Nishanth Menon <nm@ti.com>
Cc: Tero Kristo <t-kristo@ti.com>
[tony@atomide.com updated comments a bit for gpio issue]
Signed-off-by: Tony Lindgren <tony@atomide.com>
The board has USB VBUS detection available over GPIO. Plug it to
extcon node of USB1 and USB2 ports.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The board has USB VBUS detection available over GPIO. Plug it to
extcon node of USB1 and USB2 ports.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Both ports on the dra7-evm and related boards can be used
as dual-role ports. Although we don't enable dual-role mode
for USB2 port let's add the necessary extcon bits to it.
Move the common portion of extcon_usb2 into dra7-evm-common.dtsi
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Dual-role support was added in v4.12. We should be using
it for USB2 port on the am57xx-idk.
Cc: <stable@vger.kernel.org> [4.16+]
Reported-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The cooling device properties, like "#cooling-cells" and
"dynamic-power-coefficient", should either be present for all the CPUs
of a cluster or none. If these are present only for a subset of CPUs of
a cluster then things will start falling apart as soon as the CPUs are
brought online in a different order. For example, this will happen
because the operating system looks for such properties in the CPU node
it is trying to bring up, so that it can register a cooling device.
Add such missing properties.
Fix other missing properties (clocks, supply, clock latency) as well to
make it all work.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The cooling device properties, like "#cooling-cells" and
"dynamic-power-coefficient", should either be present for all the CPUs
of a cluster or none. If these are present only for a subset of CPUs of
a cluster then things will start falling apart as soon as the CPUs are
brought online in a different order. For example, this will happen
because the operating system looks for such properties in the CPU node
it is trying to bring up, so that it can register a cooling device.
Add such missing properties.
Fix other missing properties (clocks, supply, clock latency) as well to
make it all work.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
According to i.MX6 datasheet, the LDO_1P1's typical
programming operating range is 1.0V to 1.2V, and
the LDO_2P5's typical programming operating range
is 2.25V to 2.75V, correct LDO_1P1 and LDO_2P5's
regulator range settings for i.MX6 SoCs.
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
When a resource group enters pseudo-locksetup mode it reflects that the
platform supports cache pseudo-locking and the resource group is unused,
ready to be used for a pseudo-locked region. Until it is set up as a
pseudo-locked region the resource group is "locked down" such that no new
tasks or cpus can be assigned to it. This is accomplished in a user visible
way by making the cpus, cpus_list, and tasks resctrl files inaccassible
(user cannot read from or write to these files).
When the resource group changes to pseudo-locked mode it represents a cache
pseudo-locked region. While not appropriate to make any changes to the cpus
assigned to this region it is useful to make it easy for the user to see
which cpus are associated with the pseudo-locked region.
Modify the permissions of the cpus/cpus_list file when the resource group
changes to pseudo-locked mode to support reading (not writing). The
information presented to the user when reading the file are the cpus
associated with the pseudo-locked region.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/12756b7963b6abc1bffe8fb560b87b75da827bd1.1530421961.git.reinette.chatre@intel.com
As the mode of a resource group changes, the operations it can support may
also change. One way in which the supported operations are managed is to
modify the permissions of the files within the resource group's resctrl
directory.
At the moment only two possible permissions are supported: the default
permissions or no permissions in support for when the operation is "locked
down". It is possible where an operation on a resource group may have more
possibilities. For example, if by default changes can be made to the
resource group by writing to a resctrl file while the current settings can
be obtained by reading from the file, then it may be possible that in
another mode it is only possible to read the current settings, and not
change them.
Make it possible to modify some of the permissions of a resctrl file in
support of a more flexible way to manage the operations on a resource
group. In this preparation work the original behavior is maintained where
all permissions are restored.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/8773aadfade7bcb2c48a45fa294a04d2c03bb0a1.1530421961.git.reinette.chatre@intel.com
When a resource group enters pseudo-locksetup mode a pseudo_lock_region is
associated with it. When the user writes to the resource group's schemata
file the CBM of the requested pseudo-locked region is entered into the
pseudo_lock_region struct. If any part of pseudo-lock region creation fails
the resource group will remain in pseudo-locksetup mode with the
pseudo_lock_region associated with it.
In case of failure during pseudo-lock region creation care needs to be
taken to ensure that the pseudo_lock_region struct associated with the
resource group is cleared from any pseudo-locking data - especially the
CBM. This is because the existence of a pseudo_lock_region struct with a
CBM is significant in other areas of the code, for example, the display of
bit_usage and initialization of a new resource group.
Fix the error path of pseudo-lock region creation to ensure that the
pseudo_lock_region struct is cleared at each error exit.
Fixes: 018961ae55 ("x86/intel_rdt: Pseudo-lock region creation/removal core")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/49b4782f6d204d122cee3499e642b2772a98d2b4.1530421026.git.reinette.chatre@intel.com
Per Documentation/process/license-rules.rst the SPDX notation for
header file should be /* */ style, so fix it accordingly.
Fixes: 9f30b6b1a9 ("ARM: dts: imx: Add basic dtsi file for imx6sll")
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The AM3517-EVM uses a TPS65023 PMIC. This is already defined
by: compatible = "ti,tps65023"
There doesn't seem to be a need to have each regulator in the
PMIC with the 'compatible = "regulator-fixed"' since each
regulator has a min and max setting.
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
i.MX6ULL has different operating ranges than i.MX6UL so add the
operating points for the i.MX6ULL and remove them from board device
trees. A 25mV offset is added to the minimum allowed values like for the
i.MX6UL.
The valid frequencies are now selected by the cpufreq driver according
to ratings stored in fuses since commit 0aa9abd4c2 ("cpufreq: imx6q:
check speed grades for i.MX6ULL")
Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Mainline Commit b74c2b21e1 added the pinmux
settings for mmc1, however this pin (0x9a0) is routed to P9_42 on the cape
header. Thus any BeagleBone cape that utilizes P9_42 triggers mmc0's Write
Protect.
Fixes: b74c2b21e1 ("ARM: dts: am33xx: Add pinmux data for mmc1 in
am335x-evm, evmsk and beaglebone")
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
CC: Faiz Abbas <faiz_abbas@ti.com>
CC: Tony Lindgren <tony@atomide.com>
CC: Jason Kridner <jkridner@beagleboard.org>
CC: Drew Fustini <drew@beagleboard.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The checking code is done in kmap_atomic() so that we don't need to
check it in update_mmu_cache() again. There is no need to implement
it for cache aliasing or cache non-aliasing versions. We can just
implement one version for both.
Signed-off-by: Greentime Hu <greentime@andestech.com>
The RMI4 touchscreen driver applied inversion and axis swap in the
wrong order, violating the DT binding for those properties. This was fixed in
645a397, so correct the RDU1 DT to apply the inversion to the
correct axis.
Tested on Zii RDU1 00-5105-30 rev B
Signed-off-by: Nick Dyer <nick@shmanahar.org>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Initialization of several Amstrad Delta devices was once moved to
late_initcall by commit f7519d8c82 ("ARM: OMAP1: ams-delta: register
latch dependent devices later"). The purpose of that move was to allow
smooth conversion of Amstrad Delta latches to GPIO.
After successful conversion only ams_delta_serio driver was moved back
to device_initcall by commit 8d09a1bb31 ("input: serio: ams-delta:
toggle keyboard power over GPIO"). Registration of ams-delta-nand and
lcd_ams_delta devices was kept in late_initcall in order to avoid
corrupt data reported by the serio driver on boot. Registration of
cx20442-codec device was kept there for it to be probed after a
regulator on which it depended was ready.
The issue of "keybrd_dataout" GPIO pin not initilized to GPIO_OUT_LOW
before other latch2 pins causing the corruption have been apparently
fixed by commit 5322c19b117a ("ARM: OMAP1: ams-delta: Hog
"keybrd_dataout" GPIO pin"). In turn, the issue of missing regulator
has been fixed by commit 50c678772a ("ASoC: cx20442: Don't ignore
regulator_get() errors.").
Simplify the board init code by moving registration of those devices
back to init_machine.
Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Instead of exporting the FIQ buffer symbol to be used in
ams-delta-serio driver, pass it to the driver as platform_data.
Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The driver still obtains IRQ number from a hardcoded GPIO. Use IRQ
resource instead.
For this to work on Amstrad Delta, add the IRQ resource to
ams-delta-serio platform device structure. Obtain the IRQ number
assigned to "keybrd_clk" GPIO pin from FIQ initialization routine.
As a benefit, the driver no longer needs to include
<mach/board-ams-delta.h>.
Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Split the header file into two parts and move them to directories where
they belong.
Information on internal structure of FIQ buffer is moved to
<linux/platform_data/ams-delta-fiq.h> for ams-delta-serio driver use.
Other information used by ams-delta board init file and FIQ code is
made local to mach-omap1 root directory.
Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
From the very beginning, input GPIO pins of ams-delta serio port have
been used by FIQ handler, not serio driver.
Don't request those pins from the ams-delta-serio driver any longer,
instead keep them requested and initialized by the FIQ initialization
routine which already requests them and releases while identifying GPIO
IRQs.
Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
With introduction of GPIO lookup tables to Amstrad Delta board init
file, semantics of symbols representing OMAP GPIO pins defined in
<mach/board-ams-delta.h> changed from statically assigned global GPIO
numbers to hardware pin numbers local to OMAP "gpio-0-15" chip.
This patch modifies deferred FIQ interrupt handler so it no longer uses
static GPIO numbers in favour of IRQ data descriptors obtained at FIQ
initialization time from descriptor of the GPIO chip with use of its
hardware pin numbers. The chip descriptor is passed from the board
init file.
As a benefit, the deferred FIQ handler should work faster.
Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
[tony@atomide.com: removed duplicate gpiochip_match_by_label]
Signed-off-by: Tony Lindgren <tony@atomide.com>
This renames usbphy nodes 2 & 3, so that they follow the same
format as usbphy node 0 & 1 from imx53.dtsi.
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Add information about 3V3 power rail to avoid kernel warnings,
that dummy regulators have been added.
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Pinctrl_usbh1reg defines pinmux setting for reset GPIO used by
usbh1phy, but is not referenced by that node. Fix that.
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
commit bfd694d5e2 ("mmc: core: Add tunable delay
before detecting card after card is inserted") adds
"u32 cd_debounce_delay_ms" to the last of mmc_gpio
struct and cause "char cd_label[0]" NOT work as string
pointer of card detect label, when "cat /proc/interrupts",
the devname for card detect gpio is incorrect as below:
144: 0 gpio-mxc 22 Edge ▒
161: 0 gpio-mxc 7 Edge ▒
Move the cd_label field down to fix this, and drop the
zero from the array size to prevent future similar bugs,
the result is correct as below:
144: 0 gpio-mxc 22 Edge 2198000.mmc cd
161: 0 gpio-mxc 7 Edge 2190000.mmc cd
Fixes: bfd694d5e2 ("mmc: core: Add tunable delay before detecting card after card is inserted")
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Pull in branch for switching pxa to DMA slave maps, from Robert Jarzmik.
This is needed due to dependencies of changes for the pxamci mmc driver.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
We found that the original implementation will only use the built-in dtb
pointer instead of the pointer pass from bootloader. This bug is fixed
by this patch.
Signed-off-by: Greentime Hu <greentime@andestech.com>
data cache.
This issue is found by Guo Ren. Based on the Documentation/core-api/cachetlb.rst
and it says:
"Any necessary cache flushing or other coherency operations
that need to occur should happen here. If the processor's
instruction cache does not snoop cpu stores, it is very
likely that you will need to flush the instruction cache
for copy_to_user_page()."
"If the icache does not snoop stores then this
routine(flush_icache_range) will need to flush it."
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Signed-off-by: Greentime Hu <greentime@andestech.com>
If SGMII was selected in the DT then the device should
write the SGMII enable bit.
If SGMII is not selected in the DT then the SGMII bit
should be disabled.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GCC_PLUGIN_SUBDIR has never been used. If you really need this in
the future, please re-add it then.
For now, the code is unused. Remove.
'export HOSTLIBS' is not necessary either.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
In the quest to remove all stack VLA usage from the kernel[1], this
switches to using a stack size large enough for the saved routine and
adds a sanity check making sure the routine doesn't overflow into the
0x600 exception handler.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Magnus Karlsson says:
====================
This patch set fixes three bugs in the SKB TX path of AF_XDP.
Details in the individual commits.
The structure of the patch set is as follows:
Patch 1: Fix for lost completion message
Patch 2-3: Fix for possible multiple completions of single packet
Patch 4: Fix potential race during error
Changes from v1:
* Added explanation of race in commit message of patch 4.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
There is a potential race in the TX completion code for the SKB
case. One process enters the sendmsg code of an AF_XDP socket in order
to send a frame. The execution eventually trickles down to the driver
that is told to send the packet. However, it decides to drop the
packet due to some error condition (e.g., rings full) and frees the
SKB. This will trigger the SKB destructor and a completion will be
sent to the AF_XDP user space through its
single-producer/single-consumer queues.
At the same time a TX interrupt has fired on another core and it
dispatches the TX completion code in the driver. It does its HW
specific things and ends up freeing the SKB associated with the
transmitted packet. This will trigger the SKB destructor and a
completion will be sent to the AF_XDP user space through its
single-producer/single-consumer queues. With a pseudo call stack, it
would look like this:
Core 1:
sendmsg() being called in the application
netdev_start_xmit()
Driver entered through ndo_start_xmit
Driver decides to free the SKB for some reason (e.g., rings full)
Destructor of SKB called
xskq_produce_addr() is called to signal completion to user space
Core 2:
TX completion irq
NAPI loop
Driver irq handler for TX completions
Frees the SKB
Destructor of SKB called
xskq_produce_addr() is called to signal completion to user space
We now have a violation of the single-producer/single-consumer
principle for our queues as there are two threads trying to produce at
the same time on the same queue.
Fixed by introducing a spin_lock in the destructor. In regards to the
performance, I get around 1.74 Mpps for txonly before and after the
introduction of the spinlock. There is of course some impact due to
the spin lock but it is in the less significant digits that are too
noisy for me to measure. But let us say that the version without the
spin lock got 1.745 Mpps in the best case and the version with 1.735
Mpps in the worst case, then that would mean a maximum drop in
performance of 0.5%.
Fixes: 35fcde7f8d ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Sendmsg in the SKB path of AF_XDP can now return EBUSY when a packet
was discarded and completed by the driver. Just ignore this message
in the sample application.
Fixes: b4b8faa1de ("samples/bpf: sample application and documentation for AF_XDP sockets")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reported-by: Pavel Odintsov <pavel@fastnetmon.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Fixed a bug in which a frame could be completed more than once
when an error was returned from dev_direct_xmit(). The code
erroneously retried sending the message leading to multiple
calls to the SKB destructor and therefore multiple completions
of the same buffer to user space.
The error code in this case has been changed from EAGAIN to EBUSY
in order to tell user space that the sending of the packet failed
and the buffer has been return to user space through the completion
queue.
Fixes: 35fcde7f8d ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reported-by: Pavel Odintsov <pavel@fastnetmon.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The code in xskq_produce_addr erroneously checked if there
was up to LAZY_UPDATE_THRESHOLD amount of space in the completion
queue. It only needs to check if there is one slot left in the
queue. This bug could under some circumstances lead to a WARN_ON_ONCE
being triggered and the completion message to user space being lost.
Fixes: 35fcde7f8d ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reported-by: Pavel Odintsov <pavel@fastnetmon.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The cooling device properties, like "#cooling-cells" and
"dynamic-power-coefficient", should either be present for all the CPUs
of a cluster or none. If these are present only for a subset of CPUs of
a cluster then things will start falling apart as soon as the CPUs are
brought online in a different order. For example, this will happen
because the operating system looks for such properties in the CPU node
it is trying to bring up, so that it can register a cooling device.
Add such missing properties.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>