Commit Graph

28933 Commits

Author SHA1 Message Date
Tao Huang
70f1413e90 Merge remote branch 'android-4.19' of https://android.googlesource.com/kernel/common
* android-4.19: (2854 commits)
  ANDROID: move up spin_unlock_bh() ahead of remove_proc_entry()
  BACKPORT: arm64: tags: Preserve tags for addresses translated via TTBR1
  UPSTREAM: arm64: memory: Implement __tag_set() as common function
  UPSTREAM: arm64/mm: fix variable 'tag' set but not used
  UPSTREAM: arm64: avoid clang warning about self-assignment
  ANDROID: sdcardfs: evict dentries on fscrypt key removal
  ANDROID: fscrypt: add key removal notifier chain
  ANDROID: refactor build.config files to remove duplication
  ANDROID: Move from clang r353983c to r365631c
  ANDROID: gki_defconfig: remove PWRSEQ_EMMC and PWRSEQ_SIMPLE
  ANDROID: unconditionally compile sig_ok in struct module
  Linux 4.19.80
  perf/hw_breakpoint: Fix arch_hw_breakpoint use-before-initialization
  PCI: vmd: Fix config addressing when using bus offsets
  x86/asm: Fix MWAITX C-state hint value
  hwmon: Fix HWMON_P_MIN_ALARM mask
  tracing: Get trace_array reference for available_tracers files
  ftrace: Get a reference counter for the trace_array on filter files
  tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency
  tracing/hwlat: Report total time spent in all NMIs during the sample
  ...

Conflicts:
	drivers/clk/rockchip/clk-mmc-phase.c
	drivers/gpu/drm/rockchip/rockchip_drm_vop.c
	drivers/regulator/core.c
	drivers/tty/serial/8250/8250_port.c
	drivers/usb/dwc3/core.h
	drivers/usb/dwc3/gadget.c
	drivers/usb/dwc3/gadget.h

Change-Id: I65599d770d6613caba14251b890fcfd1cfa0f100
2019-10-28 20:26:28 +08:00
Nick Desaulniers
eba171dc20 tracing: do not leak kernel addresses
CVE-2017-0630

This likely breaks tracing tools like trace-cmd.  It logs in the same
format but now addresses are all 0x0.

Bug: 34277115
Change-Id: Ifb0d4d2a184bf0d95726de05b1acee0287a375d9
Signed-off-by: Jian Qiu <qiujian@rock-chips.com>
2019-10-25 10:51:54 +08:00
Andy Yan
8415000c8f power: wakeup_reason: show total wfi time in suspend get via smcc
Show a accumulation of wfi time of every suspend sate
since system bootup:

$ cat /sys/kernel/wakeup_reasons/total_suspend_wfi_time

Change-Id: I2856faabe2e883a7120931ed49bc0c4f0776600d
Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
2019-10-23 16:01:29 +08:00
Greg Kroah-Hartman
d8a623cfbb This is the 4.19.80 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl2o0vkACgkQONu9yGCS
 aT50rQ//So/ShGy5+D6cmsPVbd8p/9X0fM5D6VjgJ2pj+HbalxW7JhJ0tyVdfkO1
 AS6p/24bmUMWI3pxJBbJjfggbyleV2UFqlHIei+SjfHd0sghZ+E/XY7mLdcKvx0a
 dGdXuxF+enb18pNiUKUrJUgqksgfO40yMXyZhjGf5A+/JmYsNaJhkyCWgPfGIB5i
 YuowOWC4Rkds/CQxyRSRJl+YSQWh8j7Ke0tdjhMQxnmyzFD0wcE5vUtq2vIP7yxO
 ypPrz9zWTase3y1yCcPsXd0m4Ji1VpF99hlgAr0HLFNcZn2kNYFpyblfOrI/zfoK
 RqUJgVVlZaWvhMOrueQO+XbebXdCWTJHiNTUDxIFakEAPNz+XkTQQf6LvzOjcFxB
 oLHnp1qBCn72y6o6Q3XmauFcmYBokyfBvXDAJsXYr+X5B4akn/fpyzzBCInX0h+r
 og5zpLbXDLszG3j76TPSpcjd+t4xqMy+MG+jkH/mLtMAFyaVi2sfMzsZJAQCACr2
 wNHRFQQGjDmQjovkwSRYENQBUuITSj+VrBUD0M7lnfA/Fni3BAJaxY5I6WeEvbeW
 ejzPVFZB8dfEy7hAKKiVEuI8/akcp8MUXfLJOfTxytLIpYllOBoVC4LtjgO5FYu3
 grSbRuowjrlUCtnCU9H18avLE1ScFFEGTmPOqI6xDqKiZf59QsQ=
 =sKVA
 -----END PGP SIGNATURE-----

Merge 4.19.80 into android-4.19

Changes in 4.19.80
	panic: ensure preemption is disabled during panic()
	f2fs: use EINVAL for superblock with invalid magic
	USB: rio500: Remove Rio 500 kernel driver
	USB: yurex: Don't retry on unexpected errors
	USB: yurex: fix NULL-derefs on disconnect
	USB: usb-skeleton: fix runtime PM after driver unbind
	USB: usb-skeleton: fix NULL-deref on disconnect
	xhci: Fix false warning message about wrong bounce buffer write length
	xhci: Prevent device initiated U1/U2 link pm if exit latency is too long
	xhci: Check all endpoints for LPM timeout
	xhci: Fix USB 3.1 capability detection on early xHCI 1.1 spec based hosts
	usb: xhci: wait for CNR controller not ready bit in xhci resume
	xhci: Prevent deadlock when xhci adapter breaks during init
	xhci: Increase STS_SAVE timeout in xhci_suspend()
	USB: adutux: fix use-after-free on disconnect
	USB: adutux: fix NULL-derefs on disconnect
	USB: adutux: fix use-after-free on release
	USB: iowarrior: fix use-after-free on disconnect
	USB: iowarrior: fix use-after-free on release
	USB: iowarrior: fix use-after-free after driver unbind
	USB: usblp: fix runtime PM after driver unbind
	USB: chaoskey: fix use-after-free on release
	USB: ldusb: fix NULL-derefs on driver unbind
	serial: uartlite: fix exit path null pointer
	USB: serial: keyspan: fix NULL-derefs on open() and write()
	USB: serial: ftdi_sio: add device IDs for Sienna and Echelon PL-20
	USB: serial: option: add Telit FN980 compositions
	USB: serial: option: add support for Cinterion CLS8 devices
	USB: serial: fix runtime PM after driver unbind
	USB: usblcd: fix I/O after disconnect
	USB: microtek: fix info-leak at probe
	USB: dummy-hcd: fix power budget for SuperSpeed mode
	usb: renesas_usbhs: gadget: Do not discard queues in usb_ep_set_{halt,wedge}()
	usb: renesas_usbhs: gadget: Fix usb_ep_set_{halt,wedge}() behavior
	USB: legousbtower: fix slab info leak at probe
	USB: legousbtower: fix deadlock on disconnect
	USB: legousbtower: fix potential NULL-deref on disconnect
	USB: legousbtower: fix open after failed reset request
	USB: legousbtower: fix use-after-free on release
	mei: me: add comet point (lake) LP device ids
	mei: avoid FW version request on Ibex Peak and earlier
	gpio: eic: sprd: Fix the incorrect EIC offset when toggling
	Staging: fbtft: fix memory leak in fbtft_framebuffer_alloc
	staging: vt6655: Fix memory leak in vt6655_probe
	iio: adc: hx711: fix bug in sampling of data
	iio: adc: ad799x: fix probe error handling
	iio: adc: axp288: Override TS pin bias current for some models
	iio: light: opt3001: fix mutex unlock race
	efivar/ssdt: Don't iterate over EFI vars if no SSDT override was specified
	perf llvm: Don't access out-of-scope array
	perf inject jit: Fix JIT_CODE_MOVE filename
	blk-wbt: fix performance regression in wbt scale_up/scale_down
	CIFS: Gracefully handle QueryInfo errors during open
	CIFS: Force revalidate inode when dentry is stale
	CIFS: Force reval dentry if LOOKUP_REVAL flag is set
	kernel/sysctl.c: do not override max_threads provided by userspace
	mm/vmpressure.c: fix a signedness bug in vmpressure_register_event()
	firmware: google: increment VPD key_len properly
	gpiolib: don't clear FLAG_IS_OUT when emulating open-drain/open-source
	iio: adc: stm32-adc: move registers definitions
	iio: adc: stm32-adc: fix a race when using several adcs with dma and irq
	cifs: use cifsInodeInfo->open_file_lock while iterating to avoid a panic
	btrfs: fix incorrect updating of log root tree
	btrfs: fix uninitialized ret in ref-verify
	NFS: Fix O_DIRECT accounting of number of bytes read/written
	MIPS: Disable Loongson MMI instructions for kernel build
	MIPS: elf_hwcap: Export userspace ASEs
	ACPICA: ACPI 6.3: PPTT add additional fields in Processor Structure Flags
	ACPI/PPTT: Add support for ACPI 6.3 thread flag
	arm64: topology: Use PPTT to determine if PE is a thread
	Fix the locking in dcache_readdir() and friends
	media: stkwebcam: fix runtime PM after driver unbind
	arm64/sve: Fix wrong free for task->thread.sve_state
	tracing/hwlat: Report total time spent in all NMIs during the sample
	tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency
	ftrace: Get a reference counter for the trace_array on filter files
	tracing: Get trace_array reference for available_tracers files
	hwmon: Fix HWMON_P_MIN_ALARM mask
	x86/asm: Fix MWAITX C-state hint value
	PCI: vmd: Fix config addressing when using bus offsets
	perf/hw_breakpoint: Fix arch_hw_breakpoint use-before-initialization
	Linux 4.19.80

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8ef1835585f79b752712a835852a4bc960ae3b97
2019-10-17 15:33:07 -07:00
Mark-PK Tsai
0603d82bca perf/hw_breakpoint: Fix arch_hw_breakpoint use-before-initialization
commit 310aa0a25b upstream.

If we disable the compiler's auto-initialization feature, if
-fplugin-arg-structleak_plugin-byref or -ftrivial-auto-var-init=pattern
are disabled, arch_hw_breakpoint may be used before initialization after:

  9a4903dde2 ("perf/hw_breakpoint: Split attribute parse and commit")

On our ARM platform, the struct step_ctrl in arch_hw_breakpoint, which
used to be zero-initialized by kzalloc(), may be used in
arch_install_hw_breakpoint() without initialization.

Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alix Wu <alix.wu@mediatek.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: YJ Chiang <yj.chiang@mediatek.com>
Link: https://lkml.kernel.org/r/20190906060115.9460-1-mark-pk.tsai@mediatek.com
[ Minor edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Doug Anderson <dianders@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17 13:45:44 -07:00
Steven Rostedt (VMware)
b9040fab5f tracing: Get trace_array reference for available_tracers files
commit 194c2c74f5 upstream.

As instances may have different tracers available, we need to look at the
trace_array descriptor that shows the list of the available tracers for the
instance. But there's a race between opening the file and an admin
deleting the instance. The trace_array_get() needs to be called before
accessing the trace_array.

Cc: stable@vger.kernel.org
Fixes: 607e2ea167 ("tracing: Set up infrastructure to allow tracers for instances")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17 13:45:41 -07:00
Steven Rostedt (VMware)
a6c9fb2c2c ftrace: Get a reference counter for the trace_array on filter files
commit 9ef16693af upstream.

The ftrace set_ftrace_filter and set_ftrace_notrace files are specific for
an instance now. They need to take a reference to the instance otherwise
there could be a race between accessing the files and deleting the instance.

It wasn't until the :mod: caching where these file operations started
referencing the trace_array directly.

Cc: stable@vger.kernel.org
Fixes: 673feb9d76 ("ftrace: Add :mod: caching infrastructure to trace_array")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17 13:45:40 -07:00
Srivatsa S. Bhat (VMware)
b7f758631d tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency
commit fc64e4ad80 upstream.

max_latency is intended to record the maximum ever observed hardware
latency, which may occur in either part of the loop (inner/outer). So
we need to also consider the outer-loop sample when updating
max_latency.

Link: http://lkml.kernel.org/r/157073345463.17189.18124025522664682811.stgit@srivatsa-ubuntu

Fixes: e7c15cd8a1 ("tracing: Added hardware latency tracer")
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17 13:45:39 -07:00
Srivatsa S. Bhat (VMware)
6271cbff93 tracing/hwlat: Report total time spent in all NMIs during the sample
commit 98dc19c114 upstream.

nmi_total_ts is supposed to record the total time spent in *all* NMIs
that occur on the given CPU during the (active portion of the)
sampling window. However, the code seems to be overwriting this
variable for each NMI, thereby only recording the time spent in the
most recent NMI. Fix it by accumulating the duration instead.

Link: http://lkml.kernel.org/r/157073343544.17189.13911783866738671133.stgit@srivatsa-ubuntu

Fixes: 7b2c862501 ("tracing: Add NMI tracing in hwlat detector")
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17 13:45:38 -07:00
Michal Hocko
7bbe6eefdb kernel/sysctl.c: do not override max_threads provided by userspace
commit b0f53dbc4b upstream.

Partially revert 16db3d3f11 ("kernel/sysctl.c: threads-max observe
limits") because the patch is causing a regression to any workload which
needs to override the auto-tuning of the limit provided by kernel.

set_max_threads is implementing a boot time guesstimate to provide a
sensible limit of the concurrently running threads so that runaways will
not deplete all the memory.  This is a good thing in general but there
are workloads which might need to increase this limit for an application
to run (reportedly WebSpher MQ is affected) and that is simply not
possible after the mentioned change.  It is also very dubious to
override an admin decision by an estimation that doesn't have any direct
relation to correctness of the kernel operation.

Fix this by dropping set_max_threads from sysctl_max_threads so any
value is accepted as long as it fits into MAX_THREADS which is important
to check because allowing more threads could break internal robust futex
restriction.  While at it, do not use MIN_THREADS as the lower boundary
because it is also only a heuristic for automatic estimation and admin
might have a good reason to stop new threads to be created even when
below this limit.

This became more severe when we switched x86 from 4k to 8k kernel
stacks.  Starting since 6538b8ea88 ("x86_64: expand kernel stack to
16K") (3.16) we use THREAD_SIZE_ORDER = 2 and that halved the auto-tuned
value.

In the particular case

  3.12
  kernel.threads-max = 515561

  4.4
  kernel.threads-max = 200000

Neither of the two values is really insane on 32GB machine.

I am not sure we want/need to tune the max_thread value further.  If
anything the tuning should be removed altogether if proven not useful in
general.  But we definitely need a way to override this auto-tuning.

Link: http://lkml.kernel.org/r/20190922065801.GB18814@dhcp22.suse.cz
Fixes: 16db3d3f11 ("kernel/sysctl.c: threads-max observe limits")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17 13:45:19 -07:00
Will Deacon
7d1688c673 panic: ensure preemption is disabled during panic()
commit 20bb759a66 upstream.

Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the
calling CPU in an infinite loop, but with interrupts and preemption
enabled.  From this state, userspace can continue to be scheduled,
despite the system being "dead" as far as the kernel is concerned.

This is easily reproducible on arm64 when booting with "nosmp" on the
command line; a couple of shell scripts print out a periodic "Ping"
message whilst another triggers a crash by writing to
/proc/sysrq-trigger:

  | sysrq: Trigger a crash
  | Kernel panic - not syncing: sysrq triggered crash
  | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1
  | Hardware name: linux,dummy-virt (DT)
  | Call trace:
  |  dump_backtrace+0x0/0x148
  |  show_stack+0x14/0x20
  |  dump_stack+0xa0/0xc4
  |  panic+0x140/0x32c
  |  sysrq_handle_reboot+0x0/0x20
  |  __handle_sysrq+0x124/0x190
  |  write_sysrq_trigger+0x64/0x88
  |  proc_reg_write+0x60/0xa8
  |  __vfs_write+0x18/0x40
  |  vfs_write+0xa4/0x1b8
  |  ksys_write+0x64/0xf0
  |  __arm64_sys_write+0x14/0x20
  |  el0_svc_common.constprop.0+0xb0/0x168
  |  el0_svc_handler+0x28/0x78
  |  el0_svc+0x8/0xc
  | Kernel Offset: disabled
  | CPU features: 0x0002,24002004
  | Memory Limit: none
  | ---[ end Kernel panic - not syncing: sysrq triggered crash ]---
  |  Ping 2!
  |  Ping 1!
  |  Ping 1!
  |  Ping 2!

The issue can also be triggered on x86 kernels if CONFIG_SMP=n,
otherwise local interrupts are disabled in 'smp_send_stop()'.

Disable preemption in 'panic()' before re-enabling interrupts.

Link: http://lkml.kernel.org/r/20191002123538.22609-1-will@kernel.org
Link: https://lore.kernel.org/r/BX1W47JXPMR8.58IYW53H6M5N@dragonstone
Signed-off-by: Will Deacon <will@kernel.org>
Reported-by: Xogium <contact@xogium.me>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17 13:44:46 -07:00
Liang Chen
70537fcbc1 Revert "FROMLIST: sched/topology: Make Energy Aware Scheduling depend on schedutil"
We use cpufreq_interactive governor with Energ Aware Scheduling.

This reverts commit 1e6b1214f1.

Change-Id: I782bed880c858537a89e8928921cbecaeb31d593
Signed-off-by: Liang Chen <cl@rock-chips.com>
2019-10-15 11:14:27 +08:00
Kalesh Singh
d112bff3b7 BACKPORT: PM/sleep: Expose suspend stats in sysfs
Userspace can get suspend stats from the suspend stats debugfs node.
Since debugfs doesn't have stable ABI, expose suspend stats in
sysfs under /sys/power/suspend_stats.

Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 2c8db5bef9)
[ fixed conflicts in Documentation ]
Bug: 129087298
Change-Id: Ia8f0d8551ae693cb5f3e1abc609fb9bbfeb695f5
Signed-off-by: Tri Vo <trong@google.com>
2019-10-11 14:04:42 -07:00
Tri Vo
2c9f5fa9c3 UPSTREAM: PM / wakeup: Show wakeup sources stats in sysfs
Add an ID and a device pointer to 'struct wakeup_source'. Use them to to
expose wakeup sources statistics in sysfs under
/sys/class/wakeup/wakeup<ID>/*.

Co-developed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Co-developed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Tri Vo <trong@android.com>
Tested-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit c8377adfa7)
Bug: 129087298
Signed-off-by: Tri Vo <trong@google.com>
Change-Id: Iecd3412423f9d499981f44d3b69507eaa62a2cd9
2019-10-11 14:04:42 -07:00
Tri Vo
5bc2bdfb22 UPSTREAM: PM / wakeup: Use wakeup_source_register() in wakelock.c
kernel/power/wakelock.c duplicates wakeup source creation and
registration code from drivers/base/power/wakeup.c.

Change struct wakelock's wakeup source to a pointer and use
wakeup_source_register() function to create and register said wakeup
source. Use wakeup_source_unregister() on cleanup path.

Signed-off-by: Tri Vo <trong@android.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 2434aea58e)
Bug: 129087298
Signed-off-by: Tri Vo <trong@google.com>
Change-Id: I4e6b3c613c561fb382f17c3c31b6584aebabfb5d
2019-10-11 14:04:42 -07:00
Greg Kroah-Hartman
ef55d5261c This is the 4.19.79 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl2grBgACgkQONu9yGCS
 aT6xRBAA0pTW2W/VvzBHBLeVlmNtwQZb8x7civVb72iZkltKR9tTPim90PULpz/P
 iO7kh8KqkgVUqdgBE0VzkHGWUSThggfSTQiqzCqOgTwV8WQWqSF8ET0HU8zbglYB
 5pXSojoRYmurGVznd4Ll6aWa5brXIKwf1mDSrFHagOyOLxQmyggHaTRSLx36BSfj
 gunE2ideB1oTaPmd/2aTI03CU3jRwXmowe8rZIDa8pJEpplZPFdk0YOPXg2t6uRI
 bjJGO8bhfR/14r/3h76IwsEiVVXIcCeEVm0fos/H6NUypedfi7jlT0Ldzg1/zZti
 mUMkbPGHcJbOWfBYPQq8xQzviCa+MFraA4Tek5h/Lf7kf3NpjE20AnH3pb9TaqQf
 mJYUGziCoOOOz8k+0eNtIjIZiCysOnf9sI5rGhMYb9qfZoZGG6RiitqyVYNa+rzJ
 wvIUQZ4vSnYmQMAXqxyayfSZvFbMxv6pAdeH0NrXVRgFF6dnKG9TSsCnIuQaJxAE
 OQRaYEJktMUBs81hS0IjnJNDFLW3r++s87xEYvCt4L7XGSrxMJ3jW6xLZlmET68G
 4UIddJ81zIuqpGY1qoWdWZAp3nfRfSX4ehOnoNmIDyC9pRhiCKc+N6j5rX8gBNO/
 SO8YOaNf9RTphhEG6Op7u4ZbU+UR4pYP+rjKveyT2HKPH6D/Tv0=
 =wt6H
 -----END PGP SIGNATURE-----

Merge 4.19.79 into android-4.19

Changes in 4.19.79
	s390/process: avoid potential reading of freed stack
	KVM: s390: Test for bad access register and size at the start of S390_MEM_OP
	s390/topology: avoid firing events before kobjs are created
	s390/cio: exclude subchannels with no parent from pseudo check
	KVM: PPC: Book3S HV: Fix race in re-enabling XIVE escalation interrupts
	KVM: PPC: Book3S HV: Check for MMU ready on piggybacked virtual cores
	KVM: PPC: Book3S HV: Don't lose pending doorbell request on migration on P9
	KVM: X86: Fix userspace set invalid CR4
	KVM: nVMX: handle page fault in vmread fix
	nbd: fix max number of supported devs
	PM / devfreq: tegra: Fix kHz to Hz conversion
	ASoC: Define a set of DAPM pre/post-up events
	ASoC: sgtl5000: Improve VAG power and mute control
	powerpc/mce: Fix MCE handling for huge pages
	powerpc/mce: Schedule work from irq_work
	powerpc/powernv: Restrict OPAL symbol map to only be readable by root
	powerpc/powernv/ioda: Fix race in TCE level allocation
	powerpc/book3s64/mm: Don't do tlbie fixup for some hardware revisions
	can: mcp251x: mcp251x_hw_reset(): allow more time after a reset
	tools lib traceevent: Fix "robust" test of do_generate_dynamic_list_file
	crypto: qat - Silence smp_processor_id() warning
	crypto: skcipher - Unmap pages after an external error
	crypto: cavium/zip - Add missing single_release()
	crypto: caam - fix concurrency issue in givencrypt descriptor
	crypto: ccree - account for TEE not ready to report
	crypto: ccree - use the full crypt length value
	MIPS: Treat Loongson Extensions as ASEs
	power: supply: sbs-battery: use correct flags field
	power: supply: sbs-battery: only return health when battery present
	tracing: Make sure variable reference alias has correct var_ref_idx
	usercopy: Avoid HIGHMEM pfn warning
	timer: Read jiffies once when forwarding base clk
	PCI: vmd: Fix shadow offsets to reflect spec changes
	PCI: Restore Resizable BAR size bits correctly for 1MB BARs
	watchdog: imx2_wdt: fix min() calculation in imx2_wdt_set_timeout
	perf stat: Fix a segmentation fault when using repeat forever
	drm/omap: fix max fclk divider for omap36xx
	drm/msm/dsi: Fix return value check for clk_get_parent
	drm/nouveau/kms/nv50-: Don't create MSTMs for eDP connectors
	drm/i915/gvt: update vgpu workload head pointer correctly
	mmc: sdhci: improve ADMA error reporting
	mmc: sdhci-of-esdhc: set DMA snooping based on DMA coherence
	Revert "locking/pvqspinlock: Don't wait if vCPU is preempted"
	xen/xenbus: fix self-deadlock after killing user process
	ieee802154: atusb: fix use-after-free at disconnect
	s390/cio: avoid calling strlen on null pointer
	cfg80211: initialize on-stack chandefs
	arm64: cpufeature: Detect SSBS and advertise to userspace
	ima: always return negative code for error
	ima: fix freeing ongoing ahash_request
	fs: nfs: Fix possible null-pointer dereferences in encode_attrs()
	9p: Transport error uninitialized
	9p: avoid attaching writeback_fid on mmap with type PRIVATE
	xen/pci: reserve MCFG areas earlier
	ceph: fix directories inode i_blkbits initialization
	ceph: reconnect connection if session hang in opening state
	watchdog: aspeed: Add support for AST2600
	netfilter: nf_tables: allow lookups in dynamic sets
	drm/amdgpu: Fix KFD-related kernel oops on Hawaii
	drm/amdgpu: Check for valid number of registers to read
	pNFS: Ensure we do clear the return-on-close layout stateid on fatal errors
	pwm: stm32-lp: Add check in case requested period cannot be achieved
	x86/purgatory: Disable the stackleak GCC plugin for the purgatory
	ntb: point to right memory window index
	thermal: Fix use-after-free when unregistering thermal zone device
	thermal_hwmon: Sanitize thermal_zone type
	libnvdimm/region: Initialize bad block for volatile namespaces
	fuse: fix memleak in cuse_channel_open
	libnvdimm/nfit_test: Fix acpi_handle redefinition
	sched/membarrier: Call sync_core only before usermode for same mm
	sched/membarrier: Fix private expedited registration check
	sched/core: Fix migration to invalid CPU in __set_cpus_allowed_ptr()
	perf build: Add detection of java-11-openjdk-devel package
	kernel/elfcore.c: include proper prototypes
	perf unwind: Fix libunwind build failure on i386 systems
	nfp: flower: fix memory leak in nfp_flower_spawn_vnic_reprs
	drm/radeon: Bail earlier when radeon.cik_/si_support=0 is passed
	KVM: PPC: Book3S HV: XIVE: Free escalation interrupts before disabling the VP
	KVM: nVMX: Fix consistency check on injected exception error code
	nbd: fix crash when the blksize is zero
	powerpc/pseries: Fix cpu_hotplug_lock acquisition in resize_hpt()
	powerpc/book3s64/radix: Rename CPU_FTR_P9_TLBIE_BUG feature flag
	tools lib traceevent: Do not free tep->cmdlines in add_new_comm() on failure
	tick: broadcast-hrtimer: Fix a race in bc_set_next
	perf tools: Fix segfault in cpu_cache_level__read()
	perf stat: Reset previous counts on repeat with interval
	riscv: Avoid interrupts being erroneously enabled in handle_exception()
	arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3
	KVM: arm64: Set SCTLR_EL2.DSSBS if SSBD is forcefully disabled and !vhe
	arm64: docs: Document SSBS HWCAP
	arm64: fix SSBS sanitization
	arm64: Add sysfs vulnerability show for spectre-v1
	arm64: add sysfs vulnerability show for meltdown
	arm64: enable generic CPU vulnerabilites support
	arm64: Always enable ssb vulnerability detection
	arm64: Provide a command line to disable spectre_v2 mitigation
	arm64: Advertise mitigation of Spectre-v2, or lack thereof
	arm64: Always enable spectre-v2 vulnerability detection
	arm64: add sysfs vulnerability show for spectre-v2
	arm64: add sysfs vulnerability show for speculative store bypass
	arm64: ssbs: Don't treat CPUs with SSBS as unaffected by SSB
	arm64: Force SSBS on context switch
	arm64: Use firmware to detect CPUs that are not affected by Spectre-v2
	arm64/speculation: Support 'mitigations=' cmdline option
	vfs: Fix EOVERFLOW testing in put_compat_statfs64
	coresight: etm4x: Use explicit barriers on enable/disable
	staging: erofs: fix an error handling in erofs_readdir()
	staging: erofs: some compressed cluster should be submitted for corrupted images
	staging: erofs: add two missing erofs_workgroup_put for corrupted images
	staging: erofs: detect potential multiref due to corrupted images
	cfg80211: add and use strongly typed element iteration macros
	cfg80211: Use const more consistently in for_each_element macros
	nl80211: validate beacon head
	Linux 4.19.79

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie4f85994b5f3e53658c42833d0dc712575d0902e
2019-10-11 19:13:57 +02:00
Balasubramani Vivekanandan
e5331c37c0 tick: broadcast-hrtimer: Fix a race in bc_set_next
[ Upstream commit b9023b91dd ]

When a cpu requests broadcasting, before starting the tick broadcast
hrtimer, bc_set_next() checks if the timer callback (bc_handler) is active
using hrtimer_try_to_cancel(). But hrtimer_try_to_cancel() does not provide
the required synchronization when the callback is active on other core.

The callback could have already executed tick_handle_oneshot_broadcast()
and could have also returned. But still there is a small time window where
the hrtimer_try_to_cancel() returns -1. In that case bc_set_next() returns
without doing anything, but the next_event of the tick broadcast clock
device is already set to a timeout value.

In the race condition diagram below, CPU #1 is running the timer callback
and CPU #2 is entering idle state and so calls bc_set_next().

In the worst case, the next_event will contain an expiry time, but the
hrtimer will not be started which happens when the racing callback returns
HRTIMER_NORESTART. The hrtimer might never recover if all further requests
from the CPUs to subscribe to tick broadcast have timeout greater than the
next_event of tick broadcast clock device. This leads to cascading of
failures and finally noticed as rcu stall warnings

Here is a depiction of the race condition

CPU #1 (Running timer callback)                   CPU #2 (Enter idle
                                                  and subscribe to
                                                  tick broadcast)
---------------------                             ---------------------

__run_hrtimer()                                   tick_broadcast_enter()

  bc_handler()                                      __tick_broadcast_oneshot_control()

    tick_handle_oneshot_broadcast()

      raw_spin_lock(&tick_broadcast_lock);

      dev->next_event = KTIME_MAX;                  //wait for tick_broadcast_lock
      //next_event for tick broadcast clock
      set to KTIME_MAX since no other cores
      subscribed to tick broadcasting

      raw_spin_unlock(&tick_broadcast_lock);

    if (dev->next_event == KTIME_MAX)
      return HRTIMER_NORESTART
    // callback function exits without
       restarting the hrtimer                      //tick_broadcast_lock acquired
                                                   raw_spin_lock(&tick_broadcast_lock);

                                                   tick_broadcast_set_event()

                                                     clockevents_program_event()

                                                       dev->next_event = expires;

                                                       bc_set_next()

                                                         hrtimer_try_to_cancel()
                                                         //returns -1 since the timer
                                                         callback is active. Exits without
                                                         restarting the timer
  cpu_base->running = NULL;

The comment that hrtimer cannot be armed from within the callback is
wrong. It is fine to start the hrtimer from within the callback. Also it is
safe to start the hrtimer from the enter/exit idle code while the broadcast
handler is active. The enter/exit idle code and the broadcast handler are
synchronized using tick_broadcast_lock. So there is no need for the
existing try to cancel logic. All this can be removed which will eliminate
the race condition as well.

Fixes: 5d1638acb9 ("tick: Introduce hrtimer based broadcast")
Originally-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Balasubramani Vivekanandan <balasubramani_vivekanandan@mentor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190926135101.12102-2-balasubramani_vivekanandan@mentor.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-11 18:21:28 +02:00
Valdis Kletnieks
b0aaf65bb1 kernel/elfcore.c: include proper prototypes
[ Upstream commit 0f74914071 ]

When building with W=1, gcc properly complains that there's no prototypes:

  CC      kernel/elfcore.o
kernel/elfcore.c:7:17: warning: no previous prototype for 'elf_core_extra_phdrs' [-Wmissing-prototypes]
    7 | Elf_Half __weak elf_core_extra_phdrs(void)
      |                 ^~~~~~~~~~~~~~~~~~~~
kernel/elfcore.c:12:12: warning: no previous prototype for 'elf_core_write_extra_phdrs' [-Wmissing-prototypes]
   12 | int __weak elf_core_write_extra_phdrs(struct coredump_params *cprm, loff_t offset)
      |            ^~~~~~~~~~~~~~~~~~~~~~~~~~
kernel/elfcore.c:17:12: warning: no previous prototype for 'elf_core_write_extra_data' [-Wmissing-prototypes]
   17 | int __weak elf_core_write_extra_data(struct coredump_params *cprm)
      |            ^~~~~~~~~~~~~~~~~~~~~~~~~
kernel/elfcore.c:22:15: warning: no previous prototype for 'elf_core_extra_data_size' [-Wmissing-prototypes]
   22 | size_t __weak elf_core_extra_data_size(void)
      |               ^~~~~~~~~~~~~~~~~~~~~~~~

Provide the include file so gcc is happy, and we don't have potential code drift

Link: http://lkml.kernel.org/r/29875.1565224705@turing-police
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-11 18:21:23 +02:00
KeMeng Shi
46ff0e2f86 sched/core: Fix migration to invalid CPU in __set_cpus_allowed_ptr()
[ Upstream commit 714e501e16 ]

An oops can be triggered in the scheduler when running qemu on arm64:

 Unable to handle kernel paging request at virtual address ffff000008effe40
 Internal error: Oops: 96000007 [#1] SMP
 Process migration/0 (pid: 12, stack limit = 0x00000000084e3736)
 pstate: 20000085 (nzCv daIf -PAN -UAO)
 pc : __ll_sc___cmpxchg_case_acq_4+0x4/0x20
 lr : move_queued_task.isra.21+0x124/0x298
 ...
 Call trace:
  __ll_sc___cmpxchg_case_acq_4+0x4/0x20
  __migrate_task+0xc8/0xe0
  migration_cpu_stop+0x170/0x180
  cpu_stopper_thread+0xec/0x178
  smpboot_thread_fn+0x1ac/0x1e8
  kthread+0x134/0x138
  ret_from_fork+0x10/0x18

__set_cpus_allowed_ptr() will choose an active dest_cpu in affinity mask to
migrage the process if process is not currently running on any one of the
CPUs specified in affinity mask. __set_cpus_allowed_ptr() will choose an
invalid dest_cpu (dest_cpu >= nr_cpu_ids, 1024 in my virtual machine) if
CPUS in an affinity mask are deactived by cpu_down after cpumask_intersects
check. cpumask_test_cpu() of dest_cpu afterwards is overflown and may pass if
corresponding bit is coincidentally set. As a consequence, kernel will
access an invalid rq address associate with the invalid CPU in
migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs.

The reproduce the crash:

  1) A process repeatedly binds itself to cpu0 and cpu1 in turn by calling
  sched_setaffinity.

  2) A shell script repeatedly does "echo 0 > /sys/devices/system/cpu/cpu1/online"
  and "echo 1 > /sys/devices/system/cpu/cpu1/online" in turn.

  3) Oops appears if the invalid CPU is set in memory after tested cpumask.

Signed-off-by: KeMeng Shi <shikemeng@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1568616808-16808-1-git-send-email-shikemeng@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-11 18:21:22 +02:00
Mathieu Desnoyers
6cb7aa1b4f sched/membarrier: Fix private expedited registration check
[ Upstream commit fc0d77387c ]

Fix a logic flaw in the way membarrier_register_private_expedited()
handles ready state checks for private expedited sync core and private
expedited registrations.

If a private expedited membarrier registration is first performed, and
then a private expedited sync_core registration is performed, the ready
state check will skip the second registration when it really should not.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190919173705.2181-2-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-11 18:21:22 +02:00
Wanpeng Li
e409b81d9d Revert "locking/pvqspinlock: Don't wait if vCPU is preempted"
commit 89340d0935 upstream.

This patch reverts commit 75437bb304 (locking/pvqspinlock: Don't
wait if vCPU is preempted).  A large performance regression was caused
by this commit.  on over-subscription scenarios.

The test was run on a Xeon Skylake box, 2 sockets, 40 cores, 80 threads,
with three VMs of 80 vCPUs each.  The score of ebizzy -M is reduced from
13000-14000 records/s to 1700-1800 records/s:

          Host                Guest                score

vanilla w/o kvm optimizations     upstream    1700-1800 records/s
vanilla w/o kvm optimizations     revert      13000-14000 records/s
vanilla w/ kvm optimizations      upstream    4500-5000 records/s
vanilla w/ kvm optimizations      revert      14000-15500 records/s

Exit from aggressive wait-early mechanism can result in premature yield
and extra scheduling latency.

Actually, only 6% of wait_early events are caused by vcpu_is_preempted()
being true.  However, when one vCPU voluntarily releases its vCPU, all
the subsequently waiters in the queue will do the same and the cascading
effect leads to bad performance.

kvm optimizations:
[1] commit d73eb57b80 (KVM: Boost vCPUs that are delivering interrupts)
[2] commit 266e85a5ec (KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption)

Tested-by: loobinliu@tencent.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: loobinliu@tencent.com
Cc: stable@vger.kernel.org
Fixes: 75437bb304 (locking/pvqspinlock: Don't wait if vCPU is preempted)
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-11 18:21:06 +02:00
Li RongQing
06f250215b timer: Read jiffies once when forwarding base clk
commit e430d802d6 upstream.

The timer delayed for more than 3 seconds warning was triggered during
testing.

  Workqueue: events_unbound sched_tick_remote
  RIP: 0010:sched_tick_remote+0xee/0x100
  ...
  Call Trace:
   process_one_work+0x18c/0x3a0
   worker_thread+0x30/0x380
   kthread+0x113/0x130
   ret_from_fork+0x22/0x40

The reason is that the code in collect_expired_timers() uses jiffies
unprotected:

    if (next_event > jiffies)
        base->clk = jiffies;

As the compiler is allowed to reload the value base->clk can advance
between the check and the store and in the worst case advance farther than
next event. That causes the timer expiry to be delayed until the wheel
pointer wraps around.

Convert the code to use READ_ONCE()

Fixes: 236968383c ("timers: Optimize collect_expired_timers() for NOHZ")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Liang ZhiCheng <liangzhicheng@baidu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1568894687-14499-1-git-send-email-lirongqing@baidu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-11 18:20:59 +02:00
Tom Zanussi
e010c98351 tracing: Make sure variable reference alias has correct var_ref_idx
commit 17f8607a16 upstream.

Original changelog from Steve Rostedt (except last sentence which
explains the problem, and the Fixes: tag):

I performed a three way histogram with the following commands:

echo 'irq_lat u64 lat pid_t pid' > synthetic_events
echo 'wake_lat u64 lat u64 irqlat pid_t pid' >> synthetic_events
echo 'hist:keys=common_pid:irqts=common_timestamp.usecs if function == 0xffffffff81200580' > events/timer/hrtimer_start/trigger
echo 'hist:keys=common_pid:lat=common_timestamp.usecs-$irqts:onmatch(timer.hrtimer_start).irq_lat($lat,pid) if common_flags & 1' > events/sched/sched_waking/trigger
echo 'hist:keys=pid:wakets=common_timestamp.usecs,irqlat=lat' > events/synthetic/irq_lat/trigger
echo 'hist:keys=next_pid:lat=common_timestamp.usecs-$wakets,irqlat=$irqlat:onmatch(synthetic.irq_lat).wake_lat($lat,$irqlat,next_pid)' > events/sched/sched_switch/trigger
echo 1 > events/synthetic/wake_lat/enable

Basically I wanted to see:

 hrtimer_start (calling function tick_sched_timer)

Note:

  # grep tick_sched_timer /proc/kallsyms
ffffffff81200580 t tick_sched_timer

And save the time of that, and then record sched_waking if it is called
in interrupt context and with the same pid as the hrtimer_start, it
will record the latency between that and the waking event.

I then look at when the task that is woken is scheduled in, and record
the latency between the wakeup and the task running.

At the end, the wake_lat synthetic event will show the wakeup to
scheduled latency, as well as the irq latency in from hritmer_start to
the wakeup. The problem is that I found this:

          <idle>-0     [007] d...   190.485261: wake_lat: lat=27 irqlat=190485230 pid=698
          <idle>-0     [005] d...   190.485283: wake_lat: lat=40 irqlat=190485239 pid=10
          <idle>-0     [002] d...   190.488327: wake_lat: lat=56 irqlat=190488266 pid=335
          <idle>-0     [005] d...   190.489330: wake_lat: lat=64 irqlat=190489262 pid=10
          <idle>-0     [003] d...   190.490312: wake_lat: lat=43 irqlat=190490265 pid=77
          <idle>-0     [005] d...   190.493322: wake_lat: lat=54 irqlat=190493262 pid=10
          <idle>-0     [005] d...   190.497305: wake_lat: lat=35 irqlat=190497267 pid=10
          <idle>-0     [005] d...   190.501319: wake_lat: lat=50 irqlat=190501264 pid=10

The irqlat seemed quite large! Investigating this further, if I had
enabled the irq_lat synthetic event, I noticed this:

          <idle>-0     [002] d.s.   249.429308: irq_lat: lat=164968 pid=335
          <idle>-0     [002] d...   249.429369: wake_lat: lat=55 irqlat=249429308 pid=335

Notice that the timestamp of the irq_lat "249.429308" is awfully
similar to the reported irqlat variable. In fact, all instances were
like this. It appeared that:

  irqlat=$irqlat

Wasn't assigning the old $irqlat to the new irqlat variable, but
instead was assigning the $irqts to it.

The issue is that assigning the old $irqlat to the new irqlat variable
creates a variable reference alias, but the alias creation code
forgets to make sure the alias uses the same var_ref_idx to access the
reference.

Link: http://lkml.kernel.org/r/1567375321.5282.12.camel@kernel.org

Cc: Linux Trace Devel <linux-trace-devel@vger.kernel.org>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: stable@vger.kernel.org
Fixes: 7e8b88a30b ("tracing: Add hist trigger support for variable reference aliases")
Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-11 18:20:57 +02:00
Catalin Marinas
75443b7002 UPSTREAM: arm64: Tighten the PR_{SET, GET}_TAGGED_ADDR_CTRL prctl() unused arguments
(Upstream commit 3e91ec89f5).

Require that arg{3,4,5} of the PR_{SET,GET}_TAGGED_ADDR_CTRL prctl and
arg2 of the PR_GET_TAGGED_ADDR_CTRL prctl() are zero rather than ignored
for future extensions.

Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: I8bb5c3eb4728440880c971d77904f7e45b571ddc
2019-10-07 15:27:39 -04:00
Catalin Marinas
f077ee2609 BACKPORT: arm64: Introduce prctl() options to control the tagged user addresses ABI
(Upstream commit 63f0c60379).

It is not desirable to relax the ABI to allow tagged user addresses into
the kernel indiscriminately. This patch introduces a prctl() interface
for enabling or disabling the tagged ABI with a global sysctl control
for preventing applications from enabling the relaxed ABI (meant for
testing user-space prctl() return error checking without reconfiguring
the kernel). The ABI properties are inherited by threads of the same
application and fork()'ed children but cleared on execve(). A Kconfig
option allows the overall disabling of the relaxed ABI.

The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
MTE-specific settings like imprecise vs precise exceptions.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Change-Id: I2d52c5589b05415faab315c116245f1058d64750
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
2019-10-07 15:27:39 -04:00
Greg Kroah-Hartman
75337a6f96 This is the 4.19.78 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl2bbnkACgkQONu9yGCS
 aT6bng/+Jvj4gXLq2w+KmeN1SRbNu2ee+GjQsgQR6JZ3/dY5+rzPhuL37Op0fQd6
 UwnLhY4TL3PUiRCE8pNaVYI8nDfpxRkohYP+SMtGyoQKmoiy3W/SWe3CgEwniLwg
 k9TsuqxUsIUeEdSr6Bjbry0IU4VoZ3MP0cmMc1SrFUqJzFoGoUMHsHmQJvDPiy1f
 l7oZUgYrXArcnPhCda6peD9AJUfuIRKAM4BW47WN6Z9moqAAAa60eXN/u/hHp+Qc
 w55AZTSxel7CMbLDMnZ6/xWDgY/FHTLjkmhdIl9H6Qi8SbrJwq23zXnBO5xVameQ
 4MaLIgrp7M5sohAVFdqAVZYyZrkX91ssVujYwc+I6O1TYdja1Usj2TzdyL/MPqzY
 FLM9s3P0C3xKLdlg5gq9BxxnohIIhNmBy069NGsmfFd9jP2o6vFUEjiVxSY7Hp3r
 GpZP1ETAfMNHCG3jN3o5EkwyLoQHegFWfLpCIEF++k9gjo2CP00+Lf16RVrSsG47
 ILobFW5Wy4RXFyd7M8PrjSyuAtuzkUTzxg6P+6zuEtPwwvZPtadd97dGbA90pKvv
 eB1UPHu8/emMhW/8fwDBbpeQbIh0pHtX2yQq/LTItEHGr+YjRTIyH3Z/fST5itre
 ZDofsls4A+70TQ5/XOgjlDjco93iUs8KULDzQqvTFIuUIlvk3hQ=
 =7fm2
 -----END PGP SIGNATURE-----

Merge 4.19.78 into android-4.19

Changes in 4.19.78
	tpm: use tpm_try_get_ops() in tpm-sysfs.c.
	tpm: Fix TPM 1.2 Shutdown sequence to prevent future TPM operations
	drm/bridge: tc358767: Increase AUX transfer length limit
	drm/panel: simple: fix AUO g185han01 horizontal blanking
	video: ssd1307fb: Start page range at page_offset
	drm/stm: attach gem fence to atomic state
	drm/panel: check failure cases in the probe func
	drm/rockchip: Check for fast link training before enabling psr
	drm/radeon: Fix EEH during kexec
	gpu: drm: radeon: Fix a possible null-pointer dereference in radeon_connector_set_property()
	PCI: rpaphp: Avoid a sometimes-uninitialized warning
	ipmi_si: Only schedule continuously in the thread in maintenance mode
	clk: qoriq: Fix -Wunused-const-variable
	clk: sunxi-ng: v3s: add missing clock slices for MMC2 module clocks
	drm/amd/display: fix issue where 252-255 values are clipped
	drm/amd/display: reprogram VM config when system resume
	powerpc/powernv/ioda2: Allocate TCE table levels on demand for default DMA window
	clk: actions: Don't reference clk_init_data after registration
	clk: sirf: Don't reference clk_init_data after registration
	clk: sprd: Don't reference clk_init_data after registration
	clk: zx296718: Don't reference clk_init_data after registration
	powerpc/xmon: Check for HV mode when dumping XIVE info from OPAL
	powerpc/rtas: use device model APIs and serialization during LPM
	powerpc/futex: Fix warning: 'oldval' may be used uninitialized in this function
	powerpc/pseries/mobility: use cond_resched when updating device tree
	pinctrl: tegra: Fix write barrier placement in pmx_writel
	powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag
	vfio_pci: Restore original state on release
	drm/nouveau/volt: Fix for some cards having 0 maximum voltage
	pinctrl: amd: disable spurious-firing GPIO IRQs
	clk: renesas: mstp: Set GENPD_FLAG_ALWAYS_ON for clock domain
	clk: renesas: cpg-mssr: Set GENPD_FLAG_ALWAYS_ON for clock domain
	drm/amd/display: support spdif
	drm/amdgpu/si: fix ASIC tests
	powerpc/64s/exception: machine check use correct cfar for late handler
	pstore: fs superblock limits
	clk: qcom: gcc-sdm845: Use floor ops for sdcc clks
	powerpc/pseries: correctly track irq state in default idle
	pinctrl: meson-gxbb: Fix wrong pinning definition for uart_c
	arm64: fix unreachable code issue with cmpxchg
	clk: at91: select parent if main oscillator or bypass is enabled
	powerpc: dump kernel log before carrying out fadump or kdump
	mbox: qcom: add APCS child device for QCS404
	clk: sprd: add missing kfree
	scsi: core: Reduce memory required for SCSI logging
	dma-buf/sw_sync: Synchronize signal vs syncpt free
	ext4: fix potential use after free after remounting with noblock_validity
	MIPS: Ingenic: Disable broken BTB lookup optimization.
	MIPS: tlbex: Explicitly cast _PAGE_NO_EXEC to a boolean
	i2c-cht-wc: Fix lockdep warning
	mfd: intel-lpss: Remove D3cold delay
	PCI: tegra: Fix OF node reference leak
	HID: wacom: Fix several minor compiler warnings
	livepatch: Nullify obj->mod in klp_module_coming()'s error path
	ARM: 8898/1: mm: Don't treat faults reported from cache maintenance as writes
	soundwire: intel: fix channel number reported by hardware
	ARM: 8875/1: Kconfig: default to AEABI w/ Clang
	rtc: snvs: fix possible race condition
	rtc: pcf85363/pcf85263: fix regmap error in set_time
	HID: apple: Fix stuck function keys when using FN
	PCI: rockchip: Propagate errors for optional regulators
	PCI: histb: Propagate errors for optional regulators
	PCI: imx6: Propagate errors for optional regulators
	PCI: exynos: Propagate errors for optional PHYs
	security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb()
	ARM: 8903/1: ensure that usable memory in bank 0 starts from a PMD-aligned address
	fat: work around race with userspace's read via blockdev while mounting
	pktcdvd: remove warning on attempting to register non-passthrough dev
	hypfs: Fix error number left in struct pointer member
	crypto: hisilicon - Fix double free in sec_free_hw_sgl()
	kbuild: clean compressed initramfs image
	ocfs2: wait for recovering done after direct unlock request
	kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K
	arm64: consider stack randomization for mmap base only when necessary
	mips: properly account for stack randomization and stack guard gap
	arm: properly account for stack randomization and stack guard gap
	arm: use STACK_TOP when computing mmap base address
	block: mq-deadline: Fix queue restart handling
	bpf: fix use after free in prog symbol exposure
	cxgb4:Fix out-of-bounds MSI-X info array access
	erspan: remove the incorrect mtu limit for erspan
	hso: fix NULL-deref on tty open
	ipv6: drop incoming packets having a v4mapped source address
	ipv6: Handle missing host route in __ipv6_ifa_notify
	net: ipv4: avoid mixed n_redirects and rate_tokens usage
	net: qlogic: Fix memory leak in ql_alloc_large_buffers
	net: Unpublish sk from sk_reuseport_cb before call_rcu
	nfc: fix memory leak in llcp_sock_bind()
	qmi_wwan: add support for Cinterion CLS8 devices
	rxrpc: Fix rxrpc_recvmsg tracepoint
	sch_dsmark: fix potential NULL deref in dsmark_init()
	udp: fix gso_segs calculations
	vsock: Fix a lockdep warning in __vsock_release()
	net: dsa: rtl8366: Check VLAN ID and not ports
	udp: only do GSO if # of segs > 1
	net/rds: Fix error handling in rds_ib_add_one()
	xen-netfront: do not use ~0U as error return value for xennet_fill_frags()
	tipc: fix unlimited bundling of small messages
	sch_cbq: validate TCA_CBQ_WRROPT to avoid crash
	soundwire: Kconfig: fix help format
	soundwire: fix regmap dependencies and align with other serial links
	Smack: Don't ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set
	smack: use GFP_NOFS while holding inode_smack::smk_lock
	NFC: fix attrs checks in netlink interface
	kexec: bail out upon SIGKILL when allocating memory.
	9p/cache.c: Fix memory leak in v9fs_cache_session_get_cookie
	Linux 4.19.78

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I02db9c4a0cc1f784e6ac1523599fcbed747a6375
2019-10-07 19:17:35 +02:00
Tetsuo Handa
d85bc11a68 kexec: bail out upon SIGKILL when allocating memory.
commit 7c3a6aedcd upstream.

syzbot found that a thread can stall for minutes inside kexec_load() after
that thread was killed by SIGKILL [1].  It turned out that the reproducer
was trying to allocate 2408MB of memory using kimage_alloc_page() from
kimage_load_normal_segment().  Let's check for SIGKILL before doing memory
allocation.

[1] https://syzkaller.appspot.com/bug?id=a0e3436829698d5824231251fad9d8e998f94f5e

Link: http://lkml.kernel.org/r/993c9185-d324-2640-d061-bed2dd18b1f7@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+8ab2d0f39fb79fe6ca40@syzkaller.appspotmail.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-07 18:57:28 +02:00
Daniel Borkmann
ed568ca736 bpf: fix use after free in prog symbol exposure
commit c751798aa2 upstream.

syzkaller managed to trigger the warning in bpf_jit_free() which checks via
bpf_prog_kallsyms_verify_off() for potentially unlinked JITed BPF progs
in kallsyms, and subsequently trips over GPF when walking kallsyms entries:

  [...]
  8021q: adding VLAN 0 to HW filter on device batadv0
  8021q: adding VLAN 0 to HW filter on device batadv0
  WARNING: CPU: 0 PID: 9869 at kernel/bpf/core.c:810 bpf_jit_free+0x1e8/0x2a0
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Workqueue: events bpf_prog_free_deferred
  Call Trace:
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x113/0x167 lib/dump_stack.c:113
   panic+0x212/0x40b kernel/panic.c:214
   __warn.cold.8+0x1b/0x38 kernel/panic.c:571
   report_bug+0x1a4/0x200 lib/bug.c:186
   fixup_bug arch/x86/kernel/traps.c:178 [inline]
   do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
   do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
   invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
  RIP: 0010:bpf_jit_free+0x1e8/0x2a0
  Code: 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 86 00 00 00 48 ba 00 02 00 00 00 00 ad de 0f b6 43 02 49 39 d6 0f 84 5f fe ff ff <0f> 0b e9 58 fe ff ff 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1
  RSP: 0018:ffff888092f67cd8 EFLAGS: 00010202
  RAX: 0000000000000007 RBX: ffffc90001947000 RCX: ffffffff816e9d88
  RDX: dead000000000200 RSI: 0000000000000008 RDI: ffff88808769f7f0
  RBP: ffff888092f67d00 R08: fffffbfff1394059 R09: fffffbfff1394058
  R10: fffffbfff1394058 R11: ffffffff89ca02c7 R12: ffffc90001947002
  R13: ffffc90001947020 R14: ffffffff881eca80 R15: ffff88808769f7e8
  BUG: unable to handle kernel paging request at fffffbfff400d000
  #PF error: [normal kernel read fault]
  PGD 21ffee067 P4D 21ffee067 PUD 21ffed067 PMD 9f942067 PTE 0
  Oops: 0000 [#1] PREEMPT SMP KASAN
  CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Workqueue: events bpf_prog_free_deferred
  RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:495 [inline]
  RIP: 0010:bpf_tree_comp kernel/bpf/core.c:558 [inline]
  RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
  RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
  RIP: 0010:bpf_prog_kallsyms_find+0x107/0x2e0 kernel/bpf/core.c:632
  Code: 00 f0 ff ff 44 38 c8 7f 08 84 c0 0f 85 fa 00 00 00 41 f6 45 02 01 75 02 0f 0b 48 39 da 0f 82 92 00 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 03 0f 8e 45 01 00 00 8b 03 48 c1 e0
  [...]

Upon further debugging, it turns out that whenever we trigger this
issue, the kallsyms removal in bpf_prog_ksym_node_del() was /skipped/
but yet bpf_jit_free() reported that the entry is /in use/.

Problem is that symbol exposure via bpf_prog_kallsyms_add() but also
perf_event_bpf_event() were done /after/ bpf_prog_new_fd(). Once the
fd is exposed to the public, a parallel close request came in right
before we attempted to do the bpf_prog_kallsyms_add().

Given at this time the prog reference count is one, we start to rip
everything underneath us via bpf_prog_release() -> bpf_prog_put().
The memory is eventually released via deferred free, so we're seeing
that bpf_jit_free() has a kallsym entry because we added it from
bpf_prog_load() but /after/ bpf_prog_put() from the remote CPU.

Therefore, move both notifications /before/ we install the fd. The
issue was never seen between bpf_prog_alloc_id() and bpf_prog_new_fd()
because upon bpf_prog_get_fd_by_id() we'll take another reference to
the BPF prog, so we're still holding the original reference from the
bpf_prog_load().

Fixes: 6ee52e2a3f ("perf, bpf: Introduce PERF_RECORD_BPF_EVENT")
Fixes: 74451e66d5 ("bpf: make jited programs visible in traces")
Reported-by: syzbot+bd3bba6ff3fcea7a6ec6@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Zubin Mithra <zsm@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-07 18:57:19 +02:00
Miroslav Benes
0f0ced702d livepatch: Nullify obj->mod in klp_module_coming()'s error path
[ Upstream commit 4ff96fb52c ]

klp_module_coming() is called for every module appearing in the system.
It sets obj->mod to a patched module for klp_object obj. Unfortunately
it leaves it set even if an error happens later in the function and the
patched module is not allowed to be loaded.

klp_is_object_loaded() uses obj->mod variable and could currently give a
wrong return value. The bug is probably harmless as of now.

Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-07 18:57:10 +02:00
Greg Kroah-Hartman
7f1f24fed2 This is the 4.19.77 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl2YehoACgkQONu9yGCS
 aT536hAA0w6XF1ogPi23xS5BQ386c3BjWaOJddu0PFkR9utguuZaDG5AKjnkHtm2
 foxKeCGyChCc1h6qFD2wLavsQMephzLWXkMw1/uXfwAPXGWY1hquD46zrFO0IYOc
 dsmKh5V8ZT70lmPTBHCDVowp1xUczNFhuNRHz1KzVQ3lsKMAhsRiy6vpsXWzIeOM
 VCrayI4jxTVZvoA9Odm9fTXpau0Qb9k4pqGGU84dz/xWfGzrz/AOmp+kWuZRGepS
 fQi+46HK3idmGGNBVGkJYkDdpkefdrYdjlVHkkoAZHlaB+QaOrre0trNArK+knRu
 jJIl8R/M50yYbkn97vVIpwoh19rV0BI7KUFKO4C4NCByOOV+9vjJ5g1FuZ7c3pmj
 XKZiBmFBJVqbI1uwgqEn+/Tv6DyEz4gUzo5GHOe2i/Bh2nSguHZzO+Yhat1U+J+Y
 3QVfrIS10jICWBAqm147FepGtNxbCPP3plyqbJdrMwgM6GOMd83v+rY/5okB2YK4
 SHhzuevQoxujjeoOgHKjIqTiCIaJMt5ZvlCFqODg8NgpjTNDAbCA/UUQWs7MlrqX
 6uwH2QLwk3h+bEXUfGzsbsk5isTDpHr1ch6SI9T675JHRZCE461RoR82XpjLOyHD
 90tBJk9pfKlLqSEyiaWPPpXErhoS5WCVKM5yuT/rSS/i2Ve4VH4=
 =Q+c9
 -----END PGP SIGNATURE-----

Merge 4.19.77 into android-4.19

Changes in 4.19.77
	arcnet: provide a buffer big enough to actually receive packets
	cdc_ncm: fix divide-by-zero caused by invalid wMaxPacketSize
	macsec: drop skb sk before calling gro_cells_receive
	net/phy: fix DP83865 10 Mbps HDX loopback disable function
	net: qrtr: Stop rx_worker before freeing node
	net/sched: act_sample: don't push mac header on ip6gre ingress
	net_sched: add max len check for TCA_KIND
	nfp: flower: fix memory leak in nfp_flower_spawn_vnic_reprs
	openvswitch: change type of UPCALL_PID attribute to NLA_UNSPEC
	ppp: Fix memory leak in ppp_write
	sch_netem: fix a divide by zero in tabledist()
	skge: fix checksum byte order
	usbnet: ignore endpoints with invalid wMaxPacketSize
	usbnet: sanity checking of packet sizes and device mtu
	net: sched: fix possible crash in tcf_action_destroy()
	tcp: better handle TCP_USER_TIMEOUT in SYN_SENT state
	net/mlx5: Add device ID of upcoming BlueField-2
	mISDN: enforce CAP_NET_RAW for raw sockets
	appletalk: enforce CAP_NET_RAW for raw sockets
	ax25: enforce CAP_NET_RAW for raw sockets
	ieee802154: enforce CAP_NET_RAW for raw sockets
	nfc: enforce CAP_NET_RAW for raw sockets
	nfp: flower: prevent memory leak in nfp_flower_spawn_phy_reprs
	ALSA: hda: Flush interrupts on disabling
	regulator: lm363x: Fix off-by-one n_voltages for lm3632 ldo_vpos/ldo_vneg
	ASoC: tlv320aic31xx: suppress error message for EPROBE_DEFER
	ASoC: sgtl5000: Fix of unmute outputs on probe
	ASoC: sgtl5000: Fix charge pump source assignment
	firmware: qcom_scm: Use proper types for dma mappings
	dmaengine: bcm2835: Print error in case setting DMA mask fails
	leds: leds-lp5562 allow firmware files up to the maximum length
	media: dib0700: fix link error for dibx000_i2c_set_speed
	media: mtk-cir: lower de-glitch counter for rc-mm protocol
	media: exynos4-is: fix leaked of_node references
	media: hdpvr: Add device num check and handling
	media: i2c: ov5640: Check for devm_gpiod_get_optional() error
	time/tick-broadcast: Fix tick_broadcast_offline() lockdep complaint
	sched/fair: Fix imbalance due to CPU affinity
	sched/core: Fix CPU controller for !RT_GROUP_SCHED
	x86/apic: Make apic_pending_intr_clear() more robust
	sched/deadline: Fix bandwidth accounting at all levels after offline migration
	x86/reboot: Always use NMI fallback when shutdown via reboot vector IPI fails
	x86/apic: Soft disable APIC before initializing it
	ALSA: hda - Show the fatal CORB/RIRB error more clearly
	ALSA: i2c: ak4xxx-adda: Fix a possible null pointer dereference in build_adc_controls()
	EDAC/mc: Fix grain_bits calculation
	media: iguanair: add sanity checks
	base: soc: Export soc_device_register/unregister APIs
	ALSA: usb-audio: Skip bSynchAddress endpoint check if it is invalid
	ia64:unwind: fix double free for mod->arch.init_unw_table
	EDAC/altera: Use the proper type for the IRQ status bits
	ASoC: rsnd: don't call clk_get_rate() under atomic context
	arm64/prefetch: fix a -Wtype-limits warning
	md/raid1: end bio when the device faulty
	md: don't call spare_active in md_reap_sync_thread if all member devices can't work
	md: don't set In_sync if array is frozen
	media: media/platform: fsl-viu.c: fix build for MICROBLAZE
	ACPI / processor: don't print errors for processorIDs == 0xff
	loop: Add LOOP_SET_DIRECT_IO to compat ioctl
	EDAC, pnd2: Fix ioremap() size in dnv_rd_reg()
	efi: cper: print AER info of PCIe fatal error
	firmware: arm_scmi: Check if platform has released shmem before using
	sched/fair: Use rq_lock/unlock in online_fair_sched_group
	idle: Prevent late-arriving interrupts from disrupting offline
	media: gspca: zero usb_buf on error
	perf config: Honour $PERF_CONFIG env var to specify alternate .perfconfig
	perf test vfs_getname: Disable ~/.perfconfig to get default output
	media: mtk-mdp: fix reference count on old device tree
	media: fdp1: Reduce FCP not found message level to debug
	media: em28xx: modules workqueue not inited for 2nd device
	media: rc: imon: Allow iMON RC protocol for ffdc 7e device
	dmaengine: iop-adma: use correct printk format strings
	perf record: Support aarch64 random socket_id assignment
	media: vsp1: fix memory leak of dl on error return path
	media: i2c: ov5645: Fix power sequence
	media: omap3isp: Don't set streaming state on random subdevs
	media: imx: mipi csi-2: Don't fail if initial state times-out
	net: lpc-enet: fix printk format strings
	m68k: Prevent some compiler warnings in Coldfire builds
	ARM: dts: imx7d: cl-som-imx7: make ethernet work again
	ARM: dts: imx7-colibri: disable HS400
	media: radio/si470x: kill urb on error
	media: hdpvr: add terminating 0 at end of string
	ASoC: uniphier: Fix double reset assersion when transitioning to suspend state
	tools headers: Fixup bitsperlong per arch includes
	ASoC: sun4i-i2s: Don't use the oversample to calculate BCLK
	led: triggers: Fix a memory leak bug
	nbd: add missing config put
	media: mceusb: fix (eliminate) TX IR signal length limit
	media: dvb-frontends: use ida for pll number
	posix-cpu-timers: Sanitize bogus WARNONS
	media: dvb-core: fix a memory leak bug
	libperf: Fix alignment trap with xyarray contents in 'perf stat'
	EDAC/amd64: Recognize DRAM device type ECC capability
	EDAC/amd64: Decode syndrome before translating address
	PM / devfreq: passive: Use non-devm notifiers
	PM / devfreq: exynos-bus: Correct clock enable sequence
	media: cec-notifier: clear cec_adap in cec_notifier_unregister
	media: saa7146: add cleanup in hexium_attach()
	media: cpia2_usb: fix memory leaks
	media: saa7134: fix terminology around saa7134_i2c_eeprom_md7134_gate()
	perf trace beauty ioctl: Fix off-by-one error in cmd->string table
	media: ov9650: add a sanity check
	ASoC: es8316: fix headphone mixer volume table
	ACPI / CPPC: do not require the _PSD method
	sched/cpufreq: Align trace event behavior of fast switching
	x86/apic/vector: Warn when vector space exhaustion breaks affinity
	arm64: kpti: ensure patched kernel text is fetched from PoU
	x86/mm/pti: Do not invoke PTI functions when PTI is disabled
	ASoC: fsl_ssi: Fix clock control issue in master mode
	x86/mm/pti: Handle unaligned address gracefully in pti_clone_pagetable()
	nvmet: fix data units read and written counters in SMART log
	nvme-multipath: fix ana log nsid lookup when nsid is not found
	ALSA: firewire-motu: add support for MOTU 4pre
	iommu/amd: Silence warnings under memory pressure
	libata/ahci: Drop PCS quirk for Denverton and beyond
	iommu/iova: Avoid false sharing on fq_timer_on
	libtraceevent: Change users plugin directory
	ARM: dts: exynos: Mark LDO10 as always-on on Peach Pit/Pi Chromebooks
	ACPI: custom_method: fix memory leaks
	ACPI / PCI: fix acpi_pci_irq_enable() memory leak
	closures: fix a race on wakeup from closure_sync
	hwmon: (acpi_power_meter) Change log level for 'unsafe software power cap'
	md/raid1: fail run raid1 array when active disk less than one
	dmaengine: ti: edma: Do not reset reserved paRAM slots
	kprobes: Prohibit probing on BUG() and WARN() address
	s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding
	x86/cpu: Add Tiger Lake to Intel family
	platform/x86: intel_pmc_core: Do not ioremap RAM
	ASoC: dmaengine: Make the pcm->name equal to pcm->id if the name is not set
	raid5: don't set STRIPE_HANDLE to stripe which is in batch list
	mmc: core: Clarify sdio_irq_pending flag for MMC_CAP2_SDIO_IRQ_NOTHREAD
	mmc: sdhci: Fix incorrect switch to HS mode
	mmc: core: Add helper function to indicate if SDIO IRQs is enabled
	mmc: dw_mmc: Re-store SDIO IRQs mask at system resume
	raid5: don't increment read_errors on EILSEQ return
	libertas: Add missing sentinel at end of if_usb.c fw_table
	e1000e: add workaround for possible stalled packet
	ALSA: hda - Drop unsol event handler for Intel HDMI codecs
	drm/amd/powerplay/smu7: enforce minimal VBITimeout (v2)
	media: ttusb-dec: Fix info-leak in ttusb_dec_send_command()
	ALSA: hda/realtek - Blacklist PC beep for Lenovo ThinkCentre M73/93
	iommu/amd: Override wrong IVRS IOAPIC on Raven Ridge systems
	btrfs: extent-tree: Make sure we only allocate extents from block groups with the same type
	media: omap3isp: Set device on omap3isp subdevs
	PM / devfreq: passive: fix compiler warning
	iwlwifi: fw: don't send GEO_TX_POWER_LIMIT command to FW version 36
	ALSA: firewire-tascam: handle error code when getting current source of clock
	ALSA: firewire-tascam: check intermediate state of clock status and retry
	scsi: scsi_dh_rdac: zero cdb in send_mode_select()
	scsi: qla2xxx: Fix Relogin to prevent modifying scan_state flag
	printk: Do not lose last line in kmsg buffer dump
	IB/mlx5: Free mpi in mp_slave mode
	IB/hfi1: Define variables as unsigned long to fix KASAN warning
	randstruct: Check member structs in is_pure_ops_struct()
	Revert "ceph: use ceph_evict_inode to cleanup inode's resource"
	ceph: use ceph_evict_inode to cleanup inode's resource
	ALSA: hda/realtek - PCI quirk for Medion E4254
	blk-mq: add callback of .cleanup_rq
	scsi: implement .cleanup_rq callback
	powerpc/imc: Dont create debugfs files for cpu-less nodes
	fuse: fix missing unlock_page in fuse_writepage()
	parisc: Disable HP HSC-PCI Cards to prevent kernel crash
	KVM: x86: always stop emulation on page fault
	KVM: x86: set ctxt->have_exception in x86_decode_insn()
	KVM: x86: Manually calculate reserved bits when loading PDPTRS
	media: sn9c20x: Add MSI MS-1039 laptop to flip_dmi_table
	media: don't drop front-end reference count for ->detach
	binfmt_elf: Do not move brk for INTERP-less ET_EXEC
	ASoC: Intel: NHLT: Fix debug print format
	ASoC: Intel: Skylake: Use correct function to access iomem space
	ASoC: Intel: Fix use of potentially uninitialized variable
	ARM: samsung: Fix system restart on S3C6410
	ARM: zynq: Use memcpy_toio instead of memcpy on smp bring-up
	Revert "arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}"
	arm64: tlb: Ensure we execute an ISB following walk cache invalidation
	arm64: dts: rockchip: limit clock rate of MMC controllers for RK3328
	alarmtimer: Use EOPNOTSUPP instead of ENOTSUPP
	regulator: Defer init completion for a while after late_initcall
	efifb: BGRT: Improve efifb_bgrt_sanity_check
	gfs2: clear buf_in_tr when ending a transaction in sweep_bh_for_rgrps
	memcg, oom: don't require __GFP_FS when invoking memcg OOM killer
	memcg, kmem: do not fail __GFP_NOFAIL charges
	i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask
	block: fix null pointer dereference in blk_mq_rq_timed_out()
	smb3: allow disabling requesting leases
	ovl: Fix dereferencing possible ERR_PTR()
	ovl: filter of trusted xattr results in audit
	btrfs: fix allocation of free space cache v1 bitmap pages
	Btrfs: fix use-after-free when using the tree modification log
	btrfs: Relinquish CPUs in btrfs_compare_trees
	btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
	btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls
	Btrfs: fix race setting up and completing qgroup rescan workers
	md/raid6: Set R5_ReadError when there is read failure on parity disk
	md: don't report active array_state until after revalidate_disk() completes.
	md: only call set_in_sync() when it is expected to succeed.
	cfg80211: Purge frame registrations on iftype change
	/dev/mem: Bail out upon SIGKILL.
	ext4: fix warning inside ext4_convert_unwritten_extents_endio
	ext4: fix punch hole for inline_data file systems
	quota: fix wrong condition in is_quota_modification()
	hwrng: core - don't wait on add_early_randomness()
	i2c: riic: Clear NACK in tend isr
	CIFS: fix max ea value size
	CIFS: Fix oplock handling for SMB 2.1+ protocols
	md/raid0: avoid RAID0 data corruption due to layout confusion.
	fuse: fix deadlock with aio poll and fuse_iqueue::waitq.lock
	mm/compaction.c: clear total_{migrate,free}_scanned before scanning a new zone
	drm/amd/display: Restore backlight brightness after system resume
	Linux 4.19.77

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2c74f09f497a4b45b244a7dd263c5116533dfccb
2019-10-06 11:27:45 +02:00
Thadeu Lima de Souza Cascardo
3784576fc6 alarmtimer: Use EOPNOTSUPP instead of ENOTSUPP
commit f18ddc13af upstream.

ENOTSUPP is not supposed to be returned to userspace. This was found on an
OpenPower machine, where the RTC does not support set_alarm.

On that system, a clock_nanosleep(CLOCK_REALTIME_ALARM, ...) results in
"524 Unknown error 524"

Replace it with EOPNOTSUPP which results in the expected "95 Operation not
supported" error.

Fixes: 1c6b39ad3f (alarmtimers: Return -ENOTSUPP if no RTC device is present)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190903171802.28314-1-cascardo@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05 13:10:07 +02:00
Vincent Whitchurch
40b071992c printk: Do not lose last line in kmsg buffer dump
commit c9dccacfcc upstream.

kmsg_dump_get_buffer() is supposed to select all the youngest log
messages which fit into the provided buffer.  It determines the correct
start index by using msg_print_text() with a NULL buffer to calculate
the size of each entry.  However, when performing the actual writes,
msg_print_text() only writes the entry to the buffer if the written len
is lesser than the size of the buffer.  So if the lengths of the
selected youngest log messages happen to precisely fill up the provided
buffer, the last log message is not included.

We don't want to modify msg_print_text() to fill up the buffer and start
returning a length which is equal to the size of the buffer, since
callers of its other users, such as kmsg_dump_get_line(), depend upon
the current behaviour.

Instead, fix kmsg_dump_get_buffer() to compensate for this.

For example, with the following two final prints:

[    6.427502] AAAAAAAAAAAAA
[    6.427769] BBBBBBBB12345

A dump of a 64-byte buffer filled by kmsg_dump_get_buffer(), before this
patch:

 00000000: 3c 30 3e 5b 20 20 20 20 36 2e 35 32 32 31 39 37  <0>[    6.522197
 00000010: 5d 20 41 41 41 41 41 41 41 41 41 41 41 41 41 0a  ] AAAAAAAAAAAAA.
 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

After this patch:

 00000000: 3c 30 3e 5b 20 20 20 20 36 2e 34 35 36 36 37 38  <0>[    6.456678
 00000010: 5d 20 42 42 42 42 42 42 42 42 31 32 33 34 35 0a  ] BBBBBBBB12345.
 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

Link: http://lkml.kernel.org/r/20190711142937.4083-1-vincent.whitchurch@axis.com
Fixes: e2ae715d66 ("kmsg - kmsg_dump() use iterator to receive log buffer content")
To: rostedt@goodmis.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # v3.5+
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05 13:10:01 +02:00
Masami Hiramatsu
fad90d4bfa kprobes: Prohibit probing on BUG() and WARN() address
[ Upstream commit e336b40277 ]

Since BUG() and WARN() may use a trap (e.g. UD2 on x86) to
get the address where the BUG() has occurred, kprobes can not
do single-step out-of-line that instruction. So prohibit
probing on such address.

Without this fix, if someone put a kprobe on WARN(), the
kernel will crash with invalid opcode error instead of
outputing warning message, because kernel can not find
correct bug address.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naveen N . Rao <naveen.n.rao@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/156750890133.19112.3393666300746167111.stgit@devnote2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05 13:09:54 +02:00
Douglas RAILLARD
01e8f487ce sched/cpufreq: Align trace event behavior of fast switching
[ Upstream commit 77c84dd188 ]

Fast switching path only emits an event for the CPU of interest, whereas the
regular path emits an event for all the CPUs that had their frequency changed,
i.e. all the CPUs sharing the same policy.

With the current behavior, looking at cpu_frequency event for a given CPU that
is using the fast switching path will not give the correct frequency signal.

Signed-off-by: Douglas RAILLARD <douglas.raillard@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05 13:09:51 +02:00
Thomas Gleixner
8d5fccff7b posix-cpu-timers: Sanitize bogus WARNONS
[ Upstream commit 692117c1f7 ]

Warning when p == NULL and then proceeding and dereferencing p does not
make any sense as the kernel will crash with a NULL pointer dereference
right away.

Bailing out when p == NULL and returning an error code does not cure the
underlying problem which caused p to be NULL. Though it might allow to
do proper debugging.

Same applies to the clock id check in set_process_cpu_timer().

Clean them up and make them return without trying to do further damage.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lkml.kernel.org/r/20190819143801.846497772@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05 13:09:47 +02:00
Peter Zijlstra
5111102360 idle: Prevent late-arriving interrupts from disrupting offline
[ Upstream commit e78a7614f3 ]

Scheduling-clock interrupts can arrive late in the CPU-offline process,
after idle entry and the subsequent call to cpuhp_report_idle_dead().
Once execution passes the call to rcu_report_dead(), RCU is ignoring
the CPU, which results in lockdep complaints when the interrupt handler
uses RCU:

------------------------------------------------------------------------

=============================
WARNING: suspicious RCU usage
5.2.0-rc1+ #681 Not tainted
-----------------------------
kernel/sched/fair.c:9542 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

RCU used illegally from offline CPU!
rcu_scheduler_active = 2, debug_locks = 1
no locks held by swapper/5/0.

stack backtrace:
CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.2.0-rc1+ #681
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
Call Trace:
 <IRQ>
 dump_stack+0x5e/0x8b
 trigger_load_balance+0xa8/0x390
 ? tick_sched_do_timer+0x60/0x60
 update_process_times+0x3b/0x50
 tick_sched_handle+0x2f/0x40
 tick_sched_timer+0x32/0x70
 __hrtimer_run_queues+0xd3/0x3b0
 hrtimer_interrupt+0x11d/0x270
 ? sched_clock_local+0xc/0x74
 smp_apic_timer_interrupt+0x79/0x200
 apic_timer_interrupt+0xf/0x20
 </IRQ>
RIP: 0010:delay_tsc+0x22/0x50
Code: ff 0f 1f 80 00 00 00 00 65 44 8b 05 18 a7 11 48 0f ae e8 0f 31 48 89 d6 48 c1 e6 20 48 09 c6 eb 0e f3 90 65 8b 05 fe a6 11 48 <41> 39 c0 75 18 0f ae e8 0f 31 48 c1 e2 20 48 09 c2 48 89 d0 48 29
RSP: 0000:ffff8f92c0157ed0 EFLAGS: 00000212 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000005 RBX: ffff8c861f356400 RCX: ffff8f92c0157e64
RDX: 000000321214c8cc RSI: 00000032120daa7f RDI: 0000000000260f15
RBP: 0000000000000005 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: ffff8c861ee18000 R15: ffff8c861ee18000
 cpuhp_report_idle_dead+0x31/0x60
 do_idle+0x1d5/0x200
 ? _raw_spin_unlock_irqrestore+0x2d/0x40
 cpu_startup_entry+0x14/0x20
 start_secondary+0x151/0x170
 secondary_startup_64+0xa4/0xb0

------------------------------------------------------------------------

This happens rarely, but can be forced by happen more often by
placing delays in cpuhp_report_idle_dead() following the call to
rcu_report_dead().  With this in place, the following rcutorture
scenario reproduces the problem within a few minutes:

tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --duration 5 --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --configs "TREE04"

This commit uses the crude but effective expedient of moving the disabling
of interrupts within the idle loop to precede the cpu_is_offline()
check.  It also invokes tick_nohz_idle_stop_tick() instead of
tick_nohz_idle_stop_tick_protected() to shut off the scheduling-clock
interrupt.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
[ paulmck: Revert tick_nohz_idle_stop_tick_protected() removal, new callers. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05 13:09:40 +02:00
Phil Auld
9addfbd409 sched/fair: Use rq_lock/unlock in online_fair_sched_group
[ Upstream commit a46d14eca7 ]

Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
warning to fire in update_rq_clock. This seems to be caused by onlining
a new fair sched group not using the rq lock wrappers.

  [] rq->clock_update_flags & RQCF_UPDATED
  [] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 update_rq_clock+0xec/0x150

  [] Call Trace:
  []  online_fair_sched_group+0x53/0x100
  []  cpu_cgroup_css_online+0x16/0x20
  []  online_css+0x1c/0x60
  []  cgroup_apply_control_enable+0x231/0x3b0
  []  cgroup_mkdir+0x41b/0x530
  []  kernfs_iop_mkdir+0x61/0xa0
  []  vfs_mkdir+0x108/0x1a0
  []  do_mkdirat+0x77/0xe0
  []  do_syscall_64+0x55/0x1d0
  []  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Using the wrappers in online_fair_sched_group instead of the raw locking
removes this warning.

[ tglx: Use rq_*lock_irq() ]

Signed-off-by: Phil Auld <pauld@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20190801133749.11033-1-pauld@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05 13:09:40 +02:00
Juri Lelli
0f30856944 sched/deadline: Fix bandwidth accounting at all levels after offline migration
[ Upstream commit 59d06cea11 ]

If a task happens to be throttled while the CPU it was running on gets
hotplugged off, the bandwidth associated with the task is not correctly
migrated with it when the replenishment timer fires (offline_migration).

Fix things up, for this_bw, running_bw and total_bw, when replenishment
timer fires and task is migrated (dl_task_offline_migration()).

Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bristot@redhat.com
Cc: claudio@evidence.eu.com
Cc: lizefan@huawei.com
Cc: longman@redhat.com
Cc: luca.abeni@santannapisa.it
Cc: mathieu.poirier@linaro.org
Cc: rostedt@goodmis.org
Cc: tj@kernel.org
Cc: tommaso.cucinotta@santannapisa.it
Link: https://lkml.kernel.org/r/20190719140000.31694-5-juri.lelli@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05 13:09:36 +02:00
Juri Lelli
f381d3d2c3 sched/core: Fix CPU controller for !RT_GROUP_SCHED
[ Upstream commit a07db5c086 ]

On !CONFIG_RT_GROUP_SCHED configurations it is currently not possible to
move RT tasks between cgroups to which CPU controller has been attached;
but it is oddly possible to first move tasks around and then make them
RT (setschedule to FIFO/RR).

E.g.:

  # mkdir /sys/fs/cgroup/cpu,cpuacct/group1
  # chrt -fp 10 $$
  # echo $$ > /sys/fs/cgroup/cpu,cpuacct/group1/tasks
  bash: echo: write error: Invalid argument
  # chrt -op 0 $$
  # echo $$ > /sys/fs/cgroup/cpu,cpuacct/group1/tasks
  # chrt -fp 10 $$
  # cat /sys/fs/cgroup/cpu,cpuacct/group1/tasks
  2345
  2598
  # chrt -p 2345
  pid 2345's current scheduling policy: SCHED_FIFO
  pid 2345's current scheduling priority: 10

Also, as Michal noted, it is currently not possible to enable CPU
controller on unified hierarchy with !CONFIG_RT_GROUP_SCHED (if there
are any kernel RT threads in root cgroup, they can't be migrated to the
newly created CPU controller's root in cgroup_update_dfl_csses()).

Existing code comes with a comment saying the "we don't support RT-tasks
being in separate groups". Such comment is however stale and belongs to
pre-RT_GROUP_SCHED times. Also, it doesn't make much sense for
!RT_GROUP_ SCHED configurations, since checks related to RT bandwidth
are not performed at all in these cases.

Make moving RT tasks between CPU controller groups viable by removing
special case check for RT (and DEADLINE) tasks.

Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: lizefan@huawei.com
Cc: longman@redhat.com
Cc: luca.abeni@santannapisa.it
Cc: rostedt@goodmis.org
Link: https://lkml.kernel.org/r/20190719063455.27328-1-juri.lelli@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05 13:09:36 +02:00
Vincent Guittot
417cf53b4b sched/fair: Fix imbalance due to CPU affinity
[ Upstream commit f6cad8df6b ]

The load_balance() has a dedicated mecanism to detect when an imbalance
is due to CPU affinity and must be handled at parent level. In this case,
the imbalance field of the parent's sched_group is set.

The description of sg_imbalanced() gives a typical example of two groups
of 4 CPUs each and 4 tasks each with a cpumask covering 1 CPU of the first
group and 3 CPUs of the second group. Something like:

	{ 0 1 2 3 } { 4 5 6 7 }
	        *     * * *

But the load_balance fails to fix this UC on my octo cores system
made of 2 clusters of quad cores.

Whereas the load_balance is able to detect that the imbalanced is due to
CPU affinity, it fails to fix it because the imbalance field is cleared
before letting parent level a chance to run. In fact, when the imbalance is
detected, the load_balance reruns without the CPU with pinned tasks. But
there is no other running tasks in the situation described above and
everything looks balanced this time so the imbalance field is immediately
cleared.

The imbalance field should not be cleared if there is no other task to move
when the imbalance is detected.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1561996022-28829-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05 13:09:36 +02:00
Paul E. McKenney
7cebdfa62f time/tick-broadcast: Fix tick_broadcast_offline() lockdep complaint
[ Upstream commit 84ec3a0787 ]

time/tick-broadcast: Fix tick_broadcast_offline() lockdep complaint

The TASKS03 and TREE04 rcutorture scenarios produce the following
lockdep complaint:

	WARNING: inconsistent lock state
	5.2.0-rc1+ #513 Not tainted
	--------------------------------
	inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
	migration/1/14 [HC0[0]:SC0[0]:HE1:SE1] takes:
	(____ptrval____) (tick_broadcast_lock){?...}, at: tick_broadcast_offline+0xf/0x70
	{IN-HARDIRQ-W} state was registered at:
	  lock_acquire+0xb0/0x1c0
	  _raw_spin_lock_irqsave+0x3c/0x50
	  tick_broadcast_switch_to_oneshot+0xd/0x40
	  tick_switch_to_oneshot+0x4f/0xd0
	  hrtimer_run_queues+0xf3/0x130
	  run_local_timers+0x1c/0x50
	  update_process_times+0x1c/0x50
	  tick_periodic+0x26/0xc0
	  tick_handle_periodic+0x1a/0x60
	  smp_apic_timer_interrupt+0x80/0x2a0
	  apic_timer_interrupt+0xf/0x20
	  _raw_spin_unlock_irqrestore+0x4e/0x60
	  rcu_nocb_gp_kthread+0x15d/0x590
	  kthread+0xf3/0x130
	  ret_from_fork+0x3a/0x50
	irq event stamp: 171
	hardirqs last  enabled at (171): [<ffffffff8a201a37>] trace_hardirqs_on_thunk+0x1a/0x1c
	hardirqs last disabled at (170): [<ffffffff8a201a53>] trace_hardirqs_off_thunk+0x1a/0x1c
	softirqs last  enabled at (0): [<ffffffff8a264ee0>] copy_process.part.56+0x650/0x1cb0
	softirqs last disabled at (0): [<0000000000000000>] 0x0

        [...]

To reproduce, run the following rcutorture test:

 $ tools/testing/selftests/rcutorture/bin/kvm.sh --duration 5 --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --configs "TASKS03 TREE04"

It turns out that tick_broadcast_offline() was an innocent bystander.
After all, interrupts are supposed to be disabled throughout
take_cpu_down(), and therefore should have been disabled upon entry to
tick_offline_cpu() and thus to tick_broadcast_offline().  This suggests
that one of the CPU-hotplug notifiers was incorrectly enabling interrupts,
and leaving them enabled on return.

Some debugging code showed that the culprit was sched_cpu_dying().
It had irqs enabled after return from sched_tick_stop().  Which in turn
had irqs enabled after return from cancel_delayed_work_sync().  Which is a
wrapper around __cancel_work_timer().  Which can sleep in the case where
something else is concurrently trying to cancel the same delayed work,
and as Thomas Gleixner pointed out on IRC, sleeping is a decidedly bad
idea when you are invoked from take_cpu_down(), regardless of the state
you leave interrupts in upon return.

Code inspection located no reason why the delayed work absolutely
needed to be canceled from sched_tick_stop():  The work is not
bound to the outgoing CPU by design, given that the whole point is
to collect statistics without disturbing the outgoing CPU.

This commit therefore simply drops the cancel_delayed_work_sync() from
sched_tick_stop().  Instead, a new ->state field is added to the tick_work
structure so that the delayed-work handler function sched_tick_remote()
can avoid reposting itself.  A cpu_is_offline() check is also added to
sched_tick_remote() to avoid mucking with the state of an offlined CPU
(though it does appear safe to do so).  The sched_tick_start() and
sched_tick_stop() functions also update ->state, and sched_tick_start()
also schedules the delayed work if ->state indicates that it is not
already in flight.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
[ paulmck: Apply Peter Zijlstra and Frederic Weisbecker atomics feedback. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190625165238.GJ26519@linux.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05 13:09:35 +02:00
Greg Kroah-Hartman
abfd9e9bf7 This is the 4.19.76 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl2S8YUACgkQONu9yGCS
 aT73mA/+OFA6raC1+AwctfsJTqwGjcQ45VFNdTl2odA+/o9MQwfKFkphJaYppxoz
 vm542K1/JBko4gZ5Kb9rz5kqbFvrMdlDfRBHNlhzakhaU0u3fLkKt2LeoTtSIPEA
 kkog51CZScZGTAxUW72t069k1szdP6sj0NzLHNGtaqOLIKdYaTegm5RuiQTk6L/5
 VIMc2ASMAnzPWCO4NPjmrDN7wHPkMQusyG+ja+C2VDlJTZl288eP+eWTWixZXsnh
 nepyuUTUa5pbutpKmeUbNsTek+UYKfk/rcTzaSj7OY2t34jbWBP5ezq/WVlLW88G
 kb3HliXydUljvrAOhAZTKmunGFb12g2TER/h7MUDvs9P6YC38KoCCqk5lMlix+et
 oS9mYLPwbPhOLB42whc0PH3dPLuOFCQLxSKeTBNVRumm4ddpwVHGCJWpCkAhR07S
 h/uLK8aMbrgPifWL3eohe+XsgTdpTXXFApcV8C2wdK9E/dw4BSrbYRG1oJtMSGjC
 KLYYA5CnXWD6oBX/at+aJAx2DZKDESexB8i0jx8ot1Wc0Ac4TFPtET88WGmNyCnm
 ssW37j5Z07SFCEUEW+yujMtiYNbqlVy1nRCRjXcoMqWA2kOQZdo7hr+ihcd2aGgD
 GxzNw8yZaH/CwaKU9kziQsWE0GFEz/supWGu5BgBTO0PZDyUmwg=
 =eUlg
 -----END PGP SIGNATURE-----

Merge 4.19.76 into android-4.19

Changes in 4.19.76
	Revert "Bluetooth: validate BLE connection interval updates"
	net/ibmvnic: free reset work of removed device from queue
	RDMA/restrack: Protect from reentry to resource return path
	powerpc/xive: Fix bogus error code returned by OPAL
	drm/amd/display: readd -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines
	IB/core: Add an unbound WQ type to the new CQ API
	HID: prodikeys: Fix general protection fault during probe
	HID: sony: Fix memory corruption issue on cleanup.
	HID: logitech: Fix general protection fault caused by Logitech driver
	HID: hidraw: Fix invalid read in hidraw_ioctl
	HID: Add quirk for HP X500 PIXART OEM mouse
	mtd: cfi_cmdset_0002: Use chip_good() to retry in do_write_oneword()
	crypto: talitos - fix missing break in switch statement
	CIFS: fix deadlock in cached root handling
	net/mlx5e: Set ECN for received packets using CQE indication
	net/mlx5e: don't set CHECKSUM_COMPLETE on SCTP packets
	mlx5: fix get_ip_proto()
	net/mlx5e: Allow reporting of checksum unnecessary
	net/mlx5e: XDP, Avoid checksum complete when XDP prog is loaded
	net/mlx5e: Rx, Fixup skb checksum for packets with tail padding
	net/mlx5e: Rx, Check ip headers sanity
	iwlwifi: mvm: send BCAST management frames to the right station
	iwlwifi: mvm: always init rs_fw with 20MHz bandwidth rates
	media: tvp5150: fix switch exit in set control handler
	ASoC: Intel: cht_bsw_max98090_ti: Enable codec clock once and keep it enabled
	ASoC: fsl: Fix of-node refcount unbalance in fsl_ssi_probe_from_dt()
	ALSA: usb-audio: Add Hiby device family to quirks for native DSD support
	ALSA: usb-audio: Add DSD support for EVGA NU Audio
	ALSA: dice: fix wrong packet parameter for Alesis iO26
	ALSA: hda - Add laptop imic fixup for ASUS M9V laptop
	ALSA: hda - Apply AMD controller workaround for Raven platform
	objtool: Clobber user CFLAGS variable
	pinctrl: sprd: Use define directive for sprd_pinconf_params values
	power: supply: sysfs: ratelimit property read error message
	locking/lockdep: Add debug_locks check in __lock_downgrade()
	scsi: qla2xxx: Turn off IOCB timeout timer on IOCB completion
	scsi: qla2xxx: Remove all rports if fabric scan retry fails
	scsi: qla2xxx: Return switch command on a timeout
	Revert "drm/amd/powerplay: Enable/Disable NBPSTATE on On/OFF of UVD"
	bpf: libbpf: retry loading program on EAGAIN
	irqchip/gic-v3-its: Fix LPI release for Multi-MSI devices
	f2fs: check all the data segments against all node ones
	PCI: hv: Avoid use of hv_pci_dev->pci_slot after freeing it
	bcache: remove redundant LIST_HEAD(journal) from run_cache_set()
	initramfs: don't free a non-existent initrd
	blk-mq: change gfp flags to GFP_NOIO in blk_mq_realloc_hw_ctxs
	blk-mq: move cancel of requeue_work to the front of blk_exit_queue
	Revert "f2fs: avoid out-of-range memory access"
	dm zoned: fix invalid memory access
	net/ibmvnic: Fix missing { in __ibmvnic_reset
	f2fs: fix to do sanity check on segment bitmap of LFS curseg
	drm: Flush output polling on shutdown
	net: don't warn in inet diag when IPV6 is disabled
	Bluetooth: btrtl: HCI reset on close for Realtek BT chip
	ACPI: video: Add new hw_changes_brightness quirk, set it on PB Easynote MZ35
	drm/nouveau/disp/nv50-: fix center/aspect-corrected scaling
	xfs: don't crash on null attr fork xfs_bmapi_read
	netfilter: nft_socket: fix erroneous socket assignment
	Bluetooth: btrtl: Additional Realtek 8822CE Bluetooth devices
	net_sched: check cops->tcf_block in tc_bind_tclass()
	net/rds: An rds_sock is added too early to the hash table
	net/rds: Check laddr_check before calling it
	f2fs: use generic EFSBADCRC/EFSCORRUPTED
	Linux 4.19.76

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iee7c7dfb1bfcf16ad7eaffd9974d056622479030
2019-10-01 08:51:37 +02:00
Waiman Long
9423770eb3 locking/lockdep: Add debug_locks check in __lock_downgrade()
[ Upstream commit 7149258057 ]

Tetsuo Handa had reported he saw an incorrect "downgrading a read lock"
warning right after a previous lockdep warning. It is likely that the
previous warning turned off lock debugging causing the lockdep to have
inconsistency states leading to the lock downgrade warning.

Fix that by add a check for debug_locks at the beginning of
__lock_downgrade().

Debugged-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: syzbot+53383ae265fb161ef488@syzkaller.appspotmail.com
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/r/1547093005-26085-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-01 08:26:07 +02:00
Biswajit Paul
9c23eefd23 ANDROID: kernel: Restrict permissions of /proc/iomem.
The permissions of /proc/iomem currently are -r--r--r--. Everyone can
see its content. As iomem contains information about the physical memory
content of the device, restrict the information only to root.

Change-Id: If0be35c3fac5274151bea87b738a48e6ec0ae891
CRs-Fixed: 786116
Signed-off-by: Biswajit Paul <biswajitpaul@codeaurora.org>
Signed-off-by: Avijit Kanti Das <avijitnsec@codeaurora.org>
(cherry picked from https://android.googlesource.com/kernel/msm
 commit 3b1ac3a37ce5e6c31c82ca85604705575cb570d6)
Signed-off-by: Tao Huang <huangtao@rock-chips.com>
2019-09-26 20:00:46 +08:00
Greg Kroah-Hartman
de5730eaef This is the 4.19.75 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl2FslsACgkQONu9yGCS
 aT4kYBAAkOZ1wVwFD4mFkUmKvLmsGlwwkY/5/kQneBDUj4VQG9/1PFSN7Cfb9DdJ
 zdIcsdsfx/J+41FKJe9rxgJL6ttB1L8ob6GYdCI/8uA23TUGCQB5RSF/cwGeZUSz
 RRvqm1gstRimh4c+kibgkr3yxwBIUTzBMlBz1OMTsbP9YVzheGPahml2/mJAyb6C
 z6ETlLmrw0VixyyyvAF6r210K9qftjK4nMMDeFvftgU/eJUr59jBhSkEirS3jo5G
 KKP0kD3wDiOzqhZ83qU0bEG9EIiayap6k9H3r1u4Qu0xjyc095Jta+3JFpOqd66u
 CLfAKO0wf/jVx3/3EzWLtnxfXIpcfWi7Vj6rcTjASOsH8PrCLageHbyoA5JmKGsW
 gp4HUgwdgQPtMU7rFXCVEcoLqu0uU3PUGkOQlcx9AYLoaE2LsTijcLLJqb0tZztr
 IetrhXFVmeMnz2/ejqvORZw3mLNYMTD6OfNATMEgh1LkXqaCCWXdTVj2Bsp4IcD8
 d63E8ftILxxanfNjRS0T5+kc+yCkQs8oNRqZGXQQ9zjVzXiu0kyKDIh93lC7V+yF
 EM4pO/+kEljtc6vP+2hdpCG7buwvhSklOs2TvWJpU7umwEfHfxeetvnQajDzk5n0
 XLPDc+B/ZThND8+DrlhHvkx4dMU7xtR6IDvix9XpME65pWiB7nk=
 =ebAT
 -----END PGP SIGNATURE-----

Merge 4.19.75 into android-4.19

Changes in 4.19.75
	netfilter: nf_flow_table: set default timeout after successful insertion
	HID: wacom: generic: read HID_DG_CONTACTMAX from any feature report
	RDMA/restrack: Release task struct which was hold by CM_ID object
	Input: elan_i2c - remove Lenovo Legion Y7000 PnpID
	powerpc/mm/radix: Use the right page size for vmemmap mapping
	USB: usbcore: Fix slab-out-of-bounds bug during device reset
	media: tm6000: double free if usb disconnect while streaming
	phy: renesas: rcar-gen3-usb2: Disable clearing VBUS in over-current
	ip6_gre: fix a dst leak in ip6erspan_tunnel_xmit
	udp: correct reuseport selection with connected sockets
	xen-netfront: do not assume sk_buff_head list is empty in error handling
	net_sched: let qdisc_put() accept NULL pointer
	KVM: coalesced_mmio: add bounds checking
	firmware: google: check if size is valid when decoding VPD data
	serial: sprd: correct the wrong sequence of arguments
	tty/serial: atmel: reschedule TX after RX was started
	mwifiex: Fix three heap overflow at parsing element in cfg80211_ap_settings
	nl80211: Fix possible Spectre-v1 for CQM RSSI thresholds
	ieee802154: hwsim: Fix error handle path in hwsim_init_module
	ieee802154: hwsim: unregister hw while hwsim_subscribe_all_others fails
	ARM: dts: am57xx: Disable voltage switching for SD card
	ARM: OMAP2+: Fix missing SYSC_HAS_RESET_STATUS for dra7 epwmss
	bus: ti-sysc: Fix using configured sysc mask value
	s390/bpf: fix lcgr instruction encoding
	ARM: OMAP2+: Fix omap4 errata warning on other SoCs
	ARM: dts: dra74x: Fix iodelay configuration for mmc3
	ARM: OMAP1: ams-delta-fiq: Fix missing irq_ack
	bus: ti-sysc: Simplify cleanup upon failures in sysc_probe()
	s390/bpf: use 32-bit index for tail calls
	selftests/bpf: fix "bind{4, 6} deny specific IP & port" on s390
	tools: bpftool: close prog FD before exit on showing a single program
	fpga: altera-ps-spi: Fix getting of optional confd gpio
	netfilter: ebtables: Fix argument order to ADD_COUNTER
	netfilter: nft_flow_offload: missing netlink attribute policy
	netfilter: xt_nfacct: Fix alignment mismatch in xt_nfacct_match_info
	NFSv4: Fix return values for nfs4_file_open()
	NFSv4: Fix return value in nfs_finish_open()
	NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup
	Kconfig: Fix the reference to the IDT77105 Phy driver in the description of ATM_NICSTAR_USE_IDT77105
	xdp: unpin xdp umem pages in error path
	qed: Add cleanup in qed_slowpath_start()
	ARM: 8874/1: mm: only adjust sections of valid mm structures
	batman-adv: Only read OGM2 tvlv_len after buffer len check
	bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0
	r8152: Set memory to all 0xFFs on failed reg reads
	x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines
	netfilter: xt_physdev: Fix spurious error message in physdev_mt_check
	netfilter: nf_conntrack_ftp: Fix debug output
	NFSv2: Fix eof handling
	NFSv2: Fix write regression
	kallsyms: Don't let kallsyms_lookup_size_offset() fail on retrieving the first symbol
	cifs: set domainName when a domain-key is used in multiuser
	cifs: Use kzfree() to zero out the password
	usb: host: xhci-tegra: Set DMA mask correctly
	ARM: 8901/1: add a criteria for pfn_valid of arm
	ibmvnic: Do not process reset during or after device removal
	sky2: Disable MSI on yet another ASUS boards (P6Xxxx)
	i2c: designware: Synchronize IRQs when unregistering slave client
	perf/x86/intel: Restrict period on Nehalem
	perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops
	amd-xgbe: Fix error path in xgbe_mod_init()
	tools/power x86_energy_perf_policy: Fix "uninitialized variable" warnings at -O2
	tools/power x86_energy_perf_policy: Fix argument parsing
	tools/power turbostat: fix buffer overrun
	net: aquantia: fix out of memory condition on rx side
	net: seeq: Fix the function used to release some memory in an error handling path
	dmaengine: ti: dma-crossbar: Fix a memory leak bug
	dmaengine: ti: omap-dma: Add cleanup in omap_dma_probe()
	x86/uaccess: Don't leak the AC flags into __get_user() argument evaluation
	x86/hyper-v: Fix overflow bug in fill_gva_list()
	keys: Fix missing null pointer check in request_key_auth_describe()
	iommu/amd: Flush old domains in kdump kernel
	iommu/amd: Fix race in increase_address_space()
	PCI: kirin: Fix section mismatch warning
	ovl: fix regression caused by overlapping layers detection
	floppy: fix usercopy direction
	binfmt_elf: move brk out of mmap when doing direct loader exec
	arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field
	media: technisat-usb2: break out of loop at end of buffer
	Linux 4.19.75

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1dd841f112ee81497cd085b102979f45ee5e6b9d
2019-09-21 07:55:26 +02:00
Marc Zyngier
9a74f799b9 kallsyms: Don't let kallsyms_lookup_size_offset() fail on retrieving the first symbol
[ Upstream commit 2a1a3fa0f2 ]

An arm64 kernel configured with

  CONFIG_KPROBES=y
  CONFIG_KALLSYMS=y
  # CONFIG_KALLSYMS_ALL is not set
  CONFIG_KALLSYMS_BASE_RELATIVE=y

reports the following kprobe failure:

  [    0.032677] kprobes: failed to populate blacklist: -22
  [    0.033376] Please take care of using kprobes.

It appears that kprobe fails to retrieve the symbol at address
0xffff000010081000, despite this symbol being in System.map:

  ffff000010081000 T __exception_text_start

This symbol is part of the first group of aliases in the
kallsyms_offsets array (symbol names generated using ugly hacks in
scripts/kallsyms.c):

  kallsyms_offsets:
          .long   0x1000 // do_undefinstr
          .long   0x1000 // efi_header_end
          .long   0x1000 // _stext
          .long   0x1000 // __exception_text_start
          .long   0x12b0 // do_cp15instr

Looking at the implementation of get_symbol_pos(), it returns the
lowest index for aliasing symbols. In this case, it return 0.

But kallsyms_lookup_size_offset() considers 0 as a failure, which
is obviously wrong (there is definitely a valid symbol living there).
In turn, the kprobe blacklisting stops abruptly, hence the original
error.

A CONFIG_KALLSYMS_ALL kernel wouldn't fail as there is always
some random symbols at the beginning of this array, which are never
looked up via kallsyms_lookup_size_offset.

Fix it by considering that get_symbol_pos() is always successful
(which is consistent with the other uses of this function).

Fixes: ffc5089196 ("[PATCH] Create kallsyms_lookup_size_offset()")
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-21 07:17:02 +02:00
Sami Tolvanen
9a11e8da57 ANDROID: bpf: validate bpf_func when BPF_JIT is enabled with CFI
With CONFIG_BPF_JIT, the kernel makes indirect calls to dynamically
generated code, which the compile-time Control-Flow Integrity (CFI)
checking cannot validate. This change adds basic sanity checking to
ensure we are jumping to a valid location, which narrows down the
attack surface on the stored pointer.

In addition, this change adds a weak arch_bpf_jit_check_func function,
which architectures that implement BPF JIT can override to perform
additional validation, such as verifying that the pointer points to
the correct memory region.

Bug: 140377409
Change-Id: I8ebac6637ab6bd9db44716b1c742add267298669
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2019-09-20 10:58:37 -07:00
Jason Xing
e71f9c35ee UPSTREAM: psi: get poll_work to run when calling poll syscall next time
Only when calling the poll syscall the first time can user receive
POLLPRI correctly.  After that, user always fails to acquire the event
signal.

Reproduce case:
 1. Get the monitor code in Documentation/accounting/psi.txt
 2. Run it, and wait for the event triggered.
 3. Kill and restart the process.

The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0).  And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker.  The
cancel needs to pair with resetting the poll_scheduled flag.

Link: http://lkml.kernel.org/r/1566357985-97781-1-git-send-email-joseph.qi@linux.alibaba.com
Signed-off-by: Jason Xing <kerneljasonxing@linux.alibaba.com>
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Caspar Zhang <caspar@linux.alibaba.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 7b2b55da1d)

Bug: 141131229
Test: lmkd_unit_test and ACT mempressure tests
Change-Id: Ieaa8284ef632ef06318a92d792b239d344bb29d1
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-09-19 23:59:24 +00:00
Suren Baghdasaryan
849205758a UPSTREAM: sched/psi: Do not require setsched permission from the trigger creator
When a process creates a new trigger by writing into /proc/pressure/*
files, permissions to write such a file should be used to determine whether
the process is allowed to do so or not. Current implementation would also
require such a process to have setsched capability. Setting of psi trigger
thread's scheduling policy is an implementation detail and should not be
exposed to the user level. Remove the permission check by using _nocheck
version of the function.

Suggested-by: Nick Kralevich <nnk@google.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: lizefan@huawei.com
Cc: mingo@redhat.com
Cc: akpm@linux-foundation.org
Cc: kernel-team@android.com
Cc: dennisszhou@gmail.com
Cc: dennis@kernel.org
Cc: hannes@cmpxchg.org
Cc: axboe@kernel.dk
Link: https://lkml.kernel.org/r/20190730013310.162367-1-surenb@google.com

(cherry picked from commit 04e048cf09)

Bug: 131761776
Test: lmkd_unit_test and ACT mempressure tests
Change-Id: I37737e85611bb742d61a6988132856726bef9a1f
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-09-19 23:59:08 +00:00
Peter Zijlstra
2a220bc9f2 UPSTREAM: sched/psi: Reduce psimon FIFO priority
PSI defaults to a FIFO-99 thread, reduce this to FIFO-1.

FIFO-99 is the very highest priority available to SCHED_FIFO and
it not a suitable default; it would indicate the psi work is the
most important work on the machine.

Since Real-Time tasks will have pre-allocated memory and locked it in
place, Real-Time tasks do not care about PSI. All it needs is to be
above OTHER.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>

(cherry picked from commit 14f5c7b46a)

Bug: 141131229
Test: lmkd_unit_test and ACT mempressure tests
Change-Id: I52964915467577bfc3543700aec9b463f6f0ffe1
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-09-19 23:58:37 +00:00