alarmtimer suspend return -EBUSY if the next alarm will fire in less
than 2 seconds. This allows one RTC seconds tick to occur subsequent
to this check before the alarm wakeup time is set, ensuring the wakeup
time is still in the future (assuming the RTC does not tick one more
second prior to setting the alarm). If suspend is rejected, hold a
wakeup source for 2 seconds to process the alarm prior to reattempting
suspend.
Change-Id: If38e2568da0ea01dfee6e00323ce7e2c00f2f110
Signed-off-by: Todd Poynor <toddpoynor@google.com>
commit 047fe36052 upstream.
Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered
by splice_shrink_spd() called from vmsplice_to_pipe()
commit 35f3d14dbb (pipe: add support for shrinking and growing pipes)
added capability to adjust pipe->buffers.
Problem is some paths don't hold pipe mutex and assume pipe->buffers
doesn't change for their duration.
Fix this by adding nr_pages_max field in struct splice_pipe_desc, and
use it in place of pipe->buffers where appropriate.
splice_shrink_spd() loses its struct pipe_inode_info argument.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Tom Herbert <therbert@google.com>
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
[bwh: Backported to 3.2:
- Adjust context in vmsplice_to_pipe()
- Update one more call to splice_shrink_spd(), from skb_splice_bits()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 71babb2705 upstream.
According to Documentation/trace/ftrace.txt:
tracing_cpumask:
This is a mask that lets the user only trace
on specified CPUS. The format is a hex string
representing the CPUS.
The tracing_cpumask currently doesn't affect the tracing state of
per-CPU ring buffers.
This patch enables/disables CPU recording as its corresponding bit in
tracing_cpumask is set/unset.
Link: http://lkml.kernel.org/r/1336096792-25373-3-git-send-email-vnagarnaik@google.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Laurent Chavey <chavey@google.com>
Cc: Justin Teravest <teravest@google.com>
Cc: David Sharp <dhsharp@google.com>
Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4fe7efdbdf upstream.
do_exit() and exec_mmap() call sync_mm_rss() before mm_release() does
put_user(clear_child_tid) which can update task->rss_stat and thus make
mm->rss_stat inconsistent. This triggers the "BUG:" printk in check_mm().
Let's fix this bug in the safest way, and optimize/cleanup this later.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd48d708ff upstream.
When repeating a UTC time value during a leap second (when the UTC
time should be 23:59:60), the TAI timescale should not stop. The kernel
NTP code increments the TAI offset one second too late. This patch fixes
the issue by incrementing the offset during the leap second itself.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 62be73eafa upstream.
This patch moves kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop(),
to serialize the crash-logging process via smp_send_stop() and to
thus retrieve a more stable crash image of all CPUs stopped.
Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: dle-develop@lists.sourceforge.net <dle-develop@lists.sourceforge.net>
Cc: Satoru Moriya <satoru.moriya@hds.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: a.p.zijlstra@chello.nl <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/5C4C569E8A4B9B42A84A977CF070A35B2E4D7A5CE2@USINDEVS01.corp.hds.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f2bf1f6f5f upstream.
A recent update to have tracing_on/off() only affect the ftrace ring
buffers instead of all ring buffers had a cut and paste error.
The tracing_off() did the exact same thing as tracing_on() and
would not actually turn off tracing. Unfortunately, tracing_off()
is more important to be working than tracing_on() as this is a key
development tool, as it lets the developer turn off tracing as soon
as a problem is discovered. It is also used by panic and oops code.
This bug also breaks the 'echo func:traceoff > set_ftrace_filter'
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a841f8cef4 upstream.
It does not get processed because sched_domain_level_max is 0 at the
time that setup_relax_domain_level() is run.
Simply accept the value as it is, as we don't know the value of
sched_domain_level_max until sched domain construction is completed.
Fix sched_relax_domain_level in cpuset. The build_sched_domain() routine calls
the set_domain_attribute() routine prior to setting the sd->level, however,
the set_domain_attribute() routine relies on the sd->level to decide whether
idle load balancing will be off/on.
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120605184436.GA15668@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fad0c66c4b upstream.
Commit 6b43ae8a61 (ntp: Fix leap-second hrtimer livelock) broke the
leapsecond update of CLOCK_MONOTONIC. The missing leapsecond update to
wall_to_monotonic causes discontinuities in CLOCK_MONOTONIC.
Adjust wall_to_monotonic when NTP inserted a leapsecond.
Reported-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Richard Cochran <richardcochran@gmail.com>
Link: http://lkml.kernel.org/r/1338400497-12420-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7edc8b0ac1 upstream.
The vma length in dup_mmap is calculated and stored in a unsigned int,
which is insufficient and hence overflows for very large maps (beyond
16TB). The following program demonstrates this:
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
#define GIG 1024 * 1024 * 1024L
#define EXTENT 16393
int main(void)
{
int i, r;
void *m;
char buf[1024];
for (i = 0; i < EXTENT; i++) {
m = mmap(NULL, (size_t) 1 * 1024 * 1024 * 1024L,
PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
if (m == (void *)-1)
printf("MMAP Failed: %d\n", m);
else
printf("%d : MMAP returned %p\n", i, m);
r = fork();
if (r == 0) {
printf("%d: successed\n", i);
return 0;
} else if (r < 0)
printf("FORK Failed: %d\n", r);
else if (r > 0)
wait(NULL);
}
return 0;
}
Increase the storage size of the result to unsigned long, which is
sufficient for storing the difference between addresses.
Signed-off-by: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prints the time spent in suspend in the kernel log, and
keeps statistics on the time spent in suspend in
/sys/kernel/debug/suspend_time
Change-Id: Ia6b9ebe4baa0f7f5cd211c6a4f7e813aefd3fa1d
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Todd Poynor <toddpoynor@google.com>
commit 544ecf310f upstream.
worker_enter_idle() has WARN_ON_ONCE() which triggers if nr_running
isn't zero when every worker is idle. This can trigger spuriously
while a cpu is going down due to the way trustee sets %WORKER_ROGUE
and zaps nr_running.
It first sets %WORKER_ROGUE on all workers without updating
nr_running, releases gcwq->lock, schedules, regrabs gcwq->lock and
then zaps nr_running. If the last running worker enters idle
inbetween, it would see stale nr_running which hasn't been zapped yet
and trigger the WARN_ON_ONCE().
Fix it by performing the sanity check iff the trustee is idle.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On non-developer devices kgdb prevents CONFIG_PANIC_TIMEOUT from
rebooting the device after a panic. Add module parameters
debug_core.break_on_exception and debug_core.break_on_panic to
allow skipping debug on panics and exceptions respectively. Both
default to true to preserve existing behavior.
Change-Id: I75dce7263e96cee069a9750920cce83dc6f98e8c
Signed-off-by: Colin Cross <ccross@android.com>
task_tick_rt has an optimization to only reschedule SCHED_RR tasks
if they were the only element on their rq. However, with cgroups
a SCHED_RR task could be the only element on its per-cgroup rq but
still be competing with other SCHED_RR tasks in its parent's
cgroup. In this case, the SCHED_RR task in the child cgroup would
never yield at the end of its timeslice. If the child cgroup
rt_runtime_us was the same as the parent cgroup rt_runtime_us,
the task in the parent cgroup would starve completely.
Modify task_tick_rt to check that the task is the only task on its
rq, and that the each of the scheduling entities of its ancestors
is also the only entity on its rq.
Change-Id: I4f5b118517f85db3570923eb2f5e4c933ece9247
Signed-off-by: Colin Cross <ccross@android.com>
Sometimes resume= parameter comes in integer style (e.g. major:minor)
and then name_to_dev_t can not detect partition properly. (especially
async device like usb, mmc).
This patch calls get_gendisk() if resumewait is true and resume_file
is in integer format to work around this problem.
Signed-off-by: Minho Ban <mhban@samsung.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Pull perf, x86 and scheduler updates from Ingo Molnar.
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tracing: Do not enable function event with enable
perf stat: handle ENXIO error for perf_event_open
perf: Turn off compiler warnings for flex and bison generated files
perf stat: Fix case where guest/host monitoring is not supported by kernel
perf build-id: Fix filename size calculation
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, kvm: KVM paravirt kernels don't check for CPUID being unavailable
x86: Fix section annotation of acpi_map_cpu2node()
x86/microcode: Ensure that module is only loaded on supported Intel CPUs
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption
Export handle_edge_irq() and irq_to_desc() to modules to allow them to
do things such as
__irq_set_handler_locked(...., handle_edge_irq);
This fixes
ERROR: "handle_edge_irq" [drivers/gpio/gpio-pch.ko] undefined!
ERROR: "irq_to_desc" [drivers/gpio/gpio-pch.ko] undefined!
when gpio-pch is being built as a module.
This was introduced by commit df9541a60a ("gpio: pch9: Use proper flow
type handlers") that added
__irq_set_handler_locked(d->irq, handle_edge_irq);
but handle_edge_irq() was not exported for modules (and inlined
__irq_set_handler_locked() requires irq_to_desc() exported as well)
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make it possible to configure out the user space wakeup sources
garbage collector for debugging and default Android builds.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Arve Hjønnevåg <arve@android.com>
Make it possible to configure out the check against the limit of
user space wakeup sources for debugging and default Android builds.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Arve Hjønnevåg <arve@android.com>
Fork() failure post namespace creation for a child cloned with
CLONE_NEWPID leaks pid_namespace/mnt_cache due to proc being mounted
during creation, but not unmounted during cleanup. Call
pid_ns_release_proc() during cleanup.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Louis Rilling <louis.rilling@kerlabs.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the adding of function tracing event to perf, it caused a
side effect that produces the following warning when enabling all
events in ftrace:
# echo 1 > /sys/kernel/debug/tracing/events/enable
[console]
event trace: Could not enable event function
This is because when enabling all events via the debugfs system
it ignores events that do not have a ->reg() function assigned.
This was to skip over the ftrace internal events (as they are
not TRACE_EVENTs). But as the ftrace function event now has
a ->reg() function attached to it for use with perf, it is no
longer ignored.
Worse yet, this ->reg() function is being called when it should
not be. It returns an error and causes the above warning to
be printed.
By adding a new event_call flag (TRACE_EVENT_FL_IGNORE_ENABLE)
and have all ftrace internel event structures have it set,
setting the events/enable will no longe try to incorrectly enable
the function event and does not warn.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
compat_sys_sigprocmask reads a smaller signal mask from userspace than
sigprogmask accepts for setting. So the high word of blocked.sig[0]
will be cleared, releasing any potentially blocked RT signal.
This was discovered via userspace code that relies on get/setcontext.
glibc's i386 versions of those functions use sigprogmask instead of
rt_sigprogmask to save/restore signal mask and caused RT signal
unblocking this way.
As suggested by Linus, this replaces the sys_sigprocmask based compat
version with one that open-codes the required logic, including the merge
of the existing blocked set with the new one provided on SIG_SETMASK.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If we have one cpu that failed to boot and boot cpu gave up on
waiting for it and then another cpu is being booted, kernel
might crash with following OOPS:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
IP: [<ffffffff812c3630>] __bitmap_weight+0x30/0x80
Call Trace:
[<ffffffff8108b9b6>] build_sched_domains+0x7b6/0xa50
The crash happens in init_sched_groups_power() that expects
sched_groups to be circular linked list. However it is not
always true, since sched_groups preallocated in __sdt_alloc are
initialized in build_sched_groups and it may exit early
if (cpu != cpumask_first(sched_domain_span(sd)))
return 0;
without initializing sd->groups->next field.
Fix bug by initializing next field right after sched_group was
allocated.
Also-Reported-by: Jiang Liu <liuj97@gmail.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Cc: a.p.zijlstra@chello.nl
Cc: pjt@google.com
Cc: seto.hidetoshi@jp.fujitsu.com
Link: http://lkml.kernel.org/r/1336559908-32533-1-git-send-email-imammedo@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The condition check in autosleep_store() is incorrect and prevents
/sys/power/autosleep from working as advertised. Fix that.
[rjw: Added the changelog.]
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Android allows user space to manipulate wakelocks using two
sysfs file located in /sys/power/, wake_lock and wake_unlock.
Writing a wakelock name and optionally a timeout to the wake_lock
file causes the wakelock whose name was written to be acquired (it
is created before is necessary), optionally with the given timeout.
Writing the name of a wakelock to wake_unlock causes that wakelock
to be released.
Implement an analogous interface for user space using wakeup sources.
Add the /sys/power/wake_lock and /sys/power/wake_unlock files
allowing user space to create, activate and deactivate wakeup
sources, such that writing a name and optionally a timeout to
wake_lock causes the wakeup source of that name to be activated,
optionally with the given timeout. If that wakeup source doesn't
exist, it will be created and then activated. Writing a name to
wake_unlock causes the wakeup source of that name, if there is one,
to be deactivated. Wakeup sources created with the help of
wake_lock that haven't been used for more than 5 minutes are garbage
collected and destroyed. Moreover, there can be only WL_NUMBER_LIMIT
wakeup sources created with the help of wake_lock present at a time.
The data type used to track wakeup sources created by user space is
called "struct wakelock" to indicate the origins of this feature.
This version of the patch includes an rbtree manipulation fix from John Stultz.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Android uses one wakelock statistics that is only necessary for
opportunistic sleep. Namely, the prevent_suspend_time field
accumulates the total time the given wakelock has been locked
while "automatic suspend" was enabled. Add an analogous field,
prevent_sleep_time, to wakeup sources and make it behave in a similar
way.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce a mechanism by which the kernel can trigger global
transitions to a sleep state chosen by user space if there are no
active wakeup sources.
It consists of a new sysfs attribute, /sys/power/autosleep, that
can be written one of the strings returned by reads from
/sys/power/state, an ordered workqueue and a work item carrying out
the "suspend" operations. If a string representing the system's
sleep state is written to /sys/power/autosleep, the work item
triggering transitions to that state is queued up and it requeues
itself after every execution until user space writes "off" to
/sys/power/autosleep.
That work item enables the detection of wakeup events using the
functions already defined in drivers/base/power/wakeup.c (with one
small modification) and calls either pm_suspend(), or hibernate() to
put the system into a sleep state. If a wakeup event is reported
while the transition is in progress, it will abort the transition and
the "system suspend" work item will be queued up again.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: NeilBrown <neilb@suse.de>
1. Do not allocate memory for buffers from emergency pools, unless
absolutely required. Do not warn about and do not retry non-essential
failed allocations.
2. Do not check the amount of free pages left on every single page
write, but wait until one map is completely populated and then check.
3. Set maximum number of pages for read buffering consistently, instead
of inadvertently depending on the size of the sector type.
4. Fix copyright line, which I missed when I submitted the hibernation
threading patch.
5. Dispense with bit shifting arithmetic to improve readability.
6. Really recalculate the number of pages required to be free after all
allocations have been done.
7. Fix calculation of pages required for read buffering. Only count in
pages that do not belong to high memory.
Signed-off-by: Bojan Smojver <bojan@rexursive.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Fix for an issue causing hibernation to hang on systems with highmem (that
practically means i386) due to broken memory management (bug introduced in 3.2,
so -stable material) and PM documentation update making the freezer
documentation follow the code again after some recent updates.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQIcBAABAgAGBQJPnbeLAAoJEKhOf7ml8uNs4nkQAKwhfWWfbM7ZepPfT56A5NW1
9vlfgO+1ibUgdjkO0hi1biCAbARTNVS5eCLyRJW0W/msGgL51nYleBeFmwwx5E6h
m5Vwr/53cGeAeF0AOkrQkD45YKJaAlTmWF4/T2YWKLWNMgaVuLmGf7eyYZ6rP1NO
rJxzUOMC6UrRjIA+S2anDU0CdMyqDHvV3OmY+InZBikFCk0YAtDYUYfNDNqQpEBG
bzkG3SyaJeqnbQDkhme7U/uAPJCThSz2Z4gvvOxiXdB+I+yp6FhluhLSGxqMh/kj
OUAJe9s6AAdKz+K62/OgowwucxvmeJRCyYWkN2ZEpsZLoqTEOqLNS4+eaUO6xS/2
tq89LnfSIVFwRx23XeVr/oMfxUJZ8VKZENo5Pm6NjTAYykTeyD4ug/GAHqgXR0TT
B+fvx8QmQ68R843aJsjR9h0AKsSeXfgCAROJt+x0ONYAvmJNV62nzs81broDEl4I
BmWHpOWI7wlzMPt7bNWn4ev4K+WhbVsioFDS61he0Y47Rqt3yUJ8G2OfBq6JYndw
As4ImoPOVGl0+TKcHJ9Y3bVPnsY7fJyF0GeG50NHxsVFsnTv+rYZ9K3GM9KExhO/
5mCfoHNgkOJnhGHfZppbnQBHbmjH8EA3QUx57Abo+Q4wiPNNAVG9P1JpZeyGj8KF
3YML5FjjGQHtYBWeH5WR
=HaVL
-----END PGP SIGNATURE-----
Merge tag 'pm-for-3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael J. Wysocki:
"Fix for an issue causing hibernation to hang on systems with highmem
(that practically means i386) due to broken memory management (bug
introduced in 3.2, so -stable material) and PM documentation update
making the freezer documentation follow the code again after some
recent updates."
* tag 'pm-for-3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Freezer / Docs: Update documentation about freezing of tasks
PM / Hibernate: fix the number of pages used for hibernate/thaw buffering
Pull perf fixes from Ingo Molnar.
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix perf_event_for_each() to use sibling
perf symbols: Read plt symbols from proper symtab_type binary
tracing: Fix stacktrace of latency tracers (irqsoff and friends)
perf tools: Add 'G' and 'H' modifiers to event parsing
tracing: Fix regression with tracing_on
perf tools: Drop CROSS_COMPILE from flex and bison calls
perf report: Fix crash showing warning related to kernel maps
tracing: Fix build breakage without CONFIG_PERF_EVENTS (again)
Pull build fixes for less mainstream architectures from Paul Gortmaker:
"These are fixes for frv(1), blackfin(2), powerpc(1) and xtensa(4).
Fortunately the touches are nearly all specific to files just used by
the arch in question. The two touches to shared/common files
[kernel/irq/debug.h and drivers/pci/Makefile] are trivial to assess as
no risk to anyone.
Half of them relate to xtensa directly. It was only when I fixed the
last xtensa issue that I realized that the arch has been broken for a
significant time, and isn't a specific v3.4 regression. So if you
wanted, we could leave xtensa lying bleeding in the street for a
couple more weeks and queue those for 3.5. But given they are no risk
to anyone outside of xtensa, I figured to just leave them in.
If you are OK with taking the xtensa fixes, then please pull to get:
- one last implicit include uncovered by system.h that is in a file
specific to just one powerpc defconfig. (I'd sync'd with BenH).
- fix an oversight in the PCI makefile where shared code wasn't being
compiled for ARCH=frv
- fix a missing include for GPIO in blackfin framebuffer.
- audit and tag endif in blackfin ezkit board file, in order to find
and fix the misplaced endif masking a block of code.
- fix irq/debug.h choice of temporary macro names to be more internal
so they don't conflict with names used by xtensa.
- fix a reference to an undeclared local var in xtensa's signal.c
- fix an implicit bug.h usage in xtensa's asm/io.h uncovered by my
removing bug.h from kernel.h
- fix xtensa to properly indicate it is using asm-generic/hardirq.h
in order to resolve the link error - undefined ack_bad_irq
The xtensa still fails final link as my latest binutils does something
evil when ld forward-relocates unlikely() blocks, but in theory people
who have older/valid toolchains could now use the thing."
* 'for-v3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
xtensa: fix build fail on undefined ack_bad_irq
blackfin: fix ifdef fustercluck in mach-bf538/boards/ezkit.c
blackfin: fix compile error in bfin-lq035q1-fb.c
pci: frv architecture needs generic setup-bus infrastructure
irq: hide debug macros so they don't collide with others.
xtensa: fix build error in xtensa/include/asm/io.h
xtensa: fix build failure in xtensa/kernel/signal.c
powerpc: fix system.h fallout in sysdev/scom.c [chroma_defconfig]
In perf_event_for_each() we call a function on an event, and then
iterate over the siblings of the event.
However we don't call the function on the siblings, we call it
repeatedly on the original event - it seems "obvious" that we should
be calling it with sibling as the argument.
It looks like this broke in commit 75f937f24b ("Fix ctx->mutex
vs counter->mutex inversion").
The only effect of the bug is that the PERF_IOC_FLAG_GROUP parameter
to the ioctls doesn't work.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334109253-31329-1-git-send-email-michael@ellerman.id.au
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Under extreme memory used up situations, percpu allocation
might fail. We hit it when system goes to suspend-to-ram,
causing a kworker panic:
EIP: [<c124411a>] build_sched_domains+0x23a/0xad0
Kernel panic - not syncing: Fatal exception
Pid: 3026, comm: kworker/u:3
3.0.8-137473-gf42fbef #1
Call Trace:
[<c18cc4f2>] panic+0x66/0x16c
[...]
[<c1244c37>] partition_sched_domains+0x287/0x4b0
[<c12a77be>] cpuset_update_active_cpus+0x1fe/0x210
[<c123712d>] cpuset_cpu_inactive+0x1d/0x30
[...]
With this fix applied build_sched_domains() will return -ENOMEM and
the suspend attempt fails.
Signed-off-by: he, bo <bo.he@intel.com>
Reviewed-by: Zhang, Yanmin <yanmin.zhang@intel.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1335355161.5892.17.camel@hebo
[ So, we fail to deallocate a CPU because we cannot allocate RAM :-/
I don't like that kind of sad behavior but nevertheless it should
not crash under high memory load. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commits 367456c756 ("sched: Ditch per cgroup task lists for
load-balancing") and 5d6523ebd ("sched: Fix load-balance wreckage")
left some more wreckage.
By setting loop_max unconditionally to ->nr_running load-balancing
could take a lot of time on very long runqueues (hackbench!). So keep
the sysctl as max limit of the amount of tasks we'll iterate.
Furthermore, the min load filter for migration completely fails with
cgroups since inequality in per-cpu state can easily lead to such
small loads :/
Furthermore the change to add new tasks to the tail of the queue
instead of the head seems to have some effect.. not quite sure I
understand why.
Combined these fixes solve the huge hackbench regression reported by
Tim when hackbench is ran in a cgroup.
Reported-by: Tim Chen <tim.c.chen@linux.intel.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1335365763.28150.267.camel@twins
[ got rid of the CONFIG_PREEMPT tuning and made small readability edits ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Hibernation regression fix, since 3.2.
Calculate the number of required free pages based on non-high memory
pages only, because that is where the buffers will come from.
Commit 081a9d043c introduced a new buffer
page allocation logic during hibernation, in order to improve the
performance. The amount of pages allocated was calculated based on total
amount of pages available, although only non-high memory pages are
usable for this purpose. This caused hibernation code to attempt to over
allocate pages on platforms that have high memory, which led to hangs.
Signed-off-by: Bojan Smojver <bojan@rexursive.com>
Signed-off-by: Rafael J. Wysocki <rjw@suse.de>
The file kernel/irq/debug.h temporarily defines P, PS, PD
and then undefines them. However these names aren't really
"internal" enough, and collide with other more legit users
such as the ones in the xtensa arch, causing:
In file included from kernel/irq/internals.h:58:0,
from kernel/irq/irqdesc.c:18:
kernel/irq/debug.h:8:0: warning: "PS" redefined [enabled by default]
arch/xtensa/include/asm/regs.h:59:0: note: this is the location of the previous definition
Add a handful of underscores to do a better job of hiding these
temporary macros.
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
While debugging a latency with someone on IRC (mirage335) on #linux-rt (OFTC),
we discovered that the stacktrace output of the latency tracers
(preemptirqsoff) was empty.
This bug was caused by the creation of the dynamic length stack trace
again (like commit 12b5da3 "tracing: Fix ent_size in trace output" was).
This bug is caused by the latency tracers requiring the next event
to determine the time between the current event and the next. But by
grabbing the next event, the iter->ent_size is set to the next event
instead of the current one. As the stacktrace event is the last event,
this makes the ent_size zero and causes nothing to be printed for
the stack trace. The dynamic stacktrace uses the ent_size to determine
how much of the stack can be printed. The ent_size of zero means
no stack.
The simple fix is to save the iter->ent_size before finding the next event.
Note, mirage335 asked to remain anonymous from LKML and git, so I will
not add the Reported-by and Tested-by tags, even though he did report
the issue and tested the fix.
Cc: stable@vger.kernel.org # 3.1+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>