Some compilers (especially older gcc releases) may skip inlining
sometimes which will lead to link failures. Force the inlining of
keyfunctions in slub_def.h to avoid these issues.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Jan Dittmer <jdi@l4x.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: Fix several bugs in MSI handling.
[SPARC64]: Fix type and constant sizes wrt. sun4u IMAP/ICLR handling.
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[PKTGEN]: Remove write-only variable.
[NETFILTER]: xt_tcpudp: fix wrong struct in udp_checkentry
[NET_SCHED] sch_prio.c: remove duplicate call of tc_classify()
[BRIDGE]: Fix OOPS when bridging device without ethtool.
[BRIDGE]: Packets leaking out of disabled/blocked ports.
[TCP]: Allow minimum RTO to be configurable via routing metrics.
SCTP: Fix to handle invalid parameter length correctly
SCTP: Abort on COOKIE-ECHO if backlog is exceeded.
SCTP: Correctly disable listening when backlog is 0.
SCTP: Do not retransmit chunks that are newer then rtt.
SCTP: Uncomfirmed transports can't become Inactive
SCTP: Pick the correct port when binding to 0.
SCTP: Use net_ratelimit to suppress error messages print too fast
SCTP: Fix to encode PROTOCOL VIOLATION error cause correctly
SCTP: Fix sctp_addto_chunk() to add pad with correct length
SCTP: Assign stream sequence numbers to the entire message
SCTP: properly clean up fragment and ordering queues during FWD-TSN.
[PKTGEN]: Fix multiqueue oops.
[BNX2]: Add write posting comment.
[BNX2]: Use msleep().
1) sun4{u,v}_build_msi() have improper return value handling.
We should always return negative error codes, instead of
using the magic value "0" which could in fact be a valid
MSI number.
2) sun4{u,v}_build_msi() should return -ENOMEM instead of
calling prom_prom() halt with kzalloc() of the interrupt
data fails.
3) We 'remembered' the MSI number using a singleton in the
struct device archdata area, this doesn't work for MSI-X
which can cause multiple MSIs assosciated with one device.
Delete that archdata member, and instead store the MSI
number in the IRQ chip data area.
Signed-off-by: David S. Miller <davem@davemloft.net>
Sometimes we were using 32-bit values and the top bits were
getting inadvertantly chopped off. This will matter for the
forthcoming Fire controller MSI support.
Signed-off-by: David S. Miller <davem@davemloft.net>
Cell phone networks do link layer retransmissions and other
things that cause unnecessary timeout retransmits. So allow
the minimum RTO to be inflated per-route to deal with this.
Signed-off-by: David S. Miller <davem@davemloft.net>
PROTOCOL VIOLATION error cause in ABORT is bad encode when make abort
chunk. When SCTP encode ABORT chunk with PROTOCOL VIOLATION error cause,
it just add the error messages to PROTOCOL VIOLATION error cause, the
rest four bytes(struct sctp_paramhdr) is just add to the chunk, not
change the length of error cause. This cause the ABORT chunk to be a bad
format. The chunk is like this:
ABORT chunk
Chunk type: ABORT (6)
Chunk flags: 0x00
Chunk length: 72 (*1)
Protocol violation cause
Cause code: Protocol violation (0x000d)
Cause length: 62 (*2)
Cause information: 5468652063756D756C61746976652074736E2061636B2062...
Cause padding: 0000
[Needless] 00030010
Chunk Length(*1) = 72 but Cause length(*2) only 62, not include the
extend 4 bytes.
((72 - sizeof(chunk_hdr)) = 68) != (62 +3) / 4 * 4
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
When we recieve a FWD-TSN (meaning the peer has abandoned the data),
we need to clean up any partially received messages that may be
hanging out on the re-assembly or re-ordering queues. This is
a MUST requirement that was not properly done before.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com.>
de-HZ-ification of the granularity defaults unearthed a pre-existing
property of CFS: while it correctly converges to the granularity goal,
it does not prevent run-time fluctuations in the range of
[-gran ... 0 ... +gran].
With the increase of the granularity due to the removal of HZ
dependencies, this becomes visible in chew-max output (with 5 tasks
running):
out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40
out: 27 . 27. 32 | flu: 0 . 0 | ran: 17 . 13 | per: 44 . 40
out: 27 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 36 . 40
out: 29 . 27. 32 | flu: 2 . 0 | ran: 17 . 13 | per: 46 . 40
out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40
out: 29 . 27. 32 | flu: 0 . 0 | ran: 18 . 13 | per: 47 . 40
out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40
average slice is the ideal 13 msecs and the period is picture-perfect 40
msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no
mechanism in CFS to keep that from happening: it's a perfectly valid
solution that CFS finds.
to fix this we add a granularity/preemption rule that knows about
the "target latency", which makes tasks that run longer than the ideal
latency run a bit less. The simplest approach is to simply decrease the
preemption granularity when a task overruns its ideal latency. For this
we have to track how much the task executed since its last preemption.
( this adds a new field to task_struct, but we can eliminate that
overhead in 2.6.24 by putting all the scheduler timestamps into an
anonymous union. )
with this change in place, chew-max output is fluctuation-less all
around:
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40
this patch has no impact on any fastpath or on any globally observable
scheduling property. (unless you have sharp enough eyes to see
millisecond-level ruckles in glxgears smoothness :-)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC32]: Make flush_tlb_kernel_range() an inline function.
[SERIAL]: Fix 32-bit warnings in sunzilog.c and sunsu.c
[SPARC32]: Kill unused vars and macros from prom/console.c
[SPARC32]: Add __cmpdi2() libcall implementation ala. MIPS.
[VIDEO]: Do not prom_halt() in cg3 and bw2 device probe.
[SUNVDC]: Use slice 0xff on VD_DISK_TYPE_DISK.
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NET]: Mark Paul Moore as maintainer of labelled networking.
[VLAN/BRIDGE]: Fix "skb_pull_rcsum - Fatal exception in interrupt"
[ISDN]: Get rid of some pointless allocation casts in common and bsd comp.
[NET]: Avoid pointless allocation casts in BSD compression module
[IRDA]: Do not do pointless kmalloc return value cast in KingSun driver
[NET]: Fix crash in dev_mc_sync()/dev_mc_unsync()
[PPPOL2TP]: Fix endianness annotations.
[IOAT]: ioatdma needs to to play nice in a multi-dma-client world
[SLIP]: trivial sparse warning fix
[EQL]: sparse warning fix
[NET]: is_power_of_2 in net/core/neighbour.c
[TCP]: Describe tcp_init_cwnd() thoroughly in a comment.
[NET]: Fix IP_ADD/DROP_MEMBERSHIP to handle only connectionless
[KBUILD]: Sanitize tc_ematch headers.
[IPSEC] AH4: Update IPv4 options handling to conform to RFC 4302.
Adding the defines/constants activates the existing code in the tty layer
and allows arbitary tty speeds to be requested on this platform
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Chris Zankel <chris@zankel.net>
Add support for processors that have cache-aliasing issues, such as
the Stretch S5000 processor. Cache-aliasing means that the size of
the cache (for one way) is larger than the page size, thus, a page
can end up in several places in cache depending on the virtual to
physical translation. The method used here is to map a user page
temporarily through the auto-refill way 0 and of of the DTLB.
We probably will want to revisit this issue and use a better
approach with kmap/kunmap.
Signed-off-by: Chris Zankel <chris@zankel.net>
Newer processor versions starting with Xtensa6/LX2 support an 'executable'
bit for memory pages. This bit replaces the 'valid' bit, so it must be
always set to one for older processor versions. To mark a page invalid, we now
set the cache-attributes to b11, which is backward compatible.
Signed-off-by: Chris Zankel <chris@zankel.net>
Use the generic version of get_order for processor configurations that
don't have the 'nsa/nsau' instructions.
Signed-off-by: Chris Zankel <chris@zankel.net>
Xtensa passes long long arguments in a even/odd register pair,
so we also need to shuffle the arguments when passed through the
system call to avoid an empty argument register.
Signed-off-by: Chris Zankel <chris@zankel.net>
Although __ARCH_WANT_SYS_GETPGRP was defined, the actualy entry for
the getpgrp system-call was missing.
Signed-off-by: Chris Zankel <chris@zankel.net>
Add missing system calls that have been recently added to the kernel
for the Xtensa architecture and define __IGNORE macros for system calls
that we don't need for Xtensa.
Signed-off-by: Chris Zankel <chris@zankel.net>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fix SLB initialization at boot time
[POWERPC] Fix undefined reference to device_power_up/resume
[POWERPC] cell: Update cell_defconfig for 2.6.23
[POWERPC] axonram: Do not delete gendisks queue in error path
[POWERPC] axonram: Module modification for latest firmware API changes
[POWERPC] cell: Support pinhole-reset on IBM cell blades
[POWERPC] spu_manage: Use newer physical-id attribute
[POWERPC] pasemi: Another IOMMU bugfix for 64K PAGE_SIZE
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
[PARISC] Add NOTES section
[PARISC] Use compat_sys_getdents
[PARISC] Do not allow STI_CONSOLE to be modular
[PARISC] Clean up sti_flush
[PARISC] Add dummy isa_(bus|virt)_to_(virt|bus) inlines
[PARISC] Add empty <asm-parisc/vga.h>
Less painful than fixing up the Kconfig for a pile of drivers to only build
on X86 && ARM && MIPS...
Just make them BUG(), as defining them to be 1:1 with physical memory will
likely HPMC the box anyways.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
This avoids unused variable warnings in places like mm/vmalloc.c:
mm/vmalloc.c: In function ‘unmap_kernel_range’:
mm/vmalloc.c:75: warning: unused variable ‘start’
caused by it previously being a macro.
Signed-off-by: David S. Miller <davem@davemloft.net>
{s,d}_{session,tunnel} in pppol2tp_addr are actually host-endian
everywhere. We might switch them to net-endian, of course, but
that structure is exposed to userland via getname...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The headers in tc_ematch are used by iproute2, so these headers should
be processed.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
CC arch/mips/kernel/traps.o
arch/mips/kernel/traps.c: In function 'show_backtrace':
arch/mips/kernel/traps.c:110: warning: unused variable 'ra'
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add generic irq_chip for TX39/TX49 SoCs. This can be replace
jmr3927_irq_irc, tx4927_irq_pic_type and tx4938_irq_pic_type.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
For the generation of asm-offset.h to work these need to be evaulatable
by gcc as a constant expression. This issue did exist for a while but
didn't bite because they're only in asm-offset.h for debugging purposes.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
due to adaptive granularity scheduling the role of sched_granularity
has changed to "minimum granularity", so rename the variable (and the
tunable) accordingly.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Instead of specifying the preemption granularity, specify the wanted
latency. By fixing the granlarity to a constany the wakeup latency
it a function of the number of running tasks on the rq.
Invert this relation.
sysctl_sched_granularity becomes a minimum for the dynamic granularity
computed from the new sysctl_sched_latency.
Then use this latency to do more intelligent granularity decisions: if
there are fewer tasks running then we can schedule coarser. This helps
performance while still always keeping the latency target.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
agp: balance ioremap checks
agp: Add device id for P4M900 to via-agp module
efficeon-agp leaks 'struct agp_bridge_data' in error paths of agp_efficeon_probe()
Current Linus tree fails to link on pmac32:
drivers/built-in.o: In function `pmac_wakeup_devices':
via-pmu.c:(.text+0x5bab4): undefined reference to `device_power_up'
via-pmu.c:(.text+0x5bb08): undefined reference to `device_resume'
drivers/built-in.o: In function `pmac_suspend_devices':
via-pmu.c:(.text+0x5c260): undefined reference to `device_power_down'
via-pmu.c:(.text+0x5c27c): undefined reference to `device_resume'
make[1]: *** [.tmp_vmlinux1] Error 1
changing CONFIG_PM > CONFIG_PM_SLEEP leads to:
drivers/built-in.o: In function `pmu_led_set':
via-pmu-led.c:(.text+0x5cdca): undefined reference to `pmu_sys_suspended'
via-pmu-led.c:(.text+0x5cdce): undefined reference to `pmu_sys_suspended'
drivers/built-in.o: In function `pmu_req_done':
via-pmu-led.c:(.text+0x5ce3e): undefined reference to `pmu_sys_suspended'
via-pmu-led.c:(.text+0x5ce42): undefined reference to `pmu_sys_suspended'
drivers/built-in.o: In function `adb_init':
(.init.text+0x4c5c): undefined reference to `pmu_register_sleep_notifier'
make[1]: *** [.tmp_vmlinux1] Error 1
So change even more places from PM to PM_SLEEP to allow linking.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
drivers/acpi/ec.c: In function `acpi_ec_ecdt_probe':
drivers/acpi/ec.c:873: warning: passing arg 1 of `acpi_get_devices' discards qualifiers from pointer target type
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Len Brown <len.brown@intel.com>
Renumber AUDIT_TTY_[GS]ET to avoid a conflict with netlink message types
already used in the wild.
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6:
PCI: Run k8t_sound_hostbridge quirk only when needed
PCI: disable MSI on RX790
PCI: disable MSI on RD580
PCI: disable MSI on RS690
PCI: make pcie_get_readrq visible in pci.h
PCI: lets kill the 'PCI hidden behind bridge' message
pci/hotplug/cpqphp_ctrl.c: remove stale BKL use
PCI: Document pci_iomap()
PCI: quirk_e100_interrupt() called too early
PCI: Move prototypes for pci_bus_find_capability to include/linux/pci.h
Include asm-generic/pgtable.h to pick up the lazy_mmu_mode and
lazy_cpu_mode macros. Won't build without them.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Schedule /proc/acpi/event for removal in 6 months.
Re-name acpi_bus_generate_event() to acpi_bus_generate_proc_event()
to make sure there is no confusion that it is for /proc/acpi/event only.
Add CONFIG_ACPI_PROC_EVENT to allow removal of /proc/acpi/event.
There is no functional change if CONFIG_ACPI_PROC_EVENT=y
Signed-off-by: Len Brown <len.brown@intel.com>
The previous events patch added a netlink event for every
user of the legacy /proc/acpi/event interface.
However, some users of /proc/acpi/event are really input events,
and they already report their events via the input layer.
Introduce a new interface, acpi_bus_generate_netlink_event(),
which is explicitly called by devices that want to repoprt
events via netlink. This allows the input-like events
to opt-out of generating netlink events. In summary:
events that are sent via netlink:
ac/battery/sbs
thermal
processor
thinkpad_acpi dock/bay
events that are sent via input layer:
button
video hotkey
thinkpad_acpi hotkey
asus_acpi/asus-laptop hotkey
sonypi/sonylaptop
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
On a four package system with HT - HT load balancing optimizations were
broken. For example, if two tasks end up running on two logical threads
of one of the packages, scheduler is not able to pull one of the tasks
to a completely idle package.
In this scenario, for nice-0 tasks, imbalance calculated by scheduler
will be 512 and find_busiest_queue() will return 0 (as each cpu's load
is 1024 > imbalance and has only one task running).
Similarly MC scheduler optimizations also get fixed with this patch.
[ mingo@elte.hu: restored fair balancing by increasing the fuzz and
adding it back to the power decision, without the /2
factor. ]
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
construct a more or less wall-clock time out of sched_clock(), by
using ACPI-idle's existing knowledge about how much time we spent
idling. This allows the rq clock to work around TSC-stops-in-C2,
TSC-gets-corrupted-in-C3 type of problems.
( Besides the scheduler's statistics this also benefits blktrace and
printk-timestamps as well. )
Furthermore, the precise before-C2/C3-sleep and after-C2/C3-wakeup
callbacks allow the scheduler to get out the most of the period where
the CPU has a reliable TSC. This results in slightly more precise
task statistics.
the ACPI bits were acked by Len.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Len Brown <len.brown@intel.com>
This patch reduces 36-bit offset to 32-bit offsets. The 36-bit
offsets makes virtual addresses wraps when added to 32-bit base.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
For ARM at91, the FIQ_START #define is required if you use a driver
that enables FIQ support.
Signed-off-by: Karl Olsen <karl@micro-technic.com>
Acked-by: Andrew Victor <andrew at sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This fixes a regression from around 2.6.18, consistent_sync() will now BUG()
under these circumstances. The use of consistent_sync() was a hack, replacing
it's usage here with a new function, flush_ioremap_region().
Signed-off-by: Jared Hulbert <jaredeh@gmail.com>
Acked-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
To build arch/powerpc without including asm-ppc/ we need these files
in asm-powerpc/
Moved some headers under arch/powerpc/platforms if they were only used by
platform or driver files and fixed up the source file includes to match
the new locations
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The NUMA layer only supports NUMA policies for the highest zone. When
ZONE_MOVABLE is configured with kernelcore=, the the highest zone becomes
ZONE_MOVABLE. The result is that policies are only applied to allocations
like anonymous pages and page cache allocated from ZONE_MOVABLE when the
zone is used.
This patch applies policies to the two highest zones when the highest zone
is ZONE_MOVABLE. As ZONE_MOVABLE consists of pages from the highest "real"
zone, it's always functionally equivalent.
The patch has been tested on a variety of machines both NUMA and non-NUMA
covering x86, x86_64 and ppc64. No abnormal results were seen in
kernbench, tbench, dbench or hackbench. It passes regression tests from
the numactl package with and without kernelcore= once numactl tests are
patched to wait for vmstat counters to update.
akpm: this is the nasty hack to fix NUMA mempolicies in the presence of
ZONE_MOVABLE and kernelcore= in 2.6.23. Christoph says "For .24 either merge
the mobility or get the other solution that Mel is working on. That solution
would only use a single zonelist per node and filter on the fly. That may
help performance and also help to make memory policies work better."
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Tested-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In file included from drivers/video/console/newport_con.c:16:
include/linux/selection.h:16: warning: "struct tty_struct" declared inside parameter list
include/linux/selection.h:16: warning: its scope is only this definition or declaration, which is probably not what you want
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
m68k/mac: Make mac_hid_mouse_emulate_buttons() declaration visible
drivers/char/keyboard.c: In function 'kbd_keycode':
drivers/char/keyboard.c:1142: error: implicit declaration of function 'mac_hid_mouse_emulate_buttons'
The forward declaration of mac_hid_mouse_emulate_buttons() is not visible on
m68k because it's hidden in the middle of a big #ifdef block.
Move it to <linux/kbd_kern.h>, correct the type of the second parameter, and
include <linux/kbd_kern.h> where needed.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the needed constants and defines to activate the existing code.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The new exec code inserts an accounted vma into an mm struct which is not
current->mm. The existing memory check code has a hard coded assumption
that this does not happen as does the security code.
As the correct mm is known we pass the mm to the security method and the
helper function. A new security test is added for the case where we need
to pass the mm and the existing one is modified to pass current->mm to
avoid the need to change large amounts of code.
(Thanks to Tobias for fixing rejects and testing)
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: WU Fengguang <wfg@mail.ustc.edu.cn>
Cc: James Morris <jmorris@redhat.com>
Cc: Tobias Diedrich <ranma+kernel@tdiedrich.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reading the LSR clears the break, parity, frame error, and overrun bits in
the 8250 chip, but these are not being saved in all places that read the
LSR. Same goes for the MSR delta bits. Save the LSR bits off whenever the
lsr is read so they can be handled later in the receive routine. Save the
MSR bits to be handled in the modem status routine.
Also, clear the stored bits and clear the interrupt registers before
enabling interrupts, to avoid handling old values of the stored bits in the
interrupt routines.
[akpm@linux-foundation.org: clean up pre-existing code]
Signed-off-by: Corey Minyard <minyard@acm.org>
Cc: Russell King <rmk+lkml@arm.linux.org.uk>
Cc: Yinghai Lu <yinghai.lu@sun.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
RS690 can't do MSI like its predecessors. Disable MSI on RS690.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Henry Su <henry.su@amd.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[PATCH] PCI: make pcie_get_readrq visible in pci.h
pcie_get_readrq() is EXPORT_SYMBOL'ed, but its prototype is not
visible in pci.h, add it there.
This is needed by some network drivers.
Signed-off-by: Brice Goglin <brice@myri.com>
Acked-by: Peter Oruba <peter.oruba@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We need pci_bus_find_capability() in some arch/powerpc code so move
the prototype into a header accessible to it.
Also kill the duplicate prototype for pci_bus_alloc_resource().
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC32]: Revert f642b26380.
[SPARC64]: Need to clobber global reg vars in switch_to().
After doing some tests this seems to be the best variant for s390 and
should be correct as well. With gcc 4.2.1 we get the following kernel
image sizes using the default configuration:
atomic_t type volatile, atomic_read/set defines 5311824 bytes
atomic_t type int, atomic_read/set defines 5270864 bytes
atomic_t type int, atomic_read/set inline asm 5279056 bytes
atomic_t type int, atomic_read/set inline barrier 5270864 bytes
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There are several s390 diagnose calls, which must be executed below the
2GB memory boundary. In order to enforce this, those diagnoses must be
compiled into the kernel. Currently diag 14 can be called within the
vmur kernel module from addresses above 2GB. This leads to specification
exceptions. This patch moves diag10, diag14 and diag210 into the new
diag.c file.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
It makes head_64.S a bit more readable and will allow us to move the
iSeries exceptions elsewhere.
This also removes the last line of the comment:
* The following macros define the code that appears as
* the prologue to each of the exception handlers. They
* are split into two parts to allow a single kernel binary
* to be used for pSeries and iSeries.
* LOL. One day... - paulus
Anything is possible. :-)
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The current code assumes "foo-bar" must always be compatible with a node
compatible with "foo", which breaks device trees where this is not so.
The "case" part is also wrong according to Open Firmware, but it's more
likely to have drivers and/or device trees depending on it, and thus
needs to be handled more carefully.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We need to have xLparMap in head_64.S so that it is at a fixed address
(because the linker will not resolve (address & 0xffffffff) for us).
But the assembler miscalculates the KERNEL_VSID() expressions. So put
the confusing expressions into asm-offsets.c.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The initial user manuals for MPC8544/8533 had some issues with properly
documenting the device IDs for MPC8544/8533. These processors are almost
identical and both show up on the reference boards.
Fix up the quirks for PCIe support to handle MPC8533/E.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (6028): Turn an unnecessary mdelay() into msleep().
V4L/DVB (6027): Get rid of an ill-behaved msleep in i2c write
V4L/DVB (6026): Avoid powering up the camera on resume
V4L/DVB (6016): get_dvb_firmware: update script for new location of tda10046 firmware
V4L/DVB (5991): dvb-pll: Set minimum and maximum frequency properly
V4L/DVB (5969): ivtv: report ivtv version in status log
V4L/DVB (5967): ivtv: fix VIDIOC_S_FBUF:new OSD values where never set
V4L/DVB (5968): videodev2.h: remove superfluous FBUF GLOBAL_INV_ALPHA support
In MPS mode, "nosmp" and "maxcpus=0" boot a UP kernel with IOAPIC disabled.
However, in ACPI mode, these parameters didn't completely disable
the IO APIC initialization code and boot failed.
init/main.c:
Disable the IO_APIC if "nosmp" or "maxcpus=0"
undefine disable_ioapic_setup() when it doesn't apply.
i386:
delete ioapic_setup(), it was a duplicate of parse_noapic()
delete undefinition of disable_ioapic_setup()
x86_64:
rename disable_ioapic_setup() to parse_noapic() to match i386
define disable_ioapic_setup() in header to match i386
http://bugzilla.kernel.org/show_bug.cgi?id=1641
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
* Move ide_in_drive_list() from ide-dma.c to ide-iops.c.
* Add ivb_list[] table for listening early UDMA66 devices which don't conform
to ATA4 standard wrt cable detection (bit14 is zero, only bit13 is valid)
and use only device side cable detection for them since host side cable
detection may be unreliable.
* Add model "QUANTUM FIREBALLlct10 05" with firwmare "A03.0900" to the list
(from Craig's bugreport).
v2:
* Improve kernel message basing on suggestion from Sergei.
v3:
* Don't print kernel message when no device side cable detection is done,
plus some minor fixes. (Noticed by Sergei)
Thanks to Craig for testing this patch.
Cc: Craig Block <chblock3@yahoo.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add DMA blacklist checking (->ide_dma_on check probably can go now).
* Add ->atapi_dma flag checking and remove no longer needed
ns87415_ide_dma_check() from ns87415 host driver.
* Remove now needless __ide_dma_check() wrapper and symbol export.
* Check drive->autodma instead of hwif->autodma (there should be no changes in
behavior as all users of config_drive_for_dma() set both ->autodma flags).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
There is no need for a global inverted alpha capability since all the
application has to do is to pass '255-alpha' as the global alpha value.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Allow generic_calibrate_decr to work for 40x platforms. Given that the hardware
behavior is identical, this also changes the set_dec function to reload the PIT
on 40x to match the behavior 44x currently has.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Add MMU definitions for 40x platforms. Also fixes two warnings in 40x_mmu.c.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Fixed wrong expression which enabled watchdogs even if nmi_watchdog kernel
parameter wasn't set. This regression got slightly introduced with commit
b7471c6da9.
Introduced NMI_DISABLED (-1) which allows to switch the value of NMI_DEFAULT
without breaking the APIC NMI watchdog code (again).
Fixes:
https://bugzilla.novell.com/show_bug.cgi?id=298084http://bugzilla.kernel.org/show_bug.cgi?id=7839
And likely some more nmi_watchdog=0 related issues.
Signed-off-by: Daniel Gollub <dgollub@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use cpu_relax() in the busy loops, as atomic_read() doesn't automatically
imply volatility for i386 and x86_64. x86_64 doesn't have this issue because
it open-codes the while loop in smpboot.c:smp_callin() itself that already
uses cpu_relax().
For i386, however, smpboot.c:smp_callin() calls wait_for_init_deassert()
which is buggy for mach-default and mach-es7000 cases.
[ I test-built a kernel -- smp_callin() itself got inlined in its only
callsite, smpboot.c:start_secondary() -- and the relevant piece of
code disassembles to the following:
0xc1019704 <start_secondary+12>: mov 0xc144c4c8,%eax
0xc1019709 <start_secondary+17>: test %eax,%eax
0xc101970b <start_secondary+19>: je 0xc1019709 <start_secondary+17>
init_deasserted (at 0xc144c4c8) gets fetched into %eax only once and
then we loop over the test of the stale value in the register only,
so these look like real bugs to me. With the fix below, this becomes:
0xc1019706 <start_secondary+14>: pause
0xc1019708 <start_secondary+16>: cmpl $0x0,0xc144c4c8
0xc101970f <start_secondary+23>: je 0xc1019706 <start_secondary+14>
which looks nice and healthy. ]
Thanks to Heiko Carstens for noticing this.
Signed-off-by: Satyam Sharma <satyam@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
Cross-compilation between e.g. i386 -> 64bit could break -> work around it
[IA64] Enable early console for Ski simulator
[IA64] forbid ptrace changes psr.ri to 3
[IA64] Failure to grow RBS
[IA64] Fix processor_get_freq
[IA64] SGI Altix : fix a force_interrupt bug on altix
[IA64] Update arch/ia64/configs/* s/SLAB/SLUB/
[IA64] get back PT_IA_64_UNWIND program header
[IA64] need NOTES in vmlinux.lds.S
[IA64] make unwinder stop at last frame of the bootloader
[IA64] Clean up CPE handler registration
[IA64] Include Kconfig.preempt
[IA64] SN2 needs platform specific irq_to_vector() function.
[IA64] Use atomic64_read to read an atomic64_t.
[IA64] disable irq's and check need_resched before safe_halt
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx4: Incorrect semicolon after if statement
mlx4_core: Wait 1 second after reset before accessing device
IPoIB: Fix leak in ipoib_transport_dev_init() error path
IB/mlx4: Fix opcode returned in RDMA read completion
IB/srp: Add OUI for new Cisco targets
IB/srp: Wrap OUI checking for workarounds in helper functions
RDMA/cxgb3: Always call low level send function via cxgb3_ofld_send()
IB: Move the macro IB_UMEM_MAX_PAGE_CHUNK() to umem.c
IB: Include <linux/list.h> and <linux/rwsem.h> from <rdma/ib_verbs.h>
IB: Include <linux/list.h> from <rdma/ib_mad.h>
IB/mad: Fix address handle leak in mad_rmpp
IB/mad: agent_send_response() should be void
IB/mad: Fix memory leak in switch handling in ib_mad_recv_done_handler()
IB/mad: Fix error path if response alloc fails in ib_mad_recv_done_handler()
IB/sa: Don't need to check for default P_Key twice
IB/core: Ignore membership bit in ib_find_pkey()
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[MATH-EMU]: Fix underflow exception reporting.
[SPARC64]: Create a HWCAP_SPARC_N2 and report it to userspace on Niagara-2.
[SPARC64]: SMP trampoline needs to avoid %tick_cmpr on sun4v too.
[SPARC64]: Do not touch %tick_cmpr on sun4v cpus.
[SPARC64]: Niagara-2 optimized copies.
[SPARC64]: Allow userspace to get at the machine description.
[SPARC32]: Remove superfluous 'kernel_end' alignment on sun4c.
[SPARC32]: Fix bogus ramdisk image location check.
[SPARC32]: Remove iommu from struct sbus_bus and use archdata like sparc64.
Adrian Bunk: scripts/mod/file2alias.c is compiled with HOSTCC and ensures that
kernel_ulong_t is correct, but it can't cope with different padding on
different architectures.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reserved MCSR bits on FSL BookE parts may have spurious values
when mcheck occurs. Mask these off when printing the MCSR to
avoid confusion. Also, get rid of the MCSR_GL_CI bit defined
for e500 - this bit doesn't actually have any meaning.
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The underflow exception cases were wrong.
This is one weird area of ieee1754 handling in that the underflow
behavior changes based upon whether underflow is enabled in the trap
enable mask of the FPU control register. As a specific case the Sparc
V9 manual gives us the following description:
--------------------
If UFM = 0: Underflow occurs if a nonzero result is tiny and a
loss of accuracy occurs. Tininess may be detected
before or after rounding. Loss of accuracy may be
either a denormalization loss or an inexact result.
If UFM = 1: Underflow occurs if a nonzero result is tiny.
Tininess may be detected before or after rounding.
--------------------
What this amounts to in the packing case is if we go subnormal,
we set underflow if any of the following are true:
1) rounding sets inexact
2) we ended up rounding back up to normal (this is the case where
we set the exponent to 1 and set the fraction to zero), this
should set inexact too
3) underflow is set in FPU control register trap-enable mask
The initially discovered example was "DBL_MIN / 16.0" which
incorrectly generated an underflow. It should not, unless underflow
is set in the trap-enable mask of the FPU csr.
Another example, "0x0.0000000000001p-1022 / 16.0", should signal both
inexact and underflow. The cpu implementations and ieee1754
literature is very clear about this. This is case #2 above.
However, if underflow is set in the trap enable mask, only underflow
should be set and reported as a trap. That is handled properly by the
prioritization logic in
arch/sparc{,64}/math-emu/math.c:record_exception().
Based upon a report and test case from Jakub Jelinek.
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes some of the #ifdefs from .c files.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
They were only needed for backwards compatibility and all in tree uses
have now been changed.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This file was protected by _PPC64_LMB_H, which is confusing, as the
32-bit code also uses the lmb these days. Changed to
_ASM_POWERPC_LMB_H.
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Instead, use asm() like all other atomic operations already do.
Also use inline functions instead of macros; this actually
improves code generation (some code becomes a little smaller,
probably because of improved alias information -- just a few
hundred bytes total on a default kernel build, nothing shocking).
Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Eliminate the use of error_log_cnt as a global var shared across
different directories. Pass it as a parameter instead.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
----
Respin of earlier patch, with the CONFIG_PSERIES junk removed from the
header file.
arch/powerpc/kernel/nvram_64.c | 10 +++++-----
arch/powerpc/platforms/pseries/rtasd.c | 7 ++++---
include/asm-powerpc/nvram.h | 6 ++++--
3 files changed, 13 insertions(+), 10 deletions(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
...so that GCC doesn't complain about unused variables in the
callers of these.
Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The iscsi eh could be tearing down the session/connection while
the scsi eh is still sending task management functions. If when
we drop the session lock to grab the recv lock, the iscsi eh
tears down the connection we will oops.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add empty definition of mmiowb() since some drivers need it. Uncached
writes are strongly ordered on AVR32. They may be delayed if the
dcache is busy doing a writeback, but AFAICT that's not what this
macro is supposed to deal with, at least on UP systems.
We might have to revisit this definition when a SMP-capable AVR32 CPU
comes along, depending on how the busses and cache coherency stuff
end up being implemented.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
The current definition of pte_page() masks out valid bits from the
physical address, causing vmalloc_to_page() to misbehave. This may
lead to everything from mmap() silently accessing the wrong data to
"invalid pte" errors dumped by the kernel.
Also remove the now-unused definition of PTE_PHYS_MASK.
Thanks to Matteo Vit for discovering this bug.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
There's really no need to retry an allocation with __GFP_REPEAT set.
Also, use get_zeroed_page() and __GFP_ZERO to eliminate the extra call
to clear_page() afterwards.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
http://bugzilla.kernel.org/show_bug.cgi?id=8797 shows that the
bonding driver may produce bogus combinations of the checksum
flags and SG/TSO.
For example, if you bond devices with NETIF_F_HW_CSUM and
NETIF_F_IP_CSUM you'll end up with a bonding device that
has neither flag set. If both have TSO then this produces
an illegal combination.
The bridge device on the other hand has the correct code to
deal with this.
In fact, the same code can be used for both. So this patch
moves that logic into net/core/dev.c and uses it for both
bonding and bridging.
In the process I've made small adjustments such as only
setting GSO_ROBUST if at least one constituent device
supports it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add base support for implementing platform_irq_to_vector(), and
then use it on SN2.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: John Keller <jpk@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The routines ia64_atomic64_{add,sub} mistakenly use
atomic_read() to grab the old value instead of using
atomic64_read().
Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch makes the following needlessly global code static:
- arch_reinit_sched_domains()
- struct attr_sched_mc_power_savings
- struct attr_sched_smt_power_savings
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fix size check for hugetlbfs
[POWERPC] Fix initialization and usage of dma_mask
[POWERPC] Fix more section mismatches in head_64.S
[POWERPC] Revert "[POWERPC] Add 'mdio' to bus scan id list for platforms with QE UEC"
[POWERPC] PS3: Update ps3_defconfig
[POWERPC] PS3: Remove text saying PS3 support is incomplete
[POWERPC] PS3: Fix storage probe logic
[POWERPC] cell: Move SPU affinity init to spu_management_of_ops
[POWERPC] Fix potential duplicate entry in SLB shadow buffer
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
BLOCK: Hide the contents of linux/bio.h if CONFIG_BLOCK=n
sysace: HDIO_GETGEO has it's own method for ages
drivers/block/cpqarray.c: better error handling and kmalloc + memset conversion to k[cz]alloc
drivers/block/cciss.c: kmalloc + memset conversion to kzalloc
Clean up duplicate includes in drivers/block/
Fix remap handling by blktrace
[PATCH] remove mm/filemap.c:file_send_actor()
The Averatec 2370 and some other Turion laptop BIOS seems to program the
ENABLE_C1E MSR inconsistently between cores. This confuses the lapic
use heuristics because when C1E is enabled anywhere it seems to affect
the complete chip.
Use a global flag instead of a per cpu flag to handle this.
If any CPU has C1E enabled disabled lapic use.
Thanks to Cal Peake for debugging.
Cc: tglx@linutronix.de
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 19d36ccdc3 "x86: Fix alternatives
and kprobes to remap write-protected kernel text" uses code which is
being patched for patching.
In particular, paravirt_ops does patching in two stages: first it
calls paravirt_ops.patch, then it fills any remaining instructions
with nop_out(). nop_out calls text_poke() which calls
lookup_address() which calls pgd_val() (aka paravirt_ops.pgd_val):
that call site is one of the places we patch.
If we always do patching as one single call to text_poke(), we only
need make sure we're not patching the memcpy in text_poke itself.
This means the prototype to paravirt_ops.patch needs to change, to
marshal the new code into a buffer rather than patching in place as it
does now. It also means all patching goes through text_poke(), which
is known to be safe (apply_alternatives is also changed to make a
single patch).
AK: fix compilation on x86-64 (bad rusty!)
AK: fix boot on x86-64 (sigh)
AK: merged with other patches
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
gcc currently doesn't support attributes on types, so we can't use it
function pointers. This avoids some warnings on a gcc 4.3 build.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are some parts of include/asm-generic/pgtable.h that are relevant to
the non-mmu architectures. To make it easier to include this from them I
would like to ifdef the relevant parts.
Without this there is a handful of functions that are referenced in here
that are not defined on many non-mmu architectures. They could be defined
out of course, as an alternative approach.
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch finishes the i386 and x86-64 ->sysdata conversion and hopefully
also fixes Riku's and Andy's observed bugs. It is based on Yinghai Lu's
and Andy Whitcroft's patches (thanks!) with some changes:
- introduce pci_scan_bus_with_sysdata() and use it instead of
pci_scan_bus() where appropriate. pci_scan_bus_with_sysdata() will
allocate the sysdata structure and then call pci_scan_bus().
- always allocate pci_sysdata dynamically. The whole point of this
sysdata work is to make it easy to do root-bus specific things
(e.g., support PCI domains and IOMMU's). I dislike using a default
struct pci_sysdata in some places and a dynamically allocated
pci_sysdata elsewhere - the potential for someone indavertantly
changing the default structure is too high.
- this patch only makes the minimal changes necessary, i.e., the NUMA node is
always initialized to -1. Patches to do the right thing with regards
to the NUMA node can build on top of this (either add a 'node'
parameter to pci_scan_bus_with_sysdata() or just update the node
when it becomes known).
The patch was compile tested with various configurations (e.g., NUMAQ,
VISWS) and run-time tested on i386 and x86-64. Unfortunately none of my
machines exhibited the bugs so caveat emptor.
Andy, could you please see if this fixes the NUMA issues you've seen?
Riku, does this fix "pci=noacpi" on your laptop?
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <cebbert@redhat.com>
Cc: <riku.seppala@kymp.net>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I find a function(clockevents_unregister_notifier) which is not called by
anything in tree.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
synchronize_idle() sounds like an interesting function, but we don't
actually have it, so don't prototype it. Introduced in commit
9b06e81898, in 2005.
Signed-off-by: Josh Triplett <josh@kernel.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch c5c34d4862 (tty: flush flip buffer on
ldisc input queue flush) introduces a race condition which can lead to memory
leaks.
The problem can be triggered when tcflush() is called when data are being
pushed to the line discipline driver by flush_to_ldisc().
flush_to_ldisc() releases tty->buf.lock when calling the line discipline
receive_buf function. At that poing tty_buffer_flush() kicks in and sets both
tty->buf.head and tty->buf.tail to NULL. When flush_to_ldisc() finishes, it
restores tty->buf.head but doesn't touch tty->buf.tail. This corrups the
buffer queue, and the next call to tty_buffer_request_room() will allocate a
new buffer and overwrite tty->buf.head. The previous buffer is then lost
forever without being released.
(Thanks to Laurent for the above text, for finding, disgnosing and reporting
the bug)
- Use tty->flags bits for the flush status.
- Wait for the flag to clear again before returning
- Fix the doc error noted
- Fix flush of empty queue leaving stale flushpending
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Paul Fulghum <paulkf@microgate.com>
Cc: Laurent Pinchart <laurentp@cse-semaphore.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After /proc/sys rewrite it was left unused.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Connect up the fallocate() system call.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hide the contents of linux/bio.h if CONFIG_BLOCK=n as there shouldn't be
compiled code that uses it.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch provides more information concerning REMAP operations on block
IOs. The additional information provides clearer details at the user level,
and supports post-processing analysis in btt.
o Adds in partition remaps on the same device.
o Fixed up the remap information in DM to be in the right order
o Sent up mapped-from and mapped-to device information
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
powerpc has a couple of bugs in the usage of dma_masks that tend to
break when drivers explicitly try to set a 32-bit mask for example.
First, the code that generates the pci devices from the OF device-tree
doesn't initialize the mask properly, then our implementation of
set_dma_mask() was trying to validate the -previous- mask value, not the
one passed in as an argument.
This fixes these problems.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch moves affinity initialization code from spu_base.c to a
new spu_management_of_ops function (init_affinity), which is empty
in the case of PS3. This fixes a linking problem that was happening
when compiling for PS3.
Also, some small code style changes were made.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The b44 build uses these, caught by allmodconfig:
drivers/net/b44.c: In function `b44_sync_dma_desc_for_cpu':
drivers/net/b44.c:159: error: implicit declaration of function `dma_sync_single_range_for_cpu'
Follow the sparc64 change and stub them in.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The default definition in asm-generic conflicts with Alpha's O_DIRECT,
so, like several other arches, it needs to be redefined.
Signed-off-by: Richard Hendersion <rth@twiddle.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.linux-nfs.org/pub/linux/nfs-2.6:
SUNRPC: Replace flush_workqueue() with cancel_work_sync() and friends
NFS: Replace flush_scheduled_work with cancel_work_sync() and friends
SUNRPC: Don't call gss_delete_sec_context() from an rcu context
NFSv4: Don't call put_rpccred() from an rcu callback
NFS: Fix NFSv4 open stateid regressions
NFSv4: Fix a locking regression in nfs4_set_mode_locked()
NFS: Fix put_nfs_open_context
SUNRPC: Fix a race in rpciod_down()
Trivial fix: mark the buffer to hexdump as const so callers could avoid
casting their const buffers when calling print_hex_dump().
The patch is really trivial and I suggest to consider it as a fix
(it fixes GCC warnings) and push it to current tree.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: Fix memory leak when cpu hotplugging.
[SPARC64]: Do not assume sun4v chips have load-twin/store-init support.
[SPARC64]: Fix hard-coding of cpu type output in /proc/cpuinfo on sun4v.
[SPARC]: Centralize find_in_proplist() instead of duplicating N times.
remove the 'u64 now' parameter from ->task_new().
( identity transformation that causes no change in functionality. )
Signed-off-by: Ingo Molnar <mingo@elte.hu>
remove the 'u64 now' parameter from ->put_prev_task().
( identity transformation that causes no change in functionality. )
Signed-off-by: Ingo Molnar <mingo@elte.hu>
remove the 'u64 now' parameter from ->pick_next_task().
( identity transformation that causes no change in functionality. )
Signed-off-by: Ingo Molnar <mingo@elte.hu>
remove the 'u64 now' parameter from ->dequeue_task().
( identity transformation that causes no change in functionality. )
Signed-off-by: Ingo Molnar <mingo@elte.hu>
remove the 'u64 now' parameter from ->enqueue_task().
( identity transformation that causes no change in functionality. )
Signed-off-by: Ingo Molnar <mingo@elte.hu>
remove the 'u64 now' parameter from print_cfs_rq().
( identity transformation that causes no change in functionality. )
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are two problems with balance_tasks() and how it used:
1. The variables best_prio and best_prio_seen (inherited from the old
move_tasks()) were only required to handle problems caused by the
active/expired arrays, the order in which they were processed and the
possibility that the task with the highest priority could be on either.
These issues are no longer present and the extra overhead associated
with their use is unnecessary (and possibly wrong).
2. In the absence of CONFIG_FAIR_GROUP_SCHED being set, the same
this_best_prio variable needs to be used by all scheduling classes or
there is a risk of moving too much load. E.g. if the highest priority
task on this at the beginning is a fairly low priority task and the rt
class migrates a task (during its turn) then that moved task becomes the
new highest priority task on this_rq but when the sched_fair class
initializes its copy of this_best_prio it will get the priority of the
original highest priority task as, due to the run queue locks being
held, the reschedule triggered by pull_task() will not have taken place.
This could result in inappropriate overriding of skip_for_load and
excessive load being moved.
The attached patch addresses these problems by deleting all reference to
best_prio and best_prio_seen and making this_best_prio a reference
parameter to the various functions involved.
load_balance_fair() has also been modified so that this_best_prio is
only reset (in the loop) if CONFIG_FAIR_GROUP_SCHED is set. This should
preserve the effect of helping spread groups' higher priority tasks
around the available CPUs while improving system performance when
CONFIG_FAIR_GROUP_SCHED isn't set.
Signed-off-by: Peter Williams <pwil3058@bigpond.net.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The move_tasks() function is currently multiplexed with two distinct
capabilities:
1. attempt to move a specified amount of weighted load from one run
queue to another; and
2. attempt to move a specified number of tasks from one run queue to
another.
The first of these capabilities is used in two places, load_balance()
and load_balance_idle(), and in both of these cases the return value of
move_tasks() is used purely to decide if tasks/load were moved and no
notice of the actual number of tasks moved is taken.
The second capability is used in exactly one place,
active_load_balance(), to attempt to move exactly one task and, as
before, the return value is only used as an indicator of success or failure.
This multiplexing of sched_task() was introduced, by me, as part of the
smpnice patches and was motivated by the fact that the alternative, one
function to move specified load and one to move a single task, would
have led to two functions of roughly the same complexity as the old
move_tasks() (or the new balance_tasks()). However, the new modular
design of the new CFS scheduler allows a simpler solution to be adopted
and this patch addresses that solution by:
1. adding a new function, move_one_task(), to be used by
active_load_balance(); and
2. making move_tasks() a single purpose function that tries to move a
specified weighted load and returns 1 for success and 0 for failure.
One of the consequences of these changes is that neither move_one_task()
or the new move_tasks() care how many tasks sched_class.load_balance()
moves and this enables its interface to be simplified by returning the
amount of load moved as its result and removing the load_moved pointer
from the argument list. This helps simplify the new move_tasks() and
slightly reduces the amount of work done in each of
sched_class.load_balance()'s implementations.
Further simplification, e.g. changes to balance_tasks(), are possible
but (slightly) complicated by the special needs of load_balance_fair()
so I've left them to a later patch (if this one gets accepted).
NB Since move_tasks() gets called with two run queue locks held even
small reductions in overhead are worthwhile.
[ mingo@elte.hu ]
this change also reduces code size nicely:
text data bss dec hex filename
39216 3618 24 42858 a76a sched.o.before
39173 3618 24 42815 a73f sched.o.after
Signed-off-by: Peter Williams <pwil3058@bigpond.net.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Check the cpu type in the OBP device tree before committing to
using the optimized Niagara memcpy and memset implementation.
If we don't recognize the cpu type, use a completely generic
version.
Signed-off-by: David S. Miller <davem@davemloft.net>
Loading nf_nat causes the conntrack core to be loaded, but we need IPv4 as
well.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch addresses some issues in x86/x86-64 acpi-cpufreq driver:
1. Current memory allocation for acpi_perf_data is actually open-coded
alloc_percpu(). The patch defines and handles acpi_perf_data as percpu
data. The code will be cleaner and easier to be maintained with this
change.
2. Won't load driver in acpi_cpufreq_early_init() failure case.
3. Add __init for acpi_cpufreq_early_init().
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
We need to grab the inode->i_lock atomically with the last reference put in
order to remove the open context that is being freed from the
nfsi->open_files list.
Fix by converting the kref to a standard atomic counter and then using
atomic_dec_and_lock()...
Thanks to Arnd Bergmann for pointing out the problem.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NETFILTER]: Add xt_statistic.h to the header list for usermode programs
[BNX2]: Fix suspend/resume problem.
[TG3]: Fix suspend/resume problem.
Add the needed definitions to activate arbitary speed support on the blackfin platform.
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Aubrey Li <aubrey.li@analog.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Comply with revised Anomaly Workarounds for BF533 05000311 and BF561 05000323
accoring to BF533 anomaly sheet Rev. A 09/04/07
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Print out debug info, as early as possible - even before the
kernel initializes the interrupt vectors. Now we can print out debug
messages almost anytime during the boot process.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
This allows debugging of problems which happen eary in the kernel
boot process (after bootargs are parsed, but before serial subsystem
is fully initialized)
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Now that there is a generic GPIO driver framework
remove GPIO register unified name space legacy support.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
add an exception request/free api similar to the interrupt request/fre
api so people can utilize the free software based exceptions for their
own purposes
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
- allow people to select the feature that is unavailable to the kernel: NMI, JTAG, or CYCLES.
- change default NMI handler to simply dump hardware trace buffer.
- remove default NMI handler completely as calling into kernel code is not safe
move example handler to wiki so people dont haphazardly copy and paste this stuff thinking its safe
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Bug: When SMC921X driver is enabled, kernel boot crash on EZKIT548
http://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_item_id=3460
Fixed by restoring mach dependent ASYNC memory size CPLB coverage.
Once we have a more dynamic memory layout we should come up with a better
solution for these hard-coded values.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
now all BLKFIN should be BFIN, should be no functional changes.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Add xt_statistic.h to the list of headers to install.
Apparently needed to build newer versions of iptables.
Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Correct incorrect removal of asm-generic/fcntl.h from asm-sparc/fcntl.h by
commit 6ba60d2195.
Signed-off-by: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Our current implementation has a generic set of barrier functions that
go through the SCSI driver model. Realistically, this is unnecessary,
because the only device that can use barriers (sd) can set the flush
functions up at probe or revalidate time. This patch pulls the barrier
functions out of the mid layer and scsi driver model and relocates them
directly in sd.
Acked-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fixes for the SLB shadow buffer code
[POWERPC] Fix a compile warning in powermac/feature.c
[POWERPC] Fix a compile warning in pci_32.c
[POWERPC] Fix parse_drconf_memory() for 64-bit start addresses
[POWERPC] Fix num_cpus calculation in smp_call_function_map()
[POWERPC] ps3: Fix section mismatch in ps3/setup.c
[POWERPC] spufs: Fix affinity after introduction of node_allowed() calls
[POWERPC] Fix special PTE code for secondary hash bucket
[POWERPC] Expand RPN field to 34 bits when using 64k pages
The one choosen by asm-generic/fcntl.h is not appropriate
for this platform.
Noticed by Ulrich Drepper.
Signed-off-by: David S. Miller <davem@davemloft.net>
After moving the definition of struct ib_umem_chunk from ib_verbs.h to
ib_umem.h there isn't any reason for the macro IB_UMEM_MAX_PAGE_CHUNK
to stay in ib_verbs.h. Move the macro to umem.c, the only place where
it is used.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ib_verbs.h uses struct list_head and rw_semaphore, so while the files
<linux/list.h> and <linux/rwsem.h> seem to be pulled in indirectly by
the other header files it includes, the right thing is to include
those files directly.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ib_mad.h uses struct list_head, so while linux/list.h seems to be
pulled in indirectly by one of the headers it includes, the right
thing is to include linux/list.h directly.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
On a machine with hardware 64kB pages and a kernel configured for a
64kB base page size, we need to change the vmalloc segment from 64kB
pages to 4kB pages if some driver creates a non-cacheable mapping in
the vmalloc area. However, we never updated with SLB shadow buffer.
This fixes it. Thanks to paulus for finding this.
Also added some write barriers to ensure the shadow buffer contents
are always consistent.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The real page number field in our PTEs when configured for 64kB pages
is currently 32 bits, which turns out to be not quite enough for the
resources that the eHCA driver wants to map. This expands the RPN
field to include 2 adjacent, previously-unused bits.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
As discovered by Evegniy Polyakov, if we try to sendmsg after
a connection reset, we can do incredibly stupid things.
The core issue is that inet_sendmsg() tries to autobind the
socket, but we should never do that for TCP. Instead we should
just go straight into TCP's sendmsg() code which will do all
of the necessary state and pending socket error checks.
TCP's sendpage already directly vectors to tcp_sendpage(), so this
merely brings sendmsg() in line with that.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes sure cf support is enabled on R2D-PLUS but disabled
on R2D-1. Without this fix R2D-1 boards hang on bootup.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
A small fix to the SELinux/NetLabel glue code to ensure that the NetLabel
cache is utilized when possible. This was broken when the SELinux/NetLabel
glue code was reorganized in the last kernel release.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Enable the MB93090 motherboard's MB86943 PCI arbiter correctly by assigning to
the register rather than comparing against it. This is required to support
bus mastering.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes WARN_ON() on bitfiels ops for all architectures that have
been left out in 8d4fbcfbe0.
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sctp_chunk_cachep & sctp_bucket_cachep is used module global, so move it
to a header file.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Alexey Dobriyan noticed that the new WARN_ON() semantics that were
introduced by commit 684f978347 (to also
return the value to be warned on) didn't compile when given a bitfield,
because the typeof doesn't work for bitfields.
So instead of the typeof trick, use an "int" variable together with a
"!!(x)" expression, as suggested by Al Viro.
To make matters more interesting, Paul Mackerras points out that that is
sub-optimal on Power, but the old asm-coded comparison seems to be buggy
anyway on 32-bit Power if the conditional was 64-bit, so I think there
are more problems there.
Regardless, the new WARN_ON() semantics may have been a bad idea. But
this at least avoids the more serious complications.
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog: (28 commits)
[WATCHDOG] Fix pcwd_init_module crash
[WATCHDOG] ICH9 support for iTCO_wdt
[WATCHDOG] 631xESB/632xESB support for iTCO_wdt - add all LPC bridges
[WATCHDOG] 631xESB/632xESB support for iTCO_wdt
[WATCHDOG] omap_wdt.c - default error for IOCTL is -ENOTTY
[WATCHDOG] Return value of nonseekable_open
[WATCHDOG] mv64x60_wdt: Rework the timeout register manipulation
[WATCHDOG] mv64x60_wdt: disable watchdog timer when driver is probed
[WATCHDOG] mv64x60_wdt: Support the WDIOF_MAGICCLOSE feature
[WATCHDOG] mv64x60_wdt: Add a module parameter to change nowayout setting
[WATCHDOG] mv64x60_wdt: Add WDIOC_SETOPTIONS ioctl support
[WATCHDOG] mv64x60_wdt: Support for WDIOC_SETTIMEOUT ioctl
[WATCHDOG] mv64x60_wdt: Fix WDIOC_GETTIMEOUT return value
[WATCHDOG] mv64x60_wdt: Check return value of nonseekable_open
[WATCHDOG] mv64x60_wdt: Add arch/powerpc platform support
[WATCHDOG] mv64x60_wdt: Get register address from platform data
[WATCHDOG] mv64x60_wdt: set up platform_device in platform code
[WATCHDOG] ensure mouse and keyboard ignored in w83627hf_wdt
[WATCHDOG] s3c2410_wdt: fixup after arch include moves
[WATCHDOG] git-watchdog-typo
...
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (41 commits)
[RTNETLINK]: Fix warning for !CONFIG_KMOD
[IPV4] ip_options.c: kmalloc + memset conversion to kzalloc
[DECNET]: kmalloc + memset conversion to kzalloc
[NET]: ethtool_perm_addr only has one implementation
[NET]: ethtool ops are the only way
[PPPOE]: Improve hashing function in hash_item().
[XFRM]: State selection update to use inner addresses.
[IPSEC]: Ensure that state inner family is set
[TCP]: Bidir flow must not disregard SACK blocks for lost marking
[TCP]: Fix ratehalving with bidirectional flows
[PPPOL2TP]: Add CONFIG_INET Kconfig dependency.
[NET]: Page offsets and lengths need to be __u32.
[AF_UNIX]: Make code static.
[NETFILTER]: Make nf_ct_ipv6_skip_exthdr() static.
[PKTGEN]: make get_ipsec_sa() static and non-inline
[PPPoE]: move lock_sock() in pppoe_sendmsg() to the right location
[PPPoX/E]: return ENOTTY on unknown ioctl requests
[IPV6]: ipv6_addr_type() doesn't know about RFC4193 addresses.
[NET]: Fix prio_tune() handling of root qdisc.
[NET]: Fix sch_api to properly set sch->parent on the root.
...
This adds kerneldoc to the SPI framework. The "spi_driver" and
"spi_board_info" structs were previously not described.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make it a little more clear that this is the default implementation for
the setleast operation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use "__val" rather than "val" in the __get_unaligned macro in
asm-generic/unaligned.h. This way gcc wont warn if you happen to also name
something in the same scope "val".
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add kernel-doc entry in <linux/irq.h> for:
Warning(linux-2.6.22-git12//include/linux/irq.h:177): No description found for parameter 'last_unhandled'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add kernel-doc notation in <linux/i2c.h> for:
Warning(linux-2.6.22-git12//include/linux/i2c.h:183): No description found for parameter 'driver'
Warning(linux-2.6.22-git12//include/linux/i2c.h:183): No description found for parameter 'usage_count'
Warning(linux-2.6.22-git12//include/linux/i2c.h:183): No description found for parameter 'list'
Warning(linux-2.6.22-git12//include/linux/i2c.h:183): No description found for parameter 'released'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
pure_initcall uses the same ID as core_initcall. I guess that's a typo and
it should use its own ID.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As reported by Gustavo de Nardin <gustavodn@mandriva.com.br>, while trying to
compile xosview (http://xosview.sourceforge.net/) with upstream kernel
headers being used you get the following errors:
serialmeter.cc:48:30: error: linux/serial_reg.h: No such file or directory
serialmeter.cc: In member function 'virtual void
SerialMeter::checkResources()':
serialmeter.cc:71: error: 'UART_LSR' was not declared in this scope
serialmeter.cc:71: error: 'UART_MSR' was not declared in this scope
...
Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Cc: Gustavo de Nardin <gustavodn@mandriva.com.br>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
alpha:
In file included from kernel/notifier.c:1:
include/linux/kdebug.h:14: warning: 'struct notifier_block' declared inside parameter list
include/linux/kdebug.h:14: warning: its scope is only this definition or declaration, which is probably not what you want
include/linux/kdebug.h:15: warning: 'struct notifier_block' declared inside parameter list
kernel/notifier.c:529: error: conflicting types for 'register_die_notifier'
include/linux/kdebug.h:14: error: previous declaration of 'register_die_notifier' was here
kernel/notifier.c:533: error: conflicting types for 'register_die_notifier'
include/linux/kdebug.h:14: error: previous declaration of 'register_die_notifier' was here
kernel/notifier.c:536: error: conflicting types for 'unregister_die_notifier'
include/linux/kdebug.h:15: error: previous declaration of 'unregister_die_notifier' was here
kernel/notifier.c:539: error: conflicting types for 'unregister_die_notifier'
include/linux/kdebug.h:15: error: previous declaration of 'unregister_die_notifier' was here
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The spidev driver doesn't currently expose all SPI communications modes to
userspace. This passes them all through to the driver.
Two of them are potentially troublesome, in the sense that they could cause
hardware conflicts on shared busses. It might be appropriate to add some
privilege checks for for those modes.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Loopback mode is supported by various controllers. This mode can be
useful for testing, especially in conjunction with spidev driver.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The arm26 port has been in a state where it was far from even compiling
for quite some time.
Ian Molton agreed with the removal.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Ian Molton <spyro@f2s.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
commit eab03ac7bd aka
"[PATCH] Get rid of /proc/sys/proc" was good commit except strace(1) compile
breakage it introduced:
system.c:1581: error: 'CTL_PROC' undeclared here (not in a function)
So, add dummy enum back.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert 7e92b4fc34. It broke Sébastien Dugué's
machine and Jeff said (persuasively)
This seems like it will break decades-long-working stuff, in favor of
breaking new ground in our favorite area, "trusting the BIOS."
It's just not worth it for serial ports, IMO. Serial ports are something
that just shouldn't break at this late stage in the game. My new Intel
platform boxes don't even have serial ports, so I question the value of
messing with serial port probing even more... because... just wait a year,
and your box won't have a serial port either! :)
I certainly don't object to the use of platform devices (or isa_driver),
but the probe change seems questionable. That's sorta analagous to
rewriting the floppy driver probe routine. Sure you could do it... but why
risk all that damage and go through debugging all over again?
It seems clear from this report that we cannot, should not, trust BIOS for
something (a) so simple and (b) that has been working for over a decade.
Much discussion ensued and we've decided to have another go at all of this.
Cc: Sébastien Dugué <sebastien.dugue@bull.net>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Adam Belay <ambx1@neo.rr.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Jeff Garzik <jeff@garzik.org>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Cc: Sascha Sommer <saschasommer@freenet.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove unused TIF_NOTIFY_RESUME flag for all processor architectures. The
flag was not used excecpt on IA-64 where the patch replaces it with
TIF_PERFMON_WORK.
Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Cc: <linux-arch@vger.kernel.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix section mismatch warnings:
these functions are called only from __init functions.
WARNING: vmlinux.o(.text+0x1861c): Section mismatch: reference to .init.text:free_bootmem (between 'free_tce_table' and 'build_tce_table')
WARNING: vmlinux.o(.text+0x187e5): Section mismatch: reference to .init.text:__alloc_bootmem_low (between 'alloc_tce_table' and 'kretprobe_trampoline_holder')
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All drivers implement ethtool get_perm_addr the same way -- by calling
the generic function. So we can inline the generic function into the
caller and avoid going through the drivers.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: David S. Miller <davem@davemloft.net>
This updates sg_io_v4 structure (based on Doug's RFC, release 1.3).
The major changes are:
- add dout_resid field
- increase tag size to 64 bits to comply with SAM-4 and SRP
- add dout_iovec_count and din_iovec_count
dout_iovec_count and din_iovec_count aren't supported now. I'm not
sure whether they will be supported or not but they were added for the
possible future changes.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The following code can now become static:
- struct unix_socket_table
- unix_table_lock
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/if_inet6.h includes linux/ipv6.h which also tries to include
net/if_inet6.h. Since the latter only needs it for forward
declarations, we can fix this by adding the declarations.
A number of files are implicitly including net/if_inet6.h through
linux/ipv6.h. They also use net/ipv6.h so this patch includes
net/if_inet6.h there.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds code to allow errors to be passed up from event
handlers of NETDEV_REGISTER and NETDEV_CHANGENAME. It also adds
the notifier_from_errno/notifier_to_errnor helpers to pass the
errno value up to the notifier caller.
If an error is detected when a device is registered, it causes
that operation to fail. A NETDEV_UNREGISTER will be sent to
all event handlers.
Similarly if NETDEV_CHANGENAME fails the original name is restored
and a new NETDEV_CHANGENAME event is sent.
As such all event handlers must be idempotent with respect to
these events.
When an event handler is registered NETDEV_REGISTER events are
sent for all devices currently registered. Should any of them
fail, we will send NETDEV_GOING_DOWN/NETDEV_DOWN/NETDEV_UNREGISTER
events to that handler for the devices which have already been
registered with it. The handler registration itself will fail.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
no code changes, just documenting existing types
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the API for the callback that is done after an ACK is
received. It solves a couple of issues:
* Some congestion controls want higher resolution value of RTT
(controlled by TCP_CONG_RTT_SAMPLE flag). These don't really want a ktime, but
all compute a RTT in microseconds.
* Other congestion control could use RTT at jiffies resolution.
To keep API consistent the units should be the same for both cases, just the
resolution should change.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/infiniband/hw/mthca/mthca_main.c: In function `mthca_init_icm':
drivers/infiniband/hw/mthca/mthca_main.c:468: error: implicit declaration of function `dma_get_cache_alignment'
Pinch the one from asm-generic/dma-mapping.h
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6.23:
sh: Fix fs.h removal from mm.h regressions.
sh: fix get_wchan() for SH kernels without framepointers
sh: arch/sh/boot - fix shell usage
rtc: rtc-sh: Correct sh_rtc_set_time() for some SH-3 parts.
sh: remove support for sh7300 and solution engine 7300
sh: Add sh to the CC_OPTIMIZE_FOR_SIZE dependencies.
sh: Kill off virt_to_bus()/bus_to_virt().
sh: sh-sci - fix SH7708 support
sh: Restrict DSP support to specific CPUs.
sh: Silence sq compile warning on sh4 nommu.
sh: Kill the rest of the SE73180 cruft.
sh: remove support for sh73180 and solution engine 73180
sh: remove old broken pint code
sh: Reclaim beginning of P3 space for vmalloc area.
sh: Fix Dreamcast DMA issues.
sh: Add kmap_coherent()/kunmap_coherent() interface for SH-4.
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6:
modules: better error messages when modules fail to load due to a sysfs problem.
kobject: update documentation
kset: kernel-doc cleanups
driver core: revert "device" link creation check
stable_api_nonsense.txt: Disambiguate the use of "this" by using "that" to refer to the syscall interface
Fix Doc/sysfs-rules typos
kernel-doc fixes for PCI and drivers/base/
kobject: put kobject_actions in kobject.h
kobject: fix link error when CONFIG_HOTPLUG is disabled
HOWTO: sync Japanese HOWTO
HOWTO: adjust translation header of Japanese stable_api_nonsense.txt
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: "sparse" cleanups for usb gadgets
usb-serial: Fix edgeport regression on non-EPiC devices
USB: more pxa2xx_udc dead code removal
USB: NIKON D50 is an unusual device
USB: drivers/usb/serial/sierra.c: make 3 functions static
USB: fix BUG: sleeping function called from invalid context at /home/jeremy/hg/xen/paravirt/linux/drivers/usb/core/urb.c:524, in_atomic():1, irqs_disabled():0
USB: mct_u232: Convert to proper speed handling API
digi_acceleport: Drag the driver kicking and screaming into coding style
cp2101: Remove broken termios optimisation, use proper speed API
USB: Fix a bug in usb_start_wait_urb
USB: fix scatterlist PIO case (IOMMU)
USB: fix usb_serial_suspend(): buggy code
USB: yet another quirky device
USB: Add CanonScan LiDE30 to the quirk list
USB: even more quirks
USB: usb.h kernel-doc additions
USB: more quirky devices
USB: Don't let usb-storage steal Blackberry Pearl
USB: devices misc: Trivial patch to build the IOWARRIOR when it is selected in Kconfig
Removed kernel-doc marker (/**) from struct kset -- each struct member
still needs annotation.
Corrected one parameter name so that kernel-doc matches the macro.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This prevents the extern declaration in the driver core.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add kernel-doc entries in <linux/usb.h> for:
Warning(linux-2.6.22-git12//include/linux/usb.h:162): No description found for parameter 'intf_assoc'
Warning(linux-2.6.22-git12//include/linux/usb.h:268): No description found for parameter 'intf_assoc[USB_MAXIADS]'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
linux/dvb/video.h uses types __u32, __s32, etc., but does not include
any header defining those when __KERNEL__ is not defined.
Fix this by including asm/types.h when __KERNEL__ is not defined.
Signed-off-by: Anssi Hannula <anssi.hannula@gmail.com>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
While I was busy compile-testing my patch, ENOSYS sneaked into pm.h
leading to some compile-breakages mostly on ia64 and some mips configs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
They are handled in a ->compat_ioctl() handler, so it's just noise
when compat_ioctl.c warns which occurs when they are used on non-SBUS
framebuffer devices.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fully unify all of the DMA ops so that subordinate bus types to
the DMA operation providers (such as ebus, isa, of_device) can
work transparently.
Basically, we just make sure that for every system device we
create, the dev->archdata 'iommu' and 'stc' fields are filled
in.
Then we have two platform variants of the DMA ops, one for SUN4U which
actually programs the real hardware, and one for SUN4V which makes
hypervisor calls.
This also fixes the crashes in parport_pc on sparc64, reported by
Meelis Roos.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add in code to support an 82077 FDC on sun4c systems. There is a
problem with spurious interrupts but it does apear to work.
Testing on my SS2 (82072A FDC) shows that the floppy driver is not
100% with sun4c any way (any spurious interrupt kills it, requiring a
reboot to recover).
Signed-off-by: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Correct the Solaris x86 number of partitions (slices) is a way that is
backward compatible with the earlier size.
This works without a new VTOC structure definition as the timestamp
and v_asciilabel fields in the VTOC are not used by the kernel yet.
Signed-off-by: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (28 commits)
[SCSI] mpt fusion: Changes in mptctl.c for logging support
[SCSI] mpt fusion: Changes in mptfc.c mptlan.c mptsas.c and mptspi.c for logging support
[SCSI] mpt fusion: Changes in mptscsih.c for logging support
[SCSI] mpt fusion: Changes in mptbase.c for logging support
[SCSI] mpt fusion: logging support in Kconfig, Makefile, mptbase.h and addition of mptdebug.h
[SCSI] libsas: Fix potential NULL dereference in sas_smp_get_phy_events()
[SCSI] bsg: Fix build for CONFIG_BLOCK=n
[SCSI] aacraid: fix Sunrise Lake reset handling
[SCSI] aacraid: add SCSI SYNCHONIZE_CACHE range checking
[SCSI] add easyRAID to the no report luns blacklist
[SCSI] advansys: lindent and other large, uninteresting changes
[SCSI] aic79xx, aic7xxx: Fix incorrect width setting
[SCSI] qla2xxx: fix to honor ignored parameters in sysfs attributes
[SCSI] aacraid: draw line in sand, sundry cleanup and version update
[SCSI] iscsi_tcp: Turn off bounce buffers
[SCSI] libiscsi: fix cmd seqeunce number checking
[SCSI] iscsi_tcp, ib_iser Enable module refcounting for iscsi host template
[SCSI] libiscsi: make sure session is not blocked when removing host
[SCSI] libsas: Remove PCI dependencies
[SCSI] simscsi: convert to use the data buffer accessors
...
Restore the 2.6.22 CONFIG_ACPI_SLEEP build option, but now shadowing the
new CONFIG_PM_SLEEP option.
Signed-off-by: Len Brown <len.brown@intel.com>
[ Modified to work with the PM config setup changes. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce CONFIG_SUSPEND representing the ability to enter system sleep
states, such as the ACPI S3 state, and allow the user to choose SUSPEND
and HIBERNATION independently of each other.
Make HOTPLUG_CPU be selected automatically if SUSPEND or HIBERNATION has
been chosen and the kernel is intended for SMP systems.
Also, introduce CONFIG_PM_SLEEP which is automatically selected if
CONFIG_SUSPEND or CONFIG_HIBERNATION is set and use it to select the
code needed for both suspend and hibernation.
The top-level power management headers and the ACPI code related to
suspend and hibernation are modified to use the new definitions (the
changes in drivers/acpi/sleep/main.c are, mostly, moving code to reduce
the number of ifdefs).
There are many other files in which CONFIG_PM can be replaced with
CONFIG_PM_SLEEP or even with CONFIG_SUSPEND, but they can be updated in
the future.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace CONFIG_SOFTWARE_SUSPEND with CONFIG_HIBERNATION to avoid
confusion (among other things, with CONFIG_SUSPEND introduced in the
next patch).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A non-periodic clock_event_device and the "jiffies" clock don't mix well:
tick_handle_periodic() can go into an infinite loop.
Currently lguest guests use the jiffies clock when the TSC is
unusable. Instead, make the Host write the current time into the lguest
page on every interrupt. This doesn't cost much but is more precise
and at least as accurate as the jiffies clock. It also gets rid of
the GET_WALLCLOCK hypercall.
Also, delay setting sched_clock until our clock is set up, otherwise
the early printk timestamps can go backwards (not harmful, just ugly).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make declaration of mach_sched_init match definition
(which is in arch/m68knommu/kernel/setup.c).
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
#x blocks expansion of macro argument, but it won't do you any
good if it's already been expanded... As it is, RFALSE(cond, ....)
ended up with stringified _expanded_ cond. Real fun when cond contains
something like le32_to_cpu() and you are on a big-endian box...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
... because somebody had added preempt.h -> list.h, resulting in
asm/system.h -> hardirq.h -> preempt.h -> list.h -> asm/system.h on m68k,
with smp_wmb() used in list.h and defined in asm/system.h below the include
of hardirq.h.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] Fix sclp_vt220 error handling.
[S390] cio: Reorganize initialization.
[S390] cio: Make CIO_* macros safe if dbfs are not available.
[S390] cio: Clean up messages.
[S390] Fix IRQ tracing.
[S390] vmur: fix diag14_read.
[S390] Wire up sys_fallocate.
[S390] add types.h include to s390_ext.h
[S390] cio: Remove deprecated rdc/rcd.
[S390] Get rid of new section mismatch warnings.
[S390] sclp: kill unused SCLP config option.
[S390] cio: Remove remains of _ccw_device_get_device_number().
[S390] cio: css_sch_device_register() can be made static.
[S390] Improve __smp_call_function_map.
[S390] Convert to smp_call_function_single.
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6:
Input Serio: Blackfin doesnt support I8042 - make sure it doesnt get selected
Blackfin arch: add BF54x I2C/TWI TWI0 driver support
Blackfin On-Chip RTC driver update for supporting BF54x
Blackfin Ethernet MAC driver: fix bug Report returned -ENOMEM upwards (in case L1/uncached memory alloc fails)
Blackfin arch: add error message when IRQ no available
Blackfin arch: Initialize the exception vectors early in the boot process
Blackfin arch: fix a compiling warning about dma-mapping
Blackfin arch: switch to using proper defines this time THREAD_SIZE and PAGE_SIZE instead of just PAGE_SIZE everywhere
Blackfin arch: fix bug which unaligns the init thread's stack and causes the current macro to fail.
Blackfin arch: Load P0 before storing through it
Blackfin arch: fix KGDB bug, dont forget last parameter.
Blackfin arch: add selections for BF544 and BF542
Blackfin arch: use bfin_read_SWRST() now that BF561 provides it
Blackfin arch: setup aliases for some core Core A MMRs
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
docbook: add pipes, other fixes
blktrace: use cpu_clock() instead of sched_clock()
bsg: Fix build for CONFIG_BLOCK=n
[patch] QUEUE_FLAG_READFULL QUEUE_FLAG_WRITEFULL comment fix
MXC needs the same change as IOP. See [ARM] 4494/1
or commit 7dea1b2006
An undefined reference to elf_hwcap prevents linkage, due
to changes made by f884b1cf57
and d1cbbd6b41
Removing processor.h removes the extern definition of
elf_hwcap, which fixes the link issue, but forgets cpu_relax().
So, instead, we'll call barrier() directly.
Cc: Lennert Buytenhek <kernel@wantstofly.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Ross Wille <wille@freescale.com>
Signed-off-by: Quinn Jensen <quinn.jensen@freescale.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We should not be checking the cmd windown for just handling r2t responses.
And if the window closes in on us, always have scsi-ml requeue the command
from our queuecommand function.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch implements support of fallocate system call on s390(x)
platform. A wrapper is added to address the issue which s390 ABI has with
the arguments of this system call.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The header file for external interrupts uses the _u16 type. Make sure
that _u16 is defined by including linux/types.h. This prevents compile
failures, if asm/s390_ext.h is the first include file.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
http://marc.info/?l=linux-kernel&m=118481061928246&w=2 seems to
indicate disfavour of "deprecated", so let's just kill it now.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
smp_call_function_single now has the same semantics as s390's
smp_call_function_on. Therefore convert to the *single variant
and get rid of some architecture specific code.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
drivers/mmc/core/: make 3 functions static
mmc: add missing printk levels
mmc: remove redundant debug information from sdhci and wbsd
mmc: proper debugging output in core
mmc: be more verbose about card insertions/removal
mmc: Don't hold lock when releasing an added card
mmc: add a might_sleep() to mmc_claim_host()
mmc: update kerneldoc
mmc: update header file paths
sdhci: add support to ENE-CB714
mmc: check error bits before command completion
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (21 commits)
[POWERPC] spusched: Fix initial timeslice calculation
[POWERPC] spufs: Fix incorrect initialization of cbe_spu_info.spus
[POWERPC] Fix Maple platform ISA bus
[POWERPC] Make pci_iounmap actually unmap things
[POWERPC] Add function to check if address is an IO port
[POWERPC] Fix Pegasos keyboard detection
[POWERPC] iSeries: Fix section mismatch warning in lpevents
[POWERPC] iSeries: Fix section mismatch warnings
[POWERPC] iSeries: We need vio_enable_interrupts
[POWERPC] Fix RTC and device tree on linkstation machines
[POWERPC] Add of_register_i2c_devices()
[POWERPC] Fix loop with unsigned long counter variable
[POWERPC] Fix register labels on show_regs() message for 4xx/Book-E
[POWERPC] Only allow building of BootX text support on PPC_MULTIPLATFORM
[POWERPC] Fix the ability to reset on MPC8544 DS and MPC8568 MDS boards
[POWERPC] Fix mpc7448hpc2 tsi108 device_type bug
[POWREPC] Fixup a number of modpost warnings on ppc32
[POWERPC] Fix ethernet PHY support on MPC8544 DS
[POWERPC] Don't try to allocate resources for a Freescale POWERPC PHB
Revert "[POWERPC] Don't complain if size-cells == 0 in prom_parse()"
...
Commit bd804eba1c ("PM: Introduce
pm_power_off_prepare") caused problems in the poweroff path, as reported by
YOSHIFUJI Hideaki / 吉藤英明.
Generally, sysdev_shutdown() should be called after the ACPI preparation for
powering the system off. To make it happen, we can separate sysdev_shutdown()
from device_shutdown() and call it directly wherever necessary.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These are manual fixups after running Lindent. No functional change.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Adam Belay <ambx1@neo.rr.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
EDAC has a foundation to perform software memory scrubbing, but it requires a
per architecture (atomic_scrub) function for performing an atomic update
operation. Under X86, this is done with a
lock: add [addr],0
in the file asm-x86/edac.h
This patch provides the MIPS arch with that atomic function, atomic_scrub() in
asm-mips/edac.h
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nathan Lynch <ntl@pobox.com> reported:
2.6.23-rc1 breaks the build for 64-bit powerpc for me (using
maple_defconfig):
LD vmlinux.o
powerpc64-unknown-linux-gnu-ld: dynreloc miscount for
kernel/built-in.o, section .opd
powerpc64-unknown-linux-gnu-ld: can not edit opd Bad value
make: *** [vmlinux.o] Error 1
However, I see a possibly related binutils patch:
http://article.gmane.org/gmane.comp.gnu.binutils/33650
It was tracked down to be caused by the weak prototype
declaration in mm.h:
__attribute__((weak)) const char *arch_vma_name(struct vm_area_struct *vma);
But there is no need to make the declaration weak - only the definition
needs to be marked weak. So drop the weak declaration. And in the process
drop the duplicate definition in page.h for powerpc.
Note: the arch_vma_name fix for x86_64 needs to be applied first to avoid
breaking x86_64
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Nathan Lynch <ntl@pobox.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Documentation: The Drivers
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Documentation: The Guest
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix:
linux/include/xen/page.h: In function mfn_pte:
linux/include/xen/page.h:149: error: __supported_pte_mask undeclared (first use in this function)
linux/include/xen/page.h:149: error: (Each undeclared identifier is reported only once
linux/include/xen/page.h:149: error: for each function it appears in.)
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A dummy inline function of register_nosave_region_late was accidentally
removed by the recent PM patch that introduced suspend notifiers.
This elimination causes the following compiler error on PPC machines.
CC arch/powerpc/sysdev/dart_iommu.o
arch/powerpc/sysdev/dart_iommu.c: In function 'iommu_init_late_dart':
arch/powerpc/sysdev/dart_iommu.c:376: error: implicit declaration of function
'register_nosave_region_late'
make[1]: *** [arch/powerpc/sysdev/dart_iommu.o] Error 1
make: *** [arch/powerpc/sysdev] Error 2
This patch fixes the problem.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Have put_unaligned() warn if types would be wrong
for assignment, slap force-casts where needed. Cast the
result of get_unaligned to typeof(*ptr). With that in
place we get proper typechecking, both from gcc and from sparse,
including that for bitwise types.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since powerpc insists on printing the _value_ of condition
and on casting it to long... At least let's make it a force-cast.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We really need force-cast when converting to final result type;
unsigned long can be silently converted to integer types and
to pointers, but not to bitwise.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
no real bugs, just misannotations cropping up
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the unused mach_trap_init function pointer. All use of it
removed with change to using generic irq framework.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Create prototype for ack_bad_irq() for m68knommu.
Compilation of kernel/irq/handle.c fails without it.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eliminate unnecessary PCI dependencies in libsas. It should use generic
DMA and struct device like other subsystems.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add an above_background_load() function which can be used by other
subsystems to detect if there is anything besides niced tasks running.
Place it in sched.h to allow it to be compiled out if not used.
Unused for now, but it is a useful hint to the IO scheduler and to
swap-prefetch.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Cc: Peter Williams <pwil3058@bigpond.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This adds a general mechanism whereby a task can request the scheduler to
notify it whenever it is preempted or scheduled back in. This allows the
task to swap any special-purpose registers like the fpu or Intel's VT
registers.
Signed-off-by: Avi Kivity <avi@qumranet.com>
[ mingo@elte.hu: fixes, cleanups ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch removes old dead code:
- kill off sh7300 cpu support
- get rid of broken solution engine 7300 board support
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Wire up ARCH_NO_VIRT_TO_BUS, and kill off the remaining users. The
dma-mapping code really wanted virt_to_phys()/phys_to_virt() anyways,
there are no inherently special bus addresses.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds a function that tells you if a given kernel virtual address
is hitting a PCI or ISA IO port permanent mapping or not. This is to
be used in the next patch to fix iomap APIs to properly unmap things.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
WARNING: vmlinux.o(.text+0x8124): Section mismatch: reference to .init.text:.iSeries_early_setup (between '.__start_initialization_iSeries' and '.__mmu_off')
WARNING: vmlinux.o(.text+0x8128): Section mismatch: reference to .init.text:.early_setup (between '.__start_initialization_iSeries' and '.__mmu_off')
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Commit 3d0e91f7ac introduced a requirement
for vio_enable_interrupts which iSeires has never needed. So create a
dummy one.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Added its pci_id and implemented a quirk for it because this
controller needs to reset cmd and data when setting ios.
Signed-off-by: Leandro Dorileo <dorileo@ossystems.com.br>
Signed-off-by: Otavio Salvador <otavio@ossystems.com.br>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Nail two more simple section mismatch errors
[IA64] fix section mismatch warnings
[IA64] rename partial_page
[IA64] Ensure that machvec is set up takes place before serial console
[IA64] vector-domain - fix vector_table
[IA64] vector-domain - handle assign_irq_vector(AUTO_ASSIGN)
In 741f98fe29 Sam added full
checking across the entire vmlinux image. This flushed out
a dozen new section mismatch warnings. Start the whack-a-mole
game again to stomp them out.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Jens has added a partial_page thing in splice whcih conflicts with the ia64
one. Rename ia64 out of the way. (ia64 chose poorly).
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
struct apm_bios_info uses "unsigned short" and "unsigned long"
to mean u16 and u32 respectively. Correct.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Make "struct ist_info" valid on both i386 and x86-64, and use the
structure by name in the setup code. Additionally, "Intel SpeedStep
IST" is redundant, refer to it as IST consistently.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI: Kconfig: remove CONFIG_ACPI_SLEEP from source
ACPI: quiet ACPI Exceptions due to no _PTC or _TSS
ACPI: Remove references to ACPI_STATE_S2 from acpi_pm_enter
ACPI: Kconfig: always enable CONFIG_ACPI_SLEEP on X86
ACPI: Kconfig: fold /proc/acpi/sleep under CONFIG_ACPI_PROCFS
ACPI: Kconfig: CONFIG_ACPI_PROCFS now defaults to N
ACPI: autoload modules - Create __mod_acpi_device_table symbol for all ACPI drivers
ACPI: autoload modules - Create ACPI alias interface
ACPI: autoload modules - ACPICA modifications
ACPI: asus-laptop: Fix failure exits
ACPI: fix oops due to typo in new throttling code
ACPI: ignore _PSx method for hotplugable PCI devices
ACPI: Use ACPI methods to select PCI device suspend state
ACPI, PNP: hook ACPI D-state to PNP suspend/resume
ACPI: Add acpi_pm_device_sleep_state helper routine
ACPI: Implement the set_target() callback from pm_ops
Parse the machvec command line option outside of the early_param()
so that ia64_mv is set before any console intialisation that
may result from early_param parsing.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Need an include/asm-m68knommu/hw_irq.h for kernel/hrtimer.c
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a small typo in the definition of MCFDMA_DIR_INV (MCF5272 specific).
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CLOCK_TICK_RATE should give the underlying frequency of the tick timer,
to make ntp happy. For Coldfires, that's the main clock.
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This avoids xtime lag seen with dynticks, because while 'xtime' itself
is still not updated often, we keep a 'xtime_cache' variable around that
contains the approximate real-time that _is_ updated each time we do a
'update_wall_time()', and is thus never off by more than one tick.
IOW, this restores the original semantics for 'xtime' users, as long as
you use the proper abstraction functions (ie 'current_kernel_time()' or
'get_seconds()' depending on whether you want a timespec or just the
seconds field).
[ Updated Patch. As penance for my sins I've also yanked another #ifdef
that was added to avoid the xtime lag w/ hrtimers. ]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This avoids use of the kernel-internal "xtime" variable directly outside
of the actual time-related functions. Instead, use the helper functions
that we already have available to us.
This doesn't actually change any behaviour, but this will allow us to
fix the fact that "xtime" isn't updated very often with CONFIG_NO_HZ
(because much of the realtime information is maintained as separate
offsets to 'xtime'), which has caused interfaces that use xtime directly
to get a time that is out of sync with the real-time clock by up to a
third of a second or so.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As it was a synonym for (CONFIG_ACPI && CONFIG_X86),
the ifdefs for it were more clutter than they were worth.
For ia64, just add a few stubs in anticipation of future
S3 or S4 support.
Signed-off-by: Len Brown <len.brown@intel.com>
Add ability to expend the hardware trace buffer via a configurable
software buffer - so you can have lots of history when a crash occurs.
The interesting way we do printk in the traps.c confusese the checking
script
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Fix CCLK and SCLK checks, combine all arch checks into one file
for maintance. Checkins that remove more lines than they add are always
good.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
This patch removes old dead code:
- kill off sh73180 cpu support
- get rid of broken solution engine 73180 board support
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The first 1MB of P3 space was reserved and used for page colouring,
as we've reworked that to use fixmaps, we can reclaim the space and
hand it back to VMALLOC_START.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (26 commits)
netdev: i82596 Ethernet needs <asm/cacheflush.h>
forcedeth: mcp73 device addition
forcedeth: new device ids in pci_ids.h
atl1: make atl1_init_ring_ptrs static
eHEA: net_poll support
drivers/net/acenic.c: fix check-after-use
defxx: Use __maybe_unused rather than a local hack
Fix error checking in Vitesse IRQ config
ps3: reduce allocation size of rx skb buffers
atl1: use kernel provided ethernet length constants
atl1: fix typo in dma_req_block
atl1: change cmb write threshold
atl1: fix typo in DMA engine setup
atl1: change tpd_avail function name
ps3: fix rare issue that reenabling rx DMA fails
ps3: removed calling netif_poll_enable() in open()
ps3: use ethX as the name of irq
ps3: use net_device_stats of net_device structure
ps3: removed conditional ethtool support
ps3: removed defines no longer used
...
Add support for arch/powerpc, specifically for the prpmc2800 platform.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
ACPI implementations in several TOSHIBA laptops are weird and burn cpu
cycles for tens of seconds while trying to suspend if the PCI device
for the ATA controller is disabled when the ACPI suspend is called.
This patch uses DMI to match those machines and bypass device disable
on those machines during suspend. As the device needs to be put into
enabled state on resume without affecting PCI enable count, matching
resume callback uses __pci_reenable_device().
This bug is reported in bugzilla bug 7780.
http://bugzilla.kernel.org/show_bug.cgi?id=7780
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Some odd ACPI implementations choke if certain controller is disabled
when ACPI suspend is invoked but we still need to make sure the PCI
device is enabled during resume. Simply using pci_enable_device()
unbalances device enable count. Export __pci_reenable_device().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* 'request-queue-t' of git://git.kernel.dk/linux-2.6-block:
[BLOCK] Add request_queue_t and mark it deprecated
[BLOCK] Get rid of request_queue_t typedef
CC kernel/time/clocksource.o
In file included from
/home/bunk/linux/kernel-2.6/linux-2.6.22-rc6-mm1/include/linux/clocksource.h:18,
from
/home/bunk/linux/kernel-2.6/linux-2.6.22-rc6-mm1/kernel/time/clocksource.c:27:
include2/asm/io.h: In function 'virt_to_phys':
include2/asm/io.h:46: error: implicit declaration of function '__pa'
include2/asm/io.h: In function 'phys_to_virt':
include2/asm/io.h:51: error: implicit declaration of function '__va'
include2/asm/io.h:51: warning: return makes pointer from integer without a cast
make[3]: *** [kernel/time/clocksource.o] Error 1
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Christian Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
At present, various parts of the serial code use unsigned long to define
resource addresses. This is a problem, because some 32-bit platforms have
physical addresses larger than 32-bits, and have mmio serial uarts located
above the 4GB point.
This patch changes the type of mapbase in both struct uart_port and struct
plat_serial8250_port to resource_size_t, which can be configured to be 64
bits on such platforms. The mapbase in serial_struct can't safely be
changed, because that structure is user visible.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The driver previously registered its platform device data in its own
init function--that's bogus. Move that code to platform-specific
code in arch/ppc. This is being done so that the platform code can
decide at runtime whether to initialize this driver or not.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Just using #defines for the
bsg_register_queue()/bsg_unregister_queue() can cause undefined
variables when they're defined to nothing. Use dummy inline functions
instead.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Added the MPC85xx PCI device IDs that we need for the quirks we have.
Also, fixed the MPC8567E, MPC8567 device IDs which had the wrong value.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Also add 8641/8641D device IDs as well.
All of which already exist or have been submitted to
The Linux PCI ID Repository at:
http://pci-ids.ucw.cz/
CC-to: pci-ids@ucw.cz
Signed-off-by: Jon Loeliger <jdl@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Andrew thinks I should be nice and allow outside code to at least just
compile, so add the request_queue_t typedef back and mark it deprecated.
It'll warn people that this type is going away soonish.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
revise anomaly handling by basing things on the compiler not the kconfig defines,
so the header is stable and usable outside of the kernel. This also allows us to
move some code from preprocessing to compiling (gcc culls dead code)
which should help with code quality (readability, catch minor bugs, etc...).
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
The current SH DMA API is somewhat broken, not correctly matching
virtual channel to the correct SH DMAC. This wasn't noticeable when
using g2 DMA for the sound driver - one channel 0 is as good as any
other! - but caused the pvr2 driver to fail.
This patch fixes the pvr2 problem and consequently fixes the sound
driver to ensure it continues to function.
Signed-off by: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This wires up kmap_coherent() and kunmap_coherent() on SH-4, and
moves away from the p3map_mutex and reserved P3 space, opting to
use fixmaps for colouring instead.
The copy_user_page()/clear_user_page() implementations are moved
to this, which fixes the nasty blowups with spinlock debugging
as a result of having some of these calls nested under the page
table lock.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
We need the ability to set P2P bridge registers to properly setup the virtual
P2P bridges that exist in PCIe controllers for some of the embedded setups.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Make it so we do a runtime check to know if we need to write cfg_addr
as big or little endian. This is needed if we want to allow 86xx support
to co-exist in the same kernel as other 6xx PPCs.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This replaces the current linear search for a unique minor number with
lib/idr.c.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Modify modpost (file2alias.c) to add acpi*:XYZ0001: alias in modules.alias
like:
grep acpi /lib/modules/2.6.22-rc4-default/modules.alias
alias acpi*:SNY5001:* sony_laptop
alias acpi*:SNY6001:* sony_laptop
for e.g. the sony_laptop module.
This module matches against all ACPI devices with a HID or CID of SNY5001
or SNY6001
Export an uevent and modalias sysfs file containing the string:
[MODALIAS=]acpi:PNP0C0C:
additional CIDs are concatenated at the end.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Define standardized HIDs - Rename current acpi_device_id to acpica_device_id
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
We don't use setup_indirect_pci_nomap in arch/powerpc and it appears
the users that needed it from arch/ppc are now using setup_indirect_pci.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Added PPC_INDIRECT_TYPE_NO_PCIE_LINK flag to the indirect pci handling
code to ensure that we don't talk to any device other than the PHB
if we don't have PCIe link. Some controllers will lockup if they try
to do a config cycle to any device on the bus except the PHB.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Added early_find_capability that wraps pci_bus_find_capability and uses
fake_pci_bus() to allow us to call it before we've fully setup the
pci_controller.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This patch adds the manufacturer ID for AMD flash.
Signed-off-by: Steven J. Hill <sjhill1@rockwellcollins.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Now that the last inlined instances are gone, all that is left to do
is turning disable_irq_nosync on arm26 and m68k from defines to aliases
and we are all set - we can make these externs in linux/interrupt.h
uncoditional and kill remaining instances in asm/irq.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Not everyone wants libsas automatically to pull in libata. This patch
makes the behaviour configurable, so you can build libsas with or
without ATA support.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds:
leds: Convert from struct class_device to struct device
leds: leds-gpio for ngw100
leds: Add warning printks in error paths
leds: Fix trigger unregister_simple if register_simple fails
leds: Use menuconfig objects II - LED
leds: Teach leds-gpio to handle timer-unsafe GPIOs
leds: Add generic GPIO LED driver
* 'for-linus' of git://git.o-hand.com/linux-rpurdie-backlight:
leds: cr_bllcd.c: build fix
backlight: Convert from struct class_device to struct device
backlight: Fix order of Kconfig entries
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Clean up duplicate includes in drivers/macintosh/
[POWERPC] Quiet section mismatch warning on pcibios_setup
[POWERPC] init and exit markings for hvc_iseries
[POWERPC] Quiet section mismatch in hvc_rtas.c
[POWERPC] Constify of_platform_driver match_table
[POWERPC] hvcs: Make some things static and const
[POWERPC] Constify of_platform_driver name
[POWERPC] MPIC protected sources
[POWERPC] of_detach_node()'s device node argument cannot be const
[POWERPC] Fix ARCH=ppc builds
[POWERPC] mv64x60: Use mutex instead of semaphore
[POWERPC] Allow smp_call_function_single() to current cpu
[POWERPC] Allow exec faults on readable areas on classic 32-bit PowerPC
[POWERPC] Fix future firmware feature fixups function failure
[POWERPC] fix showing xmon help
[POWERPC] Make xmon_write accept a const buffer
[POWERPC] Fix misspelled "CONFIG_CHECK_CACHE_COHERENCY" Kconfig option.
[POWERPC] cell: CONFIG_SPE_BASE is a typo
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (77 commits)
ACPI: Populate /sys/firmware/acpi/tables/
ACPI: create CONFIG_ACPI_DEBUG_FUNC_TRACE
ACPI: update ACPI proc I/F removal schedule
ACPI: update feature-removal-schedule.txt, /sys/firmware/acpi/namespace is gone
ACPI: export ACPI events via acpi_mc_group multicast group
ACPI: fix empty macros found by -Wextra
ACPI: drivers/acpi/pci_link.c: lower printk severity
sony-laptop: Fix event reading in sony-laptop
sony-laptop: Add Vaio FE to the special init sequence
sony-laptop: Make the driver use MSC_SCAN and a setkeycode and getkeycode key table.
sony-laptop: Invoke _INI for SNC devices that provide it
sony-laptop: Add support for recent Vaios Fn keys (C series for now)
sony-laptop: map wireless switch events to KEY_WLAN
sony-laptop: add new SNC handlers
ACPI: thinkpad-acpi: add locking to brightness subdriver
ACPI: thinkpad-acpi: bump up version to 0.15
ACPI: thinkpad-acpi: make EC-based thermal readings non-experimental
ACPI: thinkpad-acpi: make sure DSDT TMPx readings don't return +128
ACPI: thinkpad-acpi: react to Lenovo ThinkPad differences in hot key
ACPI: thinkpad-acpi: allow use of CMOS NVRAM for brightness control
...
They are identical
Indirectly pointed out by Thomas Gleixner
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Previously lock was unconditionally used, but shouldn't be needed on
UP systems.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Due to index register access ordering problems, when using macros a line
like this fails (and does nothing):
setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x88);
With inlined functions this line will work as expected.
Note about a side effect: Seems on Geode GX1 based systems the
"suspend on halt power saving feature" was never enabled due to this
wrong macro expansion. With inlined functions it will be enabled, but
this will stop the TSC when the CPU runs into a HLT instruction.
Kernel output something like this:
Clocksource tsc unstable (delta = -472746897 ns)
This is the 3rd version of this patch.
- Adding missed arch/i386/kernel/cpu/mtrr/state.c
Thanks to Andres Salomon
- Adding some big fat comments into the new header file
Suggested by Andi Kleen
AK: fixed x86-64 compilation
Signed-off-by: Juergen Beisert <juergen@kreuzholzen.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
local_cmpxchg() should not use any LOCK prefix. This change probably
got lost in the move to cmpxchg.h.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a machine check or NMI occurs while multiple byte code is patched
the CPU could theoretically see an inconsistent instruction and crash.
Prevent this by temporarily disabling MCEs and returning early in the
NMI handler.
Based on discussion with Mathieu Desnoyers.
Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reenable kprobes and alternative patching when the kernel text is write
protected by DEBUG_RODATA
Add a general utility function to change write protected text. The new
function remaps the code using vmap to write it and takes care of CPU
synchronization. It also does CLFLUSH to make icache recovery faster.
There are some limitations on when the function can be used, see the
comment.
This is a newer version that also changes the paravirt_ops code.
text_poke also supports multi byte patching now.
Contains bug fixes from Zach Amsden and suggestions from Mathieu
Desnoyers.
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
Cc: Zach Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch uses the read and write functions provided at system.h
for control registers instead of writting raw assembly over and
over again in .c files. Functions to manipulate cr2 and cr8 were
provided, as they were lacking.
Also, removed some extra space after closing brackets
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch makes the i386 behave the same way that x86_64 does when a
segfault happens. A line gets printed to the kernel log so that tools
that need to check for failures can behave more uniformly between
debug.show_unhandled_signals sysctl variable to 0 (or by doing echo 0 >
/proc/sys/debug/exception-trace)
Also, all of the lines being printed are now using printk_ratelimit() to
deny the ability of DoS from a local user with a program like the
following:
main()
{
while (1)
if (!fork()) *(int *)0 = 0;
}
This new revision also includes the fix that Andrew did which got rid of
new sysctl that was added to the system in earlier versions of this.
Also, 'show-unhandled-signals' sysctl has been renamed back to the old
'exception-trace' to avoid breakage of people's scripts.
AK: Enabling by default for i386 will be likely controversal, but let's see what happens
AK: Really folks, before complaining just fix your segfaults
AK: I bet this will find a lot of silent issues
Signed-off-by: Masoud Sharbiani <masouds@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
[ Personally, I've found the complaints useful on x86-64, so I'm all for
this. That said, I wonder if we could do it more prettily.. -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move register and other definitions out of the
include/asm-arm/arch-s3c2410 into the the arch
directories of include/asm-arm/plat-s3c24xx and
include/asm-arm/plat-s3c.
This move is in preperation of the merging of
s3c2400 and s3c6400.
The following git mv commands are needed before
this patch can be applied:
git mv include/asm-arm/arch-s3c2410/regs-ac97.h include/asm-arm/plat-s3c/regs-ac97.h
git mv include/asm-arm/arch-s3c2410/regs-adc.h include/asm-arm/plat-s3c/regs-adc.h
git mv include/asm-arm/arch-s3c2410/regs-iis.h include/asm-arm/plat-s3c24xx/regs-iis.h
git mv include/asm-arm/arch-s3c2410/regs-spi.h include/asm-arm/plat-s3c24xx/regs-spi.h
git mv include/asm-arm/arch-s3c2410/regs-udc.h include/asm-arm/plat-s3c24xx/regs-udc.h
git mv include/asm-arm/arch-s3c2410/udc.h include/asm-arm/plat-s3c24xx/udc.h
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We've fixed up a number of faults with the uncompressors
so remove the now unused FIFO_MAX as it is not needed.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Split the S3C2400 out of S3C2410 memory.h files
ready for S3C2400 support to be added.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reorganise the definition of the virtual addresses
used into a common header and update the users to
rename S3C2410 items into a more common S3C defined
macros.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the static maps for the LCD and USB devices.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the S3C2400 values to their own include directory
series in include/asm-arm/arch-s3c2400 as the support
for the S3C2400 is best placed in its own arch directory.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rename the S3C24XX configuration options for the watchdog
boot controls for moving to the arch/arm/plat-s3c moves.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Check for ARM926 based S3C24XX based devices as these
only have 64 byte FIFOs, and do not have the model
detection refisters in the same place as the ARM920
based CPUs
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Ensure we check for ARM926 in the uncompressor, as all current
ARM926s do not have an ID register and all have S3C2440 style
UARTs.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rename DEBUG_S3C2410_PORT to DEBUG_S3C_PORT as well as
DEBUG_S3C2410_UART to DEBUG_S3C_UART as part of the updates
to moving to plat-s3c for S3C base support.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rename CONFIG_S3C2410_LOWLEVEL_UART_PORT to be
CONFIG_S3C_LOWLEVEL_UART_PORT as we move to using
plat-s3c for base of S3C operations.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Update the debug macros for use with the new per-cpu
configuration and usage.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the common parts of the debug macros into include/asm-arm/plat-s3c
ready to be used for the common S3C support.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch moves items of the s3c24xx support into
a new plat-s3c directory for items that use the
s3c24xx support but are not directly s3c24xx
compatible, such as the s3c2400 and s3c6400.
git mv commands:
git mv include/asm-arm/arch-s3c2410/iic.h include/asm-arm/plat-s3c/iic.h
git mv include/asm-arm/arch-s3c2410/nand.h include/asm-arm/plat-s3c/nand.h
git mv include/asm-arm/arch-s3c2410/regs-iic.h include/asm-arm/plat-s3c/regs-iic.h
git mv include/asm-arm/arch-s3c2410/regs-nand.h include/asm-arm/plat-s3c/regs-nand.h
git mv include/asm-arm/arch-s3c2410/regs-rtc.h include/asm-arm/plat-s3c/regs-rtc.h
git mv include/asm-arm/arch-s3c2410/regs-serial.h include/asm-arm/plat-s3c/regs-serial.h
git mv include/asm-arm/arch-s3c2410/regs-timer.h include/asm-arm/plat-s3c/regs-timer.h
git mv include/asm-arm/arch-s3c2410/regs-watchdog.h include/asm-arm/plat-s3c/regs-watchdog.h
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds the foundation pieces for
the Freescale MXC platforms, including
i.MX2 and i.MX3 based systems.
The bare-bones MX31 support in this patch
boots to the rootdev panic with 8250 serial
console configured "console=ttyS0,115200".
It assumes that Redboot is the boot loader.
Signed-off-by: Quinn Jensen <quinn.jensen@freescale.com>
Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Selinux folks had been complaining about the lack of AVC_PATH
records when audit is disabled. I must admit my stupidity - I assumed
that avc_audit() really couldn't use audit_log_d_path() because of
deadlocks (== could be called with dcache_lock or vfsmount_lock held).
Shouldn't have made that assumption - it never gets called that way.
It _is_ called under spinlocks, but not those.
Since audit_log_d_path() uses ab->gfp_mask for allocations,
kmalloc() in there is not a problem. IOW, the simple fix is sufficient:
let's rip AUDIT_AVC_PATH out and simply generate pathname as part of main
record. It's trivial to do.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: James Morris <jmorris@namei.org>
Right now the audit filter can match on = != > < >= blah blah blah.
This allow the filter to also look at bitwise AND operations, &
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
blk_fill_sghdr_rq, blk_unmap_sghdr_rq, and blk_complete_sghdr_rq were
exported for bsg, however bsg was changed to support only sg v4.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Some HW platforms, such as the new cell blades, requires some MPIC sources
to be left alone by the operating system. This implements support for
a "protected-sources" property in the mpic controller node containing a list
of source numbers to be protected against operating system interference.
For those interested in the gory details, the MPIC on the southbridge of
those blades has some of the processor outputs routed to the cell, and
at least one routed as a GPIO to the service processor. It will be used
in the GA product for routing some of the southbridge error interrupts
to the service processor which implements some of the RAS stuff, such
as checkstopping when fatal errors occurs before they can propagate.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
...since it modifies it (when it sets the OF_DETACHED flag).
Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The recent signal rework broke ARCH=ppc builds with the following
error:
CC arch/powerpc/kernel/signal.o
arch/powerpc/kernel/signal.c: In function Âdo_signalÂ:
arch/powerpc/kernel/signal.c:142: error: implicit declaration of
function Âset_dabrÂ
make[1]: *** [arch/powerpc/kernel/signal.o] Error 1
This fixes it by including a function prototype in asm-ppc/system.h.
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
applied after Rafel's 'PM: Update global suspend and hibernation operations framework' patch set
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Based on the David Brownell's patch at
http://marc.info/?l=linux-acpi&m=117873972806360&w=2
updated by: Rafael J. Wysocki <rjw@sisk.pl>
Add a helper routine returning the lowest power (highest number) ACPI device
power state that given device can be in while the system is in the sleep state
indicated by acpi_target_sleep_state .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Len Brown <len.brown@intel.com>
Split ACPI_DEBUG into function trace enabled and not enabled.
Function trace is most of the ACPI_DEBUG costs, but is
not much of use for kernel ACPI debugging.
Size of kernel image increased on test compile:
+ 48k (Full ACPI_DEBUG)
+ 35k (ACPI_DEBUG with function trace compiled out)
Performance without function trace is also much better.
Also remove ACPI_LV_DEBUG_OBJECT from default debug level as
a lot vendors let Store (value, debug) in their code and this
might confuse users when it pops up in syslog.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
ACPI has a ton of macros which make a bunch of empty if's when configured
in non-debug mode.
[lenb: The code it complaines about is functionally correct,
so this patch is just to make -Wextra happier]
#define DBG()
if(...)
DBG();
next_c_statement
which turns into
if(...) ;
next_c_statement
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Keep note of ThinkPad model, BIOS and EC firmware information, and log it
on startup. Makes for far more readable code in places, too.
This patch also adds Lenovo's PCI ID to the pci ids table.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NET]: Add missing entries to family name tables
[NET]: Make NETDEVICES depend on NET.
[IPV6]: endianness bug in ip6_tunnel
[IrDA]: TOSHIBA_FIR depends on virt_to_bus
[IrDA]: EP7211 IR driver port to the latest SIR API
[IrDA] Typo fix in irnetlink.c copyright
[NET]: Fix loopback crashes when multiqueue is enabled.
[IPV4]: Fix inetpeer gcc-4.2 warnings
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: ERROR: "sys_ioctl" [arch/sparc64/solaris/solaris.ko] undefined!
[SPARC32]: Make PAGE_SHARED a read-mostly variable.
[SPARC32]: Take enable_irq/disable_irq out of line.
[SPARC32]: clean include/asm-sparc/irq.h
[SPARC32]: Fix rounding errors in ndelay/udelay implementation.
Move stuff used only by arch/sparc/kernel/* into arch/sparc/kernel/irq.h
and into individual files in there (e.g. macros internal to sun4m_irq.c,
etc.)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The EP7211 SIR driver was the only one left without a new SIR API port.
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces struct pci_sysdata to x86 and x86-64, and
converts the existing two users (NUMA, Calgary) to use it.
This lays the groundwork for having other users of sysdata, such as
the PCI domains work.
The Calgary bits are tested, the NUMA bits just look ok.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This builds upon the existing geode infrastructure, but adds southbridge
support, some GPIO functions, and a header file (asm-i386/geode.h) with some
useful GX/LX detection tests.
The majority of this code was written by Jordan Crouse.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
get_vm_area always returns an area with an adjacent guard page. That guard
page is included in vm_struct.size. iounmap uses vm_struct.size to
determine how much address space needs to have change_page_attr applied to
it, which will BUG if applied to the guard page.
This patch adds a helper function - get_vm_area_size() in linux/vmalloc.h -
to return the actual size of a vm area, and uses it to make iounmap do the
right thing. There are probably other places which should be using
get_vm_area_size().
Thanks to Dave Young <hidave.darkstar@gmail.com> for debugging the
problem.
[ Andi, it wasn't clear to me whether x86_64 needs the same fix. ]
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
setup_pit_timer is declared in asm-i386/timer.h. Move it to the pit header
file, so it can be used by x86_64 as well.
Move also the PIT constants.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For K8 system: 4G RAM with memory hole remapping enabled, or more than 4G
RAM installed. when using kexec to load second kernel. In the second
kernel, when mem is allocated for GART, it will do the memset for clear, it
will cause restart, because some device still used that for dma. solution
will be:
in second kernel: disable that at first before we try to allocate mem for
it. or in the first kernel: do disable that before shutdown.
Andi/Eric/Alan prefer to second one for clean shutdown in first kernel.
Andi also point out need to consider to AGP enable but mem less 4G case
too.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add cpu_relax() to cmos_lock() inline function for faster operation on SMT
CPUs and less power consumption on others in case of lock contention (which
probably doesn't happen too often, so admittedly this patch is not too
exciting).
[akpm@linux-foundation.org: Include the header file for cpu_relax()]
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/vmalloc.c: In function 'unmap_kernel_range':
mm/vmalloc.c:75: warning: unused variable 'start'
make it a C function so that the compiler thinks it used its arguments.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The function name is set_fixmap(), not fixmap_set() as stated in the comment.
Also fix a typo, punctuation and lower/uppercase a bit.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace the pcspkr private PIT lock by the global PIT lock to serialize the
PIT access all over the place.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some interrupt entry points are currently defined in i8259.c They probably
belong in a header. Right now, their only user is init_IRQ, justifying
their declaration in-file. But when virtualization comes in, we may be
interested in using that functions in late initializations.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On some systems the ACPI NVS area is located in the first 1 MB of RAM and
it is overwritten by the i386 code during the restore after hibernation.
This confuses the ACPI platform firmware that doesn't update the AC adapter
status appropriately as a result
(http://bugzilla.kernel.org/show_bug.cgi?id=7995).
The solution is to register the reserved memory in the first 1 MB as
'nosave', so that swsusp doesn't touch it during the restore. Also, this
has been done on x86_64 for a long time now, so this patch makes the i386
restore code behave like the x86_64 one.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provide seperate versions for Calgary and CalIOC2
Also print out the PCIe Root Complex Status on CalIOC2 errors
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Calgary and CalIOC2 share most of the same logic. Introduce struct
cal_chipset_ops for quirks and tce flush logic which are
[akpm@linux-foundation.org: make calgary_chip_ops static]
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Rise CPUs were only very short-lived, and there are no reports of
anyone both owning one and running Linux on it.
Googling for the printk string "CPU: Rise iDragon" didn't find any dmesg
available online.
If it turns out that against all expectations there are actually users
reverting this patch would be easy.
This patch will make the kernel images smaller by a few bytes for all
i386 users.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 59f4e7d572 fixed machine rebooting
on Truxton's machine (when no keyboard was present). But it broke it on
Lee's machine.
The patch reinstates the old (pre-59f4e7d572980a521b7bdba74ab71b21f5995538)
code and if that doesn't work out, try the new,
post-59f4e7d572980a521b7bdba74ab71b21f5995538 code instead.
Cc: Lee Garrett <lee-in-berlin@web.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Background:
/dev/mcelog is typically polled manually. This is less than optimal for
situations where accurate accounting of MCEs is important. Calling
poll() on /dev/mcelog does not work.
Description:
This patch adds support for poll() to /dev/mcelog. This results in
immediate wakeup of user apps whenever the poller finds MCEs. Because
the exception handler can not take any locks, it can not call the wakeup
itself. Instead, it uses a thread_info flag (TIF_MCE_NOTIFY) which is
caught at the next return from interrupt or exit from idle, calling the
mce_user_notify() routine. This patch also disables the "fake panic"
path of the mce_panic(), because it results in printk()s in the exception
handler and crashy systems.
This patch also does some small cleanup for essentially unused variables,
and moves the user notification into the body of the poller, so it is
only called once per poll, rather than once per CPU.
Result:
Applications can now poll() on /dev/mcelog. When an error is logged
(whether through the poller or through an exception) the applications are
woken up promptly. This should not affect any previous behaviors. If no
MCEs are being logged, there is no overhead.
Alternatives:
I considered simply supporting poll() through the poller and not using
TIF_MCE_NOTIFY at all. However, the time between an uncorrectable error
happening and the user application being notified is *the*most* critical
window for us. Many uncorrectable errors can be logged to the network if
given a chance.
I also considered doing the MCE poll directly from the idle notifier, but
decided that was overkill.
Testing:
I used an error-injecting DIMM to create lots of correctable DRAM errors
and verified that my user app is woken up in sync with the polling interval.
I also used the northbridge to inject uncorrectable ECC errors, and
verified (printk() to the rescue) that the notify routine is called and the
user app does wake up. I built with PREEMPT on and off, and verified
that my machine survives MCEs.
[wli@holomorphy.com: build fix]
Signed-off-by: Tim Hockin <thockin@google.com>
Signed-off-by: William Irwin <bill.irwin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For NUMA emulation, our SLIT should represent the true NUMA topology of the
system but our proximity domain to node ID mapping needs to reflect the
emulated state.
When NUMA emulation has successfully setup fake nodes on the system, a new
function, acpi_fake_nodes() is called. This function determines the proximity
domain (_PXM) for each true node found on the system. It then finds which
emulated nodes have been allocated on this true node as determined by its
starting address. The node ID to PXM mapping is changed so that each fake
node ID points to the PXM of the true node that it is located on.
If the machine failed to register a SLIT, then we assume there is no special
requirement for emulated node affinity so we use the default LOCAL_DISTANCE,
which is newly exported to this code, as our measurement if the emulated nodes
appear in the same PXM. Otherwise, we use REMOTE_DISTANCE.
PXM_INVAL and NID_INVAL are also exported to the ACPI header file so that we
can compare node_to_pxm() results in generic code (in this case, the SRAT
code).
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds caching of pgds and puds, pmds, pte. That way we can avoid costly
zeroing and initialization of special mappings in the pgd.
A second quicklist is useful to separate out PGD handling. We can carry the
initialized pgds over to the next process needing them.
Also clean up the pgd_list handling to use regular list macros. There is no
need anymore to avoid the lru field.
Move the add/removal of the pgds to the pgdlist into the constructor /
destructor. That way the implementation is congruent with i386.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Constrain __supported_pte_mask and NX handling to just the PAE kernel.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When making changes to x86_64 timers, I noticed that touching hpet.h triggered
an unreasonably large rebuild. Untangling it from timex.h quiets the extra
rebuild quite a bit.
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove pit_interrupt_hook as it adds just an extra layer.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This implements new vDSO for x86-64. The concept is similar
to the existing vDSOs on i386 and PPC. x86-64 has had static
vsyscalls before, but these are not flexible enough anymore.
A vDSO is a ELF shared library supplied by the kernel that is mapped into
user address space. The vDSO mapping is randomized for each process
for security reasons.
Doing this was needed for clock_gettime, because clock_gettime
always needs a syscall fallback and having one at a fixed
address would have made buffer overflow exploits too easy to write.
The vdso can be disabled with vdso=0
It currently includes a new gettimeofday implemention and optimized
clock_gettime(). The gettimeofday implementation is slightly faster
than the one in the old vsyscall. clock_gettime is significantly faster
than the syscall for CLOCK_MONOTONIC and CLOCK_REALTIME.
The new calls are generally faster than the old vsyscall.
Advantages over the old x86-64 vsyscalls:
- Extensible
- Randomized
- Cleaner
- Easier to virtualize (the old static address range previously causes
overhead e.g. for Xen because it has to create special page tables for it)
Weak points:
- glibc support still to be written
The VM interface is partly based on Ingo Molnar's i386 version.
Includes compile fix from Joachim Deguara
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
gcc 4.3 supports a new __attribute__((__cold__)) to mark functions cold. Any
path directly leading to a call of this function will be unlikely. And gcc
will try to generate smaller code for the function itself.
Please use with care. The code generation advantage isn't large and in most
cases it is not worth uglifying code with this.
This patch marks some common error functions like panic(), printk()
as cold. This will longer term make many unlikely()s unnecessary, although
we can keep them for now for older compilers.
BUG is not marked cold because there is currently no way to tell
gcc to mark a inline function told.
Also all __init and __exit functions are marked cold. With a non -Os
build this will tell the compiler to generate slightly smaller code
for them. I think it currently only uses less alignments for labels,
but that might change in the future.
One disadvantage over *likely() is that they cannot be easily instrumented
to verify them.
Another drawback is that only the latest gcc 4.3 snapshots support this.
Unfortunately we cannot detect this using the preprocessor. This means older
snapshots will fail now. I don't think that's a problem because they are
unreleased compilers that nobody should be using.
gcc also has a __hot__ attribute, but I don't see any sense in using
this in the kernel right now. But someday I hope gcc will be able
to use more aggressive optimizing for hot functions even in -Os,
if that happens it should be added.
Includes compile fix from Thomas Gleixner.
Cc: Jan Hubicka <jh@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The compiler generally generates reasonable inline code for the simple
cases and for the rest it's better for code size for them to be out of line.
Also there they can be potentially optimized more in the future.
In fact they probably should be in a .S file because they're all pure
assembly, but that's for another day.
Also some code style cleanup on them while I was on it (this seems
to be the last untouched really early Linux code)
This saves ~12k text for a defconfig kernel with gcc 4.1.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan asked to always use the builtin memcpy on gcc 4.3 mainline because
it should generate better code than the old macro. Let's try it.
Cc: Jan Hubicka <jh@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In acpi_scan_nodes(), we immediately return -1 if acpi_numa <= 0, meaning
we haven't detected any underlying ACPI topology or we have explicitly
disabled its use from the command-line with numa=noacpi.
acpi_table_print_srat_entry() and acpi_table_parse_srat() are only
referenced within drivers/acpi/numa.c, so we can mark them as static and
remove their prototypes from the header file.
Likewise, pxm_to_node_map[] and node_to_pxm_map[] are only used within
drivers/acpi/numa.c, so we mark them as static and remove their externs
from the header file.
The automatic 'result' variable is unused in acpi_numa_init(), so it's
removed.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On x86_64, <asm/ptrace.h> uses __user but doesn't include
<linux/compiler.h>. This could lead to build failures.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
i386 and sparc64 have the identical code to update the cmos clock. Move it
into kernel/time/ntp.c as there are other architectures coming along with the
same requirements.
[akpm@linux-foundation.org: build fixes]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: David Miller <davem@davemloft.net>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need to make sure, that the clockevent devices are resumed, before
the tick is resumed. The current resume logic does not guarantee this.
Add CLOCK_EVT_MODE_RESUME and call the set mode functions of the clock
event devices before resuming the tick / oneshot functionality.
Fixup the existing users.
Thanks to Nigel Cunningham for tracking down a long standing thinko,
which affected the jinxed VAIO.
[akpm@linux-foundation.org: xen build fix]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is an variation on the patch sent by Christoph Hellwig which kills
file_count abuse by the Coda kernel module by moving the coda_flush
functionality into coda_release. However part of reason we were using the
coda_flush callback was to allow Coda to pass errors that occur during
writeback from the userspace cache manager back to close().
As Al Viro explained on linux-fsdevel, it is impossible to guarantee that
such errors can in fact be returned back to the caller. There are many
cases where the last reference to a file is not released by the close
system call and it is also impossible to pick some close as a 'last-close'
and delay it until all other references have been destroyed.
The CODA_STORE/CODA_RELEASE upcall combination is clearly a broken design,
and it is better to remove it completely.
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Too many semicolons in this macro.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, bsg doesn't make class backlinks (a process whereby you'd get
a link to bsg in the device directory in the same way you get one for
sg). This is because the bsg device is uninitialised, so the class
device has nothing it can attach to. The fix is to make the bsg device
point to the cdevice of the entity creating the bsg, necessitating
changing the bsg_register_queue() prototype into a form that takes the
generic device.
Acked-by: FUJITA Tomonori <tomof@acm.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: fix section mismatch warning in mdesc.c
[SPARC64]: fix section mismatch warning in pci_sunv4
[SPARC64]: Stop using drivers/char/rtc.c
[SPARC64]: Convert parport to of_platform_driver.
[SPARC]: Implement fb_is_primary_device().
[SPARC64]: Fix virq decomposition.
[SPARC64]: Use KERN_ERR in IRQ manipulation error printks.
[SPARC64]: Do not flood log with failed DS messages.
[SPARC64]: Add proper multicast support to VNET driver.
[SPARC64]: Handle multiple domain-services-port nodes properly.
[SPARC64]: Improve VIO device naming further.
[SPARC]: Make sure dev_archdata is filled in for all devices.
[SPARC]: Define minimal struct dev_archdata, similarly to sparc64.
[SPARC]: Fix serial console device detection.
The current scheme works on static interpretation of text names, which
is wrong.
The output-device setting, for example, must be resolved via an alias
or similar to a full path name to the console device.
Paths also contain an optional set of 'options', which starts with a
colon at the end of the path. The option area is used to specify
which of two serial ports ('a' or 'b') the path refers to when a
device node drives multiple ports. 'a' is assumed if the option
specification is missing.
This was caught by the UltraSPARC-T1 simulator. The 'output-device'
property was set to 'ttya' and we didn't pick upon the fact that this
is an OBP alias set to '/virtual-devices/console'. Instead we saw it
as the first serial console device, instead of the hypervisor console.
The infrastructure is now there to take advantage of this to resolve
the console correctly even in multi-head situations in fbcon too.
Thanks to Greg Onufer for the bug report.
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (5880): wm8775/wm8739: Fix memory leak when unloading module
V4L/DVB (5877): radio-gemtek-pci: remove unused structure member
V4L/DVB (5871): Conexant 2388x: check for kthread_run
V4L/DVB (5869): Add check for valid control ID to v4l2_ctrl_next.
V4L/DVB (5867): videodev2.h: add missing <sys/time.h> for userspace
V4L/DVB (5866): ivtv: fix DMA timeout when capturing VBI + another stream
V4L/DVB (5865): Remove usage of HZ on ivtv driver, replacing by msecs_to_jiffies
V4L/DVB (5861): Use msecs_to_jiffies instead of HZ on bttv, cx88 and saa7134
V4L/DVB (5860): Use msecs_to_jiffies instead of HZ on some webcam drivers
V4L/DVB (5859): use msecs_to_jiffies on InfraRed RC5 timeout
V4L/DVB (5858): Use msecs_to_jiffies instead of HZ on media/video I2C drivers
V4L/DVB (5857): Use msecs_to_jiffies instead of HZ on radio drivers
V4L/DVB (5855): ivtv: fix Kconfig typo and refer to the driver homepage.
V4L/DVB (5854): ivtv: cleanup of driver messages
V4L/DVB (5853): ivtv: add support to suppress high volume i2c debug messages.
V4L/DVB (5852): ivtv: don't recompile needlessly
V4L/DVB (5851): ivtv: fix missing I2C_ALGOBIT config option
V4L/DVB (5850): ivtv: improve API command debugging
V4L/DVB (5848): Av7110: fix typo
* 'for-2.6.23' of master.kernel.org:/pub/scm/linux/kernel/git/arnd/cell-2.6: (37 commits)
[CELL] spufs: rework list management and associated locking
[CELL] oprofile: add support to OProfile for profiling CELL BE SPUs
[CELL] oprofile: enable SPU switch notification to detect currently active SPU tasks
[CELL] spu_base: locking cleanup
[CELL] cell: indexing of SPUs based on firmware vicinity properties
[CELL] spufs: integration of SPE affinity with the scheduller
[CELL] cell: add placement computation for scheduling of affinity contexts
[CELL] spufs: extension of spu_create to support affinity definition
[CELL] cell: add hardcoded spu vicinity information for QS20
[CELL] cell: add vicinity information on spus
[CELL] cell: add per BE structure with info about its SPUs
[CELL] spufs: use find_first_bit() instead of sched_find_first_bit()
[CELL] spufs: remove unused file argument from spufs_run_spu()
[CELL] spufs: change decrementer restore timing
[CELL] spufs: dont halt decrementer at restore step 47
[CELL] spufs: limit saving MFC_CNTL bits
[CELL] spufs: fix read and write for decr_status file
[CELL] spufs: fix decr_status meanings
[CELL] spufs: remove needless context save/restore code
[CELL] spufs: fix array size of channel index
...
This patch adds the necessary ifdef's to the proc-v7.S code and
defines the v7wbi_tlb_fns macro in pgtable-nommu.h
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When videodev2.h is included by an application, it needs to include
<sys/time.h> for the timeval struct.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
These patches add full SSP/MCU support for the HP Jornada 720
machine. Its needed to handle keyboard, touchscreen, battery
and backlight/lcd.
The main driver exports functions and the header file exports
the command values. When talking to the MCU the general procedure
is to start MCU, send command (using ssp_inout(command)), the
proper reply is always TXDUMMY. After receiving TXDUMMY you can
send the value you wish pushed (for example brightness level).
End with ssp_end() so the spinlock gets unlocked.
Drivers using this havent been implemented yet, but will shortly.
Signed-off-by: Kristoffer Ericson <Kristoffer.ericson@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
With this patch, Kconfig only selects CPU_HAS_ASID for the MMU
case. It also corrects the typo in the v6wbi_tlb_fns definition in
pgtable-nommu.h.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
While bisecting an iop13xx compile failure I noticed that
include/asm-arm/hwcap.h should be included from include/asm-arm/elf.h
outside #ifndef __ASSEMBLY__ since hwcap.h has its own __ASSEMBLY__
protections.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This sorts out the various lists and related locks in the spu code.
In detail:
- the per-node free_spus and active_list are gone. Instead struct spu
gained an alloc_state member telling whether the spu is free or not
- the per-node spus array is now locked by a per-node mutex, which
takes over from the global spu_lock and the per-node active_mutex
- the spu_alloc* and spu_free function are gone as the state change is
now done inline in the spufs code. This allows some more sharing of
code for the affinity vs normal case and more efficient locking
- some little refactoring in the affinity code for this locking scheme
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
From: Maynard Johnson <mpjohn@us.ibm.com>
This patch updates the existing arch/powerpc/oprofile/op_model_cell.c
to add in the SPU profiling capabilities. In addition, a 'cell' subdirectory
was added to arch/powerpc/oprofile to hold Cell-specific SPU profiling code.
Exports spu_set_profile_private_kref and spu_get_profile_private_kref which
are used by OProfile to store private profile information in spufs data
structures.
Also incorporated several fixes from other patches (rrn). Check pointer
returned from kzalloc. Eliminated unnecessary cast. Better error
handling and cleanup in the related area. 64-bit unsigned long parameter
was being demoted to 32-bit unsigned int and eventually promoted back to
unsigned long.
Signed-off-by: Carl Love <carll@us.ibm.com>
Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com>
Signed-off-by: Bob Nelson <rrnelson@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
This patch adds support for additional flags at spu_create, which relate
to the establishment of affinity between contexts and contexts to memory.
A fourth, optional, parameter is supported. This parameter represent
a affinity neighbor of the context being created, and is used when defining
SPU-SPU affinity.
Affinity is represented as a doubly linked list of spu_contexts.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This patch adds affinity data to each spu instance.
A doubly linked list is created, meant to connect the spus
in the physical order they are placed in the BE. SPUs
near to memory should be marked as having memory affinity.
Adjustments of the fields acording to FW properties is done
in separate patches, one for CPBW, one for Malta (patch for
Malta under testing).
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Addition of a spufs-global "cbe_info" array. Each entry contains information
about one Cell/B.E. node, namelly:
* list of spus (both free and busy spus are in this list);
* list of free spus (replacing the static spu_list from spu_base.c)
* number of spus;
* number of reserved (non scheduleable) spus.
SPE affinity implementation actually requires only access to one spu per
BE node (since it implements its own pointer to walk through the other spus
of the ring) and the number of scheduleable spus (n_spus - non_sched_spus)
However having this more general structure can be useful for other
functionalities, concentrating per-cbe statistics / data.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
The decr_status in the LSCSA is confusedly used as two meanings:
* SPU decrementer was running
* SPU decrementer was wrapped as a result of adjust
and the code to set decr_status is missing.
This patch fixes these problems by using the decr_status argument as a
set of flags. This requires a rebuild of the shipped spu_restore code.
Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This patch exports per-context statistics in spufs as long as spu
statistics in sysfs.
It was formed by merging:
"spufs: add spu stats in sysfs" From: Christoph Hellwig
"spufs: add stat file to spufs" From: Christoph Hellwig
"spufs: fix libassist accounting" From: Jeremy Kerr
"spusched: fix spu utilization statistics" From: Luke Browning
And some adjustments by myself, after suggestions on cbe-oss-dev.
Having separate patches was making the review process harder
than it should, as we end up integrating spus and ctx statistics
accounting much more than it was on the first implementation.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
The current SPU context saving procedure in SPUFS unexpectedly
restarts MFC when halting decrementer, because MFC_CNTL[Dh] is set
without MFC_CNTL[Sm]. This bug causes, for example, saving broken DMA
queues. Here is a patch to fix the problem.
Signed-off-by: Kazunori Asayama <asayama@sm.sony.co.jp>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This patch adds support for investigating spus information after a
kernel crash event, through kdump vmcore file.
Implementation is based on xmon code, but the new functionality was
kept independent from xmon.
Signed-off-by: Lucio Jose Herculano Correia <luciojhc@br.ibm.com>
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
The pmi driver got simplified by removing support for multiple devices.
As there is no more than one pmi device per maschine, there is no need to
specify the device for listening and sending messages.
This way the caller (cbe_cpufreq) doesn't need to scan the device tree.
When registering the handler on a board without a pmi
interface, pmi.c will just return -ENODEV.
The patch that fixed the breakage of cell_defconfig has been
broken out of the earlier version of this patch. So this is
the version that applies cleanly on top of it.
Signed-off-by: Christian Krafft <krafft@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
The comparison with ZERO_SIZE_PTR in ZERO_OR_NULL_PTR() needs to be <=
(not just <) so that ZERO_OR_NULL_PTR(ZERO_SIZE_PTR) is 1.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
[ Duh! - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Prevent people from directly including <asm/rwsem.h>.
[IA64] remove time interpolator
[IA64] Convert to generic timekeeping/clocksource
[IA64] refresh some config files for 64K pagesize
[IA64] Delete iosapic_free_rte()
[IA64] fallocate system call
[IA64] Enable percpu vector domain for IA64_DIG
[IA64] Enable percpu vector domain for IA64_GENERIC
[IA64] Support irq migration across domain
[IA64] Add support for vector domain
[IA64] Add mapping table between irq and vector
[IA64] Check if irq is sharable
[IA64] Fix invalid irq vector assumption for iosapic
[IA64] Use dynamic irq for iosapic interrupts
[IA64] Use per iosapic lock for indirect iosapic register access
[IA64] Cleanup lock order in iosapic_register_intr
[IA64] Remove duplicated members in iosapic_rte_info
[IA64] Remove block structure for locking in iosapic.c
Remove time_interpolator code (This is generic code, but
only user was ia64. It has been superseded by the
CONFIG_GENERIC_TIME code).
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Peter Keilty <peter.keilty@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This is a merge of Peter Keilty's initial patch (which was
revived by Bob Picco) for this with Hidetoshi Seto's fixes
and scaling improvements.
Acked-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
There's currently no destructor for the bsg components. If you insert
and remove the module, you see the bsg devices building up and up. This
patch adds the destructor in the correct place in the transport class so
that the bsg and request queue are removed just before the device
destruction.
Acked-by: FUJITA Tomonori <tomof@acm.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
1. split pxa_cpu_suspend to pxa25x_cpu_suspend and pxa27x_cpu_suspend
and make pxa25x_cpu_pm_enter() and pxa27x_cpu_pm_enter() to invoke
the corresponding _suspend functions, thus remove all those ugly
#ifdef .. #endif out of sleep.S
2. move the declarations of those suspend functions to pm.h
note: this is not a clean enough solution until all the pxa25x and
pxa27x specific part is further removed out of sleep.S, sleep.S is
supposed to contain generic code only
Signed-off-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
1. introduce a structure pxa_cpu_pm_fns for pxa25x/pxa27x specific
operations as follows:
struct pxa_cpu_pm_fns {
int save_size;
void (*save)(unsigned long *);
void (*restore)(unsigned long *);
int (*valid)(suspend_state_t state);
void (*enter)(suspend_state_t state);
}
2. processor specific registers saving and restoring are performed
by calling the corresponding (*save) and (*restore)
3. pxa_cpu_pm_fns->save_size should be initialized to the required
size for processor specific registers saving, the allocated
memory address will be passed to (*save) and (*restore)
memory allocation happens early in pxa_pm_init(), and save_size
should be assigned prior to this (which is usually true, since
pxa_pm_init() happens in device_initcall()
4. there're some redundancies for those SLEEP_SAVE_XXX and related
macros, will be fixed later, one way possible is for the system
devices to handle the specific registers saving and restoring
Signed-off-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/ofcons:
Create drivers/of/platform.c
Create linux/of_platorm.h
[SPARC/64] Rename some functions like PowerPC
Begin consolidation of of_device.h
Begin to consolidate of_device.c
Consolidate of_find_node_by routines
Consolidate of_get_next_child
Consolidate of_get_parent
Consolidate of_find_property
Consolidate of_device_is_compatible
Start split out of common open firmware code
Split out common parts of prom.h
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: appletouch - improve powersaving for Geyser3 devices
Input: lifebook - fix an oops on Panasonic CF-18
Input: document intended meaning of KEY_SWITCHVIDEOMODE
Input: switch to using seq_list_xxx helpers
Input: i8042 - give more trust to PNP data on i386
Input: add driver for Fujitsu serial touchscreens
Input: ads7846 - re-check pendown status before reporting events
Input: ads7846 - introduce sample settling delay
Input: xpad - add support for leds on xbox 360 pad
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (26 commits)
sh: intc - add support for SH7750 and its variants
sh: Move entry point code to .text.head.
sh: heartbeat: Shut up resource size warning.
sh: update r2d defconfig and fix SH7751R pci compliation
sh: Many symbol exports for nommu allmodconfig.
sh: zero terminate 8250 platform data for r2d board
sh: cpufreq: Fix up the build for SH-2.
sh: Make on-chip DMA channel selection explicit.
sh: Fix up CPU dependencies for on-chip DMAC.
sh: cpufreq: clock framework support.
sh: Support rate rounding for SH7722 FRQCR clocks.
sh: Implement clk_round_rate() in the clock framework.
sh: Fix up PCI section mismatch warnings.
sh: Wire up fallocate() syscall.
sh: intc - add support for 7780
sh: intc - improve group support
sh: Fix up SH-3 and SH-4 driver dependencies.
sh: push-switch: Correct license string.
sh: cpufreq: Fix driver dependencies and flag as broken.
sh: IPR/INTC2 IRQ setup consolidation.
...
* 'linus' of master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa: (102 commits)
[ALSA] version 1.0.14
[ALSA] remove duplicate Logitech Quickcam USB ID in usbquirks.h
[ALSA] hda-codec - Fix input with STAC92xx
[ALSA] hda-intel: support for iMac 24'' released on 09/2006
[ALSA] hda-codec - Add quirk for Asus P5LD2
[ALSA] snd-ca0106: Add support for X-Fi Extreme Audio.
[ALSA] snd-emu10k1:Enable E-Mu 1616m notebook firmware loading.
[ALSA] snd-emu10k1: Initial support for E-Mu 1616 and 1616m.
[ALSA] cs46xx - Fix PM resume
[ALSA] hda: Enable SPDIF in/out on some stac9205 boards
[ALSA] timer: check for incorrect device state in non-debug compiles, too
[ALSA] snd-aoa-codec-onyx: fix typo
[ALSA] hda-codec - Add quirks for HP dx2200/dx2250
[ALSA] hda-codec - Rename HP model-specific quirks
[ALSA] hda-codec - Add quirk for HP Samba
[ALSA] hda-codec - Add LG LW20 line-in capture source
[ALSA] usb-audio - Fix AC3 with M-Audio Audiophile USB
[ALSA] hda: stac9202 mixer fix
[ALSA] Make s3c24xx_i2s_set_clkdiv() change the correct bits
[ALSA] hda-codec - Add LG LW20 si3054 modem id
...
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh64-2.6:
sh64: Flag sh64_get_page() as __init_refok.
sh64: Move entry point code to .text.head.
sh64: Fix up PCI section mismatch warnings.
sh64: Update cayman defconfig.
sh64: Wire up fallocate() syscall.
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: (29 commits)
libata: implement EH fast drain
libata: schedule probing after SError access failure during autopsy
libata: clear HOTPLUG flag after a reset
libata: reorganize ata_ehi_hotplugged()
libata: improve SCSI scan failure handling
libata: quickly trigger SATA SPD down after debouncing failed
libata: improve SATA PHY speed down logic
The SATA controller device ID is different according to
ahci: implement SCR_NOTIFICATION r/w
ahci: make NO_NCQ handling more consistent
libata: make ->scr_read/write callbacks return error code
libata: implement AC_ERR_NCQ
libata: improve EH report formatting
sata_sil24: separate out sil24_do_softreset()
sata_sil24: separate out sil24_exec_polled_cmd()
sata_sil24: replace sil24_update_tf() with sil24_read_tf()
ahci: separate out ahci_do_softreset()
ahci: separate out ahci_exec_polled_cmd()
ahci: separate out ahci_kick_engine()
ahci: use deadline instead of fixed timeout for 1st FIS for SRST
...
Andrew Morton:
[async_memcpy] is very wrong if both ASYNC_TX_KMAP_DST and
ASYNC_TX_KMAP_SRC can ever be set. We'll end up using the same kmap
slot for both src add dest and we get either corrupted data or a BUG.
Evgeniy Polyakov:
Btw, shouldn't it always be kmap_atomic() even if flag is not set.
That pages are usual one returned by alloc_page().
So fix the usage of kmap_atomic and kill the ASYNC_TX_KMAP_DST and
ASYNC_TX_KMAP_SRC flags.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: Fix two year old bug in early bootup asm.
[SPARC64]: Update defconfig.
[SPARC64]: Fix log message type in vio_create_one().
[SPARC64]: Tweak assertions in sun4v_build_virq().
[SPARC64]: Tweak kernel log messages in power_probe().
[SPARC64]: Fix handling of multiple vdc-port nodes.
[SPARC64]: Fix device type matching in VIO's devspec_show().
[SPARC64]: Fix MODULE_DEVICE_TABLE() specification in VDC and VNET.
[SPARC]: Add sys_fallocate() entries.
[SPARC64]: Use orderly_poweroff().
Since we have use like ~SLUB_DMA, we ought to have the type
set right in both cases.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In most cases, when EH is scheduled, all in-flight commands are
aborted causing EH to kick in immediately. However, in some cases
(especially with PMP), it's unclear which commands are affected by the
error condition and although aborting all in-flight commands work, it
isn't optimal and may cause unnecessary disruption. On the other
hand, waiting for in-flight commands to drain themselves can take up
to 30seconds.
This patch implements EH fast drain to handle such situations. It
gives in-flight commands some time to finish up but doesn't wait for
too long. After EH is scheduled, fast drain timer is started and if
no other completion occurs in ATA_EH_FASTDRAIN_INTERVAL all in-flight
commands are aborted. If any completion occurred in the interval, the
port is given another interval to finish up itself.
Currently ATA_EH_FASTDRAIN_INTERVAL is 3 secs which should be enough
for finishing up most commands.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
__ata_ehi_hotplugged() now has no users. Regorganize
ata_ehi_hotplugged() such that a new function ata_ehi_schedule_probe()
deals with scheduling probing. ata_ehi_hotplugged() calls it and
additionally marks hotplug specific flags. ata_ehi_schedule_probe()
will be used laster.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
sata_down_spd_limit() first reads the current SPD from SStatus and
limit the speed to the lower one of one below the current limit or one
below the current SPD in SStatus. SPD may not be accessible or valid
when SPD down is requested making sata_down_spd_limit() fail when it's
most needed.
This patch makes the current SPD cached after each successful reset
and forces GEN I speed (1.5Gbps) if neither of SStatus or the cached
value is valid, so sata_down_spd_limit() is now guaranteed to lower
the speed limit if lower speed is available.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Convert ->scr_read/write callbacks to return error code to better
indicate failure. This will help handling of SCR_NOTIFICATION.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
When an NCQ command fails, all commands in flight are aborted and the
offending one is reported using log page 10h. Depending on controller
characteristics and LLD implementation, all commands may appear as
having a device error due to shared TF status making it hard to
determine what's actually going on.
This patch adds AC_ERR_NCQ, marks the command reported by log page 10h
with it and print extra "<F>" after the error report for the command
to help distinguishing the offending command.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Requiring LLDs to format multiple error description messages properly
doesn't work too well. Help LLDs a bit by making ata_ehi_push_desc()
insert ", " on each invocation. __ata_ehi_push_desc() is the raw
version without the automatic separator.
While at it, make ehi_desc interface proper functions instead of
macros.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add @is_cmd to ata_tf_to_fis(). This controls bit 7 of the second
byte which tells the device whether this H2D FIS is for a command or
not. This cleans up ahci a bit and will be used by PMP.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch converts the cpu specific 7750 setup code to use the
new intc controller. Many new vectors are added and multiple
processor variants including 7091, 7750, 7750s, 7750r, 7751 and
7751r should all have the correct vectors hooked up.
IRLM interrupts can be enabled using ipr_irq_enable_irlm() which
now is marked as __init.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Fixed PM resume of cs46xx devices. It now restores properly the DSP
image and kick-off the DSP.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@suse.cz>
* adding 8 more 32-bit capture channels (total of 16) for emu1010 cards
* adding some code comments and card details description
Signed-off-by: Pavel Hofman <dustin@seznam.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@suse.cz>
Add support for Cyrix/NatSemi Geode SC5530 (VSA1).
The driver is snd-cs5530.
Signed-off-by Ash Willis <ashwillis@programmer.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@suse.cz>
This patch adds the support of mute for front channels of M-Audio
Revolution 7.1 (the DAC AK4381 features a mute bit).
Signed-off-by: Pavel Hofman <dustin@seznam.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@suse.cz>
This patch adds I2C ID for the LM4857 audio amp and corrects the spacing
of the WM8731, WM8750 and WM8753 ID's.
Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@suse.cz>
I changed the naming to be more obvious---unfortunately the HRM
doesn't specify these.
Moreover the numbering is changed to be zero indexed as this is more
natural.
Adjust all callers.
Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add definitions for RDPROOF, WRPROOF and PDCFBYTE bits of the Mode
Register in the updated MMC controller found on the AT91SAM9260 and
AT91SAM9263 processors.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Avoid the virt_to_bus()/bus_to_virt() warnings in floppy.c caused
by the (useless) double conversion to/from bus addresses.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In order for this driver to be shared across the iop architectures the
iop3xx and iop13xx header files are modified to present a common interface
for the iop_wdt driver.
Details:
* iop13xx supports disabling the timer while iop3xx does not. This requires
a few 'compatibility' definitions in include/asm-arm/hardware/iop3xx.h to
preclude adding #ifdef CONFIG_ARCH_IOP13XX blocks to the driver code.
* The heartbeat interval is derived from the internal bus clock rate, so this
this patch also exports the tick rate to the iop_wdt driver.
Cc: Curt Bruns <curt.e.bruns@intel.com>
Cc: Peter Milne <peter.milne@d-tacq.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
arch/arm/boot/compressed/misc.o: In function `valid_user_regs':
misc.c:(.text+0x74): undefined reference to `elf_hwcap'
This triggers after the various elf_hwcap cleanups in:
f884b1cf57d1cbbd6b41
include/asm-arm/arch-iop13xx/uncompress.h calls cpu_relax while spinning on
a register value. cpu_relax requires processor.h->ptrace.h->hwcap.h
'elf_hwcap' is defined as an extern, but since the uncompressor does not
link against arch/arm/kernel/setup.c 'elf_hwcap' remains undefined.
Fix is to open code the cpu_relax() call as barrier().
Cc: Lennert Buytenhek <kernel@wantstofly.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds the basic support for the em7210 board. It is similar to
the iq31244 board and can be found on Intel "Baxter Creek" ss4000e nas.
Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
If we have two processes with different ioprio_class, but the same
ioprio_data, their async requests will fall into the same queue. I guess
such behavior is not expected, because it's not right to put real-time
requests and best-effort requests in the same queue.
The attached patch fixes the problem by introducing additional *cfqq
fields on cfqd, pointing to per-(class,priority) async queues.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This is an optional component of the clock framework. However,
as we're going to be using this in the cpufreq drivers, add
support for it to the framework.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The "id" property in vdc-port nodes are not unique, they
are all zero. Therefore assign ID's using the parent's
"cfg-handle" property which will be unique.
Signed-off-by: David S. Miller <davem@davemloft.net>
and populate it with the common parts from PowerPC and Sparc[64].
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
Move common stuff from asm-powerpc/of_platform.h to here and
move the common bits from asm-sparc*/of_device.h here as well.
Create asm-sparc*/of_platform.h and move appropriate parts of
of_device.h to them.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
This is to make the of merge easier. Also rename of_bus_type.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: David S. Miller <davem@davemloft.net>
This just moves the common stuff from the arch of_device.h files to
linux/of_device.h.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
This consolidates the routines of_find_node_by_path, of_find_node_by_name,
of_find_node_by_type and of_find_compatible_device. Again, the comparison
of strings are done differently by Sparc and PowerPC and also these add
read_locks around the iterations.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
This requires creating dummy of_node_{get,put} routines for sparc and
sparc64. It also adds a read_lock around the parent accesses.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
The only change here is that a readlock is taken while the property list
is being traversed on Sparc where it was not taken previously.
Also, Sparc uses strcasecmp to compare property names while PowerPC
uses strcmp.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
The only difference here is that Sparc uses strncmp to match compatibility
names while PowerPC uses strncasecmp.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
This creates drivers/of/base.c (depending on CONFIG_OF) and puts
the first trivially common bits from the prom.c files into it.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
This patch converts the cpu specific 7780 setup code to use the
new intc controller. Many new vectors are added and also support for
external interrupt sense configuration. So with this patch it is now
possible to configure external interrupt pins as edge or level
triggered using set_irq_type().
No external interrupts are registered by default.
Use plat_irq_setup_pins() to select between IRQ or IRL mode.
This patch also fixes the Alarm IRQ for the RTC.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch unifies the cpu specific interrupt setup functions for
interrupt controller blocks such as ipr, intc2 and intc. There is no
point in having separate functions for each interrupt controller, so
let's clean this up.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch cleans up solution engine 7722 specific interrupt code.
The main purpose is to replace the mux function with use of
set_irq_chained_handler() and replace hard coded register poking
code with set_irq_type(). The board specific interrupts are also
moved to start from SE7722_FPGA_IRQ_BASE.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch converts the cpu specific 7722 setup code to use the
new intc controller. Many new vectors are added and also support
for external interrupt sense configuration. So with this patch
it is now possible to configure external interrupt pins as edge
or level triggered using set_irq_type().
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This is the second version of the shared interrupt controller patch
for the sh architecture, fixing up handling of intc_reg_fns[].
The three main advantages with this controller over the existing
ones are:
- Both priority (ipr) and bitmap (intc2) registers are
supported
- External pin sense configuration is supported, ie edge
vs level triggered
- CPU/Board specific code maps 1:1 with datasheet for
easy verification
This controller can easily coexist with the current IPR and INTC2
controllers, but the idea is that CPUs/Boards should be moved over
to this controller over time so we have a single code base to
maintain.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This creates linux/of.h and includes asm/prom.h from it.
We also include linux/of.h from asm/prom.h while we transition.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
Slab destructors were no longer supported after Christoph's
c59def9f22 change. They've been
BUGs for both slab and slub, and slob never supported them
either.
This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* Add ATA_PIO[0-6] defines to <linux/ata.h>.
* Add ->pio_mask field to ide_pci_device_t and ide_hwif_t.
* Add PIO masks to host drivers.
<linux/ata.h> change ACK-ed by Jeff Garzik <jeff@garzik.org>.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add ->host_flags to ide_hwif_t to store ide_pci_device_t.host_flags,
assign it in setup-pci.c:ide_pci_setup_ports().
* Add IDE_HFLAG_PIO_NO_{BLACKLIST,DOWNGRADE} to ide_pci_device_t.host_flags
and teach ide_get_best_pio_mode() about them. Also remove needless
!drive->id check while at it (drive->id is always present).
* Convert amd74xx, via82cxxx and ide-timing.h to use ide_get_best_pio_mode()
and then remove no longer needed ide_find_best_pio_mode().
There should be no functionality changes caused by this patch.
Acked-by: Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Drop no longer needed "PIO data" argument from ide_get_best_pio_mode()
and convert all users accordingly.
* Remove no longer needed ide_pio_data_t.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add ide_pio_cycle_time() helper.
* Use it in ali14xx/ht6560b/qd65xx/cmd64{0,x}/sl82c105 and pmac host drivers
(previously cycle time given by the device was only used for "pio" == 255).
* Remove no longer needed ide_pio_data_t.cycle_time field.
v2:
* Fix "ata_" prefix (Noticed by Jeff).
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Rename ide_pci_device_t.flags to ide_pci_device_t.host_flags
and IDEPCI_FLAG_ISA_PORTS flag to IDE_HFLAG_ISA_PORTS.
* Add IDE_HFLAG_SINGLE flag for single channel devices.
* Convert core code and all IDE PCI drivers to use IDE_HFLAG_SINGLE
and remove no longer needed ide_pci_device_t.channels field.
v2:
* Fix issues noticed by Sergei:
- correct code alignment in scc_pata.c
- s/IDE_HFLAG_SINGLE/~IDE_HFLAG_SINGLE/ in serverworks.c
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Print info about overriding PIO mode in ide_get_best_pio_mode().
* Remove info about overriding PIO mode from cmd64{0,x} host drivers.
* Remove no longer needed ide_pio_data_t.overridden field.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
SELinux: use SECINITSID_NETMSG instead of SECINITSID_UNLABELED for NetLabel
SELinux: enable dynamic activation/deactivation of NetLabel/SELinux enforcement
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
[XFS] Fix inode size update before data write in xfs_setattr
[XFS] Allow punching holes to free space when at ENOSPC
[XFS] Implement ->page_mkwrite in XFS.
[FS] Implement block_page_mkwrite.
Manually fix up conflict with Nick's VM fault handling patches in
fs/xfs/linux-2.6/xfs_file.c
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, CONFIG_X86_CMPXCHG64 both enables boot-time checking of
the cmpxchg64b feature and enables compilation of the set_64bit() family.
Since the option is dependent on PAE, and since KVM depends on set_64bit(),
this effectively disables KVM on i386 nopae.
Simplify by removing the config option altogether: the boot check is made
dependent on CONFIG_X86_PAE directly, and the set_64bit() family is exposed
without constraints. It is up to users to check for the feature flag (KVM
does not as virtualiation extensions imply its existence).
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.linux-nfs.org/pub/linux/nfs-2.6:
NFSv4: handle lack of clientaddr in option string
NFSv4: debug print ntohl(status) in nfs client callback xdr code
SUNRPC: Clean up the sillyrename code
NFS: Introduce struct nfs_removeargs+nfs_removeres
NFS: Use dentry->d_time to store the parent directory verifier.
SUNRPC: move bkl locking and xdr proc invocation into a common helper
NFSv4: Fix the nfsv4 readlink reply buffer alignment
NFSv4: Fix the readdir reply buffer alignment
NFSv4: More NFSv4 xdr cleanups
NFSv4: Try to recover from getfh failures in nfs4_xdr_dec_open
NFSv4: 'constify' lookup arguments.
NFSv4: Don't fail nfs4_xdr_dec_open if decode_restorefh() failed
NFSv4: Fix open state recovery
NFSD/SUNRPC: Fix the automatic selection of RPCSEC_GSS
* 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6: (44 commits)
i2c: Delete the i2c-isa pseudo bus driver
hwmon: refuse to load abituguru driver on non-Abit boards
hwmon: fix Abit Uguru3 driver detection on some motherboards
hwmon/w83627ehf: Be quiet when no chip is found
hwmon/w83627ehf: No need to initialize fan_min
hwmon/w83627ehf: Export the thermal sensor types
hwmon/w83627ehf: Enable VBAT monitoring
hwmon/w83627ehf: Add support for the VID inputs
hwmon/w83627ehf: Fix timing issues
hwmon/w83627ehf: Add error messages for two error cases
hwmon/w83627ehf: Convert to a platform driver
hwmon/w83627ehf: Update the Kconfig entry
make coretemp_device_remove() static
hwmon: Add LM93 support
hwmon: Improve the pwmN_enable documentation
hwmon/smsc47b397: Don't report missing fans as spinning at 82 RPM
hwmon: Add support for newer uGuru's
hwmon/f71805f: Add temperature-tracking fan control mode
hwmon/w83627ehf: Preserve speed reading when changing fan min
hwmon: fix detection of abituguru volt inputs
...
Manual fixup of trivial conflict in MAINTAINERS file
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
[PATCH] sched: implement cpu_clock(cpu) high-speed time source
[PATCH] sched: fix the all pinned logic in load_balance_newidle()
[PATCH] sched: fix newly idle load balance in case of SMT
[PATCH] sched: sched_cacheflush is now unused
When a CONFIG_USER_NS=n and a user tries to unshare some namespace other
than the user namespace, the dummy copy_user_ns returns NULL rather than
the old_ns.
This value then gets assigned to task->nsproxy->user_ns, so that a
subsequent setuid, which uses task->nsproxy->user_ns, causes a NULL
pointer deref.
Fix this by returning old_ns.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sys_fallocate for ia64. This uses an empty slot #1303 erroneously
marked as reserved for move_pages (which had already been allocated
as syscall #1276)
Signed-Off-By: Dave Chinner <dgc@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Implement the cpu_clock(cpu) interface for kernel-internal use:
high-speed (but slightly incorrect) per-cpu clock constructed from
sched_clock().
This API, unused at the moment, will be used in the future by blktrace,
by the softlockup-watchdog, by printk and by lockstat.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since Ingo's recent scheduler rewrite which was merged as commit
0437e109e1 sched_cacheflush is unused.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix a couple of bugs:
- Don't rely on the parent dentry still being valid when the call completes.
Fixes a race with shrink_dcache_for_umount_subtree()
- Don't remove the file if the filehandle has been labelled as stale.
Fix a couple of inefficiencies
- Remove the global list of sillyrenamed files. Instead we can cache the
sillyrename information in the dentry->d_fsdata
- Move common code from unlink_setup/unlink_done into fs/nfs/unlink.c
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We need a common structure for setting up an unlink() rpc call in order to
fix the asynchronous unlink code.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Since every invocation of xdr encode or decode functions takes the BKL now,
there's a lot of redundant lock_kernel/unlock_kernel pairs that we can pull
out into a common function.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
collapsed with fw-sbp2 patch "Drop cast to non-const char * in host
template initialization." from Kristian Høgsberg
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
pci_ids.h needs two of the AMD NB device-ids namely, Addressmap and the Memory
Controller devices
This patch adds those to the pci_id.h include file
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change error check and clear variable from an atomic to an int
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here's a driver for the Intel 3000 and 3010 memory controllers,
relative to today's Sourceforge code drop. This has only had light
testing (I've yet to actually see it handle a memory error) but it
detects my hardware correctly.
Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provides a way for NMI reported errors on x86 to notify the EDAC
subsystem pending ECC errors by writing to a software state variable.
Here's the reworked patch. I added an EDAC stub to the kernel so we can
have variables that are in the kernel even if EDAC is a module. I also
implemented the idea of using the chip driver to select error detection
mode via module parameter and eliminate the kernel compile option.
Please review/test. Thx!
Also, I only made changes to some of the chipset drivers since I am
unfamiliar with the other ones. We can add similar changes as we go.
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the code for the "lg.ko" module, which allows lguest guests to
be launched.
[akpm@linux-foundation.org: update for futex-new-private-futexes]
[akpm@linux-foundation.org: build fix]
[jmorris@namei.org: lguest: use hrtimers]
[akpm@linux-foundation.org: x86_64 build fix]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lguest is a simple hypervisor for Linux on Linux. Unlike kvm it doesn't need
VT/SVM hardware. Unlike Xen it's simply "modprobe and go". Unlike both, it's
5000 lines and self-contained.
Performance is ok, but not great (-30% on kernel compile). But given its
hackability, I expect this to improve, along with the paravirt_ops code which
it supplies a complete example for. There's also a 64-bit version being
worked on and other craziness.
But most of all, lguest is awesome fun! Too much of the kernel is a big ball
of hair. lguest is simple enough to dive into and hack, plus has some warts
which scream "fork me!".
This patch:
This is the code and headers required to make an i386 kernel an lguest guest.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Share a little common code, reverse the arguments for consistency, drop the
unnecessary "inline", and lowercase the name.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
EX_RDONLY is only called in one place; just put it there.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
page-writeback accounting is presently performed in the page-flags macros.
This is inconsistent and a bit ugly and makes it awkward to implement
per-backing_dev under-writeback page accounting.
So move this accounting down to the callsite(s).
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove is_in_rom() function. It doesn't actually serve the purpose it was
intended to. If you look at the use of it _access_ok() (which is the only use
of it) then it is obvious that most of memory is marked as access_ok. No
point having is_in_rom() then, so remove it.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change the m68knommu irq handling to use the generic irq framework.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The print_stack_trace macro in stacktrace.h has a wrong number of
arguments, fix it.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__acquire
|
lock _____
| \
| __contended
| |
| wait
| _______/
|/
|
__acquired
|
__release
|
unlock
We measure acquisition and contention bouncing.
This is done by recording a cpu stamp in each lock instance.
Contention bouncing requires the cpu stamp to be set on acquisition. Hence we
move __acquired into the generic path.
__acquired is then used to measure acquisition bouncing by comparing the
current cpu with the old stamp before replacing it.
__contended is used to measure contention bouncing (only useful for preemptable
locks)
[akpm@linux-foundation.org: cleanups]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- update the copyright notices
- use the default hash function
- fix a thinko in a BUILD_BUG_ON
- add a WARN_ON to spot inconsitent naming
- fix a termination issue in /proc/lock_stat
[akpm@linux-foundation.org: cleanups]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce the core lock statistics code.
Lock statistics provides lock wait-time and hold-time (as well as the count
of corresponding contention and acquisitions events). Also, the first few
call-sites that encounter contention are tracked.
Lock wait-time is the time spent waiting on the lock. This provides insight
into the locking scheme, that is, a heavily contended lock is indicative of
a too coarse locking scheme.
Lock hold-time is the duration the lock was held, this provides a reference for
the wait-time numbers, so they can be put into perspective.
1)
lock
2)
... do stuff ..
unlock
3)
The time between 1 and 2 is the wait-time. The time between 2 and 3 is the
hold-time.
The lockdep held-lock tracking code is reused, because it already collects locks
into meaningful groups (classes), and because it is an existing infrastructure
for lock instrumentation.
Currently lockdep tracks lock acquisition with two hooks:
lock()
lock_acquire()
_lock()
... code protected by lock ...
unlock()
lock_release()
_unlock()
We need to extend this with two more hooks, in order to measure contention.
lock_contended() - used to measure contention events
lock_acquired() - completion of the contention
These are then placed the following way:
lock()
lock_acquire()
if (!_try_lock())
lock_contended()
_lock()
lock_acquired()
... do locked stuff ...
unlock()
lock_release()
_unlock()
(Note: the try_lock() 'trick' is used to avoid instrumenting all platform
dependent lock primitive implementations.)
It is also possible to toggle the two lockdep features at runtime using:
/proc/sys/kernel/prove_locking
/proc/sys/kernel/lock_stat
(esp. turning off the O(n^2) prove_locking functionaliy can help)
[akpm@linux-foundation.org: build fixes]
[akpm@linux-foundation.org: nuke unneeded ifdefs]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the lockdep infrastructure to track lock contention and other lock
statistics.
It tracks lock contention events, and the first four unique call-sites that
encountered contention.
It also measures lock wait-time and hold-time in nanoseconds. The minimum and
maximum times are tracked, as well as a total (which together with the number
of event can give the avg).
All statistics are done per lock class, per write (exclusive state) and per read
(shared state).
The statistics are collected per-cpu, so that the collection overhead is
minimized via having no global cachemisses.
This new lock statistics feature is independent of the lock dependency checking
traditionally done by lockdep; it just shares the lock tracking code. It is
also possible to enable both and runtime disabled either component - thereby
avoiding the O(n^2) lock chain walks for instance.
This patch:
raw_spinlock_t should not use lockdep (and doesn't) since lockdep itself
relies on it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Similar information can easily be obtained with strace -c.
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The sb_info structure only contains a single pointer to the character device,
there is no need for the added indirection.
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We ignore signals for about 30 seconds to give userspace a chance to see the
upcall. As we did not block signals we ended up in a busy loop for the
remainder of the period when a signal is received.
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This changes the i386 linker script and the asm-generic macro it uses so that
ELF note sections with SHF_ALLOC set are linked into the kernel image along
with other read-only data. The PT_NOTE also points to their location.
This paves the way for putting useful build-time information into ELF notes
that can be found easily later in a kernel memory dump.
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds an interface to set/reset flags which determines each memory
segment should be dumped or not when a core file is generated.
/proc/<pid>/coredump_filter file is provided to access the flags. You can
change the flag status for a particular process by writing to or reading from
the file.
The flag status is inherited to the child process when it is created.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch changes mm_struct.dumpable to a pair of bit flags.
set_dumpable() converts three-value dumpable to two flags and stores it into
lower two bits of mm_struct.flags instead of mm_struct.dumpable.
get_dumpable() behaves in the opposite way.
[akpm@linux-foundation.org: export set_dumpable]
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stackable file systems, among others, frequently need to lookup paths or
path components starting from an arbitrary point in the namespace
(identified by a dentry and a vfsmount). Currently, such file systems use
lookup_one_len, which is frowned upon [1] as it does not pass the lookup
intent along; not passing a lookup intent, for example, can trigger BUG_ON's
when stacking on top of NFSv4.
The first patch introduces a new lookup function to allow lookup starting
from an arbitrary point in the namespace. This approach has been suggested
by Christoph Hellwig [2].
The second patch changes sunrpc to use vfs_path_lookup.
The third patch changes nfsctl.c to use vfs_path_lookup.
The fourth patch marks link_path_walk static.
The fifth, and last patch, unexports path_walk because it is no longer
unnecessary to call it directly, and using the new vfs_path_lookup is
cleaner.
For example, the following snippet of code, looks up "some/path/component"
in a directory pointed to by parent_{dentry,vfsmnt}:
err = vfs_path_lookup(parent_dentry, parent_vfsmnt,
"some/path/component", 0, &nd);
if (!err) {
/* exits */
...
/* once done, release the references */
path_release(&nd);
} else if (err == -ENOENT) {
/* doesn't exist */
} else {
/* other error */
}
VFS functions such as lookup_create can be used on the nameidata structure
to pass the create intent to the file system.
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the arg+env limit of MAX_ARG_PAGES by copying the strings directly from
the old mm into the new mm.
We create the new mm before the binfmt code runs, and place the new stack at
the very top of the address space. Once the binfmt code runs and figures out
where the stack should be, we move it downwards.
It is a bit peculiar in that we have one task with two mm's, one of which is
inactive.
[a.p.zijlstra@chello.nl: limit stack size]
Signed-off-by: Ollie Wild <aaw@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Hugh Dickins <hugh@veritas.com>
[bunk@stusta.de: unexport bprm_mm_init]
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The purpose of audit_bprm() is to log the argv array to a userspace daemon at
the end of the execve system call. Since user-space hasn't had time to run,
this array is still in pristine state on the process' stack; so no need to
copy it, we can just grab it from there.
In order to minimize the damage to audit_log_*() copy each string into a
temporary kernel buffer first.
Currently the audit code requires that the full argument vector fits in a
single packet. So currently it does clip the argv size to a (sysctl) limit,
but only when execve auditing is enabled.
If the audit protocol gets extended to allow for multiple packets this check
can be removed.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ollie Wild <aaw@google.com>
Cc: <linux-audit@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
New arch macro STACK_TOP_MAX it gives the larges valid stack address for the
architecture in question.
It differs from STACK_TOP in that it will not distinguish between
personalities but will always return the largest possible address.
This is used to create the initial stack on execve, which we will move down to
the proper location once the binfmt code has figured out where that is.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ollie Wild <aaw@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
per cpu data section contains two types of data. One set which is
exclusively accessed by the local cpu and the other set which is per cpu,
but also shared by remote cpus. In the current kernel, these two sets are
not clearely separated out. This can potentially cause the same data
cacheline shared between the two sets of data, which will result in
unnecessary bouncing of the cacheline between cpus.
One way to fix the problem is to cacheline align the remotely accessed per
cpu data, both at the beginning and at the end. Because of the padding at
both ends, this will likely cause some memory wastage and also the
interface to achieve this is not clean.
This patch:
Moves the remotely accessed per cpu data (which is currently marked
as ____cacheline_aligned_in_smp) into a different section, where all the data
elements are cacheline aligned. And as such, this differentiates the local
only data and remotely accessed data cleanly.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: <linux-arch@vger.kernel.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I realise jprobes are a razor-blades-included type of interface, but that
doesn't mean we can't try and make them safer to use. This guy I know once
wrote code like this:
struct jprobe jp = { .kp.symbol_name = "foo", .entry = "jprobe_foo" };
And then his kernel exploded. Oops.
This patch adds an arch hook, arch_deref_entry_point() (I don't like it
either) which takes the void * in a struct jprobe, and gives back the text
address that it represents.
We can then use that in register_jprobe() to check that the entry point we're
passed is actually in the kernel text, rather than just some random value.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
AFAICT now that jprobe.entry is a void *, JPROBE_ENTRY doesn't do anything
useful - so remove it ..
I've left a do-nothing version so that out-of-tree jprobes code will still
compile without modifications.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently jprobe.entry is a kprobe_opcode_t *, but that's a lie. On some
platforms it doesn't point to an opcode at all, it points to a function
descriptor.
It's really a pointer to something that the arch code can turn into a function
entry point. And that's what actually happens, none of the generic code ever
looks at jprobe.entry, it's only ever dereferenced by arch code.
So just make it a void *.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename some file_ra_state variables and remove some accessors.
It results in much simpler code.
Kudos to Rusty!
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Split ondemand readahead interface into two functions. I think this makes it
a little clearer for non-readahead experts (like Rusty).
Internally they both call ondemand_readahead(), but the page argument is
changed to an obvious boolean flag.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Share the same page flag bit for PG_readahead and PG_reclaim.
One is used only on file reads, another is only for emergency writes. One
is used mostly for fresh/young pages, another is for old pages.
Combinations of possible interactions are:
a) clear PG_reclaim => implicit clear of PG_readahead
it will delay an asynchronous readahead into a synchronous one
it actually does _good_ for readahead:
the pages will be reclaimed soon, it's readahead thrashing!
in this case, synchronous readahead makes more sense.
b) clear PG_readahead => implicit clear of PG_reclaim
one(and only one) page will not be reclaimed in time
it can be avoided by checking PageWriteback(page) in readahead first
c) set PG_reclaim => implicit set of PG_readahead
will confuse readahead and make it restart the size rampup process
it's a trivial problem, and can mostly be avoided by checking
PageWriteback(page) first in readahead
d) set PG_readahead => implicit set of PG_reclaim
PG_readahead will never be set on already cached pages.
PG_reclaim will always be cleared on dirtying a page.
so not a problem.
In summary,
a) we get better behavior
b,d) possible interactions can be avoided
c) racy condition exists that might affect readahead, but the chance
is _really_ low, and the hurt on readahead is trivial.
Compound pages also use PG_reclaim, but for now they do not interact with
reclaim/readahead code.
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the old readahead algorithm.
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Steven Pratt <slpratt@austin.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a minimal readahead algorithm that aims to replace the current one.
It is more flexible and reliable, while maintaining almost the same behavior
and performance. Also it is full integrated with adaptive readahead.
It is designed to be called on demand:
- on a missing page, to do synchronous readahead
- on a lookahead page, to do asynchronous readahead
In this way it eliminated the awkward workarounds for cache hit/miss,
readahead thrashing, retried read, and unaligned read. It also adopts the
data structure introduced by adaptive readahead, parameterizes readahead
pipelining with `lookahead_index', and reduces the current/ahead windows to
one single window.
HEURISTICS
The logic deals with four cases:
- sequential-next
found a consistent readahead window, so push it forward
- random
standalone small read, so read as is
- sequential-first
create a new readahead window for a sequential/oversize request
- lookahead-clueless
hit a lookahead page not associated with the readahead window,
so create a new readahead window and ramp it up
In each case, three parameters are determined:
- readahead index: where the next readahead begins
- readahead size: how much to readahead
- lookahead size: when to do the next readahead (for pipelining)
BEHAVIORS
The old behaviors are maximally preserved for trivial sequential/random reads.
Notable changes are:
- It no longer imposes strict sequential checks.
It might help some interleaved cases, and clustered random reads.
It does introduce risks of a random lookahead hit triggering an
unexpected readahead. But in general it is more likely to do good
than to do evil.
- Interleaved reads are supported in a minimal way.
Their chances of being detected and proper handled are still low.
- Readahead thrashings are better handled.
The current readahead leads to tiny average I/O sizes, because it
never turn back for the thrashed pages. They have to be fault in
by do_generic_mapping_read() one by one. Whereas the on-demand
readahead will redo readahead for them.
OVERHEADS
The new code reduced the overheads of
- excessively calling the readahead routine on small sized reads
(the current readahead code insists on seeing all requests)
- doing a lot of pointless page-cache lookups for small cached files
(the current readahead only turns itself off after 256 cache hits,
unfortunately most files are < 1MB, so never see that chance)
That accounts for speedup of
- 0.3% on 1-page sequential reads on sparse file
- 1.2% on 1-page cache hot sequential reads
- 3.2% on 256-page cache hot sequential reads
- 1.3% on cache hot `tar /lib`
However, it does introduce one extra page-cache lookup per cache miss, which
impacts random reads slightly. That's 1% overheads for 1-page random reads on
sparse file.
PERFORMANCE
The basic benchmark setup is
- 2.6.20 kernel with on-demand readahead
- 1MB max readahead size
- 2.9GHz Intel Core 2 CPU
- 2GB memory
- 160G/8M Hitachi SATA II 7200 RPM disk
The benchmarks show that
- it maintains the same performance for trivial sequential/random reads
- sysbench/OLTP performance on MySQL gains up to 8%
- performance on readahead thrashing gains up to 3 times
iozone throughput (KB/s): roughly the same
==========================================
iozone -c -t1 -s 4096m -r 64k
2.6.20 on-demand gain
first run
" Initial write " 61437.27 64521.53 +5.0%
" Rewrite " 47893.02 48335.20 +0.9%
" Read " 62111.84 62141.49 +0.0%
" Re-read " 62242.66 62193.17 -0.1%
" Reverse Read " 50031.46 49989.79 -0.1%
" Stride read " 8657.61 8652.81 -0.1%
" Random read " 13914.28 13898.23 -0.1%
" Mixed workload " 19069.27 19033.32 -0.2%
" Random write " 14849.80 14104.38 -5.0%
" Pwrite " 62955.30 65701.57 +4.4%
" Pread " 62209.99 62256.26 +0.1%
second run
" Initial write " 60810.31 66258.69 +9.0%
" Rewrite " 49373.89 57833.66 +17.1%
" Read " 62059.39 62251.28 +0.3%
" Re-read " 62264.32 62256.82 -0.0%
" Reverse Read " 49970.96 50565.72 +1.2%
" Stride read " 8654.81 8638.45 -0.2%
" Random read " 13901.44 13949.91 +0.3%
" Mixed workload " 19041.32 19092.04 +0.3%
" Random write " 14019.99 14161.72 +1.0%
" Pwrite " 64121.67 68224.17 +6.4%
" Pread " 62225.08 62274.28 +0.1%
In summary, writes are unstable, reads are pretty close on average:
access pattern 2.6.20 on-demand gain
Read 62085.61 62196.38 +0.2%
Re-read 62253.49 62224.99 -0.0%
Reverse Read 50001.21 50277.75 +0.6%
Stride read 8656.21 8645.63 -0.1%
Random read 13907.86 13924.07 +0.1%
Mixed workload 19055.29 19062.68 +0.0%
Pread 62217.53 62265.27 +0.1%
aio-stress: roughly the same
============================
aio-stress -l -s4096 -r128 -t1 -o1 knoppix511-dvd-cn.iso
aio-stress -l -s4096 -r128 -t1 -o3 knoppix511-dvd-cn.iso
2.6.20 on-demand delta
sequential 92.57s 92.54s -0.0%
random 311.87s 312.15s +0.1%
sysbench fileio: roughly the same
=================================
sysbench --test=fileio --file-io-mode=async --file-test-mode=rndrw \
--file-total-size=4G --file-block-size=64K \
--num-threads=001 --max-requests=10000 --max-time=900 run
threads 2.6.20 on-demand delta
first run
1 59.1974s 59.2262s +0.0%
2 58.0575s 58.2269s +0.3%
4 48.0545s 47.1164s -2.0%
8 41.0684s 41.2229s +0.4%
16 35.8817s 36.4448s +1.6%
32 32.6614s 32.8240s +0.5%
64 23.7601s 24.1481s +1.6%
128 24.3719s 23.8225s -2.3%
256 23.2366s 22.0488s -5.1%
second run
1 59.6720s 59.5671s -0.2%
8 41.5158s 41.9541s +1.1%
64 25.0200s 23.9634s -4.2%
256 22.5491s 20.9486s -7.1%
Note that the numbers are not very stable because of the writes.
The overall performance is close when we sum all seconds up:
sum all up 495.046s 491.514s -0.7%
sysbench oltp (trans/sec): up to 8% gain
========================================
sysbench --test=oltp --oltp-table-size=10000000 --oltp-read-only \
--mysql-socket=/var/run/mysqld/mysqld.sock \
--mysql-user=root --mysql-password=readahead \
--num-threads=064 --max-requests=10000 --max-time=900 run
10000-transactions run
threads 2.6.20 on-demand gain
1 62.81 64.56 +2.8%
2 67.97 70.93 +4.4%
4 81.81 85.87 +5.0%
8 94.60 97.89 +3.5%
16 99.07 104.68 +5.7%
32 95.93 104.28 +8.7%
64 96.48 103.68 +7.5%
5000-transactions run
1 48.21 48.65 +0.9%
8 68.60 70.19 +2.3%
64 70.57 74.72 +5.9%
2000-transactions run
1 37.57 38.04 +1.3%
2 38.43 38.99 +1.5%
4 45.39 46.45 +2.3%
8 51.64 52.36 +1.4%
16 54.39 55.18 +1.5%
32 52.13 54.49 +4.5%
64 54.13 54.61 +0.9%
That's interesting results. Some investigations show that
- MySQL is accessing the db file non-uniformly: some parts are
more hot than others
- It is mostly doing 4-page random reads, and sometimes doing two
reads in a row, the latter one triggers a 16-page readahead.
- The on-demand readahead leaves many lookahead pages (flagged
PG_readahead) there. Many of them will be hit, and trigger
more readahead pages. Which might save more seeks.
- Naturally, the readahead windows tend to lie in hot areas,
and the lookahead pages in hot areas is more likely to be hit.
- The more overall read density, the more possible gain.
That also explains the adaptive readahead tricks for clustered random reads.
readahead thrashing: 3 times better
===================================
We boot kernel with "mem=128m single", and start a 100KB/s stream on every
second, until reaching 200 streams.
max throughput min avg I/O size
2.6.20: 5MB/s 16KB
on-demand: 15MB/s 140KB
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Steven Pratt <slpratt@austin.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Extend struct file_ra_state to support the on-demand readahead logic. Also
define some helpers for it.
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Steven Pratt <slpratt@austin.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce a new page flag: PG_readahead.
It acts as a look-ahead mark, which tells the page reader: Hey, it's time to
invoke the read-ahead logic. For the sake of I/O pipelining, don't wait until
it runs out of cached pages!
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Steven Pratt <slpratt@austin.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix type issue reported by latest 'sparse': kiocb.ki_flags should be
"unsigned long" (not "long"), to match bitop type signature.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
unregister_chrdev() does not return meaningful value. This patch makes it
return void like most unregister_* functions.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move "debug during resume from s2ram" into the variable we already use
for real-mode flags to simplify code. It also closes nasty trap for
the user in acpi_sleep_setup; order of parameters actually mattered there,
acpi_sleep=s3_bios,s3_mode doing something different from
acpi_sleep=s3_mode,s3_bios.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a feature allowing the user to make the system beep during a resume from
suspend to RAM, on x86_64 and i386.
This is useful for the users with broken resume from RAM, so that they can
verify if the control reaches the kernel after a wake-up event.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce the pm_power_off_prepare() callback that can be registered by the
interested platforms in analogy with pm_idle() and pm_power_off(), used for
preparing the system to power off (needed by ACPI).
This allows us to drop acpi_sysclass and device_acpi that are only defined in
order to register the ACPI power off preparation callback, which is needed by
pm_power_off() registered in a much different way.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make it possible to register hibernation and suspend notifiers, so that
subsystems can perform hibernation-related or suspend-related operations that
should not be carried out by device drivers' .suspend() and .resume()
routines.
[akpm@linux-foundation.org: build fixes]
[akpm@linux-foundation.org: cleanups]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kernel threads should not have TIF_FREEZE set when user space processes are
being frozen, since otherwise some of them might be frozen prematurely.
To prevent this from happening we can (1) make exit_mm() unset TIF_FREEZE
unconditionally just after clearing tsk->mm and (2) make try_to_freeze_tasks()
check if p->mm is different from zero and PF_BORROWED_MM is unset in p->flags
when user space processes are to be frozen.
Namely, when user space processes are being frozen, we only should set
TIF_FREEZE for tasks that have p->mm different from NULL and don't have
PF_BORROWED_MM set in p->flags. For this reason task_lock() must be used to
prevent try_to_freeze_tasks() from racing with use_mm()/unuse_mm(), in which
p->mm and p->flags.PF_BORROWED_MM are changed under task_lock(p). Also, we
need to prevent the following scenario from happening:
* daemonize() is called by a task spawned from a user space code path
* freezer checks if the task has p->mm set and the result is positive
* task enters exit_mm() and clears its TIF_FREEZE
* freezer sets TIF_FREEZE for the task
* task calls try_to_freeze() and goes to the refrigerator, which is wrong at
that point
This requires us to acquire task_lock(p) before p->flags.PF_BORROWED_MM and
p->mm are examined and release it after TIF_FREEZE is set for p (or it turns
out that TIF_FREEZE should not be set).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
At least on some machines it is necessary to prepare the ACPI firmware for the
restoration of the system memory state from the hibernation image if the
"platform" mode of hibernation has been used. Namely, in that cases we need
to disable the GPEs before replacing the "boot" kernel with the "frozen"
kernel (cf. http://bugzilla.kernel.org/show_bug.cgi?id=7887). After the
restore they will be re-enabled by hibernation_ops->finish(), but if the
restore fails, they have to be re-enabled by the restore code explicitly.
For this purpose we can introduce two additional hibernation operations,
called pre_restore() and restore_cleanup() and call them from the restore code
path. Still, they should be called if the "platform" mode of hibernation has
been used, so we need to pass the information about the hibernation mode from
the "frozen" kernel to the "boot" kernel in the image header.
Apparently, we can't drop the disabling of GPEs before the restore because of
Bug #7887 . We also can't do it unconditionally, because the GPEs wouldn't
have been enabled after a successful restore if the suspend had been done in
the 'shutdown' or 'reboot' mode.
In principle we could (and probably should) unconditionally disable the GPEs
before each snapshot creation *and* before the restore, but then we'd have to
unconditionally enable them after the snapshot creation as well as after the
restore (or restore failure) Still, for this purpose we'd need to modify
acpi_enter_sleep_state_prep() and acpi_leave_sleep_state() and we'd have to
introduce some mechanism synchronizing the disablind/enabling of the GPEs with
the device drivers' .suspend()/.resume() routines and with
disable_/enable_nonboot_cpus(). However, this would have affected the
suspend (ie. s2ram) code as well as the hibernation, which I'd like to avoid
in this patch series.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
alloc_zeroed_user_highpage() has no in-tree users and it is not exported.
As it is not exported, it can simply be removed.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch completes Linus's wish that the fault return codes be made into
bit flags, which I agree makes everything nicer. This requires requires
all handle_mm_fault callers to be modified (possibly the modifications
should go further and do things like fault accounting in handle_mm_fault --
however that would be for another patch).
[akpm@linux-foundation.org: fix alpha build]
[akpm@linux-foundation.org: fix s390 build]
[akpm@linux-foundation.org: fix sparc build]
[akpm@linux-foundation.org: fix sparc64 build]
[akpm@linux-foundation.org: fix ia64 build]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Bryan Wu <bryan.wu@analog.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Still apparently needs some ARM and PPC loving - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change ->fault prototype. We now return an int, which contains
VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
FAULT_RET_ code tells the VM whether a page was found, whether it has been
locked, and potentially other things. This is not quite the way he wanted
it yet, but that's changed in the next patch (which requires changes to
arch code).
This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
that a page is locked which requires filemap_nopage to go away (because we
can no longer remain backward compatible without that flag), but we were
going to do that anyway.
struct fault_data is renamed to struct vm_fault as Linus asked. address
is now a void __user * that we should firmly encourage drivers not to use
without really good reason.
The page is now returned via a page pointer in the vm_fault struct.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
the virtual address -> file offset differently from linear mappings.
->populate is a layering violation because the filesystem/pagecache code
should need to know anything about the virtual memory mapping. The hitch here
is that the ->nopage handler didn't pass down enough information (ie. pgoff).
But it is more logical to pass pgoff rather than have the ->nopage function
calculate it itself anyway (because that's a similar layering violation).
Having the populate handler install the pte itself is likewise a nasty thing
to be doing.
This patch introduces a new fault handler that replaces ->nopage and
->populate and (later) ->nopfn. Most of the old mechanism is still in place
so there is a lot of duplication and nice cleanups that can be removed if
everyone switches over.
The rationale for doing this in the first place is that nonlinear mappings are
subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
to duplicate the synchronisation logic rather than just consolidate the two.
After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
pagecache. Seems like a fringe functionality anyway.
NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no
users have hit mainline yet.
[akpm@linux-foundation.org: cleanup]
[randy.dunlap@oracle.com: doc. fixes for readahead]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the race between invalidate_inode_pages and do_no_page.
Andrea Arcangeli identified a subtle race between invalidation of pages from
pagecache with userspace mappings, and do_no_page.
The issue is that invalidation has to shoot down all mappings to the page,
before it can be discarded from the pagecache. Between shooting down ptes to
a particular page, and actually dropping the struct page from the pagecache,
do_no_page from any process might fault on that page and establish a new
mapping to the page just before it gets discarded from the pagecache.
The most common case where such invalidation is used is in file truncation.
This case was catered for by doing a sort of open-coded seqlock between the
file's i_size, and its truncate_count.
Truncation will decrease i_size, then increment truncate_count before
unmapping userspace pages; do_no_page will read truncate_count, then find the
page if it is within i_size, and then check truncate_count under the page
table lock and back out and retry if it had subsequently been changed (ptl
will serialise against unmapping, and ensure a potentially updated
truncate_count is actually visible).
Complexity and documentation issues aside, the locking protocol fails in the
case where we would like to invalidate pagecache inside i_size. do_no_page
can come in anytime and filemap_nopage is not aware of the invalidation in
progress (as it is when it is outside i_size). The end result is that
dangling (->mapping == NULL) pages that appear to be from a particular file
may be mapped into userspace with nonsense data. Valid mappings to the same
place will see a different page.
Andrea implemented two working fixes, one using a real seqlock, another using
a page->flags bit. He also proposed using the page lock in do_no_page, but
that was initially considered too heavyweight. However, it is not a global or
per-file lock, and the page cacheline is modified in do_no_page to increment
_count and _mapcount anyway, so a further modification should not be a large
performance hit. Scalability is not an issue.
This patch implements this latter approach. ->nopage implementations return
with the page locked if it is possible for their underlying file to be
invalidated (in that case, they must set a special vm_flags bit to indicate
so). do_no_page only unlocks the page after setting up the mapping
completely. invalidation is excluded because it holds the page lock during
invalidation of each page (and ensures that the page is not mapped while
holding the lock).
This also allows significant simplifications in do_no_page, because we have
the page locked in the right place in the pagecache from the start.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Create a new NetLabel KAPI interface, netlbl_enabled(), which reports on the
current runtime status of NetLabel based on the existing configuration. LSMs
that make use of NetLabel, i.e. SELinux, can use this new function to determine
if they should perform NetLabel access checks. This patch changes the
NetLabel/SELinux glue code such that SELinux only enforces NetLabel related
access checks when netlbl_enabled() returns true.
At present NetLabel is considered to be enabled when there is at least one
labeled protocol configuration present. The result is that by default NetLabel
is considered to be disabled, however, as soon as an administrator configured
a CIPSO DOI definition NetLabel is enabled and SELinux starts enforcing
NetLabel related access controls - including unlabeled packet controls.
This patch also tries to consolidate the multiple "#ifdef CONFIG_NETLABEL"
blocks into a single block to ease future review as recommended by Linus.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Many filesystems need a ->page-mkwrite callout to correctly
set up pages that have been written to by mmap. This is especially
important when mmap is writing into holes as it allows filesystems
to correctly account for and allocate space before the mmap
write is allowed to proceed.
Protection against truncate races is provided by locking the page
and checking to see whether the page mapping is correct and whether
it is beyond EOF so we don't end up allowing allocations beyond
the current EOF or changing EOF as a result of a mmap write.
SGI-PV: 940392
SGI-Modid: 2.6.x-xfs-melb:linux:29146a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
* 'for-linus' of git://linux-nfs.org/~bfields/linux:
locks: fix vfs_test_lock() comment
locks: make posix_test_lock() interface more consistent
nfs: disable leases over NFS
gfs2: stop giving out non-cluster-coherent leases
locks: export setlease to filesystems
locks: provide a file lease method enabling cluster-coherent leases
locks: rename lease functions to reflect locks.c conventions
locks: share more common lease code
locks: clean up lease_alloc()
locks: convert an -EINVAL return to a BUG
leases: minor break_lease() comment clarification
Since posix_test_lock(), like fcntl() and ->lock(), indicates absence or
presence of a conflict lock by setting fl_type to, respectively, F_UNLCK
or something other than F_UNLCK, the return value is no longer needed.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Currently leases are only kept locally, so there's no way for a distributed
filesystem to enforce them against multiple clients. We're particularly
interested in the case of nfsd exporting a cluster filesystem, in which
case nfsd needs cluster-coherent leases in order to implement delegations
correctly.
Also add some documentation.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
We've been using the convention that vfs_foo is the function that calls
a filesystem-specific foo method if it exists, or falls back on a
generic method if it doesn't; thus vfs_foo is what is called when some
other part of the kernel (normally lockd or nfsd) wants to get a lock,
whereas foo is what filesystems call to use the underlying local
functionality as part of their lock implementation.
So rename setlease to vfs_setlease (which will call a
filesystem-specific setlease after a later patch) and __setlease to
setlease.
Also, vfs_setlease need only be GPL-exported as long as it's only needed
by lockd and nfsd.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
This interface allows the ability to write the majority of a driver in
userspace with only a very small shell of a driver in the kernel itself.
It uses a char device and sysfs to interact with a userspace process to
process interrupts and control memory accesses.
See the docbook documentation for more details on how to use this
interface.
From: Hans J. Koch <hjk@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benedikt Spranger <b.spranger@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This defines a dev_vdbg() call, which is enabled with -DVERBOSE_DEBUG.
When enabled, dev_vdbg() acts just like dev_dbg(). When disabled, it is a
NOP ... just like dev_dbg() without -DDEBUG. The specific code was moved
out of a USB patch, but lots of drivers have similar support.
That is, code can now be written to use an additional level of debug
output, selected at compile time. Many driver authors have found this
idiom to be very useful. A typical usage model is for "normal" debug
messages to focus on fault paths and not be very "chatty", so that those
messages can be left on during normal operation without much of a
performance or syslog load. On the other hand "verbose" messages would be
noisy enough that they wouldn't normally be enabled; they might even affect
timings enough to change system or driver behavior.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as933) removes the deprecated dpm_runtime_suspend() and
dpm_runtime_resume() routines from the PM core. The only user of
those routines is the PCMCIA ds driver; local replacements are added.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This allows the uevent file to handle any type of uevent action to be
triggered by userspace instead of just the "add" uevent.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Introduce API to dynamically register and unregister multicast groups.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow kicking listeners out of a multicast group when necessary
(for example if that group is going to be removed.)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow changing the number of groups for a netlink family
after it has been created, use RCU to protect the listeners
bitmap keeping netlink_has_listeners() lock-free.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TSEC/eTSEC can detect the interface to the PHY automatically,
but it isn't able to detect whether the RGMII connection needs internal
delay. So we need to detect that change in the device tree, propagate
it to the platform data, and then check it if we're in RGMII. This fixes
a bug on the 8641D HPCN board where the Vitesse PHY doesn't use the delay
for RGMII.
Signed-off-by: Andy Fleming <afleming@freescale.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
[AVR32] Initialize phy_mask for both macb devices
[AVR32] Fix atomic_add_unless() and atomic_sub_unless()
[AVR32] Correct misspelled CONFIG_BLK_DEV_INITRD variable.
[AVR32] Fix build error in parse_tag_rdimg()
[AVR32] Don't wire up macb0 unless SW6 is in default position
[AVR32] Wire up SSC platform device 0 as TX on ATSTK1000 board
[AVR32] Add Atmel SSC driver platform device to AT32AP architecture
[AVR32] Remove optimization of unaligned word loads
[AVR32] Make STK1000 mux settings configurable
[AVR32] CPU frequency scaling for AT32AP
[AVR32] Split SM device into PM, RTC, WDT and EIC
[AVR32] faster avr32 unaligned access
These functions depend on "result" being initalized to 0, but "result"
is not included as an input constraint to the inline assembly block
following its initialization, only as an output constraint. Thus gcc
thinks it doesn't need to initialize it, so result ends up undefined
if the "unless" condition is true.
This fixes an oops in sunrpc where the faulty atomics caused
rpciod_up() to not start the workqueue as it should.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
This patch adds register definitions, clocks and IRQs to the platform devices.
Signed-off-by: Hans-Christian Egtvedt <hcegtvedt@atmel.com>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
If we let unaligned word loads bypass the generic unaligned handling,
gcc may combine it with a swap.b instruction and turn it into a ldwsp
instruction, which does not work with unaligned addresses.
Revert the optimization to prevent the RNDIS driver from crashing.
Hopefully we'll figure something out later (it may be better to do the
optimization in gcc.)
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Split the SM platform device into separate platform devices for PM,
RTC, WDT and EIC. This is more correct according to the documentation
and allows us to simplify the code a little.
Also turn the EIC driver into a real platform driver.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Acked-by: Hans-Christian Egtvedt <hcegtvedt@atmel.com>
Use a more conventional implementation for unaligned access, and include
an AT32AP-specific optimization: the CPU will handle unaligned words.
The result is always faster and smaller for 8, 16, and 32 bit values.
For 64 bit quantities, it's presumably larger.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
* 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (126 commits)
V4L/DVB (5847): Clean up schedule_timeout calls in cpia2 and ivtv code
V4L/DVB (5846): Clean up setting state and scheduling timeouts
V4L/DVB (5844): ivtv: add high volume debugging flag
V4L/DVB (5843): ivtv: fix missing signal_pending check.
V4L/DVB (5842): ivtv: Add locking to ensure stream setup is atomic.
V4L/DVB (5841): tveeprom: add support for Philips FQ1216LME MK3 tuner.
V4L/DVB (5840): fix dst and cx24123: tune() callback changed signess for delay
V4L/DVB (5838): dvb-core: Fix signedness warnings (gcc 4.1.1, kernel 2.6.22)
V4L/DVB (5837): stv0299: Fix signedness warning (gcc 4.1.1, kernel 2.6.22)
V4L/DVB (5836): dvb-ttpci: re-initialize aspect ratio and pan scan after arm crash
V4L/DVB (5835): saa7146/dvb-ttpci: Fix signedness warnings (gcc 4.1.1, kernel 2.6.22)
V4L/DVB (5834): dvb-core: fix signedness warnings and const stripping
V4L/DVB (5832): ir-common: optimize bit extract function
V4L/DVB (5831): stradis: use ARRAY_SIZE
V4L/DVB (5829): Firmware extract and loading for opera dvb-usb update
V4L/DVB (5828): Kconfig: Added GemTek USB radio and removed experimental dependency.
V4L/DVB (5826): Usbvision: video mux cleanup
V4L/DVB (5825): Alter the tuner type for the WinTV USB UK PAL model.
V4L/DVB (5824): Usbvision: Hauppauge WinTV USB SECAM_L fix
V4L/DVB (5821): Saa7134: add remote control support for LifeView FlyDVB-S LR300
...
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: extent macros cleanup
Fix compilation with EXT_DEBUG, also fix leXX_to_cpu conversions.
ext4: remove extra IS_RDONLY() check
ext4: Use is_power_of_2()
Use zero_user_page() in ext4 where possible
ext4: Remove 65000 subdirectory limit
ext4: Expand extra_inodes space per the s_{want,min}_extra_isize fields
ext4: Add nanosecond timestamps
jbd2: Move jbd2-debug file to debugfs
jbd2: Fix CONFIG_JBD_DEBUG ifdef to be CONFIG_JBD2_DEBUG
ext4: Set the journal JBD2_FEATURE_INCOMPAT_64BIT on large devices
ext4: Make extents code sanely handle on-disk corruption
ext4: copy i_flags to inode flags on write
ext4: Enable extents by default
Change on-disk format to support 2^15 uninitialized extents
write support for preallocated blocks
fallocate support in ext4
sys_fallocate() implementation on i386, x86_64 and powerpc
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits)
[NETFILTER]: xt_connlimit needs to depend on nf_conntrack
[NETFILTER]: ipt_iprange.h must #include <linux/types.h>
[IrDA]: Fix IrDA build failure
[ATM]: nicstar needs virt_to_bus
[NET]: move __dev_addr_discard adjacent to dev_addr_discard for readability
[NET]: merge dev_unicast_discard and dev_mc_discard into one
[NET]: move dev_mc_discard from dev_mcast.c to dev.c
[NETLINK]: negative groups in netlink_setsockopt
[PPPOL2TP]: Reset meta-data in xmit function
[PPPOL2TP]: Fix use-after-free
[PKT_SCHED]: Some typo fixes in net/sched/Kconfig
[XFRM]: Fix crash introduced by struct dst_entry reordering
[TCP]: remove unused argument to cong_avoid op
[ATM]: [idt77252] Rename CONFIG_ATM_IDT77252_SEND_IDLE to not resemble a Kconfig variable
[ATM]: [drivers] ioremap balanced with iounmap
[ATM]: [lanai] sram_test_word() must be __devinit
[ATM]: [nicstar] Replace C code with call to ARRAY_SIZE() macro.
[ATM]: Eliminate dead config variable CONFIG_BR2684_FAST_TRANS.
[ATM]: Replacing kmalloc/memset combination with kzalloc.
[NET]: gen_estimator deadlock fix
...
Move tuner callback function pointers out of struct tuner, into
struct tuner_operations.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Individual tuner drivers are now allocating memory themselves for
their own private data structures. This changeset adds a release
callback to the tuner operations, so that newer drivers that may
require more complex data structures may release this private data
themselves.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Create private data struct for device specific private data.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: Set vio->desc_buf to NULL after freeing.
[SPARC]: Mark sparc and sparc64 as not having virt_to_bus
[SPARC64]: Fix reset handling in VNET driver.
[SPARC64]: Handle reset events in vio_link_state_change().
[SPARC64]: Handle LDC resets properly in domain-services driver.
[SPARC64]: Massively simplify VIO device layer and support hot add/remove.
[SPARC64]: Simplify VNET probing.
[SPARC64]: Simplify VDC device probing.
[SPARC64]: Add basic infrastructure for MD add/remove notification.
This patch adds support for SAS Management Protocol (SMP) passthrough
support via bsg. aic94xx can use this.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The sas transport class attaches one bsg device to every SAS object
(host, device, expander, etc). LLDs can define a function to handle
SMP requests via sas_function_template::smp_handler.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Adds PCI_VENDOR_ID_BROCADE macro in include/linux/pci_ids.h file. This macro
is used in MPT Fusion FC drivers to support Brocade branded FC controllers
signed-off-by: Sathya Prakash <sathya.prakash@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This one was noticed by Gilbert Wu of Adaptec:
The libata core actually does the DMA mapping for you, so there has to
be an exception in the device drivers that *don't* do dma mapping for
ATA commands. However, since we've already done this, libsas must now
dma map any ATA commands that it wishes to issue ... and yes, this is a
horrible mess.
Additionally, the test in aic94xx for ATA protocols isn't quite right.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ATA devices need special handling for sas_task_abort. If the ATA command
came from SCSI, then we merely need to tell SCSI to abort the scsi_cmnd.
However, internal commands require a bit more work--we need to fill the qc
with the appropriate error status and complete the command, and eventually
post_internal will issue the actual ABORT TASK.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch adds a new field, lldd_task, to ata_queued_cmd so that libata
users such as libsas can associate some data with a qc. The particular
ambition with this patch is to associate a sas_task with a qc; that way,
if libata decides to timeout a command, we can come back (in
sas_ata_post_internal) and abort the sas task.
One question remains: Is it necessary to reset the phy on error, or will
the libata error handler take care of it? (Assuming that one is written,
of course.) This patch, as it is today, works well enough to clean
things up when an ATA device probe attempt fails halfway through the probe,
though I'm not sure this is always the right thing to do.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is a respin of my earlier patch that migrates the ATA support code
into a separate file. For now, the controversial linking bits have
been removed per James Bottomley's request for a patch that contains
only the migration diffs, which means that libsas continues to require
libata. I intend to address that problem in a separate patch.
This patch is against the aic94xx-sas-2.6 git tree, and it has been
sanity tested on my x206m with Seagate SATA and SAS disks without
uncovering any new problems.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Hook the scsi_host_template functions in libsas to delegate
functionality to libata when appropriate.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Misc code changes and merge fixes and update for libata->drivers/ata
move
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
An experimental patch for Xen allows guests to place their vcpu_info
structs anywhere. We try to use this to place the vcpu_info into the
PDA, which allows direct access.
If this works, then switch to using direct access operations for
irq_enable, disable, save_fl and restore_fl.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Keir Fraser <keir@xensource.com>
The block device frontend driver allows the kernel to access block
devices exported exported by a virtual machine containing a physical
block device driver.
Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <axboe@kernel.dk>
This communicates with the machine control software via a registry
residing in a controlling virtual machine. This allows dynamic
creation, destruction and modification of virtual device
configurations (network devices, block devices and CPUS, to name some
examples).
[ Greg, would you mind giving this a review? Thanks -J ]
Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Greg KH <greg@kroah.com>
Add Xen 'grant table' driver which allows granting of access to
selected local memory pages by other virtual machines and,
symmetrically, the mapping of remote memory pages which other virtual
machines have granted access to.
This driver is a prerequisite for many of the Xen virtual device
drivers, which grant the 'device driver domain' restricted and
temporary access to only those memory pages that are currently
involved in I/O operations.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Implement a Xen back-end for hvc console.
* * *
Add early printk support via hvc console, enable using
"earlyprintk=xen" on the kernel command line.
From: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Olof Johansson <olof@lixom.net>
This is a fairly straightforward Xen implementation of smp_ops.
Xen has its own IPI mechanisms, and has no dependency on any
APIC-based IPI. The smp_ops hooks and the flush_tlb_others pv_op
allow a Xen guest to avoid all APIC code in arch/i386 (the only apic
operation is a single apic_read for the apic version number).
One subtle point which needs to be addressed is unpinning pagetables
when another cpu may have a lazy tlb reference to the pagetable. Xen
will not allow an in-use pagetable to be unpinned, so we must find any
other cpus with a reference to the pagetable and get them to shoot
down their references.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Add a new definition for PG_owner_priv_1 to define PG_pinned on Xen
pagetable pages.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Xen implements interrupts in terms of event channels. Each guest
domain gets 1024 event channels which can be used for a variety of
purposes, such as Xen timer events, inter-domain events,
inter-processor events (IPI) or for real hardware IRQs.
Within the kernel, we map the event channels to IRQs, and implement
the whole interrupt handling using a Xen irq_chip.
Rather than setting NR_IRQ to 1024 under PARAVIRT in order to
accomodate Xen, we create a dynamic mapping between event channels and
IRQs. Ideally, Linux will eventually move towards dynamically
allocating per-irq structures, and we can use a 1:1 mapping between
event channels and irqs.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Eric W. Biederman <ebiederm@xmission.com>
This patch is a rollup of all the core pieces of the Xen
implementation, including:
- booting and setup
- pagetable setup
- privileged instructions
- segmentation
- interrupt flags
- upcalls
- multicall batching
BOOTING AND SETUP
The vmlinux image is decorated with ELF notes which tell the Xen
domain builder what the kernel's requirements are; the domain builder
then constructs the address space accordingly and starts the kernel.
Xen has its own entrypoint for the kernel (contained in an ELF note).
The ELF notes are set up by xen-head.S, which is included into head.S.
In principle it could be linked separately, but it seems to provoke
lots of binutils bugs.
Because the domain builder starts the kernel in a fairly sane state
(32-bit protected mode, paging enabled, flat segments set up), there's
not a lot of setup needed before starting the kernel proper. The main
steps are:
1. Install the Xen paravirt_ops, which is simply a matter of a
structure assignment.
2. Set init_mm to use the Xen-supplied pagetables (analogous to the
head.S generated pagetables in a native boot).
3. Reserve address space for Xen, since it takes a chunk at the top
of the address space for its own use.
4. Call start_kernel()
PAGETABLE SETUP
Once we hit the main kernel boot sequence, it will end up calling back
via paravirt_ops to set up various pieces of Xen specific state. One
of the critical things which requires a bit of extra care is the
construction of the initial init_mm pagetable. Because Xen places
tight constraints on pagetables (an active pagetable must always be
valid, and must always be mapped read-only to the guest domain), we
need to be careful when constructing the new pagetable to keep these
constraints in mind. It turns out that the easiest way to do this is
use the initial Xen-provided pagetable as a template, and then just
insert new mappings for memory where a mapping doesn't already exist.
This means that during pagetable setup, it uses a special version of
xen_set_pte which ignores any attempt to remap a read-only page as
read-write (since Xen will map its own initial pagetable as RO), but
lets other changes to the ptes happen, so that things like NX are set
properly.
PRIVILEGED INSTRUCTIONS AND SEGMENTATION
When the kernel runs under Xen, it runs in ring 1 rather than ring 0.
This means that it is more privileged than user-mode in ring 3, but it
still can't run privileged instructions directly. Non-performance
critical instructions are dealt with by taking a privilege exception
and trapping into the hypervisor and emulating the instruction, but
more performance-critical instructions have their own specific
paravirt_ops. In many cases we can avoid having to do any hypercalls
for these instructions, or the Xen implementation is quite different
from the normal native version.
The privileged instructions fall into the broad classes of:
Segmentation: setting up the GDT and the GDT entries, LDT,
TLS and so on. Xen doesn't allow the GDT to be directly
modified; all GDT updates are done via hypercalls where the new
entries can be validated. This is important because Xen uses
segment limits to prevent the guest kernel from damaging the
hypervisor itself.
Traps and exceptions: Xen uses a special format for trap entrypoints,
so when the kernel wants to set an IDT entry, it needs to be
converted to the form Xen expects. Xen sets int 0x80 up specially
so that the trap goes straight from userspace into the guest kernel
without going via the hypervisor. sysenter isn't supported.
Kernel stack: The esp0 entry is extracted from the tss and provided to
Xen.
TLB operations: the various TLB calls are mapped into corresponding
Xen hypercalls.
Control registers: all the control registers are privileged. The most
important is cr3, which points to the base of the current pagetable,
and we handle it specially.
Another instruction we treat specially is CPUID, even though its not
privileged. We want to control what CPU features are visible to the
rest of the kernel, and so CPUID ends up going into a paravirt_op.
Xen implements this mainly to disable the ACPI and APIC subsystems.
INTERRUPT FLAGS
Xen maintains its own separate flag for masking events, which is
contained within the per-cpu vcpu_info structure. Because the guest
kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely
ignored (and must be, because even if a guest domain disables
interrupts for itself, it can't disable them overall).
(A note on terminology: "events" and interrupts are effectively
synonymous. However, rather than using an "enable flag", Xen uses a
"mask flag", which blocks event delivery when it is non-zero.)
There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which
are implemented to manage the Xen event mask state. The only thing
worth noting is that when events are unmasked, we need to explicitly
see if there's a pending event and call into the hypervisor to make
sure it gets delivered.
UPCALLS
Xen needs a couple of upcall (or callback) functions to be implemented
by each guest. One is the event upcalls, which is how events
(interrupts, effectively) are delivered to the guests. The other is
the failsafe callback, which is used to report errors in either
reloading a segment register, or caused by iret. These are
implemented in i386/kernel/entry.S so they can jump into the normal
iret_exc path when necessary.
MULTICALL BATCHING
Xen provides a multicall mechanism, which allows multiple hypercalls
to be issued at once in order to mitigate the cost of trapping into
the hypervisor. This is particularly useful for context switches,
since the 4-5 hypercalls they would normally need (reload cr3, update
TLS, maybe update LDT) can be reduced to one. This patch implements a
generic batching mechanism for hypercalls, which gets used in many
places in the Xen code.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Ian Pratt <ian.pratt@xensource.com>
Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Cc: Adrian Bunk <bunk@stusta.de>
Add Xen interface header files. These are taken fairly directly from
the Xen tree, but somewhat rearranged to suit the kernel's conventions.
Define macros and inline functions for doing hypercalls into the
hypervisor.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
The tsc-based get_scheduled_cycles interface is not a good match for
Xen's runstate accounting, which reports everything in nanoseconds.
This patch replaces this interface with a sched_clock interface, which
matches both Xen and VMI's requirements.
In order to do this, we:
1. replace get_scheduled_cycles with sched_clock
2. hoist cycles_2_ns into a common header
3. update vmi accordingly
One thing to note: because sched_clock is implemented as a weak
function in kernel/sched.c, we must define a real function in order to
override this weak binding. This means the usual paravirt_ops
technique of using an inline function won't work in this case.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Dan Hecht <dhecht@vmware.com>
Cc: john stultz <johnstul@us.ibm.com>
In a virtual environment, device drivers such as legacy IDE will waste
quite a lot of time probing for their devices which will never appear.
This helper function allows a paravirt implementation to lay claim to
the whole iomem and ioport space, thereby disabling all device drivers
trying to claim IO resources.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Allocate/release a chunk of vmalloc address space:
alloc_vm_area reserves a chunk of address space, and makes sure all
the pagetables are constructed for that address range - but no pages.
free_vm_area releases the address space range.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: "Jan Beulich" <JBeulich@novell.com>
Cc: "Andi Kleen" <ak@muc.de>
Make globally leave_mm visible, specifically so that Xen can use it to
shoot-down lazy uses of cr3.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
When running with CONFIG_PARAVIRT, we may want lots of IRQs even if
there's no IO APIC.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Add a hook so that the paravirt backend knows when the allocator is
ready. This is useful for the obvious reason that the allocator is
available, but the other side-effect of having the bootmem allocator
available is that each page now has an associated "struct page".
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
It's useful to know which mm is allocating a pagetable. Xen uses this
to determine whether the pagetable being added to is pinned or not.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Use existing elfnote.h to generate vsyscall notes, rather than doing
it locally. Changes elfnote.h a bit to suit, since this is the first
asm user, and it wasn't quite right.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.com>
Rather than using a tri-state integer for the wait flag in
call_usermodehelper_exec, define a proper enum, and use that. I've
preserved the integer values so that any callers I've missed should
still work OK.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: David Howells <dhowells@redhat.com>
Various pieces of code around the kernel want to be able to trigger an
orderly poweroff. This pulls them together into a single
implementation.
By default the poweroff command is /sbin/poweroff, but it can be set
via sysctl: kernel/poweroff_cmd. This is split at whitespace, so it
can include command-line arguments.
This patch replaces four other instances of invoking either "poweroff"
or "shutdown -h now": two sbus drivers, and acpi thermal
management.
sparc64 has its own "powerd"; still need to determine whether it should
be replaced by orderly_poweroff().
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Acked-by: Len Brown <lenb@kernel.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David S. Miller <davem@davemloft.net>
Rather than having hundreds of variations of call_usermodehelper for
various pieces of usermode state which could be set up, split the
info allocation and initialization from the actual process execution.
This means the general pattern becomes:
info = call_usermodehelper_setup(path, argv, envp); /* basic state */
call_usermodehelper_<SET EXTRA STATE>(info, stuff...); /* extra state */
call_usermodehelper_exec(info, wait); /* run process and free info */
This patch introduces wrappers for all the existing calling styles for
call_usermodehelper_*, but folds their implementations into one.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Bj?rn Steinbrink <B.Steinbrink@gmx.de>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
argv_split() is a helper function which takes a string, splits it at
whitespace, and returns a NULL-terminated argv vector. This is
deliberately simple - it does no quote processing of any kind.
[ Seems to me that this is something which is already being done in
the kernel, but I couldn't find any other implementations, either to
steal or replace. Keep an eye out. ]
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Add a kstrndup function, modelled on strndup. Like strndup this
returns a string copied into its own allocated memory, but it copies
no more than the specified number of bytes from the source.
Remove private strndup() from irda code.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@mandriva.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Panagiotis Issaris <takis@issaris.org>
Cc: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
This is a reimplementation of the zs driver for the serial subsystem. Any
resemblance to the old driver is purely coincidential. ;-) I do hope I got
the handling of modem lines right -- better do not tackle me about the
issue unless you feel too good...
Any users of the old driver: please note the numbers of the serial lines
have now been swapped, i.e. ttyS0 <-> ttyS1 and ttyS2 <-> ttyS3. It has
to do with the modem lines mentioned above; basically the port A in a given
chip has to be initialised before the port B if you want to use the latter
as the serial console (which is usually the case), as operations on modem
lines of the serial line associated with the port B access both ports (see
the comment at the top of the driver for the details of wiring used).
Please update your scripts.
This is also the reason each SCC now requests an IRQ once only (as seen in
"/proc/interrupts") -- the handler takes care of both ports at once as the
line associated with the port B has to take status update interrupts from
both ports (and yet the line of the port A takes its own for itself too).
The old driver never got it right...
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
early_serial_setup was removed from serial.h, but forgot to put in
serial_8250.h
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kill UBI's homegrown endianess handling and replace it with
the standard kernel endianess handling.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This patch adds support to ext4 for allowing more than 65000
subdirectories. Currently the maximum number of subdirectories is capped
at 32000.
If we exceed 65000 subdirectories in an htree directory it sets the
inode link count to 1 and no longer counts subdirectories. The
directory link count is not actually used when determining if a
directory is empty, as that only counts subdirectories and not regular
files that might be in there.
A EXT4_FEATURE_RO_COMPAT_DIR_NLINK flag has been added and it is set if
the subdir count for any directory crosses 65000. A later fsck will clear
EXT4_FEATURE_RO_COMPAT_DIR_NLINK if there are no longer any directory
with >65000 subdirs.
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
We need to make sure that existing ext3 filesystems can also avail the
new fields that have been added to the ext4 inode. We use
s_want_extra_isize and s_min_extra_isize to decide by how much we should
expand the inode. If EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE feature is set
then we expand the inode by max(s_want_extra_isize, s_min_extra_isize ,
sizeof(ext4_inode) - EXT4_GOOD_OLD_INODE_SIZE) bytes. Actually it is
still an open question about whether users should be able to set
s_*_extra_isize smaller than the known fields or not.
This patch also adds the functionality to expand inodes to include the
newly added fields. We start by trying to expand by s_want_extra_isize
bytes and if its fails we try to expand by s_min_extra_isize bytes. This
is done by changing the i_extra_isize if enough space is available in
the inode and no EAs are present. If EAs are present and there is enough
space in the inode then the EAs in the inode are shifted to make space.
If enough space is not available in the inode due to the EAs then 1 or
more EAs are shifted to the external EA block. In the worst case when
even the external EA block does not have enough space we inform the user
that some EA would need to be deleted or s_min_extra_isize would have to
be reduced.
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This patch adds nanosecond timestamps for ext4. This involves adding
*time_extra fields to the ext4_inode to extend the timestamps to
64-bits. Creation time is also added by this patch.
These extended fields will fit into an inode if the filesystem was
formatted with large inodes (-I 256 or larger) and there are currently
no EAs consuming all of the available space. For new inodes we always
reserve enough space for the kernel's known extended fields, but for
inodes created with an old kernel this might not have been the case. So
this patch also adds the EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE feature
flag(ro-compat so that older kernels can't create inodes with a smaller
extra_isize). which indicates if the fields fitting inside
s_min_extra_isize are available or not. If the expansion of inodes if
unsuccessful then this feature will be disabled. This feature is only
enabled if requested by the sysadmin.
None of the extended inode fields is critical for correct filesystem
operation.
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The jbd2-debug file used to be located in /proc/sys/fs/jbd2-debug, but it
incorrectly used create_proc_entry() instead of the sysctl routines, and
no proc entry was ever created.
Instead of fixing this we might as well move the jbd2-debug file to
debugfs which would be the preferred location for this kind of tunable.
The new location is now /sys/kernel/debug/jbd2/jbd2-debug.
Signed-off-by: Jose R. Santos <jrs@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When the JBD code was forked to create the new JBD2 code base, the
references to CONFIG_JBD_DEBUG where never changed to
CONFIG_JBD2_DEBUG. This patch fixes that.
Signed-off-by: Jose R. Santos <jrs@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Propagate flags such as S_APPEND, S_IMMUTABLE, etc. from i_flags into
ext4-specific i_flags. Quota code changes these flags on quota files
(to make it harder for sysadmin to screw himself) and these changes were
not correctly propagated into the filesystem.
(This is a forward port patch from ext3)
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This change was suggested by Andreas Dilger.
This patch changes the EXT_MAX_LEN value and extent code which marks/checks
uninitialized extents. With this change it will be possible to have
initialized extents with 2^15 blocks (earlier the max blocks we could have
was 2^15 - 1). This way we can have better extent-to-block alignment.
Now, maximum number of blocks we can have in an initialized extent is 2^15
and in an uninitialized extent is 2^15 - 1.
Signed-off-by: Amit Arora <aarora@in.ibm.com>
ipt_iprange.h must #include <linux/types.h> since it uses __be32.
This patch fixes kernel Bugzilla #7604.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because this function is only called by unregister_netdevice,
this moving could make this non-global function static,
and also remove its declaration in netdevice.h;
Any further, function __dev_addr_discard is also just called by
dev_mc_discard and dev_unicast_discard, keeping this two functions
both in one c file could make __dev_addr_discard also static
and remove its declaration in netdevice.h;
Futhermore, the sequential call to dev_unicast_discard and then
dev_mc_discard in unregister_netdevice have a similar mechanism that:
(netif_tx_lock_bh / __dev_addr_discard / netif_tx_unlock_bh),
they should merged into one to eliminate duplicates in acquiring and
releasing the dev->_xmit_lock, this would be done in my following patch.
Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
XFRM expects xfrm_dst->u.next to be same pointer as dst->next, which
was broken by the dst_entry reordering in commit 1e19e02c~, causing
an oops in xfrm_bundle_ok when walking the bundle upwards.
Kill xfrm_dst->u.next and change the only user to use dst->next instead.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
None of the existing TCP congestion controls use the rtt value pased
in the ca_ops->cong_avoid interface. Which is lucky because seq_rtt
could have been -1 when handling a duplicate ack.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create and destroy VIO devices in response to MD update events. These
run synchronously inside of the MD update mutex so the VIO layer
doesn't need to do internal locking of any sort.
Signed-off-by: David S. Miller <davem@davemloft.net>
And add dummy handlers for the VIO device layer. These will be filled
in with real code after the vdc, vnet, and ds drivers are reworked to
have simpler dependencies on the VIO device tree.
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark input_register_device() and input_register_handler() functions
as __must_check so authors of new drivers add error handling right
away.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
These serial touchscreens are found on some Fujitsu lifebook
P-series laptops, and the B6210. Using this requires a new
version of inputattach and doing:
inputattach -fjt /dev/ttyS0
Big thanks to Stephen Hemminger for testing it and making it
work on his B6210 laptop.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Pendown status from the PENIRQ pin is currently read only at the beginning
of a sample set. If the pen is lifted just after sampling has began then
sampled values become wrong.
This patch adds an optional platform penirq_recheck_delay attribute. If
non-zero, samples are only reported to the input subsystem if PENIRQ is
still active that long after the samples taken.
Signed-off-by: Semih Hazar <semih.hazar@indefia.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
The ads7846 driver has support for filtering, but when the chip gets
deselected between samples this causes noise. This patch adds support
for an optional settling delay time, so that two consecutive samples
will be taken with the specified delay time apart. This ensures that
the chip won't be deselected, so the noise won't appear.
Filtering can still be done, but will have less work to do since each
time a new sample is taken the same delay applies.
Signed-off-by: Semih Hazar <semih.hazar@indefia.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
This patch adds write support to the uninitialized extents that get
created when a preallocation is done using fallocate(). It takes care of
splitting the extents into multiple (upto three) extents and merging the
new split extents with neighbouring ones, if possible.
Signed-off-by: Amit Arora <aarora@in.ibm.com>
This patch implements ->fallocate() inode operation in ext4. With this
patch users of ext4 file systems will be able to use fallocate() system
call for persistent preallocation. Current implementation only supports
preallocation for regular files (directories not supported as of date)
with extent maps. This patch does not support block-mapped files currently.
Only FALLOC_ALLOCATE and FALLOC_RESV_SPACE modes are being supported as of
now.
Signed-off-by: Amit Arora <aarora@in.ibm.com>
fallocate() is a new system call being proposed here which will allow
applications to preallocate space to any file(s) in a file system.
Each file system implementation that wants to use this feature will need
to support an inode operation called ->fallocate().
Applications can use this feature to avoid fragmentation to certain
level and thus get faster access speed. With preallocation, applications
also get a guarantee of space for particular file(s) - even if later the
the system becomes full.
Currently, glibc provides an interface called posix_fallocate() which
can be used for similar cause. Though this has the advantage of working
on all file systems, but it is quite slow (since it writes zeroes to
each block that has to be preallocated). Without a doubt, file systems
can do this more efficiently within the kernel, by implementing
the proposed fallocate() system call. It is expected that
posix_fallocate() will be modified to call this new system call first
and incase the kernel/filesystem does not implement it, it should fall
back to the current implementation of writing zeroes to the new blocks.
ToDos:
1. Implementation on other architectures (other than i386, x86_64,
and ppc). Patches for s390(x) and ia64 are already available from
previous posts, but it was decided that they should be added later
once fallocate is in the mainline. Hence not including those patches
in this take.
2. Changes to glibc,
a) to support fallocate() system call
b) to make posix_fallocate() and posix_fallocate64() call fallocate()
Signed-off-by: Amit Arora <aarora@in.ibm.com>
With the slab zeroing allocations cleanups Christoph stubbed in a generic
kzalloc(), which was missed on SLOB. Follow the SLAB/SLUB changes and
kill off the __kzalloc() wrapper that SLOB was using.
Reported-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'bsg' of git://git.kernel.dk/data/git/linux-2.6-block:
bsg: fix missing space in version print
Don't define empty struct bsg_class_device if !CONFIG_BLK_DEV_BSG
bsg: Kconfig updates
bsg: minor cleanup
bsg: device hash table cleanup
bsg: fix initialization error handling bugs
bsg: mark FUJITA Tomonori as bsg maintainer
bsg: convert to dynamic major
bsg: address various review comments
... or we end up with header include order problems from hell.
E.g. on m68k this is 100% fatal - local_irq_enable() there
wants preempt_count(), which wants task_struct fields, which
we won't have when we are in smp.h pulled from sched.h.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce is_owner_or_cap() macro in fs.h, and convert over relevant
users to it. This is done because we want to avoid bugs in the future
where we check for only effective fsuid of the current task against a
file's owning uid, without simultaneously checking for CAP_FOWNER as
well, thus violating its semantics.
[ XFS uses special macros and structures, and in general looked ...
untouchable, so we leave it alone -- but it has been looked over. ]
The (current->fsuid != inode->i_uid) check in generic_permission() and
exec_permission_lite() is left alone, because those operations are
covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations
falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone.
Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Cc: Al Viro <viro@ftp.linux.org.uk>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (80 commits)
KVM: Use CPU_DYING for disabling virtualization
KVM: Tune hotplug/suspend IPIs
KVM: Keep track of which cpus have virtualization enabled
SMP: Allow smp_call_function_single() to current cpu
i386: Allow smp_call_function_single() to current cpu
x86_64: Allow smp_call_function_single() to current cpu
HOTPLUG: Adapt thermal throttle to CPU_DYING
HOTPLUG: Adapt cpuset hotplug callback to CPU_DYING
HOTPLUG: Add CPU_DYING notifier
KVM: Clean up #includes
KVM: Remove kvmfs in favor of the anonymous inodes source
KVM: SVM: Reliably detect if SVM was disabled by BIOS
KVM: VMX: Remove unnecessary code in vmx_tlb_flush()
KVM: MMU: Fix Wrong tlb flush order
KVM: VMX: Reinitialize the real-mode tss when entering real mode
KVM: Avoid useless memory write when possible
KVM: Fix x86 emulator writeback
KVM: Add support for in-kernel pio handlers
KVM: VMX: Fix interrupt checking on lightweight exit
KVM: Adds support for in-kernel mmio handlers
...
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Clean away some code inside some non-existent CONFIG ifdefs
[IA64] ar.itc access must really be after xtime_lock.sequence has been read
[IA64] correctly count CPU objects in the ia64/sn hwperf interface
[IA64] arbitary speed tty ioctl support
[IA64] use machvec=dig on hpzx1 platforms
Verify that types would match for assignment (under sizeof, so we are safe from
side effects or any code actually getting generated), then explicitly cast
everywhere to the fixed-sized types. Kills a bunch of bogus warnings about
constants being truncated (gcc, sparse), finds a pile of endianness problems
hidden by old noise (sparse).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
... fortunately, termios and ktermios there are identical, so no
run-time breakage happened.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
bitmap_unplug only ever returns 0, so it may as well be void. Two callers try
to print a message if it returns non-zero, but that message is already printed
by bitmap_file_kick.
write_page returns an error which is not consistently checked. It always
causes BITMAP_WRITE_ERROR to be set on an error, and that can more
conveniently be checked.
When the return of write_page is checked, an error causes bitmap_file_kick to
be called - so move that call into write_page - and protect against recursive
calls into bitmap_file_kick.
bitmap_update_sb returns an error that is never checked.
So make these 'void' and be consistent about checking the bit.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don't use 'unsigned' variable to track sync vs non-sync IO, as the only thing
we want to do with them is a signed comparison, and fix up the comment which
had become quite wrong.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add fb_append_extra_logo(), to append extra lines of logos below the standard
Linux logo.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Acked-By: James Simmons <jsimmons@infradead.org>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No memory allocation was done for the pseudo_palette. Allocate one for it.
Signed-off-by: Antonino Daplas <adaplas@gmail.com>
Acked-by: "Maciej W. Rozycki" <macro@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This allows for proper console unregistration via the VT layer, and updates
the FB layer to use it. This makes debugging new console drivers much easier,
since you can properly clean them up before unloading.
[adaplas]
unregister_framebuffer() is typically called as part of the driver's
module_exit(). Doing so otherwise will freeze the machine as the VT layer is
holding reference counts on fbcon, and fbcon on the driver. With this change,
it allows unregister_framebuffer() to be called safely anywhere as needed.
Additions from the original: If multiple drivers are used by fbcon, and if
one of them unregisters, a driver will take over the consoles vacated by the
outgoing one (via set_con2fb_map). Once only the outgoing driver remains,
then fbcon will unbind from the VT layer (if CONFIG_HW_CONSOLE_UNBINDING is
set to y).
It is important that these drivers implement fb_open() and fb_release()
just to ensure that no other process is using the driver. Likewise, these
drivers _must_ check the return value of unregister_framebuffer().
[akpm@linux-foundation.org: make fbcon_unbind() stub inline]
Signed-off-by: Jesse Barnes <jesse.barnes@intel.com>
Signed-off-by: Antonino Daplas <adaplas@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow fbcon to select the primary display adapter using the
fb_is_primary_device() arch-specific helper. If a a primary adapter is
detected, fbcon will unbind the old adapter from the VT layer, then rebind
using the new adapter. This requires that bind_/unbind_con_driver() be made
public.
Because this feature may produce unexpected behavior (from the user's POV),
this must be explicitly enabled in Kconfig.
[akpm@linux-foundation.org: export unbind_con_driver]
Signed-off-by: Antonino Daplas <adaplas@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add function helper, fb_is_primary_device(). Given struct fb_info, it will
return a nonzero value if the device is the primary display.
Currently, only the i386 is supported where the function checks for the
IORESOURCE_ROM_SHADOW flag.
Signed-off-by: Antonino Daplas <adaplas@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move arch-specific bits of fb_mmap() to their respective subdirectories
[bob.picco@hp.com: efi_range_is_wc is referenced but not declared]
[bunk@stusta.de: fix include/asm-m68k/fb.h]
Signed-off-by: Antonino Daplas <adaplas@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow root squashing to vary per-pseudoflavor, so that you can (for example)
allow root access only when sufficiently strong security is in use.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We could return some sort of error in the case where someone asks for secinfo
on an export without the secinfo= option set--that'd be no worse than what
we've been doing. But it's not really correct. So, hack up an approximate
secinfo response in that case--it may not be complete, but it'll tell the
client at least one acceptable security flavor.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement the secinfo operation.
(Thanks to Usha Ketineni wrote an earlier version of this support.)
Cc: Usha Ketineni <uketinen@us.ibm.com>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow readonly access to vary depending on the pseudoflavor, using the flag
passed with each pseudoflavor in the export downcall. The rest of the flags
are ignored for now, though some day we might also allow id squashing to vary
based on the flavor.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make the first actual use of the secinfo information by using it to return
nfserr_wrongsec when an export is found that doesn't allow the flavor used on
this request.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We want it to be possible for users to restrict exports both by IP address and
by pseudoflavor. The pseudoflavor information has previously been passed
using special auth_domains stored in the rq_client field. After the preceding
patch that stored the pseudoflavor in rq_pflavor, that's now superfluous; so
now we use rq_client for the ip information, as auth_null and auth_unix do.
However, we keep around the special auth_domain in the rq_gssclient field for
backwards compatibility purposes, so we can still do upcalls using the old
"gss/pseudoflavor" auth_domain if upcalls using the unix domain to give us an
appropriate export. This allows us to continue supporting old mountd.
In fact, for this first patch, we always use the "gss/pseudoflavor"
auth_domain (and only it) if it is available; thus rq_client is ignored in the
auth_gss case, and this patch on its own makes no change in behavior; that
will be left to later patches.
Note on idmap: I'm almost tempted to just replace the auth_domain in the idmap
upcall by a dummy value--no version of idmapd has ever used it, and it's
unlikely anyone really wants to perform idmapping differently depending on the
where the client is (they may want to perform *credential* mapping
differently, but that's a different matter--the idmapper just handles id's
used in getattr and setattr). But I'm updating the idmapd code anyway, just
out of general backwards-compatibility paranoia.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Split the callers of exp_get_by_name(), exp_find(), and exp_parent() into
those that are processing requests and those that are doing other stuff (like
looking up filehandles for mountd).
No change in behavior, just a (fairly pointless, on its own) cleanup.
(Note this has the effect of making nfsd_cross_mnt() pass rqstp->rq_client
instead of exp->ex_client into exp_find_by_name(). However, the two should
have the same value at this point.)
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We're passing three arguments to exp_pseudoroot, two of which are just fields
of the svc_rqst. Soon we'll want to pass in a third field as well. So let's
just give up and pass in the whole struct svc_rqst.
Also sneak in some minor style cleanups while we're at it.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We add a list of pseudoflavors to each export downcall, which will be used
both as a list of security flavors allowed on that export, and (in the order
given) as the list of pseudoflavors to return on secinfo calls.
This patch parses the new downcall information and adds it to the export
structure, but doesn't use it for anything yet.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a new field to the svc_rqst structure to record the pseudoflavor that the
request was made with. For now we record the pseudoflavor but don't use it
for anything.
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
One more incremental delegation policy improvement: don't give out a
delegation on a file if conflicting access has previously required that a
delegation be revoked on that file. (In practice we'll forget about the
conflict when the struct nfs4_file is removed on close, so this is of limited
use for now, though it should at least solve a temporary problem with
self-conflicts on write opens from the same client.)
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Our original NFSv4 delegation policy was to give out a read delegation on any
open when it was possible to.
Since the lifetime of a delegation isn't limited to that of an open, a client
may quite reasonably hang on to a delegation as long as it has the inode
cached. This becomes an obvious problem the first time a client's inode cache
approaches the size of the server's total memory.
Our first quick solution was to add a hard-coded limit. This patch makes a
mild incremental improvement by varying that limit according to the server's
total memory size, allowing at most 4 delegations per megabyte of RAM.
My quick back-of-the-envelope calculation finds that in the worst case (where
every delegation is for a different inode), a delegation could take about
1.5K, which would make the worst case usage about 6% of memory. The new limit
works out to be about the same as the old on a 1-gig server.
[akpm@linux-foundation.org: Don't needlessly bloat vmlinux]
[akpm@linux-foundation.org: Make it right for highmem machines]
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It looks like Al Viro gutted this header file five years ago and it hasn't
been touched since.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NFS4_FHSIZE is measured in bytes, not 4-byte words, so much more space than
necessary is being allocated for struct nfs4_cb_recall.
I should have wondered why this structure was so much larger than it needed to
be!
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Both lockd and (in the nfsv4 case) nfsd enforce a "grace period" after reboot,
during which clients may reclaim locks from the previous server instance, but
may not acquire new locks.
Currently the lockd and nfsd enforce grace periods of different lengths. This
may cause problems when we reboot a server with both v2/v3 and v4 clients.
For example, if the lockd grace period is shorter (as is likely the case),
then a v3 client might acquire a new lock that conflicts with a lock already
held (but not yet reclaimed) by a v4 client.
This patch calculates a lease time that lockd and nfsd can both use.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently NFSD calls directly into filesystems through the export_operations
structure. I plan to change this interface in various ways in later patches,
and want to avoid the export of the default operations to NFSD, so this patch
adds two simple exportfs_encode_fh/exportfs_decode_fh helpers for NFSD to call
instead of poking into exportfs guts.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the exportfs interface was added the expectation was that filesystems
provide an operation to convert from a file handle to an inode/dentry, but it
kept a backwards compat option that still calls into iget.
Calling into iget from non-filesystem code is very bad, because it gives too
little information to filesystem, and simply crashes if the filesystem doesn't
implement the ->read_inode routine.
Fortunately there are only two filesystems left using this fallback: efs and
jfs. This patch moves a copy of export_iget to each of those to implement the
get_dentry method.
While this is a temporary increase of lines of code in the kernel it allows
for a much cleaner interface and important code restructuring in later
patches.
[akpm@linux-foundation.org: add jfs_get_inode_flags() declaration]
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
currently the export_operation structure and helpers related to it are in
fs.h. fs.h is already far too large and there are very few places needing the
export bits, so split them off into a separate header.
[akpm@linux-foundation.org: fix cifs build]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The CAPI 2.0 driver uses a semaphore as mutex. Use the mutex API instead of
the (binary) semaphore.
Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Acked-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Quicc Engine enabled mpc83xx CPU's has a somewhat different HW interface to
the SPI controller. This patch adds a qe_mode knob that sees to that
needed adaptions are performed.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for the Infineon TLE62x0 series of low-side driver chips, such
as the TLE6220 or TLE6230. These can be viewed as output GPIOs specialized
for power switching applications. The driver provides a userspace
interface to those GPIOs, and to the switch status they provide.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add CRC7 routines, used for example in MMC over SPI communication.
Kerneldoc updates
[akpm@linux-foundation.org: fix funny mix of const and non-const]
Signed-off-by: Jan Nikitenko <jan.nikitenko@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a new spi->mode bit: SPI_3WIRE, for chips where the SI and SO signals
are shared (and which are thus only half duplex). Update the LM70 driver
to require support for that hardware mode from the controller.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minor SPI controller driver updates: make the setup() methods reject
spi->mode bits they don't support, by masking aginst the inverse of bits
they *do* support. This insures against misbehavior later when new mode
bits get added.
Most controllers can't support SPI_LSB_FIRST; more handle SPI_CS_HIGH.
Support for all four SPI clock/transfer modes is routine.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sparse now warns if one compares pointers with integers. However, there are
false positives, like:
fs/filesystems.c:72:2: warning: Using plain integer as NULL pointer
Every time BUG_ON(ptr) is used, ptr is checked against integer zero. Avoid
that and save ~70 false positives from allyesconfig run.
mentioned by Al.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Josh Triplett <josh@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make arguments of timespec_equal() const struct timespec.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KSYM_NAME_LEN is peculiar in that it does not include the space for the
trailing '\0', forcing all users to use KSYM_NAME_LEN + 1 when allocating
buffer. This is nonsense and error-prone. Moreover, when the caller
forgets that it's very likely to subtly bite back by corrupting the stack
because the last position of the buffer is always cleared to zero.
This patch increments KSYM_NAME_LEN by one and updates code accordingly.
* off-by-one bug in asm-powerpc/kprobes.h::kprobe_lookup_name() macro
is fixed.
* Where MODULE_NAME_LEN and KSYM_NAME_LEN were used together,
MODULE_NAME_LEN was treated as if it didn't include space for the
trailing '\0'. Fix it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Paulo Marques <pmarques@grupopie.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a driver for the SB1250 DUART, a dual serial port implementation
included in the Broadcom family of SOCs descending from the SiByte SB1250
MIPS64 chip multiprocessor. It is a new implementation replacing the
old-fashioned driver currently present in the linux-mips.org tree. It
supports all the usual features one would expect from a(n asynchronous)
serial driver, including modem line control (as far as hardware supports it
-- there is edge detection logic missing from the DCD and RI lines and the
driver does not implement polling of these lines at the moment), the serial
console, BREAK transmission and reception, including the magic SysRq. The
receive FIFO threshold is not maintained though.
The driver was tested with a SWARM board which uses a BCM1250 SOC (which is
dual MIPS64 CMP) and has both ports of the single DUART implemented wired
externally. Both were tested. Testing included using the ports as
terminal lines at 1200bps (which is the ports minimum), 115200bps and a
couple of random speeds inbetween. The modem lines were verified to
operate correctly. No testing was performed with a use as a network
interface, like with SLIP or PPP.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The CHILD_MAX macro in limits.h should not be there. It claims to be the
limit on processes a user can own, but its value is wrong for that.
There is no constant value, but a variable resource limit (RLIMIT_NPROC).
Nothing in the kernel uses CHILD_MAX.
The proper thing to do according to POSIX is not to define CHILD_MAX at all.
The sysconf (_SC_CHILD_MAX) implementation works by calling getrlimit.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The OPEN_MAX macro in limits.h should not be there. It claims to be the
limit on file descriptors in a process, but its value is wrong for that.
There is no constant value, but a variable resource limit (RLIMIT_NOFILE).
Nothing in the kernel uses OPEN_MAX except things that are wrong to do so.
I've submitted other patches to remove those uses.
The proper thing to do according to POSIX is not to define OPEN_MAX at all.
The sysconf (_SC_OPEN_MAX) implementation works by calling getrlimit.
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The OPEN_MAX constant is an arbitrary number with no useful relation to
anything. Nothing should be using it. SCM_MAX_FD is just an arbitrary
constant and it should be clear that its value is chosen in net/scm.h
and not actually derived from anything else meaningful in the system.
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Put WARN_ON and fixed all callers of unregister_blkdev(). Now we can make
unregister_blkdev return void.
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a proper prototype for proc_nr_files() in include/linux/fs.h
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Identical implementations of PTRACE_POKEDATA go into generic_ptrace_pokedata()
function.
AFAICS, fix bug on xtensa where successful PTRACE_POKEDATA will nevertheless
return EPERM.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the kernel OOPSed or BUGed then it probably should be considered as
tainted. Thus, all subsequent OOPSes and SysRq dumps will report the
tainted kernel. This saves a lot of time explaining oddities in the
calltraces.
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Added parisc patch from Matthew Wilson -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The bounce buffer logic is included on systems that do not need it. If a
system does not have zones like ZONE_DMA and ZONE_HIGHMEM that can lead to
the use of bounce buffers then there is no need to reserve memory pools etc
etc. This is true f.e. for SGI Altix.
Also nicifies the Makefile and gets rid of the tricky "and" there.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, the freezer treats all tasks as freezable, except for the kernel
threads that explicitly set the PF_NOFREEZE flag for themselves. This
approach is problematic, since it requires every kernel thread to either
set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
care for the freezing of tasks at all.
It seems better to only require the kernel threads that want to or need to
be frozen to use some freezer-related code and to remove any
freezer-related code from the other (nonfreezable) kernel threads, which is
done in this patch.
The patch causes all kernel threads to be nonfreezable by default (ie. to
have PF_NOFREEZE set by default) and introduces the set_freezable()
function that should be called by the freezable kernel threads in order to
unset PF_NOFREEZE. It also makes all of the currently freezable kernel
threads call set_freezable(), so it shouldn't cause any (intentional)
change of behaviour to appear. Additionally, it updates documentation to
describe the freezing of tasks more accurately.
[akpm@linux-foundation.org: build fixes]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Detect slab objects being passed to the page oriented functions of the VM.
It is not sufficient to simply return NULL because the functions calling
page_mapping may depend on other items of the page_struct also to be setup
properly. Moreover slab object may not be properly aligned. The page
oriented functions of the VM expect to operate on page aligned, page sized
objects. Operations on object straddling page boundaries may only affect the
objects partially which may lead to surprising results.
It is better to detect eventually remaining uses and eliminate them.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It becomes now easy to support the zeroing allocs with generic inline
functions in slab.h. Provide inline definitions to allow the continued use of
kzalloc, kmem_cache_zalloc etc but remove other definitions of zeroing
functions from the slab allocators and util.c.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add #ifdefs around data structures only needed if debugging is compiled into
SLUB.
Add inlines to small functions to reduce code size.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Define ZERO_OR_NULL_PTR macro to be able to remove the checks from the
allocators. Move ZERO_SIZE_PTR related stuff into slab.h.
Make ZERO_SIZE_PTR work for all slab allocators and get rid of the
WARN_ON_ONCE(size == 0) that is still remaining in SLAB.
Make slub return NULL like the other allocators if a too large memory segment
is requested via __kmalloc.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I can never remember what the function to register to receive VM pressure
is called. I have to trace down from __alloc_pages() to find it.
It's called "set_shrinker()", and it needs Your Help.
1) Don't hide struct shrinker. It contains no magic.
2) Don't allocate "struct shrinker". It's not helpful.
3) Call them "register_shrinker" and "unregister_shrinker".
4) Call the function "shrink" not "shrinker".
5) Reduce the 17 lines of waffly comments to 13, but document it properly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Chinner <dgc@sgi.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When we are out of memory of a suitable size we enter reclaim. The current
reclaim algorithm targets pages in LRU order, which is great for fairness at
order-0 but highly unsuitable if you desire pages at higher orders. To get
pages of higher order we must shoot down a very high proportion of memory;
>95% in a lot of cases.
This patch set adds a lumpy reclaim algorithm to the allocator. It targets
groups of pages at the specified order anchored at the end of the active and
inactive lists. This encourages groups of pages at the requested orders to
move from active to inactive, and active to free lists. This behaviour is
only triggered out of direct reclaim when higher order pages have been
requested.
This patch set is particularly effective when utilised with an
anti-fragmentation scheme which groups pages of similar reclaimability
together.
This patch set is based on Peter Zijlstra's lumpy reclaim V2 patch which forms
the foundation. Credit to Mel Gorman for sanitity checking.
Mel said:
The patches have an application with hugepage pool resizing.
When lumpy-reclaim is used used with ZONE_MOVABLE, the hugepages pool can
be resized with greater reliability. Testing on a desktop machine with 2GB
of RAM showed that growing the hugepage pool with ZONE_MOVABLE on it's own
was very slow as the success rate was quite low. Without lumpy-reclaim,
each attempt to grow the pool by 100 pages would yield 1 or 2 hugepages.
With lumpy-reclaim, getting 40 to 70 hugepages on each attempt was typical.
[akpm@osdl.org: ia64 pfn_to_nid fixes and loop cleanup]
[bunk@stusta.de: static declarations for internal functions]
[a.p.zijlstra@chello.nl: initial lumpy V2 implementation]
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Bob Picco <bob.picco@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the kernelcore= parameter for x86.
Once all patches are applied, a new command-line parameter exist and a new
sysctl. This patch adds the necessary documentation.
From: Yasunori Goto <y-goto@jp.fujitsu.com>
When "kernelcore" boot option is specified, kernel can't boot up on ia64
because of an infinite loop. In addition, the parsing code can be handled
in an architecture-independent manner.
This patch uses common code to handle the kernelcore= parameter. It is
only available to architectures that support arch-independent zone-sizing
(i.e. define CONFIG_ARCH_POPULATES_NODE_MAP). Other architectures will
ignore the boot parameter.
[bunk@stusta.de: make cmdline_parse_kernelcore() static]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Huge pages are not movable so are not allocated from ZONE_MOVABLE. However,
as ZONE_MOVABLE will always have pages that can be migrated or reclaimed, it
can be used to satisfy hugepage allocations even when the system has been
running a long time. This allows an administrator to resize the hugepage pool
at runtime depending on the size of ZONE_MOVABLE.
This patch adds a new sysctl called hugepages_treat_as_movable. When a
non-zero value is written to it, future allocations for the huge page pool
will use ZONE_MOVABLE. Despite huge pages being non-movable, we do not
introduce additional external fragmentation of note as huge pages are always
the largest contiguous block we care about.
[akpm@linux-foundation.org: various fixes]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The following 8 patches against 2.6.20-mm2 create a zone called ZONE_MOVABLE
that is only usable by allocations that specify both __GFP_HIGHMEM and
__GFP_MOVABLE. This has the effect of keeping all non-movable pages within a
single memory partition while allowing movable allocations to be satisfied
from either partition. The patches may be applied with the list-based
anti-fragmentation patches that groups pages together based on mobility.
The size of the zone is determined by a kernelcore= parameter specified at
boot-time. This specifies how much memory is usable by non-movable
allocations and the remainder is used for ZONE_MOVABLE. Any range of pages
within ZONE_MOVABLE can be released by migrating the pages or by reclaiming.
When selecting a zone to take pages from for ZONE_MOVABLE, there are two
things to consider. First, only memory from the highest populated zone is
used for ZONE_MOVABLE. On the x86, this is probably going to be ZONE_HIGHMEM
but it would be ZONE_DMA on ppc64 or possibly ZONE_DMA32 on x86_64. Second,
the amount of memory usable by the kernel will be spread evenly throughout
NUMA nodes where possible. If the nodes are not of equal size, the amount of
memory usable by the kernel on some nodes may be greater than others.
By default, the zone is not as useful for hugetlb allocations because they are
pinned and non-migratable (currently at least). A sysctl is provided that
allows huge pages to be allocated from that zone. This means that the huge
page pool can be resized to the size of ZONE_MOVABLE during the lifetime of
the system assuming that pages are not mlocked. Despite huge pages being
non-movable, we do not introduce additional external fragmentation of note as
huge pages are always the largest contiguous block we care about.
Credit goes to Andy Whitcroft for catching a large variety of problems during
review of the patches.
This patch creates an additional zone, ZONE_MOVABLE. This zone is only usable
by allocations which specify both __GFP_HIGHMEM and __GFP_MOVABLE. Hot-added
memory continues to be placed in their existing destination as there is no
mechanism to redirect them to a specific zone.
[y-goto@jp.fujitsu.com: Fix section mismatch of memory hotplug related code]
[akpm@linux-foundation.org: various fixes]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is often known at allocation time whether a page may be migrated or not.
This patch adds a flag called __GFP_MOVABLE and a new mask called
GFP_HIGH_MOVABLE. Allocations using the __GFP_MOVABLE can be either migrated
using the page migration mechanism or reclaimed by syncing with backing
storage and discarding.
An API function very similar to alloc_zeroed_user_highpage() is added for
__GFP_MOVABLE allocations called alloc_zeroed_user_highpage_movable(). The
flags used by alloc_zeroed_user_highpage() are not changed because it would
change the semantics of an existing API. After this patch is applied there
are no in-kernel users of alloc_zeroed_user_highpage() so it probably should
be marked deprecated if this patch is merged.
Note that this patch includes a minor cleanup to the use of __GFP_ZERO in
shmem.c to keep all flag modifications to inode->mapping in the
shmem_dir_alloc() helper function. This clean-up suggestion is courtesy of
Hugh Dickens.
Additional credit goes to Christoph Lameter and Linus Torvalds for shaping the
concept. Credit to Hugh Dickens for catching issues with shmem swap vector
and ramfs allocations.
[akpm@linux-foundation.org: build fix]
[hugh@veritas.com: __GFP_ZERO cleanup]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nobody is using ptep_test_and_clear_dirty and ptep_clear_flush_dirty. Remove
the functions from all architectures.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The last user of ptep_establish in mm/ is long gone. Remove the architecture
primitive as well.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for IRQ migration across vector domain.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add fundamental support for multiple vector domain. There still exists
only one vector domain even with this patch. IRQ migration across
domain is not supported yet by this patch.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add mapping tables between irqs and vectors, and its management code.
This is necessary for supporting multiple vector domain because 1:1
mapping between irq and vector will be changed to n:1.
The irq == vector relationship between irqs and vectors is explicitly
remained for percpu interrupts, platform interrupts, isa IRQs and
vectors assigned using assign_irq_vector() because some programs might
depend on it.
And I should consider the following problem.
When pci drivers enabled/disabled devices dynamically, its irq number
is changed to the different one. Therefore, suspend/resume code may
happen problem.
To fix this problem, I bound gsi to irq.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Don't define an empty struct bsg_class_device if !CONFIG_BLK_DEV_BSG.
It's embedded in struct request_queue, but there we have
#if defined(CONFIG_BLK_DEV_BSG)
struct bsg_class_device bsg_dev;
#endif
anyway.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (209 commits)
[POWERPC] Create add_rtc() function to enable the RTC CMOS driver
[POWERPC] Add H_ILLAN_ATTRIBUTES hcall number
[POWERPC] xilinxfb: Parameterize xilinxfb platform device registration
[POWERPC] Oprofile support for Power 5++
[POWERPC] Enable arbitary speed tty ioctls and split input/output speed
[POWERPC] Make drivers/char/hvc_console.c:khvcd() static
[POWERPC] Remove dead code for preventing pread() and pwrite() calls
[POWERPC] Remove unnecessary #undef printk from prom.c
[POWERPC] Fix typo in Ebony default DTS
[POWERPC] Check for NULL ppc_md.init_IRQ() before calling
[POWERPC] Remove extra return statement
[POWERPC] pasemi: Don't auto-select CONFIG_EMBEDDED
[POWERPC] pasemi: Rename platform
[POWERPC] arch/powerpc/kernel/sysfs.c: Move NUMA exports
[POWERPC] Add __read_mostly support for powerpc
[POWERPC] Modify sched_clock() to make CONFIG_PRINTK_TIME more sane
[POWERPC] Create a dummy zImage if no valid platform has been selected
[POWERPC] PS3: Bootwrapper support.
[POWERPC] powermac i2c: Use mutex
[POWERPC] Schedule removal of arch/ppc
...
Fixed up conflicts manually in:
Documentation/feature-removal-schedule.txt
arch/powerpc/kernel/pci_32.c
arch/powerpc/kernel/pci_64.c
include/asm-powerpc/pci.h
and asked the powerpc people to double-check the result..
Convert the macb driver to use the generic PHY layer in
drivers/net/phy.
Signed-off-by: Frederic RODO <f.rodo@til-technologies.fr>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This reverts commit 29578624e3.
Ingo Molnar reports complete breakage with his e1000 card (no
networking, card reports transmit timeouts), and bisected it down to
this commit. Let's figure out what went wrong, but not keep breaking
machines until we do.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Olaf Kirch <olaf.kirch@oracle.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (32 commits)
[PATCH] ocfs2: zero_user_page conversion
ocfs2: Support xfs style space reservation ioctls
ocfs2: support for removing file regions
ocfs2: update truncate handling of partial clusters
ocfs2: btree support for removal of arbirtrary extents
ocfs2: Support creation of unwritten extents
ocfs2: support writing of unwritten extents
ocfs2: small cleanup of ocfs2_write_begin_nolock()
ocfs2: btree changes for unwritten extents
ocfs2: abstract btree growing calls
ocfs2: use all extent block suballocators
ocfs2: plug truncate into cached dealloc routines
ocfs2: simplify deallocation locking
ocfs2: harden buffer check during mapping of page blocks
ocfs2: shared writeable mmap
ocfs2: factor out write aops into nolock variants
ocfs2: rework ocfs2_buffered_write_cluster()
ocfs2: take ip_alloc_sem during entire truncate
ocfs2: Add "preferred slot" mount option
[KJ PATCH] Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c
...
* 'bsg' of git://git.kernel.dk/data/git/linux-2.6-block: (25 commits)
bsg: Kconfig updates
bsg: add SCSI transport-level request support
bsg: add bidi support
add a struct request pointer to the request structure
bsg: fix the deadlock on discarding done commands
bsg: fix a blocking read bug
bsg: minor bug fixes
improve bsg device allocation
bind bsg to all SCSI devices
bsg: bind bsg to request_queue instead of gendisk
bsg: add a request_queue argument to scsi_cmd_ioctl()
bsg: simplify __bsg_alloc_command failpath
bsg: add cheasy error checks for sysfs stuff
Add queue resizing support
Replace s32, u32 and u64 with __s32, __u32 and __u64 in bsg.h for userspace
bsg: silence a bogus gcc warning
bsg: style cleanup
bsg: use u32 etc instead of uint32_t
bsg: add SG_IO to SG v4
bsg: replace SG v3 with SG v4
...
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
splice: direct splicing updates ppos twice
more ACSI removal
umem: Fix match of pci_ids in umem driver
umem: Remove references to dead CONFIG_MM_MAP_MEMORY variable
remove the documentation for the legacy CDROM drivers
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: (26 commits)
[SPARC64]: Fix UP build.
[SPARC64]: dr-cpu unconfigure support.
[SERIAL]: Fix console write locking in sparc drivers.
[SPARC64]: Give more accurate errors in dr_cpu_configure().
[SPARC64]: Clear cpu_{core,sibling}_map[] in smp_fill_in_sib_core_maps()
[SPARC64]: Fix leak when DR added cpu does not bootup.
[SPARC64]: Add ->set_affinity IRQ handlers.
[SPARC64]: Process dr-cpu events in a kthread instead of workqueue.
[SPARC64]: More sensible udelay implementation.
[SPARC64]: SMP build fixes.
[SPARC64]: mdesc.c needs linux/mm.h
[SPARC64]: Fix build regressions added by dr-cpu changes.
[SPARC64]: Unconditionally register vio_bus_type.
[SPARC64]: Initial LDOM cpu hotplug support.
[SPARC64]: Fix setting of variables in LDOM guest.
[SPARC64]: Fix MD property lifetime bugs.
[SPARC64]: Abstract out mdesc accesses for better MD update handling.
[SPARC64]: Use more mearningful names for IRQ registry.
[SPARC64]: Initial domain-services driver.
[SPARC64]: Export powerd facilities for external entities.
...
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (68 commits)
sh: sh-rtc support for SH7709.
sh: Revert __xdiv64_32 size change.
sh: Update r7785rp defconfig.
sh: Export div symbols for GCC 4.2 and ST GCC.
sh: fix race in parallel out-of-tree build
sh: Kill off dead mach.c for hp6xx.
sh: hd64461.h cleanup and added comments.
sh: Update the alignment when 4K stacks are used.
sh: Add a .bss.page_aligned section for 4K stacks.
sh: Don't let SH-4A clobber SH-4 CFLAGS.
sh: Add parport stub for SuperIO ports.
sh: Drop -Wa,-dsp for DSP tuning.
sh: Update dreamcast defconfig.
fb: pvr2fb: A few more __devinit annotations for PCI.
fb: pvr2fb: Fix up section mismatch warnings.
sh: Select IPR-IRQ for SH7091.
sh: Correct __xdiv64_32/div64_32 return value size.
sh: Fix timer-tmu build for SH-3.
sh: Add cpu and mach links to CLEAN_FILES.
sh: Preliminary support for the SH-X3 CPU.
...
This is a patch that speeds up statfs. It is very simple - the "overhead"
calculation, which takes a huge amount of time for large filesystems, never
changes unless the size of the filesystem itself changes. That means we can
store it in memory and only recalculate if the filesystem has been resized
(almost never).
It also fixes a minor problem that we never update the on-disk superblock free
blocks/inodes counts until the filesystem is unmounted. While not fatal, we
may as well update that on disk when we have the information, and it makes
things like debugfs and dumpe2fs report a bit more accurate info.
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a patch that speeds up statfs. It is very simple - the "overhead"
calculation, which takes a huge amount of time for large filesystems, never
changes unless the size of the filesystem itself changes. That means we can
store it in memory and only recalculate if the filesystem has been resized
(almost never).
It also fixes a minor problem that we never update the on-disk superblock free
blocks/inodes counts until the filesystem is unmounted. While not fatal, we
may as well update that on disk when we have the information, and it makes
things like debugfs and dumpe2fs report a bit more accurate info.
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a patch that speeds up statfs. It is very simple - the "overhead"
calculation, which takes a huge amount of time for large filesystems, never
changes unless the size of the filesystem itself changes. That means we can
store it in memory and only recalculate if the filesystem has been resized
(almost never).
It also fixes a minor problem that we never update the on-disk superblock free
blocks/inodes counts until the filesystem is unmounted. While not fatal, we
may as well update that on disk when we have the information, and it makes
things like debugfs and dumpe2fs report a bit more accurate info.
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change cancel_work_sync() and cancel_delayed_work_sync() to return a boolean
indicating whether the work was actually cancelled. A zero return value means
that the work was not pending/queued.
Without that kind of change it is not possible to avoid flush_workqueue()
sometimes, see the next patch as an example.
Also, this patch unifies both functions and kills the (unlikely) busy-wait
loop.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Jarek Poplawski <jarkao2@o2.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Imho, the current naming of cancel_xxx workqueue functions is very confusing.
cancel_delayed_work()
cancel_rearming_delayed_work()
cancel_rearming_delayed_workqueue() // obsolete
cancel_work_sync()
This looks as if the first 2 functions differ in "type" of their argument
which is not true any longer, nowadays the difference is the behaviour.
The semantics of cancel_rearming_delayed_work(dwork) was changed
significantly, it doesn't require that dwork rearms itself, and cancels dwork
synchronously.
Rename it to cancel_delayed_work_sync(). This matches cancel_delayed_work()
and cancel_work_sync(). Re-create cancel_rearming_delayed_work() as a simple
inline obsolete wrapper, like cancel_rearming_delayed_workqueue().
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Jarek Poplawski <jarkao2@o2.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The UMSDOS filesystem was removed back in 2.6.11, but some tiny bits stuck
around. This patch removes the few remaining leftovers. The only things
left behind after this are the entries in the CREDITS file and the ioctl
number in Documentation/ioctl-number.txt as documentation.
This third (hopefully final) version of the patch doesn't edit the
arch/um/config.release file, since Jeff Dike pointed out to me that it
should die completely, and asked me to remove it from my patch as he'll
send in a seperate patch removing the file completely.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current generic bug implementation has a call to dump_stack() in case a
WARN_ON(whatever) gets hit. Since report_bug(), which calls dump_stack(),
gets called from an exception handler we can do better: just pass the
pt_regs structure to report_bug() and pass it to show_regs() in case of a
warning. This will give more debug informations like register contents,
etc... In addition this avoids some pointless lines that dump_stack()
emits, since it includes a stack backtrace of the exception handler which
is of no interest in case of a warning. E.g. on s390 the following lines
are currently always present in a stack backtrace if dump_stack() gets
called from report_bug():
[<000000000001517a>] show_trace+0x92/0xe8)
[<0000000000015270>] show_stack+0xa0/0xd0
[<00000000000152ce>] dump_stack+0x2e/0x3c
[<0000000000195450>] report_bug+0x98/0xf8
[<0000000000016cc8>] illegal_op+0x1fc/0x21c
[<00000000000227d6>] sysc_return+0x0/0x10
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a rather bizarre thing to have inlined in io.h. Stick it in lib/
instead.
While we're there, despaghetti it a bit, and fix its off-by-one behaviour when
passed a zero length.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This follows a suggestion from Chuck Ebbert on how to make seccomp
absolutely zerocost in schedule too. The only remaining footprint of
seccomp is in terms of the bzImage size that becomes a few bytes (perhaps
even a few kbytes) larger, measure it if you care in the embedded.
Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reduces the memory footprint and it enforces that only the current
task can enable seccomp on itself (this is a requirement for a
strightforward [modulo preempt ;) ] TIF_NOTSC implementation).
Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While working on unshare support for the network namespace I noticed we
were putting clone flags in an int. Which is weird because the syscall
uses unsigned long and we at least need an unsigned to properly hold all of
the unshare flags.
So to make the code consistent, this patch updates the code to use
unsigned long instead of int for the clone flags in those places
where we get it wrong today.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
One common problem with 32 bit system call and ioctl emulation is the
different alignment rules between i386 and 64 bit machines. A number of
drivers work around this by marking the compat structures as
'attribute((packed))', which is not the right solution because it breaks
all the non-x86 architectures that want to use the same compat code.
Hopefully, this patch improves the situation, it introduces two new types,
compat_u64 and compat_s64. These are defined on all architectures to have
the same size and alignment as the 32 bit version of u64 and s64.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Vasily Tarasov <vtaras@openvz.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
aa95387774 removed the implementation of
lock_cpu_hotplug_interruptible and all users of it. This stub definition
for !CONFIG_HOTPLUG_CPU was left over -- kill it now.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove not only the references to Cobalt NVRAM, but the header file as
well.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Acked-by: Tim Hockin <thockin@hockin.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch enables the unshare of user namespaces.
It adds a new clone flag CLONE_NEWUSER and implements copy_user_ns() which
resets the current user_struct and adds a new root user (uid == 0)
For now, unsharing the user namespace allows a process to reset its
user_struct accounting and uid 0 in the new user namespace should be contained
using appropriate means, for instance selinux
The plan, when the full support is complete (all uid checks covered), is to
keep the original user's rights in the original namespace, and let a process
become uid 0 in the new namespace, with full capabilities to the new
namespace.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Acked-by: Pavel Emelianov <xemul@openvz.org>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Andrew Morgan <agm@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Basically, it will allow a process to unshare its user_struct table,
resetting at the same time its own user_struct and all the associated
accounting.
A new root user (uid == 0) is added to the user namespace upon creation.
Such root users have full privileges and it seems that theses privileges
should be controlled through some means (process capabilities ?)
The unshare is not included in this patch.
Changes since [try #4]:
- Updated get_user_ns and put_user_ns to accept NULL, and
get_user_ns to return the namespace.
Changes since [try #3]:
- moved struct user_namespace to files user_namespace.{c,h}
Changes since [try #2]:
- removed struct user_namespace* argument from find_user()
Changes since [try #1]:
- removed struct user_namespace* argument from find_user()
- added a root_user per user namespace
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Acked-by: Pavel Emelianov <xemul@openvz.org>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Andrew Morgan <agm@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CONFIG_UTS_NS and CONFIG_IPC_NS have very little value as they only
deactivate the unshare of the uts and ipc namespaces and do not improve
performance.
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Acked-by: "Serge E. Hallyn" <serue@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently we handle spurious IRQ activity based upon seeing a lot of
invalid interrupts, and we clear things back on the base of lots of valid
interrupts.
Unfortunately in some cases you get legitimate invalid interrupts caused by
timing asynchronicity between the PCI bus and the APIC bus when disabling
interrupts and pulling other tricks. In this case although the spurious
IRQs are not a problem our unhandled counters didn't clear and they act as
a slow running timebomb. (This is effectively what the serial port/tty
problem that was fixed by clearing counters when registering a handler
showed up)
It's easy enough to add a second parameter - time. This means that if we
see a regular stream of harmless spurious interrupts which are not harming
processing we don't go off and do something stupid like disable the IRQ
after a month of running. OTOH lockups and performance killers show up a
lot more than 10/second
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make available to the user the following task and process performance
statistics:
* Involuntary Context Switches (task_struct->nivcsw)
* Voluntary Context Switches (task_struct->nvcsw)
Statistics information is available from:
1. taskstats interface (Documentation/accounting/)
2. /proc/PID/status (task only).
This data is useful for detecting hyperactivity patterns between processes.
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Maxim Uvarov <muvarov@ru.mvista.com>
Cc: Shailabh Nagar <nagar@watson.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Cc: Jonathan Lim <jlim@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Drop <linux/isicom.h> from being exported to user space since it would
be only an empty file.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Acked-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the no longer used sonypi_camera_command().
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes dead keys and copy/paste of non-ASCII characters in UTF-8
mode on Linux console. See more details about the original patch at:
http://chris.heathens.co.nz/linux/utf8.html
Already posted on
(Oldest) http://lkml.org/lkml/2003/5/31/148http://lkml.org/lkml/2005/12/24/69
(Recent) http://lkml.org/lkml/2006/8/7/75
[bunk@stusta.de: make drivers/char/selection.c:store_utf8() static]
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Cc: Alexander E. Patrakov <patrakov@ums.usu.ru>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I forgot to remove capability.h from mm.h while removing sched.h! This
patch remedies that, because the only inline function which was using
CAP_something was made out of line.
Cross-compile tested without regressions on:
all powerpc defconfigs
all mips defconfigs
all m68k defconfigs
all arm defconfigs
all ia64 defconfigs
alpha alpha-allnoconfig alpha-defconfig alpha-up
arm
i386 i386-allnoconfig i386-defconfig i386-up
ia64 ia64-allnoconfig ia64-defconfig ia64-up
m68k
mips
parisc parisc-allnoconfig parisc-defconfig parisc-up
powerpc powerpc-up
s390 s390-allnoconfig s390-defconfig s390-up
sparc sparc-allnoconfig sparc-defconfig sparc-up
sparc64 sparc64-allnoconfig sparc64-defconfig sparc64-up
um-x86_64
x86_64 x86_64-allnoconfig x86_64-defconfig x86_64-up
as well as my two usual configs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Part two in the O_CLOEXEC saga: adding support for file descriptors received
through Unix domain sockets.
The patch is once again pretty minimal, it introduces a new flag for recvmsg
and passes it just like the existing MSG_CMSG_COMPAT flag. I think this bit
is not used otherwise but the networking people will know better.
This new flag is not recognized by recvfrom and recv. These functions cannot
be used for that purpose and the asymmetry this introduces is not worse than
the already existing MSG_CMSG_COMPAT situations.
The patch must be applied on the patch which introduced O_CLOEXEC. It has to
remove static from the new get_unused_fd_flags function but since scm.c cannot
live in a module the function still hasn't to be exported.
Here's a test program to make sure the code works. It's so much longer than
the actual patch...
#include <errno.h>
#include <error.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/un.h>
#ifndef O_CLOEXEC
# define O_CLOEXEC 02000000
#endif
#ifndef MSG_CMSG_CLOEXEC
# define MSG_CMSG_CLOEXEC 0x40000000
#endif
int
main (int argc, char *argv[])
{
if (argc > 1)
{
int fd = atol (argv[1]);
printf ("child: fd = %d\n", fd);
if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
{
puts ("file descriptor valid in child");
return 1;
}
return 0;
}
struct sockaddr_un sun;
strcpy (sun.sun_path, "./testsocket");
sun.sun_family = AF_UNIX;
char databuf[] = "hello";
struct iovec iov[1];
iov[0].iov_base = databuf;
iov[0].iov_len = sizeof (databuf);
union
{
struct cmsghdr hdr;
char bytes[CMSG_SPACE (sizeof (int))];
} buf;
struct msghdr msg = { .msg_iov = iov, .msg_iovlen = 1,
.msg_control = buf.bytes,
.msg_controllen = sizeof (buf) };
struct cmsghdr *cmsg = CMSG_FIRSTHDR (&msg);
cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_RIGHTS;
cmsg->cmsg_len = CMSG_LEN (sizeof (int));
msg.msg_controllen = cmsg->cmsg_len;
pid_t child = fork ();
if (child == -1)
error (1, errno, "fork");
if (child == 0)
{
int sock = socket (PF_UNIX, SOCK_STREAM, 0);
if (sock < 0)
error (1, errno, "socket");
if (bind (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
error (1, errno, "bind");
if (listen (sock, SOMAXCONN) < 0)
error (1, errno, "listen");
int conn = accept (sock, NULL, NULL);
if (conn == -1)
error (1, errno, "accept");
*(int *) CMSG_DATA (cmsg) = sock;
if (sendmsg (conn, &msg, MSG_NOSIGNAL) < 0)
error (1, errno, "sendmsg");
return 0;
}
/* For a test suite this should be more robust like a
barrier in shared memory. */
sleep (1);
int sock = socket (PF_UNIX, SOCK_STREAM, 0);
if (sock < 0)
error (1, errno, "socket");
if (connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
error (1, errno, "connect");
unlink (sun.sun_path);
*(int *) CMSG_DATA (cmsg) = -1;
if (recvmsg (sock, &msg, MSG_CMSG_CLOEXEC) < 0)
error (1, errno, "recvmsg");
int fd = *(int *) CMSG_DATA (cmsg);
if (fd == -1)
error (1, 0, "no descriptor received");
char fdname[20];
snprintf (fdname, sizeof (fdname), "%d", fd);
execl ("/proc/self/exe", argv[0], fdname, NULL);
puts ("execl failed");
return 1;
}
[akpm@linux-foundation.org: Fix fastcall inconsistency noted by Michael Buesch]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Michael Buesch <mb@bu3sch.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The problem is as follows: in multi-threaded code (or more correctly: all
code using clone() with CLONE_FILES) we have a race when exec'ing.
thread #1 thread #2
fd=open()
fork + exec
fcntl(fd,F_SETFD,FD_CLOEXEC)
In some applications this can happen frequently. Take a web browser. One
thread opens a file and another thread starts, say, an external PDF viewer.
The result can even be a security issue if that open file descriptor
refers to a sensitive file and the external program can somehow be tricked
into using that descriptor.
Just adding O_CLOEXEC support to open() doesn't solve the whole set of
problems. There are other ways to create file descriptors (socket,
epoll_create, Unix domain socket transfer, etc). These can and should be
addressed separately though. open() is such an easy case that it makes not
much sense putting the fix off.
The test program:
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#ifndef O_CLOEXEC
# define O_CLOEXEC 02000000
#endif
int
main (int argc, char *argv[])
{
int fd;
if (argc > 1)
{
fd = atol (argv[1]);
printf ("child: fd = %d\n", fd);
if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
{
puts ("file descriptor valid in child");
return 1;
}
return 0;
}
fd = open ("/proc/self/exe", O_RDONLY | O_CLOEXEC);
printf ("in parent: new fd = %d\n", fd);
char buf[20];
snprintf (buf, sizeof (buf), "%d", fd);
execl ("/proc/self/exe", argv[0], buf, NULL);
puts ("execl failed");
return 1;
}
[kyle@parisc-linux.org: parisc fix]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a flag in /proc/timer_stats to indicate deferrable timers. This will
let developers/users to differentiate between types of tiemrs in
/proc/timer_stats.
Deferrable timer and normal timer will appear in /proc/timer_stats as below.
10D, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
10, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
Also version of timer_stats changes from v0.1 to v0.2
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Continuing the work started in 411f0f3edc ...
This enables code with a dma path, that compiles away, to build without
requiring additional code factoring. It also prevents code that calls
dma_alloc_coherent and dma_free_coherent from linking whereas previously
the code would hit a BUG() at run time. Finally, it allows archs that set
!HAS_DMA to delete their asm/dma-mapping.h file.
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: <geert@linux-m68k.org>
Cc: <zippel@linux-m68k.org>
Cc: <spyro@f2s.com>
Cc: <ysato@users.sourceforge.jp>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the obviously unnecessary includes of <linux/spinlock.h> under the
include/linux/ directory, and fix the couple errors that are introduced as
a result of that.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes the following warnings.
fs/fat/dir.c: In function 'fat_parse_long':
include/linux/msdos_fs.h:294: warning: array subscript is above array bounds
include/linux/msdos_fs.h:295: warning: array subscript is above array bounds
include/linux/msdos_fs.h:295: warning: array subscript is above array bounds
The ->name is defined as "name[8], ext[3]", but fat_checksum() uses
those as name[11]. There is no actual problem, but it's not a good manner.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
per-cpu counters presently must iterate over all possible CPUs in the
exhaustive percpu_counter_sum().
But it can be much better to only iterate over the presently-online CPUs. To
do this, we must arrange for an offlined CPU's count to be spilled into the
counter's central count.
We can do this for all percpu_counters in the machine by linking them into a
single global list and walking that list at CPU_DEAD time.
(I hope. Might have race windows in which the percpu_counter_sum() count is
inaccurate?)
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
gcc-4.3:
fs/fuse/dir.c: In function 'parse_dirfile':
fs/fuse/dir.c:833: warning: cast from pointer to integer of different size
fs/fuse/dir.c:835: warning: cast from pointer to integer of different size
[miklos@szeredi.hu: use offsetof]
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The I2O driver uses two semaphores as mutexes. Use the mutex API instead of
the (binary) semaphores.
Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Without this a tty write could block if a previous blocking tty write was
in progress on the same tty and blocked by a line discipline or hardware
event. Originally found and reported by Dave Johnson.
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Dave Johnson <djohnson+linux-kernel@sw.starentnetworks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 411187fb05 caused boot time to move and
process start times to become invalid after suspend. Using boot based time
for those restores the old behaviour and fixes the issue.
[akpm@linux-foundation.org: little cleanup]
Signed-off-by: Tomas Janousek <tjanouse@redhat.com>
Cc: Tomas Smetana <tsmetana@redhat.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The commits
411187fb05 (GTOD: persistent clock support)
c1d370e167 (i386: use GTOD persistent clock
support)
changed the monotonic time so that it no longer jumps after resume, but it's
not possible to use it for boot time and process start time calculations then.
Also, the uptime no longer increases during suspend.
I add a variable to track the wall_to_monotonic changes, a function to get the
real boot time and a function to get the boot based time from the monotonic
one.
[akpm@linux-foundation.org: remove exports, add comment]
Signed-off-by: Tomas Janousek <tjanouse@redhat.com>
Cc: Tomas Smetana <tsmetana@redhat.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Before calling init_hwif_default, ide_unregister gets lock ide_lock and
disables irq. init_hwif_default calls ide_default_io_base which calls
pci_get_device and later pci_get_subsys tries to apply for semaphore
pci_bus_sem and goes to sleep.
Mostly, pci_get_device should be called when irq is turned on.
ide_default_io_base just needs find if list pci_devices is empty.
Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Cc: Greg KH <greg@kroah.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce a write_trylock_irqsave() implementation. Similar in style to
the implementation of spin_trylock_irqsave() in mainline.
Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Cc: Sripathi Kodi <sripathik@in.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix following races:
===========================================
1. Write via ->write_proc sleeps in copy_from_user(). Module disappears
meanwhile. Or, more generically, system call done on /proc file, method
supplied by module is called, module dissapeares meanwhile.
pde = create_proc_entry()
if (!pde)
return -ENOMEM;
pde->write_proc = ...
open
write
copy_from_user
pde = create_proc_entry();
if (!pde) {
remove_proc_entry();
return -ENOMEM;
/* module unloaded */
}
*boom*
==========================================
2. bogo-revoke aka proc_kill_inodes()
remove_proc_entry vfs_read
proc_kill_inodes [check ->f_op validness]
[check ->f_op->read validness]
[verify_area, security permissions checks]
->f_op = NULL;
if (file->f_op->read)
/* ->f_op dereference, boom */
NOTE, NOTE, NOTE: file_operations are proxied for regular files only. Let's
see how this scheme behaves, then extend if needed for directories.
Directories creators in /proc only set ->owner for them, so proxying for
directories may be unneeded.
NOTE, NOTE, NOTE: methods being proxied are ->llseek, ->read, ->write,
->poll, ->unlocked_ioctl, ->ioctl, ->compat_ioctl, ->open, ->release.
If your in-tree module uses something else, yell on me. Full audit pending.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adding the defines/macros activates the existing code in the tty layer
and allows this platform to use the arbitary speed ioctl setting layer
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For some reason, I was using kmalloc instead of get_free_pages for kernel
stacks.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the needed constants and bits. The actual code is already in the tty
layer and turned on by the definitions
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Mikael Starvik <starvik@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the defines and constants needed for the M32R platform to support the
arbitary speed tty ioctls.
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the ioctls and values needed for this to the ARM26/ARM32 ports. The
actual code has been in the base kernel for a while and automatically turns
on when a port sets the required defines.
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
isa_bus_to_virt() is still needed in a few places (lance.c, at least). When
we switch the kernel to using -Werror-implicit-function-declaration, the lack
of isa_bus_to_virt() breaks alpha allmodconfig builds.
Add isa_bus_to_virt() and deprecate the ezisting ISA APIs, though it might be
better to define these functions as BUG(), since virt_to_bus/bus_to_virt just
do wrong things on a number of machines.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the needed constants and defines to activate the new tty code on this
platform
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Connect up new system calls.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a straightforward split of do_mmap_pgoff() into two functions:
- do_mmap_pgoff() checks the parameters, and calculates the vma
flags. Then it calls
- mmap_region(), which does the actual mapping
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds preliminary NUMA support to SLOB, primarily aimed at systems with
small nodes (tested all the way down to a 128kB SRAM block), whether
asymmetric or otherwise.
We follow the same conventions as SLAB/SLUB, preferring current node
placement for new pages, or with explicit placement, if a node has been
specified. Presently on UP NUMA this has the side-effect of preferring
node#0 allocations (since numa_node_id() == 0, though this could be
reworked if we could hand off a pfn to determine node placement), so
single-CPU NUMA systems will want to place smaller nodes further out in
terms of node id. Once a page has been bound to a node (via explicit node
id typing), we only do block allocations from partial free pages that have
a matching node id in the page flags.
The current implementation does have some scalability problems, in that all
partial free pages are tracked in the global freelist (with contention due
to the single spinlock). However, these are things that are being reworked
for SMP scalability first, while things like per-node freelists can easily
be built on top of this sort of functionality once it's been added.
More background can be found in:
http://marc.info/?l=linux-mm&m=118117916022379&w=2http://marc.info/?l=linux-mm&m=118170446306199&w=2http://marc.info/?l=linux-mm&m=118187859420048&w=2
and subsequent threads.
Acked-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This symbol got orphaned quite a while ago.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kill pte_rdprotect(), pte_exprotect(), pte_mkread(), pte_mkexec(), pte_read(),
pte_exec(), and pte_user() except where arch-specific code is making use of
them.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
invalidate_mapping_pages() can sometimes take a long time (millions of pages
to free). Long enough for the softlockup detector to trigger.
We used to have a cond_resched() in there but I took it out because the
drop_caches code calls invalidate_mapping_pages() under inode_lock.
The patch adds a nasty flag and puts the cond_resched() back.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Given that there is no remaining usage of the deprecated kmem_cache_t
typedef anywhere in the tree, remove that typedef.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make zonelist creation policy selectable from sysctl/boot option v6.
This patch makes NUMA's zonelist (of pgdat) order selectable.
Available order are Default(automatic)/ Node-based / Zone-based.
[Default Order]
The kernel selects Node-based or Zone-based order automatically.
[Node-based Order]
This policy treats the locality of memory as the most important parameter.
Zonelist order is created by each zone's locality. This means lower zones
(ex. ZONE_DMA) can be used before higher zone (ex. ZONE_NORMAL) exhausion.
IOW. ZONE_DMA will be in the middle of zonelist.
current 2.6.21 kernel uses this.
Pros.
* A user can expect local memory as much as possible.
Cons.
* lower zone will be exhansted before higher zone. This may cause OOM_KILL.
Maybe suitable if ZONE_DMA is relatively big and you never see OOM_KILL
because of ZONE_DMA exhaution and you need the best locality.
(example)
assume 2 node NUMA. node(0) has ZONE_DMA/ZONE_NORMAL, node(1) has ZONE_NORMAL.
*node(0)'s memory allocation order:
node(0)'s NORMAL -> node(0)'s DMA -> node(1)'s NORMAL.
*node(1)'s memory allocation order:
node(1)'s NORMAL -> node(0)'s NORMAL -> node(0)'s DMA.
[Zone-based order]
This policy treats the zone type as the most important parameter.
Zonelist order is created by zone-type order. This means lower zone
never be used bofere higher zone exhaustion.
IOW. ZONE_DMA will be always at the tail of zonelist.
Pros.
* OOM_KILL(bacause of lower zone) occurs only if the whole zones are exhausted.
Cons.
* memory locality may not be best.
(example)
assume 2 node NUMA. node(0) has ZONE_DMA/ZONE_NORMAL, node(1) has ZONE_NORMAL.
*node(0)'s memory allocation order:
node(0)'s NORMAL -> node(1)'s NORMAL -> node(0)'s DMA.
*node(1)'s memory allocation order:
node(1)'s NORMAL -> node(0)'s NORMAL -> node(0)'s DMA.
bootoption "numa_zonelist_order=" and proc/sysctl is supporetd.
command:
%echo N > /proc/sys/vm/numa_zonelist_order
Will rebuild zonelist in Node-based order.
command:
%echo Z > /proc/sys/vm/numa_zonelist_order
Will rebuild zonelist in Zone-based order.
Thanks to Lee Schermerhorn, he gives me much help and codes.
[Lee.Schermerhorn@hp.com: add check_highest_zone to build_zonelists_in_zone_order]
[akpm@linux-foundation.org: build fix]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "jesse.barnes@intel.com" <jesse.barnes@intel.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Beacuse SERIAL_PORT_DFNS is removed from include/asm-i386/serial.h and
include/asm-x86_64/serial.h. the serial8250_ports need to be probed late in
serial initializing stage. the console_init=>serial8250_console_init=>
register_console=>serial8250_console_setup will return -ENDEV, and console
ttyS0 can not be enabled at that time. need to wait till uart_add_one_port in
drivers/serial/serial_core.c to call register_console to get console ttyS0.
that is too late.
Make early_uart to use early_param, so uart console can be used earlier. Make
it to be bootconsole with CON_BOOT flag, so can use console handover feature.
and it will switch to corresponding normal serial console automatically.
new command line will be:
console=uart8250,io,0x3f8,9600n8
console=uart8250,mmio,0xff5e0000,115200n8
or
earlycon=uart8250,io,0x3f8,9600n8
earlycon=uart8250,mmio,0xff5e0000,115200n8
it will print in very early stage:
Early serial console at I/O port 0x3f8 (options '9600n8')
console [uart0] enabled
later for console it will print:
console handover: boot [uart0] -> real [ttyS0]
Signed-off-by: <yinghai.lu@sun.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove all ids from the given idr tree. idr_destroy() only frees up
unused, cached idp_layers, but this function will remove all id mappings
and leave all idp_layers unused.
A typical clean-up sequence for objects stored in an idr tree, will use
idr_for_each() to free all objects, if necessay, then idr_remove_all() to
remove all ids, and idr_destroy() to free up the cached idr_layers.
Signed-off-by: Kristian Hoegsberg <krh@redhat.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds an iterator function for the idr data structure. Compared
to just iterating through the idr with an integer and idr_find, this
iterator is (almost, but not quite) linear in the number of elements, as
opposed to the number of integers in the range covered by the idr. This
makes a difference for sparse idrs, but more importantly, it's a nicer way
to iterate through the elements.
The drm subsystem is moving to idr for tracking contexts and drawables, and
with this change, we can use the idr exclusively for tracking these
resources.
[akpm@linux-foundation.org: fix comment]
Signed-off-by: Kristian Hoegsberg <krh@redhat.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a correction for a macro which gives worst case compressed data
size by LZO1X.
This patch was provided by the LZO author (Markus Oberhumer).
Signed-off-by: Nitin Gupta <nitingupta910@gmail.com>
Cc: "Markus F.X.J. Oberhumer" <markus@oberhumer.com>
Cc: "Richard Purdie" <rpurdie@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes some code that became dead code after the ATARI_ACSI
removal.
It also indirectly fixes the following bug introduced by
commit c2bcf3b8978c291e1b7f6499475c8403a259d4d6:
config ATARI_SLM
tristate "Atari SLM laser printer support"
- depends on ATARI && ATARI_ACSI!=n
+ depends on ATARI
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Take a page from the powerpc folks and just calculate the
delay factor directly.
Since frequency scaling chips use a system-tick register,
the value is going to be the same system-wide.
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not select HOTPLUG_CPU from SUN_LDOMS, that causes
HOTPLUG_CPU to be selected even on non-SMP which is
illegal.
Only build hvtramp.o when SMP, just like trampoline.o
Protect dr-cpu code in ds.c with HOTPLUG_CPU.
Likewise move ldom_startcpu_cpuid() to smp.c and protect
it and the call site with SUN_LDOMS && HOTPLUG_CPU.
Signed-off-by: David S. Miller <davem@davemloft.net>
Only adding cpus is supports at the moment, removal
will come next.
When new cpus are configured, the machine description is
updated. When we get the configure request we pass in a
cpu mask of to-be-added cpus to the mdesc CPU node parser
so it only fetches information for those cpus. That code
also proceeds to update the SMT/multi-core scheduling bitmaps.
cpu_up() does all the work and we return the status back
over the DS channel.
CPUs via dr-cpu need to be booted straight out of the
hypervisor, and this requires:
1) A new trampoline mechanism. CPUs are booted straight
out of the hypervisor with MMU disabled and running in
physical addresses with no mappings installed in the TLB.
The new hvtramp.S code sets up the critical cpu state,
installs the locked TLB mappings for the kernel, and
turns the MMU on. It then proceeds to follow the logic
of the existing trampoline.S SMP cpu bringup code.
2) All calls into OBP have to be disallowed when domaining
is enabled. Since cpus boot straight into the kernel from
the hypervisor, OBP has no state about that cpu and therefore
cannot handle being invoked on that cpu.
Luckily it's only a handful of interfaces which can be called
after the OBP device tree is obtained. For example, rebooting,
halting, powering-off, and setting options node variables.
CPU removal support will require some infrastructure changes
here. Namely we'll have to process the requests via a true
kernel thread instead of in a workqueue. workqueues run on
a per-cpu thread, but when unconfiguring we might need to
force the thread to execute on another cpu if the current cpu
is the one being removed. Removal of a cpu also causes the kernel
to destroy that cpu's workqueue running thread.
Another issue on removal is that we may have interrupts still
pointing to the cpu-to-be-removed. So new code will be needed
to walk the active INO list and retarget those cpus as-needed.
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a special domain services capability for setting
variables in the OBP options node. Guests don't have permanent
store for the OBP variables like a normal system, so they are
instead maintained in the LDOM control node or in the SC.
Signed-off-by: David S. Miller <davem@davemloft.net>
Property values cannot be referenced outside of
mdesc_grab()/mdesc_release() pairs. The only major
offender was the VIO bus layer, easily fixed.
Add some commentary to mdesc.h describing these rules.
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we have to be able to handle MD updates, having an in-tree
set of data structures representing the MD objects actually makes
things more painful.
The MD itself is easy to parse, and we can implement the existing
interfaces using direct parsing of the MD binary image.
The MD is now reference counted, so accesses have to now take the
form:
handle = mdesc_grab();
... operations on MD ...
mdesc_release(handle);
The only remaining issue are cases where code holds on to references
to MD property values. mdesc_get_property() returns a direct pointer
to the property value, most cases just pull in the information they
need and discard the pointer, but there are few that use the pointer
directly over a long lifetime. Those will be fixed up in a subsequent
changeset.
A preliminary handler for MD update events from domain services is
there, it is rudimentry but it works and handles all of the reference
counting. It does not check the generation number of the MDs,
and it does not generate a "add/delete" list for notification to
interesting parties about MD changes but that will be forthcoming.
Signed-off-by: David S. Miller <davem@davemloft.net>
All of the interrupts say "LDX RX" and "LDX TX" currently
which is next to useless. Put a device specific prefix
before "RX" and "TX" instead which makes it much more
useful.
Signed-off-by: David S. Miller <davem@davemloft.net>
Besides the existing usage for power-button interrupts, we'll
want to make use of this code for domain-services where the
LDOM manager can send reboot requests to the guest node.
Signed-off-by: David S. Miller <davem@davemloft.net>
1) LDC_MODE_RELIABLE is deprecated an unused by anything, plus
it and LDC_MODE_STREAM were mis-numbered.
2) read_stream() should try to read as much as possible into
the per-LDC stream buffer area, so do not trim the read_nonraw()
length by the caller's size parameter.
3) Send data ACKs when necessary in read_nonraw().
4) In read_nonraw() when we get a pure ACK, advance the RX head
unconditionally past it.
5) Provide the ACKID field in the ldcdgb() packet dump in read_nonraw().
This helps debugging stream mode LDC channel problems.
6) Decrease verbosity of rx_data_wait() so that it is more useful.
A debugging message each loop iteration is too much.
7) In process_data_ack() stop the loop checking when we hit lp->tx_tail
not lp->tx_head.
8) Set the seqid field properly in send_data_nack().
Signed-off-by: David S. Miller <davem@davemloft.net>
Virtual devices on Sun Logical Domains are built on top
of a virtual channel framework. This, with help of hypervisor
interfaces, provides a link layer protocol with basic
handshaking over which virtual device clients and servers
communicate.
Built on top of this is a VIO device protocol which has it's
own handshaking and message types. At this layer attributes
are exchanged (disk size, network device addresses, etc.)
descriptor rings are registered, and data transfers are
triggers and replied to.
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes the requirement for callers to get_cpu() to check in simple
cases. This patch is for !CONFIG_SMP.
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
KVM wants a notification when a cpu is about to die, so it can disable
hardware extensions, but at a time when user processes cannot be scheduled
on the cpu, so it doesn't try to use virtualization extensions after they
have been disabled.
This adds a CPU_DYING notification. The notification is called in atomic
context on the doomed cpu.
Signed-off-by: Avi Kivity <avi@qumranet.com>