mirror of
https://github.com/torvalds/linux.git
synced 2026-06-29 17:41:17 +02:00
I noticed expensive divides done in try_to_wakeup() and
find_busiest_group() on a bi dual core Opteron machine (total of 4 cores),
moderatly loaded (15.000 context switch per second)
oprofile numbers :
CPU: AMD64 processors, speed 2600.05 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit mask) count 50000
samples % symbol name
...
613914 1.0498 try_to_wake_up
834 0.0013 :ffffffff80227ae1: div %rcx
77513 0.1191 :ffffffff80227ae4: mov %rax,%r11
608893 1.0413 find_busiest_group
1841 0.0031 :ffffffff802260bf: div %rdi
140109 0.2394 :ffffffff802260c2: test %sil,%sil
Some of these divides can use the reciprocal divides we introduced some
time ago (currently used in slab AFAIK)
We can assume a load will fit in a 32bits number, because with a
SCHED_LOAD_SCALE=128 value, its still a theorical limit of 33554432
When/if we reach this limit one day, probably cpus will have a fast
hardware divide and we can zap the reciprocal divide trick.
Ingo suggested to rename cpu_power to __cpu_power to make clear it should
not be modified without changing its reciprocal value too.
I did not convert the divide in cpu_avg_load_per_task(), because tracking
nr_running changes may be not worth it ? We could use a static table of 32
reciprocal values but it would add a conditional branch and table lookup.
[akpm@linux-foundation.org: !SMP build fix]
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|---|---|---|
| .. | ||
| acpi | ||
| asm-alpha | ||
| asm-arm | ||
| asm-arm26 | ||
| asm-avr32 | ||
| asm-blackfin | ||
| asm-cris | ||
| asm-frv | ||
| asm-generic | ||
| asm-h8300 | ||
| asm-i386 | ||
| asm-ia64 | ||
| asm-m32r | ||
| asm-m68k | ||
| asm-m68knommu | ||
| asm-mips | ||
| asm-parisc | ||
| asm-powerpc | ||
| asm-ppc | ||
| asm-s390 | ||
| asm-sh | ||
| asm-sh64 | ||
| asm-sparc | ||
| asm-sparc64 | ||
| asm-um | ||
| asm-v850 | ||
| asm-x86_64 | ||
| asm-xtensa | ||
| crypto | ||
| keys | ||
| linux | ||
| math-emu | ||
| media | ||
| mtd | ||
| net | ||
| pcmcia | ||
| rdma | ||
| rxrpc | ||
| scsi | ||
| sound | ||
| video | ||
| Kbuild | ||