mirror of
https://github.com/torvalds/linux.git
synced 2026-07-04 12:07:14 +02:00
The update of the share of a cfs_rq is done when its load_avg is updated
but before the group_entity's load_avg has been updated for the past time
slot. This generates wrong load_avg accounting which can be significant
when small tasks are involved in the scheduling.
Let take the example of a task a that is dequeued of its task group A:
root
(cfs_rq)
\
(se)
A
(cfs_rq)
\
(se)
a
Task "a" was the only task in task group A which becomes idle when a is
dequeued.
We have the sequence:
- dequeue_entity a->se
- update_load_avg(a->se)
- dequeue_entity_load_avg(A->cfs_rq, a->se)
- update_cfs_shares(A->cfs_rq)
A->cfs_rq->load.weight == 0
A->se->load.weight is updated with the new share (0 in this case)
- dequeue_entity A->se
- update_load_avg(A->se) but its weight is now null so the last time
slot (up to a tick) will be accounted with a weight of 0 instead of
its real weight during the time slot. The last time slot will be
accounted as an idle one whereas it was a running one.
If the running time of task a is short enough that no tick happens when it
runs, all running time of group entity A->se will be accounted as idle
time.
Instead, we should update the share of a cfs_rq (in fact the weight of its
group entity) only after having updated the load_avg of the group_entity.
update_cfs_shares() now takes the sched_entity as a parameter instead of the
cfs_rq, and the weight of the group_entity is updated only once its load_avg
has been synced with current time.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/1482335426-7664-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|---|---|---|
| .. | ||
| auto_group.c | ||
| auto_group.h | ||
| clock.c | ||
| completion.c | ||
| core.c | ||
| cpuacct.c | ||
| cpuacct.h | ||
| cpudeadline.c | ||
| cpudeadline.h | ||
| cpufreq_schedutil.c | ||
| cpufreq.c | ||
| cpupri.c | ||
| cpupri.h | ||
| cputime.c | ||
| deadline.c | ||
| debug.c | ||
| fair.c | ||
| features.h | ||
| idle_task.c | ||
| idle.c | ||
| loadavg.c | ||
| Makefile | ||
| rt.c | ||
| sched.h | ||
| stats.c | ||
| stats.h | ||
| stop_task.c | ||
| swait.c | ||
| wait.c | ||