linux/block
Toshiaki Makita fe63ce5175 cfq-iosched: Fix wrong children_weight calculation
commit e15693ef18 upstream.

cfq_group_service_tree_add() is applying new_weight at the beginning of
the function via cfq_update_group_weight().
This actually allows weight to change between adding it to and subtracting
it from children_weight, and triggers WARN_ON_ONCE() in
cfq_group_service_tree_del(), or even causes oops by divide error during
vfr calculation in cfq_group_service_tree_add().

The detailed scenario is as follows:
1. Create blkio cgroups X and Y as a child of X.
   Set X's weight to 500 and perform some I/O to apply new_weight.
   This X's I/O completes before starting Y's I/O.
2. Y starts I/O and cfq_group_service_tree_add() is called with Y.
3. cfq_group_service_tree_add() walks up the tree during children_weight
   calculation and adds parent X's weight (500) to children_weight of root.
   children_weight becomes 500.
4. Set X's weight to 1000.
5. X starts I/O and cfq_group_service_tree_add() is called with X.
6. cfq_group_service_tree_add() applies its new_weight (1000).
7. I/O of Y completes and cfq_group_service_tree_del() is called with Y.
8. I/O of X completes and cfq_group_service_tree_del() is called with X.
9. cfq_group_service_tree_del() subtracts X's weight (1000) from
   children_weight of root. children_weight becomes -500.
   This triggers WARN_ON_ONCE().
10. Set X's weight to 500.
11. X starts I/O and cfq_group_service_tree_add() is called with X.
12. cfq_group_service_tree_add() applies its new_weight (500) and adds it
    to children_weight of root. children_weight becomes 0. Calcularion of
    vfr triggers oops by divide error.

weight should be updated right before adding it to children_weight.

Reported-by: Ruki Sekiya <sekiya.ruki@lab.ntt.co.jp>
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05 14:54:08 -07:00
..
partitions partitions/efi.c: replace useless kzalloc's by kmalloc's 2013-04-30 08:34:25 +02:00
blk-cgroup.c blkcg: don't call into policy draining if root_blkg is already gone 2014-09-17 09:04:02 -07:00
blk-cgroup.h Update of blkg_stat and blkg_rwstat may happen in bh context. While u64_stats_fetch_retry is only preempt_disable on 32bit UP system. This is not enough to avoid preemption by bh and may read strange 64 bit value. 2013-12-11 22:36:27 -08:00
blk-core.c blktrace: fix accounting of partially completed requests 2014-05-30 21:52:11 -07:00
blk-exec.c Merge branch 'for-3.9/core' of git://git.kernel.dk/linux-block 2013-02-28 12:52:24 -08:00
blk-flush.c Block: blk-flush: Fixed indent code style 2013-03-22 12:22:51 -06:00
blk-integrity.c scatterlist: introduce sg_unmark_end 2013-03-20 15:43:04 +10:30
blk-ioc.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
blk-iopoll.c
blk-lib.c block: add cond_resched() to potentially long running ioctl discard loop 2014-02-22 12:41:28 -08:00
blk-map.c
blk-merge.c scatterlist: introduce sg_unmark_end 2013-03-20 15:43:04 +10:30
blk-settings.c block: properly stack underlying max_segment_size to DM device 2013-11-29 11:11:51 -08:00
blk-softirq.c sched, block: Unify cache detection 2012-01-27 13:28:48 +01:00
blk-sysfs.c block: avoid using uninitialized value in from queue_var_store 2013-04-03 21:53:57 +02:00
blk-tag.c block: don't assume last put of shared tags is for the host 2014-07-31 12:53:48 -07:00
blk-throttle.c block: Rename queue dead flag 2012-12-06 14:30:58 +01:00
blk-timeout.c block: fix race between request completion and timeout handling 2013-11-29 11:11:50 -08:00
blk.h block: __elv_next_request() shouldn't call into the elevator if bypassing 2014-02-22 12:41:28 -08:00
bsg-lib.c bsg: Remove unused function bsg_goose_queue() 2012-12-06 14:33:02 +01:00
bsg.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
cfq-iosched.c cfq-iosched: Fix wrong children_weight calculation 2014-10-05 14:54:08 -07:00
compat_ioctl.c block: provide compat ioctl for BLKZEROOUT 2014-07-31 12:53:48 -07:00
deadline-iosched.c elevator: Fix a race in elevator switching 2013-08-20 08:43:03 -07:00
elevator.c elevator: acquire q->sysfs_lock in elevator_change() 2013-12-08 07:29:27 -08:00
genhd.c block: do not pass disk names as format strings 2013-07-13 11:42:26 -07:00
ioctl.c Merge branch 'for-3.7/core' of git://git.kernel.dk/linux-block 2012-10-11 09:04:23 +09:00
Kconfig block: don't select PERCPU_RWSEM 2013-02-22 10:42:45 +01:00
Kconfig.iosched blkcg: make CONFIG_BLK_CGROUP bool 2012-03-06 21:27:21 +01:00
Makefile
noop-iosched.c elevator: Fix a race in elevator switching 2013-08-20 08:43:03 -07:00
partition-generic.c Revert "loop: cleanup partitions when detaching loop device" 2013-04-08 10:12:11 +02:00
scsi_ioctl.c aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00