linux/fs/ext4
Jens Axboe 7479f23c2a direct-io: only inc/dec inode->i_dio_count for file systems
do_blockdev_direct_IO() increments and decrements the inode
->i_dio_count for each IO operation. It does this to protect against
truncate of a file. Block devices don't need this sort of protection.

For a capable multiqueue setup, this atomic int is the only shared
state between applications accessing the device for O_DIRECT, and it
presents a scaling wall for that. In my testing, as much as 30% of
system time is spent incrementing and decrementing this value. A mixed
read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with
better latencies too. Before:

clat percentiles (usec):
|  1.00th=[   33],  5.00th=[   34], 10.00th=[   34], 20.00th=[   34],
| 30.00th=[   34], 40.00th=[   34], 50.00th=[   35], 60.00th=[   35],
| 70.00th=[   35], 80.00th=[   35], 90.00th=[   37], 95.00th=[   80],
| 99.00th=[   98], 99.50th=[  151], 99.90th=[  155], 99.95th=[  155],
| 99.99th=[  165]

After:

clat percentiles (usec):
|  1.00th=[   95],  5.00th=[  108], 10.00th=[  129], 20.00th=[  149],
| 30.00th=[  155], 40.00th=[  161], 50.00th=[  167], 60.00th=[  171],
| 70.00th=[  177], 80.00th=[  185], 90.00th=[  201], 95.00th=[  270],
| 99.00th=[  390], 99.50th=[  398], 99.90th=[  418], 99.95th=[  422],
| 99.99th=[  438]

In other setups, Robert Elliott reported seeing good performance
improvements:

https://lkml.org/lkml/2015/4/3/557

The more applications accessing the device, the worse it gets.

Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells
do_blockdev_direct_IO() that it need not worry about incrementing
or decrementing the inode i_dio_count for this caller.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Elliott, Robert (Server Storage) <elliott@hp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Tested-and-Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
2015-07-10 13:17:35 +08:00
..
acl.c ext4: fix the number of credits needed for acl ops with inline data 2013-02-09 15:23:03 -05:00
acl.h
balloc.c ext4: fix ext4_get_group_number() 2013-07-21 18:21:33 -07:00
bitmap.c ext4: Checksum the block bitmap properly with bigalloc enabled 2012-10-22 00:34:32 -04:00
block_validity.c ext2/3/4: delete unneeded includes of module.h 2012-01-09 13:52:10 +01:00
dir.c ext4: fix readdir error in the case of inline_data+dir_index 2013-04-19 17:53:09 -04:00
ext4_extents.h ext4: mext_insert_extents should update extent block checksum 2013-04-19 14:04:12 -04:00
ext4_jbd2.c ext4: call ext4_error_inode() if jbd2_journal_dirty_metadata() fails 2014-01-09 12:24:21 -08:00
ext4_jbd2.h ext4: improve credit estimate for EXT4_SINGLEDATA_TRANS_BLOCKS 2013-04-09 12:39:26 -04:00
ext4.h Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android 2015-02-02 11:29:26 +00:00
extents_status.c ext4: fix data corruption caused by unwritten and delayed extents 2015-05-13 05:15:42 -07:00
extents_status.h ext4: fix fio regression 2013-05-03 02:15:52 -04:00
extents.c ext4: check for zero length extent explicitly 2015-06-17 11:48:50 +08:00
file.c ext4: prevent bugon on race between write/fcntl 2015-02-11 14:48:17 +08:00
fsync.c ext4/jbd2: don't wait (forever) for stale tid caused by wraparound 2013-04-03 22:02:52 -04:00
hash.c ext4: reduce one "if" comparison in ext4_dirhash() 2013-02-01 22:33:21 -05:00
ialloc.c ext4: fix oops when loading block bitmap failed 2014-11-14 08:47:58 -08:00
indirect.c direct-io: only inc/dec inode->i_dio_count for file systems 2015-07-10 13:17:35 +08:00
inline.c ext4: avoid clearing beyond i_blocks when truncating an inline data file 2014-02-06 11:08:16 -08:00
inode.c direct-io: only inc/dec inode->i_dio_count for file systems 2015-07-10 13:17:35 +08:00
ioctl.c Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android 2014-11-14 18:07:40 +00:00
Kconfig ext4: fix Kconfig documentation for CONFIG_EXT4_DEBUG 2013-04-21 20:32:03 -04:00
Makefile ext4: Remove CONFIG_EXT4_FS_XATTR 2012-12-10 16:30:43 -05:00
mballoc.c Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-android 2015-02-02 11:29:26 +00:00
mballoc.h ext4: use module parameters instead of debugfs for mballoc_debug 2013-02-09 16:28:20 -05:00
migrate.c ext4: do not convert to indirect with bigalloc enabled 2013-04-11 10:54:46 -04:00
mmp.c ext4: mark all metadata I/O with REQ_META 2013-04-20 15:46:17 -04:00
move_extent.c ext4: mext_insert_extents should update extent block checksum 2013-04-19 14:04:12 -04:00
namei.c ext4: make fsync to sync parent dir in no-journal for real this time 2015-05-06 21:56:25 +02:00
page-io.c direct-io: only inc/dec inode->i_dio_count for file systems 2015-07-10 13:17:35 +08:00
resize.c ext4: fix overflow when updating superblock backups after resize 2014-11-14 08:47:58 -08:00
super.c Merge branch android-common-3.10 2015-06-02 11:25:51 +08:00
symlink.c ext4: Remove CONFIG_EXT4_FS_XATTR 2012-12-10 16:30:43 -05:00
truncate.h
xattr_security.c Merge branch 'for_linus' into for_linus_merged 2012-01-10 11:54:07 -05:00
xattr_trusted.c ext2/3/4: delete unneeded includes of module.h 2012-01-09 13:52:10 +01:00
xattr_user.c ext2/3/4: delete unneeded includes of module.h 2012-01-09 13:52:10 +01:00
xattr.c ext4: check EA value offset when loading 2014-11-14 08:47:57 -08:00
xattr.h ext4: reserve xattr index for Rich ACL support 2013-04-18 14:53:15 -04:00