Similar to controllers found Voxel, Delbin, Magpie and Bobba, the one found
in Whitebox does not need to be reset after issuing power-on command, and
skipping reset saves resume time.
Signed-off-by: Jingle Wu <jingle.wu@emc.com.tw>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210907012924.11391-1-jingle.wu@emc.com.tw
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
a recent refactor created a null pointer vx in snd_vx222_probe().
The vx pointer should have been populated in snd_vx222_create() as
suggested in earlier version, otherwise vx->core.ibl.size will throw an
error.
[ 1.298398] BUG: kernel NULL pointer dereference, address: 00000000000001d8
[ 1.316799] RIP: 0010:snd_vx222_probe+0x155/0x290 [snd_vx222]
Fixes: 3bde3359aa ("ALSA: vx222: Allocate resources with device-managed APIs")
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Link: https://lore.kernel.org/r/20210907014746.1445278-1-ztong0001@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Damien has agreed to take over maintainership of libata, update the
MAINTAINERS file to reflect that.
Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There were typos in the fallback definitions of dummy LaTeX macros
for systems without CJK fonts.
They cause build errors in "make pdfdocs" on such systems.
Fix them.
Fixes: e291ff6f5a ("docs: pdfdocs: Add CJK-language-specific font settings")
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Link: https://lore.kernel.org/r/ad7615a5-f8fa-2bc3-de6b-7ed49d458964@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The tb_test_credit_alloc_all() function had a huge number of
KUNIT_ASSERT() statements, all of which (though the magic of many many
layers of inscrutable macros) ended up allocating and initializing
various test assertion structures on the stack.
Don't do that. The kernel stack isn't infinite, and we have compiler
warnings (now errors) for the case where a stack frame grows too large.
Like it did here, by not an inconsiderable margin:
drivers/thunderbolt/test.c: In function ‘tb_test_credit_alloc_all’:
drivers/thunderbolt/test.c:2367:1: error: the frame size of 4500 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2367 | }
| ^
Solve this similarly to the lib/test_scanf case: split out the tests
into several smaller functions, each just testing one particular tunnel
credit allocation.
This makes the i386 allyesconfig build work for me again.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It turns out that gcc has real trouble merging all the temporary
on-stack buffer allocation. So despite the fact that their lifetimes do
not overlap, gcc will allocate stack for all of them when they have
different types. Which they do in the number scanning test routines.
This is unfortunate in general, but with lots of test-cases in one
function, it becomes a real problem. gcc will allocate a huge stack
frame for no actual good reason.
We have tried to counteract this tendency of gcc not merging stack slots
(see "-fconserve-stack"), but that has limited effect (and should be on
by default these days, iirc).
So with all the debug options enabled on an i386 allmodconfig build, we
end up with overly big stack frames, and the resulting stack frame size
warnings (now errors):
lib/test_scanf.c: In function ‘numbers_list_field_width_val_width’:
lib/test_scanf.c:530:1: error: the frame size of 2088 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
530 | }
| ^
lib/test_scanf.c: In function ‘numbers_list_field_width_typemax’:
lib/test_scanf.c:488:1: error: the frame size of 2568 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
488 | }
| ^
lib/test_scanf.c: In function ‘numbers_list’:
lib/test_scanf.c:437:1: error: the frame size of 2088 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
437 | }
| ^
In this particular case, the reasonably straightforward solution is to
just split out the test routines into multiple more targeted versions.
That way we don't have one huge stack, but several smaller ones, and
they aren't active all at the same time.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The variable 'package_size' is an unsigned long, and should be printed
out using '%lu', not '%zd' (that would be for a size_t).
Yes, on many architectures (including x86-64), 'size_t' is in fact the
same type as 'long', but that's a fairly random architecture definition,
and on some platforms 'size_t' is in fact 'int' rather than 'long'.
That is the case on traditional 32-bit x86. Yes, both types are the
exact same 32-bit size, and it would all print out perfectly correctly,
but '%zd' ends up still being wrong.
And we can't make 'package_size' be a 'size_t', because we get the
actual value using efivar_entry_get() that takes a pointer to an
'unsigned long'. So '%lu' it is.
This fixes two of the i386 allmodconfig build warnings (that is now an
error due to -Werror).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmE1hQcQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgprjfEACdG+medwQOPpKNSoAvQmYyQnZRMbPjiruv
A4nW2L6MKaExO59qQLVbBYHaH2+ng2UR/p5jNi2AKm+hrQEYllxlNvuCkRBIn97J
r45R48mzBbHjR4kE3Fdu1mOFpBWOuU9JrtzHI+JF/Sl/qPIxKYNHf5E66T6l90Fz
0hJkorAoVB7+hQYixdmkM9quZy11D5SY3aM+bG8r2uNjZTBEHMfmOen8o1giR0vC
EOHzObuC6WLjLGQInNW+Cq2//vVVybQa79mhOUMp93z5nhDMtwUu7MH4B4kmGpix
GLjDa1DukUZe7nGcnsRKmjjXQ+BpG6YF52Z2RfVZpWZn83t5c4YQsq++TPZ8KfpK
4NAFFuSbGM/+QWwEiiyWu00syvpzrEJ4ZIJyZX3FYEeKyKWVRGHqlMDcS9LstYOk
4OfgQUcJ7f/fXeedwi0OGJS1BLr6fi8RnazIafCNIIJLe1XIwTsNufPCNxWYqDAi
0XhH+uYGD38VoUiR5JymZku6frwY4kxssA1khPPE5jWbzCZXiHprwwzaP4hBNNeZ
c5cn9/1ZQSoTE3ebrX9pzTn5wRZwAL+iDhZ2SpLlN2Ji1BJ4EM9H8qFGj3U/CSM4
OWKY2c0VwJYQUhjO4QDBx0MblJgNy8HsvmqGETuxUlk56j3Q1Mx3ViPV43amP9eM
OM4mGige3Q==
=4SCA
-----END PGP SIGNATURE-----
Merge tag 'block-5.15-2021-09-05' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Was going to send this one in later this week, but given that -Werror
is now enabled (or at least available), the mq-deadline fix really
should go in for the folks hitting that.
- Ensure dd_queued() is only there if needed (Geert)
- Fix a kerneldoc warning for bio_alloc_kiocb()
- BFQ fix for queue merging
- loop locking fix (Tetsuo)"
* tag 'block-5.15-2021-09-05' of git://git.kernel.dk/linux-block:
loop: reduce the loop_ctl_mutex scope
bio: fix kerneldoc documentation for bio_alloc_kiocb()
block, bfq: honor already-setup queue merges
block/mq-deadline: Move dd_queued() to fix defined but not used warning
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmE1EyYQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpgn3D/9xlSwyXp4UJXtqPATMe/YzhjRFyAjQR2ot
qXHDzpph7C5SdXIwtvTRKmrTF7YZSyoNf5OXFg4gb7K6ih9/JBmJcGFpjW9wFYpH
Dy2Bss4Grj1RSnQPhIUzZeBkF6WyHKm2MohfFHxWeJ5SKt6YSjorpL8ewVsLEY6F
7juItyoBr5ua8Rvf8XxkjvsHaILa51LMZsYQv5knfxm53cGFVt8bn53LSVzFS1uH
ZavKVcCbOpLzRr7gM+6ra+ZvR+Zg30MMDWBYH5di91uUlJg4lu5CC0Az+atWJfUL
UdsYsQRAiWIoev2VnbZ0O4BYdZ/yoWiQTpYLYKYg61Odj7XsqSDM6zXC1Qkhx+CD
1ZEJNDuJBv502QtF4xjKEEM+qaYtWTC/LjP7v2ww5a/EkJYO1idH93gnD7Af8m6z
wkjhuhBCpJ/fcHoB/2XYF+0Cbo0/87xcsyx5dEPxYGMmh8oscgmM4C6fLkQsxHGS
ndRtJSfB80tjEQSyg+MzWtIHbrNlfcWG/tS+VbEePGsVbwxwMIwaEpHlMzQmavUQ
sKLJEzg7m8ODwT35Fwq35waV2rRhcmNyk4/RtFY7zymzjc/bXZsCLD1AejObmWR8
lH9rAwM0DlMmF64ssJtlmyCU6FLEHz9+69etaHUlrMNFYnYku/cb2HY3eR7jG56E
BEn3RyXbwA==
=jgN5
-----END PGP SIGNATURE-----
Merge tag 'misc-5.15-2021-09-05' of git://git.kernel.dk/linux-block
Pull CDROM maintainer update from Jens Axboe:
"It's been about 22 years since I originally started maintaining the
CDROM code, and I just haven't been able to even get reviews done in a
timely fashion the last handful of years.
Time to pass it on, and Phillip has volunteered take over these
duties. I'll be helping as needed for the foreseeable future"
* tag 'misc-5.15-2021-09-05' of git://git.kernel.dk/linux-block:
cdrom: update uniform CD-ROM maintainership in MAINTAINERS file
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmE1EsEQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgppSqEADAaYVwbrNvdeCVNLiCiQb8u6mOP/y+wgIu
90aPaNEqpLKEx3hRkN0eOlLymkST2Z9iVXyOdwP0PHXnKygCd/mx9ZocRkO9uGs4
8vNUOK5eMWKtDg4BbWQDR8UxdL+la9udCZgLXWyoMQVij3apSIQlIWxenTZ0OVuJ
BOuxZ+T06NADkxMBOpB25TiOJkHSDvLOSfQySpJqJQKdpcOTFOqSEMfM9xjRUrju
hAxVgjwiovO0AacQ9yie+BgvHySPNggu6Ko63lrujA1JOceWQZXjAP1a3/V6Y0ke
kJ6cRZbhQLMCVDoEG8H+KxFwxJsfnryL3lEDCzs1wkaEksQiHB6SEetqjVz07M+C
S7FaENUwFnAcVTVOEK/geqotGOInUnIUcVCBdT8nLpeuco3cr81TYu1FF9Vv1cB9
O2N1o5dNhl91zbY5SlqjVKpBvJ2N62d4GPF6GcuVxHyqwd9ZtonQ3lB9eqP17Sqb
a6njGDmgUaPvlAF3xt6nPAm0a3AMgQivltvYyTwpS3AYfI6uRINI7K5kH0g4nL3F
CnX33dTVS25y4UBejxvzg2G6LDyat7DiMZadV3c6gN4rNatifAo8ySO6RKmbS8M2
IREF8Hxk81mPqGd1Svmv3E3MgQUngbPrx1zCe7Lh9Hygc65I06hW4gj760po2yVc
Q/dRtXfMcQ==
=U7lH
-----END PGP SIGNATURE-----
Merge tag 'libata-5.15-2021-09-05' of git://git.kernel.dk/linux-block
Pull libata fixes from Jens Axboe:
"Fixes for queued trim on certain Samsung SSDs, in conjunction with
certain ATI controllers"
* tag 'libata-5.15-2021-09-05' of git://git.kernel.dk/linux-block:
libata: Add ATA_HORKAGE_NO_NCQ_ON_ATI for Samsung 860 and 870 SSD.
libata: add ATA_HORKAGE_NO_NCQ_TRIM for Samsung 860 and 870 SSDs
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmEz5eEQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpmk1D/wML8Im2erR5s0PaWZgYxXlgEKrJDwJm/p+
2Uixrn/9kQAhwH+0kJnCiI+HwlL3LU+5/iAdeGtdYMcVaotPPmm5V3jfud8+RuAi
E+uIOdULXgQKj8pkiQ2h5mvYd0BxGkGH38gUqilSwFrY2HTpbfxreCHhYoQaE/7o
DiGNgbhJglSFIBuIgS4cfpLkI3FdaAmrCydZ9zaqEv/G/bx9aA9lwSbAJadhTbmt
Qc1vvbh2FB9YvgZX8qfaneyDKzQbwqTvKxCe2SOVMOp/X0feJym7WZUvrPr04EoZ
zBaLDkmn44re4iWPbide7+KQJ8NMQQDBiuxwF5WxdF3hrcsiwqmKgDtBEGWXFMeV
CUZ9Osrfb480UKsDExtxLhQqGz1JZqIPZdtDvSJb8MunPZtvTz27NNFyyb9aBrlX
WiwEHqAOE1W33buPCNyuYLGDVYis4/TkwF0NZpMwsyPdN0Iz/M8Z5F5BHhC7BYoP
U8KMsX3XvddxB113U+IMVqI/SuvT125U65brklQlQeLEHnH57ceII9mNGfNic6LR
bcIu7Fb5J1U5nAMeeLCSXsEYXs+peYgI1UOWXaWgSVixUAyU8H+OqsBVIl8eiMjr
TTbdIMmfWqENE3wBM709FQQLoMmGl1YjBkGmBXKZjNHcDrf9X56rimSxRD2i2okg
r2JczxQ5uQ==
=QoQg
-----END PGP SIGNATURE-----
Merge tag 'for-5.15/io_uring-2021-09-04' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"As sometimes happens, two reports came in around the merge window open
that led to some fixes. Hence this one is a bit bigger than usual
followup fixes, but most of it will be going towards stable, outside
of the fixes that are addressing regressions from this merge window.
In detail:
- postgres is a heavy user of signals between tasks, and if we're
unlucky this can interfere with io-wq worker creation. Make sure
we're resilient against unrelated signal handling. This set of
changes also includes hardening against allocation failures, which
could previously had led to stalls.
- Some use cases that end up having a mix of bounded and unbounded
work would have starvation issues related to that. Split the
pending work lists to handle that better.
- Completion trace int -> unsigned -> long fix
- Fix issue with REGISTER_IOWQ_MAX_WORKERS and SQPOLL
- Fix regression with hash wait lock in this merge window
- Fix retry issued on block devices (Ming)
- Fix regression with links in this merge window (Pavel)
- Fix race with multi-shot poll and completions (Xiaoguang)
- Ensure regular file IO doesn't inadvertently skip completion
batching (Pavel)
- Ensure submissions are flushed after running task_work (Pavel)"
* tag 'for-5.15/io_uring-2021-09-04' of git://git.kernel.dk/linux-block:
io_uring: io_uring_complete() trace should take an integer
io_uring: fix possible poll event lost in multi shot mode
io_uring: prolong tctx_task_work() with flushing
io_uring: don't disable kiocb_done() CQE batching
io_uring: ensure IORING_REGISTER_IOWQ_MAX_WORKERS works with SQPOLL
io-wq: make worker creation resilient against signals
io-wq: get rid of FIXED worker flag
io-wq: only exit on fatal signals
io-wq: split bounded and unbounded work into separate lists
io-wq: fix queue stalling race
io_uring: don't submit half-prepared drain request
io_uring: fix queueing half-created requests
io-wq: ensure that hash wait lock is IRQ disabling
io_uring: retry in case of short read on block device
io_uring: IORING_OP_WRITE needs hash_reg_file set
io-wq: fix race between adding work and activating a free worker
Old SDHI instances have a default value for the reset register which
keeps it in reset state by default. So, when applying a hard reset we
need to manually leave the soft reset state as well. Later SDHI
instances have a different default value, the one we write manually now.
Fixes: b4d86f37ea ("mmc: renesas_sdhi: do hard reset if possible")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210826082107.47299-1-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The fault injection function can set EVENT_DATA_ERROR but skip the
setting of ->data_status to an error status if it hits just after a data
over interrupt. This confuses the tasklet which can later end up
triggering the WARN_ON(host->cmd || ..) in dw_mci_request_end() since
dw_mci_data_complete() would return success.
Prevent the fault injection function from doing this since this is not a
real case, and ensure that the fault injection doesn't race with a real
error either.
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Reviewed-by: Jaehoon Chung <jh80.chung@samsung.com>
Fixes: 2b8ac062f3 ("mmc: dw_mmc: Add data CRC error injection")
Link: https://lore.kernel.org/r/20210825114213.7429-1-vincent.whitchurch@axis.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Currently we have readl()/writel()/ioread*()/iowrite*() APIs in use.
Let's unify to use only ioread*()/iowrite*() variants.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The io.*_lo_hi() variants are not strictly needed on the x86 hardware
and especially the PCI bus. Replace them with regular accessors, but
leave headers in place in case of 32-bit build.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The strlcpy should not be used because it doesn't limit the source
length. As linus says, it's a completely useless function if you
can't implicitly trust the source string - but that is almost always
why people think they should use it! All in all the BSD function
will lead some potential bugs.
But the strscpy doesn't require reading memory from the src string
beyond the specified "count" bytes, and since the return value is
easier to error-check than strlcpy()'s. In addition, the implementation
is robust to the string changing out from underneath it, unlike the
current strlcpy() implementation.
Thus, We prefer using strscpy instead of strlcpy.
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 9cf448c200.
This commit was added for equivalence with a similar fix to ip_gre.
That fix proved to have a bug. Upon closer inspection, ip6_gre is not
susceptible to the original bug.
So revert the unnecessary extra check.
In short, ipgre_xmit calls skb_pull to remove ipv4 headers previously
inserted by dev_hard_header. ip6gre_tunnel_xmit does not.
Link: https://lore.kernel.org/netdev/CA+FuTSe+vJgTVLc9SojGuN-f9YQ+xWLPKE_S4f=f+w+_P2hgUg@mail.gmail.com/#t
Fixes: 9cf448c200 ("ip6_gre: add validation for csum_start")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the missing description for the nents parameter, and fix a trivial
misalignment.
Fixes: fffe3cc8c2 ("dma-mapping: allow map_sg() ops to return negative error codes")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Drop the unused function as reported by test bot.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210901230904.15164-1-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This symbols is not used outside of hclge_cmd.c and hclgevf_cmd.c, so marks
it static.
Fix the following sparse warning:
drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c:345:35:
warning: symbol 'hclgevf_cmd_caps_bit_map0' was not declared. Should it
be static?
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c:365:33: warning:
symbol 'hclge_cmd_caps_bit_map0' was not declared. Should it be static?
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: chongjiapeng <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jussi Maki says:
====================
bonding: Fix negative jump count reported by syzbot
This patch set fixes a negative jump count warning encountered by
syzbot [1] and extends the tests to cover nested bonding devices.
[1]: https://lore.kernel.org/lkml/0000000000000a9f3605cb1d2455@google.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify the test to check that enslaving a bond slave with a XDP program
is now allowed.
Extend attach test to exercise the program unwinding in bond_xdp_set and
add a new test for loading XDP program on doubly nested bond device to
verify that static key incr/decr is correct.
Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With nested bonding devices the nested bond device's ndo_bpf was
called without a program causing it to decrement the static key
without a prior increment leading to negative count.
Fix the issue by 1) only calling slave's ndo_bpf when there's a
program to be loaded and 2) only decrement the count when a program
is unloaded.
Fixes: 9e2ee5c7e7 ("net, bonding: Add XDP support to the bonding driver")
Reported-by: syzbot+30622fb04ddd72a4d167@syzkaller.appspotmail.com
Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new entry for VM Sockets (AF_VSOCK) that covers vsock core,
tests, and headers. Move some general vsock stuff from virtio-vsock
entry into this new more general vsock entry.
I've been reviewing and contributing for the last few years,
so I'm available to help maintain this code.
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of fuse the MM subsystem doesn't guarantee that page writeback
completes by the time ->sync_fs() is called. This is because fuse
completes page writeback immediately to prevent DoS of memory reclaim by
the userspace file server.
This means that fuse itself must ensure that writes are synced before
sending the SYNCFS request to the server.
Introduce sync buckets, that hold a counter for the number of outstanding
write requests. On syncfs replace the current bucket with a new one and
wait until the old bucket's counter goes down to zero.
It is possible to have multiple syncfs calls in parallel, in which case
there could be more than one waited-on buckets. Descendant buckets must
not complete until the parent completes. Add a count to the child (new)
bucket until the (parent) old bucket completes.
Use RCU protection to dereference the current bucket and to wake up an
emptied bucket. Use fc->lock to protect against parallel assignments to
the current bucket.
This leaves just the counter to be a possible scalability issue. The
fc->num_waiting counter has a similar issue, so both should be addressed at
the same time.
Reported-by: Amir Goldstein <amir73il@gmail.com>
Fixes: 2d82ab251e ("virtiofs: propagate sync() to file server")
Cc: <stable@vger.kernel.org> # v5.14
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
VDUSE (vDPA Device in Userspace) is a framework to support
implementing software-emulated vDPA devices in userspace. This
document is intended to clarify the VDUSE design and usage.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-14-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This VDUSE driver enables implementing software-emulated vDPA
devices in userspace. The vDPA device is created by
ioctl(VDUSE_CREATE_DEV) on /dev/vduse/control. Then a char device
interface (/dev/vduse/$NAME) is exported to userspace for device
emulation.
In order to make the device emulation more secure, the device's
control path is handled in kernel. A message mechnism is introduced
to forward some dataplane related control messages to userspace.
And in the data path, the DMA buffer will be mapped into userspace
address space through different ways depending on the vDPA bus to
which the vDPA device is attached. In virtio-vdpa case, the MMU-based
software IOTLB is used to achieve that. And in vhost-vdpa case, the
DMA buffer is reside in a userspace memory region which can be shared
to the VDUSE userspace processs via transferring the shmfd.
For more details on VDUSE design and usage, please see the follow-on
Documentation commit.
NB(mst): when merging this with
b542e383d8 ("eventfd: Make signal recursion protection a task bit")
replace eventfd_signal_count with eventfd_signal_allowed,
and drop the previous
("eventfd: Export eventfd_wake_count to modules").
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-13-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This implements an MMU-based software IOTLB to support mapping
kernel dma buffer into userspace dynamically. The basic idea
behind it is treating MMU (VA->PA) as IOMMU (IOVA->PA). The
software IOTLB will set up MMU mapping instead of IOMMU mapping
for the DMA transfer so that the userspace process is able to
use its virtual address to access the dma buffer in kernel.
To avoid security issue, a bounce-buffering mechanism is
introduced to prevent userspace accessing the original buffer
directly which may contain other kernel data. During the mapping,
unmapping, the software IOTLB will copy the data from the original
buffer to the bounce buffer and back, depending on the direction
of the transfer. And the bounce-buffer addresses will be mapped
into the user address space instead of the original one.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-12-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces an attribute for vDPA device to indicate
whether virtual address can be used. If vDPA device driver set
it, vhost-vdpa bus driver will not pin user page and transfer
userspace virtual address instead of physical address during
DMA mapping. And corresponding vma->vm_file and offset will be
also passed as an opaque pointer.
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-11-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The upcoming patch is going to support VA mapping/unmapping.
So let's factor out the logic of PA mapping/unmapping firstly
to make the code more readable.
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-10-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add an opaque pointer for DMA mapping.
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-9-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add an opaque pointer for vhost IOTLB. And introduce
vhost_iotlb_add_range_ctx() to accept it.
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-8-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The vdpa_reset() may fail now. This adds check to its return
value and fail the vhost_vdpa_open().
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-7-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds a new callback to support device specific reset
behavior. The vdpa bus driver will call the reset function
instead of setting status to zero during resetting.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210831103634.33-6-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix some code indent issues and following checkpatch warning:
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
371: FILE: include/linux/vdpa.h:371:
+static inline void vdpa_get_config(struct vdpa_device *vdev, unsigned offset,
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-5-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Export receive_fd() so that some modules can use
it to pass file descriptor between processes without
missing any security stuffs.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-4-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Export eventfd_wake_count so that some modules can use
the eventfd_signal_count() to check whether the
eventfd_signal() call should be deferred to a safe context.
NB(mst): this patch is not needed in Linus tree since there
eventfd_signal_count() has been superseded by an already exported
eventfd_signal_allowed().
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210831103634.33-3-xieyongji@bytedance.com
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Export alloc_iova_fast() and free_iova_fast() so that
some modules can make use of the per-CPU cache to get
rid of rbtree spinlock in alloc_iova() and free_iova()
during IOVA allocation.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210831103634.33-2-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Usually we use "likely/unlikely" to optimize the fast path. Remove
redundant "likely/unlikely" statements in the control path to simplify
the code and make it easier to read.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Link: https://lore.kernel.org/r/20210905085717.7427-1-mgurtovoy@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Use the helper virtio_find_vqs().
Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210723054259.2779-1-xianting.tian@linux.alibaba.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When MSR_IA32_TSC_ADJUST is written by guest due to TSC ADJUST feature
especially there's a big tsc warp (like a new vCPU is hot-added into VM
which has been up for a long time), tsc_offset is added by a large value
then go back to guest. This causes system time jump as tsc_timestamp is
not adjusted in the meantime and pvclock monotonic character.
To fix this, just notify kvm to update vCPU's guest time before back to
guest.
Cc: stable@vger.kernel.org
Signed-off-by: Zelin Deng <zelin.deng@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1619576521-81399-2-git-send-email-zelin.deng@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is reasonable for these functions to be used only in some configurations,
for example only if the host is 64-bits (and therefore supports 64-bit
guests). It is also reasonable to keep the role_regs and role accessors
in sync even though some of the accessors may be used only for one of the
two sets (as is the case currently for CR4.LA57)..
Because clang reports warnings for unused inlines declared in a .c file,
mark both sets of accessors as __maybe_unused.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fix a build warning:
arch/mips/kvm/vz.c: In function '_kvm_vz_restore_htimer':
>> arch/mips/kvm/vz.c:392:10: warning: variable 'freeze_time' set but not used [-Wunused-but-set-variable]
392 | ktime_t freeze_time;
| ^~~~~~~~~~~
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Message-Id: <20210406024911.2008046-1-chenhuacai@loongson.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Page ownership tracking between host EL1 and EL2
- Rely on userspace page tables to create large stage-2 mappings
- Fix incompatibility between pKVM and kmemleak
- Fix the PMU reset state, and improve the performance of the virtual PMU
- Move over to the generic KVM entry code
- Address PSCI reset issues w.r.t. save/restore
- Preliminary rework for the upcoming pKVM fixed feature
- A bunch of MM cleanups
- a vGIC fix for timer spurious interrupts
- Various cleanups
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmEnfogPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDF9oQAINWHN1n30gsxcErMV8gH+XAyhDq2vTjkExQ
Qz5ddo4R5zeVkj0nkunFSK+W3xYz+W97X3I+IaiiHvk5D6dUatj37IyYlazX5iFT
7mbjTAqY7GRxfd6um7uK+CTRCApXY49GGkCVLGA5f+6mQ0JMVXaK9AKlsXKWUQLZ
JvLasUgKkseN6IEJWmPDNBdIeiKBTZloeZMdlM2vSm34HsuirSS5LmshdzJQzSk8
QSEqwXZX50afzJLNlB9Qa6V1tokjZVoYIBk0vAPO83tTh9HIyGL/PFAqBeq2rnWT
M19fFFbx5vizap4ICbpviLmZ5AOywCoBmbPBT79eMAJ53rOqHUJhU1y/3DoiVzxu
LJZI4wmGBQZVivOWOqyEZcNtTAagPLhyrLhMzYulBLwAjfFJmUHdSOxYtx+2Ysvr
SDIPN31FKWrvifTXTqJHDmaaXusi2CNZUOPzVSe2I14SbX+ZX2ny9DltlbRgPNuc
hGJagI5cZc0ngd4mAIzjjNmgBS2B+dSc8dOo71dRNJRLtQLiNHcAyQNJyFme+4xI
NpvpkvzxBAs8rG2X0YIR/Cz3W3yZoCYuQNcoPk7+F/bUTK47VocQCS+gLucHVLbT
H4286EV5n4nZ7E01oJ6uWnDnslPvrx9Sz2fxsrWYkBDR+xrz0EprrGsftFaILprz
Ic43uXfd
=LuHM
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 5.15
- Page ownership tracking between host EL1 and EL2
- Rely on userspace page tables to create large stage-2 mappings
- Fix incompatibility between pKVM and kmemleak
- Fix the PMU reset state, and improve the performance of the virtual PMU
- Move over to the generic KVM entry code
- Address PSCI reset issues w.r.t. save/restore
- Preliminary rework for the upcoming pKVM fixed feature
- A bunch of MM cleanups
- a vGIC fix for timer spurious interrupts
- Various cleanups
Commit f4e61f0c9a ("x86/kvm: Fix broken irq restoration in kvm_wait")
replaced "local_irq_restore() when IRQ enabled" with "local_irq_enable()
when IRQ enabled" to suppress a warnning.
Although there is no similar debugging warnning for doing local_irq_enable()
when IRQ enabled as doing local_irq_restore() in the same IRQ situation. But
doing local_irq_enable() when IRQ enabled is no less broken as doing
local_irq_restore() and we'd better avoid it.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20210814035129.154242-1-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>