mirror of
https://github.com/torvalds/linux.git
synced 2026-05-24 07:03:03 +02:00
v6.7
45221 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
95c8a35f1c |
12 hotfixes. 2 are cc:stable and the remainder either address post-6.7
issues or aren't considered necessary for earlier kernel versions. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZhbAwAKCRDdBJ7gKXxA jiFeAQD20H8wGT9hbMUYr/PxE4rOxyDXlhnv/mZ0Av3neve4SQD/YlgCFWYpEu8G F5rEAtq89UYE13qlgS3o9KjYOPzhtgQ= =vZ0P -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2024-01-05-11-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc mm fixes from Andrew Morton: "12 hotfixes. Two are cc:stable and the remainder either address post-6.7 issues or aren't considered necessary for earlier kernel versions" * tag 'mm-hotfixes-stable-2024-01-05-11-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: shrinker: use kvzalloc_node() from expand_one_shrinker_info() mailmap: add entries for Mathieu Othacehe MAINTAINERS: change vmware.com addresses to broadcom.com arch/mm/fault: fix major fault accounting when retrying under per-VMA lock mm/mglru: skip special VMAs in lru_gen_look_around() MAINTAINERS: hand over hwpoison maintainership to Miaohe Lin MAINTAINERS: remove hugetlb maintainer Mike Kravetz mm: fix unmap_mapping_range high bits shift bug mm: memcg: fix split queue list crash when large folio migration mm: fix arithmetic for max_prop_frac when setting max_ratio mm: fix arithmetic for bdi min_ratio mm: align larger anonymous mappings on THP boundaries |
||
|
|
7987b8b75f |
* Fix Boolean logic in intel_guest_get_msrs
-----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmWWz4QUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNYSQgAjHhbUUUHqpkl4vO+zcJiPlQN3i14 M/71Hgdvh+936UlyqBoEx9wrHcbzw0+vaxKsri/yIGLkieaU7ZtVtyeWfoc4DkDd 0Al4OTIkyydPmaKjC3terRwlCxRWbut30yRlkz4LzC/oWb2NN19j3Yq6omwBBAOu vCqK2chNCIEetYBmX3G2CDG7DJ6mVvSLZx/vQXP3dj3jQWxT8eFxTH52eNtdIJXN 0U4EVgp21hah+ur8U/VGNlqkc5govovSKtV3eAeE1WF0coBTLZcFadnHVjcG65lz QOKVCU9Aw31M7qddTAqztC9OpyaD8oDEhyRKZoVSkGhiP/2crK83AAf/7w== =J0fU -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fix from Paolo Bonzini: - Fix boolean logic in intel_guest_get_msrs * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL |
||
|
|
7131c2e9bb |
Probes fixes for 6.7-rc8:
- Kprobes/x86: Fix to emulate indirect call which size is not 5 byte.
Current code expects the indirect call instructions are 5 bytes, but
that is incorrect. Usually indirect call based on register is shorter
than that, thus the emulation causes a kernel crash by accessing
wrong instruction boundary. This uses the instruction size to
calculate the return address correctly.
-----BEGIN PGP SIGNATURE-----
iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmWWxCIbHG1hc2FtaS5o
aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8brzYH+wZnk8UnJk8VmCp+BFwf
kHfzDruVLnsjiehMPIniu/DUoDpwZnUw3/uFWzMEnH0y14GfTgS5D0m1ctybtpnR
PgUeWSRI0XAxOXhExJLGd+/29V2E1FAjpR2kQx/U5hObzUtDNR2n0zLGO4qcJq0a
laNXnkc5OgkZ9KAcJp2dT4WNsUFalRUbr4PmeVUSxYxmI1L1/+Q74vyGywRBgqkB
lENLCbzubhp9T4pLBmCrDoRRPshjr/TYPRYvEJ5gEH9c+KuVOdZXa6Drr1Y5sADe
zsycxjPi2ETxJtMCf67IzBjNbYn/wGLg2u00FtRdt+JK6p0I+Wgm+cKb6ifPDLPN
PXU=
=CmiU
-----END PGP SIGNATURE-----
Merge tag 'probes-fixes-v6.7-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull kprobes/x86 fix from Masami Hiramatsu:
- Fix to emulate indirect call which size is not 5 byte.
Current code expects the indirect call instructions are 5 bytes, but
that is incorrect. Usually indirect call based on register is shorter
than that, thus the emulation causes a kernel crash by accessing
wrong instruction boundary. This uses the instruction size to
calculate the return address correctly.
* tag 'probes-fixes-v6.7-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
x86/kprobes: fix incorrect return address calculation in kprobe_emulate_call_indirect
|
||
|
|
a476aae3f1 |
x86/csum: clean up `csum_partial' further
Commit
|
||
|
|
5d4acb6285 |
x86/csum: Remove unnecessary odd handling
The special case for odd aligned buffers is unnecessary and mostly just adds overhead. Aligned buffers is the expectations, and even for unaligned buffer, the only case that was helped is if the buffer was 1-byte from word aligned which is ~1/7 of the cases. Overall it seems highly unlikely to be worth to extra branch. It was left in the previous perf improvement patch because I was erroneously comparing the exact output of `csum_partial(...)`, but really we only need `csum_fold(csum_partial(...))` to match so its safe to remove. All csum kunit tests pass. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Laight <david.laight@aculab.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
9710794640 |
KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL
When commit |
||
|
|
f5d03da48d |
x86/kprobes: fix incorrect return address calculation in kprobe_emulate_call_indirect
kprobe_emulate_call_indirect currently uses int3_emulate_call to emulate
indirect calls. However, int3_emulate_call always assumes the size of
the call to be 5 bytes when calculating the return address. This is
incorrect for register-based indirect calls in x86, which can be either
2 or 3 bytes depending on whether REX prefix is used. At kprobe runtime,
the incorrect return address causes control flow to land onto the wrong
place after return -- possibly not a valid instruction boundary. This
can lead to a panic like the following:
[ 7.308204][ C1] BUG: unable to handle page fault for address: 000000000002b4d8
[ 7.308883][ C1] #PF: supervisor read access in kernel mode
[ 7.309168][ C1] #PF: error_code(0x0000) - not-present page
[ 7.309461][ C1] PGD 0 P4D 0
[ 7.309652][ C1] Oops: 0000 [#1] SMP
[ 7.309929][ C1] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.7.0-rc5-trace-for-next #6
[ 7.310397][ C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-20220807_005459-localhost 04/01/2014
[ 7.311068][ C1] RIP: 0010:__common_interrupt+0x52/0xc0
[ 7.311349][ C1] Code: 01 00 4d 85 f6 74 39 49 81 fe 00 f0 ff ff 77 30 4c 89 f7 4d 8b 5e 68 41 ba 91 76 d8 42 45 03 53 fc 74 02 0f 0b cc ff d3 65 48 <8b> 05 30 c7 ff 7e 65 4c 89 3d 28 c7 ff 7e 5b 41 5c 41 5e 41 5f c3
[ 7.312512][ C1] RSP: 0018:ffffc900000e0fd0 EFLAGS: 00010046
[ 7.312899][ C1] RAX: 0000000000000001 RBX: 0000000000000023 RCX: 0000000000000001
[ 7.313334][ C1] RDX: 00000000000003cd RSI: 0000000000000001 RDI: ffff888100d302a4
[ 7.313702][ C1] RBP: 0000000000000001 R08: 0ef439818636191f R09: b1621ff338a3b482
[ 7.314146][ C1] R10: ffffffff81e5127b R11: ffffffff81059810 R12: 0000000000000023
[ 7.314509][ C1] R13: 0000000000000000 R14: ffff888100d30200 R15: 0000000000000000
[ 7.314951][ C1] FS: 0000000000000000(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
[ 7.315396][ C1] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7.315691][ C1] CR2: 000000000002b4d8 CR3: 0000000003028003 CR4: 0000000000370ef0
[ 7.316153][ C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 7.316508][ C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 7.316948][ C1] Call Trace:
[ 7.317123][ C1] <IRQ>
[ 7.317279][ C1] ? __die_body+0x64/0xb0
[ 7.317482][ C1] ? page_fault_oops+0x248/0x370
[ 7.317712][ C1] ? __wake_up+0x96/0xb0
[ 7.317964][ C1] ? exc_page_fault+0x62/0x130
[ 7.318211][ C1] ? asm_exc_page_fault+0x22/0x30
[ 7.318444][ C1] ? __cfi_native_send_call_func_single_ipi+0x10/0x10
[ 7.318860][ C1] ? default_idle+0xb/0x10
[ 7.319063][ C1] ? __common_interrupt+0x52/0xc0
[ 7.319330][ C1] common_interrupt+0x78/0x90
[ 7.319546][ C1] </IRQ>
[ 7.319679][ C1] <TASK>
[ 7.319854][ C1] asm_common_interrupt+0x22/0x40
[ 7.320082][ C1] RIP: 0010:default_idle+0xb/0x10
[ 7.320309][ C1] Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 b8 0c 67 40 a5 66 90 0f 00 2d 09 b9 3b 00 fb f4 <fa> c3 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 b8 0c 67 40 a5 e9
[ 7.321449][ C1] RSP: 0018:ffffc9000009bee8 EFLAGS: 00000256
[ 7.321808][ C1] RAX: ffff88813bca8b68 RBX: 0000000000000001 RCX: 000000000001ef0c
[ 7.322227][ C1] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000000000001ef0c
[ 7.322656][ C1] RBP: ffffc9000009bef8 R08: 8000000000000000 R09: 00000000000008c2
[ 7.323083][ C1] R10: 0000000000000000 R11: ffffffff81058e70 R12: 0000000000000000
[ 7.323530][ C1] R13: ffff8881002b30c0 R14: 0000000000000000 R15: 0000000000000000
[ 7.323948][ C1] ? __cfi_lapic_next_deadline+0x10/0x10
[ 7.324239][ C1] default_idle_call+0x31/0x50
[ 7.324464][ C1] do_idle+0xd3/0x240
[ 7.324690][ C1] cpu_startup_entry+0x25/0x30
[ 7.324983][ C1] start_secondary+0xb4/0xc0
[ 7.325217][ C1] secondary_startup_64_no_verify+0x179/0x17b
[ 7.325498][ C1] </TASK>
[ 7.325641][ C1] Modules linked in:
[ 7.325906][ C1] CR2: 000000000002b4d8
[ 7.326104][ C1] ---[ end trace 0000000000000000 ]---
[ 7.326354][ C1] RIP: 0010:__common_interrupt+0x52/0xc0
[ 7.326614][ C1] Code: 01 00 4d 85 f6 74 39 49 81 fe 00 f0 ff ff 77 30 4c 89 f7 4d 8b 5e 68 41 ba 91 76 d8 42 45 03 53 fc 74 02 0f 0b cc ff d3 65 48 <8b> 05 30 c7 ff 7e 65 4c 89 3d 28 c7 ff 7e 5b 41 5c 41 5e 41 5f c3
[ 7.327570][ C1] RSP: 0018:ffffc900000e0fd0 EFLAGS: 00010046
[ 7.327910][ C1] RAX: 0000000000000001 RBX: 0000000000000023 RCX: 0000000000000001
[ 7.328273][ C1] RDX: 00000000000003cd RSI: 0000000000000001 RDI: ffff888100d302a4
[ 7.328632][ C1] RBP: 0000000000000001 R08: 0ef439818636191f R09: b1621ff338a3b482
[ 7.329223][ C1] R10: ffffffff81e5127b R11: ffffffff81059810 R12: 0000000000000023
[ 7.329780][ C1] R13: 0000000000000000 R14: ffff888100d30200 R15: 0000000000000000
[ 7.330193][ C1] FS: 0000000000000000(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
[ 7.330632][ C1] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7.331050][ C1] CR2: 000000000002b4d8 CR3: 0000000003028003 CR4: 0000000000370ef0
[ 7.331454][ C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 7.331854][ C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 7.332236][ C1] Kernel panic - not syncing: Fatal exception in interrupt
[ 7.332730][ C1] Kernel Offset: disabled
[ 7.333044][ C1] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
The relevant assembly code is (from objdump, faulting address
highlighted):
ffffffff8102ed9d: 41 ff d3 call *%r11
ffffffff8102eda0: 65 48 <8b> 05 30 c7 ff mov %gs:0x7effc730(%rip),%rax
The emulation incorrectly sets the return address to be ffffffff8102ed9d
+ 0x5 = ffffffff8102eda2, which is the 8b byte in the middle of the next
mov. This in turn causes incorrect subsequent instruction decoding and
eventually triggers the page fault above.
Instead of invoking int3_emulate_call, perform push and jmp emulation
directly in kprobe_emulate_call_indirect. At this point we can obtain
the instruction size from p->ainsn.size so that we can calculate the
correct return address.
Link: https://lore.kernel.org/all/20240102233345.385475-1-jinghao7@illinois.edu/
Fixes:
|
||
|
|
46e714c729 |
arch/mm/fault: fix major fault accounting when retrying under per-VMA lock
A test [1] in Android test suite started failing after [2] was merged. It
turns out that after handling a major fault under per-VMA lock, the
process major fault counter does not register that fault as major. Before
[2] read faults would be done under mmap_lock, in which case
FAULT_FLAG_TRIED flag is set before retrying. That in turn causes
mm_account_fault() to account the fault as major once retry completes.
With per-VMA locks we often retry because a fault can't be handled without
locking the whole mm using mmap_lock. Therefore such retries do not set
FAULT_FLAG_TRIED flag. This logic does not work after [2] because we can
now handle read major faults under per-VMA lock and upon retry the fact
there was a major fault gets lost. Fix this by setting FAULT_FLAG_TRIED
after retrying under per-VMA lock if VM_FAULT_MAJOR was returned. Ideally
we would use an additional VM_FAULT bit to indicate the reason for the
retry (could not handle under per-VMA lock vs other reason) but this
simpler solution seems to work, so keeping it simple.
[1] https://cs.android.com/android/platform/superproject/+/master:test/vts-testcase/kernel/api/drop_caches_prop/drop_caches_test.cpp
[2] https://lore.kernel.org/all/20231006195318.4087158-6-willy@infradead.org/
Link: https://lkml.kernel.org/r/20231226214610.109282-1-surenb@google.com
Fixes:
|
||
|
|
f5837722ff |
11 hotfixes. 7 are cc:stable and the other 4 address post-6.6 issues or
are not considered backporting material. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZYys4AAKCRDdBJ7gKXxA jtmaAQC+o04Ia7IfB8MIqp1p7dNZQo64x/EnGA8YjUnQ8N6IwQD+ImU7dHl9g9Oo ROiiAbtMRBUfeJRsExX/Yzc1DV9E9QM= =ZGcs -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2023-12-27-15-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "11 hotfixes. 7 are cc:stable and the other 4 address post-6.6 issues or are not considered backporting material" * tag 'mm-hotfixes-stable-2023-12-27-15-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: add an old address for Naoya Horiguchi mm/memory-failure: cast index to loff_t before shifting it mm/memory-failure: check the mapcount of the precise page mm/memory-failure: pass the folio and the page to collect_procs() selftests: secretmem: floor the memory size to the multiple of page_size mm: migrate high-order folios in swap cache correctly maple_tree: do not preallocate nodes for slot stores mm/filemap: avoid buffered read/write race to read inconsistent data kunit: kasan_test: disable fortify string checker on kmalloc_oob_memset kexec: select CRYPTO from KEXEC_FILE instead of depending on it kexec: fix KEXEC_FILE dependencies |
||
|
|
3f82f1c3a0 |
Misc fixes:
- Fix a secondary CPUs enumeration regression caused by creative
MADT APIC table entries on certain systems.
- Fix a race in the NOP-patcher that can spuriously trigger crashes
on bootup.
- Fix a bootup failure regression caused by the parallel bringup
code, caused by firmware inconsistency between the APIC
initialization states of the boot and secondary CPUs, on certain
systems.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmWG704RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1j52w/+IfvoA9zX6+qsl0lIU76FsD50szR7C8IR
4DsN68BbjI6sUNN/RWrjTY249UCbl8ykL8ozaek+eBaCXcTXIoGh96oPnojY/8vA
HTYCgilmCvwNozawkjAeWedwbxdcafTr7l1bad5Y6LyYJv7BNUnsELq67bIQoOwK
QGtnjDnTojVBoGy920KgZZ0zkqM05zWQ24MvM0Pn8T9lC2TokqCrbY4T8ags1AW0
j+swsd34ON+jKM0tXqQdBgxbb4TRW05Yg2PC9hLMxtXRXsYFfXW8udS0E83JpWD8
XL4z34IQqRuuiCqVboeBsP2gta4/oEM+ctbTXbmYs/ca7euZ7mnrUDcDVG2qyKyP
F4hGgySyRv5wHx1IdTDIjozhsK+oYPLpqj87bBNr5N/aZImlKYTkICFt6EFhxrZn
aoQCdE3ba4EoQqes05aVAvNABAt2ml1UNMyz4uu83diJG7aszVY0WOz4laX4edof
L4EBpl2RrWqWwr7CnkrMhHfAcYVKv6k50/MK4ZrQrm1Xqgw6fbtvyxp72iw+ktJk
XXnlDU6Ngc+F4dBXNY1Wr1q5kB4eYWy9DgmpHC56H6QPWTCV/Vk9wayeSXBrpoXa
w5E91MKj9Oo68QbtmczbvlTmjKgWsWKGNvHHFF2ItmF2Tc/8BKFazNNZ+wzjsRl+
oJnjZILBHfI=
=V+LX
-----END PGP SIGNATURE-----
Merge tag 'x86-urgent-2023-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix a secondary CPUs enumeration regression caused by creative MADT
APIC table entries on certain systems.
- Fix a race in the NOP-patcher that can spuriously trigger crashes on
bootup.
- Fix a bootup failure regression caused by the parallel bringup code,
caused by firmware inconsistency between the APIC initialization
states of the boot and secondary CPUs, on certain systems.
* tag 'x86-urgent-2023-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/acpi: Handle bogus MADT APIC tables gracefully
x86/alternatives: Disable interrupts and sync when optimizing NOPs in place
x86/alternatives: Sync core before enabling interrupts
x86/smpboot/64: Handle X2APIC BIOS inconsistency gracefully
|
||
|
|
867583b399 |
RISC-V
- Fix a race condition in updating external interrupt for
trap-n-emulated IMSIC swfile
- Fix print_reg defaults in get-reg-list selftest
ARM:
- Ensure a vCPU's redistributor is unregistered from the MMIO bus
if vCPU creation fails
- Fix building KVM selftests for arm64 from the top-level Makefile
x86:
- Fix breakage for SEV-ES guests that use XSAVES.
Selftests:
- Fix bad use of strcat(), by not using strcat() at all
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmWGFv0UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPczAf/e6AgAnyPG1UItZqpLD+JDURcVaV1
QyP3kc240e9dEjEkGidQ8vyekgAU9nGt2rFNPaU+5Y1E5Ky+SpZbbIzgS1cZypxT
J1lsrVhZgNdCKEVRdrUMIzhkUEk0Kjd7OsFMQ9F6OuITSv/HCgZ1g6KobgBzUGCR
0vcYqM74VnZiGGd5A4w8qP2F0FmF/7tf9k6iKWoYu6UpFe9z50jpIRq6dynrOHOc
fmwsptmGzjgzuLK9sZTXYETOQvcpmXLqSZ65k1LQG224J5AYjS08Y5XLo1QS4rpV
/g8QAgi+9ChGSzC47fqr/solAsoz/NzALPqydy+FH4u+O/O4SG5I4V8OmA==
=4/NU
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"RISC-V:
- Fix a race condition in updating external interrupt for
trap-n-emulated IMSIC swfile
- Fix print_reg defaults in get-reg-list selftest
ARM:
- Ensure a vCPU's redistributor is unregistered from the MMIO bus if
vCPU creation fails
- Fix building KVM selftests for arm64 from the top-level Makefile
x86:
- Fix breakage for SEV-ES guests that use XSAVES
Selftests:
- Fix bad use of strcat(), by not using strcat() at all"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests
KVM: selftests: Fix dynamic generation of configuration names
RISCV: KVM: update external interrupt atomically for IMSIC swfile
KVM: riscv: selftests: Fix get-reg-list print_reg defaults
KVM: selftests: Ensure sysreg-defs.h is generated at the expected path
KVM: Convert comment into an assertion in kvm_io_bus_register_dev()
KVM: arm64: vgic: Ensure that slots_lock is held in vgic_register_all_redist_iodevs()
KVM: arm64: vgic: Force vcpu vgic teardown on vcpu destroy
KVM: arm64: vgic: Add a non-locking primitive for kvm_vgic_vcpu_destroy()
KVM: arm64: vgic: Simplify kvm_vgic_destroy()
|
||
|
|
ef5b28372c |
KVM/riscv fixes for 6.7, take #1
- Fix a race condition in updating external interrupt for trap-n-emulated IMSIC swfile - Fix print_reg defaults in get-reg-list selftest -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmV8h2AACgkQrUjsVaLH LAeVKQ//TXVUf7I8y71MO7dtNu31oU1ud9HzD02iRecsOK/207OpGBrzPrOLO3sP Ttk0TJCk1o+BU8kS5FQGV2tiIAb3SAF4utVDNPUOzfAd4Leg3CBvkH6aQSDCmNyk 91ZdJVmlt7LHKb9UlMIwldKDb2z51HtPotZOVVbhqtmoiWKCR1BBlgtTjQ8+P1ME PQorbjnRYVmjqpK/IbhXboQBiFZssnFdVuV0zftncBZECyFAI6TfZR/FTGFcRN6u VnW5cvyWET8DGhcpqt1BuQl8dPomDV32+HtKszyxkvUTrc1RqfFmoP02adppSU6r /8h5WcylzF66swDpm59ibH9c2k1f4T6fW1YQrWL4JwmL0IX5/u3ZrdbNcBtoD2fd 81VDiJe+jjklH5EsTm1gwgNaw+qS2SNtXuQ+Phe9pSsuwtD4LL94ZcAY29dCEuPk GR1DKxgVR+nfwtUC3OaShjDLkto90kcpjCaiHVIXL8ZU4pMEvEbcBc7B6rxwDT98 udziEoH42UPvf/0B2AcDJZHwPEvvY9uI/dsrEN0PZa5rhRL9Z7nBE673iRDgcp7e lad25lss+BbpOsGzmztEQG83rtwYTCLN5ah1eWGOzSq5xV5j+XrnK1qvWsF+OrGl q8CEzc7sUZ/9EVxWCAhCPV8WqMSVIsfuSmS+pt7Mo9G+oVF58uM= =0+bX -----END PGP SIGNATURE----- Merge tag 'kvm-riscv-fixes-6.7-1' of https://github.com/kvm-riscv/linux into kvm-master KVM/riscv fixes for 6.7, take #1 - Fix a race condition in updating external interrupt for trap-n-emulated IMSIC swfile - Fix print_reg defaults in get-reg-list selftest |
||
|
|
b7bc7bce88 |
xen: branch for v6.7-rc7
-----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCZYUtLwAKCRCAXGG7T9hj vvAkAQDJED+V3e3adOpkt1xjLyftY2kzHbX9i0koIg7ivbBYEwD/c3yfPz2L8SXK IgU0LgxIMgGnjMRrBOuW9kY5FtDGPQQ= =EUSt -----END PGP SIGNATURE----- Merge tag 'for-linus-6.7a-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "A single patch fixing a build issue for x86 32-bit configurations with CONFIG_XEN, which was introduced in the 6.7 development cycle" * tag 'for-linus-6.7a-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: add CPU dependencies for 32-bit build |
||
|
|
93cd059764 |
x86/xen: add CPU dependencies for 32-bit build
Xen only supports modern CPUs even when running a 32-bit kernel, and it now
requires a kernel built for a 64 byte (or larger) cache line:
In file included from <command-line>:
In function 'xen_vcpu_setup',
inlined from 'xen_vcpu_setup_restore' at arch/x86/xen/enlighten.c:111:3,
inlined from 'xen_vcpu_restore' at arch/x86/xen/enlighten.c:141:3:
include/linux/compiler_types.h:435:45: error: call to '__compiletime_assert_287' declared with attribute error: BUILD_BUG_ON failed: sizeof(*vcpup) > SMP_CACHE_BYTES
arch/x86/xen/enlighten.c:166:9: note: in expansion of macro 'BUILD_BUG_ON'
166 | BUILD_BUG_ON(sizeof(*vcpup) > SMP_CACHE_BYTES);
| ^~~~~~~~~~~~
Enforce the dependency with a whitelist of CPU configurations. In normal
distro kernels, CONFIG_X86_GENERIC is enabled, and this works fine. When this
is not set, still allow Xen to be built on kernels that target a 64-bit
capable CPU.
Fixes:
|
||
|
|
a4aebe9365 |
posix-timers: Get rid of [COMPAT_]SYS_NI() uses
Only the posix timer system calls use this (when the posix timer support is disabled, which does not actually happen in any normal case), because they had debug code to print out a warning about missing system calls. Get rid of that special case, and just use the standard COND_SYSCALL interface that creates weak system call stubs that return -ENOSYS for when the system call does not exist. This fixes a kCFI issue with the SYS_NI() hackery: CFI failure at int80_emulation+0x67/0xb0 (target: sys_ni_posix_timers+0x0/0x70; expected type: 0xb02b34d9) WARNING: CPU: 0 PID: 48 at int80_emulation+0x67/0xb0 Reported-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
c1ad12ee0e |
kexec: fix KEXEC_FILE dependencies
The cleanup for the CONFIG_KEXEC Kconfig logic accidentally changed the 'depends on CRYPTO=y' dependency to a plain 'depends on CRYPTO', which causes a link failure when all the crypto support is in a loadable module and kexec_file support is built-in: x86_64-linux-ld: vmlinux.o: in function `__x64_sys_kexec_file_load': (.text+0x32e30a): undefined reference to `crypto_alloc_shash' x86_64-linux-ld: (.text+0x32e58e): undefined reference to `crypto_shash_update' x86_64-linux-ld: (.text+0x32e6ee): undefined reference to `crypto_shash_final' Both s390 and x86 have this problem, while ppc64 and riscv have the correct dependency already. On riscv, the dependency is only used for the purgatory, not for the kexec_file code itself, which may be a bit surprising as it means that with CONFIG_CRYPTO=m, it is possible to enable KEXEC_FILE but then the purgatory code is silently left out. Move this into the common Kconfig.kexec file in a way that is correct everywhere, using the dependency on CRYPTO_SHA256=y only when the purgatory code is available. This requires reversing the dependency between ARCH_SUPPORTS_KEXEC_PURGATORY and KEXEC_FILE, but the effect remains the same, other than making riscv behave like the other ones. On s390, there is an additional dependency on CRYPTO_SHA256_S390, which should technically not be required but gives better performance. Remove this dependency here, noting that it was not present in the initial Kconfig code but was brought in without an explanation in commit |
||
|
|
d5a10b976e |
x86/acpi: Handle bogus MADT APIC tables gracefully
The recent fix to ignore invalid x2APIC entries inadvertently broke
systems with creative MADT APIC tables. The affected systems have APIC
MADT tables where all entries have invalid APIC IDs (0xFF), which means
they register exactly zero CPUs.
But the condition to ignore the entries of APIC IDs < 255 in the X2APIC
MADT table is solely based on the count of MADT APIC table entries.
As a consequence, the affected machines enumerate no secondary CPUs at
all because the APIC table has entries and therefore the X2APIC table
entries with APIC IDs < 255 are ignored.
Change the condition so that the APIC table preference for APIC IDs <
255 only becomes effective when the APIC table has valid APIC ID
entries.
IOW, an APIC table full of invalid APIC IDs is considered to be empty
which in consequence enables the X2APIC table entries with a APIC ID
< 255 and restores the expected behaviour.
Fixes:
|
||
|
|
a62aa88ba1 |
17 hotfixes. 8 are cc:stable and the other 9 pertain to post-6.6 issues.
-----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZXxs8wAKCRDdBJ7gKXxA junbAQCdItfHHinkWziciOrb0387wW+5WZ1ohqRFW8pGYLuasQEArpKmw13bvX7z e+ec9K1Ek9MlIsO2RwORR4KHH4MAbwA= =YpZh -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2023-12-15-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "17 hotfixes. 8 are cc:stable and the other 9 pertain to post-6.6 issues" * tag 'mm-hotfixes-stable-2023-12-15-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/mglru: reclaim offlined memcgs harder mm/mglru: respect min_ttl_ms with memcgs mm/mglru: try to stop at high watermarks mm/mglru: fix underprotected page cache mm/shmem: fix race in shmem_undo_range w/THP Revert "selftests: error out if kernel header files are not yet built" crash_core: fix the check for whether crashkernel is from high memory x86, kexec: fix the wrong ifdeffery CONFIG_KEXEC sh, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC mips, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC m68k, kexec: fix the incorrect ifdeffery and build dependency of CONFIG_KEXEC loongarch, kexec: change dependency of object files mm/damon/core: make damon_start() waits until kdamond_fn() starts selftests/mm: cow: print ksft header before printing anything else mm: fix VMA heap bounds checking riscv: fix VMALLOC_START definition kexec: drop dependency on ARCH_SUPPORTS_KEXEC from CRASH_DUMP |
||
|
|
2dc4196138 |
x86/alternatives: Disable interrupts and sync when optimizing NOPs in place
apply_alternatives() treats alternatives with the ALT_FLAG_NOT flag set
special as it optimizes the existing NOPs in place.
Unfortunately, this happens with interrupts enabled and does not provide any
form of core synchronization.
So an interrupt hitting in the middle of the update and using the affected code
path will observe a half updated NOP and crash and burn. The following
3 NOP sequence was observed to expose this crash halfway reliably under QEMU
32bit:
0x90 0x90 0x90
which is replaced by the optimized 3 byte NOP:
0x8d 0x76 0x00
So an interrupt can observe:
1) 0x90 0x90 0x90 nop nop nop
2) 0x8d 0x90 0x90 undefined
3) 0x8d 0x76 0x90 lea -0x70(%esi),%esi
4) 0x8d 0x76 0x00 lea 0x0(%esi),%esi
Where only #1 and #4 are true NOPs. The same problem exists for 64bit obviously.
Disable interrupts around this NOP optimization and invoke sync_core()
before re-enabling them.
Fixes:
|
||
|
|
3ea1704a92 |
x86/alternatives: Sync core before enabling interrupts
text_poke_early() does:
local_irq_save(flags);
memcpy(addr, opcode, len);
local_irq_restore(flags);
sync_core();
That's not really correct because the synchronization should happen before
interrupts are re-enabled to ensure that a pending interrupt observes the
complete update of the opcodes.
It's not entirely clear whether the interrupt entry provides enough
serialization already, but moving the sync_core() invocation into interrupt
disabled region does no harm and is obviously correct.
Fixes:
|
||
|
|
69a7386c1e |
x86/smpboot/64: Handle X2APIC BIOS inconsistency gracefully
Chris reported that a Dell PowerEdge T340 system stopped to boot when upgrading
to a kernel which contains the parallel hotplug changes. Disabling parallel
hotplug on the kernel command line makes it boot again.
It turns out that the Dell BIOS has x2APIC enabled and the boot CPU comes up in
X2APIC mode, but the APs come up inconsistently in xAPIC mode.
Parallel hotplug requires that the upcoming CPU reads out its APIC ID from the
local APIC in order to map it to the Linux CPU number.
In this particular case the readout on the APs uses the MMIO mapped registers
because the BIOS failed to enable x2APIC mode. That readout results in a page
fault because the kernel does not have the APIC MMIO space mapped when X2APIC
mode was enabled by the BIOS on the boot CPU and the kernel switched to X2APIC
mode early. That page fault can't be handled on the upcoming CPU that early and
results in a silent boot failure.
If parallel hotplug is disabled the system boots because in that case the APIC
ID read is not required as the Linux CPU number is provided to the AP in the
smpboot control word. When the kernel uses x2APIC mode then the APs are
switched to x2APIC mode too slightly later in the bringup process, but there is
no reason to do it that late.
Cure the BIOS bogosity by checking in the parallel bootup path whether the
kernel uses x2APIC mode and if so switching over the APs to x2APIC mode before
the APIC ID readout.
Fixes:
|
||
|
|
a26b7cd225 |
KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests
When intercepts are enabled for MSR_IA32_XSS, the host will swap in/out
the guest-defined values while context-switching to/from guest mode.
However, in the case of SEV-ES, vcpu->arch.guest_state_protected is set,
so the guest-defined value is effectively ignored when switching to
guest mode with the understanding that the VMSA will handle swapping
in/out this register state.
However, SVM is still configured to intercept these accesses for SEV-ES
guests, so the values in the initial MSR_IA32_XSS are effectively
read-only, and a guest will experience undefined behavior if it actually
tries to write to this MSR. Fortunately, only CET/shadowstack makes use
of this register on SEV-ES-capable systems currently, which isn't yet
widely used, but this may become more of an issue in the future.
Additionally, enabling intercepts of MSR_IA32_XSS results in #VC
exceptions in the guest in certain paths that can lead to unexpected #VC
nesting levels. One example is SEV-SNP guests when handling #VC
exceptions for CPUID instructions involving leaf 0xD, subleaf 0x1, since
they will access MSR_IA32_XSS as part of servicing the CPUID #VC, then
generate another #VC when accessing MSR_IA32_XSS, which can lead to
guest crashes if an NMI occurs at that point in time. Running perf on a
guest while it is issuing such a sequence is one example where these can
be problematic.
Address this by disabling intercepts of MSR_IA32_XSS for SEV-ES guests
if the host/guest configuration allows it. If the host/guest
configuration doesn't allow for MSR_IA32_XSS, leave it intercepted so
that it can be caught by the existing checks in
kvm_{set,get}_msr_common() if the guest still attempts to access it.
Fixes:
|
||
|
|
69f8ca8d36 |
x86, kexec: fix the wrong ifdeffery CONFIG_KEXEC
With the current ifdeffery CONFIG_KEXEC, get_cmdline_acpi_rsdp() is only available when kexec_load interface is taken, while kexec_file_load interface can't make use of it. Now change it to CONFIG_KEXEC_CORE. Link: https://lkml.kernel.org/r/20231208073036.7884-6-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Eric DeVolder <eric_devolder@yahoo.com> Cc: Ignat Korchagin <ignat@cloudflare.com> Cc: kernel test robot <lkp@intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
5412fed784 |
- Add a forgotten CPU vendor check in the AMD microcode post-loading
callback so that the callback runs only on AMD - Make sure SEV-ES protocol negotiation happens only once and on the BSP -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmV1l4MACgkQEsHwGGHe VUoESQ//WxIdMb/ATNtBRXEM3GRzd/X2he7kXRGVSUeJZMxEOnpVEmo4NtAzXwI7 E2yOk4EJ6QlN+p6/eea5X2FJ5IkpiDKj1jVvU6NxTJiF2I5t1XUjdcXiv05Is+bP 5Cx1ysaznMQYvZUler15NkbLCZvVX5/1V9FthnLpS1d1cvem0NKoH+hVUQHkdHbo /QNzEy+dyii8mp+t6D4QEfjq2Ab7bRbx1FufJcvHy9tFt3u8JsuWI/kHLm5Jk3VN n7Efynluw1wJC3l/18QzrxUCL5LVwrs2LxC0GH42dq9vBp0EJclgsBOh+43cm+Ey ffowP3uJxayngdigRA4ANpfbJL7GHjEtoYlxk/ooI4l2ccNtSdCXn86UVI9q3BiT 60j/qV9r/o0Wl46/BVrT6SXhHlcKm4YchUa53NbOHcU3hBpUuM78WAcqs+63voMq kFqXQVihlJWYYytyFKw6Ng7O6SHR/SMC3uSygG0q2KebiUlVMZSheJin2Od+shhC skD0z2EUzUGuNICFHuxwA2uu6jr/KvkEwDfjczwdE2haUfZLCnOfTjxZmqBbuzJ6 l7lfGjBCpGGtGettjrM8WAuHnL6MzNDI2jVDV3K6bzqQVzk9wjFQjIG9X7c8XjN/ E2IyQxNuiIKVj3GfFOVM9lgJ4C8qwNDa2A46Kad2dKHioiFfMuM= =uIo2 -----END PGP SIGNATURE----- Merge tag 'x86_urgent_for_v6.7_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add a forgotten CPU vendor check in the AMD microcode post-loading callback so that the callback runs only on AMD - Make sure SEV-ES protocol negotiation happens only once and on the BSP * tag 'x86_urgent_for_v6.7_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Check vendor in the AMD microcode callback x86/sev: Fix kernel crash due to late update to read-only ghcb_version |
||
|
|
0aea22c7ab |
Generic:
* Set .owner for various KVM file_operations so that files refcount the
KVM module until KVM is done executing _all_ code, including the last
few instructions of kvm_put_kvm(). And then revert the misguided
attempt to rely on "struct kvm" refcounts to pin KVM-the-module.
ARM:
* Do not redo the mapping of vLPIs, if they have already been mapped
s390:
* Do not leave bits behind in PTEs
* Properly catch page invalidations that affect the prefix of a nested
guest
x86:
* When checking if a _running_ vCPU is "in-kernel", i.e. running at CPL0,
get the CPL directly instead of relying on preempted_in_kernel (which
is valid if and only if the vCPU was preempted, i.e. NOT running).
* Fix a benign "return void" that was recently introduced.
Selftests:
* Makefile tweak for dependency generation
* -Wformat fix
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmVzhWMUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMd3QgArPrc1tHKb5RAv3/OGxUhgmmKcYFz
AERbxrzcEv25MjCy6wfb+WzOTud/7KM113oVwHpkWbPaOdR/NteCeNrypCOiFnxB
BzKOdm23A8UmjgFlwYpIlukmn51QT2iMWfIQDetDboRpRPlIZ6AOQpIHph99K0J1
xMlXKbd4eaeIS+FoqJGD2wlKDkMZmrQD1SjndLp7xtPjRBZQzVTcMg/fKJNcDLAg
xIfecCeeU4/lJ0fQlnnO5a4L3gCzJnSGLKYvLgBkX4sC1yFr959WAnqSn8mb+1wT
jQtsmCg7+GCuNF5mn4nF/YnnuD7+xgbpGKVGlHhKCmmvmbi74NVm5SBNgA==
=aZLl
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Generic:
- Set .owner for various KVM file_operations so that files refcount
the KVM module until KVM is done executing _all_ code, including
the last few instructions of kvm_put_kvm(). And then revert the
misguided attempt to rely on "struct kvm" refcounts to pin
KVM-the-module.
ARM:
- Do not redo the mapping of vLPIs, if they have already been mapped
s390:
- Do not leave bits behind in PTEs
- Properly catch page invalidations that affect the prefix of a
nested guest
x86:
- When checking if a _running_ vCPU is "in-kernel", i.e. running at
CPL0, get the CPL directly instead of relying on
preempted_in_kernel (which is valid if and only if the vCPU was
preempted, i.e. NOT running).
- Fix a benign "return void" that was recently introduced.
Selftests:
- Makefile tweak for dependency generation
- '-Wformat' fix"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SVM: Update EFER software model on CR0 trap for SEV-ES
KVM: selftests: add -MP to CFLAGS
KVM: selftests: Actually print out magic token in NX hugepages skip message
KVM: x86: Remove 'return void' expression for 'void function'
Revert "KVM: Prevent module exit until all VMs are freed"
KVM: Set file_operations.owner appropriately for all such structures
KVM: x86: Get CPL directly when checking if loaded vCPU is in kernel mode
KVM: arm64: GICv4: Do not perform a map to a mapped vLPI
KVM: s390/mm: Properly reset no-dat
KVM: s390: vsie: fix wrong VIR 37 when MSO is used
|
||
|
|
4cdf351d36 |
KVM: SVM: Update EFER software model on CR0 trap for SEV-ES
In general, activating long mode involves setting the EFER_LME bit in the EFER register and then enabling the X86_CR0_PG bit in the CR0 register. At this point, the EFER_LMA bit will be set automatically by hardware. In the case of SVM/SEV guests where writes to CR0 are intercepted, it's necessary for the host to set EFER_LMA on behalf of the guest since hardware does not see the actual CR0 write. In the case of SEV-ES guests where writes to CR0 are trapped instead of intercepted, the hardware *does* see/record the write to CR0 before exiting and passing the value on to the host, so as part of enabling SEV-ES support commit |
||
|
|
6254eebad4 |
KVM fixes for 6.7-rcN:
- When checking if a _running_ vCPU is "in-kernel", i.e. running at CPL0,
get the CPL directly instead of relying on preempted_in_kernel, which
is valid if and only if the vCPU was preempted, i.e. NOT running.
- Set .owner for various KVM file_operations so that files refcount the
KVM module until KVM is done executing _all_ code, including the last
few instructions of kvm_put_kvm(). And then revert the misguided
attempt to rely on "struct kvm" refcounts to pin KVM-the-module.
- Fix a benign "return void" that was recently introduced.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmVyeV0SHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5aJIP/izKZivi/kZjuuKp2c1W2XM+mBZlM+Yj
qYcdV0rZygQJOZXTpMaEVg7iUtvbwAT495nm8sXr/IxXw+omMcP+qyLRCZ6JafYy
B19buCnAt2DymlJOurFzlIEeWtunkxk/gFLMB/BnSrok88cKz5PMxAVFPPBXsTms
ZqSFlDhzG0G4Mxhr8t0elyjd4HrCbNjCn1MhJg+uzFHKakfOvbST5jO02LkTeIM2
VFrqWZo1C6uPDrA8TzWzik54qOrDFrodNv/XvIJ0szgVOc+7Iwxy80A/v7o7jBET
igH+6F3cbST5uoKFrFn7pPJdTOfX2u18DXcpxiYIu+24ToKyqdE1Np9M5W4ZMX/9
Im5ilykfylHpRYAL4tECD6Jzd/Q/xIvpe8Uk6HTfFAtb/UdMY35/1keBnnkI2oj8
/4USM7AHNiqoAs4+OE4kZslrFG8ttv3vIOr7Mtk2UjGyGp8TH8sRFYPJKToXsQIJ
Gs96rsbiU+oo/IDp3UiRhWtwpwfKGbkDLp4r/3X6UOx6Re5u1ITVIoM14qFQaw3W
CKdHKN/MoreYLS5gasjaGRSyQNJPonaS10l8SqzWflrUBZYfyjCNKliihjKood2g
JykH4p69IFTWADT2VbrGCQVKfY1GJCxGwpGePFmChTsPUiQ2P+AbHCmaefnIRRbK
8UR/OmsDtFRZ
=gWp0
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-fixes-6.7-rcN' of https://github.com/kvm-x86/linux into kvm-master
KVM fixes for 6.7-rcN:
- When checking if a _running_ vCPU is "in-kernel", i.e. running at CPL0,
get the CPL directly instead of relying on preempted_in_kernel, which
is valid if and only if the vCPU was preempted, i.e. NOT running.
- Set .owner for various KVM file_operations so that files refcount the
KVM module until KVM is done executing _all_ code, including the last
few instructions of kvm_put_kvm(). And then revert the misguided
attempt to rely on "struct kvm" refcounts to pin KVM-the-module.
- Fix a benign "return void" that was recently introduced.
|
||
|
|
5e3f5b81de |
Including fixes from bpf and netfilter.
Current release - regressions:
- veth: fix packet segmentation in veth_convert_skb_to_xdp_buff
Current release - new code bugs:
- tcp: assorted fixes to the new Auth Option support
Older releases - regressions:
- tcp: fix mid stream window clamp
- tls: fix incorrect splice handling
- ipv4: ip_gre: handle skb_pull() failure in ipgre_xmit()
- dsa: mv88e6xxx: restore USXGMII support for 6393X
- arcnet: restore support for multiple Sohard Arcnet cards
Older releases - always broken:
- tcp: do not accept ACK of bytes we never sent
- require admin privileges to receive packet traces via netlink
- packet: move reference count in packet_sock to atomic_long_t
- bpf:
- fix incorrect branch offset comparison with cpu=v4
- fix prog_array_map_poke_run map poke update
- netfilter:
- 3 fixes for crashes on bad admin commands
- xt_owner: fix race accessing sk->sk_socket, TOCTOU null-deref
- nf_tables: fix 'exist' matching on bigendian arches
- leds: netdev: fix RTNL handling to prevent potential deadlock
- eth: tg3: prevent races in error/reset handling
- eth: r8169: fix rtl8125b PAUSE storm when suspended
- eth: r8152: improve reset and surprise removal handling
- eth: hns: fix race between changing features and sending
- eth: nfp: fix sleep in atomic for bonding offload
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmVyGxsACgkQMUZtbf5S
IrvziA//XZQLEQ3OsZnnYuuGkH0lPnY6ABaK/hcjCHnk9xs8SfIKPVYpq1LaShEp
TY6mBhLMIANbdNO+yPzaszWVTkBPyb0w8JNy43bhLhOL3m/6FS6qwsgN8SAL2qVv
8rnDF9Gsb4yU27aMZ6+2m92WiuyPptf4HrWU2ISSv/oCYH9TWsPUrTwt+QuVUboN
eSbvMzgIAkFIQVSbhMuinR9bOzAypSJPi18m1kkID5NsNUP/OToxPE7IFDEVS/oo
f4P7Ru6g1Gw9pAJmVXy5c0528Hy2P4Pyyw3LD5i2FWZ7rhYJRADOC4EMs9lINzrn
uscNUyztldaMHkKcZRqKbaXsnA3MPvuf3qycRH0wyHa1+OjL9N4A9P077FugtBln
UlmgVokfONVlxRgwy7AqapQbZ30QmnUEOvWjFWV3dsCBS3ziq1h7ujCTaQkl6R/6
i96xuiUPMrAnxAlbFOjoF8NeGvcvwujYCqs/q5JC43f+xZRGf52Pwf5U/AliOFym
aBX1mF/mdMLjYIBlGwFABiybACRPMceT2RuCfvhfIdQiM01OHlydO933jS+R3I4O
cB03ppK0QiNo5W4RlMqDGuXfVnBJ36pv/2tY8IUOZGXSR+jSQOxZHrhYrtzMM5F8
sWjpEIrfzdtuz0ssEg9wwGBTffEf07uZyPttov3Pm+VnDrsmCMU=
=bkyC
-----END PGP SIGNATURE-----
Merge tag 'net-6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf and netfilter.
Current release - regressions:
- veth: fix packet segmentation in veth_convert_skb_to_xdp_buff
Current release - new code bugs:
- tcp: assorted fixes to the new Auth Option support
Older releases - regressions:
- tcp: fix mid stream window clamp
- tls: fix incorrect splice handling
- ipv4: ip_gre: handle skb_pull() failure in ipgre_xmit()
- dsa: mv88e6xxx: restore USXGMII support for 6393X
- arcnet: restore support for multiple Sohard Arcnet cards
Older releases - always broken:
- tcp: do not accept ACK of bytes we never sent
- require admin privileges to receive packet traces via netlink
- packet: move reference count in packet_sock to atomic_long_t
- bpf:
- fix incorrect branch offset comparison with cpu=v4
- fix prog_array_map_poke_run map poke update
- netfilter:
- three fixes for crashes on bad admin commands
- xt_owner: fix race accessing sk->sk_socket, TOCTOU null-deref
- nf_tables: fix 'exist' matching on bigendian arches
- leds: netdev: fix RTNL handling to prevent potential deadlock
- eth: tg3: prevent races in error/reset handling
- eth: r8169: fix rtl8125b PAUSE storm when suspended
- eth: r8152: improve reset and surprise removal handling
- eth: hns: fix race between changing features and sending
- eth: nfp: fix sleep in atomic for bonding offload"
* tag 'net-6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits)
vsock/virtio: fix "comparison of distinct pointer types lacks a cast" warning
net/smc: fix missing byte order conversion in CLC handshake
net: dsa: microchip: provide a list of valid protocols for xmit handler
drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group
psample: Require 'CAP_NET_ADMIN' when joining "packets" group
bpf: sockmap, updating the sg structure should also update curr
net: tls, update curr on splice as well
nfp: flower: fix for take a mutex lock in soft irq context and rcu lock
net: dsa: mv88e6xxx: Restore USXGMII support for 6393X
tcp: do not accept ACK of bytes we never sent
selftests/bpf: Add test for early update in prog_array_map_poke_run
bpf: Fix prog_array_map_poke_run map poke update
netfilter: xt_owner: Fix for unsafe access of sk->sk_socket
netfilter: nf_tables: validate family when identifying table via handle
netfilter: nf_tables: bail out on mismatching dynset and set expressions
netfilter: nf_tables: fix 'exist' matching on bigendian arches
netfilter: nft_set_pipapo: skip inactive elements during set walk
netfilter: bpf: fix bad registration on nf_defrag
leds: trigger: netdev: fix RTNL handling to prevent potential deadlock
octeontx2-af: Update Tx link register range
...
|
||
|
|
f4116bfc44 |
x86/tdx: Allow 32-bit emulation by default
32-bit emulation was disabled on TDX to prevent a possible attack by a VMM injecting an interrupt on vector 0x80. Now that int80_emulation() has a check for external interrupts the limitation can be lifted. To distinguish software interrupts from external ones, int80_emulation() checks the APIC ISR bit relevant to the 0x80 vector. For software interrupts, this bit will be 0. On TDX, the VAPIC state (including ISR) is protected and cannot be manipulated by the VMM. The ISR bit is set by the microcode flow during the handling of posted interrupts. [ dhansen: more changelog tweaks ] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@vger.kernel.org> # v6.0+ |
||
|
|
55617fb991 |
x86/entry: Do not allow external 0x80 interrupts
The INT 0x80 instruction is used for 32-bit x86 Linux syscalls. The kernel expects to receive a software interrupt as a result of the INT 0x80 instruction. However, an external interrupt on the same vector also triggers the same codepath. An external interrupt on vector 0x80 will currently be interpreted as a 32-bit system call, and assuming that it was a user context. Panic on external interrupts on the vector. To distinguish software interrupts from external ones, the kernel checks the APIC ISR bit relevant to the 0x80 vector. For software interrupts, this bit will be 0. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@vger.kernel.org> # v6.0+ |
||
|
|
be5341eb0d |
x86/entry: Convert INT 0x80 emulation to IDTENTRY
There is no real reason to have a separate ASM entry point implementation for the legacy INT 0x80 syscall emulation on 64-bit. IDTENTRY provides all the functionality needed with the only difference that it does not: - save the syscall number (AX) into pt_regs::orig_ax - set pt_regs::ax to -ENOSYS Both can be done safely in the C code of an IDTENTRY before invoking any of the syscall related functions which depend on this convention. Aside of ASM code reduction this prepares for detecting and handling a local APIC injected vector 0x80. [ kirill.shutemov: More verbose comments ] Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@vger.kernel.org> # v6.0+ |
||
|
|
b82a8dbd3d |
x86/coco: Disable 32-bit emulation by default on TDX and SEV
The INT 0x80 instruction is used for 32-bit x86 Linux syscalls. The
kernel expects to receive a software interrupt as a result of the INT
0x80 instruction. However, an external interrupt on the same vector
triggers the same handler.
The kernel interprets an external interrupt on vector 0x80 as a 32-bit
system call that came from userspace.
A VMM can inject external interrupts on any arbitrary vector at any
time. This remains true even for TDX and SEV guests where the VMM is
untrusted.
Put together, this allows an untrusted VMM to trigger int80 syscall
handling at any given point. The content of the guest register file at
that moment defines what syscall is triggered and its arguments. It
opens the guest OS to manipulation from the VMM side.
Disable 32-bit emulation by default for TDX and SEV. User can override
it with the ia32_emulation=y command line option.
[ dhansen: reword the changelog ]
Reported-by: Supraja Sridhara <supraja.sridhara@inf.ethz.ch>
Reported-by: Benedict Schlüter <benedict.schlueter@inf.ethz.ch>
Reported-by: Mark Kuhne <mark.kuhne@inf.ethz.ch>
Reported-by: Andrin Bertschi <andrin.bertschi@inf.ethz.ch>
Reported-by: Shweta Shinde <shweta.shinde@inf.ethz.ch>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@vger.kernel.org> # v6.0+:
|
||
|
|
4b7de80160 |
bpf: Fix prog_array_map_poke_run map poke update
Lee pointed out issue found by syscaller [0] hitting BUG in prog array
map poke update in prog_array_map_poke_run function due to error value
returned from bpf_arch_text_poke function.
There's race window where bpf_arch_text_poke can fail due to missing
bpf program kallsym symbols, which is accounted for with check for
-EINVAL in that BUG_ON call.
The problem is that in such case we won't update the tail call jump
and cause imbalance for the next tail call update check which will
fail with -EBUSY in bpf_arch_text_poke.
I'm hitting following race during the program load:
CPU 0 CPU 1
bpf_prog_load
bpf_check
do_misc_fixups
prog_array_map_poke_track
map_update_elem
bpf_fd_array_map_update_elem
prog_array_map_poke_run
bpf_arch_text_poke returns -EINVAL
bpf_prog_kallsyms_add
After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
poke update fails on expected jump instruction check in bpf_arch_text_poke
with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
Similar race exists on the program unload.
Fixing this by moving the update to bpf_arch_poke_desc_update function which
makes sure we call __bpf_arch_text_poke that skips the bpf address check.
Each architecture has slightly different approach wrt looking up bpf address
in bpf_arch_text_poke, so instead of splitting the function or adding new
'checkip' argument in previous version, it seems best to move the whole
map_poke_run update as arch specific code.
[0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
Fixes:
|
||
|
|
deb4b9dd3b |
xen: branch for v6.7-rc4
-----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCZWraswAKCRCAXGG7T9hj vlV0AP9241p7vHlIW6PIdfNZt9/dZZpuFnKHz+cE99pTZDl5nwEAoYqXUm/kEb14 VAy7x0XIVdEt+9l4bgO2Qggx+Y184Qs= =APgm -----END PGP SIGNATURE----- Merge tag 'for-linus-6.7a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - A fix for the Xen event driver setting the correct return value when experiencing an allocation failure - A fix for allocating space for a struct in the percpu area to not cross page boundaries (this one is for x86, a similar one for Arm was already in the pull request for rc3) * tag 'for-linus-6.7a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/events: fix error code in xen_bind_pirq_msi_to_irq() x86/xen: fix percpu vcpu_info allocation |
||
|
|
9b8493dc43 |
x86/CPU/AMD: Check vendor in the AMD microcode callback
Commit in Fixes added an AMD-specific microcode callback. However, it didn't check the CPU vendor the kernel runs on explicitly. The only reason the Zenbleed check in it didn't run on other x86 vendors hardware was pure coincidental luck: if (!cpu_has_amd_erratum(c, amd_zenbleed)) return; gives true on other vendors because they don't have those families and models. However, with the removal of the cpu_has_amd_erratum() in |
||
|
|
ef8d89033c |
KVM: x86: Remove 'return void' expression for 'void function'
The requested info will be stored in 'guest_xsave->region' referenced by
the incoming pointer "struct kvm_xsave *guest_xsave", thus there is no need
to explicitly use return void expression for a void function "static void
kvm_vcpu_ioctl_x86_get_xsave(...)". The issue is caught with [-Wpedantic].
Fixes: 2d287ec65e79 ("x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer")
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20231007064019.17472-1-likexu@tencent.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
||
|
|
087e15206d |
KVM: Set file_operations.owner appropriately for all such structures
Set .owner for all KVM-owned filed types so that the KVM module is pinned until any files with callbacks back into KVM are completely freed. Using "struct kvm" as a proxy for the module, i.e. keeping KVM-the-module alive while there are active VMs, doesn't provide full protection. Userspace can invoke delete_module() the instant the last reference to KVM is put. If KVM itself puts the last reference, e.g. via kvm_destroy_vm(), then it's possible for KVM to be preempted and deleted/unloaded before KVM fully exits, e.g. when the task running kvm_destroy_vm() is scheduled back in, it will jump to a code page that is no longer mapped. Note, file types that can call into sub-module code, e.g. kvm-intel.ko or kvm-amd.ko on x86, must use the module pointer passed to kvm_init(), not THIS_MODULE (which points at kvm.ko). KVM assumes that if /dev/kvm is reachable, e.g. VMs are active, then the vendor module is loaded. To reduce the probability of forgetting to set .owner entirely, use THIS_MODULE for stats files where KVM does not call back into vendor code. This reverts commit |
||
|
|
27d25348d4 |
x86/sev: Fix kernel crash due to late update to read-only ghcb_version
A write-access violation page fault kernel crash was observed while running
cpuhotplug LTP testcases on SEV-ES enabled systems. The crash was
observed during hotplug, after the CPU was offlined and the process
was migrated to different CPU. setup_ghcb() is called again which
tries to update ghcb_version in sev_es_negotiate_protocol(). Ideally this
is a read_only variable which is initialised during booting.
Trying to write it results in a pagefault:
BUG: unable to handle page fault for address: ffffffffba556e70
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
[ ...]
Call Trace:
<TASK>
? __die_body.cold+0x1a/0x1f
? __die+0x2a/0x35
? page_fault_oops+0x10c/0x270
? setup_ghcb+0x71/0x100
? __x86_return_thunk+0x5/0x6
? search_exception_tables+0x60/0x70
? __x86_return_thunk+0x5/0x6
? fixup_exception+0x27/0x320
? kernelmode_fixup_or_oops+0xa2/0x120
? __bad_area_nosemaphore+0x16a/0x1b0
? kernel_exc_vmm_communication+0x60/0xb0
? bad_area_nosemaphore+0x16/0x20
? do_kern_addr_fault+0x7a/0x90
? exc_page_fault+0xbd/0x160
? asm_exc_page_fault+0x27/0x30
? setup_ghcb+0x71/0x100
? setup_ghcb+0xe/0x100
cpu_init_exception_handling+0x1b9/0x1f0
The fix is to call sev_es_negotiate_protocol() only in the BSP boot phase,
and it only needs to be done once in any case.
[ mingo: Refined the changelog. ]
Fixes:
|
||
|
|
547c91929f |
KVM: x86: Get CPL directly when checking if loaded vCPU is in kernel mode
When querying whether or not a vCPU "is" running in kernel mode, directly
get the CPL if the vCPU is the currently loaded vCPU. In scenarios where
a guest is profiled via perf-kvm, querying vcpu->arch.preempted_in_kernel
from kvm_guest_state() is wrong if vCPU is actively running, i.e. isn't
scheduled out due to being preempted and so preempted_in_kernel is stale.
This affects perf/core's ability to accurately tag guest RIP with
PERF_RECORD_MISC_GUEST_{KERNEL|USER} and record it in the sample. This
causes perf/tool to fail to connect the vCPU RIPs to the guest kernel
space symbols when parsing these samples due to incorrect PERF_RECORD_MISC
flags:
Before (perf-report of a cpu-cycles sample):
1.23% :58945 [unknown] [u] 0xffffffff818012e0
After:
1.35% :60703 [kernel.vmlinux] [g] asm_exc_page_fault
Note, checking preempted_in_kernel in kvm_arch_vcpu_in_kernel() is awful
as nothing in the API's suggests that it's safe to use if and only if the
vCPU was preempted. That can be cleaned up in the future, for now just
fix the glaring correctness bug.
Note #2, checking vcpu->preempted is NOT safe, as getting the CPL on VMX
requires VMREAD, i.e. is correct if and only if the vCPU is loaded. If
the target vCPU *was* preempted, then it can be scheduled back in after
the check on vcpu->preempted in kvm_vcpu_on_spin(), i.e. KVM could end up
trying to do VMREAD on a VMCS that isn't loaded on the current pCPU.
Signed-off-by: Like Xu <likexu@tencent.com>
Fixes:
|
||
|
|
db2832309a |
x86/xen: fix percpu vcpu_info allocation
Today the percpu struct vcpu_info is allocated via DEFINE_PER_CPU(),
meaning that it could cross a page boundary. In this case registering
it with the hypervisor will fail, resulting in a panic().
This can easily be fixed by using DEFINE_PER_CPU_ALIGNED() instead,
as struct vcpu_info is guaranteed to have a size of 64 bytes, matching
the cache line size of x86 64-bit processors (Xen doesn't support
32-bit processors).
Fixes:
|
||
|
|
4892711ace |
Fix/enhance x86 microcode version reporting: fix the bootup log spam,
and remove the driver version announcement to avoid version confusion when distros backport fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmVjFKgRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1h/7hAAs5hL1KvrfdW0VpAW91MbX6mtIe7Emc8T LCiBJtl9UngRdASUC9CGrcIZ5JIps0702gAq0qPVzk5zKxC22ySWsqMZybask+eF d6E7amMtF+KX0wiCZSuC66StCKA08JfrUXgxvYHnxDjNqERYFmVr1QabGL1IN5lZ KUrVUyvN8VOnzypOiQ98lXGWDJYwaV7t+IzMMh7mT5OUkoo09e6tFm7IF+NWg5xe NYCcvZqyo0Ipld7HOjlGHYG+blFkDxJpfTby5UevZybXsPd00cxBzDSR9zs1sYeG Kt6cgwDhfewfcM1QFVvNV/SDVsPp1BlVvMUa6Xa3vtsnWCit8zQqMbkYYWUaTcIh yUJvtzh/xZQtxaQ8Z8SbI7EhUBOFJXoHWV9JoEe3gWsWA5thu+4iOCh/P8C2ON3n 6kLSgNQ4GAylH34MWoS84t2Jxv7XmNZljR/78ucRQrJ1JJIEA+r4sJ9hK9btxqf2 0n86StHuwtNXSQwEhDcacqUpFPLZ65Za1Y9AXc69CDuiwj3DvTVBMwjEOBbnGTrZ dL9QOYG5gkklOx4o5ePj7RoLrzz/j6dj6idmu8FxZZ4q+QB9vvL2lRHusJnUEloE yxR7WWOB/kyUZT7FriLHRuEP7yQNRSvLs6U7b8uXCHiGcAb2mN0fFi2m/BcJGS4A hHA0t9WyNBM= =eYQ4 -----END PGP SIGNATURE----- Merge tag 'x86-urgent-2023-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode fixes from Ingo Molnar: "Fix/enhance x86 microcode version reporting: fix the bootup log spam, and remove the driver version announcement to avoid version confusion when distros backport fixes" * tag 'x86-urgent-2023-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode: Rework early revisions reporting x86/microcode: Remove the driver announcement and version |
||
|
|
e81fe50520 |
Fix a bug in the Intel hybrid CPUs hardware-capabilities enumeration
code resulting in non-working events on those platforms. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmVjEt8RHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1g54g/9EKugplQhVSfEXhzfBx38ICyodLxJuomL x96auVes+H3CEC6CYADzAY14GPDxymzOVSK/Sy8fWkzyGVWG23SZgB0i/hOTIsFL CVZLrVztidvkWWzJj5RnzNstZsUXp4QfNtcIw3VvCcsgDhk8WVen1xe5XbIuzamJ HbTrlrL2S3feAWaWZUh6c/3unHsLhCupLg8u0DTe97OFlRRlSj+wtmJCMCQoRRpL /LVItmmxfRbuMeyY8GGR6kb2qcb2+Zr6AUOXUPIoYOqARLuaV81F0kD2zwpPwB68 rFPYBG2ld2IHkbNFgxfCfNqP2DuY5TnSpS8SJzcfTAp6saIA5Ua3QLwg8IumV/2X G3ZI/G54er2+hhddX7tUjJIiutwNHxGkiOKy3Bnxm8K8c2H0I8XZ97fB1OWQUnVy pnwMzz3tlUOfF9SV8K0zttu3Dar9dcoBybT8HL/Wxr4RWF+FuZBiQNjjd92k3L4u Ua8Q6nLD/IRXwQfiEWzDSsWIQqxuSDGxvGHBgYJm4d5DxblFAeKmKMkFnO6Z17Kh fm1M4nW/YawSQYlG7Jo5XG+dWESJPhM5IAVBw19o0oYkqHV2gfmUerG8hvJYR7DY CpLrw4S9UDOLzsYAtNoozB/GMQnXQvjRz8mILWiwNJxpbo1saeZUj43zHvzqr7ta Vj96qrQLuNU= =jOSq -----END PGP SIGNATURE----- Merge tag 'perf-urgent-2023-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf event fix from Ingo Molnar: "Fix a bug in the Intel hybrid CPUs hardware-capabilities enumeration code resulting in non-working events on those platforms" * tag 'perf-urgent-2023-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Correct incorrect 'or' operation for PMU capabilities |
||
|
|
05c8c94ed4 |
hyperv-fixes for 6.7-rc3
-----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmVdgqYTHHdlaS5saXVA a2VybmVsLm9yZwAKCRB2FHBfkEGgXhBsCACzUGLF3vOQdrmgTMymzaaOzfLJtvNW oQ34FwMJMOAyJ6FxM12IJPHA2j+azl9CPjQc5O6F2CBcF8hVj2mDIINQIi+4wpV5 FQv445g2KFml/+AJr/1waz1GmhHtr1rfu7B7NX6tPUtOpxKN7AHAQXWYmHnwK8BJ 5Mh2a/7Lphjin4M1FWCeBTj0JtqF1oVAW2L9jsjqogq1JV0a51DIFutROtaPSC/4 ssTLM5Rqpnw8Z1GWVYD2PObIW4A+h1LV1tNGOIoGW6mX56mPU+KmVA7tTKr8Je/i z3Jk8bZXFyLvPW2+KNJacbldKNcfwAFpReffNz/FO3R16Stq9Ta1mcE2 =wXju -----END PGP SIGNATURE----- Merge tag 'hyperv-fixes-signed-20231121' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - One fix for the KVP daemon (Ani Sinha) - Fix for the detection of E820_TYPE_PRAM in a Gen2 VM (Saurabh Sengar) - Micro-optimization for hv_nmi_unknown() (Uros Bizjak) * tag 'hyperv-fixes-signed-20231121' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown() x86/hyperv: Fix the detection of E820_TYPE_PRAM in a Gen2 VM hv/hv_kvp_daemon: Some small fixes for handling NM keyfiles |
||
|
|
18286883e7 |
x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown()
Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old in hv_nmi_unknown(). On x86 the CMPXCHG instruction returns success in the ZF flag, so this change saves a compare after CMPXCHG. The generated asm code improves from: 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx 45: b8 ff ff ff ff mov $0xffffffff,%eax 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip) 51: 00 52: 83 f8 ff cmp $0xffffffff,%eax 55: 0f 95 c0 setne %al to: 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx 45: b8 ff ff ff ff mov $0xffffffff,%eax 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip) 51: 00 52: 0f 95 c0 setne %al No functional change intended. Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20231114170038.381634-1-ubizjak@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20231114170038.381634-1-ubizjak@gmail.com> |
||
|
|
080990aa33 |
x86/microcode: Rework early revisions reporting
The AMD side of the loader issues the microcode revision for each logical thread on the system, which can become really noisy on huge machines. And doing that doesn't make a whole lot of sense - the microcode revision is already in /proc/cpuinfo. So in case one is interested in the theoretical support of mixed silicon steppings on AMD, one can check there. What is also missing on the AMD side - something which people have requested before - is showing the microcode revision the CPU had *before* the early update. So abstract that up in the main code and have the BSP on each vendor provide those revision numbers. Then, dump them only once on driver init. On Intel, do not dump the patch date - it is not needed. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/CAHk-=wg=%2B8rceshMkB4VnKxmRccVLtBLPBawnewZuuqyx5U=3A@mail.gmail.com |
||
|
|
2e569ada42 |
x86/microcode: Remove the driver announcement and version
First of all, the print is useless. The driver will either load and say which microcode revision the machine has or issue an error. Then, the version number is meaningless and actively confusing, as Yazen mentioned recently: when a subset of patches are backported to a distro kernel, one can't assume the driver version is the same as the upstream one. And besides, the version number of the loader hasn't been used and incremented for a long time. So drop it. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20231115210212.9981-2-bp@alien8.de |
||
|
|
e8df9d9f42 |
perf/x86/intel: Correct incorrect 'or' operation for PMU capabilities
When running perf-stat command on Intel hybrid platform, perf-stat
reports the following errors:
sudo taskset -c 7 ./perf stat -vvvv -e cpu_atom/instructions/ sleep 1
Opening: cpu/cycles/:HG
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0xa00000000
disabled 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8
sys_perf_event_open failed, error -16
Performance counter stats for 'sleep 1':
<not counted> cpu_atom/instructions/
It looks the cpu_atom/instructions/ event can't be enabled on atom PMU
even when the process is pinned on atom core. Investigation shows that
exclusive_event_init() helper always returns -EBUSY error in the perf
event creation. That's strange since the atom PMU should not be an
exclusive PMU.
Further investigation shows the issue was introduced by commit:
|
||
|
|
cd557bc0a2 |
- Ignore invalid x2APIC entries in order to not waste per-CPU data
- Fix a back-to-back signals handling scenario when shadow stack is in use - A documentation fix - Add Kirill as TDX maintainer -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmVaChkACgkQEsHwGGHe VUraNQ/+KyCyJgG6bdIB3tS9qKr0Z4REaXQ+UQ7DfAjlhrzw7C6f4VReNLp3ohEv RdxNjKLEueYFQAo+v8uKGkqYIT6H1ob9uW+RjtjN+OJqWN/3AfK7CTx8HI1bJsW5 wKM+Ey81cID0iQDiNPAdzRnu7suKKjF5jLwztAw6EYOsTRfUnLZ8Ct84uHBWd58v kZ+WkEyeOyeJo+Vdx07d/LEcCJ+S9G6WfA0AnhHPOZxRZTn2RhqNsnJvqTeOvWUM PSN9NjxFk0ymidwnhR1urw1wHGgTT990vNsPIHLE72TwXrWEOM14Xkq1XNI4PfD1 Bp74ySpF0YUQrvgBW4V3qXgBFls4DkKys1amd2kK5KQGEpcXZm7ZPnI5w2NKMsY4 1Tk379W/1jPY8cyZjIqn92eFEkAjfID4eHICLj5IJhVMUusNEPmxgoycvKDqI8sK NihF1wUjyfRibh4ujYaurqKUBgxVHo2dyXPPo7UNzeaMfvqkFaxgwNJVF0gQ+MyI 5BzeY71RCFb8ZKtCT6SVN6oUeWLg+QAZApoJVDDnhF9InG+wJj+D400T7pZnNHbo ag6L2gJFJ2+XsV8DJhiaII0gfbf9cUppn4G7RcvQfL2HivYnZV3q1dBKf6C35H44 Kpz5w/eoJPOIcuZ48a6ph80zuRpuN6MSBigZ0G2Q7IwrmFx1Vcg= =PGYO -----END PGP SIGNATURE----- Merge tag 'x86_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Ignore invalid x2APIC entries in order to not waste per-CPU data - Fix a back-to-back signals handling scenario when shadow stack is in use - A documentation fix - Add Kirill as TDX maintainer * tag 'x86_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/acpi: Ignore invalid x2APIC entries x86/shstk: Delay signal entry SSP write until after user accesses x86/Documentation: Indent 'note::' directive for protocol version number note MAINTAINERS: Add Intel TDX entry |
||
|
|
bfa993b355 |
acpi/processor: sanitize _OSC/_PDC capabilities for Xen dom0
The Processor capability bits notify ACPI of the OS capabilities, and
so ACPI can adjust the return of other Processor methods taking the OS
capabilities into account.
When Linux is running as a Xen dom0, the hypervisor is the entity
in charge of processor power management, and hence Xen needs to make
sure the capabilities reported by _OSC/_PDC match the capabilities of
the driver in Xen.
Introduce a small helper to sanitize the buffer when running as Xen
dom0.
When Xen supports HWP, this serves as the equivalent of commit
|
||
|
|
7e8037b099 |
x86/hyperv: Fix the detection of E820_TYPE_PRAM in a Gen2 VM
A Gen2 VM doesn't support legacy PCI/PCIe, so both raw_pci_ops and raw_pci_ext_ops are NULL, and pci_subsys_init() -> pcibios_init() doesn't call pcibios_resource_survey() -> e820__reserve_resources_late(); as a result, any emulated persistent memory of E820_TYPE_PRAM (12) via the kernel parameter memmap=nn[KMG]!ss is not added into iomem_resource and hence can't be detected by register_e820_pmem(). Fix this by directly calling e820__reserve_resources_late() in hv_pci_init(), which is called from arch_initcall(pci_arch_init). It's ok to move a Gen2 VM's e820__reserve_resources_late() from subsys_initcall(pci_subsys_init) to arch_initcall(pci_arch_init) because the code in-between doesn't depend on the E820 resources. e820__reserve_resources_late() depends on e820__reserve_resources(), which has been called earlier from setup_arch(). For a Gen-2 VM, the new hv_pci_init() also adds any memory of E820_TYPE_PMEM (7) into iomem_resource, and acpi_nfit_register_region() -> acpi_nfit_insert_resource() -> region_intersects() returns REGION_INTERSECTS, so the memory of E820_TYPE_PMEM won't get added twice. Changed the local variable "int gen2vm" to "bool gen2vm". Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1699691867-9827-1-git-send-email-ssengar@linux.microsoft.com> |