Commit Graph

969 Commits

Author SHA1 Message Date
Breno Leitao
aefb2f2e61 x86/bugs: Rename CONFIG_RETPOLINE => CONFIG_MITIGATION_RETPOLINE
Step 5/10 of the namespace unification of CPU mitigations related Kconfig options.

[ mingo: Converted a few more uses in comments/messages as well. ]

Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ariel Miculas <amiculas@cisco.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20231121160740.1249350-6-leitao@debian.org
2024-01-10 10:52:28 +01:00
Linus Torvalds
669d089a7f Address a GCC-14 warning: there's no real bug, but indeed the calloc order doesn't match
the prototype.
 
 (Side note: we should really add zalloc() for such cases.)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmWbvjgRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gxJg//QBDT7Y3p91uRM/XknJ5wTZIuVg48gRoL
 t0fvCtgcE2SPzpJCy45OzJsWlp9ehJEJtJZHNgcoKwbEEbupbo+N9F3XIkuA55GT
 j9esp4ZeXqv4L6HO/dTIyQIrF69vzRq8Z0LoYi7O+7RZy7ZWivO+yJpGb8xN+SW2
 XkFlXTB7CTJXMGZ++CO0aeIHJxi3hANSv19dpaMYRrmueDM/6BBY8RiJwWUqt7Cw
 ZK0IuZBRhB0XR4tU3eevJQhxza6+VlPUxPXY6jVC5pg9yPd5ccigYPmDm7Jez3oB
 nTdY6mHrGMakK8y4VJDDoU/2cY83A3rS8RxLPCO2Z0IkTmI0IY4lJ5/IlGQO0+r/
 RWrnoD7BXCSN7/bws+FcDREl1tClIQBiAreeNBi3GpAsxNC+FnmaA4bGHpEHxZJR
 PrbTHotXk04HsJULJgt1kA81di7/WJBszQqVKeKRDv5Q9o6t4vpXaVxLUDFb3W7+
 /hpyjozfJUKFOcbLQa7eQlyrkzsPdtOp4Ga4MW9bsuaNPYlz9GswS3ubFo9eQcKm
 /CiwLUwqiQ4E7PNlfjESnhH4q557Gim9C2JDXi9hbv3PWlqhM5DWebo+qNsEx3SC
 i8IeClzDk9PDLxngJIhMBjyAmlBESZ1CzpzGo7yckLEMbRfAWLvsA80N5Ii8i0AO
 R4VIVtxQyAs=
 =3zAc
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fixlet from Ingo Molnar:
 "Address a GCC-14 warning: there's no real bug, but indeed the calloc
  order doesn't match the prototype.

  (Side note: we should really add zalloc() for such cases)"

* tag 'objtool-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix calloc call for new -Walloc-size
2024-01-08 18:31:27 -08:00
Jens Axboe
ae1914174a cred: get rid of CONFIG_DEBUG_CREDENTIALS
This code is rarely (never?) enabled by distros, and it hasn't caught
anything in decades. Let's kill off this legacy debug code.

Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-15 14:19:48 -08:00
Sam James
e2e13630f9 objtool: Fix calloc call for new -Walloc-size
GCC 14 introduces a new -Walloc-size included in -Wextra which errors out
like:
```
check.c: In function ‘cfi_alloc’:
check.c:294:33: error: allocation of insufficient size ‘1’ for type ‘struct cfi_state’ with size ‘320’ [-Werror=alloc-size]
  294 |         struct cfi_state *cfi = calloc(sizeof(struct cfi_state), 1);
      |                                 ^~~~~~
```

The calloc prototype is:
```
void *calloc(size_t nmemb, size_t size);
```

So, just swap the number of members and size arguments to match the prototype, as
we're initialising 1 struct of size `sizeof(struct ...)`. GCC then sees we're not
doing anything wrong.

Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/20231107205504.1470006-1-sam@gentoo.org
2023-11-17 10:54:50 +01:00
Linus Torvalds
8999ad99f4 * Refactor and clean up TDX hypercall/module call infrastructure
* Handle retrying/resuming page conversion hypercalls
  * Make sure to use the (shockingly) reliable TSC in TDX guests
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmVBlqMACgkQaDWVMHDJ
 krBrhBAArKay0MvzmdzS4IQs8JqkmuMEHI6WabYv2POPjJNXrn5MelLH972pLuX9
 NJ3+yeOLmNMYwqu5qwLCxyeO5CtqEyT2lNumUrxAtHQG4+oS2RYJYUalxMuoGxt8
 fAHxbItFg0TobBSUtwcnN2R2WdXwPuUW0Co+pJfLlZV4umVM7QANO1nf1g8YmlDD
 sVtpDaeKJRdylmwgWgAyGow0tDKd6oZB9j/vOHvZRrEQ+DMjEtG75fjwbjbu43Cl
 tI/fbxKjzAkOFcZ7PEPsQ8jE1h9DXU+JzTML9Nu/cPMalxMeBg3Dol/JOEbqgreI
 4W8Lg7g071EkUcQDxpwfe4aS6rsfsbwUIV4gJVkg9ZhlT7RayWsFik2CfBpJ4IMQ
 TM8BxtCEGCz3cxVvg3mstX9rRA7eNlXOzcKE/8Y7cpSsp94bA9jtf2GgUSUoi9St
 y+fIEei8mgeHutdiFh8psrmR7hp6iX/ldMFqHtjNo6xatf2KjdVHhVSU13Jz544z
 43ATNi1gZeHOgfwlAlIxLPDVDJidHuux3f6g2vfMkAqItyEqFauC1HA1pIDgckoY
 9FpBPp9vNUToSPp6reB6z/PkEBIrG2XtQh82JLt2CnCb6aTUtnPds+psjtT4sSE/
 a9SQvZLWWmpj+BlI2yrtfJzhy7SwhltgdjItQHidmCNEn0PYfTc=
 =FJ1Y
 -----END PGP SIGNATURE-----

Merge tag 'x86_tdx_for_6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 TDX updates from Dave Hansen:
 "The majority of this is a rework of the assembly and C wrappers that
  are used to talk to the TDX module and VMM. This is a nice cleanup in
  general but is also clearing the way for using this code when Linux is
  the TDX VMM.

  There are also some tidbits to make TDX guests play nicer with Hyper-V
  and to take advantage the hardware TSC.

  Summary:

   - Refactor and clean up TDX hypercall/module call infrastructure

   - Handle retrying/resuming page conversion hypercalls

   - Make sure to use the (shockingly) reliable TSC in TDX guests"

[ TLA reminder: TDX is "Trust Domain Extensions", Intel's guest VM
  confidentiality technology ]

* tag 'x86_tdx_for_6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tdx: Mark TSC reliable
  x86/tdx: Fix __noreturn build warning around __tdx_hypercall_failed()
  x86/virt/tdx: Make TDX_MODULE_CALL handle SEAMCALL #UD and #GP
  x86/virt/tdx: Wire up basic SEAMCALL functions
  x86/tdx: Remove 'struct tdx_hypercall_args'
  x86/tdx: Reimplement __tdx_hypercall() using TDX_MODULE_CALL asm
  x86/tdx: Make TDX_HYPERCALL asm similar to TDX_MODULE_CALL
  x86/tdx: Extend TDX_MODULE_CALL to support more TDCALL/SEAMCALL leafs
  x86/tdx: Pass TDCALL/SEAMCALL input/output registers via a structure
  x86/tdx: Rename __tdx_module_call() to __tdcall()
  x86/tdx: Make macros of TDCALLs consistent with the spec
  x86/tdx: Skip saving output regs when SEAMCALL fails with VMFailInvalid
  x86/tdx: Zero out the missing RSI in TDX_HYPERCALL macro
  x86/tdx: Retry partially-completed page conversion hypercalls
2023-11-01 10:28:32 -10:00
Linus Torvalds
cd063c8b9e Misc fixes and cleanups:
- Fix potential MAX_NAME_LEN limit related build failures
  - Fix scripts/faddr2line symbol filtering bug
  - Fix scripts/faddr2line on LLVM=1
  - Fix scripts/faddr2line to accept readelf output with mapping symbols
  - Minor cleanups
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU88VYRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g2rQ//dvzezrAs+ZEhKLbRLSabbAlCeJ+J9zuP
 c0xBmaLwUh47sSDKfBLLEFN3IMDfgMdKjfb3E32vT/WQ+ASdfEMs6FfwRtaErypG
 XfZFpfC2WE1+Gq0MAgrXYuQgDv1Lygdimoy0aCwMlrgb7ZgWL1xorG0VSEemyKhd
 CoRFURKjeJIKJN1oOvTXKhp/SZyk39KHXeF4qSAjIGkrzsfDtEUSNR6NjBmeGUS4
 zNVWus/CucHK/6MMpHtdWw1/Ygemc1CBzYC3ZSMGimqy4Rqe2RsiGa0Y3XhlMCyn
 ekNFuUm9bxStaTknM3ZXga0xHPdKnTPkihxykLDzo0Nh9eysuFlmFrFJ2xL/B87k
 IxlpXvwxjxTSmGDhGQFVnXma6M2le3YFWGClS8UyhSPG08qg09ClwZ8OtVDi8ITI
 rj0VoFbFLuc8aeHF/tyF2t323JmcMHq0aHi+kMUElszm6+B+fPnD54gHU+REXVxO
 YIRkK9RY52mfU4KFf8xlO/UhFF6nP8pgE8pVnNF4lC034M0t4z+i/TLjOsspjVt3
 yMoZakD7sfUkAaCBq4mVfdWwo5UzTVse0BarbEcKxoME6wLEfN+efE850zGdy7n1
 iRC9AddddEyo4BnSHbWdWu/PDYJKPiH7dAtHBcfnEMJjLQewnRHlsHHbCA55jtrX
 363jNE3x6K4=
 =9U5x
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:
 "Misc fixes and cleanups:

   - Fix potential MAX_NAME_LEN limit related build failures

   - Fix scripts/faddr2line symbol filtering bug

   - Fix scripts/faddr2line on LLVM=1

   - Fix scripts/faddr2line to accept readelf output with mapping
     symbols

   - Minor cleanups"

* tag 'objtool-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  scripts/faddr2line: Skip over mapping symbols in output from readelf
  scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1
  scripts/faddr2line: Don't filter out non-function symbols from readelf
  objtool: Remove max symbol name length limitation
  objtool: Propagate early errors
  objtool: Use 'the fallthrough' pseudo-keyword
  x86/speculation, objtool: Use absolute relocations for annotations
  x86/unwind/orc: Remove redundant initialization of 'mid' pointer in __orc_find()
2023-10-30 13:20:02 -10:00
Linus Torvalds
f84a52eef5 - A bunch of improvements, cleanups and fixlets to the SRSO mitigation
machinery and other, general cleanups to the hw mitigations code,
   by Josh Poimboeuf
 
 - Improve the return thunk detection by objtool as it is absolutely
   important that the default return thunk is not used after returns
   have been patched. Future work to detect and report this better is
   pending
 
 - Other misc cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmU7mFEACgkQEsHwGGHe
 VUpbBxAAtS4X5LCntPWUsDEBU80SBYAunEp0Wd0ttYEj+UrEk4tvnWVGFiIEr47A
 PrRKK9JCJtC6ko0+dwPtMi66L/T7mCpoNPI1kzfRG1IHJBfvCTGJhzZsesogvkA2
 1X9Je+RCVW4xVybIryxhjMGdB6jUoGEU1a4DmQXq481qiLB3ilvA1bIAaNo9BBYP
 rxKPrPcdOxn2NjxuOWg+FXjSc8LuAVSu3HqsgCW2AHJ6XIKEYWEq9FkXhwj9OJOr
 ax1F4qD1IY++jYZO9DJiltjeJyj0wC+yp8kDDURoLbcTk85WHlpD5vK0g64mELOA
 y0375thHep+vsrtQ/qZAmi/eVTaTekgbi7McahjoZebK7FbKOYRk6GZ+5+m29AVr
 DfQSJ7xQQqbCbpimeFmZ+gQf7mFexyDWvjUPyBl+OelOY1umdPM9IZVTnqib5LPr
 D2M+uqWfJhSwACi2o05LRv0gyhkAz0bGHrwZPmCVuxE5kBbhOpj4aT87fetUp/MW
 8lEFa3PHx/gkh2VOJ7ZgKzpeD75Vjo8TRAXOe4O2jn/L54gNEJ+1mukvrjW3+lp1
 ShmcZokl3ldPq6F5ioE+u45hVAfHkaruWM+5Rj3hsA/fdFN3isTVLhIRIsypPTKc
 p1ITT8Yhek8vkm9PcRBE5xWRmEZ2XE5ooDld930nJxra8QNVVQw=
 =E7c4
 -----END PGP SIGNATURE-----

Merge tag 'x86_bugs_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 hw mitigation updates from Borislav Petkov:

 - A bunch of improvements, cleanups and fixlets to the SRSO mitigation
   machinery and other, general cleanups to the hw mitigations code, by
   Josh Poimboeuf

 - Improve the return thunk detection by objtool as it is absolutely
   important that the default return thunk is not used after returns
   have been patched. Future work to detect and report this better is
   pending

 - Other misc cleanups and fixes

* tag 'x86_bugs_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  x86/retpoline: Document some thunk handling aspects
  x86/retpoline: Make sure there are no unconverted return thunks due to KCSAN
  x86/callthunks: Delete unused "struct thunk_desc"
  x86/vdso: Run objtool on vdso32-setup.o
  objtool: Fix return thunk patching in retpolines
  x86/srso: Remove unnecessary semicolon
  x86/pti: Fix kernel warnings for pti= and nopti cmdline options
  x86/calldepth: Rename __x86_return_skl() to call_depth_return_thunk()
  x86/nospec: Refactor UNTRAIN_RET[_*]
  x86/rethunk: Use SYM_CODE_START[_LOCAL]_NOALIGN macros
  x86/srso: Disentangle rethunk-dependent options
  x86/srso: Move retbleed IBPB check into existing 'has_microcode' code block
  x86/bugs: Remove default case for fully switched enums
  x86/srso: Remove 'pred_cmd' label
  x86/srso: Unexport untraining functions
  x86/srso: Improve i-cache locality for alias mitigation
  x86/srso: Fix unret validation dependencies
  x86/srso: Fix vulnerability reporting for missing microcode
  x86/srso: Print mitigation for retbleed IBPB case
  x86/srso: Print actual mitigation if requested mitigation isn't possible
  ...
2023-10-30 11:48:49 -10:00
Linus Torvalds
9e87705289 Initial bcachefs pull request for 6.7-rc1
Here's the bcachefs filesystem pull request.
 
 One new patch since last week: the exportfs constants ended up
 conflicting with other filesystems that are also getting added to the
 global enum, so switched to new constants picked by Amir.
 
 I'll also be sending another pull request later on in the cycle bringing
 things up to date my master branch that people are currently running;
 that will be restricted to fs/bcachefs/, naturally.
 
 Testing - fstests as well as the bcachefs specific tests in ktest:
   https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-for-upstream
 
 It's also been soaking in linux-next, which resulted in a whole bunch of
 smatch complaints and fixes and a patch or two from Kees.
 
 The only new non fs/bcachefs/ patch is the objtool patch that adds
 bcachefs functions to the list of noreturns. The patch that exports
 osq_lock() has been dropped for now, per Ingo.
 
 Prereq patch list:
 
 faf1dce852 objtool: Add bcachefs noreturns
 73badee428 lib/generic-radix-tree.c: Add peek_prev()
 9492261ff2 lib/generic-radix-tree.c: Don't overflow in peek()
 0fb5d567f5 MAINTAINERS: Add entry for generic-radix-tree
 b414e8ecd4 closures: Add a missing include
 48b7935722 closures: closure_nr_remaining()
 ced58fc7ab closures: closure_wait_event()
 bd0d22e41e MAINTAINERS: Add entry for closures
 8c8d2d9670 bcache: move closures to lib/
 957e48087d locking: export contention tracepoints for bcachefs six locks
 21db931445 lib: Export errname
 83feeb1955 lib/string_helpers: string_get_size() now returns characters wrote
 7d672f4094 stacktrace: Export stack_trace_save_tsk
 771eb4fe8b fs: factor out d_mark_tmpfile()
 2b69987be5 sched: Add task_struct->faults_disabled_mapping
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmU/wyIACgkQE6szbY3K
 bnZc1xAAqjQBGXdtgtKQvk0/ru0WaMZguMsOHd3BUXIbm30F6eJqnoXQ/ahALofc
 Ju6NrOgcy9wmdPKWpbeF+aK3WnkAW9jShDd0QieVH6PkhyYyh5r11iR/EVtjjLu5
 6Teodn8fyTqn9WSDtKG15QreTCJrEasAoGFQKQDA8oiXC7zc+RSpLUkkTWD/pxyW
 zVqkGGiAUG4x6FON+X2a3QBa9WCahIgV6XzHstGLsmOECxKO/LopGR5jThuIhv9t
 Yo0wodQTKAgb9QviG6V3f2dJLQKKUVDmVEGTXv+8Hl3d8CiYBJeIh+icp+VESBo1
 m8ev0y2xbTPLwgm5v0Uj4o/G8ISZ+qmcexV2zQ9xUWUAd2AjEBzhCh9BrNXM5qSg
 o7mphH+Pt6bJXgzxb2RkYJixU11yG3yuHPOCrRGGFpVHiNYhdHuJeDZOqChWZB8x
 6kY0uvU0X0tqVfWKxMwTwuqG8mJ5BkJNvnEvYi05QEZG0dDcUhgOqYlNNaL8vGkl
 qVixOwE4aH4kscdmW2gXY1c76VSebheyN8n6Wj1zrmTw4hTJH7ZWXPtmbRqQzpB6
 U6w3NjVyopbIjuF+syWeGqitTT/8fpvgZU4E9MpKGmHX4ADgecp6YSZQzzxTJn7D
 cbVX7YQxhmsM50C1PW7A8yLCspD/uRNiKLvzb/g9gFSInk4rV+U=
 =g+ia
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefs

Pull initial bcachefs updates from Kent Overstreet:
 "Here's the bcachefs filesystem pull request.

  One new patch since last week: the exportfs constants ended up
  conflicting with other filesystems that are also getting added to the
  global enum, so switched to new constants picked by Amir.

  The only new non fs/bcachefs/ patch is the objtool patch that adds
  bcachefs functions to the list of noreturns. The patch that exports
  osq_lock() has been dropped for now, per Ingo"

* tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefs: (2781 commits)
  exportfs: Change bcachefs fid_type enum to avoid conflicts
  bcachefs: Refactor memcpy into direct assignment
  bcachefs: Fix drop_alloc_keys()
  bcachefs: snapshot_create_lock
  bcachefs: Fix snapshot skiplists during snapshot deletion
  bcachefs: bch2_sb_field_get() refactoring
  bcachefs: KEY_TYPE_error now counts towards i_sectors
  bcachefs: Fix handling of unknown bkey types
  bcachefs: Switch to unsafe_memcpy() in a few places
  bcachefs: Use struct_size()
  bcachefs: Correctly initialize new buckets on device resize
  bcachefs: Fix another smatch complaint
  bcachefs: Use strsep() in split_devs()
  bcachefs: Add iops fields to bch_member
  bcachefs: Rename bch_sb_field_members -> bch_sb_field_members_v1
  bcachefs: New superblock section members_v2
  bcachefs: Add new helper to retrieve bch_member from sb
  bcachefs: bucket_lock() is now a sleepable lock
  bcachefs: fix crc32c checksum merge byte order problem
  bcachefs: Fix bch2_inode_delete_keys()
  ...
2023-10-30 11:09:38 -10:00
Josh Poimboeuf
34de4fe7d1 objtool: Fix return thunk patching in retpolines
With CONFIG_RETHUNK enabled, the compiler replaces every RET with a tail
call to a return thunk ('JMP __x86_return_thunk').  Objtool annotates
all such return sites so they can be patched during boot by
apply_returns().

The implementation of __x86_return_thunk() is just a bare RET.  It's
only meant to be used temporarily until apply_returns() patches all
return sites with either a JMP to another return thunk or an actual RET.

Removing the .text..__x86.return_thunk section would break objtool's
detection of return sites in retpolines.  Since retpolines and return
thunks would land in the same section, the compiler no longer uses
relocations for the intra-section jumps between the retpolines and the
return thunk, causing objtool to overlook them.

As a result, none of the retpolines' return sites would get patched.
Each one stays at 'JMP __x86_return_thunk', effectively a bare RET.

Fix it by teaching objtool to detect when a non-relocated jump target is
a return thunk (or retpoline).

  [ bp: Massage the commit message now that the offending commit
    removing the .text..__x86.return_thunk section has been zapped.
    Still keep the objtool change here as it makes objtool more robust
    wrt handling such intra-TU jumps without relocations, should some
    toolchain and/or config generate them in the future. ]

Reported-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231012024737.eg5phclogp67ik6x@treble
2023-10-20 12:51:41 +02:00
Kent Overstreet
faf1dce852 objtool: Add bcachefs noreturns
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-19 14:58:29 -04:00
Aaron Plattner
f404a58dcf objtool: Remove max symbol name length limitation
If one of the symbols processed by read_symbols() happens to have a
.cold variant with a name longer than objtool's MAX_NAME_LEN limit, the
build fails.

Avoid this problem by just using strndup() to copy the parent function's
name, rather than strncpy()ing it onto the stack.

Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Link: https://lore.kernel.org/r/41e94cfea1d9131b758dd637fecdeacd459d4584.1696355111.git.aplattner@nvidia.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-10-05 17:01:28 -07:00
Aaron Plattner
e959c279d3 objtool: Propagate early errors
If objtool runs into a problem that causes it to exit early, the overall
tool still returns a status code of 0, which causes the build to
continue as if nothing went wrong.

Note this only affects early errors, as later errors are still ignored
by check().

Fixes: b51277eb97 ("objtool: Ditch subcommands")
Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Link: https://lore.kernel.org/r/cb6a28832d24b2ebfafd26da9abb95f874c83045.1696355111.git.aplattner@nvidia.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-10-05 17:01:11 -07:00
Ruan Jinjie
758a74306f objtool: Use 'the fallthrough' pseudo-keyword
Replace the existing /* fallthrough */ comments with the
new 'fallthrough' pseudo-keyword macro:

  https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
2023-10-03 21:37:35 +02:00
Kai Huang
518755a7ee x86/tdx: Fix __noreturn build warning around __tdx_hypercall_failed()
LKP reported below build warning:

  vmlinux.o: warning: objtool: __tdx_hypercall+0x128: __tdx_hypercall_failed() is missing a __noreturn annotation

The __tdx_hypercall_failed() function definition already has __noreturn
annotation, but it turns out the __noreturn must be annotated to the
function declaration.

PeterZ explains:

  "FWIW, the reason being that...

   The point of noreturn is that the caller should know to stop generating
   code. For that the declaration needs the attribute, because call sites
   typically do not have access to the function definition in C."

Add __noreturn annotation to the declaration of __tdx_hypercall_failed()
to fix.  It's not a bad idea to document the __noreturn nature at the
definition site either, so keep the annotation at the definition.

Note <asm/shared/tdx.h> is also included by TDX related assembly files.
Include <linux/compiler_attributes.h> only in case of !__ASSEMBLY__
otherwise compiling assembly file would trigger build error.

Also, following the objtool documentation, add __tdx_hypercall_failed()
to "tools/objtool/noreturns.h".

Fixes: c641cfb5c1 ("x86/tdx: Make TDX_HYPERCALL asm similar to TDX_MODULE_CALL")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230918041858.331234-1-kai.huang@intel.com
Closes: https://lore.kernel.org/oe-kbuild-all/202309140828.9RdmlH2Z-lkp@intel.com/
2023-09-18 09:11:39 +02:00
Josh Poimboeuf
72178d5d1a objtool: Fix _THIS_IP_ detection for cold functions
Cold functions and their non-cold counterparts can use _THIS_IP_ to
reference each other.  Don't warn about !ENDBR in that case.

Note that for GCC this is currently irrelevant in light of the following
commit

  c27cd083cf ("Compiler attributes: GCC cold function alignment workarounds")

which disabled cold functions in the kernel.  However this may still be
possible with Clang.

Fixes several warnings like the following:

  drivers/scsi/bnx2i/bnx2i.prelink.o: warning: objtool: bnx2i_hw_ep_disconnect+0x19d: relocation to !ENDBR: bnx2i_hw_ep_disconnect.cold+0x0
  drivers/net/ipvlan/ipvlan.prelink.o: warning: objtool: ipvlan_addr4_event.cold+0x28: relocation to !ENDBR: ipvlan_addr4_event+0xda
  drivers/net/ipvlan/ipvlan.prelink.o: warning: objtool: ipvlan_addr6_event.cold+0x26: relocation to !ENDBR: ipvlan_addr6_event+0xb7
  drivers/net/ethernet/broadcom/tg3.prelink.o: warning: objtool: tg3_set_ringparam.cold+0x17: relocation to !ENDBR: tg3_set_ringparam+0x115
  drivers/net/ethernet/broadcom/tg3.prelink.o: warning: objtool: tg3_self_test.cold+0x17: relocation to !ENDBR: tg3_self_test+0x2e1
  drivers/target/iscsi/cxgbit/cxgbit.prelink.o: warning: objtool: __cxgbit_free_conn.cold+0x24: relocation to !ENDBR: __cxgbit_free_conn+0xfb
  net/can/can.prelink.o: warning: objtool: can_rx_unregister.cold+0x2c: relocation to !ENDBR: can_rx_unregister+0x11b
  drivers/net/ethernet/qlogic/qed/qed.prelink.o: warning: objtool: qed_spq_post+0xc0: relocation to !ENDBR: qed_spq_post.cold+0x9a
  drivers/net/ethernet/qlogic/qed/qed.prelink.o: warning: objtool: qed_iwarp_ll2_comp_syn_pkt.cold+0x12f: relocation to !ENDBR: qed_iwarp_ll2_comp_syn_pkt+0x34b
  net/tipc/tipc.prelink.o: warning: objtool: tipc_nametbl_publish.cold+0x21: relocation to !ENDBR: tipc_nametbl_publish+0xa6

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/d8f1ab6a23a6105bc023c132b105f245c7976be6.1694476559.git.jpoimboe@kernel.org
2023-09-12 08:16:54 +02:00
Peter Zijlstra
dbf4600877 objtool/x86: Fixup frame-pointer vs rethunk
For stack-validation of a frame-pointer build, objtool validates that
every CALL instruction is preceded by a frame-setup. The new SRSO
return thunks violate this with their RSB stuffing trickery.

Extend the __fentry__ exception to also cover the embedded_insn case
used for this. This cures:

  vmlinux.o: warning: objtool: srso_untrain_ret+0xd: call without frame pointer save/setup

Fixes: 4ae68b26c3 ("objtool/x86: Fix SRSO mess")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/20230816115921.GH980931@hirez.programming.kicks-ass.net
2023-08-17 00:44:35 +02:00
Peter Zijlstra
d025b7bac0 x86/cpu: Rename original retbleed methods
Rename the original retbleed return thunk and untrain_ret to
retbleed_return_thunk() and retbleed_untrain_ret().

No functional changes.

Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.909378169@infradead.org
2023-08-16 21:47:53 +02:00
Peter Zijlstra
d43490d0ab x86/cpu: Clean up SRSO return thunk mess
Use the existing configurable return thunk. There is absolute no
justification for having created this __x86_return_thunk alternative.

To clarify, the whole thing looks like:

Zen3/4 does:

  srso_alias_untrain_ret:
	  nop2
	  lfence
	  jmp srso_alias_return_thunk
	  int3

  srso_alias_safe_ret: // aliasses srso_alias_untrain_ret just so
	  add $8, %rsp
	  ret
	  int3

  srso_alias_return_thunk:
	  call srso_alias_safe_ret
	  ud2

While Zen1/2 does:

  srso_untrain_ret:
	  movabs $foo, %rax
	  lfence
	  call srso_safe_ret           (jmp srso_return_thunk ?)
	  int3

  srso_safe_ret: // embedded in movabs instruction
	  add $8,%rsp
          ret
          int3

  srso_return_thunk:
	  call srso_safe_ret
	  ud2

While retbleed does:

  zen_untrain_ret:
	  test $0xcc, %bl
	  lfence
	  jmp zen_return_thunk
          int3

  zen_return_thunk: // embedded in the test instruction
	  ret
          int3

Where Zen1/2 flush the BTB entry using the instruction decoder trick
(test,movabs) Zen3/4 use BTB aliasing. SRSO adds a return sequence
(srso_safe_ret()) which forces the function return instruction to
speculate into a trap (UD2).  This RET will then mispredict and
execution will continue at the return site read from the top of the
stack.

Pick one of three options at boot (evey function can only ever return
once).

  [ bp: Fixup commit message uarch details and add them in a comment in
    the code too. Add a comment about the srso_select_mitigation()
    dependency on retbleed_select_mitigation(). Add moar ifdeffery for
    32-bit builds. Add a dummy srso_untrain_ret_alias() definition for
    32-bit alternatives needing the symbol. ]

Fixes: fb3bd914b3 ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.842775684@infradead.org
2023-08-16 21:47:24 +02:00
Peter Zijlstra
4ae68b26c3 objtool/x86: Fix SRSO mess
Objtool --rethunk does two things:

 - it collects all (tail) call's of __x86_return_thunk and places them
   into .return_sites. These are typically compiler generated, but
   RET also emits this same.

 - it fudges the validation of the __x86_return_thunk symbol; because
   this symbol is inside another instruction, it can't actually find
   the instruction pointed to by the symbol offset and gets upset.

Because these two things pertained to the same symbol, there was no
pressing need to separate these two separate things.

However, alas, along comes SRSO and more crazy things to deal with
appeared.

The SRSO patch itself added the following symbol names to identify as
rethunk:

  'srso_untrain_ret', 'srso_safe_ret' and '__ret'

Where '__ret' is the old retbleed return thunk, 'srso_safe_ret' is a
new similarly embedded return thunk, and 'srso_untrain_ret' is
completely unrelated to anything the above does (and was only included
because of that INT3 vs UD2 issue fixed previous).

Clear things up by adding a second category for the embedded instruction
thing.

Fixes: fb3bd914b3 ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.704502245@infradead.org
2023-08-16 09:39:16 +02:00
Petr Pavlu
79cd2a1122 x86/retpoline,kprobes: Fix position of thunk sections with CONFIG_LTO_CLANG
The linker script arch/x86/kernel/vmlinux.lds.S matches the thunk
sections ".text.__x86.*" from arch/x86/lib/retpoline.S as follows:

  .text {
    [...]
    TEXT_TEXT
    [...]
    __indirect_thunk_start = .;
    *(.text.__x86.*)
    __indirect_thunk_end = .;
    [...]
  }

Macro TEXT_TEXT references TEXT_MAIN which normally expands to only
".text". However, with CONFIG_LTO_CLANG, TEXT_MAIN becomes
".text .text.[0-9a-zA-Z_]*" which wrongly matches also the thunk
sections. The output layout is then different than expected. For
instance, the currently defined range [__indirect_thunk_start,
__indirect_thunk_end] becomes empty.

Prevent the problem by using ".." as the first separator, for example,
".text..__x86.indirect_thunk". This pattern is utilized by other
explicit section names which start with one of the standard prefixes,
such as ".text" or ".data", and that need to be individually selected in
the linker script.

  [ nathan: Fix conflicts with SRSO and fold in fix issue brought up by
    Andrew Cooper in post-review:
    https://lore.kernel.org/20230803230323.1478869-1-andrew.cooper3@citrix.com ]

Fixes: dc5723b02e ("kbuild: add support for Clang LTO")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230711091952.27944-2-petr.pavlu@suse.com
2023-08-14 11:44:19 +02:00
Borislav Petkov (AMD)
fb3bd914b3 x86/srso: Add a Speculative RAS Overflow mitigation
Add a mitigation for the speculative return address stack overflow
vulnerability found on AMD processors.

The mitigation works by ensuring all RET instructions speculate to
a controlled location, similar to how speculation is controlled in the
retpoline sequence.  To accomplish this, the __x86_return_thunk forces
the CPU to mispredict every function return using a 'safe return'
sequence.

To ensure the safety of this mitigation, the kernel must ensure that the
safe return sequence is itself free from attacker interference.  In Zen3
and Zen4, this is accomplished by creating a BTB alias between the
untraining function srso_untrain_ret_alias() and the safe return
function srso_safe_ret_alias() which results in evicting a potentially
poisoned BTB entry and using that safe one for all function returns.

In older Zen1 and Zen2, this is accomplished using a reinterpretation
technique similar to Retbleed one: srso_untrain_ret() and
srso_safe_ret().

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-07-27 11:07:14 +02:00
Michal Kubecek
9f71fbcde2 objtool: initialize all of struct elf
Function elf_open_read() only zero initializes the initial part of
allocated struct elf; num_relocs member was recently added outside the
zeroed part so that it was left uninitialized, resulting in build failures
on some systems.

The partial initialization is a relic of times when struct elf had large
hash tables embedded. This is no longer the case so remove the trap and
initialize the whole structure instead.

Fixes: eb0481bbc4 ("objtool: Fix reloc_hash size")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/20230629102051.42E8360467@lion.mk-sys.cz
2023-07-10 09:52:28 +02:00
Ingo Molnar
06697ca69b objtool: Remove btrfs_assertfail() from the noreturn exceptions list
The objtool merge in commit 6f612579be ("Merge tag 'objtool-core ...")
generated a semantic conflict that was not resolved.

The btrfs_assertfail() entry was removed from the noreturn list in
commit b831306b3b ("btrfs: print assertion failure report and stack
trace from the same line") because btrfs_assertfail() was changed from a
noreturn function into a macro.

The noreturn list was then moved from check.c to noreturns.h in commit
6245ce4ab6 ("objtool: Move noreturn function list to separate file"),
and should be removed from that post-merge as well.

Do it explicitly.

Cc: David Sterba <dsterba@suse.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-29 17:27:47 -07:00
Linus Torvalds
6f612579be objtool changes for v6.5:
- Build footprint & performance improvements:
 
     - Reduce memory usage with CONFIG_DEBUG_INFO=y
 
       In the worst case of an allyesconfig+CONFIG_DEBUG_INFO=y kernel, DWARF
       creates almost 200 million relocations, ballooning objtool's peak heap
       usage to 53GB.  These patches reduce that to 25GB.
 
       On a distro-type kernel with kernel IBT enabled, they reduce objtool's
       peak heap usage from 4.2GB to 2.8GB.
 
       These changes also improve the runtime significantly.
 
 - Debuggability improvements:
 
     - Add the unwind_debug command-line option, for more extend unwinding
       debugging output.
     - Limit unreachable warnings to once per function
     - Add verbose option for disassembling affected functions
     - Include backtrace in verbose mode
     - Detect missing __noreturn annotations
     - Ignore exc_double_fault() __noreturn warnings
     - Remove superfluous global_noreturns entries
     - Move noreturn function list to separate file
     - Add __kunit_abort() to noreturns
 
 - Unwinder improvements:
 
     - Allow stack operations in UNWIND_HINT_UNDEFINED regions
     - drm/vmwgfx: Add unwind hints around RBP clobber
 
 - Cleanups:
 
     - Move the x86 entry thunk restore code into thunk functions
     - x86/unwind/orc: Use swap() instead of open coding it
     - Remove unnecessary/unused variables
 
 - Fixes for modern stack canary handling
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmSaxcoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ht5w//f8mBoABct29pS4ib6pDwRZQDoG8fCA7M
 +KWjFD1AhX7RsJVEbM4uBUXdSWZD61xxIa8p8LO2jjzE5RyhM+EuNaisKujKqmfj
 uQTSnRhIRHMPqqVGK/gQxy1v4+3+12O32XFIJhAPYCp/dpbZJ2yKDsiHjapzZTDy
 BM+86hbIyHFmSl5uJcBFHEv6EGhoxwdrrrOxhpao1CqfAUi+uVgamHGwVqx+NtTY
 MvOmcy3/0ukHwDLON0MIMu9MSwvnXorD7+RSkYstwAM/k6ao/k78iJ31sOcynpRn
 ri0gmfygJsh2bxL4JUlY4ZeTs7PLWkj3i60deePc5u6EyV4JDJ2borUibs5oGoF6
 pN0AwbtubLHHhUI/v74B3E6K6ZGvLiEn9dsNTuXsJffD+qU2REb+WLhr4ut+E1Wi
 IKWrYh811yBLyOqFEW3XudZTiXSJlgi3eYiCxspEsKw2RIFFt2g6vYcwrIb0Hatw
 8R4/jCWk1nc6Wa3RQYsVnhkglAECSKQdDfS7p2e1hNUTjZuess4EEJjSLs8upIQ9
 D1bmuUxEzRxVwAZtXYNh0NKe7OtyOrqgsVTQuqxvWXq2CpC7Hqj8piVJWHdBWgHO
 0o2OQqjwSrzAtevpAIaYQv9zhPs1hV7CpBgzzqWGXrwJ3vM6YoSRLf0bg+5OkN8I
 O4U2xq2OVa8=
 =uNnc
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molar:
 "Build footprint & performance improvements:

   - Reduce memory usage with CONFIG_DEBUG_INFO=y

     In the worst case of an allyesconfig+CONFIG_DEBUG_INFO=y kernel,
     DWARF creates almost 200 million relocations, ballooning objtool's
     peak heap usage to 53GB. These patches reduce that to 25GB.

     On a distro-type kernel with kernel IBT enabled, they reduce
     objtool's peak heap usage from 4.2GB to 2.8GB.

     These changes also improve the runtime significantly.

  Debuggability improvements:

   - Add the unwind_debug command-line option, for more extend unwinding
     debugging output
   - Limit unreachable warnings to once per function
   - Add verbose option for disassembling affected functions
   - Include backtrace in verbose mode
   - Detect missing __noreturn annotations
   - Ignore exc_double_fault() __noreturn warnings
   - Remove superfluous global_noreturns entries
   - Move noreturn function list to separate file
   - Add __kunit_abort() to noreturns

  Unwinder improvements:

   - Allow stack operations in UNWIND_HINT_UNDEFINED regions
   - drm/vmwgfx: Add unwind hints around RBP clobber

  Cleanups:

   - Move the x86 entry thunk restore code into thunk functions
   - x86/unwind/orc: Use swap() instead of open coding it
   - Remove unnecessary/unused variables

  Fixes for modern stack canary handling"

* tag 'objtool-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  x86/orc: Make the is_callthunk() definition depend on CONFIG_BPF_JIT=y
  objtool: Skip reading DWARF section data
  objtool: Free insns when done
  objtool: Get rid of reloc->rel[a]
  objtool: Shrink elf hash nodes
  objtool: Shrink reloc->sym_reloc_entry
  objtool: Get rid of reloc->jump_table_start
  objtool: Get rid of reloc->addend
  objtool: Get rid of reloc->type
  objtool: Get rid of reloc->offset
  objtool: Get rid of reloc->idx
  objtool: Get rid of reloc->list
  objtool: Allocate relocs in advance for new rela sections
  objtool: Add for_each_reloc()
  objtool: Don't free memory in elf_close()
  objtool: Keep GElf_Rel[a] structs synced
  objtool: Add elf_create_section_pair()
  objtool: Add mark_sec_changed()
  objtool: Fix reloc_hash size
  objtool: Consolidate rel/rela handling
  ...
2023-06-27 15:05:41 -07:00
Linus Torvalds
9ba92dc1de linux-kselftest-kunit-6.5-rc1
This KUnit update for Linux 6.5-rc1 consists of:
 
 - kunit_add_action() API to defer a call until test exit.
 - Update document to add kunit_add_action() usage notes.
 - Changes to always run cleanup from a test kthread.
 - Documentation updates to clarify cleanup usage
   - assertions should not be used in cleanup
 - Documentation update to clearly indicate that exit
   functions should run even if init fails
 - Several fixes and enhancements to existing tests.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmSYWVcACgkQCwJExA0N
 QxwbxA//eGx3xkFN9CWb8ryBTZhs8DZrzc+JlqWEDpk7GQTSlErd3DtInzY0jM2a
 GWKV4BJCX6uI2JiyG+cof7nWtnv//L4LxRCpYlY/n7sJeYwZyd1s745nM8lfYTh9
 UtAHPmZplAqMCOHgfeUQ6wMxiUc7VGC8Spu82nFzRuSLzf+q5BpK7LPHSJiJ4ea+
 kkM+5ygHzBW2cfvULIglb8jQPgPRoVR4RhmmHMF7CYTZQkrU/z7ZZlFTx7LowrxC
 p2zWVuH0KJONn4L8rB4QI8oqCZejU2qV2bealCnKY3/atSLUvrnYxyPQbbxCNqmi
 EY1XyQFbGsvmgy77IeEXKWhiUmAfD7/Hcvh8M/vLk2wHzQG8+428DAQ7sGRHHqZX
 6DvDUo8Z2TE7585glxkbiXhuGsY0y8dkeNURw4URys+TvucNHGrmDfKp0UIEAJW1
 iqopMGmM/MDfV5gPUlUEg6jKhTkZOn6OlVwZ8moUaAeAKV7qGGuMrNSZJ6Jw1Gc9
 LjI2ma3uZ3hOahyqwU+zwO4CeTJHOq6JjXJZt9aiGwqJPrbjvVCUtikz4QSptU2z
 vCjVEV/e7tTGXl+suDb48cu/pyh+z3t5/Gz7eOHMId7S3MENTauxyBXDm1WzoV0c
 HuBEsmWXetYuXXkh66LJ/8fzUeWvaGrQPM9hXi2fn1hmPLxOnxw=
 =rYT5
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:

 - kunit_add_action() API to defer a call until test exit

 - Update document to add kunit_add_action() usage notes

 - Changes to always run cleanup from a test kthread

 - Documentation updates to clarify cleanup usage (assertions should not
   be used in cleanup)

 - Documentation update to clearly indicate that exit functions should
   run even if init fails

 - Several fixes and enhancements to existing tests

* tag 'linux-kselftest-kunit-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  MAINTAINERS: Add source tree entry for kunit
  Documentation: kunit: Rename references to kunit_abort()
  kunit: Move kunit_abort() call out of kunit_do_failed_assertion()
  kunit: Fix obsolete name in documentation headers (func->action)
  Documentation: Kunit: add MODULE_LICENSE to sample code
  kunit: Update kunit_print_ok_not_ok function
  kunit: Fix reporting of the skipped parameterized tests
  kunit/test: Add example test showing parameterized testing
  Documentation: kunit: Add usage notes for kunit_add_action()
  kunit: kmalloc_array: Use kunit_add_action()
  kunit: executor_test: Use kunit_add_action()
  kunit: Add kunit_add_action() to defer a call until test exit
  kunit: example: Provide example exit functions
  Documentation: kunit: Warn that exit functions run even if init fails
  Documentation: kunit: Note that assertions should not be used in cleanup
  kunit: Always run cleanup from a test kthread
  Documentation: kunit: Modular tests should not depend on KUNIT=y
  kunit: tool: undo type subscripts for subprocess.Popen
2023-06-27 11:12:55 -07:00
Linus Torvalds
8c69e7afe9 - Up until now the Fast Short Rep Mov optimizations implied the presence
of the ERMS CPUID flag. AMD decoupled them with a BIOS setting so decouple
   that dependency in the kernel code too
 
 - Teach the alternatives machinery to handle relocations
 
 - Make debug_alternative accept flags in order to see only that set of
   patching done one is interested in
 
 - Other fixes, cleanups and optimizations to the patching code
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSZi2AACgkQEsHwGGHe
 VUqhGw/9EC/m5HTFBlCy9PS5Qy6pPLzmHR5Tuy4meqlnB1gN+5wzfxdYEwHm46hH
 SR6WqR12yVaCMIzh66y8nTJyMbIykaBbfFJb3WesdDrBIYUZ9f+7O+Xd0JS6Jykd
 2HBHOyaVS1/W75+y6w9JhTExBH5xieCpJVIYyAvifbn/pB8XmuTTwJ1Z3EJ8DzkK
 AN16i46bUiKNBdTYZUMhtKL4vHVfqLYMskgWe6IG7DmRLOwikR0uRVhuVqP/bmUj
 U128cUacGJT2AYbZarTAKmOa42nDj3TpJqRp1qit3y6Cun4vxKH+1A91UPd7IHTa
 M5H1bNSgfXMm8rU+JgfvXKqrCTckGn2OqlCkJfPV3RBeP9IcQBBF0vE3dnM/X2We
 dwbXeDfJvc+1s4/M41MOhyahTUbW+4iRK5UCZEt1mprTbtzHTlN7RROo7QLpFsWx
 T0Jqvsd1raAutPTgTjU7ToQwDpSQNnn4Y/KoEdpvOCXR8wU7Wo5/+Qa4tEkIY3W6
 mUFpJcgFC9QEKLuaNAofPIhMuZ/vzRVtpK7wbLn4KR5JZA8AxznenMFVg8YPWRFI
 4oga0kMFJ7t6z/CXHtrxFaLQ9e7WAUSRU6gPiz8As1F/K9N0JWMUfjuTJcgjUsF8
 bwdCNinwG8y3rrPUCrqbO5N766ZkLYd6NksKlmIyUvtCcS0ksbg=
 =mH38
 -----END PGP SIGNATURE-----

Merge tag 'x86_alternatives_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 instruction alternatives updates from Borislav Petkov:

 - Up until now the Fast Short Rep Mov optimizations implied the
   presence of the ERMS CPUID flag. AMD decoupled them with a BIOS
   setting so decouple that dependency in the kernel code too

 - Teach the alternatives machinery to handle relocations

 - Make debug_alternative accept flags in order to see only that set of
   patching done one is interested in

 - Other fixes, cleanups and optimizations to the patching code

* tag 'x86_alternatives_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/alternative: PAUSE is not a NOP
  x86/alternatives: Add cond_resched() to text_poke_bp_batch()
  x86/nospec: Shorten RESET_CALL_DEPTH
  x86/alternatives: Add longer 64-bit NOPs
  x86/alternatives: Fix section mismatch warnings
  x86/alternative: Optimize returns patching
  x86/alternative: Complicate optimize_nops() some more
  x86/alternative: Rewrite optimize_nops() some
  x86/lib/memmove: Decouple ERMS from FSRM
  x86/alternative: Support relocations in alternatives
  x86/alternative: Make debug-alternative selective
2023-06-26 15:14:55 -07:00
David Sterba
b831306b3b btrfs: print assertion failure report and stack trace from the same line
Assertions reports are split into two parts, the exact file and location
of the condition and then the stack trace printed from
btrfs_assertfail(). This means all the stack traces report the same line
and this is what's typically reported by various tools, making it harder
to distinguish the reports.

  [403.2467] assertion failed: refcount_read(&block_group->refs) == 1, in fs/btrfs/block-group.c:4259
  [403.2479] ------------[ cut here ]------------
  [403.2484] kernel BUG at fs/btrfs/messages.c:259!
  [403.2488] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
  [403.2493] CPU: 2 PID: 23202 Comm: umount Not tainted 6.2.0-rc4-default+ #67
  [403.2499] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552-rebuilt.opensuse.org 04/01/2014
  [403.2509] RIP: 0010:btrfs_assertfail+0x19/0x1b [btrfs]
  ...
  [403.2595] Call Trace:
  [403.2598]  <TASK>
  [403.2601]  btrfs_free_block_groups.cold+0x52/0xae [btrfs]
  [403.2608]  close_ctree+0x6c2/0x761 [btrfs]
  [403.2613]  ? __wait_for_common+0x2b8/0x360
  [403.2618]  ? btrfs_cleanup_one_transaction.cold+0x7a/0x7a [btrfs]
  [403.2626]  ? mark_held_locks+0x6b/0x90
  [403.2630]  ? lockdep_hardirqs_on_prepare+0x13d/0x200
  [403.2636]  ? __call_rcu_common.constprop.0+0x1ea/0x3d0
  [403.2642]  ? trace_hardirqs_on+0x2d/0x110
  [403.2646]  ? __call_rcu_common.constprop.0+0x1ea/0x3d0
  [403.2652]  generic_shutdown_super+0xb0/0x1c0
  [403.2657]  kill_anon_super+0x1e/0x40
  [403.2662]  btrfs_kill_super+0x25/0x30 [btrfs]
  [403.2668]  deactivate_locked_super+0x4c/0xc0

By making btrfs_assertfail a macro we'll get the same line number for
the BUG output:

  [63.5736] assertion failed: 0, in fs/btrfs/super.c:1572
  [63.5758] ------------[ cut here ]------------
  [63.5782] kernel BUG at fs/btrfs/super.c:1572!
  [63.5807] invalid opcode: 0000 [#2] PREEMPT SMP KASAN
  [63.5831] CPU: 0 PID: 859 Comm: mount Tainted: G      D            6.3.0-rc7-default+ #2062
  [63.5868] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
  [63.5905] RIP: 0010:btrfs_mount+0x24/0x30 [btrfs]
  [63.5964] RSP: 0018:ffff88800e69fcd8 EFLAGS: 00010246
  [63.5982] RAX: 000000000000002d RBX: ffff888008fc1400 RCX: 0000000000000000
  [63.6004] RDX: 0000000000000000 RSI: ffffffffb90fd868 RDI: ffffffffbcc3ff20
  [63.6026] RBP: ffffffffc081b200 R08: 0000000000000001 R09: ffff88800e69fa27
  [63.6046] R10: ffffed1001cd3f44 R11: 0000000000000001 R12: ffff888005a3c370
  [63.6062] R13: ffffffffc058e830 R14: 0000000000000000 R15: 00000000ffffffff
  [63.6081] FS:  00007f7b3561f800(0000) GS:ffff88806c600000(0000) knlGS:0000000000000000
  [63.6105] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [63.6120] CR2: 00007fff83726e10 CR3: 0000000002a9e000 CR4: 00000000000006b0
  [63.6137] Call Trace:
  [63.6143]  <TASK>
  [63.6148]  legacy_get_tree+0x80/0xd0
  [63.6158]  vfs_get_tree+0x43/0x120
  [63.6166]  do_new_mount+0x1f3/0x3d0
  [63.6176]  ? do_add_mount+0x140/0x140
  [63.6187]  ? cap_capable+0xa4/0xe0
  [63.6197]  path_mount+0x223/0xc10

This comes at a cost of bloating the final btrfs.ko module due all the
inlining, as long as assertions are compiled in. This is a must for
debugging builds but this is often enabled on release builds too.

Release build:

   text    data     bss     dec     hex filename
1251676   20317   16088 1288081  13a791 pre/btrfs.ko
1260612   29473   16088 1306173  13ee3d post/btrfs.ko

DELTA: +8936

CC: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19 13:59:31 +02:00
Josh Poimboeuf
b4c96ef0ad objtool: Skip reading DWARF section data
Objtool doesn't use DWARF at all, and the DWARF sections' data take up a
lot of memory.  Skip reading them.

Note this only skips the DWARF base sections, not the rela sections.
The relas are needed because their symbol references may need to be
reindexed if any local symbols get added by elf_create_symbol().

Also note the DWARF data will eventually be read by libelf anyway, when
writing the object file.  But that's fine, the goal here is to reduce
*peak* memory usage, and the previous patch (which freed insn memory)
gave some breathing room.  So the allocation gets shifted to a later
time, resulting in lower peak memory usage.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 29.93G
- After:  peak heap memory consumption: 25.47G

Link: https://lore.kernel.org/r/52a9698835861dd35f2ec35c49f96d0bb39fb177.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:27 -07:00
Josh Poimboeuf
d93b5935fd objtool: Free insns when done
Free the decoded instructions as they're no longer needed after this
point.  This frees up a big chunk of heap, which will come handy when
skipping the reading of DWARF section data.

Link: https://lore.kernel.org/r/4d4bca1a0f869de020dac80d91f9acbf6df77eab.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:27 -07:00
Josh Poimboeuf
ec24b927c1 objtool: Get rid of reloc->rel[a]
Get the relocation entry info from the underlying rsec->data.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 35.12G
- After:  peak heap memory consumption: 29.93G

Link: https://lore.kernel.org/r/2be32323de6d8cc73179ee0ff14b71f4e7cefaa0.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:26 -07:00
Josh Poimboeuf
02b5400106 objtool: Shrink elf hash nodes
Instead of using hlist for the 'struct elf' hashes, use a custom
single-linked list scheme.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 36.89G
- After:  peak heap memory consumption: 35.12G

Link: https://lore.kernel.org/r/6e8cd305ed22e743c30d6e72cfdc1be20fb94cd4.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:25 -07:00
Josh Poimboeuf
890f10a433 objtool: Shrink reloc->sym_reloc_entry
Convert it to a singly-linked list.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 38.64G
- After:  peak heap memory consumption: 36.89G

Link: https://lore.kernel.org/r/a51f0a6f9bbf2494d5a3a449807307e78a940988.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:25 -07:00
Josh Poimboeuf
be2f0b1e12 objtool: Get rid of reloc->jump_table_start
Rework the jump table logic slightly so 'jump_table_start' is no longer
needed.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 40.37G
- After:  peak heap memory consumption: 38.64G

Link: https://lore.kernel.org/r/e1602ed8a6171ada3cfac0bd8449892ec82bd188.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:24 -07:00
Josh Poimboeuf
0696b6e314 objtool: Get rid of reloc->addend
Get the addend from the embedded GElf_Rel[a] struct.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 42.10G
- After:  peak heap memory consumption: 40.37G

Link: https://lore.kernel.org/r/ad2354f95d9ddd86094e3f7687acfa0750657784.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:23 -07:00
Josh Poimboeuf
fcee899d27 objtool: Get rid of reloc->type
Get the type from the embedded GElf_Rel[a] struct.

Link: https://lore.kernel.org/r/d1c1f8da31e4f052a2478aea585fcf355cacc53a.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:22 -07:00
Josh Poimboeuf
e4cbb9b81f objtool: Get rid of reloc->offset
Get the offset from the embedded GElf_Rel[a] struct.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 43.83G
- After:  peak heap memory consumption: 42.10G

Link: https://lore.kernel.org/r/2b9ec01178baa346a99522710bf2e82159412e3a.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:21 -07:00
Josh Poimboeuf
be9a4c1168 objtool: Get rid of reloc->idx
Use the array offset to calculate the reloc index.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 45.56G
- After:  peak heap memory consumption: 43.83G

Link: https://lore.kernel.org/r/7351d2ebad0519027db14a32f6204af84952574a.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:21 -07:00
Josh Poimboeuf
ebcef730a1 objtool: Get rid of reloc->list
Now that all relocs are allocated in an array, the linked list is no
longer needed.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 49.02G
- After:  peak heap memory consumption: 45.56G

Link: https://lore.kernel.org/r/71e7a2c017dbc46bb497857ec97d67214f832d10.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:20 -07:00
Josh Poimboeuf
e0a9349b4c objtool: Allocate relocs in advance for new rela sections
Similar to read_relocs(), allocate the reloc structs all together in an
array rather than allocating them one at a time.

Link: https://lore.kernel.org/r/5332d845c5a2d6c2d052075b381bfba8bcb67ed5.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:20 -07:00
Josh Poimboeuf
caa4a6b74b objtool: Add for_each_reloc()
Link: https://lore.kernel.org/r/dbfcb1037d8b958e52d097b67829c4c6811c24bb.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:19 -07:00
Josh Poimboeuf
5201a9bcb7 objtool: Don't free memory in elf_close()
It's not necessary, objtool's about to exit anyway.

Link: https://lore.kernel.org/r/74bdb3058b8f029db8d5b3b5175f2a200804196d.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:19 -07:00
Josh Poimboeuf
fcf933552b objtool: Keep GElf_Rel[a] structs synced
Keep the GElf_Rela structs synced with their 'struct reloc' counterparts
instead of having to go back and "rebuild" them later.

Link: https://lore.kernel.org/r/156d8a3e528a11e5c8577cf552890ed1f2b9567b.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:18 -07:00
Josh Poimboeuf
6342a20efb objtool: Add elf_create_section_pair()
When creating an annotation section, allocate the reloc section data at
the beginning.  This simplifies the data model a bit and also saves
memory due to the removal of malloc() in elf_rebuild_reloc_section().

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 53.49G
- After:  peak heap memory consumption: 49.02G

Link: https://lore.kernel.org/r/048e908f3ede9b66c15e44672b6dda992b1dae3e.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:17 -07:00
Josh Poimboeuf
ff4082730c objtool: Add mark_sec_changed()
Ensure elf->changed always gets set when sec->changed gets set.

Link: https://lore.kernel.org/r/9a810a8d2e28af6ba07325362d0eb4703bb09d3a.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:16 -07:00
Josh Poimboeuf
eb0481bbc4 objtool: Fix reloc_hash size
With CONFIG_DEBUG_INFO, DWARF creates a lot of relocations and
reloc_hash is woefully undersized, which can affect performance
significantly.  Fix that.

Link: https://lore.kernel.org/r/38ef60dc8043270bf3b9dfd139ae2a30ca3f75cc.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:16 -07:00
Josh Poimboeuf
53257a977a objtool: Consolidate rel/rela handling
The GElf_Rel[a] structs have more similarities than differences.  It's
safe to hard-code the assumptions about their shared fields as they will
never change.  Consolidate their handling where possible, getting rid of
duplicated code.

Also, at least for now we only ever create rela sections, so simplify
the relocation creation code to be rela-only.

Link: https://lore.kernel.org/r/dcabf6df400ca500ea929f1e4284f5e5ec0b27c8.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:15 -07:00
Josh Poimboeuf
a5bd623653 objtool: Improve reloc naming
- The term "reloc" is overloaded to mean both "an instance of struct
  reloc" and "a reloc section".  Change the latter to "rsec".

- For variable names, use "sec" for regular sections and "rsec" for rela
  sections to prevent them getting mixed up.

- For struct reloc variables, use "reloc" instead of "rel" everywhere
  for consistency.

Link: https://lore.kernel.org/r/8b790e403df46f445c21003e7893b8f53b99a6f3.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:14 -07:00
Josh Poimboeuf
2707579dfa objtool: Remove flags argument from elf_create_section()
Simplify the elf_create_section() interface a bit by removing the flags
argument.  Most callers don't care about changing the section header
flags.  If needed, they can be modified afterwards, just like any other
section header field.

Link: https://lore.kernel.org/r/515235d9cf62637a14bee37bfa9169ef20065471.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:13 -07:00
Josh Poimboeuf
809373e17b objtool: Tidy elf.h
Reorganize elf.h a bit:

- Move the prototypes higher up so they can be used by the inline
  functions.

- Move hash-related code to the bottom.

- Remove the unused ELF_HASH_BITS macro.

No functional changes.

Link: https://lore.kernel.org/r/b1490ed85951868219a6ece177a7cd30a6454d66.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:13 -07:00
Josh Poimboeuf
1e4b619185 objtool: Allow stack operations in UNWIND_HINT_UNDEFINED regions
If the code specified UNWIND_HINT_UNDEFINED, skip the "undefined stack
state" warning due to a stack operation.  Just ignore the stack op and
continue to propagate the undefined state.

Link: https://lore.kernel.org/r/820c5b433f17c84e8761fb7465a8d319d706b1cf.1685981486.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 10:03:11 -07:00
Lu Hongfei
d49d1666aa tools: Remove unnecessary variables
There are several places where warnings variables are not needed,
remove them and directly return 0.

Signed-off-by: Lu Hongfei <luhongfei@vivo.com>
Link: https://lore.kernel.org/r/20230530075649.21661-1-luhongfei@vivo.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 09:27:11 -07:00
Josh Poimboeuf
ff9a6459bb objtool: Add __kunit_abort() to noreturns
Fixes a bunch of warnings like:

  drivers/input/tests/input_test.o: warning: objtool: input_test_init+0x1cb: stack state mismatch: cfa1=4+64 cfa2=4+56
  lib/kunit/kunit-test.o: warning: objtool: kunit_log_newline_test+0xfb: return with modified stack frame
  ...

Fixes: 260755184c ("kunit: Move kunit_abort() call out of kunit_do_failed_assertion()")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230602175453.swsn3ehyochtwkhy@treble
2023-06-02 21:12:24 +02:00
Josh Poimboeuf
6245ce4ab6 objtool: Move noreturn function list to separate file
This makes it a little cleaner and easier to maintain.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/cecacf07a69a244c74474c18b7652627de67a528.1681853186.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-05-16 06:31:55 -07:00
Josh Poimboeuf
34245659de objtool: Remove superfluous global_noreturns entries
lbug_with_loc() no longer exists, and resume_play_dead() is static
(objtool only checks globals and weaks).

Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/2725d7f2ccc2361c6903de9ebaa2b5bb304f7ac2.1681853186.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-05-16 06:31:54 -07:00
Josh Poimboeuf
55eeab2a8a objtool: Ignore exc_double_fault() __noreturn warnings
This is a hack, but it works for now.

Problem is, exc_double_fault() may or may not return, depending on
whether CONFIG_X86_ESPFIX64 is set.  But objtool has no visibility to
the kernel config.

"Fix" it by silencing the exc_double_fault() __noreturn warning.

This removes the following warning:

  vmlinux.o: warning: objtool: xenpv_exc_double_fault+0xd: exc_double_fault() is missing a __noreturn annotation

Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/a45b085071d3a7d049a20f9e78754452336ecbe8.1681853186.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-05-16 06:31:53 -07:00
Josh Poimboeuf
fedb724c3d objtool: Detect missing __noreturn annotations
Most "unreachable instruction" warnings these days seem to actually be
the result of a missing __noreturn annotation.  Add an explicit check
for that.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/6e2b93d8c65eaed6c4166a358269dc0ef01f890c.1681853186.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-05-16 06:31:53 -07:00
Josh Poimboeuf
ced23d2e37 objtool: Include backtrace in verbose mode
Include backtrace in verbose mode.  This makes it easy to gather all the
information needed for diagnosing objtool warnings.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/c255224fabcf7e64bac232fec1c77c9fc2d7d7ab.1681853186.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-05-16 06:31:52 -07:00
Josh Poimboeuf
ca653464dd objtool: Add verbose option for disassembling affected functions
When a warning is associated with a function, add an option to
disassemble that function.

This makes it easier for reporters to submit the information needed to
diagnose objtool warnings.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/dd0fe13428ede186f09c74059a8001f4adcea5fc.1681853186.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-05-16 06:31:51 -07:00
Josh Poimboeuf
5e3992fe72 objtool: Limit unreachable warnings to once per function
Unreachable instruction warnings are limited to once per object file.
That no longer makes sense for vmlinux validation, which might have
more unreachable instructions lurking in other places.  Change it to
once per function.

Note this affects some other (much rarer) non-fatal warnings as well.
In general I think one-warning-per-function makes sense, as related
warnings can accumulate quickly and we want to eventually get back to
failing the build with -Werror anyway.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/9d38f881bfc34e031c74e4e90064ccb3e49f599a.1681853186.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-05-16 06:31:51 -07:00
Peter Zijlstra
270a69c448 x86/alternative: Support relocations in alternatives
A little while ago someone (Kirill) ran into the whole 'alternatives don't
do relocations nonsense' again and I got annoyed enough to actually look
at the code.

Since the whole alternative machinery already fully decodes the
instructions it is simple enough to adjust immediates and displacement
when needed. Specifically, the immediates for IP modifying instructions
(JMP, CALL, Jcc) and the displacement for RIP-relative instructions.

  [ bp: Massage comment some more and get rid of third loop in
    apply_relocation(). ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230208171431.313857925@infradead.org
2023-05-10 14:47:08 +02:00
Linus Torvalds
2aff7c706c Objtool changes for v6.4:
- Mark arch_cpu_idle_dead() __noreturn, make all architectures & drivers that did
    this inconsistently follow this new, common convention, and fix all the fallout
    that objtool can now detect statically.
 
  - Fix/improve the ORC unwinder becoming unreliable due to UNWIND_HINT_EMPTY ambiguity,
    split it into UNWIND_HINT_END_OF_STACK and UNWIND_HINT_UNDEFINED to resolve it.
 
  - Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code.
 
  - Generate ORC data for __pfx code
 
  - Add more __noreturn annotations to various kernel startup/shutdown/panic functions.
 
  - Misc improvements & fixes.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmRK1x0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ghxQ/+IkCynMYtdF5OG9YwbcGJqsPSfOPMEcEM
 pUSFYg+gGPBDT/fJfcVSqvUtdnWbLC2kXt9yiswXz3X3J2nmNkBk5YKQftsNDcul
 TmKeqIIAK51XTncpegKH0EGnOX63oZ9Vxa8CTPdDlb+YF23Km2FoudGRI9F5qbUd
 LoraXqGYeiaeySkGyWmZVl6Uc8dIxnMkTN3H/oI9aB6TOrsi059hAtFcSaFfyemP
 c4LqXXCH7k2baiQt+qaLZ8cuZVG/+K5r2N2cmjO5kmJc6ynIaFnfMe4XxZLjp5LT
 /PulYI15bXkvSARKx5CRh/CDHMOx5Blw+ASO0RhWbdy0WH4ZhhcaVF5AeIpPW86a
 1LBcz97rMp72WmvKgrJeVO1r9+ll4SI6/YKGJRsxsCMdP3hgFpqntXyVjTFNdTM1
 0gH6H5v55x06vJHvhtTk8SR3PfMTEM2fRU5jXEOrGowoGifx+wNUwORiwj6LE3KQ
 SKUdT19RNzoW3VkFxhgk65ThK1S7YsJUKRoac3YdhttpqqqtFV//erenrZoR4k/p
 vzvKy68EQ7RCNyD5wNWNFe0YjeJl5G8gQ8bUm4Xmab7djjgz+pn4WpQB8yYKJLAo
 x9dqQ+6eUbw3Hcgk6qQ9E+r/svbulnAL0AeALAWK/91DwnZ2mCzKroFkLN7napKi
 fRho4CqzrtM=
 =NwEV
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:

 - Mark arch_cpu_idle_dead() __noreturn, make all architectures &
   drivers that did this inconsistently follow this new, common
   convention, and fix all the fallout that objtool can now detect
   statically

 - Fix/improve the ORC unwinder becoming unreliable due to
   UNWIND_HINT_EMPTY ambiguity, split it into UNWIND_HINT_END_OF_STACK
   and UNWIND_HINT_UNDEFINED to resolve it

 - Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code

 - Generate ORC data for __pfx code

 - Add more __noreturn annotations to various kernel startup/shutdown
   and panic functions

 - Misc improvements & fixes

* tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  x86/hyperv: Mark hv_ghcb_terminate() as noreturn
  scsi: message: fusion: Mark mpt_halt_firmware() __noreturn
  x86/cpu: Mark {hlt,resume}_play_dead() __noreturn
  btrfs: Mark btrfs_assertfail() __noreturn
  objtool: Include weak functions in global_noreturns check
  cpu: Mark nmi_panic_self_stop() __noreturn
  cpu: Mark panic_smp_self_stop() __noreturn
  arm64/cpu: Mark cpu_park_loop() and friends __noreturn
  x86/head: Mark *_start_kernel() __noreturn
  init: Mark start_kernel() __noreturn
  init: Mark [arch_call_]rest_init() __noreturn
  objtool: Generate ORC data for __pfx code
  x86/linkage: Fix padding for typed functions
  objtool: Separate prefix code from stack validation code
  objtool: Remove superfluous dead_end_function() check
  objtool: Add symbol iteration helpers
  objtool: Add WARN_INSN()
  scripts/objdump-func: Support multiple functions
  context_tracking: Fix KCSAN noinstr violation
  objtool: Add stackleak instrumentation to uaccess safe list
  ...
2023-04-28 14:02:54 -07:00
Linus Torvalds
85d7ab2463 for-6.4-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmRHC3gACgkQxWXV+ddt
 WDvI/A//ZzREEE0wNexbuidoTacDVXVJ6LBb2K1eP+HUKfsmd6GYWQDJ9x/ExpKb
 T1ehLibCYWLeYxEREFbjXI3x9G8mrvLzvzsqXs/MzJPkmEF1igPddFztidBwvLQH
 ey/Bh+cra2bpVhRhkX0Cf09/q/YWp17/d14ZxxW60PMfyhx8RWXejXhHkulOPVv8
 +3FL8E0kc2Zjx9ioUwOy/i18LR6YzsCNVXoHzUZuWyWM4A7NG2TZR6FhuLSjlWSZ
 3RAnROwr+8i5nR0xchcyYaVMO2LMbqH6mBtHnXCtxCr+4pFrfrvKym+CQco/Xriz
 v1y/xDc23XeYXLCVhb0beJ6uRcjaM9+gvDF1oVBSJEv6V7sQr/tEGo/8QRehfEfT
 FTro7Lf89R1GOa1IBSkv/T5S25d9LlIID3/g7PbcUBtXNKvLAjDAGTH9bzL4HS5x
 /MKwN80GvaGs1KyEfUndbVPIpAwNFDYZPHM7nw1x+JTkIBcHgfjRyAMAC9jrJd0D
 730W04c+0nXZtQGtKKsxc3U8y4ewzSJAKx9t7Vgo7+1P6dSRnzvJee3x/5kXV9Yn
 MhxxzYDfIN9EcWbASdSm11gY5WZdG3an609pO7nc1T2K4Tuo0SPs4xOR7c3xuZrY
 MN5z3QFWyI2ustUuTG+nsd5J81j76DEmj5ymWQfG3SBplTneDM0=
 =Jt7p
 -----END PGP SIGNATURE-----

Merge tag 'for-6.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "Mostly core changes and cleanups, some notable fixes and two
  performance improvements in directory logging.

  The IO path cleanups are removing or refactoring old code, scrub main
  loop has been completely rewritten also refactoring old code.

  There are some changes to non-btrfs code, mostly trivial, the cgroup
  punt bio logic is only moved from generic code.

  Performance improvements:

   - improve logging changes in a directory during one transaction,
     avoid iterating over items and reduce lock contention (fsync time
     4x lower)

   - when logging directory entries during one transaction, reduce
     locking of subvolume trees by checking tree-log instead
     (improvement in throughput and latency for concurrent access to a
     subvolume)

  Notable fixes:

   - dev-replace:
      - properly honor read mode when requested to avoid reading from
        source device
      - target device won't be used for eventual read repair, this is
        unreliable for NODATASUM files
      - when there are unpaired (and unrepairable) metadata during
        replace, exit early with error and don't try to finish whole
        operation

   - scrub ioctl properly rejects unknown flags

   - fix global block reserve calculations

   - fix partial direct io write when there's a page fault in the
     middle, iomap will try to continue with partial request but the
     btrfs part did not match that, this can lead to zeros written
     instead of data

  Core changes:

   - io path:
      - continued cleanups and refactoring around bio handling
      - extent io submit path simplifications and cleanups
      - flush write path simplifications and cleanups
      - rework logic of passing sync mode of bio, with further cleanups

   - rewrite scrub code flow, restructure how the stripes are enumerated
     and verified in a more unified way

   - allow to set lower threshold for block group reclaim in debug mode
     to aid zoned mode testing

   - remove obsolete time-based delayed ref throttling logic when
     truncating items

   - DREW locks are not using percpu variables anymore

   - more warning fixes (-Wmaybe-uninitialized)

   - u64 division simplifications

   - error handling improvements

  Non-btrfs code changes:

   - push cgroup punt bio logic to btrfs code (there was no other user
     of that), the functionality can be now selected separately by
     BLK_CGROUP_PUNT_BIO

   - crc32c_impl removed after removing last uses in btrfs code

   - add btrfs_assertfail() to objtool table"

* tag 'for-6.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (147 commits)
  btrfs: mark btrfs_assertfail() __noreturn
  btrfs: fix uninitialized variable warnings
  btrfs: use log root when iterating over index keys when logging directory
  btrfs: avoid iterating over all indexes when logging directory
  btrfs: dev-replace: error out if we have unrepaired metadata error during
  btrfs: remove pointless loop at btrfs_get_next_valid_item()
  btrfs: scrub: reject unsupported scrub flags
  btrfs: reinterpret async discard iops_limit=0 as no delay
  btrfs: set default discard iops_limit to 1000
  btrfs: remove unused raid56 functions which were dedicated for scrub
  btrfs: scrub: remove scrub_bio structure
  btrfs: scrub: remove scrub_block and scrub_sector structures
  btrfs: scrub: remove the old scrub recheck code
  btrfs: scrub: remove the old writeback infrastructure
  btrfs: scrub: remove scrub_parity structure
  btrfs: scrub: use scrub_stripe to implement RAID56 P/Q scrub
  btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure
  btrfs: scrub: introduce helper to queue a stripe for scrub
  btrfs: scrub: introduce error reporting functionality for scrub_stripe
  btrfs: scrub: introduce a writeback helper for scrub_stripe
  ...
2023-04-26 09:13:44 -07:00
Linus Torvalds
c23f28975a Commit volume in documentation is relatively low this time, but there is
still a fair amount going on, including:
 
 - Reorganizing the architecture-specific documentation under
   Documentation/arch.  This makes the structure match the source directory
   and helps to clean up the mess that is the top-level Documentation
   directory a bit.  This work creates the new directory and moves x86 and
   most of the less-active architectures there.  The current plan is to move
   the rest of the architectures in 6.5, with the patches going through the
   appropriate subsystem trees.
 
 - Some more Spanish translations and maintenance of the Italian
   translation.
 
 - A new "Kernel contribution maturity model" document from Ted.
 
 - A new tutorial on quickly building a trimmed kernel from Thorsten.
 
 Plus the usual set of updates and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmRGze0PHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5Y/VsH/RyWqinorRVFZmHqRJMRhR0j7hE2pAgK5prE
 dGXYVtHHNQ+25thNaqhZTOLYFbSX6ii2NG7sLRXmyOTGIZrhUCFFXCHkuq4ZUypR
 gJpMUiKQVT4dhln3gIZ0k09NSr60gz8UTcq895N9UFpUdY1SCDhbCcLc4uXTRajq
 NrdgFaHWRkPb+gBRbXOExYm75DmCC6Ny5AyGo2rXfItV//ETjWIJVQpJhlxKrpMZ
 3LgpdYSLhEFFnFGnXJ+EAPJ7gXDi2Tg5DuPbkvJyFOTouF3j4h8lSS9l+refMljN
 xNRessv+boge/JAQidS6u8F2m2ESSqSxisv/0irgtKIMJwXaoX4=
 =1//8
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.4' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "Commit volume in documentation is relatively low this time, but there
  is still a fair amount going on, including:

   - Reorganize the architecture-specific documentation under
     Documentation/arch

     This makes the structure match the source directory and helps to
     clean up the mess that is the top-level Documentation directory a
     bit. This work creates the new directory and moves x86 and most of
     the less-active architectures there.

     The current plan is to move the rest of the architectures in 6.5,
     with the patches going through the appropriate subsystem trees.

   - Some more Spanish translations and maintenance of the Italian
     translation

   - A new "Kernel contribution maturity model" document from Ted

   - A new tutorial on quickly building a trimmed kernel from Thorsten

  Plus the usual set of updates and fixes"

* tag 'docs-6.4' of git://git.lwn.net/linux: (47 commits)
  media: Adjust column width for pdfdocs
  media: Fix building pdfdocs
  docs: clk: add documentation to log which clocks have been disabled
  docs: trace: Fix typo in ftrace.rst
  Documentation/process: always CC responsible lists
  docs: kmemleak: adjust to config renaming
  ELF: document some de-facto PT_* ABI quirks
  Documentation: arm: remove stih415/stih416 related entries
  docs: turn off "smart quotes" in the HTML build
  Documentation: firmware: Clarify firmware path usage
  docs/mm: Physical Memory: Fix grammar
  Documentation: Add document for false sharing
  dma-api-howto: typo fix
  docs: move m68k architecture documentation under Documentation/arch/
  docs: move parisc documentation under Documentation/arch/
  docs: move ia64 architecture docs under Documentation/arch/
  docs: Move arc architecture docs under Documentation/arch/
  docs: move nios2 documentation under Documentation/arch/
  docs: move openrisc documentation under Documentation/arch/
  docs: move superh documentation under Documentation/arch/
  ...
2023-04-24 12:35:49 -07:00
Linus Torvalds
427fda2c8a x86: improve on the non-rep 'copy_user' function
The old 'copy_user_generic_unrolled' function was oddly implemented for
largely historical reasons: it had been largely based on the uncached
copy case, which has some other concerns.

For example, the __copy_user_nocache() function uses 'movnti' for the
destination stores, and those want the destination to be aligned.  In
contrast, the regular copy function doesn't really care, and trying to
align things only complicates matters.

Also, like the clear_user function, the copy function had some odd
handling of the repeat counts, complicating the exception handling for
no really good reason.  So as with clear_user, just write it to keep all
the byte counts in the %rcx register, exactly like the 'rep movs'
functionality that this replaces.

Unlike a real 'rep movs', we do allow for this to trash a few temporary
registers to not have to unnecessarily save/restore registers on the
stack.

And like the clearing case, rename this to what it now clearly is:
'rep_movs_alternative', and make it one coherent function, so that it
shows up as such in profiles (instead of the odd split between
"copy_user_generic_unrolled" and "copy_user_short_string", the latter of
which was not about strings at all, and which was shared with the
uncached case).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18 17:05:28 -07:00
Linus Torvalds
8c9b6a88b7 x86: improve on the non-rep 'clear_user' function
The old version was oddly written to have the repeat count in multiple
registers.  So instead of taking advantage of %rax being zero, it had
some sub-counts in it.  All just for a "single word clearing" loop,
which isn't even efficient to begin with.

So get rid of those games, and just keep all the state in the same
registers we got it in (and that we should return things in).  That not
only makes this act much more like 'rep stos' (which this function is
replacing), but makes it much easier to actually do the obvious loop
unrolling.

Also rename the function from the now nonsensical 'clear_user_original'
to what it now clearly is: 'rep_stos_alternative'.

End result: if we don't have a fast 'rep stosb', at least we can have a
fast fallback for it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18 17:05:28 -07:00
Linus Torvalds
577e6a7fd5 x86: inline the 'rep movs' in user copies for the FSRM case
This does the same thing for the user copies as commit 0db7058e8e
("x86/clear_user: Make it faster") did for clear_user().  In other
words, it inlines the "rep movs" case when X86_FEATURE_FSRM is set,
avoiding the function call entirely.

In order to do that, it makes the calling convention for the out-of-line
case ("copy_user_generic_unrolled") match the 'rep movs' calling
convention, although it does also end up clobbering a number of
additional registers.

Also, to simplify code sharing in the low-level assembly with the
__copy_user_nocache() function (that uses the normal C calling
convention), we end up with a kind of mixed return value for the
low-level asm code: it will return the result in both %rcx (to work as
an alternative for the 'rep movs' case), _and_ in %rax (for the nocache
case).

We could avoid this by wrapping __copy_user_nocache() callers in an
inline asm, but since the cost is just an extra register copy, it's
probably not worth it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18 17:05:28 -07:00
Linus Torvalds
3639a53558 x86: move stac/clac from user copy routines into callers
This is preparatory work for inlining the 'rep movs' case, but also a
cleanup.  The __copy_user_nocache() function was mis-used by the rdma
code to do uncached kernel copies that don't actually want user copies
at all, and as a result doesn't want the stac/clac either.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18 17:05:28 -07:00
Linus Torvalds
d2c95f9d68 x86: don't use REP_GOOD or ERMS for user memory clearing
The modern target to use is FSRS (Fast Short REP STOS), and the other
cases should only be used for bigger areas (ie mainly things like page
clearing).

Note! This changes the conditional for the inlining from FSRM ("fast
short rep movs") to FSRS ("fast short rep stos").

We'll have a separate fixup for AMD microarchitectures that have a good
'rep stosb' yet do not set the new Intel-specific FSRS bit (because FSRM
was there first).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18 17:05:28 -07:00
Josh Poimboeuf
f372463124 btrfs: mark btrfs_assertfail() __noreturn
Fixes a bunch of warnings including:

  vmlinux.o: warning: objtool: select_reloc_root+0x314: unreachable instruction
  vmlinux.o: warning: objtool: finish_inode_if_needed+0x15b1: unreachable instruction
  vmlinux.o: warning: objtool: get_bio_sector_nr+0x259: unreachable instruction
  vmlinux.o: warning: objtool: raid_wait_read_end_io+0xc26: unreachable instruction
  vmlinux.o: warning: objtool: raid56_parity_alloc_scrub_rbio+0x37b: unreachable instruction
  ...

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202302210709.IlXfgMpX-lkp@intel.com/
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 19:52:19 +02:00
Guilherme G. Piccoli
611d4c716d x86/hyperv: Mark hv_ghcb_terminate() as noreturn
Annotate the function prototype and definition as noreturn to prevent
objtool warnings like:

vmlinux.o: warning: objtool: hyperv_init+0x55c: unreachable instruction

Also, as per Josh's suggestion, add it to the global_noreturns list.
As a comparison, an objdump output without the annotation:

[...]
1b63:  mov    $0x1,%esi
1b68:  xor    %edi,%edi
1b6a:  callq  ffffffff8102f680 <hv_ghcb_terminate>
1b6f:  jmpq   ffffffff82f217ec <hyperv_init+0x9c> # unreachable
1b74:  cmpq   $0xffffffffffffffff,-0x702a24(%rip)
[...]

Now, after adding the __noreturn to the function prototype:

[...]
17df:  callq  ffffffff8102f6d0 <hv_ghcb_negotiate_protocol>
17e4:  test   %al,%al
17e6:  je     ffffffff82f21bb9 <hyperv_init+0x469>
[...]  <many insns>
1bb9:  mov    $0x1,%esi
1bbe:  xor    %edi,%edi
1bc0:  callq  ffffffff8102f680 <hv_ghcb_terminate>
1bc5:  nopw   %cs:0x0(%rax,%rax,1) # end of function

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/32453a703dfcf0d007b473c9acbf70718222b74b.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:28 +02:00
Josh Poimboeuf
6e36a56a5f scsi: message: fusion: Mark mpt_halt_firmware() __noreturn
mpt_halt_firmware() doesn't return.  Mark it as such.

Fixes the following warnings:

  vmlinux.o: warning: objtool: mptscsih_abort+0x7f4: unreachable instruction
  vmlinux.o: warning: objtool: mptctl_timeout_expired+0x310: unreachable instruction

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Debugged-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/d8129817423422355bf30e90dadc6764261b53e0.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:27 +02:00
Josh Poimboeuf
52668badd3 x86/cpu: Mark {hlt,resume}_play_dead() __noreturn
Fixes the following warning:

  vmlinux.o: warning: objtool: resume_play_dead+0x21: unreachable instruction

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/ce1407c4bf88b1334fe40413126343792a77ca50.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:27 +02:00
Josh Poimboeuf
09c5ae30d0 btrfs: Mark btrfs_assertfail() __noreturn
Fixes a bunch of warnings including:

  vmlinux.o: warning: objtool: select_reloc_root+0x314: unreachable instruction
  vmlinux.o: warning: objtool: finish_inode_if_needed+0x15b1: unreachable instruction
  vmlinux.o: warning: objtool: get_bio_sector_nr+0x259: unreachable instruction
  vmlinux.o: warning: objtool: raid_wait_read_end_io+0xc26: unreachable instruction
  vmlinux.o: warning: objtool: raid56_parity_alloc_scrub_rbio+0x37b: unreachable instruction
  ...

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/960bd9c0c9e3cfc409ba9c35a17644b11b832956.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:26 +02:00
Josh Poimboeuf
1c47c8758a objtool: Include weak functions in global_noreturns check
If a global function doesn't return, and its prototype has the
__noreturn attribute, its weak counterpart must also not return so that
it matches the prototype and meets call site expectations.

To properly follow the compiled control flow at the call sites, change
the global_noreturns check to include both global and weak functions.

On the other hand, if a weak function isn't in global_noreturns, assume
the prototype doesn't have __noreturn.  Even if the weak function
doesn't return, call sites treat it like a returnable function.

Fixes the following warning:

  kernel/sched/build_policy.o: warning: objtool: do_idle() falls through to next function play_idle_precise()

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/ede3460d63f4a65d282c86f1175bd2662c2286ba.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:26 +02:00
Josh Poimboeuf
27dea14c7f cpu: Mark nmi_panic_self_stop() __noreturn
In preparation for improving objtool's handling of weak noreturn
functions, mark nmi_panic_self_stop() __noreturn.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/316fc6dfab5a8c4e024c7185484a1ee5fb0afb79.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:26 +02:00
Josh Poimboeuf
7412a60dec cpu: Mark panic_smp_self_stop() __noreturn
In preparation for improving objtool's handling of weak noreturn
functions, mark panic_smp_self_stop() __noreturn.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/92d76ab5c8bf660f04fdcd3da1084519212de248.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:25 +02:00
Josh Poimboeuf
4208d2d798 x86/head: Mark *_start_kernel() __noreturn
Now that start_kernel() is __noreturn, mark its chain of callers
__noreturn.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/c2525f96b88be98ee027ee0291d58003036d4120.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:24 +02:00
Josh Poimboeuf
25a6917ca6 init: Mark start_kernel() __noreturn
Now that arch_call_rest_init() is __noreturn, mark its caller
start_kernel() __noreturn.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/7069acf026a195f26a88061227fba5a3b0337b9a.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:23 +02:00
Josh Poimboeuf
9ea7e6b62c init: Mark [arch_call_]rest_init() __noreturn
In preparation for improving objtool's handling of weak noreturn
functions, mark start_kernel(), arch_call_rest_init(), and rest_init()
__noreturn.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/7194ed8a989a85b98d92e62df660f4a90435a723.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:23 +02:00
Josh Poimboeuf
5743654f5e objtool: Generate ORC data for __pfx code
Allow unwinding from prefix code by copying the CFI from the starting
instruction of the corresponding function.  Even when the NOPs are
replaced, they're still stack-invariant instructions so the same ORC
entry can be reused everywhere.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/bc3344e51f3e87102f1301a0be0f72a7689ea4a4.1681331135.git.jpoimboe@kernel.org
2023-04-14 16:08:30 +02:00
Josh Poimboeuf
bd456a1bed objtool: Separate prefix code from stack validation code
Simplify the prefix code by moving it after
validate_reachable_instructions().

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/d7f31ac2de462d0cd7b1db01b7ecb525c057c8f6.1681331135.git.jpoimboe@kernel.org
2023-04-14 16:08:29 +02:00
Josh Poimboeuf
6126ed5dfb objtool: Remove superfluous dead_end_function() check
annotate_call_site() already sets 'insn->dead_end' for calls to dead end
functions.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/5d603a301e9a8b1036b61503385907e154867ace.1681325924.git.jpoimboe@kernel.org
2023-04-14 16:08:29 +02:00
Josh Poimboeuf
9290e772ba objtool: Add symbol iteration helpers
Add [sec_]for_each_sym() and use them.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/59023e5886ab125aa30702e633be7732b1acaa7e.1681325924.git.jpoimboe@kernel.org
2023-04-14 16:08:29 +02:00
Josh Poimboeuf
246b2c8548 objtool: Add WARN_INSN()
It's easier to use and also gives easy access to the instruction's
containing function, which is useful for printing that function's
symbol.  It will also be useful in the future for rate-limiting and
disassembly of warned functions.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/2eaa3155c90fba683d8723599f279c46025b75f3.1681325924.git.jpoimboe@kernel.org
2023-04-14 16:08:28 +02:00
Josh Poimboeuf
7f530fba11 objtool: Add stackleak instrumentation to uaccess safe list
If a function has a large stack frame, the stackleak plugin adds a call
to stackleak_track_stack() after the prologue.

This function may be called in uaccess-enabled code.  Add it to the
uaccess safe list.

Fixes the following warning:

  vmlinux.o: warning: objtool: kasan_report+0x12: call to stackleak_track_stack() with UACCESS enabled

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/42e9b487ef89e9b237fd5220ad1c7cf1a2ad7eb8.1681320562.git.jpoimboe@kernel.org
2023-04-14 16:08:27 +02:00
Josh Poimboeuf
e18398e80c Revert "objtool: Support addition to set CFA base"
Commit 468af56a7b ("objtool: Support addition to set CFA base") was
added as a preparatory patch for arm64 support, but that support never
came.  It triggers a false positive warning on x86, so just revert it
for now.

Fixes the following warning:

  vmlinux.o: warning: objtool: cdce925_regmap_i2c_write+0xdb: stack state mismatch: cfa1=4+120 cfa2=5+40

Fixes: 468af56a7b ("objtool: Support addition to set CFA base")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/oe-kbuild-all/202304080538.j5G6h1AB-lkp@intel.com/
2023-04-14 16:08:27 +02:00
Jonathan Corbet
ff61f0791c docs: move x86 documentation into Documentation/arch/
Move the x86 documentation under Documentation/arch/ as a way of cleaning
up the top-level directory and making the structure of our docs more
closely match the structure of the source directories it describes.

All in-kernel references to the old paths have been updated.

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20230315211523.108836-1-corbet@lwn.net/
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-03-30 12:58:51 -06:00
Josh Poimboeuf
fb799447ae x86,objtool: Split UNWIND_HINT_EMPTY in two
Mark reported that the ORC unwinder incorrectly marks an unwind as
reliable when the unwind terminates prematurely in the dark corners of
return_to_handler() due to lack of information about the next frame.

The problem is UNWIND_HINT_EMPTY is used in two different situations:

  1) The end of the kernel stack unwind before hitting user entry, boot
     code, or fork entry

  2) A blind spot in ORC coverage where the unwinder has to bail due to
     lack of information about the next frame

The ORC unwinder has no way to tell the difference between the two.
When it encounters an undefined stack state with 'end=1', it blindly
marks the stack reliable, which can break the livepatch consistency
model.

Fix it by splitting UNWIND_HINT_EMPTY into UNWIND_HINT_UNDEFINED and
UNWIND_HINT_END_OF_STACK.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/fd6212c8b450d3564b855e1cb48404d6277b4d9f.1677683419.git.jpoimboe@kernel.org
2023-03-23 23:18:58 +01:00
Josh Poimboeuf
4708ea14be x86,objtool: Separate unret validation from unwind hints
The ENTRY unwind hint type is serving double duty as both an empty
unwind hint and an unret validation annotation.

Unret validation is unrelated to unwinding. Separate it out into its own
annotation.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/ff7448d492ea21b86d8a90264b105fbd0d751077.1677683419.git.jpoimboe@kernel.org
2023-03-23 23:18:58 +01:00
Josh Poimboeuf
f902cfdd46 x86,objtool: Introduce ORC_TYPE_*
Unwind hints and ORC entry types are two distinct things.  Separate them
out more explicitly.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/cc879d38fff8a43f8f7beb2fd56e35a5a384d7cd.1677683419.git.jpoimboe@kernel.org
2023-03-23 23:18:57 +01:00
Josh Poimboeuf
f7515d9fe8 objtool: Add objtool_types.h
Reduce the amount of header sync churn by splitting the shared objtool.h
types into a new file.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/dec622720851210ceafa12d4f4c5f9e73c832152.1677683419.git.jpoimboe@kernel.org
2023-03-23 23:18:56 +01:00
Josh Poimboeuf
071c44e427 sched/idle: Mark arch_cpu_idle_dead() __noreturn
Before commit 076cbf5d2163 ("x86/xen: don't let xen_pv_play_dead()
return"), in Xen, when a previously offlined CPU was brought back
online, it unexpectedly resumed execution where it left off in the
middle of the idle loop.

There were some hacks to make that work, but the behavior was surprising
as do_idle() doesn't expect an offlined CPU to return from the dead (in
arch_cpu_idle_dead()).

Now that Xen has been fixed, and the arch-specific implementations of
arch_cpu_idle_dead() also don't return, give it a __noreturn attribute.

This will cause the compiler to complain if an arch-specific
implementation might return.  It also improves code generation for both
caller and callee.

Also fixes the following warning:

  vmlinux.o: warning: objtool: do_idle+0x25f: unreachable instruction

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/60d527353da8c99d4cf13b6473131d46719ed16d.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-03-08 08:44:28 -08:00
Linus Torvalds
857f1268a5 Changes in this cycle were:
- Shrink 'struct instruction', to improve objtool performance & memory
    footprint.
 
  - Other maximum memory usage reductions - this makes the build both faster,
    and fixes kernel build OOM failures on allyesconfig and similar configs
    when they try to build the final (large) vmlinux.o.
 
  - Fix ORC unwinding when a kprobe (INT3) is set on a stack-modifying
    single-byte instruction (PUSH/POP or LEAVE). This requires the
    extension of the ORC metadata structure with a 'signal' field.
 
  - Misc fixes & cleanups.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmQAVp8RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gV6A//YbWb4nNxYbRFBd1O3FnFfy4efrDQ4btI
 hwkL6f7jka9RnIpIEatJvaLdNvyN5tuPCC/+B5eVnvFdd1JcBUmj5D+zYFt6H6qt
 BG4M6TNHFkP1kOJVfFGn8UPRfoMz2oMiEqilpsc1Yuf7b3ldMJtGUoHaeZC9pyqe
 RUisKNw4WHZp2G/gTBUWxW17xpWY3Awgch/w4HCu8wMnR+uEC44i0UCBfnAadl36
 ar66PfhMJcQIv0XkK9wu43g7+HFnjpxHOx35JW3lRot0xRnwl/JcsmaX5iPkh0gt
 HV8eLH80J0homeMZDY7vWIKJxGeLkIdfjO5gxwTdnFc9rQw3GwHp1B7WTS6J3Vwe
 gM00kyaGly3CvkKMiz5QQBfViWCjE25nYS8X0i9Oz6Gk58IkRPGByaDTKRjNrDJB
 BwH9DE9xb3dPVZRv/PejkTdggQWo+FDTrL8ulHIjUFK11M7VubwkskecNHkfpAOE
 TRy5iLjMocF8u7hdyec6Mma2K6qEndC2Rw9ZMPQ7TeieMsBcl63cSRgSJLFfdRhr
 /5c6Hr2SNQKU8xu+3j49GyBwFvp4CwCa+GPs9/o+l0uCvuKNIn9B788cm4TjxLJ9
 C3PRzE6B/CaLhYvlC5k5cNM+I4YpoMU/mvSvY6HcC0Duj2nSAWS2VV60MVMDpqVX
 8nK4xnla2tM=
 =bpPY
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2023-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:

 - Shrink 'struct instruction', to improve objtool performance & memory
   footprint

 - Other maximum memory usage reductions - this makes the build both
   faster, and fixes kernel build OOM failures on allyesconfig and
   similar configs when they try to build the final (large) vmlinux.o

 - Fix ORC unwinding when a kprobe (INT3) is set on a stack-modifying
   single-byte instruction (PUSH/POP or LEAVE). This requires the
   extension of the ORC metadata structure with a 'signal' field

 - Misc fixes & cleanups

* tag 'objtool-core-2023-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
  objtool: Fix ORC 'signal' propagation
  objtool: Remove instruction::list
  x86: Fix FILL_RETURN_BUFFER
  objtool: Fix overlapping alternatives
  objtool: Union instruction::{call_dest,jump_table}
  objtool: Remove instruction::reloc
  objtool: Shrink instruction::{type,visited}
  objtool: Make instruction::alts a single-linked list
  objtool: Make instruction::stack_ops a single-linked list
  objtool: Change arch_decode_instruction() signature
  x86/entry: Fix unwinding from kprobe on PUSH/POP instruction
  x86/unwind/orc: Add 'signal' field to ORC metadata
  objtool: Optimize layout of struct special_alt
  objtool: Optimize layout of struct symbol
  objtool: Allocate multiple structures with calloc()
  objtool: Make struct check_options static
  objtool: Make struct entries[] static and const
  objtool: Fix HOSTCC flag usage
  objtool: Properly support make V=1
  objtool: Install libsubcmd in build
  ...
2023-03-02 09:45:34 -08:00
Linus Torvalds
3822a7c409 - Daniel Verkamp has contributed a memfd series ("mm/memfd: add
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
   memfd creation time, with the option of sealing the state of the X bit.
 
 - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
   thread-safe for pmd unshare") which addresses a rare race condition
   related to PMD unsharing.
 
 - Several folioification patch serieses from Matthew Wilcox, Vishal
   Moola, Sidhartha Kumar and Lorenzo Stoakes
 
 - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which
   does perform some memcg maintenance and cleanup work.
 
 - SeongJae Park has added DAMOS filtering to DAMON, with the series
   "mm/damon/core: implement damos filter".  These filters provide users
   with finer-grained control over DAMOS's actions.  SeongJae has also done
   some DAMON cleanup work.
 
 - Kairui Song adds a series ("Clean up and fixes for swap").
 
 - Vernon Yang contributed the series "Clean up and refinement for maple
   tree".
 
 - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series.  It
   adds to MGLRU an LRU of memcgs, to improve the scalability of global
   reclaim.
 
 - David Hildenbrand has added some userfaultfd cleanup work in the
   series "mm: uffd-wp + change_protection() cleanups".
 
 - Christoph Hellwig has removed the generic_writepages() library
   function in the series "remove generic_writepages".
 
 - Baolin Wang has performed some maintenance on the compaction code in
   his series "Some small improvements for compaction".
 
 - Sidhartha Kumar is doing some maintenance work on struct page in his
   series "Get rid of tail page fields".
 
 - David Hildenbrand contributed some cleanup, bugfixing and
   generalization of pte management and of pte debugging in his series "mm:
   support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap
   PTEs".
 
 - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
   flag in the series "Discard __GFP_ATOMIC".
 
 - Sergey Senozhatsky has improved zsmalloc's memory utilization with his
   series "zsmalloc: make zspage chain size configurable".
 
 - Joey Gouly has added prctl() support for prohibiting the creation of
   writeable+executable mappings.  The previous BPF-based approach had
   shortcomings.  See "mm: In-kernel support for memory-deny-write-execute
   (MDWE)".
 
 - Waiman Long did some kmemleak cleanup and bugfixing in the series
   "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
 
 - T.J.  Alumbaugh has contributed some MGLRU cleanup work in his series
   "mm: multi-gen LRU: improve".
 
 - Jiaqi Yan has provided some enhancements to our memory error
   statistics reporting, mainly by presenting the statistics on a per-node
   basis.  See the series "Introduce per NUMA node memory error
   statistics".
 
 - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
   regression in compaction via his series "Fix excessive CPU usage during
   compaction".
 
 - Christoph Hellwig does some vmalloc maintenance work in the series
   "cleanup vfree and vunmap".
 
 - Christoph Hellwig has removed block_device_operations.rw_page() in ths
   series "remove ->rw_page".
 
 - We get some maple_tree improvements and cleanups in Liam Howlett's
   series "VMA tree type safety and remove __vma_adjust()".
 
 - Suren Baghdasaryan has done some work on the maintainability of our
   vm_flags handling in the series "introduce vm_flags modifier functions".
 
 - Some pagemap cleanup and generalization work in Mike Rapoport's series
   "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and
   "fixups for generic implementation of pfn_valid()"
 
 - Baoquan He has done some work to make /proc/vmallocinfo and
   /proc/kcore better represent the real state of things in his series
   "mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
 
 - Jason Gunthorpe rationalized the GUP system's interface to the rest of
   the kernel in the series "Simplify the external interface for GUP".
 
 - SeongJae Park wishes to migrate people from DAMON's debugfs interface
   over to its sysfs interface.  To support this, we'll temporarily be
   printing warnings when people use the debugfs interface.  See the series
   "mm/damon: deprecate DAMON debugfs interface".
 
 - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
   and clean-ups" series.
 
 - Huang Ying has provided a dramatic reduction in migration's TLB flush
   IPI rates with the series "migrate_pages(): batch TLB flushing".
 
 - Arnd Bergmann has some objtool fixups in "objtool warning fixes".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY/PoPQAKCRDdBJ7gKXxA
 jlvpAPsFECUBBl20qSue2zCYWnHC7Yk4q9ytTkPB/MMDrFEN9wD/SNKEm2UoK6/K
 DmxHkn0LAitGgJRS/W9w81yrgig9tAQ=
 =MlGs
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Daniel Verkamp has contributed a memfd series ("mm/memfd: add
   F_SEAL_EXEC") which permits the setting of the memfd execute bit at
   memfd creation time, with the option of sealing the state of the X
   bit.

 - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
   thread-safe for pmd unshare") which addresses a rare race condition
   related to PMD unsharing.

 - Several folioification patch serieses from Matthew Wilcox, Vishal
   Moola, Sidhartha Kumar and Lorenzo Stoakes

 - Johannes Weiner has a series ("mm: push down lock_page_memcg()")
   which does perform some memcg maintenance and cleanup work.

 - SeongJae Park has added DAMOS filtering to DAMON, with the series
   "mm/damon/core: implement damos filter".

   These filters provide users with finer-grained control over DAMOS's
   actions. SeongJae has also done some DAMON cleanup work.

 - Kairui Song adds a series ("Clean up and fixes for swap").

 - Vernon Yang contributed the series "Clean up and refinement for maple
   tree".

 - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
   adds to MGLRU an LRU of memcgs, to improve the scalability of global
   reclaim.

 - David Hildenbrand has added some userfaultfd cleanup work in the
   series "mm: uffd-wp + change_protection() cleanups".

 - Christoph Hellwig has removed the generic_writepages() library
   function in the series "remove generic_writepages".

 - Baolin Wang has performed some maintenance on the compaction code in
   his series "Some small improvements for compaction".

 - Sidhartha Kumar is doing some maintenance work on struct page in his
   series "Get rid of tail page fields".

 - David Hildenbrand contributed some cleanup, bugfixing and
   generalization of pte management and of pte debugging in his series
   "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with
   swap PTEs".

 - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
   flag in the series "Discard __GFP_ATOMIC".

 - Sergey Senozhatsky has improved zsmalloc's memory utilization with
   his series "zsmalloc: make zspage chain size configurable".

 - Joey Gouly has added prctl() support for prohibiting the creation of
   writeable+executable mappings.

   The previous BPF-based approach had shortcomings. See "mm: In-kernel
   support for memory-deny-write-execute (MDWE)".

 - Waiman Long did some kmemleak cleanup and bugfixing in the series
   "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".

 - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
   "mm: multi-gen LRU: improve".

 - Jiaqi Yan has provided some enhancements to our memory error
   statistics reporting, mainly by presenting the statistics on a
   per-node basis. See the series "Introduce per NUMA node memory error
   statistics".

 - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
   regression in compaction via his series "Fix excessive CPU usage
   during compaction".

 - Christoph Hellwig does some vmalloc maintenance work in the series
   "cleanup vfree and vunmap".

 - Christoph Hellwig has removed block_device_operations.rw_page() in
   ths series "remove ->rw_page".

 - We get some maple_tree improvements and cleanups in Liam Howlett's
   series "VMA tree type safety and remove __vma_adjust()".

 - Suren Baghdasaryan has done some work on the maintainability of our
   vm_flags handling in the series "introduce vm_flags modifier
   functions".

 - Some pagemap cleanup and generalization work in Mike Rapoport's
   series "mm, arch: add generic implementation of pfn_valid() for
   FLATMEM" and "fixups for generic implementation of pfn_valid()"

 - Baoquan He has done some work to make /proc/vmallocinfo and
   /proc/kcore better represent the real state of things in his series
   "mm/vmalloc.c: allow vread() to read out vm_map_ram areas".

 - Jason Gunthorpe rationalized the GUP system's interface to the rest
   of the kernel in the series "Simplify the external interface for
   GUP".

 - SeongJae Park wishes to migrate people from DAMON's debugfs interface
   over to its sysfs interface. To support this, we'll temporarily be
   printing warnings when people use the debugfs interface. See the
   series "mm/damon: deprecate DAMON debugfs interface".

 - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
   and clean-ups" series.

 - Huang Ying has provided a dramatic reduction in migration's TLB flush
   IPI rates with the series "migrate_pages(): batch TLB flushing".

 - Arnd Bergmann has some objtool fixups in "objtool warning fixes".

* tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits)
  include/linux/migrate.h: remove unneeded externs
  mm/memory_hotplug: cleanup return value handing in do_migrate_range()
  mm/uffd: fix comment in handling pte markers
  mm: change to return bool for isolate_movable_page()
  mm: hugetlb: change to return bool for isolate_hugetlb()
  mm: change to return bool for isolate_lru_page()
  mm: change to return bool for folio_isolate_lru()
  objtool: add UACCESS exceptions for __tsan_volatile_read/write
  kmsan: disable ftrace in kmsan core code
  kasan: mark addr_has_metadata __always_inline
  mm: memcontrol: rename memcg_kmem_enabled()
  sh: initialize max_mapnr
  m68k/nommu: add missing definition of ARCH_PFN_OFFSET
  mm: percpu: fix incorrect size in pcpu_obj_full_size()
  maple_tree: reduce stack usage with gcc-9 and earlier
  mm: page_alloc: call panic() when memoryless node allocation fails
  mm: multi-gen LRU: avoid futile retries
  migrate_pages: move THP/hugetlb migration support check to simplify code
  migrate_pages: batch flushing TLB
  migrate_pages: share more code between _unmap and _move
  ...
2023-02-23 17:09:35 -08:00
Josh Poimboeuf
00c8f01c4e objtool: Fix ORC 'signal' propagation
There have been some recently reported ORC unwinder warnings like:

  WARNING: can't access registers at entry_SYSCALL_64_after_hwframe+0x63/0xcd
  WARNING: stack going in the wrong direction? at __sys_setsockopt+0x2c6/0x5b0 net/socket.c:2271

And a KASAN warning:

  BUG: KASAN: stack-out-of-bounds in unwind_next_frame (arch/x86/include/asm/ptrace.h:136 arch/x86/kernel/unwind_orc.c:455)

It turns out the 'signal' bit isn't getting propagated from the unwind
hints to the ORC entries, making the unwinder confused at times.

Fixes: ffb1b4a410 ("x86/unwind/orc: Add 'signal' field to ORC metadata")
Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/97eef9db60cd86d376a9a40d49d77bb67a8f6526.1676579666.git.jpoimboe@kernel.org
2023-02-23 09:21:50 +01:00
Peter Zijlstra
1c34496e58 objtool: Remove instruction::list
Replace the instruction::list by allocating instructions in arrays of
256 entries and stringing them together by (amortized) find_insn().
This shrinks instruction by 16 bytes and brings it down to 128.

 struct instruction {
-	struct list_head           list;                 /*     0    16 */
-	struct hlist_node          hash;                 /*    16    16 */
-	struct list_head           call_node;            /*    32    16 */
-	struct section *           sec;                  /*    48     8 */
-	long unsigned int          offset;               /*    56     8 */
-	/* --- cacheline 1 boundary (64 bytes) --- */
-	long unsigned int          immediate;            /*    64     8 */
-	unsigned int               len;                  /*    72     4 */
-	u8                         type;                 /*    76     1 */
-
-	/* Bitfield combined with previous fields */
+	struct hlist_node          hash;                 /*     0    16 */
+	struct list_head           call_node;            /*    16    16 */
+	struct section *           sec;                  /*    32     8 */
+	long unsigned int          offset;               /*    40     8 */
+	long unsigned int          immediate;            /*    48     8 */
+	u8                         len;                  /*    56     1 */
+	u8                         prev_len;             /*    57     1 */
+	u8                         type;                 /*    58     1 */
+	s8                         instr;                /*    59     1 */
+	u32                        idx:8;                /*    60: 0  4 */
+	u32                        dead_end:1;           /*    60: 8  4 */
+	u32                        ignore:1;             /*    60: 9  4 */
+	u32                        ignore_alts:1;        /*    60:10  4 */
+	u32                        hint:1;               /*    60:11  4 */
+	u32                        save:1;               /*    60:12  4 */
+	u32                        restore:1;            /*    60:13  4 */
+	u32                        retpoline_safe:1;     /*    60:14  4 */
+	u32                        noendbr:1;            /*    60:15  4 */
+	u32                        entry:1;              /*    60:16  4 */
+	u32                        visited:4;            /*    60:17  4 */
+	u32                        no_reloc:1;           /*    60:21  4 */

-	u16                        dead_end:1;           /*    76: 8  2 */
-	u16                        ignore:1;             /*    76: 9  2 */
-	u16                        ignore_alts:1;        /*    76:10  2 */
-	u16                        hint:1;               /*    76:11  2 */
-	u16                        save:1;               /*    76:12  2 */
-	u16                        restore:1;            /*    76:13  2 */
-	u16                        retpoline_safe:1;     /*    76:14  2 */
-	u16                        noendbr:1;            /*    76:15  2 */
-	u16                        entry:1;              /*    78: 0  2 */
-	u16                        visited:4;            /*    78: 1  2 */
-	u16                        no_reloc:1;           /*    78: 5  2 */
+	/* XXX 10 bits hole, try to pack */

-	/* XXX 2 bits hole, try to pack */
-	/* Bitfield combined with next fields */
-
-	s8                         instr;                /*    79     1 */
-	struct alt_group *         alt_group;            /*    80     8 */
-	struct instruction *       jump_dest;            /*    88     8 */
-	struct instruction *       first_jump_src;       /*    96     8 */
+	/* --- cacheline 1 boundary (64 bytes) --- */
+	struct alt_group *         alt_group;            /*    64     8 */
+	struct instruction *       jump_dest;            /*    72     8 */
+	struct instruction *       first_jump_src;       /*    80     8 */
 	union {
-		struct symbol *    _call_dest;           /*   104     8 */
-		struct reloc *     _jump_table;          /*   104     8 */
-	};                                               /*   104     8 */
-	struct alternative *       alts;                 /*   112     8 */
-	struct symbol *            sym;                  /*   120     8 */
-	/* --- cacheline 2 boundary (128 bytes) --- */
-	struct stack_op *          stack_ops;            /*   128     8 */
-	struct cfi_state *         cfi;                  /*   136     8 */
+		struct symbol *    _call_dest;           /*    88     8 */
+		struct reloc *     _jump_table;          /*    88     8 */
+	};                                               /*    88     8 */
+	struct alternative *       alts;                 /*    96     8 */
+	struct symbol *            sym;                  /*   104     8 */
+	struct stack_op *          stack_ops;            /*   112     8 */
+	struct cfi_state *         cfi;                  /*   120     8 */

-	/* size: 144, cachelines: 3, members: 28 */
-	/* sum members: 142 */
-	/* sum bitfield members: 14 bits, bit holes: 1, sum bit holes: 2 bits */
-	/* last cacheline: 16 bytes */
+	/* size: 128, cachelines: 2, members: 29 */
+	/* sum members: 124 */
+	/* sum bitfield members: 22 bits, bit holes: 1, sum bit holes: 10 bits */
 };

pre:	5:38.18 real,   213.25 user,    124.90 sys,     23449040 mem
post:	5:03.34 real,   210.75 user,    88.80 sys,      20241232 mem

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build only
Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run
Link: https://lore.kernel.org/r/20230208172245.851307606@infradead.org
2023-02-23 09:21:44 +01:00
Peter Zijlstra
a706bb08c8 objtool: Fix overlapping alternatives
Things like ALTERNATIVE_{2,3}() generate multiple alternatives on the
same place, objtool would override the first orig_alt_group with the
second (or third), failing to check the CFI among all the different
variants.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build only
Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run
Link: https://lore.kernel.org/r/20230208172245.711471461@infradead.org
2023-02-23 09:21:33 +01:00
Peter Zijlstra
c6f5dc28fb objtool: Union instruction::{call_dest,jump_table}
The instruction call_dest and jump_table members can never be used at
the same time, their usage depends on type.

 struct instruction {
 	struct list_head           list;                 /*     0    16 */
 	struct hlist_node          hash;                 /*    16    16 */
 	struct list_head           call_node;            /*    32    16 */
 	struct section *           sec;                  /*    48     8 */
 	long unsigned int          offset;               /*    56     8 */
 	/* --- cacheline 1 boundary (64 bytes) --- */
 	long unsigned int          immediate;            /*    64     8 */
 	unsigned int               len;                  /*    72     4 */
 	u8                         type;                 /*    76     1 */

 	/* Bitfield combined with previous fields */

 	u16                        dead_end:1;           /*    76: 8  2 */
 	u16                        ignore:1;             /*    76: 9  2 */
 	u16                        ignore_alts:1;        /*    76:10  2 */
 	u16                        hint:1;               /*    76:11  2 */
 	u16                        save:1;               /*    76:12  2 */
 	u16                        restore:1;            /*    76:13  2 */
 	u16                        retpoline_safe:1;     /*    76:14  2 */
 	u16                        noendbr:1;            /*    76:15  2 */
 	u16                        entry:1;              /*    78: 0  2 */
 	u16                        visited:4;            /*    78: 1  2 */
 	u16                        no_reloc:1;           /*    78: 5  2 */

 	/* XXX 2 bits hole, try to pack */
 	/* Bitfield combined with next fields */

 	s8                         instr;                /*    79     1 */
 	struct alt_group *         alt_group;            /*    80     8 */
-	struct symbol *            call_dest;            /*    88     8 */
-	struct instruction *       jump_dest;            /*    96     8 */
-	struct instruction *       first_jump_src;       /*   104     8 */
-	struct reloc *             jump_table;           /*   112     8 */
-	struct alternative *       alts;                 /*   120     8 */
+	struct instruction *       jump_dest;            /*    88     8 */
+	struct instruction *       first_jump_src;       /*    96     8 */
+	union {
+		struct symbol *    _call_dest;           /*   104     8 */
+		struct reloc *     _jump_table;          /*   104     8 */
+	};                                               /*   104     8 */
+	struct alternative *       alts;                 /*   112     8 */
+	struct symbol *            sym;                  /*   120     8 */
 	/* --- cacheline 2 boundary (128 bytes) --- */
-	struct symbol *            sym;                  /*   128     8 */
-	struct stack_op *          stack_ops;            /*   136     8 */
-	struct cfi_state *         cfi;                  /*   144     8 */
+	struct stack_op *          stack_ops;            /*   128     8 */
+	struct cfi_state *         cfi;                  /*   136     8 */

-	/* size: 152, cachelines: 3, members: 29 */
-	/* sum members: 150 */
+	/* size: 144, cachelines: 3, members: 28 */
+	/* sum members: 142 */
 	/* sum bitfield members: 14 bits, bit holes: 1, sum bit holes: 2 bits */
-	/* last cacheline: 24 bytes */
+	/* last cacheline: 16 bytes */
 };

pre:	5:39.35 real,   215.58 user,    123.69 sys,     23448736 mem
post:	5:38.18 real,   213.25 user,    124.90 sys,     23449040 mem

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build only
Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run
Link: https://lore.kernel.org/r/20230208172245.640914454@infradead.org
2023-02-23 09:21:27 +01:00
Peter Zijlstra
0932dbe1f5 objtool: Remove instruction::reloc
Instead of caching the reloc for each instruction, only keep a
negative cache of not having a reloc (by far the most common case).

 struct instruction {
 	struct list_head           list;                 /*     0    16 */
 	struct hlist_node          hash;                 /*    16    16 */
 	struct list_head           call_node;            /*    32    16 */
 	struct section *           sec;                  /*    48     8 */
 	long unsigned int          offset;               /*    56     8 */
 	/* --- cacheline 1 boundary (64 bytes) --- */
 	long unsigned int          immediate;            /*    64     8 */
 	unsigned int               len;                  /*    72     4 */
 	u8                         type;                 /*    76     1 */

 	/* Bitfield combined with previous fields */

 	u16                        dead_end:1;           /*    76: 8  2 */
 	u16                        ignore:1;             /*    76: 9  2 */
 	u16                        ignore_alts:1;        /*    76:10  2 */
 	u16                        hint:1;               /*    76:11  2 */
 	u16                        save:1;               /*    76:12  2 */
 	u16                        restore:1;            /*    76:13  2 */
 	u16                        retpoline_safe:1;     /*    76:14  2 */
 	u16                        noendbr:1;            /*    76:15  2 */
 	u16                        entry:1;              /*    78: 0  2 */
 	u16                        visited:4;            /*    78: 1  2 */
+	u16                        no_reloc:1;           /*    78: 5  2 */

-	/* XXX 3 bits hole, try to pack */
+	/* XXX 2 bits hole, try to pack */
 	/* Bitfield combined with next fields */

 	s8                         instr;                /*    79     1 */
 	struct alt_group *         alt_group;            /*    80     8 */
 	struct symbol *            call_dest;            /*    88     8 */
 	struct instruction *       jump_dest;            /*    96     8 */
 	struct instruction *       first_jump_src;       /*   104     8 */
 	struct reloc *             jump_table;           /*   112     8 */
-	struct reloc *             reloc;                /*   120     8 */
+	struct alternative *       alts;                 /*   120     8 */
 	/* --- cacheline 2 boundary (128 bytes) --- */
-	struct alternative *       alts;                 /*   128     8 */
-	struct symbol *            sym;                  /*   136     8 */
-	struct stack_op *          stack_ops;            /*   144     8 */
-	struct cfi_state *         cfi;                  /*   152     8 */
+	struct symbol *            sym;                  /*   128     8 */
+	struct stack_op *          stack_ops;            /*   136     8 */
+	struct cfi_state *         cfi;                  /*   144     8 */

-	/* size: 160, cachelines: 3, members: 29 */
-	/* sum members: 158 */
-	/* sum bitfield members: 13 bits, bit holes: 1, sum bit holes: 3 bits */
-	/* last cacheline: 32 bytes */
+	/* size: 152, cachelines: 3, members: 29 */
+	/* sum members: 150 */
+	/* sum bitfield members: 14 bits, bit holes: 1, sum bit holes: 2 bits */
+	/* last cacheline: 24 bytes */
 };

pre:	5:48.89 real,   220.96 user,    127.55 sys,     24834672 mem
post:	5:39.35 real,   215.58 user,    123.69 sys,     23448736 mem

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build only
Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run
Link: https://lore.kernel.org/r/20230208172245.572145269@infradead.org
2023-02-23 09:21:17 +01:00
Peter Zijlstra
8b2de41215 objtool: Shrink instruction::{type,visited}
Since we don't have that many types in enum insn_type, force it into a
u8 and re-arrange member to get rid of the holes, saves another 8
bytes.

 struct instruction {
 	struct list_head           list;                 /*     0    16 */
 	struct hlist_node          hash;                 /*    16    16 */
 	struct list_head           call_node;            /*    32    16 */
 	struct section *           sec;                  /*    48     8 */
 	long unsigned int          offset;               /*    56     8 */
 	/* --- cacheline 1 boundary (64 bytes) --- */
-	unsigned int               len;                  /*    64     4 */
-	enum insn_type             type;                 /*    68     4 */
-	long unsigned int          immediate;            /*    72     8 */
-	u16                        dead_end:1;           /*    80: 0  2 */
-	u16                        ignore:1;             /*    80: 1  2 */
-	u16                        ignore_alts:1;        /*    80: 2  2 */
-	u16                        hint:1;               /*    80: 3  2 */
-	u16                        save:1;               /*    80: 4  2 */
-	u16                        restore:1;            /*    80: 5  2 */
-	u16                        retpoline_safe:1;     /*    80: 6  2 */
-	u16                        noendbr:1;            /*    80: 7  2 */
-	u16                        entry:1;              /*    80: 8  2 */
+	long unsigned int          immediate;            /*    64     8 */
+	unsigned int               len;                  /*    72     4 */
+	u8                         type;                 /*    76     1 */

-	/* XXX 7 bits hole, try to pack */
+	/* Bitfield combined with previous fields */

-	s8                         instr;                /*    82     1 */
-	u8                         visited;              /*    83     1 */
+	u16                        dead_end:1;           /*    76: 8  2 */
+	u16                        ignore:1;             /*    76: 9  2 */
+	u16                        ignore_alts:1;        /*    76:10  2 */
+	u16                        hint:1;               /*    76:11  2 */
+	u16                        save:1;               /*    76:12  2 */
+	u16                        restore:1;            /*    76:13  2 */
+	u16                        retpoline_safe:1;     /*    76:14  2 */
+	u16                        noendbr:1;            /*    76:15  2 */
+	u16                        entry:1;              /*    78: 0  2 */
+	u16                        visited:4;            /*    78: 1  2 */

-	/* XXX 4 bytes hole, try to pack */
+	/* XXX 3 bits hole, try to pack */
+	/* Bitfield combined with next fields */

-	struct alt_group *         alt_group;            /*    88     8 */
-	struct symbol *            call_dest;            /*    96     8 */
-	struct instruction *       jump_dest;            /*   104     8 */
-	struct instruction *       first_jump_src;       /*   112     8 */
-	struct reloc *             jump_table;           /*   120     8 */
+	s8                         instr;                /*    79     1 */
+	struct alt_group *         alt_group;            /*    80     8 */
+	struct symbol *            call_dest;            /*    88     8 */
+	struct instruction *       jump_dest;            /*    96     8 */
+	struct instruction *       first_jump_src;       /*   104     8 */
+	struct reloc *             jump_table;           /*   112     8 */
+	struct reloc *             reloc;                /*   120     8 */
 	/* --- cacheline 2 boundary (128 bytes) --- */
-	struct reloc *             reloc;                /*   128     8 */
-	struct alternative *       alts;                 /*   136     8 */
-	struct symbol *            sym;                  /*   144     8 */
-	struct stack_op *          stack_ops;            /*   152     8 */
-	struct cfi_state *         cfi;                  /*   160     8 */
+	struct alternative *       alts;                 /*   128     8 */
+	struct symbol *            sym;                  /*   136     8 */
+	struct stack_op *          stack_ops;            /*   144     8 */
+	struct cfi_state *         cfi;                  /*   152     8 */

-	/* size: 168, cachelines: 3, members: 29 */
-	/* sum members: 162, holes: 1, sum holes: 4 */
-	/* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 7 bits */
-	/* last cacheline: 40 bytes */
+	/* size: 160, cachelines: 3, members: 29 */
+	/* sum members: 158 */
+	/* sum bitfield members: 13 bits, bit holes: 1, sum bit holes: 3 bits */
+	/* last cacheline: 32 bytes */
 };

pre:	5:48.86 real,   220.30 user,    128.34 sys,     24834672 mem
post:	5:48.89 real,   220.96 user,    127.55 sys,     24834672 mem

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build only
Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run
Link: https://lore.kernel.org/r/20230208172245.501847188@infradead.org
2023-02-23 09:21:12 +01:00
Peter Zijlstra
d540665461 objtool: Make instruction::alts a single-linked list
struct instruction {
 	struct list_head           list;                 /*     0    16 */
 	struct hlist_node          hash;                 /*    16    16 */
 	struct list_head           call_node;            /*    32    16 */
 	struct section *           sec;                  /*    48     8 */
 	long unsigned int          offset;               /*    56     8 */
 	/* --- cacheline 1 boundary (64 bytes) --- */
 	unsigned int               len;                  /*    64     4 */
 	enum insn_type             type;                 /*    68     4 */
 	long unsigned int          immediate;            /*    72     8 */
 	u16                        dead_end:1;           /*    80: 0  2 */
 	u16                        ignore:1;             /*    80: 1  2 */
 	u16                        ignore_alts:1;        /*    80: 2  2 */
 	u16                        hint:1;               /*    80: 3  2 */
 	u16                        save:1;               /*    80: 4  2 */
 	u16                        restore:1;            /*    80: 5  2 */
 	u16                        retpoline_safe:1;     /*    80: 6  2 */
 	u16                        noendbr:1;            /*    80: 7  2 */
 	u16                        entry:1;              /*    80: 8  2 */

 	/* XXX 7 bits hole, try to pack */

 	s8                         instr;                /*    82     1 */
 	u8                         visited;              /*    83     1 */

 	/* XXX 4 bytes hole, try to pack */

 	struct alt_group *         alt_group;            /*    88     8 */
 	struct symbol *            call_dest;            /*    96     8 */
 	struct instruction *       jump_dest;            /*   104     8 */
 	struct instruction *       first_jump_src;       /*   112     8 */
 	struct reloc *             jump_table;           /*   120     8 */
 	/* --- cacheline 2 boundary (128 bytes) --- */
 	struct reloc *             reloc;                /*   128     8 */
-	struct list_head           alts;                 /*   136    16 */
-	struct symbol *            sym;                  /*   152     8 */
-	struct stack_op *          stack_ops;            /*   160     8 */
-	struct cfi_state *         cfi;                  /*   168     8 */
+	struct alternative *       alts;                 /*   136     8 */
+	struct symbol *            sym;                  /*   144     8 */
+	struct stack_op *          stack_ops;            /*   152     8 */
+	struct cfi_state *         cfi;                  /*   160     8 */

-	/* size: 176, cachelines: 3, members: 29 */
-	/* sum members: 170, holes: 1, sum holes: 4 */
+	/* size: 168, cachelines: 3, members: 29 */
+	/* sum members: 162, holes: 1, sum holes: 4 */
 	/* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 7 bits */
-	/* last cacheline: 48 bytes */
+	/* last cacheline: 40 bytes */
 };

pre:	5:58.50 real,   229.64 user,    128.65 sys,     26221520 mem
post:	5:48.86 real,   220.30 user,    128.34 sys,     24834672 mem

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build only
Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run
Link: https://lore.kernel.org/r/20230208172245.430556498@infradead.org
2023-02-23 09:21:06 +01:00
Peter Zijlstra
3ee88df1b0 objtool: Make instruction::stack_ops a single-linked list
struct instruction {
 	struct list_head           list;                 /*     0    16 */
 	struct hlist_node          hash;                 /*    16    16 */
 	struct list_head           call_node;            /*    32    16 */
 	struct section *           sec;                  /*    48     8 */
 	long unsigned int          offset;               /*    56     8 */
 	/* --- cacheline 1 boundary (64 bytes) --- */
 	unsigned int               len;                  /*    64     4 */
 	enum insn_type             type;                 /*    68     4 */
 	long unsigned int          immediate;            /*    72     8 */
 	u16                        dead_end:1;           /*    80: 0  2 */
 	u16                        ignore:1;             /*    80: 1  2 */
 	u16                        ignore_alts:1;        /*    80: 2  2 */
 	u16                        hint:1;               /*    80: 3  2 */
 	u16                        save:1;               /*    80: 4  2 */
 	u16                        restore:1;            /*    80: 5  2 */
 	u16                        retpoline_safe:1;     /*    80: 6  2 */
 	u16                        noendbr:1;            /*    80: 7  2 */
 	u16                        entry:1;              /*    80: 8  2 */

 	/* XXX 7 bits hole, try to pack */

 	s8                         instr;                /*    82     1 */
 	u8                         visited;              /*    83     1 */

 	/* XXX 4 bytes hole, try to pack */

 	struct alt_group *         alt_group;            /*    88     8 */
 	struct symbol *            call_dest;            /*    96     8 */
 	struct instruction *       jump_dest;            /*   104     8 */
 	struct instruction *       first_jump_src;       /*   112     8 */
 	struct reloc *             jump_table;           /*   120     8 */
 	/* --- cacheline 2 boundary (128 bytes) --- */
 	struct reloc *             reloc;                /*   128     8 */
 	struct list_head           alts;                 /*   136    16 */
 	struct symbol *            sym;                  /*   152     8 */
-	struct list_head           stack_ops;            /*   160    16 */
-	struct cfi_state *         cfi;                  /*   176     8 */
+	struct stack_op *          stack_ops;            /*   160     8 */
+	struct cfi_state *         cfi;                  /*   168     8 */

-	/* size: 184, cachelines: 3, members: 29 */
-	/* sum members: 178, holes: 1, sum holes: 4 */
+	/* size: 176, cachelines: 3, members: 29 */
+	/* sum members: 170, holes: 1, sum holes: 4 */
 	/* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 7 bits */
-	/* last cacheline: 56 bytes */
+	/* last cacheline: 48 bytes */
 };

pre:	5:58.22 real,   226.69 user,    131.22 sys,     26221520 mem
post:	5:58.50 real,   229.64 user,    128.65 sys,     26221520 mem

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build only
Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run
Link: https://lore.kernel.org/r/20230208172245.362196959@infradead.org
2023-02-23 09:20:59 +01:00
Peter Zijlstra
20a554638d objtool: Change arch_decode_instruction() signature
In preparation to changing struct instruction around a bit, avoid
passing it's members by pointer and instead pass the whole thing.

A cleanup in it's own right too.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build only
Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run
Link: https://lore.kernel.org/r/20230208172245.291087549@infradead.org
2023-02-23 09:20:50 +01:00
Ingo Molnar
585a78c1f7 Merge branch 'linus' into objtool/core, to pick up Xen dependencies
Pick up dependencies - freshly merged upstream via xen-next - before applying
dependent objtool changes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-02-23 09:16:39 +01:00
Linus Torvalds
239451e903 xen: branch for v6.3-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCY/GzaAAKCRCAXGG7T9hj
 vhgtAP96ax9EV49/kCST52z9yGfGUA+giq/9Jm6bwHlP3PZXVAD/Wfhfp1HbxzFp
 CqXG7veXU+uGVP3lbpbYKNPV9DIOdgQ=
 =K+0Q
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - help deprecate the /proc/xen files by making the related information
   available via sysfs

 - mark the Xen variants of play_dead "noreturn"

 - support a shared Xen platform interrupt

 - several small cleanups and fixes

* tag 'for-linus-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: sysfs: make kobj_type structure constant
  x86/Xen: drop leftover VM-assist uses
  xen: Replace one-element array with flexible-array member
  xen/grant-dma-iommu: Implement a dummy probe_device() callback
  xen/pvcalls-back: fix permanently masked event channel
  xen: Allow platform PCI interrupt to be shared
  x86/xen/time: prefer tsc as clocksource when it is invariant
  x86/xen: mark xen_pv_play_dead() as __noreturn
  x86/xen: don't let xen_pv_play_dead() return
  drivers/xen/hypervisor: Expose Xen SIF flags to userspace
2023-02-21 17:07:39 -08:00
Linus Torvalds
1adce1b944 - Teach the static_call patching infrastructure to handle conditional
tall calls properly which can be static calls too
 
 - Add proper struct alt_instr.flags which controls different aspects of
   insn patching behavior
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmPzkAcACgkQEsHwGGHe
 VUpA3A//bALDnLosUQe/m8CTcj1AU12Y59fGInoLl5xArM3liOhRYWj9yu8+2r5N
 j+89yjWoiaogu/9B18pV0+VnBrFUbALZmHxec0+4VAWyMqYuTbqN28Nj/2cZiHdP
 I/9mwGu40I/Ira021D132EcdoZI7O/6bFlh+kEoAqxc7rsqhD5KKRMlrTTdEPVjH
 aRbWIuzqDWNhbi7IwfgEBIPLiQZQKmIdH5hsFMD6yOMIdMRL6CwKmXVg2M1Zp8ta
 5v2Aqgvu2nZYCIteP4GQck2AlUBlGR4ClGQQRII+U1o8c9dM0hfcIDgsbSYKvgrY
 ANm9MQJaF7MRomk9y4E0EHPZAJEMLKUgiQXMxWpER3O1GOKgZPlyzNSe0gRCiL6O
 NZWZ2cGtdhQMrko4EapE3GNryM1HoCY/QCuD1fCYwoc/pRBDhCxsSqjWUd8G/6wn
 s3S/mD0v3nmTrxHg8sWvqhKshsd7B9V0LSkTpHktz3soFIJGXTxbrtty0CIS61pM
 4iUMYB9SjunoEmdwC7+gCN3sCiRpRqfmIybqXdsW3d37QI+FM5aSBPw51xULubfY
 Wsxo8SkH+IMYxXmfbQuUppsGZ+1QHzU08+MrlvNxGHUjS1aMnsrFF/fbfbbCnWvX
 7hcyBPT0jxc9RPMNeKDm4ItapMMGxGdv6XiRmM8LiUtVG2fMaW4=
 =XUqC
 -----END PGP SIGNATURE-----

Merge tag 'x86_alternatives_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 asm alternatives updates from Borislav Petkov:

 - Teach the static_call patching infrastructure to handle conditional
   tall calls properly which can be static calls too

 - Add proper struct alt_instr.flags which controls different aspects of
   insn patching behavior

* tag 'x86_alternatives_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/static_call: Add support for Jcc tail-calls
  x86/alternatives: Teach text_poke_bp() to patch Jcc.d32 instructions
  x86/alternatives: Introduce int3_emulate_jcc()
  x86/alternatives: Add alt_instr.flags
2023-02-21 08:27:47 -08:00
Arnd Bergmann
d5d4692472 objtool: add UACCESS exceptions for __tsan_volatile_read/write
A lot of the tsan helpers are already excempt from the UACCESS warnings,
but some more functions were added that need the same thing:

kernel/kcsan/core.o: warning: objtool: __tsan_volatile_read16+0x0: call to __tsan_unaligned_read16() with UACCESS enabled
kernel/kcsan/core.o: warning: objtool: __tsan_volatile_write16+0x0: call to __tsan_unaligned_write16() with UACCESS enabled
vmlinux.o: warning: objtool: __tsan_unaligned_volatile_read16+0x4: call to __tsan_unaligned_read16() with UACCESS enabled
vmlinux.o: warning: objtool: __tsan_unaligned_volatile_write16+0x4: call to __tsan_unaligned_write16() with UACCESS enabled

As Marco points out, these functions don't even call each other
explicitly but instead gcc (but not clang) notices the functions
being identical and turns one symbol into a direct branch to the
other.

Link: https://lkml.kernel.org/r/20230215130058.3836177-4-arnd@kernel.org
Fixes: 75d75b7a4d ("kcsan: Support distinguishing volatile accesses")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-20 12:46:16 -08:00
Juergen Gross
f697cb00af x86/xen: mark xen_pv_play_dead() as __noreturn
Mark xen_pv_play_dead() and related to that xen_cpu_bringup_again()
as "__noreturn".

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221125063248.30256-3-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13 06:53:19 +01:00
Josh Poimboeuf
ffb1b4a410 x86/unwind/orc: Add 'signal' field to ORC metadata
Add a 'signal' field which allows unwind hints to specify whether the
instruction pointer should be taken literally (like for most interrupts
and exceptions) rather than decremented (like for call stack return
addresses) when used to find the next ORC entry.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/d2c5ec4d83a45b513d8fd72fab59f1a8cfa46871.1676068346.git.jpoimboe@kernel.org
2023-02-11 12:37:51 +01:00
Peter Zijlstra
443ed4c302 objtool: mem*() are not uaccess safe
For mysterious raisins I listed the new __asan_mem*() functions as
being uaccess safe, this is giving objtool fails on KASAN builds
because these functions call out to the actual __mem*() functions
which are not marked uaccess safe.

Removing it doesn't make the robots unhappy.

Fixes: 69d4c0d321 ("entry, kasan, x86: Disallow overriding mem*() functions")
Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
Bisected-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230126182302.GA687063@paulmck-ThinkPad-P17-Gen-1
2023-02-11 11:18:08 +01:00
Thomas Weißschuh
a20717aca3 objtool: Optimize layout of struct special_alt
Reduce the size of struct special_alt from 72 to 64 bytes.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-7-17968f85a464@weissschuh.net
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-02-01 09:15:24 -08:00
Thomas Weißschuh
21a899f9fc objtool: Optimize layout of struct symbol
Reduce the size of struct symbol on x86_64 from 208 to 200 bytes.
This structure is allocated a lot and never freed.

This reduces maximum memory usage while processing vmlinux.o from
2919716 KB to 2917988 KB (-0.5%) on my notebooks "localmodconfig".

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-6-17968f85a464@weissschuh.net
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-02-01 09:15:24 -08:00
Thomas Weißschuh
8045b8f0b1 objtool: Allocate multiple structures with calloc()
By using calloc() instead of malloc() in a loop, libc does not have to
keep around bookkeeping information for each single structure.

This reduces maximum memory usage while processing vmlinux.o from
3153325 KB to 3035668 KB (-3.7%) on my notebooks "localmodconfig".

Note this introduces memory leaks, because some additional structs get
added to the lists later after reading the symbols and sections from the
original object.  Luckily we don't really care about memory leaks in
objtool.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-3-17968f85a464@weissschuh.net
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-02-01 09:15:23 -08:00
Thomas Weißschuh
cfd66e8179 objtool: Make struct check_options static
It is not used outside of builtin-check.c.

Also remove the unused declaration from builtin.h .

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-2-17968f85a464@weissschuh.net
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-02-01 09:15:23 -08:00
Thomas Weißschuh
d93ee0553c objtool: Make struct entries[] static and const
This data is not modified and not used outside of special.c.

Also adapt its users to the constness.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-1-17968f85a464@weissschuh.net
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-02-01 09:15:22 -08:00
Ian Rogers
cd955bdd6a objtool: Fix HOSTCC flag usage
HOSTCC is always wanted when building objtool. Setting CC to HOSTCC
happens after tools/scripts/Makefile.include is included, meaning
flags (like CFLAGS) are set assuming say CC is gcc, but then it can be
later set to HOSTCC which may be clang. tools/scripts/Makefile.include
is needed for host set up and common macros in objtool's
Makefile. Rather than override the CC variable to HOSTCC, just pass CC
as HOSTCC to the sub-makes of Makefile.build, the libsubcmd builds and
also to the linkage step.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20230126190606.40739-4-irogers@google.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-02-01 09:15:18 -08:00
Ingo Molnar
57a30218fa Linux 6.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
 E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
 nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
 TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
 s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
 ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
 SDc/Y/c=
 =zpaD
 -----END PGP SIGNATURE-----

Merge tag 'v6.2-rc6' into sched/core, to pick up fixes

Pick up fixes before merging another batch of cpuidle updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-01-31 15:01:20 +01:00
Ian Rogers
8c4526ca6a objtool: Properly support make V=1
The Q variable was being used but never correctly set up. Add the
setting up and use in place of @.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20230126190606.40739-3-irogers@google.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-01-30 16:28:16 -08:00
Ian Rogers
bdb8bf7d56 objtool: Install libsubcmd in build
Including from tools/lib can create inadvertent dependencies. Install
libsubcmd in the objtool build and then include the headers from
there.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20230126190606.40739-2-irogers@google.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-01-30 16:27:46 -08:00
Michal Kubecek
03d7a1053c objtool: Check that module init/exit function is an indirect call target
Some out-of-tree modules still do not use module_init() / module_exit()
macros and simply create functions with magic names init_module() and
cleanup_module() instead. As a result, these functions are not recognized
as indirect call targets by objtool and such module fails to load into an
IBT enabled kernel.

This old way is not even documented any more but it is cleaner to issue
a warning than to let the module fail on load without obvious reason.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230118105215.B9DA960514@lion.mk-sys.cz
2023-01-21 10:50:18 +01:00
Peter Zijlstra
69d4c0d321 entry, kasan, x86: Disallow overriding mem*() functions
KASAN cannot just hijack the mem*() functions, it needs to emit
__asan_mem*() variants if it wants instrumentation (other sanitizers
already do this).

  vmlinux.o: warning: objtool: sync_regs+0x24: call to memcpy() leaves .noinstr.text section
  vmlinux.o: warning: objtool: vc_switch_off_ist+0xbe: call to memcpy() leaves .noinstr.text section
  vmlinux.o: warning: objtool: fixup_bad_iret+0x36: call to memset() leaves .noinstr.text section
  vmlinux.o: warning: objtool: __sev_get_ghcb+0xa0: call to memcpy() leaves .noinstr.text section
  vmlinux.o: warning: objtool: __sev_put_ghcb+0x35: call to memcpy() leaves .noinstr.text section

Remove the weak aliases to ensure nobody hijacks these functions and
add them to the noinstr section.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195542.028523143@infradead.org
2023-01-13 11:48:17 +01:00
Peter Zijlstra
f18b0d7ee8 ubsan: Fix objtool UACCESS warns
clang-14 allyesconfig gives:

  vmlinux.o: warning: objtool: emulator_cmpxchg_emulated+0x705: call to __ubsan_handle_load_invalid_value() with UACCESS enabled
  vmlinux.o: warning: objtool: paging64_update_accessed_dirty_bits+0x39e: call to __ubsan_handle_load_invalid_value() with UACCESS enabled
  vmlinux.o: warning: objtool: paging32_update_accessed_dirty_bits+0x390: call to __ubsan_handle_load_invalid_value() with UACCESS enabled
  vmlinux.o: warning: objtool: ept_update_accessed_dirty_bits+0x43f: call to __ubsan_handle_load_invalid_value() with UACCESS enabled

Add the required eflags save/restore and whitelist the thing.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195541.906007455@infradead.org
2023-01-13 11:48:17 +01:00
Peter Zijlstra
2b5a0e425e objtool/idle: Validate __cpuidle code as noinstr
Idle code is very like entry code in that RCU isn't available. As
such, add a little validation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195540.373461409@infradead.org
2023-01-13 11:48:15 +01:00
Nicholas Piggin
cad90e5381 objtool: Tolerate STT_NOTYPE symbols at end of section
Hand-written asm often contains non-function symbols in executable
sections. _end symbols for finding the size of instruction blocks
for runtime processing is one such usage.

optprobe_template_end is one example that causes the warning:

  objtool: optprobe_template_end(): can't find starting instruction

This is because the symbol happens to be at the end of the file (and
therefore end of a section in the object file).

So ignore end-of-section STT_NOTYPE symbols instead of bailing out
because an instruction can't be found. While we're here, add a more
descriptive warning for STT_FUNC symbols found at the end of a
section.

[ This also solves a PowerPC regression reported by Sathvika Vasireddy. ]

Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reported-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Sathvika Vasireddy <sv@linux.ibm.com>
Link: https://lore.kernel.org/r/20221220101323.3119939-1-npiggin@gmail.com
2023-01-09 17:53:46 +01:00
Miaoqian Lin
3da73f1023 objtool: Fix memory leak in create_static_call_sections()
strdup() allocates memory for key_name. We need to release the memory in
the following error paths. Add free() to avoid memory leak.

Fixes: 1e7e478838 ("x86/static_call: Add inline static call implementation for x86-64")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20221205080642.558583-1-linmq006@gmail.com
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
2023-01-07 11:45:24 +01:00
Borislav Petkov (AMD)
5d1dd961e7 x86/alternatives: Add alt_instr.flags
Add a struct alt_instr.flags field which will contain different flags
controlling alternatives patching behavior.

The initial idea was to be able to specify it as a separate macro
parameter but that would mean touching all possible invocations of the
alternatives macros and thus a lot of churn.

What is more, as PeterZ suggested, being able to say ALT_NOT(feature) is
very readable and explains exactly what is meant.

So make the feature field a u32 where the patching flags are the upper
u16 part of the dword quantity while the lower u16 word is the feature.

The highest feature number currently is 0x26a (i.e., word 19) so there
is plenty of space. If that becomes insufficient, the field can be
extended to u64 which will then make struct alt_instr of the nice size
of 16 bytes (14 bytes currently).

There should be no functional changes resulting from this.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/Y6RCoJEtxxZWwotd@zn.tnic
2023-01-05 12:46:47 +01:00
Linus Torvalds
5f6e430f93 powerpc updates for 6.2
- Add powerpc qspinlock implementation optimised for large system scalability and
    paravirt. See the merge message for more details.
 
  - Enable objtool to be built on powerpc to generate mcount locations.
 
  - Use a temporary mm for code patching with the Radix MMU, so the writable mapping is
    restricted to the patching CPU.
 
  - Add an option to build the 64-bit big-endian kernel with the ELFv2 ABI.
 
  - Sanitise user registers on interrupt entry on 64-bit Book3S.
 
  - Many other small features and fixes.
 
 Thanks to: Aboorva Devarajan, Angel Iglesias, Benjamin Gray, Bjorn Helgaas, Bo Liu, Chen
 Lifu, Christoph Hellwig, Christophe JAILLET, Christophe Leroy, Christopher M. Riedl, Colin
 Ian King, Deming Wang, Disha Goel, Dmitry Torokhov, Finn Thain, Geert Uytterhoeven,
 Gustavo A. R. Silva, Haowen Bai, Joel Stanley, Jordan Niethe, Julia Lawall, Kajol Jain,
 Laurent Dufour, Li zeming, Miaoqian Lin, Michael Jeanson, Nathan Lynch, Naveen N. Rao,
 Nayna Jain, Nicholas Miehlbradt, Nicholas Piggin, Pali Rohár, Randy Dunlap, Rohan McLure,
 Russell Currey, Sathvika Vasireddy, Shaomin Deng, Stephen Kitt, Stephen Rothwell, Thomas
 Weißschuh, Tiezhu Yang, Uwe Kleine-König, Xie Shaowen, Xiu Jianfeng, XueBing Chen, Yang
 Yingliang, Zhang Jiaming, ruanjinjie, Jessica Yu, Wolfram Sang.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmOfrj8THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgIWtD/9mGF/ze2k+qFTo+30fb7bO8WJIDgsR
 dIASnZjXV7q/45elvymhUdkQv4R7xL3pzC40P1+ZKtWzGTNe+zWUQLoALNwRK85j
 8CsxZbqefGNKE5Z6ZHo9s37wsu3+jJu9yEQpGFo1LINyzeclCn5St5oqfRam+Hd/
 cPF+VfvREwZ0+YOKGBhJ2EgC+Gc9xsFY7DLQsoYlu71iZZr6Z6rgZW/EY5h3RMGS
 YKBoVwDsWaU0FpFWrr/rYTI6DqSr3AHr1+ftDg7ncCZMD6vQva6aMCCt94aLB1aE
 vC+DNdhZlA558bXGa5yA7Wr//7aUBUIwyC60DogOeZ6vw3kD9tdEd1fbH5hmqNKY
 K5bfqm28XU2959CTE8RDgsYYZvwDcfrjBIML14WZGdCQOTcGKpgOGp22o6yNb1Pq
 JKpHHnVpvu2PZ/p2XdKSm9+etr2yI6lXZAEVTS7ehdtMukButjSHEVbSCEZ8tlWz
 KokQt2J23BMHuSrXK6+67wWQBtdsLEk+LBOQmweiwarMocqvL/Zjz/5J7DR2DtH8
 wlY3wOtB1+E5j7xZ+RgK3c3jNg5dH39ZwvFsSATWTI3P+iq6OK/bbk4q4LmZt2l9
 ZIfH/CXPf9BvGCHzHa3AAd3UBbJLFwj17btMEv1wFVPS0T4LPUzkgTNTNUYeP6zL
 h1e5QfgUxvKPuQ==
 =7k3p
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add powerpc qspinlock implementation optimised for large system
   scalability and paravirt. See the merge message for more details

 - Enable objtool to be built on powerpc to generate mcount locations

 - Use a temporary mm for code patching with the Radix MMU, so the
   writable mapping is restricted to the patching CPU

 - Add an option to build the 64-bit big-endian kernel with the ELFv2
   ABI

 - Sanitise user registers on interrupt entry on 64-bit Book3S

 - Many other small features and fixes

Thanks to Aboorva Devarajan, Angel Iglesias, Benjamin Gray, Bjorn
Helgaas, Bo Liu, Chen Lifu, Christoph Hellwig, Christophe JAILLET,
Christophe Leroy, Christopher M. Riedl, Colin Ian King, Deming Wang,
Disha Goel, Dmitry Torokhov, Finn Thain, Geert Uytterhoeven, Gustavo A.
R. Silva, Haowen Bai, Joel Stanley, Jordan Niethe, Julia Lawall, Kajol
Jain, Laurent Dufour, Li zeming, Miaoqian Lin, Michael Jeanson, Nathan
Lynch, Naveen N. Rao, Nayna Jain, Nicholas Miehlbradt, Nicholas Piggin,
Pali Rohár, Randy Dunlap, Rohan McLure, Russell Currey, Sathvika
Vasireddy, Shaomin Deng, Stephen Kitt, Stephen Rothwell, Thomas
Weißschuh, Tiezhu Yang, Uwe Kleine-König, Xie Shaowen, Xiu Jianfeng,
XueBing Chen, Yang Yingliang, Zhang Jiaming, ruanjinjie, Jessica Yu,
and Wolfram Sang.

* tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (181 commits)
  powerpc/code-patching: Fix oops with DEBUG_VM enabled
  powerpc/qspinlock: Fix 32-bit build
  powerpc/prom: Fix 32-bit build
  powerpc/rtas: mandate RTAS syscall filtering
  powerpc/rtas: define pr_fmt and convert printk call sites
  powerpc/rtas: clean up includes
  powerpc/rtas: clean up rtas_error_log_max initialization
  powerpc/pseries/eeh: use correct API for error log size
  powerpc/rtas: avoid scheduling in rtas_os_term()
  powerpc/rtas: avoid device tree lookups in rtas_os_term()
  powerpc/rtasd: use correct OF API for event scan rate
  powerpc/rtas: document rtas_call()
  powerpc/pseries: unregister VPA when hot unplugging a CPU
  powerpc/pseries: reset the RCU watchdogs after a LPM
  powerpc: Take in account addition CPU node when building kexec FDT
  powerpc: export the CPU node count
  powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze state
  powerpc/dts/fsl: Fix pca954x i2c-mux node names
  cxl: Remove unnecessary cxl_pci_window_alignment()
  selftests/powerpc: Fix resource leaks
  ...
2022-12-19 07:13:33 -06:00
Linus Torvalds
94a855111e - Add the call depth tracking mitigation for Retbleed which has
been long in the making. It is a lighterweight software-only fix for
 Skylake-based cores where enabling IBRS is a big hammer and causes a
 significant performance impact.
 
 What it basically does is, it aligns all kernel functions to 16 bytes
 boundary and adds a 16-byte padding before the function, objtool
 collects all functions' locations and when the mitigation gets applied,
 it patches a call accounting thunk which is used to track the call depth
 of the stack at any time.
 
 When that call depth reaches a magical, microarchitecture-specific value
 for the Return Stack Buffer, the code stuffs that RSB and avoids its
 underflow which could otherwise lead to the Intel variant of Retbleed.
 
 This software-only solution brings a lot of the lost performance back,
 as benchmarks suggest:
 
   https://lore.kernel.org/all/20220915111039.092790446@infradead.org/
 
 That page above also contains a lot more detailed explanation of the
 whole mechanism
 
 - Implement a new control flow integrity scheme called FineIBT which is
 based on the software kCFI implementation and uses hardware IBT support
 where present to annotate and track indirect branches using a hash to
 validate them
 
 - Other misc fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOZp5EACgkQEsHwGGHe
 VUrZFxAAvi/+8L0IYSK4mKJvixGbTFjxN/Swo2JVOfs34LqGUT6JaBc+VUMwZxdb
 VMTFIZ3ttkKEodjhxGI7oGev6V8UfhI37SmO2lYKXpQVjXXnMlv/M+Vw3teE38CN
 gopi+xtGnT1IeWQ3tc/Tv18pleJ0mh5HKWiW+9KoqgXj0wgF9x4eRYDz1TDCDA/A
 iaBzs56j8m/FSykZHnrWZ/MvjKNPdGlfJASUCPeTM2dcrXQGJ93+X2hJctzDte0y
 Nuiw6Y0htfFBE7xoJn+sqm5Okr+McoUM18/CCprbgSKYk18iMYm3ZtAi6FUQZS1A
 ua4wQCf49loGp15PO61AS5d3OBf5D3q/WihQRbCaJvTVgPp9sWYnWwtcVUuhMllh
 ZQtBU9REcVJ/22bH09Q9CjBW0VpKpXHveqQdqRDViLJ6v/iI6EFGmD24SW/VxyRd
 73k9MBGrL/dOf1SbEzdsnvcSB3LGzp0Om8o/KzJWOomrVKjBCJy16bwTEsCZEJmP
 i406m92GPXeaN1GhTko7vmF0GnkEdJs1GVCZPluCAxxbhHukyxHnrjlQjI4vC80n
 Ylc0B3Kvitw7LGJsPqu+/jfNHADC/zhx1qz/30wb5cFmFbN1aRdp3pm8JYUkn+l/
 zri2Y6+O89gvE/9/xUhMohzHsWUO7xITiBavewKeTP9GSWybWUs=
 =cRy1
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core updates from Borislav Petkov:

 - Add the call depth tracking mitigation for Retbleed which has been
   long in the making. It is a lighterweight software-only fix for
   Skylake-based cores where enabling IBRS is a big hammer and causes a
   significant performance impact.

   What it basically does is, it aligns all kernel functions to 16 bytes
   boundary and adds a 16-byte padding before the function, objtool
   collects all functions' locations and when the mitigation gets
   applied, it patches a call accounting thunk which is used to track
   the call depth of the stack at any time.

   When that call depth reaches a magical, microarchitecture-specific
   value for the Return Stack Buffer, the code stuffs that RSB and
   avoids its underflow which could otherwise lead to the Intel variant
   of Retbleed.

   This software-only solution brings a lot of the lost performance
   back, as benchmarks suggest:

       https://lore.kernel.org/all/20220915111039.092790446@infradead.org/

   That page above also contains a lot more detailed explanation of the
   whole mechanism

 - Implement a new control flow integrity scheme called FineIBT which is
   based on the software kCFI implementation and uses hardware IBT
   support where present to annotate and track indirect branches using a
   hash to validate them

 - Other misc fixes and cleanups

* tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits)
  x86/paravirt: Use common macro for creating simple asm paravirt functions
  x86/paravirt: Remove clobber bitmask from .parainstructions
  x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al
  x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit
  x86/Kconfig: Enable kernel IBT by default
  x86,pm: Force out-of-line memcpy()
  objtool: Fix weak hole vs prefix symbol
  objtool: Optimize elf_dirty_reloc_sym()
  x86/cfi: Add boot time hash randomization
  x86/cfi: Boot time selection of CFI scheme
  x86/ibt: Implement FineIBT
  objtool: Add --cfi to generate the .cfi_sites section
  x86: Add prefix symbols for function padding
  objtool: Add option to generate prefix symbols
  objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
  objtool: Slice up elf_create_section_symbol()
  kallsyms: Revert "Take callthunks into account"
  x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces
  x86/retpoline: Fix crash printing warning
  x86/paravirt: Fix a !PARAVIRT build warning
  ...
2022-12-14 15:03:00 -08:00
Michael Ellerman
a39818a3fb objtool/powerpc: Implement arch_pc_relative_reloc()
Provide an implementation for arch_pc_relative_reloc(). It is needed to
pass the build once 61c6065ef7 ("objtool: Allow !PC relative
relocations") is merged.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2022-11-23 21:26:10 +11:00
Sathvika Vasireddy
c984aef8c8 objtool/powerpc: Add --mcount specific implementation
This patch enables objtool --mcount on powerpc, and adds implementation
specific to powerpc.

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-17-sv@linux.ibm.com
2022-11-18 19:00:16 +11:00
Sathvika Vasireddy
e52ec98c5a objtool/powerpc: Enable objtool to be built on ppc
This patch adds [stub] implementations for required functions, inorder
to enable objtool build on powerpc.

[Christophe Leroy: powerpc: Add missing asm/asm.h for objtool,
Use local variables for type and imm in arch_decode_instruction(),
Adapt len for prefixed instructions.]

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-16-sv@linux.ibm.com
2022-11-18 19:00:16 +11:00
Sathvika Vasireddy
4ca993d498 objtool: Add arch specific function arch_ftrace_match()
Add architecture specific function to look for relocation records
pointing to architecture specific symbols.

Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-15-sv@linux.ibm.com
2022-11-18 19:00:16 +11:00
Sathvika Vasireddy
c144973521 objtool: Use macros to define arch specific reloc types
Make relocation types architecture specific.

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-14-sv@linux.ibm.com
2022-11-18 19:00:16 +11:00
Sathvika Vasireddy
de6fbcedf5 objtool: Read special sections with alts only when specific options are selected
Call add_special_section_alts() only when stackval or orc or uaccess or
noinstr options are passed to objtool.

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-13-sv@linux.ibm.com
2022-11-18 19:00:16 +11:00
Sathvika Vasireddy
280981d699 objtool: Add --mnop as an option to --mcount
Some architectures (powerpc) may not support ftrace locations being nop'ed
out at build time. Introduce CONFIG_HAVE_OBJTOOL_NOP_MCOUNT for objtool, as
a means for architectures to enable nop'ing of ftrace locations. Add --mnop
as an option to objtool --mcount, to indicate support for the same.

Also, make sure that --mnop can be passed as an option to objtool only when
--mcount is passed.

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-12-sv@linux.ibm.com
2022-11-18 19:00:16 +11:00
Christophe Leroy
86ea7f3615 objtool: Use target file class size instead of a compiled constant
In order to allow using objtool on cross-built kernels,
determine size of long from elf data instead of using
sizeof(long) at build time.

For the time being this covers only mcount.

[Sathvika Vasireddy: Rename variable "size" to "addrsize" and function
"elf_class_size()" to "elf_class_addrsize()", and modify
create_mcount_loc_sections() function to follow reverse christmas tree
format to order local variable declarations.]

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-11-sv@linux.ibm.com
2022-11-18 19:00:16 +11:00
Christophe Leroy
0646c28b41 objtool: Use target file endianness instead of a compiled constant
Some architectures like powerpc support both endianness, it's
therefore not possible to fix the endianness via arch/endianness.h
because there is no easy way to get the target endianness at
build time.

Use the endianness recorded in the file objtool is working on.

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-10-sv@linux.ibm.com
2022-11-18 19:00:15 +11:00
Christophe Leroy
efb11fdb3e objtool: Fix SEGFAULT
find_insn() will return NULL in case of failure. Check insn in order
to avoid a kernel Oops for NULL pointer dereference.

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-9-sv@linux.ibm.com
2022-11-18 19:00:15 +11:00
Peter Zijlstra
023f2340f0 objtool: Fix weak hole vs prefix symbol
Boris (and the robot) reported that objtool grew a new complaint about
unreachable instructions. Upon inspection it was immediately clear
the __weak zombie instructions struck again.

For the unweary, the linker will simply remove the symbol for
overriden __weak symbols but leave the instructions in place, creating
unreachable instructions -- and objtool likes to report these.

Commit 4adb236867 ("objtool: Ignore extra-symbol code") was supposed
to have dealt with that, but the new commit 9f2899fe36 ("objtool:
Add option to generate prefix symbols") subtly broke that logic by
created unvisited symbols.

Fixes: 9f2899fe36 ("objtool: Add option to generate prefix symbols")
Reported-by: Borislav Petkov <bp@alien8.de>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2022-11-05 11:28:02 +01:00
Peter Zijlstra
19526717f7 objtool: Optimize elf_dirty_reloc_sym()
When moving a symbol in the symtab its index changes and any reloc
referring that symtol-table-index will need to be rewritten too.

In order to facilitate this, objtool simply marks the whole reloc
section 'changed' which will cause the whole section to be
re-generated.

However, finding the relocs that use any given symbol is implemented
rather crudely -- a fully iteration of all sections and their relocs.
Given that some builds have over 20k sections (kallsyms etc..)
iterating all that for *each* symbol moved takes a bit of time.

Instead have each symbol keep a list of relocs that reference it.

This *vastly* improves build times for certain configs.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Y2LlRA7x+8UsE1xf@hirez.programming.kicks-ass.net
2022-11-05 11:28:02 +01:00
Peter Zijlstra
9a479f766b objtool: Add --cfi to generate the .cfi_sites section
Add the location of all __cfi_##name symbols (as generated by kCFI) to
a section such that we might re-write things at kernel boot.

Notably; boot time re-hashing and FineIBT are the intended use of
this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221027092842.568039454@infradead.org
2022-11-01 13:44:10 +01:00
Peter Zijlstra
9f2899fe36 objtool: Add option to generate prefix symbols
When code is compiled with:

  -fpatchable-function-entry=${PADDING_BYTES},${PADDING_BYTES}

functions will have PADDING_BYTES of NOP in front of them. Unwinders
and other things that symbolize code locations will typically
attribute these bytes to the preceding function.

Given that these bytes nominally belong to the following symbol this
mis-attribution is confusing.

Inspired by the fact that CFI_CLANG emits __cfi_##name symbols to
claim these bytes, allow objtool to emit __pfx_##name symbols to do
the same.

Therefore add the objtool --prefix=N argument, to conditionally place
a __pfx_##name symbol at N bytes ahead of symbol 'name' when: all
these preceding bytes are NOP and name-N is an instruction boundary.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Link: https://lkml.kernel.org/r/20221028194453.526899822@infradead.org
2022-11-01 13:44:09 +01:00
Peter Zijlstra
13f60e80e1 objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
Due to how gelf_update_sym*() requires an Elf_Data pointer, and how
libelf keeps Elf_Data in a linked list per section,
elf_update_symbol() ends up having to iterate this list on each
update to find the correct Elf_Data for the index'ed symbol.

By allocating one Elf_Data per new symbol, the list grows per new
symbol, giving an effective O(n^2) insertion time. This is obviously
bloody terrible.

Therefore over-allocate the Elf_Data when an extention is needed.
Except it turns out libelf disregards Elf_Scn::sh_size in favour of
the sum of Elf_Data::d_size. IOW it will happily write out all the
unused space and fill it with:

  0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND

entries (aka zeros). Which obviously violates the STB_LOCAL placement
rule, and is a general pain in the backside for not being the desired
behaviour.

Manually fix-up the Elf_Data size to avoid this problem before calling
elf_update().

This significantly improves performance when adding a significant
number of symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Link: https://lkml.kernel.org/r/20221028194453.461658986@infradead.org
2022-11-01 13:44:08 +01:00
Peter Zijlstra
4c91be8e92 objtool: Slice up elf_create_section_symbol()
In order to facilitate creation of more symbol types, slice up
elf_create_section_symbol() to extract a generic helper that deals
with adding ELF symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Link: https://lkml.kernel.org/r/20221028194453.396634875@infradead.org
2022-11-01 13:44:08 +01:00
Marco Elver
63646fcba5 objtool, kcsan: Add volatile read/write instrumentation to whitelist
Adds KCSAN's volatile instrumentation to objtool's uaccess whitelist.

Recent kernel change have shown that this was missing from the uaccess
whitelist (since the first upstreamed version of KCSAN):

  mm/gup.o: warning: objtool: fault_in_readable+0x101: call to __tsan_volatile_write1() with UACCESS enabled

Fixes: 75d75b7a4d ("kcsan: Support distinguishing volatile accesses")
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-10-18 15:07:33 -07:00
Peter Zijlstra
5a9c361a41 objtool: Allow STT_NOTYPE -> STT_FUNC+0 sibling-calls
Teach objtool about STT_NOTYPE -> STT_FUNC+0 sibling calls. Doing do
allows slightly simpler .S files.

There is a slight complication in that we specifically do not want to
allow sibling calls from symbol holes (previously covered by STT_WEAK
symbols) -- such things exist where a weak function has a .cold
subfunction for example.

Additionally, STT_NOTYPE tail-calls are allowed to happen with a
modified stack frame, they don't need to obey the normal rules after
all.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2022-10-17 16:41:09 +02:00
Peter Zijlstra
dbcdbdfdf1 objtool: Rework instruction -> symbol mapping
Currently insn->func contains a instruction -> symbol link for
STT_FUNC symbols. A NULL value is assumed to mean STT_NOTYPE.
However, there are also instructions not covered by any symbol at all.
This can happen due to __weak symbols for example.

Since the current scheme cannot differentiate between no symbol and
STT_NOTYPE symbol, change things around. Make insn->sym point to any
symbol type such that !insn->sym means no symbol and add a helper
insn_func() that check the sym->type to retain the old functionality.

This then prepares the way to add code that depends on the distinction
between STT_NOTYPE and no symbol at all.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2022-10-17 16:41:08 +02:00
Peter Zijlstra
08ef8c4011 objtool: Allow symbol range comparisons for IBT/ENDBR
A semi common pattern is where code checks if a code address is
within a specific range. All text addresses require either ENDBR or
ANNOTATE_ENDBR, however the ANNOTATE_NOENDBR past the range is
unnatural.

Instead, suppress this warning when this is exactly at the end of a
symbol that itself starts with either ENDBR/ANNOTATE_ENDBR.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111146.434642471@infradead.org
2022-10-17 16:41:08 +02:00
Peter Zijlstra
5da6aea375 objtool: Fix find_{symbol,func}_containing()
The current find_{symbol,func}_containing() functions are broken in
the face of overlapping symbols, exactly the case that is needed for a
new ibt/endbr supression.

Import interval_tree_generic.h into the tools tree and convert the
symbol tree to an interval tree to support proper range stabs.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111146.330203761@infradead.org
2022-10-17 16:41:08 +02:00
Peter Zijlstra
0c0a6d8934 objtool: Add --hacks=skylake
Make the call/func sections selectable via the --hacks option.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111146.120821440@infradead.org
2022-10-17 16:41:07 +02:00
Peter Zijlstra
00abd38408 objtool: Add .call_sites section
In preparation for call depth tracking provide a section which collects all
direct calls.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111146.016511961@infradead.org
2022-10-17 16:41:07 +02:00
Peter Zijlstra
6644ee846c objtool: Track init section
For future usage of .init.text exclusion track the init section in the
instruction decoder and use the result in retpoline validation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111145.910334431@infradead.org
2022-10-17 16:41:07 +02:00
Peter Zijlstra
61c6065ef7 objtool: Allow !PC relative relocations
Objtool doesn't currently much like per-cpu usage in alternatives:

arch/x86/entry/entry_64.o: warning: objtool: .altinstr_replacement+0xf: unsupported relocation in alternatives section
  f:   65 c7 04 25 00 00 00 00 00 00 00 80     movl   $0x80000000,%gs:0x0      13: R_X86_64_32S        __x86_call_depth

Since the R_X86_64_32S relocation is location invariant (it's
computation doesn't include P - the address of the location itself),
it can be trivially allowed.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111145.806607235@infradead.org
2022-10-17 16:41:06 +02:00
Linus Torvalds
27bc50fc90 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any negative
   reports (or any positive ones, come to that).
 
 - Also the Maple Tree from Liam R.  Howlett.  An overlapping range-based
   tree for vmas.  It it apparently slight more efficient in its own right,
   but is mainly targeted at enabling work to reduce mmap_lock contention.
 
   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.
 
   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   (https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
   This has yet to be addressed due to Liam's unfortunately timed
   vacation.  He is now back and we'll get this fixed up.
 
 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer.  It uses
   clang-generated instrumentation to detect used-unintialized bugs down to
   the single bit level.
 
   KMSAN keeps finding bugs.  New ones, as well as the legacy ones.
 
 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.
 
 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
   file/shmem-backed pages.
 
 - userfaultfd updates from Axel Rasmussen
 
 - zsmalloc cleanups from Alexey Romanov
 
 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
 
 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.
 
 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.
 
 - memcg cleanups from Kairui Song.
 
 - memcg fixes and cleanups from Johannes Weiner.
 
 - Vishal Moola provides more folio conversions
 
 - Zhang Yi removed ll_rw_block() :(
 
 - migration enhancements from Peter Xu
 
 - migration error-path bugfixes from Huang Ying
 
 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths.  For optimizations by PMEM drivers, DRM
   drivers, etc.
 
 - vma merging improvements from Jakub Matěn.
 
 - NUMA hinting cleanups from David Hildenbrand.
 
 - xu xin added aditional userspace visibility into KSM merging activity.
 
 - THP & KSM code consolidation from Qi Zheng.
 
 - more folio work from Matthew Wilcox.
 
 - KASAN updates from Andrey Konovalov.
 
 - DAMON cleanups from Kaixu Xia.
 
 - DAMON work from SeongJae Park: fixes, cleanups.
 
 - hugetlb sysfs cleanups from Muchun Song.
 
 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
 joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
 bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
 =xfWx
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP & KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
2022-10-10 17:53:04 -07:00
Linus Torvalds
65f109e199 Objtool changes for v6.1:
- Remove the "ANNOTATE_NOENDBR on ENDBR" warning: it's not
    really useful and only found a non-bug false positive so far.
 
  - Properly decode LOOP/LOOPE/LOOPNE, which were missing from
    the x86 decoder. Because these instructions are rather
    ineffective, they never showed up in compiler output,
    but they are simple enough to support, so add them for
    completeness.
 
  - A bit more cross-arch preparatory work.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmM/4C4RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1golg//cWCQqnPy1JDwQIF5zm1oNXAZlcyDDfgp
 066NP8IlxajWgkPLunRSzId1ZQW4dppk7ccqI/J6mvNgKJe+nKGO1GXJ94aElxmr
 XJdJm8P3R+sYggfVtxMWqZr5cy9JTa7P/4bKDEC7hyP63FXZkOFA7Fk7PZTU/lKm
 gQ7IW8CKt+kSqygk8xBTS1gW7EfQWo0hA1mZN75ewwlJP0MJVlECKCfFEjQEVTJb
 z4M/iCybJolhiAGktGBM+bj8YijN2/ZvZ9zIC90neMaTLNFY6zMlUbV9WmVT5X+F
 1UFlKczNpOxwk3/IGVlwn3IzDToxYk1LMkWc7W7bfcu15aPvJUSR/QJjsRoGsLtX
 R4Cpa8dKkvDkbnMEUV9Y93AUehF6WO5ZEzPieCxEQLLieZD+XWQdSIiCugoB3+Kr
 BD1n9iE/Gm7IMHAp1yVkNGcMclu4Co9Cc36v71SzfuJznqf+Nv6gm/qy4KCofrZR
 C1CCnP/y1Yez4Z0avsbsTqQtCr/TCE7Nf8GmV3fL4rMtxG5eTap7X1GswiGRuL/G
 uTe9oKBHuSYBwrVK14/cqtAaPkSy+tIXb2GZUMfdYwx6baLTnz9MGhhXaOyDEOMw
 uwt0qd4cXmUqmAhyi+SGRsP4v9HVdmyQFonm8v2gymUltBrb5hWteFOKVPXgt9E+
 4wo/pldrr+A=
 =D5iK
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:

 - Remove the "ANNOTATE_NOENDBR on ENDBR" warning: it's not really
   useful and only found a non-bug false positive so far.

 - Properly decode LOOP/LOOPE/LOOPNE, which were missing from the x86
   decoder. Because these instructions are rather ineffective, they
   never showed up in compiler output, but they are simple enough to
   support, so add them for completeness.

 - A bit more cross-arch preparatory work.

* tag 'objtool-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool,x86: Teach decode about LOOP* instructions
  objtool: Remove "ANNOTATE_NOENDBR on ENDBR" warning
  objtool: Use arch_jump_destination() in read_intra_function_calls()
2022-10-10 09:56:19 -07:00
Linus Torvalds
0326074ff4 Networking changes for 6.1.
Core
 ----
 
  - Introduce and use a single page frag cache for allocating small skb
    heads, clawing back the 10-20% performance regression in UDP flood
    test from previous fixes.
 
  - Run packets which already went thru HW coalescing thru SW GRO.
    This significantly improves TCP segment coalescing and simplifies
    deployments as different workloads benefit from HW or SW GRO.
 
  - Shrink the size of the base zero-copy send structure.
 
  - Move TCP init under a new slow / sleepable version of DO_ONCE().
 
 BPF
 ---
 
  - Add BPF-specific, any-context-safe memory allocator.
 
  - Add helpers/kfuncs for PKCS#7 signature verification from BPF
    programs.
 
  - Define a new map type and related helpers for user space -> kernel
    communication over a ring buffer (BPF_MAP_TYPE_USER_RINGBUF).
 
  - Allow targeting BPF iterators to loop through resources of one
    task/thread.
 
  - Add ability to call selected destructive functions.
    Expose crash_kexec() to allow BPF to trigger a kernel dump.
    Use CAP_SYS_BOOT check on the loading process to judge permissions.
 
  - Enable BPF to collect custom hierarchical cgroup stats efficiently
    by integrating with the rstat framework.
 
  - Support struct arguments for trampoline based programs.
    Only structs with size <= 16B and x86 are supported.
 
  - Invoke cgroup/connect{4,6} programs for unprivileged ICMP ping
    sockets (instead of just TCP and UDP sockets).
 
  - Add a helper for accessing CLOCK_TAI for time sensitive network
    related programs.
 
  - Support accessing network tunnel metadata's flags.
 
  - Make TCP SYN ACK RTO tunable by BPF programs with TCP Fast Open.
 
  - Add support for writing to Netfilter's nf_conn:mark.
 
 Protocols
 ---------
 
  - WiFi: more Extremely High Throughput (EHT) and Multi-Link
    Operation (MLO) work (802.11be, WiFi 7).
 
  - vsock: improve support for SO_RCVLOWAT.
 
  - SMC: support SO_REUSEPORT.
 
  - Netlink: define and document how to use netlink in a "modern" way.
    Support reporting missing attributes via extended ACK.
 
  - IPSec: support collect metadata mode for xfrm interfaces.
 
  - TCPv6: send consistent autoflowlabel in SYN_RECV state
    and RST packets.
 
  - TCP: introduce optional per-netns connection hash table to allow
    better isolation between namespaces (opt-in, at the cost of memory
    and cache pressure).
 
  - MPTCP: support TCP_FASTOPEN_CONNECT.
 
  - Add NEXT-C-SID support in Segment Routing (SRv6) End behavior.
 
  - Adjust IP_UNICAST_IF sockopt behavior for connected UDP sockets.
 
  - Open vSwitch:
    - Allow specifying ifindex of new interfaces.
    - Allow conntrack and metering in non-initial user namespace.
 
  - TLS: support the Korean ARIA-GCM crypto algorithm.
 
  - Remove DECnet support.
 
 Driver API
 ----------
 
  - Allow selecting the conduit interface used by each port
    in DSA switches, at runtime.
 
  - Ethernet Power Sourcing Equipment and Power Device support.
 
  - Add tc-taprio support for queueMaxSDU parameter, i.e. setting
    per traffic class max frame size for time-based packet schedules.
 
  - Support PHY rate matching - adapting between differing host-side
    and link-side speeds.
 
  - Introduce QUSGMII PHY mode and 1000BASE-KX interface mode.
 
  - Validate OF (device tree) nodes for DSA shared ports; make
    phylink-related properties mandatory on DSA and CPU ports.
    Enforcing more uniformity should allow transitioning to phylink.
 
  - Require that flash component name used during update matches one
    of the components for which version is reported by info_get().
 
  - Remove "weight" argument from driver-facing NAPI API as much
    as possible. It's one of those magic knobs which seemed like
    a good idea at the time but is too indirect to use in practice.
 
  - Support offload of TLS connections with 256 bit keys.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - Microchip KSZ9896 6-port Gigabit Ethernet Switch
    - Renesas Ethernet AVB (EtherAVB-IF) Gen4 SoCs
    - Analog Devices ADIN1110 and ADIN2111 industrial single pair
      Ethernet (10BASE-T1L) MAC+PHY.
    - Rockchip RV1126 Gigabit Ethernet (a version of stmmac IP).
 
  - Ethernet SFPs / modules:
    - RollBall / Hilink / Turris 10G copper SFPs
    - HALNy GPON module
 
  - WiFi:
    - CYW43439 SDIO chipset (brcmfmac)
    - CYW89459 PCIe chipset (brcmfmac)
    - BCM4378 on Apple platforms (brcmfmac)
 
 Drivers
 -------
 
  - CAN:
    - gs_usb: HW timestamp support
 
  - Ethernet PHYs:
    - lan8814: cable diagnostics
 
  - Ethernet NICs:
    - Intel (100G):
      - implement control of FCS/CRC stripping
      - port splitting via devlink
      - L2TPv3 filtering offload
    - nVidia/Mellanox:
      - tunnel offload for sub-functions
      - MACSec offload, w/ Extended packet number and replay
        window offload
      - significantly restructure, and optimize the AF_XDP support,
        align the behavior with other vendors
    - Huawei:
      - configuring DSCP map for traffic class selection
      - querying standard FEC statistics
      - querying SerDes lane number via ethtool
    - Marvell/Cavium:
      - egress priority flow control
      - MACSec offload
    - AMD/SolarFlare:
      - PTP over IPv6 and raw Ethernet
    - small / embedded:
      - ax88772: convert to phylink (to support SFP cages)
      - altera: tse: convert to phylink
      - ftgmac100: support fixed link
      - enetc: standard Ethtool counters
      - macb: ZynqMP SGMII dynamic configuration support
      - tsnep: support multi-queue and use page pool
      - lan743x: Rx IP & TCP checksum offload
      - igc: add xdp frags support to ndo_xdp_xmit
 
  - Ethernet high-speed switches:
    - Marvell (prestera):
      - support SPAN port features (traffic mirroring)
      - nexthop object offloading
    - Microchip (sparx5):
      - multicast forwarding offload
      - QoS queuing offload (tc-mqprio, tc-tbf, tc-ets)
 
  - Ethernet embedded switches:
    - Marvell (mv88e6xxx):
      - support RGMII cmode
    - NXP (felix):
      - standardized ethtool counters
    - Microchip (lan966x):
      - QoS queuing offload (tc-mqprio, tc-tbf, tc-cbs, tc-ets)
      - traffic policing and mirroring
      - link aggregation / bonding offload
      - QUSGMII PHY mode support
 
  - Qualcomm 802.11ax WiFi (ath11k):
    - cold boot calibration support on WCN6750
    - support to connect to a non-transmit MBSSID AP profile
    - enable remain-on-channel support on WCN6750
    - Wake-on-WLAN support for WCN6750
    - support to provide transmit power from firmware via nl80211
    - support to get power save duration for each client
    - spectral scan support for 160 MHz
 
  - MediaTek WiFi (mt76):
    - WiFi-to-Ethernet bridging offload for MT7986 chips
 
  - RealTek WiFi (rtw89):
    - P2P support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmM7vtkACgkQMUZtbf5S
 Irvotg//dmh53rC+UMKO3OgOqPlSMnaqzbUdDEfN6mj4Mpox7Csb8zERVURHhBHY
 fvlXWsDgxmvgTebI5fvNC5+f1iW5xcqgJV2TWnNmDOKWwvQwb6qQfgixVmunvkpe
 IIukMXYt0dAf9bXeeEfbNXcCb85cPwB76stX0tMV6BX7osp3T0TL1fvFk0NJkL0j
 TeydLad/yAQtPb4TbeWYjNDoxPVDf0cVpUrevLGmWE88UMYmgTqPze+h1W5Wri52
 bzjdLklY/4cgcIZClHQ6F9CeRWqEBxvujA5Hj/cwOcn/ptVVJWUGi7sQo3sYkoSs
 HFu+F8XsTec14kGNC0Ab40eVdqs5l/w8+E+4jvgXeKGOtVns8DwoiUIzqXpyty89
 Ib04mffrwWNjFtHvo/kIsNwP05X2PGE9HUHfwsTUfisl/ASvMmQp7D7vUoqQC/4B
 AMVzT5qpjkmfBHYQQGuw8FxJhMeAOjC6aAo6censhXJyiUhIfleQsN0syHdaNb8q
 9RZlhAgQoVb6ZgvBV8r8unQh/WtNZ3AopwifwVJld2unsE/UNfQy2KyqOWBES/zf
 LP9sfuX0JnmHn8s1BQEUMPU1jF9ZVZCft7nufJDL6JhlAL+bwZeEN4yCiAHOPZqE
 ymSLHI9s8yWZoNpuMWKrI9kFexVnQFKmA3+quAJUcYHNMSsLkL8=
 =Gsio
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - Introduce and use a single page frag cache for allocating small skb
     heads, clawing back the 10-20% performance regression in UDP flood
     test from previous fixes.

   - Run packets which already went thru HW coalescing thru SW GRO. This
     significantly improves TCP segment coalescing and simplifies
     deployments as different workloads benefit from HW or SW GRO.

   - Shrink the size of the base zero-copy send structure.

   - Move TCP init under a new slow / sleepable version of DO_ONCE().

  BPF:

   - Add BPF-specific, any-context-safe memory allocator.

   - Add helpers/kfuncs for PKCS#7 signature verification from BPF
     programs.

   - Define a new map type and related helpers for user space -> kernel
     communication over a ring buffer (BPF_MAP_TYPE_USER_RINGBUF).

   - Allow targeting BPF iterators to loop through resources of one
     task/thread.

   - Add ability to call selected destructive functions. Expose
     crash_kexec() to allow BPF to trigger a kernel dump. Use
     CAP_SYS_BOOT check on the loading process to judge permissions.

   - Enable BPF to collect custom hierarchical cgroup stats efficiently
     by integrating with the rstat framework.

   - Support struct arguments for trampoline based programs. Only
     structs with size <= 16B and x86 are supported.

   - Invoke cgroup/connect{4,6} programs for unprivileged ICMP ping
     sockets (instead of just TCP and UDP sockets).

   - Add a helper for accessing CLOCK_TAI for time sensitive network
     related programs.

   - Support accessing network tunnel metadata's flags.

   - Make TCP SYN ACK RTO tunable by BPF programs with TCP Fast Open.

   - Add support for writing to Netfilter's nf_conn:mark.

  Protocols:

   - WiFi: more Extremely High Throughput (EHT) and Multi-Link Operation
     (MLO) work (802.11be, WiFi 7).

   - vsock: improve support for SO_RCVLOWAT.

   - SMC: support SO_REUSEPORT.

   - Netlink: define and document how to use netlink in a "modern" way.
     Support reporting missing attributes via extended ACK.

   - IPSec: support collect metadata mode for xfrm interfaces.

   - TCPv6: send consistent autoflowlabel in SYN_RECV state and RST
     packets.

   - TCP: introduce optional per-netns connection hash table to allow
     better isolation between namespaces (opt-in, at the cost of memory
     and cache pressure).

   - MPTCP: support TCP_FASTOPEN_CONNECT.

   - Add NEXT-C-SID support in Segment Routing (SRv6) End behavior.

   - Adjust IP_UNICAST_IF sockopt behavior for connected UDP sockets.

   - Open vSwitch:
      - Allow specifying ifindex of new interfaces.
      - Allow conntrack and metering in non-initial user namespace.

   - TLS: support the Korean ARIA-GCM crypto algorithm.

   - Remove DECnet support.

  Driver API:

   - Allow selecting the conduit interface used by each port in DSA
     switches, at runtime.

   - Ethernet Power Sourcing Equipment and Power Device support.

   - Add tc-taprio support for queueMaxSDU parameter, i.e. setting per
     traffic class max frame size for time-based packet schedules.

   - Support PHY rate matching - adapting between differing host-side
     and link-side speeds.

   - Introduce QUSGMII PHY mode and 1000BASE-KX interface mode.

   - Validate OF (device tree) nodes for DSA shared ports; make
     phylink-related properties mandatory on DSA and CPU ports.
     Enforcing more uniformity should allow transitioning to phylink.

   - Require that flash component name used during update matches one of
     the components for which version is reported by info_get().

   - Remove "weight" argument from driver-facing NAPI API as much as
     possible. It's one of those magic knobs which seemed like a good
     idea at the time but is too indirect to use in practice.

   - Support offload of TLS connections with 256 bit keys.

  New hardware / drivers:

   - Ethernet:
      - Microchip KSZ9896 6-port Gigabit Ethernet Switch
      - Renesas Ethernet AVB (EtherAVB-IF) Gen4 SoCs
      - Analog Devices ADIN1110 and ADIN2111 industrial single pair
        Ethernet (10BASE-T1L) MAC+PHY.
      - Rockchip RV1126 Gigabit Ethernet (a version of stmmac IP).

   - Ethernet SFPs / modules:
      - RollBall / Hilink / Turris 10G copper SFPs
      - HALNy GPON module

   - WiFi:
      - CYW43439 SDIO chipset (brcmfmac)
      - CYW89459 PCIe chipset (brcmfmac)
      - BCM4378 on Apple platforms (brcmfmac)

  Drivers:

   - CAN:
      - gs_usb: HW timestamp support

   - Ethernet PHYs:
      - lan8814: cable diagnostics

   - Ethernet NICs:
      - Intel (100G):
         - implement control of FCS/CRC stripping
         - port splitting via devlink
         - L2TPv3 filtering offload
      - nVidia/Mellanox:
         - tunnel offload for sub-functions
         - MACSec offload, w/ Extended packet number and replay window
           offload
         - significantly restructure, and optimize the AF_XDP support,
           align the behavior with other vendors
      - Huawei:
         - configuring DSCP map for traffic class selection
         - querying standard FEC statistics
         - querying SerDes lane number via ethtool
      - Marvell/Cavium:
         - egress priority flow control
         - MACSec offload
      - AMD/SolarFlare:
         - PTP over IPv6 and raw Ethernet
      - small / embedded:
         - ax88772: convert to phylink (to support SFP cages)
         - altera: tse: convert to phylink
         - ftgmac100: support fixed link
         - enetc: standard Ethtool counters
         - macb: ZynqMP SGMII dynamic configuration support
         - tsnep: support multi-queue and use page pool
         - lan743x: Rx IP & TCP checksum offload
         - igc: add xdp frags support to ndo_xdp_xmit

   - Ethernet high-speed switches:
      - Marvell (prestera):
         - support SPAN port features (traffic mirroring)
         - nexthop object offloading
      - Microchip (sparx5):
         - multicast forwarding offload
         - QoS queuing offload (tc-mqprio, tc-tbf, tc-ets)

   - Ethernet embedded switches:
      - Marvell (mv88e6xxx):
         - support RGMII cmode
      - NXP (felix):
         - standardized ethtool counters
      - Microchip (lan966x):
         - QoS queuing offload (tc-mqprio, tc-tbf, tc-cbs, tc-ets)
         - traffic policing and mirroring
         - link aggregation / bonding offload
         - QUSGMII PHY mode support

   - Qualcomm 802.11ax WiFi (ath11k):
      - cold boot calibration support on WCN6750
      - support to connect to a non-transmit MBSSID AP profile
      - enable remain-on-channel support on WCN6750
      - Wake-on-WLAN support for WCN6750
      - support to provide transmit power from firmware via nl80211
      - support to get power save duration for each client
      - spectral scan support for 160 MHz

   - MediaTek WiFi (mt76):
      - WiFi-to-Ethernet bridging offload for MT7986 chips

   - RealTek WiFi (rtw89):
      - P2P support"

* tag 'net-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1864 commits)
  eth: pse: add missing static inlines
  once: rename _SLOW to _SLEEPABLE
  net: pse-pd: add regulator based PSE driver
  dt-bindings: net: pse-dt: add bindings for regulator based PoDL PSE controller
  ethtool: add interface to interact with Ethernet Power Equipment
  net: mdiobus: search for PSE nodes by parsing PHY nodes.
  net: mdiobus: fwnode_mdiobus_register_phy() rework error handling
  net: add framework to support Ethernet PSE and PDs devices
  dt-bindings: net: phy: add PoDL PSE property
  net: marvell: prestera: Propagate nh state from hw to kernel
  net: marvell: prestera: Add neighbour cache accounting
  net: marvell: prestera: add stub handler neighbour events
  net: marvell: prestera: Add heplers to interact with fib_notifier_info
  net: marvell: prestera: Add length macros for prestera_ip_addr
  net: marvell: prestera: add delayed wq and flush wq on deinit
  net: marvell: prestera: Add strict cleanup of fib arbiter
  net: marvell: prestera: Add cleanup of allocated fib_nodes
  net: marvell: prestera: Add router nexthops ABI
  eth: octeon: fix build after netif_napi_add() changes
  net/mlx5: E-Switch, Return EBUSY if can't get mode lock
  ...
2022-10-04 13:38:03 -07:00
Linus Torvalds
7db99f01d1 - Print the CPU number at segfault time. The number printed is not
always accurate (preemption is enabled at that time) but the print string
   contains "likely" and after a lot of back'n'forth on this, this was the
   consensus that was reached. See thread starting at:
   https://lore.kernel.org/r/5d62c1d0-7425-d5bb-ecb5-1dc3b4d7d245@intel.com
 
 - After a *lot* of testing and polishing, finally the clear_user()
   improvements to inline REP; STOSB by default
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM780YACgkQEsHwGGHe
 VUoVmRAAhTTUYqe81XRAX1Egge1RVwXgZFZQQD54239IveLt80kpg1AFR697z3/G
 GSaL5dy9/LEwyE0r9u7VN1SuY3Bz8LEZLYfurKlGfMGv3mlcUedLSWHQNaNZ4+cx
 nwP1VjrfH80Qwn7l99hOZ7kwCRlUWdsamcMsv6n7Fq0YnM7vW6MgmQGlqCGADQKI
 GGElgn3VaU5pEXF4YtZE0qfy17dgkW+RJD7RlaAtzmKcdKYiKgfX0Mh9bkQ5VVcb
 973peg9GHKHoP0w54LYKNgF/WkYPpBwcNIkJW//aMLGS5ofalliu0W281okAhZ7w
 Sknnx+umoprCAV9ljn/HZbQnseXPXKlZWUxiwfEZD2fcBI6+60HK86zVSEvrYxNs
 COZRhpj/bwvOt5LtNj7dcV6cbbBvJJIEJSNazXHscds2zUh79fidzkjZ/Z0Hbo7D
 sr7SM80STaURGFPTLoD41z4TY3V2GH8JUbWG9+I4fBjmSDyX+tQKKJhH886MRSSW
 ZQ0pY1uhwF8QBMlp3t87pZ2gbdjMr40PSYJGU+Xqsb/khq7xOWnP1CiIkVUHLF2L
 n/hm6VF6ifwFstFhOBc8nTcz7kGqzKvlQssFpf+DAIl/b7mZ3AKHVIJxTpZ1hcis
 BqQdeNcVEcNCRZWFzrG5hCs0b7xtaJrI30RPSMsQP2IcaX+DNyg=
 =xLSS
 -----END PGP SIGNATURE-----

Merge tag 'x86_cpu_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpu updates from Borislav Petkov:

 - Print the CPU number at segfault time.

   The number printed is not always accurate (preemption is enabled at
   that time) but the print string contains "likely" and after a lot of
   back'n'forth on this, this was the consensus that was reached. See
   thread at [1].

 - After a *lot* of testing and polishing, finally the clear_user()
   improvements to inline REP; STOSB by default

Link: https://lore.kernel.org/r/5d62c1d0-7425-d5bb-ecb5-1dc3b4d7d245@intel.com [1]

* tag 'x86_cpu_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Print likely CPU at segfault time
  x86/clear_user: Make it faster
2022-10-04 09:21:30 -07:00
Alexander Potapenko
40b22c9df2 objtool: kmsan: list KMSAN API functions as uaccess-safe
KMSAN inserts API function calls in a lot of places (function entries and
exits, local variables, memory accesses), so they may get called from the
uaccess regions as well.

KMSAN API functions are used to update the metadata (shadow/origin pages)
for kernel memory accesses.  The metadata pages for kernel pointers are
also located in the kernel memory, so touching them is not a problem.  For
userspace pointers, no metadata is allocated.

If an API function is supposed to read or modify the metadata, it does so
for kernel pointers and ignores userspace pointers.  If an API function is
supposed to return a pair of metadata pointers for the instrumentation to
use (like all __msan_metadata_ptr_for_TYPE_SIZE() functions do), it
returns the allocated metadata for kernel pointers and special dummy
buffers residing in the kernel memory for userspace pointers.

As a result, none of KMSAN API functions perform userspace accesses, but
since they might be called from UACCESS regions they use
user_access_save/restore().

Link: https://lkml.kernel.org/r/20220915150417.722975-32-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:23 -07:00
Sami Tolvanen
3c68a92d17 objtool: Disable CFI warnings
The __cfi_ preambles contain a mov instruction that embeds the KCFI
type identifier in the following format:

  ; type preamble
  __cfi_function:
    mov <id>, %eax
  function:
    ...

While the preamble symbols are STT_FUNC and contain valid
instructions, they are never executed and always fall through. Skip
the warning for them.

.kcfi_traps sections point to CFI traps in text sections. Also skip
the warning about them referencing !ENDBR instructions.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-18-samitolvanen@google.com
2022-09-26 10:13:15 -07:00
Sami Tolvanen
5141d3a06b objtool: Preserve special st_shndx indexes in elf_update_symbol
elf_update_symbol fails to preserve the special st_shndx values
between [SHN_LORESERVE, SHN_HIRESERVE], which results in it
converting SHN_ABS entries into SHN_UNDEF, for example. Explicitly
check for the special indexes and ensure these symbols are not
marked undefined.

Fixes: ead165fa10 ("objtool: Fix symbol creation")
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-17-samitolvanen@google.com
2022-09-26 10:13:15 -07:00
Peter Zijlstra (Intel)
9440155ccb ftrace: Add HAVE_DYNAMIC_FTRACE_NO_PATCHABLE
x86 will shortly start using -fpatchable-function-entry for purposes
other than ftrace, make sure the __patchable_function_entry section
isn't merged in the mcount_loc section.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220903131154.420467-2-jolsa@kernel.org
2022-09-16 22:16:48 +02:00
Peter Zijlstra
7a7621dfa4 objtool,x86: Teach decode about LOOP* instructions
When 'discussing' control flow Masami mentioned the LOOP* instructions
and I realized objtool doesn't decode them properly.

As it turns out, these instructions are somewhat inefficient and as
such unlikely to be emitted by the compiler (a few vmlinux.o checks
can't find a single one) so this isn't critical, but still, best to
decode them properly.

Reported-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yxhd4EMKyoFoH9y4@hirez.programming.kicks-ass.net
2022-09-15 16:13:55 +02:00
Linus Torvalds
2f23a7c914 Misc fixes:
- Fix PAT on Xen, which caused i915 driver failures
  - Fix compat INT 80 entry crash on Xen PV guests
  - Fix 'MMIO Stale Data' mitigation status reporting on older Intel CPUs
  - Fix RSB stuffing regressions
  - Fix ORC unwinding on ftrace trampolines
  - Add Intel Raptor Lake CPU model number
  - Fix (work around) a SEV-SNP bootloader bug providing bogus values in
    boot_params->cc_blob_address, by ignoring the value on !SEV-SNP bootups.
  - Fix SEV-SNP early boot failure
  - Fix the objtool list of noreturn functions and annotate snp_abort(),
    which bug confused objtool on gcc-12.
  - Fix the documentation for retbleed
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmMLgUERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hU1hAAj9bKO+D07gROpkRhXpLvbqm+mUqk6It8
 qEuyLkC/xvD9N//Pya0ZPPE+l23EHQxM/L0tXCYwsc8Zlx3fr678FQktHZuxULaP
 qlGmwub8PvsyMesXBGtgHJ4RT8L07FelBR+E/TpqCSbv/geM7mcc0ojyOKo+OmbD
 vbsUTDNqsMYndzePUBrv/vJRqVLGlbd1a22DKE/YiYBbZu5hughNUB0bHYhYdsml
 mutcRsVTKRbZOi4wiuY/pnveMf/z9wAwKOWd9EBaDWILygVSHkBJJQyhHigBh3F8
 H1hFn1q4ybULdrAUT4YND9Tjb6U8laU0fxg8Z43bay7bFXXElqoj3W2qka9NjAim
 QvWSFvTYQRTIWt6sCshVsUBWj6fBZdHxcJ8Fh+ucJJp+JWl9/aD0A41vK2Jx0LIt
 p8YzBRKEqd1Q/7QD855BArN7HNtQGgShNt4oPVZ7nPnjuQt0+lngu8NR+6X3RKpX
 r8rordyPvzgPL6W+1uV+8hnz1w+YU2xplAbK+zTijwgJVgyf8khSlZQNpFKULrou
 zjtzo/2nB+4C4bvfetNnaOGhi1/AdCHZHyZE35rotpd73SLHvdOrH0Ll9oCVfVrC
 UWbC1E67cHQw97Ni/4CrCsJRBULK01uyszVCxlEkSYInf0UsKlnmy+TqxizwsCVy
 reYQ0ePyWg0=
 =ZpCJ
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 fixes from Ingo Molnar:

 - Fix PAT on Xen, which caused i915 driver failures

 - Fix compat INT 80 entry crash on Xen PV guests

 - Fix 'MMIO Stale Data' mitigation status reporting on older Intel CPUs

 - Fix RSB stuffing regressions

 - Fix ORC unwinding on ftrace trampolines

 - Add Intel Raptor Lake CPU model number

 - Fix (work around) a SEV-SNP bootloader bug providing bogus values in
   boot_params->cc_blob_address, by ignoring the value on !SEV-SNP
   bootups.

 - Fix SEV-SNP early boot failure

 - Fix the objtool list of noreturn functions and annotate snp_abort(),
   which bug confused objtool on gcc-12.

 - Fix the documentation for retbleed

* tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation/ABI: Mention retbleed vulnerability info file for sysfs
  x86/sev: Mark snp_abort() noreturn
  x86/sev: Don't use cc_platform_has() for early SEV-SNP calls
  x86/boot: Don't propagate uninitialized boot_params->cc_blob_address
  x86/cpu: Add new Raptor Lake CPU model number
  x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry
  x86/nospec: Fix i386 RSB stuffing
  x86/nospec: Unwreck the RSB stuffing
  x86/bugs: Add "unknown" reporting for MMIO Stale Data
  x86/entry: Fix entry_INT80_compat for Xen PV guests
  x86/PAT: Have pat_enabled() properly reflect state when running on Xen
2022-08-28 10:10:23 -07:00
Borislav Petkov
c93c296fff x86/sev: Mark snp_abort() noreturn
Mark both the function prototype and definition as noreturn in order to
prevent the compiler from doing transformations which confuse objtool
like so:

  vmlinux.o: warning: objtool: sme_enable+0x71: unreachable instruction

This triggers with gcc-12.

Add it and sev_es_terminate() to the objtool noreturn tracking array
too. Sort it while at it.

Suggested-by: Michael Matz <matz@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220824152420.20547-1-bp@alien8.de
2022-08-25 15:54:03 +02:00
Josh Poimboeuf
3c6f390080 objtool: Remove "ANNOTATE_NOENDBR on ENDBR" warning
This warning isn't very useful: why would you put ANNOTATE_NOENDBR on
ENDBR, and if you did, what's the harm?

And thus far it's only found one non-bug, where the
'__end_entry_SYSENTER_compat' label happens to land on the ENDBR from
entry_SYSCALL_compat:

  vmlinux.o: warning: objtool: entry_SYSCALL_compat+0x0: ANNOTATE_NOENDBR on ENDBR

.. which is fine.  Just remove the warning.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/142341a5dafdfc788e4c95b9e226a6eefc9b626e.1660839773.git.jpoimboe@kernel.org
2022-08-19 19:47:57 +02:00
Chen Zhongjin
7b3e318690 objtool: Use arch_jump_destination() in read_intra_function_calls()
Use arch_jump_destiation() instead of the open-coded 'offset + len +
immediate' that is x86 specific.

Avoids future trouble with other architectures.

Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220818014553.220261-1-chenzhongjin@huawei.com
2022-08-19 19:47:56 +02:00
Josh Poimboeuf
e27e5bea95 x86/ibt, objtool: Add IBT_NOSEAL()
Add a macro which prevents a function from getting sealed if there are
no compile-time references to it.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Message-Id: <20220818213927.e44fmxkoq4yj6ybn@treble>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-08-19 04:05:42 -04:00
Borislav Petkov
0db7058e8e x86/clear_user: Make it faster
Based on a patch by Mark Hemment <markhemm@googlemail.com> and
incorporating very sane suggestions from Linus.

The point here is to have the default case with FSRM - which is supposed
to be the majority of x86 hw out there - if not now then soon - be
directly inlined into the instruction stream so that no function call
overhead is taking place.

Drop the early clobbers from the @size and @addr operands as those are
not needed anymore since we have single instruction alternatives.

The benchmarks I ran would show very small improvements and a PF
benchmark would even show weird things like slowdowns with higher core
counts.

So for a ~6m running the git test suite, the function gets called under
700K times, all from padzero():

  <...>-2536    [006] .....   261.208801: padzero: to: 0x55b0663ed214, size: 3564, cycles: 21900
  <...>-2536    [006] .....   261.208819: padzero: to: 0x7f061adca078, size: 3976, cycles: 17160
  <...>-2537    [008] .....   261.211027: padzero: to: 0x5572d019e240, size: 3520, cycles: 23850
  <...>-2537    [008] .....   261.211049: padzero: to: 0x7f1288dc9078, size: 3976, cycles: 15900
   ...

which is around 1%-ish of the total time and which is consistent with
the benchmark numbers.

So Mel gave me the idea to simply measure how fast the function becomes.
I.e.:

  start = rdtsc_ordered();
  ret = __clear_user(to, n);
  end = rdtsc_ordered();

Computing the mean average of all the samples collected during the test
suite run then shows some improvement:

  clear_user_original:
  Amean: 9219.71 (Sum: 6340154910, samples: 687674)

  fsrm:
  Amean: 8030.63 (Sum: 5522277720, samples: 687652)

That's on Zen3.

The situation looks a lot more confusing on Intel:

Icelake:

  clear_user_original:
  Amean: 19679.4 (Sum: 13652560764, samples: 693750)
  Amean: 19743.7 (Sum: 13693470604, samples: 693562)

(I ran it twice just to be sure.)

  ERMS:
  Amean: 20374.3 (Sum: 13910601024, samples: 682752)
  Amean: 20453.7 (Sum: 14186223606, samples: 693576)

  FSRM:
  Amean: 20458.2 (Sum: 13918381386, sample s: 680331)

The original microbenchmark which people were complaining about:

  for i in $(seq 1 10); do dd if=/dev/zero of=/dev/null bs=1M status=progress count=65536; done 2>&1 | grep copied
  32207011840 bytes (32 GB, 30 GiB) copied, 1 s, 32.2 GB/s
  68719476736 bytes (69 GB, 64 GiB) copied, 1.93069 s, 35.6 GB/s
  37597741056 bytes (38 GB, 35 GiB) copied, 1 s, 37.6 GB/s
  68719476736 bytes (69 GB, 64 GiB) copied, 1.78017 s, 38.6 GB/s
  62020124672 bytes (62 GB, 58 GiB) copied, 2 s, 31.0 GB/s
  68719476736 bytes (69 GB, 64 GiB) copied, 2.13716 s, 32.2 GB/s
  60010004480 bytes (60 GB, 56 GiB) copied, 1 s, 60.0 GB/s
  68719476736 bytes (69 GB, 64 GiB) copied, 1.14129 s, 60.2 GB/s
  53212086272 bytes (53 GB, 50 GiB) copied, 1 s, 53.2 GB/s
  68719476736 bytes (69 GB, 64 GiB) copied, 1.28398 s, 53.5 GB/s
  55698259968 bytes (56 GB, 52 GiB) copied, 1 s, 55.7 GB/s
  68719476736 bytes (69 GB, 64 GiB) copied, 1.22507 s, 56.1 GB/s
  55306092544 bytes (55 GB, 52 GiB) copied, 1 s, 55.3 GB/s
  68719476736 bytes (69 GB, 64 GiB) copied, 1.23647 s, 55.6 GB/s
  54387539968 bytes (54 GB, 51 GiB) copied, 1 s, 54.4 GB/s
  68719476736 bytes (69 GB, 64 GiB) copied, 1.25693 s, 54.7 GB/s
  50566529024 bytes (51 GB, 47 GiB) copied, 1 s, 50.6 GB/s
  68719476736 bytes (69 GB, 64 GiB) copied, 1.35096 s, 50.9 GB/s
  58308165632 bytes (58 GB, 54 GiB) copied, 1 s, 58.3 GB/s
  68719476736 bytes (69 GB, 64 GiB) copied, 1.17394 s, 58.5 GB/s

Now the same thing with smaller buffers:

  for i in $(seq 1 10); do dd if=/dev/zero of=/dev/null bs=1M status=progress count=8192; done 2>&1 | grep copied
  8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.28485 s, 30.2 GB/s
  8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.276112 s, 31.1 GB/s
  8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.29136 s, 29.5 GB/s
  8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.283803 s, 30.3 GB/s
  8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.306503 s, 28.0 GB/s
  8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.349169 s, 24.6 GB/s
  8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.276912 s, 31.0 GB/s
  8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.265356 s, 32.4 GB/s
  8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.28464 s, 30.2 GB/s
  8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.242998 s, 35.3 GB/s

is also not conclusive because it all depends on the buffer sizes,
their alignments and when the microcode detects that cachelines can be
aggregated properly and copied in bigger sizes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/CAHk-=wh=Mu_EYhtOmPn6AxoQZyEh-4fo2Zx3G7rBv1g7vwoKiw@mail.gmail.com
2022-08-18 12:36:42 +02:00
Linus Torvalds
aad26f55f4 This was a moderately busy cycle for documentation, but nothing all that
earth-shaking:
 
 - More Chinese translations, and an update to the Italian translations.
   The Japanese, Korean, and traditional Chinese translations are
   more-or-less unmaintained at this point, instead.
 
 - Some build-system performance improvements.
 
 - The removal of the archaic submitting-drivers.rst document, with the
   movement of what useful material that remained into other docs.
 
 - Improvements to sphinx-pre-install to, hopefully, give more useful
   suggestions.
 
 - A number of build-warning fixes
 
 Plus the usual collection of typo fixes, updates, and more.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmLn9OwPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YtrwIAJNZoDYJJIRuVHnFkAn5EJ4b/chnR1dSTBtn
 WdE/1zdAlMBWVlEGO48VZybph9Sk0v+cUGf+yviDgASQrfOhRRTkg/0u6XaBAYO0
 +C2D1QDd9DggGgajxsfJfTdD3IuB78mGmCQvP17XIJW+NK1CK9rXZBnj6WC5/HJw
 PCHzeeVreBxOS3W9GelMYa6vjVl7dv81x4DPllnsgU2AMk0/Ce0MVjeIZ695sOeP
 Ki6jZgC2GsgFSK5kBC35OiDe5q+fDzlLfek34EUCn4SIbMALSUYWO1db122w5Pme
 Ej0+UTBhD19WH1uB/rcVKnVWugi7UEUJexZsao+nC7UrdIVtYq0=
 =83BG
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.0' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "This was a moderately busy cycle for documentation, but nothing
  all that earth-shaking:

   - More Chinese translations, and an update to the Italian
     translations.

     The Japanese, Korean, and traditional Chinese translations
     are more-or-less unmaintained at this point, instead.

   - Some build-system performance improvements.

   - The removal of the archaic submitting-drivers.rst document,
     with the movement of what useful material that remained into
     other docs.

   - Improvements to sphinx-pre-install to, hopefully, give more
     useful suggestions.

   - A number of build-warning fixes

  Plus the usual collection of typo fixes, updates, and more"

* tag 'docs-6.0' of git://git.lwn.net/linux: (92 commits)
  docs: efi-stub: Fix paths for x86 / arm stubs
  Docs/zh_CN: Update the translation of sched-stats to 5.19-rc8
  Docs/zh_CN: Update the translation of pci to 5.19-rc8
  Docs/zh_CN: Update the translation of pci-iov-howto to 5.19-rc8
  Docs/zh_CN: Update the translation of usage to 5.19-rc8
  Docs/zh_CN: Update the translation of testing-overview to 5.19-rc8
  Docs/zh_CN: Update the translation of sparse to 5.19-rc8
  Docs/zh_CN: Update the translation of kasan to 5.19-rc8
  Docs/zh_CN: Update the translation of iio_configfs to 5.19-rc8
  doc:it_IT: align Italian documentation
  docs: Remove spurious tag from admin-guide/mm/overcommit-accounting.rst
  Documentation: process: Update email client instructions for Thunderbird
  docs: ABI: correct QEMU fw_cfg spec path
  doc/zh_CN: remove submitting-driver reference from docs
  docs: zh_TW: align to submitting-drivers removal
  docs: zh_CN: align to submitting-drivers removal
  docs: ko_KR: howto: remove reference to removed submitting-drivers
  docs: ja_JP: howto: remove reference to removed submitting-drivers
  docs: it_IT: align to submitting-drivers removal
  docs: process: remove outdated submitting-drivers.rst
  ...
2022-08-02 19:24:24 -07:00
Linus Torvalds
ce114c8668 Just when you thought that all the speculation bugs were addressed and
solved and the nightmare is complete, here's the next one: speculating
 after RET instructions and leaking privileged information using the now
 pretty much classical covert channels.
 
 It is called RETBleed and the mitigation effort and controlling
 functionality has been modelled similar to what already existing
 mitigations provide.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLKqAgACgkQEsHwGGHe
 VUoM5w/8CSvwPZ3otkhmu8MrJPtWc7eLDPjYN4qQP+19e+bt094MoozxeeWG2wmp
 hkDJAYHT2Oik/qDuEdhFgNYwS7XGgbV3Py3B8syO4//5SD5dkOSG+QqFXvXMdFri
 YsVqqNkjJOWk/YL9Ql5RS/xQewsrr0OqEyWWocuI6XAvfWV4kKvlRSd+6oPqtZEO
 qYlAHTXElyIrA/gjmxChk1HTt5HZtK3uJLf4twNlUfzw7LYFf3+sw3bdNuiXlyMr
 WcLXMwGpS0idURwP3mJa7JRuiVBzb4+kt8mWwWqA02FkKV45FRRRFhFUsy667r00
 cdZBaWdy+b7dvXeliO3FN/x1bZwIEUxmaNy1iAClph4Ifh0ySPUkxAr8EIER7YBy
 bstDJEaIqgYg8NIaD4oF1UrG0ZbL0ImuxVaFdhG1hopQsh4IwLSTLgmZYDhfn/0i
 oSqU0Le+A7QW9s2A2j6qi7BoAbRW+gmBuCgg8f8ECYRkFX1ZF6mkUtnQxYrU7RTq
 rJWGW9nhwM9nRxwgntZiTjUUJ2HtyXEgYyCNjLFCbEBfeG5QTg7XSGFhqDbgoymH
 85vsmSXYxgTgQ/kTW7Fs26tOqnP2h1OtLJZDL8rg49KijLAnISClEgohYW01CWQf
 ZKMHtz3DM0WBiLvSAmfGifScgSrLB5AjtvFHT0hF+5/okEkinVk=
 =09fW
 -----END PGP SIGNATURE-----

Merge tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 retbleed fixes from Borislav Petkov:
 "Just when you thought that all the speculation bugs were addressed and
  solved and the nightmare is complete, here's the next one: speculating
  after RET instructions and leaking privileged information using the
  now pretty much classical covert channels.

  It is called RETBleed and the mitigation effort and controlling
  functionality has been modelled similar to what already existing
  mitigations provide"

* tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  x86/speculation: Disable RRSBA behavior
  x86/kexec: Disable RET on kexec
  x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported
  x86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry
  x86/bugs: Add Cannon lake to RETBleed affected CPU list
  x86/retbleed: Add fine grained Kconfig knobs
  x86/cpu/amd: Enumerate BTC_NO
  x86/common: Stamp out the stepping madness
  KVM: VMX: Prevent RSB underflow before vmenter
  x86/speculation: Fill RSB on vmexit for IBRS
  KVM: VMX: Fix IBRS handling after vmexit
  KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS
  KVM: VMX: Convert launched argument to flags
  KVM: VMX: Flatten __vmx_vcpu_run()
  objtool: Re-add UNWIND_HINT_{SAVE_RESTORE}
  x86/speculation: Remove x86_spec_ctrl_mask
  x86/speculation: Use cached host SPEC_CTRL value for guest entry/exit
  x86/speculation: Fix SPEC_CTRL write on SMT state change
  x86/speculation: Fix firmware entry SPEC_CTRL handling
  x86/speculation: Fix RSB filling with CONFIG_RETPOLINE=n
  ...
2022-07-11 18:15:25 -07:00
Mauro Carvalho Chehab
d6a21f2d73 objtool: update objtool.txt references
Changeset a8e35fece4 ("objtool: Update documentation")
renamed: tools/objtool/Documentation/stack-validation.txt
to: tools/objtool/Documentation/objtool.txt.

Update the cross-references accordingly.

Fixes: a8e35fece4 ("objtool: Update documentation")
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Link: https://lore.kernel.org/r/ec285ece6348a5be191aebe45f78d06b3319056b.1656234456.git.mchehab@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-07-07 13:10:00 -06:00
Peter Zijlstra
38e0e4d04d x86/ibt, objtool: Don't discard text references from tracepoint section
On Tue, Jun 28, 2022 at 04:28:58PM +0800, Pengfei Xu wrote:

> # ./ftracetest
> === Ftrace unit tests ===
> [1] Basic trace file check      [PASS]
> [2] Basic test for tracers      [PASS]
> [3] Basic trace clock test      [PASS]
> [4] Basic event tracing check   [PASS]
> [5] Change the ringbuffer size  [PASS]
> [6] Snapshot and tracing setting        [PASS]
> [7] trace_pipe and trace_marker [PASS]
> [8] Test ftrace direct functions against tracers        [UNRESOLVED]
> [9] Test ftrace direct functions against kprobes        [UNRESOLVED]
> [10] Generic dynamic event - add/remove eprobe events   [FAIL]
> [11] Generic dynamic event - add/remove kprobe events
>
> It 100% reproduced in step 11 and then missing ENDBR BUG generated:
> "
> [ 9332.752836] mmiotrace: enabled CPU7.
> [ 9332.788612] mmiotrace: disabled.
> [ 9337.103426] traps: Missing ENDBR: syscall_regfunc+0x0/0xb0

It turns out that while syscall_regfunc() does have an ENDBR when
generated, it gets sealed by objtool's .ibt_endbr_seal list.

Since the only text references to this function:

  $ git grep syscall_regfunc
  include/linux/tracepoint.h:extern int syscall_regfunc(void);
  include/trace/events/syscalls.h:        syscall_regfunc, syscall_unregfunc
  include/trace/events/syscalls.h:        syscall_regfunc, syscall_unregfunc
  kernel/tracepoint.c:int syscall_regfunc(void)

appear in the __tracepoint section which is excluded by objtool.

Fixes: 3c6f9f77e6 ("objtool: Rework ibt and extricate from stack validation")
Reported-by: Pengfei Xu <pengfei.xu@intel.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yrrepdaow4F5kqG0@hirez.programming.kicks-ass.net
2022-07-07 13:26:54 +02:00
Peter Zijlstra
f43b9876e8 x86/retbleed: Add fine grained Kconfig knobs
Do fine-grained Kconfig for all the various retbleed parts.

NOTE: if your compiler doesn't support return thunks this will
silently 'upgrade' your mitigation to IBPB, you might not like this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-29 17:43:41 +02:00
Josh Poimboeuf
8faea26e61 objtool: Re-add UNWIND_HINT_{SAVE_RESTORE}
Commit

  c536ed2fff ("objtool: Remove SAVE/RESTORE hints")

removed the save/restore unwind hints because they were no longer
needed. Now they're going to be needed again so re-add them.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-27 10:34:00 +02:00
Peter Zijlstra
a09a6e2399 objtool: Add entry UNRET validation
Since entry asm is tricky, add a validation pass that ensures the
retbleed mitigation has been done before the first actual RET
instruction.

Entry points are those that either have UNWIND_HINT_ENTRY, which acts
as UNWIND_HINT_EMPTY but marks the instruction as an entry point, or
those that have UWIND_HINT_IRET_REGS at +0.

This is basically a variant of validate_branch() that is
intra-function and it will simply follow all branches from marked
entry points and ensures that all paths lead to ANNOTATE_UNRET_END.

If a path hits RET or an indirection the path is a fail and will be
reported.

There are 3 ANNOTATE_UNRET_END instances:

 - UNTRAIN_RET itself
 - exception from-kernel; this path doesn't need UNTRAIN_RET
 - all early exceptions; these also don't need UNTRAIN_RET

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-27 10:34:00 +02:00
Peter Zijlstra
9bb2ec608a objtool: Update Retpoline validation
Update retpoline validation with the new CONFIG_RETPOLINE requirement of
not having bare naked RET instructions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-27 10:33:59 +02:00
Peter Zijlstra
a149180fbc x86: Add magic AMD return-thunk
Note: needs to be in a section distinct from Retpolines such that the
Retpoline RET substitution cannot possibly use immediate jumps.

ORC unwinding for zen_untrain_ret() and __x86_return_thunk() is a
little tricky but works due to the fact that zen_untrain_ret() doesn't
have any stack ops and as such will emit a single ORC entry at the
start (+0x3f).

Meanwhile, unwinding an IP, including the __x86_return_thunk() one
(+0x40) will search for the largest ORC entry smaller or equal to the
IP, these will find the one ORC entry (+0x3f) and all works.

  [ Alexandre: SVM part. ]
  [ bp: Build fix, massages. ]

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-27 10:33:59 +02:00
Peter Zijlstra
951ddecf43 objtool: Treat .text.__x86.* as noinstr
Needed because zen_untrain_ret() will be called from noinstr code.

Also makes sense since the thunks MUST NOT contain instrumentation nor
be poked with dynamic instrumentation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-27 10:33:59 +02:00
Peter Zijlstra
d9e9d23006 x86,objtool: Create .return_sites
Find all the return-thunk sites and record them in a .return_sites
section such that the kernel can undo this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-27 10:33:58 +02:00
Borislav Petkov
2028a255f4 x86/extable: Annotate ex_handler_msr_mce() as a dead end
Fix

  vmlinux.o: warning: objtool: fixup_exception+0x2d6: unreachable instruction

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220520192729.23969-1-bp@alien8.de
2022-05-27 12:34:45 +02:00
Peter Zijlstra
385bd430c0 objtool: Mark __ubsan_handle_builtin_unreachable() as noreturn
fs/ntfs3/ntfs3.prelink.o: warning: objtool: ni_read_frame() falls through to next function ni_readpage_cmpr.cold()

That is in fact:

000000000000124a <ni_read_frame.cold>:
    124a:       44 89 e0                mov    %r12d,%eax
    124d:       0f b6 55 98             movzbl -0x68(%rbp),%edx
    1251:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi        1254: R_X86_64_32S      .data+0x1380
    1258:       48 89 c6                mov    %rax,%rsi
    125b:       e8 00 00 00 00          call   1260 <ni_read_frame.cold+0x16>   125c: R_X86_64_PLT32    __ubsan_handle_shift_out_of_bounds-0x4
    1260:       48 8d 7d cc             lea    -0x34(%rbp),%rdi
    1264:       e8 00 00 00 00          call   1269 <ni_read_frame.cold+0x1f>   1265: R_X86_64_PLT32    __tsan_read4-0x4
    1269:       8b 45 cc                mov    -0x34(%rbp),%eax
    126c:       e9 00 00 00 00          jmp    1271 <ni_read_frame.cold+0x27>   126d: R_X86_64_PC32     .text+0x19109
    1271:       48 8b 75 a0             mov    -0x60(%rbp),%rsi
    1275:       48 63 d0                movslq %eax,%rdx
    1278:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi        127b: R_X86_64_32S      .data+0x13a0
    127f:       89 45 88                mov    %eax,-0x78(%rbp)
    1282:       e8 00 00 00 00          call   1287 <ni_read_frame.cold+0x3d>   1283: R_X86_64_PLT32    __ubsan_handle_shift_out_of_bounds-0x4
    1287:       8b 45 88                mov    -0x78(%rbp),%eax
    128a:       e9 00 00 00 00          jmp    128f <ni_read_frame.cold+0x45>   128b: R_X86_64_PC32     .text+0x19098
    128f:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi        1292: R_X86_64_32S      .data+0x11f0
    1296:       e8 00 00 00 00          call   129b <ni_readpage_cmpr.cold>     1297: R_X86_64_PLT32    __ubsan_handle_builtin_unreachable-0x4

000000000000129b <ni_readpage_cmpr.cold>:

Tell objtool that __ubsan_handle_builtin_unreachable() is a noreturn.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220502091514.GB479834@worktop.programming.kicks-ass.net
2022-05-27 12:34:43 +02:00
Linus Torvalds
df202b452f Kbuild updates for v5.19
- Add HOSTPKG_CONFIG env variable to allow users to override pkg-config
 
  - Support W=e as a shorthand for KCFLAGS=-Werror
 
  - Fix CONFIG_IKHEADERS build to support toybox cpio
 
  - Add scripts/dummy-tools/pahole to ease distro packagers' life
 
  - Suppress false-positive warnings from checksyscalls.sh for W=2 build
 
  - Factor out the common code of arch/*/boot/install.sh into
    scripts/install.sh
 
  - Support 'kernel-install' tool in scripts/prune-kernel
 
  - Refactor module-versioning to link the symbol versions at the final
    link of vmlinux and modules
 
  - Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in
    an arch-agnostic way
 
  - Refactor modpost, Makefiles
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmKOO2oVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGG54P/3/U5FIP5EoPAVu9HqSUKeeUiBYc
 z1B8d7Wt1xU0xHImPWNjoacfye4MrDMUv8mEWKgHCVusJxbUoS+3Z/kd64NU75Fg
 Cpj+9fP1N8m02IJzraxn6fw0bmfx4zp9Zsa9l2fjwL0emq4qhB7BA9/Nl6Png1IW
 p0TPR6gV0Wgp6ikf/eJ3b1decFSqM7QzDlbo860nPMG164gNpDZmFVf2G4HCRQoY
 GtgoQLEy2pBeOdU7+nJTKl2f5JOhDjRKX8equ7BHW9l7nbUvHd6ys3DGqYO3nvwV
 hZZdHwDtxxO6bJtzClKPREyfL2H9R2AGxq94HzSwdvwdLLoFxrTN+mg88xBg17Rm
 tKHy8jpZT36qh218h5lX5n9ZWcovTA38giZ+S/tkwOvvYGivKHDS23QwzB0HrG8/
 VRd+0rhfIvuIpu0OQaTpTkZr2QVci2Zn6PPnxpyPEsGvWVFRjyx0WyZh4fFXnkQT
 n+GS7j5g1LVMra0qu0y+yp4zy/DVFKIcfry0xU8S5SaSEBBcWUxLS2nnoBVB4vb2
 RpiVD2vaOlvu/Zs2SOgtuMOnTw+Qqrvh7OYm/WyxWrB3JQGa/r+vipMKiFEDi2NN
 pwR8wJT/CW1ycte93m3oO83jiitFqzXtAqo24wKlp4SOqnR/TQ/dx743ku2xvONe
 uynJVW/gZVm4KEUl
 =Y2TB
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Add HOSTPKG_CONFIG env variable to allow users to override pkg-config

 - Support W=e as a shorthand for KCFLAGS=-Werror

 - Fix CONFIG_IKHEADERS build to support toybox cpio

 - Add scripts/dummy-tools/pahole to ease distro packagers' life

 - Suppress false-positive warnings from checksyscalls.sh for W=2 build

 - Factor out the common code of arch/*/boot/install.sh into
   scripts/install.sh

 - Support 'kernel-install' tool in scripts/prune-kernel

 - Refactor module-versioning to link the symbol versions at the final
   link of vmlinux and modules

 - Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in
   an arch-agnostic way

 - Refactor modpost, Makefiles

* tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (56 commits)
  genksyms: adjust the output format to modpost
  kbuild: stop merging *.symversions
  kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS
  modpost: extract symbol versions from *.cmd files
  modpost: add sym_find_with_module() helper
  modpost: change the license of EXPORT_SYMBOL to bool type
  modpost: remove left-over cross_compile declaration
  kbuild: record symbol versions in *.cmd files
  kbuild: generate a list of objects in vmlinux
  modpost: move *.mod.c generation to write_mod_c_files()
  modpost: merge add_{intree_flag,retpoline,staging_flag} to add_header
  scripts/prune-kernel: Use kernel-install if available
  kbuild: factor out the common installation code into scripts/install.sh
  modpost: split new_symbol() to symbol allocation and hash table addition
  modpost: make sym_add_exported() always allocate a new symbol
  modpost: make multiple export error
  modpost: dump Module.symvers in the same order of modules.order
  modpost: traverse the namespace_list in order
  modpost: use doubly linked list for dump_lists
  modpost: traverse unresolved symbols in order
  ...
2022-05-26 12:09:50 -07:00
Mikulas Patocka
22682a07ac objtool: Fix objtool regression on x32 systems
Commit c087c6e7b5 ("objtool: Fix type of reloc::addend") failed to
appreciate cross building from ILP32 hosts, where 'int' == 'long' and
the issue persists.

As such, use s64/int64_t/Elf64_Sxword for this field and suffer the
pain that is ISO C99 printf formats for it.

Fixes: c087c6e7b5 ("objtool: Fix type of reloc::addend")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
[peterz: reword changelog, s/long long/s64/]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/alpine.LRH.2.02.2205161041260.11556@file01.intranet.prod.int.rdu2.redhat.com
2022-05-20 12:45:30 +02:00
Peter Zijlstra
ead165fa10 objtool: Fix symbol creation
Nathan reported objtool failing with the following messages:

  warning: objtool: no non-local symbols !?
  warning: objtool: gelf_update_symshndx: invalid section index

The problem is due to commit 4abff6d48d ("objtool: Fix code relocs
vs weak symbols") failing to consider the case where an object would
have no non-local symbols.

The problem that commit tries to address is adding a STB_LOCAL symbol
to the symbol table in light of the ELF spec's requirement that:

  In each symbol table, all symbols with STB_LOCAL binding preced the
  weak and global symbols.  As ``Sections'' above describes, a symbol
  table section's sh_info section header member holds the symbol table
  index for the first non-local symbol.

The approach taken is to find this first non-local symbol, move that
to the end and then re-use the freed spot to insert a new local symbol
and increment sh_info.

Except it never considered the case of object files without global
symbols and got a whole bunch of details wrong -- so many in fact that
it is a wonder it ever worked :/

Specifically:

 - It failed to re-hash the symbol on the new index, so a subsequent
   find_symbol_by_index() would not find it at the new location and a
   query for the old location would now return a non-deterministic
   choice between the old and new symbol.

 - It failed to appreciate that the GElf wrappers are not a valid disk
   format (it works because GElf is basically Elf64 and we only
   support x86_64 atm.)

 - It failed to fully appreciate how horrible the libelf API really is
   and got the gelf_update_symshndx() call pretty much completely
   wrong; with the direct consequence that if inserting a second
   STB_LOCAL symbol would require moving the same STB_GLOBAL symbol
   again it would completely come unstuck.

Write a new elf_update_symbol() function that wraps all the magic
required to update or create a new symbol at a given index.

Specifically, gelf_update_sym*() require an @ndx argument that is
relative to the @data argument; this means you have to manually
iterate the section data descriptor list and update @ndx.

Fixes: 4abff6d48d ("objtool: Fix code relocs vs weak symbols")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/YoPCTEYjoPqE4ZxB@hirez.programming.kicks-ass.net
2022-05-20 12:42:06 +02:00
Tiezhu Yang
4bc7800588 objtool: Remove libsubcmd.a when make clean
The file libsubcmd.a still exists after make clean, remove it.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/1652258270-6278-3-git-send-email-yangtiezhu@loongson.cn
2022-05-12 07:28:35 -07:00
Tiezhu Yang
f193c32cad objtool: Remove inat-tables.c when make clean
When build objtool on x86, the generated file inat-tables.c is in
arch/x86/lib instead of arch/x86, use the correct dir to remove it
when make clean.

$ cd tools/objtool
$ make
[...]
  GEN     arch/x86/lib/inat-tables.c
[...]

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/1652258270-6278-2-git-send-email-yangtiezhu@loongson.cn
2022-05-12 07:28:05 -07:00
Josh Poimboeuf
a8e35fece4 objtool: Update documentation
The objtool documentation is very stack validation centric.  Broaden the
documentation and describe all the features objtool supports.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/b6a84d301d9f73ec6725752654097f4e31fa1b69.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:05 +02:00
Josh Poimboeuf
753da4179d objtool: Remove --lto and --vmlinux in favor of --link
The '--lto' option is a confusing way of telling objtool to do stack
validation despite it being a linked object.  It's no longer needed now
that an explicit '--stackval' option exists.  The '--vmlinux' option is
also redundant.

Remove both options in favor of a straightforward '--link' option which
identifies a linked object.

Also, implicitly set '--link' with a warning if the user forgets to do
so and we can tell that it's a linked object.  This makes it easier for
manual vmlinux runs.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/dcd3ceffd15a54822c6183e5766d21ad06082b45.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:05 +02:00
Josh Poimboeuf
22102f4559 objtool: Make noinstr hacks optional
Objtool has some hacks in place to workaround toolchain limitations
which otherwise would break no-instrumentation rules.  Make the hacks
explicit (and optional for other arches) by turning it into a cmdline
option and kernel config option.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/b326eeb9c33231b9dfbb925f194ed7ee40edcd7c.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:04 +02:00
Josh Poimboeuf
4ab7674f59 objtool: Make jump label hack optional
Objtool secretly does a jump label hack to overcome the limitations of
the toolchain.  Make the hack explicit (and optional for other arches)
by turning it into a cmdline option and kernel config option.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/3bdcbfdd27ecb01ddec13c04bdf756a583b13d24.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:04 +02:00
Josh Poimboeuf
26e176896a objtool: Make static call annotation optional
As part of making objtool more modular, put the existing static call
code behind a new '--static-call' option.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/d59ac57ef3d6d8380cdce20322314c9e2e556750.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:03 +02:00
Josh Poimboeuf
7206447496 objtool: Make stack validation frame-pointer-specific
Now that CONFIG_STACK_VALIDATION is frame-pointer specific, do the same
for the '--stackval' option.  Now the '--no-fp' option is redundant and
can be removed.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/f563fa064b3b63d528de250c72012d49e14742a3.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:03 +02:00
Josh Poimboeuf
c2bdd61c98 objtool: Extricate sls from stack validation
Extricate sls functionality from validate_branch() so they can be
executed (or ported) independently from each other.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/2545c86ffa5f27497f0d0c542540ad4a4be3c5a5.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:03 +02:00
Josh Poimboeuf
3c6f9f77e6 objtool: Rework ibt and extricate from stack validation
Extricate ibt from validate_branch() so they can be executed (or ported)
independently from each other.

While shuffling code around, simplify and improve the ibt logic:

- Ignore an explicit list of known sections which reference functions
  for reasons other than indirect branching to them.  This helps prevent
  unnnecesary sealing.

- Warn on missing !ENDBR for all other sections, not just .data and
  .rodata.  This finds additional warnings, because there are sections
  other than .[ro]data which reference function pointers.  For example,
  the ksymtab sections which are used for exporting symbols.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/fd1435e46bb95f81031b8fb1fa360f5f787e4316.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:02 +02:00
Josh Poimboeuf
7dce62041a objtool: Make stack validation optional
Make stack validation an explicit cmdline option so that individual
objtool features can be enabled individually by other arches.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/52da143699574d756e65ca4c9d4acaffe9b0fe5f.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:02 +02:00
Josh Poimboeuf
99c0beb547 objtool: Add option to print section addresses
To help prevent objtool users from having to do math to convert function
addresses to section addresses, and to help out with finding data
addresses reported by IBT validation, add an option to print the section
address in addition to the function address.

Normal:

  vmlinux.o: warning: objtool: fixup_exception()+0x2d1: unreachable instruction

With '--sec-address':

  vmlinux.o: warning: objtool: fixup_exception()+0x2d1 (.text+0x76c51): unreachable instruction

Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/2cea4d5299d53d1a4c09212a6ad7820aa46fda7a.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:02 +02:00
Josh Poimboeuf
2bc3dec705 objtool: Don't print parentheses in function addresses
The parentheses in the "func()+off" address output are inconsistent with
how the kernel prints function addresses, breaking Peter's scripts.
Remove them.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/f2bec70312f62ef4f1ea21c134d9def627182ad3.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:02 +02:00
Josh Poimboeuf
b51277eb97 objtool: Ditch subcommands
Objtool has a fairly singular focus.  It runs on object files and does
validations and transformations which can be combined in various ways.
The subcommand model has never been a good fit, making it awkward to
combine and remove options.

Remove the "check" and "orc" subcommands in favor of a more traditional
cmdline option model.  This makes it much more flexible to use, and
easier to port individual features to other arches.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/5c61ebf805e90aefc5fa62bc63468ffae53b9df6.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:01 +02:00
Josh Poimboeuf
2daf7faba7 objtool: Reorganize cmdline options
Split the existing options into two groups: actions, which actually do
something; and options, which modify the actions in some way.

Also there's no need to have short flags for all the non-action options.
Reserve short flags for the more important actions.

While at it:

- change a few of the short flags to be more intuitive

- make option descriptions more consistently descriptive

- sort options in the source like they are when printed

- move options to a global struct

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/9dcaa752f83aca24b1b21f0b0eeb28a0c181c0b0.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:01 +02:00
Peter Zijlstra
3398b12d10 Merge branch 'tip/x86/urgent'
Merge the x86/urgent objtool/IBT changes as a base

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-04-22 12:32:01 +02:00
Peter Zijlstra
4abff6d48d objtool: Fix code relocs vs weak symbols
Occasionally objtool driven code patching (think .static_call_sites
.retpoline_sites etc..) goes sideways and it tries to patch an
instruction that doesn't match.

Much head-scatching and cursing later the problem is as outlined below
and affects every section that objtool generates for us, very much
including the ORC data. The below uses .static_call_sites because it's
convenient for demonstration purposes, but as mentioned the ORC
sections, .retpoline_sites and __mount_loc are all similarly affected.

Consider:

foo-weak.c:

  extern void __SCT__foo(void);

  __attribute__((weak)) void foo(void)
  {
	  return __SCT__foo();
  }

foo.c:

  extern void __SCT__foo(void);
  extern void my_foo(void);

  void foo(void)
  {
	  my_foo();
	  return __SCT__foo();
  }

These generate the obvious code
(gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c):

foo-weak.o:
0000000000000000 <foo>:
   0:   e9 00 00 00 00          jmpq   5 <foo+0x5>      1: R_X86_64_PLT32       __SCT__foo-0x4

foo.o:
0000000000000000 <foo>:
   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   e8 00 00 00 00          callq  9 <foo+0x9>      5: R_X86_64_PLT32       my_foo-0x4
   9:   48 83 c4 08             add    $0x8,%rsp
   d:   e9 00 00 00 00          jmpq   12 <foo+0x12>    e: R_X86_64_PLT32       __SCT__foo-0x4

Now, when we link these two files together, you get something like
(ld -r -o foos.o foo-weak.o foo.o):

foos.o:
0000000000000000 <foo-0x10>:
   0:   e9 00 00 00 00          jmpq   5 <foo-0xb>      1: R_X86_64_PLT32       __SCT__foo-0x4
   5:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)
   f:   90                      nop

0000000000000010 <foo>:
  10:   48 83 ec 08             sub    $0x8,%rsp
  14:   e8 00 00 00 00          callq  19 <foo+0x9>     15: R_X86_64_PLT32      my_foo-0x4
  19:   48 83 c4 08             add    $0x8,%rsp
  1d:   e9 00 00 00 00          jmpq   22 <foo+0x12>    1e: R_X86_64_PLT32      __SCT__foo-0x4

Noting that ld preserves the weak function text, but strips the symbol
off of it (hence objdump doing that funny negative offset thing). This
does lead to 'interesting' unused code issues with objtool when ran on
linked objects, but that seems to be working (fingers crossed).

So far so good.. Now lets consider the objtool static_call output
section (readelf output, old binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 1d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

So we have two patch sites, one in the dead code of the weak foo and one
in the real foo. All is well.

*HOWEVER*, when the toolchain strips unused section symbols it
generates things like this (using new enough binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 foo + d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

And now we can see how that foos.o .static_call_sites goes side-ways, we
now have _two_ patch sites in foo. One for the weak symbol at foo+0
(which is no longer a static_call site!) and one at foo+d which is in
fact the right location.

This seems to happen when objtool cannot find a section symbol, in which
case it falls back to any other symbol to key off of, however in this
case that goes terribly wrong!

As such, teach objtool to create a section symbol when there isn't
one.

Fixes: 44f6a7c075 ("objtool: Fix seg fault with Clang non-section symbols")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20220419203807.655552918@infradead.org
2022-04-22 12:13:55 +02:00
Peter Zijlstra
c087c6e7b5 objtool: Fix type of reloc::addend
Elf{32,64}_Rela::r_addend is of type: Elf{32,64}_Sword, that means
that our reloc::addend needs to be long or face tuncation issues when
we do elf_rebuild_reloc_section():

  - 107:  48 b8 00 00 00 00 00 00 00 00   movabs $0x0,%rax        109: R_X86_64_64        level4_kernel_pgt+0x80000067
  + 107:  48 b8 00 00 00 00 00 00 00 00   movabs $0x0,%rax        109: R_X86_64_64        level4_kernel_pgt-0x7fffff99

Fixes: 627fce1480 ("objtool: Add ORC unwind table generation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20220419203807.596871927@infradead.org
2022-04-22 12:13:55 +02:00
Josh Poimboeuf
08feafe8d1 objtool: Fix function fallthrough detection for vmlinux
Objtool's function fallthrough detection only works on C objects.
The distinction between C and assembly objects no longer makes sense
with objtool running on vmlinux.o.

Now that copy_user_64.S has been fixed up, and an objtool sibling call
detection bug has been fixed, the asm code is in "compliance" and this
hack is no longer needed.  Remove it.

Fixes: ed53a0d971 ("x86/alternative: Use .ibt_endbr_seal to seal indirect calls")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/b434cff98eca3a60dcc64c620d7d5d405a0f441c.1649718562.git.jpoimboe@redhat.com
2022-04-19 21:58:53 +02:00
Josh Poimboeuf
34c861e806 objtool: Fix sibling call detection in alternatives
In add_jump_destinations(), sibling call detection requires 'insn->func'
to be valid.  But alternative instructions get their 'func' set in
handle_group_alt(), which runs *after* add_jump_destinations().  So
sibling calls in alternatives code don't get properly detected.

Fix that by changing the initialization order: call
add_special_section_alts() *before* add_jump_destinations().

This also means the special case for a missing 'jump_dest' in
add_jump_destinations() can be removed, as it has already been dealt
with.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/c02e0a0a2a4286b5f848d17c77fdcb7e0caf709c.1649718562.git.jpoimboe@redhat.com
2022-04-19 21:58:53 +02:00
Josh Poimboeuf
26ff604102 objtool: Don't set 'jump_dest' for sibling calls
For most sibling calls, 'jump_dest' is NULL because objtool treats the
jump like a call and sets 'call_dest'.  But there are a few edge cases
where that's not true.  Make it consistent to avoid unexpected behavior.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/8737d6b9d1691831aed73375f444f0f42da3e2c9.1649718562.git.jpoimboe@redhat.com
2022-04-19 21:58:53 +02:00
Josh Poimboeuf
1d08b92fa2 objtool: Use offstr() to print address of missing ENDBR
Fixes: 89bc853eae ("objtool: Find unused ENDBR instructions")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/95d12e800c736a3f7d08d61dabb760b2d5251a8e.1650300597.git.jpoimboe@redhat.com
2022-04-19 21:58:50 +02:00
Josh Poimboeuf
4baae989e6 objtool: Print data address for "!ENDBR" data warnings
When a "!ENDBR" warning is reported for a data section, objtool just
prints the text address of the relocation target twice, without giving
any clues about the location of the original data reference:

  vmlinux.o: warning: objtool: dcbnl_netdevice_event()+0x0: .text+0xb64680: data relocation to !ENDBR: dcbnl_netdevice_event+0x0

Instead, print the address of the data reference, in addition to the
address of the relocation target.

  vmlinux.o: warning: objtool: dcbnl_nb+0x0: .data..read_mostly+0xe260: data relocation to !ENDBR: dcbnl_netdevice_event+0x0

Fixes: 89bc853eae ("objtool: Find unused ENDBR instructions")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/762e88d51300e8eaf0f933a5b0feae20ac033bea.1650300597.git.jpoimboe@redhat.com
2022-04-19 21:58:50 +02:00
Peter Zijlstra
d4e5268a08 x86,objtool: Mark cpu_startup_entry() __noreturn
GCC-8 isn't clever enough to figure out that cpu_start_entry() is a
noreturn while objtool is. This results in code after the call in
start_secondary(). Give GCC a hand so that they all agree on things.

  vmlinux.o: warning: objtool: start_secondary()+0x10e: unreachable

Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220408094718.383658532@infradead.org
2022-04-19 21:58:48 +02:00
Peter Zijlstra
7a53f40890 objtool: Fix SLS validation for kcov tail-call replacement
Since not all compilers have a function attribute to disable KCOV
instrumentation, objtool can rewrite KCOV instrumentation in noinstr
functions as per commit:

  f56dae88a8 ("objtool: Handle __sanitize_cov*() tail calls")

However, this has subtle interaction with the SLS validation from
commit:

  1cc1e4c8aa ("objtool: Add straight-line-speculation validation")

In that when a tail-call instrucion is replaced with a RET an
additional INT3 instruction is also written, but is not represented in
the decoded instruction stream.

This then leads to false positive missing INT3 objtool warnings in
noinstr code.

Instead of adding additional struct instruction objects, mark the RET
instruction with retpoline_safe to suppress the warning (since we know
there really is an INT3).

Fixes: 1cc1e4c8aa ("objtool: Add straight-line-speculation validation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220323230712.GA8939@worktop.programming.kicks-ass.net
2022-04-05 10:24:40 +02:00
Peter Zijlstra
d139bca4b8 objtool: Fix IBT tail-call detection
Objtool reports:

  arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_blocks_avx() falls through to next function poly1305_blocks_x86_64()
  arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_emit_avx() falls through to next function poly1305_emit_x86_64()
  arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_blocks_avx2() falls through to next function poly1305_blocks_x86_64()

Which reads like:

0000000000000040 <poly1305_blocks_x86_64>:
	 40:       f3 0f 1e fa             endbr64
	...

0000000000000400 <poly1305_blocks_avx>:
	400:       f3 0f 1e fa             endbr64
	404:       44 8b 47 14             mov    0x14(%rdi),%r8d
	408:       48 81 fa 80 00 00 00    cmp    $0x80,%rdx
	40f:       73 09                   jae    41a <poly1305_blocks_avx+0x1a>
	411:       45 85 c0                test   %r8d,%r8d
	414:       0f 84 2a fc ff ff       je     44 <poly1305_blocks_x86_64+0x4>
	...

These are simple conditional tail-calls and *should* be recognised as
such by objtool, however due to a mistake in commit 08f87a93c8
("objtool: Validate IBT assumptions") this is failing.

Specifically, the jump_dest is +4, this means the instruction pointed
at will not be ENDBR and as such it will fail the second clause of
is_first_func_insn() that was supposed to capture this exact case.

Instead, have is_first_func_insn() look at the previous instruction.

Fixes: 08f87a93c8 ("objtool: Validate IBT assumptions")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220322115125.811582125@infradead.org
2022-04-05 10:24:40 +02:00
Chun-Tse Shao
d5ea4fece4 kbuild: Allow kernel installation packaging to override pkg-config
Add HOSTPKG_CONFIG to allow tooling that builds the kernel to override
what pkg-config and parameters are used.

Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-04-05 17:03:31 +09:00
Linus Torvalds
b8321ed4a4 Kbuild updates for v5.18
- Add new environment variables, USERCFLAGS and USERLDFLAGS to allow
    additional flags to be passed to user-space programs.
 
  - Fix missing fflush() bugs in Kconfig and fixdep
 
  - Fix a minor bug in the comment format of the .config file
 
  - Make kallsyms ignore llvm's local labels, .L*
 
  - Fix UAPI compile-test for cross-compiling with Clang
 
  - Extend the LLVM= syntax to support LLVM=<suffix> form for using a
    particular version of LLVm, and LLVM=<prefix> form for using custom
    LLVM in a particular directory path.
 
  - Clean up Makefiles
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmJFGloVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGH0kP/j6Vx5BqEv3tP2Q+UANxLqITleJs
 IFpbSesz/BhlG7I/IapWmCDSqFbYd5uJTO4ko8CsPmZHcxr6Gw3y+DN5yQACKaG/
 p9xiF6GjPyKR8+VdcT2tV50+dVY8ANe/DxCyzKrJd/uyYxgARPKJh0KRMNz+d9lj
 ixUpCXDhx/XlKzPIlcxrvhhjevKz+NnHmN0fe6rzcOw9KzBGBTsf20Q3PqUuBOKa
 rWHsRGcBPA8eKLfWT1Us1jjic6cT2g4aMpWjF20YgUWKHgWVKcNHpxYKGXASVo/z
 ewdDnNfmwo7f7fKMCDDro9iwFWV/BumGtn43U00tnqdBcTpFojPlEOga37UPbZDF
 nmTblGVUhR0vn4PmfBy8WkAkbW+IpVatKwJGV4J3KjSvdWvZOmVj9VUGLVAR0TXW
 /YcgRs6EtG8Hn0IlCj0fvZ5wRWoDLbP2DSZ67R/44EP0GaNQPwUe4FI1izEE4EYX
 oVUAIxcKixWGj4RmdtmtMMdUcZzTpbgS9uloMUmS3u9LK0Ir/8tcWaf2zfMO6Jl2
 p4Q31s1dUUKCnFnj0xDKRyKGUkxYebrHLfuBqi0RIc0xRpSlxoXe3Dynm9aHEQoD
 ZSV0eouQJxnaxM1ck5Bu4AHLgEebHfEGjWVyUHno7jFU5EI9Wpbqpe4pCYEEDTm1
 +LJMEpdZO0dFvpF+
 =84rW
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Add new environment variables, USERCFLAGS and USERLDFLAGS to allow
   additional flags to be passed to user-space programs.

 - Fix missing fflush() bugs in Kconfig and fixdep

 - Fix a minor bug in the comment format of the .config file

 - Make kallsyms ignore llvm's local labels, .L*

 - Fix UAPI compile-test for cross-compiling with Clang

 - Extend the LLVM= syntax to support LLVM=<suffix> form for using a
   particular version of LLVm, and LLVM=<prefix> form for using custom
   LLVM in a particular directory path.

 - Clean up Makefiles

* tag 'kbuild-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: Make $(LLVM) more flexible
  kbuild: add --target to correctly cross-compile UAPI headers with Clang
  fixdep: use fflush() and ferror() to ensure successful write to files
  arch: syscalls: simplify uapi/kapi directory creation
  usr/include: replace extra-y with always-y
  certs: simplify empty certs creation in certs/Makefile
  certs: include certs/signing_key.x509 unconditionally
  kallsyms: ignore all local labels prefixed by '.L'
  kconfig: fix missing '# end of' for empty menu
  kconfig: add fflush() before ferror() check
  kbuild: replace $(if A,A,B) with $(or A,B)
  kbuild: Add environment variables for userprogs flags
  kbuild: unify cmd_copy and cmd_shipped
2022-03-31 11:59:03 -07:00
Peter Zijlstra
89bc853eae objtool: Find unused ENDBR instructions
Find all ENDBR instructions which are never referenced and stick them
in a section such that the kernel can poison them, sealing the
functions from ever being an indirect call target.

This removes about 1-in-4 ENDBR instructions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.763643193@infradead.org
2022-03-15 10:32:47 +01:00
Peter Zijlstra
08f87a93c8 objtool: Validate IBT assumptions
Intel IBT requires that every indirect JMP/CALL targets an ENDBR
instructions, failing this #CP happens and we die. Similarly, all
exception entries should be ENDBR.

Find all code relocations and ensure they're either an ENDBR
instruction or ANNOTATE_NOENDBR. For the exceptions look for
UNWIND_HINT_IRET_REGS at sym+0 not being ENDBR.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.705110141@infradead.org
2022-03-15 10:32:46 +01:00
Peter Zijlstra
7d209d13e7 objtool: Add IBT/ENDBR decoding
Intel IBT requires the target of any indirect CALL or JMP instruction
to be the ENDBR instruction; optionally it allows those two
instructions to have a NOTRACK prefix in order to avoid this
requirement.

The kernel will not enable the use of NOTRACK, as such any occurence
of it in compiler generated code should be flagged.

Teach objtool to Decode ENDBR instructions and WARN about NOTRACK
prefixes.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.645963517@infradead.org
2022-03-15 10:32:46 +01:00
Peter Zijlstra
96db4a988d objtool: Read the NOENDBR annotation
Read the new NOENDBR annotation. While there, attempt to not bloat
struct instruction.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.586815435@infradead.org
2022-03-15 10:32:46 +01:00
Peter Zijlstra
0e5b613b4d objtool: Rework ASM_REACHABLE
Currently ASM_REACHABLE only works for UD2 instructions; reorder
things to also allow over-riding dead_end_function().

To that end:

 - Mark INSN_BUG instructions in decode_instructions(), this saves
   having to iterate all instructions yet again.

 - Have add_call_destinations() set insn->dead_end for
   dead_end_function() calls.

 - Move add_dead_ends() *after* add_call_destinations() such that
   ASM_REACHABLE can clear the ->dead_end mark.

 - have validate_branch() only check ->dead_end.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.410010807@infradead.org
2022-03-15 10:32:44 +01:00
Peter Zijlstra
105cd68596 x86: Mark __invalid_creds() __noreturn
vmlinux.o: warning: objtool: ksys_unshare()+0x36c: unreachable instruction

0000 0000000000067040 <ksys_unshare>:
...
0364    673a4:	4c 89 ef             	mov    %r13,%rdi
0367    673a7:	e8 00 00 00 00       	call   673ac <ksys_unshare+0x36c>	673a8: R_X86_64_PLT32	__invalid_creds-0x4
036c    673ac:	e9 28 ff ff ff       	jmp    672d9 <ksys_unshare+0x299>
0371    673b1:	41 bc f4 ff ff ff    	mov    $0xfffffff4,%r12d
0377    673b7:	e9 80 fd ff ff       	jmp    6713c <ksys_unshare+0xfc>

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yi9gOW9f1GGwwUD6@hirez.programming.kicks-ass.net
2022-03-15 10:32:44 +01:00
Peter Zijlstra
eae654f1c2 exit: Mark do_group_exit() __noreturn
vmlinux.o: warning: objtool: get_signal()+0x108: unreachable instruction

0000 000000000007f930 <get_signal>:
...
0103    7fa33:  e8 00 00 00 00          call   7fa38 <get_signal+0x108> 7fa34: R_X86_64_PLT32   do_group_exit-0x4
0108    7fa38:  41 8b 45 74             mov    0x74(%r13),%eax

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.351270711@infradead.org
2022-03-15 10:32:43 +01:00
Peter Zijlstra
f9cdf7ca57 x86: Mark stop_this_cpu() __noreturn
vmlinux.o: warning: objtool: smp_stop_nmi_callback()+0x2b: unreachable instruction

0000 0000000000047cf0 <smp_stop_nmi_callback>:
...
0026    47d16:  e8 00 00 00 00          call   47d1b <smp_stop_nmi_callback+0x2b>       47d17: R_X86_64_PLT32   stop_this_cpu-0x4
002b    47d1b:  b8 01 00 00 00          mov    $0x1,%eax

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.290905453@infradead.org
2022-03-15 10:32:43 +01:00
Peter Zijlstra
4adb236867 objtool: Ignore extra-symbol code
There's a fun implementation detail on linking STB_WEAK symbols. When
the linker combines two translation units, where one contains a weak
function and the other an override for it. It simply strips the
STB_WEAK symbol from the symbol table, but doesn't actually remove the
code.

The result is that when objtool is ran in a whole-archive kind of way,
it will encounter *heaps* of unused (and unreferenced) code. All
rudiments of weak functions.

Additionally, when a weak implementation is split into a .cold
subfunction that .cold symbol is left in place, even though completely
unused.

Teach objtool to ignore such rudiments by searching for symbol holes;
that is, code ranges that fall outside the given symbol bounds.
Specifically, ignore a sequence of unreachable instruction iff they
occupy a single hole, additionally ignore any .cold subfunctions
referenced.

Both ld.bfd and ld.lld behave like this. LTO builds otoh can (and do)
properly DCE weak functions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.232019347@infradead.org
2022-03-15 10:32:43 +01:00
Peter Zijlstra
53f7109ef9 objtool: Rename --duplicate to --lto
In order to prepare for LTO like objtool runs for modules, rename the
duplicate argument to lto.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.172584233@infradead.org
2022-03-15 10:32:42 +01:00
Peter Zijlstra
5cff2086b0 objtool: Have WARN_FUNC fall back to sym+off
Currently WARN_FUNC() either prints func+off and failing that prints
sec+off, add an intermediate sym+off. This is useful when playing
around with entry code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.461283840@infradead.org
2022-03-15 10:32:32 +01:00
Peter Zijlstra
1ffbe4e935 objtool: Default ignore INT3 for unreachable
Ignore all INT3 instructions for unreachable code warnings, similar to NOP.
This allows using INT3 for various paddings instead of NOPs.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.343312938@infradead.org
2022-03-15 10:32:32 +01:00
Peter Zijlstra
f2d3a25089 objtool: Add --dry-run
Add a --dry-run argument to skip writing the modifications. This is
convenient for debugging.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.282720146@infradead.org
2022-03-15 10:32:32 +01:00
Fenghua Yu
227a06553f tools/objtool: Check for use of the ENQCMD instruction in the kernel
The ENQCMD instruction implicitly accesses the PASID_MSR to fill in the
pasid field of the descriptor being submitted to an accelerator. But
there is no precise (and stable across kernel changes) point at which
the PASID_MSR is updated from the value for one task to the next.

Kernel code that uses accelerators must always use the ENQCMDS instruction
which does not access the PASID_MSR.

Check for use of the ENQCMD instruction in the kernel and warn on its
usage.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220207230254.3342514-11-fenghua.yu@intel.com
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-03-15 10:32:30 +01:00
Masahiro Yamada
5c8166419a kbuild: replace $(if A,A,B) with $(or A,B)
$(or ...) is available since GNU Make 3.81, and useful to shorten the
code in some places.

Covert as follows:

  $(if A,A,B)  -->  $(or A,B)

This patch also converts:

  $(if A, A, B) --> $(or A, B)

Strictly speaking, the latter is not an equivalent conversion because
GNU Make keeps spaces after commas; if A is not empty, $(if A, A, B)
expands to " A", while $(or A, B) expands to "A".

Anyway, preceding spaces are not significant in the code hunks I touched.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-02-15 12:25:56 +09:00
Sergei Trofimovich
82880283d7 objtool: Fix truncated string warning
On GCC 12, the build fails due to a possible truncated string:

    check.c: In function 'validate_call':
    check.c:2865:58: error: '%d' directive output may be truncated writing between 1 and 10 bytes into a region of size 9 [-Werror=format-truncation=]
     2865 |                 snprintf(pvname, sizeof(pvname), "pv_ops[%d]", idx);
          |                                                          ^~

In theory it's a valid bug:

    static char pvname[16];
    int idx;
    ...
    idx = (rel->addend / sizeof(void *));
    snprintf(pvname, sizeof(pvname), "pv_ops[%d]", idx);

There are only 7 chars for %d while it could take up to 9, so the
printed "pv_ops[%d]" string could get truncated.

In reality the bug should never happen, because pv_ops only has ~80
entries, so 7 chars for the integer is more than enough.  Still, it's
worth fixing.  Bump the buffer size by 2 bytes to silence the warning.

[ jpoimboe: changed size to 19; massaged changelog ]

Fixes: db2b0c5d7b ("objtool: Support pv_opsindirect calls for noinstr")
Reported-by: Adam Borowski <kilobyte@angband.pl>
Reported-by: Martin Liška <mliska@suse.cz>
Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220120233748.2062559-1-slyich@gmail.com
2022-01-24 10:09:06 -08:00
Linus Torvalds
35ce8ae9ae Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull signal/exit/ptrace updates from Eric Biederman:
 "This set of changes deletes some dead code, makes a lot of cleanups
  which hopefully make the code easier to follow, and fixes bugs found
  along the way.

  The end-game which I have not yet reached yet is for fatal signals
  that generate coredumps to be short-circuit deliverable from
  complete_signal, for force_siginfo_to_task not to require changing
  userspace configured signal delivery state, and for the ptrace stops
  to always happen in locations where we can guarantee on all
  architectures that the all of the registers are saved and available on
  the stack.

  Removal of profile_task_ext, profile_munmap, and profile_handoff_task
  are the big successes for dead code removal this round.

  A bunch of small bug fixes are included, as most of the issues
  reported were small enough that they would not affect bisection so I
  simply added the fixes and did not fold the fixes into the changes
  they were fixing.

  There was a bug that broke coredumps piped to systemd-coredump. I
  dropped the change that caused that bug and replaced it entirely with
  something much more restrained. Unfortunately that required some
  rebasing.

  Some successes after this set of changes: There are few enough calls
  to do_exit to audit in a reasonable amount of time. The lifetime of
  struct kthread now matches the lifetime of struct task, and the
  pointer to struct kthread is no longer stored in set_child_tid. The
  flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is
  removed. Issues where task->exit_code was examined with
  signal->group_exit_code should been examined were fixed.

  There are several loosely related changes included because I am
  cleaning up and if I don't include them they will probably get lost.

  The original postings of these changes can be found at:
     https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.org
     https://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.org
     https://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org

  I trimmed back the last set of changes to only the obviously correct
  once. Simply because there was less time for review than I had hoped"

* 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits)
  ptrace/m68k: Stop open coding ptrace_report_syscall
  ptrace: Remove unused regs argument from ptrace_report_syscall
  ptrace: Remove second setting of PT_SEIZED in ptrace_attach
  taskstats: Cleanup the use of task->exit_code
  exit: Use the correct exit_code in /proc/<pid>/stat
  exit: Fix the exit_code for wait_task_zombie
  exit: Coredumps reach do_group_exit
  exit: Remove profile_handoff_task
  exit: Remove profile_task_exit & profile_munmap
  signal: clean up kernel-doc comments
  signal: Remove the helper signal_group_exit
  signal: Rename group_exit_task group_exec_task
  coredump: Stop setting signal->group_exit_task
  signal: Remove SIGNAL_GROUP_COREDUMP
  signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process
  signal: Make coredump handling explicit in complete_signal
  signal: Have prepare_signal detect coredumps using signal->core_state
  signal: Have the oom killer detect coredumps using signal->core_state
  exit: Move force_uaccess back into do_exit
  exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit
  ...
2022-01-17 05:49:30 +02:00
Linus Torvalds
64ad946152 - Get rid of all the .fixup sections because this generates
misleading/wrong stacktraces and confuse RELIABLE_STACKTRACE and
 LIVEPATCH as the backtrace misses the function which is being fixed up.
 
 - Add Straight Light Speculation mitigation support which uses a new
 compiler switch -mharden-sls= which sticks an INT3 after a RET or an
 indirect branch in order to block speculation after them. Reportedly,
 CPUs do speculate behind such insns.
 
 - The usual set of cleanups and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIyBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHfKA0ACgkQEsHwGGHe
 VUqLJg/2I2X2xXr5filJVaK+sQgmvDzk67DKnbxRBW2xcPF+B5sSW5yhe3G5UPW7
 SJVdhQ3gHcTiliGGlBf/VE7KXbqxFN0vO4/VFHZm78r43g7OrXTxz6WXXQRJ1n67
 U3YwRH3b6cqXZNFMs+X4bJt6qsGJM1kdTTZ2as4aERnaFr5AOAfQvfKbyhxLe/XA
 3SakfYISVKCBQ2RkTfpMpwmqlsatGFhTC5IrvuDQ83dDsM7O+Dx1J6Gu3fwjKmie
 iVzPOjCh+xTpZQp/SIZmt7MzoduZvpSym4YVyHvEnMiexQT4AmyaRthWqrhnEXY/
 qOvj8/XIqxmix8EaooGqRIK0Y2ZegxkPckNFzaeC3lsWohwMIGIhNXwHNEeuhNyH
 yvNGAW9Cq6NeDRgz5MRUXcimYw4P4oQKYLObS1WqFZhNMqm4sNtoEAYpai/lPYfs
 zUDckgXF2AoPOsSqy3hFAVaGovAgzfDaJVzkt0Lk4kzzjX2WQiNLhmiior460w+K
 0l2Iej58IajSp3MkWmFH368Jo8YfUVmkjbbpsmjsBppA08e1xamJB7RmswI/Ezj6
 s5re6UioCD+UYdjWx41kgbvYdvIkkZ2RLrktoZd/hqHrOLWEIiwEbyFO2nRFJIAh
 YjvPkB1p7iNuAeYcP1x9Ft9GNYVIsUlJ+hK86wtFCqy+abV+zQ==
 =R52z
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core updates from Borislav Petkov:

 - Get rid of all the .fixup sections because this generates
   misleading/wrong stacktraces and confuse RELIABLE_STACKTRACE and
   LIVEPATCH as the backtrace misses the function which is being fixed
   up.

 - Add Straight Line Speculation mitigation support which uses a new
   compiler switch -mharden-sls= which sticks an INT3 after a RET or an
   indirect branch in order to block speculation after them. Reportedly,
   CPUs do speculate behind such insns.

 - The usual set of cleanups and improvements

* tag 'x86_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
  x86/entry_32: Fix segment exceptions
  objtool: Remove .fixup handling
  x86: Remove .fixup section
  x86/word-at-a-time: Remove .fixup usage
  x86/usercopy: Remove .fixup usage
  x86/usercopy_32: Simplify __copy_user_intel_nocache()
  x86/sgx: Remove .fixup usage
  x86/checksum_32: Remove .fixup usage
  x86/vmx: Remove .fixup usage
  x86/kvm: Remove .fixup usage
  x86/segment: Remove .fixup usage
  x86/fpu: Remove .fixup usage
  x86/xen: Remove .fixup usage
  x86/uaccess: Remove .fixup usage
  x86/futex: Remove .fixup usage
  x86/msr: Remove .fixup usage
  x86/extable: Extend extable functionality
  x86/entry_32: Remove .fixup usage
  x86/entry_64: Remove .fixup usage
  x86/copy_mc_64: Remove .fixup usage
  ...
2022-01-12 16:31:19 -08:00
Linus Torvalds
1be5bdf8cd KCSAN updates for v5.17
This series provides KCSAN fixes and also the ability to take memory
 barriers into account for weakly-ordered systems.  This last can increase
 the probability of detecting certain types of data races.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmHbuRwTHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jKDPEACWuzYnd/u/02AHyRd3PIF3Px9uFKlK
 TFwaXX95oYSFCXcrmO42YtDUlZm4QcbwNb85KMCu1DvckRtIsNw0rkBU7BGyqv3Z
 ZoJEfMNpmC0x9+IFBOeseBHySPVT0x7GmYus05MSh0OLfkbCfyImmxRzgoKJGL+A
 ADF9EQb4z2feWjmVEoN8uRaarCAD4f77rSXiX6oTCNDuKrHarqMld/TmoXFrJbu2
 QtfwHeyvraKBnZdUoYfVbGVenyKb1vMv4bUlvrOcuJEgIi/J/th4FupR3XCGYulI
 aWJTl2TQTGnMoE8VnFHgI27I841w3k5PVL+Y1hr/S4uN1/rIoQQuBzCtlnFeCksa
 BiBXsHIchN8N0Dwh8zD8NMd2uxV4t3fmpxXTDAwaOm7vs5hA8AJ0XNu6Sz94Lyjv
 wk2CxX41WWUNJVo3gh6SrS4mL6lC8+VvHF1hbIap++jrevj58gj1jAR1fdx4ANlT
 e2qA00EeoMngEogDNZH42/Fxs3H9zxrBta2ZbkPkwzIqTHH+4pIQDCy2xO3T3oxc
 twdGPYpjYdkf79EGsG4I4R/VA/IfcS09VIWTce8xSDeSnqkgFhcG37r1orJe8hTB
 tH+ODkNOsz5HaEoa8OoAL4ko2l0fL99p2AtMPpuQfHjRj7aorF+dJIrqCizASxwx
 37PjQgOmHeDHgQ==
 =Q5fg
 -----END PGP SIGNATURE-----

Merge tag 'kcsan.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull KCSAN updates from Paul McKenney:
 "This provides KCSAN fixes and also the ability to take memory barriers
  into account for weakly-ordered systems. This last can increase the
  probability of detecting certain types of data races"

* tag 'kcsan.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (29 commits)
  kcsan: Only test clear_bit_unlock_is_negative_byte if arch defines it
  kcsan: Avoid nested contexts reading inconsistent reorder_access
  kcsan: Turn barrier instrumentation into macros
  kcsan: Make barrier tests compatible with lockdep
  kcsan: Support WEAK_MEMORY with Clang where no objtool support exists
  compiler_attributes.h: Add __disable_sanitizer_instrumentation
  objtool, kcsan: Remove memory barrier instrumentation from noinstr
  objtool, kcsan: Add memory barrier instrumentation to whitelist
  sched, kcsan: Enable memory barrier instrumentation
  mm, kcsan: Enable barrier instrumentation
  x86/qspinlock, kcsan: Instrument barrier of pv_queued_spin_unlock()
  x86/barriers, kcsan: Use generic instrumentation for non-smp barriers
  asm-generic/bitops, kcsan: Add instrumentation for barriers
  locking/atomics, kcsan: Add instrumentation for barriers
  locking/barriers, kcsan: Support generic instrumentation
  locking/barriers, kcsan: Add instrumentation for barriers
  kcsan: selftest: Add test case to check memory barrier instrumentation
  kcsan: Ignore GCC 11+ warnings about TSan runtime support
  kcsan: test: Add test cases for memory barrier instrumentation
  kcsan: test: Match reordered or normal accesses
  ...
2022-01-11 09:51:26 -08:00
Eric W. Biederman
1fb466dff9 objtool: Add a missing comma to avoid string concatenation
Recently the kbuild robot reported two new errors:

>> lib/kunit/kunit-example-test.o: warning: objtool: .text.unlikely: unexpected end of section
>> arch/x86/kernel/dumpstack.o: warning: objtool: oops_end() falls through to next function show_opcodes()

I don't know why they did not occur in my test setup but after digging
it I realized I had accidentally dropped a comma in
tools/objtool/check.c when I renamed rewind_stack_do_exit to
rewind_stack_and_make_dead.

Add that comma back to fix objtool errors.

Link: https://lkml.kernel.org/r/202112140949.Uq5sFKR1-lkp@intel.com
Fixes: 0e25498f8c ("exit: Add and use make_task_dead.")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-15 11:33:22 -06:00
Eric W. Biederman
cead185526 exit: Rename complete_and_exit to kthread_complete_and_exit
Update complete_and_exit to call kthread_exit instead of do_exit.

Change the name to reflect this change in functionality.  All of the
users of complete_and_exit are causing the current kthread to exit so
this change makes it clear what is happening.

Move the implementation of kthread_complete_and_exit from
kernel/exit.c to to kernel/kthread.c.  As this function is kthread
specific it makes most sense to live with the kthread functions.

There are no functional change.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-13 12:04:45 -06:00
Eric W. Biederman
ca3574bd65 exit: Rename module_put_and_exit to module_put_and_kthread_exit
Update module_put_and_exit to call kthread_exit instead of do_exit.

Change the name to reflect this change in functionality.  All of the
users of module_put_and_exit are causing the current kthread to exit
so this change makes it clear what is happening.  There is no
functional change.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-13 12:04:45 -06:00
Eric W. Biederman
bbda86e988 exit: Implement kthread_exit
The way the per task_struct exit_code is used by kernel threads is not
quite compatible how it is used by userspace applications.  The low
byte of the userspace exit_code value encodes the exit signal.  While
kthreads just use the value as an int holding ordinary kernel function
exit status like -EPERM.

Add kthread_exit to clearly separate the two kinds of uses.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-13 12:04:45 -06:00
Eric W. Biederman
0e25498f8c exit: Add and use make_task_dead.
There are two big uses of do_exit.  The first is it's design use to be
the guts of the exit(2) system call.  The second use is to terminate
a task after something catastrophic has happened like a NULL pointer
in kernel code.

Add a function make_task_dead that is initialy exactly the same as
do_exit to cover the cases where do_exit is called to handle
catastrophic failure.  In time this can probably be reduced to just a
light wrapper around do_task_dead. For now keep it exactly the same so
that there will be no behavioral differences introducing this new
concept.

Replace all of the uses of do_exit that use it for catastraphic
task cleanup with make_task_dead to make it clear what the code
is doing.

As part of this rename rewind_stack_do_exit
rewind_stack_and_make_dead.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-13 12:04:45 -06:00
Peter Zijlstra
82a8954acd objtool: Remove .fixup handling
The .fixup has gone the way of the Dodo, that test will always be
false.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20211110101326.261496792@infradead.org
2021-12-11 09:09:51 +01:00
Marco Elver
0509811952 objtool, kcsan: Remove memory barrier instrumentation from noinstr
Teach objtool to turn instrumentation required for memory barrier
modeling into nops in noinstr text.

The __tsan_func_entry/exit calls are still emitted by compilers even
with the __no_sanitize_thread attribute. The memory barrier
instrumentation will be inserted explicitly (without compiler help), and
thus needs to also explicitly be removed.

Signed-off-by: Marco Elver <elver@google.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09 16:42:28 -08:00
Marco Elver
0525bd82f6 objtool, kcsan: Add memory barrier instrumentation to whitelist
Adds KCSAN's memory barrier instrumentation to objtool's uaccess
whitelist.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09 16:42:28 -08:00
Peter Zijlstra
1cc1e4c8aa objtool: Add straight-line-speculation validation
Teach objtool to validate the straight-line-speculation constraints:

 - speculation trap after indirect calls
 - speculation trap after RET

Notable: when an instruction is annotated RETPOLINE_SAFE, indicating
  speculation isn't a problem, also don't care about sls for that
  instruction.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211204134908.023037659@infradead.org
2021-12-08 19:26:50 +01:00
Peter Zijlstra
988f01683c objtool: Fix pv_ops noinstr validation
Boris reported that in one of his randconfig builds, objtool got
infinitely stuck. Turns out there's trivial list corruption in the
pv_ops tracking when a function is both in a static table and in a code
assignment.

Avoid re-adding function to the pv_ops[] lists when they're already on
it.

Fixes: db2b0c5d7b ("objtool: Support pv_opsindirect calls for noinstr")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20211202204534.GA16608@worktop.programming.kicks-ass.net
2021-12-03 09:11:42 +01:00
Peter Zijlstra
2105a92748 static_call,x86: Robustify trampoline patching
Add a few signature bytes after the static call trampoline and verify
those bytes match before patching the trampoline. This avoids patching
random other JMPs (such as CFI jump-table entries) instead.

These bytes decode as:

   d:   53                      push   %rbx
   e:   43 54                   rex.XB push %r12

And happen to spell "SCT".

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211030074758.GT174703@worktop.programming.kicks-ass.net
2021-11-11 13:09:31 +01:00
Linus Torvalds
79ef0c0014 Tracing updates for 5.16:
- kprobes: Restructured stack unwinder to show properly on x86 when a stack
   dump happens from a kretprobe callback.
 
 - Fix to bootconfig parsing
 
 - Have tracefs allow owner and group permissions by default (only denying
   others). There's been pressure to allow non root to tracefs in a
   controlled fashion, and using groups is probably the safest.
 
 - Bootconfig memory managament updates.
 
 - Bootconfig clean up to have the tools directory be less dependent on
   changes in the kernel tree.
 
 - Allow perf to be traced by function tracer.
 
 - Rewrite of function graph tracer to be a callback from the function tracer
   instead of having its own trampoline (this change will happen on an arch
   by arch basis, and currently only x86_64 implements it).
 
 - Allow multiple direct trampolines (bpf hooks to functions) be batched
   together in one synchronization.
 
 - Allow histogram triggers to add variables that can perform calculations
   against the event's fields.
 
 - Use the linker to determine architecture callbacks from the ftrace
   trampoline to allow for proper parameter prototypes and prevent warnings
   from the compiler.
 
 - Extend histogram triggers to key off of variables.
 
 - Have trace recursion use bit magic to determine preempt context over if
   branches.
 
 - Have trace recursion disable preemption as all use cases do anyway.
 
 - Added testing for verification of tracing utilities.
 
 - Various small clean ups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYYBdxhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qp1sAQD2oYFwaG3sx872gj/myBcHIBSKdiki
 Hry5csd8zYDBpgD+Poylopt5JIbeDuoYw/BedgEXmscZ8Qr7VzjAXdnv/Q4=
 =Loz8
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - kprobes: Restructured stack unwinder to show properly on x86 when a
   stack dump happens from a kretprobe callback.

 - Fix to bootconfig parsing

 - Have tracefs allow owner and group permissions by default (only
   denying others). There's been pressure to allow non root to tracefs
   in a controlled fashion, and using groups is probably the safest.

 - Bootconfig memory managament updates.

 - Bootconfig clean up to have the tools directory be less dependent on
   changes in the kernel tree.

 - Allow perf to be traced by function tracer.

 - Rewrite of function graph tracer to be a callback from the function
   tracer instead of having its own trampoline (this change will happen
   on an arch by arch basis, and currently only x86_64 implements it).

 - Allow multiple direct trampolines (bpf hooks to functions) be batched
   together in one synchronization.

 - Allow histogram triggers to add variables that can perform
   calculations against the event's fields.

 - Use the linker to determine architecture callbacks from the ftrace
   trampoline to allow for proper parameter prototypes and prevent
   warnings from the compiler.

 - Extend histogram triggers to key off of variables.

 - Have trace recursion use bit magic to determine preempt context over
   if branches.

 - Have trace recursion disable preemption as all use cases do anyway.

 - Added testing for verification of tracing utilities.

 - Various small clean ups and fixes.

* tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (101 commits)
  tracing/histogram: Fix semicolon.cocci warnings
  tracing/histogram: Fix documentation inline emphasis warning
  tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together
  tracing: Show size of requested perf buffer
  bootconfig: Initialize ret in xbc_parse_tree()
  ftrace: do CPU checking after preemption disabled
  ftrace: disable preemption when recursion locked
  tracing/histogram: Document expression arithmetic and constants
  tracing/histogram: Optimize division by a power of 2
  tracing/histogram: Covert expr to const if both operands are constants
  tracing/histogram: Simplify handling of .sym-offset in expressions
  tracing: Fix operator precedence for hist triggers expression
  tracing: Add division and multiplication support for hist triggers
  tracing: Add support for creating hist trigger variables from literal
  selftests/ftrace: Stop tracing while reading the trace file by default
  MAINTAINERS: Update KPROBES and TRACING entries
  test_kprobes: Move it from kernel/ to lib/
  docs, kprobes: Remove invalid URL and add new reference
  samples/kretprobes: Fix return value if register_kretprobe() failed
  lib/bootconfig: Fix the xbc_get_info kerneldoc
  ...
2021-11-01 20:05:19 -07:00
Linus Torvalds
43aa0a195f objtool updates:
- Improve retpoline code patching by separating it from alternatives which
    reduces memory footprint and allows to do better optimizations in the
    actual runtime patching.
 
  - Add proper retpoline support for x86/BPF
 
  - Address noinstr warnings in x86/kvm, lockdep and paravirtualization code
 
  - Add support to handle pv_opsindirect calls in the noinstr analysis
 
  - Classify symbols upfront and cache the result to avoid redundant
    str*cmp() invocations.
 
  - Add a CFI hash to reduce memory consumption which also reduces runtime
    on a allyesconfig by ~50%
 
  - Adjust XEN code to make objtool handling more robust and as a side
    effect to prevent text fragmentation due to placement of the hypercall
    page.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/GFgTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoc1JD/0Sz6seP2OUMxbMT3gCcFo9sMvYTdsM
 7WuGFbBbnCIo7g8JH7k0zRRBigptMp2eUtQXKkgaaIbWN4JbuVKf8KxN5/qXxLi4
 fJ12QnNTGH9N2jtzl5wKmpjaKJnnJMD9D10XwoR+T6gn6NHd+AgLEs7GxxuQUlgo
 eC9oEXhNHC8uNhiZc38EwfwmItI1bRgaLrnZWIL4rYGSMxfCK1/cEOpWrFfX9wmj
 /diB6oqMyPXZXMCtgpX7TniUr5XOTCcUkeO9mQv5bmyq/YM/8hrTbcVSJlsVYLvP
 EsBnUSHAcfLFiHXwa1RNiIGdbiPjbN+UYeXGAvqF58f3e5dTIHtN/UmWo7OH93If
 9rLMVNcMpsfPx7QRk2IxEPumLCkyfwjzfKrVDM6P6TKEIUzD1og4IK9gTlfykVsh
 56G5XiCOC/X2x8IMxKTLGuBiAVLFHXK/rSwoqhvNEWBFKDbP13QWs0LurBcW09Sa
 /kQI9pIBT1xFA/R+OY5Xy1cqNVVK1Gxmk8/bllCijA9pCFSCFM4hLZE5CevdrBCV
 h5SdqEK5hIlzFyypXfsCik/4p/+rfvlGfUKtFsPctxx29SPe+T0orx+l61jiWQok
 rZOflwMawK5lDuASHrvNHGJcWaTwoo3VcXMQDnQY0Wulc43J5IFBaPxkZzgyd+S1
 4lktHxatrCMUgw==
 =pfZi
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Thomas Gleixner:

 - Improve retpoline code patching by separating it from alternatives
   which reduces memory footprint and allows to do better optimizations
   in the actual runtime patching.

 - Add proper retpoline support for x86/BPF

 - Address noinstr warnings in x86/kvm, lockdep and paravirtualization
   code

 - Add support to handle pv_opsindirect calls in the noinstr analysis

 - Classify symbols upfront and cache the result to avoid redundant
   str*cmp() invocations.

 - Add a CFI hash to reduce memory consumption which also reduces
   runtime on a allyesconfig by ~50%

 - Adjust XEN code to make objtool handling more robust and as a side
   effect to prevent text fragmentation due to placement of the
   hypercall page.

* tag 'objtool-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  bpf,x86: Respect X86_FEATURE_RETPOLINE*
  bpf,x86: Simplify computing label offsets
  x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
  x86/alternative: Add debug prints to apply_retpolines()
  x86/alternative: Try inline spectre_v2=retpoline,amd
  x86/alternative: Handle Jcc __x86_indirect_thunk_\reg
  x86/alternative: Implement .retpoline_sites support
  x86/retpoline: Create a retpoline thunk array
  x86/retpoline: Move the retpoline thunk declarations to nospec-branch.h
  x86/asm: Fixup odd GEN-for-each-reg.h usage
  x86/asm: Fix register order
  x86/retpoline: Remove unused replacement symbols
  objtool,x86: Replace alternatives with .retpoline_sites
  objtool: Shrink struct instruction
  objtool: Explicitly avoid self modifying code in .altinstr_replacement
  objtool: Classify symbols
  objtool: Support pv_opsindirect calls for noinstr
  x86/xen: Rework the xen_{cpu,irq,mmu}_opsarrays
  x86/xen: Mark xen_force_evtchn_callback() noinstr
  x86/xen: Make irq_disable() noinstr
  ...
2021-11-01 13:24:43 -07:00
Peter Zijlstra
134ab5bd18 objtool,x86: Replace alternatives with .retpoline_sites
Instead of writing complete alternatives, simply provide a list of all
the retpoline thunk calls. Then the kernel is free to do with them as
it pleases. Simpler code all-round.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.850007165@infradead.org
2021-10-28 23:25:25 +02:00
Peter Zijlstra
c509331b41 objtool: Shrink struct instruction
Any one instruction can only ever call a single function, therefore
insn->mcount_loc_node is superfluous and can use insn->call_node.

This shrinks struct instruction, which is by far the most numerous
structure objtool creates.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.785456706@infradead.org
2021-10-28 23:25:25 +02:00
Peter Zijlstra
dd003edeff objtool: Explicitly avoid self modifying code in .altinstr_replacement
Assume ALTERNATIVE()s know what they're doing and do not change, or
cause to change, instructions in .altinstr_replacement sections.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.722511775@infradead.org
2021-10-28 23:25:25 +02:00
Peter Zijlstra
1739c66eb7 objtool: Classify symbols
In order to avoid calling str*cmp() on symbol names, over and over, do
them all once upfront and store the result.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.658539311@infradead.org
2021-10-28 23:25:24 +02:00
Michael Forney
86e1e054e0 objtool: Update section header before relocations
The libelf implementation from elftoolchain has a safety check in
gelf_update_rel[a] to check that the data corresponds to a section
that has type SHT_REL[A] [0]. If the relocation is updated before
the section header is updated with the proper type, this check
fails.

To fix this, update the section header first, before the relocations.
Previously, the section size was calculated in elf_rebuild_reloc_section
by counting the number of entries in the reloc_list. However, we
now need the size during elf_write so instead keep a running total
and add to it for every new relocation.

[0] https://sourceforge.net/p/elftoolchain/mailman/elftoolchain-developers/thread/CAGw6cBtkZro-8wZMD2ULkwJ39J+tHtTtAWXufMjnd3cQ7XG54g@mail.gmail.com/

Signed-off-by: Michael Forney <mforney@mforney.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210509000103.11008-2-mforney@mforney.org
2021-10-06 20:11:57 -07:00
Michael Forney
b46179d6bb objtool: Check for gelf_update_rel[a] failures
Otherwise, if these fail we end up with garbage data in the
.rela.orc_unwind_ip section, leading to errors like

  ld: fs/squashfs/namei.o: bad reloc symbol index (0x7f16 >= 0x12) for offset 0x7f16d5c82cc8 in section `.orc_unwind_ip'

Signed-off-by: Michael Forney <mforney@mforney.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210509000103.11008-1-mforney@mforney.org
2021-10-06 20:11:53 -07:00
Peter Zijlstra
b08cadbd3b Merge branch 'objtool/urgent'
Fixup conflicts.

# Conflicts:
#	tools/objtool/check.c
2021-10-07 00:40:17 +02:00
Joe Lawrence
fe255fe6ad objtool: Remove redundant 'len' field from struct section
The section structure already contains sh_size, so just remove the extra
'len' member that requires extra mirroring and potential confusion.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210822225037.54620-3-joe.lawrence@redhat.com
Cc: Andy Lavr <andy.lavr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
2021-10-05 12:03:21 -07:00
Joe Lawrence
dc02368164 objtool: Make .altinstructions section entry size consistent
Commit e31694e0a7 ("objtool: Don't make .altinstructions writable")
aligned objtool-created and kernel-created .altinstructions section
flags, but there remains a minor discrepency in their use of a section
entry size: objtool sets one while the kernel build does not.

While sh_entsize of sizeof(struct alt_instr) seems intuitive, this small
deviation can cause failures with external tooling (kpatch-build).

Fix this by creating new .altinstructions sections with sh_entsize of 0
and then later updating sec->sh_size as alternatives are added to the
section.  An added benefit is avoiding the data descriptor and buffer
created by elf_create_section(), but previously unused by
elf_add_alternative().

Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210822225037.54620-2-joe.lawrence@redhat.com
Cc: Andy Lavr <andy.lavr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
2021-10-05 12:03:20 -07:00
Josh Poimboeuf
4d8b35968b objtool: Remove reloc symbol type checks in get_alt_entry()
Converting a special section's relocation reference to a symbol is
straightforward.  No need for objtool to complain that it doesn't know
how to handle it.  Just handle it.

This fixes the following warning:

  arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception

Fixes: 24ff652573 ("objtool: Teach get_alt_entry() about more relocation types")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/feadbc3dfb3440d973580fad8d3db873cbfe1694.1633367242.git.jpoimboe@redhat.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: linux-kernel@vger.kernel.org
2021-10-05 12:03:20 -07:00
Linus Torvalds
7fab1c12bd objtool: print out the symbol type when complaining about it
The objtool warning that the kvm instruction emulation code triggered
wasn't very useful:

    arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception

in that it helpfully tells you which symbol name it had trouble figuring
out the relocation for, but it doesn't actually say what the unknown
symbol type was that triggered it all.

In this case it was because of missing type information (type 0, aka
STT_NOTYPE), but on the whole it really should just have printed that
out as part of the message.

Because if this warning triggers, that's very much the first thing you
want to know - why did reloc2sec_off() return failure for that symbol?

So rather than just saying you can't handle some type of symbol without
saying what the type _was_, just print out the type number too.

Fixes: 24ff652573 ("objtool: Teach get_alt_entry() about more relocation types")
Link: https://lore.kernel.org/lkml/CAHk-=wiZwq-0LknKhXN4M+T8jbxn_2i9mcKpO+OaBSSq_Eh7tg@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-03 13:45:48 -07:00
Peter Zijlstra
24ff652573 objtool: Teach get_alt_entry() about more relocation types
Occasionally objtool encounters symbol (as opposed to section)
relocations in .altinstructions. Typically they are the alternatives
written by elf_add_alternative() as encountered on a noinstr
validation run on vmlinux after having already ran objtool on the
individual .o files.

Basically this is the counterpart of commit 44f6a7c075 ("objtool:
Fix seg fault with Clang non-section symbols"), because when these new
assemblers (binutils now also does this) strip the section symbols,
elf_add_reloc_to_insn() is forced to emit symbol based relocations.

As such, teach get_alt_entry() about different relocation types.

Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/YVWUvknIEVNkPvnP@hirez.programming.kicks-ass.net
2021-10-01 13:57:47 +02:00
Josh Poimboeuf
5b284b1933 objtool: Ignore unwind hints for ignored functions
If a function is ignored, also ignore its hints.  This is useful for the
case where the function ignore is conditional on frame pointers, e.g.
STACK_FRAME_NON_STANDARD_FP().

Link: https://lkml.kernel.org/r/163163048317.489837.10988954983369863209.stgit@devnote2

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-30 21:24:07 -04:00
Peter Zijlstra
db2b0c5d7b objtool: Support pv_opsindirect calls for noinstr
Normally objtool will now follow indirect calls; there is no need.

However, this becomes a problem with noinstr validation; if there's an
indirect call from noinstr code, we very much need to know it is to
another noinstr function. Luckily there aren't many indirect calls in
entry code with the obvious exception of paravirt. As such, noinstr
validation didn't work with paravirt kernels.

In order to track pv_ops[] call targets, objtool reads the static
pv_ops[] tables as well as direct assignments to the pv_ops[] array,
provided the compiler makes them a single instruction like:

  bf87:       48 c7 05 00 00 00 00 00 00 00 00        movq   $0x0,0x0(%rip)
    bf92 <xen_init_spinlocks+0x5f>
    bf8a: R_X86_64_PC32     pv_ops+0x268

There are, as of yet, no warnings for when this goes wrong :/

Using the functions found with the above means, all pv_ops[] calls are
now subject to noinstr validation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210624095149.118815755@infradead.org
2021-09-17 13:20:26 +02:00
Peter Zijlstra
f56dae88a8 objtool: Handle __sanitize_cov*() tail calls
Turns out the compilers also generate tail calls to __sanitize_cov*(),
make sure to also patch those out in noinstr code.

Fixes: 0f1441b44e ("objtool: Fix noinstr vs KCOV")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20210624095147.818783799@infradead.org
2021-09-15 15:51:45 +02:00
Peter Zijlstra
8b946cc38e objtool: Introduce CFI hash
Andi reported that objtool on vmlinux.o consumes more memory than his
system has, leading to horrific performance.

This is in part because we keep a struct instruction for every
instruction in the file in-memory. Shrink struct instruction by
removing the CFI state (which includes full register state) from it
and demand allocating it.

Given most instructions don't actually change CFI state, there's lots
of repetition there, so add a hash table to find previous CFI
instances.

Reduces memory consumption (and runtime) for processing an
x86_64-allyesconfig:

  pre:  4:40.84 real,   143.99 user,    44.18 sys,      30624988 mem
  post: 2:14.61 real,   108.58 user,    25.04 sys,      16396184 mem

Suggested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210624095147.756759107@infradead.org
2021-09-15 15:51:45 +02:00
Peter Zijlstra
9af9dcf11b x86/xen: Mark cpu_bringup_and_idle() as dead_end_function
The asm_cpu_bringup_and_idle() function is required to push the return
value on the stack in order to make ORC happy, but the only reason
objtool doesn't complain is because of a happy accident.

The thing is that asm_cpu_bringup_and_idle() doesn't return, so
validate_branch() never terminates and falls through to the next
function, which in the normal case is the hypercall_page. And that, as
it happens, is 4095 NOPs and a RET.

Make asm_cpu_bringup_and_idle() terminate on it's own, by making the
function it calls as a dead-end. This way we no longer rely on what
code happens to come after.

Fixes: c3881eb58d ("x86/xen: Make the secondary CPU idle tasks reliable")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/20210624095147.693801717@infradead.org
2021-09-15 15:51:44 +02:00
Linus Torvalds
b89c07dea1 A single ELF format fix for a section flags mismatch bug that breaks
kernel tooling such as kpatch-build.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZYv4RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ipeBAAhJPS/kCQ17Y5zGyMB0/6yfCWIifODoS7
 9J+6/mqKHPDdV07yzPtOXuTTmpKV4OHPi8Yj8kaXs5L5fOmQ1uAwITwZNF5hU0a5
 CiFIsubUCJmglf9b6L9EH5pBEQ72Cq4u8zIhJ9LmZ4t625AHJAm2ikZgascc4U67
 RvVoGr5sYTo0YEsc1IDM1wUtnUhXBNjS1VwkXNnCFFTXYHju47MeY1sPHq2hvkzO
 iJGC9A+hxfM1eQt9/qC/2L/6F/XECN61gcR9Get8TkWeEGHmPG+FthmPLd4oO9Ho
 03J4JfMbmXumWosAeilYBNUkfii/M5Em78Wpv/cB94iSt67rq7Eb+8gm4D5svmfN
 l+utsPY/HYB+uWV0hy2cV/ORRiwcJnon54dEWL6912YkKz+OIb3DK/7l9ex5lW+D
 r3o8NP0s6S+RgUkOFxz5VaYK1giu6fiaFysWdKeflvwlvY/64owMepQ1QfPBbeB7
 3DTzvuYZ4Cb1x/vR6WBbFqGcuJKZ1CsZIBLCblveUs+G0wlu147K5E1qlXg/Wvq7
 5Vzznc4fmRng8np5hxAw8ieLkatWg7szyryUV/4H2Ubs/jWGcH628ZYbapaCb7EM
 Eson65xzbVfhnz16z8sN13XIF1lGe8sb0+qiFSclEfyDUnZDuhwMn6d9Ubqxrg5J
 uTULEzmY/rI=
 =MvPd
 -----END PGP SIGNATURE-----
mergetag object d33b9035e1
 type commit
 tag objtool-core-2021-06-28
 tagger Ingo Molnar <mingo@kernel.org> 1624859477 +0200
 
 The biggest change in this cycle is the new code to handle
 and rewrite variable sized jump labels - which results in
 slightly tighter code generation in hot paths, through the
 use of short(er) NOPs.
 
 Also a number of cleanups and fixes, and a change to the
 generic include/linux/compiler.h to handle a s390 GCC quirk.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZZGcRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1goYg/7BxUIJXP0F5wbrMbAvJIDRgR/j3TA+ztk
 uNU1yabBGluMxCqJ87HadJ+A5d010G+GRUn/birVr7w1UuwWv8HOda78dnyG7tme
 xm78/1FlOnstuOTQxhK6rjbb2cp+QOmdsAQkq1TF4SOxArBQiwtjiOvytHjb5yNx
 7LrlbtuZ7Dtc0qd2evkG4ma4QkGoDhBS1dRogrItc27ZLuFIQoNnEd2K2QNMgczw
 a/Jx8fgNmdoJSq+vkBn9TnS/cJYUW/PAlPNtO3ac8yE857aDIVnjXFRzveAP/nTh
 rwFD6aCGnJAqyqP7A8ElNjySos5O+ebYApxe7rEx0TNLbrc55qSP9lpdIO+vgytV
 Xzy4O7z6o+lailQ4EoF8Qf+rlPeue0kLF23SsNbZY1uT0vjX1Uv70xgKbkuyPygp
 GNXAy6dOXK0AfaZYL/Wa50yVnJnkYDjes/hHr+HEam5Oad566pqIyQNP8yWSPqaf
 KHkL//1pb5C2RKwot4IYv/ftHfZB5QftoFq6bhGBc1GXUd/FiqivvGHPW/6g7rxi
 ZIrXs+Fqm/5KP9mssNONfyz5XEvbcUTD1CbeqX9eyVbiYZbLp1oWSgtogiRW9ya+
 HR7t0Dt/UFzFWbilb6EZff/Hdr1NZBZLdrfpvVDoMf5tR9J0BIOyjddTu89g/FIO
 KcfJ5yyjJBU=
 =+HAB
 -----END PGP SIGNATURE-----

Merge tags 'objtool-urgent-2021-06-28' and 'objtool-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fix and updates from Ingo Molnar:
 "An ELF format fix for a section flags mismatch bug that breaks kernel
  tooling such as kpatch-build.

  The biggest change in this cycle is the new code to handle and rewrite
  variable sized jump labels - which results in slightly tighter code
  generation in hot paths, through the use of short(er) NOPs.

  Also a number of cleanups and fixes, and a change to the generic
  include/linux/compiler.h to handle a s390 GCC quirk"

* tag 'objtool-urgent-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Don't make .altinstructions writable

* tag 'objtool-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Improve reloc hash size guestimate
  instrumentation.h: Avoid using inline asm operand modifiers
  compiler.h: Avoid using inline asm operand modifiers
  kbuild: Fix objtool dependency for 'OBJECT_FILES_NON_STANDARD_<obj> := n'
  objtool: Reflow handle_jump_alt()
  jump_label/x86: Remove unused JUMP_LABEL_NOP_SIZE
  jump_label, x86: Allow short NOPs
  objtool: Provide stats for jump_labels
  objtool: Rewrite jump_label instructions
  objtool: Decode jump_entry::key addend
  jump_label, x86: Emit short JMP
  jump_label: Free jump_entry::key bit1 for build use
  jump_label, x86: Add variable length patching support
  jump_label, x86: Introduce jump_entry_size()
  jump_label, x86: Improve error when we fail expected text
  jump_label, x86: Factor out the __jump_table generation
  jump_label, x86: Strip ASM jump_label support
  x86, objtool: Dont exclude arch/x86/realmode/
  objtool: Rewrite hashtable sizing
2021-06-28 11:35:55 -07:00
Josh Poimboeuf
e31694e0a7 objtool: Don't make .altinstructions writable
When objtool creates the .altinstructions section, it sets the SHF_WRITE
flag to make the section writable -- unless the section had already been
previously created by the kernel.  The mismatch between kernel-created
and objtool-created section flags can cause failures with external
tooling (kpatch-build).  And the section doesn't need to be writable
anyway.

Make the section flags consistent with the kernel's.

Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/6c284ae89717889ea136f9f0064d914cd8329d31.1624462939.git.jpoimboe@redhat.com
2021-06-24 08:55:20 +02:00
Peter Zijlstra
d33b9035e1 objtool: Improve reloc hash size guestimate
Nathan reported that LLVM ThinLTO builds have a performance regression
with commit 25cf0d8aa2 ("objtool: Rewrite hashtable sizing"). Sami
was quick to note that this is due to their use of -ffunction-sections.

As a result the .text section is small and basing the number of relocs
off of that no longer works. Instead have read_sections() compute the
sum of all SHF_EXECINSTR sections and use that.

Fixes: 25cf0d8aa2 ("objtool: Rewrite hashtable sizing")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Debugged-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lkml.kernel.org/r/YMJpGLuGNsGtA5JJ@hirez.programming.kicks-ass.net
2021-06-14 14:05:36 +02:00
Peter Zijlstra
2d49b721dc objtool: Only rewrite unconditional retpoline thunk calls
It turns out that the compilers generate conditional branches to the
retpoline thunks like:

  5d5:   0f 85 00 00 00 00       jne    5db <cpuidle_reflect+0x22>
	5d7: R_X86_64_PLT32     __x86_indirect_thunk_r11-0x4

while the rewrite can only handle JMP/CALL to the thunks. The result
is the alternative wrecking the code. Make sure to skip writing the
alternatives for conditional branches.

Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Reported-by: Lukasz Majczak <lma@semihalf.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
2021-06-11 08:53:06 +02:00
Peter Zijlstra
584fd3b318 objtool: Fix .symtab_shndx handling for elf_create_undef_symbol()
When an ELF object uses extended symbol section indexes (IOW it has a
.symtab_shndx section), these must be kept in sync with the regular
symbol table (.symtab).

So for every new symbol we emit, make sure to also emit a
.symtab_shndx value to keep the arrays of equal size.

Note: since we're writing an UNDEF symbol, most GElf_Sym fields will
be 0 and we can repurpose one (st_size) to host the 0 for the xshndx
value.

Fixes: 2f2f7e47f0 ("objtool: Add elf_create_undef_symbol()")
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/YL3q1qFO9QIRL/BA@hirez.programming.kicks-ass.net
2021-06-10 10:08:24 +02:00
Peter Zijlstra
48001d26c1 objtool: Reflow handle_jump_alt()
Miroslav figured the code flow in handle_jump_alt() was sub-optimal
with that goto. Reflow the code to make it clearer.

Reported-by: Miroslav Benes <mbenes@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/YJ00lgslY+IpA/rL@hirez.programming.kicks-ass.net
2021-05-14 09:00:10 +02:00
Vasily Gorbik
f66c05d6ba objtool/x86: Fix elf_add_alternative() endianness
Currently x86 kernel cross-compiled on big endian system fails at boot with:

  kernel BUG at arch/x86/kernel/alternative.c:258!

Corresponding bug condition look like the following:

  BUG_ON(feature >= (NCAPINTS + NBUGINTS) * 32);

Fix that by converting alternative feature/cpuid to target endianness.

Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/patch-2.thread-6c9df9.git-6c9df9a8098d.your-ad-here.call-01620841104-ext-2554@work.hours
2021-05-12 21:17:01 +02:00
Vasily Gorbik
46c7405df7 objtool: Fix elf_create_undef_symbol() endianness
Currently x86 cross-compilation fails on big endian system with:

  x86_64-cross-ld: init/main.o: invalid string offset 488112128 >= 6229 for section `.strtab'

Mark new ELF data in elf_create_undef_symbol() as symbol, so that libelf
does endianness handling correctly.

Fixes: 2f2f7e47f0 ("objtool: Add elf_create_undef_symbol()")
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/patch-1.thread-6c9df9.git-d39264656387.your-ad-here.call-01620841104-ext-2554@work.hours
2021-05-12 21:16:53 +02:00
Peter Zijlstra
e2d9494bef objtool: Provide stats for jump_labels
Add objtool --stats to count the jump_label sites it encounters.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210506194158.153101906@infradead.org
2021-05-12 14:54:56 +02:00
Peter Zijlstra
6d37b83c5d objtool: Rewrite jump_label instructions
When a jump_entry::key has bit1 set, rewrite the instruction to be a
NOP. This allows the compiler/assembler to emit JMP (and thus decide
on which encoding to use).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210506194158.091028792@infradead.org
2021-05-12 14:54:56 +02:00
Peter Zijlstra
cbf82a3dc2 objtool: Decode jump_entry::key addend
Teach objtool about the the low bits in the struct static_key pointer.

That is, the low two bits of @key in:

  struct jump_entry {
	s32 code;
	s32 target;
	long key;
  }

as found in the __jump_table section. Since @key has a relocation to
the variable (to be resolved by the linker), the low two bits will be
reflected in the relocation's addend.

As such, find the reloc and store the addend, such that we can access
these bits.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210506194158.028024143@infradead.org
2021-05-12 14:54:55 +02:00
Peter Zijlstra
25cf0d8aa2 objtool: Rewrite hashtable sizing
Currently objtool has 5 hashtables and sizes them 16 or 20 bits
depending on the --vmlinux argument.

However, a single side doesn't really work well for the 5 tables,
which among them, cover 3 different uses. Also, while vmlinux is
larger, there is still a very wide difference between a defconfig and
allyesconfig build, which again isn't optimally covered by a single
size.

Another aspect is the cost of elf_hash_init(), which for large tables
dominates the runtime for small input files. It turns out that all it
does it assign NULL, something that is required when using malloc().
However, when we allocate memory using mmap(), we're guaranteed to get
zero filled pages.

Therefore, rewrite the whole thing to:

 1) use more dynamic sized tables, depending on the input file,
 2) avoid the need for elf_hash_init() entirely by using mmap().

This speeds up a regular kernel build (100s to 98s for
x86_64-defconfig), and potentially dramatically speeds up vmlinux
processing.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210506194157.452881700@infradead.org
2021-05-12 14:54:50 +02:00
Linus Torvalds
03b2cd72aa Objtool updates in this cycle were:
- Standardize the crypto asm code so that it looks like compiler-generated
    code to objtool - so that it can understand it. This enables unwinding
    from crypto asm code - and also fixes the last known remaining objtool
    warnings for LTO and more.
 
  - x86 decoder fixes: clean up and fix the decoder, and also extend it a bit
 
  - Misc fixes and cleanups
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJEOQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jN0A//dZIR9GPW1cHkPD3Na+lxb+RWgCVnFUgw
 meLEOum389zWr7S8YmcFpKWLy94f3l24i/e7ufKn6/RMdaQuT6pZUa6teiNPqDKN
 Qq1v6EX/LT49Q1zh/zCCnKdlmF1t7wDyA1/+HBLnB4YfYEtteEt+p2Apyv4xIHOl
 xWqaTMFcVR/El9FXSyqRWRR4zuqY0Uatz0fmfo5jmi2xq460k53fQlTLA/0w5Jw0
 V3omyA3AYMUW6YlW5TGUINOhyDeAJm4PWl3siSUnSd6t8A/TVs5zpZX15BtseCle
 0FRp2SbxOoVkiyo3N3XmkfYYns9+4wK7cr9qja9U9MsSBZJZwaBm2LO/t2WFrAhq
 5dkOsoPmpIsjutsQnIhQgtVT9I/A4/u5m5Zi3trlXsBS0XAt/q+2GPfEngFmgb3q
 nae4rhGUsQ3NTGBiqNuMHQF4yeEvQZ8DCf3ytTz7DjBeiQ9nAtwzbUUGQjYl2mj1
 ZPOnl7Xmq/Nyw+AmdpffFPiEUJxqEg9HWjDo7DQATXb3Hw2VJ3cU8jwPRqDDlO10
 OB81vysXNGTmhOngHXexxncpmU9gDOIC1imZZpw5lNx4W9Qn20AlGaGAIbqzlfx0
 p5VuhkIWCySe1bOZx03xuk7Gq7GBIPPy/a2m204Ftipetlo1HBYwT3KB/wVpHmh7
 CSjWgdiW3+k=
 =poAZ
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:

 - Standardize the crypto asm code so that it looks like compiler-
   generated code to objtool - so that it can understand it. This
   enables unwinding from crypto asm code - and also fixes the last
   known remaining objtool warnings for LTO and more.

 - x86 decoder fixes: clean up and fix the decoder, and also extend it a
   bit

 - Misc fixes and cleanups

* tag 'objtool-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/crypto: Enable objtool in crypto code
  x86/crypto/sha512-ssse3: Standardize stack alignment prologue
  x86/crypto/sha512-avx2: Standardize stack alignment prologue
  x86/crypto/sha512-avx: Standardize stack alignment prologue
  x86/crypto/sha256-avx2: Standardize stack alignment prologue
  x86/crypto/sha1_avx2: Standardize stack alignment prologue
  x86/crypto/sha_ni: Standardize stack alignment prologue
  x86/crypto/crc32c-pcl-intel: Standardize jump table
  x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer
  x86/crypto/aesni-intel_avx: Standardize stack alignment prologue
  x86/crypto/aesni-intel_avx: Fix register usage comments
  x86/crypto/aesni-intel_avx: Remove unused macros
  objtool: Support asm jump tables
  objtool: Parse options from OBJTOOL_ARGS
  objtool: Collate parse_options() users
  objtool: Add --backup
  objtool,x86: More ModRM sugar
  objtool,x86: Rewrite ADD/SUB/AND
  objtool,x86: Support %riz encodings
  objtool,x86: Simplify register decode
  ...
2021-04-28 12:53:24 -07:00
Josh Poimboeuf
99033461e6 objtool: Support asm jump tables
Objtool detection of asm jump tables would normally just work, except
for the fact that asm retpolines use alternatives.  Objtool thinks the
alternative code path (a jump to the retpoline) is a sibling call.

Don't treat alternative indirect branches as sibling calls when the
original instruction has a jump table.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://lore.kernel.org/r/460cf4dc675d64e1124146562cabd2c05aa322e8.1614182415.git.jpoimboe@redhat.com
2021-04-19 12:36:32 -05:00
Peter Zijlstra
9bc0bb5072 objtool/x86: Rewrite retpoline thunk calls
When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an
indirect call, have objtool rewrite it to:

	ALTERNATIVE "call __x86_indirect_thunk_\reg",
		    "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE)

Additionally, in order to not emit endless identical
.altinst_replacement chunks, use a global symbol for them, see
__x86_indirect_alt_*.

This also avoids objtool from having to do code generation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.320177914@infradead.org
2021-04-02 12:47:28 +02:00
Peter Zijlstra
50e7b4a1a1 objtool: Skip magical retpoline .altinstr_replacement
When the .altinstr_replacement is a retpoline, skip the alternative.
We already special case retpolines anyway.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.259429287@infradead.org
2021-04-02 12:46:57 +02:00
Peter Zijlstra
7bd2a600f3 objtool: Cache instruction relocs
Track the reloc of instructions in the new instruction->reloc field
to avoid having to look them up again later.

( Technically x86 instructions can have two relocations, but not jumps
  and calls, for which we're using this. )

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.195441549@infradead.org
2021-04-02 12:46:15 +02:00
Peter Zijlstra
43d5430ad7 objtool: Keep track of retpoline call sites
Provide infrastructure for architectures to rewrite/augment compiler
generated retpoline calls. Similar to what we do for static_call()s,
keep track of the instructions that are retpoline calls.

Use the same list_head, since a retpoline call cannot also be a
static_call.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.130805730@infradead.org
2021-04-02 12:45:27 +02:00
Peter Zijlstra
2f2f7e47f0 objtool: Add elf_create_undef_symbol()
Allow objtool to create undefined symbols; this allows creating
relocations to symbols not currently in the symbol table.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.064743095@infradead.org
2021-04-02 12:45:05 +02:00
Peter Zijlstra
9a7827b778 objtool: Extract elf_symbol_add()
Create a common helper to add symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.003468981@infradead.org
2021-04-02 12:45:01 +02:00
Peter Zijlstra
417a4dc91e objtool: Extract elf_strtab_concat()
Create a common helper to append strings to a strtab.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.941474004@infradead.org
2021-04-02 12:44:56 +02:00
Peter Zijlstra
d0c5c4cc73 objtool: Create reloc sections implicitly
Have elf_add_reloc() create the relocation section implicitly.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.880174448@infradead.org
2021-04-02 12:44:37 +02:00
Peter Zijlstra
ef47cc01cb objtool: Add elf_create_reloc() helper
We have 4 instances of adding a relocation. Create a common helper
to avoid growing even more.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.817438847@infradead.org
2021-04-02 12:44:18 +02:00
Peter Zijlstra
3a647607b5 objtool: Rework the elf_rebuild_reloc_section() logic
Instead of manually calling elf_rebuild_reloc_section() on sections
we've called elf_add_reloc() on, have elf_write() DTRT.

This makes it easier to add random relocations in places without
carefully tracking when we're done and need to flush what section.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.754213408@infradead.org
2021-04-02 12:43:32 +02:00
Peter Zijlstra
a958c4fea7 objtool: Fix static_call list generation
Currently, objtool generates tail call entries in add_jump_destination()
but waits until validate_branch() to generate the regular call entries.
Move these to add_call_destination() for consistency.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.691529901@infradead.org
2021-04-02 12:43:19 +02:00
Peter Zijlstra
530b4ddd9d objtool: Handle per arch retpoline naming
The __x86_indirect_ naming is obviously not generic. Shorten to allow
matching some additional magic names later.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.630296706@infradead.org
2021-04-02 12:43:02 +02:00
Peter Zijlstra
bcb1b6ff39 objtool: Correctly handle retpoline thunk calls
Just like JMP handling, convert a direct CALL to a retpoline thunk
into a retpoline safe indirect CALL.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.567568238@infradead.org
2021-04-02 12:42:54 +02:00
Peter Zijlstra
119251855f x86/retpoline: Simplify retpolines
Due to:

  c9c324dc22 ("objtool: Support stack layout changes in alternatives")

it is now possible to simplify the retpolines.

Currently our retpolines consist of 2 symbols:

 - __x86_indirect_thunk_\reg: the compiler target
 - __x86_retpoline_\reg:  the actual retpoline.

Both are consecutive in code and aligned such that for any one register
they both live in the same cacheline:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop

  0000000000000005 <__x86_retpoline_rax>:
   5:   e8 07 00 00 00          callq  11 <__x86_retpoline_rax+0xc>
   a:   f3 90                   pause
   c:   0f ae e8                lfence
   f:   eb f9                   jmp    a <__x86_retpoline_rax+0x5>
  11:   48 89 04 24             mov    %rax,(%rsp)
  15:   c3                      retq
  16:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)

The thunk is an alternative_2, where one option is a JMP to the
retpoline. This was done so that objtool didn't need to deal with
alternatives with stack ops. But that problem has been solved, so now
it is possible to fold the entire retpoline into the alternative to
simplify and consolidate unused bytes:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   90                      nop
   a:   90                      nop
   b:   90                      nop
   c:   90                      nop
   d:   90                      nop
   e:   90                      nop
   f:   90                      nop
  10:   90                      nop
  11:   66 66 2e 0f 1f 84 00 00 00 00 00        data16 nopw %cs:0x0(%rax,%rax,1)
  1c:   0f 1f 40 00             nopl   0x0(%rax)

Notice that since the longest alternative sequence is now:

   0:   e8 07 00 00 00          callq  c <.altinstr_replacement+0xc>
   5:   f3 90                   pause
   7:   0f ae e8                lfence
   a:   eb f9                   jmp    5 <.altinstr_replacement+0x5>
   c:   48 89 04 24             mov    %rax,(%rsp)
  10:   c3                      retq

17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if
we can shrink the retpoline by 1 byte we can pack it more densely).

 [ bp: Massage commit message. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210326151259.506071949@infradead.org
2021-04-02 12:42:04 +02:00
Peter Zijlstra
23c1ad538f x86/alternatives: Optimize optimize_nops()
Currently, optimize_nops() scans to see if the alternative starts with
NOPs. However, the emit pattern is:

  141:	\oldinstr
  142:	.skip (len-(142b-141b)), 0x90

That is, when 'oldinstr' is short, the tail is padded with NOPs. This case
never gets optimized.

Rewrite optimize_nops() to replace any trailing string of NOPs inside
the alternative to larger NOPs. Also run it irrespective of patching,
replacing NOPs in both the original and replaced code.

A direct consequence is that 'padlen' becomes superfluous, so remove it.

 [ bp:
   - Adjust commit message
   - remove a stale comment about needing to pad
   - add a comment in optimize_nops()
   - exit early if the NOP verif. loop catches a mismatch - function
     should not not add NOPs in that case
   - fix the "optimized NOPs" offsets output ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210326151259.442992235@infradead.org
2021-04-02 12:41:17 +02:00
Ingo Molnar
b1f480bc06 Merge branch 'x86/cpu' into WIP.x86/core, to merge the NOP changes & resolve a semantic conflict
Conflict-merge this main commit in essence:

  a89dfde3dc3c: ("x86: Remove dynamic NOP selection")

With this upstream commit:

  b90829704780: ("bpf: Use NOP_ATOMIC5 instead of emit_nops(&prog, 5) for BPF_TRAMP_F_CALL_ORIG")

Semantic merge conflict:

  arch/x86/net/bpf_jit_comp.c

  - memcpy(prog, ideal_nops[NOP_ATOMIC5], X86_PATCH_SIZE);
  + memcpy(prog, x86_nops[5], X86_PATCH_SIZE);

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-04-02 12:36:30 +02:00
Peter Zijlstra
301cddc21a objtool/x86: Use asm/nops.h
Since the kernel will rely on a single canonical set of NOPs, make sure
objtool uses the exact same ones.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210312115749.136357911@infradead.org
2021-03-15 16:37:37 +01:00
Borislav Petkov
c7e41b099b tools/objtool: Convert to insn_decode()
Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210304174237.31945-18-bp@alien8.de
2021-03-15 12:10:41 +01:00
Borislav Petkov
d30c7b820b x86/insn: Add a __ignore_sync_check__ marker
Add an explicit __ignore_sync_check__ marker which will be used to mark
lines which are supposed to be ignored by file synchronization check
scripts, its advantage being that it explicitly denotes such lines in
the code.

Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210304174237.31945-4-bp@alien8.de
2021-03-15 11:00:57 +01:00
Peter Zijlstra
ba08abca66 objtool,x86: Fix uaccess PUSHF/POPF validation
Commit ab234a260b ("x86/pv: Rework arch_local_irq_restore() to not
use popf") replaced "push %reg; popf" with something like: "test
$0x200, %reg; jz 1f; sti; 1:", which breaks the pushf/popf symmetry
that commit ea24213d80 ("objtool: Add UACCESS validation") relies
on.

The result is:

  drivers/gpu/drm/amd/amdgpu/si.o: warning: objtool: si_common_hw_init()+0xf36: PUSHF stack exhausted

Meanwhile, commit c9c324dc22 ("objtool: Support stack layout changes
in alternatives") makes that we can actually use stack-ops in
alternatives, which means we can revert 1ff865e343 ("x86,smap: Fix
smap_{save,restore}() alternatives").

That in turn means we can limit the PUSHF/POPF handling of
ea24213d80 to those instructions that are in alternatives.

Fixes: ab234a260b ("x86/pv: Rework arch_local_irq_restore() to not use popf")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/YEY4rIbQYa5fnnEp@hirez.programming.kicks-ass.net
2021-03-12 09:15:49 +01:00
Peter Zijlstra
900b4df347 objtool: Parse options from OBJTOOL_ARGS
Teach objtool to parse options from the OBJTOOL_ARGS environment
variable.

This enables things like:

  $ OBJTOOL_ARGS="--backup" make O=defconfig-build/ kernel/ponies.o

to obtain both defconfig-build/kernel/ponies.o{,.orig} and easily
inspect what objtool actually did.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20210226110004.252553847@infradead.org
2021-03-06 12:44:23 +01:00
Peter Zijlstra
a2f605f9ff objtool: Collate parse_options() users
Ensure there's a single place that parses check_options, in
preparation for extending where to get options from.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20210226110004.193108106@infradead.org
2021-03-06 12:44:23 +01:00
Peter Zijlstra
8ad15c6900 objtool: Add --backup
Teach objtool to write backups files, such that it becomes easier to
see what objtool did to the object file.

Backup files will be ${name}.orig.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/YD4obT3aoXPWl7Ax@hirez.programming.kicks-ass.net
2021-03-06 12:44:23 +01:00
Peter Zijlstra
36d92e43d0 objtool,x86: More ModRM sugar
Better helpers to decode ModRM.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/YCZB/ljatFXqQbm8@hirez.programming.kicks-ass.net
2021-03-06 12:44:23 +01:00
Peter Zijlstra
961d83b907 objtool,x86: Rewrite ADD/SUB/AND
Support sign extending and imm8 forms.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20210211173627.588366777@infradead.org
2021-03-06 12:44:23 +01:00
Peter Zijlstra
78df6245c3 objtool,x86: Support %riz encodings
When there's a SIB byte, the register otherwise denoted by r/m will
then be denoted by SIB.base REX.b will now extend this. SIB.index == SP
is magic and notes an index value zero.

This means that there's a bunch of alternative (longer) encodings for
the same thing. Eg. 'ModRM.mod != 3, ModRM.r/m = AX' can be encoded as
'ModRM.mod != 3, ModRM.r/m = SP, SIB.base = AX, SIB.index = SP' which is actually 4
different encodings because the value of SIB.scale is irrelevant,
giving rise to 5 different but equal encodings.

Support these encodings and clean up the SIB handling in general.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20210211173627.472967498@infradead.org
2021-03-06 12:44:23 +01:00
Peter Zijlstra
16ef7f159c objtool,x86: Simplify register decode
Since the CFI_reg number now matches the instruction encoding order do
away with the op_to_cfi_reg[] and use direct assignment.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20210211173627.362004522@infradead.org
2021-03-06 12:44:23 +01:00
Peter Zijlstra
ffc7e74f36 objtool,x86: Rewrite LEAVE
Since we can now have multiple stack-ops per instruction, we don't
need to special case LEAVE and can simply emit the composite
operations.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20210211173627.253273977@infradead.org
2021-03-06 12:44:23 +01:00
Peter Zijlstra
2ee0c36349 objtool,x86: Rewrite LEA decode
Current LEA decoding is a bunch of special cases, properly decode the
instruction, with exception of full SIB and RIP-relative modes.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20210211173627.143250641@infradead.org
2021-03-06 12:44:23 +01:00
Peter Zijlstra
d473b18b2e objtool,x86: Renumber CFI_reg
Make them match the instruction encoding numbering.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20210211173627.033720313@infradead.org
2021-03-06 12:44:22 +01:00
Peter Zijlstra
d54dba4199 objtool: Allow UNWIND_HINT to suppress dodgy stack modifications
rewind_stack_do_exit()
	UNWIND_HINT_FUNC
	/* Prevent any naive code from trying to unwind to our caller. */

	xorl	%ebp, %ebp
	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rax
	leaq	-PTREGS_SIZE(%rax), %rsp
	UNWIND_HINT_REGS

	call	do_exit

Does unspeakable things to the stack, which objtool currently fails to
detect due to a limitation in instruction decoding. This will be
rectified after which the above will result in:

arch/x86/entry/entry_64.o: warning: objtool: .text+0xab: unsupported stack register modification

Allow the UNWIND_HINT on the next instruction to suppress this, it
will overwrite the state anyway.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20210211173626.918498579@infradead.org
2021-03-06 12:44:22 +01:00
Linus Torvalds
29c395c77a Rework of the X86 irq stack handling:
The irq stack switching was moved out of the ASM entry code in course of
   the entry code consolidation. It ended up being suboptimal in various
   ways.
 
   - Make the stack switching inline so the stackpointer manipulation is not
     longer at an easy to find place.
 
   - Get rid of the unnecessary indirect call.
 
   - Avoid the double stack switching in interrupt return and reuse the
     interrupt stack for softirq handling.
 
   - A objtool fix for CONFIG_FRAME_POINTER=y builds where it got confused
     about the stack pointer manipulation.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmA21OcTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaX0D/9S0ud6oqbsIvI8LwhvYub63a2cjKP9
 liHAJ7xwMYYVwzf0skwsPb/QE6+onCzdq0upJkgG/gEYm2KbiaMWZ4GgHdj0O7ER
 qXKJONDd36AGxSEdaVzLY5kPuD/mkomGk5QdaZaTmjruthkNzg4y/N2wXUBIMZR0
 FdpSpp5fGspSZCn/DXDx6FjClwpLI53VclvDs6DcZ2DIBA0K+F/cSLb1UQoDLE1U
 hxGeuNa+GhKeeZ5C+q5giho1+ukbwtjMW9WnKHAVNiStjm0uzdqq7ERGi/REvkcB
 LY62u5uOSW1zIBMmzUjDDQEqvypB0iFxFCpN8g9sieZjA0zkaUioRTQyR+YIQ8Cp
 l8LLir0dVQivR1bHghHDKQJUpdw/4zvDj4mMH10XHqbcOtIxJDOJHC5D00ridsAz
 OK0RlbAJBl9FTdLNfdVReBCoehYAO8oefeyMAG12nZeSh5XVUWl238rvzmzIYNhG
 cEtkSx2wIUNEA+uSuI+xvfmwpxL7voTGvqmiRDCAFxyO7Bl/GBu9OEBFA1eOvHB+
 +wTmPDMswRetQNh4QCRXzk1JzP1Wk5CobUL9iinCWFoTJmnsPPSOWlosN6ewaNXt
 kYFpRLy5xt9EP7dlfgBSjiRlthDhTdMrFjD5bsy1vdm1w7HKUo82lHa4O8Hq3PHS
 tinKICUqRsbjig==
 =Sqr1
 -----END PGP SIGNATURE-----

Merge tag 'x86-entry-2021-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 irq entry updates from Thomas Gleixner:
 "The irq stack switching was moved out of the ASM entry code in course
  of the entry code consolidation. It ended up being suboptimal in
  various ways.

  This reworks the X86 irq stack handling:

   - Make the stack switching inline so the stackpointer manipulation is
     not longer at an easy to find place.

   - Get rid of the unnecessary indirect call.

   - Avoid the double stack switching in interrupt return and reuse the
     interrupt stack for softirq handling.

   - A objtool fix for CONFIG_FRAME_POINTER=y builds where it got
     confused about the stack pointer manipulation"

* tag 'x86-entry-2021-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix stack-swizzle for FRAME_POINTER=y
  um: Enforce the usage of asm-generic/softirq_stack.h
  x86/softirq/64: Inline do_softirq_own_stack()
  softirq: Move do_softirq_own_stack() to generic asm header
  softirq: Move __ARCH_HAS_DO_SOFTIRQ to Kconfig
  x86: Select CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
  x86/softirq: Remove indirection in do_softirq_own_stack()
  x86/entry: Use run_sysvec_on_irqstack_cond() for XEN upcall
  x86/entry: Convert device interrupts to inline stack switching
  x86/entry: Convert system vectors to irq stack macro
  x86/irq: Provide macro for inlining irq stack switching
  x86/apic: Split out spurious handling code
  x86/irq/64: Adjust the per CPU irq stack pointer by 8
  x86/irq: Sanitize irq stack tracking
  x86/entry: Fix instrumentation annotation
2021-02-24 16:32:23 -08:00
Andrey Konovalov
f00748bfa0 kasan: prefix global functions with kasan_
Patch series "kasan: HW_TAGS tests support and fixes", v4.

This patchset adds support for running KASAN-KUnit tests with the
hardware tag-based mode and also contains a few fixes.

This patch (of 15):

There's a number of internal KASAN functions that are used across multiple
source code files and therefore aren't marked as static inline.  To avoid
littering the kernel function names list with generic function names,
prefix all such KASAN functions with kasan_.

As a part of this change:

 - Rename internal (un)poison_range() to kasan_(un)poison() (no _range)
   to avoid name collision with a public kasan_unpoison_range().

 - Rename check_memory_region() to kasan_check_range(), as it's a more
   fitting name.

Link: https://lkml.kernel.org/r/cover.1610733117.git.andreyknvl@google.com
Link: https://linux-review.googlesource.com/id/I719cc93483d4ba288a634dba80ee6b7f2809cd26
Link: https://lkml.kernel.org/r/13777aedf8d3ebbf35891136e1f2287e2f34aaba.1610733117.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Suggested-by: Marco Elver <elver@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24 13:38:30 -08:00
Linus Torvalds
414eece95b clang-lto for v5.12-rc1 (part2)
- Generate __mcount_loc in objtool (Peter Zijlstra)
 - Support running objtool against vmlinux.o (Sami Tolvanen)
 - Clang LTO enablement for x86 (Sami Tolvanen)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmA1fn8ACgkQiXL039xt
 wCbswQ//Zmnq912Ubyn5uPe9SOS/kumGDoqtxGzlZwo/pSB3qFArhD6G07sJ49XD
 nu/05ZcOda760wubnhcuK91n2fY5i/eGLXMSjfgtdVcco4Q67nPQydc+LGdhuDco
 FlhL8TAIwqYN1f2nJK1IggZpZFxz5r/r1Pq8q1S0oQRqDenxDBQwNtBba4B1OIxw
 /FE/1Hp3xwRnuJEP2jREBeY1yQ+Y1n859pZcDgSOWlTArcp8EVUi5hIWJ9DwIe73
 mqnx6PcFWEYB0zLNZmZz2gpEac+ncGyme6ChayeuQfInbL5dhx97jFGt3S6/+NSY
 mF2zyaR/+JsGGuM8dVqH3izKCJXCEAGirrdMO1ndb9HdwS3KnYEiag2ciNWL0wm3
 UEM4r0i2B14sU3pkyotKgsJdOSgorMKkQUPb2wW+OUfnkZNEWKLqylMgNXBD80l4
 WG5vYQRwwFN9jRBik6Z5YFGnwGsNIoGg1F1GRNMjh6h51adYQeBN/1QJE1FJ5L4D
 iKzmZYqimKUINXWfI6TNyqiv9TctOt65pxnRyq+MHxfTDzHGyc3MUeCeCiR1a1yI
 S5QhcgfSnC/NjDA0+oYC6yRlcBtfhjtUqFTGoZ4q4q/LF1BVU1bPyIXZrROLc05s
 LNMMBcWbJetJxFtm/gYfiVFuNitYtxbBV1krVtsWznCA2nKGJ9w=
 =htKJ
 -----END PGP SIGNATURE-----

Merge tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull more clang LTO updates from Kees Cook:
 "Clang LTO x86 enablement.

  Full disclosure: while this has _not_ been in linux-next (since it
  initially looked like the objtool dependencies weren't going to make
  v5.12), it has been under daily build and runtime testing by Sami for
  quite some time. These x86 portions have been discussed on lkml, with
  Peter, Josh, and others helping nail things down.

  The bulk of the changes are to get objtool working happily. The rest
  of the x86 enablement is very small.

  Summary:

   - Generate __mcount_loc in objtool (Peter Zijlstra)

   - Support running objtool against vmlinux.o (Sami Tolvanen)

   - Clang LTO enablement for x86 (Sami Tolvanen)"

Link: https://lore.kernel.org/lkml/20201013003203.4168817-26-samitolvanen@google.com/
Link: https://lore.kernel.org/lkml/cover.1611263461.git.jpoimboe@redhat.com/

* tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  kbuild: lto: force rebuilds when switching CONFIG_LTO
  x86, build: allow LTO to be selected
  x86, cpu: disable LTO for cpu.c
  x86, vdso: disable LTO only for vDSO
  kbuild: lto: postpone objtool
  objtool: Split noinstr validation from --vmlinux
  x86, build: use objtool mcount
  tracing: add support for objtool mcount
  objtool: Don't autodetect vmlinux.o
  objtool: Fix __mcount_loc generation with Clang's assembler
  objtool: Add a pass for generating __mcount_loc
2021-02-23 15:13:45 -08:00
Sami Tolvanen
41425ebe20 objtool: Split noinstr validation from --vmlinux
This change adds a --noinstr flag to objtool to allow us to specify
that we're processing vmlinux.o without also enabling noinstr
validation. This is needed to avoid false positives with LTO when we
run objtool on vmlinux.o without CONFIG_DEBUG_ENTRY.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2021-02-23 12:46:57 -08:00
Sami Tolvanen
0e731dbc18 objtool: Don't autodetect vmlinux.o
With LTO, we run objtool on vmlinux.o, but don't want noinstr
validation. This change requires --vmlinux to be passed to objtool
explicitly.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2021-02-23 12:46:57 -08:00
Sami Tolvanen
18a14575ae objtool: Fix __mcount_loc generation with Clang's assembler
When objtool generates relocations for the __mcount_loc section, it
tries to reference __fentry__ calls by their section symbol offset.
However, this fails with Clang's integrated assembler as it may not
generate section symbols for every section. This patch looks up a
function symbol instead if the section symbol is missing, similarly
to commit e81e072443 ("objtool: Support Clang non-section symbols
in ORC generation").

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2021-02-23 12:46:56 -08:00
Peter Zijlstra
99d0021569 objtool: Add a pass for generating __mcount_loc
Add the --mcount option for generating __mcount_loc sections
needed for dynamic ftrace. Using this pass requires the kernel to
be compiled with -mfentry and CC_USING_NOP_MCOUNT to be defined
in Makefile.

Link: https://lore.kernel.org/lkml/20200625200235.GQ4781@hirez.programming.kicks-ass.net/
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
[Sami: rebased, dropped config changes, fixed to actually use --mcount,
       and wrote a commit message.]
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2021-02-23 12:46:56 -08:00
Linus Torvalds
a56ff24efb objtool updates:
- Make objtool work for big-endian cross compiles
 
  - Make stack tracking via stack pointer memory operations match push/pop
    semantics to prepare for architectures w/o PUSH/POP instructions.
 
  - Add support for analyzing alternatives
 
  - Improve retpoline detection and handling
 
  - Improve assembly code coverage on x86
 
  - Provide support for inlined stack switching
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmA1FUcTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoe+0D/9ytW3AfQUOGlVHVPTwCAd2LSCL2kQR
 zrUAyUEwEXDuZi2vOcmgndr9AToszdBnAlxSOStJYE1/ia/ptbYjj9eFOWkCwPw2
 R0DSjTHh+Ui2yPjcbYvOcMphc7DTT1ssMvRWzw0I3fjfJaYBJjNx1qdseN2yhFrL
 BNhdh4B4StEfCbNBMhnzKTZNM1yXNN93ojot9suxnqPIAV6ruc5SUrd9Pmii2odX
 gRHQthGSPMR9nJYWrT2QzbDrM2DWkKIGUol0Xr1LTFYWNFsK3sTQkFiMevTP5Msw
 qO01lw4IKCMKMonaE0t/vxFBz5vhIyivxLQMI3LBixmf2dbE9UbZqW0ONPYoZJgf
 MrYyz4Tdv2u/MklTPM263cbTsdtmGEuW2iVRqaDDWP/Py1A187bUaVkw8p/9O/9V
 CBl8dMF3ag1FquxnsyHDowHKu8DaIZyeBHu69aNfAlcOrtn8ZtY4MwQbQkL9cNYe
 ywLEmCm8zdYNrXlVOuMX/0AAWnSpqCgDYUmKhOLW4W1r4ewNpAUCmvIL8cpLtko0
 FDbMTdKU2pd5SQv5YX6Bvvra483DvP9rNAuQGHpxZ7ubSlj8cFOT9UmjuuOb4fxQ
 EFj8JrF9KEN5sxGUu4tjg0D0Ee3wDdSTGs0cUN5FBMXelQOM7U4n4Y7n/Pas/LMa
 B5TVW3JiDcMcPg==
 =0AHf
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2021-02-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Thomas Gleixner:

 - Make objtool work for big-endian cross compiles

 - Make stack tracking via stack pointer memory operations match
   push/pop semantics to prepare for architectures w/o PUSH/POP
   instructions.

 - Add support for analyzing alternatives

 - Improve retpoline detection and handling

 - Improve assembly code coverage on x86

 - Provide support for inlined stack switching

* tag 'objtool-core-2021-02-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  objtool: Support stack-swizzle
  objtool,x86: Additionally decode: mov %rsp, (%reg)
  x86/unwind/orc: Change REG_SP_INDIRECT
  x86/power: Support objtool validation in hibernate_asm_64.S
  x86/power: Move restore_registers() to top of the file
  x86/power: Annotate indirect branches as safe
  x86/acpi: Support objtool validation in wakeup_64.S
  x86/acpi: Annotate indirect branch as safe
  x86/ftrace: Support objtool vmlinux.o validation in ftrace_64.S
  x86/xen/pvh: Annotate indirect branch as safe
  x86/xen: Support objtool vmlinux.o validation in xen-head.S
  x86/xen: Support objtool validation in xen-asm.S
  objtool: Add xen_start_kernel() to noreturn list
  objtool: Combine UNWIND_HINT_RET_OFFSET and UNWIND_HINT_FUNC
  objtool: Add asm version of STACK_FRAME_NON_STANDARD
  objtool: Assume only ELF functions do sibling calls
  x86/ftrace: Add UNWIND_HINT_FUNC annotation for ftrace_stub
  objtool: Support retpoline jump detection for vmlinux.o
  objtool: Fix ".cold" section suffix check for newer versions of GCC
  objtool: Fix retpoline detection in asm code
  ...
2021-02-23 09:56:13 -08:00
Peter Zijlstra
724c8a23d5 objtool: Fix stack-swizzle for FRAME_POINTER=y
When objtool encounters the stack-swizzle:

	mov %rsp, (%[tos])
	mov %[tos], %rsp
	...
	pop %rsp

Inside a FRAME_POINTER=y build, things go a little screwy because
clearly we're not adjusting the cfa->base. This then results in the
pop %rsp not being detected as a restore of cfa->base so it will turn
into a regular POP and offset the stack, resulting in:

  kernel/softirq.o: warning: objtool: do_softirq()+0xdb: return with modified stack frame

Therefore, have "mov %[tos], %rsp" act like a PUSH (it sorta is
anyway) to balance the things out. We're not too concerned with the
actual stack_size for frame-pointer builds, since we don't generate
ORC data for them anyway.

Fixes: aafeb14e9d ("objtool: Support stack-swizzle")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/YC6UC+rc9KKmQrkd@hirez.programming.kicks-ass.net
2021-02-22 19:54:09 +01:00
Josh Poimboeuf
73f44fe19d static_call: Allow module use without exposing static_call_key
When exporting static_call_key; with EXPORT_STATIC_CALL*(), the module
can use static_call_update() to change the function called.  This is
not desirable in general.

Not exporting static_call_key however also disallows usage of
static_call(), since objtool needs the key to construct the
static_call_site.

Solve this by allowing objtool to create the static_call_site using
the trampoline address when it builds a module and cannot find the
static_call_key symbol. The module loader will then try and map the
trampole back to a key before it constructs the normal sites list.

Doing this requires a trampoline -> key associsation, so add another
magic section that keeps those.

Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210127231837.ifddpn7rhwdaepiu@treble
2021-02-17 14:12:42 +01:00
Peter Zijlstra
2d24dd5798 rbtree: Add generic add and find helpers
I've always been bothered by the endless (fragile) boilerplate for
rbtree, and I recently wrote some rbtree helpers for objtool and
figured I should lift them into the kernel and use them more widely.

Provide:

partial-order; less() based:
 - rb_add(): add a new entry to the rbtree
 - rb_add_cached(): like rb_add(), but for a rb_root_cached

total-order; cmp() based:
 - rb_find(): find an entry in an rbtree
 - rb_find_add(): find an entry, and add if not found

 - rb_find_first(): find the first (leftmost) matching entry
 - rb_next_match(): continue from rb_find_first()
 - rb_for_each(): iterate a sub-tree using the previous two

Inlining and constant propagation should see the compiler inline the
whole thing, including the various compare functions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
2021-02-17 14:07:31 +01:00
Ingo Molnar
a3251c1a36 Merge branch 'x86/paravirt' into x86/entry
Merge in the recent paravirt changes to resolve conflicts caused
by objtool annotations.

Conflicts:
	arch/x86/xen/xen-asm.S

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-02-12 13:36:43 +01:00
Peter Zijlstra
aafeb14e9d objtool: Support stack-swizzle
Natively support the stack swizzle pattern:

	mov %rsp, (%[tos])
	mov %[tos], %rsp
	...
	pop %rsp

It uses the vals[] array to link the first two stack-ops, and detect
the SP to SP_INDIRECT swizzle.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-02-10 20:53:52 +01:00
Peter Zijlstra
2a51282984 objtool,x86: Additionally decode: mov %rsp, (%reg)
Where we already decode: mov %rsp, %reg, also decode mov %rsp, (%reg).

Nothing should match for this new stack-op.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-02-10 20:53:52 +01:00
Peter Zijlstra
87ccc826bf x86/unwind/orc: Change REG_SP_INDIRECT
Currently REG_SP_INDIRECT is unused but means (%rsp + offset),
change it to mean (%rsp) + offset.

The reason is that we're going to swizzle stack in the middle of a C
function with non-trivial stack footprint. This means that when the
unwinder finds the ToS, it needs to dereference it (%rsp) and then add
the offset to the next frame, resulting in: (%rsp) + offset

This is somewhat unfortunate, since REG_BP_INDIRECT is used (by DRAP)
and thus needs to retain the current (%rbp + offset).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-02-10 20:53:51 +01:00
Josh Poimboeuf
c26acfbbfb objtool: Add xen_start_kernel() to noreturn list
xen_start_kernel() doesn't return.  Annotate it as such so objtool can
follow the code flow.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/930deafa89256c60b180442df59a1bbae48f30ab.1611263462.git.jpoimboe@redhat.com
2021-01-26 11:12:00 -06:00
Josh Poimboeuf
b735bd3e68 objtool: Combine UNWIND_HINT_RET_OFFSET and UNWIND_HINT_FUNC
The ORC metadata generated for UNWIND_HINT_FUNC isn't actually very
func-like.  With certain usages it can cause stack state mismatches
because it doesn't set the return address (CFI_RA).

Also, users of UNWIND_HINT_RET_OFFSET no longer need to set a custom
return stack offset.  Instead they just need to specify a func-like
situation, so the current ret_offset code is hacky for no good reason.

Solve both problems by simplifying the RET_OFFSET handling and
converting it into a more useful UNWIND_HINT_FUNC.

If we end up needing the old 'ret_offset' functionality again in the
future, we should be able to support it pretty easily with the addition
of a custom 'sp_offset' in UNWIND_HINT_FUNC.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/db9d1f5d79dddfbb3725ef6d8ec3477ad199948d.1611263462.git.jpoimboe@redhat.com
2021-01-26 11:12:00 -06:00
Josh Poimboeuf
ecf11ba4d0 objtool: Assume only ELF functions do sibling calls
There's an inconsistency in how sibling calls are detected in
non-function asm code, depending on the scope of the object.  If the
target code is external to the object, objtool considers it a sibling
call.  If the target code is internal but not a function, objtool
*doesn't* consider it a sibling call.

This can cause some inconsistencies between per-object and vmlinux.o
validation.

Instead, assume only ELF functions can do sibling calls.  This generally
matches existing reality, and makes sibling call validation consistent
between vmlinux.o and per-object.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/0e9ab6f3628cc7bf3bde7aa6762d54d7df19ad78.1611263461.git.jpoimboe@redhat.com
2021-01-26 11:12:00 -06:00
Josh Poimboeuf
31a7424bc5 objtool: Support retpoline jump detection for vmlinux.o
Objtool converts direct retpoline jumps to type INSN_JUMP_DYNAMIC, since
that's what they are semantically.

That conversion doesn't work in vmlinux.o validation because the
indirect thunk function is present in the object, so the intra-object
jump check succeeds before the retpoline jump check gets a chance.

Rearrange the checks: check for a retpoline jump before checking for an
intra-object jump.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/4302893513770dde68ddc22a9d6a2a04aca491dd.1611263461.git.jpoimboe@redhat.com
2021-01-26 11:11:59 -06:00
Josh Poimboeuf
34ca59e109 objtool: Fix ".cold" section suffix check for newer versions of GCC
With my version of GCC 9.3.1 the ".cold" subfunctions no longer have a
numbered suffix, so the trailing period is no longer there.

Presumably this doesn't yet trigger a user-visible bug since most of the
subfunction detection logic is duplicated.   I only found it when
testing vmlinux.o validation.

Fixes: 54262aa283 ("objtool: Fix sibling call detection")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/ca0b5a57f08a2fbb48538dd915cc253b5edabb40.1611263461.git.jpoimboe@redhat.com
2021-01-26 11:11:59 -06:00
Josh Poimboeuf
1f9a1b7494 objtool: Fix retpoline detection in asm code
The JMP_NOSPEC macro branches to __x86_retpoline_*() rather than the
__x86_indirect_thunk_*() wrappers used by C code.  Detect jumps to
__x86_retpoline_*() as retpoline dynamic jumps.

Presumably this doesn't trigger a user-visible bug.  I only found it
when testing vmlinux.o validation.

Fixes: 39b735332c ("objtool: Detect jumps to retpoline thunks")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/31f5833e2e4f01e3d755889ac77e3661e906c09f.1611263461.git.jpoimboe@redhat.com
2021-01-26 11:11:59 -06:00
Josh Poimboeuf
6f567c9300 objtool: Fix error handling for STD/CLD warnings
Actually return an error (and display a backtrace, if requested) for
directional bit warnings.

Fixes: 2f0f9e9ad7 ("objtool: Add Direction Flag validation")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/dc70f2adbc72f09526f7cab5b6feb8bf7f6c5ad4.1611263461.git.jpoimboe@redhat.com
2021-01-26 11:11:59 -06:00
Josh Poimboeuf
1d489151e9 objtool: Don't fail on missing symbol table
Thanks to a recent binutils change which doesn't generate unused
symbols, it's now possible for thunk_64.o be completely empty without
CONFIG_PREEMPTION: no text, no data, no symbols.

We could edit the Makefile to only build that file when
CONFIG_PREEMPTION is enabled, but that will likely create confusion
if/when the thunks end up getting used by some other code again.

Just ignore it and move on.

Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1254
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-21 15:49:58 -06:00
Josh Poimboeuf
655cf86548 objtool: Don't fail the kernel build on fatal errors
This is basically a revert of commit 644592d328 ("objtool: Fail the
kernel build on fatal errors").

That change turned out to be more trouble than it's worth.  Failing the
build is an extreme measure which sometimes gets too much attention and
blocks CI build testing.

These fatal-type warnings aren't yet as rare as we'd hope, due to the
ever-increasing matrix of supported toolchains/plugins and their
fast-changing nature as of late.

Also, there are more people (and bots) looking for objtool warnings than
ever before, so even non-fatal warnings aren't likely to be ignored for
long.

Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-21 15:49:39 -06:00
Josh Poimboeuf
c9c324dc22 objtool: Support stack layout changes in alternatives
The ORC unwinder showed a warning [1] which revealed the stack layout
didn't match what was expected.  The problem was that paravirt patching
had replaced "CALL *pv_ops.irq.save_fl" with "PUSHF;POP".  That changed
the stack layout between the PUSHF and the POP, so unwinding from an
interrupt which occurred between those two instructions would fail.

Part of the agreed upon solution was to rework the custom paravirt
patching code to use alternatives instead, since objtool already knows
how to read alternatives (and converging runtime patching infrastructure
is always a good thing anyway).  But the main problem still remains,
which is that runtime patching can change the stack layout.

Making stack layout changes in alternatives was disallowed with commit
7117f16bf4 ("objtool: Fix ORC vs alternatives"), but now that paravirt
is going to be doing it, it needs to be supported.

One way to do so would be to modify the ORC table when the code gets
patched.  But ORC is simple -- a good thing! -- and it's best to leave
it alone.

Instead, support stack layout changes by "flattening" all possible stack
states (CFI) from parallel alternative code streams into a single set of
linear states.  The only necessary limitation is that CFI conflicts are
disallowed at all possible instruction boundaries.

For example, this scenario is allowed:

          Alt1                    Alt2                    Alt3

   0x00   CALL *pv_ops.save_fl    CALL xen_save_fl        PUSHF
   0x01                                                   POP %RAX
   0x02                                                   NOP
   ...
   0x05                           NOP
   ...
   0x07   <insn>

The unwind information for offset-0x00 is identical for all 3
alternatives.  Similarly offset-0x05 and higher also are identical (and
the same as 0x00).  However offset-0x01 has deviating CFI, but that is
only relevant for Alt3, neither of the other alternative instruction
streams will ever hit that offset.

This scenario is NOT allowed:

          Alt1                    Alt2

   0x00   CALL *pv_ops.save_fl    PUSHF
   0x01                           NOP6
   ...
   0x07   NOP                     POP %RAX

The problem here is that offset-0x7, which is an instruction boundary in
both possible instruction patch streams, has two conflicting stack
layouts.

[ The above examples were stolen from Peter Zijlstra. ]

The new flattened CFI array is used both for the detection of conflicts
(like the second example above) and the generation of linear ORC
entries.

BTW, another benefit of these changes is that, thanks to some related
cleanups (new fake nops and alt_group struct) objtool can finally be rid
of fake jumps, which were a constant source of headaches.

[1] https://lkml.kernel.org/r/20201111170536.arx2zbn4ngvjoov7@treble

Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-14 09:53:54 -06:00
Josh Poimboeuf
b23cc71c62 objtool: Add 'alt_group' struct
Create a new struct associated with each group of alternatives
instructions.  This will help with the removal of fake jumps, and more
importantly with adding support for stack layout changes in
alternatives.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-14 09:53:48 -06:00
Josh Poimboeuf
ab4e0744e9 objtool: Refactor ORC section generation
Decouple ORC entries from instructions.  This simplifies the
control/data flow, and is going to make it easier to support alternative
instructions which change the stack layout.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-14 09:53:42 -06:00
Vasily Gorbik
7786032e52 objtool: Rework header include paths
Currently objtool headers are being included either by their base name
or included via ../ from a parent directory. In case of a base name usage:

 #include "warn.h"
 #include "arch_elf.h"

it does not make it apparent from which directory the file comes from.
To make it slightly better, and actually to avoid name clashes some arch
specific files have "arch_" suffix. And files from an arch folder have
to revert to including via ../ e.g:
 #include "../../elf.h"

With additional architectures support and the code base growth there is
a need for clearer headers naming scheme for multiple reasons:
1. to make it instantly obvious where these files come from (objtool
   itself / objtool arch|generic folders / some other external files),
2. to avoid name clashes of objtool arch specific headers, potential
   obtool arch generic headers and the system header files (there is
   /usr/include/elf.h already),
3. to avoid ../ includes and improve code readability.
4. to give a warm fuzzy feeling to developers who are mostly kernel
   developers and are accustomed to linux kernel headers arranging
   scheme.

Doesn't this make it instantly obvious where are these files come from?

 #include <objtool/warn.h>
 #include <arch/elf.h>

And doesn't it look nicer to avoid ugly ../ includes? Which also
guarantees this is elf.h from the objtool and not /usr/include/elf.h.

 #include <objtool/elf.h>

This patch defines and implements new objtool headers arranging
scheme. Which is:
- all generic headers go to include/objtool (similar to include/linux)
- all arch headers go to arch/$(SRCARCH)/include/arch (to get arch
  prefix). This is similar to linux arch specific "asm/*" headers but we
  are not abusing "asm" name and calling it what it is. This also helps
  to prevent name clashes (arch is not used in system headers or kernel
  exports).

To bring objtool to this state the following things are done:
1. current top level tools/objtool/ headers are moved into
   include/objtool/ subdirectory,
2. arch specific headers, currently only arch/x86/include/ are moved into
   arch/x86/include/arch/ and were stripped of "arch_" suffix,
3. new -I$(srctree)/tools/objtool/include include path to make
   includes like <objtool/warn.h> possible,
4. rewriting file includes,
5. make git not to ignore include/objtool/ subdirectory.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-13 18:13:14 -06:00
Vasily Gorbik
8bfe273238 objtool: Fix x86 orc generation on big endian cross-compiles
Correct objtool orc generation endianness problems to enable fully
functional x86 cross-compiles on big endian hardware.

Introduce bswap_if_needed() macro, which does a byte swap if target
endianness doesn't match the host, i.e. cross-compilation for little
endian on big endian and vice versa.  The macro is used for conversion
of multi-byte values which are read from / about to be written to a
target native endianness ELF file.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-13 18:13:13 -06:00
Martin Schwidefsky
a1a664ece5 objtool: Fix reloc generation on big endian cross-compiles
Relocations generated in elf_rebuild_rel[a]_reloc_section() are broken
if objtool is built and run on a big endian system.

The following errors pop up during x86 cross-compilation:

  x86_64-9.1.0-ld: fs/efivarfs/inode.o: bad reloc symbol index (0x2000000 >= 0x22) for offset 0 in section `.orc_unwind_ip'
  x86_64-9.1.0-ld: final link failed: bad value

Convert those functions to use gelf_update_rel[a](), similar to what
elf_write_reloc() does.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-13 18:13:12 -06:00
Julien Thierry
201ef5a974 objtool: Make SP memory operation match PUSH/POP semantics
Architectures without PUSH/POP instructions will always access the stack
though memory operations (SRC/DEST_INDIRECT). Make those operations have
the same effect on the CFA as PUSH/POP, with no stack pointer
modification.

Signed-off-by: Julien Thierry <jthierry@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-13 18:13:10 -06:00
Julien Thierry
468af56a7b objtool: Support addition to set CFA base
On arm64, the compiler can set the frame pointer either
with a move operation or with and add operation like:

    add (SP + constant), BP

For a simple move operation, the CFA base is changed from SP to BP.
Handle also changing the CFA base when the frame pointer is set with
an addition instruction.

Signed-off-by: Julien Thierry <jthierry@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-13 18:13:10 -06:00
Julien Thierry
fb084fde0c objtool: Fully validate the stack frame
A valid stack frame should contain both the return address and the
previous frame pointer value.

On x86, the return value is placed on the stack by the calling
instructions. On other architectures, the callee needs to explicitly
save the return address on the stack.

Add the necessary checks to verify a function properly sets up all the
elements of the stack frame.

Signed-off-by: Julien Thierry <jthierry@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-13 18:13:09 -06:00
Josh Poimboeuf
a2e38dffcd objtool: Don't add empty symbols to the rbtree
Building with the Clang assembler shows the following warning:

  arch/x86/kernel/ftrace_64.o: warning: objtool: missing symbol for insn at offset 0x16

The Clang assembler strips section symbols.  That ends up giving
objtool's find_func_containing() much more test coverage than normal.
Turns out, find_func_containing() doesn't work so well for overlapping
symbols:

     2: 000000000000000e     0 NOTYPE  LOCAL  DEFAULT    2 fgraph_trace
     3: 000000000000000f     0 NOTYPE  LOCAL  DEFAULT    2 trace
     4: 0000000000000000   165 FUNC    GLOBAL DEFAULT    2 __fentry__
     5: 000000000000000e     0 NOTYPE  GLOBAL DEFAULT    2 ftrace_stub

The zero-length NOTYPE symbols are inside __fentry__(), confusing the
rbtree search for any __fentry__() offset coming after a NOTYPE.

Try to avoid this problem by not adding zero-length symbols to the
rbtree.  They're rare and aren't needed in the rbtree anyway.

One caveat, this actually might not end up being the right fix.
Non-empty overlapping symbols, if they exist, could have the same
problem.  But that would need bigger changes, let's see if we can get
away with the easy fix for now.

Reported-by: Arnd Bergmann <arnd@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2021-01-13 16:56:37 -06:00
Linus Torvalds
cce622ab92 Fix a segfault that occurs when built with Clang.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl/oU8URHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iOgw//Rb7VwweTCbPXansvkw3/lCZdQ4EHhfw9
 kWwAd+fXyGgO4SL9Beu0LgH4IADmEmDBpG5tdX6F1fufRoJKOANc1yhKisWGfxhR
 WCx+ve+0gden6Ky2hPqog3hVOcZQnyrZSCHRSSvDqV9zasIDIqSJI9UNMVQ8lb6r
 fRCwbW6++8dwy0vQVN/yU78Gi/YTEPPyP5us3WATJuvyTQtD3P8PrQyJQWucULhB
 49Tup//M/NjZRC8p5Yyhyy7YXOngox+LEjw9S8Eztlu3f0YqGlMwYmKc6FOYyl3z
 R6zoO+vKbkakXZ2qwPBVTNINQOc/5HGKf5OPl2itollMtpQIlOCNwENULmjhisoK
 k/BOtBp699GjqvRMfWKEp3WC/xV3ujQa/RKA6bi7F54G2p5cZIV2qx+/nK+MNqEq
 pWg2yqvKQEWZbA4AAUGj3Ls1lsBgs1m1Uc9gZLVwM22gkwCut5xZczXHmANZGeZY
 AdkX/AAxM4/X+u1E7DbpCUmOeylT3ig8iDJxGr56Gr06kJEoyMocDdpk+T/KdRXL
 2paPZhMS3BJWLF1Z9W0a0fT36F9Q0FU9dvp3UXBv/iuslIHRfBw5PoNqUlDNGb6h
 rHHyAvlt210xsZrVTTcHbJzR+xzR0AkEd1C0/g2yUitp3yORC0BgsjYRmew20iEI
 aJ/vstOYgi4=
 =xfnP
 -----END PGP SIGNATURE-----

Merge tag 'objtool-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fix from Ingo Molnar:
 "Fix a segfault that occurs when built with Clang"

* tag 'objtool-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix seg fault with Clang non-section symbols
2020-12-27 09:08:23 -08:00
Josh Poimboeuf
44f6a7c075 objtool: Fix seg fault with Clang non-section symbols
The Clang assembler likes to strip section symbols, which means objtool
can't reference some text code by its section.  This confuses objtool
greatly, causing it to seg fault.

The fix is similar to what was done before, for ORC reloc generation:

  e81e072443 ("objtool: Support Clang non-section symbols in ORC generation")

Factor out that code into a common helper and use it for static call
reloc generation as well.

Reported-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://github.com/ClangBuiltLinux/linux/issues/1207
Link: https://lkml.kernel.org/r/ba6b6c0f0dd5acbba66e403955a967d9fdd1726a.1607983452.git.jpoimboe@redhat.com
2020-12-16 14:35:46 +01:00
Jean-Philippe Brucker
c8a950d0d3 tools: Factor HOSTCC, HOSTLD, HOSTAR definitions
Several Makefiles in tools/ need to define the host toolchain variables.
Move their definition to tools/scripts/Makefile.include

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/bpf/20201110164310.2600671-2-jean-philippe@linaro.org
2020-11-11 12:18:22 -08:00
Linus Torvalds
6873139ed0 objtool changes for v5.10:
- Most of the changes are cleanups and reorganization to make the objtool code
    more arch-agnostic. This is in preparation for non-x86 support.
 
 Fixes:
 
  - KASAN fixes.
  - Handle unreachable trap after call to noreturn functions better.
  - Ignore unreachable fake jumps.
  - Misc smaller fixes & cleanups.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+FgwIRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1juGw/6A6goA5/HHapM965yG1eY/rTLp3eIbcma
 1ZbkUsP0YfT6wVUzw/sOeZzKNOwOq1FuMfkjuH2KcnlxlcMekIaKvLk8uauW4igM
 hbFGuuZfZ0An5ka9iQ1W6HGdsuD3vVlN1w/kxdWk0c3lJCVQSTxdCfzF8fuF3gxX
 lF3Bc1D/ZFcHIHT/hu/jeIUCgCYpD3qZDjQJBScSwVthZC+Fw6weLLGp2rKDaCao
 HhSQft6MUfDrUKfH3LBIUNPRPCOrHo5+AX6BXxLXJVxqlwO/YU3e0GMwSLedMtBy
 TASWo7/9GAp+wNNZe8EliyTKrfC3sLxN1QImfjuojxbBVXx/YQ/ToTt9fVGpF4Y+
 XhhRFv9520v1tS2wPHIgQGwbh7EWG6mdrmo10RAs/31ViONPrbEZ4WmcA08b/5FY
 KEkOVb18yfmDVzVZPpSc+HpIFkppEBOf7wPg27Bj3RTZmzIl/y+rKSnxROpsJsWb
 R6iov7SFVET14lHl1G7tPNXfqRaS7HaOQIj3rSUyAP0ZfX+yIupVJp32dc6Ofg8b
 SddUCwdIHoFdUNz4Y9csUCrewtCVJbxhV4MIdv0GpWbrgSw96RFZgetaH+6mGRpj
 0Kh6M1eC3irDbhBuarWUBAr2doPAq4iOUeQU36Q6YSAbCs83Ws2uKOWOHoFBVwCH
 uSKT0wqqG+E=
 =KX5o
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2020-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:
 "Most of the changes are cleanups and reorganization to make the
  objtool code more arch-agnostic. This is in preparation for non-x86
  support.

  Other changes:

   - KASAN fixes

   - Handle unreachable trap after call to noreturn functions better

   - Ignore unreachable fake jumps

   - Misc smaller fixes & cleanups"

* tag 'objtool-core-2020-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  perf build: Allow nested externs to enable BUILD_BUG() usage
  objtool: Allow nested externs to enable BUILD_BUG()
  objtool: Permit __kasan_check_{read,write} under UACCESS
  objtool: Ignore unreachable trap after call to noreturn functions
  objtool: Handle calling non-function symbols in other sections
  objtool: Ignore unreachable fake jumps
  objtool: Remove useless tests before save_reg()
  objtool: Decode unwind hint register depending on architecture
  objtool: Make unwind hint definitions available to other architectures
  objtool: Only include valid definitions depending on source file type
  objtool: Rename frame.h -> objtool.h
  objtool: Refactor jump table code to support other architectures
  objtool: Make relocation in alternative handling arch dependent
  objtool: Abstract alternative special case handling
  objtool: Move macros describing structures to arch-dependent code
  objtool: Make sync-check consider the target architecture
  objtool: Group headers to check in a single list
  objtool: Define 'struct orc_entry' only when needed
  objtool: Skip ORC entry creation for non-text sections
  objtool: Move ORC logic out of check()
  ...
2020-10-14 10:13:37 -07:00
Linus Torvalds
dd502a8107 This tree introduces static_call(), which is the idea of static_branch()
applied to indirect function calls. Remove a data load (indirection) by
 modifying the text.
 
 They give the flexibility of function pointers, but with better
 performance. (This is especially important for cases where
 retpolines would otherwise be used, as retpolines can be pretty
 slow.)
 
 API overview:
 
   DECLARE_STATIC_CALL(name, func);
   DEFINE_STATIC_CALL(name, func);
   DEFINE_STATIC_CALL_NULL(name, typename);
 
   static_call(name)(args...);
   static_call_cond(name)(args...);
   static_call_update(name, func);
 
 x86 is supported via text patching, otherwise basic indirect calls are used,
 with function pointers.
 
 There's a second variant using inline code patching, inspired by jump-labels,
 implemented on x86 as well.
 
 The new APIs are utilized in the x86 perf code, a heavy user of function pointers,
 where static calls speed up the PMU handler by 4.2% (!).
 
 The generic implementation is not really excercised on other architectures,
 outside of the trivial test_static_call_init() self-test.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+EfAQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iEAw//divHeVCJnHhV+YBbuI9ROUsERkzu8VhK
 O1DEmW68Fvj7pszT8NZsMjtkt97ZtxDRK7aCJiiup0eItG9qCJ8lpCLb84ZbizHV
 HhCbhBLrpxSvTrWlQnkgP1OkPAbtoryIjVlZzWhjye2MY8UEbVnZWyviBolbAAxH
 Fk1Yi56fIMu19GO+9Ohzy9E2VDnVEH1iMx5YWoLD2H88Qbq/yEMP+U2tIj8hIVKT
 Y/jdogihNXRIau6QB+YPfDPisdty+RHxfU7zct4Rv8cFF5ylglZB5fD34C3sUQF2
 WqsaYz7zjUj9f02F8pw8hIaAT7InzArPhlNVITxf2oMfmdrNqBptnSCddZqCJLvv
 oDGew21k50Zcbqkv9amclpxXH5tTpRvJeqit2pz/85GMeqBRuhzHUAkCpht5YA73
 qJsHWS3z+qIxKi0tDbhDJswuwa51q5sgdUUwo1uCr3wT3DGDlqNhCAZBzX14dcty
 0shDSbv13TCwqAcb7asPzEoPwE15cwa+x+viGEIL901pyZKyQYjs/abDU26It3BW
 roWRkuVJZ9/QMdZJs1v7kaXw1L8YiKIDkBgke+xbfrDwEvvjudQkl2LUL66DB11j
 RJU3GyxKClvdY06SSRh/H13fqZLNKh1JZ0nPEWSTJECDFN9zcDjrDrod/7PFOcpY
 NAlawLoGG+s=
 =JvpF
 -----END PGP SIGNATURE-----

Merge tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull static call support from Ingo Molnar:
 "This introduces static_call(), which is the idea of static_branch()
  applied to indirect function calls. Remove a data load (indirection)
  by modifying the text.

  They give the flexibility of function pointers, but with better
  performance. (This is especially important for cases where retpolines
  would otherwise be used, as retpolines can be pretty slow.)

  API overview:

      DECLARE_STATIC_CALL(name, func);
      DEFINE_STATIC_CALL(name, func);
      DEFINE_STATIC_CALL_NULL(name, typename);

      static_call(name)(args...);
      static_call_cond(name)(args...);
      static_call_update(name, func);

  x86 is supported via text patching, otherwise basic indirect calls are
  used, with function pointers.

  There's a second variant using inline code patching, inspired by
  jump-labels, implemented on x86 as well.

  The new APIs are utilized in the x86 perf code, a heavy user of
  function pointers, where static calls speed up the PMU handler by
  4.2% (!).

  The generic implementation is not really excercised on other
  architectures, outside of the trivial test_static_call_init()
  self-test"

* tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  static_call: Fix return type of static_call_init
  tracepoint: Fix out of sync data passing by static caller
  tracepoint: Fix overly long tracepoint names
  x86/perf, static_call: Optimize x86_pmu methods
  tracepoint: Optimize using static_call()
  static_call: Allow early init
  static_call: Add some validation
  static_call: Handle tail-calls
  static_call: Add static_call_cond()
  x86/alternatives: Teach text_poke_bp() to emulate RET
  static_call: Add simple self-test for static calls
  x86/static_call: Add inline static call implementation for x86-64
  x86/static_call: Add out-of-line static call implementation
  static_call: Avoid kprobes on inline static_call()s
  static_call: Add inline static call infrastructure
  static_call: Add basic static call infrastructure
  compiler.h: Make __ADDRESSABLE() symbol truly unique
  jump_label,module: Fix module lifetime for __jump_label_mod_text_reserved()
  module: Properly propagate MODULE_STATE_COMING failure
  module: Fix up module_notifier return values
  ...
2020-10-12 13:58:15 -07:00
Linus Torvalds
ed016af52e These are the locking updates for v5.10:
- Add deadlock detection for recursive read-locks. The rationale is outlined
    in:
 
      224ec489d3cd: ("lockdep/Documention: Recursive read lock detection reasoning")
 
    The main deadlock pattern we want to detect is:
 
            TASK A:                 TASK B:
 
            read_lock(X);
                                    write_lock(X);
            read_lock_2(X);
 
  - Add "latch sequence counters" (seqcount_latch_t):
 
       A sequence counter variant where the counter even/odd value is used to
       switch between two copies of protected data. This allows the read path,
       typically NMIs, to safely interrupt the write side critical section.
 
    We utilize this new variant for sched-clock, and to make x86 TSC handling safer.
 
  - Other seqlock cleanups, fixes and enhancements
 
  - KCSAN updates
 
  - LKMM updates
 
  - Misc updates, cleanups and fixes.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+EX6QRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g3gxAAkg+Jy/tcdRxlxlEDOQPFy1mBqvFmulNA
 pGFPkB6dzqmAWF/NfOZSl4g/h/mqGYsq2V+PfK5E8Sq8DQ/yCmnLhjgVOHNUUliv
 x0WWfOysNgJdtdf69NLYJufIQhxhyI0dwFHHoHIsCdGdGqjh2DVevQFPFTBjdpOc
 BUZYo+u3gCaCdB6A2nmlcWYbEw8eVEHgv3qLG6dq46J0KJOV0HfliqJoU3EZqH+s
 977LvEIo+THfuYWMo/Jepwngbi0y36KeeukOAdwm9fK196htBHIUR+YPPrAe+FWD
 z+UXP5IS5XIw9V1sGLmUaC2m+6gpdW19jKBtlzPkxHXmJmsgiZdLLeytEh3WYey7
 nzfH+9Jd4NyyZKucLssYkOjf6P5BxGKCyJ9LXb7vlSthIhiDdFNx47oKtW4hxjOY
 jubsI3BP5c3G1sIBIjTS53XmOhJg+Z52FxTpQ33JswXn1wGidcHZiuNHZuU5q28p
 +tn8rGb2NGJFb4Sw/Vp0yTcqIpEXf+vweiQoaxm6tc9BWzcVzZntGnh0i3gFotx/
 VgKafN4+pgXgo6bwHbN2WBK2FGyvcXFaptfaOMZL48En82hJ1DI6EnBEYN+vuERQ
 JcCXg+iHeeVbxoou7q8NJxITkBmEL5xNBIugXRRqNSP3fXLxKjFuPYqT84/e7yZi
 elGTReYcq6g=
 =Iq51
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:
 "These are the locking updates for v5.10:

   - Add deadlock detection for recursive read-locks.

     The rationale is outlined in commit 224ec489d3 ("lockdep/
     Documention: Recursive read lock detection reasoning")

     The main deadlock pattern we want to detect is:

           TASK A:                 TASK B:

           read_lock(X);
                                   write_lock(X);
           read_lock_2(X);

   - Add "latch sequence counters" (seqcount_latch_t):

     A sequence counter variant where the counter even/odd value is used
     to switch between two copies of protected data. This allows the
     read path, typically NMIs, to safely interrupt the write side
     critical section.

     We utilize this new variant for sched-clock, and to make x86 TSC
     handling safer.

   - Other seqlock cleanups, fixes and enhancements

   - KCSAN updates

   - LKMM updates

   - Misc updates, cleanups and fixes"

* tag 'locking-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits)
  lockdep: Revert "lockdep: Use raw_cpu_*() for per-cpu variables"
  lockdep: Fix lockdep recursion
  lockdep: Fix usage_traceoverflow
  locking/atomics: Check atomic-arch-fallback.h too
  locking/seqlock: Tweak DEFINE_SEQLOCK() kernel doc
  lockdep: Optimize the memory usage of circular queue
  seqlock: Unbreak lockdep
  seqlock: PREEMPT_RT: Do not starve seqlock_t writers
  seqlock: seqcount_LOCKNAME_t: Introduce PREEMPT_RT support
  seqlock: seqcount_t: Implement all read APIs as statement expressions
  seqlock: Use unique prefix for seqcount_t property accessors
  seqlock: seqcount_LOCKNAME_t: Standardize naming convention
  seqlock: seqcount latch APIs: Only allow seqcount_latch_t
  rbtree_latch: Use seqcount_latch_t
  x86/tsc: Use seqcount_latch_t
  timekeeping: Use seqcount_latch_t
  time/sched_clock: Use seqcount_latch_t
  seqlock: Introduce seqcount_latch_t
  mm/swap: Do not abuse the seqcount_t latching API
  time/sched_clock: Use raw_read_seqcount_latch() during suspend
  ...
2020-10-12 13:06:20 -07:00
Linus Torvalds
ca1b66922a * Extend the recovery from MCE in kernel space also to processes which
encounter an MCE in kernel space but while copying from user memory by
 sending them a SIGBUS on return to user space and umapping the faulty
 memory, by Tony Luck and Youquan Song.
 
 * memcpy_mcsafe() rework by splitting the functionality into
 copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables
 support for new hardware which can recover from a machine check
 encountered during a fast string copy and makes that the default and
 lets the older hardware which does not support that advance recovery,
 opt in to use the old, fragile, slow variant, by Dan Williams.
 
 * New AMD hw enablement, by Yazen Ghannam and Akshay Gupta.
 
 * Do not use MSR-tracing accessors in #MC context and flag any fault
 while accessing MCA architectural MSRs as an architectural violation
 with the hope that such hw/fw misdesigns are caught early during the hw
 eval phase and they don't make it into production.
 
 * Misc fixes, improvements and cleanups, as always.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl+EIpUACgkQEsHwGGHe
 VUouoBAAgwb+NkWZtIqGImV4f+LOyFjhTR/r/7ZyiijXdbhOIuAdc/jQM31mQxug
 sX2jxaRYnf1n6SLA0ggX99gwr2deRQ/hsNf5Abw55GC+Z1dOxpGL0k59A3ELl1IR
 H9KYmCAFQIHvzfk38qcdND73XHcgthQoXFBOG9wAPAdgDWnaiWt6lcLAq8OiJTmp
 D8pInAYhcnL8YXwMGyQQ1KkFn9HwydoWDsK5Ff2shaw2/+dMQqd1zetenbVtjhLb
 iNYGvV7Bi/RQ8PyMbzmtTWa4kwQJAHC2gptkGxty//2ADGVBbqUQdqF9TjIWCNy5
 V6Ldv5zo0/1s7DOzji3htzqkSs/K1Ea6d2LtZjejkJipHKV5x068UC6Fu+PlfS2D
 VZfcICeapU4G2F3Zvks2DlZ7dVTbHCvoI78Qi7bBgczPUVmk6iqah4xuQaiHyBJc
 kTFDA4Nnf/026GpoWRiFry9vqdnHBZyLet5A6Y+SoWF0FbhYnCVPpq4MnussYoav
 lUIi9ZZav6X2RZp9DDM1f9d5xubtKq0DKt93wvzqAhjK0T2DikckJ+riOYkI6N8t
 fHCBNUkdfgyMzJUTBPAzYQ7RmjbjKWJi7xWP0oz6+GqOJkQfSTVC5/2yEffbb3ya
 whYRS6iklbl7yshzaOeecXsZcAeK2oGPfoHg34WkHFgXdF5mNgA=
 =u1Wg
 -----END PGP SIGNATURE-----

Merge tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS updates from Borislav Petkov:

 - Extend the recovery from MCE in kernel space also to processes which
   encounter an MCE in kernel space but while copying from user memory
   by sending them a SIGBUS on return to user space and umapping the
   faulty memory, by Tony Luck and Youquan Song.

 - memcpy_mcsafe() rework by splitting the functionality into
   copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables
   support for new hardware which can recover from a machine check
   encountered during a fast string copy and makes that the default and
   lets the older hardware which does not support that advance recovery,
   opt in to use the old, fragile, slow variant, by Dan Williams.

 - New AMD hw enablement, by Yazen Ghannam and Akshay Gupta.

 - Do not use MSR-tracing accessors in #MC context and flag any fault
   while accessing MCA architectural MSRs as an architectural violation
   with the hope that such hw/fw misdesigns are caught early during the
   hw eval phase and they don't make it into production.

 - Misc fixes, improvements and cleanups, as always.

* tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Allow for copy_mc_fragile symbol checksum to be generated
  x86/mce: Decode a kernel instruction to determine if it is copying from user
  x86/mce: Recover from poison found while copying from user space
  x86/mce: Avoid tail copy when machine check terminated a copy from user
  x86/mce: Add _ASM_EXTABLE_CPY for copy user access
  x86/mce: Provide method to find out the type of an exception handler
  x86/mce: Pass pointer to saved pt_regs to severity calculation routines
  x86/copy_mc: Introduce copy_mc_enhanced_fast_string()
  x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()
  x86/mce: Drop AMD-specific "DEFERRED" case from Intel severity rule list
  x86/mce: Add Skylake quirk for patrol scrub reported errors
  RAS/CEC: Convert to DEFINE_SHOW_ATTRIBUTE()
  x86/mce: Annotate mce_rd/wrmsrl() with noinstr
  x86/mce/dev-mcelog: Do not update kflags on AMD systems
  x86/mce: Stop mce_reign() from re-computing severity for every CPU
  x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR
  x86/mce: Increase maximum number of banks to 64
  x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()
  x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
  RAS/CEC: Fix cec_init() prototype
2020-10-12 10:14:38 -07:00
Ingo Molnar
d6c4c11348 Merge branch 'kcsan' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into locking/core
Pull KCSAN updates for v5.10 from Paul E. McKenney:

 - Improve kernel messages.

 - Be more permissive with bitops races under KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=y.

 - Optimize debugfs stat counters.

 - Introduce the instrument_*read_write() annotations, to provide a
   finer description of certain ops - using KCSAN's compound instrumentation.
   Use them for atomic RNW and bitops, where appropriate.
   Doing this might find new races.
   (Depends on the compiler having tsan-compound-read-before-write=1 support.)

 - Support atomic built-ins, which will help certain architectures, such as s390.

 - Misc enhancements and smaller fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-10-09 08:56:02 +02:00
Vasily Gorbik
2486baae2c objtool: Allow nested externs to enable BUILD_BUG()
Currently BUILD_BUG() macro is expanded to smth like the following:
   do {
           extern void __compiletime_assert_0(void)
                   __attribute__((error("BUILD_BUG failed")));
           if (!(!(1)))
                   __compiletime_assert_0();
   } while (0);

If used in a function body this obviously would produce build errors
with -Wnested-externs and -Werror.

Build objtool with -Wno-nested-externs to enable BUILD_BUG() usage.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-10-06 09:32:13 -05:00
Dan Williams
5da8e4a658 x86/copy_mc: Introduce copy_mc_enhanced_fast_string()
The motivations to go rework memcpy_mcsafe() are that the benefit of
doing slow and careful copies is obviated on newer CPUs, and that the
current opt-in list of CPUs to instrument recovery is broken relative to
those CPUs.  There is no need to keep an opt-in list up to date on an
ongoing basis if pmem/dax operations are instrumented for recovery by
default. With recovery enabled by default the old "mcsafe_key" opt-in to
careful copying can be made a "fragile" opt-out. Where the "fragile"
list takes steps to not consume poison across cachelines.

The discussion with Linus made clear that the current "_mcsafe" suffix
was imprecise to a fault. The operations that are needed by pmem/dax are
to copy from a source address that might throw #MC to a destination that
may write-fault, if it is a user page.

So copy_to_user_mcsafe() becomes copy_mc_to_user() to indicate
the separate precautions taken on source and destination.
copy_mc_to_kernel() is introduced as a non-SMAP version that does not
expect write-faults on the destination, but is still prepared to abort
with an error code upon taking #MC.

The original copy_mc_fragile() implementation had negative performance
implications since it did not use the fast-string instruction sequence
to perform copies. For this reason copy_mc_to_kernel() fell back to
plain memcpy() to preserve performance on platforms that did not indicate
the capability to recover from machine check exceptions. However, that
capability detection was not architectural and now that some platforms
can recover from fast-string consumption of memory errors the memcpy()
fallback now causes these more capable platforms to fail.

Introduce copy_mc_enhanced_fast_string() as the fast default
implementation of copy_mc_to_kernel() and finalize the transition of
copy_mc_fragile() to be a platform quirk to indicate 'copy-carefully'.
With this in place, copy_mc_to_kernel() is fast and recovery-ready by
default regardless of hardware capability.

Thanks to Vivek for identifying that copy_user_generic() is not suitable
as the copy_mc_to_user() backend since the #MC handler explicitly checks
ex_has_fault_handler(). Thanks to the 0day robot for catching a
performance bug in the x86/copy_mc_to_user implementation.

 [ bp: Add the "why" for this change from the 0/2th message, massage. ]

Fixes: 92b0729c34 ("x86/mm, x86/mce: Add memcpy_mcsafe()")
Reported-by: Erwin Tsaur <erwin.tsaur@intel.com>
Reported-by: 0day robot <lkp@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Erwin Tsaur <erwin.tsaur@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/160195562556.2163339.18063423034951948973.stgit@dwillia2-desk3.amr.corp.intel.com
2020-10-06 11:37:36 +02:00