Commit Graph

6863 Commits

Author SHA1 Message Date
Stephen Smalley
59ffc9beeb selinux: fix sel_read_bool() allocation and error handling
Switch sel_read_bool() from using get_zeroed_page() and free_page()
to a stack-allocated buffer. This also fixes a memory leak in the
error path when security_get_bool_value() returns an error.

Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-09-03 17:34:32 -04:00
Simon Schuster
edd3cb05c0 copy_process: pass clone_flags as u64 across calltree
With the introduction of clone3 in commit 7f192e3cd3 ("fork: add
clone3") the effective bit width of clone_flags on all architectures was
increased from 32-bit to 64-bit, with a new type of u64 for the flags.
However, for most consumers of clone_flags the interface was not
changed from the previous type of unsigned long.

While this works fine as long as none of the new 64-bit flag bits
(CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still
undesirable in terms of the principle of least surprise.

Thus, this commit fixes all relevant interfaces of callees to
sys_clone3/copy_process (excluding the architecture-specific
copy_thread) to consistently pass clone_flags as u64, so that
no truncation to 32-bit integers occurs on 32-bit architectures.

Signed-off-by: Simon Schuster <schuster.simon@siemens-energy.com>
Link: https://lore.kernel.org/20250901-nios2-implement-clone3-v2-2-53fcf5577d57@siemens-energy.com
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-01 15:31:34 +02:00
Josef Bacik
37b27bd5d6
fs: add an icount_read helper
Instead of doing direct access to ->i_count, add a helper to handle
this. This will make it easier to convert i_count to a refcount later.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/9bc62a84c6b9d6337781203f60837bd98fbc4a96.1756222464.git.josef@toxicpanda.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-01 12:41:09 +02:00
Casey Schaufler
0ffbc876d0 audit: add record for multiple object contexts
Create a new audit record AUDIT_MAC_OBJ_CONTEXTS.
An example of the MAC_OBJ_CONTEXTS record is:

    type=MAC_OBJ_CONTEXTS
      msg=audit(1601152467.009:1050):
      obj_selinux=unconfined_u:object_r:user_home_t:s0

When an audit event includes a AUDIT_MAC_OBJ_CONTEXTS record
the "obj=" field in other records in the event will be "obj=?".
An AUDIT_MAC_OBJ_CONTEXTS record is supplied when the system has
multiple security modules that may make access decisions based
on an object security context.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj tweak, audit example readability indents]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-30 10:15:30 -04:00
Casey Schaufler
eb59d494ee audit: add record for multiple task security contexts
Replace the single skb pointer in an audit_buffer with a list of
skb pointers. Add the audit_stamp information to the audit_buffer as
there's no guarantee that there will be an audit_context containing
the stamp associated with the event. At audit_log_end() time create
auxiliary records as have been added to the list. Functions are
created to manage the skb list in the audit_buffer.

Create a new audit record AUDIT_MAC_TASK_CONTEXTS.
An example of the MAC_TASK_CONTEXTS record is:

    type=MAC_TASK_CONTEXTS
      msg=audit(1600880931.832:113)
      subj_apparmor=unconfined
      subj_smack=_

When an audit event includes a AUDIT_MAC_TASK_CONTEXTS record the
"subj=" field in other records in the event will be "subj=?".
An AUDIT_MAC_TASK_CONTEXTS record is supplied when the system has
multiple security modules that may make access decisions based on a
subject security context.

Refactor audit_log_task_context(), creating a new audit_log_subj_ctx().
This is used in netlabel auditing to provide multiple subject security
contexts as necessary.

Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj tweak, audit example readability indents]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-30 10:15:30 -04:00
Casey Schaufler
a59076f266 lsm: security_lsmblob_to_secctx module selection
Add a parameter lsmid to security_lsmblob_to_secctx() to identify which
of the security modules that may be active should provide the security
context. If the value of lsmid is LSM_ID_UNDEF the first LSM providing
a hook is used. security_secid_to_secctx() is unchanged, and will
always report the first LSM providing a hook.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-30 10:15:29 -04:00
Qianfeng Rong
e73f759d2e security: use umax() to improve code
Use umax() to reduce the code in update_mmap_min_addr() and improve its
readability.

Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
[PM: subj line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-18 15:41:47 -04:00
Qianfeng Rong
f20e70a341 selinux: Remove redundant __GFP_NOWARN
Commit 16f5dfbc85 ("gfp: include __GFP_NOWARN in GFP_NOWAIT")
made GFP_NOWAIT implicitly include __GFP_NOWARN.

Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT
(e.g., `GFP_NOWAIT | __GFP_NOWARN`) is now redundant. Let's clean
up these redundant flags across subsystems.

No functional changes.

Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
[PM: fixed horizontal spacing / alignment, line wraps]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-12 13:18:13 -04:00
Blaise Boscaccy
5816bf4273 lsm,selinux: Add LSM blob support for BPF objects
This patch introduces LSM blob support for BPF maps, programs, and
tokens to enable LSM stacking and multiplexing of LSM modules that
govern BPF objects. Additionally, the existing BPF hooks used by
SELinux have been updated to utilize the new blob infrastructure,
removing the assumption of exclusive ownership of the security
pointer.

Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
[PM: dropped local variable init, style fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-11 17:56:09 -04:00
Paul Moore
e5bc887413 lsm: use lsm_blob_alloc() in lsm_bdev_alloc()
Convert the lsm_bdev_alloc() function to use the lsm_blob_alloc() helper
like all of the other LSM security blob allocators.

Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-11 17:56:08 -04:00
Tianjia Zhang
d4e8dc8e8b selinux: use a consistent method to get full socket from skb
In order to maintain code consistency and readability,
skb_to_full_sk() is used to get full socket from skb.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-11 11:59:11 -04:00
Yue Haibing
5f9383bd41 selinux: Remove unused function selinux_policycap_netif_wildcard()
This is unused since commit a3d3043ef2 ("selinux: get netif_wildcard
policycap from policy instead of cache").

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-11 11:59:11 -04:00
Linus Torvalds
8b45c6c90a + Features
- improve debug printing
   - carry mediation check on label (optimization)
   - improve ability for compiler to optimize __begin_current_label_crit_section
   - transition for a linked list of rulesets to a vector of rulesets
   - don't hardcode profile signal, allow it to be set by policy
   - ability to mediate caps via the state machine instead of lut
   - Add Ubuntu af_unix mediation, put it behind new v9 abi
 
 + Cleanups
   - fix typos and spelling errors
   - cleanup kernel doc and code inconsistencies
   - remove redundant checks/code
   - remove unused variables
   - Use str_yes_no() helper function
   - mark tables static where appropriate
   - make all generated string array headers const char *const
   - refactor to doc semantics of file_perm checks
   - replace macro calls to network/socket fns with explicit calls
   - refactor/cleanup socket mediation code preparing for finer grained
     mediation of different network families
   - several updates to kernel doc comments
 
 + Bug fixes
   - apparmor: Fix incorrect profile->signal range check
   - idmap mount fixes
   - policy unpack unaligned access fixes
   - kfree_sensitive() where appropriate
   - fix oops when freeing policy
   - fix conflicting attachment resolution
   - fix exec table look-ups when stacking isn't first
   - fix exec auditing
   - mitigate userspace generating overly large xtables
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE7cSDD705q2rFEEf7BS82cBjVw9gFAmiQmakACgkQBS82cBjV
 w9jvKBAAn/HblSPo112ycW/yjeonkyiCY56LvyeU1YWQ8m370xPqM3yK2SHcj2i1
 we1mx5beDxbH5xn7c6w0EtyoHP7FNhyHp7neG8/WaJ1JG4uxv9HvrDmEJeQEJn/3
 5AP1q2dZF9NwnKBfB5zjwZXBJzncWtYBoLUjYMbehWlQjufT2yElyM8YZZN8ziLE
 M5ILVX6UMGpBH/zuX5kN2idLcubnv5MvLo2IEt+/nGLPbed44w+mZTM5WOTbzPNq
 w8axyNdhGt9kcSGwWuM+48T4oLfwagoxIZ3RXSQ4eExk6I8ZaFXua8nknC9wENN4
 9vkzDSWAupQ+o1bLKVNMVkqvBIIqmvEWvwket/hiyxs3Y5PDckRqOgQ/4Wbmgp9B
 KhLXxzIrF9PXkZ/rpMzloxnvDtMwoSScDShhW4TCRCmpDo/GwPwoPIpgbnc3kTq0
 poomca9KZ7YEnX/90Bh92Duo5OBDOHYlbWVE7EWX01htcxExQJt47JK48C25cY5p
 /cVDVepoz7EnKjB7mm9k6K1gYGvDeu3W1whRZNEK74AQJ7p+CrBoU+WjeMmZqP5V
 s47cLF17hbnw4ZvfsxQDkPgSOP1kuJIVlwFV2lPQk5hDcT6V0kZtqUzczKJSqeJb
 CGOkKvM7ao/7Cn8pmDNG1ZuPl/HuJ6wjlxt7SVt4/3rzLFzwglo=
 =Djjn
 -----END PGP SIGNATURE-----

Merge tag 'apparmor-pr-2025-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

Pull apparmor updates from John Johansen:
 "This has one major feature, it pulls in a cleaned up version of
  af_unix mediation that Ubuntu has been carrying for years. It is
  placed behind a new abi to ensure that it does cause policy
  regressions. With pulling in the af_unix mediation there have been
  cleanups and some refactoring of network socket mediation. This
  accounts for the majority of the changes in the diff.

  In addition there are a few improvements providing minor code
  optimizations. several code cleanups, and bug fixes.

  Features:
   - improve debug printing
   - carry mediation check on label (optimization)
   - improve ability for compiler to optimize
     __begin_current_label_crit_section
   - transition for a linked list of rulesets to a vector of rulesets
   - don't hardcode profile signal, allow it to be set by policy
   - ability to mediate caps via the state machine instead of lut
   - Add Ubuntu af_unix mediation, put it behind new v9 abi

  Cleanups:
   - fix typos and spelling errors
   - cleanup kernel doc and code inconsistencies
   - remove redundant checks/code
   - remove unused variables
   - Use str_yes_no() helper function
   - mark tables static where appropriate
   - make all generated string array headers const char *const
   - refactor to doc semantics of file_perm checks
   - replace macro calls to network/socket fns with explicit calls
   - refactor/cleanup socket mediation code preparing for finer grained
     mediation of different network families
   - several updates to kernel doc comments

  Bug fixes:
   - fix incorrect profile->signal range check
   - idmap mount fixes
   - policy unpack unaligned access fixes
   - kfree_sensitive() where appropriate
   - fix oops when freeing policy
   - fix conflicting attachment resolution
   - fix exec table look-ups when stacking isn't first
   - fix exec auditing
   - mitigate userspace generating overly large xtables"

* tag 'apparmor-pr-2025-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (60 commits)
  apparmor: fix: oops when trying to free null ruleset
  apparmor: fix Regression on linux-next (next-20250721)
  apparmor: fix test error: WARNING in apparmor_unix_stream_connect
  apparmor: Remove the unused variable rules
  apparmor: fix: accept2 being specifie even when permission table is presnt
  apparmor: transition from a list of rules to a vector of rules
  apparmor: fix documentation mismatches in val_mask_to_str and socket functions
  apparmor: remove redundant perms.allow MAY_EXEC bitflag set
  apparmor: fix kernel doc warnings for kernel test robot
  apparmor: Fix unaligned memory accesses in KUnit test
  apparmor: Fix 8-byte alignment for initial dfa blob streams
  apparmor: shift uid when mediating af_unix in userns
  apparmor: shift ouid when mediating hard links in userns
  apparmor: make sure unix socket labeling is correctly updated.
  apparmor: fix regression in fs based unix sockets when using old abi
  apparmor: fix AA_DEBUG_LABEL()
  apparmor: fix af_unix auditing to include all address information
  apparmor: Remove use of the double lock
  apparmor: update kernel doc comments for xxx_label_crit_section
  apparmor: make __begin_current_label_crit_section() indicate whether put is needed
  ...
2025-08-04 08:17:28 -07:00
John Johansen
5f49c2d1f4 apparmor: fix: oops when trying to free null ruleset
profile allocation is wrongly setting the number of entries on the
rules vector before any ruleset is assigned. If profile allocation
fails between ruleset allocation and assigning the first ruleset,
free_ruleset() will be called with a null pointer resulting in an
oops.

[  107.350226] kernel BUG at mm/slub.c:545!
[  107.350912] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  107.351447] CPU: 1 UID: 0 PID: 27 Comm: ksoftirqd/1 Not tainted 6.14.6-hwe-rlee287-dev+ #5
[  107.353279] Hardware name:[   107.350218] -QE-----------[ cutMU here ]--------- Ub---
[  107.3502untu26] kernel BUG a 24t mm/slub.c:545.!04 P
[  107.350912]C ( Oops: invalid oi4pcode: 0000 [#1]40 PREEMPT SMP NOPFXTI
 + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  107.356054] RIP: 0010:__slab_free+0x152/0x340
[  107.356444] Code: 00 4c 89 ff e8 0f ac df 00 48 8b 14 24 48 8b 4c 24 20 48 89 44 24 08 48 8b 03 48 c1 e8 09 83 e0 01 88 44 24 13 e9 71 ff ff ff <0f> 0b 41 f7 44 24 08 87 04 00 00 75 b2 eb a8 41 f7 44 24 08 87 04
[  107.357856] RSP: 0018:ffffad4a800fbbb0 EFLAGS: 00010246
[  107.358937] RAX: ffff97ebc2a88e70 RBX: ffffd759400aa200 RCX: 0000000000800074
[  107.359976] RDX: ffff97ebc2a88e60 RSI: ffffd759400aa200 RDI: ffffad4a800fbc20
[  107.360600] RBP: ffffad4a800fbc50 R08: 0000000000000001 R09: ffffffff86f02cf2
[  107.361254] R10: 0000000000000000 R11: 0000000000000000 R12: ffff97ecc0049400
[  107.361934] R13: ffff97ebc2a88e60 R14: ffff97ecc0049400 R15: 0000000000000000
[  107.362597] FS:  0000000000000000(0000) GS:ffff97ecfb200000(0000) knlGS:0000000000000000
[  107.363332] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  107.363784] CR2: 000061c9545ac000 CR3: 0000000047aa6000 CR4: 0000000000750ef0
[  107.364331] PKRU: 55555554
[  107.364545] Call Trace:
[  107.364761]  <TASK>
[  107.364931]  ? local_clock+0x15/0x30
[  107.365219]  ? srso_alias_return_thunk+0x5/0xfbef5
[  107.365593]  ? kfree_sensitive+0x32/0x70
[  107.365900]  kfree+0x29d/0x3a0
[  107.366144]  ? srso_alias_return_thunk+0x5/0xfbef5
[  107.366510]  ? local_clock_noinstr+0xe/0xd0
[  107.366841]  ? srso_alias_return_thunk+0x5/0xfbef5
[  107.367209]  kfree_sensitive+0x32/0x70
[  107.367502]  aa_free_profile.part.0+0xa2/0x400
[  107.367850]  ? rcu_do_batch+0x1e6/0x5e0
[  107.368148]  aa_free_profile+0x23/0x60
[  107.368438]  label_free_switch+0x4c/0x80
[  107.368751]  label_free_rcu+0x1c/0x50
[  107.369038]  rcu_do_batch+0x1e8/0x5e0
[  107.369324]  ? rcu_do_batch+0x157/0x5e0
[  107.369626]  rcu_core+0x1b0/0x2f0
[  107.369888]  rcu_core_si+0xe/0x20
[  107.370156]  handle_softirqs+0x9b/0x3d0
[  107.370460]  ? smpboot_thread_fn+0x26/0x210
[  107.370790]  run_ksoftirqd+0x3a/0x70
[  107.371070]  smpboot_thread_fn+0xf9/0x210
[  107.371383]  ? __pfx_smpboot_thread_fn+0x10/0x10
[  107.371746]  kthread+0x10d/0x280
[  107.372010]  ? __pfx_kthread+0x10/0x10
[  107.372310]  ret_from_fork+0x44/0x70
[  107.372655]  ? __pfx_kthread+0x10/0x10
[  107.372974]  ret_from_fork_asm+0x1a/0x30
[  107.373316]  </TASK>
[  107.373505] Modules linked in: af_packet_diag mptcp_diag tcp_diag udp_diag raw_diag inet_diag snd_seq_dummy snd_hrtimer snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd soundcore qrtr binfmt_misc intel_rapl_msr intel_rapl_common kvm_amd ccp kvm irqbypass polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd i2c_piix4 i2c_smbus input_leds joydev sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci dmi_sysfs qemu_fw_cfg ip_tables x_tables autofs4 hid_generic usbhid hid psmouse serio_raw floppy bochs pata_acpi
[  107.379086] ---[ end trace 0000000000000000 ]---

Don't set the count until a ruleset is actually allocated and
guard against free_ruleset() being called with a null pointer.

Reported-by: Ryan Lee <ryan.lee@canonical.com>
Fixes: 217af7e2f4 ("apparmor: refactor profile rules and attachments")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-08-04 01:14:56 -07:00
Linus Torvalds
02523d2d93 integrity-v6.17
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCaItL7xQcem9oYXJAbGlu
 dXguaWJtLmNvbQAKCRDLwZzRsCrn5TduAQDu7W14clgQiJNwYo2hN5cEnfKZVkRI
 6PJGyxV+g+cMOQEA4Aepo2EL86kQJH33iAUmzi0bvyQl4cPTuKqpw5CgjQg=
 =eGra
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity update from Mimi Zohar:
 "A single commit to permit disabling IMA from the boot command line for
  just the kdump kernel.

  The exception itself sort of makes sense. My concern is that
  exceptions do not remain as exceptions, but somehow morph to become
  the norm"

* tag 'integrity-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: add a knob ima= to allow disabling IMA in kdump kernel
2025-07-31 11:42:11 -07:00
Linus Torvalds
12ed593ee8 Capabilities update for 6.17
This branch contains two patches:
 
   cdd73b1666 uapi: fix broken link in linux/capability.h
 
 This updates documentation in capability.h.
 
   337490f000 exec: Correct the permission check for unsafe exec
 
 This is not a trivial patch, but fixes a real problem where during
 exec, different effective and real credentials were assumed to mean
 changed credentials, making it impossible in the no-new-privs case
 to keep different uid and euid.
 
 These are available at:
 
    git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git #caps-pr-20250729
 
 on top of commit 19272b37aa (tag: v6.16-rc1)
 
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEqb0/8XByttt4D8+UNXDaFycKziQFAmiJHCIACgkQNXDaFycK
 ziRz0QgAqDCdWBrwS/icfXnkV+fcB8J/p/JP+5DO4stn9vV97j+RC3WLc/Wlj5tl
 jLbGGd8eWnCYyTaUHlhE81Qw4b5iQKkd6umGSIBoQVu00YiJ3pLaYsIkEaEMDIkG
 2qS5oNyDEFho8g05++mmTfGMeFQJ4mUEwUmcnsCcTH3bkvcVinlZwP6Lfitmt0lH
 yS0xeMf+gBINou1yTPLdMCCv8LAac2qBcMWhWk4NBEFZJoM0S/Rg5RhdiYYNYYAj
 fi96cTG7qbv80a3M6e/7ta4hU+XdU6JuzQRQJeZstFj2oRfTx2CRR1Wgbus4KdlP
 x/5Sc3bT/ovdPwDc/5HTeFXAlqIqmQ==
 =uKJr
 -----END PGP SIGNATURE-----
 
 commit cdd73b1666
 Author: Ariel Otilibili <ariel.otilibili-anieli@eurecom.fr>
 Date:   Wed Jul 2 12:00:21 2025 +0200
 
     uapi: fix broken link in linux/capability.h
 
     The link to the libcap library is outdated. Instead, use a link to the
     libcap2 library.
 
     As well, give the complete reference of the POSIX compliance.
 
     Signed-off-by: Ariel Otilibili <ariel.otilibili-anieli@eurecom.fr>
     Acked-by: Andrew G. Morgan <morgan@kernel.org>
     Reviewed-by: Paul Moore <paul@paul-moore.com>
     Signed-off-by: Serge Hallyn <sergeh@kernel.org>
 
 diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
 index 2e21b5594f81..ea5a0899ecf0 100644
 --- a/include/uapi/linux/capability.h
 +++ b/include/uapi/linux/capability.h
 @@ -6,9 +6,10 @@
   * Alexander Kjeldaas <astor@guardian.no>
   * with help from Aleph1, Roland Buresund and Andrew Main.
   *
 - * See here for the libcap library ("POSIX draft" compliance):
 + * See here for the libcap2 library (compliant with Section 25 of
 + * the withdrawn POSIX 1003.1e Draft 17):
   *
 - * ftp://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.6/
 + * https://www.kernel.org/pub/linux/libs/security/linux-privs/libcap2/
   */
 
  #ifndef _UAPI_LINUX_CAPABILITY_H
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEqb0/8XByttt4D8+UNXDaFycKziQFAmiJHhoACgkQNXDaFycK
 ziQXVQgAn7931CkRAKZXcukfluPWFsanPGoBKqT+UFVjjUenDtd/GXZQwahRwweS
 k7zDDa/JE59yMHKNenslLgV1PYb3/ZtBMDq+8/LXQ/JMkaZYpNhPPo0lUa70s3pM
 EdY4qbB+aNWeAoHTltRkRkbDHWak3sNjlRyavQj2n7uCOXdSHp4yAVOgiMqxAUX8
 o/V6M6dHA2iN0HEKV+Mp5vHY+9TTFv8dTygNxb1995n1QfHT/AFOTOO598FHkYxA
 aSR8kSpdULdGExk+7Q2NsJoX38zsZIaaI75MQw9Tz5iRkYFZG3hcJVYNbZyT10s+
 NwteAaDfTqTz78QB3PcG794CQls8vQ==
 =yLTV
 -----END PGP SIGNATURE-----

Merge tag 'caps-pr-20250729' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux

Pull capabilities update from Serge Hallyn:

 - Fix broken link in documentation in capability.h

 - Correct the permission check for unsafe exec

   During exec, different effective and real credentials were assumed to
   mean changed credentials, making it impossible in the no-new-privs
   case to keep different uid and euid

* tag 'caps-pr-20250729' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux:
  uapi: fix broken link in linux/capability.h
  exec: Correct the permission check for unsafe exec
2025-07-31 11:19:54 -07:00
Linus Torvalds
b4efd62564 ipe/stable-6.17 PR 20250728
-----BEGIN PGP SIGNATURE-----
 
 iIcEABYIAC8WIQQzmBmZPBN6m/hUJmnyomI6a/yO7QUCaIgqhBEcd3VmYW5Aa2Vy
 bmVsLm9yZwAKCRDyomI6a/yO7RS6AQDikpH4iYfC5PNOcPRvYrl85SvZdDVdJoyD
 0r+DyNddqQEA5iWbIo18rz7usj62uqZd5yFXLmUNfgX+/SvpLLDeXQ4=
 =B34A
 -----END PGP SIGNATURE-----

Merge tag 'ipe-pr-20250728' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe

Pull ipe update from Fan Wu:
 "A single commit from Eric Biggers to simplify the IPE (Integrity
  Policy Enforcement) policy audit with the SHA-256 library API"

* tag 'ipe-pr-20250728' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe:
  ipe: use SHA-256 library API instead of crypto_shash API
2025-07-31 09:42:20 -07:00
John Johansen
43584e9932 apparmor: fix Regression on linux-next (next-20250721)
sk lock initialization was incorrectly removed, from
apparmor_file_alloc_security() while testing changes to changes to
apparmor_sk_alloc_security()

resulting in the following regression.

[   48.056654] INFO: trying to register non-static key.
[   48.057480] The code is fine but needs lockdep annotation, or maybe
[   48.058416] you didn't initialize this object before use?
[   48.059209] turning off the locking correctness validator.
[   48.060040] CPU: 0 UID: 0 PID: 648 Comm: chronyd Not tainted 6.16.0-rc7-test-next-20250721-11410-g1ee809985e11-dirty #577 NONE
[   48.060049] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   48.060055] Call Trace:
[   48.060059]  <TASK>
[   48.060063] dump_stack_lvl (lib/dump_stack.c:122)
[   48.060075] register_lock_class (kernel/locking/lockdep.c:988 kernel/locking/lockdep.c:1302)
[   48.060084] ? path_name (security/apparmor/file.c:159)
[   48.060093] __lock_acquire (kernel/locking/lockdep.c:5116)
[   48.060103] lock_acquire (kernel/locking/lockdep.c:473 (discriminator 4) kernel/locking/lockdep.c:5873 (discriminator 4) kernel/locking/lockdep.c:5828 (discriminator 4))
[   48.060109] ? update_file_ctx (security/apparmor/file.c:464)
[   48.060115] ? __pfx_profile_path_perm (security/apparmor/file.c:247)
[   48.060121] _raw_spin_lock (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[   48.060130] ? update_file_ctx (security/apparmor/file.c:464)
[   48.060134] update_file_ctx (security/apparmor/file.c:464)
[   48.060140] aa_file_perm (security/apparmor/file.c:532 (discriminator 1) security/apparmor/file.c:642 (discriminator 1))
[   48.060147] ? __pfx_aa_file_perm (security/apparmor/file.c:607)
[   48.060152] ? do_mmap (mm/mmap.c:558)
[   48.060160] ? __pfx_userfaultfd_unmap_complete (fs/userfaultfd.c:841)
[   48.060170] ? __lock_acquire (kernel/locking/lockdep.c:4677 (discriminator 1) kernel/locking/lockdep.c:5194 (discriminator 1))
[   48.060176] ? common_file_perm (security/apparmor/lsm.c:535 (discriminator 1))
[   48.060185] security_mmap_file (security/security.c:3012 (discriminator 2))
[   48.060192] vm_mmap_pgoff (mm/util.c:574 (discriminator 1))
[   48.060200] ? find_held_lock (kernel/locking/lockdep.c:5353 (discriminator 1))
[   48.060206] ? __pfx_vm_mmap_pgoff (mm/util.c:568)
[   48.060212] ? lock_release (kernel/locking/lockdep.c:5539 kernel/locking/lockdep.c:5892 kernel/locking/lockdep.c:5878)
[   48.060219] ? __fget_files (arch/x86/include/asm/preempt.h:85 (discriminator 13) include/linux/rcupdate.h:100 (discriminator 13) include/linux/rcupdate.h:873 (discriminator 13) fs/file.c:1072 (discriminator 13))
[   48.060229] ksys_mmap_pgoff (mm/mmap.c:604)
[   48.060239] do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
[   48.060248] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[   48.060254] RIP: 0033:0x7fb6920e30a2
[ 48.060265] Code: 08 00 04 00 00 eb e2 90 41 f7 c1 ff 0f 00 00 75 27 55 89 cd 53 48 89 fb 48 85 ff 74 33 41 89 ea 48 89 df b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5e 5b 5d c3 0f 1f 00 c7 05 e6 41 01 00 16 00
All code
========
   0:	08 00                	or     %al,(%rax)
   2:	04 00                	add    $0x0,%al
   4:	00 eb                	add    %ch,%bl
   6:	e2 90                	loop   0xffffffffffffff98
   8:	41 f7 c1 ff 0f 00 00 	test   $0xfff,%r9d
   f:	75 27                	jne    0x38
  11:	55                   	push   %rbp
  12:	89 cd                	mov    %ecx,%ebp
  14:	53                   	push   %rbx
  15:	48 89 fb             	mov    %rdi,%rbx
  18:	48 85 ff             	test   %rdi,%rdi
  1b:	74 33                	je     0x50
  1d:	41 89 ea             	mov    %ebp,%r10d
  20:	48 89 df             	mov    %rbx,%rdi
  23:	b8 09 00 00 00       	mov    $0x9,%eax
  28:	0f 05                	syscall
  2a:*	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax		<-- trapping instruction
  30:	77 5e                	ja     0x90
  32:	5b                   	pop    %rbx
  33:	5d                   	pop    %rbp
  34:	c3                   	ret
  35:	0f 1f 00             	nopl   (%rax)
  38:	c7                   	.byte 0xc7
  39:	05 e6 41 01 00       	add    $0x141e6,%eax
  3e:	16                   	(bad)
	...

Code starting with the faulting instruction
===========================================
   0:	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax
   6:	77 5e                	ja     0x66
   8:	5b                   	pop    %rbx
   9:	5d                   	pop    %rbp
   a:	c3                   	ret
   b:	0f 1f 00             	nopl   (%rax)
   e:	c7                   	.byte 0xc7
   f:	05 e6 41 01 00       	add    $0x141e6,%eax
  14:	16                   	(bad)
	...
[   48.060270] RSP: 002b:00007ffd2c0d3528 EFLAGS: 00000206 ORIG_RAX: 0000000000000009
[   48.060279] RAX: ffffffffffffffda RBX: 00007fb691fc8000 RCX: 00007fb6920e30a2
[   48.060283] RDX: 0000000000000005 RSI: 000000000007d000 RDI: 00007fb691fc8000
[   48.060287] RBP: 0000000000000812 R08: 0000000000000003 R09: 0000000000011000
[   48.060290] R10: 0000000000000812 R11: 0000000000000206 R12: 00007ffd2c0d3578
[   48.060293] R13: 00007fb6920b6160 R14: 00007ffd2c0d39f0 R15: 00000fffa581a6a8

Fixes: 88fec3526e ("apparmor: make sure unix socket labeling is correctly updated.")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-30 05:01:38 -07:00
John Johansen
f3c0675bb9 apparmor: fix test error: WARNING in apparmor_unix_stream_connect
commit 88fec3526e ("apparmor: make sure unix socket labeling is correctly updated.")
added the use of security_sk_alloc() which ensures the sk label is
initialized.

This means that the AA_BUG in apparmor_unix_stream_connect() is no
longer correct, because while the sk is still not being initialized
by going through post_create, it is now initialize in sk_alloc().
Remove the now invalid check.

Reported-by: syzbot+cd38ee04bcb3866b0c6d@syzkaller.appspotmail.com
Fixes: 88fec3526e ("apparmor: make sure unix socket labeling is correctly updated.")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-30 05:00:47 -07:00
Jiapeng Chong
8936125e23 apparmor: Remove the unused variable rules
Variable rules is not effectively used, so delete it.

security/apparmor/lsm.c:182:23: warning: variable ‘rules’ set but not used.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=22942
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-30 04:57:56 -07:00
Linus Torvalds
5f5c9952b3 powerpc updates for 6.17
- CONFIG_HZ changes to move the base_slice from 10ms to 1ms
 
  - Patchset to move some of the mutex handling to lock guard
 
  - Expose secvars relevant to the key management mode
 
  - Misc cleanups and fixes
 
 Thanks to: Ankit Chauhan, Christophe Leroy, Donet Tom, Gautam Menghani, Haren
 Myneni, Johan Korsnes, Madadi Vineeth Reddy, Paul Mackerras, Shrikanth Hegde,
 Srish Srinivasan, Thomas Fourier, Thomas Huth, Thomas Weißschuh, Souradeep,
 Amit Machhiwal, R Nageswara Sastry, Venkat Rao Bagalkote, Andrew Donnellan,
 Greg Kroah-Hartman, Mimi Zohar, Mukesh Kumar Chaurasiya, Nayna Jain, Ritesh
 Harjani (IBM), Sourabh Jain, Srikar Dronamraju, Stefan Berger, Tyrel Datwyler,
 Kowshik Jois
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEqX2DNAOgU8sBX3pRpnEsdPSHZJQFAmiIMYgACgkQpnEsdPSH
 ZJRwBhAAibjlhQrzxxGsk4AcLHDIlEqI/KRRIiYjM5gr1p9Ms+o3l8epgcwy4aQO
 ZcYhYIyyj4Q6ctR29fPdjb0simjOdoQm1gf5atXv7o6kaY/ooIrc4V8YWTXsxwC8
 KriPJZnrcvae7muYwzP9oVB6S4rets4Ch66FpXgR/pplxa4LwLmefgmnIB4zWXUi
 7u96ZxJ+F/kNGn0X/OtsHpgr9Hmuly6NDk3LH7Smcw0TiL/gTOTZknqm3bNuqw+I
 IJszbYbnaYslnZfNluJW4aluAv7VuUoYc5JVPn94Msw5rrio5UPcRHF5lBdqyKyE
 j++y9uuCi6oFfWw48B3o+7iUzeD6LZUqfWDb85nIc95mCKP0kX66WjuHHQlSmR/h
 WIA0nu9r+FpYOauu2i5fs1A90DnTC+7wdBCFQNYDnCbxxyPuJxBph2YN3ir4heDH
 VAxRfNw8g51HL5Sc3I8J7gP9+pubuJyS4wKCebghKSU4Uhdw7cWPC3dWACDi4b6t
 JWTV5+SrANyCy4B2bRj7p2g36xFFdFRXsoNV1RJtboWkcFRihLFDH9vN7mdB6kYi
 IMm5TQnEUw3UgwddCg2DuSXVg0fMe9R3Ph5VZcAooqYaMzLUWi0lt02Q0QVb4iES
 adAwIPn2mb/v9EeisL7nJmtYI2UZhfwanu19zrI0lrlH/G8jaKM=
 =i2VP
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Madhavan Srinivasan:

 - CONFIG_HZ changes to move the base_slice from 10ms to 1ms

 - Patchset to move some of the mutex handling to lock guard

 - Expose secvars relevant to the key management mode

 - Misc cleanups and fixes

Thanks to Ankit Chauhan, Christophe Leroy, Donet Tom, Gautam Menghani,
Haren Myneni, Johan Korsnes, Madadi Vineeth Reddy, Paul Mackerras,
Shrikanth Hegde, Srish Srinivasan, Thomas Fourier, Thomas Huth, Thomas
Weißschuh, Souradeep, Amit Machhiwal, R Nageswara Sastry, Venkat Rao
Bagalkote, Andrew Donnellan, Greg Kroah-Hartman, Mimi Zohar, Mukesh
Kumar Chaurasiya, Nayna Jain, Ritesh Harjani (IBM), Sourabh Jain, Srikar
Dronamraju, Stefan Berger, Tyrel Datwyler, and Kowshik Jois.

* tag 'powerpc-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (23 commits)
  arch/powerpc: Remove .interp section in vmlinux
  powerpc: Drop GPL boilerplate text with obsolete FSF address
  powerpc: Don't use %pK through printk
  arch: powerpc: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
  misc: ocxl: Replace scnprintf() with sysfs_emit() in sysfs show functions
  integrity/platform_certs: Allow loading of keys in the static key management mode
  powerpc/secvar: Expose secvars relevant to the key management mode
  powerpc/pseries: Correct secvar format representation for static key management
  (powerpc/512) Fix possible `dma_unmap_single()` on uninitialized pointer
  powerpc: floppy: Add missing checks after DMA map
  book3s64/radix : Optimize vmemmap start alignment
  book3s64/radix : Handle error conditions properly in radix_vmemmap_populate
  powerpc/pseries/dlpar: Search DRC index from ibm,drc-indexes for IO add
  KVM: PPC: Book3S HV: Add H_VIRT mapping for tracing exits
  powerpc: sysdev: use lock guard for mutex
  powerpc: powernv: ocxl: use lock guard for mutex
  powerpc: book3s: vas: use lock guard for mutex
  powerpc: fadump: use lock guard for mutex
  powerpc: rtas: use lock guard for mutex
  powerpc: eeh: use lock guard for mutex
  ...
2025-07-29 20:28:38 -07:00
Linus Torvalds
ae388edd4a Landlock update for v6.17-rc1
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCaINQ2hAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbS8DcA/RvnXD7NcvINU94pkY6w+SxtdhsIe/w7EcGF
 LJdwxrKQAP0WpMNTfKdzfe6ub7qE+AkZYhagKAuMCloS1qE35nZgDg==
 =WKRS
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock update from Mickaël Salaün:
 "Fix test issues, improve build compatibility, and add new tests"

* tag 'landlock-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Fix cosmetic change
  samples/landlock: Fix building on musl libc
  landlock: Fix warning from KUnit tests
  selftests/landlock: Add test to check rule tied to covered mount point
  selftests/landlock: Fix build of audit_test
  selftests/landlock: Fix readlink check
2025-07-28 19:21:32 -07:00
Eric Biggers
b90bb6dbf1 ipe: use SHA-256 library API instead of crypto_shash API
audit_policy() does not support any other algorithm, so the crypto_shash
abstraction provides no value.  Just use the SHA-256 library API
instead, which is much simpler and easier to use.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Fan Wu <wufan@kernel.org>
2025-07-28 18:54:18 -07:00
Linus Torvalds
dffb641bea selinux/stable-6.17 PR 20250725
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmiD08EUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOFzRAAwTbLKcFggDs0bG6AfqT5KCQHRtRW
 z/0+kvDZeKzJvmJ4XvNMdW3orC2iIYgGCYJgj8umEXh1BOGIOSd4edpGaUKvJB9Z
 S4py0I3udGjlchInmDBZqW2qGaN0EAl/s0AWBbf6QLYmMV1r9+nb5lH71pVBbEQv
 4Vltg61KkQPc6PuDpk7F+BD7yYKSCqxxWyy7zEJsFMntg4jCyoI2PfRhrHM5xuCE
 RRqctblhCc4nNqzhPvse4Vt9PD+cVQx0Pi7arxvSD8M7mVrFTsUV17OoSKlCcLGh
 1LlFk7heoDNC/UQy7xFg5hsy8GeI6gYYY/Fomu7Ue3SYJL2fqqr7xsbJyCCNg8kC
 S17gRPBxqftzl/2SE76V8y5toIRMtT7b+6aXudtukfFbxtBq09XNflq1G3iR9fTo
 o6f9MGpyWW3t/e8wk2yPYfXNXrLe3yJoxV6lIsDaHejOwGi098ArcahKKuxNzRit
 AikIoqn8GEO+j3/X/4YS1xbdS/HkG7N3t+Rbr2GNO4XsmU6iqZRBkmz5UlCcL8u1
 OTQNjzqqSy/DKAfrPAyqN7wiBVXt5siBPwAXIcPbh9JfAI/kRN7LUxAQeGnUps+o
 kJn2zCV3G4kD2qtCVCQ7VifIYGCMxa1PTYsiwu5S4Wgm1Ducvtsx4r3TVKiCuObC
 bwv/1k9kpkvLy/E=
 =M55j
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

 - Introduce the concept of a SELinux "neveraudit" type which prevents
   all auditing of the given type/domain.

   Taken by itself, the benefit of marking a SELinux domain with the
   "neveraudit" tag is likely not very interesting, especially given the
   significant overlap with the "dontaudit" tag.

   However, given that the "neveraudit" tag applies to *all* auditing of
   the tagged domain, we can do some fairly interesting optimizations
   when a SELinux domain is marked as both "permissive" and "dontaudit"
   (think of the unconfined_t domain).

   While this pull request includes optimized inode permission and
   getattr hooks, these optimizations require SELinux policy changes,
   therefore the improvements may not be visible on standard downstream
   Linux distos for a period of time.

 - Continue the deprecation process of /sys/fs/selinux/user.

   After removing the associated userspace code in 2020, we marked the
   /sys/fs/selinux/user interface as deprecated in Linux v6.13 with
   pr_warn() and the usual documention update.

   This adds a five second sleep after the pr_warn(), following a
   previous deprecation process pattern that has worked well for us in
   the past in helping identify any existing users that we haven't yet
   reached.

 - Add a __GFP_NOWARN flag to our initial hash table allocation.

   Fuzzers such a syzbot often attempt abnormally large SELinux policy
   loads, which the SELinux code gracefully handles by checking for
   allocation failures, but not before the allocator emits a warning
   which causes the automated fuzzing to flag this as an error and
   report it to the list. While we want to continue to support the work
   done by the fuzzing teams, we want to focus on proper issues and not
   an error case that is already handled safely. Add a NOWARN flag to
   quiet the allocator and prevent syzbot from tripping on this again.

 - Remove some unnecessary selinuxfs cleanup code, courtesy of Al.

 - Update the SELinux in-kernel documentation with pointers to
   additional information.

* tag 'selinux-pr-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: don't bother with selinuxfs_info_free() on failures
  selinux: add __GFP_NOWARN to hashtab_init() allocations
  selinux: optimize selinux_inode_getattr/permission() based on neveraudit|permissive
  selinux: introduce neveraudit types
  documentation: add links to SELinux resources
  selinux: add a 5 second sleep to /sys/fs/selinux/user
2025-07-28 18:25:57 -07:00
Linus Torvalds
30b9dcae98 lsm/stable-6.17 PR 20250725
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmiD1VgUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPBEQ/+NOqcHL1p/SRq5Y4gDraovaI3Eyy4
 /gLQbsWn/16B4UGYzOaF3iKRUgs00blCMrIU6hXPkm7kuklrfhuiT+fgWq91JHBB
 ihtrQl77KCCL6pJWBNyrBhvpcOG3WrUvAurOPOz5pWB8dELiloMenbHI70HIy9wb
 uy6N7aOBMfg1NlVWbo4weghhc3Wwtpc0+iDsII8aW4PNmO5ngYJ/IIseh453vzPg
 zFNQFyc10NFVLo6UNao3FRVTvclrdexEvIoNgWvA5WqNoD948qbY2IuQjuRhlqB9
 ZtMeiBTcmJFLO83RqM+d8MbQK5W2dA8NRRJ0H7+DC6lCHGjaqS+Qhy+19DVI5K7j
 LhgmAGWTCT2uW3/iLL4t9srgJ+EFpYi2uU1G9UjtqvIqwiEKTe6iM4vSgupvnw1k
 YGm89ilbshbYUA1+AtaPkfna7ayFU9UY2EJIU+j2xhtsexKuUuGVugadD2G8wW3Q
 jrIwFYK/gVbGK704jIjbJlqALsx+Rd9xSQv0hLHVbL9OJrBSVQs3ZbaezMhm1qiO
 x6xmqLco+6EdOjhpGh7NplJj/jENI6aauOvofiqu/rV5Qz1u+EaDYhwW2HGuL9f9
 9fF0XfkeunNC0GbIYUvq2RVYkHtCqFAdjP8FXkXz30MCSWQJiyJokDCCIIZBqwEI
 qAbxKKJIrj+Q8n0=
 =5isD
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm updates from Paul Moore:

 - Add Nicolas Bouchinet and Xiu Jianfeng as Lockdown maintainers

   The Lockdown LSM has been without a dedicated mantainer since its
   original acceptance upstream, and it has suffered as a result.
   Thankfully we have two new volunteers who together I believe have the
   background and desire to help ensure Lockdown is properly supported.

 - Remove the unused cap_mmap_file() declaration

* tag 'lsm-pr-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  MAINTAINERS: Add Xiu and myself as Lockdown maintainers
  security: Remove unused declaration cap_mmap_file()
  lsm: trivial comment fix
2025-07-28 18:20:32 -07:00
Linus Torvalds
4b65b859f5 Crypto library conversions for 6.17
Convert fsverity and apparmor to use the SHA-2 library functions
 instead of crypto_shash. This is simpler and also slightly faster.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaIZ+MhQcZWJpZ2dlcnNA
 a2VybmVsLm9yZwAKCRDzXCl4vpKOK6LlAP42XrHo9hMqCnVs7uCtDVOk2kBUB7ml
 m7rkVqQLiEbgAAEAiFvQYPjiaUedTi+bH45+qP/O/LxY7y5laIby7QKa1wc=
 =X4Xs
 -----END PGP SIGNATURE-----

Merge tag 'libcrypto-conversions-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull crypto library conversions from Eric Biggers:
 "Convert fsverity and apparmor to use the SHA-2 library functions
  instead of crypto_shash. This is simpler and also slightly faster"

* tag 'libcrypto-conversions-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  fsverity: Switch from crypto_shash to SHA-2 library
  fsverity: Explicitly include <linux/export.h>
  apparmor: use SHA-256 library API instead of crypto_shash API
2025-07-28 18:05:46 -07:00
Linus Torvalds
8e736a2eea hardening updates for v6.17-rc1
- Introduce and start using TRAILING_OVERLAP() helper for fixing
   embedded flex array instances (Gustavo A. R. Silva)
 
 - mux: Convert mux_control_ops to a flex array member in mux_chip
   (Thorsten Blum)
 
 - string: Group str_has_prefix() and strstarts() (Andy Shevchenko)
 
 - Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
   Kees Cook)
 
 - Refactor and rename stackleak feature to support Clang
 
 - Add KUnit test for seq_buf API
 
 - Fix KUnit fortify test under LTO
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaIfUkgAKCRA2KwveOeQk
 uypLAP92r6f47sWcOw/5B9aVffX6Bypsb7dqBJQpCNxI5U1xcAEAiCrZ98UJyOeQ
 JQgnXd4N67K4EsS2JDc+FutRn3Yi+A8=
 =+5Bq
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:

 - Introduce and start using TRAILING_OVERLAP() helper for fixing
   embedded flex array instances (Gustavo A. R. Silva)

 - mux: Convert mux_control_ops to a flex array member in mux_chip
   (Thorsten Blum)

 - string: Group str_has_prefix() and strstarts() (Andy Shevchenko)

 - Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
   Kees Cook)

 - Refactor and rename stackleak feature to support Clang

 - Add KUnit test for seq_buf API

 - Fix KUnit fortify test under LTO

* tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
  sched/task_stack: Add missing const qualifier to end_of_stack()
  kstack_erase: Support Clang stack depth tracking
  kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
  init.h: Disable sanitizer coverage for __init and __head
  kstack_erase: Disable kstack_erase for all of arm compressed boot code
  x86: Handle KCOV __init vs inline mismatches
  arm64: Handle KCOV __init vs inline mismatches
  s390: Handle KCOV __init vs inline mismatches
  arm: Handle KCOV __init vs inline mismatches
  mips: Handle KCOV __init vs inline mismatch
  powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section
  configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON
  configs/hardening: Enable CONFIG_KSTACK_ERASE
  stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS
  stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
  stackleak: Rename STACKLEAK to KSTACK_ERASE
  seq_buf: Introduce KUnit tests
  string: Group str_has_prefix() and strstarts()
  kunit/fortify: Add back "volatile" for sizeof() constants
  acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings
  ...
2025-07-28 17:16:12 -07:00
Linus Torvalds
57fcb7d930 vfs-6.17-rc1.fileattr
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaINCpgAKCRCRxhvAZXjc
 oqfFAQDcy3rROUF3W34KcSi7rDmaKVSX53d1tUoqH+1zDRpSlwEAriKDNC1ybudp
 YAnxVzkRHjHs1296WIuwKq5lfhJ60Q4=
 =geAl
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc1.fileattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull fileattr updates from Christian Brauner:
 "This introduces the new file_getattr() and file_setattr() system calls
  after lengthy discussions.

  Both system calls serve as successors and extensible companions to
  the FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR system calls which have
  started to show their age in addition to being named in a way that
  makes it easy to conflate them with extended attribute related
  operations.

  These syscalls allow userspace to set filesystem inode attributes on
  special files. One of the usage examples is the XFS quota projects.

  XFS has project quotas which could be attached to a directory. All new
  inodes in these directories inherit project ID set on parent
  directory.

  The project is created from userspace by opening and calling
  FS_IOC_FSSETXATTR on each inode. This is not possible for special
  files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
  with empty project ID. Those inodes then are not shown in the quota
  accounting but still exist in the directory. This is not critical but
  in the case when special files are created in the directory with
  already existing project quota, these new inodes inherit extended
  attributes. This creates a mix of special files with and without
  attributes. Moreover, special files with attributes don't have a
  possibility to become clear or change the attributes. This, in turn,
  prevents userspace from re-creating quota project on these existing
  files.

  In addition, these new system calls allow the implementation of
  additional attributes that we couldn't or didn't want to fit into the
  legacy ioctls anymore"

* tag 'vfs-6.17-rc1.fileattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: tighten a sanity check in file_attr_to_fileattr()
  tree-wide: s/struct fileattr/struct file_kattr/g
  fs: introduce file_getattr and file_setattr syscalls
  fs: prepare for extending file_get/setattr()
  fs: make vfs_fileattr_[get|set] return -EOPNOTSUPP
  selinux: implement inode_file_[g|s]etattr hooks
  lsm: introduce new hooks for setting/getting inode fsxattr
  fs: split fileattr related helpers into separate file
2025-07-28 15:24:14 -07:00
Linus Torvalds
2d9c1336ed VFS-related cleanups in various places (mostly of the "that really can't
happen" or "there's a better way to do it" variety)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaIRK0QAKCRBZ7Krx/gZQ
 66/LAPoCvj5nAZH41F1VfyinA6V96kKsAazjrG7ttpWenu+6GAD/e9YQIAtYro0Z
 6f6EWTgrrEZqpOgc9kfHJq60m/TnSg8=
 =Ojq4
 -----END PGP SIGNATURE-----

Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull misc VFS updates from Al Viro:
 "VFS-related cleanups in various places (mostly of the "that really
  can't happen" or "there's a better way to do it" variety)"

* tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  gpib: use file_inode()
  binder_ioctl_write_read(): simplify control flow a bit
  secretmem: move setting O_LARGEFILE and bumping users' count to the place where we create the file
  apparmor: file never has NULL f_path.mnt
  landlock: opened file never has a negative dentry
2025-07-28 10:32:20 -07:00
Linus Torvalds
8297b790c6 securityfs cleanups and fixes:
* one extra reference is enough to pin a dentry down; no need
 for two.  Switch to regular scheme, similar to shmem, debugfs,
 etc. - that fixes securityfs_recursive_remove() dentry leak,
 among other things.
 
 * we need to have the filesystem pinned to prevent the contents
 disappearing; what we do not need is pinning it for each file.
 Doing that only for files and directories in the root is enough.
 
 * the previous two changes allow to get rid of the racy kludges
 in efi_secret_unlink(), where we can use simple_unlink() instead
 of securityfs_remove().  Which does not require unlocking and
 relocking the parent, with all deadlocks that invites.
 
 * Make securityfs_remove() take the entire subtree out, turning
 securityfs_recursive_remove() into its alias.  Makes a lot more
 sense for callers and fixes a mount leak, while we are at it.
 
 * Making securityfs_remove() remove the entire subtree allows for
 much simpler life in most of the users - efi_secret, ima_fs,
 evm, ipe, tmp get cleaner.  I hadn't touched apparmor use of
 securityfs, but I suspect that it would be useful there as well.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaIRJkgAKCRBZ7Krx/gZQ
 67PmAQCCmJ8Czxb0+4P2J8bJFDELvrT3ff0Ns2d/1m77cATdBAEArOxw5iNXfpfU
 0WhjMvQFsgob6jtijG1MAWV7Npz4MwE=
 =wraS
 -----END PGP SIGNATURE-----

Merge tag 'pull-securityfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull securityfs updates from Al Viro:
 "Securityfs cleanups and fixes:

   - one extra reference is enough to pin a dentry down; no need for
     two. Switch to regular scheme, similar to shmem, debugfs, etc. This
     fixes a securityfs_recursive_remove() dentry leak, among other
     things.

   - we need to have the filesystem pinned to prevent the contents
     disappearing; what we do not need is pinning it for each file.
     Doing that only for files and directories in the root is enough.

   - the previous two changes allow us to get rid of the racy kludges in
     efi_secret_unlink(), where we can use simple_unlink() instead of
     securityfs_remove(). Which does not require unlocking and relocking
     the parent, with all deadlocks that invites.

   - Make securityfs_remove() take the entire subtree out, turning
     securityfs_recursive_remove() into its alias. Makes a lot more
     sense for callers and fixes a mount leak, while we are at it.

   - Making securityfs_remove() remove the entire subtree allows for
     much simpler life in most of the users - efi_secret, ima_fs, evm,
     ipe, tmp get cleaner. I hadn't touched apparmor use of securityfs,
     but I suspect that it would be useful there as well"

* tag 'pull-securityfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  tpm: don't bother with removal of files in directory we'll be removing
  ipe: don't bother with removal of files in directory we'll be removing
  evm_secfs: clear securityfs interactions
  ima_fs: get rid of lookup-by-dentry stuff
  ima_fs: don't bother with removal of files in directory we'll be removing
  efi_secret: clean securityfs use up
  make securityfs_remove() remove the entire subtree
  fix locking in efi_secret_unlink()
  securityfs: pin filesystem only for objects directly in root
  securityfs: don't pin dentries twice, once is enough...
2025-07-28 10:07:54 -07:00
Kees Cook
a8f0b1f8ef kstack_erase: Support Clang stack depth tracking
Wire up CONFIG_KSTACK_ERASE to Clang 21's new stack depth tracking
callback[1] option.

Link: https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-stack-depth [1]
Acked-by: Nicolas Schier <n.schier@avm.de>
Link: https://lore.kernel.org/r/20250724055029.3623499-4-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26 14:28:35 -07:00
Kees Cook
9ea1e8d28a stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
The Clang stack depth tracking implementation has a fixed name for
the stack depth tracking callback, "__sanitizer_cov_stack_depth", so
rename the GCC plugin function to match since the plugin has no external
dependencies on naming.

Link: https://lore.kernel.org/r/20250717232519.2984886-2-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:40:39 -07:00
Kees Cook
57fbad15c2 stackleak: Rename STACKLEAK to KSTACK_ERASE
In preparation for adding Clang sanitizer coverage stack depth tracking
that can support stack depth callbacks:

- Add the new top-level CONFIG_KSTACK_ERASE option which will be
  implemented either with the stackleak GCC plugin, or with the Clang
  stack depth callback support.
- Rename CONFIG_GCC_PLUGIN_STACKLEAK as needed to CONFIG_KSTACK_ERASE,
  but keep it for anything specific to the GCC plugin itself.
- Rename all exposed "STACKLEAK" names and files to "KSTACK_ERASE" (named
  for what it does rather than what it protects against), but leave as
  many of the internals alone as possible to avoid even more churn.

While here, also split "prev_lowest_stack" into CONFIG_KSTACK_ERASE_METRICS,
since that's the only place it is referenced from.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250717232519.2984886-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:35:01 -07:00
John Johansen
4d9d1a08b7 apparmor: fix: accept2 being specifie even when permission table is presnt
The transition to the perms32 permission table dropped the need for
the accept2 table as permissions. However accept2 can be used for
flags and may be present even when the perms32 table is present. So
instead of checking on version, check whether the table is present.

Fixes: 2e12c5f060 ("apparmor: add additional flags to extended permission.")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-20 02:31:13 -07:00
John Johansen
9afdc6abb0 apparmor: transition from a list of rules to a vector of rules
The set of rules on a profile is not dynamically extended, instead
if a new ruleset is needed a new version of the profile is created.
This allows us to use a vector of rules instead of a list, slightly
reducing memory usage and simplifying the code.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-20 02:31:06 -07:00
Peng Jiang
f9c9dce01e apparmor: fix documentation mismatches in val_mask_to_str and socket functions
This patch fixes kernel-doc warnings:
1. val_mask_to_str:
- Added missing descriptions for `size` and `table` parameters.
- Removed outdated str_size and chrs references.
2. Socket Functions:
- Makes non-null requirements clear for socket/address args.
- Standardizes return values per kernel conventions.
- Adds Unix domain socket protocol details.

These changes silence doc validation warnings and improve accuracy for
AppArmor LSM docs.

Signed-off-by: Peng Jiang <jiang.peng9@zte.com.cn>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-20 02:19:28 -07:00
Ryan Lee
4ce7d3cf5a apparmor: remove redundant perms.allow MAY_EXEC bitflag set
This section of profile_transition that occurs after x_to_label only
happens if perms.allow already has the MAY_EXEC bit set, so we don't need
to set it again.

Fixes: 16916b17b4 ("apparmor: force auditing of conflicting attachment execs from confined")
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-20 02:19:28 -07:00
John Johansen
da0edababa apparmor: fix kernel doc warnings for kernel test robot
Fix kernel doc warnings for the functions
- apparmor_socket_bind
- apparmor_unix_may_send
- apparmor_unix_stream_connect
- val_mask_to_str

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506070127.B1bc3da4-lkp@intel.com/
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-20 02:19:27 -07:00
Helge Deller
c68804199d apparmor: Fix unaligned memory accesses in KUnit test
The testcase triggers some unnecessary unaligned memory accesses on the
parisc architecture:
  Kernel: unaligned access to 0x12f28e27 in policy_unpack_test_init+0x180/0x374 (iir 0x0cdc1280)
  Kernel: unaligned access to 0x12f28e67 in policy_unpack_test_init+0x270/0x374 (iir 0x64dc00ce)

Use the existing helper functions put_unaligned_le32() and
put_unaligned_le16() to avoid such warnings on architectures which
prefer aligned memory accesses.

Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 98c0cc48e2 ("apparmor: fix policy_unpack_test on big endian systems")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-20 02:19:27 -07:00
Helge Deller
c567de2c4f apparmor: Fix 8-byte alignment for initial dfa blob streams
The dfa blob stream for the aa_dfa_unpack() function is expected to be aligned
on a 8 byte boundary.

The static nulldfa_src[] and stacksplitdfa_src[] arrays store the initial
apparmor dfa blob streams, but since they are declared as an array-of-chars
the compiler and linker will only ensure a "char" (1-byte) alignment.

Add an __aligned(8) annotation to the arrays to tell the linker to always
align them on a 8-byte boundary. This avoids runtime warnings at startup on
alignment-sensitive platforms like parisc such as:

 Kernel: unaligned access to 0x7f2a584a in aa_dfa_unpack+0x124/0x788 (iir 0xca0109f)
 Kernel: unaligned access to 0x7f2a584e in aa_dfa_unpack+0x210/0x788 (iir 0xca8109c)
 Kernel: unaligned access to 0x7f2a586a in aa_dfa_unpack+0x278/0x788 (iir 0xcb01090)

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Fixes: 98b824ff89 ("apparmor: refcount the pdb")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-20 02:19:27 -07:00
Gabriel Totev
3fa0af4cc8 apparmor: shift uid when mediating af_unix in userns
Avoid unshifted ouids for socket file operations as observed when using
AppArmor profiles in unprivileged containers with LXD or Incus.

For example, root inside container and uid 1000000 outside, with
`owner /root/sock rw,` profile entry for nc:

/root$ nc -lkU sock & nc -U sock
==> dmesg
apparmor="DENIED" operation="connect" class="file"
namespace="root//lxd-podia_<var-snap-lxd-common-lxd>" profile="sockit"
name="/root/sock" pid=3924 comm="nc" requested_mask="wr" denied_mask="wr"
fsuid=1000000 ouid=0 [<== should be 1000000]

Fix by performing uid mapping as per common_perm_cond() in lsm.c

Signed-off-by: Gabriel Totev <gabriel.totev@zetier.com>
Fixes: c05e705812 ("apparmor: add fine grained af_unix mediation")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-20 02:19:27 -07:00
Gabriel Totev
c5bf96d20f apparmor: shift ouid when mediating hard links in userns
When using AppArmor profiles inside an unprivileged container,
the link operation observes an unshifted ouid.
(tested with LXD and Incus)

For example, root inside container and uid 1000000 outside, with
`owner /root/link l,` profile entry for ln:

/root$ touch chain && ln chain link
==> dmesg
apparmor="DENIED" operation="link" class="file"
namespace="root//lxd-feet_<var-snap-lxd-common-lxd>" profile="linkit"
name="/root/link" pid=1655 comm="ln" requested_mask="l" denied_mask="l"
fsuid=1000000 ouid=0 [<== should be 1000000] target="/root/chain"

Fix by mapping inode uid of old_dentry in aa_path_link() rather than
using it directly, similarly to how it's mapped in __file_path_perm()
later in the file.

Signed-off-by: Gabriel Totev <gabriel.totev@zetier.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-20 02:19:27 -07:00
John Johansen
88fec3526e apparmor: make sure unix socket labeling is correctly updated.
When a unix socket is passed into a different confinement domain make
sure its cached mediation labeling is updated to correctly reflect
which domains are using the socket.

Fixes: c05e705812 ("apparmor: add fine grained af_unix mediation")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-20 02:19:27 -07:00
Mickaël Salaün
6803b6ebb8
landlock: Fix cosmetic change
This line removal should not be there and it makes it more difficult to
backport the following patch.

Cc: Günther Noack <gnoack@google.com>
Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Fixes: 7a11275c37 ("landlock: Refactor layer helpers")
Link: https://lore.kernel.org/r/20250719104204.545188-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-07-19 12:44:16 +02:00
John Johansen
6456ccbd2f apparmor: fix regression in fs based unix sockets when using old abi
Policy loaded using abi 7 socket mediation was not being applied
correctly in all cases. In some cases with fs based unix sockets a
subset of permissions where allowed when they should have been denied.

This was happening because the check for if the socket was an fs based
unix socket came before the abi check. But the abi check is where the
correct path is selected, so having the fs unix socket check occur
early would cause the wrong code path to be used.

Fix this by pushing the fs unix to be done after the abi check.

Fixes: dcd7a55941 ("apparmor: gate make fine grained unix mediation behind v9 abi")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-15 22:39:43 -07:00
John Johansen
50d56a1a36 apparmor: fix AA_DEBUG_LABEL()
AA_DEBUG_LABEL() was not specifying it vargs, which is needed so it can
output debug parameters.

Fixes: 71e6cff3e0 ("apparmor: Improve debug print infrastructure")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-15 22:39:43 -07:00
John Johansen
a30a9fdb66 apparmor: fix af_unix auditing to include all address information
The auditing of addresses currently doesn't include the source address
and mixes source and foreign/peer under the same audit name. Fix this
so source is always addr, and the foreign/peer is peer_addr.

Fixes: c05e705812 ("apparmor: add fine grained af_unix mediation")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-15 22:39:43 -07:00
John Johansen
bc6e5f6933 apparmor: Remove use of the double lock
The use of the double lock is not necessary and problematic. Instead
pull the bits that need locks into their own sections and grab the
needed references.

Fixes: c05e705812 ("apparmor: add fine grained af_unix mediation")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-15 22:39:43 -07:00
John Johansen
6afb0a7bc9 apparmor: update kernel doc comments for xxx_label_crit_section
Add a kernel doc header for __end_current_label_crit_section(), and
update the header for __begin_current_label_crit_section().

Fixes: b42ecc5f58ef ("apparmor: make __begin_current_label_crit_section() indicate whether put is needed")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-15 22:39:43 -07:00
Mateusz Guzik
87cc7b0011 apparmor: make __begin_current_label_crit_section() indicate whether put is needed
Same as aa_get_newest_cred_label_condref().

This avoids a bunch of work overall and allows the compiler to note when no
clean up is necessary, allowing for tail calls.

This in particular happens in apparmor_file_permission(), which manages to
tail call aa_file_perm() 105 bytes in (vs a regular call 112 bytes in
followed by branches to figure out if clean up is needed).

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-15 22:39:43 -07:00
John Johansen
37a3741d27 Revert "apparmor: use SHA-256 library API instead of crypto_shash API"
This reverts commit e9ed1eb8f6.

Eric has requested that this patch be taken through the libcrypto-next
tree, instead.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-15 22:39:22 -07:00
John Johansen
aff426f359 apparmor: mitigate parser generating large xtables
Some versions of the parser are generating an xtable transition per
state in the state machine, even when the state machine isn't using
the transition table.

The parser bug is triggered by
commit 2e12c5f060 ("apparmor: add additional flags to extended permission.")

In addition to fixing this in userspace, mitigate this in the kernel
as part of the policy verification checks by detecting this situation
and adjusting to what is actually used, or if not used at all freeing
it, so we are not wasting unneeded memory on policy.

Fixes: 2e12c5f060 ("apparmor: add additional flags to extended permission.")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-07-15 22:39:07 -07:00
Eric Biggers
f93c27092a apparmor: use SHA-256 library API instead of crypto_shash API
This user of SHA-256 does not support any other algorithm, so the
crypto_shash abstraction provides no value.  Just use the SHA-256
library API instead, which is much simpler and easier to use.

Acked-by: John Johansen <john.johansen@canonical.com>
Link: https://lore.kernel.org/r/20250630174805.59010-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:29:31 -07:00
Srish Srinivasan
bde5b1a155 integrity/platform_certs: Allow loading of keys in the static key management mode
On PLPKS enabled PowerVM LPAR, there is no provision to load signed
third-party kernel modules when the key management mode is static. This
is because keys from secure boot secvars are only loaded when the key
management mode is dynamic.

Allow loading of the trustedcadb and moduledb keys even in the static
key management mode, where the secvar format string takes the form
"ibm,plpks-sb-v0".

Signed-off-by: Srish Srinivasan <ssrish@linux.ibm.com>
Tested-by: R Nageswara Sastry <rnsastry@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250610211907.101384-4-ssrish@linux.ibm.com
2025-07-09 09:16:18 +05:30
Christian Brauner
ca115d7e75
tree-wide: s/struct fileattr/struct file_kattr/g
Now that we expose struct file_attr as our uapi struct rename all the
internal struct to struct file_kattr to clearly communicate that it is a
kernel internal struct. This is similar to struct mount_{k}attr and
others.

Link: https://lore.kernel.org/20250703-restlaufzeit-baurecht-9ed44552b481@brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-04 16:14:39 +02:00
Andrey Albershteyn
bd14e462bb
selinux: implement inode_file_[g|s]etattr hooks
These hooks are called on inode extended attribute retrieval/change.

Cc: selinux@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>

Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-3-c4e3bc35227b@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01 22:44:29 +02:00
Andrey Albershteyn
defdd02d78
lsm: introduce new hooks for setting/getting inode fsxattr
Introduce new hooks for setting and getting filesystem extended
attributes on inode (FS_IOC_FSGETXATTR).

Cc: selinux@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>

Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-2-c4e3bc35227b@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01 22:44:29 +02:00
Konstantin Andreev
6ddd169d02 smack: fix kernel-doc warnings for smk_import_valid_label()
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506251712.x5SJiNlh-lkp@intel.com/
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-30 13:02:49 -07:00
Tingmao Wang
e0a69cf2c0
landlock: Fix warning from KUnit tests
get_id_range() expects a positive value as first argument but
get_random_u8() can return 0.  Fix this by clamping it.

Validated by running the test in a for loop for 1000 times.

Note that MAX() is wrong as it is only supposed to be used for
constants, but max() is good here.

  [..]     ok 9 test_range2_rand1
  [..]     ok 10 test_range2_rand2
  [..]     ok 11 test_range2_rand15
  [..] ------------[ cut here ]------------
  [..] WARNING: CPU: 6 PID: 104 at security/landlock/id.c:99 test_range2_rand16 (security/landlock/id.c:99 (discriminator 1) security/landlock/id.c:234 (discriminator 1))
  [..] Modules linked in:
  [..] CPU: 6 UID: 0 PID: 104 Comm: kunit_try_catch Tainted: G                 N  6.16.0-rc1-dev-00001-g314a2f98b65f #1 PREEMPT(undef)
  [..] Tainted: [N]=TEST
  [..] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
  [..] RIP: 0010:test_range2_rand16 (security/landlock/id.c:99 (discriminator 1) security/landlock/id.c:234 (discriminator 1))
  [..] Code: 49 c7 c0 10 70 30 82 4c 89 ff 48 c7 c6 a0 63 1e 83 49 c7 45 a0 e0 63 1e 83 e8 3f 95 17 00 e9 1f ff ff ff 0f 0b e9 df fd ff ff <0f> 0b ba 01 00 00 00 e9 68 fe ff ff 49 89 45 a8 49 8d 4d a0 45 31

  [..] RSP: 0000:ffff888104eb7c78 EFLAGS: 00010246
  [..] RAX: 0000000000000000 RBX: 000000000870822c RCX: 0000000000000000
            ^^^^^^^^^^^^^^^^
  [..]
  [..] Call Trace:
  [..]
  [..] ---[ end trace 0000000000000000 ]---
  [..]     ok 12 test_range2_rand16
  [..] # landlock_id: pass:12 fail:0 skip:0 total:12
  [..] # Totals: pass:12 fail:0 skip:0 total:12
  [..] ok 1 landlock_id

Fixes: d9d2a68ed4 ("landlock: Add unique ID generator")
Signed-off-by: Tingmao Wang <m@maowtm.org>
Link: https://lore.kernel.org/r/73e28efc5b8cc394608b99d5bc2596ca917d7c4a.1750003733.git.m@maowtm.org
[mic: Minor cosmetic improvements]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-06-27 10:10:37 +02:00
Al Viro
ee79ba39b3 selinux: don't bother with selinuxfs_info_free() on failures
Failures in sel_fill_super() will be followed by sel_kill_sb(), which
will call selinuxfs_info_free() anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Christian Brauner <brauner@kernel.org>
[PM: subj and description tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-06-24 19:39:28 -04:00
Konstantin Andreev
674e2b2479 smack: fix bug: setting task label silently ignores input garbage
This command:
    # echo foo/bar >/proc/$$/attr/smack/current

gives the task a label 'foo' w/o indication
that label does not match input.
Setting the label with lsm_set_self_attr() syscall
behaves identically.

This occures because:

1) smk_parse_smack() is used to convert input to a label
2) smk_parse_smack() takes only that part from the
   beginning of the input that looks like a label.
3) `/' is prohibited in labels, so only "foo" is taken.

(2) is by design, because smk_parse_smack() is used
for parsing strings which are more than just a label.

Silent failure is not a good thing, and there are two
indicators that this was not done intentionally:

    (size >= SMK_LONGLABEL) ~> invalid

clause at the beginning of the do_setattr() and the
"Returns the length of the smack label" claim
in the do_setattr() description.

So I fixed this by adding one tiny check:
the taken label length == input length.

Since input length is now strictly controlled,
I changed the two ways of setting label

   smack_setselfattr(): lsm_set_self_attr() syscall
   smack_setprocattr(): > /proc/.../current

to accommodate the divergence in
what they understand by "input length":

  smack_setselfattr counts mandatory \0 into input length,
  smack_setprocattr does not.

  smack_setprocattr allows various trailers after label

Related changes:

* fixed description for smk_parse_smack

* allow unprivileged tasks validate label syntax.

* extract smk_parse_label_len() from smk_parse_smack()
  so parsing may be done w/o string allocation.

* extract smk_import_valid_label() from smk_import_entry()
  to avoid repeated parsing.

* smk_parse_smack(): scan null-terminated strings
  for no more than SMK_LONGLABEL(256) characters

* smack_setselfattr(): require struct lsm_ctx . flags == 0
  to reserve them for future.

Fixes: e114e47377 ("Smack: Simplified Mandatory Access Control Kernel")
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-24 16:30:24 -07:00
Konstantin Andreev
c147e13ea7 smack: fix bug: unprivileged task can create labels
If an unprivileged task is allowed to relabel itself
(/smack/relabel-self is not empty),
it can freely create new labels by writing their
names into own /proc/PID/attr/smack/current

This occurs because do_setattr() imports
the provided label in advance,
before checking "relabel-self" list.

This change ensures that the "relabel-self" list
is checked before importing the label.

Fixes: 38416e5393 ("Smack: limited capability for changing process label")
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-24 16:30:24 -07:00
Eric W. Biederman
337490f000
exec: Correct the permission check for unsafe exec
Max Kellerman recently experienced a problem[1] when calling exec with
differing uid and euid's and he triggered the logic that is supposed
to only handle setuid executables.

When exec isn't changing anything in struct cred it doesn't make sense
to go into the code that is there to handle the case when the
credentials change.

When looking into the history of the code I discovered that this issue
was not present in Linux-2.4.0-test12 and was introduced in
Linux-2.4.0-prerelease when the logic for handling this case was moved
from prepare_binprm to compute_creds in fs/exec.c.

The bug introdused was to comparing euid in the new credentials with
uid instead of euid in the old credentials, when testing if setuid
had changed the euid.

Since triggering the keep ptrace limping along case for setuid
executables makes no sense when it was not a setuid exec revert back
to the logic present in Linux-2.4.0-test12.

This removes the confusingly named and subtlety incorrect helpers
is_setuid and is_setgid, that helped this bug to persist.

The varaiable is_setid is renamed to id_changed (it's Linux-2.4.0-test12)
as the old name describes what matters rather than it's cause.

The code removed in Linux-2.4.0-prerelease was:
-       /* Set-uid? */
-       if (mode & S_ISUID) {
-               bprm->e_uid = inode->i_uid;
-               if (bprm->e_uid != current->euid)
-                       id_change = 1;
-       }
-
-       /* Set-gid? */
-       /*
-        * If setgid is set but no group execute bit then this
-        * is a candidate for mandatory locking, not a setgid
-        * executable.
-        */
-       if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
-               bprm->e_gid = inode->i_gid;
-               if (!in_group_p(bprm->e_gid))
-                       id_change = 1;

Linux-2.4.0-prerelease added the current logic as:
+       if (bprm->e_uid != current->uid || bprm->e_gid != current->gid ||
+           !cap_issubset(new_permitted, current->cap_permitted)) {
+                current->dumpable = 0;
+
+               lock_kernel();
+               if (must_not_trace_exec(current)
+                   || atomic_read(&current->fs->count) > 1
+                   || atomic_read(&current->files->count) > 1
+                   || atomic_read(&current->sig->count) > 1) {
+                       if(!capable(CAP_SETUID)) {
+                               bprm->e_uid = current->uid;
+                               bprm->e_gid = current->gid;
+                       }
+                       if(!capable(CAP_SETPCAP)) {
+                               new_permitted = cap_intersect(new_permitted,
+                                                       current->cap_permitted);
+                       }
+               }
+               do_unlock = 1;
+       }

I have condenced the logic from Linux-2.4.0-test12 to just:
	id_changed = !uid_eq(new->euid, old->euid) || !in_group_p(new->egid);

This change is userspace visible, but I don't expect anyone to care.

For the bug that is being fixed to trigger bprm->unsafe has to be set.
The variable bprm->unsafe is set when ptracing an executable, when
sharing a working directory, or when no_new_privs is set.  Properly
testing for cases that are safe even in those conditions and doing
nothing special should not affect anyone.  Especially if they were
previously ok with their credentials getting munged

To minimize behavioural changes the code continues to set secureexec
when euid != uid or when egid != gid.

[1] https://lkml.kernel.org/r/20250306082615.174777-1-max.kellermann@ionos.com

Reported-by: Max Kellermann <max.kellermann@ionos.com>
Fixes: 64444d3d0d7f ("Linux version 2.4.0-prerelease")
v1: https://lkml.kernel.org/r/878qmxsuy8.fsf@email.froward.int.ebiederm.org
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Kees Cook <kees@kernel.org>
2025-06-23 10:38:39 -05:00
Konstantin Andreev
78fc6a94be smack: fix bug: invalid label of unix socket file
According to [1], the label of a UNIX domain socket (UDS)
file (i.e., the filesystem object representing the socket)
is not supposed to participate in Smack security.

To achieve this, [1] labels UDS files with "*"
in smack_d_instantiate().

Before [2], smack_d_instantiate() was responsible
for initializing Smack security for all inodes,
except ones under /proc

[2] imposed the sole responsibility for initializing
inode security for newly created filesystem objects
on smack_inode_init_security().

However, smack_inode_init_security() lacks some logic
present in smack_d_instantiate().
In particular, it does not label UDS files with "*".

This patch adds the missing labeling of UDS files
with "*" to smack_inode_init_security().

Labeling UDS files with "*" in smack_d_instantiate()
still works for stale UDS files that already exist on
disk. Stale UDS files are useless, but I keep labeling
them for consistency and maybe to make easier for user
to delete them.

Compared to [1], this version introduces the following
improvements:

  * UDS file label is held inside inode only
    and not saved to xattrs.

  * relabeling UDS files (setxattr, removexattr, etc.)
    is blocked.

[1] 2010-11-24 Casey Schaufler
commit b4e0d5f079 ("Smack: UDS revision")

[2] 2023-11-16 roberto.sassu
Fixes: e63d86b8b7 ("smack: Initialize the in-memory inode in smack_inode_init_security()")
Link: https://lore.kernel.org/linux-security-module/20231116090125.187209-5-roberto.sassu@huaweicloud.com/

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-22 08:51:32 -07:00
Konstantin Andreev
69204f6cdb smack: always "instantiate" inode in smack_inode_init_security()
If memory allocation for the SMACK64TRANSMUTE
xattr value fails in smack_inode_init_security(),
the SMK_INODE_INSTANT flag is not set in
(struct inode_smack *issp)->smk_flags,
leaving the inode as not "instantiated".

It does not matter if fs frees the inode
after failed smack_inode_init_security() call,
but there is no guarantee for this.

To be safe, mark the inode as "instantiated",
even if allocation of xattr values fails.

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-22 08:51:32 -07:00
Konstantin Andreev
8e5d9f916a smack: deduplicate xattr setting in smack_inode_init_security()
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-22 08:51:32 -07:00
Konstantin Andreev
195da3ff24 smack: fix bug: SMACK64TRANSMUTE set on non-directory
When a new file system object is created
and the conditions for label transmutation are met,
the SMACK64TRANSMUTE extended attribute is set
on the object regardless of its type:
file, pipe, socket, symlink, or directory.

However,
SMACK64TRANSMUTE may only be set on directories.

This bug is a combined effect of the commits [1] and [2]
which both transfer functionality
from smack_d_instantiate() to smack_inode_init_security(),
but only in part.

Commit [1] set blank  SMACK64TRANSMUTE on improper object types.
Commit [2] set "TRUE" SMACK64TRANSMUTE on improper object types.

[1] 2023-06-10,
Fixes: baed456a6a ("smack: Set the SMACK64TRANSMUTE xattr in smack_inode_init_security()")
Link: https://lore.kernel.org/linux-security-module/20230610075738.3273764-3-roberto.sassu@huaweicloud.com/

[2] 2023-11-16,
Fixes: e63d86b8b7 ("smack: Initialize the in-memory inode in smack_inode_init_security()")
Link: https://lore.kernel.org/linux-security-module/20231116090125.187209-5-roberto.sassu@huaweicloud.com/

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-22 08:51:32 -07:00
Konstantin Andreev
635a01da83 smack: deduplicate "does access rule request transmutation"
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-22 08:51:32 -07:00
Paul Moore
9ab71d9204 selinux: add __GFP_NOWARN to hashtab_init() allocations
As reported by syzbot, hashtab_init() can be affected by abnormally
large policy loads which would cause the kernel's allocator to emit
a warning in some configurations.  Since the SELinux hashtab_init()
code handles the case where the allocation fails, due to a large
request or some other reason, we can safely add the __GFP_NOWARN flag
to squelch these abnormally large allocation warnings.

Reported-by: syzbot+bc2c99c2929c3d219fb3@syzkaller.appspotmail.com
Tested-by: syzbot+bc2c99c2929c3d219fb3@syzkaller.appspotmail.com
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-06-19 17:24:57 -04:00
Stephen Smalley
951b2de06a selinux: optimize selinux_inode_getattr/permission() based on neveraudit|permissive
Extend the task avdcache to also cache whether the task SID is both
permissive and neveraudit, and return immediately if so in both
selinux_inode_getattr() and selinux_inode_permission().

The same approach could be applied to many of the hook functions
although the avdcache would need to be updated for more than directory
search checks in order for this optimization to be beneficial for checks
on objects other than directories.

To test, apply https://github.com/SELinuxProject/selinux/pull/473 to
your selinux userspace, build and install libsepol, and use the following
CIL policy module:
$ cat neverauditpermissive.cil
(typeneveraudit unconfined_t)
(typepermissive unconfined_t)

Without this module inserted, running the following commands:
   perf record make -jN # on an already built allmodconfig tree
   perf report --sort=symbol,dso
yields the following percentages (only showing __d_lookup_rcu for
reference and only showing relevant SELinux functions):
   1.65%  [k] __d_lookup_rcu
   0.53%  [k] selinux_inode_permission
   0.40%  [k] selinux_inode_getattr
   0.15%  [k] avc_lookup
   0.05%  [k] avc_has_perm
   0.05%  [k] avc_has_perm_noaudit
   0.02%  [k] avc_policy_seqno
   0.02%  [k] selinux_file_permission
   0.01%  [k] selinux_inode_alloc_security
   0.01%  [k] selinux_file_alloc_security
for a total of 1.24% for SELinux compared to 1.65% for
__d_lookup_rcu().

After running the following command to insert this module:
   semodule -i neverauditpermissive.cil
and then re-running the same perf commands from above yields
the following non-zero percentages:
   1.74%  [k] __d_lookup_rcu
   0.31%  [k] selinux_inode_permission
   0.03%  [k] selinux_inode_getattr
   0.03%  [k] avc_policy_seqno
   0.01%  [k] avc_lookup
   0.01%  [k] selinux_file_permission
   0.01%  [k] selinux_file_open
for a total of 0.40% for SELinux compared to 1.74% for
__d_lookup_rcu().

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-06-19 17:23:05 -04:00
Stephen Smalley
1106896146 selinux: introduce neveraudit types
Introduce neveraudit types i.e. types that should never trigger
audit messages. This allows the AVC to skip all audit-related
processing for such types. Note that neveraudit differs from
dontaudit not only wrt being applied for all checks with a given
source type but also in that it disables all auditing, not just
permission denials.

When a type is both a permissive type and a neveraudit type,
the security server can short-circuit the security_compute_av()
logic, allowing all permissions and not auditing any permissions.

This change just introduces the basic support but does not yet
further optimize the AVC or hook function logic when a type
is both a permissive type and a dontaudit type.

Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-06-19 17:23:04 -04:00
Stephen Smalley
fde46f60f6 selinux: change security_compute_sid to return the ssid or tsid on match
If the end result of a security_compute_sid() computation matches the
ssid or tsid, return that SID rather than looking it up again. This
avoids the problem of multiple initial SIDs that map to the same
context.

Cc: stable@vger.kernel.org
Reported-by: Guido Trentalancia <guido@trentalancia.com>
Fixes: ae254858ce ("selinux: introduce an initial SID for early boot processes")
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Tested-by: Guido Trentalancia <guido@trentalancia.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-06-19 16:13:16 -04:00
Al Viro
5be998a218 ipe: don't bother with removal of files in directory we'll be removing
... and use securityfs_remove() instead of securityfs_recursive_remove()

Acked-by: Fan Wu <wufan@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-17 18:10:53 -04:00
Al Viro
e25fc5540c evm_secfs: clear securityfs interactions
1) creation never returns NULL; error is reported as ERR_PTR()
2) no need to remove file before removing its parent

Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-17 18:10:30 -04:00
Al Viro
d15ffbbf4d ima_fs: get rid of lookup-by-dentry stuff
lookup_template_data_hash_algo() machinery is used to locate the
matching ima_algo_array[] element at read time; securityfs
allows to stash that into inode->i_private at object creation
time, so there's no need to bother

Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-17 18:10:14 -04:00
Al Viro
22260a99d7 ima_fs: don't bother with removal of files in directory we'll be removing
removal of parent takes all children out

Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-17 18:09:52 -04:00
Al Viro
273a291dd7 apparmor: file never has NULL f_path.mnt
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-17 18:04:36 -04:00
Al Viro
d1832e648d landlock: opened file never has a negative dentry
Reviewed-by: Christian Brauner <brauner@kernel.org>
Acked-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-17 18:03:57 -04:00
Stephen Smalley
86c8db86af selinux: fix selinux_xfrm_alloc_user() to set correct ctx_len
We should count the terminating NUL byte as part of the ctx_len.
Otherwise, UBSAN logs a warning:
  UBSAN: array-index-out-of-bounds in security/selinux/xfrm.c:99:14
  index 60 is out of range for type 'char [*]'

The allocation itself is correct so there is no actual out of bounds
indexing, just a warning.

Cc: stable@vger.kernel.org
Suggested-by: Christian Göttsche <cgzones@googlemail.com>
Link: https://lore.kernel.org/selinux/CAEjxPJ6tA5+LxsGfOJokzdPeRomBHjKLBVR6zbrg+_w3ZZbM3A@mail.gmail.com/
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-06-16 19:02:22 -04:00
Paul Moore
8a71d8fa55 selinux: add a 5 second sleep to /sys/fs/selinux/user
Commit d7b6918e22 ("selinux: Deprecate /sys/fs/selinux/user") started
the deprecation process for /sys/fs/selinux/user:

  The selinuxfs "user" node allows userspace to request a list
  of security contexts that can be reached for a given SELinux
  user from a given starting context. This was used by libselinux
  when various login-style programs requested contexts for
  users, but libselinux stopped using it in 2020.
  Kernel support will be removed no sooner than Dec 2025.

A pr_warn() message has been in place since Linux v6.13, this patch
adds a five second sleep to /sys/fs/selinux/user to help make the
deprecation and upcoming removal more noticeable.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-06-16 18:44:03 -04:00
Kalevi Kolttonen
9fc86a85f3 lsm: trivial comment fix
Fix a typo in the security_inode_mkdir() comment block.

Signed-off-by: Kalevi Kolttonen <kalevi@kolttonen.fi>
[PM: subject tweak, add description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-06-16 18:43:13 -04:00
Baoquan He
aa9bb1b325 ima: add a knob ima= to allow disabling IMA in kdump kernel
Kdump kernel doesn't need IMA functionality, and enabling IMA will cost
extra memory. It would be very helpful to allow IMA to be disabled for
kdump kernel.

Hence add a knob ima=on|off here to allow turning IMA off in kdump
kernel if needed.

Note that this IMA disabling is limited to kdump kernel, please don't
abuse it in other kernel and thus serious consequences are caused.

Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-06-16 09:15:13 -04:00
Al Viro
29d673b150 make securityfs_remove() remove the entire subtree
... and fix the mount leak when anything's mounted there.
securityfs_recursive_remove becomes an alias for securityfs_remove -
we'll probably need to remove it in a cycle or two.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-11 18:19:46 -04:00
Al Viro
e4de726502 securityfs: pin filesystem only for objects directly in root
Nothing on securityfs ever changes parents, so we don't need
to pin the internal mount if it's already pinned for parent.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-11 18:00:00 -04:00
Al Viro
27cd1bf124 securityfs: don't pin dentries twice, once is enough...
incidentally, securityfs_recursive_remove() is broken without that -
it leaks dentries, since simple_recursive_removal() does not expect
anything of that sort.  It could be worked around by dput() in
remove_one() callback, but it's easier to just drop that double-get
stuff.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-11 15:07:57 -04:00
Herbert Xu
488ef35601 KEYS: Invert FINAL_PUT bit
Invert the FINAL_PUT bit so that test_bit_acquire and clear_bit_unlock
can be used instead of smp_mb.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-06-11 11:57:14 -07:00
Linus Torvalds
dee264c16a require gcc-8 and binutils-2.30
x86 already uses gcc-8 as the minimum version, this changes all other
 architectures to the same version. gcc-8 is used is Debian 10 and Red
 Hat Enterprise Linux 8, both of which are still supported, and binutils
 2.30 is the oldest corresponding version on those. Ubuntu Pro 18.04 and
 SUSE Linux Enterprise Server 15 both use gcc-7 as the system compiler
 but additionally include toolchains that remain supported.
 
 With the new minimum toolchain versions, a number of workarounds for older
 versions can be dropped, in particular on x86_64 and arm64.  Importantly,
 the updated compiler version allows removing two of the five remaining
 gcc plugins, as support for sancov and structeak features is already
 included in modern compiler versions.
 
 I tried collecting the known changes that are possible based on the
 new toolchain version, but expect that more cleanups will be possible.
 Since this touches multiple architectures, I merged the patches through
 the asm-generic tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmg6vNMACgkQmmx57+YA
 GNkOmg/+LtR9B2P27GPBeG8HnLTZ8hKELiyYeSk6ZFgQv5hevE37HV35Yru7e7gu
 wcF6CgYr8ff4CVcHM7y0790oGew1thkqq5CklFIH0EwCDJx/mWfZR1SS2jfZIEWM
 HSDOlQQd1S8oWia14tSnQos3nW3CB9/ABVTHH+Wvl3xn48WMRvgK2LJgGLuxJrt8
 5DD9auHiLjchWB5tB4DU98IgWWgFUGMTsI6IayZ4dkF4CdWqd89h0Y3pjJYeBgHS
 mPxzR2q8WjEmG9hp7QuZQgn/pAYleJAwHvvkoLrkQ2ieqx3FjWiwFbQp4CG1Sc8L
 eBR1lnkqS2z/e7xJLfe86fOoKWWu4I0tZKhRan/0+UOGm5nXrGpqSxKS8ZDsRuAp
 3fvyhIp1cYSa7Xkok8BFhLEFR0tguXJXnXBc3tWE5VXIfFNd0Ohh1GUYhXDAqWKh
 i0jN9dSNhokM3AqBi6qZl5kmBnRA3UsIaOg3QRrqN8IlBPp+u7i5xsrJIUWvD95o
 TO06admmLcCJT8n6ZfNVfRjBgzu8+t54UVaDx9YYwxoNGOSFwqOb8CSPTWPxLmDr
 RKDUOvO8DBlP7uFz9neP+LxluA3DjurRZvb0z0AmCZ8/RXEmTMCyfP5a6esxquXt
 0Bqo6hM9q+TeXTHNS1CNvqLSWWikw+AzS/ZPPvriYFn5lxtbq6c=
 =pdDC
 -----END PGP SIGNATURE-----

Merge tag 'gcc-minimum-version-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull compiler version requirement update from Arnd Bergmann:
 "Require gcc-8 and binutils-2.30

  x86 already uses gcc-8 as the minimum version, this changes all other
  architectures to the same version. gcc-8 is used is Debian 10 and Red
  Hat Enterprise Linux 8, both of which are still supported, and
  binutils 2.30 is the oldest corresponding version on those.

  Ubuntu Pro 18.04 and SUSE Linux Enterprise Server 15 both use gcc-7 as
  the system compiler but additionally include toolchains that remain
  supported.

  With the new minimum toolchain versions, a number of workarounds for
  older versions can be dropped, in particular on x86_64 and arm64.
  Importantly, the updated compiler version allows removing two of the
  five remaining gcc plugins, as support for sancov and structeak
  features is already included in modern compiler versions.

  I tried collecting the known changes that are possible based on the
  new toolchain version, but expect that more cleanups will be possible.

  Since this touches multiple architectures, I merged the patches
  through the asm-generic tree."

* tag 'gcc-minimum-version-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  Makefile.kcov: apply needed compiler option unconditionally in CFLAGS_KCOV
  Documentation: update binutils-2.30 version reference
  gcc-plugins: remove SANCOV gcc plugin
  Kbuild: remove structleak gcc plugin
  arm64: drop binutils version checks
  raid6: skip avx512 checks
  kbuild: require gcc-8 and binutils-2.30
2025-05-31 08:16:52 -07:00
Linus Torvalds
12e9b9e522 ipe/stable-6.16 PR 20250527
-----BEGIN PGP SIGNATURE-----
 
 iIcEABYIAC8WIQQzmBmZPBN6m/hUJmnyomI6a/yO7QUCaDZldREcd3VmYW5Aa2Vy
 bmVsLm9yZwAKCRDyomI6a/yO7eT7AQCSU1qiLEHpKEbxRtPyD9m6OBVRrS9joKEn
 zABAi00GEgD8C11CKWFemqaL6zexg6c0k51y90K1S1SPZ6pY9HwzZAY=
 =c1Gi
 -----END PGP SIGNATURE-----

Merge tag 'ipe-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe

Pull IPE update from Fan Wu:
 "A single commit from Jasjiv Singh, that adds an errno field to IPE
  policy load auditing to log failures with error details, not just
  successes.

  This improves the security audit trail and helps diagnose policy
  deployment issues"

* tag 'ipe-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe:
  ipe: add errno field to IPE policy load auditing
2025-05-29 08:01:53 -07:00
Linus Torvalds
1b98f357da Networking changes for 6.16.
Core
 ----
 
  - Implement the Device Memory TCP transmit path, allowing zero-copy
    data transmission on top of TCP from e.g. GPU memory to the wire.
 
  - Move all the IPv6 routing tables management outside the RTNL scope,
    under its own lock and RCU. The route control path is now 3x times
    faster.
 
  - Convert queue related netlink ops to instance lock, reducing
    again the scope of the RTNL lock. This improves the control plane
    scalability.
 
  - Refactor the software crc32c implementation, removing unneeded
    abstraction layers and improving significantly the related
    micro-benchmarks.
 
  - Optimize the GRO engine for UDP-tunneled traffic, for a 10%
    performance improvement in related stream tests.
 
  - Cover more per-CPU storage with local nested BH locking; this is a
    prep work to remove the current per-CPU lock in local_bh_disable()
    on PREMPT_RT.
 
  - Introduce and use nlmsg_payload helper, combining buffer bounds
    verification with accessing payload carried by netlink messages.
 
 Netfilter
 ---------
 
  - Rewrite the procfs conntrack table implementation, improving
    considerably the dump performance. A lot of user-space tools
    still use this interface.
 
  - Implement support for wildcard netdevice in netdev basechain
    and flowtables.
 
  - Integrate conntrack information into nft trace infrastructure.
 
  - Export set count and backend name to userspace, for better
    introspection.
 
 BPF
 ---
 
  - BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops
    programs and can be controlled in similar way to traditional qdiscs
    using the "tc qdisc" command.
 
  - Refactor the UDP socket iterator, addressing long standing issues
    WRT duplicate hits or missed sockets.
 
 Protocols
 ---------
 
  - Improve TCP receive buffer auto-tuning and increase the default
    upper bound for the receive buffer; overall this improves the single
    flow maximum thoughput on 200Gbs link by over 60%.
 
  - Add AFS GSSAPI security class to AF_RXRPC; it provides transport
    security for connections to the AFS fileserver and VL server.
 
  - Improve TCP multipath routing, so that the sources address always
    matches the nexthop device.
 
  - Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS,
    and thus preventing DoS caused by passing around problematic FDs.
 
  - Retire DCCP socket. DCCP only receives updates for bugs, and major
    distros disable it by default. Its removal allows for better
    organisation of TCP fields to reduce the number of cache lines hit
    in the fast path.
 
  - Extend TCP drop-reason support to cover PAWS checks.
 
 Driver API
 ----------
 
  - Reorganize PTP ioctl flag support to require an explicit opt-in for
    the drivers, avoiding the problem of drivers not rejecting new
    unsupported flags.
 
  - Converted several device drivers to timestamping APIs.
 
  - Introduce per-PHY ethtool dump helpers, improving the support for
    dump operations targeting PHYs.
 
 Tests and tooling
 -----------------
 
  - Add support for classic netlink in user space C codegen, so that
    ynl-c can now read, create and modify links, routes addresses and
    qdisc layer configuration.
 
  - Add ynl sub-types for binary attributes, allowing ynl-c to output
    known struct instead of raw binary data, clarifying the classic
    netlink output.
 
  - Extend MPTCP selftests to improve the code-coverage.
 
  - Add tests for XDP tail adjustment in AF_XDP.
 
 New hardware / drivers
 ----------------------
 
  - OpenVPN virtual driver: offload OpenVPN data channels processing
    to the kernel-space, increasing the data transfer throughput WRT
    the user-space implementation.
 
  - Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.
 
  - Broadcom asp-v3.0 ethernet driver.
 
  - AMD Renoir ethernet device.
 
  - ReakTek MT9888 2.5G ethernet PHY driver.
 
  - Aeonsemi 10G C45 PHYs driver.
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - nVidia/Mellanox (mlx5):
      - refactor the stearing table handling to reduce significantly
        the amount of memory used
      - add support for complex matches in H/W flow steering
      - improve flow streeing error handling
      - convert to netdev instance locking
    - Intel (100G, ice, igb, ixgbe, idpf):
      - ice: add switchdev support for LLDP traffic over VF
      - ixgbe: add firmware manipulation and regions devlink support
      - igb: introduce support for frame transmission premption
      - igb: adds persistent NAPI configuration
      - idpf: introduce RDMA support
      - idpf: add initial PTP support
    - Meta (fbnic):
      - extend hardware stats coverage
      - add devlink dev flash support
    - Broadcom (bnxt):
      - add support for RX-side device memory TCP
    - Wangxun (txgbe):
      - implement support for udp tunnel offload
      - complete PTP and SRIOV support for AML 25G/10G devices
 
  - Ethernet NICs embedded and virtual:
    - Google (gve):
      - add device memory TCP TX support
    - Amazon (ena):
      - support persistent per-NAPI config
    - Airoha:
      - add H/W support for L2 traffic offload
      - add per flow stats for flow offloading
    - RealTek (rtl8211): add support for WoL magic packet
    - Synopsys (stmmac):
      - dwmac-socfpga 1000BaseX support
      - add Loongson-2K3000 support
      - introduce support for hardware-accelerated VLAN stripping
    - Broadcom (bcmgenet):
      - expose more H/W stats
    - Freescale (enetc, dpaa2-eth):
      - enetc: add MAC filter, VLAN filter RSS and loopback support
      - dpaa2-eth: convert to H/W timestamping APIs
    - vxlan: convert FDB table to rhashtable, for better scalabilty
    - veth: apply qdisc backpressure on full ring to reduce TX drops
 
  - Ethernet switches:
    - Microchip (kzZ88x3): add ETS scheduler support
 
  - Ethernet PHYs:
    - RealTek (rtl8211):
      - add support for WoL magic packet
      - add support for PHY LEDs
 
  - CAN:
    - Adds RZ/G3E CANFD support to the rcar_canfd driver.
    - Preparatory work for CAN-XL support.
    - Add self-tests framework with support for CAN physical interfaces.
 
  - WiFi:
    - mac80211:
      - scan improvements with multi-link operation (MLO)
    - Qualcomm (ath12k):
      - enable AHB support for IPQ5332
      - add monitor interface support to QCN9274
      - add multi-link operation support to WCN7850
      - add 802.11d scan offload support to WCN7850
      - monitor mode for WCN7850, better 6 GHz regulatory
    - Qualcomm (ath11k):
      - restore hibernation support
    - MediaTek (mt76):
      - WiFi-7 improvements
      - implement support for mt7990
    - Intel (iwlwifi):
      - enhanced multi-link single-radio (EMLSR) support on 5 GHz links
      - rework device configuration
    - RealTek (rtw88):
      - improve throughput for RTL8814AU
    - RealTek (rtw89):
      - add multi-link operation support
      - STA/P2P concurrency improvements
      - support different SAR configs by antenna
 
  - Bluetooth:
    - introduce HCI Driver protocol
    - btintel_pcie: do not generate coredump for diagnostic events
    - btusb: add HCI Drv commands for configuring altsetting
    - btusb: add RTL8851BE device 0x0bda:0xb850
    - btusb: add new VID/PID 13d3/3584 for MT7922
    - btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
    - btnxpuart: implement host-wakeup feature
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmg3D64SHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkcIsQAK2eEc+BxQer975wzvtMg6gF9eoex4a+
 rZ7jxfDzDtNvTauoQsrpehDZp0FnySaVGCU36lHGB2OvDnhCpPc5hXzKDWQpOuqQ
 SHrGG3/6FTbdTG/HfHUcbNyrUzIf53SADSObiQ3qg4gyEQ3sCpcOKtVtMcU8rvsY
 /HqMnsJWFaROUMjMtCcnUSgjmeY9kBvha3sTXUqgeRugEOCvZD7z4rpqFIcQqHw7
 e2Fi8dwIXEYNxqPp6MRq2qdyUTewCRruE8ZIMAFuhtfYeMElUZMPlqlMENX3AzTQ
 cr0EgwcFOUxRA7oZRxhoBNBsVXavtSpQr4ZDoWplxP4aQ37n5tc1E9Q72axpB/Og
 FbJRl6GvWYnCd8071BczgmfHlKaTAigPvt2Z4r6JjM5I/Bij/IZ3k+On1OTuOAj/
 EqfFkdZ0a5cfKrwUMP+oSGtSAywkMVUtnIKJlZeRbjSj2432sCfe2jVAlS8ELM43
 3LUgXYrAKtA87g171LlsRu5EEpI5QmqPb+i5LpPlEXe2TJEgPisyfecJ3NafF/2+
 j575lm+TFNm9NTNhGGjDPEvw0djI5wSGGMe9J4gC74eWi6s5t6C4cuUf84TKWdwR
 x+9H0IB7rfFncAwXHJuUUtzd+fPHaYzs5dDGbSgMQOXr1cr1wlubCK8mQ1r/Wt/a
 3GjFIOQKW2Q5
 =t/Tz
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "Core:

   - Implement the Device Memory TCP transmit path, allowing zero-copy
     data transmission on top of TCP from e.g. GPU memory to the wire.

   - Move all the IPv6 routing tables management outside the RTNL scope,
     under its own lock and RCU. The route control path is now 3x times
     faster.

   - Convert queue related netlink ops to instance lock, reducing again
     the scope of the RTNL lock. This improves the control plane
     scalability.

   - Refactor the software crc32c implementation, removing unneeded
     abstraction layers and improving significantly the related
     micro-benchmarks.

   - Optimize the GRO engine for UDP-tunneled traffic, for a 10%
     performance improvement in related stream tests.

   - Cover more per-CPU storage with local nested BH locking; this is a
     prep work to remove the current per-CPU lock in local_bh_disable()
     on PREMPT_RT.

   - Introduce and use nlmsg_payload helper, combining buffer bounds
     verification with accessing payload carried by netlink messages.

  Netfilter:

   - Rewrite the procfs conntrack table implementation, improving
     considerably the dump performance. A lot of user-space tools still
     use this interface.

   - Implement support for wildcard netdevice in netdev basechain and
     flowtables.

   - Integrate conntrack information into nft trace infrastructure.

   - Export set count and backend name to userspace, for better
     introspection.

  BPF:

   - BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops
     programs and can be controlled in similar way to traditional qdiscs
     using the "tc qdisc" command.

   - Refactor the UDP socket iterator, addressing long standing issues
     WRT duplicate hits or missed sockets.

  Protocols:

   - Improve TCP receive buffer auto-tuning and increase the default
     upper bound for the receive buffer; overall this improves the
     single flow maximum thoughput on 200Gbs link by over 60%.

   - Add AFS GSSAPI security class to AF_RXRPC; it provides transport
     security for connections to the AFS fileserver and VL server.

   - Improve TCP multipath routing, so that the sources address always
     matches the nexthop device.

   - Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS,
     and thus preventing DoS caused by passing around problematic FDs.

   - Retire DCCP socket. DCCP only receives updates for bugs, and major
     distros disable it by default. Its removal allows for better
     organisation of TCP fields to reduce the number of cache lines hit
     in the fast path.

   - Extend TCP drop-reason support to cover PAWS checks.

  Driver API:

   - Reorganize PTP ioctl flag support to require an explicit opt-in for
     the drivers, avoiding the problem of drivers not rejecting new
     unsupported flags.

   - Converted several device drivers to timestamping APIs.

   - Introduce per-PHY ethtool dump helpers, improving the support for
     dump operations targeting PHYs.

  Tests and tooling:

   - Add support for classic netlink in user space C codegen, so that
     ynl-c can now read, create and modify links, routes addresses and
     qdisc layer configuration.

   - Add ynl sub-types for binary attributes, allowing ynl-c to output
     known struct instead of raw binary data, clarifying the classic
     netlink output.

   - Extend MPTCP selftests to improve the code-coverage.

   - Add tests for XDP tail adjustment in AF_XDP.

  New hardware / drivers:

   - OpenVPN virtual driver: offload OpenVPN data channels processing to
     the kernel-space, increasing the data transfer throughput WRT the
     user-space implementation.

   - Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.

   - Broadcom asp-v3.0 ethernet driver.

   - AMD Renoir ethernet device.

   - ReakTek MT9888 2.5G ethernet PHY driver.

   - Aeonsemi 10G C45 PHYs driver.

  Drivers:

   - Ethernet high-speed NICs:
       - nVidia/Mellanox (mlx5):
           - refactor the steering table handling to significantly
             reduce the amount of memory used
           - add support for complex matches in H/W flow steering
           - improve flow streeing error handling
           - convert to netdev instance locking
       - Intel (100G, ice, igb, ixgbe, idpf):
           - ice: add switchdev support for LLDP traffic over VF
           - ixgbe: add firmware manipulation and regions devlink support
           - igb: introduce support for frame transmission premption
           - igb: adds persistent NAPI configuration
           - idpf: introduce RDMA support
           - idpf: add initial PTP support
       - Meta (fbnic):
           - extend hardware stats coverage
           - add devlink dev flash support
       - Broadcom (bnxt):
           - add support for RX-side device memory TCP
       - Wangxun (txgbe):
           - implement support for udp tunnel offload
           - complete PTP and SRIOV support for AML 25G/10G devices

   - Ethernet NICs embedded and virtual:
       - Google (gve):
           - add device memory TCP TX support
       - Amazon (ena):
           - support persistent per-NAPI config
       - Airoha:
           - add H/W support for L2 traffic offload
           - add per flow stats for flow offloading
       - RealTek (rtl8211): add support for WoL magic packet
       - Synopsys (stmmac):
           - dwmac-socfpga 1000BaseX support
           - add Loongson-2K3000 support
           - introduce support for hardware-accelerated VLAN stripping
       - Broadcom (bcmgenet):
           - expose more H/W stats
       - Freescale (enetc, dpaa2-eth):
           - enetc: add MAC filter, VLAN filter RSS and loopback support
           - dpaa2-eth: convert to H/W timestamping APIs
       - vxlan: convert FDB table to rhashtable, for better scalabilty
       - veth: apply qdisc backpressure on full ring to reduce TX drops

   - Ethernet switches:
       - Microchip (kzZ88x3): add ETS scheduler support

   - Ethernet PHYs:
       - RealTek (rtl8211):
           - add support for WoL magic packet
           - add support for PHY LEDs

   - CAN:
       - Adds RZ/G3E CANFD support to the rcar_canfd driver.
       - Preparatory work for CAN-XL support.
       - Add self-tests framework with support for CAN physical interfaces.

   - WiFi:
       - mac80211:
           - scan improvements with multi-link operation (MLO)
       - Qualcomm (ath12k):
           - enable AHB support for IPQ5332
           - add monitor interface support to QCN9274
           - add multi-link operation support to WCN7850
           - add 802.11d scan offload support to WCN7850
           - monitor mode for WCN7850, better 6 GHz regulatory
       - Qualcomm (ath11k):
           - restore hibernation support
       - MediaTek (mt76):
           - WiFi-7 improvements
           - implement support for mt7990
       - Intel (iwlwifi):
           - enhanced multi-link single-radio (EMLSR) support on 5 GHz links
           - rework device configuration
       - RealTek (rtw88):
           - improve throughput for RTL8814AU
       - RealTek (rtw89):
           - add multi-link operation support
           - STA/P2P concurrency improvements
           - support different SAR configs by antenna

   - Bluetooth:
       - introduce HCI Driver protocol
       - btintel_pcie: do not generate coredump for diagnostic events
       - btusb: add HCI Drv commands for configuring altsetting
       - btusb: add RTL8851BE device 0x0bda:0xb850
       - btusb: add new VID/PID 13d3/3584 for MT7922
       - btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
       - btnxpuart: implement host-wakeup feature"

* tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits)
  selftests/bpf: Fix bpf selftest build warning
  selftests: netfilter: Fix skip of wildcard interface test
  net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames
  net: openvswitch: Fix the dead loop of MPLS parse
  calipso: Don't call calipso functions for AF_INET sk.
  selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem
  net_sched: hfsc: Address reentrant enqueue adding class to eltree twice
  octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback
  octeontx2-pf: QOS: Perform cache sync on send queue teardown
  net: mana: Add support for Multi Vports on Bare metal
  net: devmem: ncdevmem: remove unused variable
  net: devmem: ksft: upgrade rx test to send 1K data
  net: devmem: ksft: add 5 tuple FS support
  net: devmem: ksft: add exit_wait to make rx test pass
  net: devmem: ksft: add ipv4 support
  net: devmem: preserve sockc_err
  page_pool: fix ugly page_pool formatting
  net: devmem: move list_add to net_devmem_bind_dmabuf.
  selftests: netfilter: nft_queue.sh: include file transfer duration in log message
  net: phy: mscc: Fix memory leak when using one step timestamping
  ...
2025-05-28 15:24:36 -07:00
Linus Torvalds
b5628b81bd selinux-pr-20250527
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmg17c0UHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPLZhAAoEOtbcgImq6Qn8vH0rPcA0y5n2as
 Seg7lfz3HFFo1R8aV8d3PIHbmbhpW2EnLJnDQS3lhgzRCzM9/seXOkX47P7hyu7L
 IJvZ9RU1JFhLU/7y4p3KsGJGowNoVVaJLRlD/IpwzX3osBOImZBTCy28lfD608g3
 RSC1CMIsTuKnWPD+7CFy62pZDixRP/MhxCleTnN7qH/65N6UnqT0rolEgqJUlfZ4
 Gji5GVXLIWjEnz2fkCearxqjA6STz8SAYgYoysgWSZoyhbGMetCTaVQj6iXcwqJO
 yX39jxoI4XymcxCwa/aypeCB1PllkO86qPJ7coJA+v5EHnuRXzQ/XANiNtTRZUlg
 itGr8/H4CABVMKsuEqw4lU587b42kVnrS6JIETdqzRievtBoNmp6RTXTQvd8w7GV
 zvAyWgl1gSBWRWCZN9UVA9w76sL6NL4alaFhAUQ352oMiSODwfS7ooiki5V8GAe7
 SvCl1WgrBmhUS87MFnXk9p2YzWcrUdOdsqHCm3JFGqepE5czWtDZ8fAzW5STrkYL
 eJiIbOM6qxDW8uiFXtenp8FXg31iWmR4/N1fjK3vQ5jVIYQDjWAlKG/WYxGK84Nf
 IS/MvKQgJU0+j3LoZX/Fr5hz3uZSFHUV+ZjpniF8y6x/MAbg1Q4IYuDDJpNP96rP
 fG2k78KSx8q3ckk=
 =6ljO
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

 - Reduce the SELinux impact on path walks.

   Add a small directory access cache to the per-task SELinux state.
   This cache allows SELinux to cache the most recently used directory
   access decisions in order to avoid repeatedly querying the AVC on
   path walks where the majority of the directories have similar
   security contexts/labels.

   My performance measurements are crude, but prior to this patch the
   time spent in SELinux code on a 'make allmodconfig' run was 103% that
   of __d_lookup_rcu(), and with this patch the time spent in SELinux
   code dropped to 63% of __d_lookup_rcu(), a ~40% improvement.

   Additional improvments can be expected in the future, but those will
   require additional SELinux policy/toolchain support.

 - Add support for wildcards in genfscon policy statements.

   This patch allows for wildcards in the genfscon patch matching logic
   as opposed to the prefix matching that was used prior to this change.
   Adding wilcard support allows for more expressive and efficient path
   matching in the policy which is especially helpful for sysfs, and has
   resulted in a ~15% boot time reduction in Android.

   SELinux policies can opt into wilcard matching by using the
   "genfs_seclabel_wildcard" policy capability.

 - Unify the error/OOM handling of the SELinux network caches.

   A failure to allocate memory for the SELinux network caches isn't
   fatal as the object label can still be safely returned to the caller,
   it simply means that we cannot add the new data to the cache, at
   least temporarily. This patch corrects this behavior for the
   InfiniBand cache and does some minor cleanup.

 - Minor improvements around constification, 'likely' annotations, and
   removal of bogus comments.

* tag 'selinux-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix the kdoc header for task_avdcache_update
  selinux: remove a duplicated include
  selinux: reduce path walk overhead
  selinux: support wildcard match in genfscon
  selinux: drop copy-paste comment
  selinux: unify OOM handling in network hashtables
  selinux: add likely hints for fast paths
  selinux: contify network namespace pointer
  selinux: constify network address pointer
2025-05-28 08:28:58 -07:00
Linus Torvalds
1bc8c83af9 lsm-pr-20250527
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmg17fsUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPPmxAAj7/Ua694XyG8XvU934+XOyKi3ljH
 Z6vK094lMY1xUp8ed0wfAoFUD0adZZ3Zxs/entRTBz1istqE244xauWd6t61Wm/H
 BnT1M5hjz6SZ51ErA337WE94GhnGndNpO3/h95yd8/1FpZrzrUy3W5N6+4flEW4g
 s6uFWYk4VXPKvqmrFnelwgBJ/om+gAAWNdU1z1ujeZsec+BtAkaO8R1PYDGr/zIw
 Oxv7OtWpSxFKuD0W51uf9GNYEBlDke9X/JvvJ77hH3amFuHEWKWuBgIePjDengGv
 dwauRg2Z8ciVmKCkwztIFrCP86TvYd5mvDWPU4vveU3ApSYO+rl22WMuQZ4tTmI/
 X56QZRx2uL6PASVEvreMPgVGW/W0IBzyzY5vzEy99e2wqtlxMY6t2cW27sYZT2wA
 obISzZUg1SYKyQ5bFzn11Xzm9YdqU6mpIQXJiQh6EjDKTzOHPDPijQFMD0kVSy1P
 SKAYmDK7a+Wgfx3ZBdsBkQOM2DCHwG96ExGxG394trxdhnJqfUTM3APhuAIgO+x3
 x9UpPVM+zwnTgYNa7RqnfeXt+0rMk03kG0RLtyI9+m2NtaVp5jbRBsdGxDfbRVHm
 X2Y1wGjWdDwhY2Wr0m2XAafc5JwPUOwfrK6S8J88by7t7BWbcsmJMpriEJ3Z094W
 fYItOQwuLArd/JY=
 =Go29
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm update from Paul Moore:
 "One minor LSM framework patch to move the selinux_netlink_send() hook
  under the CONFIG_SECURITY_NETWORK Kconfig knob"

* tag 'lsm-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  lsm: Move security_netlink_send to under CONFIG_SECURITY_NETWORK
2025-05-28 08:17:19 -07:00
Linus Torvalds
7af6e3febb integrity-v6.16
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCaDYRjRQcem9oYXJAbGlu
 dXguaWJtLmNvbQAKCRDLwZzRsCrn5fpRAQDsdIIwCgyQLFQhZq3wW5dhTUBQW8o9
 GjaNHpROKV57cwD9GqT78xi9qsxgaYW0lUUh5+zvlGI5cAtnl8/Fkby7hgY=
 =ohlH
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity updates from Mimi Zohar:
 "Carrying the IMA measurement list across kexec is not a new feature,
  but is updated to address a couple of issues:

   - Carrying the IMA measurement list across kexec required knowing
     apriori all the file measurements between the "kexec load" and
     "kexec execute" in order to measure them before the "kexec load".
     Any delay between the "kexec load" and "kexec exec" exacerbated the
     problem.

   - Any file measurements post "kexec load" were not carried across
     kexec, resulting in the measurement list being out of sync with the
     TPM PCR.

  With these changes, the buffer for the IMA measurement list is still
  allocated at "kexec load", but copying the IMA measurement list is
  deferred to after quiescing the TPM.

  Two new kexec critical data records are defined"

* tag 'integrity-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: do not copy measurement list to kdump kernel
  ima: measure kexec load and exec events as critical data
  ima: make the kexec extra memory configurable
  ima: verify if the segment size has changed
  ima: kexec: move IMA log copy from kexec load to execute
  ima: kexec: define functions to copy IMA log at soft boot
  ima: kexec: skip IMA segment validation after kexec soft reboot
  kexec: define functions to map and unmap segments
  ima: define and call ima_alloc_kexec_file_buf()
  ima: rename variable the seq_file "file" to "ima_kexec_file"
2025-05-28 08:12:33 -07:00
Linus Torvalds
cbaed2f58c Kernel doc fix for 6.16
-----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmg1+04XHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBHO8g/+JHGgUH93wUh7Gx6esjWEZLKg
 QUT8XYoQun4RXpCeutV+MlGr6KU1HtTTeNgr1gS2s9Gol/L0C4xiKpAXf5VhWISH
 ZyblAKXS2JjpmOnJSvl8Ej3zwpuhu4UX2Cf5MX6d1HSiHAQqGWsMnRTimG3Cdxm0
 yDBgcELygRteg66pu+XHOnQuOPiw7pOUq8aJnGGB1d7ldG1/Lbz6cqy5jrhmxqFc
 NWDFw9Ca2fzQV9XG/OOHhcZ8ZZJCGg+DPyRdeZjJxgAA3L+u2Isc8slJYQl3Gs6R
 7aKWhDC6QpohqYYSKXRVgN3d1Ro91meOFjc9tE03z/nzAwcOES9kITr4PJt1Cw3R
 uWJScbir7i8iD43uh8A1ovjw9weco4BNOOIsQVYdk/DLEQO2vkXtEQsgz6RNV1tm
 pzRae0SI2b+g2PJUfCgiFylJfQHEhCSK90KX5VtHRnv3ZEStxhESmDLXgIFghjOi
 kgs6aL3emdQSs84aF1bhENBUSLaFtOom2wMFc6LZ6xyTCk9ibIBGrSyun/9A9JIM
 3nhn8yXh5pdA2QEHNV5EUZaiejozWSqTN5gwVJouiBp9j5XpRX7vPKRw2kin0Ws3
 H6gEctDQfrStUJV1N5d+qdWQUe3JHdt9qNuQD5ZOw5geuGlwZe7aJvcBYLEh1H3H
 ZXKNOe9cmo0uJ5SMqcY=
 =S+jO
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-6.16' of https://github.com/cschaufler/smack-next

Pull smack update from Casey Schaufler:
 "One trivial kernel doc fix"

* tag 'Smack-for-6.16' of https://github.com/cschaufler/smack-next:
  security/smack/smackfs: small kernel-doc fixes
2025-05-28 08:09:37 -07:00
Linus Torvalds
48cfc5791d hardening updates for v6.16-rc1
- Update overflow helpers to ease refactoring of on-stack flex array
   instances (Gustavo A. R. Silva, Kees Cook)
 
 - lkdtm: Use SLAB_NO_MERGE instead of constructors (Harry Yoo)
 
 - Simplify CONFIG_CC_HAS_COUNTED_BY (Jan Hendrik Farr)
 
 - Disable u64 usercopy KUnit test on 32-bit SPARC (Thomas Weißschuh)
 
 - Add missed designated initializers now exposed by fixed randstruct
   (Nathan Chancellor, Kees Cook)
 
 - Document compilers versions for __builtin_dynamic_object_size
 
 - Remove ARM_SSP_PER_TASK GCC plugin
 
 - Fix GCC plugin randstruct, add selftests, and restore COMPILE_TEST
   builds
 
 - Kbuild: induce full rebuilds when dependencies change with GCC plugins,
   the Clang sanitizer .scl file, or the randstruct seed.
 
 - Kbuild: Switch from -Wvla to -Wvla-larger-than=1
 
 - Correct several __nonstring uses for -Wunterminated-string-initialization
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaDUq9gAKCRA2KwveOeQk
 u+ZCAQDhqpOE/yn5gfjyplIvaTtzj9CaW6g11AmPYrimJCuj3QD9G+0o35kzlXOw
 f0ZIj2U7LFNgbLos+20hQwhMFf1Zhgg=
 =OYzD
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:

 - Update overflow helpers to ease refactoring of on-stack flex array
   instances (Gustavo A. R. Silva, Kees Cook)

 - lkdtm: Use SLAB_NO_MERGE instead of constructors (Harry Yoo)

 - Simplify CONFIG_CC_HAS_COUNTED_BY (Jan Hendrik Farr)

 - Disable u64 usercopy KUnit test on 32-bit SPARC (Thomas Weißschuh)

 - Add missed designated initializers now exposed by fixed randstruct
   (Nathan Chancellor, Kees Cook)

 - Document compilers versions for __builtin_dynamic_object_size

 - Remove ARM_SSP_PER_TASK GCC plugin

 - Fix GCC plugin randstruct, add selftests, and restore COMPILE_TEST
   builds

 - Kbuild: induce full rebuilds when dependencies change with GCC
   plugins, the Clang sanitizer .scl file, or the randstruct seed.

 - Kbuild: Switch from -Wvla to -Wvla-larger-than=1

 - Correct several __nonstring uses for -Wunterminated-string-initialization

* tag 'hardening-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
  Revert "hardening: Disable GCC randstruct for COMPILE_TEST"
  lib/tests: randstruct: Add deep function pointer layout test
  lib/tests: Add randstruct KUnit test
  randstruct: gcc-plugin: Remove bogus void member
  net: qede: Initialize qede_ll_ops with designated initializer
  scsi: qedf: Use designated initializer for struct qed_fcoe_cb_ops
  md/bcache: Mark __nonstring look-up table
  integer-wrap: Force full rebuild when .scl file changes
  randstruct: Force full rebuild when seed changes
  gcc-plugins: Force full rebuild when plugins change
  kbuild: Switch from -Wvla to -Wvla-larger-than=1
  hardening: simplify CONFIG_CC_HAS_COUNTED_BY
  overflow: Fix direct struct member initialization in _DEFINE_FLEX()
  kunit/overflow: Add tests for STACK_FLEX_ARRAY_SIZE() helper
  overflow: Add STACK_FLEX_ARRAY_SIZE() helper
  input/joystick: magellan: Mark __nonstring look-up table const
  watchdog: exar: Shorten identity name to fit correctly
  mod_devicetable: Enlarge the maximum platform_device_id name length
  overflow: Clarify expectations for getting DEFINE_FLEX variable sizes
  compiler_types: Identify compiler versions for __builtin_dynamic_object_size
  ...
2025-05-28 07:47:10 -07:00
Jasjiv Singh
1d887d6f81 ipe: add errno field to IPE policy load auditing
Users of IPE require a way to identify when and why an operation fails,
allowing them to both respond to violations of policy and be notified
of potentially malicious actions on their systems with respect to IPE.

This patch introduces a new error field to the AUDIT_IPE_POLICY_LOAD event
to log policy loading failures. Currently, IPE only logs successful policy
loads, but not failures. Tracking failures is crucial to detect malicious
attempts and ensure a complete audit trail for security events.

The new error field will capture the following error codes:

* -ENOKEY: Key used to sign the IPE policy not found in the keyring
* -ESTALE: Attempting to update an IPE policy with an older version
* -EKEYREJECTED: IPE signature verification failed
* -ENOENT: Policy was deleted while updating
* -EEXIST: Same name policy already deployed
* -ERANGE: Policy version number overflow
* -EINVAL: Policy version parsing error
* -EPERM: Insufficient permission
* -ENOMEM: Out of memory (OOM)
* -EBADMSG: Policy is invalid

Here are some examples of the updated audit record types:

AUDIT_IPE_POLICY_LOAD(1422):
audit:  AUDIT1422 policy_name="Test_Policy" policy_version=0.0.1
policy_digest=sha256:84EFBA8FA71E62AE0A537FAB962F8A2BD1053964C4299DCA
92BFFF4DB82E86D3 auid=1000 ses=3 lsm=ipe res=1 errno=0

The above record shows a new policy has been successfully loaded into
the kernel with the policy name, version, and hash with the errno=0.

AUDIT_IPE_POLICY_LOAD(1422) with error:

audit: AUDIT1422 policy_name=? policy_version=? policy_digest=?
auid=1000 ses=3 lsm=ipe res=0 errno=-74

The above record shows a policy load failure due to an invalid policy
(-EBADMSG).

By adding this error field, we ensure that all policy load attempts,
whether successful or failed, are logged, providing a comprehensive
audit trail for IPE policy management.

Signed-off-by: Jasjiv Singh <jasjivsingh@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@kernel.org>
2025-05-27 18:08:51 -07:00
Linus Torvalds
6d5b940e1e vfs-6.16-rc1.async.dir
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaDBN6wAKCRCRxhvAZXjc
 ok32AQD9DTiSCAoVg+7s+gSBuLTi8drPTN++mCaxdTqRh5WpRAD9GVyrGQT0s6LH
 eo9bm8d1TAYjilEWM0c0K0TxyQ7KcAA=
 =IW7H
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.16-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs directory lookup updates from Christian Brauner:
 "This contains cleanups for the lookup_one*() family of helpers.

  We expose a set of functions with names containing "lookup_one_len"
  and others without the "_len". This difference has nothing to do with
  "len". It's rater a historical accident that can be confusing.

  The functions without "_len" take a "mnt_idmap" pointer. This is found
  in the "vfsmount" and that is an important question when choosing
  which to use: do you have a vfsmount, or are you "inside" the
  filesystem. A related question is "is permission checking relevant
  here?".

  nfsd and cachefiles *do* have a vfsmount but *don't* use the non-_len
  functions. They pass nop_mnt_idmap and refuse to work on filesystems
  which have any other idmap.

  This work changes nfsd and cachefile to use the lookup_one family of
  functions and to explictily pass &nop_mnt_idmap which is consistent
  with all other vfs interfaces used where &nop_mnt_idmap is explicitly
  passed.

  The remaining uses of the "_one" functions do not require permission
  checks so these are renamed to be "_noperm" and the permission
  checking is removed.

  This series also changes these lookup function to take a qstr instead
  of separate name and len. In many cases this simplifies the call"

* tag 'vfs-6.16-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  VFS: change lookup_one_common and lookup_noperm_common to take a qstr
  Use try_lookup_noperm() instead of d_hash_and_lookup() outside of VFS
  VFS: rename lookup_one_len family to lookup_noperm and remove permission check
  cachefiles: Use lookup_one() rather than lookup_one_len()
  nfsd: Use lookup_one() rather than lookup_one_len()
  VFS: improve interface for lookup_one functions
2025-05-26 08:02:43 -07:00
John Johansen
b1f87be728 apparmor: Document that label must be last member in struct aa_profile
The label struct is variable length. While its use in struct aa_profile
is fixed length at 2 entries the variable length member needs to be
the last member in the structure.

The code already does this but the comment has it in the wrong location.
Also add a comment to ensure it stays at the end of the structure.

While we are at it, update the documentation for other profile members
as well.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-25 20:15:01 -07:00
John Johansen
4c0dc425fd apparmor: make debug_values_table static
The debug_values_table is only referenced from lib.c so it should
be static.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-25 20:15:01 -07:00
Ryan Lee
16916b17b4 apparmor: force auditing of conflicting attachment execs from confined
Conflicting attachment paths are an error state that result in the
binary in question executing under an unexpected ix/ux fallback. As such,
it should be audited to record the occurrence of conflicting attachments.

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-25 20:15:01 -07:00
Ryan Lee
b824b5f82b apparmor: include conflicting attachment info for confined ix/ux fallback
Instead of silently overwriting the conflicting profile attachment string,
include that information in the ix/ux fallback string that gets set as info
instead. Also add a warning print if some other info is set that would be
overwritten by the ix/ux fallback string or by the profile not found error.

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-25 20:15:01 -07:00
Ryan Lee
e76d733b1b apparmor: move the "conflicting profile attachments" infostr to a const declaration
Instead of having a literal, making this a constant will allow for (hacky)
detection of conflicting profile attachments from inspection of the info
pointer. This is used in the next patch to augment the information provided
through domain.c:x_to_label for ix/ux fallback.

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-25 20:15:01 -07:00
Ryan Lee
89a3561e69 apparmor: force audit on unconfined exec if info is set by find_attach
find_attach may set info if something unusual happens during that process
(currently only used to signal conflicting attachments, but this could be
expanded in the future). This is information that should be propagated to
userspace via an audit message.

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-25 20:15:01 -07:00
Ryan Lee
95ff118958 apparmor: make all generated string array headers const char *const
address_family_names and sock_type_names were created as const char *a[],
which declares them as (non-const) pointers to const chars. Since the
pointers themselves would not be changed, they should be generated as
const char *const a[].

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-25 20:15:01 -07:00
Ryan Lee
a88db916b8 apparmor: fix loop detection used in conflicting attachment resolution
Conflicting attachment resolution is based on the number of states
traversed to reach an accepting state in the attachment DFA, accounting
for DFA loops traversed during the matching process. However, the loop
counting logic had multiple bugs:

 - The inc_wb_pos macro increments both position and length, but length
   is supposed to saturate upon hitting buffer capacity, instead of
   wrapping around.
 - If no revisited state is found when traversing the history, is_loop
   would still return true, as if there was a loop found the length of
   the history buffer, instead of returning false and signalling that
   no loop was found. As a result, the adjustment step of
   aa_dfa_leftmatch would sometimes produce negative counts with loop-
   free DFAs that traversed enough states.
 - The iteration in the is_loop for loop is supposed to stop before
   i = wb->len, so the conditional should be < instead of <=.

This patch fixes the above bugs as well as the following nits:
 - The count and size fields in struct match_workbuf were not used,
   so they can be removed.
 - The history buffer in match_workbuf semantically stores aa_state_t
   and not unsigned ints, even if aa_state_t is currently unsigned int.
 - The local variables in is_loop are counters, and thus should be
   unsigned ints instead of aa_state_t's.

Fixes: 21f6066105 ("apparmor: improve overlapping domain attachment resolution")

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Co-developed-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-25 20:14:53 -07:00
Jakub Kicinski
33e1b1b399 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.15-rc8).

Conflicts:
  80f2ab46c2 ("irdma: free iwdev->rf after removing MSI-X")
  4bcc063939 ("ice, irdma: fix an off by one in error handling code")
  c24a65b6a2 ("iidc/ice/irdma: Update IDC to support multiple consumers")
https://lore.kernel.org/20250513130630.280ee6c5@canb.auug.org.au

No extra adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-22 09:42:41 -07:00
Randy Dunlap
4b59f4fd0a security/smack/smackfs: small kernel-doc fixes
Add function short descriptions to the kernel-doc where missing.
Correct a verb and add ending periods to sentences.

smackfs.c:1080: warning: missing initial short description on line:
 * smk_net4addr_insert
smackfs.c:1343: warning: missing initial short description on line:
 * smk_net6addr_insert

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: linux-security-module@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
2025-05-19 16:28:32 -07:00
Ryan Lee
6c055e6256 apparmor: ensure WB_HISTORY_SIZE value is a power of 2
WB_HISTORY_SIZE was defined to be a value not a power of 2, despite a
comment in the declaration of struct match_workbuf stating it is and a
modular arithmetic usage in the inc_wb_pos macro assuming that it is. Bump
WB_HISTORY_SIZE's value up to 32 and add a BUILD_BUG_ON_NOT_POWER_OF_2
line to ensure that any future changes to the value of WB_HISTORY_SIZE
respect this requirement.

Fixes: 136db99485 ("apparmor: increase left match history buffer size")

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-17 18:20:10 -07:00
Randy Dunlap
a949b46e7d apparmor: fix some kernel-doc issues in header files
Fix kernel-doc warnings in apparmor header files as reported by
scripts/kernel-doc:

cred.h:128: warning: expecting prototype for end_label_crit_section(). Prototype was for end_current_label_crit_section() instead
file.h:108: warning: expecting prototype for aa_map_file_perms(). Prototype was for aa_map_file_to_perms() instead

lib.h:159: warning: Function parameter or struct member 'hname' not described in 'basename'
lib.h:159: warning: Excess function parameter 'name' description in 'basename'

match.h:21: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * The format used for transition tables is based on the GNU flex table

perms.h:109: warning: Function parameter or struct member 'accum' not described in 'aa_perms_accum_raw'
perms.h:109: warning: Function parameter or struct member 'addend' not described in 'aa_perms_accum_raw'
perms.h:136: warning: Function parameter or struct member 'accum' not described in 'aa_perms_accum'
perms.h:136: warning: Function parameter or struct member 'addend' not described in 'aa_perms_accum'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ryan Lee <ryan.lee@canonical.com>
Cc: John Johansen <john.johansen@canonical.com>
Cc: John Johansen <john@apparmor.net>
Cc: apparmor@lists.ubuntu.com
Cc: linux-security-module@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-17 01:52:25 -07:00
Colin Ian King
44fbeeb308 apparmor: Fix incorrect profile->signal range check
The check on profile->signal is always false, the value can never be
less than 1 *and* greater than MAXMAPPED_SIG. Fix this by replacing
the logical operator && with ||.

Fixes: 84c455decf ("apparmor: add support for profiles to define the kill signal")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-17 01:49:20 -07:00
Eric Biggers
e9ed1eb8f6 apparmor: use SHA-256 library API instead of crypto_shash API
This user of SHA-256 does not support any other algorithm, so the
crypto_shash abstraction provides no value.  Just use the SHA-256
library API instead, which is much simpler and easier to use.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-17 01:42:01 -07:00
Zilin Guan
2b270e2f43 security/apparmor: use kfree_sensitive() in unpack_secmark()
The unpack_secmark() function currently uses kfree() to release memory
allocated for secmark structures and their labels. However, if a failure
occurs after partially parsing secmark, sensitive data may remain in
memory, posing a security risk.

To mitigate this, replace kfree() with kfree_sensitive() for freeing
secmark structures and their labels, aligning with the approach used
in free_ruleset().

I am submitting this as an RFC to seek freedback on whether this change
is appropriate and aligns with the subsystem's expectations. If
confirmed to be helpful, I will send a formal patch.

Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-05-17 01:20:25 -07:00
Steven Chen
fe3aebf27d ima: do not copy measurement list to kdump kernel
Kdump kernel doesn't need IMA to do integrity measurement.
Hence the measurement list in 1st kernel doesn't need to be copied to
kdump kernel.

Here skip allocating buffer for measurement list copying if loading
kdump kernel. Then there won't be the later handling related to
ima_kexec_buffer.

Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Tested-by: Baoquan He <bhe@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-05-14 06:40:09 -04:00
Mickaël Salaün
3039ed4327
landlock: Improve bit operations in audit code
Use the BIT() and BIT_ULL() macros in the new audit code instead of
explicit shifts to improve readability.  Use bitmask instead of modulo
operation to simplify code.

Add test_range1_rand15() and test_range2_rand15() KUnit tests to improve
get_id_range() coverage.

Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20250512093732.1408485-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-05-12 11:38:53 +02:00
Kees Cook
f0cd6012c4 Revert "hardening: Disable GCC randstruct for COMPILE_TEST"
This reverts commit f5c68a4e84.

It is again possible to build "allmodconfig" with the randstruct GCC
plugin, so enable it for COMPILE_TEST to catch future bugs.

Signed-off-by: Kees Cook <kees@kernel.org>
2025-05-08 09:42:40 -07:00
Mickaël Salaün
b1525d0a8d
landlock: Remove KUnit test that triggers a warning
A KUnit test checking boundaries triggers a canary warning, which may be
disturbing.  Let's remove this test for now.  Hopefully, KUnit will soon
get support for suppressing warning backtraces [1].

Cc: Alessandro Carminati <acarmina@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Günther Noack <gnoack@google.com>
Reported-by: Tingmao Wang <m@maowtm.org>
Closes: https://lore.kernel.org/r/20250327213807.12964-1-m@maowtm.org
Link: https://lore.kernel.org/r/20250425193249.78b45d2589575c15f483c3d8@linux-foundation.org [1]
Link: https://lore.kernel.org/r/20250503065359.3625407-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-05-03 08:55:42 +02:00
Jakub Kicinski
337079d31f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.15-rc5).

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-01 15:11:38 -07:00
Arnd Bergmann
8530ea3c9b Kbuild: remove structleak gcc plugin
gcc-12 and higher support the -ftrivial-auto-var-init= flag, after
gcc-8 is the minimum version, this is half of the supported ones, and
the vast majority of the versions that users are actually likely to
have, so it seems like a good time to stop having the fallback
plugin implementation

Older toolchains are still able to build kernels normally without
this plugin, but won't be able to use variable initialization..

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-04-30 21:57:09 +02:00
Steven Chen
591683d394 ima: measure kexec load and exec events as critical data
The amount of memory allocated at kexec load, even with the extra memory
allocated, might not be large enough for the entire measurement list.  The
indeterminate interval between kexec 'load' and 'execute' could exacerbate
this problem.

Define two new IMA events, 'kexec_load' and 'kexec_execute', to be
measured as critical data at kexec 'load' and 'execute' respectively.
Report the allocated kexec segment size, IMA binary log size and the
runtime measurements count as part of those events.

These events, and the values reported through them, serve as markers in
the IMA log to verify the IMA events are captured during kexec soft
reboot.  The presence of a 'kexec_load' event in between the last two
'boot_aggregate' events in the IMA log implies this is a kexec soft
reboot, and not a cold-boot. And the absence of 'kexec_execute' event
after kexec soft reboot implies missing events in that window which
results in inconsistency with TPM PCR quotes, necessitating a cold boot
for a successful remote attestation.

These critical data events are displayed as hex encoded ascii in the
ascii_runtime_measurement_list.  Verifying the critical data hash requires
calculating the hash of the decoded ascii string.

For example, to verify the 'kexec_load' data hash:

sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements
| grep  kexec_load | cut -d' ' -f 6 | xxd -r -p | sha256sum

To verify the 'kexec_execute' data hash:

sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements
| grep kexec_execute | cut -d' ' -f 6 | xxd -r -p | sha256sum

Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29 15:54:54 -04:00
Steven Chen
0ad93987c9 ima: make the kexec extra memory configurable
The extra memory allocated for carrying the IMA measurement list across
kexec is hard-coded as half a PAGE.  Make it configurable.

Define a Kconfig option, IMA_KEXEC_EXTRA_MEMORY_KB, to configure the
extra memory (in kb) to be allocated for IMA measurements added during
kexec soft reboot.  Ensure the default value of the option is set such
that extra half a page of memory for additional measurements is allocated
for the additional measurements.

Update ima_add_kexec_buffer() function to allocate memory based on the
Kconfig option value, rather than the currently hard-coded one.

Suggested-by: Stefan Berger <stefanb@linux.ibm.com>
Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29 15:54:54 -04:00
Steven Chen
d0a00ce470 ima: verify if the segment size has changed
kexec 'load' may be called multiple times. Free and realloc the buffer
only if the segment_size is changed from the previous kexec 'load' call.

Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29 15:54:54 -04:00
Steven Chen
9f0ec4b16f ima: kexec: move IMA log copy from kexec load to execute
The IMA log is currently copied to the new kernel during kexec 'load' using
ima_dump_measurement_list(). However, the IMA measurement list copied at
kexec 'load' may result in loss of IMA measurements records that only
occurred after the kexec 'load'. Move the IMA measurement list log copy
from kexec 'load' to 'execute'

Make the kexec_segment_size variable a local static variable within the
file, so it can be accessed during both kexec 'load' and 'execute'.

Define kexec_post_load() as a wrapper for calling ima_kexec_post_load() and
machine_kexec_post_load().  Replace the existing direct call to
machine_kexec_post_load() with kexec_post_load().

When there is insufficient memory to copy all the measurement logs, copy as
much of the measurement list as possible.

Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29 15:54:54 -04:00
Steven Chen
f18e502db6 ima: kexec: define functions to copy IMA log at soft boot
The IMA log is currently copied to the new kernel during kexec 'load'
using ima_dump_measurement_list(). However, the log copied at kexec
'load' may result in loss of IMA measurements that only occurred after
kexec "load'. Setup the needed infrastructure to move the IMA log copy
from kexec 'load' to 'execute'.

Define a new IMA hook ima_update_kexec_buffer() as a stub function.
It will be used to call ima_dump_measurement_list() during kexec 'execute'.

Implement ima_kexec_post_load() function to be invoked after the new
Kernel image has been loaded for kexec. ima_kexec_post_load() maps the
IMA buffer to a segment in the newly loaded Kernel.  It also registers
the reboot notifier_block to trigger ima_update_kexec_buffer() at
kexec 'execute'.

Set the priority of register_reboot_notifier to INT_MIN to ensure that the
IMA log copy operation will happen at the end of the operation chain, so
that all the IMA measurement records extended into the TPM are copied

Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29 15:54:54 -04:00
Steven Chen
9ee8888a80 ima: kexec: skip IMA segment validation after kexec soft reboot
Currently, the function kexec_calculate_store_digests() calculates and
stores the digest of the segment during the kexec_file_load syscall,
where the  IMA segment is also allocated.

Later, the IMA segment will be updated with the measurement log at the
kexec execute stage when a kexec reboot is initiated. Therefore, the
digests should be updated for the IMA segment in the  normal case. The
problem is that the content of memory segments carried over to the new
kernel during the kexec systemcall can be changed at kexec 'execute'
stage, but the size and the location of the memory segments cannot be
changed at kexec 'execute' stage.

To address this, skip the calculation and storage of the digest for the
IMA segment in kexec_calculate_store_digests() so that it is not added
to the purgatory_sha_regions.

With this change, the IMA segment is not included in the digest
calculation, storage, and verification.

Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
[zohar@linux.ibm.com: Fixed Signed-off-by tag to match author's email ]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29 15:54:53 -04:00
Steven Chen
c95e1acb6d ima: define and call ima_alloc_kexec_file_buf()
In the current implementation, the ima_dump_measurement_list() API is
called during the kexec "load" phase, where a buffer is allocated and
the measurement records are copied. Due to this, new events added after
kexec load but before kexec execute are not carried over to the new kernel
during kexec operation

Carrying the IMA measurement list across kexec requires allocating a
buffer and copying the measurement records.  Separate allocating the
buffer and copying the measurement records into separate functions in
order to allocate the buffer at kexec 'load' and copy the measurements
at kexec 'execute'.

After moving the vfree() here at this stage in the patch set, the IMA
measurement list fails to verify when doing two consecutive "kexec -s -l"
with/without a "kexec -s -u" in between.  Only after "ima: kexec: move
IMA log copy from kexec load to execute" the IMA measurement list verifies
properly with the vfree() here.

Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29 15:54:53 -04:00
Steven Chen
cb5052282c ima: rename variable the seq_file "file" to "ima_kexec_file"
Before making the function local seq_file "file" variable file static
global, rename it to "ima_kexec_file".

Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29 15:54:53 -04:00
Linus Torvalds
30e268185e Landlock fix for v6.15-rc4
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCaAp+shAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbSAukA/1FkHkyqXaiMYy2s4tlwvpjWMBP11gEUzyYM
 j1HIn3rOAQDUT1V2q+Ff5ZVk4hsCZ8pJMTmrASSL/CHw1yfQHtzQAw==
 =5RX1
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock fixes from Mickaël Salaün:
 "Fix some Landlock audit issues, add related tests, and updates
  documentation"

* tag 'landlock-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Update log documentation
  landlock: Fix documentation for landlock_restrict_self(2)
  landlock: Fix documentation for landlock_create_ruleset(2)
  selftests/landlock: Add PID tests for audit records
  selftests/landlock: Factor out audit fixture in audit_test
  landlock: Log the TGID of the domain creator
  landlock: Remove incorrect warning
2025-04-24 12:59:05 -07:00
Jakub Kicinski
5565acd1e6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.15-rc4).

This pull includes wireless and a fix to vxlan which isn't
in Linus's tree just yet. The latter creates with a silent conflict
/ build breakage, so merging it now to avoid causing problems.

drivers/net/vxlan/vxlan_vnifilter.c
  094adad913 ("vxlan: Use a single lock to protect the FDB table")
  087a9eb9e5 ("vxlan: vnifilter: Fix unlocked deletion of default FDB entry")
https://lore.kernel.org/20250423145131.513029-1-idosch@nvidia.com

No "normal" conflicts, or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24 11:20:52 -07:00
Song Liu
74e5b13a1b lsm: Move security_netlink_send to under CONFIG_SECURITY_NETWORK
security_netlink_send() is a networking hook, so it fits better under
CONFIG_SECURITY_NETWORK.

Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-22 15:34:58 -04:00
Frederick Lawler
30d68cb0c3 ima: process_measurement() needlessly takes inode_lock() on MAY_READ
On IMA policy update, if a measure rule exists in the policy,
IMA_MEASURE is set for ima_policy_flags which makes the violation_check
variable always true. Coupled with a no-action on MAY_READ for a
FILE_CHECK call, we're always taking the inode_lock().

This becomes a performance problem for extremely heavy read-only workloads.
Therefore, prevent this only in the case there's no action to be taken.

Signed-off-by: Frederick Lawler <fred@cloudflare.com>
Acked-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-22 16:39:32 +02:00
Mickaël Salaün
25b1fc1cdc
landlock: Fix documentation for landlock_restrict_self(2)
Fix, deduplicate, and improve rendering of landlock_restrict_self(2)'s
flags documentation.

The flags are now rendered like the syscall's parameters and
description.

Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250416154716.1799902-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-17 11:09:10 +02:00
Mickaël Salaün
50492f942c
landlock: Fix documentation for landlock_create_ruleset(2)
Move and fix the flags documentation, and improve formatting.

It makes more sense and it eases maintenance to document syscall flags
in landlock.h, where they are defined.  This is already the case for
landlock_restrict_self(2)'s flags.

The flags are now rendered like the syscall's parameters and
description.

Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250416154716.1799902-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-17 11:09:07 +02:00
Kees Cook
f5c68a4e84 hardening: Disable GCC randstruct for COMPILE_TEST
There is a GCC crash bug in the randstruct for latest GCC versions that
is being tickled by landlock[1]. Temporarily disable GCC randstruct for
COMPILE_TEST builds to unbreak CI systems for the coming -rc2. This can
be restored once the bug is fixed.

Suggested-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/all/20250407-kbuild-disable-gcc-plugins-v1-1-5d46ae583f5e@kernel.org/ [1]
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250409151154.work.872-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-04-15 13:50:17 -07:00
Paul Moore
05f1a93922 selinux: fix the kdoc header for task_avdcache_update
The kdoc header incorrectly references an older parameter, update it
to reference what is currently used in the function.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504122308.Ch8PzJdD-lkp@intel.com/
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-12 11:37:06 -04:00
Paul Moore
1ec31f14a8 selinux: remove a duplicated include
The "linux/parser.h" header was included twice, we only need it once.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504121945.Q0GDD0sG-lkp@intel.com/
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-12 11:31:46 -04:00
Kuniyuki Iwashima
2a63dd0edf net: Retire DCCP socket.
DCCP was orphaned in 2021 by commit 054c4610bd ("MAINTAINERS: dccp:
move Gerrit Renker to CREDITS"), which noted that the last maintainer
had been inactive for five years.

In recent years, it has become a playground for syzbot, and most changes
to DCCP have been odd bug fixes triggered by syzbot.  Apart from that,
the only changes have been driven by treewide or networking API updates
or adjustments related to TCP.

Thus, in 2023, we announced we would remove DCCP in 2025 via commit
b144fcaf46 ("dccp: Print deprecation notice.").

Since then, only one individual has contacted the netdev mailing list. [0]

There is ongoing research for Multipath DCCP.  The repository is hosted
on GitHub [1], and development is not taking place through the upstream
community.  While the repository is published under the GPLv2 license,
the scheduling part remains proprietary, with a LICENSE file [2] stating:

  "This is not Open Source software."

The researcher mentioned a plan to address the licensing issue, upstream
the patches, and step up as a maintainer, but there has been no further
communication since then.

Maintaining DCCP for a decade without any real users has become a burden.

Therefore, it's time to remove it.

Removing DCCP will also provide significant benefits to TCP.  It allows
us to freely reorganize the layout of struct inet_connection_sock, which
is currently shared with DCCP, and optimize it to reduce the number of
cachelines accessed in the TCP fast path.

Note that we keep DCCP netfilter modules as requested.  [3]

Link: https://lore.kernel.org/netdev/20230710182253.81446-1-kuniyu@amazon.com/T/#u #[0]
Link: https://github.com/telekom/mp-dccp #[1]
Link: https://github.com/telekom/mp-dccp/blob/mpdccp_v03_k5.10/net/dccp/non_gpl_scheduler/LICENSE #[2]
Link: https://lore.kernel.org/netdev/Z_VQ0KlCRkqYWXa-@calendula/ #[3]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paul Moore <paul@paul-moore.com> (LSM and SELinux)
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Link: https://patch.msgid.link/20250410023921.11307-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 18:58:10 -07:00
Paul Moore
5d7ddc59b3 selinux: reduce path walk overhead
Reduce the SELinux performance overhead during path walks through the
use of a per-task directory access cache and some minor code
optimizations.  The directory access cache is per-task because it allows
for a lockless cache while also fitting well with a common application
pattern of heavily accessing a relatively small number of SELinux
directory labels.  The cache is inherited by child processes when the
child runs with the same SELinux domain as the parent, and invalidated
on changes to the task's SELinux domain or the loaded SELinux policy.
A cache of four entries was chosen based on testing with the Fedora
"targeted" policy, a SELinux Reference Policy variant, and
'make allmodconfig' on Linux v6.14.

Code optimizations include better use of inline functions to reduce
function calls in the common case, especially in the inode revalidation
code paths, and elimination of redundant checks between the LSM and
SELinux layers.

As mentioned briefly above, aside from general use and regression
testing with the selinux-testsuite, performance was measured using
'make allmodconfig' with Linux v6.14 as a base reference.  As expected,
there were variations from one test run to another, but the measurements
below are a good representation of the test results seen on my test
system.

 * Linux v6.14
   REF
     1.26%  [k] __d_lookup_rcu
   SELINUX (1.31%)
     0.58%  [k] selinux_inode_permission
     0.29%  [k] avc_lookup
     0.25%  [k] avc_has_perm_noaudit
     0.19%  [k] __inode_security_revalidate

 * Linux v6.14 + patch
   REF
     1.41%  [k] __d_lookup_rcu
   SELINUX (0.89%)
     0.65%  [k] selinux_inode_permission
     0.15%  [k] avc_lookup
     0.05%  [k] avc_has_perm_noaudit
     0.04%  [k] avc_policy_seqno
     X.XX%  [k] __inode_security_revalidate (now inline)

In both cases the __d_lookup_rcu() function was used as a reference
point to establish a context for the SELinux related functions.  On a
unpatched Linux v6.14 system we see the time spent in the combined
SELinux functions exceeded that of __d_lookup_rcu(), 1.31% compared to
1.26%.  However, with this patch applied the time spent in the combined
SELinux functions dropped to roughly 65% of the time spent in
__d_lookup_rcu(), 0.89% compared to 1.41%.  Aside from the significant
decrease in time spent in the SELinux AVC, it appears that any additional
time spent searching and updating the cache is offset by other code
improvements, e.g. time spent in selinux_inode_permission() +
__inode_security_revalidate() + avc_policy_seqno() is less on the
patched kernel than the unpatched kernel.

It is worth noting that in this patch the use of the per-task cache is
limited to the security_inode_permission() LSM callback,
selinux_inode_permission(), but future work could expand the cache into
inode_has_perm(), likely through consolidation of the two functions.
While this would likely have little to no impact on path walks, it
may benefit other operations.

Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Tested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-11 16:41:31 -04:00
Takaya Saeki
8716451a4e selinux: support wildcard match in genfscon
Currently, genfscon only supports string prefix match to label files.
Thus, labeling numerous dynamic sysfs entries requires many specific
path rules. For example, labeling device paths such as
`/sys/devices/pci0000:00/0000:00:03.1/<...>/0000:04:00.1/wakeup`
requires listing all specific PCI paths, which is challenging to
maintain. While user-space restorecon can handle these paths with
regular expression rules, relabeling thousands of paths under sysfs
after it is mounted is inefficient compared to using genfscon.

This commit adds wildcard matching to genfscon to make rules more
efficient and expressive. This new behavior is enabled by
genfs_seclabel_wildcard capability. With this capability, genfscon does
wildcard matching instead of prefix matching. When multiple wildcard
rules match against a path, then the longest rule (determined by the
length of the rule string) will be applied. If multiple rules of the
same length match, the first matching rule encountered in the given
genfscon policy will be applied. Users are encouraged to write longer,
more explicit path rules to avoid relying on this behavior.

This change resulted in nice real-world performance improvements. For
example, boot times on test Android devices were reduced by 15%. This
improvement is due to the elimination of the "restorecon -R /sys" step
during boot, which takes more than two seconds in the worst case.

Signed-off-by: Takaya Saeki <takayas@chromium.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-11 16:36:34 -04:00
Christian Göttsche
4926c3fd83 selinux: drop copy-paste comment
Port labeling is based on port number and protocol (TCP/UDP/...) but not
based on network family (IPv4/IPv6).

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-11 16:32:07 -04:00
Christian Göttsche
cde3b1b66f selinux: unify OOM handling in network hashtables
For network objects, like interfaces, nodes, port and InfiniBands, the
object to SID lookup is cached in hashtables.  OOM during such hashtable
additions of new objects is considered non-fatal and the computed SID is
simply returned without adding the compute result into the hash table.

Actually ignore OOM in the InfiniBand code, despite the comment already
suggesting to do so.  This reverts commit c350f8bea2 ("selinux: Fix
error return code in sel_ib_pkey_sid_slow()").

Add comments in the other places.

Use kmalloc() instead of kzalloc(), since all members are initialized on
success and the data is only used in internbal hash tables, so no risk
of information leakage to userspace.

Fixes: c350f8bea2 ("selinux: Fix error return code in sel_ib_pkey_sid_slow()")
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-11 16:29:51 -04:00
Christian Göttsche
e6fb56b225 selinux: add likely hints for fast paths
In the network hashtable lookup code add likely() compiler hints in the
fast path, like already done in sel_netif_sid().

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-11 16:29:51 -04:00
Christian Göttsche
9cc034be10 selinux: contify network namespace pointer
The network namespace is not modified.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-11 16:29:51 -04:00
Christian Göttsche
47a1a15645 selinux: constify network address pointer
The network address, either an IPv4 or IPv6 one, is not modified.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-11 16:29:50 -04:00
Mickaël Salaün
4767af82a0
landlock: Log the TGID of the domain creator
As for other Audit's "pid" fields, Landlock should use the task's TGID
instead of its TID.  Fix this issue by keeping a reference to the TGID
of the domain creator.

Existing tests already check for the PID but only with the thread group
leader, so always the TGID.  A following patch adds dedicated tests for
non-leader thread.

Remove the current_real_cred() check which does not make sense because
we only reference a struct pid, whereas a previous version did reference
a struct cred instead.

Cc: Christian Brauner <brauner@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20250410171725.1265860-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-11 12:53:17 +02:00
Mickaël Salaün
fe81536af3
landlock: Remove incorrect warning
landlock_put_hierarchy() can be called when an error occurs in
landlock_merge_ruleset() due to insufficient memory.  In this case, the
domain's audit details might not have been allocated yet, which would
cause landlock_free_hierarchy_details() to print a warning (but still
safely handle this case).

We could keep the WARN_ON_ONCE(!hierarchy) but it's not worth it for
this kind of function, so let's remove it entirely.

Cc: Paul Moore <paul@paul-moore.com>
Reported-by: syzbot+8bca99e91de7e060e4ea@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20250331104709.897062-1-mic@digikod.net
Reviewed-by: Günther Noack <gnoack@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-08 19:18:20 +02:00
NeilBrown
06c567403a
Use try_lookup_noperm() instead of d_hash_and_lookup() outside of VFS
try_lookup_noperm() and d_hash_and_lookup() are nearly identical.  The
former does some validation of the name where the latter doesn't.
Outside of the VFS that validation is likely valuable, and having only
one exported function for this task is certainly a good idea.

So make d_hash_and_lookup() local to VFS files and change all other
callers to try_lookup_noperm().  Note that the arguments are swapped.

Signed-off-by: NeilBrown <neilb@suse.de>
Link: https://lore.kernel.org/r/20250319031545.2999807-6-neil@brown.name
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-04-08 11:24:41 +02:00
NeilBrown
fa6fe07d15
VFS: rename lookup_one_len family to lookup_noperm and remove permission check
The lookup_one_len family of functions is (now) only used internally by
a filesystem on itself either
- in a context where permission checking is irrelevant such as by a
  virtual filesystem populating itself, or xfs accessing its ORPHANAGE
  or dquota accessing the quota file; or
- in a context where a permission check (MAY_EXEC on the parent) has just
  been performed such as a network filesystem finding in "silly-rename"
  file in the same directory.  This is also the context after the
  _parentat() functions where currently lookup_one_qstr_excl() is used.

So the permission check is pointless.

The name "one_len" is unhelpful in understanding the purpose of these
functions and should be changed.  Most of the callers pass the len as
"strlen()" so using a qstr and QSTR() can simplify the code.

This patch renames these functions (include lookup_positive_unlocked()
which is part of the family despite the name) to have a name based on
"lookup_noperm".  They are changed to receive a 'struct qstr' instead
of separate name and len.  In a few cases the use of QSTR() results in a
new call to strlen().

try_lookup_noperm() takes a pointer to a qstr instead of the whole
qstr.  This is consistent with d_hash_and_lookup() (which is nearly
identical) and useful for lookup_noperm_unlocked().

The new lookup_noperm_common() doesn't take a qstr yet.  That will be
tidied up in a subsequent patch.

Signed-off-by: NeilBrown <neil@brown.name>
Link: https://lore.kernel.org/r/20250319031545.2999807-5-neil@brown.name
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-04-08 11:24:36 +02:00
Jeff Xu
5796d3967c mseal sysmap: kernel config and header change
Patch series "mseal system mappings", v9.

As discussed during mseal() upstream process [1], mseal() protects the
VMAs of a given virtual memory range against modifications, such as the
read/write (RW) and no-execute (NX) bits.  For complete descriptions of
memory sealing, please see mseal.rst [2].

The mseal() is useful to mitigate memory corruption issues where a
corrupted pointer is passed to a memory management system.  For example,
such an attacker primitive can break control-flow integrity guarantees
since read-only memory that is supposed to be trusted can become writable
or .text pages can get remapped.

The system mappings are readonly only, memory sealing can protect them
from ever changing to writable or unmmap/remapped as different attributes.

System mappings such as vdso, vvar, vvar_vclock, vectors (arm
compat-mode), sigpage (arm compat-mode), are created by the kernel during
program initialization, and could be sealed after creation.

Unlike the aforementioned mappings, the uprobe mapping is not established
during program startup.  However, its lifetime is the same as the
process's lifetime [3].  It could be sealed from creation.

The vsyscall on x86-64 uses a special address (0xffffffffff600000), which
is outside the mm managed range.  This means mprotect, munmap, and mremap
won't work on the vsyscall.  Since sealing doesn't enhance the vsyscall's
security, it is skipped in this patch.  If we ever seal the vsyscall, it
is probably only for decorative purpose, i.e.  showing the 'sl' flag in
the /proc/pid/smaps.  For this patch, it is ignored.

It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may
alter the system mappings during restore operations.  UML(User Mode Linux)
and gVisor, rr are also known to change the vdso/vvar mappings. 
Consequently, this feature cannot be universally enabled across all
systems.  As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default.

To support mseal of system mappings, architectures must define
CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS and update their special
mappings calls to pass mseal flag.  Additionally, architectures must
confirm they do not unmap/remap system mappings during the process
lifetime.  The existence of this flag for an architecture implies that it
does not require the remapping of thest system mappings during process
lifetime, so sealing these mappings is safe from a kernel perspective.

This version covers x86-64 and arm64 archiecture as minimum viable feature.

While no specific CPU hardware features are required for enable this
feature on an archiecture, memory sealing requires a 64-bit kernel.  Other
architectures can choose whether or not to adopt this feature.  Currently,
I'm not aware of any instances in the kernel code that actively
munmap/mremap a system mapping without a request from userspace.  The PPC
does call munmap when _install_special_mapping fails for vdso; however,
it's uncertain if this will ever fail for PPC - this needs to be
investigated by PPC in the future [4].  The UML kernel can add this
support when KUnit tests require it [5].

In this version, we've improved the handling of system mapping sealing
from previous versions, instead of modifying the _install_special_mapping
function itself, which would affect all architectures, we now call
_install_special_mapping with a sealing flag only within the specific
architecture that requires it.  This targeted approach offers two key
advantages: 1) It limits the code change's impact to the necessary
architectures, and 2) It aligns with the software architecture by keeping
the core memory management within the mm layer, while delegating the
decision of sealing system mappings to the individual architecture, which
is particularly relevant since 32-bit architectures never require sealing.

Prior to this patch series, we explored sealing special mappings from
userspace using glibc's dynamic linker.  This approach revealed several
issues:

- The PT_LOAD header may report an incorrect length for vdso, (smaller
  than its actual size).  The dynamic linker, which relies on PT_LOAD
  information to determine mapping size, would then split and partially
  seal the vdso mapping.  Since each architecture has its own vdso/vvar
  code, fixing this in the kernel would require going through each
  archiecture.  Our initial goal was to enable sealing readonly mappings,
  e.g.  .text, across all architectures, sealing vdso from kernel since
  creation appears to be simpler than sealing vdso at glibc.

- The [vvar] mapping header only contains address information, not
  length information.  Similar issues might exist for other special
  mappings.

- Mappings like uprobe are not covered by the dynamic linker, and there
  is no effective solution for them.

This feature's security enhancements will benefit ChromeOS, Android, and
other high security systems.

Testing:
This feature was tested on ChromeOS and Android for both x86-64 and ARM64.
- Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly,
  i.e. "sl" shown in the smaps for those mappings, and mremap is blocked.
- Passing various automation tests (e.g. pre-checkin) on ChromeOS and
  Android to ensure the sealing doesn't affect the functionality of
  Chromebook and Android phone.

I also tested the feature on Ubuntu on x86-64:
- With config disabled, vdso/vvar is not sealed,
- with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK,
  normal operations such as browsing the web, open/edit doc are OK.

Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1]
Link: Documentation/userspace-api/mseal.rst [2]
Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3]
Link: https://lore.kernel.org/all/CABi2SkV6JJwJeviDLsq9N4ONvQ=EFANsiWkgiEOjyT9TQSt+HA@mail.gmail.com/ [4]
Link: https://lore.kernel.org/all/202502251035.239B85A93@keescook/ [5]


This patch (of 7):

Provide infrastructure to mseal system mappings.  Establish two kernel
configs (CONFIG_MSEAL_SYSTEM_MAPPINGS,
ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS) and VM_SEALED_SYSMAP macro for future
patches.

Link: https://lkml.kernel.org/r/20250305021711.3867874-1-jeffxu@google.com
Link: https://lkml.kernel.org/r/20250305021711.3867874-2-jeffxu@google.com
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Benjamin Berg <benjamin@sipsolutions.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Elliot Hughes <enh@google.com>
Cc: Florian Faineli <f.fainelli@gmail.com>
Cc: Greg Ungerer <gerg@kernel.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
Cc: Linus Waleij <linus.walleij@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Rapoport <mike.rapoport@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pedro Falcato <pedro.falcato@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Röttger <sroettger@google.com>
Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-04-01 15:17:14 -07:00
Linus Torvalds
2cd5769fb0 Driver core updates for 6.15-rc1
Here is the big set of driver core updates for 6.15-rc1.  Lots of stuff
 happened this development cycle, including:
   - kernfs scaling changes to make it even faster thanks to rcu
   - bin_attribute constify work in many subsystems
   - faux bus minor tweaks for the rust bindings
   - rust binding updates for driver core, pci, and platform busses,
     making more functionaliy available to rust drivers.  These are all
     due to people actually trying to use the bindings that were in 6.14.
   - make Rafael and Danilo full co-maintainers of the driver core
     codebase
   - other minor fixes and updates.
 
 This has been in linux-next for a while now, with the only reported
 issue being some merge conflicts with the rust tree.  Depending on which
 tree you pull first, you will have conflicts in one of them.  The merge
 resolution has been in linux-next as an example of what to do, or can be
 found here:
 	https://lore.kernel.org/r/CANiq72n3Xe8JcnEjirDhCwQgvWoE65dddWecXnfdnbrmuah-RQ@mail.gmail.com
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZ+mMrg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylRgwCdH58OE3BgL0uoFY5vFImStpmPtqUAoL5HpVWI
 jtbJ+UuXGsnmO+JVNBEv
 =gy6W
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updatesk from Greg KH:
 "Here is the big set of driver core updates for 6.15-rc1. Lots of stuff
  happened this development cycle, including:

   - kernfs scaling changes to make it even faster thanks to rcu

   - bin_attribute constify work in many subsystems

   - faux bus minor tweaks for the rust bindings

   - rust binding updates for driver core, pci, and platform busses,
     making more functionaliy available to rust drivers. These are all
     due to people actually trying to use the bindings that were in
     6.14.

   - make Rafael and Danilo full co-maintainers of the driver core
     codebase

   - other minor fixes and updates"

* tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (52 commits)
  rust: platform: require Send for Driver trait implementers
  rust: pci: require Send for Driver trait implementers
  rust: platform: impl Send + Sync for platform::Device
  rust: pci: impl Send + Sync for pci::Device
  rust: platform: fix unrestricted &mut platform::Device
  rust: pci: fix unrestricted &mut pci::Device
  rust: device: implement device context marker
  rust: pci: use to_result() in enable_device_mem()
  MAINTAINERS: driver core: mark Rafael and Danilo as co-maintainers
  rust/kernel/faux: mark Registration methods inline
  driver core: faux: only create the device if probe() succeeds
  rust/faux: Add missing parent argument to Registration::new()
  rust/faux: Drop #[repr(transparent)] from faux::Registration
  rust: io: fix devres test with new io accessor functions
  rust: io: rename `io::Io` accessors
  kernfs: Move dput() outside of the RCU section.
  efi: rci2: mark bin_attribute as __ro_after_init
  rapidio: constify 'struct bin_attribute'
  firmware: qemu_fw_cfg: constify 'struct bin_attribute'
  powerpc/perf/hv-24x7: Constify 'struct bin_attribute'
  ...
2025-04-01 11:02:03 -07:00
Linus Torvalds
fa593d0f96 bpf-next-6.15
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmfi6ZAACgkQ6rmadz2v
 bTpLOg/+J7xUddPMhlpFAUlifQEadE5hmw6v1tXpM3zyKHzUWJiv/qsx3j8/ckgD
 D+d4P8bqIbI9SSuIS4oZ0+D9pr/g7GYztnoYZmPiYJ7v2AijPuof5dsagFQE8E2y
 rhfbt9KHTMzzkdkTvaAZaITS/HWAoJ2YVRB6gfLex2ghcXYHcgmtKRZniQrbBiFZ
 MIXBN8Rg6HP+pUdIVllSXFcQCb3XIgjPONRAos4hr5tIm+3Ku7Jvkgk2H/9vUcoF
 bdXAcg8xygyH7eY+1l3e7nEPQlG0jUZEsL+tq+vpdoLRLqlIpAUYmwUvqcmq4dPS
 QGFjiUcpDbXlxsUFpzjXHIFto7fXCfND7HEICQPwAncdflIIfYaATSQUfkEexn0a
 wBCFlAChrEzAmg2vFl4EeEr0fdSe/3jswrgKx0m6ctKieMjgloBUeeH4fXOpfkhS
 9tvhuduVFuronlebM8ew4w9T/mBgbyxkE5KkvP4hNeB3ni3N0K6Mary5/u2HyN1e
 lqTlnZxRA4p6lrvxce/mDrR4VSwlKLcSeQVjxAL1afD5KRkuZJnUv7bUhS361vkG
 IjNrQX30EisDAz+X7tMn3ndBf9vVatwFT4+c3yaxlQRor1WofhDfT88HPiyB4QqQ
 Kdx2EHgbQxJp4vkzhp4/OXlTfkihsMEn8egzZuphdPEQ9Y+Jdwg=
 =aN/V
 -----END PGP SIGNATURE-----

Merge tag 'bpf-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:
 "For this merge window we're splitting BPF pull request into three for
  higher visibility: main changes, res_spin_lock, try_alloc_pages.

  These are the main BPF changes:

   - Add DFA-based live registers analysis to improve verification of
     programs with loops (Eduard Zingerman)

   - Introduce load_acquire and store_release BPF instructions and add
     x86, arm64 JIT support (Peilin Ye)

   - Fix loop detection logic in the verifier (Eduard Zingerman)

   - Drop unnecesary lock in bpf_map_inc_not_zero() (Eric Dumazet)

   - Add kfunc for populating cpumask bits (Emil Tsalapatis)

   - Convert various shell based tests to selftests/bpf/test_progs
     format (Bastien Curutchet)

   - Allow passing referenced kptrs into struct_ops callbacks (Amery
     Hung)

   - Add a flag to LSM bpf hook to facilitate bpf program signing
     (Blaise Boscaccy)

   - Track arena arguments in kfuncs (Ihor Solodrai)

   - Add copy_remote_vm_str() helper for reading strings from remote VM
     and bpf_copy_from_user_task_str() kfunc (Jordan Rome)

   - Add support for timed may_goto instruction (Kumar Kartikeya
     Dwivedi)

   - Allow bpf_get_netns_cookie() int cgroup_skb programs (Mahe Tardy)

   - Reduce bpf_cgrp_storage_busy false positives when accessing cgroup
     local storage (Martin KaFai Lau)

   - Introduce bpf_dynptr_copy() kfunc (Mykyta Yatsenko)

   - Allow retrieving BTF data with BTF token (Mykyta Yatsenko)

   - Add BPF kfuncs to set and get xattrs with 'security.bpf.' prefix
     (Song Liu)

   - Reject attaching programs to noreturn functions (Yafang Shao)

   - Introduce pre-order traversal of cgroup bpf programs (Yonghong
     Song)"

* tag 'bpf-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (186 commits)
  selftests/bpf: Add selftests for load-acquire/store-release when register number is invalid
  bpf: Fix out-of-bounds read in check_atomic_load/store()
  libbpf: Add namespace for errstr making it libbpf_errstr
  bpf: Add struct_ops context information to struct bpf_prog_aux
  selftests/bpf: Sanitize pointer prior fclose()
  selftests/bpf: Migrate test_xdp_vlan.sh into test_progs
  selftests/bpf: test_xdp_vlan: Rename BPF sections
  bpf: clarify a misleading verifier error message
  selftests/bpf: Add selftest for attaching fexit to __noreturn functions
  bpf: Reject attaching fexit/fmod_ret to __noreturn functions
  bpf: Only fails the busy counter check in bpf_cgrp_storage_get if it creates storage
  bpf: Make perf_event_read_output accessible in all program types.
  bpftool: Using the right format specifiers
  bpftool: Add -Wformat-signedness flag to detect format errors
  selftests/bpf: Test freplace from user namespace
  libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID
  bpf: Return prog btf_id without capable check
  bpf: BPF token support for BPF_BTF_GET_FD_BY_ID
  bpf, x86: Fix objtool warning for timed may_goto
  bpf: Check map->record at the beginning of check_and_free_fields()
  ...
2025-03-30 12:43:03 -07:00
Linus Torvalds
e5e0e6bebe This update includes the following changes:
API:
 
 - Remove legacy compression interface.
 - Improve scatterwalk API.
 - Add request chaining to ahash and acomp.
 - Add virtual address support to ahash and acomp.
 - Add folio support to acomp.
 - Remove NULL dst support from acomp.
 
 Algorithms:
 
 - Library options are fuly hidden (selected by kernel users only).
 - Add Kerberos5 algorithms.
 - Add VAES-based ctr(aes) on x86.
 - Ensure LZO respects output buffer length on compression.
 - Remove obsolete SIMD fallback code path from arm/ghash-ce.
 
 Drivers:
 
 - Add support for PCI device 0x1134 in ccp.
 - Add support for rk3588's standalone TRNG in rockchip.
 - Add Inside Secure SafeXcel EIP-93 crypto engine support in eip93.
 - Fix bugs in tegra uncovered by multi-threaded self-test.
 - Fix corner cases in hisilicon/sec2.
 
 Others:
 
 - Add SG_MITER_LOCAL to sg miter.
 - Convert ubifs, hibernate and xfrm_ipcomp from legacy API to acomp.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmfiQ9kACgkQxycdCkmx
 i6fFZg/9GWjC1FLEV66vNlYAIzFGwzwWdFGyQzXyP235Cphhm4qt9gx7P91N6Lvc
 pplVjNEeZHoP8lMw+AIeGc2cRhIwsvn8C+HA3tCBOoC1qSe8T9t7KHAgiRGd/0iz
 UrzVBFLYlR9i4tc0T5peyQwSctv8DfjWzduTmI3Ts8i7OQcfeVVgj3sGfWam7kjF
 1GJWIQH7aPzT8cwFtk8gAK1insuPPZelT1Ppl9kUeZe0XUibrP7Gb5G9simxXAyi
 B+nLCaJYS6Hc1f47cfR/qyZSeYQN35KTVrEoKb1pTYXfEtMv6W9fIvQVLJRYsqpH
 RUBdDJUseE+WckR6glX9USrh+Fv9d+HfsTXh1fhpApKU5sQJ7pDbUm4ge8p6htNG
 MIszbJPdqajYveRLuPUjFlUXaqomos8eT6BZA+RLHm1cogzEOm+5bjspbfRNAVPj
 x9KiDu5lXNiFj02v/MkLKUe3bnGIyVQnZNi7Rn0Rpxjv95tIjVpksZWMPJarxUC6
 5zdyM2I5X0Z9+teBpbfWyqfzSbAs/KpzV8S/xNvWDUT6NlpYGBeNXrCDTXcwJLAh
 PRW0w1EJUwsZbPi8GEh5jNzo/YK1cGsUKrihKv7YgqSSopMLI8e/WVr8nKZMVDFA
 O+6F6ec5lR7KsOIMGUqrBGFU1ccAeaLLvLK3H5J8//gMMg82Uik=
 =aQNt
 -----END PGP SIGNATURE-----

Merge tag 'v6.15-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Remove legacy compression interface
   - Improve scatterwalk API
   - Add request chaining to ahash and acomp
   - Add virtual address support to ahash and acomp
   - Add folio support to acomp
   - Remove NULL dst support from acomp

  Algorithms:
   - Library options are fuly hidden (selected by kernel users only)
   - Add Kerberos5 algorithms
   - Add VAES-based ctr(aes) on x86
   - Ensure LZO respects output buffer length on compression
   - Remove obsolete SIMD fallback code path from arm/ghash-ce

  Drivers:
   - Add support for PCI device 0x1134 in ccp
   - Add support for rk3588's standalone TRNG in rockchip
   - Add Inside Secure SafeXcel EIP-93 crypto engine support in eip93
   - Fix bugs in tegra uncovered by multi-threaded self-test
   - Fix corner cases in hisilicon/sec2

  Others:
   - Add SG_MITER_LOCAL to sg miter
   - Convert ubifs, hibernate and xfrm_ipcomp from legacy API to acomp"

* tag 'v6.15-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (187 commits)
  crypto: testmgr - Add multibuffer acomp testing
  crypto: acomp - Fix synchronous acomp chaining fallback
  crypto: testmgr - Add multibuffer hash testing
  crypto: hash - Fix synchronous ahash chaining fallback
  crypto: arm/ghash-ce - Remove SIMD fallback code path
  crypto: essiv - Replace memcpy() + NUL-termination with strscpy()
  crypto: api - Call crypto_alg_put in crypto_unregister_alg
  crypto: scompress - Fix incorrect stream freeing
  crypto: lib/chacha - remove unused arch-specific init support
  crypto: remove obsolete 'comp' compression API
  crypto: compress_null - drop obsolete 'comp' implementation
  crypto: cavium/zip - drop obsolete 'comp' implementation
  crypto: zstd - drop obsolete 'comp' implementation
  crypto: lzo - drop obsolete 'comp' implementation
  crypto: lzo-rle - drop obsolete 'comp' implementation
  crypto: lz4hc - drop obsolete 'comp' implementation
  crypto: lz4 - drop obsolete 'comp' implementation
  crypto: deflate - drop obsolete 'comp' implementation
  crypto: 842 - drop obsolete 'comp' implementation
  crypto: nx - Migrate to scomp API
  ...
2025-03-29 10:01:55 -07:00
Linus Torvalds
7288511606 Landlock update for v6.15-rc1
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZ+bGgBAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbSKmgBAICZsmQTuKMHIXdB7kwA+BX5k++SZcyA+qHN
 0hrJTSMsAP0Uv6NpiPT4CTduqBMRbuMwNhujBczRiok6yaHDbC8eCw==
 =K8XL
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock updates from Mickaël Salaün:
 "This brings two main changes to Landlock:

   - A signal scoping fix with a new interface for user space to know if
     it is compatible with the running kernel.

   - Audit support to give visibility on why access requests are denied,
     including the origin of the security policy, missing access rights,
     and description of object(s). This was designed to limit log spam
     as much as possible while still alerting about unexpected blocked
     access.

  With these changes come new and improved documentation, and a lot of
  new tests"

* tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (36 commits)
  landlock: Add audit documentation
  selftests/landlock: Add audit tests for network
  selftests/landlock: Add audit tests for filesystem
  selftests/landlock: Add audit tests for abstract UNIX socket scoping
  selftests/landlock: Add audit tests for ptrace
  selftests/landlock: Test audit with restrict flags
  selftests/landlock: Add tests for audit flags and domain IDs
  selftests/landlock: Extend tests for landlock_restrict_self(2)'s flags
  selftests/landlock: Add test for invalid ruleset file descriptor
  samples/landlock: Enable users to log sandbox denials
  landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF
  landlock: Add LANDLOCK_RESTRICT_SELF_LOG_*_EXEC_* flags
  landlock: Log scoped denials
  landlock: Log TCP bind and connect denials
  landlock: Log truncate and IOCTL denials
  landlock: Factor out IOCTL hooks
  landlock: Log file-related denials
  landlock: Log mount-related denials
  landlock: Add AUDIT_LANDLOCK_DOMAIN and log domain status
  landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials
  ...
2025-03-28 12:37:13 -07:00
Linus Torvalds
78fb88eca6 Capabilities update for 6.15
This branch contains one patch:
 
     capability: Remove unused has_capability
 
 This removes a helper function whose last user (smack) stopped using
 it in 2018.
 
 This has been in linux-next for most of the the last cycle with no
 apparent issues.  It is available at:
 
 git@git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git #caps-20250327
 
 on top of commit 2014c95afe (tag: v6.14-rc1)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEqb0/8XByttt4D8+UNXDaFycKziQFAmfmHIcACgkQNXDaFycK
 ziRQFggAl1fQCS/LNhFY7SAmkv8S6yJemYt8hNvRjlOwwMB2aZbFFfZeVGdO73od
 mkQAoZA0ppPT8aXH2pbfcCsKI2UFQQ/cm6Jpl4wCEZmKr2ZTxcH5zCRbUQeiR/7t
 QvcWVXCC47BnI9ab+OiCCaCw5HtAuweax6IGjFTregjUAEcBk8aoF9PJMvlT0e6j
 WQqWegPVS1GSiQqWGxPzdfnUOSkGuak9setg53pCTQE6UjMIK7uVxfSEJpe3pKvH
 ZvRvkif5UJrdUAr9MKMvf/gkKEl5xpAoTRdoaiWdzZ/arZ77HD32qoi/o3s4P+7a
 7ZHvAc+Xjm0AIO/4JrjzFw0zH/svSg==
 =rbE0
 -----END PGP SIGNATURE-----

Merge tag 'caps-pr-20250327' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux

Pull capabilities update from Serge Hallyn:
 "This contains just one patch that removes a helper function whose last
  user (smack) stopped using it in 2018"

* tag 'caps-pr-20250327' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux:
  capability: Remove unused has_capability
2025-03-28 12:09:33 -07:00
Linus Torvalds
a2d4f473df integrity-v6.15
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZ+WAJBQcem9oYXJAbGlu
 dXguaWJtLmNvbQAKCRDLwZzRsCrn5QxeAP4/XqKc8OX5tKnySy5xe6r5ghYLIQLR
 zcw5lUbKEFrAewEAvvdciA5fLBRHE4xbMRtu30Y1Rxiqq1XY+3ilcAKjpws=
 =B+UO
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull ima updates from Mimi Zohar:
 "Two performance improvements, which minimize the number of integrity
  violations"

* tag 'integrity-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: limit the number of ToMToU integrity violations
  ima: limit the number of open-writers integrity violations
2025-03-28 12:06:58 -07:00
Linus Torvalds
f174ac5ba2 ipe/stable-6.15 PR 20250324
-----BEGIN PGP SIGNATURE-----
 
 iIcEABYIAC8WIQQzmBmZPBN6m/hUJmnyomI6a/yO7QUCZ+HCcBEcd3VmYW5Aa2Vy
 bmVsLm9yZwAKCRDyomI6a/yO7SwKAP9AOYDTVMThFW+lWAuEbfATW+Tf+QRxhjfq
 IIJ5F8b2bgD+PrMveqeKRwLnho/rshgw7NQLwVLOFM/JgnDvP0JavQg=
 =2Vtz
 -----END PGP SIGNATURE-----

Merge tag 'ipe-pr-20250324' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe

Pull ipe update from Fan Wu:
 "This contains just one commit from Randy Dunlap, which fixes
  kernel-doc warnings in the IPE subsystem"

* tag 'ipe-pr-20250324' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe:
  ipe: policy_fs: fix kernel-doc warnings
2025-03-28 12:00:40 -07:00
Mimi Zohar
a414016218 ima: limit the number of ToMToU integrity violations
Each time a file in policy, that is already opened for read, is opened
for write, a Time-of-Measure-Time-of-Use (ToMToU) integrity violation
audit message is emitted and a violation record is added to the IMA
measurement list.  This occurs even if a ToMToU violation has already
been recorded.

Limit the number of ToMToU integrity violations per file open for read.

Note: The IMA_MAY_EMIT_TOMTOU atomic flag must be set from the reader
side based on policy.  This may result in a per file open for read
ToMToU violation.

Since IMA_MUST_MEASURE is only used for violations, rename the atomic
IMA_MUST_MEASURE flag to IMA_MAY_EMIT_TOMTOU.

Cc: stable@vger.kernel.org # applies cleanly up to linux-6.6
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Tested-by: Petr Vorel <pvorel@suse.cz>
Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-03-27 12:40:12 -04:00
Mimi Zohar
5b3cd80115 ima: limit the number of open-writers integrity violations
Each time a file in policy, that is already opened for write, is opened
for read, an open-writers integrity violation audit message is emitted
and a violation record is added to the IMA measurement list. This
occurs even if an open-writers violation has already been recorded.

Limit the number of open-writers integrity violations for an existing
file open for write to one.  After the existing file open for write
closes (__fput), subsequent open-writers integrity violations may be
emitted.

Cc: stable@vger.kernel.org # applies cleanly up to linux-6.6
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Tested-by: Petr Vorel <pvorel@suse.cz>
Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-03-27 12:35:51 -04:00
Linus Torvalds
592329e5e9 Summary
* Move vm_table members out of kernel/sysctl.c
 
   All vm_table array members have moved to their respective subsystems leading
   to the removal of vm_table from kernel/sysctl.c. This increases modularity by
   placing the ctl_tables closer to where they are actually used and at the same
   time reducing the chances of merge conflicts in kernel/sysctl.c.
 
 * ctl_table range fixes
 
   Replace the proc_handler function that checks variable ranges in
   coredump_sysctls and vdso_table with the one that actually uses the extra{1,2}
   pointers as min/max values. This tightens the range of the values that users
   can pass into the kernel effectively preventing {under,over}flows.
 
 * Misc fixes
 
   Correct grammar errors and typos in test messages. Update sysctl files in
   MAINTAINERS. Constified and removed array size in declaration for
   alignment_tbl
 
 * Testing
 
   - These have all been in linux-next for at least 1 month
   - They have gone through 0-day
   - Ran all these through sysctl selftests in x86_64
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmfhV8EACgkQupfNUreW
 QU/udAv/VCXGkndQsJ5biXpXYFnokX0gIEaYzzHiqrFycZqr8ys0/wWzc+ar1LjF
 Jvanl2uKB0mUviLKt7Gk0+Hri+PJlYIrbx+5K5eo2wsKUUxFykqLLm59y/orPODl
 gyPQjKNpHJb7COsnEc3Lrq/fvol4NPHlcBPXG8NwehccTeBHZ1ninfo+pSnxh3o8
 kI3GSLLxD4K9AgBl5QuVWH4gU7o//u7lUkKzy03NW+2jmuRv3dRcYF7IdgMINNee
 AeXnygdSBxLzECBvmkfNdyg+AmL8hdsmzbsIh7UuJDvxLlQOInVLZa+sXBotCOIc
 TImCrr1Ws1OuGrD0kpH+21tJvc8pNFWt61QlulObQdrLndWHdZEGyGOusLpXTwbn
 jIWZmMvzk1foSwdgzwPFzUqPEpW3FrBVDo4Z4kenBDrCp56QTX7hGRvkNYJNKvot
 Ue+i8BeHR/Gm/p+UMqgsSTOaNJXTqZhFqwJQVzxU/9LN/vkS0On6fbjgBd5X6Pn+
 a5dlc9gy
 =0bcX
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl

Pull sysctl updates from Joel Granados:

 - Move vm_table members out of kernel/sysctl.c

   All vm_table array members have moved to their respective subsystems
   leading to the removal of vm_table from kernel/sysctl.c. This
   increases modularity by placing the ctl_tables closer to where they
   are actually used and at the same time reducing the chances of merge
   conflicts in kernel/sysctl.c.

 - ctl_table range fixes

   Replace the proc_handler function that checks variable ranges in
   coredump_sysctls and vdso_table with the one that actually uses the
   extra{1,2} pointers as min/max values. This tightens the range of the
   values that users can pass into the kernel effectively preventing
   {under,over}flows.

 - Misc fixes

   Correct grammar errors and typos in test messages. Update sysctl
   files in MAINTAINERS. Constified and removed array size in
   declaration for alignment_tbl

* tag 'sysctl-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (22 commits)
  selftests/sysctl: fix wording of help messages
  selftests: fix spelling/grammar errors in sysctl/sysctl.sh
  MAINTAINERS: Update sysctl file list in MAINTAINERS
  sysctl: Fix underflow value setting risk in vm_table
  coredump: Fixes core_pipe_limit sysctl proc_handler
  sysctl: remove unneeded include
  sysctl: remove the vm_table
  sh: vdso: move the sysctl to arch/sh/kernel/vsyscall/vsyscall.c
  x86: vdso: move the sysctl to arch/x86/entry/vdso/vdso32-setup.c
  fs: dcache: move the sysctl to fs/dcache.c
  sunrpc: simplify rpcauth_cache_shrink_count()
  fs: drop_caches: move sysctl to fs/drop_caches.c
  fs: fs-writeback: move sysctl to fs/fs-writeback.c
  mm: nommu: move sysctl to mm/nommu.c
  security: min_addr: move sysctl to security/min_addr.c
  mm: mmap: move sysctl to mm/mmap.c
  mm: util: move sysctls to mm/util.c
  mm: vmscan: move vmscan sysctls to mm/vmscan.c
  mm: swap: move sysctl to mm/swap.c
  mm: filemap: move sysctl to mm/filemap.c
  ...
2025-03-26 21:02:05 -07:00
Mickaël Salaün
ead9079f75
landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF
Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF for the case of sandboxer
tools, init systems, or runtime containers launching programs sandboxing
themselves in an inconsistent way.  Setting this flag should only
depends on runtime configuration (i.e. not hardcoded).

We don't create a new ruleset's option because this should not be part
of the security policy: only the task that enforces the policy (not the
one that create it) knows if itself or its children may request denied
actions.

This is the first and only flag that can be set without actually
restricting the caller (i.e. without providing a ruleset).

Extend struct landlock_cred_security with a u8 log_subdomains_off.
struct landlock_file_security is still 16 bytes.

Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Closes: https://github.com/landlock-lsm/linux/issues/3
Link: https://lore.kernel.org/r/20250320190717.2287696-19-mic@digikod.net
[mic: Fix comment]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:43 +01:00
Mickaël Salaün
12bfcda73a
landlock: Add LANDLOCK_RESTRICT_SELF_LOG_*_EXEC_* flags
Most of the time we want to log denied access because they should not
happen and such information helps diagnose issues.  However, when
sandboxing processes that we know will try to access denied resources
(e.g. unknown, bogus, or malicious binary), we might want to not log
related access requests that might fill up logs.

By default, denied requests are logged until the task call execve(2).

If the LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF flag is set, denied
requests will not be logged for the same executed file.

If the LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON flag is set, denied
requests from after an execve(2) call will be logged.

The rationale is that a program should know its own behavior, but not
necessarily the behavior of other programs.

Because LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF is set for a specific
Landlock domain, it makes it possible to selectively mask some access
requests that would be logged by a parent domain, which might be handy
for unprivileged processes to limit logs.  However, system
administrators should still use the audit filtering mechanism.  There is
intentionally no audit nor sysctl configuration to re-enable these logs.
This is delegated to the user space program.

Increment the Landlock ABI version to reflect this interface change.

Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-18-mic@digikod.net
[mic: Rename variables and fix __maybe_unused]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:42 +01:00
Mickaël Salaün
1176a15b5e
landlock: Log scoped denials
Add audit support for unix_stream_connect, unix_may_send, task_kill, and
file_send_sigiotask hooks.

The related blockers are:
- scope.abstract_unix_socket
- scope.signal

Audit event sample for abstract unix socket:

  type=LANDLOCK_DENY msg=audit(1729738800.268:30): domain=195ba459b blockers=scope.abstract_unix_socket path=00666F6F

Audit event sample for signal:

  type=LANDLOCK_DENY msg=audit(1729738800.291:31): domain=195ba459b blockers=scope.signal opid=1 ocomm="systemd"

Refactor and simplify error handling in LSM hooks.

Extend struct landlock_file_security with fown_layer and use it to log
the blocking domain.  The struct aligned size is still 16 bytes.

Cc: Günther Noack <gnoack@google.com>
Cc: Tahera Fahimi <fahimitahera@gmail.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-17-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:42 +01:00
Mickaël Salaün
9f74411a40
landlock: Log TCP bind and connect denials
Add audit support to socket_bind and socket_connect hooks.

The related blockers are:
- net.bind_tcp
- net.connect_tcp

Audit event sample:

  type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=net.connect_tcp daddr=127.0.0.1 dest=80

Cc: Günther Noack <gnoack@google.com>
Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Cc: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-16-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:41 +01:00
Mickaël Salaün
20fd295494
landlock: Log truncate and IOCTL denials
Add audit support to the file_truncate and file_ioctl hooks.

Add a deny_masks_t type and related helpers to store the domain's layer
level per optional access rights (i.e. LANDLOCK_ACCESS_FS_TRUNCATE and
LANDLOCK_ACCESS_FS_IOCTL_DEV) when opening a file, which cannot be
inferred later.  In practice, the landlock_file_security aligned blob size is
still 16 bytes because this new one-byte deny_masks field follows the
existing two-bytes allowed_access field and precede the packed
fown_subject.

Implementing deny_masks_t with a bitfield instead of a struct enables a
generic implementation to store and extract layer levels.

Add KUnit tests to check the identification of a layer level from a
deny_masks_t, and the computation of a deny_masks_t from an access right
with its layer level or a layer_mask_t array.

Audit event sample:

  type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.ioctl_dev path="/dev/tty" dev="devtmpfs" ino=9 ioctlcmd=0x5401

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-15-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:41 +01:00
Mickaël Salaün
e120b3c293
landlock: Factor out IOCTL hooks
Compat and non-compat IOCTL hooks are almost the same, except to compare
the IOCTL command.  Factor out these two IOCTL hooks to highlight the
difference and minimize audit changes (see next commit).

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-14-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:40 +01:00
Mickaël Salaün
2fc80c69df
landlock: Log file-related denials
Add audit support for path_mkdir, path_mknod, path_symlink, path_unlink,
path_rmdir, path_truncate, path_link, path_rename, and file_open hooks.

The dedicated blockers are:
- fs.execute
- fs.write_file
- fs.read_file
- fs.read_dir
- fs.remove_dir
- fs.remove_file
- fs.make_char
- fs.make_dir
- fs.make_reg
- fs.make_sock
- fs.make_fifo
- fs.make_block
- fs.make_sym
- fs.refer
- fs.truncate
- fs.ioctl_dev

Audit event sample for a denied link action:

  type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.refer path="/usr/bin" dev="vda2" ino=351
  type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.make_reg,fs.refer path="/usr/local" dev="vda2" ino=365

We could pack blocker names (e.g. "fs:make_reg,refer") but that would
increase complexity for the kernel and log parsers.  Moreover, this
could not handle blockers of different classes (e.g. fs and net).  Make
it simple and flexible instead.

Add KUnit tests to check the identification from a layer_mask_t array of
the first layer level denying such request.

Cc: Günther Noack <gnoack@google.com>
Depends-on: 058518c209 ("landlock: Align partial refer access checks with final ones")
Depends-on: d617f0d72d ("landlock: Optimize file path walks and prepare for audit support")
Link: https://lore.kernel.org/r/20250320190717.2287696-13-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:39 +01:00
Mickaël Salaün
c56f649646
landlock: Log mount-related denials
Add audit support for sb_mount, move_mount, sb_umount, sb_remount, and
sb_pivot_root hooks.

The new related blocker is "fs.change_topology".

Audit event sample:

  type=LANDLOCK_DENY msg=audit(1729738800.349:44): domain=195ba459b blockers=fs.change_topology name="/" dev="tmpfs" ino=1

Remove landlock_get_applicable_domain() and get_current_fs_domain()
which are now fully replaced with landlock_get_applicable_subject().

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-12-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:39 +01:00
Mickaël Salaün
1d636984e0
landlock: Add AUDIT_LANDLOCK_DOMAIN and log domain status
Asynchronously log domain information when it first denies an access.
This minimize the amount of generated logs, which makes it possible to
always log denials for the current execution since they should not
happen.  These records are identified with the new AUDIT_LANDLOCK_DOMAIN
type.

The AUDIT_LANDLOCK_DOMAIN message contains:
- the "domain" ID which is described;
- the "status" which can either be "allocated" or "deallocated";
- the "mode" which is for now only "enforcing";
- for the "allocated" status, a minimal set of properties to easily
  identify the task that loaded the domain's policy with
  landlock_restrict_self(2): "pid", "uid", executable path ("exe"), and
  command line ("comm");
- for the "deallocated" state, the number of "denials" accounted to this
  domain, which is at least 1.

This requires each domain to save these task properties at creation
time in the new struct landlock_details.  A reference to the PID is kept
for the lifetime of the domain to avoid race conditions when
investigating the related task.  The executable path is resolved and
stored to not keep a reference to the filesystem and block related
actions.  All these metadata are stored for the lifetime of the related
domain and should then be minimal.  The required memory is not accounted
to the task calling landlock_restrict_self(2) contrary to most other
Landlock allocations (see related comment).

The AUDIT_LANDLOCK_DOMAIN record follows the first AUDIT_LANDLOCK_ACCESS
record for the same domain, which is always followed by AUDIT_SYSCALL
and AUDIT_PROCTITLE.  This is in line with the audit logic to first
record the cause of an event, and then add context with other types of
record.

Audit event sample for a first denial:

  type=LANDLOCK_ACCESS msg=audit(1732186800.349:44): domain=195ba459b blockers=ptrace opid=1 ocomm="systemd"
  type=LANDLOCK_DOMAIN msg=audit(1732186800.349:44): domain=195ba459b status=allocated mode=enforcing pid=300 uid=0 exe="/root/sandboxer" comm="sandboxer"
  type=SYSCALL msg=audit(1732186800.349:44): arch=c000003e syscall=101 success=no [...] pid=300 auid=0

Audit event sample for a following denial:

  type=LANDLOCK_ACCESS msg=audit(1732186800.372:45): domain=195ba459b blockers=ptrace opid=1 ocomm="systemd"
  type=SYSCALL msg=audit(1732186800.372:45): arch=c000003e syscall=101 success=no [...] pid=300 auid=0

Log domain deletion with the "deallocated" state when a domain was
previously logged.  This makes it possible for log parsers to free
potential resources when a domain ID will never show again.

The number of denied access requests is useful to easily check how many
access requests a domain blocked and potentially if some of them are
missing in logs because of audit rate limiting, audit rules, or Landlock
log configuration flags (see following commit).

Audit event sample for a deletion of a domain that denied something:

  type=LANDLOCK_DOMAIN msg=audit(1732186800.393:46): domain=195ba459b status=deallocated denials=2

Cc: Günther Noack <gnoack@google.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-11-mic@digikod.net
[mic: Update comment and GFP flag for landlock_log_drop_domain()]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:38 +01:00
Mickaël Salaün
33e65b0d3a
landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials
Add a new AUDIT_LANDLOCK_ACCESS record type dedicated to an access
request denied by a Landlock domain.  AUDIT_LANDLOCK_ACCESS indicates
that something unexpected happened.

For now, only denied access are logged, which means that any
AUDIT_LANDLOCK_ACCESS record is always followed by a SYSCALL record with
"success=no".  However, log parsers should check this syscall property
because this is the only sign that a request was denied.  Indeed, we
could have "success=yes" if Landlock would support a "permissive" mode.
We could also add a new field to AUDIT_LANDLOCK_DOMAIN for this mode
(see following commit).

By default, the only logged access requests are those coming from the
same executed program that enforced the Landlock restriction on itself.
In other words, no audit record are created for a task after it called
execve(2).  This is required to avoid log spam because programs may only
be aware of their own restrictions, but not the inherited ones.

Following commits will allow to conditionally generate
AUDIT_LANDLOCK_ACCESS records according to dedicated
landlock_restrict_self(2)'s flags.

The AUDIT_LANDLOCK_ACCESS message contains:
- the "domain" ID restricting the action on an object,
- the "blockers" that are missing to allow the requested access,
- a set of fields identifying the related object (e.g. task identified
  with "opid" and "ocomm").

The blockers are implicit restrictions (e.g. ptrace), or explicit access
rights (e.g. filesystem), or explicit scopes (e.g. signal).  This field
contains a list of at least one element, each separated with a comma.

The initial blocker is "ptrace", which describe all implicit Landlock
restrictions related to ptrace (e.g. deny tracing of tasks outside a
sandbox).

Add audit support to ptrace_access_check and ptrace_traceme hooks.  For
the ptrace_access_check case, we log the current/parent domain and the
child task.  For the ptrace_traceme case, we log the parent domain and
the current/child task.  Indeed, the requester and the target are the
current task, but the action would be performed by the parent task.

Audit event sample:

  type=LANDLOCK_ACCESS msg=audit(1729738800.349:44): domain=195ba459b blockers=ptrace opid=1 ocomm="systemd"
  type=SYSCALL msg=audit(1729738800.349:44): arch=c000003e syscall=101 success=no [...] pid=300 auid=0

A following commit adds user documentation.

Add KUnit tests to check reading of domain ID relative to layer level.

The quick return for non-landlocked tasks is moved from task_ptrace() to
each LSM hooks.

It is not useful to inline the audit_enabled check because other
computation are performed by landlock_log_denial().

Use scoped guards for RCU read-side critical sections.

Cc: Günther Noack <gnoack@google.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-10-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:38 +01:00
Mickaël Salaün
14f6c14e9f
landlock: Identify domain execution crossing
Extend struct landlock_cred_security with a domain_exec bitmask to
identify which Landlock domain were created by the current task's bprm.
The whole bitmask is reset on each execve(2) call.

Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-9-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:37 +01:00
Mickaël Salaün
79625f1b3a
landlock: Prepare to use credential instead of domain for fowner
This cosmetic change is needed for audit support, specifically to be
able to filter according to cross-execution boundaries.

struct landlock_file_security's size stay the same for now but it will
increase with struct landlock_cred_security's size.

Only save Landlock domain in hook_file_set_fowner() if the current
domain has LANDLOCK_SCOPE_SIGNAL, which was previously done for each
hook_file_send_sigiotask() calls.  This should improve a bit
performance.

Replace hardcoded LANDLOCK_SCOPE_SIGNAL with the signal_scope.scope
variable.

Use scoped guards for RCU read-side critical sections.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-8-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:37 +01:00
Mickaël Salaün
8d20efa9dc
landlock: Prepare to use credential instead of domain for scope
This cosmetic change that is needed for audit support, specifically to
be able to filter according to cross-execution boundaries.

Replace hardcoded LANDLOCK_SCOPE_SIGNAL with the signal_scope.scope
variable.

Use scoped guards for RCU read-side critical sections.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-7-mic@digikod.net
[mic: Update headers]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:36 +01:00
Mickaël Salaün
93f33f0cb2
landlock: Prepare to use credential instead of domain for network
This cosmetic change that is needed for audit support, specifically to
be able to filter according to cross-execution boundaries.

Optimize current_check_access_socket() to only handle the access
request.

Remove explicit domain->num_layers check which is now part of the
landlock_get_applicable_subject() call.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-6-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:35 +01:00
Mickaël Salaün
ae2483a260
landlock: Prepare to use credential instead of domain for filesystem
This cosmetic change is needed for audit support, specifically to be
able to filter according to cross-execution boundaries.

Add landlock_get_applicable_subject(), mainly a copy of
landlock_get_applicable_domain(), which will fully replace it in a
following commit.

Optimize current_check_access_path() to only handle the access request.

Partially replace get_current_fs_domain() with explicit calls to
landlock_get_applicable_subject().  The remaining ones will follow with
more changes.

Remove explicit domain->num_layers check which is now part of the
landlock_get_applicable_subject() call.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-5-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:35 +01:00
Mickaël Salaün
5b95b329be
landlock: Move domain hierarchy management
Create a new domain.h file containing the struct landlock_hierarchy
definition and helpers.  This type will grow with audit support.  This
also prepares for a new domain type.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-4-mic@digikod.net
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:34 +01:00
Mickaël Salaün
d9d2a68ed4
landlock: Add unique ID generator
Landlock IDs can be generated to uniquely identify Landlock objects.
For now, only Landlock domains get an ID at creation time.  These IDs
map to immutable domain hierarchies.

Landlock IDs have important properties:
- They are unique during the lifetime of the running system thanks to
  the 64-bit values: at worse, 2^60 - 2*2^32 useful IDs.
- They are always greater than 2^32 and must then be stored in 64-bit
  integer types.
- The initial ID (at boot time) is randomly picked between 2^32 and
  2^33, which limits collisions in logs across different boots.
- IDs are sequential, which enables users to order them.
- IDs may not be consecutive but increase with a random 2^4 step, which
  limits side channels.

Such IDs can be exposed to unprivileged processes, even if it is not the
case with this audit patch series.  The domain IDs will be useful for
user space to identify sandboxes and get their properties.

These Landlock IDs are more secure that other absolute kernel IDs such
as pipe's inodes which rely on a shared global counter.

For checkpoint/restore features (i.e. CRIU), we could easily implement a
privileged interface (e.g. sysfs) to set the next ID counter.

IDR/IDA are not used because we only need a bijection from Landlock
objects to Landlock IDs, and we must not recycle IDs.  This enables us
to identify all Landlock objects during the lifetime of the system (e.g.
in logs), but not to access an object from an ID nor know if an ID is
assigned.   Using a counter is simpler, it scales (i.e. avoids growing
memory footprint), and it does not require locking.  We'll use proper
file descriptors (with IDs used as inode numbers) to access Landlock
objects.

Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-3-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:34 +01:00
Mickaël Salaün
9b08a16637
lsm: Add audit_log_lsm_data() helper
Extract code from dump_common_audit_data() into the audit_log_lsm_data()
helper. This helps reuse common LSM audit data while not abusing
AUDIT_AVC records because of the common_lsm_audit() helper.

Depends-on: 7ccbe076d9 ("lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set")
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: James Morris <jmorris@namei.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-2-mic@digikod.net
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:33 +01:00
Mickaël Salaün
18eb75f3af
landlock: Always allow signals between threads of the same process
Because Linux credentials are managed per thread, user space relies on
some hack to synchronize credential update across threads from the same
process.  This is required by the Native POSIX Threads Library and
implemented by set*id(2) wrappers and libcap(3) to use tgkill(2) to
synchronize threads.  See nptl(7) and libpsx(3).  Furthermore, some
runtimes like Go do not enable developers to have control over threads
[1].

To avoid potential issues, and because threads are not security
boundaries, let's relax the Landlock (optional) signal scoping to always
allow signals sent between threads of the same process.  This exception
is similar to the __ptrace_may_access() one.

hook_file_set_fowner() now checks if the target task is part of the same
process as the caller.  If this is the case, then the related signal
triggered by the socket will always be allowed.

Scoping of abstract UNIX sockets is not changed because kernel objects
(e.g. sockets) should be tied to their creator's domain at creation
time.

Note that creating one Landlock domain per thread puts each of these
threads (and their future children) in their own scope, which is
probably not what users expect, especially in Go where we do not control
threads.  However, being able to drop permissions on all threads should
not be restricted by signal scoping.  We are working on a way to make it
possible to atomically restrict all threads of a process with the same
domain [2].

Add erratum for signal scoping.

Closes: https://github.com/landlock-lsm/go-landlock/issues/36
Fixes: 54a6e6bbf3 ("landlock: Add signal scoping")
Fixes: c899496501 ("selftests/landlock: Test signal scoping for threads")
Depends-on: 26f204380a ("fs: Fix file_set_fowner LSM hook inconsistencies")
Link: https://pkg.go.dev/kernel.org/pub/linux/libs/security/libcap/psx [1]
Link: https://github.com/landlock-lsm/linux/issues/2 [2]
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Tahera Fahimi <fahimitahera@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20250318161443.279194-6-mic@digikod.net
[mic: Add extra pointer check and RCU guard, and ease backport]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-26 13:59:29 +01:00
Linus Torvalds
61af143fbe Smack patches for v6.15
-----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmfi1Z4XHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBGPlRAAua8w+rpw06Njdi8LNDytiMil
 Z8a1dbF2yupLclaydLuwxOvJbdjNcCzEVDDFNYEfh6iBiXsaoqr1FO13mmdcjon7
 NXPscvhBWbb0dXvPEk74upWk2HRUIS0BI3tofY1YaGJLdWZRM2z6qvHcb4jNkM89
 o+wzzxme7HNwez4SpB+vmm8WZ4AH3rX7q0ihf3XOpplvxeKwob3ZCu+nNQe0AXMI
 FjMXPn148O96UKJgYJFMV1rLzWmYXzrrDBvzYS79walyti9ct1SYKRY4Zs80MiJK
 0mQGDpg60PFUldcsnBD9Pp5nifcLKg5nNUWb15AhRw1sR4wnKl4DxCA4032NYB/x
 wuVW2rsRjPDuzD6XGS16FZDv86qSyxPtJmvk3qKJmiIQVRc1xhJWAH57A+UzgnLx
 UGvIcIoJjifTX4EUZXdCdH44RBBs/tjCbrBKROPnKBPbkLHNxAzW/Z65hLvxHgg8
 f0vFuSNAL5dCNRdspWgE5BayM0RoW/HZbY15/O+oc7hDgs2ORgKPtayt4uJZr1km
 PhxG3/9BWRjga44RRwIGCsxzqVdO0l5FYYqh+AYnzTOz8S+kncmcPHbnWOYVU0aH
 9u5jE46TCmRxVnEaco+1dhGBUZaFgjXpPv+PxT+LZgTVHLbhBuONCEIot8ejFOKc
 TomnIzxqMacRAlURNXM=
 =SUpV
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-6.15' of https://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "This is a larger set of patches than usual, consisting of a set of
  build clean-ups, a rework of error handling in setting up CIPSO label
  specification and a bug fix in network labeling"

* tag 'Smack-for-6.15' of https://github.com/cschaufler/smack-next:
  smack: recognize ipv4 CIPSO w/o categories
  smack: Revert "smackfs: Added check catlen"
  smack: remove /smack/logging if audit is not configured
  smack: ipv4/ipv6: tcp/dccp/sctp: fix incorrect child socket label
  smack: dont compile ipv6 code unless ipv6 is configured
  Smack: fix typos and spelling errors
2025-03-25 16:04:11 -07:00
Linus Torvalds
59c017ce9e selinux/stable-6.15 PR 20250323
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmfgWewUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNTXA/9F7Fo5ov6mP15jChSSZWuVPBdi1gD
 y8Q8sCbu/KeCRO1Qb4QTv8ZCVGkP+EDK47IIvLXj27Aa19y1m3E4r1mddCSBQ3eu
 jSqR/kOXf3j8AWPP2m4qYK/EJvNqNd/V67PkktFal+95crcmz3IDV68qWuNafdSc
 r8VuprrEw+NSuKhPh4e2tM0hvOmAzePuvI6gGPb9z7Fj807/qfSOteAkvYpJ1y+d
 vZzHLeu3FRExxu4wKZZymGpT2+5Xl/MrjRJUtKuJdxXW8FphPUr5cfHDIP0Ae97w
 J70RGr0Oy02dQnCtAMkOGi7lpS1S1r0Qnhr+eloQQvG7J2eRRPZqGrmaU69qopAo
 JY/Xc7/r29pGwGnXtiHKZ4ej65mTIN9bmPsHIjjr01hiB/gEUnX2vdVSwVYLxOsF
 dzCnXb1VBc4mSIJ1Sjst0a6CRNPVA3U/bCfCbvfeyhn6A0XHmJI1PDRbxEXavnki
 sQIAtLv5M0Pyzyjij+6qHfd8TsUgiH/rtR6st31SnL5iqIWkE9wPMFldg064vHgS
 8dECnF7G9ZU/OErJjTQVshJE3fDEJvbQj8YIq7u1gQOZV02G7U3q4R3Aoj3GoSKJ
 dMjoeG18+yuIevW/OHWtbjp4QMpp2R4xuXaJJlfsB2OaOX6jSS4S5KpYO3eKQ/Jd
 kNQxuG8VD3tK8jc=
 =QD7q
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

 - Add additional SELinux access controls for kernel file reads/loads

   The SELinux kernel file read/load access controls were never updated
   beyond the initial kernel module support, this pull request adds
   support for firmware, kexec, policies, and x.509 certificates.

 - Add support for wildcards in network interface names

   There are a number of userspace tools which auto-generate network
   interface names using some pattern of <XXXX>-<NN> where <XXXX> is a
   fixed string, e.g. "podman", and <NN> is a increasing counter.
   Supporting wildcards in the SELinux policy for network interfaces
   simplifies the policy associted with these interfaces.

 - Fix a potential problem in the kernel read file SELinux code

   SELinux should always check the file label in the
   security_kernel_read_file() LSM hook, regardless of if the file is
   being read in chunks. Unfortunately, the existing code only
   considered the file label on the first chunk; this pull request fixes
   this problem.

   There is more detail in the individual commit, but thankfully the
   existing code didn't expose a bug due to multi-stage reads only
   taking place in one driver, and that driver loading a file type that
   isn't targeted by the SELinux policy.

 - Fix the subshell error handling in the example policy loader

   Minor fix to SELinux example policy loader in scripts/selinux due to
   an undesired interaction with subshells and errexit.

* tag 'selinux-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: get netif_wildcard policycap from policy instead of cache
  selinux: support wildcard network interface names
  selinux: Chain up tool resolving errors in install_policy.sh
  selinux: add permission checks for loading other kinds of kernel files
  selinux: always check the file label in selinux_kernel_read_file()
  selinux: fix spelling error
2025-03-25 15:52:32 -07:00
Linus Torvalds
054570267d lsm/stable-6.15 PR 20250323
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmfgWgMUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNW5RAAvCDq5gBtY0aTNlULe637EVLSh+t8
 PkSzHzu/NlzU6BfjtwSm2fuML8welTGxSwUPxUzMCI91gPdkGeFktefavT3xa+QI
 BHWROn7fEJ/KmRZvngPeIkgLr5xhF5nBJmc/Jw71qem20zRzNgJnpzMX16d10Phx
 dxd2xOO1qM3bv6Z9RcIssZRGaN+PHngpWWg+0B69XuaBUso87S6NDyKNn1XPmvoz
 as96k+Wk/xAZGVEeCbs/+H5rBx6DLg+FfTRa06Oh4BFsqedpkDPxLrTgCJGJkA0H
 dsK6O/993zvjx0Jn4ZPoJ9n35S82BmkCsz4bGq1xVl6FYUiMcm3/8yO41wllS+w4
 j+RlTU/RIdB7n8EKyMMl1hj1stTvt3Bi9F5Cbf7ZEv0snfR00K4KVpi17jnFjUHv
 kpOiEtXZb/NGQip7UAuUq0PisfqbiO4jJurYHRetDgv1WCy6+C8ufM5t6I+cnvmG
 VG+dlxcW+rDIn6bLRVuGi9TJRsQ6eox9ipa+qEKNNiOXgftELcgT7m74nAS5m0uv
 n5rDa221nPXecEB0X7d6YUFk711lly90dbelNeLrmv1w6jl8L1PpS1oBaW+UzGu9
 46eGBd6pzu9otvK9WVyDEdotDOCrgH0sd7pTetqDhLJZ7KrGwyyqO2gD/JroUKcC
 lnxBQwPnat86iI8=
 =oxfV
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm updates from Paul Moore:

 - Various minor updates to the LSM Rust bindings

   Changes include marking trivial Rust bindings as inlines and comment
   tweaks to better reflect the LSM hooks.

 - Add LSM/SELinux access controls to io_uring_allowed()

   Similar to the io_uring_disabled sysctl, add a LSM hook to
   io_uring_allowed() to enable LSMs a simple way to enforce security
   policy on the use of io_uring. This pull request includes SELinux
   support for this new control using the io_uring/allowed permission.

 - Remove an unused parameter from the security_perf_event_open() hook

   The perf_event_attr struct parameter was not used by any currently
   supported LSMs, remove it from the hook.

 - Add an explicit MAINTAINERS entry for the credentials code

   We've seen problems in the past where patches to the credentials code
   sent by non-maintainers would often languish on the lists for
   multiple months as there was no one explicitly tasked with the
   responsibility of reviewing and/or merging credentials related code.

   Considering that most of the code under security/ has a vested
   interest in ensuring that the credentials code is well maintained,
   I'm volunteering to look after the credentials code and Serge Hallyn
   has also volunteered to step up as an official reviewer. I posted the
   MAINTAINERS update as a RFC to LKML in hopes that someone else would
   jump up with an "I'll do it!", but beyond Serge it was all crickets.

 - Update Stephen Smalley's old email address to prevent confusion

   This includes a corresponding update to the mailmap file.

* tag 'lsm-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  mailmap: map Stephen Smalley's old email addresses
  lsm: remove old email address for Stephen Smalley
  MAINTAINERS: add Serge Hallyn as a credentials reviewer
  MAINTAINERS: add an explicit credentials entry
  cred,rust: mark Credential methods inline
  lsm,rust: reword "destroy" -> "release" in SecurityCtx
  lsm,rust: mark SecurityCtx methods inline
  perf: Remove unnecessary parameter of security check
  lsm: fix a missing security_uring_allowed() prototype
  io_uring,lsm,selinux: add LSM hooks for io_uring_setup()
  io_uring: refactor io_uring_allowed()
2025-03-25 15:44:19 -07:00
Linus Torvalds
fc13a78e1f hardening updates for v6.15-rc1
- loadpin: remove unsupported MODULE_COMPRESS_NONE (Arulpandiyan Vadivel)
 
 - samples/check-exec: Fix script name (Mickaël Salaün)
 
 - yama: remove needless locking in yama_task_prctl() (Oleg Nesterov)
 
 - lib/string_choices: Sort by function name (R Sundar)
 
 - hardening: Allow default HARDENED_USERCOPY to be set at compile time
   (Mel Gorman)
 
 - uaccess: Split out compile-time checks into ucopysize.h
 
 - kbuild: clang: Support building UM with SUBARCH=i386
 
 - x86: Enable i386 FORTIFY_SOURCE on Clang 16+
 
 - ubsan/overflow: Rework integer overflow sanitizer option
 
 - Add missing __nonstring annotations for callers of memtostr*()/strtomem*()
 
 - Add __must_be_noncstr() and have memtostr*()/strtomem*() check for it
 
 - Introduce __nonstring_array for silencing future GCC 15 warnings
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ9hGrgAKCRA2KwveOeQk
 u1WvAQC3ZxFu3b0Omfmht2pPqCltf2UOQNvUx3egjoeXpUaNSgD+Lxr/T4xksy7E
 jHh7rCYDkruOWs3DHA5JjRQcf0BBLQo=
 =FTQp
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "As usual, it's scattered changes all over. Patches touching things
  outside of our traditional areas in the tree have been Acked by
  maintainers or were trivial changes:

   - loadpin: remove unsupported MODULE_COMPRESS_NONE (Arulpandiyan
     Vadivel)

   - samples/check-exec: Fix script name (Mickaël Salaün)

   - yama: remove needless locking in yama_task_prctl() (Oleg Nesterov)

   - lib/string_choices: Sort by function name (R Sundar)

   - hardening: Allow default HARDENED_USERCOPY to be set at compile
     time (Mel Gorman)

   - uaccess: Split out compile-time checks into ucopysize.h

   - kbuild: clang: Support building UM with SUBARCH=i386

   - x86: Enable i386 FORTIFY_SOURCE on Clang 16+

   - ubsan/overflow: Rework integer overflow sanitizer option

   - Add missing __nonstring annotations for callers of
     memtostr*()/strtomem*()

   - Add __must_be_noncstr() and have memtostr*()/strtomem*() check for
     it

   - Introduce __nonstring_array for silencing future GCC 15 warnings"

* tag 'hardening-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits)
  compiler_types: Introduce __nonstring_array
  hardening: Enable i386 FORTIFY_SOURCE on Clang 16+
  x86/build: Remove -ffreestanding on i386 with GCC
  ubsan/overflow: Enable ignorelist parsing and add type filter
  ubsan/overflow: Enable pattern exclusions
  ubsan/overflow: Rework integer overflow sanitizer option to turn on everything
  samples/check-exec: Fix script name
  yama: don't abuse rcu_read_lock/get_task_struct in yama_task_prctl()
  kbuild: clang: Support building UM with SUBARCH=i386
  loadpin: remove MODULE_COMPRESS_NONE as it is no longer supported
  lib/string_choices: Rearrange functions in sorted order
  string.h: Validate memtostr*()/strtomem*() arguments more carefully
  compiler.h: Introduce __must_be_noncstr()
  nilfs2: Mark on-disk strings as nonstring
  uapi: stddef.h: Introduce __kernel_nonstring
  x86/tdx: Mark message.bytes as nonstring
  string: kunit: Mark nonstring test strings as __nonstring
  scsi: qla2xxx: Mark device strings as nonstring
  scsi: mpt3sas: Mark device strings as nonstring
  scsi: mpi3mr: Mark device strings as nonstring
  ...
2025-03-24 15:18:08 -07:00
Randy Dunlap
6df401a2ee ipe: policy_fs: fix kernel-doc warnings
Use the "struct" keyword in kernel-doc when describing struct
ipefs_file. Add kernel-doc for the struct members also.

Don't use kernel-doc notation for 'policy_subdir'. kernel-doc does
not support documentation comments for data definitions.

This eliminates multiple kernel-doc warnings:

security/ipe/policy_fs.c:21: warning: cannot understand function prototype: 'struct ipefs_file '
security/ipe/policy_fs.c:407: warning: cannot understand function prototype: 'const struct ipefs_file policy_subdir[] = '

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Fan Wu <wufan@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Fan Wu <wufan@kernel.org>
2025-03-24 13:36:00 -07:00
Linus Torvalds
26d8e43079 vfs-6.15-rc1.async.dir
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90rNwAKCRCRxhvAZXjc
 onBJAP9Z8Ywmlb5KQ1E3HvDmkwyY6yOSyZ9/CmbzrkCJ8ywYkQD/d9/xt0EP/O/q
 N8YtzXArHWt7u0YbcVpy9WK3F72BdwU=
 =VJgY
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.15-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs async dir updates from Christian Brauner:
 "This contains cleanups that fell out of the work from async directory
  handling:

   - Change kern_path_locked() and user_path_locked_at() to never return
     a negative dentry. This simplifies the usability of these helpers
     in various places

   - Drop d_exact_alias() from the remaining place in NFS where it is
     still used. This also allows us to drop the d_exact_alias() helper
     completely

   - Drop an unnecessary call to fh_update() from nfsd_create_locked()

   - Change i_op->mkdir() to return a struct dentry

     Change vfs_mkdir() to return a dentry provided by the filesystems
     which is hashed and positive. This allows us to reduce the number
     of cases where the resulting dentry is not positive to very few
     cases. The code in these places becomes simpler and easier to
     understand.

   - Repack DENTRY_* and LOOKUP_* flags"

* tag 'vfs-6.15-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  doc: fix inline emphasis warning
  VFS: Change vfs_mkdir() to return the dentry.
  nfs: change mkdir inode_operation to return alternate dentry if needed.
  fuse: return correct dentry for ->mkdir
  ceph: return the correct dentry on mkdir
  hostfs: store inode in dentry after mkdir if possible.
  Change inode_operations.mkdir to return struct dentry *
  nfsd: drop fh_update() from S_IFDIR branch of nfsd_create_locked()
  nfs/vfs: discard d_exact_alias()
  VFS: add common error checks to lookup_one_qstr_excl()
  VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry
  VFS: repack LOOKUP_ bit flags.
  VFS: repack DENTRY_ flags.
2025-03-24 10:47:14 -07:00
Linus Torvalds
fd101da676 vfs-6.15-rc1.mount
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90qAwAKCRCRxhvAZXjc
 on7lAP0akpIsJMWREg9tLwTNTySI1b82uKec0EAgM6T7n/PYhAD/T4zoY8UYU0Pr
 qCxwTXHUVT6bkNhjREBkfqq9OkPP8w8=
 =GxeN
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs mount updates from Christian Brauner:

 - Mount notifications

   The day has come where we finally provide a new api to listen for
   mount topology changes outside of /proc/<pid>/mountinfo. A mount
   namespace file descriptor can be supplied and registered with
   fanotify to listen for mount topology changes.

   Currently notifications for mount, umount and moving mounts are
   generated. The generated notification record contains the unique
   mount id of the mount.

   The listmount() and statmount() api can be used to query detailed
   information about the mount using the received unique mount id.

   This allows userspace to figure out exactly how the mount topology
   changed without having to generating diffs of /proc/<pid>/mountinfo
   in userspace.

 - Support O_PATH file descriptors with FSCONFIG_SET_FD in the new mount
   api

 - Support detached mounts in overlayfs

   Since last cycle we support specifying overlayfs layers via file
   descriptors. However, we don't allow detached mounts which means
   userspace cannot user file descriptors received via
   open_tree(OPEN_TREE_CLONE) and fsmount() directly. They have to
   attach them to a mount namespace via move_mount() first.

   This is cumbersome and means they have to undo mounts via umount().
   Allow them to directly use detached mounts.

 - Allow to retrieve idmappings with statmount

   Currently it isn't possible to figure out what idmapping has been
   attached to an idmapped mount. Add an extension to statmount() which
   allows to read the idmapping from the mount.

 - Allow creating idmapped mounts from mounts that are already idmapped

   So far it isn't possible to allow the creation of idmapped mounts
   from already idmapped mounts as this has significant lifetime
   implications. Make the creation of idmapped mounts atomic by allow to
   pass struct mount_attr together with the open_tree_attr() system call
   allowing to solve these issues without complicating VFS lookup in any
   way.

   The system call has in general the benefit that creating a detached
   mount and applying mount attributes to it becomes an atomic operation
   for userspace.

 - Add a way to query statmount() for supported options

   Allow userspace to query which mount information can be retrieved
   through statmount().

 - Allow superblock owners to force unmount

* tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits)
  umount: Allow superblock owners to force umount
  selftests: add tests for mount notification
  selinux: add FILE__WATCH_MOUNTNS
  samples/vfs: fix printf format string for size_t
  fs: allow changing idmappings
  fs: add kflags member to struct mount_kattr
  fs: add open_tree_attr()
  fs: add copy_mount_setattr() helper
  fs: add vfs_open_tree() helper
  statmount: add a new supported_mask field
  samples/vfs: add STATMOUNT_MNT_{G,U}IDMAP
  selftests: add tests for using detached mount with overlayfs
  samples/vfs: check whether flag was raised
  statmount: allow to retrieve idmappings
  uidgid: add map_id_range_up()
  fs: allow detached mounts in clone_private_mount()
  selftests/overlayfs: test specifying layers as O_PATH file descriptors
  fs: support O_PATH fds with FSCONFIG_SET_FD
  vfs: add notifications for mount attach and detach
  fanotify: notify on mount attach and detach
  ...
2025-03-24 09:34:10 -07:00
Linus Torvalds
99c21beaab vfs-6.15-rc1.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90p4AAKCRCRxhvAZXjc
 ojMIAP9atkG3u7+490+NGWLdulQlaHnD51Owa9MiW87UfKpsTQEArwi/NrJqXJNT
 PFQ2xIa5TxG+9haChR89w3kjZ6b/hgs=
 =iDkx
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.15-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "Features:

   - Add CONFIG_DEBUG_VFS infrastucture:
      - Catch invalid modes in open
      - Use the new debug macros in inode_set_cached_link()
      - Use debug-only asserts around fd allocation and install

   - Place f_ref to 3rd cache line in struct file to resolve false
     sharing

Cleanups:

   - Start using anon_inode_getfile_fmode() helper in various places

   - Don't take f_lock during SEEK_CUR if exclusion is guaranteed by
     f_pos_lock

   - Add unlikely() to kcmp()

   - Remove legacy ->remount_fs method from ecryptfs after port to the
     new mount api

   - Remove invalidate_inodes() in favour of evict_inodes()

   - Simplify ep_busy_loopER by removing unused argument

   - Avoid mmap sem relocks when coredumping with many missing pages

   - Inline getname()

   - Inline new_inode_pseudo() and de-staticize alloc_inode()

   - Dodge an atomic in putname if ref == 1

   - Consistently deref the files table with rcu_dereference_raw()

   - Dedup handling of struct filename init and refcounts bumps

   - Use wq_has_sleeper() in end_dir_add()

   - Drop the lock trip around I_NEW wake up in evict()

   - Load the ->i_sb pointer once in inode_sb_list_{add,del}

   - Predict not reaching the limit in alloc_empty_file()

   - Tidy up do_sys_openat2() with likely/unlikely

   - Call inode_sb_list_add() outside of inode hash lock

   - Sort out fd allocation vs dup2 race commentary

   - Turn page_offset() into a wrapper around folio_pos()

   - Remove locking in exportfs around ->get_parent() call

   - try_lookup_one_len() does not need any locks in autofs

   - Fix return type of several functions from long to int in open

   - Fix return type of several functions from long to int in ioctls

  Fixes:

   - Fix watch queue accounting mismatch"

* tag 'vfs-6.15-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (30 commits)
  fs: sort out fd allocation vs dup2 race commentary, take 2
  fs: call inode_sb_list_add() outside of inode hash lock
  fs: tidy up do_sys_openat2() with likely/unlikely
  fs: predict not reaching the limit in alloc_empty_file()
  fs: load the ->i_sb pointer once in inode_sb_list_{add,del}
  fs: drop the lock trip around I_NEW wake up in evict()
  fs: use wq_has_sleeper() in end_dir_add()
  VFS/autofs: try_lookup_one_len() does not need any locks
  fs: dedup handling of struct filename init and refcounts bumps
  fs: consistently deref the files table with rcu_dereference_raw()
  exportfs: remove locking around ->get_parent() call.
  fs: use debug-only asserts around fd allocation and install
  fs: dodge an atomic in putname if ref == 1
  vfs: Remove invalidate_inodes()
  ecryptfs: remove NULL remount_fs from super_operations
  watch_queue: fix pipe accounting mismatch
  fs: place f_ref to 3rd cache line in struct file to resolve false sharing
  epoll: simplify ep_busy_loop by removing always 0 argument
  fs: Turn page_offset() into a wrapper around folio_pos()
  kcmp: improve performance adding an unlikely hint to task comparisons
  ...
2025-03-24 09:13:50 -07:00
David Howells
75845c6c1a keys: Fix UAF in key_put()
Once a key's reference count has been reduced to 0, the garbage collector
thread may destroy it at any time and so key_put() is not allowed to touch
the key after that point.  The most key_put() is normally allowed to do is
to touch key_gc_work as that's a static global variable.

However, in an effort to speed up the reclamation of quota, this is now
done in key_put() once the key's usage is reduced to 0 - but now the code
is looking at the key after the deadline, which is forbidden.

Fix this by using a flag to indicate that a key can be gc'd now rather than
looking at the key's refcount in the garbage collector.

Fixes: 9578e327b2 ("keys: update key quotas in key_put()")
Reported-by: syzbot+6105ffc1ded71d194d6d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/673b6aec.050a0220.87769.004a.GAE@google.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: syzbot+6105ffc1ded71d194d6d@syzkaller.appspotmail.com
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-03-22 15:36:49 +02:00
Mickaël Salaün
6d9ac5e4d7
landlock: Prepare to add second errata
Potentially include errata for Landlock ABI v5 (Linux 6.10) and v6
(Linux 6.12).  That will be useful for the following signal scoping
erratum.

As explained in errata.h, this commit should be backportable without
conflict down to ABI v5.  It must then not include the errata/abi-6.h
file.

Fixes: 54a6e6bbf3 ("landlock: Add signal scoping")
Cc: Günther Noack <gnoack@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250318161443.279194-5-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-21 12:12:21 +01:00
Mickaël Salaün
48fce74fe2
landlock: Add erratum for TCP fix
Add erratum for the TCP socket identification fixed with commit
854277e2cc ("landlock: Fix non-TCP sockets restriction").

Fixes: 854277e2cc ("landlock: Fix non-TCP sockets restriction")
Cc: Günther Noack <gnoack@google.com>
Cc: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250318161443.279194-4-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-21 12:12:20 +01:00
Mickaël Salaün
15383a0d63
landlock: Add the errata interface
Some fixes may require user space to check if they are applied on the
running kernel before using a specific feature.  For instance, this
applies when a restriction was previously too restrictive and is now
getting relaxed (e.g. for compatibility reasons).  However, non-visible
changes for legitimate use (e.g. security fixes) do not require an
erratum.

Because fixes are backported down to a specific Landlock ABI, we need a
way to avoid cherry-pick conflicts.  The solution is to only update a
file related to the lower ABI impacted by this issue.  All the ABI files
are then used to create a bitmask of fixes.

The new errata interface is similar to the one used to get the supported
Landlock ABI version, but it returns a bitmask instead because the order
of fixes may not match the order of versions, and not all fixes may
apply to all versions.

The actual errata will come with dedicated commits.  The description is
not actually used in the code but serves as documentation.

Create the landlock_abi_version symbol and use its value to check errata
consistency.

Update test_base's create_ruleset_checks_ordering tests and add errata
tests.

This commit is backportable down to the first version of Landlock.

Fixes: 3532b0b435 ("landlock: Enable user space to infer supported features")
Cc: Günther Noack <gnoack@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250318161443.279194-3-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-21 12:12:19 +01:00
Mickaël Salaün
624f177d8f
landlock: Move code to ease future backports
To ease backports in setup.c, let's group changes from
__lsm_ro_after_init to __ro_after_init with commit f22f9aaf6c
("selinux: remove the runtime disable functionality"), and the
landlock_lsmid addition with commit f3b8788cde ("LSM: Identify modules
by more than name").

That will help to backport the following errata.

Cc: Günther Noack <gnoack@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250318161443.279194-2-mic@digikod.net
Fixes: f3b8788cde ("LSM: Identify modules by more than name")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-21 12:12:16 +01:00
Arnd Bergmann
edc8e80bf8 crypto: lib/Kconfig - hide library options
Any driver that needs these library functions should already be selecting
the corresponding Kconfig symbols, so there is no real point in making
these visible.

The original patch that made these user selectable described problems
with drivers failing to select the code they use, but for consistency
it's better to always use 'select' on a symbol than to mix it with
'depends on'.

Fixes: e56e189855 ("lib/crypto: add prompts back to crypto libraries")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-21 17:33:39 +08:00
Christian Göttsche
a3d3043ef2 selinux: get netif_wildcard policycap from policy instead of cache
Retrieve the netif_wildcard policy capability in security_netif_sid()
from the locked active policy instead of the cached value in
selinux_state.

Fixes: 8af43b61c1 ("selinux: support wildcard network interface names")
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
[PM: /netlabel/netif/ due to a typo in the description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-03-17 16:22:04 -04:00
Blaise Boscaccy
082f1db02c security: Propagate caller information in bpf hooks
Certain bpf syscall subcommands are available for usage from both
userspace and the kernel. LSM modules or eBPF gatekeeper programs may
need to take a different course of action depending on whether or not
a BPF syscall originated from the kernel or userspace.

Additionally, some of the bpf_attr struct fields contain pointers to
arbitrary memory. Currently the functionality to determine whether or
not a pointer refers to kernel memory or userspace memory is exposed
to the bpf verifier, but that information is missing from various LSM
hooks.

Here we augment the LSM hooks to provide this data, by simply passing
a boolean flag indicating whether or not the call originated in the
kernel, in any hook that contains a bpf_attr struct that corresponds
to a subcommand that may be called from the kernel.

Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250310221737.821889-2-bboscaccy@linux.microsoft.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:58 -07:00
Stephen Smalley
9da4f4f987 lsm: remove old email address for Stephen Smalley
Remove my old, no longer functioning, email address from comments.
Could alternatively replace with my current email but seems
redundant with MAINTAINERS and prone to being out of date.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-03-10 15:58:43 -04:00
Greg Kroah-Hartman
993a47bd7b Linux 6.14-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmfOKBUeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG1aQH/iC+Oyij4VxAjBek
 BOXIT/p6CwlIXb8ObiWWcRjDPizlcxb3RaV8J2RO+IqaQ2wltxpFANq2G7Re2FPm
 SNcEpIURAOVcxHGedcfFA91srO5F4FzNTO8LVp7MIbcgMYy3pdk+dbZmi6A691R+
 t9pb74m+MAnF1o/MUx7pUlhAT/4ymuuR0F7WCSg4h0Xwe5m0nlJY89kJBC7PCjyd
 n3mdhsz3rDSLmt/z/T7HGD89r8sYSvm9cOKtL3ELgGTrm7boQV8ii9Y9w04DI8PQ
 JmIernugcCxmhH36mVUAHgJf2+/T388xFUh/D5+skeUOUZpaJZG866rnb32WpsHc
 eWLFUeg=
 =Wypt
 -----END PGP SIGNATURE-----

Merge 6.14-rc6 into driver-core-next

We need the driver core fix in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-10 17:37:25 +01:00
Kees Cook
d70da12453 hardening: Enable i386 FORTIFY_SOURCE on Clang 16+
The i386 regparm bug exposed with FORTIFY_SOURCE with Clang was fixed
in Clang 16[1].

Link: c167c0a4dc [1]
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250308042929.1753543-2-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-08 09:16:42 -08:00
Jan Kara
93fd0d46cb
vfs: Remove invalidate_inodes()
The function can be replaced by evict_inodes. The only difference is
that evict_inodes() skips the inodes with positive refcount without
touching ->i_lock, but they are equivalent as evict_inodes() repeats the
refcount check after having grabbed ->i_lock.

Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20250307144318.28120-2-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-08 12:19:22 +01:00
Dr. David Alan Gilbert
4ae89b1fe7
capability: Remove unused has_capability
The vanilla has_capability() function has been unused since 2018's
commit dcb569cf6a ("Smack: ptrace capability use fixes")

Remove it.

Fixup a comment in security/commoncap.c that referenced it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Serge Hallyn <sergeh@kernel.org>
2025-03-07 22:03:09 -06:00
Oleg Nesterov
71c769231f yama: don't abuse rcu_read_lock/get_task_struct in yama_task_prctl()
current->group_leader is stable, no need to take rcu_read_lock() and do
get/put_task_struct().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250219161417.GA20851@redhat.com
Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-07 19:58:05 -08:00
Christian Göttsche
8af43b61c1 selinux: support wildcard network interface names
Add support for wildcard matching of network interface names.  This is
useful for auto-generated interfaces, for example podman creates network
interfaces for containers with the naming scheme podman0, podman1,
podman2, ...

To maintain backward compatibility guard this feature with a new policy
capability 'netif_wildcard'.

Netifcon definitions are compared against in the order given by the
policy, so userspace tools should sort them in a reasonable order.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-03-07 15:11:10 -05:00
Arulpandiyan Vadivel
d73ef9ec87 loadpin: remove MODULE_COMPRESS_NONE as it is no longer supported
Updated the MODULE_COMPRESS_NONE with MODULE_COMPRESS as it was no longer
available from kernel modules. As MODULE_COMPRESS and MODULE_DECOMPRESS
depends on MODULES removing MODULES as well.

Fixes: c7ff693fa2 ("module: Split modules_install compression and in-kernel decompression")
Signed-off-by: Arulpandiyan Vadivel <arulpandiyan.vadivel@siemens.com>
Link: https://lore.kernel.org/r/20250302103831.285381-1-arulpandiyan.vadivel@siemens.com
Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-03 09:35:50 -08:00
Mel Gorman
ca758b147e fortify: Move FORTIFY_SOURCE under 'Kernel hardening options'
FORTIFY_SOURCE is a hardening option both at build and runtime. Move
it under 'Kernel hardening options'.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250123221115.19722-5-mgorman@techsingularity.net
Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-28 11:51:31 -08:00
Mel Gorman
d2132f453e mm: security: Allow default HARDENED_USERCOPY to be set at compile time
HARDENED_USERCOPY defaults to on if enabled at compile time. Allow
hardened_usercopy= default to be set at compile time similar to
init_on_alloc= and init_on_free=. The intent is that hardening
options that can be disabled at runtime can set their default at
build time.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20250123221115.19722-3-mgorman@techsingularity.net
Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-28 11:51:31 -08:00
Mel Gorman
f4d4e8b9d6 mm: security: Move hardened usercopy under 'Kernel hardening options'
There is a submenu for 'Kernel hardening options' under "Security".
Move HARDENED_USERCOPY under the hardening options as it is clearly
related.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250123221115.19722-2-mgorman@techsingularity.net
Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-28 11:51:31 -08:00
NeilBrown
88d5baf690
Change inode_operations.mkdir to return struct dentry *
Some filesystems, such as NFS, cifs, ceph, and fuse, do not have
complete control of sequencing on the actual filesystem (e.g.  on a
different server) and may find that the inode created for a mkdir
request already exists in the icache and dcache by the time the mkdir
request returns.  For example, if the filesystem is mounted twice the
directory could be visible on the other mount before it is on the
original mount, and a pair of name_to_handle_at(), open_by_handle_at()
calls could instantiate the directory inode with an IS_ROOT() dentry
before the first mkdir returns.

This means that the dentry passed to ->mkdir() may not be the one that
is associated with the inode after the ->mkdir() completes.  Some
callers need to interact with the inode after the ->mkdir completes and
they currently need to perform a lookup in the (rare) case that the
dentry is no longer hashed.

This lookup-after-mkdir requires that the directory remains locked to
avoid races.  Planned future patches to lock the dentry rather than the
directory will mean that this lookup cannot be performed atomically with
the mkdir.

To remove this barrier, this patch changes ->mkdir to return the
resulting dentry if it is different from the one passed in.
Possible returns are:
  NULL - the directory was created and no other dentry was used
  ERR_PTR() - an error occurred
  non-NULL - this other dentry was spliced in

This patch only changes file-systems to return "ERR_PTR(err)" instead of
"err" or equivalent transformations.  Subsequent patches will make
further changes to some file-systems to return a correct dentry.

Not all filesystems reliably result in a positive hashed dentry:

- NFS, cifs, hostfs will sometimes need to perform a lookup of
  the name to get inode information.  Races could result in this
  returning something different. Note that this lookup is
  non-atomic which is what we are trying to avoid.  Placing the
  lookup in filesystem code means it only happens when the filesystem
  has no other option.
- kernfs and tracefs leave the dentry negative and the ->revalidate
  operation ensures that lookup will be called to correctly populate
  the dentry.  This could be fixed but I don't think it is important
  to any of the users of vfs_mkdir() which look at the dentry.

The recommendation to use
    d_drop();d_splice_alias()
is ugly but fits with current practice.  A planned future patch will
change this.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
Link: https://lore.kernel.org/r/20250227013949.536172-2-neilb@suse.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-27 20:00:17 +01:00
Miklos Szeredi
7d90fb5253
selinux: add FILE__WATCH_MOUNTNS
Watching mount namespaces for changes (mount, umount, move mount) was added
by previous patches.

This patch adds the file/watch_mountns permission that can be applied to
nsfs files (/proc/$$/ns/mnt), making it possible to allow or deny watching
a particular namespace for changes.

Suggested-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/all/CAHC9VhTOmCjCSE2H0zwPOmpFopheexVb6jyovz92ZtpKtoVv6A@mail.gmail.com/
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://lore.kernel.org/r/20250224154836.958915-1-mszeredi@redhat.com
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-27 09:16:04 +01:00
"Kipp N. Davis"
2c2b1e0597 selinux: add permission checks for loading other kinds of kernel files
Although the LSM hooks for loading kernel modules were later generalized
to cover loading other kinds of files, SELinux didn't implement
corresponding permission checks, leaving only the module case covered.
Define and add new permission checks for these other cases.

Signed-off-by: Cameron K. Williams <ckwilliams.work@gmail.com>
Signed-off-by: Kipp N. Davis <kippndavis.work@gmx.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
[PM: merge fuzz, line length, and spacing fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-02-26 15:14:43 -05:00
Linus Torvalds
c0d35086a2 Landlock fix for v6.14-rc5
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZ785FhAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbSILQBAMFwpFClzjVeWyLFNd/gaTlPWeedvnag+yZu
 CK9q39jOAP9tj1unQBFpsI7jsTk6ZxPVxb4DymPIirZMb/5FuxUnDQ==
 =QrSM
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock fixes from Mickaël Salaün:
 "Fixes to TCP socket identification, documentation, and tests"

* tag 'landlock-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  selftests/landlock: Add binaries to .gitignore
  selftests/landlock: Test that MPTCP actions are not restricted
  selftests/landlock: Test TCP accesses with protocol=IPPROTO_TCP
  landlock: Fix non-TCP sockets restriction
  landlock: Minor typo and grammar fixes in IPC scoping documentation
  landlock: Fix grammar error
  selftests/landlock: Enable the new CONFIG_AF_UNIX_OOB
2025-02-26 11:55:44 -08:00
Linus Torvalds
d62fdaf51b integrity-v6.14-fix
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZ78LSBQcem9oYXJAbGlu
 dXguaWJtLmNvbQAKCRDLwZzRsCrn5X1kAQCsB9LO0NX+DhZILASeSJfzjAkY0cGY
 jzQ3eq3keLkXrQD/RjeKD9e/qjBrXX0C4qbr9e8+JYuzY22o911bEiK67wg=
 =lOQV
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v6.14-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity fixes from Mimi Zohar:
 "One bugfix and one spelling cleanup. The bug fix restores a
  performance improvement"

* tag 'integrity-v6.14-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: Reset IMA_NONACTION_RULE_FLAGS after post_setattr
  integrity: fix typos and spelling errors
2025-02-26 11:47:19 -08:00
Luo Gengkun
9ec84f79c5 perf: Remove unnecessary parameter of security check
It seems that the attr parameter was never been used in security
checks since it was first introduced by:

commit da97e18458 ("perf_event: Add support for LSM and SELinux checks")

so remove it.

Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-02-26 14:13:58 -05:00
Greg Kroah-Hartman
2ce177e9b3 Merge 6.14-rc3 into driver-core-next
We need the faux_device changes in here for future work.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-17 07:24:33 +01:00
Konstantin Andreev
a158a937d8 smack: recognize ipv4 CIPSO w/o categories
If SMACK label has CIPSO representation w/o categories, e.g.:

| # cat /smack/cipso2
| foo  10
| @ 250/2
| ...

then SMACK does not recognize such CIPSO in input ipv4 packets
and substitues '*' label instead. Audit records may look like

| lsm=SMACK fn=smack_socket_sock_rcv_skb action=denied
|   subject="*" object="_" requested=w pid=0 comm="swapper/1" ...

This happens in two steps:

1) security/smack/smackfs.c`smk_set_cipso
   does not clear NETLBL_SECATTR_MLS_CAT
   from (struct smack_known *)skp->smk_netlabel.flags
   on assigning CIPSO w/o categories:

| rcu_assign_pointer(skp->smk_netlabel.attr.mls.cat, ncats.attr.mls.cat);
| skp->smk_netlabel.attr.mls.lvl = ncats.attr.mls.lvl;

2) security/smack/smack_lsm.c`smack_from_secattr
   can not match skp->smk_netlabel with input packet's
   struct netlbl_lsm_secattr *sap
   because sap->flags have not NETLBL_SECATTR_MLS_CAT (what is correct)
   but skp->smk_netlabel.flags have (what is incorrect):

| if ((sap->flags & NETLBL_SECATTR_MLS_CAT) == 0) {
| 	if ((skp->smk_netlabel.flags &
| 		 NETLBL_SECATTR_MLS_CAT) == 0)
| 		found = 1;
| 	break;
| }

This commit sets/clears NETLBL_SECATTR_MLS_CAT in
skp->smk_netlabel.flags according to the presense of CIPSO categories.
The update of smk_netlabel is not atomic, so input packets processing
still may be incorrect during short time while update proceeds.

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-16 14:17:55 -08:00
Konstantin Andreev
c7fb50cecf smack: Revert "smackfs: Added check catlen"
This reverts commit ccfd889acb

The indicated commit
* does not describe the problem that change tries to solve
* has programming issues
* introduces a bug: forever clears NETLBL_SECATTR_MLS_CAT
         in (struct smack_known *)skp->smk_netlabel.flags

Reverting the commit to reapproach original problem

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-16 10:30:23 -08:00
Sebastian Andrzej Siewior
741c10b096 kernfs: Use RCU to access kernfs_node::name.
Using RCU lifetime rules to access kernfs_node::name can avoid the
trouble with kernfs_rename_lock in kernfs_name() and kernfs_path_from_node()
if the fs was created with KERNFS_ROOT_INVARIANT_PARENT. This is usefull
as it allows to implement kernfs_path_from_node() only with RCU
protection and avoiding kernfs_rename_lock. The lock is only required if
the __parent node can be changed and the function requires an unchanged
hierarchy while it iterates from the node to its parent.
The change is needed to allow the lookup of the node's path
(kernfs_path_from_node()) from context which runs always with disabled
preemption and or interrutps even on PREEMPT_RT. The problem is that
kernfs_rename_lock becomes a sleeping lock on PREEMPT_RT.

I went through all ::name users and added the required access for the lookup
with a few extensions:
- rdtgroup_pseudo_lock_create() drops all locks and then uses the name
  later on. resctrl supports rename with different parents. Here I made
  a temporal copy of the name while it is used outside of the lock.

- kernfs_rename_ns() accepts NULL as new_parent. This simplifies
  sysfs_move_dir_ns() where it can set NULL in order to reuse the current
  name.

- kernfs_rename_ns() is only using kernfs_rename_lock if the parents are
  different. All users use either kernfs_rwsem (for stable path view) or
  just RCU for the lookup. The ::name uses always RCU free.

Use RCU lifetime guarantees to access kernfs_node::name.

Suggested-by: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Reported-by: syzbot+6ea37e2e6ffccf41a7e6@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/67251dc6.050a0220.529b6.015e.GAE@google.com/
Reported-by: Hillf Danton <hdanton@sina.com>
Closes: https://lore.kernel.org/20241102001224.2789-1-hdanton@sina.com
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20250213145023.2820193-7-bigeasy@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-15 17:46:32 +01:00
Mikhail Ivanov
854277e2cc
landlock: Fix non-TCP sockets restriction
Use sk_is_tcp() to check if socket is TCP in bind(2) and connect(2)
hooks.

SMC, MPTCP, SCTP protocols are currently restricted by TCP access
rights.  The purpose of TCP access rights is to provide control over
ports that can be used by userland to establish a TCP connection.
Therefore, it is incorrect to deny bind(2) and connect(2) requests for a
socket of another protocol.

However, SMC, MPTCP and RDS implementations use TCP internal sockets to
establish communication or even to exchange packets over a TCP
connection [1]. Landlock rules that configure bind(2) and connect(2)
usage for TCP sockets should not cover requests for sockets of such
protocols. These protocols have different set of security issues and
security properties, therefore, it is necessary to provide the userland
with the ability to distinguish between them (eg. [2]).

Control over TCP connection used by other protocols can be achieved with
upcoming support of socket creation control [3].

[1] https://lore.kernel.org/all/62336067-18c2-3493-d0ec-6dd6a6d3a1b5@huawei-partners.com/
[2] https://lore.kernel.org/all/20241204.fahVio7eicim@digikod.net/
[3] https://lore.kernel.org/all/20240904104824.1844082-1-ivanov.mikhail1@huawei-partners.com/

Closes: https://github.com/landlock-lsm/linux/issues/40
Fixes: fff69fb03d ("landlock: Support network rules with TCP bind and connect")
Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Link: https://lore.kernel.org/r/20250205093651.1424339-2-ivanov.mikhail1@huawei-partners.com
[mic: Format commit message to 72 columns]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-02-14 09:23:09 +01:00
Tanya Agarwal
143c9aae04
landlock: Fix grammar error
Fix grammar error in comments that were identified using the codespell
tool.

Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com>
Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250123194208.2660-1-tanyaagarwal25699@gmail.com
[mic: Simplify commit message]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-02-14 09:23:08 +01:00
Konstantin Andreev
bf9f14c91a smack: remove /smack/logging if audit is not configured
If CONFIG_AUDIT is not set then
SMACK does not generate audit messages,
however, keeps audit control file, /smack/logging,
while there is no entity to control.
This change removes audit control file /smack/logging
when audit is not configured in the kernel

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-13 18:33:55 -08:00
Konstantin Andreev
6cce0cc386 smack: ipv4/ipv6: tcp/dccp/sctp: fix incorrect child socket label
Since inception [1], SMACK initializes ipv* child socket security
for connection-oriented communications (tcp/sctp/dccp)
during accept() syscall, in the security_sock_graft() hook:

| void smack_sock_graft(struct sock *sk, ...)
| {
|     // only ipv4 and ipv6 are eligible here
|     // ...
|     ssp = sk->sk_security; // socket security
|     ssp->smk_in = skp;     // process label: smk_of_current()
|     ssp->smk_out = skp;    // process label: smk_of_current()
| }

This approach is incorrect for two reasons:

A) initialization occurs too late for child socket security:

   The child socket is created by the kernel once the handshake
   completes (e.g., for tcp: after receiving ack for syn+ack).

   Data can legitimately start arriving to the child socket
   immediately, long before the application calls accept()
   on the socket.

   Those data are (currently — were) processed by SMACK using
   incorrect child socket security attributes.

B) Incoming connection requests are handled using the listening
   socket's security, hence, the child socket must inherit the
   listening socket's security attributes.

   smack_sock_graft() initilizes the child socket's security with
   a process label, as is done for a new socket()

   But ... the process label is not necessarily the same as the
   listening socket label. A privileged application may legitimately
   set other in/out labels for a listening socket.

   When this happens, SMACK processes incoming packets using
   incorrect socket security attributes.

In [2] Michael Lontke noticed (A) and fixed it in [3] by adding
socket initialization into security_sk_clone_security() hook like

| void smack_sk_clone_security(struct sock *oldsk, struct sock *newsk)
| {
|    *(struct socket_smack *)newsk->sk_security =
|    *(struct socket_smack *)oldsk->sk_security;
| }

This initializes the child socket security with the parent (listening)
socket security at the appropriate time.

I was forced to revisit this old story because

smack_sock_graft() was left in place by [3] and continues overwriting
the child socket's labels with the process label,
and there might be a reason for this, so I undertook a study.

If the process label differs from the listening socket's labels,
the following occurs for ipv4:

assigning the smk_out is not accompanied by netlbl_sock_setattr,
so the outgoing packet's cipso label does not change.

So, the only effect of this assignment for interhost communications
is a divergence between the program-visible “out” socket label and
the cipso network label. For intrahost communications this label,
however, becomes visible via secmark netfilter marking, and is
checked for access rights by the client, receiving side.

Assigning the smk_in affects both interhost and intrahost
communications: the server begins to check access rights against
an wrong label.

Access check against wrong label (smk_in or smk_out),
unsurprisingly fails, breaking the connection.

The above affects protocols that calls security_sock_graft()
during accept(), namely: {tcp,dccp,sctp}/{ipv4,ipv6}
One extra security_sock_graft() caller, crypto/af_alg.c`af_alg_accept
is not affected, because smack_sock_graft() does nothing for PF_ALG.

To reproduce, assign non-default in/out labels to a listening socket,
setup rules between these labels and client label, attempt to connect
and send some data.

Ipv6 specific: ipv6 packets do not convey SMACK labels. To reproduce
the issue in interhost communications set opposite labels in
/smack/ipv6host on both hosts.
Ipv6 intrahost communications do not require tricking, because SMACK
labels are conveyed via secmark netfilter marking.

So, currently smack_sock_graft() is not useful, but harmful,
therefore, I have removed it.

This fixes the issue for {tcp,dccp}/{ipv4,ipv6},
but not sctp/{ipv4,ipv6}.

Although this change is necessary for sctp+smack to function
correctly, it is not sufficient because:
sctp/ipv4 does not call security_sk_clone() and
sctp/ipv6 ignores SMACK completely.

These are separate issues, belong to other subsystem,
and should be addressed separately.

[1] 2008-02-04,
Fixes: e114e47377 ("Smack: Simplified Mandatory Access Control Kernel")

[2] Michael Lontke, 2022-08-31, SMACK LSM checks wrong object label
                                during ingress network traffic
Link: https://lore.kernel.org/linux-security-module/6324997ce4fc092c5020a4add075257f9c5f6442.camel@elektrobit.com/

[3] 2022-08-31, michael.lontke,
    commit 4ca165fc6c ("SMACK: Add sk_clone_security LSM hook")

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-13 13:25:14 -08:00
Konstantin Andreev
bfcf4004bc smack: dont compile ipv6 code unless ipv6 is configured
I want to be sure that ipv6-specific code
is not compiled in kernel binaries
if ipv6 is not configured.

[1] was getting rid of "unused variable" warning, but,
with that, it also mandated compilation of a handful ipv6-
specific functions in ipv4-only kernel configurations:

smk_ipv6_localhost, smack_ipv6host_label, smk_ipv6_check.

Their compiled bodies are likely to be removed by compiler
from the resulting binary, but, to be on the safe side,
I remove them from the compiler view.

[1]
Fixes: 00720f0e7f ("smack: avoid unused 'sip' variable warning")

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-12 13:05:15 -08:00
Casey Schaufler
2aad5cd1db Smack: fix typos and spelling errors
Fix typos and spelling errors in security/smack module comments that
were identified using the codespell tool.
No functional changes - documentation only.

Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-11 14:34:02 -08:00
Linus Torvalds
09fbf3d502 Redo of pathname patternization and fix spelling errors.
Tetsuo Handa (3):
   tomoyo: use better patterns for procfs in learning mode
   tomoyo: fix spelling errors
 
 Tanya Agarwal (1):
   tomoyo: fix spelling error
 
  security/tomoyo/common.c        |  145 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------
  security/tomoyo/domain.c        |    2 -
  security/tomoyo/securityfs_if.c |    6 +--
  security/tomoyo/tomoyo.c        |    5 --
  4 files changed, 117 insertions(+), 41 deletions(-)
 -----BEGIN PGP SIGNATURE-----
 
 iQJXBAABCABBFiEEQ8gzaWI9etOpbC/HQl8SjQxk9SoFAmerSHojHHBlbmd1aW4t
 a2VybmVsQGktbG92ZS5zYWt1cmEubmUuanAACgkQQl8SjQxk9Sreeg/+KHl8qDxi
 eLbml7ihosJ+qOJhCdHgsboNqoIrPFS7fp89YLwOBoBBsvT1zbqwcKJhAU0bAlFt
 bDHcKnt0anrokzxAhajVlikmnKP4CeW/fFf4m2UXJ7/uYinoYyizutqsduthF6G9
 MwxCpSnReOKsys8tVZyBtCNk53vKk6o0I1GGZ56GG7fNOz6UR81z8mUhJRcqj9Gk
 0gqdfC8RLNbXbygOHbZINM3TUmn/MW9E/ALRU+a1Ay+9ZazJE26R87N7/O7ksDB3
 LGB1OZ6eEJkCMJm583wa2AdmUdaT5DPv7PWVEYkPjFkYjQzPO4GuBONPJHsElM1O
 OybzMuQKlWqMWvFwky7wbUNaKkxqXNm9BVCg1tdbuQubB6moZSUjcZTM4SXnSD5K
 Vn5AGfgLHMJ5nC576ZbyAARYdo3I0ffkv1Ql8sUio3giybLw+QFika1dpQrVupka
 fWZxj3znJXP0sNpunb9ffOMlf92jj/XD9382bBNUbR/aEP50YwvYnAfWek1+H4YI
 kVDFiCb3PVx1pwfJjKezcsN1Rp0JXRFXMDcd0Pm7n0vCs3MdWwJ44/xUICBs/UR2
 B/+ocCQykQDgMYW9LC9fZnZ+A0KfhLJpUNjqX+B2pQR3Cyr51l+BY8QyUlBL3Eci
 XQyX+dqDdh0L7dXyBjowYGvo9cEJ6aNkwQ8=
 =cPFI
 -----END PGP SIGNATURE-----

Merge tag 'tomoyo-pr-20250211' of git://git.code.sf.net/p/tomoyo/tomoyo

Pull tomoyo fixes from Tetsuo Handa:
 "Redo of pathname patternization and fix spelling errors"

* tag 'tomoyo-pr-20250211' of git://git.code.sf.net/p/tomoyo/tomoyo:
  tomoyo: use better patterns for procfs in learning mode
  tomoyo: fix spelling errors
  tomoyo: fix spelling error
2025-02-11 10:19:36 -08:00
Nathan Chancellor
3e45553acb apparmor: Remove unused variable 'sock' in __file_sock_perm()
When CONFIG_SECURITY_APPARMOR_DEBUG_ASSERTS is disabled, there is a
warning that sock is unused:

  security/apparmor/file.c: In function '__file_sock_perm':
  security/apparmor/file.c:544:24: warning: unused variable 'sock' [-Wunused-variable]
    544 |         struct socket *sock = (struct socket *) file->private_data;
        |                        ^~~~

sock was moved into aa_sock_file_perm(), where the same check is
present, so remove sock and the assertion from __file_sock_perm() to fix
the warning.

Fixes: c05e705812 ("apparmor: add fine grained af_unix mediation")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501190757.myuLxLyL-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-02-10 11:18:45 -08:00
Mateusz Guzik
67e370aa7f apparmor: use the condition in AA_BUG_FMT even with debug disabled
This follows the established practice and fixes a build failure for me:
security/apparmor/file.c: In function ‘__file_sock_perm’:
security/apparmor/file.c:544:24: error: unused variable ‘sock’ [-Werror=unused-variable]
  544 |         struct socket *sock = (struct socket *) file->private_data;
      |                        ^~~~

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-02-10 11:17:59 -08:00
Tanya Agarwal
aabbe6f908 apparmor: fix typos and spelling errors
Fix typos and spelling errors in apparmor module comments that were
identified using the codespell tool.
No functional changes - documentation only.

Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-02-10 11:17:49 -08:00
Jiapeng Chong
04fe43104e apparmor: Modify mismatched function name
No functional modification involved.

security/apparmor/lib.c:93: warning: expecting prototype for aa_mask_to_str(). Prototype was for val_mask_to_str() instead.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=13606
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-02-10 11:17:33 -08:00
Jiapeng Chong
aa904fa118 apparmor: Modify mismatched function name
No functional modification involved.

security/apparmor/file.c:184: warning: expecting prototype for aa_lookup_fperms(). Prototype was for aa_lookup_condperms() instead.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=13605
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-02-10 11:16:45 -08:00
Nathan Chancellor
509f8cb2ff apparmor: Fix checking address of an array in accum_label_info()
clang warns:

  security/apparmor/label.c:206:15: error: address of array 'new->vec' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
    206 |         AA_BUG(!new->vec);
        |                ~~~~~~^~~

The address of this array can never be NULL because it is not at the
beginning of a structure. Convert the assertion to check that the new
pointer is not NULL.

Fixes: de4754c801 ("apparmor: carry mediation check on label")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501191802.bDp2voTJ-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-02-10 11:16:30 -08:00
Hamza Mahfooz
c6ad9fdbd4 io_uring,lsm,selinux: add LSM hooks for io_uring_setup()
It is desirable to allow LSM to configure accessibility to io_uring
because it is a coarse yet very simple way to restrict access to it. So,
add an LSM for io_uring_allowed() to guard access to io_uring.

Cc: Paul Moore <paul@paul-moore.com>
Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
[PM: merge fuzz due to changes in preceding patches, subj tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-02-07 17:17:49 -05:00
Paul Moore
5fc80fb5b7 selinux: always check the file label in selinux_kernel_read_file()
Commit 2039bda1fa ("LSM: Add "contents" flag to kernel_read_file hook")
added a new flag to the security_kernel_read_file() LSM hook, "contents",
which was set if a file was being read in its entirety or if it was the
first chunk read in a multi-step process.  The SELinux LSM callback was
updated to only check against the file label if this "contents" flag was
set, meaning that in multi-step reads the file label was not considered
in the access control decision after the initial chunk.

Thankfully the only in-tree user that performs a multi-step read is the
"bcm-vk" driver and it is loading firmware, not a kernel module, so there
are no security regressions to worry about.  However, we still want to
ensure that the SELinux code does the right thing, and *always* checks
the file label, especially as there is a chance the file could change
between chunk reads.

Fixes: 2039bda1fa ("LSM: Add "contents" flag to kernel_read_file hook")
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-02-07 15:30:54 -05:00
Kaixiong Yu
b121dd4d55 security: min_addr: move sysctl to security/min_addr.c
The dac_mmap_min_addr belongs to min_addr.c, move it to
min_addr.c from /kernel/sysctl.c. In the previous Linux kernel
boot process, sysctl_init_bases needs to be executed before
init_mmap_min_addr, So, register_sysctl_init should be executed
before update_mmap_min_addr in init_mmap_min_addr. And according
to the compilation condition in security/Makefile:

      obj-$(CONFIG_MMU)            += min_addr.o

if CONFIG_MMU is not defined, min_addr.c would not be included in the
compilation process. So, drop the CONFIG_MMU check.

Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07 16:53:04 +01:00
Roberto Sassu
57a0ef02fe ima: Reset IMA_NONACTION_RULE_FLAGS after post_setattr
Commit 0d73a55208 ("ima: re-introduce own integrity cache lock")
mistakenly reverted the performance improvement introduced in commit
42a4c60319 ("ima: fix ima_inode_post_setattr"). The unused bit mask was
subsequently removed by commit 11c60f23ed ("integrity: Remove unused
macro IMA_ACTION_RULE_FLAGS").

Restore the performance improvement by introducing the new mask
IMA_NONACTION_RULE_FLAGS, equal to IMA_NONACTION_FLAGS without
IMA_NEW_FILE, which is not a rule-specific flag.

Finally, reset IMA_NONACTION_RULE_FLAGS instead of IMA_NONACTION_FLAGS in
process_measurement(), if the IMA_CHANGE_ATTR atomic flag is set (after
file metadata modification).

With this patch, new files for which metadata were modified while they are
still open, can be reopened before the last file close (when security.ima
is written), since the IMA_NEW_FILE flag is not cleared anymore. Otherwise,
appraisal fails because security.ima is missing (files with IMA_NEW_FILE
set are an exception).

Cc: stable@vger.kernel.org # v4.16.x
Fixes: 0d73a55208 ("ima: re-introduce own integrity cache lock")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-02-04 21:36:43 -05:00
Tanya Agarwal
ceb5faef84 integrity: fix typos and spelling errors
Fix typos and spelling errors in integrity module comments that were
identified using the codespell tool.
No functional changes - documentation only.

Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-02-04 21:36:43 -05:00
Tanya Agarwal
75eb39f2f5 selinux: fix spelling error
Fix spelling error in selinux module comments that were identified
using the codespell tool.
No functional changes - documentation only.

Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-02-03 16:47:20 -05:00
Tetsuo Handa
bdc35f164b tomoyo: use better patterns for procfs in learning mode
Commit 08ae2487b2 ("tomoyo: automatically use patterns for several
situations in learning mode") replaced only $PID part of procfs pathname
with \$ pattern. But it turned out that we need to also replace $TID part
and $FD part to make this functionality useful for e.g. /bin/lsof .

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2025-01-31 00:27:44 +09:00
Joel Granados
1751f872cc treewide: const qualify ctl_tables where applicable
Add the const qualifier to all the ctl_tables in the tree except for
watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls,
loadpin_sysctl_table and the ones calling register_net_sysctl (./net,
drivers/inifiniband dirs). These are special cases as they use a
registration function with a non-const qualified ctl_table argument or
modify the arrays before passing them on to the registration function.

Constifying ctl_table structs will prevent the modification of
proc_handler function pointers as the arrays would reside in .rodata.
This is made possible after commit 78eb4ea25c ("sysctl: treewide:
constify the ctl_table argument of proc_handlers") constified all the
proc_handlers.

Created this by running an spatch followed by a sed command:
Spatch:
    virtual patch

    @
    depends on !(file in "net")
    disable optional_qualifier
    @

    identifier table_name != {
      watchdog_hardlockup_sysctl,
      iwcm_ctl_table,
      ucma_ctl_table,
      memory_allocation_profiling_sysctls,
      loadpin_sysctl_table
    };
    @@

    + const
    struct ctl_table table_name [] = { ... };

sed:
    sed --in-place \
      -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \
      kernel/utsname_sysctl.c

Reviewed-by: Song Liu <song@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> # for kernel/trace/
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # SCSI
Reviewed-by: Darrick J. Wong <djwong@kernel.org> # xfs
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Acked-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-01-28 13:48:37 +01:00
Linus Torvalds
c159dfbdd4 Mainly individually changelogged singleton patches. The patch series in
this pull are:
 
 - "lib min_heap: Improve min_heap safety, testing, and documentation"
   from Kuan-Wei Chiu provides various tightenings to the min_heap library
   code.
 
 - "xarray: extract __xa_cmpxchg_raw" from Tamir Duberstein preforms some
   cleanup and Rust preparation in the xarray library code.
 
 - "Update reference to include/asm-<arch>" from Geert Uytterhoeven fixes
   pathnames in some code comments.
 
 - "Converge on using secs_to_jiffies()" from Easwar Hariharan uses the
   new secs_to_jiffies() in various places where that is appropriate.
 
 - "ocfs2, dlmfs: convert to the new mount API" from Eric Sandeen
   switches two filesystems to the new mount API.
 
 - "Convert ocfs2 to use folios" from Matthew Wilcox does that.
 
 - "Remove get_task_comm() and print task comm directly" from Yafang Shao
   removes now-unneeded calls to get_task_comm() in various places.
 
 - "squashfs: reduce memory usage and update docs" from Phillip Lougher
   implements some memory savings in squashfs and performs some
   maintainability work.
 
 - "lib: clarify comparison function requirements" from Kuan-Wei Chiu
   tightens the sort code's behaviour and adds some maintenance work.
 
 - "nilfs2: protect busy buffer heads from being force-cleared" from
   Ryusuke Konishi fixes an issues in nlifs when the fs is presented with a
   corrupted image.
 
 - "nilfs2: fix kernel-doc comments for function return values" from
   Ryusuke Konishi fixes some nilfs kerneldoc.
 
 - "nilfs2: fix issues with rename operations" from Ryusuke Konishi
   addresses some nilfs BUG_ONs which syzbot was able to trigger.
 
 - "minmax.h: Cleanups and minor optimisations" from David Laight
   does some maintenance work on the min/max library code.
 
 - "Fixes and cleanups to xarray" from Kemeng Shi does maintenance work
   on the xarray library code.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ5SP5QAKCRDdBJ7gKXxA
 jqN7AQChvwXGG43n4d5SDiA/rH7ddvowQcDqhC9cAMJ1ReR7qwEA8/LIWDE4PdMX
 mJnaZ1/ibpEpearrChCViApQtcyEGQI=
 =ti4E
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2025-01-24-23-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "Mainly individually changelogged singleton patches. The patch series
  in this pull are:

   - "lib min_heap: Improve min_heap safety, testing, and documentation"
     from Kuan-Wei Chiu provides various tightenings to the min_heap
     library code

   - "xarray: extract __xa_cmpxchg_raw" from Tamir Duberstein preforms
     some cleanup and Rust preparation in the xarray library code

   - "Update reference to include/asm-<arch>" from Geert Uytterhoeven
     fixes pathnames in some code comments

   - "Converge on using secs_to_jiffies()" from Easwar Hariharan uses
     the new secs_to_jiffies() in various places where that is
     appropriate

   - "ocfs2, dlmfs: convert to the new mount API" from Eric Sandeen
     switches two filesystems to the new mount API

   - "Convert ocfs2 to use folios" from Matthew Wilcox does that

   - "Remove get_task_comm() and print task comm directly" from Yafang
     Shao removes now-unneeded calls to get_task_comm() in various
     places

   - "squashfs: reduce memory usage and update docs" from Phillip
     Lougher implements some memory savings in squashfs and performs
     some maintainability work

   - "lib: clarify comparison function requirements" from Kuan-Wei Chiu
     tightens the sort code's behaviour and adds some maintenance work

   - "nilfs2: protect busy buffer heads from being force-cleared" from
     Ryusuke Konishi fixes an issues in nlifs when the fs is presented
     with a corrupted image

   - "nilfs2: fix kernel-doc comments for function return values" from
     Ryusuke Konishi fixes some nilfs kerneldoc

   - "nilfs2: fix issues with rename operations" from Ryusuke Konishi
     addresses some nilfs BUG_ONs which syzbot was able to trigger

   - "minmax.h: Cleanups and minor optimisations" from David Laight does
     some maintenance work on the min/max library code

   - "Fixes and cleanups to xarray" from Kemeng Shi does maintenance
     work on the xarray library code"

* tag 'mm-nonmm-stable-2025-01-24-23-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (131 commits)
  ocfs2: use str_yes_no() and str_no_yes() helper functions
  include/linux/lz4.h: add some missing macros
  Xarray: use xa_mark_t in xas_squash_marks() to keep code consistent
  Xarray: remove repeat check in xas_squash_marks()
  Xarray: distinguish large entries correctly in xas_split_alloc()
  Xarray: move forward index correctly in xas_pause()
  Xarray: do not return sibling entries from xas_find_marked()
  ipc/util.c: complete the kernel-doc function descriptions
  gcov: clang: use correct function param names
  latencytop: use correct kernel-doc format for func params
  minmax.h: remove some #defines that are only expanded once
  minmax.h: simplify the variants of clamp()
  minmax.h: move all the clamp() definitions after the min/max() ones
  minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
  minmax.h: reduce the #define expansion of min(), max() and clamp()
  minmax.h: update some comments
  minmax.h: add whitespace around operators and after commas
  nilfs2: do not update mtime of renamed directory that is not moved
  nilfs2: handle errors that nilfs_prepare_chunk() may return
  CREDITS: fix spelling mistake
  ...
2025-01-26 17:50:53 -08:00
Tetsuo Handa
691a1f3f18 tomoyo: fix spelling errors
No functional changes.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2025-01-26 19:13:58 +09:00
Tanya Agarwal
41f198d58b tomoyo: fix spelling error
Fix spelling error in security/tomoyo module comments that were
identified using the codespell tool.
No functional changes - documentation only.

Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2025-01-26 19:11:56 +09:00
Linus Torvalds
8883957b3c \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmePs7oACgkQnJ2qBz9k
 QNmHuAf9GkLnY5u1/81xP5V9ukZ4N2yeMW0dydLS5cjWj/St5ELeMAza3jeqtJtD
 j36vbnmy2c5pPaGLAK8BJpMXT/R2TkmmKD004zcfqF2S3SgbGzdgO1zMZzq9KJpM
 woRKZtLuglDajedsDEBBcKotBhlN2+C/sQlFuL1mX4zitk9ajr0qYUB1+JqOeg5f
 qwPsDLT077ADpxd7lVIMcm+OqbduP5KWkBKYHpn7lJcLe1eqVMMzceJroW42zhVG
 Dq8Iln26bbU9Wx6FSPFCUcHEzHRHUfXmu07HN9U0X++0QgWjrmBQQLooGFB/bR4a
 edBrPpVas6xE4/brjgFX3gOKtv8xYg==
 =ewDV
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_hsm_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify pre-content notification support from Jan Kara:
 "This introduces a new fsnotify event (FS_PRE_ACCESS) that gets
  generated before a file contents is accessed.

  The event is synchronous so if there is listener for this event, the
  kernel waits for reply. On success the execution continues as usual,
  on failure we propagate the error to userspace. This allows userspace
  to fill in file content on demand from slow storage. The context in
  which the events are generated has been picked so that we don't hold
  any locks and thus there's no risk of a deadlock for the userspace
  handler.

  The new pre-content event is available only for users with global
  CAP_SYS_ADMIN capability (similarly to other parts of fanotify
  functionality) and it is an administrator responsibility to make sure
  the userspace event handler doesn't do stupid stuff that can DoS the
  system.

  Based on your feedback from the last submission, fsnotify code has
  been improved and now file->f_mode encodes whether pre-content event
  needs to be generated for the file so the fast path when nobody wants
  pre-content event for the file just grows the additional file->f_mode
  check. As a bonus this also removes the checks whether the old
  FS_ACCESS event needs to be generated from the fast path. Also the
  place where the event is generated during page fault has been moved so
  now filemap_fault() generates the event if and only if there is no
  uptodate folio in the page cache.

  Also we have dropped FS_PRE_MODIFY event as current real-world users
  of the pre-content functionality don't really use it so let's start
  with the minimal useful feature set"

* tag 'fsnotify_hsm_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (21 commits)
  fanotify: Fix crash in fanotify_init(2)
  fs: don't block write during exec on pre-content watched files
  fs: enable pre-content events on supported file systems
  ext4: add pre-content fsnotify hook for DAX faults
  btrfs: disable defrag on pre-content watched files
  xfs: add pre-content fsnotify hook for DAX faults
  fsnotify: generate pre-content permission event on page fault
  mm: don't allow huge faults for files with pre content watches
  fanotify: disable readahead if we have pre-content watches
  fanotify: allow to set errno in FAN_DENY permission response
  fanotify: report file range info with pre-content events
  fanotify: introduce FAN_PRE_ACCESS permission event
  fsnotify: generate pre-content permission event on truncate
  fsnotify: pass optional file access range in pre-content event
  fsnotify: introduce pre-content permission events
  fanotify: reserve event bit of deprecated FAN_DIR_MODIFY
  fanotify: rename a misnamed constant
  fanotify: don't skip extra event info if no info_mode is set
  fsnotify: check if file is actually being watched for pre-content events on open
  fsnotify: opt-in for permission events at file open time
  ...
2025-01-23 13:36:06 -08:00
Linus Torvalds
d0d106a2bd bpf-next-6.14
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmeOu1YACgkQ6rmadz2v
 bTrrHxAAn6eqEsluWnDlzhI0OGsPjvgS00sf+MOeqiXYeS2eJ8yJuKifp38+nIQZ
 lIplsWU2ReUY20eizPqLPnQ7TXZGvLgp08E8yHUoZ0siWanqr9iDRfbZCCNrDMNm
 lMqeR1SLapMws2R/UX9JbvPn2ajIJ6Lb4wxenTfdlW6q+0hAGM6Dt0k/jBod+quq
 /oo+xwG3L0q4APBovJfiAFN2z6IYN03b+zLiOrpIJtMACGewEXnl3m4mkL8ZM/FV
 nZGPIxIUPXCpKTGEkNqxfkrnHN2wZQ4ZSKEJ6lhEEp4jrgCVITaGZ/E7jlx6fZoj
 bbd4YMonIPo9Nhim8p1dt8yYBhKKiE5IXIq0GqlMv5+MvAN8ylrlydpsouW1fu66
 hZ1W1BxbxmrgyF0Bwo9JPOMhBHwMrmD6iH9LgiMpZf0ASeF+q9cJpoSOU5j5E9XB
 LpLIRf5jYTd4wZjhDmrQREReLo+Bng9DlCBu+jjh2+YTz6l6Qed+ETpENcd7lL5i
 IHZVbgD2RVPNJoUfdrd763HfYfDTk+50MF5FIMEyfKHz11if0E/LhBMzto22hm6b
 2f8ruj/8yvg8s2dxEP3ySQgcnynlwEnGxLenUVv7uEOYKeWri1rq+fvTK5ne1OLK
 oHnTlkViwQb74c0r8cFW+nkyfUYTfhhBAql14rl/fMjGDO2KZ10=
 =f2CA
 -----END PGP SIGNATURE-----

Merge tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:
 "A smaller than usual release cycle.

  The main changes are:

   - Prepare selftest to run with GCC-BPF backend (Ihor Solodrai)

     In addition to LLVM-BPF runs the BPF CI now runs GCC-BPF in compile
     only mode. Half of the tests are failing, since support for
     btf_decl_tag is still WIP, but this is a great milestone.

   - Convert various samples/bpf to selftests/bpf/test_progs format
     (Alexis Lothoré and Bastien Curutchet)

   - Teach verifier to recognize that array lookup with constant
     in-range index will always succeed (Daniel Xu)

   - Cleanup migrate disable scope in BPF maps (Hou Tao)

   - Fix bpf_timer destroy path in PREEMPT_RT (Hou Tao)

   - Always use bpf_mem_alloc in bpf_local_storage in PREEMPT_RT (Martin
     KaFai Lau)

   - Refactor verifier lock support (Kumar Kartikeya Dwivedi)

     This is a prerequisite for upcoming resilient spin lock.

   - Remove excessive 'may_goto +0' instructions in the verifier that
     LLVM leaves when unrolls the loops (Yonghong Song)

   - Remove unhelpful bpf_probe_write_user() warning message (Marco
     Elver)

   - Add fd_array_cnt attribute for prog_load command (Anton Protopopov)

     This is a prerequisite for upcoming support for static_branch"

* tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (125 commits)
  selftests/bpf: Add some tests related to 'may_goto 0' insns
  bpf: Remove 'may_goto 0' instruction in opt_remove_nops()
  bpf: Allow 'may_goto 0' instruction in verifier
  selftests/bpf: Add test case for the freeing of bpf_timer
  bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT
  bpf: Free element after unlock in __htab_map_lookup_and_delete_elem()
  bpf: Bail out early in __htab_map_lookup_and_delete_elem()
  bpf: Free special fields after unlock in htab_lru_map_delete_node()
  tools: Sync if_xdp.h uapi tooling header
  libbpf: Work around kernel inconsistently stripping '.llvm.' suffix
  bpf: selftests: verifier: Add nullness elision tests
  bpf: verifier: Support eliding map lookup nullness
  bpf: verifier: Refactor helper access type tracking
  bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write
  bpf: verifier: Add missing newline on verbose() call
  selftests/bpf: Add distilled BTF test about marking BTF_IS_EMBEDDED
  libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED
  libbpf: Fix return zero when elf_begin failed
  selftests/bpf: Fix btf leak on new btf alloc failure in btf_distill test
  veristat: Load struct_ops programs only once
  ...
2025-01-23 08:04:07 -08:00
Linus Torvalds
754916d4a2 capabilities patches for 6.14-rc1
This branch contains basically the same two patches as last time:
 
 1. A patch by Paul Moore to remove the cap_mmap_file() hook, as it simply
    returned the default return value and so doesn't need to exist.
 2. A patch by Jordan Rome to add a trace event for cap_capable(), updated
    to address your feedback during the last cycle.
 
 Both patches have been sitting in linux-next since 6.13-rc1 with no
 issues.
 
 Signed-off-by: Serge E. Hallyn <serge@hallyn.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEqb0/8XByttt4D8+UNXDaFycKziQFAmeOxO0ACgkQNXDaFycK
 ziSbqwf9FmQbCG9zpgHhAaODz8GXPn1EYm0TfabbfuG+hRvTQLt/7eVuLB6Tt69l
 lx7zM8HUjZLQW8qsDc1nmdnrvvLK6z8e97yGBBMG4uzFyzsCgNQowyDRz69IOG+l
 eTCUMXOQXYtO4OYm7pECBeUos8yCOpW7vdZzyyKInw0A8JXy98K880HlYoiYc7wI
 9xXtKWTmqry156llwIYU/opo/Pag480Y2hzP9x5EqvTNqJ/iMEUb2Dswhf+53dOY
 HePwerTu1BYYupSC2gl3ujl/m6R2BroLBmOMApLiAhNtRZCm+J6rkhmMW9cFqyxZ
 Nyw8nAuc08cAKoobAdggD+cgFy9e6g==
 =WKYe
 -----END PGP SIGNATURE-----

Merge tag 'caps-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux

Pull capabilities updates from Serge Hallyn:

 - remove the cap_mmap_file() hook, as it simply returned the default
   return value and so doesn't need to exist (Paul Moore)

 - add a trace event for cap_capable() (Jordan Rome)

* tag 'caps-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux:
  security: add trace event for cap_capable
  capabilities: remove cap_mmap_file()
2025-01-23 08:00:16 -08:00
Linus Torvalds
21266b8df5 AT_EXECVE_CHECK introduction for v6.14-rc1
- Implement AT_EXECVE_CHECK flag to execveat(2) (Mickaël Salaün)
 
 - Implement EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits
   (Mickaël Salaün)
 
 - Add selftests and samples for AT_EXECVE_CHECK (Mickaël Salaün)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ4hO7wAKCRA2KwveOeQk
 u4l+AP9UHO1KwMn3aOt6uFPj7omaoY0vpcB1rx/x5s4efNFHOAD/QjY0f+ND+HzF
 mKLYOIeacGEQi7TNhpnOkGjz6jzSiwg=
 =sMhZ
 -----END PGP SIGNATURE-----

Merge tag 'AT_EXECVE_CHECK-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull AT_EXECVE_CHECK from Kees Cook:

 - Implement AT_EXECVE_CHECK flag to execveat(2) (Mickaël Salaün)

 - Implement EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits
   (Mickaël Salaün)

 - Add selftests and samples for AT_EXECVE_CHECK (Mickaël Salaün)

* tag 'AT_EXECVE_CHECK-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  ima: instantiate the bprm_creds_for_exec() hook
  samples/check-exec: Add an enlighten "inc" interpreter and 28 tests
  selftests: ktap_helpers: Fix uninitialized variable
  samples/check-exec: Add set-exec
  selftests/landlock: Add tests for execveat + AT_EXECVE_CHECK
  selftests/exec: Add 32 tests for AT_EXECVE_CHECK and exec securebits
  security: Add EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits
  exec: Add a new AT_EXECVE_CHECK flag to execveat(2)
2025-01-22 20:34:42 -08:00
Linus Torvalds
5ab889facc hardening updates for v6.14-rc1
- stackleak: Use str_enabled_disabled() helper (Thorsten Blum)
 
 - Document GCC INIT_STACK_ALL_PATTERN behavior (Geert Uytterhoeven)
 
 - Add task_prctl_unknown tracepoint (Marco Elver)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ4hR6QAKCRA2KwveOeQk
 uyYrAP90cNcedNxKCIC/XfIEyS5bWqgAcEcOdLwsPQ8X130M7wEAwadkKaO7PwrF
 8T3ynXxUd4z5OyuXjKQvfvPAgaxhbg4=
 =OoiS
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:

 - stackleak: Use str_enabled_disabled() helper (Thorsten Blum)

 - Document GCC INIT_STACK_ALL_PATTERN behavior (Geert Uytterhoeven)

 - Add task_prctl_unknown tracepoint (Marco Elver)

* tag 'hardening-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  hardening: Document INIT_STACK_ALL_PATTERN behavior with GCC
  stackleak: Use str_enabled_disabled() helper in stack_erasing_sysctl()
  tracing: Remove pid in task_rename tracing output
  tracing: Add task_prctl_unknown tracepoint
2025-01-22 20:29:53 -08:00
Linus Torvalds
ad2aec7c96 Small changes for improving usability.
Tetsuo Handa (3):
   tomoyo: automatically use patterns for several situations in learning mode
   tomoyo: use realpath if symlink's pathname refers to procfs
   tomoyo: don't emit warning in tomoyo_write_control()
 
  security/tomoyo/common.c |   32 +++++++++++++++++++++++++++++++-
  security/tomoyo/domain.c |   11 +++++++++--
  2 files changed, 40 insertions(+), 3 deletions(-)
 -----BEGIN PGP SIGNATURE-----
 
 iQJXBAABCABBFiEEQ8gzaWI9etOpbC/HQl8SjQxk9SoFAmeRdEUjHHBlbmd1aW4t
 a2VybmVsQGktbG92ZS5zYWt1cmEubmUuanAACgkQQl8SjQxk9SqbCRAAg0aZ5pu/
 p9uJhsZFzBZzTE8BcFRdMWKS+t5+RGE1KqHoNc/YsP73eb9ui/N1jckJjsYCFh4m
 NsqWvadVMtma4qTcAw5Um48kZ898QVoq9kFBpxCklRnH0eGJsn2kNjNmgywYKK9q
 TOSKTzJM2DfOEjrSdepvthp/FK76nbHdrKREBYsslPcSLFx5c5qBqx60l6p8hS0A
 1+zKQM8m9H2nZ7KnuCr0e4Lhiq1W6C/BaIN/+NzHxbPVXl55nE1i4OXM3WSTC92H
 i6Fe+eKv5B9B5zaQwWbAGhfONcL8pkmgwTvPFtZLj7y5t5HW8TxmPMFZ/jE5E1Kt
 AFFLWvXIlDZ5aYU4pbNUAA4RCNm2hWw1vniPgLPiJte2ft2AdSxBHO9J6w/W4xs9
 W44TvsFXDfJgKsoY4Hy5HR3VxleIFA5tvIyWp1ONp7rZ+4WKolTQYW0hEz26V30K
 +urTZFusv3+229z3ZM8qsPUbL/WLML1jRw8E3AtnBMRJ92c8tpFaBA8b+o829zz8
 QAFWeca3psgPqxHPZsnNeM8lOLOrtBik3+CNpmXJLq6KL2BrI9eRDUntBwHbBoRu
 /hAJX0feI/4yoixzr5sOk5UMNlp+G5CDBQb9NkTynUnr3Q1KRcgKWC4f8cAQlTb2
 6OfT8xw2AID0/XXxJVLOdDRTPnYEVU3VH9c=
 =wawL
 -----END PGP SIGNATURE-----

Merge tag 'tomoyo-pr-20250123' of git://git.code.sf.net/p/tomoyo/tomoyo

Pull tomoyo updates from Tetsuo Handa:
 "Small changes to improve usability"

* tag 'tomoyo-pr-20250123' of git://git.code.sf.net/p/tomoyo/tomoyo:
  tomoyo: automatically use patterns for several situations in learning mode
  tomoyo: use realpath if symlink's pathname refers to procfs
  tomoyo: don't emit warning in tomoyo_write_control()
2025-01-22 20:25:00 -08:00
Linus Torvalds
de5817bbfb Landlock updates for v6.14-rc1
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZ5EMhBAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbSMv0BAMOG2TFwq+UhbtxtL6pM7qzxfdWg6GR/t4t8
 MFasAcCaAQDtTnW0HymHge8k7JFgWHHp0JBu7V7dhFrdJoS+718aDA==
 =1Hfr
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock updates from Mickaël Salaün:
 "This mostly factors out some Landlock code and prepares for upcoming
  audit support.

  Because files with invalid modes might be visible after filesystem
  corruption, Landlock now handles those weird files too.

  A few sample and test issues are also fixed"

* tag 'landlock-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  selftests/landlock: Add layout1.umount_sandboxer tests
  selftests/landlock: Add wrappers.h
  selftests/landlock: Fix error message
  landlock: Optimize file path walks and prepare for audit support
  selftests/landlock: Add test to check partial access in a mount tree
  landlock: Align partial refer access checks with final ones
  landlock: Simplify initially denied access rights
  landlock: Move access types
  landlock: Factor out check_access_path()
  selftests/landlock: Fix build with non-default pthread linking
  landlock: Use scoped guards for ruleset in landlock_add_rule()
  landlock: Use scoped guards for ruleset
  landlock: Constify get_mode_access()
  landlock: Handle weird files
  samples/landlock: Fix possible NULL dereference in parse_path()
  selftests/landlock: Remove unused macros in ptrace_test.c
2025-01-22 20:20:55 -08:00
Linus Torvalds
9cb2bf599b Hi,
Here's the keys changes for 6.14-rc1.
 
 BR, Jarkko
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRE6pSOnaBC00OEHEIaerohdGur0gUCZ49r8AAKCRAaerohdGur
 0qiFAP9Av1djr8/HA+VidLPOe5neBkRYSuX54yNz3TGBOpjmvwD+L2YJfJYcBAd+
 sku1MLkpLbMmBCLSTNhqjSJWxx4vngg=
 =iF42
 -----END PGP SIGNATURE-----

Merge tag 'keys-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull keys updates from Jarkko Sakkinen.

Avoid using stack addresses for sg lists. And a cleanup.

* tag 'keys-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  KEYS: trusted: dcp: fix improper sg use with CONFIG_VMAP_STACK=y
  keys: drop shadowing dead prototype
2025-01-22 19:47:17 -08:00
Linus Torvalds
690ffcd817 selinux/stable-6.14 PR 20250121
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmeQE9YUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXO66BAAmmhqw3Cm94u6QOjYTQCNrHpmZVpv
 atPfW0EIlvYWuLGvyQZVh/2SKrYm0o7iNlTlOyMcBV9BfFUZd6vepX3L+ylhMacG
 L2lg5jgl11zUZdo8m++38kCABbdMexhTzgtfdAm+w2RmLRoOzXjBOCDx18sWVtCy
 aV3DQAvl6qdU/Y5U6PccmOCwgFVQEmWzQ4A1CMq696Fybr4EzTjI8mCCnotHWarz
 cgfMHHf5RYR4M4VdmxWo3MR6y6Qiq19/Vsy43YP/G/A0Ad+mfLqhHmc27+Mx2bDk
 IfdrvOOjaVxiEeIJe6mOePcW9p1D9q4OrPmBZHxN1+R3ck7k0MgVLIDvSzLDMbbj
 3PSZx2UFk1xz+B0x3hvzhAXJ5YfbAjPj1Z65HlLIQBFBo5jvLWMrLxdpcH4eRdhT
 ovTqFuB4wQwYOeeKlXlnCWFitsAynjo9qGxcqjxG63geJfnBlsnLoYIa0g+cN6Uf
 3Ty0+zeHDfCajj40buvtOWv98CyAMF5vBopnr18Kfo4upp6pgVERVBoGy2Yw020I
 yItiRhi1fpV31J8Gxrd7WA2/OmZZLISnAJtKMSsyd+hBihfOjVZ9LhKKYJk4vH+X
 mWVOdplHpCDe6y2EJE4EmaNwQCOVJfg4/Xvh9fghELdBdc91wFGPDO36AitDBNr8
 /o13aUvFarsEmtA=
 =UJsr
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

 - Extended permissions supported in conditional policy

   The SELinux extended permissions, aka "xperms", allow security admins
   to target individuals ioctls, and recently netlink messages, with
   their SELinux policy. Adding support for conditional policies allows
   admins to toggle the granular xperms using SELinux booleans, helping
   pave the way for greater use of xperms in general purpose SELinux
   policies. This change bumps the maximum SELinux policy version to 34.

 - Fix a SCTP/SELinux error return code inconsistency

   Depending on the loaded SELinux policy, specifically it's
   EXTSOCKCLASS support, the bind(2) LSM/SELinux hook could return
   different error codes due to the SELinux code checking the socket's
   SELinux object class (which can vary depending on EXTSOCKCLASS) and
   not the socket's sk_protocol field. We fix this by doing the obvious,
   and looking at the sock->sk_protocol field instead of the object
   class.

 - Makefile fixes to properly cleanup av_permissions.h

   Add av_permissions.h to "targets" so that it is properly cleaned up
   using the kbuild infrastructure.

 - A number of smaller improvements by Christian Göttsche

   A variety of straightforward changes to reduce code duplication,
   reduce pointer lookups, migrate void pointers to defined types,
   simplify code, constify function parameters, and correct iterator
   types.

* tag 'selinux-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: make more use of str_read() when loading the policy
  selinux: avoid unnecessary indirection in struct level_datum
  selinux: use known type instead of void pointer
  selinux: rename comparison functions for clarity
  selinux: rework match_ipv6_addrmask()
  selinux: constify and reconcile function parameter names
  selinux: avoid using types indicating user space interaction
  selinux: supply missing field initializers
  selinux: add netlink nlmsg_type audit message
  selinux: add support for xperms in conditional policies
  selinux: Fix SCTP error inconsistency in selinux_socket_bind()
  selinux: use native iterator types
  selinux: add generated av_permissions.h to targets
2025-01-21 20:09:14 -08:00
Linus Torvalds
f96a974170 lsm/stable-6.14 PR 20250121
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmeQFBoUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPvcA//XCdwMz0bGtWKv58nuyP8vkQx08n6
 //olz/O8te3uWK5O3kRiarzFLwH8qsHQ6A7GYalwwix34hatR4ndJE0Y/guVRWa1
 +aBmJxJ7Jm/q3fvpAEfqiSgreuE6kBoztlDOWEq+hUQGu4qfnQGm2EnvbvfFrAmN
 VheOfIQSU2KCL/Scc3FGnF6uru4WrqN0JJ9RbvrEpfdQgmcyTGLnQsZLljutWSIq
 kDWkteIr7cj3O9J45zpxZsTftvYSgVn/y1iKeXbHI4DBA1eheK12vsHB9AADKI1J
 GwHxOrnLpZtv+ICUKqcfFTmWTl+NmfJJurAT5KXKdBjL3xM5MoJlBvK1A5qE9CMo
 LaHVG/TZR2MmBaoM3EN+gvWhDgWlvT02Q/0cYaafTlVLMez3HtfctxN6OnCvTXTB
 Y8dqYClhhlBm/mHQwYfMoeKw4MftUpzEqBd1Nj7Qe8dbP0f/62Ca3K2B3D6Rf8QV
 pj3ryMlSWYV9mdTerruLNQexTGoN7l66jPwzdWpTbFeL3WmNtfCako8OZGbXgPIu
 Iahm3P+jnSVx8ZQro2c9zwdKXI5xiI335pCBbDZ8aX+JAsfj0OofHsFx5Q5diber
 M7tAEhxDqRisbpz7Ei+/LOAEGg2Z619XKg8ks4z6Y4P5PF7zEgeWTkZJk2iLbxXe
 6LLOjmF7LLw+G4M=
 =fgyr
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm updates from Paul Moore:

 - Improved handling of LSM "secctx" strings through lsm_context struct

   The LSM secctx string interface is from an older time when only one
   LSM was supported, migrate over to the lsm_context struct to better
   support the different LSMs we now have and make it easier to support
   new LSMs in the future.

   These changes explain the Rust, VFS, and networking changes in the
   diffstat.

 - Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are
   enabled

   Small tweak to be a bit smarter about when we build the LSM's common
   audit helpers.

 - Check for absurdly large policies from userspace in SafeSetID

   SafeSetID policies rules are fairly small, basically just "UID:UID",
   it easy to impose a limit of KMALLOC_MAX_SIZE on policy writes which
   helps quiet a number of syzbot related issues. While work is being
   done to address the syzbot issues through other mechanisms, this is a
   trivial and relatively safe fix that we can do now.

 - Various minor improvements and cleanups

   A collection of improvements to the kernel selftests, constification
   of some function parameters, removing redundant assignments, and
   local variable renames to improve readability.

* tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  lockdown: initialize local array before use to quiet static analysis
  safesetid: check size of policy writes
  net: corrections for security_secid_to_secctx returns
  lsm: rename variable to avoid shadowing
  lsm: constify function parameters
  security: remove redundant assignment to return variable
  lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set
  selftests: refactor the lsm `flags_overset_lsm_set_self_attr` test
  binder: initialize lsm_context structure
  rust: replace lsm context+len with lsm_context
  lsm: secctx provider check on release
  lsm: lsm_context in security_dentry_init_security
  lsm: use lsm_context in security_inode_getsecctx
  lsm: replace context+len with lsm_context
  lsm: ensure the correct LSM context releaser
2025-01-21 20:03:04 -08:00
Linus Torvalds
678ca9f78e One minor code improvement for v6.14
-----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmeQGOEXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBGbYA//RwccdJJXj5r3gUjfEb1p58hV
 z2D2cbJ6P23+e+mTa+AF/EqDQyRY1SizVnV/yV+ztfkXQ5DM0MLIMziWnenmwCeQ
 FTFp9D+/3q9J26+SdYV4JN9I/dVYeMVjIzEIQr82lEoiAWhPgNeJ8CKlme14wpUb
 jEqtFdAkNVwTxVi3OjMWs6liLBmkaskx2m9AXorsA7MXS6msuWhFmzuElJCt8KID
 DPUgEx9ny90SlCY3+FDz922v6cw4b4AU/z6Bf8ZBbf3cr9VcDOnq/W9D1NeOCfqb
 NXBHDIbjCNmQgR2vcfpDPW1tDk36xJdSOfIzWSlg8ZpMGWBe8Xnla5rpTTH4zaVq
 shMhLxzV1VTQoyrwsyvUbC7g7CssEBh7v+rIhdA0h59cBniMKanGTffAXEBzgXnl
 39kMicu7jX8vLn6HRgmOUrSdr4KgXKzhuW4GfCe/5dBh+2RnnUTn+iUVdj4xts/X
 SwODT0XT5FOofVh2hZYZMZK3MnFa/P401vOENMH3OiY3hYHBJdrO3JtPz9n+vxUF
 bA4rhwvSuBTsLjyJC/8qTLU3sgfCIHl9pBGVlMeRb2yt+7+EAJ6WDEr3LBdIrh9P
 6obhtG3zbQsplpVLwh7MbxuPcZT3u2pBX7ME9zfNWMKltwKKSPho/Bt72G8w2YNi
 VApxTfc6HKbWzJoCk04=
 =tXff
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-6.14' of https://github.com/cschaufler/smack-next

Pull smack update from Casey Schaufler:
 "One minor code improvement for v6.14"

* tag 'Smack-for-6.14' of https://github.com/cschaufler/smack-next:
  smack: deduplicate access to string conversion
2025-01-21 19:59:46 -08:00
Linus Torvalds
0ca0cf9f8c integrity-v6.14
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZ4rmvxQcem9oYXJAbGlu
 dXguaWJtLmNvbQAKCRDLwZzRsCrn5avvAP4tzjNdVp3tFeq9bA8gQZEJ74E6q/6a
 Qb18xTn54hxGXAEAuotzwJiandLBk/3hkHIE0BbyzmULiVMpos4qVuUmTwI=
 =ZvaD
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity updates from Mimi Zohar:
 "There's just a couple of changes: two kernel messages addressed, a
  measurement policy collision addressed, and one policy cleanup.

  Please note that the contents of the IMA measurement list is
  potentially affected. The builtin tmpfs IMA policy rule change might
  introduce additional measurements, while detecting a reboot might
  eliminate some measurements"

* tag 'integrity-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: ignore suffixed policy rule comments
  ima: limit the builtin 'tcb' dont_measure tmpfs policy rule
  ima: kexec: silence RCU list traversal warning
  ima: Suspend PCR extends and log appends when rebooting
2025-01-21 19:54:32 -08:00
David Gstir
e8d9fab39d KEYS: trusted: dcp: fix improper sg use with CONFIG_VMAP_STACK=y
With vmalloc stack addresses enabled (CONFIG_VMAP_STACK=y) DCP trusted
keys can crash during en- and decryption of the blob encryption key via
the DCP crypto driver. This is caused by improperly using sg_init_one()
with vmalloc'd stack buffers (plain_key_blob).

Fix this by always using kmalloc() for buffers we give to the DCP crypto
driver.

Cc: stable@vger.kernel.org # v6.10+
Fixes: 0e28bf61a5 ("KEYS: trusted: dcp: fix leak of blob encryption key")
Signed-off-by: David Gstir <david@sigma-star.at>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-01-21 11:25:23 +02:00
Linus Torvalds
4b84a4c8d4 vfs-6.14-rc1.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pRjQAKCRCRxhvAZXjc
 omUyAP9k31Qr7RY1zNtmpPfejqc+3Xx+xXD7NwHr+tONWtUQiQEA/F94qU2U3ivS
 AzyDABWrEQ5ZNsm+Rq2Y3zyoH7of3ww=
 =s3Bu
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "Features:

   - Support caching symlink lengths in inodes

     The size is stored in a new union utilizing the same space as
     i_devices, thus avoiding growing the struct or taking up any more
     space

     When utilized it dodges strlen() in vfs_readlink(), giving about
     1.5% speed up when issuing readlink on /initrd.img on ext4

   - Add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag

     If a file system supports uncached buffered IO, it may set
     FOP_DONTCACHE and enable support for RWF_DONTCACHE.

     If RWF_DONTCACHE is attempted without the file system supporting
     it, it'll get errored with -EOPNOTSUPP

   - Enable VBOXGUEST and VBOXSF_FS on ARM64

     Now that VirtualBox is able to run as a host on arm64 (e.g. the
     Apple M3 processors) we can enable VBOXSF_FS (and in turn
     VBOXGUEST) for this architecture.

     Tested with various runs of bonnie++ and dbench on an Apple MacBook
     Pro with the latest Virtualbox 7.1.4 r165100 installed

  Cleanups:

   - Delay sysctl_nr_open check in expand_files()

   - Use kernel-doc includes in fiemap docbook

   - Use page->private instead of page->index in watch_queue

   - Use a consume fence in mnt_idmap() as it's heavily used in
     link_path_walk()

   - Replace magic number 7 with ARRAY_SIZE() in fc_log

   - Sort out a stale comment about races between fd alloc and dup2()

   - Fix return type of do_mount() from long to int

   - Various cosmetic cleanups for the lockref code

  Fixes:

   - Annotate spinning as unlikely() in __read_seqcount_begin

     The annotation already used to be there, but got lost in commit
     52ac39e5db ("seqlock: seqcount_t: Implement all read APIs as
     statement expressions")

   - Fix proc_handler for sysctl_nr_open

   - Flush delayed work in delayed fput()

   - Fix grammar and spelling in propagate_umount()

   - Fix ESP not readable during coredump

     In /proc/PID/stat, there is the kstkesp field which is the stack
     pointer of a thread. While the thread is active, this field reads
     zero. But during a coredump, it should have a valid value

     However, at the moment, kstkesp is zero even during coredump

   - Don't wake up the writer if the pipe is still full

   - Fix unbalanced user_access_end() in select code"

* tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
  gfs2: use lockref_init for qd_lockref
  erofs: use lockref_init for pcl->lockref
  dcache: use lockref_init for d_lockref
  lockref: add a lockref_init helper
  lockref: drop superfluous externs
  lockref: use bool for false/true returns
  lockref: improve the lockref_get_not_zero description
  lockref: remove lockref_put_not_zero
  fs: Fix return type of do_mount() from long to int
  select: Fix unbalanced user_access_end()
  vbox: Enable VBOXGUEST and VBOXSF_FS on ARM64
  pipe_read: don't wake up the writer if the pipe is still full
  selftests: coredump: Add stackdump test
  fs/proc: do_task_stat: Fix ESP not readable during coredump
  fs: add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag
  fs: sort out a stale comment about races between fd alloc and dup2
  fs: Fix grammar and spelling in propagate_umount()
  fs: fc_log replace magic number 7 with ARRAY_SIZE()
  fs: use a consume fence in mnt_idmap()
  file: flush delayed work in delayed fput()
  ...
2025-01-20 09:40:49 -08:00
John Johansen
e6b0876769 apparmor: fix dbus permission queries to v9 ABI
dbus permission queries need to be synced with fine grained unix
mediation to avoid potential policy regressions. To ensure that
dbus queries don't result in a case where fine grained unix mediation
is not being applied but dbus mediation is check the loaded policy
support ABI and abort the query if policy doesn't support the
v9 ABI.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:13 -08:00
John Johansen
dcd7a55941 apparmor: gate make fine grained unix mediation behind v9 abi
Fine grained unix mediation in Ubuntu used ABI v7, and policy using
this has propogated onto systems where fine grained unix mediation was
not supported. The userspace policy compiler supports downgrading
policy so the policy could be shared without changes.

Unfortunately this had the side effect that policy was not updated for
the none Ubuntu systems and enabling fine grained unix mediation on
those systems means that a new kernel can break a system with existing
policy that worked with the previous kernel. With fine grained af_unix
mediation this regression can easily break the system causing boot to
fail, as it affect unix socket files, non-file based unix sockets, and
dbus communication.

To aoid this regression move fine grained af_unix mediation behind
a new abi. This means that the system's userspace and policy must
be updated to support the new policy before it takes affect and
dropping a new kernel on existing system will not result in a
regression.

The abi bump is done in such a way as existing policy can be activated
on the system by changing the policy abi declaration and existing unix
policy rules will apply. Policy then only needs to be incrementally
updated, can even be backported to existing Ubuntu policy.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:13 -08:00
John Johansen
c05e705812 apparmor: add fine grained af_unix mediation
Extend af_unix mediation to support fine grained controls based on
the type (abstract, anonymous, fs), the address, and the labeling
on the socket.

This allows for using socket addresses to label and the socket and
control which subjects can communicate.

The unix rule format follows standard apparmor rules except that fs
based unix sockets can be mediated by existing file rules. None fs
unix sockets can be mediated by a unix socket rule. Where The address
of an abstract unix domain socket begins with the @ character, similar
to how they are reported (as paths) by netstat -x.  The address then
follows and may contain pattern matching and any characters including
the null character. In apparmor null characters must be specified by
using an escape sequence \000 or \x00. The pattern matching is the
same as is used by file path matching so * will not match / even
though it has no special meaning with in an abstract socket name. Eg.

     allow unix addr=@*,

Autobound unix domain sockets have a unix sun_path assigned to them by
the kernel, as such specifying a policy based address is not possible.
The autobinding of sockets can be controlled by specifying the special
auto keyword. Eg.

     allow unix addr=auto,

To indicate that the rule only applies to auto binding of unix domain
sockets.  It is important to note this only applies to the bind
permission as once the socket is bound to an address it is
indistinguishable from a socket that have an addr bound with a
specified name. When the auto keyword is used with other permissions
or as part of a peer addr it will be replaced with a pattern that can
match an autobound socket. Eg. For some kernels

    allow unix rw addr=auto,

It is important to note, this pattern may match abstract sockets that
were not autobound but have an addr that fits what is generated by the
kernel when autobinding a socket.

Anonymous unix domain sockets have no sun_path associated with the
socket address, however it can be specified with the special none
keyword to indicate the rule only applies to anonymous unix domain
sockets. Eg.

    allow unix addr=none,

If the address component of a rule is not specified then the rule
applies to autobind, abstract and anonymous sockets.

The label on the socket can be compared using the standard label=
rule conditional. Eg.

    allow unix addr=@foo peer=(label=bar),

see man apparmor.d for full syntax description.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
b4940d913c apparmor: in preparation for finer networking rules rework match_prot
Rework match_prot into a common fn that can be shared by all the
networking rules. This will provide compatibility with current socket
mediation, via the early bailout permission encoding.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
6cc6a0523d apparmor: lift kernel socket check out of critical section
There is no need for the kern check to be in the critical section,
it only complicates the code and slows down the case where the
socket is being created by the kernel.

Lifting it out will also allow socket_create to share common template
code, with other socket_permission checks.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
9045aa25d1 apparmor: remove af_select macro
The af_select macro just adds a layer of unnecessary abstraction that
makes following what the code is doing harder.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
ce9e3b3fa2 apparmor: add ability to mediate caps with policy state machine
Currently the caps encoding is very limited and can't be used with
conditionals. Allow capabilities to be mediated by the state
machine. This will allow us to add conditionals to capabilities that
aren't possible with the current encoding.

This patch only adds support for using the state machine and retains
the old encoding lookup as part of the runtime mediation code to
support older policy abis. A follow on patch will move backwards
compatibility to a mapping function done at policy load time.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
a9eb185be8 apparmor: fix x_table_lookup when stacking is not the first entry
x_table_lookup currently does stacking during label_parse() if the
target specifies a stack but its only caller ensures that it will
never be used with stacking.

Refactor to slightly simplify the code in x_to_label(), this
also fixes a long standing problem where x_to_labels check on stacking
is only on the first element to the table option list, instead of
the element that is found and used.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
84c455decf apparmor: add support for profiles to define the kill signal
Previously apparmor has only sent SIGKILL but there are cases where
it can be useful to send a different signal. Allow the profile
to optionally specify a different value.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
2e12c5f060 apparmor: add additional flags to extended permission.
This is a step towards merging the file and policy state machines.

With the switch to extended permissions the state machine's ACCEPT2
table became unused freeing it up to store state specific flags. The
first flags to be stored are FLAG_OWNER and FLAG other which paves the
way towards merging the file and policydb perms into a single
permission table.

Currently Lookups based on the objects ownership conditional will
still need separate fns, this will be address in a following patch.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
de4754c801 apparmor: carry mediation check on label
In order to speed up the mediated check, precompute and store the
result as a bit per class type. This will not only allow us to
speed up the mediation check but is also a step to removing the
unconfined special cases as the unconfined check can be replaced
with the generic label_mediates() check.

Note: label check does not currently work for capabilities and resources
      which need to have their mediation updated first.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
34d31f2338 apparmor: cleanup: refactor file_perm() to doc semantics of some checks
Provide semantics, via fn names, for some checks being done in
file_perm(). This is a preparatory patch for improvements to both
permission caching and delegation, where the check will become more
involved.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
35fad5b462 apparmor: remove explicit restriction that unconfined cannot use change_hat
There does not need to be an explicit restriction that unconfined
can't use change_hat. Traditionally unconfined doesn't have hats
so change_hat could not be used. But newer unconfined profiles have
the potential of having hats, and even system unconfined will be
able to be replaced with a profile that allows for hats.

To remain backwards compitible with expected return codes, continue
to return -EPERM if the unconfined profile does not have any hats.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
cd769b05cc apparmor: ensure labels with more than one entry have correct flags
labels containing more than one entry need to accumulate flag info
from profiles that the label is constructed from. This is done
correctly for labels created by a merge but is not being done for
labels created by an update or directly created via a parse.

This technically is a bug fix, however the effect in current code is
to cause early unconfined bail out to not happen (ie. without the fix
it is slower) on labels that were created via update or a parse.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
0bc8c6862f apparmor: switch signal mediation to use RULE_MEDIATES
Currently signal mediation is using a hard coded form of the
RULE_MEDIATES check. This hides the intended semantics, and means this
specific check won't pickup any changes or improvements made in the
RULE_MEDIATES check. Switch to using RULE_MEDIATES().

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
46b9b994dd apparmor: remove redundant unconfined check.
profile_af_perm and profile_af_sk_perm are only ever called after
checking that the profile is not unconfined. So we can drop these
redundant checks.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:12 -08:00
John Johansen
280799f724 apparmor: cleanup: attachment perm lookup to use lookup_perms()
Remove another case of code duplications. Switch to using the generic
routine instead of the current custom checks.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:11 -08:00
John Johansen
71e6cff3e0 apparmor: Improve debug print infrastructure
Make it so apparmor debug output can be controlled by class flags
as well as the debug flag on labels. This provides much finer
control at what is being output so apparmor doesn't flood the
logs with information that is not needed, making it hard to find
what is important.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:11 -08:00
Thorsten Blum
c602537de3 apparmor: Use str_yes_no() helper function
Remove hard-coded strings by using the str_yes_no() helper function.

Fix a typo in a comment: s/unpritable/unprintable/

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2025-01-18 06:47:11 -08:00
Mickaël Salaün
d617f0d72d
landlock: Optimize file path walks and prepare for audit support
Always synchronize access_masked_parent* with access_request_parent*
according to allowed_parent*.  This is required for audit support to be
able to get back to the reason of denial.

In a rename/link action, instead of always checking a rule two times for
the same parent directory of the source and the destination files, only
check it when an action on a child was not already allowed.  This also
enables us to keep consistent allowed_parent* status, which is required
to get back to the reason of denial.

For internal mount points, only upgrade allowed_parent* to true but do
not wrongfully set both of them to false otherwise.  This is also
required to get back to the reason of denial.

This does not impact the current behavior but slightly optimize code and
prepare for audit support that needs to know the exact reason why an
access was denied.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250108154338.1129069-14-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17 19:05:37 +01:00
Mickaël Salaün
058518c209
landlock: Align partial refer access checks with final ones
Fix a logical issue that could have been visible if the source or the
destination of a rename/link action was allowed for either the source or
the destination but not both.  However, this logical bug is unreachable
because either:
- the rename/link action is allowed by the access rights tied to the
  same mount point (without relying on access rights in a parent mount
  point) and the access request is allowed (i.e. allow_parent1 and
  allow_parent2 are true in current_check_refer_path),
- or a common rule in a parent mount point updates the access check for
  the source and the destination (cf. is_access_to_paths_allowed).

See the following layout1.refer_part_mount_tree_is_allowed test that
work with and without this fix.

This fix does not impact current code but it is required for the audit
support.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250108154338.1129069-12-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17 19:05:35 +01:00
Mickaël Salaün
d6c7cf84a2
landlock: Simplify initially denied access rights
Upgrade domain's handled access masks when creating a domain from a
ruleset, instead of converting them at runtime.  This is more consistent
and helps with audit support.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250108154338.1129069-7-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17 19:05:35 +01:00
Mickaël Salaün
622e2f5954
landlock: Move access types
Move LANDLOCK_ACCESS_FS_INITIALLY_DENIED, access_mask_t, struct
access_mask, and struct access_masks_all to a dedicated access.h file.

Rename LANDLOCK_ACCESS_FS_INITIALLY_DENIED to
_LANDLOCK_ACCESS_FS_INITIALLY_DENIED to make it clear that it's not part
of UAPI.  Add some newlines when appropriate.

This file will be extended with following commits, and it will help to
avoid dependency loops.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250108154338.1129069-6-mic@digikod.net
[mic: Fix rebase conflict because of the new cleanup headers]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17 19:05:34 +01:00
Mickaël Salaün
924f4403d8
landlock: Factor out check_access_path()
Merge check_access_path() into current_check_access_path() and make
hook_path_mknod() use it.

Cc: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250108154338.1129069-4-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-17 19:05:33 +01:00
Mickaël Salaün
16a6f4d3b5
landlock: Use scoped guards for ruleset in landlock_add_rule()
Simplify error handling by replacing goto statements with automatic
calls to landlock_put_ruleset() when going out of scope.

This change depends on the TCP support.

Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Cc: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250113161112.452505-3-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14 11:57:45 +01:00
Mickaël Salaün
d32f79a59a
landlock: Use scoped guards for ruleset
Simplify error handling by replacing goto statements with automatic
calls to landlock_put_ruleset() when going out of scope.

This change will be easy to backport to v6.6 if needed, only the
kernel.h include line conflicts.  As for any other similar changes, we
should be careful when backporting without goto statements.

Add missing include file.

Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20250113161112.452505-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14 11:57:45 +01:00
Mickaël Salaün
25ccc75f5d
landlock: Constify get_mode_access()
Use __attribute_const__ for get_mode_access().

Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20250110153918.241810-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14 11:57:44 +01:00
Mickaël Salaün
49440290a0
landlock: Handle weird files
A corrupted filesystem (e.g. bcachefs) might return weird files.
Instead of throwing a warning and allowing access to such file, treat
them as regular files.

Cc: Dave Chinner <david@fromorbit.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Paul Moore <paul@paul-moore.com>
Reported-by: syzbot+34b68f850391452207df@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/000000000000a65b35061cffca61@google.com
Reported-by: syzbot+360866a59e3c80510a62@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/67379b3f.050a0220.85a0.0001.GAE@google.com
Reported-by: Ubisectech Sirius <bugreport@ubisectech.com>
Closes: https://lore.kernel.org/r/c426821d-8380-46c4-a494-7008bbd7dd13.bugreport@ubisectech.com
Fixes: cb2c7d1a17 ("landlock: Support filesystem access-control")
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20250110153918.241810-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-01-14 11:57:39 +01:00
Yafang Shao
7c58ed44bd security: remove get_task_comm() and print task comm directly
Since task->comm is guaranteed to be NUL-terminated, we can print it
directly without the need to copy it into a separate buffer.  This
simplifies the code and avoids unnecessary operations.

Link: https://lkml.kernel.org/r/20241219023452.69907-5-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Acked-by: Kees Cook <kees@kernel.org>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: "André Almeida" <andrealmeid@igalia.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-12 20:21:16 -08:00
Geert Uytterhoeven
a9a5e0bdc5 hardening: Document INIT_STACK_ALL_PATTERN behavior with GCC
The help text for INIT_STACK_ALL_PATTERN documents the patterns used by
Clang, but lacks documentation for GCC.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/293d29d6a0d1823165be97285c1bc73e90ee9db8.1736239070.git.geert+renesas@glider.be
Signed-off-by: Kees Cook <kees@kernel.org>
2025-01-08 14:17:33 -08:00
Christian Göttsche
01c2253a0f selinux: make more use of str_read() when loading the policy
Simplify the call sites, and enable future string validation in a single
place.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:40 -05:00
Christian Göttsche
7491536366 selinux: avoid unnecessary indirection in struct level_datum
Store the owned member of type struct mls_level directly in the parent
struct instead of an extra heap allocation.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:40 -05:00
Christian Göttsche
f07586160f selinux: use known type instead of void pointer
Improve type safety and readability by using the known type.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:39 -05:00
Christian Göttsche
83e7e18eed selinux: rename comparison functions for clarity
The functions context_cmp(), mls_context_cmp() and ebitmap_cmp() are not
traditional C style compare functions returning -1, 0, and 1 for less
than, equal, and greater than; they only return whether their arguments
are equal.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:39 -05:00
Christian Göttsche
5e99b81f48 selinux: rework match_ipv6_addrmask()
Constify parameters, add size hints, and simplify control flow.

According to godbolt the same assembly is generated.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:38 -05:00
Christian Göttsche
9090308510 selinux: constify and reconcile function parameter names
Align the parameter names between declarations and definitions, and
constify read-only parameters.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: tweak the subject line]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:38 -05:00
Christian Göttsche
046b85a993 selinux: avoid using types indicating user space interaction
Integer types starting with a double underscore, like __u32, are
intended for usage of variables interacting with user-space.

Just use the plain variant.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:38 -05:00
Christian Göttsche
9d8d094fa3 selinux: supply missing field initializers
Please clang by supplying the missing field initializers in the
secclass_map variable and sel_fill_super() function.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: tweak subj and commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-07 23:14:37 -05:00
Linus Torvalds
09a0fa92e5 selinux/stable-6.13 PR 20250107
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmd9cIwUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOUCA//cioM7Z3f2NLzvV0O8zmQcnaEV1U3
 8RumFdUi+n2g7jUTgkOSJoPEsY7loeE/n/7tqD0D7r0nD5sgS4CsFtVdLfz/yVH5
 8tIslrJR18h3H70riL1fev7giNbdrftGJA1W0VSh2cwSqJUUpJ79skrZ9HgGOILW
 vDHoxv3IlzyNrc1XwfjaXXhr7xQMFRghjUGSBpIXk56iVeZebI/Bd1CcHiI1JPBl
 cea73AF8CiGvgjXh2IUPmxjVzJiM2Whjea0QT6RDg8dx2/NJ11jaYHNsKvjT14Z/
 H9s3wx4rl0+qwkmmp0zO3bK4It1X0I1XWrjvhjIxkAq7nZ0befNIJCfK7JXBT2TI
 vGP7kGDbIaN1183CcnXfZ3cpvQTzEanRn+CzLhXVuRgfFoIGIPMxYGRWYunNVjiN
 RW5ptv2pFmNRlCEthTAzwbzxk5ISlTspnsgyo5O5RhB2HGPbCyhEQN8K4rfJxzUy
 nSznOi8+TXdHpOwGFnU0Olhem2Yj0j0wo6c5nvB5+0NDVGFXULget2vHv00FTfOp
 uglGECdKf4Uc0+g6ZRrGjdxJtkEyZr6QAL43ccmQ+2WMb95R2nLq9YjOTpBMg1pD
 zUfcW9qensxAYucloCtHOdOAdd62udEVSERhna7ZPApUjVzaeD50l8SsasaWBxDL
 m6sl3gRDKWW9m2k=
 =ojzK
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20250107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fix from Paul Moore:
 "A single SELinux patch to address a problem with a single domain using
  multiple xperm classes"

* tag 'selinux-pr-20250107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: match extended permissions to their base permissions
2025-01-07 14:49:48 -08:00
Tetsuo Handa
08ae2487b2 tomoyo: automatically use patterns for several situations in learning mode
The "file_pattern" keyword was used for automatically recording patternized
pathnames when using the learning mode. This keyword was removed in TOMOYO
2.4 because it is impossible to predefine all possible pathname patterns.

However, since the numeric part of proc:/$PID/ , pipe:[$INO] and
socket:[$INO] has no meaning except $PID == 1, automatically replacing
the numeric part with \$ pattern helps reducing frequency of restarting
the learning mode due to hitting the quota.

Since replacing one digit with \$ pattern requires enlarging string buffer,
and several programs access only $PID == 1, replace only two or more digits
with \$ pattern.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2025-01-06 21:25:00 +09:00
Tanya Agarwal
714d87c90a lockdown: initialize local array before use to quiet static analysis
The static code analysis tool "Coverity Scan" pointed the following
details out for further development considerations:

  CID 1486102: Uninitialized scalar variable (UNINIT)
  uninit_use_in_call: Using uninitialized value *temp when calling
                      strlen.

Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com>
[PM: edit/reformat the description, subject line]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-05 12:48:43 -05:00
Leo Stone
f09ff307c7 safesetid: check size of policy writes
syzbot attempts to write a buffer with a large size to a sysfs entry
with writes handled by handle_policy_update(), triggering a warning
in kmalloc.

Check the size specified for write buffers before allocating.

Reported-by: syzbot+4eb7a741b3216020043a@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4eb7a741b3216020043a
Signed-off-by: Leo Stone <leocstone@gmail.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04 22:46:09 -05:00
Christian Göttsche
b00083aed4 lsm: rename variable to avoid shadowing
The function dump_common_audit_data() contains two variables with the
name comm: one declared at the top and one nested one.  Rename the
nested variable to improve readability and make future refactorings
of the function less error prone.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: description long line removal, line wrap cleanup, merge fuzz]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04 22:04:39 -05:00
Christian Göttsche
b0966c7c81 lsm: constify function parameters
The functions print_ipv4_addr() and print_ipv6_addr() are called with
string literals and do not modify these parameters internally.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: cleaned up the description to remove long lines]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04 22:04:39 -05:00
Colin Ian King
241d6a6640 security: remove redundant assignment to return variable
In the case where rc is equal to EOPNOTSUPP it is being reassigned a
new value of zero that is never read. The following continue statement
loops back to the next iteration of the lsm_for_each_hook loop and
rc is being re-assigned a new value from the call to getselfattr.
The assignment is redundant and can be removed.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
[PM: subj tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04 21:52:13 -05:00
Thiébaud Weksteen
5e7f0efd23 selinux: match extended permissions to their base permissions
In commit d1d991efaf ("selinux: Add netlink xperm support") a new
extended permission was added ("nlmsg"). This was the second extended
permission implemented in selinux ("ioctl" being the first one).

Extended permissions are associated with a base permission. It was found
that, in the access vector cache (avc), the extended permission did not
keep track of its base permission. This is an issue for a domain that is
using both extended permissions (i.e., a domain calling ioctl() on a
netlink socket). In this case, the extended permissions were
overlapping.

Keep track of the base permission in the cache. A new field "base_perm"
is added to struct extended_perms_decision to make sure that the
extended permission refers to the correct policy permission. A new field
"base_perms" is added to struct extended_perms to quickly decide if
extended permissions apply.

While it is in theory possible to retrieve the base permission from the
access vector, the same base permission may not be mapped to the same
bit for each class (e.g., "nlmsg" is mapped to a different bit for
"netlink_route_socket" and "netlink_audit_socket"). Instead, use a
constant (AVC_EXT_IOCTL or AVC_EXT_NLMSG) provided by the caller.

Fixes: d1d991efaf ("selinux: Add netlink xperm support")
Signed-off-by: Thiébaud Weksteen <tweek@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04 20:58:46 -05:00
Mickaël Salaün
7ccbe076d9 lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set
When CONFIG_AUDIT is set, its CONFIG_NET dependency is also set, and the
dev_get_by_index and init_net symbols (used by dump_common_audit_data)
are found by the linker.  dump_common_audit_data() should then failed to
build when CONFIG_NET is not set. However, because the compiler is
smart, it knows that audit_log_start() always return NULL when
!CONFIG_AUDIT, and it doesn't build the body of common_lsm_audit().  As
a side effect, dump_common_audit_data() is not built and the linker
doesn't error out because of missing symbols.

Let's only build lsm_audit.o when CONFIG_SECURITY and CONFIG_AUDIT are
both set, which is checked with the new CONFIG_HAS_SECURITY_AUDIT.

ipv4_skb_to_auditdata() and ipv6_skb_to_auditdata() are only used by
Smack if CONFIG_AUDIT is set, so they don't need fake implementations.

Because common_lsm_audit() is used in multiple places without
CONFIG_AUDIT checks, add a fake implementation.

Link: https://lore.kernel.org/r/20241122143353.59367-2-mic@digikod.net
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: James Morris <jmorris@namei.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04 11:50:44 -05:00
Mimi Zohar
4785ed362a ima: ignore suffixed policy rule comments
Lines beginning with '#' in the IMA policy are comments and are ignored.
Instead of placing the rule and comment on separate lines, allow the
comment to be suffixed to the IMA policy rule.

Reviewed-by: Petr Vorel <pvorel@suse.cz>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-01-03 10:18:43 -05:00
Mimi Zohar
7eef7c8bac ima: limit the builtin 'tcb' dont_measure tmpfs policy rule
With a custom policy similar to the builtin IMA 'tcb' policy [1], arch
specific policy, and a kexec boot command line measurement policy rule,
the kexec boot command line is not measured due to the dont_measure
tmpfs rule.

Limit the builtin 'tcb' dont_measure tmpfs policy rule to just the
"func=FILE_CHECK" hook.  Depending on the end users security threat
model, a custom policy might not even include this dont_measure tmpfs
rule.

Note: as a result of this policy rule change, other measurements might
also be included in the IMA-measurement list that previously weren't
included.

[1] https://ima-doc.readthedocs.io/en/latest/ima-policy.html#ima-tcb

Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-01-03 10:18:24 -05:00
Breno Leitao
68af44a719 ima: kexec: silence RCU list traversal warning
The ima_measurements list is append-only and doesn't require
rcu_read_lock() protection. However, lockdep issues a warning when
traversing RCU lists without the read lock:

  security/integrity/ima/ima_kexec.c:40 RCU-list traversed in non-reader section!!

Fix this by using the variant of list_for_each_entry_rcu() with the last
argument set to true. This tells the RCU subsystem that traversing this
append-only list without the read lock is intentional and safe.

This change silences the lockdep warning while maintaining the correct
semantics for the append-only list traversal.

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2024-12-24 13:56:45 -05:00
Mateusz Guzik
ea38219907
vfs: support caching symlink lengths in inodes
When utilized it dodges strlen() in vfs_readlink(), giving about 1.5%
speed up when issuing readlink on /initrd.img on ext4.

Filesystems opt in by calling inode_set_cached_link() when creating an
inode.

The size is stored in a new union utilizing the same space as i_devices,
thus avoiding growing the struct or taking up any more space.

Churn-wise the current readlink_copy() helper is patched to accept the
size instead of calculating it.

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://lore.kernel.org/r/20241120112037.822078-2-mjguzik@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-22 11:29:50 +01:00
Mimi Zohar
95b3cdafd7 ima: instantiate the bprm_creds_for_exec() hook
Like direct file execution (e.g. ./script.sh), indirect file execution
(e.g. sh script.sh) needs to be measured and appraised.  Instantiate
the new security_bprm_creds_for_exec() hook to measure and verify the
indirect file's integrity.  Unlike direct file execution, indirect file
execution is optionally enforced by the interpreter.

Differentiate kernel and userspace enforced integrity audit messages.

Co-developed-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20241212174223.389435-9-mic@digikod.net
Signed-off-by: Kees Cook <kees@kernel.org>
2024-12-18 17:00:29 -08:00
Mickaël Salaün
a0623b2a1d security: Add EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits
The new SECBIT_EXEC_RESTRICT_FILE, SECBIT_EXEC_DENY_INTERACTIVE, and
their *_LOCKED counterparts are designed to be set by processes setting
up an execution environment, such as a user session, a container, or a
security sandbox.  Unlike other securebits, these ones can be set by
unprivileged processes.  Like seccomp filters or Landlock domains, the
securebits are inherited across processes.

When SECBIT_EXEC_RESTRICT_FILE is set, programs interpreting code should
control executable resources according to execveat(2) + AT_EXECVE_CHECK
(see previous commit).

When SECBIT_EXEC_DENY_INTERACTIVE is set, a process should deny
execution of user interactive commands (which excludes executable
regular files).

Being able to configure each of these securebits enables system
administrators or owner of image containers to gradually validate the
related changes and to identify potential issues (e.g. with interpreter
or audit logs).

It should be noted that unlike other security bits, the
SECBIT_EXEC_RESTRICT_FILE and SECBIT_EXEC_DENY_INTERACTIVE bits are
dedicated to user space willing to restrict itself.  Because of that,
they only make sense in the context of a trusted environment (e.g.
sandbox, container, user session, full system) where the process
changing its behavior (according to these bits) and all its parent
processes are trusted.  Otherwise, any parent process could just execute
its own malicious code (interpreting a script or not), or even enforce a
seccomp filter to mask these bits.

Such a secure environment can be achieved with an appropriate access
control (e.g. mount's noexec option, file access rights, LSM policy) and
an enlighten ld.so checking that libraries are allowed for execution
e.g., to protect against illegitimate use of LD_PRELOAD.

Ptrace restrictions according to these securebits would not make sense
because of the processes' trust assumption.

Scripts may need some changes to deal with untrusted data (e.g. stdin,
environment variables), but that is outside the scope of the kernel.

See chromeOS's documentation about script execution control and the
related threat model:
https://www.chromium.org/chromium-os/developer-library/guides/security/noexec-shell-scripts/

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Paul Moore <paul@paul-moore.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Jeff Xu <jeffxu@chromium.org>
Tested-by: Jeff Xu <jeffxu@chromium.org>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20241212174223.389435-3-mic@digikod.net
Signed-off-by: Kees Cook <kees@kernel.org>
2024-12-18 17:00:29 -08:00
Mickaël Salaün
a5874fde3c exec: Add a new AT_EXECVE_CHECK flag to execveat(2)
Add a new AT_EXECVE_CHECK flag to execveat(2) to check if a file would
be allowed for execution.  The main use case is for script interpreters
and dynamic linkers to check execution permission according to the
kernel's security policy. Another use case is to add context to access
logs e.g., which script (instead of interpreter) accessed a file.  As
any executable code, scripts could also use this check [1].

This is different from faccessat(2) + X_OK which only checks a subset of
access rights (i.e. inode permission and mount options for regular
files), but not the full context (e.g. all LSM access checks).  The main
use case for access(2) is for SUID processes to (partially) check access
on behalf of their caller.  The main use case for execveat(2) +
AT_EXECVE_CHECK is to check if a script execution would be allowed,
according to all the different restrictions in place.  Because the use
of AT_EXECVE_CHECK follows the exact kernel semantic as for a real
execution, user space gets the same error codes.

An interesting point of using execveat(2) instead of openat2(2) is that
it decouples the check from the enforcement.  Indeed, the security check
can be logged (e.g. with audit) without blocking an execution
environment not yet ready to enforce a strict security policy.

LSMs can control or log execution requests with
security_bprm_creds_for_exec().  However, to enforce a consistent and
complete access control (e.g. on binary's dependencies) LSMs should
restrict file executability, or measure executed files, with
security_file_open() by checking file->f_flags & __FMODE_EXEC.

Because AT_EXECVE_CHECK is dedicated to user space interpreters, it
doesn't make sense for the kernel to parse the checked files, look for
interpreters known to the kernel (e.g. ELF, shebang), and return ENOEXEC
if the format is unknown.  Because of that, security_bprm_check() is
never called when AT_EXECVE_CHECK is used.

It should be noted that script interpreters cannot directly use
execveat(2) (without this new AT_EXECVE_CHECK flag) because this could
lead to unexpected behaviors e.g., `python script.sh` could lead to Bash
being executed to interpret the script.  Unlike the kernel, script
interpreters may just interpret the shebang as a simple comment, which
should not change for backward compatibility reasons.

Because scripts or libraries files might not currently have the
executable permission set, or because we might want specific users to be
allowed to run arbitrary scripts, the following patch provides a dynamic
configuration mechanism with the SECBIT_EXEC_RESTRICT_FILE and
SECBIT_EXEC_DENY_INTERACTIVE securebits.

This is a redesign of the CLIP OS 4's O_MAYEXEC:
f5cb330d6b/1901_open_mayexec.patch
This patch has been used for more than a decade with customized script
interpreters.  Some examples can be found here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Jeff Xu <jeffxu@chromium.org>
Tested-by: Jeff Xu <jeffxu@chromium.org>
Link: https://docs.python.org/3/library/io.html#io.open_code [1]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20241212174223.389435-2-mic@digikod.net
Signed-off-by: Kees Cook <kees@kernel.org>
2024-12-18 17:00:29 -08:00
Linus Torvalds
397d1d88af selinux/stable-6.13 PR 20241217
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmdiU+EUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXP/Dg//X14XikP3UB0OcVRFkG3etPuUTf0L
 gCTDvPcv+Ck4T1AVhYgyPnZCjkuzvIWeqPMPcSOpUmgeJb9x3pPAB1pJSJnrhAoE
 3VmOmyalxnj/weboKwFLHRgEBN+gYe1J+fchFkQjGJQF+LzZ3I4jk/FARhYzE2UY
 gy/WVKS68MWK/RwED4Hc4c+ZJ/fM27bc3QPLB3C62J9qlQI4p+4XIRNrcfqYYvah
 X+Gd0oKMpRF6evHfx7LujWq+e9fZv5ZaGrRDRUwTTmdyWK2+iFKfQw1x24ijw3Iq
 0xrj8XR1O8nVd+FWo78mSEax+YXa8UY/WbQlTC1IxlN1lETshVGlQPz7QYV0yOpu
 FH47UhXDN2fPHGnMQRbSZf7d8GhOmEBEpms7xll5mDKQnx78Cqxp+xL7BzMCRMyK
 ktO8HPyQcxlKMAIrNStvA9xYWcbXf6PhNfogKln9hAiUyJBeEAMEQWp/tz2r1IHw
 yl78ZsbL3bNOjlk4K7G9w1qqiHjo7DDPgvzE7bTi2yolG/QX4iUIbAeEUAKqxKtl
 qn7R+GGIy/oijSohbkxIPDlf93dzQfMG8QzWN+Z/WZ4NtbdDQglZD6F3ediPNPvP
 RpmabcXBEK4TKnHzwWx1fsxd256OzrWI3QF5bJaEQ2u+R4RIJGmPjz27xiXZiXyb
 oheacqtiYnAyJQU=
 =LS+v
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fix from Paul Moore:
 "One small SELinux patch to get rid improve our handling of unknown
  extended permissions by safely ignoring them.

  Not only does this make it easier to support newer SELinux policy
  on older kernels in the future, it removes to BUG() calls from the
  SELinux code."

* tag 'selinux-pr-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: ignore unknown extended permissions
2024-12-18 12:10:15 -08:00
Tetsuo Handa
0476fd4ff4 tomoyo: use realpath if symlink's pathname refers to procfs
Fedora 41 has reached Linux 6.12 kernel with TOMOYO enabled. I observed
that /usr/lib/systemd/systemd executes /usr/lib/systemd/systemd-executor
by passing dirfd == 9 or dirfd == 16 upon execveat().

Commit ada1986d07 ("tomoyo: fallback to realpath if symlink's pathname
does not exist") used realpath only if symlink's pathname does not exist.
But an out of tree patch suggested that it will be reasonable to always
use realpath if symlink's pathname refers to proc filesystem.

Therefore, this patch changes the pathname used for checking "file execute"
and the domainname used after a successful execve() request.

Before:

  <kernel> /usr/lib/systemd/systemd
  file execute proc:/self/fd/16 exec.realpath="/usr/lib/systemd/systemd-executor" exec.argv[0]="/usr/lib/systemd/systemd-executor"
  file execute proc:/self/fd/9 exec.realpath="/usr/lib/systemd/systemd-executor" exec.argv[0]="/usr/lib/systemd/systemd-executor"

  <kernel> /usr/lib/systemd/systemd proc:/self/fd/16
  file execute /usr/sbin/auditd exec.realpath="/usr/sbin/auditd" exec.argv[0]="/usr/sbin/auditd"

  <kernel> /usr/lib/systemd/systemd proc:/self/fd/16 /usr/sbin/auditd

  <kernel> /usr/lib/systemd/systemd proc:/self/fd/9
  file execute /usr/bin/systemctl exec.realpath="/usr/bin/systemctl" exec.argv[0]="/usr/bin/systemctl"

  <kernel> /usr/lib/systemd/systemd proc:/self/fd/9 /usr/bin/systemctl

After:

  <kernel> /usr/lib/systemd/systemd
  file execute /usr/lib/systemd/systemd-executor exec.realpath="/usr/lib/systemd/systemd-executor" exec.argv[0]="/usr/lib/systemd/systemd-executor"

  <kernel> /usr/lib/systemd/systemd /usr/lib/systemd/systemd-executor
  file execute /usr/bin/systemctl exec.realpath="/usr/bin/systemctl" exec.argv[0]="/usr/bin/systemctl"
  file execute /usr/sbin/auditd exec.realpath="/usr/sbin/auditd" exec.argv[0]="/usr/sbin/auditd"

  <kernel> /usr/lib/systemd/systemd /usr/lib/systemd/systemd-executor /usr/bin/systemctl

  <kernel> /usr/lib/systemd/systemd /usr/lib/systemd/systemd-executor /usr/sbin/auditd

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2024-12-17 19:03:39 +09:00
Song Liu
58ecb3a789 bpf: lsm: Remove hook to bpf_task_storage_free
free_task() already calls bpf_task_storage_free(). It is not necessary
to call it again on security_task_free(). Remove the hook.

Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://patch.msgid.link/20241212075956.2614894-1-song@kernel.org
2024-12-16 12:32:31 -08:00
Tetsuo Handa
3df7546fc0 tomoyo: don't emit warning in tomoyo_write_control()
syzbot is reporting too large allocation warning at tomoyo_write_control(),
for one can write a very very long line without new line character. To fix
this warning, I use __GFP_NOWARN rather than checking for KMALLOC_MAX_SIZE,
for practically a valid line should be always shorter than 32KB where the
"too small to fail" memory-allocation rule applies.

One might try to write a valid line that is longer than 32KB, but such
request will likely fail with -ENOMEM. Therefore, I feel that separately
returning -EINVAL when a line is longer than KMALLOC_MAX_SIZE is redundant.
There is no need to distinguish over-32KB and over-KMALLOC_MAX_SIZE.

Reported-by: syzbot+7536f77535e5210a5c76@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7536f77535e5210a5c76
Reported-by: Leo Stone <leocstone@gmail.com>
Closes: https://lkml.kernel.org/r/20241216021459.178759-2-leocstone@gmail.com
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2024-12-16 19:41:29 +09:00
Thiébaud Weksteen
900f83cf37 selinux: ignore unknown extended permissions
When evaluating extended permissions, ignore unknown permissions instead
of calling BUG(). This commit ensures that future permissions can be
added without interfering with older kernels.

Cc: stable@vger.kernel.org
Fixes: fa1aa143ac ("selinux: extended permissions for ioctls")
Signed-off-by: Thiébaud Weksteen <tweek@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-15 21:59:03 -05:00
Thiébaud Weksteen
2ef6fc99e0 selinux: add netlink nlmsg_type audit message
Add a new audit message type to capture nlmsg-related information. This
is similar to LSM_AUDIT_DATA_IOCTL_OP which was added for the other
SELinux extended permission (ioctl).

Adding a new type is preferred to adding to the existing
lsm_network_audit structure which contains irrelevant information for
the netlink sockets (i.e., dport, sport).

Signed-off-by: Thiébaud Weksteen <tweek@google.com>
[PM: change "nlnk-msgtype" to "nl-msgtype" as discussed]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-15 19:33:07 -05:00
Christian Göttsche
4aa1761934 selinux: add support for xperms in conditional policies
Add support for extended permission rules in conditional policies.
Currently the kernel accepts such rules already, but evaluating a
security decision will hit a BUG() in
services_compute_xperms_decision().  Thus reject extended permission
rules in conditional policies for current policy versions.

Add a new policy version for this feature.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Tested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-13 16:35:38 -05:00
Mikhail Ivanov
034294fbfd selinux: Fix SCTP error inconsistency in selinux_socket_bind()
Check sk->sk_protocol instead of security class to recognize SCTP socket.
SCTP socket is initialized with SECCLASS_SOCKET class if policy does not
support EXTSOCKCLASS capability. In this case bind(2) hook wrongfully
return EAFNOSUPPORT instead of EINVAL.

The inconsistency was detected with help of Landlock tests:
https://lore.kernel.org/all/b58680ca-81b2-7222-7287-0ac7f4227c3c@huawei-partners.com/

Fixes: 0f8db8cc73 ("selinux: add AF_UNSPEC and INADDR_ANY checks to selinux_socket_bind()")
Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-11 14:57:47 -05:00
Christian Göttsche
c75c7945cd selinux: use native iterator types
Use types for iterators equal to the type of the to be compared values.

Reported by clang:

    ../ss/sidtab.c:126:2: warning: comparison of integers of different
                          signs: 'int' and 'unsigned long'
      126 |  hash_for_each_rcu(sidtab->context_to_sid, i, entry, list) {
          |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ../hashtable.h:139:51: note: expanded from macro 'hash_for_each_rcu'
      139 |  for (... ; obj == NULL && (bkt) < HASH_SIZE(name);\
          |                             ~~~  ^ ~~~~~~~~~~~~~~~

    ../selinuxfs.c:1520:23: warning: comparison of integers of different
                            signs: 'int' and 'unsigned int'
     1520 |  for (cpu = *idx; cpu < nr_cpu_ids; ++cpu) {
          |                   ~~~ ^ ~~~~~~~~~~

    ../hooks.c:412:16: warning: comparison of integers of different signs:
                       'int' and 'unsigned long'
      412 |  for (i = 0; i < ARRAY_SIZE(tokens); i++) {
          |              ~ ^ ~~~~~~~~~~~~~~~~~~

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: munged the clang output due to line length concerns]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-11 13:54:14 -05:00
Thomas Weißschuh
b01c939d58 selinux: add generated av_permissions.h to targets
av_permissions.h was not declared as a target and therefore not cleaned
up automatically by kbuild.

Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/lkml/CAK7LNATUnCPt03BRFSKh1EH=+Sy0Q48wE4ER0BZdJqOb_44L8w@mail.gmail.com/
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-11 13:42:35 -05:00
Stefan Berger
254ef9541d ima: Suspend PCR extends and log appends when rebooting
To avoid the following types of error messages due to a failure by the TPM
driver to use the TPM, suspend TPM PCR extensions and the appending of
entries to the IMA log once IMA's reboot notifier has been called. This
avoids trying to use the TPM after the TPM subsystem has been shut down.

[111707.685315][    T1] ima: Error Communicating to TPM chip, result: -19
[111707.685960][    T1] ima: Error Communicating to TPM chip, result: -19

Synchronization with the ima_extend_list_mutex to set
ima_measurements_suspended ensures that the TPM subsystem is not shut down
when IMA holds the mutex while appending to the log and extending the PCR.
The alternative of reading the system_state variable would not provide this
guarantee.

This error could be observed on a ppc64 machine running SuSE Linux where
processes are still accessing files after devices have been shut down.

Suspending the IMA log and PCR extensions shortly before reboot does not
seem to open a significant measurement gap since neither TPM quoting would
work for attestation nor that new log entries could be written to anywhere
after devices have been shut down. However, there's a time window between
the invocation of the reboot notifier and the shutdown of devices. This
includes all subsequently invoked reboot notifiers as well as
kernel_restart_prepare() where __usermodehelper_disable() waits for all
running_helpers to exit. During this time window IMA could now miss log
entries even though attestation would still work. The reboot of the system
shortly after may make this small gap insignificant.

Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2024-12-11 11:55:53 -05:00
Amir Goldstein
f156524e5d fsnotify: introduce pre-content permission events
The new FS_PRE_ACCESS permission event is similar to FS_ACCESS_PERM,
but it meant for a different use case of filling file content before
access to a file range, so it has slightly different semantics.

Generate FS_PRE_ACCESS/FS_ACCESS_PERM as two seperate events, so content
scanners could inspect the content filled by pre-content event handler.

Unlike FS_ACCESS_PERM, FS_PRE_ACCESS is also called before a file is
modified by syscalls as write() and fallocate().

FS_ACCESS_PERM is reported also on blockdev and pipes, but the new
pre-content events are only reported for regular files and dirs.

The pre-content events are meant to be used by hierarchical storage
managers that want to fill the content of files on first access.

There are some specific requirements from filesystems that could
be used with pre-content events, so add a flag for fs to opt-in
for pre-content events explicitly before they can be used.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/b934c5e3af205abc4e0e4709f6486815937ddfdf.1731684329.git.josef@toxicpanda.com
2024-12-10 12:03:17 +01:00
Konstantin Andreev
6f71ad02aa smack: deduplicate access to string conversion
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2024-12-06 13:21:12 -08:00
Linus Torvalds
896d8946da Including fixes from can and netfilter.
Current release - regressions:
 
   - rtnetlink: fix double call of rtnl_link_get_net_ifla()
 
   - tcp: populate XPS related fields of timewait sockets
 
   - ethtool: fix access to uninitialized fields in set RXNFC command
 
   - selinux: use sk_to_full_sk() in selinux_ip_output()
 
 Current release - new code bugs:
 
   - net: make napi_hash_lock irq safe
 
   - eth: bnxt_en: support header page pool in queue API
 
   - eth: ice: fix NULL pointer dereference in switchdev
 
 Previous releases - regressions:
 
   - core: fix icmp host relookup triggering ip_rt_bug
 
   - ipv6:
     - avoid possible NULL deref in modify_prefix_route()
     - release expired exception dst cached in socket
 
   - smc: fix LGR and link use-after-free issue
 
   - hsr: avoid potential out-of-bound access in fill_frame_info()
 
   - can: hi311x: fix potential use-after-free
 
   - eth: ice: fix VLAN pruning in switchdev mode
 
 Previous releases - always broken:
 
   - netfilter:
     - ipset: hold module reference while requesting a module
     - nft_inner: incorrect percpu area handling under softirq
 
   - can: j1939: fix skb reference counting
 
   - eth: mlxsw: use correct key block on Spectrum-4
 
   - eth: mlx5: fix memory leak in mlx5hws_definer_calc_layout
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmdRve4SHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOk/TUQAItDVkTQDiAUUqd0TDH2SSnboxXozcMF
 fKVrdWulFAOe7qcajUqjkzvTDjAOw9Xbfh8rEiFBLaXmUzSUT2eqbm2VahvWIR5/
 k08v7fTTuCNzEwhnbQlsZ47Nd26LJVPwOvbtE/4V8d50pU/serjWuI/74tUCWAjn
 DQOjyyqRjKgKKY+WWCQ6g4tVpD//DB4Sj15GiE3MhlW1f08AAnPJSe2oTaq0RZBG
 nXo7abOGn8x3RYilrlp/ZwWYuNpVI4oF+qmp+t/46NV+7ER1JgrC97k0kFyyCYVD
 g7vBvFjvA7vDmiuzfsOW2n7IRdcfBtkfi8UJYOIvVgJg7KDF0DXoxj3qD4FagI35
 RWRMJw+PoNlFFkPprlp0we/19jPJWOO6rx+AOMEQD78jrH7NoFQ/+eeBf/nppfjy
 wX0LD1aQsgPk2ju0I8GbcM/qaJ81EJUiYYVLJHieH0+vvmqts8cMDRLZf5t08EHa
 myXcRGB9N8gjguGp5mdR5KmtY82zASlNC0PbDp3nlPYc3e/opCmMRYGjBohO+vqn
 n7u250WThPwiBtOwYmSbcK7zpS034/VX0ufnTT2X3MWnFGGNDv6XVmho/OBuCHqJ
 m/EiJo/D9qVGAIql/vAxN9a4lQKrcZgaMzCyEPYCf7rtLmzx7sfyHWbf/GZlUAU/
 9dUUfWqCbZcL
 =nkPk
 -----END PGP SIGNATURE-----

Merge tag 'net-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from can and netfilter.

  Current release - regressions:

   - rtnetlink: fix double call of rtnl_link_get_net_ifla()

   - tcp: populate XPS related fields of timewait sockets

   - ethtool: fix access to uninitialized fields in set RXNFC command

   - selinux: use sk_to_full_sk() in selinux_ip_output()

  Current release - new code bugs:

   - net: make napi_hash_lock irq safe

   - eth:
      - bnxt_en: support header page pool in queue API
      - ice: fix NULL pointer dereference in switchdev

  Previous releases - regressions:

   - core: fix icmp host relookup triggering ip_rt_bug

   - ipv6:
      - avoid possible NULL deref in modify_prefix_route()
      - release expired exception dst cached in socket

   - smc: fix LGR and link use-after-free issue

   - hsr: avoid potential out-of-bound access in fill_frame_info()

   - can: hi311x: fix potential use-after-free

   - eth: ice: fix VLAN pruning in switchdev mode

  Previous releases - always broken:

   - netfilter:
      - ipset: hold module reference while requesting a module
      - nft_inner: incorrect percpu area handling under softirq

   - can: j1939: fix skb reference counting

   - eth:
      - mlxsw: use correct key block on Spectrum-4
      - mlx5: fix memory leak in mlx5hws_definer_calc_layout"

* tag 'net-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits)
  net :mana :Request a V2 response version for MANA_QUERY_GF_STAT
  net: avoid potential UAF in default_operstate()
  vsock/test: verify socket options after setting them
  vsock/test: fix parameter types in SO_VM_SOCKETS_* calls
  vsock/test: fix failures due to wrong SO_RCVLOWAT parameter
  net/mlx5e: Remove workaround to avoid syndrome for internal port
  net/mlx5e: SD, Use correct mdev to build channel param
  net/mlx5: E-Switch, Fix switching to switchdev mode in MPV
  net/mlx5: E-Switch, Fix switching to switchdev mode with IB device disabled
  net/mlx5: HWS: Properly set bwc queue locks lock classes
  net/mlx5: HWS: Fix memory leak in mlx5hws_definer_calc_layout
  bnxt_en: handle tpa_info in queue API implementation
  bnxt_en: refactor bnxt_alloc_rx_rings() to call bnxt_alloc_rx_agg_bmap()
  bnxt_en: refactor tpa_info alloc/free into helpers
  geneve: do not assume mac header is set in geneve_xmit_skb()
  mlxsw: spectrum_acl_flex_keys: Use correct key block on Spectrum-4
  ethtool: Fix wrong mod state in case of verbose and no_mask bitset
  ipmr: tune the ipmr_can_free_table() checks.
  netfilter: nft_set_hash: skip duplicated elements pending gc run
  netfilter: ipset: Hold module reference while requesting a module
  ...
2024-12-05 10:25:06 -08:00
Jordan Rome
d48da4d5ed
security: add trace event for cap_capable
In cases where we want a stable way to observe/trace
cap_capable (e.g. protection from inlining and API updates)
add a tracepoint that passes:
- The credentials used
- The user namespace of the resource being accessed
- The user namespace in which the credential provides the
capability to access the targeted resource
- The capability to check for
- The return value of the check

Signed-off-by: Jordan Rome <linux@jordanrome.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Link: https://lore.kernel.org/r/20241204155911.1817092-1-linux@jordanrome.com
Signed-off-by: Serge Hallyn <sergeh@kernel.org>
2024-12-04 20:59:21 -06:00
Paul Moore
3f4f1f8a1a
capabilities: remove cap_mmap_file()
The cap_mmap_file() LSM callback returns the default value for the
security_mmap_file() LSM hook and can be safely removed.

Signed-off-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Serge Hallyn <sergeh@kernel.org>
2024-12-04 20:56:28 -06:00
Casey Schaufler
a4626e9786 lsm: secctx provider check on release
Verify that the LSM releasing the secctx is the LSM that
allocated it. This was not necessary when only one LSM could
create a secctx, but once there can be more than one it is.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-04 14:59:57 -05:00
Casey Schaufler
b530104f50 lsm: lsm_context in security_dentry_init_security
Replace the (secctx,seclen) pointer pair with a single lsm_context
pointer to allow return of the LSM identifier along with the context
and context length. This allows security_release_secctx() to know how
to release the context. Callers have been modified to use or save the
returned data from the new structure.

Cc: ceph-devel@vger.kernel.org
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-04 14:58:51 -05:00
Casey Schaufler
76ecf306ae lsm: use lsm_context in security_inode_getsecctx
Change the security_inode_getsecctx() interface to fill a lsm_context
structure instead of data and length pointers.  This provides
the information about which LSM created the context so that
security_release_secctx() can use the correct hook.

Cc: linux-nfs@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-04 14:58:09 -05:00
Casey Schaufler
2d470c7781 lsm: replace context+len with lsm_context
Replace the (secctx,seclen) pointer pair with a single
lsm_context pointer to allow return of the LSM identifier
along with the context and context length. This allows
security_release_secctx() to know how to release the
context. Callers have been modified to use or save the
returned data from the new structure.

security_secid_to_secctx() and security_lsmproc_to_secctx()
will now return the length value on success instead of 0.

Cc: netdev@vger.kernel.org
Cc: audit@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org
Cc: Todd Kjos <tkjos@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak, kdoc fix, signedness fix from Dan Carpenter]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-04 14:42:31 -05:00
Casey Schaufler
6fba89813c lsm: ensure the correct LSM context releaser
Add a new lsm_context data structure to hold all the information about a
"security context", including the string, its size and which LSM allocated
the string. The allocation information is necessary because LSMs have
different policies regarding the lifecycle of these strings. SELinux
allocates and destroys them on each use, whereas Smack provides a pointer
to an entry in a list that never goes away.

Update security_release_secctx() to use the lsm_context instead of a
(char *, len) pair. Change its callers to do likewise.  The LSMs
supporting this hook have had comments added to remind the developer
that there is more work to be done.

The BPF security module provides all LSM hooks. While there has yet to
be a known instance of a BPF configuration that uses security contexts,
the possibility is real. In the existing implementation there is
potential for multiple frees in that case.

Cc: linux-integrity@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: audit@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: linux-nfs@vger.kernel.org
Cc: Todd Kjos <tkjos@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-04 10:46:26 -05:00
Peter Zijlstra
cdd30ebb1b module: Convert symbol namespace to string literal
Clean up the existing export namespace code along the same lines of
commit 33def8498f ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.

Scripted using

  git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
  do
    awk -i inplace '
      /^#define EXPORT_SYMBOL_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /^#define MODULE_IMPORT_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /MODULE_IMPORT_NS/ {
        $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
      }
      /EXPORT_SYMBOL_NS/ {
        if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
  	if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &&
  	    $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &&
  	    $0 !~ /^my/) {
  	  getline line;
  	  gsub(/[[:space:]]*\\$/, "");
  	  gsub(/[[:space:]]/, "", line);
  	  $0 = $0 " " line;
  	}

  	$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
  		    "\\1(\\2, \"\\3\")", "g");
        }
      }
      { print }' $file;
  done

Requested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-02 11:34:44 -08:00
Linus Torvalds
8a6a03ad5b lsm/stable-6.13 PR 20241129
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmdKdi4UHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXP36A//UudCkttNTVCErQM0r8mjcPjtthlo
 fRblzsuXbHB3FTsrDQA4bpPcLTlxMcNZXGgO9F+cwjNRNRbavBZo8+AbtAWOppNx
 3XGr65Z0iIelUgvy/T3jYl8xQM4IePWbBn/wXMtqX3synE/UX4fqfxC/feeaUly8
 qdyoefM0I4JOabjn9i9N6KyeyU/sjI4aTRhd5YwaCdy6wK2u9hQEqvER5XdFZjhN
 HnuFKlBKEH5fo3Wwq5X+PbX8hP4szl3DE7QEplTlMjrh0y7xIc/ndugp/f3IN2Ri
 0/Da6FpqjtBiis+wkyGWwwfmw/erRRaLUhwbuW3MjCQTmtbcjkIjA2xb4nxtcT+n
 e/sCdWkgf1PX15igHpbWAynlv+jF3Ue+Kr/hrWi/H50fWiZh9skUmUxJ92SlI4fK
 DT1N430xvoBpQwynASAM6hcWWgLJlFBo11aZA2fl++ORC3z/dbCZ5iLPdbIvaemF
 KSmQTnrk4sVH8sPl9aIHqCFxj78uTHkSzLGQyq76AWU/s8hGzPszpU6iPwlcx1wk
 WpkMLG5FQr0a4vKnNx1FVLCmwalPlB6/ZclRvKuSiEGZWTrnOo5KhPy1+Yn39AUo
 IDBFnJkJNWxUVyb6zK9OJJ3Tsd6c1qeBEqb0VzUDGmw2umTuEAKqlqNJAohxVvyP
 A7GFrlg9FvsDAwM=
 =ZCsN
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20241129' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull ima fix from Paul Moore:
 "One small patch to fix a function parameter / local variable naming
  snafu that went up to you in the current merge window"

* tag 'lsm-pr-20241129' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  ima: uncover hidden variable in ima_match_rules()
2024-11-30 18:14:56 -08:00
Eric Dumazet
eedcad2f2a selinux: use sk_to_full_sk() in selinux_ip_output()
In blamed commit, TCP started to attach timewait sockets to
some skbs.

syzbot reported that selinux_ip_output() was not expecting them yet.

Note that using sk_to_full_sk() is still allowing the
following sk_listener() check to work as before.

BUG: KASAN: slab-out-of-bounds in selinux_sock security/selinux/include/objsec.h:207 [inline]
BUG: KASAN: slab-out-of-bounds in selinux_ip_output+0x1e0/0x1f0 security/selinux/hooks.c:5761
Read of size 8 at addr ffff88804e86e758 by task syz-executor347/5894

CPU: 0 UID: 0 PID: 5894 Comm: syz-executor347 Not tainted 6.12.0-syzkaller-05480-gfcc79e1714e8 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024
Call Trace:
 <IRQ>
  __dump_stack lib/dump_stack.c:94 [inline]
  dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
  print_address_description mm/kasan/report.c:377 [inline]
  print_report+0xc3/0x620 mm/kasan/report.c:488
  kasan_report+0xd9/0x110 mm/kasan/report.c:601
  selinux_sock security/selinux/include/objsec.h:207 [inline]
  selinux_ip_output+0x1e0/0x1f0 security/selinux/hooks.c:5761
  nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
  nf_hook_slow+0xbb/0x200 net/netfilter/core.c:626
  nf_hook+0x386/0x6d0 include/linux/netfilter.h:269
  __ip_local_out+0x339/0x640 net/ipv4/ip_output.c:119
  ip_local_out net/ipv4/ip_output.c:128 [inline]
  ip_send_skb net/ipv4/ip_output.c:1505 [inline]
  ip_push_pending_frames+0xa0/0x5b0 net/ipv4/ip_output.c:1525
  ip_send_unicast_reply+0xd0e/0x1650 net/ipv4/ip_output.c:1672
  tcp_v4_send_ack+0x976/0x13f0 net/ipv4/tcp_ipv4.c:1024
  tcp_v4_timewait_ack net/ipv4/tcp_ipv4.c:1077 [inline]
  tcp_v4_rcv+0x2f96/0x4390 net/ipv4/tcp_ipv4.c:2428
  ip_protocol_deliver_rcu+0xba/0x4c0 net/ipv4/ip_input.c:205
  ip_local_deliver_finish+0x316/0x570 net/ipv4/ip_input.c:233
  NF_HOOK include/linux/netfilter.h:314 [inline]
  NF_HOOK include/linux/netfilter.h:308 [inline]
  ip_local_deliver+0x18e/0x1f0 net/ipv4/ip_input.c:254
  dst_input include/net/dst.h:460 [inline]
  ip_rcv_finish net/ipv4/ip_input.c:447 [inline]
  NF_HOOK include/linux/netfilter.h:314 [inline]
  NF_HOOK include/linux/netfilter.h:308 [inline]
  ip_rcv+0x2c3/0x5d0 net/ipv4/ip_input.c:567
  __netif_receive_skb_one_core+0x199/0x1e0 net/core/dev.c:5672
  __netif_receive_skb+0x1d/0x160 net/core/dev.c:5785
  process_backlog+0x443/0x15f0 net/core/dev.c:6117
  __napi_poll.constprop.0+0xb7/0x550 net/core/dev.c:6877
  napi_poll net/core/dev.c:6946 [inline]
  net_rx_action+0xa94/0x1010 net/core/dev.c:7068
  handle_softirqs+0x213/0x8f0 kernel/softirq.c:554
  do_softirq kernel/softirq.c:455 [inline]
  do_softirq+0xb2/0xf0 kernel/softirq.c:442
 </IRQ>
 <TASK>
  __local_bh_enable_ip+0x100/0x120 kernel/softirq.c:382
  local_bh_enable include/linux/bottom_half.h:33 [inline]
  rcu_read_unlock_bh include/linux/rcupdate.h:919 [inline]
  __dev_queue_xmit+0x8af/0x43e0 net/core/dev.c:4461
  dev_queue_xmit include/linux/netdevice.h:3168 [inline]
  neigh_hh_output include/net/neighbour.h:523 [inline]
  neigh_output include/net/neighbour.h:537 [inline]
  ip_finish_output2+0xc6c/0x2150 net/ipv4/ip_output.c:236
  __ip_finish_output net/ipv4/ip_output.c:314 [inline]
  __ip_finish_output+0x49e/0x950 net/ipv4/ip_output.c:296
  ip_finish_output+0x35/0x380 net/ipv4/ip_output.c:324
  NF_HOOK_COND include/linux/netfilter.h:303 [inline]
  ip_output+0x13b/0x2a0 net/ipv4/ip_output.c:434
  dst_output include/net/dst.h:450 [inline]
  ip_local_out+0x33e/0x4a0 net/ipv4/ip_output.c:130
  __ip_queue_xmit+0x777/0x1970 net/ipv4/ip_output.c:536
  __tcp_transmit_skb+0x2b39/0x3df0 net/ipv4/tcp_output.c:1466
  tcp_transmit_skb net/ipv4/tcp_output.c:1484 [inline]
  tcp_write_xmit+0x12b1/0x8560 net/ipv4/tcp_output.c:2827
  __tcp_push_pending_frames+0xaf/0x390 net/ipv4/tcp_output.c:3010
  tcp_send_fin+0x154/0xc70 net/ipv4/tcp_output.c:3616
  __tcp_close+0x96b/0xff0 net/ipv4/tcp.c:3130
  tcp_close+0x28/0x120 net/ipv4/tcp.c:3221
  inet_release+0x13c/0x280 net/ipv4/af_inet.c:435
  __sock_release net/socket.c:640 [inline]
  sock_release+0x8e/0x1d0 net/socket.c:668
  smc_clcsock_release+0xb7/0xe0 net/smc/smc_close.c:34
  __smc_release+0x5c2/0x880 net/smc/af_smc.c:301
  smc_release+0x1fc/0x5f0 net/smc/af_smc.c:344
  __sock_release+0xb0/0x270 net/socket.c:640
  sock_close+0x1c/0x30 net/socket.c:1408
  __fput+0x3f8/0xb60 fs/file_table.c:450
  __fput_sync+0xa1/0xc0 fs/file_table.c:535
  __do_sys_close fs/open.c:1550 [inline]
  __se_sys_close fs/open.c:1535 [inline]
  __x64_sys_close+0x86/0x100 fs/open.c:1535
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f6814c9ae10
Code: ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 80 3d b1 e2 07 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c
RSP: 002b:00007fffb2389758 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f6814c9ae10
RDX: 0000000000000010 RSI: 0000000020000000 RDI: 0000000000000003
RBP: 00000000000f4240 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000202 R12: 00007fffb23897b0
R13: 00000000000141c3 R14: 00007fffb238977c R15: 00007fffb2389790
 </TASK>

Fixes: 79636038d3 ("ipv4: tcp: give socket pointer to control skbs")
Reported-by: syzbot+2d9f5f948c31dcb7745e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/6745e1a2.050a0220.1286eb.001c.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20241126145911.4187198-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-30 13:24:33 -08:00
Casey Schaufler
a65d9d1d89 ima: uncover hidden variable in ima_match_rules()
The variable name "prop" is inadvertently used twice in
ima_match_rules(), resulting in incorrect use of the local
variable when the function parameter should have been.
Rename the local variable and correct the use of the parameter.

Suggested-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Roberto Sassu <roberto.sassu@huawei.com>
[PM: subj tweak, Roberto's ACK]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-11-26 22:58:03 -05:00
John Johansen
04b5f0a5bf apparmor: lift new_profile declaration to remove C23 extension warning
the kernel test robot reports a C23 extension
warning: label followed by a declaration is a C23 extension
[-Wc23-extensions]
     696 |                 struct aa_profile *new_profile = NULL;

Instead of adding a null statement creating a C99 style inline var
declaration lift the label declaration out of the block so that it no
longer immediatedly follows the label.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202411101808.AI8YG6cs-lkp@intel.com/
Fixes: ee650b3820f3 ("apparmor: properly handle cx/px lookup failure for complain")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:06 -08:00
Ryan Lee
8acf7ad02d apparmor: replace misleading 'scrubbing environment' phrase in debug print
The wording of 'scrubbing environment' implied that all environment
variables would be removed, when instead secure-execution mode only
removes a small number of environment variables. This patch updates the
wording to describe what actually occurs instead: setting AT_SECURE for
ld.so's secure-execution mode.

Link: https://gitlab.com/apparmor/apparmor/-/merge_requests/1315 is a
merge request that does similar updating for apparmor userspace.

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:06 -08:00
John Johansen
9133493a76 parser: drop dead code for XXX_comb macros
The macros for label combination XXX_comb are no longer used and there
are no plans to use them so remove the dead code.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:06 -08:00
Jinjie Ruan
2115517682 apparmor: Remove unused parameter L1 in macro next_comb
In the macro definition of next_comb(), a parameter L1 is accepted,
but it is not used. Hence, it should be removed.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:06 -08:00
Ryan Lee
74a96bbe12 apparmor: audit_cap dedup based on subj_cred instead of profile
The previous audit_cap cache deduping was based on the profile that was
being audited. This could cause confusion due to the deduplication then
occurring across multiple processes, which could happen if multiple
instances of binaries matched the same profile attachment (and thus ran
under the same profile) or a profile was attached to a container and its
processes.

Instead, perform audit_cap deduping over ad->subj_cred, which ensures the
deduping only occurs across a single process, instead of across all
processes that match the current one's profile.

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:06 -08:00
Ryan Lee
fee7a2340f apparmor: add a cache entry expiration time aging out capability audit cache
When auditing capabilities, AppArmor uses a per-CPU, per-profile cache
such that the same capability for the same profile doesn't get repeatedly
audited, with the original goal of reducing audit logspam. However, this
cache does not have an expiration time, resulting in confusion when a
profile is shared across binaries (for example) and an expected DENIED
audit entry doesn't appear, despite the cache entry having been populated
much longer ago. This confusion was exacerbated by the per-CPU nature of
the cache resulting in the expected entries sporadically appearing when
the later denial+audit occurred on a different CPU.

To resolve this, record the last time a capability was audited for a
profile and add a timestamp expiration check before doing the audit.

v1 -> v2:
 - Hardcode a longer timeout and drop the patches making it a sysctl,
   after discussion with John Johansen.
 - Cache the expiration time instead of the last-audited time. This value
   can never be zero, which lets us drop the kernel_cap_t caps field from
   the cache struct.

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:06 -08:00
Ryan Lee
8532503eac apparmor: document capability.c:profile_capable ad ptr not being NULL
The profile_capabile function takes a struct apparmor_audit_data *ad,
which is documented as possibly being NULL. However, the single place that
calls this function never passes it a NULL ad. If we were ever to call
profile_capable with a NULL ad elsewhere, we would need to rework the
function, as its very first use of ad is to dereference ad->class without
checking if ad is NULL.

Thus, document profile_capable's ad parameter as not accepting NULL.

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:06 -08:00
chao liu
9b89713242 apparmor: fix 'Do simple duplicate message elimination'
Multiple profiles shared 'ent->caps', so some logs missed.

Fixes: 0ed3b28ab8 ("AppArmor: mediation of non file objects")
Signed-off-by: chao liu <liuzgyid@outlook.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:05 -08:00
John Johansen
a2081b78e2 apparmor: document first entry is in packed perms struct is reserved
Add a comment to unpack_perm to document the first entry in the packed
perms struct is reserved, and make a non-functional change of unpacking
to a temporary stack variable named "reserved" to help suppor the
documentation of which value is reserved.

Suggested-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:05 -08:00
Jinjie Ruan
7290f59231 apparmor: test: Fix memory leak for aa_unpack_strdup()
The string allocated by kmemdup() in aa_unpack_strdup() is not
freed and cause following memory leaks, free them to fix it.

	unreferenced object 0xffffff80c6af8a50 (size 8):
	  comm "kunit_try_catch", pid 225, jiffies 4294894407
	  hex dump (first 8 bytes):
	    74 65 73 74 69 6e 67 00                          testing.
	  backtrace (crc 5eab668b):
	    [<0000000001e3714d>] kmemleak_alloc+0x34/0x40
	    [<000000006e6c7776>] __kmalloc_node_track_caller_noprof+0x300/0x3e0
	    [<000000006870467c>] kmemdup_noprof+0x34/0x60
	    [<000000001176bb03>] aa_unpack_strdup+0xd0/0x18c
	    [<000000008ecde918>] policy_unpack_test_unpack_strdup_with_null_name+0xf8/0x3ec
	    [<0000000032ef8f77>] kunit_try_run_case+0x13c/0x3ac
	    [<00000000f3edea23>] kunit_generic_run_threadfn_adapter+0x80/0xec
	    [<00000000adf936cf>] kthread+0x2e8/0x374
	    [<0000000041bb1628>] ret_from_fork+0x10/0x20
	unreferenced object 0xffffff80c2a29090 (size 8):
	  comm "kunit_try_catch", pid 227, jiffies 4294894409
	  hex dump (first 8 bytes):
	    74 65 73 74 69 6e 67 00                          testing.
	  backtrace (crc 5eab668b):
	    [<0000000001e3714d>] kmemleak_alloc+0x34/0x40
	    [<000000006e6c7776>] __kmalloc_node_track_caller_noprof+0x300/0x3e0
	    [<000000006870467c>] kmemdup_noprof+0x34/0x60
	    [<000000001176bb03>] aa_unpack_strdup+0xd0/0x18c
	    [<0000000046a45c1a>] policy_unpack_test_unpack_strdup_with_name+0xd0/0x3c4
	    [<0000000032ef8f77>] kunit_try_run_case+0x13c/0x3ac
	    [<00000000f3edea23>] kunit_generic_run_threadfn_adapter+0x80/0xec
	    [<00000000adf936cf>] kthread+0x2e8/0x374
	    [<0000000041bb1628>] ret_from_fork+0x10/0x20

Cc: stable@vger.kernel.org
Fixes: 4d944bcd4e ("apparmor: add AppArmor KUnit tests for policy unpack")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:05 -08:00
Dr. David Alan Gilbert
75535669c9 apparmor: Remove deadcode
aa_label_audit, aa_label_find, aa_label_seq_print and aa_update_label_name
were added by commit
f1bd904175 ("apparmor: add the base fns() for domain labels")
but never used.

aa_profile_label_perm was added by commit
637f688dc3 ("apparmor: switch from profiles to using labels on contexts")
but never used.

aa_secid_update was added by commit
c092921219 ("apparmor: add support for mapping secids and using secctxes")
but never used.

aa_split_fqname has been unused since commit
3664268f19 ("apparmor: add namespace lookup fns()")

aa_lookup_profile has been unused since commit
93c98a484c ("apparmor: move exec domain mediation to using labels")

aa_audit_perms_cb was only used by aa_profile_label_perm (see above).

All of these commits are from around 2017.

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:05 -08:00
Thorsten Blum
648e45d724 apparmor: Remove unnecessary NULL check before kvfree()
Since kvfree() already checks if its argument is NULL, an additional
check before calling kvfree() is unnecessary and can be removed.

Remove it and the following Coccinelle/coccicheck warning reported by
ifnullfree.cocci:

  WARNING: NULL check before some freeing functions is not needed

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:05 -08:00
Leesoo Ahn
ab6875fbb9 apparmor: domain: clean up duplicated parts of handle_onexec()
Regression test of AppArmor finished without any failures.

PASSED: aa_exec access attach_disconnected at_secure introspect
capabilities changeprofile onexec changehat changehat_fork
changehat_misc chdir clone coredump deleted e2e environ exec exec_qual
fchdir fd_inheritance fork i18n link link_subset mkdir mmap mount
mult_mount named_pipe namespaces net_raw open openat pipe pivot_root
posix_ipc ptrace pwrite query_label regex rename readdir rw socketpair
swap sd_flags setattr symlink syscall sysv_ipc tcp unix_fd_server
unix_socket_pathname unix_socket_abstract unix_socket_unnamed
unix_socket_autobind unlink userns xattrs xattrs_profile longpath nfs
exec_stack aa_policy_cache nnp stackonexec stackprofile
FAILED:
make: Leaving directory '/apparmor/tests/regression/apparmor'

Signed-off-by: Leesoo Ahn <lsahn@ooseel.net>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:05 -08:00
Hongbo Li
c030937306 apparmor: Use IS_ERR_OR_NULL() helper function
Use the IS_ERR_OR_NULL() helper instead of open-coding a
NULL and an error pointer checks to simplify the code and
improve readability.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:05 -08:00
John Johansen
9208c05f9f apparmor: add support for 2^24 states to the dfa state machine.
Currently the dfa state machine is limited by its default, next, and
check tables using u16. Allow loading of u32 tables, and if u16 tables
are loaded map them to u32.

The number of states allowed does not increase to 2^32 because the
base table uses the top 8 bits of its u32 for flags. Moving the flags
into a separate table allowing a full 2^32 bit range wil be done in
a separate patch.

Link: https://gitlab.com/apparmor/apparmor/-/issues/419
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:05 -08:00
Ryan Lee
db93ca15e5 apparmor: properly handle cx/px lookup failure for complain
mode profiles

When a cx/px lookup fails, apparmor would deny execution of the binary
even in complain mode (where it would audit as allowing execution while
actually denying it). Instead, in complain mode, create a new learning
profile, just as would have been done if the cx/px line wasn't there.

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:05 -08:00
Ryan Lee
17d0d04f3c apparmor: allocate xmatch for nullpdb inside aa_alloc_null
attach->xmatch was not set when allocating a null profile, which is used in
complain mode to allocate a learning profile. This was causing downstream
failures in find_attach, which expected a valid xmatch but did not find
one under a certain sequence of profile transitions in complain mode.

This patch ensures the xmatch is set up properly for null profiles.

Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2024-11-26 19:21:04 -08:00
Linus Torvalds
f5f4745a7f - The series "resource: A couple of cleanups" from Andy Shevchenko
performs some cleanups in the resource management code.
 
 - The series "Improve the copy of task comm" from Yafang Shao addresses
   possible race-induced overflows in the management of task_struct.comm[].
 
 - The series "Remove unnecessary header includes from
   {tools/}lib/list_sort.c" from Kuan-Wei Chiu adds some cleanups and a
   small fix to the list_sort library code and to its selftest.
 
 - The series "Enhance min heap API with non-inline functions and
   optimizations" also from Kuan-Wei Chiu optimizes and cleans up the
   min_heap library code.
 
 - The series "nilfs2: Finish folio conversion" from Ryusuke Konishi
   finishes off nilfs2's folioification.
 
 - The series "add detect count for hung tasks" from Lance Yang adds more
   userspace visibility into the hung-task detector's activity.
 
 - Apart from that, singelton patches in many places - please see the
   individual changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ0L6lQAKCRDdBJ7gKXxA
 jmEIAPwMSglNPKRIOgzOvHh8MUJW1Dy8iKJ2kWCO3f6QTUIM2AEA+PazZbUd/g2m
 Ii8igH0UBibIgva7MrCyJedDI1O23AA=
 =8BIU
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-11-24-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - The series "resource: A couple of cleanups" from Andy Shevchenko
   performs some cleanups in the resource management code

 - The series "Improve the copy of task comm" from Yafang Shao addresses
   possible race-induced overflows in the management of
   task_struct.comm[]

 - The series "Remove unnecessary header includes from
   {tools/}lib/list_sort.c" from Kuan-Wei Chiu adds some cleanups and a
   small fix to the list_sort library code and to its selftest

 - The series "Enhance min heap API with non-inline functions and
   optimizations" also from Kuan-Wei Chiu optimizes and cleans up the
   min_heap library code

 - The series "nilfs2: Finish folio conversion" from Ryusuke Konishi
   finishes off nilfs2's folioification

 - The series "add detect count for hung tasks" from Lance Yang adds
   more userspace visibility into the hung-task detector's activity

 - Apart from that, singelton patches in many places - please see the
   individual changelogs for details

* tag 'mm-nonmm-stable-2024-11-24-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
  gdb: lx-symbols: do not error out on monolithic build
  kernel/reboot: replace sprintf() with sysfs_emit()
  lib: util_macros_kunit: add kunit test for util_macros.h
  util_macros.h: fix/rework find_closest() macros
  Improve consistency of '#error' directive messages
  ocfs2: fix uninitialized value in ocfs2_file_read_iter()
  hung_task: add docs for hung_task_detect_count
  hung_task: add detect count for hung tasks
  dma-buf: use atomic64_inc_return() in dma_buf_getfile()
  fs/proc/kcore.c: fix coccinelle reported ERROR instances
  resource: avoid unnecessary resource tree walking in __region_intersects()
  ocfs2: remove unused errmsg function and table
  ocfs2: cluster: fix a typo
  lib/scatterlist: use sg_phys() helper
  checkpatch: always parse orig_commit in fixes tag
  nilfs2: convert metadata aops from writepage to writepages
  nilfs2: convert nilfs_recovery_copy_block() to take a folio
  nilfs2: convert nilfs_page_count_clean_buffers() to take a folio
  nilfs2: remove nilfs_writepage
  nilfs2: convert checkpoint file to be folio-based
  ...
2024-11-25 16:09:48 -08:00
Linus Torvalds
2dde263d81 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmc/WikACgkQnJ2qBz9k
 QNnZdwf9FfT95zhnNWk3ohNOh5BO0P/uTY2fNkQBDPLPY3Bi8nywPIjXYCDSOgX1
 SBV0rakkWp+rVO1/qkg5J1mUvBoefzT7O17rG0LfRw3zjHPX+XeO+e3Xf/kPmJHJ
 3fvN//VTZQ6uPcn8PWgLe8VVQqNXD3nlUrwz/JKaxyodsdm0ERej4QZjG6Cikotk
 aKuDPAnOiS37/lIFZGdJRca/rwJPwMekNt1SxVrnmin0/QfB/Uubba2+NNdQ+z3W
 SCA/26PK822T3ELB8BkfwpdINC17WUwDJlkC8qha/JRzDlxJC/ysr43fHn/7Adfb
 CthG8V4JDGm51jcC0qe0Yk2HV75U4A==
 =htHs
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify updates from Jan Kara:
 "A couple of smaller random fsnotify fixes"

* tag 'fsnotify_for_v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: Fix ordering of iput() and watched_objects decrement
  fsnotify: fix sending inotify event with unexpected filename
  fanotify: allow reporting errors on failure to open fd
  fsnotify, lsm: Decouple fsnotify from lsm
2024-11-21 09:55:45 -08:00
Linus Torvalds
02b2f1a7b8 This update includes the following changes:
API:
 
 - Add sig driver API.
 - Remove signing/verification from akcipher API.
 - Move crypto_simd_disabled_for_test to lib/crypto.
 - Add WARN_ON for return values from driver that indicates memory corruption.
 
 Algorithms:
 
 - Provide crc32-arch and crc32c-arch through Crypto API.
 - Optimise crc32c code size on x86.
 - Optimise crct10dif on arm/arm64.
 - Optimise p10-aes-gcm on powerpc.
 - Optimise aegis128 on x86.
 - Output full sample from test interface in jitter RNG.
 - Retry without padata when it fails in pcrypt.
 
 Drivers:
 
 - Add support for Airoha EN7581 TRNG.
 - Add support for STM32MP25x platforms in stm32.
 - Enable iproc-r200 RNG driver on BCMBCA.
 - Add Broadcom BCM74110 RNG driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmc6sQsACgkQxycdCkmx
 i6dfHxAAnkI65TE6agZq9DlkEU4ZqOsxxdk0MsGIhbCUTxW3KENzu9vtKjnvg9T/
 Ou0d2J49ny87Y4zaA59Wf/Q1+gg5YSQR5kelonpfrPLkCkJjr72HZpyCHv8TTzEC
 uHHoVj9cnPIF5/yfiqQsrWT1ACip9vn+slyVPaMJV1qR6gnvnSALtsg4e/vKHkn7
 ZMaf2pZ2ROYXdB02nMK5KQcCrxD64MQle/yQepY44eYjnT+XclkqPdi6o1nUSpj/
 RFAeY0jFSTu0pj3DqT48TnU/LiiNLlFOZrGjCdEySoac63vmTtKqfYDmrRaFz4hB
 sucxbgJ3xnnYseRijtfXnxaD/IkDJln+ipGNQKAZLfOVMDCTxPdYGmOpobMTXMS+
 0sY0eAHgqr23P9pOp+sOzcAEFIqg6llAYQVWx3Zl4vpXBUuxzg6AqmHnPicnck7y
 Lw1cJhQxij2De3dG2ZL/0dgQxMjGN/YfCM8SSg6l+Xn3j4j47rqJNH2ZsmXtbJ2n
 kTkmemmWdgRR1IvgQQGsvyKs9ThkcEDW+IzW26SUv3Clvru2NSkX4ZPHbezZQf+D
 R0wMZsW3Fw7Zymerz1GIBSqdLnsyFWtIAjukDpOR6ordPgOBeDt76v6tw5vL2/II
 KYoeN1pdEEecwuhAsEvCryT5ZG4noBeNirf/ElWAfEybgcXiTks=
 =T8pa
 -----END PGP SIGNATURE-----

Merge tag 'v6.13-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Add sig driver API
   - Remove signing/verification from akcipher API
   - Move crypto_simd_disabled_for_test to lib/crypto
   - Add WARN_ON for return values from driver that indicates memory
     corruption

  Algorithms:
   - Provide crc32-arch and crc32c-arch through Crypto API
   - Optimise crc32c code size on x86
   - Optimise crct10dif on arm/arm64
   - Optimise p10-aes-gcm on powerpc
   - Optimise aegis128 on x86
   - Output full sample from test interface in jitter RNG
   - Retry without padata when it fails in pcrypt

  Drivers:
   - Add support for Airoha EN7581 TRNG
   - Add support for STM32MP25x platforms in stm32
   - Enable iproc-r200 RNG driver on BCMBCA
   - Add Broadcom BCM74110 RNG driver"

* tag 'v6.13-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (112 commits)
  crypto: marvell/cesa - fix uninit value for struct mv_cesa_op_ctx
  crypto: cavium - Fix an error handling path in cpt_ucode_load_fw()
  crypto: aesni - Move back to module_init
  crypto: lib/mpi - Export mpi_set_bit
  crypto: aes-gcm-p10 - Use the correct bit to test for P10
  hwrng: amd - remove reference to removed PPC_MAPLE config
  crypto: arm/crct10dif - Implement plain NEON variant
  crypto: arm/crct10dif - Macroify PMULL asm code
  crypto: arm/crct10dif - Use existing mov_l macro instead of __adrl
  crypto: arm64/crct10dif - Remove remaining 64x64 PMULL fallback code
  crypto: arm64/crct10dif - Use faster 16x64 bit polynomial multiply
  crypto: arm64/crct10dif - Remove obsolete chunking logic
  crypto: bcm - add error check in the ahash_hmac_init function
  crypto: caam - add error check to caam_rsa_set_priv_key_form
  hwrng: bcm74110 - Add Broadcom BCM74110 RNG driver
  dt-bindings: rng: add binding for BCM74110 RNG
  padata: Clean up in padata_do_multithreaded()
  crypto: inside-secure - Fix the return value of safexcel_xcbcmac_cra_init()
  crypto: qat - Fix missing destroy_workqueue in adf_init_aer()
  crypto: rsassa-pkcs1 - Reinstate support for legacy protocols
  ...
2024-11-19 10:28:41 -08:00