x86/vsyscall: Do not require X86_PF_INSTR to emulate vsyscall

emulate_vsyscall() expects to see X86_PF_INSTR in PFEC on a vsyscall
page fault, but the CPU does not report X86_PF_INSTR if neither
X86_FEATURE_NX nor X86_FEATURE_SMEP are enabled.

X86_FEATURE_NX should be enabled on nearly all 64-bit CPUs, except for
early P4 processors that did not support this feature.

Instead of explicitly checking for X86_PF_INSTR, compare the fault
address to RIP.

On machines with X86_FEATURE_NX enabled, issue a warning if RIP is equal
to fault address but X86_PF_INSTR is absent.

[ dhansen: flesh out code comments ]

Originally-by: Dave Hansen <dave.hansen@intel.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Link: https://lore.kernel.org/all/bd81a98b-f8d4-4304-ac55-d4151a1a77ab@intel.com
Link: https://lore.kernel.org/all/20250624145918.2720487-1-kirill.shutemov%40linux.intel.com
This commit is contained in:
Kirill A. Shutemov 2025-06-24 17:59:18 +03:00 committed by Dave Hansen
parent 8f5ae30d69
commit 8ba38a7a9a

View File

@ -124,7 +124,12 @@ bool emulate_vsyscall(unsigned long error_code,
if ((error_code & (X86_PF_WRITE | X86_PF_USER)) != X86_PF_USER)
return false;
if (!(error_code & X86_PF_INSTR)) {
/*
* Assume that faults at regs->ip are because of an
* instruction fetch. Return early and avoid
* emulation for faults during data accesses:
*/
if (address != regs->ip) {
/* Failed vsyscall read */
if (vsyscall_mode == EMULATE)
return false;
@ -136,13 +141,19 @@ bool emulate_vsyscall(unsigned long error_code,
return false;
}
/*
* X86_PF_INSTR is only set when NX is supported. When
* available, use it to double-check that the emulation code
* is only being used for instruction fetches:
*/
if (cpu_feature_enabled(X86_FEATURE_NX))
WARN_ON_ONCE(!(error_code & X86_PF_INSTR));
/*
* No point in checking CS -- the only way to get here is a user mode
* trap to a high address, which means that we're in 64-bit user code.
*/
WARN_ON_ONCE(address != regs->ip);
if (vsyscall_mode == NONE) {
warn_bad_vsyscall(KERN_INFO, regs,
"vsyscall attempted with vsyscall=none");