linux - Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/

Age	Commit message (Collapse)	Author	Lines
2026-04-13	Merge tag 'kvm-s390-next-7.1-1' of ↵	Paolo Bonzini	-4/+0
	https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - ESA nesting support - 4k memslots - LPSW/E fix
2026-04-13	Merge tag 'kvm-x86-nested-7.1' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini	-7/+12
	KVM nested SVM changes for 7.1 (with one common x86 fix) - To minimize the probability of corrupting guest state, defer KVM's non-architectural delivery of exception payloads (e.g. CR2 and DR6) until consumption of the payload is imminent, and force delivery of the payload in all paths where userspace saves relevant state. - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT to fix a bug where L2's CR2 can get corrupted after a save/restore, e.g. if the VM is migrated while L2 is faulting in memory. - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE. - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore. - Add a variety of missing nSVM consistency checks. - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT. - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions. - Add support for save+restore of virtualized LBRs (on SVM). - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain. - Aggressively sanitize fields when copying from vmcb12 to guard against unintentionally allowing L1 to utilize yet-to-be-defined features. - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. Note, KVM is still flawed in that KVM doesn't address size prefix overrides for 64-bit guests; this should probably be documented as a KVM erratum. - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't bastardize AMD's already- sketchy behavior of generating #GP if for "unsupported" addresses). - Cache all used vmcb12 fields to further harden against TOCTOU bugs.
2026-04-13	Merge tag 'kvm-x86-selftests-7.1' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini	-0/+8
	KVM selftests changes for 7.1 - Add support for Hygon CPUs in KVM selftests. - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP. - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will.
2026-04-13	Merge tag 'kvmarm-7.1' of ↵	Paolo Bonzini	-0/+150
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 7.1 * New features: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This comes with a full infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware. - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM. - Finally add support for pKVM protected guests, with anonymous memory being used as a backing store. About time! * Improvements and bug fixes: - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable. - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis. - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow. - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups. - A small set of patches fixing the Stage-2 MMU freeing in error cases. - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls. - The usual cleanups and other selftest churn.
2026-04-13	Merge tag 'loongarch-kvm-7.1' of ↵	Paolo Bonzini	-0/+81
	git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v7.1 1. Use CSR_CRMD_PLV in kvm_arch_vcpu_in_kernel(). 2. Let vcpu_is_preempted() a macro & some enhanments. 3. Add DMSINTC irqchip in kernel support. 4. Add KVM PMU test cases for tools/selftests.
2026-04-09	KVM: LoongArch: selftests: Add PMU overflow interrupt test	Song Gao	-0/+24
	Extend the PMU test suite to cover overflow interrupts. The test enables the PMI (Performance Monitor Interrupt), sets counter 0 to one less than the overflow value, and verifies that an interrupt is raised when the counter overflows. A guest interrupt handler checks the interrupt cause and disables further PMU interrupts upon success. Signed-off-by: Song Gao <gaosong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2026-04-09	KVM: LoongArch: selftests: Add basic PMU event counting test	Song Gao	-0/+46
	Introduce a basic PMU test that verifies hardware event counting for four performance counters. The test enables the events for CPU cycles, instructions retired, branch instructions, and branch misses, runs a fixed number of loops, and checks that the counter values fall within expected ranges. It also validates that the host supports PMU and that the VM feature is enabled. Signed-off-by: Song Gao <gaosong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2026-04-09	KVM: LoongArch: selftests: Add cpucfg read/write helpers	Song Gao	-0/+11
	Add helper macros and functions to read and write CPU configuration registers (cpucfg) from the guest and from the VMM. This interface is required in upcoming selftests for querying and setting CPU features, such as PMU capabilities. Signed-off-by: Song Gao <gaosong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2026-04-07	KVM: selftests: Remove 1M alignment requirement for s390	Claudio Imbrenda	-4/+0
	Remove the 1M memslot alignment requirement for s390, since it is not needed anymore. Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-03-27	RISC-V: KVM: selftests: Fix firmware counter read in sbi_pmu_test	Jiakai Xu	-0/+37
	The current sbi_pmu_test attempts to read firmware counters without configuring them first with SBI_EXT_PMU_COUNTER_CFG_MATCH. Previously this did not fail because KVM incorrectly allowed the read and accessed fw_event[] with an out-of-bounds index when the counter was unconfigured. After fixing that bug, the read now correctly returns SBI_ERR_INVALID_PARAM, causing the selftest to fail. Update the test to configure a firmware event before reading the counter. Also add a negative test to ensure that attempting to read an unconfigured firmware counter fails gracefully. Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com> Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Link: https://lore.kernel.org/r/20260316014533.2312254-3-xujiakai2025@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-03-26	RISC-V: KVM: selftests: Add RISC-V SBI STA shmem alignment tests	Jiakai Xu	-0/+2
	Add RISC-V KVM selftests to verify the SBI Steal-Time Accounting (STA) shared memory alignment requirements. The SBI specification requires the STA shared memory GPA to be 64-byte aligned, or set to all-ones to explicitly disable steal-time accounting. This test verifies that KVM enforces the expected behavior when configuring the SBI STA shared memory via KVM_SET_ONE_REG. Specifically, the test checks that: - misaligned GPAs are rejected with -EINVAL - 64-byte aligned GPAs are accepted - all-ones GPA is accepted Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com> Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260303010859.1763177-4-xujiakai2025@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-03-19	KVM: arm64: selftests: Introduce a minimal GICv5 PPI selftest	Sascha Bischoff	-0/+150
	This basic selftest creates a vgic_v5 device (if supported), and tests that one of the PPI interrupts works as expected with a basic single-vCPU guest. Upon starting, the guest enables interrupts. That means that it is initialising all PPIs to have reasonable priorities, but marking them as disabled. Then the priority mask in the ICC_PCR_EL1 is set, and interrupts are enable in ICC_CR0_EL1. At this stage the guest is able to receive interrupts. The architected SW_PPI (64) is enabled and KVM_IRQ_LINE ioctl is used to inject the state into the guest. The guest's interrupt handler has an explicit WFI in order to ensure that the guest skips WFI when there are pending and enabled PPI interrupts. Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260319154937.3619520-41-sascha.bischoff@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-12	KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8	Sean Christopherson	-0/+23
	Add "do no harm" testing of EFER, CR0, CR4, and CR8 for SEV+ guests to verify that the guest can read and write the registers, without hitting e.g. a #VC on SEV-ES guests due to KVM incorrectly trying to intercept a register. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20260310211841.2552361-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2026-03-11	selftests: kvm: extract common functionality out of smm_test.c	Paolo Bonzini	-0/+17
	Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2026-03-04	KVM: SVM: Rename vmcb->virt_ext to vmcb->misc_ctl2	Yosry Ahmed	-4/+4
	'virt' is confusing in the VMCB because it is relative and ambiguous. The 'virt_ext' field includes bits for LBR virtualization and VMSAVE/VMLOAD virtualization, so it's just another miscellaneous control field. Name it as such. While at it, move the definitions of the bits below those for 'misc_ctl' and rename them for consistency. Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-20-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04	KVM: SVM: Rename vmcb->nested_ctl to vmcb->misc_ctl	Sean Christopherson	-3/+3
	The 'nested_ctl' field is misnamed. Although the first bit is for nested paging, the other defined bits are for SEV/SEV-ES. Other bits in the same field according to the APM (but not defined by KVM) include "Guest Mode Execution Trap", "Enable INVLPGB/TLBSYNC", and other control bits unrelated to 'nested'. There is nothing common among these bits, so just name the field misc_ctl. Also rename the flags accordingly. Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-19-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04	KVM: selftests: Add a test for LBR save/restore (ft. nested)	Yosry Ahmed	-0/+5
	Add a selftest exercising save/restore with usage of LBRs in both L1 and L2, and making sure all LBRs remain intact. Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-5-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04	KVM: selftests: Wrap madvise() to assert success	Ackerley Tng	-0/+1
	Extend kvm_syscalls.h to wrap madvise() to assert success. This will be used in the next patch. Signed-off-by: Ackerley Tng <ackerleytng@google.com> Reviewed-by: David Hildenbrand (Arm) <david@kernel.org> Link: https://patch.msgid.link/455483ca29a3a3042efee0cf3bbd0e2548cbeb1c.1771630983.git.ackerleytng@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-02	KVM: selftests: Add a flag to identify AMD compatible test cases	Zhiquan Li	-0/+1
	Most of KVM x86 selftests for AMD are compatible with Hygon architecture (but not all), add a flag "host_cpu_is_amd_compatible" to figure out these cases. Following test failures on Hygon platform can be fixed: * Fix hypercall test: Hygon architecture also uses VMMCALL as guest hypercall instruction. * Following test failures due to access reserved memory address regions: - access_tracking_perf_test - demand_paging_test - dirty_log_perf_test - dirty_log_test - kvm_page_table_test - memslot_modification_stress_test - pre_fault_memory_test - x86/dirty_log_page_splitting_test Hygon CSV also makes the "physical address space width reduction", the reduced physical address bits are reported by bits 11:6 of CPUID[0x8000001f].EBX as well, so the existed logic is totally applicable for Hygon processors. Mapping memory into these regions and accessing to them results in a #PF. Signed-off-by: Zhiquan Li <zhiquan_li@163.com> Link: https://patch.msgid.link/20260212103841.171459-3-zhiquan_li@163.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-02	KVM: selftests: Add CPU vendor detection for Hygon	Zhiquan Li	-0/+6
	Currently some KVM selftests are failed on Hygon CPUs due to missing vendor detection and edge-case handling specific to Hygon's architecture. Add CPU vendor detection for Hygon and add a global variable "host_cpu_is_hygon" as the basic facility for the following fixes. Signed-off-by: Zhiquan Li <zhiquan_li@163.com> Link: https://patch.msgid.link/20260212103841.171459-2-zhiquan_li@163.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-02-11	Merge tag 'kvm-x86-apic-6.20' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini	-0/+4
	KVM x86 APIC-ish changes for 6.20 - Fix a benign bug where KVM could use the wrong memslots (ignored SMM) when creating a vCPU-specific mapping of guest memory. - Clean up KVM's handling of marking mapped vCPU pages dirty. - Drop a pile of ancient sanity checks hidden behind in KVM's unused ASSERT() macro, most of which could be trivially triggered by the guest and/or user, and all of which were useless. - Fold "struct dest_map" into its sole user, "struct rtc_status", to make it more obvious what the weird parameter is used for, and to allow burying the RTC shenanigans behind CONFIG_KVM_IOAPIC=y. - Bury all of ioapic.h and KVM_IRQCHIP_KERNEL behind CONFIG_KVM_IOAPIC=y. - Add a regression test for recent APICv update fixes. - Rework KVM's handling of VMCS updates while L2 is active to temporarily switch to vmcs01 instead of deferring the update until the next nested VM-Exit. The deferred updates approach directly contributed to several bugs, was proving to be a maintenance burden due to the difficulty in auditing the correctness of deferred updates, and was polluting "struct nested_vmx" with a growing pile of booleans. - Handle "hardware APIC ISR", a.k.a. SVI, updates in kvm_apic_update_apicv() to consolidate the updates, and to co-locate SVI updates with the updates for KVM's own cache of ISR information. - Drop a dead function declaration.
2026-02-11	Merge tag 'kvm-riscv-6.20-1' of https://github.com/kvm-riscv/linux into HEAD	Paolo Bonzini	-3/+16
	KVM/riscv changes for 6.20 - Fixes for issues discoverd by KVM API fuzzing in kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and kvm_riscv_vcpu_aia_imsic_update() - Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM - Add riscv vm satp modes in KVM selftests - Transparent huge page support for G-stage - Adjust the number of available guest irq files based on MMIO register sizes in DeviceTree or ACPI
2026-02-09	Merge tag 'kvm-x86-svm-6.20' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini	-2/+1
	KVM SVM changes for 6.20 - Drop a user-triggerable WARN on nested_svm_load_cr3() failure. - Add support for virtualizing ERAPS. Note, correct virtualization of ERAPS relies on an upcoming, publicly announced change in the APM to reduce the set of conditions where hardware (i.e. KVM) must flush the RAP. - Ignore nSVM intercepts for instructions that are not supported according to L1's virtual CPU model. - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath for EPT Misconfig. - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM, and allow userspace to restore nested state with GIF=0. - Treat exit_code as an unsigned 64-bit value through all of KVM. - Add support for fetching SNP certificates from userspace. - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD or VMSAVE on behalf of L2. - Misc fixes and cleanups.
2026-02-09	Merge tag 'kvm-x86-selftests-6.20' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini	-32/+106
	KVM selftests changes for 6.20 - Add a regression test for TPR<=>CR8 synchronization and IRQ masking. - Overhaul selftest's MMU infrastructure to genericize stage-2 MMU support, and extend x86's infrastructure to support EPT and NPT (for L2 guests). - Extend several nested VMX tests to also cover nested SVM. - Add a selftest for nested VMLOAD/VMSAVE. - Rework the nested dirty log test, originally added as a regression test for PML where KVM logged L2 GPAs instead of L1 GPAs, to improve test coverage and to hopefully make the test easier to understand and maintain.
2026-02-06	KVM: riscv: selftests: Add riscv vm satp modes	Wu Fei	-3/+16
	Current vm modes cannot represent riscv guest modes precisely, here add all 9 combinations of P(56,40,41) x V(57,48,39). Also the default vm mode is detected on runtime instead of hardcoded one, which might not be supported on specific machine. Signed-off-by: Wu Fei <wu.fei9@sanechips.com.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251105151442.28767-1-wu.fei9@sanechips.com.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-01-16	KVM: selftests: Test READ=>WRITE dirty logging behavior for shadow MMU	Sean Christopherson	-0/+1
	Update the nested dirty log test to validate KVM's handling of READ faults when dirty logging is enabled. Specifically, set the Dirty bit in the guest PTEs used to map L2 GPAs, so that KVM will create writable SPTEs when handling L2 read faults. When handling read faults in the shadow MMU, KVM opportunistically creates a writable SPTE if the mapping can be writable and the gPTE is dirty (or doesn't support the Dirty bit), i.e. if KVM doesn't need to intercept writes in order to emulate Dirty-bit updates. To actually test the L2 READ=>WRITE sequence, e.g. without masking a false pass by other test activity, route the READ=>WRITE and WRITE=>WRITE sequences to separate L1 pages, and differentiate between "marked dirty due to a WRITE access/fault" and "marked dirty due to creating a writable SPTE for a READ access/fault". The updated sequence exposes the bug fixed by KVM commit 1f4e5fc83a42 ("KVM: x86: fix nested guest live migration with PML") when the guest performs a READ=>WRITE sequence with dirty guest PTEs. Opportunistically tweak and rename the address macros, and add comments, to make it more obvious what the test is doing. E.g. NESTED_TEST_MEM1 vs. GUEST_TEST_MEM doesn't make it all that obvious that the test is creating aliases in both the L2 GPA and GVA address spaces, but only when L1 is using TDP to run L2. Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20260115172154.709024-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-15	KVM: selftests: Fix typos and stale comments in kvm_util	Fuad Tabba	-2/+2
	Fix minor documentation errors in `kvm_util.h` and `kvm_util.c`. - Correct the argument description for `vcpu_args_set` in `kvm_util.h`, which incorrectly listed `vm` instead of `vcpu`. - Fix a typo in the comment for `kvm_selftest_arch_init` ("exeucting" -> "executing"). - Correct the return value description for `vm_vaddr_unused_gap` in `kvm_util.c` to match the implementation, which returns an address "at or above" `vaddr_min`, not "at or below". No functional change intended. Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260109082218.3236580-6-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-15	KVM: selftests: Move page_align() to shared header	Fuad Tabba	-0/+5
	To avoid code duplication, move page_align() to the shared `kvm_util.h` header file. Rename it to vm_page_align(), to make it clear that the alignment is done with respect to the guest's base page size. No functional change intended. Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260109082218.3236580-5-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-15	KVM: arm64: selftests: Disable unused TTBR1_EL1 translations	Fuad Tabba	-0/+4
	KVM selftests map all guest code and data into the lower virtual address range (0x0000...) managed by TTBR0_EL1. The upper range (0xFFFF...) managed by TTBR1_EL1 is unused and uninitialized. If a guest accesses the upper range, the MMU attempts a translation table walk using uninitialized registers, leading to unpredictable behavior. Set `TCR_EL1.EPD1` to disable translation table walks for TTBR1_EL1, ensuring that any access to the upper range generates an immediate Translation Fault. Additionally, set `TCR_EL1.TBI1` (Top Byte Ignore) to ensure that tagged pointers in the upper range also deterministically trigger a Translation Fault via EPD1. Define `TCR_EPD1_MASK`, `TCR_EPD1_SHIFT`, and `TCR_TBI1` in `processor.h` to support this configuration. These are based on their definitions in `arch/arm64/include/asm/pgtable-hwdef.h`. Suggested-by: Will Deacon <will@kernel.org> Reviewed-by: Itaru Kitayama <itaru.kitayama@fujitsu.com> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260109082218.3236580-2-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-14	KVM: selftests: Add a selftests for nested VMLOAD/VMSAVE	Yosry Ahmed	-0/+1
	Add a test for VMLOAD/VMSAVE in an L2 guest. The test verifies that L1 intercepts for VMSAVE/VMLOAD always work regardless of VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK. Then, more interestingly, it makes sure that when L1 does not intercept VMLOAD/VMSAVE, they work as intended in L2. When VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is enabled by L1, VMSAVE/VMLOAD from L2 should interpret the GPA as an L2 GPA and translate it through the NPT. When VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is disabled by L1, VMSAVE/VMLOAD from L2 should interpret the GPA as an L1 GPA. To test this, put two VMCBs (0 and 1) in L1's physical address space, and have a single L2 GPA where: - L2 VMCB GPA == L1 VMCB(0) GPA - L2 VMCB GPA maps to L1 VMCB(1) via the NPT in L1. This setup allows detecting how the GPA is interpreted based on which L1 VMCB is actually accessed. In both cases, L2 sets KERNEL_GS_BASE (one of the fields handled by VMSAVE/VMLOAD), and executes VMSAVE to write its value to the VMCB. The test userspace code then checks that the write was made to the correct VMCB (based on whether VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is set by L1), and writes a new value to that VMCB. L2 then executes VMLOAD to load the new value and makes sure it's reflected correctly in KERNERL_GS_BASE. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20260110004821.3411245-4-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-13	KVM: SVM: Treat exit_code as an unsigned 64-bit value through all of KVM	Sean Christopherson	-2/+1
	Fix KVM's long-standing buggy handling of SVM's exit_code as a 32-bit value. Per the APM and Xen commit d1bd157fbc ("Big merge the HVM full-virtualisation abstractions.") (which is arguably more trustworthy than KVM), offset 0x70 is a single 64-bit value: 070h 63:0 EXITCODE Track exit_code as a single u64 to prevent reintroducing bugs where KVM neglects to correctly set bits 63:32. Fixes: 6aa8b732ca01 ("[PATCH] kvm: userspace interface") Cc: Jim Mattson <jmattson@google.com> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230211347.4099600-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-13	KVM: selftests: Add a test to verify APICv updates (while L2 is active)	Sean Christopherson	-0/+4
	Add a test to verify KVM correctly handles a variety of edge cases related to APICv updates, and in particular updates that are triggered while L2 is actively running. Reviewed-by: Chao Gao <chao.gao@intel.com> Link: https://patch.msgid.link/20260109034532.1012993-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Rename vm_get_page_table_entry() to vm_get_pte()	Sean Christopherson	-1/+1
	Shorten the API to get a PTE as the "PTE" acronym is ubiquitous, and the "page table entry" makes it unnecessarily difficult to quickly understand what callers are doing. No functional change intended. Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-21-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Set the user bit on nested NPT PTEs	Yosry Ahmed	-0/+3
	According to the APM, NPT walks are treated as user accesses. In preparation for supporting NPT mappings, set the 'user' bit on NPTs by adding a mask of bits to always be set on PTEs in kvm_mmu. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-18-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Add support for nested NPTs	Yosry Ahmed	-0/+11
	Implement nCR3 and NPT initialization functions, similar to the EPT equivalents, and create common TDP helpers for enablement checking and initialization. Enable NPT for nested guests by default if the TDP MMU was initialized, similar to VMX. Reuse the PTE masks from the main MMU in the NPT MMU, except for the C and S bits related to confidential VMs. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-17-seanjc@google.com [sean: apply Yosry's fixup for ncr3_gpa] Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Move TDP mapping functions outside of vmx.c	Sean Christopherson	-3/+4
	Now that the functions are no longer VMX-specific, move them to processor.c. Do a minor comment tweak replacing 'EPT' with 'TDP'. No functional change intended. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-15-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Reuse virt mapping functions for nested EPTs	Yosry Ahmed	-4/+16
	Rework tdp_map() and friends to use __virt_pg_map() and drop the custom EPT code in __tdp_pg_map() and tdp_create_pte(). The EPT code and __virt_pg_map() are practically identical, the main differences are: - EPT uses the EPT struct overlay instead of the PTE masks. - EPT always assumes 4-level EPTs. To reuse __virt_pg_map(), extend the PTE masks to work with EPT's RWX and X-only capabilities, and provide a tdp_mmu_init() API so that EPT can pass in the EPT PTE masks along with the root page level (which is currently hardcoded to '4'). Don't reuse KVM's insane overloading of the USER bit for EPT_R as there's no reason to multiplex bits in the selftests, e.g. selftests aren't trying to shadow guest PTEs and thus don't care about funnelling protections into a common permissions check. Another benefit of reusing the code is having separate handling for upper-level PTEs vs 4K PTEs, which avoids some quirks like setting the large bit on a 4K PTE in the EPTs. For all intents and purposes, no functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Co-developed-by: Sean Christopherson <seanjc@google.com> Link: https://patch.msgid.link/20251230230150.4150236-14-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Add a stage-2 MMU instance to kvm_vm	Sean Christopherson	-0/+5
	Add a stage-2 MMU instance so that architectures that support nested virtualization (more specifically, nested stage-2 page tables) can create and track stage-2 page tables for running L2 guests. Plumb the structure into common code to avoid cyclical dependencies, and to provide some line of sight to having common APIs for creating stage-2 mappings. As a bonus, putting the member in common code justifies using stage2_mmu instead of tdp_mmu for x86. Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-13-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Stop passing VMX metadata to TDP mapping functions	Yosry Ahmed	-8/+3
	The root GPA is now retrieved from the nested MMU, stop passing VMX metadata. This is in preparation for making these functions work for NPTs as well. Opportunistically drop tdp_pg_map() since it's unused. No functional change intended. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-12-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Use a TDP MMU to share EPT page tables between vCPUs	Yosry Ahmed	-5/+10
	prepare_eptp() currently allocates new EPTs for each vCPU. memstress has its own hack to share the EPTs between vCPUs. Currently, there is no reason to have separate EPTs for each vCPU, and the complexity is significant. The only reason it doesn't matter now is because memstress is the only user with multiple vCPUs. Add vm_enable_ept() to allocate EPT page tables for an entire VM, and use it everywhere to replace prepare_eptp(). Drop 'eptp' and 'eptp_hva' from 'struct vmx_pages' as they serve no purpose (e.g. the EPTP can be built from the PGD), but keep 'eptp_gpa' so that the MMU structure doesn't need to be passed in along with vmx_pages. Dynamically allocate the TDP MMU structure to avoid a cyclical dependency between kvm_util_arch.h and kvm_util.h. Remove the workaround in memstress to copy the EPT root between vCPUs since that's now the default behavior. Name the MMU tdp_mmu instead of e.g. nested_mmu or nested.mmu to avoid recreating the same mess that KVM has with respect to "nested" MMUs, e.g. does nested refer to the stage-2 page tables created by L1, or the stage-1 page tables created by L2? Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Co-developed-by: Sean Christopherson <seanjc@google.com> Link: https://patch.msgid.link/20251230230150.4150236-11-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Move PTE bitmasks to kvm_mmu	Yosry Ahmed	-11/+33
	Move the PTE bitmasks into kvm_mmu to parameterize them for virt mapping functions. Introduce helpers to read/write different PTE bits given a kvm_mmu. Drop the 'global' bit definition as it's currently unused, but leave the 'user' bit as it will be used in coming changes. Opportunisitcally rename 'large' to 'huge' as it's more consistent with the kernel naming. Leave PHYSICAL_PAGE_MASK alone, it's fixed in all page table formats and a lot of other macros depend on it. It's tempting to move all the other macros to be per-struct instead, but it would be too much noise for little benefit. Keep c_bit and s_bit in vm->arch as they used before the MMU is initialized, through __vmcreate() -> vm_userspace_mem_region_add() -> vm_mem_add() -> vm_arch_has_protected_memory(). No functional change intended. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> [sean: rename accessors to is_<adjective>_pte()] Link: https://patch.msgid.link/20251230230150.4150236-10-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Add a "struct kvm_mmu_arch arch" member to kvm_mmu	Sean Christopherson	-0/+9
	Add an arch structure+field in "struct kvm_mmu" so that architectures can track arch-specific information for a given MMU. No functional change intended. Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Plumb "struct kvm_mmu" into x86's MMU APIs	Sean Christopherson	-1/+2
	In preparation for generalizing the x86 virt mapping APIs to work with TDP (stage-2) page tables, plumb "struct kvm_mmu" into all of the helper functions instead of operating on vm->mmu directly. Opportunistically swap the order of the check in virt_get_pte() to first assert that the parent is the PGD, and then check that the PTE is present, as it makes more sense to check if the parent PTE is the PGD/root (i.e. not a PTE) before checking that the PTE is PRESENT. No functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> [sean: rebase on common kvm_mmu structure, rewrite changelog] Link: https://patch.msgid.link/20251230230150.4150236-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Add "struct kvm_mmu" to track a given MMU instance	Sean Christopherson	-3/+8
	Add a "struct kvm_mmu" to track a given MMU instance, e.g. a VM's stage-1 MMU versus a VM's stage-2 MMU, so that x86 can share MMU functionality for both stage-1 and stage-2 MMUs, without creating the potential for subtle bugs, e.g. due to consuming on vm->pgtable_levels when operating a stage-2 MMU. Encapsulate the existing de facto MMU in "struct kvm_vm", e.g instead of burying the MMU details in "struct kvm_vm_arch", to avoid more #ifdefs in ____vm_create(), and in the hopes that other architectures can utilize the formalized MMU structure if/when they too support stage-2 page tables. No functional change intended. Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Rename nested TDP mapping functions	Yosry Ahmed	-8/+8
	Rename the functions from nested_* to tdp_* to make their purpose clearer. No functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Stop passing a memslot to nested_map_memslot()	Yosry Ahmed	-2/+2
	On x86, KVM selftests use memslot 0 for all the default regions used by the test infrastructure. This is an implementation detail. nested_map_memslot() is currently used to map the default regions by explicitly passing slot 0, which leaks the library implementation into the caller. Rename the function to a very verbose nested_identity_map_default_memslots() to reflect what it actually does. Add an assertion that only memslot 0 is being used so that the implementation does not change from under us. No functional change intended. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Make __vm_get_page_table_entry() static	Yosry Ahmed	-2/+0
	The function is only used in processor.c, drop the declaration in processor.h and make it static. No functional change intended. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230230150.4150236-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Fix sign extension bug in get_desc64_base()	MJ Pooladkhay	-2/+4
	The function get_desc64_base() performs a series of bitwise left shifts on fields of various sizes. More specifically, when performing '<< 24' on 'desc->base2' (which is a u8), 'base2' is promoted to a signed integer before shifting. In a scenario where base2 >= 0x80, the shift places a 1 into bit 31, causing the 32-bit intermediate value to become negative. When this result is cast to uint64_t or ORed into the return value, sign extension occurs, corrupting the upper 32 bits of the address (base3). Example: Given: base0 = 0x5000 base1 = 0xd6 base2 = 0xf8 base3 = 0xfffffe7c Expected return: 0xfffffe7cf8d65000 Actual return: 0xfffffffff8d65000 Fix this by explicitly casting the fields to 'uint64_t' before shifting to prevent sign extension. Signed-off-by: MJ Pooladkhay <mj@pooladkhay.com> Link: https://patch.msgid.link/20251222174207.107331-1-mj@pooladkhay.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-08	KVM: selftests: Test TPR / CR8 sync and interrupt masking	Maciej S. Szmigiero	-0/+3
	Add a few extra TPR / CR8 tests to x86's xapic_state_test to see if: * TPR is 0 on reset, * TPR, PPR and CR8 are equal inside the guest, * TPR and CR8 read equal by the host after a VMExit * TPR borderline values set by the host correctly mask interrupts in the guest. These hopefully will catch the most obvious cases of improper TPR sync or interrupt masking. Do these tests both in x2APIC and xAPIC modes. The x2APIC mode uses SELF_IPI register to trigger interrupts to give it a bit of exercise too. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Acked-by: Naveen N Rao (AMD) <naveen@kernel.org> [sean: put code in separate test] Link: https://patch.msgid.link/20251205224937.428122-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-12-02	Merge tag 'kvmarm-6.19' of ↵	Paolo Bonzini	-0/+3
	https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.19 - Support for userspace handling of synchronous external aborts (SEAs), allowing the VMM to potentially handle the abort in a non-fatal manner. - Large rework of the VGIC's list register handling with the goal of supporting more active/pending IRQs than available list registers in hardware. In addition, the VGIC now supports EOImode==1 style deactivations for IRQs which may occur on a separate vCPU than the one that acked the IRQ. - Support for FEAT_XNX (user / privileged execute permissions) and FEAT_HAF (hardware update to the Access Flag) in the software page table walkers and shadow MMU. - Allow page table destruction to reschedule, fixing long need_resched latencies observed when destroying a large VM. - Minor fixes to KVM and selftests