- Oct 25, 2022
-
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 7de8e5b6 Author: Sean Christopherson <seanjc@google.com> Date: Wed Jul 27 23:34:23 2022 +0000 KVM: VMX: Use proper type-safe functions for vCPU => LBRs helpers Turn vcpu_to_lbr_desc() and vcpu_to_lbr_records() into functions in order to provide type safety, to document exactly what they return, and to allow consuming the helpers in vmx.h. Move the definitions as necessary (the macros "reference" to_vmx() before its definition). No functional change intended. Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220727233424.2968356-3-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 17a024a8 Author: Sean Christopherson <seanjc@google.com> Date: Wed Jul 27 23:34:22 2022 +0000 KVM: x86: Refresh PMU after writes to MSR_IA32_PERF_CAPABILITIES Refresh the PMU if userspace modifies MSR_IA32_PERF_CAPABILITIES. KVM consumes the vCPU's PERF_CAPABILITIES when enumerating PEBS support, but relies on CPUID updates to refresh the PMU. I.e. KVM will do the wrong thing if userspace stuffs PERF_CAPABILITIES _after_ setting guest CPUID. Opportunistically fix a curly-brace indentation. Fixes: c59a1f10 ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS") Cc: Like Xu <like.xu.linux@gmail.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220727233424.2968356-2-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 9d27d461 Author: Sean Christopherson <seanjc@google.com> Date: Thu Aug 4 12:18:15 2022 -0700 KVM: selftests: Test all possible "invalid" PERF_CAPABILITIES.LBR_FMT vals Test all possible input values to verify that KVM rejects all values except the exact host value. Due to the LBR format affecting the core functionality of LBRs, KVM can't emulate "other" formats, so even though there are a variety of legal values, KVM should reject anything but an exact host match. Suggested-by:
Like Xu <like.xu.linux@gmail.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 0fcc1029 Author: Gavin Shan <gshan@redhat.com> Date: Wed Aug 10 18:41:14 2022 +0800 KVM: selftests: Use getcpu() instead of sched_getcpu() in rseq_test sched_getcpu() is glibc dependent and it can simply return the CPU ID from the registered rseq information, as Florian Weimer pointed. In this case, it's pointless to compare the return value from sched_getcpu() and that fetched from the registered rseq information. Fix the issue by replacing sched_getcpu() with getcpu(), as Florian suggested. The comments are modified accordingly by replacing "sched_getcpu()" with "getcpu()". Reported-by:
Yihuang Yu <yihyu@redhat.com> Suggested-by:
Florian Weimer <fweimer@redhat.com> Suggested-by:
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Suggested-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Gavin Shan <gshan@redhat.com> Message-Id: <20220810104114.6838-3-gshan@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 66d42ac7 Author: Gavin Shan <gshan@redhat.com> Date: Wed Aug 10 18:41:13 2022 +0800 KVM: selftests: Make rseq compatible with glibc-2.35 The rseq information is registered by TLS, starting from glibc-2.35. In this case, the test always fails due to syscall(__NR_rseq). For example, on RHEL9.1 where upstream glibc-2.35 features are enabled on downstream glibc-2.34, the test fails like below. # ./rseq_test ==== Test Assertion Failure ==== rseq_test.c:60: !r pid=112043 tid=112043 errno=22 - Invalid argument 1 0x0000000000401973: main at rseq_test.c:226 2 0x0000ffff84b6c79b: ?? ??:0 3 0x0000ffff84b6c86b: ?? ??:0 4 0x0000000000401b6f: _start at ??:? rseq failed, errno = 22 (Invalid argument) # rpm -aq | grep glibc-2 glibc-2.34-39.el9.aarch64 Fix the issue by using "../rseq/rseq.c" to fetch the rseq information, registred by TLS if it exists. Otherwise, we're going to register our own rseq information as before. Reported-by:
Yihuang Yu <yihyu@redhat.com> Suggested-by:
Florian Weimer <fweimer@redhat.com> Suggested-by:
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Suggested-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Gavin Shan <gshan@redhat.com> Message-Id: <20220810104114.6838-2-gshan@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit b74ed7a6 Author: Oliver Upton <oupton@google.com> Date: Wed Jul 20 09:22:51 2022 +0000 KVM: Actually create debugfs in kvm_create_vm() Doing debugfs creation after vm creation leaves things in a quasi-initialized state for a while. This is further complicated by the fact that we tear down debugfs from kvm_destroy_vm(). Align debugfs and stats init/destroy with the vm init/destroy pattern to avoid any headaches. Note the fix for a benign mistake in error handling for calls to kvm_arch_create_vm_debugfs() rolled in. Since all implementations of the function return 0 unconditionally it isn't actually a bug at the moment. Lastly, tear down debugfs/stats data in the kvm_create_vm_debugfs() error path. Previously it was safe to assume that kvm_destroy_vm() would take out the garbage, that is no longer the case. Signed-off-by:
Oliver Upton <oupton@google.com> Message-Id: <20220720092259.3491733-6-oliver.upton@linux.dev> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 59f82aad Author: Oliver Upton <oupton@google.com> Date: Wed Jul 20 09:22:50 2022 +0000 KVM: Pass the name of the VM fd to kvm_create_vm_debugfs() At the time the VM fd is used in kvm_create_vm_debugfs(), the fd has been allocated but not yet installed. It is only really useful as an identifier in strings for the VM (such as debugfs). Treat it exactly as such by passing the string name of the fd to kvm_create_vm_debugfs(), futureproofing against possible misuse of the VM fd. Suggested-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Oliver Upton <oupton@google.com> Message-Id: <20220720092259.3491733-5-oliver.upton@linux.dev> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 20020f4c Author: Oliver Upton <oupton@google.com> Date: Wed Jul 20 09:22:49 2022 +0000 KVM: Get an fd before creating the VM Allocate a VM's fd at the very beginning of kvm_dev_ioctl_create_vm() so that KVM can use the fd value to generate strigns, e.g. for debugfs, when creating and initializing the VM. Signed-off-by:
Oliver Upton <oupton@google.com> Message-Id: <20220720092259.3491733-4-oliver.upton@linux.dev> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 58fc1166 Author: Oliver Upton <oupton@google.com> Date: Wed Jul 20 09:22:48 2022 +0000 KVM: Shove vcpu stats_id init into kvm_vcpu_init() Initialize stats_id alongside other kvm_vcpu fields to make it more difficult to unintentionally access stats_id before it's set. No functional change intended. Signed-off-by:
Oliver Upton <oupton@google.com> Message-Id: <20220720092259.3491733-3-oliver.upton@linux.dev> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit f2759c08 Author: Oliver Upton <oupton@google.com> Date: Wed Jul 20 09:22:47 2022 +0000 KVM: Shove vm stats_id init into kvm_create_vm() Initialize stats_id alongside other struct kvm fields to make it more difficult to unintentionally access stats_id before it's set. While at it, move the format string to the first line of the call and fix the indentation of the second line. No functional change intended. Signed-off-by:
Oliver Upton <oupton@google.com> Message-Id: <20220720092259.3491733-2-oliver.upton@linux.dev> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 8bad4606 Author: Sean Christopherson <seanjc@google.com> Date: Fri Aug 5 19:41:33 2022 +0000 KVM: x86/mmu: Add sanity check that MMIO SPTE mask doesn't overlap gen Add compile-time and init-time sanity checks to ensure that the MMIO SPTE mask doesn't overlap the MMIO SPTE generation or the MMU-present bit. The generation currently avoids using bit 63, but that's as much coincidence as it is strictly necessarly. That will change in the future, as TDX support will require setting bit 63 (SUPPRESS_VE) in the mask. Explicitly carve out the bits that are allowed in the mask so that any future shuffling of SPTE bits doesn't silently break MMIO caching (KVM has broken MMIO caching more than once due to overlapping the generation with other things). Suggested-by:
Kai Huang <kai.huang@intel.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Kai Huang <kai.huang@intel.com> Message-Id: <20220805194133.86299-1-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 1685c0f3 Author: Mingwei Zhang <mizhang@google.com> Date: Sun Aug 7 05:21:41 2022 +0000 KVM: x86/mmu: rename trace function name for asynchronous page fault Rename the tracepoint function from trace_kvm_async_pf_doublefault() to trace_kvm_async_pf_repeated_fault() to make it clear, since double fault has nothing to do with this trace function. Asynchronous Page Fault (APF) is an artifact generated by KVM when it cannot find a physical page to satisfy an EPT violation. KVM uses APF to tell the guest OS to do something else such as scheduling other guest processes to make forward progress. However, when another guest process also touches a previously APFed page, KVM halts the vCPU instead of generating a repeated APF to avoid wasting cycles. Double fault (#DF) clearly has a different meaning and a different consequence when triggered. #DF requires two nested contributory exceptions instead of two page faults faulting at the same address. A prevous bug on APF indicates that it may trigger a double fault in the guest [1] and clearly this trace function has nothing to do with it. So rename this function should be a valid choice. No functional change intended. [1] https://www.spinics.net/lists/kvm/msg214957.html Signed-off-by:
Mingwei Zhang <mizhang@google.com> Message-Id: <20220807052141.69186-1-mizhang@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit c0368991 Author: Coleman Dietsch <dietschc@csp.edu> Date: Mon Aug 8 14:06:07 2022 -0500 KVM: x86/xen: Stop Xen timer before changing IRQ Stop Xen timer (if it's running) prior to changing the IRQ vector and potentially (re)starting the timer. Changing the IRQ vector while the timer is still running can result in KVM injecting a garbage event, e.g. vm_xen_inject_timer_irqs() could see a non-zero xen.timer_pending from a previous timer but inject the new xen.timer_virq. Fixes: 53639526 ("KVM: x86/xen: handle PV timers oneshot mode") Cc: stable@vger.kernel.org Link: https://syzkaller.appspot.com/bug?id=8234a9dfd3aafbf092cc5a7cd9842e3ebc45fc42 Reported-by:
<syzbot+e54f930ed78eb0f85281@syzkaller.appspotmail.com> Signed-off-by:
Coleman Dietsch <dietschc@csp.edu> Reviewed-by:
Sean Christopherson <seanjc@google.com> Acked-by:
David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20220808190607.323899-3-dietschc@csp.edu> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit af735db3 Author: Coleman Dietsch <dietschc@csp.edu> Date: Mon Aug 8 14:06:06 2022 -0500 KVM: x86/xen: Initialize Xen timer only once Add a check for existing xen timers before initializing a new one. Currently kvm_xen_init_timer() is called on every KVM_XEN_VCPU_ATTR_TYPE_TIMER, which is causing the following ODEBUG crash when vcpu->arch.xen.timer is already set. ODEBUG: init active (active state 0) object type: hrtimer hint: xen_timer_callbac0 RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:502 Call Trace: __debug_object_init debug_hrtimer_init debug_init hrtimer_init kvm_xen_init_timer kvm_xen_vcpu_set_attr kvm_arch_vcpu_ioctl kvm_vcpu_ioctl vfs_ioctl Fixes: 53639526 ("KVM: x86/xen: handle PV timers oneshot mode") Cc: stable@vger.kernel.org Link: https://syzkaller.appspot.com/bug?id=8234a9dfd3aafbf092cc5a7cd9842e3ebc45fc42 Reported-by:
<syzbot+e54f930ed78eb0f85281@syzkaller.appspotmail.com> Signed-off-by:
Coleman Dietsch <dietschc@csp.edu> Reviewed-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220808190607.323899-2-dietschc@csp.edu> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 0c29397a Author: Sean Christopherson <seanjc@google.com> Date: Wed Aug 3 22:49:57 2022 +0000 KVM: SVM: Disable SEV-ES support if MMIO caching is disable Disable SEV-ES if MMIO caching is disabled as SEV-ES relies on MMIO SPTEs generating #NPF(RSVD), which are reflected by the CPU into the guest as a #VC. With SEV-ES, the untrusted host, a.k.a. KVM, doesn't have access to the guest instruction stream or register state and so can't directly emulate in response to a #NPF on an emulated MMIO GPA. Disabling MMIO caching means guest accesses to emulated MMIO ranges cause #NPF(!PRESENT), and those flavors of #NPF cause automatic VM-Exits, not #VC. Adjust KVM's MMIO masks to account for the C-bit location prior to doing SEV(-ES) setup, and document that dependency between adjusting the MMIO SPTE mask and SEV(-ES) setup. Fixes: b09763da ("KVM: x86/mmu: Add module param to disable MMIO caching (for testing)") Reported-by:
Michael Roth <michael.roth@amd.com> Tested-by:
Michael Roth <michael.roth@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: stable@vger.kernel.org Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220803224957.1285926-4-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit c3e0c8c2 Author: Sean Christopherson <seanjc@google.com> Date: Wed Aug 3 22:49:56 2022 +0000 KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change Fully re-evaluate whether or not MMIO caching can be enabled when SPTE masks change; simply clearing enable_mmio_caching when a configuration isn't compatible with caching fails to handle the scenario where the masks are updated, e.g. by VMX for EPT or by SVM to account for the C-bit location, and toggle compatibility from false=>true. Snapshot the original module param so that re-evaluating MMIO caching preserves userspace's desire to allow caching. Use a snapshot approach so that enable_mmio_caching still reflects KVM's actual behavior. Fixes: 8b9e74bf ("KVM: x86/mmu: Use enable_mmio_caching to track if MMIO caching is enabled") Reported-by:
Michael Roth <michael.roth@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: stable@vger.kernel.org Tested-by:
Michael Roth <michael.roth@amd.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Kai Huang <kai.huang@intel.com> Message-Id: <20220803224957.1285926-3-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 982bae43 Author: Sean Christopherson <seanjc@google.com> Date: Wed Aug 3 22:49:55 2022 +0000 KVM: x86: Tag kvm_mmu_x86_module_init() with __init Mark kvm_mmu_x86_module_init() with __init, the entire reason it exists is to initialize variables when kvm.ko is loaded, i.e. it must never be called after module initialization. Fixes: 1d0e8480 ("KVM: x86/mmu: Resolve nx_huge_pages when kvm.ko is loaded") Cc: stable@vger.kernel.org Reviewed-by:
Kai Huang <kai.huang@intel.com> Tested-by:
Michael Roth <michael.roth@amd.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220803224957.1285926-2-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 4ac5b423 Author: Michal Luczaj <mhal@rbox.co> Date: Fri Jul 29 15:48:01 2022 +0200 KVM: x86: emulator: Fix illegal LEA handling The emulator mishandles LEA with register source operand. Even though such LEA is illegal, it can be encoded and fed to CPU. In which case real hardware throws #UD. The emulator, instead, returns address of x86_emulate_ctxt._regs. This info leak hurts host's kASLR. Tell the decoder that illegal LEA is not to be emulated. Signed-off-by:
Michal Luczaj <mhal@rbox.co> Message-Id: <20220729134801.1120-1-mhal@rbox.co> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 2bc685e6 Author: Yu Zhang <yu.c.zhang@linux.intel.com> Date: Mon Jul 18 15:47:56 2022 +0800 KVM: X86: avoid uninitialized 'fault.async_page_fault' from fixed-up #PF kvm_fixup_and_inject_pf_error() was introduced to fixup the error code( e.g., to add RSVD flag) and inject the #PF to the guest, when guest MAXPHYADDR is smaller than the host one. When it comes to nested, L0 is expected to intercept and fix up the #PF and then inject to L2 directly if - L2.MAXPHYADDR < L0.MAXPHYADDR and - L1 has no intention to intercept L2's #PF (e.g., L2 and L1 have the same MAXPHYADDR value && L1 is using EPT for L2), instead of constructing a #PF VM Exit to L1. Currently, with PFEC_MASK and PFEC_MATCH both set to 0 in vmcs02, the interception and injection may happen on all L2 #PFs. However, failing to initialize 'fault' in kvm_fixup_and_inject_pf_error() may cause the fault.async_page_fault being NOT zeroed, and later the #PF being treated as a nested async page fault, and then being injected to L1. Instead of zeroing 'fault' at the beginning of this function, we mannually set the value of 'fault.async_page_fault', because false is the value we really expect. Fixes: 89786147 ("KVM: x86: Add helper functions for illegal GPA checking and page fault injection") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216178 Reported-by:
Yang Lixiao <lixiao.yang@intel.com> Signed-off-by:
Yu Zhang <yu.c.zhang@linux.intel.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220718074756.53788-1-yu.c.zhang@linux.intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 70c8327c Author: Sean Christopherson <seanjc@google.com> Date: Thu Aug 4 23:50:28 2022 +0000 KVM: x86: Bug the VM if an accelerated x2APIC trap occurs on a "bad" reg Bug the VM if retrieving the x2APIC MSR/register while processing an accelerated vAPIC trap VM-Exit fails. In theory it's impossible for the lookup to fail as hardware has already validated the register, but bugs happen, and not checking the result of kvm_lapic_msr_read() would result in consuming the uninitialized "val" if a KVM or hardware bug occurs. Fixes: 1bd9dfec ("KVM: x86: Do not block APIC write for non ICR registers") Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220804235028.1766253-1-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit baea2ce5 Author: Paolo Bonzini <pbonzini@redhat.com> Date: Wed Aug 10 14:55:27 2022 -0400 selftests: kvm: fix compilation Commit 49de12ba ("selftests: drop KSFT_KHDR_INSTALL make target") dropped from tools/testing/selftests/lib.mk the code related to KSFT_KHDR_INSTALL, but in doing so it also dropped the definition of the ARCH variable. The ARCH variable is used in several subdirectories, but kvm/ is the only one of these that was using KSFT_KHDR_INSTALL. As a result, kvm selftests cannot be built anymore: In file included from include/x86_64/vmx.h:12, from x86_64/vmx_pmu_caps_test.c:18: include/x86_64/processor.h:15:10: fatal error: asm/msr-index.h: No such file or directory 15 | #include <asm/msr-index.h> | ^~~~~~~~~~~~~~~~~ In file included from ../../../../tools/include/asm/atomic.h:6, from ../../../../tools/include/linux/atomic.h:5, from rseq_test.c:15: ../../../../tools/include/asm/../../arch/x86/include/asm/atomic.h:11:10: fatal error: asm/cmpxchg.h: No such file or directory 11 | #include <asm/cmpxchg.h> | ^~~~~~~~~~~~~~~ Fix it by including the definition that was present in lib.mk. Fixes: 49de12ba ("selftests: drop KSFT_KHDR_INSTALL make target") Cc: Guillaume Tucker <guillaume.tucker@collabora.com> Cc: Anders Roxell <anders.roxell@linaro.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: linux-kselftest@vger.kernel.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Conflicts: tools/testing/selftests/kvm/Makefile (context, skipping f2745dc0) Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 281106f9 Author: Andrei Vagin <avagin@google.com> Date: Fri Jul 22 16:02:40 2022 -0700 selftests: kvm: set rax before vmcall kvm_hypercall has to place the hypercall number in rax. Trace events show that kvm_pv_test doesn't work properly: kvm_pv_test-53132: kvm_hypercall: nr 0x0 a0 0x0 a1 0x0 a2 0x0 a3 0x0 kvm_pv_test-53132: kvm_hypercall: nr 0x0 a0 0x0 a1 0x0 a2 0x0 a3 0x0 kvm_pv_test-53132: kvm_hypercall: nr 0x0 a0 0x0 a1 0x0 a2 0x0 a3 0x0 With this change, it starts working as expected: kvm_pv_test-54285: kvm_hypercall: nr 0x5 a0 0x0 a1 0x0 a2 0x0 a3 0x0 kvm_pv_test-54285: kvm_hypercall: nr 0xa a0 0x0 a1 0x0 a2 0x0 a3 0x0 kvm_pv_test-54285: kvm_hypercall: nr 0xb a0 0x0 a1 0x0 a2 0x0 a3 0x0 Signed-off-by:
Andrei Vagin <avagin@google.com> Message-Id: <20220722230241.1944655-5-avagin@google.com> Fixes: ac4a4d6d ("selftests: kvm: test enforcement of paravirtual cpuid features") Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit dd4d1c3b Author: Oliver Upton <oupton@google.com> Date: Tue Jul 19 14:31:34 2022 +0000 selftests: KVM: Add exponent check for boolean stats The only sensible exponent for a boolean stat is 0. Add a test assertion requiring all boolean statistics to have an exponent of 0. Signed-off-by:
Oliver Upton <oupton@google.com> Reviewed-by:
Andrew Jones <andrew.jones@linux.dev> Message-Id: <20220719143134.3246798-4-oliver.upton@linux.dev> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 7eebae78 Author: Oliver Upton <oupton@google.com> Date: Tue Jul 19 14:31:33 2022 +0000 selftests: KVM: Provide descriptive assertions in kvm_binary_stats_test As it turns out, tests sometimes fail. When that is the case, packing the test assertion with as much relevant information helps track down the problem more quickly. Sharpen up the stat descriptor assertions in kvm_binary_stats_test to more precisely describe the reason for the test assertion and which stat is to blame. Signed-off-by:
Oliver Upton <oupton@google.com> Reviewed-by:
Andrew Jones <andrew.jones@linux.dev> Message-Id: <20220719143134.3246798-3-oliver.upton@linux.dev> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit ad5b0727 Author: Oliver Upton <oupton@google.com> Date: Tue Jul 19 14:31:32 2022 +0000 selftests: KVM: Check stat name before other fields In order to provide more useful test assertions that describe the broken stats descriptor, perform sanity check on the stat name before any other descriptor field. While at it, avoid dereferencing the name field if the sanity check fails as it is more likely to contain garbage. Signed-off-by:
Oliver Upton <oupton@google.com> Reviewed-by:
Andrew Jones <andrew.jones@linux.dev> Message-Id: <20220719143134.3246798-2-oliver.upton@linux.dev> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 31f6e383 Author: Paolo Bonzini <pbonzini@redhat.com> Date: Mon Aug 1 07:27:18 2022 -0400 KVM: x86/mmu: remove unused variable The last use of 'pfn' went away with the same-named argument to host_pfn_mapping_level; now that the hugepage level is obtained exclusively from the host page tables, kvm_mmu_zap_collapsible_spte does not need to know host pfns at all. Fixes: a8ac499b ("KVM: x86/mmu: Don't require refcounted "struct page" to create huge SPTEs") Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 4ab0e470 Author: Anup Patel <apatel@ventanamicro.com> Date: Fri Jul 29 17:15:00 2022 +0530 KVM: Add gfp_custom flag in struct kvm_mmu_memory_cache The kvm_mmu_topup_memory_cache() always uses GFP_KERNEL_ACCOUNT for memory allocation which prevents it's use in atomic context. To address this limitation of kvm_mmu_topup_memory_cache(), we add gfp_custom flag in struct kvm_mmu_memory_cache. When the gfp_custom flag is set to some GFP_xyz flags, the kvm_mmu_topup_memory_cache() will use that instead of GFP_KERNEL_ACCOUNT. Signed-off-by:
Anup Patel <apatel@ventanamicro.com> Reviewed-by:
Atish Patra <atishp@rivosinc.com> Signed-off-by:
Anup Patel <anup@brainfault.org> commit 63f4b210 Merge: 2e2e9115 7edc3a68 Author: Paolo Bonzini <pbonzini@redhat.com> Date: Fri Jul 29 09:46:01 2022 -0400 Merge remote-tracking branch 'kvm/next' into kvm-next-5.20 KVM/s390, KVM/x86 and common infrastructure changes for 5.20 x86: * Permit guests to ignore single-bit ECC errors * Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache * Intel IPI virtualization * Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS * PEBS virtualization * Simplify PMU emulation by just using PERF_TYPE_RAW events * More accurate event reinjection on SVM (avoid retrying instructions) * Allow getting/setting the state of the speaker port data bit * Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent * "Notify" VM exit (detect microarchitectural hangs) for Intel * Cleanups for MCE MSR emulation s390: * add an interface to provide a hypervisor dump for secure guests * improve selftests to use TAP interface * enable interpretive execution of zPCI instructions (for PCI passthrough) * First part of deferred teardown * CPU Topology * PV attestation * Minor fixes Generic: * new selftests API using struct kvm_vcpu instead of a (vm, id) tuple x86: * Use try_cmpxchg64 instead of cmpxchg64 * Bugfixes * Ignore benign host accesses to PMU MSRs when PMU is disabled * Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior * x86/MMU: Allow NX huge pages to be disabled on a per-vm basis * Port eager page splitting to shadow MMU as well * Enable CMCI capability by default and handle injected UCNA errors * Expose pid of vcpu threads in debugfs * x2AVIC support for AMD * cleanup PIO emulation * Fixes for LLDT/LTR emulation * Don't require refcounted "struct page" to create huge SPTEs x86 cleanups: * Use separate namespaces for guest PTEs and shadow PTEs bitmasks * PIO emulation * Reorganize rmap API, mostly around rmap destruction * Do not workaround very old KVM bugs for L0 that runs with nesting enabled * new selftests API for CPUID Conflicts: virt/kvm/kvm_main.c (upstream merge conflict) Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 7edc3a68 Author: Kai Huang <kai.huang@intel.com> Date: Thu Jul 28 15:04:52 2022 +1200 KVM, x86/mmu: Fix the comment around kvm_tdp_mmu_zap_leafs() Now kvm_tdp_mmu_zap_leafs() only zaps leaf SPTEs but not any non-root pages within that GFN range anymore, so the comment around it isn't right. Fix it by shifting the comment from tdp_mmu_zap_leafs() instead of duplicating it, as tdp_mmu_zap_leafs() is static and is only called by kvm_tdp_mmu_zap_leafs(). Opportunistically tweak the blurb about SPTEs being cleared to (a) say "zapped" instead of "cleared" because "cleared" will be wrong if/when KVM allows a non-zero value for non-present SPTE (i.e. for Intel TDX), and (b) to clarify that a flush is needed if and only if a SPTE has been zapped since MMU lock was last acquired. Fixes: f47e5bbb ("KVM: x86/mmu: Zap only TDP MMU leafs in zap range and mmu_notifier unmap") Suggested-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Kai Huang <kai.huang@intel.com> Message-Id: <20220728030452.484261-1-kai.huang@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 6fac42f1 Author: Jarkko Sakkinen <jarkko@profian.com> Date: Thu Jul 28 08:09:19 2022 +0300 KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog As Virtual Machine Save Area (VMSA) is essential in troubleshooting attestation, dump it to the klog with the KERN_DEBUG level of priority. Cc: Jarkko Sakkinen <jarkko@kernel.org> Suggested-by:
Harald Hoyer <harald@profian.com> Signed-off-by:
Jarkko Sakkinen <jarkko@profian.com> Message-Id: <20220728050919.24113-1-jarkko@profian.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 6c6ab524 Author: Sean Christopherson <seanjc@google.com> Date: Sat Jul 23 01:30:29 2022 +0000 KVM: x86/mmu: Treat NX as a valid SPTE bit for NPT Treat the NX bit as valid when using NPT, as KVM will set the NX bit when the NX huge page mitigation is enabled (mindblowing) and trigger the WARN that fires on reserved SPTE bits being set. KVM has required NX support for SVM since commit b26a71a1 ("KVM: SVM: Refuse to load kvm_amd if NX support is not available") for exactly this reason, but apparently it never occurred to anyone to actually test NPT with the mitigation enabled. ------------[ cut here ]------------ spte = 0x800000018a600ee7, level = 2, rsvd bits = 0x800f0000001fe000 WARNING: CPU: 152 PID: 15966 at arch/x86/kvm/mmu/spte.c:215 make_spte+0x327/0x340 [kvm] Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 10.48.0 01/27/2022 RIP: 0010:make_spte+0x327/0x340 [kvm] Call Trace: <TASK> tdp_mmu_map_handle_target_level+0xc3/0x230 [kvm] kvm_tdp_mmu_map+0x343/0x3b0 [kvm] direct_page_fault+0x1ae/0x2a0 [kvm] kvm_tdp_page_fault+0x7d/0x90 [kvm] kvm_mmu_page_fault+0xfb/0x2e0 [kvm] npf_interception+0x55/0x90 [kvm_amd] svm_invoke_exit_handler+0x31/0xf0 [kvm_amd] svm_handle_exit+0xf6/0x1d0 [kvm_amd] vcpu_enter_guest+0xb6d/0xee0 [kvm] ? kvm_pmu_trigger_event+0x6d/0x230 [kvm] vcpu_run+0x65/0x2c0 [kvm] kvm_arch_vcpu_ioctl_run+0x355/0x610 [kvm] kvm_vcpu_ioctl+0x551/0x610 [kvm] __se_sys_ioctl+0x77/0xc0 __x64_sys_ioctl+0x1d/0x20 do_syscall_64+0x44/0xa0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> ---[ end trace 0000000000000000 ]--- Cc: stable@vger.kernel.org Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220723013029.1753623-1-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 1bd9dfec Author: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Date: Mon Jul 25 00:33:56 2022 -0500 KVM: x86: Do not block APIC write for non ICR registers The commit 5413bcba ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode") introduces logic to prevent APIC write for offset other than ICR in kvm_apic_write_nodecode() function. This breaks x2AVIC support, which requires KVM to trap and emulate x2APIC MSR writes. Therefore, removes the warning and modify to logic to allow MSR write. Fixes: 5413bcba ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode") Cc: Zeng Guang <guang.zeng@intel.com> Suggested-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Message-Id: <20220725053356.4275-1-suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 0a8735a6 Author: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Date: Sun Jul 24 22:34:28 2022 -0500 KVM: SVM: Do not virtualize MSR accesses for APIC LVTT register AMD does not support APIC TSC-deadline timer mode. AVIC hardware will generate GP fault when guest kernel writes 1 to bits [18] of the APIC LVTT register (offset 0x32) to set the timer mode. (Note: bit 18 is reserved on AMD system). Therefore, always intercept and let KVM emulate the MSR accesses. Fixes: f3d7c8aa6882 ("KVM: SVM: Fix x2APIC MSRs interception") Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Message-Id: <20220725033428.3699-1-suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit ce30d8b9 Author: Sean Christopherson <seanjc@google.com> Date: Tue Jun 7 21:36:04 2022 +0000 KVM: selftests: Verify VMX MSRs can be restored to KVM-supported values Verify that KVM allows toggling VMX MSR bits to be "more" restrictive, and also allows restoring each MSR to KVM's original, less restrictive value. Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220607213604.3346000-16-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit a910b5ab Author: Sean Christopherson <seanjc@google.com> Date: Tue Jun 7 21:36:00 2022 +0000 KVM: nVMX: Set UMIP bit CR4_FIXED1 MSR when emulating UMIP Make UMIP an "allowed-1" bit CR4_FIXED1 MSR when KVM is emulating UMIP. KVM emulates UMIP for both L1 and L2, and so should enumerate that L2 is allowed to have CR4.UMIP=1. Not setting the bit doesn't immediately break nVMX, as KVM does set/clear the bit in CR4_FIXED1 in response to a guest CPUID update, i.e. KVM will correctly (dis)allow nested VM-Entry based on whether or not UMIP is exposed to L1. That said, KVM should enumerate the bit as being allowed from time zero, e.g. userspace will see the wrong value if the MSR is read before CPUID is written. Fixes: 0367f205 ("KVM: vmx: add support for emulating UMIP") Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220607213604.3346000-12-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 9389d577 Author: Paolo Bonzini <pbonzini@redhat.com> Date: Fri Jul 22 22:44:09 2022 +0000 Revert "KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL VM-{Entry,Exit} control" This reverts commit 03a8871a. Since commit 03a8871a ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL VM-{Entry,Exit} control"), KVM has taken ownership of the "load IA32_PERF_GLOBAL_CTRL" VMX entry/exit control bits, trying to set these bits in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS MSRs if the guest's CPUID supports the architectural PMU (CPUID[EAX=0Ah].EAX[7:0]=1), and clear otherwise. This was a misguided attempt at mimicking what commit 5f76f6f5 ("KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled", 2018-10-01) did for MPX. However, that commit was a workaround for another KVM bug and not something that should be imitated. Mucking with the VMX MSRs creates a subtle, difficult to maintain ABI as KVM must ensure that any internal changes, e.g. to how KVM handles _any_ guest CPUID changes, yield the same functional result. Therefore, KVM's policy is to let userspace have full control of the guest vCPU model so long as the host kernel is not at risk. Now that KVM really truly ensures kvm_set_msr() will succeed by loading PERF_GLOBAL_CTRL if and only if it exists, revert KVM's misguided and roundabout behavior. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> [sean: make it a pure revert] Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220722224409.1336532-6-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 4496a6f9 Author: Sean Christopherson <seanjc@google.com> Date: Fri Jul 22 22:44:08 2022 +0000 KVM: nVMX: Attempt to load PERF_GLOBAL_CTRL on nVMX xfer iff it exists Attempt to load PERF_GLOBAL_CTRL during nested VM-Enter/VM-Exit if and only if the MSR exists (according to the guest vCPU model). KVM has very misguided handling of VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL and attempts to force the nVMX MSR settings to match the vPMU model, i.e. to hide/expose the control based on whether or not the MSR exists from the guest's perspective. KVM's modifications fail to handle the scenario where the vPMU is hidden from the guest _after_ being exposed to the guest, e.g. by userspace doing multiple KVM_SET_CPUID2 calls, which is allowed if done before any KVM_RUN. nested_vmx_pmu_refresh() is called if and only if there's a recognized vPMU, i.e. KVM will leave the bits in the allow state and then ultimately reject the MSR load and WARN. KVM should not force the VMX MSRs in the first place. KVM taking control of the MSRs was a misguided attempt at mimicking what commit 5f76f6f5 ("KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled", 2018-10-01) did for MPX. However, the MPX commit was a workaround for another KVM bug and not something that should be imitated (and it should never been done in the first place). In other words, KVM's ABI _should_ be that userspace has full control over the MSRs, at which point triggering the WARN that loading the MSR must not fail is trivial. The intent of the WARN is still valid; KVM has consistency checks to ensure that vmcs12->{guest,host}_ia32_perf_global_ctrl is valid. The problem is that '0' must be considered a valid value at all times, and so the simple/obvious solution is to just not actually load the MSR when it does not exist. It is userspace's responsibility to provide a sane vCPU model, i.e. KVM is well within its ABI and Intel's VMX architecture to skip the loads if the MSR does not exist. Fixes: 03a8871a ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL VM-{Entry,Exit} control") Cc: stable@vger.kernel.org Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220722224409.1336532-5-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit b663f0b5 Author: Sean Christopherson <seanjc@google.com> Date: Fri Jul 22 22:44:07 2022 +0000 KVM: VMX: Add helper to check if the guest PMU has PERF_GLOBAL_CTRL Add a helper to check of the guest PMU has PERF_GLOBAL_CTRL, which is unintuitive _and_ diverges from Intel's architecturally defined behavior. Even worse, KVM currently implements the check using two different (but equivalent) checks, _and_ there has been at least one attempt to add a _third_ flavor. Cc: stable@vger.kernel.org Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220722224409.1336532-4-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 93255bf9 Author: Sean Christopherson <seanjc@google.com> Date: Fri Jul 22 22:44:06 2022 +0000 KVM: VMX: Mark all PERF_GLOBAL_(OVF)_CTRL bits reserved if there's no vPMU Mark all MSR_CORE_PERF_GLOBAL_CTRL and MSR_CORE_PERF_GLOBAL_OVF_CTRL bits as reserved if there is no guest vPMU. The nVMX VM-Entry consistency checks do not check for a valid vPMU prior to consuming the masks via kvm_valid_perf_global_ctrl(), i.e. may incorrectly allow a non-zero mask to be loaded via VM-Enter or VM-Exit (well, attempted to be loaded, the actual MSR load will be rejected by intel_is_valid_msr()). Fixes: f5132b01 ("KVM: Expose a version 2 architectural PMU to a guests") Cc: stable@vger.kernel.org Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220722224409.1336532-3-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit 8805875a Author: Paolo Bonzini <pbonzini@redhat.com> Date: Fri Jul 22 05:07:39 2022 -0400 Revert "KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled" Since commit 5f76f6f5 ("KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled"), KVM has taken ownership of the "load IA32_BNDCFGS" and "clear IA32_BNDCFGS" VMX entry/exit controls, trying to set these bits in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS MSRs if the guest's CPUID supports MPX, and clear otherwise. The intent of the patch was to apply it to L0 in order to work around L1 kernels that lack the fix in commit 691bd434 ("kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS", 2017-07-04): by hiding the control bits from L0, L1 hides BNDCFGS from KVM_GET_MSR_INDEX_LIST, and the L1 bug is neutralized even in the lack of commit 691bd434. This was perhaps a sensible kludge at the time, but a horrible idea in the long term and in fact it has not been extended to other CPUID bits like these: X86_FEATURE_LM => VM_EXIT_HOST_ADDR_SPACE_SIZE, VM_ENTRY_IA32E_MODE, VMX_MISC_SAVE_EFER_LMA X86_FEATURE_TSC => CPU_BASED_RDTSC_EXITING, CPU_BASED_USE_TSC_OFFSETTING, SECONDARY_EXEC_TSC_SCALING X86_FEATURE_INVPCID_SINGLE => SECONDARY_EXEC_ENABLE_INVPCID X86_FEATURE_MWAIT => CPU_BASED_MONITOR_EXITING, CPU_BASED_MWAIT_EXITING X86_FEATURE_INTEL_PT => SECONDARY_EXEC_PT_CONCEAL_VMX, SECONDARY_EXEC_PT_USE_GPA, VM_EXIT_CLEAR_IA32_RTIT_CTL, VM_ENTRY_LOAD_IA32_RTIT_CTL X86_FEATURE_XSAVES => SECONDARY_EXEC_XSAVES These days it's sort of common knowledge that any MSR in KVM_GET_MSR_INDEX_LIST must allow *at least* setting it with KVM_SET_MSR to a default value, so it is unlikely that something like commit 5f76f6f5 will be needed again. So revert it, at the potential cost of breaking L1s with a 6 year old kernel. While in principle the L0 owner doesn't control what runs on L1, such an old hypervisor would probably have many other bugs. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-
Vitaly Kuznetsov authored
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2119111 commit f8ae08f9 Author: Sean Christopherson <seanjc@google.com> Date: Tue Jun 7 21:35:54 2022 +0000 KVM: nVMX: Let userspace set nVMX MSR to any _host_ supported value Restrict the nVMX MSRs based on KVM's config, not based on the guest's current config. Using the guest's config to audit the new config prevents userspace from restoring the original config (KVM's config) if at any point in the past the guest's config was restricted in any way. Fixes: 62cc6b9d ("KVM: nVMX: support restore of VMX capability MSRs") Cc: stable@vger.kernel.org Cc: David Matlack <dmatlack@google.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220607213604.3346000-6-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com>
-