KVM: VMX: Nop emulation of MSR_IA32_POWER_CTL
authorLiran Alon <liran.alon@oracle.com>
Mon, 15 Apr 2019 15:45:26 +0000 (18:45 +0300)
committerPaolo Bonzini <pbonzini@redhat.com>
Tue, 30 Apr 2019 19:32:14 +0000 (21:32 +0200)
Since commits 668fffa3f838 ("kvm: better MWAIT emulation for guestsâ€)
and 4d5422cea3b6 ("KVM: X86: Provide a capability to disable MWAIT interceptsâ€),
KVM was modified to allow an admin to configure certain guests to execute
MONITOR/MWAIT inside guest without being intercepted by host.

This is useful in case admin wishes to allocate a dedicated logical
processor for each vCPU thread. Thus, making it safe for guest to
completely control the power-state of the logical processor.

The ability to use this new KVM capability was introduced to QEMU by
commits 6f131f13e68d ("kvm: support -overcommit cpu-pm=on|offâ€) and
2266d4431132 ("i386/cpu: make -cpu host support monitor/mwaitâ€).

However, exposing MONITOR/MWAIT to a Linux guest may cause it's intel_idle
kernel module to execute c1e_promotion_disable() which will attempt to
RDMSR/WRMSR from/to MSR_IA32_POWER_CTL to manipulate the "C1E Enable"
bit. This behaviour was introduced by commit
32e9518005c8 ("intel_idle: export both C1 and C1Eâ€).

Becuase KVM doesn't emulate this MSR, running KVM with ignore_msrs=0
will cause the above guest behaviour to raise a #GP which will cause
guest to kernel panic.

Therefore, add support for nop emulation of MSR_IA32_POWER_CTL to
avoid #GP in guest in this scenario.

Future commits can optimise emulation further by reflecting guest
MSR changes to host MSR to provide guest with the ability to
fine-tune the dedicated logical processor power-state.

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
arch/x86/kvm/vmx/vmx.c
arch/x86/kvm/vmx/vmx.h
arch/x86/kvm/x86.c

index c40fb667002c712aaadd8700c015b67a550ed83f..3fe2020e3bc402e5fff17ef1423fb4da6419776c 100644 (file)
@@ -1692,6 +1692,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
        case MSR_IA32_SYSENTER_ESP:
                msr_info->data = vmcs_readl(GUEST_SYSENTER_ESP);
                break;
+       case MSR_IA32_POWER_CTL:
+               msr_info->data = vmx->msr_ia32_power_ctl;
+               break;
        case MSR_IA32_BNDCFGS:
                if (!kvm_mpx_supported() ||
                    (!msr_info->host_initiated &&
@@ -1822,6 +1825,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
        case MSR_IA32_SYSENTER_ESP:
                vmcs_writel(GUEST_SYSENTER_ESP, data);
                break;
+       case MSR_IA32_POWER_CTL:
+               vmx->msr_ia32_power_ctl = data;
+               break;
        case MSR_IA32_BNDCFGS:
                if (!kvm_mpx_supported() ||
                    (!msr_info->host_initiated &&
index f879529906b48cd84e99cc0f672210aaeaffeabd..1e42f983e0f1aca33799765ef5c9ff3cf910053f 100644 (file)
@@ -257,6 +257,8 @@ struct vcpu_vmx {
 
        unsigned long host_debugctlmsr;
 
+       u64 msr_ia32_power_ctl;
+
        /*
         * Only bits masked by msr_ia32_feature_control_valid_bits can be set in
         * msr_ia32_feature_control. FEATURE_CONTROL_LOCKED is always included
index cedd396e3003e7a1e35b4565990b96e70b098157..c09507057743c91019341a1adf38d32176483aad 100644 (file)
@@ -1170,6 +1170,7 @@ static u32 emulated_msrs[] = {
        MSR_PLATFORM_INFO,
        MSR_MISC_FEATURES_ENABLES,
        MSR_AMD64_VIRT_SPEC_CTRL,
+       MSR_IA32_POWER_CTL,
 };
 
 static unsigned num_emulated_msrs;