KVM: x86: Move "ack" phase of local APIC IRQ delivery to separate API
authorSean Christopherson <seanjc@google.com>
Fri, 6 Sep 2024 04:34:07 +0000 (21:34 -0700)
committerSean Christopherson <seanjc@google.com>
Tue, 10 Sep 2024 03:14:57 +0000 (20:14 -0700)
commita194a3a13ce0b4cce4b52f328405891ef3a85cb9
tree4dfd600c1f2ef8fddb056aef58fde305ae58005c
parent7efb4d8a392a18e37fcdb5e77c111af6e9a9e2f2
KVM: x86: Move "ack" phase of local APIC IRQ delivery to separate API

Split the "ack" phase, i.e. the movement of an interrupt from IRR=>ISR,
out of kvm_get_apic_interrupt() and into a separate API so that nested
VMX can acknowledge a specific interrupt _after_ emulating a VM-Exit from
L2 to L1.

To correctly emulate nested posted interrupts while APICv is active, KVM
must:

  1. find the highest pending interrupt.
  2. check if that IRQ is L2's notification vector
  3. emulate VM-Exit if the IRQ is NOT the notification vector
  4. ACK the IRQ in L1 _after_ VM-Exit

When APICv is active, the process of moving the IRQ from the IRR to the
ISR also requires a VMWRITE to update vmcs01.GUEST_INTERRUPT_STATUS.SVI,
and so acknowledging the interrupt before switching to vmcs01 would result
in marking the IRQ as in-service in the wrong VMCS.

KVM currently fudges around this issue by doing kvm_get_apic_interrupt()
smack dab in the middle of emulating VM-Exit, but that hack doesn't play
nice with nested posted interrupts, as notification vector IRQs don't
trigger a VM-Exit in the first place.

Cc: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240906043413.1049633-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
arch/x86/kvm/lapic.c
arch/x86/kvm/lapic.h