487 lines
18 KiB
ReStructuredText
487 lines
18 KiB
ReStructuredText
.. _interrupt-hld:
|
|
|
|
Interrupt Management high-level design
|
|
######################################
|
|
|
|
|
|
Overview
|
|
********
|
|
|
|
This document describes the interrupt management high-level design for
|
|
the ACRN hypervisor.
|
|
|
|
The ACRN hypervisor implements a simple but fully functional framework
|
|
to manage interrupts and exceptions, as show in
|
|
:numref:`interrupt-modules-overview`. In its native layer, it configures
|
|
the physical PIC, IOAPIC, and LAPIC to support different interrupt
|
|
sources from local timer/IPI to external INTx/MSI. In its virtual guest
|
|
layer, it emulates virtual PIC, virtual IOAPIC and virtual LAPIC, and
|
|
provides full APIs allowing virtual interrupt injection from emulated or
|
|
pass-thru devices.
|
|
|
|
.. figure:: images/interrupt-image3.png
|
|
:align: center
|
|
:width: 600px
|
|
:name: interrupt-modules-overview
|
|
|
|
ACRN Interrupt Modules Overview
|
|
|
|
In the software modules view shown in :numref:`interrupt-sw-modules`,
|
|
the ACRN hypervisor sets up the physical interrupt in its basic
|
|
interrupt modules (e.g., IOAPIC/LAPIC/IDT). It dispatches the interrupt
|
|
in the hypervisor interrupt flow control layer to the corresponding
|
|
handlers, that could be pre-defined IPI notification, timer, or runtime
|
|
registered pass-thru devices. The ACRN hypervisor then uses its VM
|
|
interfaces based on vPIC, vIOAPIC, and vMSI modules, to inject the
|
|
necessary virtual interrupt into the specific VM
|
|
|
|
.. figure:: images/interrupt-image2.png
|
|
:align: center
|
|
:width: 600px
|
|
:name: interrupt-sw-modules
|
|
|
|
ACRN Interrupt SW Modules Overview
|
|
|
|
Hypervisor Physical Interrupt Management
|
|
****************************************
|
|
|
|
The ACRN hypervisor is responsible for all the physical interrupt
|
|
handling. All physical interrupts are first handled in VMX root-mode.
|
|
The "external-interrupt exiting" bit in the VM-Execution controls field
|
|
is set to support this. The ACRN hypervisor also initializes all the
|
|
interrupt related modules such as IDT, PIC, IOAPIC, and LAPIC.
|
|
|
|
Only a few physical interrupts (such as TSC-Deadline timer and IOMMU)
|
|
are fully serviced in the hypervisor. Most interrupts come from pass-thru
|
|
devices whose interrupt are remapped to a virtual INTx/MSI source and
|
|
injected to the SOS or UOS, according to the pass-thru device
|
|
configuration.
|
|
|
|
The ACRN hypervisor does handle exceptions and any exception coming from
|
|
the VMX root-mode will lead to the CPU halting. For guest exception, the
|
|
hypervisor only traps #MC (machine check), prints a warning message, and
|
|
injects the exception back into the guest OS.
|
|
|
|
Physical Interrupt Initialization
|
|
=================================
|
|
|
|
After the ACRN hypervisor get control from the bootloader, it
|
|
initializes all physical interrupt-related modules for all the CPUs. The
|
|
ACRN hypervisor creates a framework to manage the physical interrupt for
|
|
hypervisor-local devices, pass-thru devices, and IPI between CPUs.
|
|
|
|
IDT
|
|
---
|
|
|
|
The ACRN hypervisor builds its native Interrupt Descriptor Table (IDT) during
|
|
interrupt initialization. For exceptions, it links to function
|
|
``dispatch_exception``, and for external interrupts it links to function
|
|
``dispatch_interrupt``. Please refer to ``arch/x86/idt.S`` for more details.
|
|
|
|
LAPIC
|
|
-----
|
|
|
|
The ACRN hypervisor resets LAPIC for each CPU, and provides basic APIs
|
|
used, for example, by the local timer (TSC Deadline)
|
|
program and IPI notification program. These APIs include
|
|
write_laipic_reg32, send_lapic_eoi, send_startup_ipi, and
|
|
send_single_ipi.
|
|
|
|
|
|
.. comment
|
|
|
|
Need reference to API doc generated from doxygen comments
|
|
in hypervisor/include/arch/x86/lapic.h
|
|
|
|
PIC/IOAPIC
|
|
----------
|
|
|
|
The ACRN hypervisor masks all interrupts from PIC, so all the
|
|
legacy interrupts from PIC (<16) are linked to IOAPIC, as shown in
|
|
:numref:`interrupt-pic-pin`.
|
|
|
|
ACRN will pre-allocate vectors and mask them for these legacy interrupts
|
|
in IOAPIC RTE. For others (>= 16) ACRN will mask them with vector 0 in
|
|
RTE, and the vector will be dynamically allocated on demand.
|
|
|
|
.. figure:: images/interrupt-image5.png
|
|
:align: center
|
|
:width: 600px
|
|
:name: interrupt-pic-pin
|
|
|
|
PIC & IOAPIC Pin Connection
|
|
|
|
Irq Desc
|
|
--------
|
|
|
|
The ACRN hypervisor maintains a global ``irq_desc[]`` array shared among the
|
|
CPUs and uses a flat mode to manage the interrupts. The same
|
|
vector is linked to the same IRQ number for all CPUs.
|
|
|
|
.. comment
|
|
|
|
Need reference to API doc generated from doxygen comments
|
|
for ``struct irq_desc`` in hypervisor/include/common/irq.h
|
|
|
|
|
|
The ``irq_desc[]`` array is indexed by the IRQ number. An
|
|
``irq_handler`` field can be set to a common edge, level, or quick
|
|
handler called from ``interrupt_dispatch``. The ``irq_desc`` structure
|
|
also contains the ``dev_list`` field to maintain this IRQ's action
|
|
handler list.
|
|
|
|
The global array ``vector_to_irq[]`` is used to manage the vector
|
|
resource. This array is initialized with value ``IRQ_INVALID`` for all
|
|
vectors, and will be set to a valid IRQ number after the corresponding
|
|
vector is registered.
|
|
|
|
For example, if the local timer registers interrupt with IRQ number 271 and
|
|
vector 0xEF, then the arrays mentioned above will be set to::
|
|
|
|
irq_desc[271].irq = 271;
|
|
irq_desc[271].vector = 0xEF;
|
|
vector_to_irq[0xEF] = 271;
|
|
|
|
Physical Interrupt Flow
|
|
=======================
|
|
|
|
|
|
When an physical interrupt occurs, and the CPU is running under VMX root
|
|
mode, the interrupt is triggered from the standard native irq flow:
|
|
interrupt gate to irq handler. However, if the CPU is running under VMX
|
|
non-root mode, an external interrupt will trigger a VM exit for reason
|
|
"external-interrupt". See :numref:`interrupt-handle-flow`.
|
|
|
|
.. figure:: images/interrupt-image4.png
|
|
:align: center
|
|
:width: 800px
|
|
:name: interrupt-handle-flow
|
|
|
|
ACRN Hypervisor Interrupt Handle Flow
|
|
|
|
After an interrupt happens (in either case noted above), the ACRN
|
|
hypervisor jumps to ``dispatch_interrupt``. This function will check
|
|
which vector caused this interrupt, and the corresponding ``irq_desc``
|
|
structure's ``irq_handler`` will be called for the service.
|
|
|
|
There are several irq_handler's defined in the ACRN hypervisor, as shown
|
|
in :numref:`interrupt-handle-flow`, designed for different uses. For
|
|
example, ``quick_handler_nolock`` is used when no critical data needs
|
|
protection in the action handlers; the VCPU notification IPI and local
|
|
timer are good example of this use case.
|
|
|
|
The more complicated ``common_dev_handler_level`` handler is intended
|
|
for pass-thru devices with level triggered interrupts. To avoid
|
|
continuously triggering the interrupt, it initially masks IOAPIC pin and
|
|
unmasks it only when the corresponding vIOAPIC pin gets an explicit EOI
|
|
ACK from the guest.
|
|
|
|
All the irq handler's finally call their own action handler list, as
|
|
shown here:
|
|
|
|
.. code-block: c
|
|
|
|
struct dev_handler_node \*dev = desc->dev_list;
|
|
while (dev != NULL) {
|
|
if (dev->dev_handler != NULL)
|
|
dev->dev_handler(desc->irq, dev->dev_data);
|
|
dev = dev->next;
|
|
}
|
|
|
|
The common APIs for registering, updating, and unregistering
|
|
interrupt handlers include irq_to_vector, dev_to_irq, dev_to_vector,
|
|
pri_register_handler, normal_register_handler,
|
|
unregister_handler_common, and update_irq_handler.
|
|
|
|
.. comment
|
|
|
|
Need reference to API doc generated from doxygen comments
|
|
in hypervisor/include/common/irq.h
|
|
|
|
.. _physical_interrupt_source:
|
|
|
|
Physical Interrupt Source
|
|
=========================
|
|
|
|
The ACRN hypervisor handles interrupts from many different sources, as
|
|
shown in :numref:`interrupt-source`:
|
|
|
|
|
|
.. list-table:: Physical Interrupt Source
|
|
:widths: 15 10 60
|
|
:header-rows: 1
|
|
:name: interrupt-source
|
|
|
|
* - Interrupt Source
|
|
- Vector
|
|
- Description
|
|
* - TSC Deadline Timer
|
|
- 0xEF
|
|
- The TSC deadline timer implements the timer framework in
|
|
the hypervisor based on the LAPIC TSC deadline. This interrupt's
|
|
target is specific to the CPU to which the LAPIC belongs.
|
|
* - CPU Startup IPI
|
|
- N/A
|
|
- The BSP needs to trigger an INIT-SIPI sequence to wake up the
|
|
APs. This interrupt's target is specified by the BSP calling
|
|
`` start_cpus()``.
|
|
* - VCPU Notify IPI
|
|
- 0xF0
|
|
- When the hypervisor needs to kick the VCPU out of VMX non-root
|
|
mode to do requests such as virtual interrupt injection, EPT
|
|
flush, etc. This interrupt's target is specified by function
|
|
``send_single_ipi()``.
|
|
* - IOMMU MSI
|
|
- dynamic
|
|
- IOMMU device supports an MSI interrupt. The vtd device driver in
|
|
the hypervisor will register an interrupt to handle dmar fault.
|
|
This interrupt's target is specified by vtd device driver.
|
|
* - PTdev INTx
|
|
- dynamic
|
|
- All native devices are owned by the guest (SOS or UOS), taking
|
|
advantage of the pass-thru method. Each pass-thru device connected
|
|
with IOAPIC/PIC (PTdev INTx) will register an interrupt when
|
|
its attached interrupt controller pin first gets unmasked.
|
|
This interrupt's target is defined by and RTE entry in the IOAPIC.
|
|
* - PTdev MSI
|
|
- dynamic
|
|
- All native devices are owned by the guest (SOS or UOS), taking
|
|
advantage of pass-thru method. Each pass-thru device with
|
|
enabled MSI (PTdev MSI) will register an interrupt when the SOS
|
|
does an explicit hypercall. This interrupt's target is defined
|
|
by an MSI address entry.
|
|
|
|
Softirq
|
|
=======
|
|
|
|
ACRN hypervisor implements a simple bottom-half softirq to execute the
|
|
interrupt handler, as showed in :numref:`interrupt-handle-flow`.
|
|
The softirq is executed when an interrupt is enabled. Several APIs for softirq
|
|
are defined including enable_softirq, disable_softirq, raise_softirq,
|
|
and exec_softirq.
|
|
|
|
.. comment
|
|
|
|
Need reference to API doc generated from doxygen comments
|
|
in hypervisor/include/common/softirq.h
|
|
|
|
Physical Exception Handling
|
|
===========================
|
|
|
|
As mentioned earlier, the ACRN hypervisor does not handle any
|
|
physical exceptions. The VMX root mode code path should guarantee no
|
|
exceptions are triggered while the hypervisor is running.
|
|
|
|
Guest Virtual Interrupt Management
|
|
**********************************
|
|
|
|
The previous sections describe physical interrupt management in the ACRN
|
|
hypervisor. After a physical interrupt happens, a registered action
|
|
handler is executed. Usually, the action handler represents a service
|
|
for virtual interrupt injection. For example, if an interrupt is
|
|
triggered from a pass-thru device, the appropriate virtual interrupt
|
|
should be injected into its guest VM.
|
|
|
|
The virtual interrupt injection could also come from an emulated device.
|
|
The I/O mediator in the Service OS (SOS) could trigger an interrupt
|
|
through a hypercall, and then do the virtual interrupt injection in the
|
|
hypervisor.
|
|
|
|
The following sections give an introduction to the ACRN guest virtual
|
|
interrupt management, including VCPU request for virtual interrupt kick
|
|
off, vPIC/vIOAPIC/vLAPIC for virtual interrupt injection interfaces,
|
|
physical-to-virtual interrupt mapping for a pass-thru device, and the
|
|
process of VMX interrupt/exception injection.
|
|
|
|
VCPU Request
|
|
============
|
|
|
|
As mentioned in `physical_interrupt_source`_, physical vector 0xF0 is
|
|
used to kick the VCPU out of its VMX non-root mode, and make a request
|
|
for virtual interrupt injection or other requests such as flush EPT.
|
|
|
|
The request-make API (vcpu_make_request) and eventid supports virtual interrupt
|
|
injection.
|
|
|
|
.. comment
|
|
|
|
Need reference to API doc generated from doxygen comments
|
|
in hypervisor/include/common/irq.h
|
|
|
|
There are requests for exception injection (ACRN_REQUEST_EXCP), vLAPIC
|
|
event (ACRN_REQUEST_EVENT), external interrupt from vPIC
|
|
(ACRN_REQUEST_EXTINT) and non-maskable-interrupt (ACRN_REQUEST_NMI).
|
|
|
|
The ``vcpu_make_request`` is necessary for a virtual interrupt
|
|
injection. If the target VCPU is running under VMX non-root mode, it
|
|
will send an IPI to kick it out and results in an external-interrupt
|
|
VM-Exit. The flow of :numref:`interrupt-handle-flow` could be executed
|
|
to complete the injection of a virtual interrupt.
|
|
|
|
There are some cases that do not need to send an IPI when making a
|
|
request because the CPU making the request is the target VCPU. For
|
|
example, the #GP exception request always happens on the current CPU
|
|
when an invalid emulation happens. An external interrupt for a pass-thru
|
|
device always happens on the VCPUs the device belongs to, so after it
|
|
triggers an external-interrupt VM-Exit, the current CPU is also the
|
|
target VCPU.
|
|
|
|
Virtual PIC
|
|
===========
|
|
|
|
The ACRN hypervisor emulates a vPIC for each VM based on IO ranges
|
|
0x20-0x21, 0xa0-0xa1, or 0x4d0-0x4d1.
|
|
|
|
If an interrupt source from vPIC needs to inject an interrupt,
|
|
the vpic_assert_irq, vpic_deassert_irq, or vpic_pulse_irq functions can
|
|
be called to make a request for ACRN_REQUEST_EXTINT or
|
|
ACRN_REQUEST_EVENT:
|
|
|
|
.. comment
|
|
|
|
Need reference to API doc generated from doxygen comments
|
|
in hypervisor/include/common/vpic.h
|
|
|
|
The vpic_pending_intr and vpic_intr_accepted APIs are used to query the
|
|
vector being injected and ACK the service, by moving the interrupt from
|
|
request service (IRR) to in service (ISR).
|
|
|
|
|
|
Virtual IOAPIC
|
|
==============
|
|
|
|
ACRN hypervisor emulates a vIOAPIC for each VM based on MMIO
|
|
VIOAPIC_BASE.
|
|
|
|
If an interrupt source from vIOAPIC needs to inject an interrupt, the
|
|
vioapic_assert_irq, vioapic_dessert_irq, and vioapic_pulse_irq APIs are
|
|
used to make a request for ACRN_REQUEST_EVENT.
|
|
|
|
As the vIOAPIC is always associated with a vLAPIC, the virtual interrupt
|
|
injection from vIOAPIC will finally trigger a request for an vLAPIC
|
|
event.
|
|
|
|
Virtual LAPIC
|
|
=============
|
|
|
|
The ACRN hypervisor emulates a vLAPIC for each VCPU based on MMIO
|
|
DEFAULT_APIC_BASE.
|
|
|
|
If an interrupt source from vLAPIC needs to inject an interrupt (e.g.,
|
|
from LVT such as an LAPIC timer, from vIOAPIC for a pass-thru device
|
|
interrupt, or from an emulated device for a MSI), vlapic_intr_level,
|
|
vlapic_intr_edge, vlapic_set_local_intr, vlapic_intr_msi,
|
|
vlapic_deliver_intr APIs need to be called, resulting in a request for
|
|
ACRN_REQUEST_EVENT.
|
|
|
|
.. comment
|
|
|
|
Need reference to API doc generated from doxygen comments
|
|
in hypervisor/include/common/vlapic.h
|
|
|
|
|
|
The vlapic_pending_intr and vlapic_intr_accepted APIs are used to query
|
|
the vector that needs to be injected and ACK
|
|
the service that move the interrupt from request service (IRR) to in
|
|
service (ISR).
|
|
|
|
By default, the ACRN hypervisor enables vAPIC to improve the performance of
|
|
a vLAPIC emulation.
|
|
|
|
Virtual Exception
|
|
=================
|
|
|
|
When doing emulation, an exception may be triggered in the hypervisor,
|
|
for example, if guest accesses an invalid vMSR register, or the
|
|
hypervisor needs to inject a #GP, or during instruction emulation, an
|
|
instruction fetch may access a non-exist page from rip_gva, and a #PF
|
|
must be injected.
|
|
|
|
ACRN hypervisor implements virtual exception injection using the
|
|
vcpu_queue_exception, vcpu_inject_gq, and vcpu_inject_pf APIs.
|
|
|
|
.. comment
|
|
|
|
Need reference to API doc generated from doxygen comments
|
|
in hypervisor/include/common/irq.h
|
|
|
|
The ACRN hypervisor uses vcpu_inject_gp/vcpu_inject_pf functions to
|
|
queue exception requests, and follows `Intel Software
|
|
Developer Manual, Vol 3. <SDM vol3>`_ - 6.15, Table 6-5
|
|
listing conditions for generating a double fault.
|
|
|
|
.. _SDM vol3: https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html
|
|
|
|
Interrupt Mapping for a Pass-thru Device
|
|
========================================
|
|
|
|
A VM can control a PCI device directly through pass-thru device
|
|
assignment. The pass-thru entry is the major info object, and it is:
|
|
|
|
- A physical interrupt source, and could be a MSI/MSIX entry, PIC pins, or
|
|
IOAPIC pins
|
|
- Pass-thru remapping information between physical and virtual interrupt
|
|
source, for MSI/MSIX it is identified by a PCI device's BDF. For
|
|
PIC/IOAPIC it is identified by the pin number.
|
|
|
|
.. figure:: images/interrupt-image7.png
|
|
:align: center
|
|
:width: 600px
|
|
:name: interrupt-pass-thru
|
|
|
|
Pass-thru Device Entry Assignment
|
|
|
|
As shown in :numref:`interrupt-pass-thru` above, a UOS will assign its
|
|
pass-thru device entry by the DM, and it will fill its entry info from:
|
|
|
|
- vPIC/vIOAPIC interrupt mask/unmask
|
|
- MSI IOReq from UOS then MSI hypercall from SOS
|
|
|
|
The SOS adds its pass-thru device entry at runtime and fills info for:
|
|
|
|
- vPIC/vIOAPIC interrupt mask/unmask
|
|
- MSI hypercall from SOS
|
|
|
|
During the pass-thru device entry info filling, the hypervisor builds
|
|
native IOAPIC RTE/MSI entry based on vIOAPIC/vPIC/vMSI configuration,
|
|
and register the physical interrupt handler for it. Then with the pass-thru
|
|
device entry as the handler private data, the physical interrupt can
|
|
be linked to a virtual pin of a guest's vPIC/vIOAPIC or virtual vector of
|
|
a guest's vMSI. The handler then injects the corresponding virtual
|
|
interrupt into the guest, based on vPIC/vIOAPIC/vLAPIC APIs described
|
|
earlier.
|
|
|
|
Interrupt Storm Mitigation
|
|
==========================
|
|
|
|
When the Device Model (DM) launches a User OS (UOS), the ACRN hypervisor
|
|
will remap the interrupt for this user OS's pass-through devices. When
|
|
an interrupt occurs for a pass-through device, the CPU core is assigned
|
|
to that User OS gets trapped into the hypervisor. The benefit of such a
|
|
mechanism is that, should an interrupt storm happen in a particular UOS,
|
|
it will have only a minimal effect on the performance of the Service OS.
|
|
|
|
Interrupt/Exception Injection Process
|
|
=====================================
|
|
|
|
As shown in :numref:`interrupt-handle-flow`, the ACRN hypervisor injects
|
|
virtual interrupt/exception to the guest before its VM-Entry.
|
|
|
|
This is done by updating the VMX_ENTRY_INT_INFO_FIELD of the VCPU's
|
|
VMCS. As this field is unique, the interrupt/exception injection must
|
|
follow a priority rule to handle one-by-one.
|
|
|
|
:numref:`interrupt-injection` below shows the rules about how to inject
|
|
virtual interrupt/exception one-by-one. If a high priority
|
|
interrupt/exception was already injected, the next pending
|
|
interrupt/exception will enable an interrupt window where the next
|
|
injection will be done by the following VM-Exit, triggered by the
|
|
interrupt window.
|
|
|
|
.. figure:: images/interrupt-image6.png
|
|
:align: center
|
|
:width: 600px
|
|
:name: interrupt-injection
|
|
|
|
ACRN Hypervisor Interrupt/Exception Injection Process
|