doc: terminology cleanup in hv-dev-passthrough.rst
- Change UOS and SOS to User VM and Service VM, respectively - Change guest to VM or similar depending on context - Clean up some of the grammar Signed-off-by: Amy Reyes <amy.reyes@intel.com>
|
@ -21,10 +21,10 @@ discussed here.
|
||||||
--------
|
--------
|
||||||
|
|
||||||
In the ACRN project, device emulation means emulating all existing
|
In the ACRN project, device emulation means emulating all existing
|
||||||
hardware resource through a software component device model running in
|
hardware resources through a software component device model running in
|
||||||
the Service OS (SOS). Device emulation must maintain the same SW
|
the Service VM. Device emulation must maintain the same SW
|
||||||
interface as a native device, providing transparency to the VM software
|
interface as a native device, providing transparency to the VM software
|
||||||
stack. Passthrough implemented in hypervisor assigns a physical device
|
stack. Passthrough implemented in the hypervisor assigns a physical device
|
||||||
to a VM so the VM can access the hardware device directly with minimal
|
to a VM so the VM can access the hardware device directly with minimal
|
||||||
(if any) VMM involvement.
|
(if any) VMM involvement.
|
||||||
|
|
||||||
|
@ -38,23 +38,23 @@ can't support device sharing.
|
||||||
:align: center
|
:align: center
|
||||||
:name: emu-passthru-diff
|
:name: emu-passthru-diff
|
||||||
|
|
||||||
Difference between Emulation and passthrough
|
Difference between emulation and passthrough
|
||||||
|
|
||||||
Passthrough in the hypervisor provides the following functionalities to
|
Passthrough in the hypervisor provides the following functionalities to
|
||||||
allow VM to access PCI devices directly:
|
allow the VM to access PCI devices directly:
|
||||||
|
|
||||||
- VT-d DMA Remapping for PCI devices: hypervisor will setup DMA
|
- VT-d DMA remapping for PCI devices: hypervisor will set up DMA
|
||||||
remapping during VM initialization phase.
|
remapping during VM initialization phase.
|
||||||
- VT-d Interrupt-remapping for PCI devices: hypervisor will enable
|
- VT-d interrupt-remapping for PCI devices: hypervisor will enable
|
||||||
VT-d interrupt-remapping for PCI devices for security considerations.
|
VT-d interrupt-remapping for PCI devices for security considerations.
|
||||||
- MMIO Remapping between virtual and physical BAR
|
- MMIO remapping between virtual and physical BAR
|
||||||
- Device configuration Emulation
|
- Device configuration emulation
|
||||||
- Remapping interrupts for PCI devices
|
- Remapping interrupts for PCI devices
|
||||||
- ACPI configuration Virtualization
|
- ACPI configuration virtualization
|
||||||
- GSI sharing violation check
|
- GSI sharing violation check
|
||||||
|
|
||||||
The following diagram details passthrough initialization control flow in ACRN
|
The following diagram details the passthrough initialization control flow in
|
||||||
for post-launched VM:
|
ACRN for a post-launched VM:
|
||||||
|
|
||||||
.. figure:: images/passthru-image22.png
|
.. figure:: images/passthru-image22.png
|
||||||
:align: center
|
:align: center
|
||||||
|
@ -70,59 +70,61 @@ passthrough, as detailed here:
|
||||||
.. figure:: images/passthru-image77.png
|
.. figure:: images/passthru-image77.png
|
||||||
:align: center
|
:align: center
|
||||||
|
|
||||||
Passthrough Device Status
|
Passthrough device status
|
||||||
|
|
||||||
Owner of Passthrough Devices
|
Owner of Passthrough Devices
|
||||||
****************************
|
****************************
|
||||||
|
|
||||||
ACRN hypervisor will do PCI enumeration to discover the PCI devices on the platform.
|
ACRN hypervisor will do PCI enumeration to discover the PCI devices on the
|
||||||
According to the hypervisor/VM configurations, the owner of these PCI devices can be
|
platform. According to the hypervisor/VM configurations, the owner of these PCI
|
||||||
one the following 4 cases:
|
devices can be one of the following 4 cases:
|
||||||
|
|
||||||
- **Hypervisor**: hypervisor uses UART device as the console in debug version for
|
- **Hypervisor**: Hypervisor uses a UART device as the console in debug version
|
||||||
debug purpose, so the UART device is owned by hypervisor and is not visible
|
for debugging purposes, so the UART device is owned by the hypervisor and is
|
||||||
to any VM. For now, UART is the only pci device could be owned by hypervisor.
|
not visible to any VM. For now, UART is the only PCI device that can be owned
|
||||||
- **Pre-launched VM**: The passthrough devices will be used in a pre-launched VM is
|
by the hypervisor.
|
||||||
predefined in VM configuration. These passthrough devices are owned by the
|
- **Pre-launched VM**: The passthrough devices that will be used in a
|
||||||
pre-launched VM after the VM is created. These devices will not be removed
|
pre-launched VM are predefined in the VM configuration. These passthrough
|
||||||
from the pre-launched VM. There could be pre-launched VM(s) in partitioned
|
devices are owned by the pre-launched VM after the VM is created. These
|
||||||
mode and hybrid mode.
|
devices will not be removed from the pre-launched VM. There can be
|
||||||
- **Service VM**: All the passthrough devices except these described above (owned by
|
pre-launched VMs in partitioned mode and hybrid mode.
|
||||||
hypervisor or pre-launched VM(s)) are assigned to Service VM. And some of these devices
|
- **Service VM**: All the passthrough devices except those described above
|
||||||
can be assigned to a post-launched VM according to the passthrough device list
|
(owned by hypervisor or pre-launched VMs) are assigned to the Service VM. And
|
||||||
specified in the parameters of the ACRN DM.
|
some of these devices can be assigned to a post-launched VM according to the
|
||||||
- **Post-launched VM**: A list of passthrough devices can be specified in the parameters of
|
passthrough device list specified in the parameters of the ACRN Device Model.
|
||||||
the ACRN DM. When creating a post-launched VM, these specified devices will be moved
|
- **Post-launched VM**: A list of passthrough devices can be specified in the
|
||||||
from Service VM domain to the post-launched VM domain. After the post-launched VM is
|
parameters of the ACRN Device Model. When creating a post-launched VM, these
|
||||||
powered-off, these devices will be moved back to Service VM domain.
|
specified devices will be moved from the Service VM domain to the
|
||||||
|
post-launched VM domain. After the post-launched VM is powered-off, these
|
||||||
|
devices will be moved back to the Service VM domain.
|
||||||
|
|
||||||
|
|
||||||
VT-d DMA Remapping
|
VT-d DMA Remapping
|
||||||
******************
|
******************
|
||||||
|
|
||||||
To enable passthrough, for VM DMA access the VM can only
|
To enable passthrough, for VM DMA access the VM can only
|
||||||
support GPA, while physical DMA requires HPA. One work-around
|
support GPA, while a physical DMA requires HPA. One work-around
|
||||||
is building identity mapping so that GPA is equal to HPA, but this
|
is building identity mapping so that GPA is equal to HPA, but this
|
||||||
is not recommended as some VM don't support relocation well. To
|
is not recommended as some VMs don't support relocation well. To
|
||||||
address this issue, Intel introduces VT-d in the chipset to add one
|
address this issue, Intel introduces VT-d in the chipset to add one
|
||||||
remapping engine to translate GPA to HPA for DMA operations.
|
remapping engine to translate GPA to HPA for DMA operations.
|
||||||
|
|
||||||
Each VT-d engine (DMAR Unit), maintains a remapping structure
|
Each VT-d engine (DMAR Unit) maintains a remapping structure
|
||||||
similar to a page table with device BDF (Bus/Dev/Func) as input and final
|
similar to a page table with device BDF (Bus/Dev/Func) as input and final
|
||||||
page table for GPA/HPA translation as output. The GPA/HPA translation
|
page table for GPA/HPA translation as output. The GPA/HPA translation
|
||||||
page table is similar to a normal multi-level page table.
|
page table is similar to a normal multi-level page table.
|
||||||
|
|
||||||
VM DMA depends on Intel VT-d to do the translation from GPA to HPA, so we
|
VM DMA depends on Intel VT-d to do the translation from GPA to HPA, so we need
|
||||||
need to enable VT-d IOMMU engine in ACRN before we can passthrough any device. Service VM
|
to enable VT-d IOMMU engine in ACRN before we can passthrough any device. The
|
||||||
in ACRN is a VM running in non-root mode which also depends
|
Service VM in ACRN is a VM running in non-root mode which also depends on VT-d
|
||||||
on VT-d to access a device. In Service VM DMA remapping
|
to access a device. In Service VM DMA remapping engine settings, GPA is equal to
|
||||||
engine settings, GPA is equal to HPA.
|
HPA.
|
||||||
|
|
||||||
ACRN hypervisor checks DMA-Remapping Hardware unit Definition (DRHD) in
|
ACRN hypervisor checks DMA-Remapping Hardware unit Definition (DRHD) in the host
|
||||||
host DMAR ACPI table to get basic info, then sets up each DMAR unit. For
|
DMAR ACPI table to get basic information, then sets up each DMAR unit. For
|
||||||
simplicity, ACRN reuses EPT table as the translation table in DMAR
|
simplicity, ACRN reuses the EPT table as the translation table in the DMAR unit
|
||||||
unit for each passthrough device. The control flow of assigning and de-assigning
|
for each passthrough device. The control flow of assigning and deassigning a
|
||||||
a passthrough device to/from a post-launched VM is shown in the following figures:
|
passthrough device to/from a post-launched VM is shown in the following figures:
|
||||||
|
|
||||||
.. figure:: images/passthru-image86.png
|
.. figure:: images/passthru-image86.png
|
||||||
:align: center
|
:align: center
|
||||||
|
@ -132,7 +134,7 @@ a passthrough device to/from a post-launched VM is shown in the following figure
|
||||||
.. figure:: images/passthru-image42.png
|
.. figure:: images/passthru-image42.png
|
||||||
:align: center
|
:align: center
|
||||||
|
|
||||||
ptdev de-assignment control flow
|
ptdev deassignment control flow
|
||||||
|
|
||||||
.. _vtd-posted-interrupt:
|
.. _vtd-posted-interrupt:
|
||||||
|
|
||||||
|
@ -140,47 +142,44 @@ a passthrough device to/from a post-launched VM is shown in the following figure
|
||||||
VT-d Interrupt-Remapping
|
VT-d Interrupt-Remapping
|
||||||
************************
|
************************
|
||||||
|
|
||||||
The VT-d interrupt-remapping architecture enables system software to
|
The VT-d interrupt-remapping architecture enables system software to control and
|
||||||
control and censor external interrupt requests generated by all sources
|
censor external interrupt requests generated by all sources including those from
|
||||||
including those from interrupt controllers (I/OxAPICs), MSI/MSI-X capable
|
interrupt controllers (I/OxAPICs), MSI/MSI-X capable devices including
|
||||||
devices including endpoints, root-ports and Root-Complex integrated
|
endpoints, root-ports and Root-Complex integrated end-points. ACRN requires
|
||||||
end-points.
|
enabling the VT-d interrupt-remapping feature for security reasons. If the VT-d
|
||||||
ACRN forces to enabled VT-d interrupt-remapping feature for security reasons.
|
hardware doesn't support interrupt-remapping, then ACRN will refuse to boot VMs.
|
||||||
If the VT-d hardware doesn't support interrupt-remapping, then ACRN will
|
VT-d interrupt-remapping is NOT related to the translation from physical
|
||||||
refuse to boot VMs.
|
|
||||||
VT-d Interrupt-remapping is NOT related to the translation from physical
|
|
||||||
interrupt to virtual interrupt or vice versa. The term VT-d interrupt-remapping
|
interrupt to virtual interrupt or vice versa. The term VT-d interrupt-remapping
|
||||||
remaps the interrupt index in the VT-d interrupt-remapping table to the physical
|
remaps the interrupt index in the VT-d interrupt-remapping table to the physical
|
||||||
interrupt vector after checking the external interrupt request is valid. Translation
|
interrupt vector after checking the external interrupt request is valid. The
|
||||||
physical vector to virtual vector is still needed to be done by hypervisor, which is
|
hypervisor still needs to translate the physical vector to the virtual vector,
|
||||||
also described in the below section :ref:`interrupt-remapping`.
|
which is also described in the below section :ref:`interrupt-remapping`.
|
||||||
|
|
||||||
VT-d posted interrupt (PI) enables direct delivery of external interrupts from
|
VT-d posted interrupt (PI) enables direct delivery of external interrupts from
|
||||||
passthrough devices to VMs without having to exit to hypervisor, thereby improving
|
passthrough devices to VMs without having to exit to the hypervisor, thereby
|
||||||
interrupt performance. ACRN uses VT-d posted interrupts if the platform
|
improving interrupt performance. ACRN uses VT-d posted interrupts if the
|
||||||
supports them. VT-d distinguishes between remapped
|
platform supports them. VT-d distinguishes between remapped and posted interrupt
|
||||||
and posted interrupt modes by bit 15 in the low 64-bit of the IRTE. If cleared the
|
modes by bit 15 in the low 64-bit of the interrupt-remapping table entry. If
|
||||||
entry is remapped, if set it's posted.
|
cleared, the entry is remapped. If set, it's posted. The idea is to keep a
|
||||||
The idea for posted interrupt is to keep a Posted Interrupt Descriptor (PID) in memory.
|
Posted Interrupt Descriptor (PID) in memory. The PID is a 64-byte data structure
|
||||||
The PID is a 64-byte data structure that contains several fields:
|
that contains several fields:
|
||||||
|
|
||||||
Posted Interrupt Request (PIR):
|
Posted Interrupt Request (PIR):
|
||||||
a 256-bit field, one bit per request vector;
|
a 256-bit field, one bit per request vector;
|
||||||
this is where the interrupts are posted;
|
this is where the interrupts are posted.
|
||||||
|
|
||||||
Suppress Notification (SN):
|
Suppress Notification (SN):
|
||||||
determines whether to notify (``SN=0``) or not notify (``SN=1``)
|
determines whether to notify (``SN=0``) or not notify (``SN=1``) the CPU for
|
||||||
the CPU for non-urgent interrupts. For ACRN,
|
non-urgent interrupts. For ACRN, all interrupts are treated as non-urgent.
|
||||||
all interrupts are treated as non-urgent. ACRN sets SN=0 during initialization
|
ACRN sets SN=0 during initialization and then never changes it at runtime.
|
||||||
and then never changes it at runtime;
|
|
||||||
|
|
||||||
Notification Vector (NV):
|
Notification Vector (NV):
|
||||||
the CPU must be notified with an interrupt and this
|
the CPU must be notified with an interrupt and this
|
||||||
field specifies the vector for notification;
|
field specifies the vector for notification.
|
||||||
|
|
||||||
Notification Destination (NDST):
|
Notification Destination (NDST):
|
||||||
the physical APIC-ID of the destination.
|
the physical APIC-ID of the destination.
|
||||||
ACRN does not support vCPU migration, one vCPU always runs on the same pCPU,
|
ACRN does not support vCPU migration. One vCPU always runs on the same pCPU,
|
||||||
so for ACRN, NDST is never changed after initialization.
|
so for ACRN, NDST is never changed after initialization.
|
||||||
|
|
||||||
Outstanding Notification (ON):
|
Outstanding Notification (ON):
|
||||||
|
@ -188,10 +187,10 @@ Outstanding Notification (ON):
|
||||||
|
|
||||||
The ACRN scheduler supports vCPU scheduling, where two or more vCPUs can
|
The ACRN scheduler supports vCPU scheduling, where two or more vCPUs can
|
||||||
share the same pCPU using a time sharing technique. One issue emerges
|
share the same pCPU using a time sharing technique. One issue emerges
|
||||||
here for VT-d posted interrupt handling process, where IRQs could happen
|
here for the VT-d posted interrupt handling process, where IRQs could happen
|
||||||
when the target vCPU is in a halted state. We need to handle the case
|
when the target vCPU is in a halted state. We need to handle the case
|
||||||
where the running vCPU disrupted by the external interrupt, is not the
|
where the running vCPU disrupted by the external interrupt, is not the
|
||||||
target vCPU that an external interrupt should be delivered.
|
target vCPU that should have received the external interrupt.
|
||||||
|
|
||||||
Consider this scenario:
|
Consider this scenario:
|
||||||
|
|
||||||
|
@ -206,7 +205,7 @@ allocate the same Activation Notification Vector (ANV) to all vCPUs.
|
||||||
To circumvent this issue, ACRN allocates unique ANVs for each vCPU that
|
To circumvent this issue, ACRN allocates unique ANVs for each vCPU that
|
||||||
belongs to the same pCPU. The ANVs need only be unique within each pCPU,
|
belongs to the same pCPU. The ANVs need only be unique within each pCPU,
|
||||||
not across all vCPUs. Since vCPU0's ANV is different from vCPU1's ANV,
|
not across all vCPUs. Since vCPU0's ANV is different from vCPU1's ANV,
|
||||||
if a vCPU0 is in a halted state, external interrupts from an assigned
|
if vCPU0 is in a halted state, external interrupts from an assigned
|
||||||
device destined to vCPU0 delivered through the PID will not trigger the
|
device destined to vCPU0 delivered through the PID will not trigger the
|
||||||
posted interrupt processing. Instead, a VMExit to ACRN happens that can
|
posted interrupt processing. Instead, a VMExit to ACRN happens that can
|
||||||
then process the event such as waking up the halted vCPU0 and kick it
|
then process the event such as waking up the halted vCPU0 and kick it
|
||||||
|
@ -233,15 +232,15 @@ related vCPU array.
|
||||||
An example to illustrate our solution:
|
An example to illustrate our solution:
|
||||||
|
|
||||||
.. figure:: images/passthru-image50.png
|
.. figure:: images/passthru-image50.png
|
||||||
:align: center
|
:align: center
|
||||||
|
|
||||||
ACRN sets ``SN=0`` during initialization and then never change it at
|
ACRN sets ``SN=0`` during initialization and then never changes it at
|
||||||
runtime. This means posted interrupt notification is never suppressed.
|
runtime. This means posted interrupt notification is never suppressed.
|
||||||
After posting the interrupt in Posted Interrupt Request (PIR), VT-d will
|
After posting the interrupt in Posted Interrupt Request (PIR), VT-d will
|
||||||
always notify the CPU using the interrupt vector NV, in both root and
|
always notify the CPU using the interrupt vector NV, in both root and
|
||||||
non-root mode. With this scheme, if the target vCPU is running under
|
non-root mode. With this scheme, if the target vCPU is running under
|
||||||
VMX non-root mode, it will receive the interrupts coming from
|
VMX non-root mode, it will receive the interrupts coming from the
|
||||||
passed-through device without a VMExit (and therefore without any
|
passthrough device without a VMExit (and therefore without any
|
||||||
intervention of the ACRN hypervisor).
|
intervention of the ACRN hypervisor).
|
||||||
|
|
||||||
If the target vCPU is in a halted state (under VMX non-root mode), a
|
If the target vCPU is in a halted state (under VMX non-root mode), a
|
||||||
|
@ -254,11 +253,11 @@ immediately.
|
||||||
MMIO Remapping
|
MMIO Remapping
|
||||||
**************
|
**************
|
||||||
|
|
||||||
For PCI MMIO BAR, hypervisor builds EPT mapping between virtual BAR and
|
For PCI MMIO BAR, the hypervisor builds EPT mapping between the virtual BAR and
|
||||||
physical BAR, then VM can access MMIO directly.
|
physical BAR, then the VM can access MMIO directly. There is one exception: an
|
||||||
There is one exception, MSI-X table is also in a MMIO BAR. Hypervisor needs to trap the
|
MSI-X table is also in a MMIO BAR. The hypervisor needs to trap the accesses to
|
||||||
accesses to MSI-X table. So the page(s) having MSI-X table should not be accessed by guest
|
the MSI-X table. So the pages that have an MSI-X table should not be accessed by
|
||||||
directly. EPT mapping is not built for these pages having MSI-X table.
|
the VM directly. EPT mapping is not built for pages that have an MSI-X table.
|
||||||
|
|
||||||
Device Configuration Emulation
|
Device Configuration Emulation
|
||||||
******************************
|
******************************
|
||||||
|
@ -266,25 +265,26 @@ Device Configuration Emulation
|
||||||
The PCI configuration space can be accessed by a PCI-compatible
|
The PCI configuration space can be accessed by a PCI-compatible
|
||||||
Configuration Mechanism (IO port 0xCF8/CFC) and the PCI Express Enhanced
|
Configuration Mechanism (IO port 0xCF8/CFC) and the PCI Express Enhanced
|
||||||
Configuration Access Mechanism (PCI MMCONFIG). The ACRN hypervisor traps
|
Configuration Access Mechanism (PCI MMCONFIG). The ACRN hypervisor traps
|
||||||
this PCI configuration space access and emulate it. Refer to :ref:`split-device-model` for details.
|
this PCI configuration space access and emulates it. Refer to :ref:`split-device-model` for details.
|
||||||
|
|
||||||
MSI-X Table Emulation
|
MSI-X Table Emulation
|
||||||
*********************
|
*********************
|
||||||
|
|
||||||
VM accesses to MSI-X table should be trapped so that hypervisor has the
|
VM accesses to an MSI-X table should be trapped so that the hypervisor has the
|
||||||
information to map the virtual vector and physical vector. EPT mapping should
|
information to map the virtual vector and physical vector. EPT mapping should
|
||||||
be skipped for the 4KB pages having MSI-X table.
|
be skipped for the 4KB pages that have an MSI-X table.
|
||||||
|
|
||||||
There are three situations for the emulation of MSI-X table:
|
There are three situations for the emulation of MSI-X tables:
|
||||||
|
|
||||||
- **Service VM**: accesses to MSI-X table are handled by HV MMIO handler (4KB adjusted up
|
- **Service VM**: Accesses to an MSI-X table are handled by the hypervisor MMIO
|
||||||
and down). HV will remap interrupts.
|
handler (4KB adjusted up and down). The hypervisor remaps the interrupts.
|
||||||
- **Post-launched VM**: accesses to MSI-X Tables are handled by DM MMIO handler
|
- **Post-launched VM**: Accesses to an MSI-X table are handled by the Device
|
||||||
(4KB adjusted up and down) and when DM (Service VM) writes to the table, it will be
|
Model MMIO handler (4KB adjusted up and down). When the Device Model (Service
|
||||||
intercepted by HV MMIO handler and HV will remap interrupts.
|
VM) writes to the table, it will be intercepted by the hypervisor MMIO
|
||||||
- **Pre-launched VM**: Writes to MMIO region in MSI-X Table BAR handled by HV MMIO
|
handler. The hypervisor remaps the interrupts.
|
||||||
handler. If the offset falls within the MSI-X table (offset, offset+tables_size),
|
- **Pre-launched VM**: Writes to the MMIO region in an MSI-X table BAR are
|
||||||
HV remaps interrupts.
|
handled by the hypervisor MMIO handler. If the offset falls within the MSI-X
|
||||||
|
table (offset, offset+tables_size), the hypervisor remaps the interrupts.
|
||||||
|
|
||||||
|
|
||||||
.. _interrupt-remapping:
|
.. _interrupt-remapping:
|
||||||
|
@ -292,7 +292,7 @@ There are three situations for the emulation of MSI-X table:
|
||||||
Interrupt Remapping
|
Interrupt Remapping
|
||||||
*******************
|
*******************
|
||||||
|
|
||||||
When the physical interrupt of a passthrough device happens, hypervisor has
|
When the physical interrupt of a passthrough device happens, the hypervisor has
|
||||||
to distribute it to the relevant VM according to interrupt remapping
|
to distribute it to the relevant VM according to interrupt remapping
|
||||||
relationships. The structure ``ptirq_remapping_info`` is used to define
|
relationships. The structure ``ptirq_remapping_info`` is used to define
|
||||||
the subordination relation between physical interrupt and VM, the
|
the subordination relation between physical interrupt and VM, the
|
||||||
|
@ -303,10 +303,10 @@ virtual destination, etc. See the following figure for details:
|
||||||
|
|
||||||
Remapping of physical interrupts
|
Remapping of physical interrupts
|
||||||
|
|
||||||
There are two different types of interrupt source: IOAPIC and MSI.
|
There are two different types of interrupt sources: IOAPIC and MSI.
|
||||||
The hypervisor will record different information for interrupt
|
The hypervisor will record different information for interrupt
|
||||||
distribution: physical and virtual IOAPIC pin for IOAPIC source,
|
distribution: physical and virtual IOAPIC pin for IOAPIC source,
|
||||||
physical and virtual BDF and other info for MSI source.
|
physical and virtual BDF and other information for MSI source.
|
||||||
|
|
||||||
Service VM passthrough is also in the scope of interrupt remapping which is
|
Service VM passthrough is also in the scope of interrupt remapping which is
|
||||||
done on-demand rather than on hypervisor initialization.
|
done on-demand rather than on hypervisor initialization.
|
||||||
|
@ -318,11 +318,12 @@ done on-demand rather than on hypervisor initialization.
|
||||||
Initialization of remapping of virtual IOAPIC interrupts for Service VM
|
Initialization of remapping of virtual IOAPIC interrupts for Service VM
|
||||||
|
|
||||||
:numref:`init-remapping` above illustrates how remapping of (virtual) IOAPIC
|
:numref:`init-remapping` above illustrates how remapping of (virtual) IOAPIC
|
||||||
interrupts are remapped for Service VM. VM exit occurs whenever Service VM tries to
|
interrupts are remapped for the Service VM. VM exit occurs whenever the Service
|
||||||
unmask an interrupt in (virtual) IOAPIC by writing to the Redirection
|
VM tries to unmask an interrupt in (virtual) IOAPIC by writing to the
|
||||||
Table Entry (or RTE). The hypervisor then invokes the IOAPIC emulation
|
Redirection Table Entry (or RTE). The hypervisor then invokes the IOAPIC
|
||||||
handler (refer to :ref:`hld-io-emulation` for details on I/O emulation) which
|
emulation handler (refer to :ref:`hld-io-emulation` for details on I/O
|
||||||
calls APIs to set up a remapping for the to-be-unmasked interrupt.
|
emulation) which calls APIs to set up a remapping for the to-be-unmasked
|
||||||
|
interrupt.
|
||||||
|
|
||||||
Remapping of (virtual) MSI interrupts are set up in a similar sequence:
|
Remapping of (virtual) MSI interrupts are set up in a similar sequence:
|
||||||
|
|
||||||
|
@ -331,67 +332,65 @@ Remapping of (virtual) MSI interrupts are set up in a similar sequence:
|
||||||
|
|
||||||
Initialization of remapping of virtual MSI for Service VM
|
Initialization of remapping of virtual MSI for Service VM
|
||||||
|
|
||||||
This figure illustrates how mappings of MSI or MSI-X are set up for
|
This figure illustrates how mappings of MSI or MSI-X are set up for the
|
||||||
Service VM. Service VM is responsible for issuing a hypercall to notify the
|
Service VM. The Service VM is responsible for issuing a hypercall to notify the
|
||||||
hypervisor before it configures the PCI configuration space to enable an
|
hypervisor before it configures the PCI configuration space to enable an
|
||||||
MSI. The hypervisor takes this opportunity to set up a remapping for the
|
MSI. The hypervisor takes this opportunity to set up a remapping for the
|
||||||
given MSI or MSI-X before it is actually enabled by Service VM.
|
given MSI or MSI-X before it is actually enabled by the Service VM.
|
||||||
|
|
||||||
When the User VM needs to access the physical device by passthrough, it uses
|
When the User VM needs to access the physical device by passthrough, it uses
|
||||||
the following steps:
|
the following steps:
|
||||||
|
|
||||||
- User VM gets a virtual interrupt
|
- User VM gets a virtual interrupt.
|
||||||
- VM exit happens and the trapped vCPU is the target where the interrupt
|
- VM exit happens and the trapped vCPU is the target where the interrupt
|
||||||
will be injected.
|
will be injected.
|
||||||
- Hypervisor will handle the interrupt and translate the vector
|
- Hypervisor handles the interrupt and translates the vector
|
||||||
according to ptirq_remapping_info.
|
according to ``ptirq_remapping_info``.
|
||||||
- Hypervisor delivers the interrupt to User VM.
|
- Hypervisor delivers the interrupt to the User VM.
|
||||||
|
|
||||||
When the Service VM needs to use the physical device, the passthrough is also
|
When the Service VM needs to use the physical device, the passthrough is also
|
||||||
active because the Service VM is the first VM. The detail steps are:
|
active because the Service VM is the first VM. The detail steps are:
|
||||||
|
|
||||||
- Service VM get all physical interrupts. It assigns different interrupts for
|
- Service VM gets all physical interrupts. It assigns different interrupts for
|
||||||
different VMs during initialization and reassign when a VM is created or
|
different VMs during initialization and reassigns when a VM is created or
|
||||||
deleted.
|
deleted.
|
||||||
- When physical interrupt is trapped, an exception will happen after VMCS
|
- When a physical interrupt is trapped, an exception will happen after VMCS
|
||||||
has been set.
|
has been set.
|
||||||
- Hypervisor will handle the VM exit issue according to
|
- Hypervisor handles the VM exit issue according to
|
||||||
ptirq_remapping_info and translates the vector.
|
``ptirq_remapping_info`` and translates the vector.
|
||||||
- The interrupt will be injected the same as a virtual interrupt.
|
- The interrupt is injected the same as a virtual interrupt.
|
||||||
|
|
||||||
ACPI Virtualization
|
ACPI Virtualization
|
||||||
*******************
|
*******************
|
||||||
|
|
||||||
ACPI virtualization is designed in ACRN with these assumptions:
|
ACPI virtualization is designed in ACRN with these assumptions:
|
||||||
|
|
||||||
- HV has no knowledge of ACPI,
|
- Hypervisor has no knowledge of ACPI,
|
||||||
- Service VM owns all physical ACPI resources,
|
- Service VM owns all physical ACPI resources,
|
||||||
- User VM sees virtual ACPI resources emulated by device model.
|
- User VM sees virtual ACPI resources emulated by the Device Model.
|
||||||
|
|
||||||
Some passthrough devices require physical ACPI table entry for
|
Some passthrough devices require a physical ACPI table entry for initialization.
|
||||||
initialization. The device model will create such device entry based on
|
The Device Model creates such device entry based on the physical one according
|
||||||
the physical one according to vendor ID and device ID. Virtualization is
|
to vendor ID and device ID. Virtualization is implemented in the Service VM
|
||||||
implemented in Service VM device model and not in scope of the hypervisor.
|
Device Model and not in the scope of the hypervisor. For pre-launched VMs, the
|
||||||
For pre-launched VM, ACRN hypervisor doesn't support the ACPI virtualization,
|
ACRN hypervisor doesn't support ACPI virtualization, so devices relying on ACPI
|
||||||
so devices relying on ACPI table are not supported.
|
tables are not supported.
|
||||||
|
|
||||||
GSI Sharing Violation Check
|
GSI Sharing Violation Check
|
||||||
***************************
|
***************************
|
||||||
|
|
||||||
All the PCI devices that are sharing the same GSI should be assigned to
|
All the PCI devices that share the same GSI should be assigned to the same
|
||||||
the same VM to avoid physical GSI sharing between multiple VMs.
|
VM to avoid physical GSI sharing between multiple VMs. In partitioned mode or
|
||||||
In partitioned mode or hybrid mode, the PCI devices assigned to
|
hybrid mode, the PCI devices assigned to a pre-launched VM are statically
|
||||||
pre-launched VM is statically predefined. Developers should take care not to
|
predefined. Developers should take care not to violate the rule. For a
|
||||||
violate the rule.
|
post-launched VM, the ACRN Device Model puts the devices sharing the same GSI
|
||||||
For post-launched VM, devices that don't support MSI, ACRN DM puts the devices
|
pin in a GSI sharing group (devices that don't support MSI). The devices in the
|
||||||
sharing the same GSI pin to a GSI
|
same group should be assigned together to the current VM; otherwise, none of
|
||||||
sharing group. The devices in the same group should be assigned together to
|
them should be assigned to the current VM. A device that violates the rule will
|
||||||
the current VM, otherwise, none of them should be assigned to the
|
be rejected to be passed-through. The checking logic is implemented in the
|
||||||
current VM. A device that violates the rule will be rejected to be
|
Device Model and not in the scope of the hypervisor. The platform-specific GSI
|
||||||
passed-through. The checking logic is implemented in Device Model and not
|
information shall be filled in ``devicemodel/hw/pci/platform_gsi_info.c`` for
|
||||||
in scope of hypervisor.
|
the target platform to activate the checking of GSI sharing violations.
|
||||||
The platform specific GSI information shall be filled in devicemodel/hw/pci/platform_gsi_info.c
|
|
||||||
for target platform to activate the checking of GSI sharing violation.
|
|
||||||
|
|
||||||
.. _PCIe PTM implementation:
|
.. _PCIe PTM implementation:
|
||||||
|
|
||||||
|
@ -408,14 +407,14 @@ further details on PTM, refer to the `PCIe specification
|
||||||
<https://pcisig.com/specifications>`_.
|
<https://pcisig.com/specifications>`_.
|
||||||
|
|
||||||
ACRN adds PCIe root port emulation in the hypervisor to support the PTM feature
|
ACRN adds PCIe root port emulation in the hypervisor to support the PTM feature
|
||||||
and emulates a simple PTM hierarchy. ACRN enables PTM in a Guest VM if the user
|
and emulates a simple PTM hierarchy. ACRN enables PTM in a post-launched VM if
|
||||||
sets the ``enable_ptm`` option when passing through a device to a post-launched
|
the user sets the ``enable_ptm`` option when passing through a device to the
|
||||||
VM. When you enable PTM, the passthrough device is connected to a virtual
|
post-launched VM. When you enable PTM, the passthrough device is connected to a
|
||||||
root port instead of the host bridge.
|
virtual root port instead of the host bridge.
|
||||||
|
|
||||||
By default, the :ref:`vm.PTM` option is disabled in ACRN VMs. Use the
|
By default, the :ref:`vm.PTM` option is disabled in ACRN VMs. Use the
|
||||||
:ref:`ACRN configurator tool <acrn_configurator_tool>` to enable PTM
|
:ref:`acrn_configurator_tool` to enable PTM
|
||||||
in the scenario XML file that configures the Guest VM.
|
in the scenario XML file that configures the VM.
|
||||||
|
|
||||||
Here is an example launch script that configures a supported Ethernet card for
|
Here is an example launch script that configures a supported Ethernet card for
|
||||||
passthrough and enables PTM on it:
|
passthrough and enables PTM on it:
|
||||||
|
@ -436,7 +435,7 @@ passthrough and enables PTM on it:
|
||||||
echo ${passthru_bdf["ethptm"]} > /sys/bus/pci/drivers/pci-stub/bind
|
echo ${passthru_bdf["ethptm"]} > /sys/bus/pci/drivers/pci-stub/bind
|
||||||
|
|
||||||
acrn-dm -A -m $mem_size -s 0:0,hostbridge \
|
acrn-dm -A -m $mem_size -s 0:0,hostbridge \
|
||||||
-s 3,virtio-blk,uos-test.img \
|
-s 3,virtio-blk,user-vm-test.img \
|
||||||
-s 4,virtio-net,tap0 \
|
-s 4,virtio-net,tap0 \
|
||||||
-s 5,virtio-console,@stdio:stdio_port \
|
-s 5,virtio-console,@stdio:stdio_port \
|
||||||
-s 6,passthru,a9/00/0,enable_ptm \
|
-s 6,passthru,a9/00/0,enable_ptm \
|
||||||
|
@ -458,8 +457,8 @@ PTM Implementation Notes
|
||||||
To simplify PTM support implementation, the virtual root port only supports the
|
To simplify PTM support implementation, the virtual root port only supports the
|
||||||
most basic PCIe configuration and operation, in addition to PTM capabilities.
|
most basic PCIe configuration and operation, in addition to PTM capabilities.
|
||||||
|
|
||||||
In Guest VM post-launched scenarios, you enable PTM by setting the
|
For a post-launched VM, you enable PTM by setting the
|
||||||
``enable_ptm`` option for the pass through device (as shown above).
|
``enable_ptm`` option for the passthrough device (as shown above).
|
||||||
|
|
||||||
.. figure:: images/PTM-hld-PTM-flow.png
|
.. figure:: images/PTM-hld-PTM-flow.png
|
||||||
:align: center
|
:align: center
|
||||||
|
@ -469,49 +468,52 @@ In Guest VM post-launched scenarios, you enable PTM by setting the
|
||||||
PTM-enabling workflow in post-launched VM
|
PTM-enabling workflow in post-launched VM
|
||||||
|
|
||||||
As shown in :numref:`ptm-flow`, PTM is enabled in the root port during the
|
As shown in :numref:`ptm-flow`, PTM is enabled in the root port during the
|
||||||
hypervisor startup. The Device Model (DM) then checks whether the pass-through device
|
hypervisor startup. The Device Model (DM) then checks whether the passthrough
|
||||||
supports PTM requestor capabilities and whether the corresponding root port
|
device supports PTM requestor capabilities and whether the corresponding root
|
||||||
supports PTM root capabilities, as well as some other sanity checks. If an
|
port supports PTM root capabilities, as well as some other sanity checks. If an
|
||||||
error is detected during these checks, the error will be reported and ACRN will
|
error is detected during these checks, the error will be reported and ACRN will
|
||||||
not enable PTM in the Guest VM. This doesn't prevent the user from launching the Guest
|
not enable PTM in the post-launched VM. This doesn't prevent the user from
|
||||||
VM and passing through the device to the Guest VM. If no error is detected,
|
launching the post-launched VM and passing through the device to the VM. If no
|
||||||
the device model will use ``add_vdev`` hypercall to add a virtual root port (VRP),
|
error is detected, the Device Model uses the ``add_vdev`` hypercall to add a
|
||||||
acting as the PTM root, to the Guest VM before passing through the device to the Guest VM.
|
virtual root port (VRP), acting as the PTM root, to the post-launched VM before
|
||||||
|
passing through the device to the post-launched VM.
|
||||||
|
|
||||||
.. figure:: images/PTM-hld-PTM-passthru.png
|
.. figure:: images/PTM-hld-PTM-passthru.png
|
||||||
:align: center
|
:align: center
|
||||||
:width: 700
|
:width: 700
|
||||||
:name: ptm-vrp
|
:name: ptm-vrp
|
||||||
|
|
||||||
PTM-enabled PCI device pass-through to post-launched VM
|
PTM-enabled PCI device passthrough to post-launched VM
|
||||||
|
|
||||||
:numref:`ptm-vrp` shows that, after enabling PTM, the passthru device connects to
|
:numref:`ptm-vrp` shows that, after enabling PTM, the passthrough device
|
||||||
the virtual root port instead of the virtual host bridge.
|
connects to the virtual root port instead of the virtual host bridge.
|
||||||
|
|
||||||
To use PTM in a virtualized environment, you may want to first verify that PTM
|
To use PTM in a virtualized environment, you may want to first verify that PTM
|
||||||
is supported by the device and is enabled on the bare metal machine.
|
is supported by the device and is enabled on the bare metal machine.
|
||||||
If supported, follow these steps to enable PTM in the post-launched guest VM:
|
If supported, follow these steps to enable PTM in the post-launched VM:
|
||||||
|
|
||||||
1. Make sure that PTM is enabled in the guest kernel. In the Linux kernel, for example,
|
1. Make sure that PTM is enabled in the guest kernel. In the Linux kernel,
|
||||||
set ``CONFIG_PCIE_PTM=y``.
|
for example, set ``CONFIG_PCIE_PTM=y``.
|
||||||
2. Not every PCI device supports PTM. One example that does is the Intel I225-V
|
2. Not every PCI device supports PTM. One example that does is the Intel I225-V
|
||||||
Ethernet controller. If you passthrough this card to the guest VM, make sure the guest VM
|
Ethernet controller. If you passthrough this card to the post-launched VM,
|
||||||
uses a version of the IGC driver that supports PTM.
|
make sure the post-launched VM uses a version of the IGC driver that supports
|
||||||
3. In the device model launch script, add the ``enable_ptm`` option to the
|
PTM.
|
||||||
|
3. In the Device Model launch script, add the ``enable_ptm`` option to the
|
||||||
passthrough device. For example:
|
passthrough device. For example:
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
:emphasize-lines: 5
|
:emphasize-lines: 5
|
||||||
|
|
||||||
$ acrn-dm -A -m $mem_size -s 0:0,hostbridge \
|
$ acrn-dm -A -m $mem_size -s 0:0,hostbridge \
|
||||||
-s 3,virtio-blk,uos-test.img \
|
-s 3,virtio-blk,user-vm-test.img \
|
||||||
-s 4,virtio-net,tap0 \
|
-s 4,virtio-net,tap0 \
|
||||||
-s 5,virtio-console,@stdio:stdio_port \
|
-s 5,virtio-console,@stdio:stdio_port \
|
||||||
-s 6,passthru,a9/00/0,enable_ptm \
|
-s 6,passthru,a9/00/0,enable_ptm \
|
||||||
--ovmf /usr/share/acrn/bios/OVMF.fd \
|
--ovmf /usr/share/acrn/bios/OVMF.fd \
|
||||||
|
|
||||||
4. You can check that PTM is correctly enabled on guest by displaying the PCI
|
4. You can check that PTM is correctly enabled on the post-launched VM by
|
||||||
bus hiearchy on the guest using the ``lspci`` command:
|
displaying the PCI bus hierarchy on the post-launched VM using the ``lspci``
|
||||||
|
command:
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
:emphasize-lines: 12,20
|
:emphasize-lines: 12,20
|
||||||
|
@ -555,9 +557,10 @@ VMs:
|
||||||
.. doxygenfunction:: ptirq_prepare_msix_remap
|
.. doxygenfunction:: ptirq_prepare_msix_remap
|
||||||
:project: Project ACRN
|
:project: Project ACRN
|
||||||
|
|
||||||
Post-launched VM needs to pre-allocate interrupt entries during VM initialization.
|
Post-launched VMs need to pre-allocate interrupt entries during VM
|
||||||
Post-launched VM needs to free interrupt entries during VM de-initialization.
|
initialization. Post-launched VMs need to free interrupt entries during VM
|
||||||
The following APIs are provided to pre-allocate/free interrupt entries for post-launched VM:
|
de-initialization. The following APIs are provided to pre-allocate/free
|
||||||
|
interrupt entries for post-launched VMs:
|
||||||
|
|
||||||
.. doxygenfunction:: ptirq_add_intx_remapping
|
.. doxygenfunction:: ptirq_add_intx_remapping
|
||||||
:project: Project ACRN
|
:project: Project ACRN
|
||||||
|
@ -568,12 +571,12 @@ The following APIs are provided to pre-allocate/free interrupt entries for post-
|
||||||
.. doxygenfunction:: ptirq_remove_msix_remapping
|
.. doxygenfunction:: ptirq_remove_msix_remapping
|
||||||
:project: Project ACRN
|
:project: Project ACRN
|
||||||
|
|
||||||
The following APIs are provided to acknowledge a virtual interrupt.
|
The following APIs are provided to acknowledge a virtual interrupt:
|
||||||
|
|
||||||
.. doxygenfunction:: ptirq_intx_ack
|
.. doxygenfunction:: ptirq_intx_ack
|
||||||
:project: Project ACRN
|
:project: Project ACRN
|
||||||
|
|
||||||
The following APIs are provided to handle ptdev interrupt:
|
The following APIs are provided to handle a ptdev interrupt:
|
||||||
|
|
||||||
.. doxygenfunction:: ptdev_init
|
.. doxygenfunction:: ptdev_init
|
||||||
:project: Project ACRN
|
:project: Project ACRN
|
||||||
|
|
Before Width: | Height: | Size: 49 KiB After Width: | Height: | Size: 49 KiB |
Before Width: | Height: | Size: 21 KiB After Width: | Height: | Size: 27 KiB |
Before Width: | Height: | Size: 104 KiB After Width: | Height: | Size: 136 KiB |
Before Width: | Height: | Size: 27 KiB After Width: | Height: | Size: 22 KiB |
Before Width: | Height: | Size: 26 KiB After Width: | Height: | Size: 20 KiB |
Before Width: | Height: | Size: 26 KiB After Width: | Height: | Size: 22 KiB |
Before Width: | Height: | Size: 16 KiB After Width: | Height: | Size: 21 KiB |