294 lines
10 KiB
ReStructuredText
294 lines
10 KiB
ReStructuredText
.. _sriov_virtualization:
|
|
|
|
Enable SR-IOV Virtualization
|
|
############################
|
|
|
|
SR-IOV (Single Root Input/Output Virtualization) can isolate PCIe devices
|
|
to improve performance that is similar to bare-metal levels. SR-IOV consists
|
|
of two basic units: PF (Physical Function), which supports SR-IOV PCIe
|
|
extended capability and manages entire physical devices; and VF (Virtual
|
|
Function), a "lightweight" PCIe function that is a passthrough device for
|
|
VMs.
|
|
|
|
For details, refer to Chapter 9 of PCI-SIG's
|
|
`PCI Express Base Specification Revision 4.0, Version 1.0
|
|
<https://pcisig.com/pci-express-architecture-configuration-space-test-specification-revision-40-version-10>`_.
|
|
|
|
SR-IOV Architectural Overview
|
|
*****************************
|
|
|
|
.. figure:: images/sriov-image1.png
|
|
:align: center
|
|
:name: SR-IOV-architecture-overview
|
|
|
|
SR-IOV Architectural Overview
|
|
|
|
- **SI** - A System Image known as a VM.
|
|
|
|
- **VI** - A Virtualization Intermediary known as a hypervisor.
|
|
|
|
- **SR-PCIM** - A Single Root PCI Manager; it is a software entity for
|
|
SR-IOV management.
|
|
|
|
- **PF** - A PCIe Function that supports the SR-IOV capability
|
|
and is accessible to an SR-PCIM, a VI, or an SI.
|
|
|
|
- **VF** - A "light-weight" PCIe Function that is directly accessible by an
|
|
SI.
|
|
|
|
SR-IOV Extended Capability
|
|
--------------------------
|
|
|
|
The SR-IOV Extended Capability defined here is a PCIe extended
|
|
capability that must be implemented in each PF device that supports the
|
|
SR-IOV feature. This capability is used to describe and control a PF's
|
|
SR-IOV capabilities.
|
|
|
|
.. figure:: images/sriov-image2.png
|
|
:align: center
|
|
:name: SR-IOV-extended-capability
|
|
|
|
SR-IOV Extended Capability
|
|
|
|
- **PCIe Extended Capability ID** - 0010h.
|
|
|
|
- **SR-IOV Capabilities** - VF Migration-Capable and ARI-Capable.
|
|
|
|
- **SR-IOV Control** - Enable/Disable VFs; VF migration state query.
|
|
|
|
- **SR-IOV Status** - VF Migration Status.
|
|
|
|
- **Initial VFs** - Indicates to the SR-PCIM the number of VFs that are
|
|
initially associated with the PF.
|
|
|
|
- **Total VFs** - Indicates the maximum number of VFs that can be
|
|
associated with the PF.
|
|
|
|
- **Num VFs** - Controls the number of VFs that are visible. *Num VFs* <=
|
|
*Initial VFs* = *Total VFs*.
|
|
|
|
- **Function Link Dependency** - The field used to describe
|
|
dependencies between PFs. VF dependencies are the same as the
|
|
dependencies of their associated PFs.
|
|
|
|
- **First VF Offset** - A constant that defines the Routing ID
|
|
offset of the first VF that is associated with the PF that contains
|
|
this Capability structure.
|
|
|
|
- **VF Stride** - Defines the Routing ID offset from one VF to the
|
|
next one for all VFs associated with the PF that contains this
|
|
Capability structure.
|
|
|
|
- **VF Device ID** - The field that contains the Device ID that should be
|
|
presented for every VF to the SI.
|
|
|
|
- **Supported Page Sizes** - The field that indicates the page sizes
|
|
supported by the PF.
|
|
|
|
- **System Page Size** - The field that defines the page size the system
|
|
will use to map the VFs' memory addresses. Software must set the
|
|
value of the *System Page Size* to one of the page sizes set in the
|
|
*Supported Page Sizes* field.
|
|
|
|
- **VF BARs** - Fields that must define the VF's Base Address
|
|
Registers (BARs). These fields behave as normal PCI BARs.
|
|
|
|
- **VF Migration State Array Offset** - Register that contains a
|
|
PF BAR relative pointer to the VF Migration State Array.
|
|
|
|
- **VF Migration State Array** - Located using the VF Migration
|
|
State Array Offset register of the SR-IOV Capability block.
|
|
|
|
For details, refer to the *PCI Express Base Specification Revision 4.0, Version 1.0 Chapter 9.3.3*.
|
|
|
|
SR-IOV Architecture in ACRN
|
|
---------------------------
|
|
|
|
.. figure:: images/sriov-image3.png
|
|
:align: center
|
|
:name: SR-IOV-architecure-in-acrn
|
|
|
|
SR-IOV Architectural in ACRN
|
|
|
|
1. A hypervisor detects an SR-IOV capable PCIe device in the physical PCI
|
|
device enumeration phase.
|
|
|
|
2. The hypervisor intercepts the PF's SR-IOV capability and accesses whether
|
|
to enable/disable VF devices based on the ``VF_ENABLE`` state. All
|
|
read/write requests for a PF device passthrough to the PF physical
|
|
device.
|
|
|
|
3. The hypervisor waits for 100ms after ``VF_ENABLE`` is set and initializes
|
|
VF devices. The differences between a normal passthrough device and
|
|
SR-IOV VF device are physical device detection, BARs, and MSI-X
|
|
initialization. The hypervisor uses ``Subsystem Vendor ID`` to detect the
|
|
SR-IOV VF physical device instead of ``Vendor ID`` since no valid
|
|
``Vendor ID`` exists for the SR-IOV VF physical device. The VF BARs are
|
|
initialized by its associated PF's SR-IOV capabilities, not PCI
|
|
standard BAR registers. The MSI-X mapping base address is also from the
|
|
PF's SR-IOV capabilities, not PCI standard BAR registers.
|
|
|
|
SR-IOV Passthrough VF Architecture in ACRN
|
|
------------------------------------------
|
|
|
|
.. figure:: images/sriov-image4.png
|
|
:align: center
|
|
:name: SR-IOV-vf-passthrough
|
|
|
|
SR-IOV VF Passthrough Architecture in ACRN
|
|
|
|
1. The SR-IOV VF device needs to bind the PCI-stud driver instead of the
|
|
vendor-specific VF driver before the device passthrough.
|
|
|
|
2. The user configures the ``acrn-dm`` boot parameter with the passthrough
|
|
SR-IOV VF device. When the User VM starts, ``acrn-dm`` invokes a
|
|
hypercall to set the *vdev-VF0* device in the User VM.
|
|
|
|
3. The hypervisor emulates ``Device ID/Vendor ID`` and ``Memory Space Enable
|
|
(MSE)`` in the configuration space for an assigned SR-IOV VF device. The
|
|
assigned VF ``Device ID`` comes from its associated PF's capability. The
|
|
``Vendor ID`` is the same as the PF's ``Vendor ID`` and the ``MSE`` is always
|
|
set when reading the SR-IOV VF device's control register.
|
|
|
|
4. The vendor-specific VF driver in the target VM probes the assigned SR-IOV
|
|
VF device.
|
|
|
|
SR-IOV Initialization Flow
|
|
--------------------------
|
|
|
|
.. figure:: images/sriov-image5.png
|
|
:align: center
|
|
:name: SR-IOV-init-flow
|
|
|
|
SR-IOV Initialization Flow
|
|
|
|
When an SR-IOV capable device is initialized, all access to the
|
|
configuration space will passthrough to the physical device directly.
|
|
The Service VM can identify all capabilities of the device from the SR-IOV
|
|
extended capability and then create a *sysfs* node for SR-IOV management.
|
|
|
|
SR-IOV VF Enable Flow
|
|
---------------------
|
|
|
|
.. figure:: images/sriov-image6.png
|
|
:align: center
|
|
:width: 900px
|
|
:name: SR-IOV-enable-flow
|
|
|
|
SR-IOV VF Enable Flow
|
|
|
|
The application enables ``n`` VF devices via an SR-IOV PF device ``sysfs`` node.
|
|
The hypervisor intercepts all SR-IOV capability access and checks the
|
|
``VF_ENABLE`` state. If ``VF_ENABLE`` is set, the hypervisor creates n
|
|
virtual devices after 100ms so that VF physical devices have enough time to
|
|
be created. The Service VM waits 100ms and then only accesses the first VF
|
|
device's configuration space including Class Code, Reversion ID, Subsystem
|
|
Vendor ID, Subsystem ID. The Service VM uses the first VF device
|
|
information to initialize subsequent VF devices.
|
|
|
|
SR-IOV VF Disable Flow
|
|
----------------------
|
|
|
|
.. figure:: images/sriov-image7.png
|
|
:align: center
|
|
:name: SR-IOV-disable-flow
|
|
|
|
SR-IOV VF Disable Flow
|
|
|
|
The application disables SR-IOV VF devices by writing zero to the SR-IOV PF
|
|
device ``sysfs`` node. The hypervisor intercepts all SR-IOV capability
|
|
accesses and checks the ``VF_ENABLE`` state. If ``VF_ENABLE`` is clear, the
|
|
hypervisor makes VF virtual devices invisible from the Service VM so that all
|
|
access to VF devices will return ``0xFFFFFFFF`` as an error. The VF physical
|
|
devices are removed within 1s of when ``VF_ENABLE`` is clear.
|
|
|
|
SR-IOV VF Assignment Policy
|
|
---------------------------
|
|
|
|
.. figure:: images/sriov-image8.png
|
|
:align: center
|
|
:name: SR-IOV-vf-assignment
|
|
|
|
SR-IOV VF Assignment
|
|
|
|
1. All SR-IOV PF devices are managed by the Service VM.
|
|
|
|
2. The SR-IOV PF cannot passthrough to the User VM.
|
|
|
|
3. All VFs can passthrough to the User VM, but we do not recommend
|
|
a passthrough to high privilege VMs because the PF device may impact
|
|
the assigned VFs' functionality and stability.
|
|
|
|
SR-IOV Usage Guide in ACRN
|
|
--------------------------
|
|
|
|
We use the Intel 82576 NIC as an example in the following instructions. We
|
|
only support LaaG (Linux as a Guest).
|
|
|
|
1. Ensure that the 82576 VF driver is compiled into the User VM Kernel
|
|
(set ``CONFIG_IGBVF=y`` in the Kernel Config).
|
|
|
|
#. When the Service VM boots, the ``lspci -v`` command indicates
|
|
that the Intel 82576 NIC devices have SR-IOV capability and their PF
|
|
drivers are ``igb``.
|
|
|
|
.. figure:: images/sriov-image9.png
|
|
:align: center
|
|
:name: 82576-pf
|
|
|
|
82576 SR-IOV PF Devices
|
|
|
|
#. Input the ``echo n > /sys/class/net/enp109s0f0/device/sriov\_numvfs``
|
|
command in the Service VM to enable n VF devices for the first PF
|
|
device (\ *enp109s0f0)*. The number *n* can't be more than *TotalVFs*
|
|
coming from the return value of command
|
|
``cat /sys/class/net/enp109s0f0/device/sriov\_totalvfs``. Here we
|
|
use *n = 2* as an example.
|
|
|
|
.. figure:: images/sriov-image10.png
|
|
:align: center
|
|
:name: 82576-vf
|
|
|
|
82576 SR-IOV VF Devices
|
|
|
|
.. figure:: images/sriov-image11.png
|
|
:align: center
|
|
:name: 82576-vf-nic
|
|
|
|
82576 SR-IOV VF NIC
|
|
|
|
#. Passthrough an SR-IOV VF device to guest.
|
|
|
|
a. Unbind the igbvf driver in the Service VM.
|
|
|
|
i. ``modprobe pci\_stub``
|
|
|
|
ii. ``echo "8086 10ca" > /sys/bus/pci/drivers/pci-stub/new\_id``
|
|
|
|
iii. ``echo "0000:6d:10.0" > /sys/bus/pci/devices/0000:6d:10.0/driver/unbind``
|
|
|
|
iv. ``echo "0000:6d:10.0" > /sys/bus/pci/drivers/pci-stub/bind``
|
|
|
|
b. Add the SR-IOV VF device parameter (``-s X, passthru,6d/10/0``) in
|
|
the launch User VM script
|
|
|
|
.. figure:: images/sriov-image12.png
|
|
:align: center
|
|
:name: 82576-nic-passthru
|
|
|
|
Configure 82576 NIC as a Passthrough Device
|
|
|
|
c. Boot the User VM
|
|
|
|
SR-IOV Limitations in ACRN
|
|
--------------------------
|
|
|
|
1. The SR-IOV migration feature is not supported.
|
|
|
|
2. If an SR-IOV PF device is detected during the enumeration phase, but
|
|
not enough room exists for its total VF devices, the PF device will be
|
|
dropped. The platform uses the ``MAX_PCI_DEV_NUM`` ACRN configuration to
|
|
support the maximum number of PCI devices. Make sure ``MAX_PCI_DEV_NUM`` is
|
|
more than the number of all PCI devices, including the total SR-IOV VF
|
|
devices.
|