In physical destination mode, the destination processor is specified by its
local APIC ID. When a CPU switch xAPIC Mode to x2APIC Mode or vice versa,
the local APIC ID is not changed. So a vcpu in x2APIC Mode could use physical
Destination Mode to send an IPI to another vcpu in xAPIC Mode by writing ICR.
This patch adds support for a vCPU A could write ICR to send IPI to another
vCPU B which is in different APIC mode.
Tracked-On: #5923
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Retrieve physical APIC IDs from board xml file and use them to fill in the ACPI MADT table
for pre-Launched VMs.
Note that the config-tool will throw an error if the processors/die/core/thread tags are absent.
User needs to run board_inspector.py to regenerate the board xml file when this commit is merged,
if the processors/die/core/thread tags are missing in the board xml file.
Tracked-On: #6020
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Two utility functions are copied and adapted from hyerpervisor:
ffs64
bitmap_clear_nolock
Two public functions are provided for future use (such as for RTCTv2)
pcpuid_from_vcpuid
lapicid_from_pcpuid
Tracked-On: #6020
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
This allows users to retrieve and use the requested platform_info information from hypervisor
Tracked-On: #6020
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
So that DM can retrieve physical APIC IDs and use them to fill in the ACPI MADT table for
post-launched VMs.
Note:
1. DM needs to use the same logic as hypervisor to calculate vLAPIC IDs based on physical APIC IDs
and CPU affinity setting
2. Using reserved0[] in struct hc_platform_info to pass physical APIC IDs means we can only support at
most 116 cores. And it assumes LAPIC ID is 8bits (X2APIC mode supports 32 bits).
Cat IDs shift will be used by DM RTCT V2
Tracked-On: #6020
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Using physical APIC IDs as vLAPIC IDs for pre-Launched and post-launched VMs
is not sufficient to replicate the host CPU and cache topologies in guest VMs,
we also need to passthrough host CPUID leaf.0BH to guest VMs, otherwise,
guest VMs may see weird CPU topology.
Note that in current code, ACRN has already passthroughed host cache CPUID
leaf 04H to guest VMs
Tracked-On: #6020
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
In current code, ACRN uses physical APIC IDs as vLAPIC IDs for SOS,
and vCPU ids (contiguous) as vLAPIC IDs for pre-Launched and post-Launched VMs.
Using vCPU ids as vLAPIC IDs for pre-Launched and post-Launched VMs
would result in wrong CPU and cache topologies showing in the guest VMs,
and could adversely affect performance if the guest VM chooses to detect
CPU and cache topologies and optimize its behavior accordingly.
Uses physical APIC IDs as vLAPIC IDs (and related CPU/cache topology enumeration
CPUIDs passthrough) will replicate the host CPU and cache topologies in pre-Launched
and post-Launched VMs.
Tracked-On: #6020
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Remove the direct calls to exec_vmptrld() or exec_vmclear(), and replace
with the wrapper APIs load_va_vmcs() and clear_va_vmcs().
Tracked-On: #5923
Signed-off-by: Zide Chen <zide.chen@intel.com>
commit 873ed75 ("misc: sanity check VM config for nested virtualization")
requires that the guest_flag tag can't be empty, or it will fail to build.
This patch changes adl instances of "<guest_flag></guest_flag>" to
"<guest_flag>0</guest_flag>".
Tracked-On: #5923
Signed-off-by: Jiang, Yanting <yanting.jiang@intel.com>
Configure PTM in post-launched VM using <PTM> element. If the //vm/PTM
sets to 'y', pci_dev.c.xsl appends the virtual root port to
corresponding struct acrn_vm_pci_dev_config of that VM. Currently it
supports only post-launched VMs.
Configure enable_ptm for dm argument. If a uos/enable_ptm with uos id
= 'vm_id 'sets to 'y' and the vm/PTM with the same vm_id sets to 'y',
append an "enable_ptm" flag to the end of passthrough ethernet devices.
Currently there is only ethernet card can support the "enable_ptm"flag.
For the schema validation, the <PTM> can only be ['y', 'n'].
For the launched script validation, the <enable_ptm> can only be ['y',
'n']. If the <enable_ptm> sets to 'y' but the corresponding <PTM> sets
to 'n', the launch script will fail to generate.
Tracked-On: #6054
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Add a number of steps and details that were not called
out in the "Getting Started Guide for ACRN Hybrid Mode". Those
are not obvious to the first-time or novice user so the user
guide was hard to follow and confusing. At a high-level:
* How to build Zephyr
* How to install ACRN
* How to install the ACRN kernel
The hybrid scenario overview diagram has been updated too.
Tracked-On: #5992
Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
Add an xslt file "board_info.h.xsl". This file is used to
generate board_info.h which is used by hypervisor.
Tracked-On: #6024
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Add an xslt file "pci_dev.c.xsl". This file is used to
generate pci_dev.c which is used by hypervisor.
Tracked-On: #6024
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
acrn:get-vbdf: get the virtual bdf from allocation.xml based on vmid and device name
acrn:get-pbdf: get physical bdf from <pci_dev>
acrn:ptdev-name-suffix: fix the name to look up allocation.xml
acrn:get-hidden-device-num: get the number of hidden devices based on
board name
acrn:is-vmsix-supported-device: check if a device is a vmsix supported
device based on the vendor and identifier
Tracked-On: #6024
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Assign bdf to pci emulated and passthrough devices.
For pre-launched VM, assigns unique bdf to passthrough devices, inter-vm
shared memory, pci vuart(console and communication vuarts).
For SOS vm, assigns unique bdf to inter-vm shared memory and pci
vuart(console and communication vuarts).
The bdf follows the rules below:
- the bdf 00:00.0 is reserved for pci hostbridge
- the assigned bdf range: bus is 0x00, dev is in range [0x1, 0x20)
and the fuc is 0x00
- the bdf must be unique, which means any vm's emulated devices cannot
share the same bdf with existing devices
- some devices's bdf is hardcoded, modify its bdf would leads the
device cannot be dicoverd by os. A HARDCODED_BDF_LIST in bdf.py documents
them
- the passthrough devices' bdf can be reused in SOS vm
Tracked-On: #6024
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Add methods allocates the mmio bar base to console vuart,
communication vuarts, inter-vm shared memory and passthrough pci
devices.
For SOS:
- get low mem by parsing board xml.
- get high mem by parsing board xml, if the high mem is not enabled,
the high mem start address would be ~0UL and the end address is 0UL
- get the occupied mmio windows by parsing board.xml
- for each console vuart, communication vuart and inter-vm shared memory
devices, assign unused mmio windows to them
- all the assigned mmio windows must be unique and should not overlay
with any devices' mmio window
- the passthrough devices mmio windows can be reused in SOS vm
- each allocated mmio start address must be 4k alignment if the length
of bar is smaller than 4k
- each allocated mmio start address must be aligned with the bar length
if its length is greater than 4k
- the 32bits bar will fall in low mem range only
- 64bits bar will look for free mmio in low mem rage first, if the high
mem is enabled, the 64bits bar will look for free mmio in high mem
range if there is not enough space in low mem range
- allocator raises an error if there is not enough mmio space
For pre-launched VM:
- the high mem range is [256G, 512G)
- the low mem range is [2G, 3.5G)
- there is no used mmio window initially
- for each console vuart, communication vuart, inter-vm shared memory
devices and passthrough devices, assign unused mmio windows to them
- all the assigned mmio windows must be unique and should not overlay
with any devices' mmio window
- the 32bits bar will fall in low mem range only
- 64bits bar will look for free mmio in low mem rage first and then
look for free mmio in high mem range if there is not enough space in
low mem range
- each allocated mmio start address must be 4k alignment if the length
of bar is smaller than 4k
- each allocated mmio start address must be aligned with the bar length
if its lenght is greater than 4k
- allocator raises an error if there is not enough mmio space
Tracked-On: #6024
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Add a common method "get_shmem_regions":
This method get <IVSHMEM_REGION> and extracts the region size, region
position in xml and and vm ids which share this regions. Returns a
dictionary:
{'vm_id':{'region_name':{'id': region position,'size': region size,}}}
Add vm type checking methods:
is_pre_launched_vm, is_post_launched_vm and is_sos_vm.
Tracked-On: #6024
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Only free rb_entry when we remove this entry from the rb tree, otherwise, a
page fault would trigger when next rb itreation would access the freed rb_entry.
Tracked-On: #6056
Signed-off-by: Li Fei1 <fei1.li@intel.com>
This patch implements the VMREAD and VMWRITE instructions.
When L1 guest is running with an active VMCS12, the “VMCS shadowing”
VM-execution control is always set to 1 in VMCS01. Thus the possible
behavior of VMREAD or VMWRITE from L1 could be:
- It causes a VM exit to L0 if the bit corresponds to the target VMCS
field in the VMREAD bitmap or VMWRITE bitmap is set to 1.
- It accesses the VMCS referenced by VMCS01 link pointer (VMCS02 in
our case) if the above mentioned bit is set to 0.
This patch handles the VMREAD and VMWRITE VM exits in this way:
- on VMWRITE, it writes the desired VMCS value to the respective field
in the cached VMCS12. For VMCS fields that need to be synced to VMCS02,
sets the corresponding dirty flag.
- on VMREAD, it reads the desired VMCS value from the cached VMCS12.
Tracked-On: #5923
Signed-off-by: Alex Merritt <alex.merritt@intel.com>
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
This patch is to emulate VMCLEAR instruction.
L1 hypervisor issues VMCLEAR on a VMCS12 whose state could be any of
these: active and current, active but not current, not yet VMPTRLDed.
To emulate the VMCLEAR instruction, ACRN sets the VMCS12 launch state to
"clear", and if L0 already cached this VMCS12, need to sync it back to
guest memory:
- sync shadow fields from shadow VMCS VMCS to cache VMCS12
- copy cache VMCS12 to L1 guest memory
Tracked-On: #5923
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Enable VMCS shadowing for most of the VMCS fields, so that execution of
the VMREAD or VMWRITE on these shadow VMCS fields from L1 hypervisor
won't cause VM exits, but read from or write to the shadow VMCS.
Tracked-On: #5923
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Alexander Merritt <alex.merritt@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Software layout of VMCS12 data is a contract between L1 guest and L0
hypervisor to run a L2 guest.
ACRN hypervisor caches the VMCS12 which is passed down from L1 hypervisor
by the VMPTRLD instructin. At the time of VMCLEAR, ACRN syncs the cached
VMCS12 back to L1 guest memory.
Tracked-On: #5923
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
This patch emulates the VMPTRLD instruction. L0 hypervisor (ACRN) caches
the VMCS12 that is passed down from the VMPTRLD instruction, and merges it
with VMCS01 to create VMCS02 to run the nested VM.
- Currently ACRN can't cache multiple VMCS12 on one vCPU, so it needs to
flushes active but not current VMCS12s to L1 guest.
- ACRN creates VMCS02 to run nested VM based on VMCS12:
1) copy VMCS12 from guest memory to the per vCPU cache VMCS12
2) initialize VMCS02 revision ID and host-state area
3) load shadow fields from cache VMCS12 to VMCS02
4) enable VMCS shadowing before L1 Vm entry
Tracked-On: #5923
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
This patch implements the VMXOFF instruction. By issuing VMXOFF,
L1 guest Leaves VMX Operation.
- cleanup VCPU nested virtualization context states in VMXOFF handler.
- implement check_vmx_permission() to check permission for VMX operation
for VMXOFF and other VMX instructions.
Tracked-On: #5923
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
According to VMXON Instruction Reference, do the following checks in the
virtual hardware environment: vCPU CPL, guest CR0, CR4, revision ID
in VMXON region, etc.
Currently ACRN doesn't support 32-bit L1 hypervisor, and injects an #UD
exception if L1 hypervisor is not running in 64-bit mode.
Tracked-On: #5923
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
This patch emulates VMXON instruction. Basically checks some
prerequisites to enable VMX operation on L1 guest (next patch), and
prepares some virtual hardware environment in L0.
Tracked-On: #5923
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
The commit 2ab70f43e5
HV: cache: Fix page fault by flushing cache for VM trusty RAM in HV
It is wrong in using stac()/clac()
Tracked-On: #6020
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Now guest would use `Destination Shorthand` to broadcast IPIs if there're more
than one destination. However, it is not supported when the guest is in LAPIC
passthru situation, and all active VCPUs are working in X2APIC mode. As a result,
the guest would not work properly since this kind broadcast IPIs was ignored
by ACRN. What's worse, ACRN Hypervisor would inject GP to the guest in this case.
This patch extend vlapic_x2apic_pt_icr_access to support more destination modes
(both `Physical` and `Logical`) and destination shorthand (`No Shorthand`, `Self`,
`All Including Self` and `All Excluding Self`).
Tracked-On: #5923
Signed-off-by: Zide Chen <zide.chen@intel.com>
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
- Change 64-bit MMIO BAR window to 256G-512G
The release is the same as 736a03222f.
This is just to update the changelog.
Tracked-On: #5913
Signed-off-by: Peter Fang <peter.fang@intel.com>
A user can use "--pm_notify_channel uart,allow_trigger_s5" to indicate
the User VM is allowed to trigger system S5.
"--pm_notify_channel uart" means a vuart channel will be created in the
User VM to allow communication with the VM's life_mngr. The Service VM
can then initiate S5 in the guest via its dm's monitor interface. The
additional option, "allow_trigger_s5", will create a socket connection
with the Service VM's life_mngr, allowing this VM to initiate system S5.
v1 -> v2:
- rename pm_notify_channel type to PWR_EVENT_NOTIFY_UART_TRIG_PLAT_S5
Tracked-On: #6034
Signed-off-by: Peter Fang <peter.fang@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
The accrss right of HV RAM can be changed to PAGE_USER (eg. trusty RAM
of post-launched VM). So before using clflush(or clflushopt) to flush
HV RAM cache, must allow explicit supervisor-mode data accesses to
user-mode pages. Otherwise, it may trigger page fault.
Tracked-On: #6020
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Hypervisor does not need to care about hugepage settings in SOS kernel, user
could enable these settings in the scenario config file or GRUB menu.
Tracked-On: #5815
Signed-off-by: Victor Sun <victor.sun@intel.com>
changes:
1. The VM load order type condition is not needed, since the function
is called only when create SOS VM or pre-launched VM;
2. Fixed wrong parameter of fill_seed_arg() which introduced by commit
80262f0602.
3. More comments on why multiboot string could override the pre-
configured VM bootargs and why append multiboot cmdline to SOS VM
bootargs;
Tracked-On: #5815
Signed-off-by: Victor Sun <victor.sun@intel.com>
If there is hugepage support from board xml, config tool will
add hugepagesz=1G hugepages=[size] into sos kernel cmdline,
the size is calculated by memory size in G minusing 3.
The reason for reducing 3 is that it is reserved for SOS VM use.
Tracked-On: #5815
Signed-off-by: Kunhui Li <kunhuix.li@intel.com>
Reviewed-By: Junjie Mao <junjie.mao@intel.com>
* Split linux and windows build of lifemngr to be able to build
them independently.
* Install life_mngr.service
* Avoid the following Makefile error output by explicitly checking
Windows cross compiler availability:
make[4]: x86_64-w64-mingw32-gcc: Command not found
make[4]: [Makefile:47: all-win] Error 127 (ignored)
Tracked-On: #5660
Signed-off-by: Helmut Buchsbaum <helmut.buchsbaum@opensource.tttech-industrial.com>
Use BUILD_VERSION an BUILD_TAG variable also for hypervisor,
acrnprobe and crashlog. This eases build from an archive without
git available.
Tracked-On: #6035
Signed-off-by: Helmut Buchsbaum <helmut.buchsbaum@opensource.tttech-industrial.com>
Make builds reproducible by honoring SOURCE_DATE_EPOCH and USER
environment variables in the respective Makefiles. Just follow the
recommendations at https://reproducible-builds.org/
Build tools (e.g. Debian packaging, Yocto) use this to ensure reproducibility
of packages.
Tracked-On: #6035
Signed-off-by: Helmut Buchsbaum <helmut.buchsbaum@opensource.tttech-industrial.com>
Create virtual root port through add_vdev hypercall. add_vdev
identifies the virtual device to add by its vendor id and device id, then
call the corresponding function to create virtual device.
-create_vrp(): Find the right virtual root port to create
by its secondary bus number, then initialize the virtual root port.
And finally initialize PTM related configurations.
-destroy_vrp(): nothing to destroy
Tracked-On: #5915
Signed-off-by: Rong Liu <rong.l.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Acked-by: Jason Chen <jason.cj.chen@intel.com>
Acked-by: Yu Wang <yu1.wang@intel.com>
Add virtual root port that supports the most basic pci-e bridge and root port operations.
- init_vroot_port(): init vroot_port's basic registers.
- deinit_vroot_port(): reset vroot_port
- read_vroot_port_cfg(): read from vroot_port's virtual config space.
- write_vroot_port_cfg(): write to vroot_port's virtual config space.
Tracked-On: #5915
Signed-off-by: Rong Liu <rong.l.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Acked-by: Jason Chen <jason.cj.chen@intel.com>
Acked-by: Yu Wang <yu1.wang@intel.com>