Initialize efi info of acrn mbi when boot from multiboot2 protocol, with
this patch hypervisor could get host efi info and pass it to Linux zeropage,
then make guest Linux possible to boot with efi environment;
Tracked-On: #4419
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Initialize module info and ACPI rsdp info of acrn mbi when boot from
multiboot2 protocol, with this patch SOS VM could be loaded sucessfully
with correct ACPI RSDP;
Tracked-On: #4419
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Initialize mmap info of acrn mbi when boot from multiboot2 protocol,
with this patch acrn hv could boot from multiboot2;
Tracked-On: #4419
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add multiboot2 header info in HV image so that bootloader could
recognize it.
Tracked-On: #4419
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Initialize and sanitize a acrn specific multiboot info struct with current
supported multiboot1 in very early boot stage, which would bring below
benifits:
- don't need to do hpa2hva convention every time when refering boot_regs;
- panic early if failed to sanitize multiboot info, so that don't need to
check multiboot info pointer/flags and panic in later boot process;
- keep most code unchanged when introduce multiboot2 support in future;
Tracked-On: #4419
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The patch re-arch boot component header files by:
- moving multiboot.h from include/arch/x86/ to boot/include/ and keep
this header for multiboot1 protocol data struct only;
- moving multiboot related MACROs in cpu_primary.S to multiboot.h;
- creating an independent boot.h to store acrn specific boot information
for other files' reference;
Tracked-On: #4419
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
- It is meaningless to enable debug function in parse_hv_cmdline() because
the function run in very eary stage and uart has not been initialized at
that time, so remove this debug level definition;
- Rewrite parse_hv_cmdline() function to make it compliant with MISRA-C;
- Decouple uart16550 stuff from Init.c module and let console.c handle it;
Tracked-On: #4419
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
We hit build issue if the ld version is 2.34:
error: PHDR segment not covered by LOAD segment
One issue was created to binutils bugzilla system:
https://sourceware.org/bugzilla/show_bug.cgi?id=25585
From the ld guys comment, this is not an issue of 2.34. It's an
issue fixing of the old ld. He suggested to add option
--no-dynamic-linker
to ld if we don't depend on dynamically linker to loader our binary.
Tracked-On: #4415
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Count down number will be decreased at each tick, when it comes to zero,
it will trigger reschedule.
Tracked-On: #4410
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
pick_next function will update the virtual time parameters, and return
the vcpu thread with earlest evt. Calculate the count down number for
the picked vcpu thread, it means how many mcu a thread can run before
the next reschedule occur.
Tracked-On: #4410
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In the wakeup handler, the vcpu_thread object will be inserted into the
runqueue, and in the sleep handler, it will be removed from the queue.
vcpu_thread object is ordered by EVT (effective virtual time).
Tracked-On: #4410
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add init function for bvt scheduler, creating a runqueue and a period
timer, the timer interval is default as 1ms. The interval is the minimum
charging unit.
Tracked-On: #4410
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
BVT (Borrowed virtual time) scheduler is used to schedule vCPUs on pCPU.
It has the concept of virtual time, vCPU with earliset virtual time is
dispatched first.
Main concepts:
tick timer:
a period tick is used to measure the physcial time in units of MCU
(minimum charing unit).
runqueue:
thread in the runqueue is ordered by virtual time.
weight:
each thread receives a share of the pCPU in proportion to its
weight.
context switch allowance:
the physcial time by which the current thread is allowed to advance
beyond the next runnable thread.
warp:
a thread with warp enabled will have a change to minus a value (Wi)
from virtual time to achieve higher priority.
virtual time:
AVT: actual virtual time, advance in proportional to weight.
EVT: effective virtual time.
EVT <- AVT - ( warp ? Wi : 0 )
SVT: scheduler virtual time, the minimum AVT in the runqueue.
Tracked-On: #4410
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
1. Rename BOOT_CPU_ID to BSP_CPU_ID
2. Repace hardcoded value with BSP_CPU_ID when
ID of BSP is referenced.
Tracked-On: #4420
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Now only PCI MSI-X BAR access need dynamic register/unregister. Others don't need
unregister once it's registered. So we don't need to lock the vm level emul_mmio_lock
when we handle the MMIO access. Instead, we could use finer granularity lock in the
handler to ptotest the shared resource.
This patch fixed the dead lock issue when OVMF try to size the BAR size:
Becasue OVMF use ECAM to access the PCI configuration space, it will first hold vm
emul_mmio_lock, then calls vpci_handle_mmconfig_access. While this tries to size a
BAR which is also a MSI-X Table BAR, it will call register_mmio_emulation_handler to
register the MSI-X Table BAR MMIO access handler. This will causes the emul_mmio_lock
dead lock.
Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Now we split passthrough PCI device from DM to HV, we could remove all the passthrough
PCI device unused code.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In this case, we could handle all the passthrough PCI devices in ACRN hypervisor.
But we still need DM to initialize BAR resources and Intx for passthrough PCI
device for post-launched VM since these informations should been filled into
ACPI tables. So
1. we add a HC vm_assign_pcidev to pass the extra informations to replace the old
vm_assign_ptdev.
2. we saso remove HC vm_set_ptdev_msix_info since it could been setted by the post-launched
VM now same as SOS.
3. remove vm_map_ptdev_mmio call for PTDev in DM since ACRN hypervisor will handle these
BAR access.
4. the most important thing is to trap PCI configure space access for PTDev in HV for
post-launched VM and bypass the virtual PCI device configure space access to DM.
This patch doesn't do the clean work. Will do it in the next patch.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Add assign/deassign PCI device hypercall APIs to assign a PCI device from SOS to
post-launched VM or deassign a PCI device from post-launched VM to SOS. This patch
is prepared for spliting passthrough PCI device from DM to HV.
The old assign/deassign ptdev APIs will be discarded.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
The previous fcf-protection fix broke the old gcc (older than
gcc 8 which is common on Ubuntu 18.04 and older distributions).
We only add fcf-protection=none for gcc8 and newer.
Tracked-On: #4358
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
apl-mrb need to access P2SB device, so add 00:0d.0 P2SB device to
whitelist for platform pci hidden device.
Tracked-On: #3475
Signed-off-by: Wei Liu <weix.w.liu@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Victor Sun <victor.sun@intel.com>
To enable gvt-d,need to allow the GPU IOMMU.
While gvt-d hasn't been enabled on APL yet,
so let APL disable GPU IOMMU.
v2 -> v3:
* let APL platforms disable GPU IOMMU.
Tracked-On: #4405
Signed-off-by: Junming Liu <junming.liu@intel.com>
Reviewed-by: Wu Binbin <binbin.wu@intel.com>
If one of the enabled VT-d DMAR units
doesn’t support snoop control,
then bit 11 of leaf PET of EPT is not set,
since the field is treated as reserved(0)
by VT-d hardware implementations
not supporting snoop control.
GUP IOMMU doesn’t support snoop control,
this patch add an option to disable
iommu snoop control for gvt-d.
v2 -> v3:
* refine the MICRO name and description.
Tracked-On: #4405
Signed-off-by: Junming Liu <junming.liu@intel.com>
Reviewed-by: Wu Binbin <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
On UEFI UP2 board, APs might execute HLT before SOS kernel INIT them.
After SOS kernel take over and will re-init the APs directly. The flows
from HV perspective is like:
HLT trap:
wait_event(VCPU_EVENT_VIRTUAL_INTERRUPT) -> sleep_thread
SOS kernel INIT, SIPI APs:
pause_vcpu(ZOMBIE) -> sleep_thread
-> reset_vcpu
-> launch_vcpu -> wake_vcpu
However, the last wake_vcpu will fail because the cpu event
VCPU_EVENT_VIRTUAL_INTERRUPT had not got signaled.
This patch will reset all vcpu events in reset_vcpu. If the thread was
previously waiting for a event, its waiting status will be cleared and
launch_vcpu will wake it to running.
Tracked-On: #4402
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In platforms that support CAT, when it is enabled by ACRN, i.e.
IA32_resourceType_MASK_n registers are programmed with customized values,
it has impacts to the whole system.
The per guest flag GUEST_FLAG_CLOS_REQUIRED suggests that CAT may be
enabled in some guests, but not in others who don't have this flag,
which is conceptually incorrect.
This patch removes GUEST_FLAG_CLOS_REQUIRED, and adds a new Kconfig
entry CAT_ENABLED for CAT enabling. When it's enabled, platform_clos_array[]
defines a set of system-wide Class of Service (COS, or CLOS), and the
per guest vm_configs[].clos associates the guest with particular CLOS.
Tracked-On: #2462
Signed-off-by: Zide Chen <zide.chen@intel.com>
In some build env (Ubuntu 19.10 as example), gcc enabled the option
-fcf-protection by default. But this option is not compatible with
-mindirect-branch. Which could trigger following build error:
fail to build with gcc-9 [error: ‘-mindirect-branch’ and
‘-fcf-protection’ are not compatible]
-mindirect-branch is mandatory for retpoline mitigation and always
enabled for ACRN build. We disable -fcf-protection here for ACRN
build.
Tracked-On: #4358
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Wu Binbin <binbin.wu@intel.com>
Currently panic() and pr_xxx() statements before init_primary_pcpu_post()
won't be printed, which is inconvenient and misleading for debugging.
This patch makes pr_xxx() APIs working before init_pcpu_pre():
- clear .bss in init.c, which makes sense to clear .bss at the very beginning
of initialization code. Also this makes it possible to call init_logmsg()
before init_pcpu_pre().
- move parse_hv_cmdline() and uart16550_init(true) to init.c.
- refine ticks_to_us() to handle the case that it's called before
calibrate_tsc(). As a side effect, it prints "0us" in early pr_xxx() calls.
- call init_debug_pre() in init_primary_pcpu() and after this point,
both printf() and pr_xxx() APIs are available.
However, this patch doesn't address the issue that pr_xxx() could be called
on PCPUs that set_current_pcpu_id() hasn't been called, which implies that
the PCPU ID shown in early logs may not be accurate.
Tracked-On: #2987
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
it is better to init bdfs_from_drhds.pci_bdf_map_count
before it is passed to other function to do:
bdfs_from_drhds->pci_bdf_map_count++
Tracked-On: #3875
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
INVALID_BIT_INDEX has 16 bits only, which removes all pcpu_id that
is >= 16 from the destination mask.
Tracked-On: #4354
Signed-off-by: Zide Chen <zide.chen@intel.com>
1. Align the coding style for these MACROs
2. Align the values of fixed VECTORs
Tracked-On: #4348
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
In lapic passthrough mode, it should passthrough HLT/PAUSE execution
too. This patch disable their emulation when switch to lapic passthrough mode.
Tracked-On: #4329
Tested-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Due to vcpu and its thread are two different perspective modules, each
of them has its own status. Dump both states for better understanding
of system status.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
is_polling_ioreq is more straightforward. Rename it.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch checks the validity of 'vdev->pdev' to
ensure physical device is linked to 'vdev'.
this check is to avoid some potential hypervisor
crash when destroying VM with crafted input.
Tracked-On: #4336
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
'param' is BDF value instead of GPA when VHM driver
issues below 2 hypercalls:
- HC_ASSIGN_PTEDEV
- HC_DEASSIGN_PTDEV
This patch is to remove related code in hc_assign/deassign()
functions.
Tracked-On: #4334
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
SOS will use PCIe ECAM access PCIe external configuration space. HV should trap this
access for security(Now pre-launched VM doesn't want to support PCI ECAM; post-launched
VM trap PCIe ECAM access in DM).
Besides, update PCIe MMCONFIG region to be owned by hypervisor and expose and pass through
platform hide PCI devices by BIOS to SOS.
Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Use Enhanced Configuration Access Mechanism (MMIO) instead of PCI-compatible
Configuration Mechanism (IO port) to access PCIe Configuration Space
PCI-compatible Configuration Mechanism (IO port) access is used for UART in
debug version.
Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
This patch overwrites the idle driver of service OS for industry, sdc,
sdc2 scenarios. HLT will be used as the default idle action.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
HLT emulation is import to CPU resource maximum utilization. vcpu
doing HLT means it is idle and can give up CPU proactively. Thus, we
pause the vcpu thread in HLT emulation and resume it while event happens.
When vcpu enter HLT, its vcpu thread will sleep, but the vcpu state is
still 'Running'.
VM ID PCPU ID VCPU ID VCPU ROLE VCPU STATE
===== ======= ======= ========= ==========
0 0 0 PRIMARY Running
0 1 1 SECONDARY Running
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Sometimes HV wants to know if there are pending interrupts of one vcpu.
Add .has_pending_intr interface in acrn_apicv_ops and return the pending
interrupts status by check IRRs of apicv.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Introduce two kinds of events for each vcpu,
VCPU_EVENT_IOREQ: for vcpu waiting for IO request completion
VCPU_EVENT_VIRTUAL_INTERRUPT: for vcpu waiting for virtual interrupts events
vcpu can wait for such events, and resume to run when the
event get signalled.
This patch also change IO request waiting/notifying to this way.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This simple event implemention can only support exclusive waiting
at same time. It mainly used by thread who want to wait for special event
happens.
Thread A who want to wait for some events calls
wait_event(struct sched_event *);
Thread B who can give the event signal calls
signal_event(struct sched_event *);
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
As we enabled cpu sharing, PAUSE-loop exiting can help vcpu
to release its pcpu proactively. It's good for performance.
VMX_PLE_GAP: upper bound on the amount of time between two successive
executions of PAUSE in a loop.
VMX_PLE_WINDOW: upper bound on the amount of time a guest is allowed to
execute in a PAUSE loop
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In current code, wait_pcpus_offline() and make_pcpu_offline() are called by
both shutdown_vm() and reset_vm(), but this is not needed when lapic_pt is
not enabled for the vcpus of the VM.
The patch merged offline pcpus part code into a common
offline_lapic_pt_enabled_pcpus() api for shutdown_vm() and reset_vm() use and
called only when lapic_pt is enabled.
Tracked-On: #4325
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
1. This patch passes-through CR4.PCIDE to guest VM.
2. This patch handles the invlidation of TLB and the paging-structure caches.
According to SDM Vol.3 4.10.4.1, the following instructions invalidate
entries in the TLBs and the paging-structure caches:
- INVLPG: this instruction is passed-through to guest, no extra handling needed.
- INVPCID: this instruction is passed-trhough to guest, no extra handling needed.
- CR0.PG from 1 to 0: already handled by current code, change of CR0.PG will do
EPT flush.
- MOV to CR3: hypervisor doesn't trap this instrcution, no extra handling needed.
- CR4.PGE changed: already handled by current code, change of CR4.PGE will no EPT
flush.
- CR4.PCIDE from 1 to 0: this patch handles this case, will do EPT flush.
- CR4.PAE changed: already handled by current code, change of CR4.PAE will do EPT
flush.
- CR4.SEMP from 1 to 0, already handled by current code, change of CR4.SEMP will
do EPT flush.
- Task switch: Task switch is not supported in VMX non-root mode.
- VMX transitions: already handled by current code with the support of VPID.
3. This patch checks the validatiy of CR0, CR4 related to PCID feature.
According to SDM Vol.3 4.10.1, CR.PCIDE can be 1 only in IA-32e mode.
- MOV to CR4 causes a general-protection exception (#GP) if it would change CR4.PCIDE
from 0 to 1 and either IA32_EFER.LMA = 0 or CR3[11:0] ≠ 000H
- MOV to CR0 causes a general-protection exception if it would clear CR0.PG to 0
while CR4.PCIDE = 1
Tracked-On: #4296
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>