So that DM can retrieve physical APIC IDs and use them to fill in the ACPI MADT table for
post-launched VMs.
Note:
1. DM needs to use the same logic as hypervisor to calculate vLAPIC IDs based on physical APIC IDs
and CPU affinity setting
2. Using reserved0[] in struct hc_platform_info to pass physical APIC IDs means we can only support at
most 116 cores. And it assumes LAPIC ID is 8bits (X2APIC mode supports 32 bits).
Cat IDs shift will be used by DM RTCT V2
Tracked-On: #6020
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Hypervisor does not need to care about hugepage settings in SOS kernel, user
could enable these settings in the scenario config file or GRUB menu.
Tracked-On: #5815
Signed-off-by: Victor Sun <victor.sun@intel.com>
Create virtual root port through add_vdev hypercall. add_vdev
identifies the virtual device to add by its vendor id and device id, then
call the corresponding function to create virtual device.
-create_vrp(): Find the right virtual root port to create
by its secondary bus number, then initialize the virtual root port.
And finally initialize PTM related configurations.
-destroy_vrp(): nothing to destroy
Tracked-On: #5915
Signed-off-by: Rong Liu <rong.l.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Acked-by: Jason Chen <jason.cj.chen@intel.com>
Acked-by: Yu Wang <yu1.wang@intel.com>
1. do not allow external modules to touch internal field of a timer.
2. make timer mode internal, period_in_ticks will decide the mode.
API wise:
1. the "mode" parameter was taken out of initialize_timer().
2. a new function update_timer() was added to update the timeout and
period fields.
3. the timer_expired() function was extended with an output parameter
to return the remaining cycles before expiration.
Also, the "fire_tsc" field name of hv_timer was renamed to "timeout".
With the new API, however, this change should not concern user code.
Tracked-On: #5920
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
x86/timer.[ch] was moved to the common directory largely unchanged.
x86 specific code now resides in x86/tsc_deadline_timer.c and its
interface was defined in hw/hw_timer.h. The interface defines two
functions: init_hw_timer() and set_hw_timeout() that provides HW
specific initialization and timer interrupt source.
Other than these two functions, the timer module is largely arch
agnostic.
Tracked-On: #5920
Signed-off-by: Rong Liu <rong2.liu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Modules that use udelay() should include "delay.h" explicitly.
Tracked-On: #5920
Signed-off-by: Rong Liu <rong2.liu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Generalize and split basic cpu cycle/tick routines from x86/timer:
- Instead of rdstc(), use cpu_ticks() in generic code.
- Instead of get_tsc_khz(), use cpu_tickrate() in generic code.
- Include "common/ticks.h" instead of "x86/timer.h" in generic code.
- CYCLES_PER_MS is renamed to TICKS_PER_MS.
The x86 specific API rdstc() and get_tsc_khz(), as well as TSC_PER_MS
are still available in arch/x86/tsc.h but only for x86 specific usage.
Tracked-On: #5920
Signed-off-by: Rong Liu <rong2.liu@intel.com>
Signed-off-by: Yi Liang <yi.liang@intel.com>
The current permission-checking and dispatching mechanism of hypercalls is
not unified because:
1. Some hypercalls require the exact vCPU initiating the call, while the
others only need to know the VM.
2. Different hypercalls have different permission requirements: the
trusty-related ones are enabled by a guest flag, while the others
require the initiating VM to be the Service OS.
Without a unified logic it could be hard to scale when more kinds of
hypercalls are added later.
The objectives of this patch are as follows.
1. All hypercalls have the same prototype and are dispatched by a unified
logic.
2. Permissions are checked by a unified logic without consulting the
hypercall ID.
To achieve the first objective, this patch modifies the type of the first
parameter of hcall_* functions (which are the callbacks implementing the
hypercalls) from `struct acrn_vm *` to `struct acrn_vcpu *`. The
doxygen-style documentations are updated accordingly.
To achieve the second objective, this patch adds to `struct hc_dispatch` a
`permission_flags` field which specifies the guest flags that must ALL be
set for a VM to be able to invoke the hypercall. The default value (which
is 0UL) indicates that this hypercall is for SOS only. Currently only the
`permission_flag` of trusty-related hypercalls have the non-zero value
GUEST_FLAG_SECURE_WORLD_ENABLED.
With `permission_flag`, the permission checking logic of hypercalls is
unified as follows.
1. General checks
i. If the VM is neither SOS nor having any guest flag that allows
certain hypercalls, it gets #UD upon executing the `vmcall`
instruction.
ii. If the VM is allowed to execute the `vmcall` instruction, but
attempts to execute it in ring 1, 2 or 3, the VM gets #GP(0).
2. Hypercall-specific checks
i. If the hypercall is for SOS (i.e. `permission_flag` is 0), the
initiating VM must be SOS and the specified target VM cannot be a
pre-launched VM. Otherwise the hypercall returns -EINVAL without
further actions.
ii. If the hypercall requires certain guest flags, the initiating VM
must have all the required flags. Otherwise the hypercall returns
-EINVAL without further actions.
iii. A hypercall with an unknown hypercall ID makes the hypercall
returns -EINVAL without further actions.
The logic above is different from the current implementation in the
following aspects.
1. A pre-launched VM now gets #UD (rather than #GP(0)) when it attempts
to execute `vmcall` in ring 1, 2 or 3.
2. A pre-launched VM now gets #UD (rather than the return value -EPERM)
when it attempts to execute a trusty hypercall in ring 0.
3. The SOS now gets the return value -EINVAL (rather than -EPERM) when it
attempts to invoke a trusty hypercall.
4. A post-launched VM with trusty support now gets the return value
-EINVAL (rather than #UD) when it attempts to invoke a non-trusty
hypercall or an invalid hypercall.
v1 -> v2:
- Update documentation that describe hypercall behavior.
- Fix Doxygen warnings
Tracked-On: #5924
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Instead of "#include <x86/foo.h>", use "#include <asm/foo.h>".
In other words, we are adopting the same practice in Linux kernel.
Tracked-On: #5920
Signed-off-by: Liang Yi <yi.liang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Requires explicit arch path name in the include directive.
The config scripts was also updated to reflect this change.
Tracked-On: #5825
Signed-off-by: Peter Fang <peter.fang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Each .c file includes the arch specific irq header file (with full
path) by itself if required.
Tracked-On: #5825
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
A new x86/guest/virq.h head file now contains all guest
related interrupt handling API.
Tracked-On: #5825
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
The common irq file is responsible for managing the central
irq_desc data structure and provides the following APIs for
host interrupt handling.
- init_interrupt()
- reserve_irq_num()
- request_irq()
- free_irq()
- set_irq_trigger_mode()
- do_irq()
API prototypes, constant and data structures belonging to common
interrupt handling are all moved into include/common/irq.h.
Conversely, the following arch specific APIs are added which are
called from the common code at various points:
- init_irq_descs_arch()
- setup_irqs_arch()
- init_interrupt_arch()
- free_irq_arch()
- request_irq_arch()
- pre_irq_arch()
- post_irq_arch()
Tracked-On: #5825
Signed-off-by: Peter Fang <peter.fang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Below boolean function are defined in this patch:
- is_software_sram_enabled() to check if SW SRAM
feature is enabled or not.
- set global variable 'is_sw_sram_initialized'
to file static.
Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
For FuSa's case, we remove all dynamic memory allocation use in ACRN HV. Instead,
we use static memory allocation or embedded data structure. For pagetable page,
we prefer to use an index (hva for MMU, gpa for EPT) to get a page from a special
page pool. The special page pool should be big enougn for each possible index.
This is not a big problem when we don't support 64 bits MMIO. Without 64 bits MMIO
support, we could use the index to search addrss not larger than DRAM_SIZE + 4G.
However, if ACRN plan to support 64 bits MMIO in SOS, we could not use the static
memory alocation any more. This is because there's a very huge hole between the
top DRAM address and the bottom 64 bits MMIO address. We could not reserve such
many pages for pagetable mapping as the CPU physical address bits may very large.
This patch will use dynamic page allocation for pagetable mapping. We also need
reserve a big enough page pool at first. For HV MMU, we don't use 4K granularity
page table mapping, we need reserve PML4, PDPT and PD pages according the maximum
physical address space (PPT va and pa are identical mapping); For each VM EPT,
we reserve PML4, PDPT and PD pages according to the maximum physical address space
too, (the EPT address sapce can't beyond the physical address space), and we reserve
PT pages by real use cases of DRAM, low MMIO and high MMIO.
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Tracked-On: #5788
Currently the VM bootargs load address is hard-coded at 8KB right before
kernel load address, this should work for Linux kernel only since Linux
kernel is guaranteed to be loadered high than GPA 8K so its load address
would never be overflowed, other OS like Zephyr has no such assumption.
Tracked-On: #5689
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Current implementation, SOS may allocate the memory region belonging to
hypervisor/pre-launched VM to a post-launched VM. Because it only verifies
the start address rather than the entire memory region.
This patch verifies the validity of the entire memory region before
allocating to a post-launched VM so that the specified memory can only
be allocated to a post-launched VM if the entire memory region is mapped
in SOS’s EPT.
Tracked-On: #5555
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
Physical address to SW SRAM region maybe different
on different platforms, this hardcoded address may
result in address mismatch for SW SRAM operations.
This patch removes above hardcoded address and uses
the physical address parsed from native RTCT.
Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
'ptcm' and 'ptct' are legacy name according
to the latest TCC spec, hence rename below files
to avoid confusing:
ptcm.c -> rtcm.c
ptcm.h -> rtcm.h
ptct.h -> rtct.h
Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
The init_multiboot_info() and sanitize_multiboot_ifno() APIs now
require parameters instead of implicitly relying on global boot
variables.
Tracked-On: #5661
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
- The current code to virtualize CR0/CR4 is not
well designed, and hard to read.
This patch reshuffle the logic to make it clear
and classify those bits into PASSTHRU,
TRAP_AND_PASSTHRU, TRAP_AND_EMULATE & reserved bits.
Tracked-On: #5586
Signed-off-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
While following two styles are both correct, the 2nd one is simpler.
bool is_level_triggered;
1. if (is_level_triggered == true) {...}
2. if (is_level_triggered) {...}
This patch cleans up the style in hypervisor.
Tracked-On: #861
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
- fix bug in 'hcall_destroy_vdev()', the availability of
vpci device shall be checked on 'target_vm".
- refine 'vpci_update_one_vbar()' to avoid potential NULL
pointer access.
Tracked-On: #5490
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Hypercall handlers for post-launched VMs automatically grab the vm_lock
in dispatch_sos_hypercall(). Remove the use of vm_lock inside the
handler.
Tracked-On: #5411
Signed-off-by: Peter Fang <peter.fang@intel.com>
More than one VM may request shutdown on the same pCPU before
shutdown_vm_from_idle() is called in the idle thread when pCPUs are
shared among VMs.
Use a per-pCPU bitmap to store all the VMIDs requesting shutdown.
v1 -> v2:
- use vm_lock to avoid a race on shutdown
Tracked-On: #5411
Signed-off-by: Peter Fang <peter.fang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
pSRAM memory should be cachable. However, it's not a RAM or a normal MMIO,
so we can't use the an exist API to do the EPT mapping and set the EPT cache
attribute to WB for it. Now we assume that SOS must assign the PSRAM area as
a whole and as a separate memory region whose base address is PSRAM_BASE_HPA.
If the hpa of the EPT mapping region is equal to PSRAM_BASE_HPA, we think this
EPT mapping is for pSRAM, we change the EPT mapping cache attribute to WB.
And fix a minor bug when SOS trap out to emulate wbinvd when pSRAM is enabled.
Tracked-On: #5330
Signed-off-by: Qian Wang <qian1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add cteate method for vmcs9900 vdev in hypercalls.
The destroy method of ivshmem is also suitable for other emulated vdev,
move it into hcall_destroy_vdev() for all emulated vdevs
Tracked-On: #5394
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
This function can be used by other modules instead of hypercall
handling only, hence move it to vlapic.c
Tracked-On: #5407
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Li, Fei <fei1.li@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Now ACRN supports direct boot mode, which could be SBL/ABL, or GRUB boot.
Thus the vboot wrapper layer can be removed and the direct boot functions
don't need to be wrapped in direct_boot.c:
- remove call to init_vboot(), and call e820_alloc_memory() directly at the
time when the trampoline buffer is actually needed.
- Similarly, call CPU_IRQ_ENABLE() instead of the wrapper init_vboot_irq().
- remove get_ap_trampoline_buf(), since the existing function
get_trampoline_start16_paddr() returns the exact same value.
- merge init_general_vm_boot_info() into init_vm_boot_info().
- remove vm_sw_loader pointer, and call direct_boot_sw_loader() directly.
- move get_rsdp_ptr() from vboot_wrapper.c to multiboot.c, and remove the
wrapper over two boot modes.
Tracked-On: #5197
Signed-off-by: Zide Chen <zide.chen@intel.com>
Previously we use a pre-defined structure as vACPI table for pre-launched
VM, the structure is initialized by HV code. Now change the method to use a
pre-loaded multiboot module instead. The module file will be generated by
acrn-config tool and loaded to GPA 0x7ff00000, a hardcoded RSDP table at
GPA 0x000f2400 will point to the XSDT table which at GPA 0x7ff00080;
Tracked-On: #5266
Signed-off-by: Victor Sun <victor.sun@intel.com>
Signed-off-by: Shuang Zheng <shuang.zheng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Some hypercalls to a target VM are only acceptable in some certain
states, else it impacts target VM. Add some restrictive status checks to
avoid that.
Tracked-On: #5208
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Virtual interrupts injection and memory mapping operations can impact
target VM. By design, these type of operations from lower severity VM
to higher severity VM should be blocked by the hypervisor.
While the hypercalls are the interface between SOS VM and the
hypervisor, severity checks can be implemented at the beginning of
hypercalls needed.
Added severity checks in below hypercalls:
* hcall_set_vm_memory_regions()
* hcall_notify_ioreq_finish()
* hcall_set_irqline()
* hcall_inject_msi()
* hcall_write_protect_page()
Tracked-On: #5208
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
For ivshmem vdev creation, the vdev vBDF, vBARs, shared memory region
name and size are set by device model. The shared memory name and size
must be same as the corresponding device configuration which is configured
by offline tool.
v3: add a comment to the vbar_base member of the acrn_vm_pci_dev_config
structure that vbar_base is power-on default value
Tracked-On: #4853
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add HC_CREATE_VDEV and HC_DESTROY_VDEV two hypercalls that are used to
create and destroy an emulated device(PCI device or legacy device) in hypervisor
v3: 1) change HC_CREATE_DEVICE and HC_DESTROY_DEVICE to HC_CREATE_VDEV
and HC_DESTROY_VDEV
2) refine code style
v4: 1) remove unnecessary parameter
2) add VM state check for HC_CREATE_VDEV and HC_DESTROY hypercalls
Tracked-On: #4853
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
-- use an array to fast locate the hypercall handler
to replace switch case.
-- uniform hypercall handler as below:
int32_t (*handler)(sos_vm, target_vm, param1, param2)
Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2abbb99f6a ("hv: make thread status more accurate") introduced a
transition stage, marked as var be_blocking, between RUNNING->BLOCKED
of thread status. wake_thread() does not work in this transition stage
because it only checks thread->status.
Need to check thread->be_blocking as well in wake_thread(). When
wake_thread() happens in the transition stage, the previous sleep
operation rolled back.
Tracked-On: #5190
Fixes: 2abbb99f6a ("hv: make thread status more accurate")
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
As we only set BLOCKED status in context switch_out, which means, only
running thread can be changed to BLOCKED, but runnable thread can not.
This lead to the deadloop in sleep_thread_sync.
To solve the problem, in sleep_thread, we set the status to BLOCKED
directly when the original thread status is RUNNABLE.
Tracked-On: #5115
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Add two hypercalls to support MMIO device pass through for post-launched VM.
And when we support MMIO pass through for pre-launched VM, we could re-use
the code in mmio_dev.c
Tracked-On: #5053
Signed-off-by: Li Fei1 <fei1.li@intel.com>
kick_thread function is only used by kick_vcpu to kick vcpu out of
non-root mode, the implementation in it is sending IPI to target CPU if
target obj is running and target PCPU is not current one; while for
runnable obj, it will just make reschedule request. So the kick_thread
is not actually belong to scheduler module, we can drop it and just do
the cpu notification in kick_vcpu.
Tracked-On: #5057
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
vcpu->running is duplicated with THREAD_STS_RUNNING status of thread
object. Introduce an API sleep_thread_sync(), which can utilize the
inner status of thread object, to do the sync sleep for zombie_vcpu().
Tracked-On: #5057
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
1. Update thread status after switch_in/switch_out.
2. Add 'be_blocking' to represent the intermediate state during
sleep_thread and switch_out. After switch_out, the thread status
update to THREAD_STS_BLOCKED.
Tracked-On: #5057
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
-- replace global hypercall lock with per-vm lock
-- add spinlock protection for vm & vcpu state change
v1-->v2:
change get_vm_lock/put_vm_lock parameter from vm_id to vm
move lock obtain before vm state check
move all lock from vmcall.c to hypercall.c
Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
will follow this convention for spin lock initialization:
-- for simple global variable locks, use this style:
static spinlock_t xxx_spinlock = {.head = 0U, .tail = 0U,}
-- for the locks inside a data structure, need to call
spinlock_init to initialize.
Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
hv: hypercall: restrict the condition to assign/deassign a pci device to
a post-launched VM for safety
For the safety of post-launched VMs, pci devices assignments should
occur only when VM is being created (at VM_CREATED STATUS), and pci
devices de-assignment should occur only when VM is being created or
shutdown/reset (at VM_CREATED or VM_PAUSED status)
Tracked-On: #4995
Acked-by: Eddie Done <eddie.dong@intel.com>
Reviewed-by: Li Fei <Fei1.Li@intel.com>
Signed-off-by: Wang Qian <qian1.wang@intel.com>
For a ptirq_remapping_info entry, when build IRTE:
- If the caller provides a valid IRTE, use the IRET
- If the caller doesn't provide a valid IRTE, allocate a IRET when the
entry doesn't have a valid IRTE, in this case, the IRET will be freed
when free the entry.
Tracked-On:#4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>