Commit Graph

3487 Commits

Author SHA1 Message Date
Jiang, Yanting 599894e571 Fix: write xmm registers correctly
The movdqu instruction moves unaligned double quadword (128 bit)
contained in XMM registers.

This patch uses pointers as input parameters of the function
write_xmm_0_2() to get 128-bit value from 64-bit array for each XMM
register.

Tracked-On: #7380
Reviewed-by: Fei Li <fei1.li@intel.com>
Signed-off-by: Jiang, Yanting <yanting.jiang@intel.com>
2022-05-06 10:29:33 +08:00
Fei Li 5130dfe08b hv: vSRIOV: add VF BARs mapping for PF
When enabling SRIOV capability for a PF in Service VM, ACRN Hypervisor
should add VF BARs mapping for PF since PF's firmware would access these
BARs to do initialization for VFs when it's first created.

Tracked-On: #4433
Signed-off-by: Fei Li <fei1.li@intel.com>
2022-04-26 15:07:25 +08:00
Fei Li 13e99bc0b9 hv: vPCI: passthrough MSI-X Control Register to guest.
In spite of Table Size in MSI-X Message Control Register [Bits 10:0] masks as
RO (Register bits are read-only and cannot be altered by software), In Spec
PCIe 6.0, Chap 6.1.4.2 MSI-X Configuration "Depending upon system software
policy, system software, device driver software, or each at different times or
environments may configure a Function’s MSI-X Capability and table structures
with suitable vectors."

This patch just pass through MSI-X Control Register field to guest.

Tracked-On: #7275
Signed-off-by: Fei Li <fei1.li@intel.com>
2022-04-26 15:07:25 +08:00
Tw e2f7b1fc51 hv: remove obsolete declarations related to RDT
Since CAT support for hybrid platform is landed, let's remove some old declarations
which are no longer used.

Tracked-On: #6690
Signed-off-by: Tw <wei.tan@intel.com>
2022-04-26 14:27:01 +08:00
Conghui a2e2640507 config-tools: ignore the scenario and board field
Ignore the "scenario" and "board" field in <scenario>.xml:
    <acrn-config board="whl-ipc-i5" scenario="shared">

Tracked-On: #7345
Signed-off-by: Conghui <conghui.chen@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2022-04-24 13:53:32 +08:00
Conghui 593414e9c2 config-tools: support absolute path
1. Support absolute path for scenario file.
2. Use the scenario xml file name as scenario name, but if it is
'scenario.xml', use the upper level directory name.
e.g.
	SCENARIO=<pathxxx>/shared/scenario.xml
Then scenario name would be 'shared'.

3. Change 'realpath' to 'abspath' as we should keep the original path
for scenario file even it is a link file. This will make sure the
scenario name is always consistent with file set in 'SCENARIO='.

Tracked-On: #7345
Signed-off-by: Conghui <conghui.chen@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2022-04-24 13:53:32 +08:00
Chenli Wei ed1c638c87 hv: refine for HPAn setting
The current code only supports 2 HPA regions per VM.

This patch extended ACRN to support 2+ HPA regions per VM, to use host
memory better if it is scatted among multiple regions.

This patch uses an array to describe the hpa region for the VM, and
change the logic of ve820 to support multiple regions.

This patch dependent on the config tool and GPA SSRAM change

Tracked-On: #6690
Signed-off-by: Chenli Wei <chenli.wei@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
2022-04-22 14:46:05 +08:00
Tw e246ada6b0 hv: fix core type error when launch RTVM use atom core
When CPUID executes with EAX set to 1AH, the processor returns information about hybrid capabilities.
This information is percpu related, and should be obtained directly from the physical cpu.

Tracked-On: #6899
Signed-off-by: Tw <wei.tan@intel.com>
2022-04-21 09:21:16 +08:00
Yonghua Huang 80292a482d hv: remove pgentry_present field in struct pgtable
Page table entry present check is page table type
  specific and static, e.g. just need to check bit0
  of page entry for entries of MMU page table and
  bit2~bit0 for EPT page table case. hence no need to
  check it by callback function every time.

  This patch remove 'pgentry_present' callback field and
  add a new bitmask field for this page entry present check.
  It can get better performance especially when this
  check is executed frequently.

Tracked-On: #7327
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Reviewed-by: Yu Wang <yu1.wang@intel.com>
2022-04-20 17:38:02 +08:00
Tw deb35a0de9 hv: fix cpuid 0x2 mismatch when launch RTVM use atom core
When CPUID executes with EAX set to 02H, the processor returns information about cache and TLB information.
This information is percpu related, and should be obtained directly from the physical cpu.

BTW, this patch is backported from v2.7 branch.

Tracked-On: #6931
Signed-off-by: Tw <wei.tan@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
2022-04-20 16:13:10 +08:00
Zhou, Wu 6458a3f474 hv: remove an unnecessary code line
This patch is to eliminate a code scan warning.

p_elf_header32 was given a value when it was declared, but later it was
given the same value again. Just remove the later one.

Tracked-On: #7318

Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
2022-04-20 14:15:25 +08:00
Fei Li 6b32b28e72 hv: ptdev: address vector scalability problem by interrupt posting
Now interrupt vector in ACRN hypervisor is maintained as global variable, not
per-CPU variable. If there're more PCI devices, the physical interrupt vectors
are not enough most likely.

This patch would not allocate physical interrupt vector for MSI/MSI-X vectors
if interrupt posting could been used to inject the MSI/MSI-X interrupt to
a VM directly.

Tracked-On: #7275
Signed-off-by: Fei Li <fei1.li@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2022-04-19 14:54:04 +08:00
Zhou, Wu 32cb5954f2 hv: refine the hard-coded GPA SSRAM area size
Using the SSRAM area size extracted by config_tools, the patch changes
the hard-coded GPA SSRAM area size to its actual size, so that
pre-launched VMs can support large(>8MB) SSRAM area.

When booting service VM, the SSRAM area has to be removed from Service
VM's mem space, because they are passed-through to the pre-rt VM. The
code was bugged since it was using the SSRAM area's GPA in the pre-rt
VM. Changed it to GPA in Service VM.

Tracked-On: #7212

Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
2022-04-18 16:47:23 +08:00
Tw 3c384a489c hv: support CAT on hybrid platform
On hybrid platform(e.g. ADL), there may be multiple instances of same level caches for different type of processors,
The current design only supports one global `rdt_info` for each RDT resource type.
In order to support hybrid platform, this patch introduce `rdt_ins` to represents the "instance".
Also, the number of `rdt_info` is dynamically generated by config-tool to match with physical board.

Tracked-On: projectacrn#6690
Signed-off-by: Tw <wei.tan@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2022-04-18 15:33:11 +08:00
Tw 19da21c898 hv: remove RDT information detection
As RDT related information will be offered by config-tool dynamically,
and HV is just a consumer of that. So there's no need to do this detection
at startup anymore.

Tracked-On: projectacrn#6690
Signed-off-by: Tw <wei.tan@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2022-04-18 15:33:11 +08:00
Geoffroy Van Cutsem 8b16be9185 Remove "All rights reserved" string headers
Many of the license and Intel copyright headers include the "All rights
reserved" string. It is not relevant in the context of the BSD-3-Clause
license that the code is released under. This patch removes those strings
throughout the code (hypervisor, devicemodel and misc).

Tracked-On: #7254
Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
2022-04-06 13:21:02 +08:00
Kunhui-Li 74dc103d9e config_tools: remove board and scenario attributes
remove board and scenario attributes dependency for new configuration.

To do:
will remove board and scenario attributes in all scenario XML files
and update the upgrader.py after the new configuration works.

Tracked-On: #6690
Signed-off-by: Kunhui-Li <kunhuix.li@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2022-03-31 19:32:52 +08:00
Fei Li 2aaf61bef8 hv: acpi: refine dmar acpi table parse
Now ACRN Hypervisor only support one PCI Segment, this patch add this check.
This patch also fix a small bug: it would trigger false error "DRHD with
INCLUDE_PCI_ALL flag is NOT the last one".

Tracked-On: #5907
Signed-off-by: Fei Li <fei1.li@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2022-03-29 16:38:40 +08:00
Fei Li 3f7501db38 hv: mmio: replace hi_mmio with mmio64
Now HI_MMIO_xxx is duplicate with MMIO64_xxx. This patch replace HI_MMIO_xxx
with MMIO64_xxx.

Tracked-On: #6011
Signed-off-by: Fei Li <fei1.li@intel.com>
2022-03-29 15:34:29 +08:00
Minggui Cao 05ca1d7641 hv: fix a bug about host/guest msr store/load
Unify the handling of host/guest MSR area in VMCS. Remove the emum value
as the element index when there are a few of MSRs in host/guest area.
Because the index could be changed if one element not used. So, use a
variable to save the index which will be used.

Tracked-On: #6966
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
2022-03-28 12:00:01 +08:00
Junjie Mao 7ad9596dd6 config_tools: change representation of build types
Instead of using a Boolean variable indicating whether a build is for debug
or release, it is more intuitive to specify the build types as "debug" or
"release".

This patch converts the config item RELEASE to BUILD_TYPE which takes
"debug" or "release" as of now.

The generated header and makefile still uses RELEASE, and the command line
option RELEASE=<y or n> is also preserved.

Tracked-On: #6690
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
2022-03-15 10:22:37 +08:00
Junjie Mao 433b37b1a8 config_tools: unify the CLI of scenario-manipulating scripts
Today the scripts that populate default values and validate scenarios have
different command-line interfaces: the former requires an XML schema as
input (which is cumbersome in most cases), while the latter always infer
where the XML schema is (which is inflexible).

This patch unifies the command line options of those scripts as follows:

  - The scenario XML is always a required positional argument.

  - The output file path (if any) is an optional positional argument.

  - The schema XML file is an optional long option. When not specified, the
    scripts will always use the one under the misc/config_tools/schema
    directory.

Also, this patch makes the validator.py executable, as is done to other
executable scripts in the repo.

Tracked-On: #6690
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
2022-03-15 10:22:37 +08:00
Minggui Cao b3bd153180 hv: expose PEBS capability and MSR as PMU_PT flag
Requirement: in CPU partition VM (RTVM), vtune or perf can be used to
sample hotspot code path to tune the RT performance, It need support
PMU/PEBS (Processor Event Based Sampling). Intel TCC asks for it, too.

It exposes PEBS related capabilities/features and MSRs to CPU
partition VM, like RTVM. PEBS is a part of PMU. Also PEBS needs
DS (Debug Store) feature to support. So DS is exposed too.

Limitation: current it just support PEBS feature in VM level, when CPU
traps to HV, the performance counter will stop. Perf global control
MSR is used to do this work. So, the counters shall be close to native.

Tracked-On: #6966
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
2022-03-10 14:34:33 +08:00
Minggui Cao 299c56bb68 hv: add a flag for PMU passthrough to guest VM
Add a flag: GUEST_FLAG_PMU_PASSTHROUGH to indicate if
PMU (Performance Monitor Unit) is passthrough to guest VM.

Tracked-On: #6966
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
2022-03-10 14:34:33 +08:00
Minggui Cao 3b1deda0eb hv: revert NMI notification by INIT signal
NMI is used to notify LAPIC-PT RTVM, to kick its CPU into hypervisor.
But NMI could be used by system devices, like PMU (Performance Monitor
Unit). So use INIT signal as the partition CPU notification function, to
replace injecting NMI.

Also remove unused NMI as notification related code.

Tracked-On: #6966
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
2022-03-10 14:34:33 +08:00
Chenli Wei c4c7835c12 hv: refine the vept module
Now the vept module uses a mixture of nept and vept, it's better to
refine it.

So this patch rename nept to vept and simplify the interface of vept
init module.

Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@intel.com>
2022-03-08 16:41:46 +08:00
Geoffroy Van Cutsem af8b51a63f Makefile: remove default BOARD and SCENARIO values
Remove the fact that a default BOARD and SCENARIO are used in case there was
none provided by the user, nor any available from a previous build. Up until
now, if that was the case, a build was triggered using a default set of BOARD
and SCENARIO values. The 'make' command will now error out asking the user to
specify those parameters.

Tracked-On: #7112
Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
2022-02-28 10:04:15 +08:00
Wen Qian e2f1990548 hv: change error code of undefined hypercall
This patch adds ENOTTY and ENOSYS to indicate undefined and obsoleted
request hyercall respectively, and uses ENOTTY as error code for undefined
hypercall instead of EINVAL to consistent with the ACRN kernel's return
value.

Tracked-On: #7029
Signed-off-by: Wen Qian <qian.wen@intel.com>
Signed-off-by: Li Fei <fei1.li@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2022-02-21 09:25:50 +08:00
Mingqiang Chi 3d5c3c4754 hv:fix violations of coding guideline C-ST-04
The coding guideline rule C-ST-04 requires that
a 'if' statement followed by one or more 'else if'
statement shall be terminated by an 'else' statement
which contains either appropriate action or a comment.

Tracked-On: #6776
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2022-02-18 18:41:07 +08:00
Chenli Wei b7a99f4530 hv: replace the CONFIG_PLATFORM_RAM_SIZE with get_e820_ram_size for vept
Now the vept table was allocate dynamically, but the table size of vept
was calculated by the CONFIG_PLATFORM_RAM_SIZE which was predefined by
config tool.

It's not complete change and can't support single binary for different
boards/platforms.

So this patch will replace the CONFIG_PLATFORM_RAM_SIZE and get the
top ram size from hv_E820 interface for vept.

Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@linux.intel.com>
2022-02-18 18:39:43 +08:00
Chenli Wei 5432b52b12 hv: replace the CONFIG_PLATFORM_RAM_SIZE with get_e820_ram_size for ept
Now the EPT module use predefined parameter "CONFIG_PLATFORM_RAM_SIZE"
to calculate the ept table size.

After change the EPT table to dynamic allocate to support single binary
for different boards/platforms, the ept table size should dynamic
calculate too.

So this patch replace CONFIG_PLATFORM_RAM_SIZE by the hv_e820_ram_size
to get the RAM info on run time.

Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@linux.intel.com>
2022-02-18 18:39:43 +08:00
Chenli Wei 148a3d334c hv: replace the CONFIG_PLATFORM_RAM_SIZE with get_e820_ram_size for mmu
CONFIG_PLATFORM_RAM_SIZE is predefined by config tool and mmu use it to
calculate the table size and predefine the ppt table.

This patch will change the ppt to allocate dynamically and get the table
size by the hv_e820_ram_size interface which could get the RAM
info on run time and replace the CONFIG_PLATFORM_RAM_SIZE.

Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@linux.intel.com>
2022-02-18 18:39:43 +08:00
Chenli Wei 22e08d541f hv: calculate hv_e820_ram_size dynamically
The e820 module could get the RAM info on run time, but the RAM size
and MAX address was limited by CONFIG_PLATFORM_RAM_SIZE which was
predefined by config tool.

Current solution can't support single binary for different boards or
platforms and the CONFIG_PLATFORM_RAM_SIZE can't matching the RAM size
if user have not update config tools setting after the device changed.

So this patch remove the CONFIG_PLATFORM_RAM_SIZE and calculate ram
size on run time.

Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@linux.intel.com>
2022-02-18 18:39:43 +08:00
Geoffroy Van Cutsem 8a5f25e835 Makefile: put 'serial.conf' in final location
The 'serial.conf' file need to be put in /etc/, but it is currently being
installed in $(libdir)/acrn/. We therefore ask the user to manually copy that
file over to the /etc/ folder. This patch fixes that by installing
'serial.conf' directly there.

Tracked-On: #7107
Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
2022-02-18 18:38:45 +08:00
Chenli Wei 01fa6de42c hv: refine the e820 add new entry logic
Sometimes the memory to be allocated is not at the end of an entry,
that means we have to break one enty into 2 smaller entries, there
are two ways to add the new entry to hv_e820, adds to the end or
insert it.

The initial e820 table is ordered, that's why the e820_alloc_memory
interface asssum all entries was sorted, but add new entry to the
end will break the orde of hv_e820.

So we use insert_e820_entry to replace the add_e820_entry, the new
interfeac will keep the orde and users do not need sort again after
alloc region

Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei<chenli.wei@linux.intel.com>
2022-02-09 10:14:01 +08:00
Yonghua Huang 364b2b1428 hv: remove HC_GET_PLATFORM_INFO hypercall support
HC_GET_PLATFORM_INFO hypercall is not supported anymore,
 hence to remove related function and data structure definition.

Tracked-On: #6690
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2022-02-09 10:11:11 +08:00
Geoffroy Van Cutsem 8db94b1335 Makefile: fix typo in config.mk
Fix typo in hypervisor/scripts/makefile/config.mk file.

Tracked-On: #5647
Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
2022-02-08 11:10:16 +08:00
Chenli Wei 635a6da1c0 hv:refine the min address logic of high memory
Current mmu assum the high memory start from 4G,it's not true for some
platform.

The map logic use "high64_max_ram - 4G" to calculate the high ram size
without any check,it's an issue when the platform have no high memory.

So this patch add high64_min_ram variable to calculate the min address
of high memory and check the high64_min_ram to fix the previou issue.

Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@linux.intel.com>
2022-01-27 09:11:40 +08:00
Junjie Mao 1f305beba9 config_tools: add a script that focuses on XML validation
Today the XML validation logic is embedded in scenario_cfg_gen.py which is
highly entangled with the Python-internal representation of
configurations. Such representation is used by the current configurator,
but will soon be obsolete when the new configurator is introduced.

In order to avoid unnecessary work on this internal representation when we
refine the schema of scenario XML files, this patch separates the
validation logic into a new script which can either be used from the
command line or imported in other Python-based applications. At build time
this script will be used instead to validate the XML files given by users.

This change makes it easier to refine the current configuration items for
better developer experience.

Migration of existing checks in scenario_cfg_gen.py to XML schema will be
done by a following series.

v2 -> v3:
 * Keep Invoking asl_gen.py to generate vACPI tables for pre-launched VMs.

v1 -> v2:
 * Remove "all rights reserved" from the license header
 * Upgrade the severity of the message indicating lack of xmlschema as
   error according to our latest definitions of log severities, as
   validation violations could indicate build time or boot time failures.

Tracked-On: #6690
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
2022-01-26 14:19:01 +08:00
Junjie Mao c83c49ae36 Makefile: allow arbitrary character in paths
Today we assume the paths to the build directory do not contain the
character `@` and build the sed commands on top of this
assumption. However, there is no guarantee that this assumption holds.

This patch changes the separating character in sed pattern replacing
commands back to slash ('/') and escape the slashes in the replacements to
make the commands work. That gives more flexibility to the paths where
users can put their configuration files and build the project.

Tracked-On: #6691
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2022-01-24 15:38:39 +08:00
Wen Qian 5d7465a055 hv: fix bug when set MSR_IA32_COPY_PLATFORM_TO_LOCAL before setting MSR_IA32_COPY_LOCAL_TO_PLATFORM
The current code would inject GP to guest, when there's no IWKeyBackup,
and the guest tried to write MSR MSR_IA32_COPY_PLATFORM_TO_LOCAL(0xd92)
to copy IWKeyBackup for the platform to the IWKey for this logical processor.

This patch fixes it by adjusting the code logic, and it'll do nothing
instead of inject GP if no valid IWKeyBackup.
This patch alse add checking for the value being written to avoid setting
reserved MSR bits.

Tracked-On: #7018
Signed-off-by: Wen Qian <qian.wen@intel.com>
Signed-off-by: Li Fei <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2022-01-21 14:35:59 +08:00
Chenli Wei c93c2224e0 hv: alloc multiboot modules memory
Now the multiboot modules memory have not reserve,it's an issue if
these memory alloc and write before VM start.

Incorrect allocation of multiboot modules memory will cause VM lost
data or start faild.

So we find these modules memory range and reserve these memory from
e820 entry.

All these memory will realloc to VM which own them before the vm start.

Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@intel.com>
2022-01-21 13:38:06 +08:00
Mingqiang Chi b6b69f2178 hv: fix violations of coding guideline C-FN-16
The coding guideline rule C-FN-16 requires that 'Mixed-use of
C code and assembly code in a single function shall not be allowed',
this patch wraps inline assembly to inline functions.

Tracked-On: #6776
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>

v1-->v2:
    use inline functions for read/write XMM registers
2022-01-13 08:29:02 +08:00
Mingqiang Chi 83c0a97fb1 hv: fix clang analyzer deadcode
remove the dead code since the variable is never used.

Tracked-On: #6776
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2022-01-07 13:47:32 +08:00
Yuanyuan Zhao 8f114d82af hv: rename CONFIG_IOMMU_BUS_NUM
Rename `CONFIG_IOMMU_BUS_NUM` to `ACFG_MAX_PCI_BUS_NUM`. Configure tool
will calculate `ACFG_MAX_PCI_BUS_NUM` base on the max pci num which is
used by VF. So user needn't care about `ACFG_MAX_PCI_BUS_NUM`, and memory
will be used resonable.

Tracked-On: #6942
Signed-off-by: Yuanyuan Zhao <yuanyuan.zhao@linux.intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Victor Sun <victor.sun@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-12-13 11:06:10 +08:00
Mingqiang Chi 3555aae4ac hv:remove 2 bits vmx capability check
remove is_valid_xsave_combination api,
assume the hardware or QEMU can guarantee that support
XSAVE on CPU side and XSAVE_XRSTR on VMX side or not.
will add offline-tool in QEMU platform to avoid the user
use wrong XSAVE configurations.
remov check VMX_PROCBASED_CTLS2_XSVE_XRSTR based on the above reason.
for VMX_PROCBASED_CTLS2_PAUSE_LOOP, now it will panic
if run ACRN over QEMU, here remove it from essential check,
and it will print error information when set this bit
if there is no the hardware capability.

v1-v2:
  remove is_valid_xsave_combination

Tracked-On: #6584
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-12-13 08:52:52 +08:00
Yifan Liu 5c9456462b hv && config-tool: Add compilation option to disable all interrupts in HV
This patch adds an option CONFIG_KEEP_IRQ_DISABLED to hv (default n) and
config-tool so that when this option is 'y', all interrupts in hv root
mode will be permanently disabled.

With this option to be 'y', all interrupts received in root mode will be
handled in external interrupt vmexit after next VM entry. The postpone
latency is negligible. This new configuration is a requirement from x86
TEE's secure/non-secure interrupt flow support. Many race conditions can be
avoided when keeping IRQ off.

v5:
Rename CONFIG_ACRN_KEEP_IRQ_DISABLED to CONFIG_KEEP_IRQ_DISABLED

v4:
Change CPU_IRQ_ENABLE/DISABLE to
CPU_IRQ_ENABLE_ON_CONFIG/DISABLE_ON_CONFIG and guard them using
CONFIG_ACRN_KEEP_IRQ_DISABLED

v3:
CONFIG_ACRN_DISABLE_INTERRUPT -> CONFIG_ACRN_KEEP_IRQ_DISABLED
Add more comment in commit message

Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-12-10 09:50:17 +08:00
Jie Deng 2fab18a6d6 hv: tee: avoid halt in REE bootargs
"idle=halt " should be avoided in REE since we have to
keep the interrupt always masked in root mode.

Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-12-09 10:47:16 +08:00
Yifan Liu fa6b55db68 hv: tee: Handling x86_tee secure interrupts corner cases
Previous upstreamed patches handles the secure/non-secure interrupts in
handle_x86_tee_int. However there is a corner case in which there might
be unhandled secure interrupts (in a very short time window) when TEE
yields vCPU. For this case we always make sure that no secure interrupts
are pending in TEE's vlapic before scheduling REE.

Also in previous patches, if non-secure interrupt comes when TEE is
handling its secure interrupts, hypervisor injects a predefined vector
into TEE's vlapic. TEE does not consume this vector in secure interrupt
handling routine so it stays in vIRR, but it should be cleared because the
actual interrupt will be consumed in REE after VM Entry.

v3:
    Fix comments on interrupt priority

v2:
    Add comments explaining the priority of secure/non-secure interrupts

Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-12-09 10:47:16 +08:00
Yifan Liu fd7ab300a8 hv: tee: Rename TEE_NOTIFICATION_VECTOR to TEE_FIXED_NONSECURE_VECTOR
The TEE_NOTIFICATION_VECTOR can sometimes be confused with TEE's PI
notification vector. So rename it to TEE_FIXED_NONSECURE_VECTOR for
better readability.

No logic change.

v3:
Add more comments in commit message.

Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-12-09 10:47:16 +08:00
Yifan Liu 702a71639f hv: Add two vlapic APIs
Sometimes HV would like to know if there are specific interrupt
pending in vIRR, and clears them if necessary (such as in x86_tee case).

This patch adds two APIs: get_next_pending_intr and clear_pending_intr.
This patch also moves the inline api prio() from
vlapic.c to vlapic.h

v3:
    Remove apicv_get_next_pending_intr and apicv_clear_pending_intr
    and use vlapic_get_next_pending_intr and vlapic_clear_pending_intr
    directly.

v2:
    get_pending_intr -> get_next_pending_intr
    apicv_basic/advanced_clear_pending_intr -> apicv_clear_pending_intr
    apicv_basic/advanced_get_pending_intr -> apicv_get_next_pending_intr
    has_pending_intr kept

Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-12-09 10:47:16 +08:00
Yifan Liu 1d831cd3a2 hv: Add checks in pci_enumerate_ext_cap to guard against malformed lists
In pci_enumerate_ext_cap we assume the extended capability linked lists
are always legal and correct, which might not be true when there was a
faulty hardware. This patch adds checks (time to live) to guard against malformed
extended capability linked lists.

v2:
Add error printing when node_limit <= 0.

Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-12-09 10:47:16 +08:00
Yifan Liu b59c489750 hv: Hide Service VM hypercalls from REE
Though REE VM has its load order to be Service_VM, it does not offer
services as Service VM does. The only hypercalls allowed for REE are the
ones with GUEST_FLAG_REE.

Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-12-09 10:47:16 +08:00
Yifan Liu 98bc4cab35 hv: Wrap GUEST_FLAG_TEE/REE checks into function
This patch wraps the check of GUEST_FLAG_TEE/REE into functions
is_tee_vm/is_ree_vm for readability. No logic changes.

Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-12-09 10:47:16 +08:00
Chenli Wei 7390488b8d hv: remove CONFIG_LOG_DESTINATION
The CONFIG_LOG_DESTINATION parameter selects where the logging messages
send to,serial console or memory or npk device MMIO region.

Now we want to remove it and check the loglevel of each channel,close the
output when the loglevel is ZERO.

Tracked-On: #6934
Signed-off-by: Chenli Wei <chenli.wei@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-12-06 14:24:40 +08:00
Xiangyang Wu d16d0f00bc HV: update destination shorthand during x2apic ICR emulation
Currently, in RTVM with multi vCPUs, lapic pass through is
configured, each vCPU works in x2apic mode. When one vCPU sends
IPI to all other vCPUs through writes ICR register with virtual
value 0x00000000000c00f8, this ICR writting will be intercepted,
the hypervisor passes destination shorthand field 11B (All Excluding
Self) in the virtual ICR value into physical ICR value during IPI
emulation, this IPI will be sent to each physical CPU core
in the platform according to 10.6.1 Interrupt Command Register (ICR),
Vol 3, SDM.
One vCPU in User VM with lapic pass through configuration can
send IPI with destination shorthand (10B or 11B) and any vector
(such as NMI or reboot vector) to other vCPUs, this IPI will sent
other VMs in the platform by hypervisor, this interference may
cause other VMs hang.

In this patch, set "Destination Shorthand" field of the
ICR value as 00B (No Shorthand) since the emulation is done
through sending IPI to each VCPU in dmask one by one.

Tracked-On: #6908

Signed-off-by: Xiangyang Wu <xiangyang.wu@intel.com>
Reviewed-by: Chen, Jason CJ <jason.cj.chen@intel.com>
2021-12-01 09:54:35 +08:00
Yifan Liu 0d59577fe4 hv: Add stateful VM check before system shutdown
This patch introduces stateful VM which represents a VM that has its own
internal state such as a file cache, and adds a check before system
shutdown to make sure that stateless VM does not block system shutdown.

Tracked-On: #6571
Signed-off-by: Wang Yu <yu1.wang@intel.com>
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-12-01 08:47:25 +08:00
Yifan Liu 21615ee2f3 hv: Fix minor coding style warnings
This patch fixes a minor warning introduced by commit 3c9c41b. No logic
changes.

Tracked-On: #6776
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
2021-11-30 08:40:57 +08:00
Mingqiang Chi 4cb55e58a8 hv: move DM_OWNED_GUEST_FLAG_MASK to vm_config.h
Currently the MACRO(DM_OWNED_GUEST_FLAG_MASK) is generated
by config-tool. It's unnecessary to generate by tool since
it is fixed, the config-tool will remove this MACRO and
move it to vm_config.h

Tracked-On: #6366
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2021-11-28 14:35:25 +08:00
Yang,Yu-chu 30951f7a83 hv: rename CONFIG_GPU_SBDF to CONFIG_IGD_SBDF
The name CONFIG_IGD_SBDF indicates the bdf of an integrated GPU on platform.

Tracked-On: #6855
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
2021-11-28 14:23:29 +08:00
Victor Sun e4a58363e3 HV: fix MISRA violation of _ld_ram_xxx
Variable should not have a prefix of '_' per MISRA C standard. The patch
removes the prefix for _ld_ram_start and _ld_ram_end.

Tracked-On: #6885

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-11-26 16:45:17 +08:00
Victor Sun e718f48a55 HV: correct hv_ram_size when hv is relocated
In previous commit df7ffab441
the CONFIG_HV_RAM_SIZE was removed and the hv_ram_size was calculated in
link script by following formula:
	ld_ram_size = _ld_ram_end - _ld_ram_start ;
but _ld_ram_start is a relative address in boot section whereas _ld_ram_end
is a absolute address in global, the mix operation cause hv_ram_size is
incorrect when HV binary is relocated.

The patch fix this issue by getting _ld_ram_start and _ld_ram_end respectively
and calculated at runtime.

Tracked-On: #6885

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2021-11-26 16:45:17 +08:00
Helmut Buchsbaum 2f54a1862e hv: hypercall: return 0 for empty PX or CX tables
Avoid failing hypercalls by returning 0 for empty PX and CX tables on
HC_PM_GET_CPU_STATE/PMCMD_GET_PX_CNT and
HC_PM_GET_CPU_STATE/PMCMD_GET_CX_CNT.

Tracked-On: #6848

Signed-off-by: Helmut Buchsbaum <helmut.buchsbaum@opensource.tttech-industrial.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-11-17 16:33:11 +08:00
Jie Deng e97b171ca2 hv: tee: x86_tee interrupt support
Secure interrupt (interrupt belongs to TEE) comes
when TEE vcpu is running, the interrupt will be
injected to TEE directly. But when REE vcpu is running
at that time, we need to switch to TEE for handling.

Non-Secure interrupt (interrupt belongs to REE) comes
when REE vcpu is running, the interrupt will be injected
to REE directly. But when TEE vcpu is running at that time,
we need to inject a predefined vector to TEE for notification
and continue to switch back to TEE for running.

To sum up, when secure interrupt comes, switch to TEE
immediately regardless of whether REE is running or not;
when non-Secure interrupt comes and TEE is running,
just notify the TEE and keep it running, TEE will switch
to REE on its own initiative after completing its work.

Tracked-On: projectacrn#6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-11-17 15:25:14 +08:00
Jie Deng 314d9ca8af hv: tee: implement the x86_tee hypercalls
This patch implements the following x86_tee hypercalls,

- HC_TEE_VCPU_BOOT_DONE
- HC_SWITCH_EE

Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-11-17 15:25:14 +08:00
Jie Deng 3c9c41b656 hv: tee: add x86_tee hypercall interfaces
This patch adds the x86_tee hypercall interfaces.

- HC_TEE_VCPU_BOOT_DONE

This hypercall is used to notify the hypervisor that the TEE VCPU Boot
is done, so that we can sleep the corresponding TEE VCPU. REE will be
started at the last time this hypercall is called by TEE.

- HC_SWITCH_EE

For REE VM, it uses this hypercall to request TEE service.

For TEE VM, it uses this hypercall to switch back to REE
when it completes the REE service.

Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-11-17 15:25:14 +08:00
Jie Deng f3792a74a3 hv: tee: add TEE VM memmap support
TEE is a secure VM which has its own partitioned resources while
REE is a normal VM which owns the rest of platform resources.
The TEE, as a secure world, it can see the memory of the REE
VM, also known as normal world, but not the other way around.
But please note, TEE and REE can only see their own devices.

So this patch does the following things:

1. go through physical e820 table, to ept add all system memory entries.
2. remove hv owned memory.

Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-11-17 15:25:14 +08:00
Jie Deng 0b1418d395 hv: tee: add an API for creating identical memmap according to e820
Given an e820, this API creates an identical memmap for specified
e820 memory type, EPT memory cache type and access right.

Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-11-17 15:25:14 +08:00
Jie Deng c4d59c8f91 hv: tee: Support the concept of companion VM
Add a configuration to support companion VM.

Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
2021-11-17 15:25:14 +08:00
Jie Deng 71ae0fdabf hv: tee: add VM flags for x86_tee support
Add two VM flags for x86_tee. GUEST_FLAG_TEE for TEE VM,
GUEST_FLAG_REE for normal rich VM.

Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
2021-11-17 15:25:14 +08:00
Victor Sun 960238cdcb HV: fix build issue on RELEASE version
The HV will be built failed with below compiler message:

  common/efi_mmap.c: In function ‘init_efi_mmap_entries’:
  common/efi_mmap.c:41:11: error: unused variable ‘efi_memdesc_nr’
			[-Werror=unused-variable]
  uint32_t efi_memdesc_nr = uefi_info->memmap_size / uefi_info->memdesc_size;
  ^~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

The root cause is ASSERT() api is for DEBUG only so efi_memdesc_nr is not used
in RELEASE code.

The patch fix this issue by removing efi_memdesc_nr declaration;

Tracked-On: #6834

Signed-off-by: Victor Sun <victor.sun@intel.com>
2021-11-16 19:01:44 +08:00
Chenli Wei 1cfaee396a hv: remove MAX_KATA_VM_NUM and CONFIG_KATA_VM
Since the UUID is not a *must* set parameter for the standard post-launched
VM which doesn't depend on any static VM configuration. We can remove
the KATA related code from hypervisor as it belongs to such VM type.

v2-->v3:
    separate the struce acrn_platform_info change of devicemodel

v1-->v2:
    update the subject and commit msg

Tracked-On:#6685
Signed-off-by: Chenli Wei <chenli.wei@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
2021-11-16 14:42:59 +08:00
Mingqiang Chi bb0327e700 hv: remove UUID
With current arch design the UUID is used to identify ACRN VMs,
all VM configurations must be deployed with given UUIDs at build time.
For post-launched VMs, end user must use UUID as acrn-dm parameter
to launch specified user VM. This is not friendly for end users
that they have to look up the pre-configured UUID before launching VM,
and then can only launch the VM which its UUID in the pre-configured UUID
list,otherwise the launch will fail.Another side, VM name is much straight
forward for end user to identify VMs, whereas the VM name defined
in launch script has not been passed to hypervisor VM configuration
so it is not consistent with the VM name when user list VM
in hypervisor shell, this would confuse user a lot.

This patch will resolve these issues by removing UUID as VM identifier
and use VM name instead:
1. Hypervisor will check the VM name duplication during VM creation time
   to make sure the VM name is unique.
2. If the VM name passed from acrn-dm matches one of pre-configured
   VM configurations, the corresponding VM will be launched,
   we call it static configured VM.
   If there is no matching found, hypervisor will try to allocate one
   unused VM configuration slot for this VM with given VM name and get it
   run if VM number does not reach CONFIG_MAX_VM_NUM,
   we will call it dynamic configured VM.
3. For dynamic configured VMs, we need a guest flag to identify them
   because the VM configuration need to be destroyed
   when it is shutdown or creation failed.

v7->v8:
    -- rename is_static_vm_configured to is_static_configured_vm
    -- only set DM owned guest_flags in hcall_create_vm
    -- add check dynamic flag in get_unused_vmid

v6->v7:
    -- refine get_vmid_by_name, return the first matching vm_id
    -- the GUEST_FLAG_STATIC_VM is added to identify the static or
       dynamic VM, the offline tool will set this flag for
       all the pre-defined VMs.
    -- only clear name field for dynamic VM instead of clear entire
       vm_config

Tracked-On: #6685
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
Reviewed-by: Victor Sun<victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-16 14:42:59 +08:00
Chenli Wei 022df1fb2e hv: align the MAX_IR_ENTRIES to MAX_PT_IRQ_ENTRIES
The CONFIG_MAX_IR_ENTRIES and CONFIG_MAX_PT_IRQ_ENTRIES are separate
configuration items, and they can be configured through configuration tool

When the number of PT irq entries are more than IR entries, then some
passthrough devices' irqs may failed to be protected by interrupt
remapping or automatically injected by post-interrupt mechanism.
And it waste memory if the CONFIG_MAX_IR_ENTRIES is larger.

This patch replace the CONFIG_MAX_IR_ENTRIES to MAX_IR_ENTRIES and
enforce it align to CONFIG_PT_IRQ_ENTRIES and round up to > 2^n as the
IRTA_REG spec.This way can enforce all PT irqs works with IR or PI
mechanism.

Tracked-On: #6745
Signed-off-by: Chenli Wei <chenli.wei@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
2021-11-15 09:00:27 +08:00
Yuanyuan Zhao c111dd2e2c hv : encapsulate page align in e820_alloc_memory
e820_alloc_memory requires 4k alignment, so conversion to size is
encapsulated in the function. And then the pre-condition of
`size_arg` is removed.

Tracked-On: #6805
Signed-off-by: Yuanyuan Zhao <yuanyuan.zhao@linux.intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-12 11:56:03 +08:00
Yuanyuan Zhao 4f6aa38ea5 hv: remove CONFIG_LOW_RAM_SIZE
The CONFIG_LOW_RAM_SIZE is used to describe the size of trampoline code
that is never changed. And it totally confused user to configure it.

This patch hard code it to 1MB and remove the macro for configuration.
In the trampoline related code, use ld_trampoline_end and
ld_trampoline_start symbol to calculate the real size.

Tracked-On: #6805
Signed-off-by: Yuanyuan Zhao <yuanyuan.zhao@linux.intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-12 11:56:03 +08:00
Shiqing Gao 7bbd17ce80 hv: initialize and save/restore IA32_TSC_AUX MSR for guest
Commit cbf3825 "hv: Pass-through IA32_TSC_AUX MSR to L1 guest"
lets guest own the physical MSR IA32_TSC_AUX and does not handle this MSR
in the hypervisor.
If multiple vCPUs share the same pCPU, when one vCPU reads MSR IA32_TSC_AUX,
it may get the value set by other vCPUs.

To fix this issue, this patch does:
 - initialize the MSR content to 0 for the given vCPU, which is consistent with
   the value specified in SDM Vol3 "Table 9-1. IA-32 and Intel 64 Processor
   States Following Power-up, Reset, or INIT"
 - save/restore the MSR content for the given vCPU during context switch

v1 -> v2:
 * According to Table 9-1, the content of IA32_TSC_AUX MSR is unchanged
   following INIT, v2 updates the initialization logic so that the content for
   vCPU is consistent with SDM.

Tracked-On: #6799
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-12 09:30:12 +08:00
Xiangyang Wu 511fd7b469 misc: config_tools: generate serial configuration file
Generate serial configuration file for service VM according
to scenario file and vUART ports base address allocated by
config tool.
Currently, some non-standard serial ports are emulated in
hypervisor and will be used to do communication between service
VM and user VM, so need to generate serial configuration file
to configure these serial ports for service VM.

v1-->v2:
	Fix some type issues
	Refine script code format

Tracked-On: #6652

Signed-off-by: Xiangyang Wu <xiangyang.wu@intel.com>
2021-11-08 13:15:38 +08:00
Zhou, Wu 1bc25ed198 HV: refine the ve820 tab for pre-VMs
This patch moves the ssram area in ve820 tab, and reunites the
hpa1_low_part1/2 areas. The ve820 building code is refined.

before:
|<---low_1M--->|
|<---hpa1_low_part1--->|
|<---SSRAM--->|
|<---hpa1_low_part2--->|
|<---GPU_OpRegion--->|
|<---ACPI DATA--->|
|<---ACPI NVS--->|
---2G---

after:
|<---low_1M--->|
|<---hpa_low--->|
|<---SSRAM--->|
|<---GPU_OpRegion--->|
|<---ACPI DATA--->|
|<---ACPI NVS--->|
---2G---

The SSRAM area's address is described in the ACPI's RTCT/PTCT
table. To simplify the SSRAM implementation, SSRAM area was
identical mapped to GPA, and resulted in the divition of hpa_low.
Then the ve820 building logic became too complicated.

Now we managed to edit the guest's RTCT/PTCT table by offline
tools in the former patch, so we can move the guest's SSRAM
area, and reunite the hpa_low areas again.

After doing this, this patch rewrites the ve820 building code
in a much simpler way.

Tracked-On: #6674

Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
2021-11-08 13:13:14 +08:00
Zhou, Wu f1f6fe11c1 HV: move the ve820 GPU OpRegion address
The ve820 table' hpa1_low area is divided into two parts, which
is making the code too complicated and causing problems. Moving
the entries that divides the hpa1_low could make things easier.

This patch moves the GPU OpRegion to the tail area of 2G,
consecutive to the acpi data/nvs area.

before:
|<---low_1M--->|
|<---hpa1_low_part1--->|
|<---SSRAM--->|
|<---GPU_OpRegion--->|
|<---hpa1_low_part2--->|
|<---ACPI DATA--->|
|<---ACPI NVS--->|
---2G---

after:
|<---low_1M--->|
|<---hpa1_low_part1--->|
|<---SSRAM--->|
|<---hpa1_low_part2--->|
|<---GPU_OpRegion--->|
|<---ACPI DATA--->|
|<---ACPI NVS--->|
---2G---

Tracked-On: #6674

Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
2021-11-08 13:13:14 +08:00
Zhou, Wu e00421d5be HV: Fix the problems in ve820 acpi area
The length of the ACPI data entry in ve820 tab was 960K, while the
ACPI file is 1MB. It causes the ACPI file copy failed due to reserved
ACPI regions in ve820 table is not enough when loading pre-launched
VMs. This patch changes ACPI data area to 1MB to fix the problem.

And the ACPI data length was missed when calculating
ENTRY_HPA1_LOW_PART2 length. Fixed here too.

Also adds some refinement to the hard-coded ACPI base/addr definations

Tracked-On: #6674

Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
2021-11-08 13:13:14 +08:00
Victor Sun c5107123f6 HV: prepare adaptable guest EFI mmap buffer
Previously we prepared 32KB buffer to store guest VM boot parameters, in which
there is at most 17KB buffer for guest EFI memory map to accomodate ~400 EFI
memory descriptors. But this is uncertain to ensure working on all boards, now
change the algorithm that make the EFI mmap buffer adaptable with configured
MAX_EFI_MMAP_ENTRIES macro.

Tracked-On: #6442

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-11-08 09:49:24 +08:00
Victor Sun 914341c9aa hv: set default MAX_EFI_MMAP_ENTRIES to 350
In previous implementation we leave MAX_EFI_MMAP_ENTRIES in config tool and
let end user to configure it. However it is hard for end user to understand
how to configure it, also it is hard for board_inspector to get this value
automatically because this info is only meaningful during the kernel boot
stage and there is no such info available after boot in Linux toolset.

This patch hardcode the value to 350, and ASSERT if the board need more efi
mmap entries to run ACRN. User could modify the MAX_EFI_MMAP_ENTRIES macro
in case ASSERT occurs in DEBUG stage.

The More size of hv_memdesc[] only consume very little memory, the overhead
is (size * sizeof(struct efi_memory_desc)), i.e. (size * 40) in bytes.

Tracked-On: #6442

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-11-08 09:49:24 +08:00
Junjie Mao 83a938bae6 HV: treewide: fix violations of coding guideline C-TY-27 & C-TY-28
The coding guideline rules C-TY-27 and C-TY-28, combined, requires that
assignment and arithmetic operations shall be applied only on operands of the
same kind. This patch either adds explicit type casts or adjust types of
variables to align the types of operands.

The only semantic change introduced by this patch is the promotion of the
second argument of set_vmcs_bit() and clear_vmcs_bit() to
uint64_t (formerly uint32_t). This avoids clear_vmcs_bit() to accidentally
clears the upper 32 bits of the requested VMCS field.

Other than that, this patch has no semantic change. Specifically this patch
is not meant to fix buggy narrowing operations, only to make these
operations explicit.

Tracked-On: #6776
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-04 18:15:47 +08:00
Junjie Mao 2c86795fa8 HV: arch: fix a violation of coding guideline C-TY-24
The coding guideline rule C-TY-24 requires that 'cast shall not be
performed on a function pointer'. This patch removes a duplicated explicit
cast on timer_expired_handler in tsc_deadline_timer.c.

This patch has no semantic impacts.

Tracked-On: #6776
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-04 18:15:47 +08:00
Junjie Mao 9781873e77 HV: treewide: fix violations of coding guideline C-TY-12
The coding guideline rule C-TY-12 requires that 'all type conversions shall
be explicit'. Especially implicit cases on the signedness of variables
shall be avoided.

This patch either adds explicit type casts or adjust local variable types
to make sure that Booleans, signed and unsigned integers are not used
mixedly.

This patch has no semantic changes.

Tracked-On: #6776
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-04 18:15:47 +08:00
Junjie Mao db20e277b6 HV: treewide: fix violations of coding guideline C-TY-02
The coding guideline rule C-TY-02 requires that 'the operands of bit
operations shall be unsigned'. This patch adds explicit casts or literal
suffixes to make explicit the type of values involved in bit operations.
Explicit casts to widen integers before shift operations are also
introduced to make explicit that the variables are expanded BEFORE it is
shifted (which is already so in C99 but implicitly).

This patch has no semantic changes.

Tracked-On: #6776
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-04 18:15:47 +08:00
Junjie Mao b0dbc1cbfe HV: treewide: fix violations of coding guideline C-ST-02
The coding guideline rule C-ST-02 requires that 'the loop body shall be
enclosed with brackets', or more specifically, braces. This patch adds
braces to the single-line loop bodies.

This patch has no semantic change.

Tracked-On: #6776
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-04 18:15:47 +08:00
Junjie Mao ff891b4f79 HV: treewide: fix violations of coding guideline C-PP-04
The coding guideline rule C-PP-04 requires that 'parentheses shall be used
when referencing a MACRO parameter'. This patch adds parentheses to macro
parameters or expressions that are not yet wrapped properly.

This patch has no sematic impact.

Tracked-On: #6776
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-04 18:15:47 +08:00
Junjie Mao d5c137eac5 HV: treewide: fix violations of coding guideline C-FN-09
The coding guideline gule C-FN-09 requires that 'the formal parameter name
of a function shall be consistent'. This patch fixes two places where the
formal parameters are named differently in declarations and
definitions. More specifically, the names in declarations are replaced with
those in definitions.

This patch has no semantic impact.

Tracked-On: #6776
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-04 18:15:47 +08:00
Junjie Mao fba343bd05 HV: treewide: fix violations of coding guideline C-FN-06
The coding guideline rule C-FN-06 requires that 'a parameter passed by
value to a function shall not be modified directly'. This patch rewrites
two functions which does modify its parameters today.

This patch has no semantic impact.

Tracked-On: #6776
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-04 18:15:47 +08:00
Junjie Mao d4055d6157 HV: acpi_parser: fix violations of coding guideline C-CU-02
The coding guideline rule C-CU-02 requires that 'only one return statement shall
be in a function'. This patch refactors handle_dmar_devscope() which has
multiple return statements today.

This patch has no semantic changes.

Tracked-On: #6776
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-04 18:15:47 +08:00
Junjie Mao 3f3f4be642 HV: treewide: fix violations of coding guideline C-EP-05
The coding guideline rule C-EP-05 requires that 'parentheses shall be used
to set the operator precedence explicitly'. This patch adds the missing
parentheses detected by the static analyzer.

Tracked-On: #6776
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-04 18:15:47 +08:00
Junjie Mao 4cf6c288cd HV: treewide: fix warnings raised by Clang
This patch fixes the following warnings detected by the LLVM/Clang
compiler:

  1. Unused static functions in C sources, which are fixed by explicitly
     tagging them with __unused

  2. Duplicated parentheses around branch conditions

  3. Assigning 64-bit constants to 32-bit variables, which is fixed by
     promoting the variables to uint64_t

  4. Using { '\0' } to zero-fill an array, which is fixed by replacing it
     with { 0 }

  5. Taking a bit out of a variable using && (which should be & instead)

Most changes do not have a semantic impact, except item 5 which is probably
a real code issue.

Tracked-On: #6776
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-11-04 18:15:47 +08:00
Yifan Liu 10963b04d1 hv: Fix vcpu signaling racing problem in lock instruction emulation
In lock instruction emulation, we use vcpu_make_request and
signal_event pairs to shoot down/release other vcpus.
However, vcpu_make_request is async and does not guarantee an execution
of wait_event on target vcpu, and we want wait_event to be consistent
with signal_event.

Consider following scenarios:

1, When target vcpu's state has not yet turned to VCPU_RUNNING,
vcpu_make_request on ACRN_REQUEST_SPLIT_LOCK does not make sense, and will
not result in wait_event.

2, When target vcpu is already requested on ACRN_REQUEST_SPLIT_LOCK (i.e., the
corresponding bit in pending_req is set) but not yet handled,
the vcpu_make_request call does not result in wait_event as 1 bit is not
enough to cache multiple requests.

This patch tries to add checks in vcpu_kick_lock_instr_emulation and
vcpu_complete_lock_instr_emulation to resolve these issues.

Tracked-On: #6502
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-11-02 15:01:20 +08:00
Liu Long 3f4ea38158 ACRN: misc: Unify terminology for service vm/user vm
Rename SOS_VM type to SERVICE_VM
rename UOS to User VM in XML description
rename uos_thread_pid to user_vm_thread_pid
rename devname_uos to devname_user_vm
rename uosid to user_vmid
rename UOS_ACK to USER_VM_ACK
rename SOS_VM_CONFIG_CPU_AFFINITY to SERVICE_VM_CONFIG_CPU_AFFINITY
rename SOS_COM to SERVICE_VM_COM
rename SOS_UART1_VALID_NUM" to SERVICE_VM_UART1_VALID_NUM
rename SOS_BOOTARGS_DIFF to SERVICE_VM_BOOTARGS_DIFF
rename uos to user_vm in launch script and xml

Tracked-On: #6744
Signed-off-by: Liu Long <long.liu@linux.intel.com>
Reviewed-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
2021-11-02 10:00:55 +08:00
Liu Long 14c6e21efa ACRN: misc: Unify terminology for sos/uos rin macro
Rename SOS_VM_NUM to SERVICE_VM_NUM.
rename SOS_SOCKET_PORT to SERVICE_VM_SOCKET_PORT.
rename PROCESS_RUN_IN_SOS to PROCESS_RUN_IN_SERVICE_VM.
rename PCI_DEV_TYPE_SOSEMUL to PCI_DEV_TYPE_SERVICE_VM_EMUL.
rename SHUTDOWN_REQ_FROM_SOS to SHUTDOWN_REQ_FROM_SERVICE_VM.
rename PROCESS_RUN_IN_SOS to PROCESS_RUN_IN_SERVICE_VM.
rename SHUTDOWN_REQ_FROM_UOS to SHUTDOWN_REQ_FROM_USER_VM.
rename UOS_SOCKET_PORT to USER_VM_SOCKET_PORT.
rename SOS_CONSOLE to SERVICE_VM_OS_CONSOLE.
rename SOS_LCS_SOCK to SERVICE_VM_LCS_SOCK.
rename SOS_VM_BOOTARGS to SERVICE_VM_OS_BOOTARGS.
rename SOS_ROOTFS to SERVICE_VM_ROOTFS.
rename SOS_IDLE to SERVICE_VM_IDLE.
rename SEVERITY_SOS to SEVERITY_SERVICE_VM.
rename SOS_VM_UUID to SERVICE_VM_UUID.
rename SOS_REQ to SERVICE_VM_REQ.
rename RTCT_NATIVE_FILE_PATH_IN_SOS to RTCT_NATIVE_FILE_PATH_IN_SERVICE_VM.
rename CBC_REQ_T_UOS_ACTIVE to CBC_REQ_T_USER_VM_ACTIVE.
rename CBC_REQ_T_UOS_INACTIVE to CBC_REQ_T_USER_VM_INACTIV.
rename uos_active to user_vm_active.

Tracked-On: #6744
Signed-off-by: Liu Long <long.liu@linux.intel.com>
Reviewed-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
2021-11-02 10:00:55 +08:00
Liu Long e9c4ced460 ACRN: hv: Unify terminology for user vm
Rename gpa_uos to gpa_user_vm
rename base_gpa_in_uos to base_gpa_in_user_vm
rename UOS_VIRT_PCI_MMCFG_BASE to USER_VM_VIRT_PCI_MMCFG_BASE
rename UOS_VIRT_PCI_MMCFG_START_BUS to USER_VM_VIRT_PCI_MMCFG_START_BUS
rename UOS_VIRT_PCI_MMCFG_END_BUS to USER_VM_VIRT_PCI_MMCFG_END_BUS
rename UOS_VIRT_PCI_MEMBASE32 to USER_VM_VIRT_PCI_MEMBASE32
rename UOS_VIRT_PCI_MEMLIMIT32 to USER_VM_VIRT_PCI_MEMLIMIT32
rename UOS_VIRT_PCI_MEMBASE64 to USER_VM_VIRT_PCI_MEMBASE64
rename UOS_VIRT_PCI_MEMLIMIT64 to USER_VM_VIRT_PCI_MEMLIMIT64
rename UOS in comments message to User VM.

Tracked-On: #6744
Signed-off-by: Liu Long <long.liu@linux.intel.com>
Reviewed-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
2021-11-02 10:00:55 +08:00
Liu Long 92b7d6a9a3 ACRN: hv: Terminology modification in hv code
Rename sos_vm to service_vm.
rename sos_vmid to service_vmid.
rename sos_vm_ptr to service_vm_ptr.
rename get_sos_vm to get_service_vm.
rename sos_vm_gpa to service_vm_gpa.
rename sos_vm_e820 to service_vm_e820.
rename sos_efi_info to service_vm_efi_info.
rename sos_vm_config to service_vm_config.
rename sos_vm_hpa2gpa to service_vm_hpa2gpa.
rename vdev_in_sos to vdev_in_service_vm.
rename create_sos_vm_e820 to create_service_vm_e820.
rename sos_high64_max_ram to service_vm_high64_max_ram.
rename prepare_sos_vm_memmap to prepare_service_vm_memmap.
rename post_uos_sworld_memory to post_user_vm_sworld_memory
rename hcall_sos_offline_cpu to hcall_service_vm_offline_cpu.
rename filter_mem_from_sos_e820 to filter_mem_from_service_vm_e820.
rename create_sos_vm_efi_mmap_desc to create_service_vm_efi_mmap_desc.
rename HC_SOS_OFFLINE_CPU to HC_SERVICE_VM_OFFLINE_CPU.
rename SOS to Service VM in comments message.

Tracked-On: #6744
Signed-off-by: Liu Long <long.liu@linux.intel.com>
Reviewed-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
2021-11-02 10:00:55 +08:00
Liu Long 26e507a06e ACRN: hv: Unify terminology for service vm
Rename is_sos_vm to is_service_vm

Tracked-On: #6744
Signed-off-by: Liu Long <longliu@intel.com>
2021-11-02 10:00:55 +08:00
dongshen dcafcadaf9 hv: rename some C preprocessor macros
Rename some C preprocessor macros:
  NUM_GUEST_MSRS --> NUM_EMULATED_MSRS
  CAT_MSR_START_INDEX --> FLEXIBLE_MSR_INDEX
  NUM_VCAT_MSRS --> NUM_CAT_MSRS
  NUM_VCAT_L2_MSRS --> NUM_CAT_L2_MSRS
  NUM_VCAT_L3_MSRS --> NUM_CAT_L3_MSRS

Tracked-On: #5917
Signed-off-by: Eddie Dong <eddie.dong@Intel.com>
2021-10-28 19:12:29 +08:00
dongshen c0d95558c1 hv: vCAT: propagate vCBM to other vCPUs that share cache with vcpu
Implement the propagate_vcbm() function:
  Set vCBM to to all the vCPUs that share cache with vcpu
  to mimic hardware CAT behavior

Tracked-On: #5917
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-10-28 19:12:29 +08:00
dongshen a7014f4654 hv: vCAT: implementing the vCAT MSRs write handler
Implement the write_vcbm() function to handle the
MSR_IA32_type_MASK_n vCBM MSRs write request

Call write_vclosid() to handle MSR_IA32_PQR_ASSOC MSR write request

Several vCAT P2V (physical to virtual) and V2P (virtual to physical)
mappings exist:

   struct acrn_vm_config *vm_config = get_vm_config(vm_id)

   max_pcbm = vm_config->max_type_pcbm (type: l2 or l3)
   mask_shift = ffs64(max_pcbm)

   vclosid = vmsr - MSR_IA32_type_MASK_0
   pclosid = vm_config->pclosids[vclosid]

   pmsr = MSR_IA32_type_MASK_0 + pclosid
   pcbm = vcbm << mask_shift
   vcbm = pcbm >> mask_shift

   Where
   MSR_IA32_type_MASK_n: L2 or L3 mask msr address for CLOSIDn, from
   0C90H through 0D8FH (inclusive).

   max_pcbm: a bitmask that selects all the physical cache ways assigned to the VM

   vclosid: virtual CLOSID, always starts from 0

   pclosid: corresponding physical CLOSID for a given vclosid

   vmsr: virtual msr address, passed to vCAT handlers by the
   caller functions rdmsr_vmexit_handler()/wrmsr_vmexit_handler()

   pmsr: physical msr address

   vcbm: virtual CBM, passed to vCAT handlers by the
   caller functions rdmsr_vmexit_handler()/wrmsr_vmexit_handler()

   pcbm: physical CBM

Tracked-On: #5917
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-10-28 19:12:29 +08:00
dongshen 3ab50f2ef5 hv: vCAT: implementing the vCAT MSRs read handlers
Implement the read_vcbm() and read_vclosid() functions to handle the MSR_IA32_PQR_ASSOC
and MSR_IA32_type_MASK_n vCAT MSRs read request.

Tracked-On: #5917
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-10-28 19:12:29 +08:00
dongshen be855d2352 hv: vCAT: expose CAT capabilities to vCAT-enabled VM
Expose CAT feature to vCAT VM by reporting the number of
cache ways/CLOSIDs via the 04H/10H cpuid instructions, so that the
VM can take advantage of CAT to prioritize and partition cache
resource for its own tasks.

Add the vcat_pcbm_to_vcbm() function to map pcbm to vcbm

Tracked-On: #5917
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-10-28 19:12:29 +08:00
dongshen 77ae989379 hv: vCAT: initialize vCAT MSRs during vmcs init
Initialize vCBM MSRs

Initialize vCLOSID MSR

Add some vCAT functions:
 Retrieve max_vcbm and max_pcbm
 Check if vCAT is configured or not for the VM
 Map vclosid to pclosid
 write_vclosid: vCLOSID MSR write handler
 write_vcbm: vCBM MSR write handler

Tracked-On: #5917
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-10-28 19:12:29 +08:00
Fei Li 1905ed6124 hv: vMSR: minor fix about rdmsr_vmexit_handler
Specifying a reserved or unimplemented MSR address in ECX for rdmsr will cause a
general protection exception. In this case, we should not change the contents of
registers EDX:EAX.

Tracked-On: #4550
Signed-off-by: Fei Li <fei1.li@intel.com>
2021-10-27 08:23:43 +08:00
dongshen 39461ef9dd hv: vCAT: initialize the emulated_guest_msrs array for CAT msrs during platform initialization
Initialize the emulated_guest_msrs[] array at runtime for
MSR_IA32_type_MASK_n and MSR_IA32_PQR_ASSOC msrs, there is no good
way to do this initialization statically at build time

Tracked-On: #5917
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-10-26 11:48:27 +08:00
dongshen cb2bb78b6f hv/config_tools: amend the struct acrn_vm_config to make it compatible with vCAT
For vCAT, it may need to store more than MAX_VCPUS_PER_VM of closids,
change clos in vm_config.h to a pointer to accommodate this situation

Rename clos to pclosids

pclosids now is a pointer to an array of physical CLOSIDs that is defined
in vm_configurations.c by vmconfig. The number of elements in the array
must be equal to the value given by num_pclosids

Add max_type_pcbm (type: l2 or l3) to struct acrn_vm_config, which stores a bitmask
that selects/covers all the physical cache ways assigned to the VM

Change vmsr.c to accommodate this amended data structure

Change the config-tools to generate vm_configurations.c, and fill in the num_closids
and clos pointers based on the information from the scenario file.

Now vm_configurations.c.xsl generates all the clos related code so remove the same
code from misc_cfg.h.xsl.

Examples:

  Scenario file:

  <RDT>
    <RDT_ENABLED>y</RDT_ENABLED>
    <CDP_ENABLED>n</CDP_ENABLED>
    <VCAT_ENABLED>y</VCAT_ENABLED>
    <CLOS_MASK>0x7ff</CLOS_MASK>
    <CLOS_MASK>0x7ff</CLOS_MASK>
    <CLOS_MASK>0x7ff</CLOS_MASK>
    <CLOS_MASK>0xff800</CLOS_MASK>
    <CLOS_MASK>0xff800</CLOS_MASK>
    <CLOS_MASK>0xff800</CLOS_MASK>
    <CLOS_MASK>0xff800</CLOS_MASK>
    <CLOS_MASK>0xff800</CLOS_MASK>
  /RDT>

  <vm id="0">
   <guest_flags>
     <guest_flag>GUEST_FLAG_VCAT_ENABLED</guest_flag>
   </guest_flags>
   <clos>
     <vcpu_clos>3</vcpu_clos>
     <vcpu_clos>4</vcpu_clos>
     <vcpu_clos>5</vcpu_clos>
     <vcpu_clos>6</vcpu_clos>
     <vcpu_clos>7</vcpu_clos>
   </clos>
  </vm>

  <vm id="1">
   <clos>
     <vcpu_clos>1</vcpu_clos>
     <vcpu_clos>2</vcpu_clos>
   </clos>
  </vm>

 vm_configurations.c (generated by config-tools) with the above vCAT config:

  static uint16_t vm0_vcpu_clos[5U] = {3U, 4U, 5U, 6U, 7U};
  static uint16_t vm1_vcpu_clos[2U] = {1U, 2U};

  struct acrn_vm_config vm_configs[CONFIG_MAX_VM_NUM] = {
  {
  .guest_flags = (GUEST_FLAG_VCAT_ENABLED),
  .pclosids = vm0_vcpu_clos,
  .num_pclosids = 5U,
  .max_l3_pcbm = 0xff800U,
  },
  {
  .pclosids = vm1_vcpu_clos,
  .num_pclosids = 2U,
  },
  };

Tracked-On: #5917
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-10-26 11:48:27 +08:00
dongshen 368f158b46 hv/config-tools: add the support for vCAT
Add the VCAT_ENABLED element to RDTType so that user can enable/disable vCAT globally

Add the GUEST_FLAG_VCAT_ENABLED guest flag to enable/disable vCAT per-VM.

  Currently we have the following per-VM clos element in scenario file for RDT use:
    <clos>
      <vcpu_clos>0</vcpu_clos>
      <vcpu_clos>0</vcpu_clos>
    </clos>

  When the GUEST_FLAG_VCAT_ENABLED guest flag is not specified, clos is for RDT use,
  vcpu_clos is per-CPU and it configures each CPU in VMs to a desired CLOS ID.

  When the GUEST_FLAG_VCAT_ENABLED guest flag is specified, vCAT is enabled for this VM,
  clos is for vCAT use, vcpu_clos is not per-CPU anymore in this case, just a list of
  physical CLOSIDs (minimum 2) that are assigned to VMs for vCAT use. Each vcpu_clos
  will be mapped to a virtual CLOSID, the first vcpu_clos is mapped to virtual CLOSID
  0 and the second is mapped to virtual CLOSID 1, etc

Add xs:assert to prevent any problems with invalid configuration data for vCAT:

  If any GUEST_FLAG_VCAT_ENABLED guest flag is specified, both RDT_ENABLED and VCAT_ENABLED
  must be 'y'

  If VCAT_ENABLED is 'y', RDT_ENABLED must be 'y' and CDP_ENABLED must be 'n'

  For a vCAT VM, vcpu_clos cannot be set to CLOSID 0, CLOSID 0 is reserved to be used by hypervisor

  For a vCAT VM, number of clos/vcpu_clos elements must be greater than 1

  For a vCAT VM, each clos/vcpu_clos must be less than L2/L3 COS_MAX

  For a vCAT VM, its clos/vcpu_clos elements cannot contain duplicate values

  There should not be any CLOS IDs overlap between a vCAT VM and any other VMs

Tracked-On: #5917
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-10-26 11:48:27 +08:00
Yonghua Huang c8e2060d37 hv: unmap IOMMU register pages from service VM EPT
IOMMU hardware resource is owned by hypervisor, while
 IOMMU capability is reported to service VM in its ACPI
 table. In this case, Service VM may access IOMMU hardware
 resource, which is not expected.

 This patch unmaps all Intel IOMMU register pages for service VM EPT.

Tracked-On: #6677
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-10-22 09:31:10 +08:00
David B. Kinder 2913395123 doc: update uses of VHM in doxygen comments
PR #6283 updated code and docs to the new kernel HSM driver. Fix
some references to VHM missed in the doxygen comments. Also fixed some
misspellings while in these files.

Tracked-On: #6282

Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
2021-10-18 19:09:07 -07:00
Liu,Junming 79a5d7a787 hv: initialize IGD offset 0xfc of CFG space for Service VM
For the IGD device the opregion addr is returned by reading the 0xFC config of
0:02.0 bdf. And the opregion addr is required by GPU driver.
The opregion_addr should be the GPA addr.

When the IGD is assigned to pre-launched VM, the value in 0xFC of igd_vdev is
programmed into with new GPA addr. In such case the prelaunched VM reads
the value from 0xFC of 0:02.0 vdev.

But for the Service VM, the IGD is initialized by using the same policy as other PCI
devices. We only initialize the vdev_head_conf(0x0-0x3F) by checking the
corresponding pbdf. The remaining pci_config_space will be read by
leveraging the corresponding pdev. But as the above code doesn't handle the
scenario for Service VM, it causes that the Service VM fails to
read the 0xFC config_space for IGD vdev.
Then the i915 GPU driver in SOS has some issues because of incorrect 0xFC
pci_conf_space.

This patch initializes offset 0xfc of CFG space of IGD for Service VM,
it is simple and can cover post-launched VM too.

Tracked-On: #6387

Signed-off-by: Liu,Junming <junming.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-10-18 09:11:16 +08:00
Xiangyang Wu dec8d7e22f hv: support at most MAX_VUART_NUM_PER_VM legacy vuarts
In the current hypervisor, only support at most two legacy vuarts
(COM1 and COM2) for a VM, COM1 is usually configured as VM console,
COM2 is configured as communication channel of S5 feature.
Hypervisor can support MAX_VUART_NUM_PER_VM(8) legacy vuart, but only
register handlers for two legacy vuart since the assumption (legacy
vuart is less than 2) is made.
In the current hypervisor configurtion, io port (2F8H) is always
allocated for virtual COM2, it will be not friendly if user wants to
assign this port to physical COM2.
Legacy vuart is common communication channel between service VM and
user VM, it can work in polling mode and its driver exits in each
guest OS. The channel can be used to send shutdown command to user VM
in S5 featuare, so need to config serval vuarts for service VM and one
vuart for each user VM.

The following changes will be made to support at most
MAX_VUART_NUM_PER_VM legacy vuarts:
   - Refine legacy vuarts initialization to register PIO handler for
related vuart.
   - Update assumption of legacy vuart number.

BTW, config tools updates about legacy vuarts will be made in other
patch.

v1-->v2:
	Update commit message to make this patch's purpose clearer;
	If vuart index is valid, register handler for it.

Tracked-On: #6652
Signed-off-by: Xiangyang Wu <xiangyang.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-10-15 10:00:02 +08:00
Fei Li 6c5bf4a642 hv: enhance e820_alloc_memory could allocate memory than 4G
Enhance e820_alloc_memory could allocate memory than 4G.

Tracked-On: #5830
Signed-off-by: Fei Li <fei1.li@intel.com>
2021-10-14 15:04:36 +08:00
Fei Li df7ffab441 hv: remove CONFIG_HV_RAM_SIZE
It's difficult to configure CONFIG_HV_RAM_SIZE properly at once. This patch
not only remove CONFIG_HV_RAM_SIZE, but also we use ld linker script to
dynamically get the size of HV RAM size.

Tracked-On: #6663
Signed-off-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-10-14 15:04:36 +08:00
Zide Chen e48962faa6 hv: optimize run_vcpu() for nested
This patch implements a separate path for L2 VMEntry in run_vcpu(),
which has several benefits:

- keep run_vcpu() clean, to reduce the number of is_vcpu_in_l2_guest()
  statements:
  - current code has three is_vcpu_in_l2_guest() already.
  - supposed to have another 2 statement so that nested VMEntry won't
    hit the "Starting vCPU" and "vCPU launched" pr_info() and a few
    other statements in the VM launch path.

- save few other things in run_vcpu() that are not needed for nested.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-10-13 15:55:31 +08:00
Zide Chen 89bbc44962 hv: inject external interrupts only if LAPIC is not passthru
Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-10-08 09:18:34 +08:00
Zide Chen 228b052fdb hv: operations on vcpu->reg_cached/reg_updated don't need LOCK prefix
In run time, one vCPU won't read or write a register on other vCPUs,
thus we don't need the LOCK prefixed instructions on reg_cached and
reg_updated.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-10-08 09:11:10 +08:00
Zide Chen 2b683f8f5b hv: call vcpu_inject_exception() only when ACRN_REQUEST_EXCP is set
move the bitmap test call out of vcpu_inject_exception(), then we call
the expensive bitmap_test_and_clear_lock() only pending_req_bits is
non-zero and call vcpu_inject_exception() only if needed.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-10-07 20:48:43 +08:00
Zide Chen f801ba4ed7 hv: update guest RIP only if vcpu->arch.inst_len is non zero
In very large number of VM extis, the VM-exit instruction length could be
zero, and it's no need to update VMX_GUEST_RIP.

Some examples:

- all external interrupt VM exits in non LAPIC passthru setup.
- for all the nested VM-exits that are reflecting to L1 hypervisor.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-10-07 20:47:07 +08:00
Zide Chen b7e9a68923 hv: code cleanup in run_vcpu()
- wrap a new function exec_vmentry() to reduce code duplication.
- remove exec_vmread(VMX_GUEST_RSP) since ACRN doesn't need to know
  guest RSP in run time.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-10-07 20:47:07 +08:00
Zide Chen ee12daff84 hv: nested: refine vmcs12_read/write_field APIs
Change "uint64_t vmcs_hva" to "void *vmcs_hva" in the input argument,
list, so that no type casting is needed when calling them from pointers.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-10-07 20:45:34 +08:00
Kunhui-Li 2a8c587824 config_tools: update board name in makefile
update board name from nuc7i7dnb to nuc11tnbi5 in makefile because
we have removed the nuc7i7dnb board folder, and also update the
scenario name from industry to shared to fix "make all" build issue.

Tracked-On: #6315
Signed-off-by: Kunhui-Li <kunhuix.li@intel.com>
2021-09-29 16:53:44 +08:00
Liu,Junming 545c006a33 hv: inject #GP if guest tries to reprogram pass-thru dev PIO bar
In current design, when pass-thru dev,
for the PIO bar, need to ensure the guest PIO start address
equals to host PIO start address.

But malicious guest may reprogram the PIO bar,
then hv will pass-thru the reprogramed PIO address to guest.
This isn't safe behavior.
When guest tries to reprogram pass-thru dev PIO bar,
inject #GP to guest directly.

Tracked-On: #6508

Signed-off-by: Liu,Junming <junming.liu@intel.com>
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
2021-09-28 08:49:01 +08:00
Liu,Junming 4105ca2cb4 hv: deny the launch of VM if pass-thru PIO bar isn't identical mapping
In current design, when pass-thru dev,
for the PIO bar, need to ensure the guest PIO start address
equals to host PIO start address.
Then set the VMCS io bitmap to pass-thru the corresponding
port io to guest for performance.

ACRN-DM and acrn-config should ensure the identical mapping of PIO bar.
If ACRN-DM or acrn-config failed to achieve this,
we should deny the launch of VM

Tracked-On: #6508

Signed-off-by: Liu,Junming <junming.liu@intel.com>
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
2021-09-28 08:49:01 +08:00
Victor Sun 28824c1e74 HV: init e820 before init paging
In the commit of 4e1deab3d9, we changed the
init sequence that init paging first and then init e820 because we worried
about the efi memory map could be beyond 4GB space on some platform.

After we double checked multiboot2 spec, when system boot from multiboot2
protocol, the efi memory map info will be embedded in multiboot info so it
is guaranteed that the efi memory map must be under 4GB space. Consider that
the page table will be allocated in free memory space in future, we have
to change the init sequence back that init e820 first and then init paging.

If we need to support other boot protocol in future that the efi memory map
might be put beyond 4GB, we could have below options:
	1. Request bootloader put efi memory map below 4GB;
	2. Call EFI_BOOT_SERVICES.GetMemoryMap() before ExitBootServices();
	3. Enable a early 64bit page table to get the efi memory map only;

Tracked-On: #5626

Signed-off-by: Victor Sun <victor.sun@intel.com>
2021-09-27 09:03:15 +08:00
Zide Chen a62dd6ad8a hv: nested: fixed vmxoff_vmexit_handler() issue
In VMXOFF vmexit handler, it's supposed to remove VMCS shadowing.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
2021-09-26 08:49:35 +08:00
Zide Chen 45b036e028 hv: nested: enable multiple active VMCS12 support
This patch changes the size of vvmcs[] array from 1 to
PER_VCPU_ACTIVE_VVMCS_NUM, and actually enables multiple active VMCS12
support in ACRN.  The basic operations:

- if L1 VMPTRLDs a VMCS12 without previously VMCLEAR the current
  VMCS12, ACRN no longer unconditionally flushes the current VMCS12
  back to L1.  Instead, it tries to keep both the current and the newly
  loaded VMCS12 in the nested->vvmcs[] array, unless:

- if there is no more available vvmcs[] entry, ACRN flushes one active
  VMCS12 to make room for this new VMCS12.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-09-26 08:49:35 +08:00
Mingqiang Chi f39c882359 hv:change log level for check_vmx_ctrl
Some processors don't support VMX_PROCBASED_CTLS_TERTIARY bit
and VMX_PROCBASED_CTLS2_UWAIT_PAUSE bit in MSRs
(IA32_VMX_PROCBASED_CTLS & IA32_VMX_PROCBASED_CTLS2),
HV will output error log which will cause confusion,
change the log level from pr_err to pr_info.

Tracked-On: #6397

Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2021-09-24 10:17:19 +08:00
Jie Deng 064fd7647f hv: add priority based scheduler
This patch adds a new priority based scheduler to support
vCPU scheduling based on their pre-configured priorities.
A vCPU can be running only if there is no higher priority
vCPU running on the same pCPU.

Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
2021-09-24 09:32:18 +08:00
Junjie Mao efcb9e2fdf Makefile: fix wrong reference to board XML and skip binary in diffconfig
The current config.mk uses the variable BOARD_FILE as the path to the board
XML when generating an unmodified copy of configuration files for
comparison, which is incorrect. The right variable is HV_BOARD_XML which is
the path to the copy of board XML that is actually used for the build.

This patch corrects the bug above.

In addition, this patch also skips binary files (which are not meant to be
edited manually) when calculating the differences.

Tracked-On: #6592
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
2021-09-19 20:23:44 +08:00
Fei Li 53fe6d63be hv: vioapic: update remote IRR for lapic-pt
For local APIC passthrough case, EOI would not trigger VM-exit. So virtual
'Remote IRR' would not be updated. Needs to read physical IOxAPIC RTE to
update virtual 'Remote IRR' field each time when guest wants to read I/O
REDIRECTION TABLE REGISTERS

Tracked-On: #5923
Signed-off-by: Fei Li <fei1.li@intel.com>
2021-09-18 09:42:44 +08:00
Zide Chen 94cbe909ee hv: irq: identical vector mapping if LAPIC passthough
In local APIC passthrough case, when devices triggered a INTx interrupt, this
interrupt would be delivered to vCPU directly. For this case, need to set the
virtual vector in
the 'Interrupt Vector' field of physical IOxAPIC I/O REDIRECTION TABLE REGISTER
(bits 7:0) and 'Vector' field of vt-d Interrupt Remapping Table Entry (IRTE)
for Remapped Interrupts.

Assumption:
(a) IOAPIC pins won't be shared between LAPIC PT guest and other guests;
(b) The guest would not trigger this IRQ before it switched to x2 APIC mode.

Tracked-On: #5923
Signed-off-by: Zide Chen <zide.chen@intel.com>
2021-09-18 09:42:44 +08:00
Mingqiang Chi db98f01b6e add vmx capability check
check some essential vmx capablility,
will panic if processor doesn't support it.

Tracked-On: #6584

Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-09-18 08:44:30 +08:00
dongshen 08d4517431 hv: fix bugs in RDT's CDP code
In current RDT code, if CDP is configured, L2/L3 resources' num_closids calculation
is wrong:
res_cap_info[res].num_closids = (uint16_t)((edx & 0xffffU) >> 1U) + 1U;

Should be:
res_cap_info[res].num_closids = (uint16_t)((edx & 0xffffU) >> 1U + 1) >> 1U;

Aslo, in order to enable CDP system-wide, need to enable the CDP bit (bit 0) on all pcpus,
not just on pcpu 0.

Tracked-On: #5917
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
2021-09-17 16:29:05 +08:00
dongshen f4cdbba0bd hv: some cosmetic fixes to rdt.c/rdt.h
Rename the clos_max field in struct rdt_info to num_closids

Rename variable valid_clos_num to common_num_closids and make it static

Tracked-On: #5917
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
2021-09-17 16:29:05 +08:00
Liu Long 2de395b6f6 HV: Normalize hypervisor help output format
Normalize hypervisor help command output format, remove the 10 lines
limit for one screen, fix the misspelled words.

Tracked-On: #5112
Signed-off-by: Liu Long <long.liu@intel.com>
Reviewed-by: VanCutsem, Geoffroy <geoffroy.vancutsem@intel.com>
2021-09-17 11:06:18 +08:00
Zide Chen 0466d7055f hv: nested: move the VMCS12 dirty flags to struct acrn_vvmcs
These dirty flags are supposed to be per VMCS12, so move them from the
per vCPU acrn_nested struct to the newly added acrn_vvmcs struct.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-09-17 10:58:43 +08:00
Zide Chen 4e54c3880b hv: nested: remove vcpu->arch.nested.current_vmcs12_ptr
This variable represents the L1 GPA of the current VMCS12.  But it's
no longer needed in the multiple active VMCS12 case, which uses the
following variables for this purpose.

- nested->current_vvmcs refers to the vvmcs[] entry which contains the
  cached current VMCS12, its associated VMCS02, and other context info.

- nested->current_vvmcs->vmcs12_gpa refers to the L1 GPA of this
  current VMCS12.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-09-17 10:58:43 +08:00
Zide Chen 799a4d332a hv: nested: initial implementation of struct acrn_vvmcs
Add an array of struct acrn_vvmcs to struct acrn_nested, so it is
possible to cache multiple active VMCS12s.

This patch declares the size of this array to 1, meaning that there is
only one active VMCS12.  This is to minimize the logical code changes.

Add pointer current_vvmcs to struct acrn_nested, which refers to the
current vvmcs[] entry.  In this patch, if any VMCS12 is active, it
always points to vvmcs[0].

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-09-17 10:58:43 +08:00
Zide Chen cf697e753d hv: nested: some API signature changes
No any logical changes, this patch is preparing for multiple active
VMCS12 support.

- currently it's easy to get the vmcs12 pointer from the vcpu pointer.
  In multiple active vmcs12 case, we need to explicitly add "struct
  acrn_vmcs12 *vmcs12" to certain APIs' input argument list, in order to
  get the desired vmcs12 pointer.

- merge flush_current_vmcs12() into clear_vmcs02() for multiple reasons:
  a) it's called only once; b) we don't wrap the opposite operation
  (loading vmcs12) in an API; c) this API has simple and clear logic.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-09-17 10:58:43 +08:00
Zide Chen e9eb72d319 hv: nested: flush L2 VPID only when it could conflict with L1 VPIDs
By changing the way to assign L1 VPID from bottom-up to top-down,
the possibilities for VPID conflicts between L1 and L2 guests are
small.

Then we can flush VPID just in case of conflicting.

Tracked-On: #6289
Signed-off-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-09-16 09:26:10 +08:00
Fei Li 0a515ab2ea hv: pci: fix a minor bug about is_pci_cfg_multifunction
Before checking whether a PCI device is a Multi-Function Device or not, we need
make sure this PCI device is a valid PCI device. For a valid PCI device, the
'Header Layout' field in Header Type Register must be 000 0000b (Type 0 PCI device)
or 000 0001b (Type 1 PCI device).

So for a valid PCI device, the Header Type can't be 0xff.

Tracked-On: #4134
Signed-off-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-09-15 13:24:18 +08:00
Zide Chen 1ab65825ba hv: nested: merge gpa_field_dirty and control_field_dirty flag
In run time, it's rare for L1 to write to the intercepted non host-state
VMCS fields, and using multiple dirty flags is not necessary.

This patch uses one single dirty flag to manage all non host-state VMCS
fields.  This helps to simplify current code and in the future we may
not need to declare new dirty flags when we intercept more VMCS fields.

Tracked-On: #5923
Signed-off-by: Zide Chen <zide.chen@intel.com>
2021-09-13 15:50:01 +08:00
Zide Chen 6376d5a0d3 hv: nested: fix bug in syncing EPTP from VMCS12 to VMCS02
Currently vmptrld_vmexit_handler() doesn't sync VMX_EPT_POINTER_FULL
from vmcs12 to vmcs02, instead it sets gpa_field_dirty and relies on
nested_vmentry() to sync EPTP in next nested VMentry.

This creates readability issue since all other intercepted VMCS fields
are synced in sync_vmcs12_to_vmcs02().  Another issue is that other
VMCS fields managed by gpa_field_dirty are repeatedly synced in both
vmptrld and nested vmentry handler.

This patch moves get_nept_desc() ahead of sync_vmcs12_to_vmcs02(), such
that shadow_eptp is allocated before sync_vmcs12_to_vmcs02() which
can sync EPTP properly.

BTW, in nested_vmexit_handler(), don't need to read from VMCS to get
the exit reason, since vcpu->arch.exit_reason has it already.

Tracked-On: #5923
Signed-off-by: Zide Chen <zide.chen@intel.com>
2021-09-13 15:50:01 +08:00
Geoffroy Van Cutsem 01bf5110c5 Makefile: add missing deps in top-level and hypervisor Makefile
Add a couple of missing dependencies in the ACRN Makefiles:
1. 'acrn.bin' is required before the hypervisor can be installed
2. The 'acrn_mngr.h' needs to be installed ('tools-install') in
the build folder.

Tracked-On: #6360
Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
2021-09-13 11:28:14 +08:00
Junjie Mao 2bfaa34cf2 config_tools: populate default values in scenario XML
While we have default values of configuration entries stated in the schema
of scenario XMLs, today we still require user-given scenario XMLs to
contain literally ALL XML nodes. Missing of a single node will cause schema
validation errors even though we can use its default value defined in the
schema.

This patch allows user-given scenario XMLs to ignore nodes with default
values. It is done by adding the missing nodes, all containing the defined
default values, to the input scenario XML when copying it to the build
directory. This approach imposes no changes to either the schema or
subsequent scripts in the build system.

Tracked-On: #6292
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
2021-09-13 09:05:52 +08:00
Yifan Liu 0a1ad45b32 hv: Avoid using SMBIOS major version
Previously it is (falsely) assumed that the major_ver of 32-bit SMBIOS
entry point structure (which is called SMBIOS 2.1 in spec, or SMBIOS2 in code)
will have a value of 2 and major_ver of 64-bit SMBIOS (which is called SMBIOS
3.0 in spec, and SMBIOS3 in code) will have a value of 3. This turned out to be
wrong. This major_ver refers to the implemented doc revision, and 32-bit SMBIOS2
can have its major_ver to be 3 (current most recent implementation).

This patch removes the use of major_ver to distinguish between
SMBIOS2/3, and use a doc-defined anchor string instead.

Tracked-On: #6528
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
2021-09-08 15:22:12 +08:00
Zide Chen 11c2f3eabb hv: check bitmap before calling bitmap_test_and_clear_lock()
The locked btr instruction is expensive.  This patch changes the
logic to ensure that the bitmap is non-zero before executing
bitmap_test_and_clear_lock().

The VMX transition time gets significant improvement.  SOS running
on TGL, the CPUID roundtrip reduces from ~2400 cycles to ~2000 cycles.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-09-02 16:09:33 +08:00
Zide Chen 7cde4a8d40 hv: initialize host IA32_PAT MSR
Currently ACRN assumes firmware setup IA32_PAT correctly.  This patch
explicitly initializes host IA32_PAT MSR according to ISDM Table 11-12.
Memory Type Setting of PAT Entries Following a Power-up or Reset.

ACRN creates host page tables based on PAT0 (WB) and PAT3 (UC).

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
2021-09-02 09:15:39 +08:00
Zide Chen aeb3690b6f hv: simplify is_lapic_pt_enabled()
is_lapic_pt_enabled() is called at least twice in one loop of the vCPU
thread, and it's called in vmexit_handler() frequently if LAPIC is not
pass-through.  Thus the efficiency of this function has direct
impact to the system performance.

Since the LAPIC mode is not changed in run time, we don't have to
calculate it on the fly in is_lapic_pt_enabled().

BTW, removed the unused lapic_mask from struct acrn_vcpu_arch.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-26 09:52:10 +08:00
Shiqing Gao d90dbc0d91 hv: check the capability of XSAVES/XRSTORS instructions before execution
For platforms that do not support XSAVES/XRSTORS instructions, like QEMU,
executing these instructions causes #UD.
This patch adds the check before the execution of XSAVES/XRSTORS instructions.

It also refines the logic inside rstore_xsave_area for the following reason:
If XSAVES/XRSTORS instructions are supported, restore XSAVE area if any of the
following conditions is met:
 1. "vcpu->launched" is false (state initialization for guest)
 2. "vcpu->arch.xsave_enabled" is true (state restoring for guest)

 * Before vCPU is launched, condition 1 is satisfied.
 * After vCPU is launched, condition 2 is satisfied because
   is_valid_xsave_combination() guarantees that "vcpu->arch.xsave_enabled"
   is consistent with pcpu_has_cap(X86_FEATURE_XSAVES).
Therefore, the check against "vcpu->launched" and "vcpu->arch.xsave_enabled"
can be eliminated here.

Tracked-On: #6481

Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-08-26 09:42:23 +08:00
Zide Chen cbf3825140 hv: Pass-through IA32_TSC_AUX MSR to L1 guest
Use an unused MSR on host to save ACRN pcpu ID and avoid saving and
restoring TSC AUX MSR on VMX transitions.

Tracked-On: #6289
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2021-08-26 09:25:54 +08:00
Yifan Liu d33c76f701 hv: quirks: SMBIOS passthrough for prelaunched-VM
This feature is guarded under config CONFIG_SECURITY_VM_FIXUP, which
by default should be disabled.

This patch passthrough native SMBIOS information to prelaunched VM.
SMBIOS table contains a small entry point structure and a table, of which
the entry point structure will be put in 0xf0000-0xfffff region in guest
address space, and the table will be put in the ACPI_NVS region in guest
address space.

v2 -> v3:
uuid_is_equal moved to util.h as inline API
result -> pVendortable, in function efi_search_guid
recalc_checksum -> generate_checksum
efi_search_smbios -> efi_search_smbios_eps
scan_smbios_eps -> mem_search_smbios_eps
EFI GUID definition kept

Tracked-On: #6320
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
2021-08-26 09:24:50 +08:00
Yifan Liu 975ff33e01 hv: Move uuid_is_equal to util.h
This patch moves uuid_is_equal from vm_config.c to util.h as inline API.

Tracked-On: #6320
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
2021-08-26 09:24:50 +08:00
Yifan Liu 32d6ead8de hv && config-tool: Rename GUEST_FLAG_TPM2_FIXUP
This patch renames the GUEST_FLAG_TPM2_FIXUP to
GUEST_FLAG_SECURITY_VM.

v2 -> v3:
The "FIXUP" suffix is removed.

Tracked-On: #6320
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
2021-08-26 09:24:50 +08:00
Liu Long 31598ae895 ACRN:hv: Fix vcpu_dumpreg command hang issue
In ACRN RT VM if the lapic is passthrough to the guest, the ipi can't
trigger VM_EXIT and the vNMI is just for notification, it can't handle
the smp_call function. Modify vcpu_dumpreg function prompt user switch
to vLAPIC mode for vCPU register dump.

Tracked-On: #6473
Signed-off-by: Liu Long <long.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-25 08:54:27 +08:00
Zide Chen 0980420aea hv: minor cleanup of hv_main.c
- remove vcpu->arch.nrexits which is useless.
- record full 32 bits of exit_reason to TRACE_2L(). Make the code simpler.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-25 08:49:54 +08:00
Jian Jun Chen 8de39f7b61 hv: GSI of hcall_set_irqline should be checked against target_vm
GSI of hcall_set_irqline should be checked against target_vm's
total GSI count instead of SOS's total GSI count.

Tracked-On: #6357
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-25 08:48:47 +08:00
Zide Chen 6d7eb6d7b6 hv: emulate IA32_EFER and adjust Load EFER VMX controls
This helps to improve performance:

- Don't need to execute VMREAD in vcpu_get_efer(), which is frequently
  called.

- VMX_EXIT_CTLS_SAVE_EFER can be removed from VM-Exit Controls.

- If the value of IA32_EFER MSR is identical between the host and guest
  (highly likely), adjust the VMX controls not to load IA32_EFER on
  VMExit and VMEntry.

It's convenient to continue use the exiting vcpu_s/get_efer() APIs,
other than the common vcpu_s/get_guest_msr().

Tracked-On: #6289
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
2021-08-24 11:16:53 +08:00
Liang Yi 499f62e8bd hv: use per platform maximum physical address width
MAXIMUM_PA_WIDTH will be calculated from board information.

Tracked-On: #6357
Signed-off-by: Liang Yi <yi.liang@intel.com>
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
2021-08-20 11:02:21 +08:00
Liang Yi 2b3620de7d hv: mask off LA57 in cpuid
Mask off support of 57-bit linear addresses and five-level paging.

ICX-D has LA57 but ACRN doesn't support 5-level paging yet.

Tracked-On: #6357
Signed-off-by: Liang Yi <yi.liang@intel.com>
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2021-08-20 11:02:21 +08:00
Shiqing Gao 91777a83b5 config_tools: add a new entry MAX_EFI_MMAP_ENTRIES
It is used to specify the maximum number of EFI memmap entries.

On some platforms, like Tiger Lake, the number of EFI memmap entries
becomes 268 when the BIOS settings are changed.
The current value of MAX_EFI_MMAP_ENTRIES (256) defined in hypervisor
is not big enough to cover such cases.

As the number of EFI memmap entries depends on the platforms and the
BIOS settings, this patch introduces a new entry MAX_EFI_MMAP_ENTRIES
in configurations so that it can be adjusted for different cases.

Tracked-On: #6442

Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
2021-08-20 09:50:39 +08:00
Shiqing Gao 651d44432c hv: initialize the XSAVE related processor state for guest
If SOS is using kernel 5.4, hypervisor got panic with #GP.

Here is an example on KBL showing how the panic occurs when kernel 5.4 is used:
Notes:
 * Physical MSR_IA32_XSS[bit 8] is 1 when physical CPU boots up.
 * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is initialized to 0.

Following thread switches would happen at run time:
1. idle thread -> vcpu thread
   context_switch_in happens and rstore_xsave_area is called.
   At this moment, vcpu->arch.xsave_enabled is false as vcpu is not launched yet
   and init_vmcs is not called yet (where xsave_enabled is set to true).
   Thus, physical MSR_IA32_XSS is not updated with the value of guest MSR_IA32_XSS.

   States at this point:
    * Physical MSR_IA32_XSS[bit 8] is 1.
    * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is 0.

2. vcpu thread -> idle thread
   context_switch_out happens and save_xsave_area is called.
   At this moment, vcpu->arch.xsave_enabled is true. Processor state is saved
   to memory with XSAVES instruction. As physical MSR_IA32_XSS[bit 8] is 1,
   ectx->xs_area.xsave_hdr.hdr.xcomp_bv[bit 8] is set to 1 after the execution
   of XSAVES instruction.

   States at this point:
    * Physical MSR_IA32_XSS[bit 8] is 1.
    * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is 0.
    * ectx->xs_area.xsave_hdr.hdr.xcomp_bv[bit 8] is 1.

3. idle thread -> vcpu thread
   context_switch_in happens and rstore_xsave_area is called.
   At this moment, vcpu->arch.xsave_enabled is true. Physical MSR_IA32_XSS is
   updated with the value of guest MSR_IA32_XSS, which is 0.

   States at this point:
    * Physical MSR_IA32_XSS[bit 8] is 0.
    * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is 0.
    * ectx->xs_area.xsave_hdr.hdr.xcomp_bv[bit 8] is 1.

   Processor state is restored from memory with XRSTORS instruction afterwards.
   According to SDM Vol1 13.12 OPERATION OF XRSTORS, a #GP occurs if XCOMP_BV
   sets a bit in the range 62:0 that is not set in XCR0 | IA32_XSS.
   So, #GP occurs once XRSTORS instruction is executed.

Such issue does not happen with kernel 5.10. Because kernel 5.10 writes to
MSR_IA32_XSS during initialization, while kernel 5.4 does not do such write.
Once guest writes to MSR_IA32_XSS, it would be trapped to hypervisor, then,
physical MSR_IA32_XSS and the value of MSR_IA32_XSS in vcpu->arch.guest_msrs
are updated with the value specified by guest. So, in the point 2 above,
correct processor state is saved. And #GP would not happen in the point 3.

This patch initializes the XSAVE related processor state for guest.
If vcpu is not launched yet, the processor state is initialized according to
the initial value of vcpu_get_guest_msr(vcpu, MSR_IA32_XSS), ectx->xcr0,
and ectx->xs_area. With this approach, the physical processor state is
consistent with the one presented to guest.

Tracked-On: #6434

Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Reviewed-by: Li Fei1 <fei1.li@intel.com>
2021-08-20 09:46:09 +08:00
Zide Chen 2e6cf2b85b hv: nested: fix bugs in init_vmx_msrs()
Currently init_vmx_msrs() emulates same value for the IA32_VMX_xxx_CTLS
and IA32_VMX_TRUE_xxx_CTLS MSRs.

But the value of physical MSRs could be different between the pair,
and we need to adjust the emulated value accordingly.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-20 09:40:50 +08:00
Zide Chen ad37553873 hv: nested: redundant permission check on nested_vmentry()
check_vmx_permission() is called in vmresume_vmexit_handler() and
vmlaunch_vmexit_handler() already.

Tracked-On: #6289
Signed-off-by: Zide Chen <zide.chen@intel.com>
2021-08-20 08:14:40 +08:00
Yifan Liu d575edf79a hv: Change sched_event structure to resolve data race in event handling
Currently the sched event handling may encounter data race problem, and
as a result some vcpus might be stalled forever.

One example can be wbinvd handling where more than 1 vcpus are doing
wbinvd concurrently. The following is a possible execution of 3 vcpus:
-------
0                            1                           2
                             req [Note: 0]
                             req bit0 set [Note: 1]
                             IPI -> 0
                             req bit2 set
                             IPI -> 2
                                                         VMExit
                                                         req bit2 cleared
                                                         wait
                                                         vcpu2 descheduled

VMExit
req bit0 cleared
wait
vcpu0 descheduled
                             signal 0
                             event0->set=true
                             wake 0

                             signal 2
                             event2->set=true [Note: 3]
                             wake 2
                                                         vcpu2 scheduled
                                                         event2->set=false
                                                         resume

                                                         req
                                                         req bit0 set
                                                         IPI -> 0
                                                         req bit1 set
                                                         IPI -> 1
                             (doesn't matter)
vcpu0 scheduled [Note: 4]
                                                         signal 0
                                                         event0->set=true
                                                         (no wake) [Note: 2]
event0->set=false                                        (the rest doesn't matter)
resume

Any VMExit
req bit0 cleared
wait
idle running

(blocked forever)

Notes:
0: req: vcpu_make_request(vcpu, ACRN_REQUEST_WAIT_WBINVD).
1: req bit: Bit in pending_req_bits. Bit0 stands for bit for vcpu0.
2: In function signal_event, At this time the event->waiting_thread
    is not NULL, so wake_thread will not execute
3: eventX: struct sched_event of vcpuX.
4: In function wait_event, the lock does not strictly cover the execution between
    schedule() and event->set=false, so other threads may kick in.
-----

As shown in above example, before the last random VMExit, vcpu0 ended up
with request bit set but event->set==false, so blocked forever.

This patch proposes to change event->set from a boolean variable to an
integer. The semantic is very similar to a semaphore. The wait_event
will add 1 to this value, and block when this value is > 0, whereas signal_event
will decrease this value by 1.

It may happen that this value was decreased to a negative number but that
is OK. As long as the wait_event and signal_event are paired and
program order is observed (that is, wait_event always happens-before signal_event
on a single vcpu), this value will eventually be 0.

Tracked-On: #6405
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
2021-08-20 08:11:40 +08:00
Zhou, Wu b394777908 HV: Add implements of 32bit and 64bit elf loader
This is a simply implement for the 32bit and 64bit elf loader.

The loading function first reads the image header, and finds the program
entries that are marked as PT_LOAD, then loads segments from elf file to
guest ram. After that, it finds the bss section in the elf section entries, and
clear the ram area it points to.

Limitations:
1. The e_type of the elf image must be ET_EXEC(executable). Relocatable or
   dynamic code is not supported.
2. The loader only copies program segments that has a p_type of
   PT_LOAD(loadable segment). Other segments are ignored.
3. The loader doesn’t support Sections that are relocatable
   (sh_type is SHT_REL or SHT_RELA)
4. The 64bit elf’s entry address must below 4G.
5. The elf is assumed to be able to put segments to valid guest memory.

Tracked-On: #6323

Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-19 20:00:45 +08:00
Zhou, Wu c2468d2791 HV: Add elf loader sketch
This patch adds a function elf_loader() to load elf image.
It checks the elf header, get its 32/64 bit type, then calls
the corresponding loading routines, which are empty, and
will be realized later.

Tracked-On: #6323

Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-19 20:00:45 +08:00
Zhou, Wu 537f69dde9 HV: Add elf header file for elf loader
Source: https://github.com/freebsd/freebsd-src/blob/main/sys/sys/elf_common.h
Trimed to meet the minimal requirements for the Zephyr elf file to be loaded
Also added elf file header data struct and program/section entry data structs.

Tracked-On: #6323

Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-19 20:00:45 +08:00
Zhou, Wu 8100b1dd56 HV: Remove 'vm_' of vm_elf_loader and etc.
In order to make better sense, vm_elf_loader, vm_bzimage_loader and
vm_rawimage_loader are changed to elf_loaer, bzimage_loaer and
rawimage_loader.

Tracked-On: #6323

Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-19 20:00:45 +08:00
Zhou, Wu 53f6720d13 HV: Combine the acpi loading fucntion to one place
Remove the acpi loading function from elf_loader, rawimage_loaer and
bzimage_loader, and call it together in vm_sw_loader.

Now the vm_sw_loader's job is not just loading sw, so we rename it to
prepare_os_image.

Tracked-On: #6323

Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-19 20:00:45 +08:00
Zhou, Wu e78aacbe55 HV: Correct some naming issues
For the guest OS loaders, prapare_loading_xxx are not accurate for
what those functions actually do. Now they are changed to load_xxx:
load_rawimage, load_bzimage.

And the 'bsp' expression is confusing in the comments for
init_vcpu_protect_mode_regs, changed to a better way.

Tracked-On: #6323

Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-19 20:00:45 +08:00
Victor Sun 3124018917 HV: vm_load: rename vboot_info.h to vboot.h
vboot_info.h declares vm loader function also, so rename the file name to
vboot.h;

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-19 20:00:45 +08:00
Victor Sun 9b632c0e4b HV: vm_load: split vm_load.c to support diff kernel format
The patch splits the vm_load.c to three parts, the loader function of bzImage
kernel is moved to bzimage_loader.c, the loader function of raw image kernel
is moved to rawimage_loader.c, the stub is still stayed in vm_load.c to load
the corresponding kernel loader function. Each loader function could be
isolated by CONFIG_GUEST_KERNEL_XXX macro which generated by config tool.

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-19 20:00:45 +08:00
Victor Sun 2524572fb2 HV: vm_load: refine vm_sw_loader API
Change if condition to switch in vm_sw_loader() so that the sw loader
could be compiled conditionally.

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-19 20:00:45 +08:00
Yang,Yu-chu 73dc610d90 config-tool: refine guest kernel types
Rename KERNEL_ZEPHYR to KERNEL_RAWIMAGE. Added new type "KERNEL_ELF".

Add CONFIG_GUEST_KERNEL_RAWIMAGE, CONFIG_GUEST_KERNEL_ELF and/or
CONFIG_GUEST_KERNEL_BZIMAGE to config.h if it's configured.

Tracked-On: #6323

Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
2021-08-19 20:00:45 +08:00
Victor Sun 178b3e85e3 HV: vm_load: change kernel type for zephyr image
Previously we only support loading raw format of zephyr image as prelaunched
Zephyr VM, this would cause guest F segment overridden issue because the zephyr
raw image covers memory space from 0x1000 to 0x100000 upper. To fix this issue,
we should support ELF format image loading so that parse and load the multiple
segments from ELF image directly.

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-19 20:00:45 +08:00
Fei Li 2e7491a8ec hv: mmiodev: a minor bug fix about refine acrn_mmiodev data structure
Rename base_hpa to host_pa in acrn_mmiodev data structure.

Tracked-On: #6366
Signed-off-by: Fei Li <fei1.li@intel.com>
2021-08-19 12:01:35 +08:00
Liu,Junming 2c5c8754de hv:enable GVT-d for pre-launched linux guest in logical partion mode
When pass-thru GPU to pre-launched Linux guest,
need to pass GPU OpRegion to the guest.
Here's the detailed steps:
1. reserve a memory region in ve820 table for GPU OpRegion
2. build EPT mapping for GPU OpRegion to pass-thru OpRegion to guest
3. emulate the pci config register for OpRegion
For the third step, here's detailed description:
The address of OpRegion locates on PCI config space offset 0xFC,
Normal Linux guest won't write this register,
so we can regard this register as read-only.
When guest reads this register, return the emulated value.
When guest writes this register, ignore the operation.

Tracked-On: #6387

Signed-off-by: Liu,Junming <junming.liu@intel.com>
2021-08-19 11:56:26 +08:00
Jian Jun Chen dc77ef9e52 hv: ivshmem: map SHM BAR with PAT ignored
ACRN does not support the variable range vMTRR. The default
memory type of vMTRR is UC. With this vMTRR emulation guest VM
such as Linux refuses to map the MMIO address space as WB. In
order to get better performance SHM BAR of ivshmem is mapped
with PAT ignored and memory type of SHM BAR is fixed to WB.

Tracked-On: #6389
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-13 11:17:15 +08:00
Yang,Yu-chu d997f4bbc1 config-tools: refine bin_gen.py and create virtual TPM2 acpi table
Create virtual acpi table of tpm2 based on the raw data if the TPM2
device is presented and the passthrough tpm2 is enabled.

Refine the arguments of bin_gen.py. The --board and --scenario take the
path to the XMLs as the argument. The allocation.xml is needed for
bin_gen.py to generate tpm2 acpi table.

Refine the condition of tpm2_acpi_gen. The tpm2 device "MSFT0101" can be
present in device id or compatible_id(CID). Check both attributes and
child node of tpm2 device.

Tracked-On: #6320
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
2021-08-11 14:45:55 +08:00
Fei Li a705ff2dac hv: relocate ACPI DATA address to 0x7fe00000
Relocate ACPI address to 0x7fe00000 and ACPI NVS to 0x7ff00000 correspondingly.
In this case, we could include TPM event log region [0x7ffb0000, 0x80000000)
into ACPI NVS.

Tracked-On: #6320
Signed-off-by: Fei Li <fei1.li@intel.com>
2021-08-11 14:45:55 +08:00
Fei Li 74e68e39d1 hv: tpm2: do tpm2 fixup for security vm
ACRN used to prepare the vTPM2 ACPI Table for pre-launched VM at the build stage
using config tools. This is OK if the TPM2 ACPI Table never changes. However,
TPM2 ACPI Table may be changed in some conditions: change BIOS configuration or
update BIOS.

This patch do TPM2 fixup to update the vTPM2 ACPI Table and TPM2 MMIO resource
configuration according to the physical TPM2 ACPI Table.

Tracked-On: #6366
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Signed-off-by: Fei Li <fei1.li@intel.com>
2021-08-11 14:45:55 +08:00
Fei Li f81b39225c HV: refine acrn_mmiodev data structure
1. add a name field to indicate what the MMIO Device is.
2. add two more MMIO resource to the acrn_mmiodev data structure.

Tracked-On: #6366
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Signed-off-by: Fei Li <fei1.li@intel.com>
2021-08-11 14:45:55 +08:00
Fei Li 20061b7c39 hv: remove xsave dependence
ACRN could run without XSAVE Capability. So remove XSAVE dependence to support
more (hardware or virtual) platforms.

Tracked-On: #6287
Signed-off-by: Fei Li <fei1.li@intel.com>
2021-08-10 16:36:15 +08:00
Fei Li 84235bf07c hv: vtd: a minor refine about dmar_wait_completion
Check whether condition is met before check whether time is out after iommu_read32.
This is because iommu_read32 would cause time out on some virtual platform in
spite of the current DMAR status meets the pre_condition.

Tracked-On: #6371
Signed-off-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-08-10 16:36:15 +08:00
Tao Yuhong 171856c46b hv: uc-lock: Fix do not trap #GP
If HV enable trigger #GP for uc-lock, and is about to emulate guest uc-lock
instructions, should trap guest #GP. Guest uc-lock instrucction trigger #GP,
cause vmexit for #GP, HV handle this vmexit and emulate uc-lock
instruction.

Tracked-On: #6299
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
2021-08-09 15:33:12 +08:00
liu hang1 d07bd78b13 Makefile:add targz-pkg entry in Makefile
User could use make targz-pkg command to generate tar package in
build directory,which could help user simplify the process
of installing acrn hypervisor in target board. user need to copy the
tarball package to target board,and extract it to "/" directory.

Tracked-On: #6355
Signed-off-by: liu hang1 <hang1.liu@intel.com>
Reviewed-by: VanCutsem, Geoffroy <geoffroy.vancutsem@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2021-08-09 11:52:27 +08:00
Kunhui-Li 578c18b962 config_tools: remove obsolete kconfig files
Remove obsolete Kconfig files;
Update Kconfig related README and error message.

Tracked-On: #6315
Signed-off-by: Kunhui-Li <kunhuix.li@intel.com>
2021-08-09 09:25:02 +08:00
Victor Sun 4a53a23faa HV: debug: support 64bit BAR pci uart with 32bit space
Currently the HV console does not support PCI UART with 64bit BAR, but in the
case that the BAR is in 64bit and the BAR space is below 4GB (i.e. the high
32bit address of the 64bit BAR is zero), HV should be able to support it.

Tracked-On: #6334

Signed-off-by: Victor Sun <victor.sun@intel.com>
2021-08-04 10:10:35 +08:00
Victor Sun 2fbc4c26e6 HV: vm_load: remove kernel_load_addr in sw_kernel_info struct
When guest kernel has multiple loading segments like ELF format image, just
define one load address in sw_kernel_info struct is meaningless.

The patch removes kernel_load_addr member in struct sw_kernel_info, the load
address should be parsed in each specified format image processing.

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-08-03 13:44:51 +08:00
Victor Sun d1d59437ea HV: vm_load: correct needed size of bzImage kernel
The previous code did not load bzImage start from protected mode part, result
in the protected mode part un-align with kernel_alignment field and then cause
kernel decompression start from a later aligned address. In this case we had
to enlarge the needed size of bzImage kernel to kernel_init_size plus double
size of kernel_alignment.

With loading issue of bzImage protected mode part fixed, the kernel needed size
is corrected in this patch.

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-08-03 13:44:51 +08:00
Victor Sun 2b5bd2e87a HV: vm_load: load protected mode code only for bzImage
When LaaG boots with bzImage module file, only protected mode code need
to be loaded to guest space since the VM will boot from protected mode
directly. Futhermore, per Linux boot protocol the protected mode code
better to be aligned with kernel_alignment field in zeropage, otherwise
kernel will take time to do "rep movs" to the aligned address.

In previous code, the bzImage is loaded to the address where aligned with
kernel_alignment, this would make the protected mode code unalign with
kernel_alignment. If the kernel is configured with CONFIG_RELOCATABLE=n,
the guest would not boot. This patch fixed this issue.

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-08-03 13:44:51 +08:00
Victor Sun 9caff7360f HV: vm_load: set kernel load addr in vm_load.c
This patch moves get_bzimage_kernel_load_addr() from init_vm_sw_load() to
vm_sw_loader() stage so will set kernel load address of bzImage type kernel
in vm_bzimage_loader() in vm_load.c.

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-08-03 13:44:51 +08:00
Victor Sun e40a258102 HV: vm_load: set ramdisk load addr in vm_load.c
This patch moves get_initrd_load_addr() API from init_vm_sw_load() to
vm_sw_loader() stage. The patch assumes that the kernel image have been
loaded to guest space already.

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-08-03 13:44:51 +08:00
Victor Sun afe24731a7 HV: vm_load: remove load_sw_modules() api
In load_sw_modules() implementation, we always assuming the guest kernel
module has one load address and then the whole kernel image would be loaded
to guest space from its load address. This is not true when guest kernel
has multiple load addresses like ELF format kernel image.

This patch removes load_sw_modules() API, and the loading method of each
format of kernel image could be specified in prepare_loading_xxximage() API.

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-08-03 13:44:51 +08:00
Victor Sun 2938eca363 HV: vm_load: refine api of get_bzimage_kernel_load_addr()
As the previous commit said the kernel load address should be moved
from init_vm_sw_load() to vm_sw_loader() stage. This patch refines
the API of get_bzimage_kernel_load_addr() in init_vm_kernel_info()
for later use.

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-08-03 13:44:51 +08:00
Victor Sun 33d226bf58 HV: vm_load: refine api of get_initrd_load_addr()
Currently the guest kernel load address and ramdisk load address are
initialized during init_vm_sw_load() stage, this is meaningless when
guest kernel has multiple segments with different loading addresses.
In that case, the kernel load addresses should be parsed and loaded
in vm_sw_loader() stage, the ramdisk load address should be set in
that stage also because it is depended on kernel load address.

This patch refines the API of get_initrd_load_addr() which will set
proper initrd load address of bzImage type kernel for later use.

Tracked-On: #6323

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-08-03 13:44:51 +08:00