Using the SSRAM area size extracted by config_tools, the patch changes
the hard-coded GPA SSRAM area size to its actual size, so that
pre-launched VMs can support large(>8MB) SSRAM area.
When booting service VM, the SSRAM area has to be removed from Service
VM's mem space, because they are passed-through to the pre-rt VM. The
code was bugged since it was using the SSRAM area's GPA in the pre-rt
VM. Changed it to GPA in Service VM.
Tracked-On: #7212
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
On hybrid platform(e.g. ADL), there may be multiple instances of same level caches for different type of processors,
The current design only supports one global `rdt_info` for each RDT resource type.
In order to support hybrid platform, this patch introduce `rdt_ins` to represents the "instance".
Also, the number of `rdt_info` is dynamically generated by config-tool to match with physical board.
Tracked-On: projectacrn#6690
Signed-off-by: Tw <wei.tan@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
As RDT related information will be offered by config-tool dynamically,
and HV is just a consumer of that. So there's no need to do this detection
at startup anymore.
Tracked-On: projectacrn#6690
Signed-off-by: Tw <wei.tan@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
Many of the license and Intel copyright headers include the "All rights
reserved" string. It is not relevant in the context of the BSD-3-Clause
license that the code is released under. This patch removes those strings
throughout the code (hypervisor, devicemodel and misc).
Tracked-On: #7254
Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
Now HI_MMIO_xxx is duplicate with MMIO64_xxx. This patch replace HI_MMIO_xxx
with MMIO64_xxx.
Tracked-On: #6011
Signed-off-by: Fei Li <fei1.li@intel.com>
Unify the handling of host/guest MSR area in VMCS. Remove the emum value
as the element index when there are a few of MSRs in host/guest area.
Because the index could be changed if one element not used. So, use a
variable to save the index which will be used.
Tracked-On: #6966
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
Requirement: in CPU partition VM (RTVM), vtune or perf can be used to
sample hotspot code path to tune the RT performance, It need support
PMU/PEBS (Processor Event Based Sampling). Intel TCC asks for it, too.
It exposes PEBS related capabilities/features and MSRs to CPU
partition VM, like RTVM. PEBS is a part of PMU. Also PEBS needs
DS (Debug Store) feature to support. So DS is exposed too.
Limitation: current it just support PEBS feature in VM level, when CPU
traps to HV, the performance counter will stop. Perf global control
MSR is used to do this work. So, the counters shall be close to native.
Tracked-On: #6966
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
Add a flag: GUEST_FLAG_PMU_PASSTHROUGH to indicate if
PMU (Performance Monitor Unit) is passthrough to guest VM.
Tracked-On: #6966
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
NMI is used to notify LAPIC-PT RTVM, to kick its CPU into hypervisor.
But NMI could be used by system devices, like PMU (Performance Monitor
Unit). So use INIT signal as the partition CPU notification function, to
replace injecting NMI.
Also remove unused NMI as notification related code.
Tracked-On: #6966
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
Now the vept module uses a mixture of nept and vept, it's better to
refine it.
So this patch rename nept to vept and simplify the interface of vept
init module.
Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@intel.com>
This patch adds ENOTTY and ENOSYS to indicate undefined and obsoleted
request hyercall respectively, and uses ENOTTY as error code for undefined
hypercall instead of EINVAL to consistent with the ACRN kernel's return
value.
Tracked-On: #7029
Signed-off-by: Wen Qian <qian.wen@intel.com>
Signed-off-by: Li Fei <fei1.li@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
The coding guideline rule C-ST-04 requires that
a 'if' statement followed by one or more 'else if'
statement shall be terminated by an 'else' statement
which contains either appropriate action or a comment.
Tracked-On: #6776
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Now the vept table was allocate dynamically, but the table size of vept
was calculated by the CONFIG_PLATFORM_RAM_SIZE which was predefined by
config tool.
It's not complete change and can't support single binary for different
boards/platforms.
So this patch will replace the CONFIG_PLATFORM_RAM_SIZE and get the
top ram size from hv_E820 interface for vept.
Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@linux.intel.com>
Now the EPT module use predefined parameter "CONFIG_PLATFORM_RAM_SIZE"
to calculate the ept table size.
After change the EPT table to dynamic allocate to support single binary
for different boards/platforms, the ept table size should dynamic
calculate too.
So this patch replace CONFIG_PLATFORM_RAM_SIZE by the hv_e820_ram_size
to get the RAM info on run time.
Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@linux.intel.com>
CONFIG_PLATFORM_RAM_SIZE is predefined by config tool and mmu use it to
calculate the table size and predefine the ppt table.
This patch will change the ppt to allocate dynamically and get the table
size by the hv_e820_ram_size interface which could get the RAM
info on run time and replace the CONFIG_PLATFORM_RAM_SIZE.
Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@linux.intel.com>
The e820 module could get the RAM info on run time, but the RAM size
and MAX address was limited by CONFIG_PLATFORM_RAM_SIZE which was
predefined by config tool.
Current solution can't support single binary for different boards or
platforms and the CONFIG_PLATFORM_RAM_SIZE can't matching the RAM size
if user have not update config tools setting after the device changed.
So this patch remove the CONFIG_PLATFORM_RAM_SIZE and calculate ram
size on run time.
Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@linux.intel.com>
Sometimes the memory to be allocated is not at the end of an entry,
that means we have to break one enty into 2 smaller entries, there
are two ways to add the new entry to hv_e820, adds to the end or
insert it.
The initial e820 table is ordered, that's why the e820_alloc_memory
interface asssum all entries was sorted, but add new entry to the
end will break the orde of hv_e820.
So we use insert_e820_entry to replace the add_e820_entry, the new
interfeac will keep the orde and users do not need sort again after
alloc region
Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei<chenli.wei@linux.intel.com>
HC_GET_PLATFORM_INFO hypercall is not supported anymore,
hence to remove related function and data structure definition.
Tracked-On: #6690
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Current mmu assum the high memory start from 4G,it's not true for some
platform.
The map logic use "high64_max_ram - 4G" to calculate the high ram size
without any check,it's an issue when the platform have no high memory.
So this patch add high64_min_ram variable to calculate the min address
of high memory and check the high64_min_ram to fix the previou issue.
Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@linux.intel.com>
The current code would inject GP to guest, when there's no IWKeyBackup,
and the guest tried to write MSR MSR_IA32_COPY_PLATFORM_TO_LOCAL(0xd92)
to copy IWKeyBackup for the platform to the IWKey for this logical processor.
This patch fixes it by adjusting the code logic, and it'll do nothing
instead of inject GP if no valid IWKeyBackup.
This patch alse add checking for the value being written to avoid setting
reserved MSR bits.
Tracked-On: #7018
Signed-off-by: Wen Qian <qian.wen@intel.com>
Signed-off-by: Li Fei <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Now the multiboot modules memory have not reserve,it's an issue if
these memory alloc and write before VM start.
Incorrect allocation of multiboot modules memory will cause VM lost
data or start faild.
So we find these modules memory range and reserve these memory from
e820 entry.
All these memory will realloc to VM which own them before the vm start.
Tracked-On: #6690
Acked-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Chenli Wei <chenli.wei@intel.com>
The coding guideline rule C-FN-16 requires that 'Mixed-use of
C code and assembly code in a single function shall not be allowed',
this patch wraps inline assembly to inline functions.
Tracked-On: #6776
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
v1-->v2:
use inline functions for read/write XMM registers
Rename `CONFIG_IOMMU_BUS_NUM` to `ACFG_MAX_PCI_BUS_NUM`. Configure tool
will calculate `ACFG_MAX_PCI_BUS_NUM` base on the max pci num which is
used by VF. So user needn't care about `ACFG_MAX_PCI_BUS_NUM`, and memory
will be used resonable.
Tracked-On: #6942
Signed-off-by: Yuanyuan Zhao <yuanyuan.zhao@linux.intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Victor Sun <victor.sun@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
remove is_valid_xsave_combination api,
assume the hardware or QEMU can guarantee that support
XSAVE on CPU side and XSAVE_XRSTR on VMX side or not.
will add offline-tool in QEMU platform to avoid the user
use wrong XSAVE configurations.
remov check VMX_PROCBASED_CTLS2_XSVE_XRSTR based on the above reason.
for VMX_PROCBASED_CTLS2_PAUSE_LOOP, now it will panic
if run ACRN over QEMU, here remove it from essential check,
and it will print error information when set this bit
if there is no the hardware capability.
v1-v2:
remove is_valid_xsave_combination
Tracked-On: #6584
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
This patch adds an option CONFIG_KEEP_IRQ_DISABLED to hv (default n) and
config-tool so that when this option is 'y', all interrupts in hv root
mode will be permanently disabled.
With this option to be 'y', all interrupts received in root mode will be
handled in external interrupt vmexit after next VM entry. The postpone
latency is negligible. This new configuration is a requirement from x86
TEE's secure/non-secure interrupt flow support. Many race conditions can be
avoided when keeping IRQ off.
v5:
Rename CONFIG_ACRN_KEEP_IRQ_DISABLED to CONFIG_KEEP_IRQ_DISABLED
v4:
Change CPU_IRQ_ENABLE/DISABLE to
CPU_IRQ_ENABLE_ON_CONFIG/DISABLE_ON_CONFIG and guard them using
CONFIG_ACRN_KEEP_IRQ_DISABLED
v3:
CONFIG_ACRN_DISABLE_INTERRUPT -> CONFIG_ACRN_KEEP_IRQ_DISABLED
Add more comment in commit message
Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Previous upstreamed patches handles the secure/non-secure interrupts in
handle_x86_tee_int. However there is a corner case in which there might
be unhandled secure interrupts (in a very short time window) when TEE
yields vCPU. For this case we always make sure that no secure interrupts
are pending in TEE's vlapic before scheduling REE.
Also in previous patches, if non-secure interrupt comes when TEE is
handling its secure interrupts, hypervisor injects a predefined vector
into TEE's vlapic. TEE does not consume this vector in secure interrupt
handling routine so it stays in vIRR, but it should be cleared because the
actual interrupt will be consumed in REE after VM Entry.
v3:
Fix comments on interrupt priority
v2:
Add comments explaining the priority of secure/non-secure interrupts
Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
The TEE_NOTIFICATION_VECTOR can sometimes be confused with TEE's PI
notification vector. So rename it to TEE_FIXED_NONSECURE_VECTOR for
better readability.
No logic change.
v3:
Add more comments in commit message.
Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Sometimes HV would like to know if there are specific interrupt
pending in vIRR, and clears them if necessary (such as in x86_tee case).
This patch adds two APIs: get_next_pending_intr and clear_pending_intr.
This patch also moves the inline api prio() from
vlapic.c to vlapic.h
v3:
Remove apicv_get_next_pending_intr and apicv_clear_pending_intr
and use vlapic_get_next_pending_intr and vlapic_clear_pending_intr
directly.
v2:
get_pending_intr -> get_next_pending_intr
apicv_basic/advanced_clear_pending_intr -> apicv_clear_pending_intr
apicv_basic/advanced_get_pending_intr -> apicv_get_next_pending_intr
has_pending_intr kept
Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Though REE VM has its load order to be Service_VM, it does not offer
services as Service VM does. The only hypercalls allowed for REE are the
ones with GUEST_FLAG_REE.
Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
This patch wraps the check of GUEST_FLAG_TEE/REE into functions
is_tee_vm/is_ree_vm for readability. No logic changes.
Tracked-On: #6571
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
The CONFIG_LOG_DESTINATION parameter selects where the logging messages
send to,serial console or memory or npk device MMIO region.
Now we want to remove it and check the loglevel of each channel,close the
output when the loglevel is ZERO.
Tracked-On: #6934
Signed-off-by: Chenli Wei <chenli.wei@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Currently, in RTVM with multi vCPUs, lapic pass through is
configured, each vCPU works in x2apic mode. When one vCPU sends
IPI to all other vCPUs through writes ICR register with virtual
value 0x00000000000c00f8, this ICR writting will be intercepted,
the hypervisor passes destination shorthand field 11B (All Excluding
Self) in the virtual ICR value into physical ICR value during IPI
emulation, this IPI will be sent to each physical CPU core
in the platform according to 10.6.1 Interrupt Command Register (ICR),
Vol 3, SDM.
One vCPU in User VM with lapic pass through configuration can
send IPI with destination shorthand (10B or 11B) and any vector
(such as NMI or reboot vector) to other vCPUs, this IPI will sent
other VMs in the platform by hypervisor, this interference may
cause other VMs hang.
In this patch, set "Destination Shorthand" field of the
ICR value as 00B (No Shorthand) since the emulation is done
through sending IPI to each VCPU in dmask one by one.
Tracked-On: #6908
Signed-off-by: Xiangyang Wu <xiangyang.wu@intel.com>
Reviewed-by: Chen, Jason CJ <jason.cj.chen@intel.com>
This patch introduces stateful VM which represents a VM that has its own
internal state such as a file cache, and adds a check before system
shutdown to make sure that stateless VM does not block system shutdown.
Tracked-On: #6571
Signed-off-by: Wang Yu <yu1.wang@intel.com>
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Variable should not have a prefix of '_' per MISRA C standard. The patch
removes the prefix for _ld_ram_start and _ld_ram_end.
Tracked-On: #6885
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
In previous commit df7ffab441
the CONFIG_HV_RAM_SIZE was removed and the hv_ram_size was calculated in
link script by following formula:
ld_ram_size = _ld_ram_end - _ld_ram_start ;
but _ld_ram_start is a relative address in boot section whereas _ld_ram_end
is a absolute address in global, the mix operation cause hv_ram_size is
incorrect when HV binary is relocated.
The patch fix this issue by getting _ld_ram_start and _ld_ram_end respectively
and calculated at runtime.
Tracked-On: #6885
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Secure interrupt (interrupt belongs to TEE) comes
when TEE vcpu is running, the interrupt will be
injected to TEE directly. But when REE vcpu is running
at that time, we need to switch to TEE for handling.
Non-Secure interrupt (interrupt belongs to REE) comes
when REE vcpu is running, the interrupt will be injected
to REE directly. But when TEE vcpu is running at that time,
we need to inject a predefined vector to TEE for notification
and continue to switch back to TEE for running.
To sum up, when secure interrupt comes, switch to TEE
immediately regardless of whether REE is running or not;
when non-Secure interrupt comes and TEE is running,
just notify the TEE and keep it running, TEE will switch
to REE on its own initiative after completing its work.
Tracked-On: projectacrn#6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
This patch implements the following x86_tee hypercalls,
- HC_TEE_VCPU_BOOT_DONE
- HC_SWITCH_EE
Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
This patch adds the x86_tee hypercall interfaces.
- HC_TEE_VCPU_BOOT_DONE
This hypercall is used to notify the hypervisor that the TEE VCPU Boot
is done, so that we can sleep the corresponding TEE VCPU. REE will be
started at the last time this hypercall is called by TEE.
- HC_SWITCH_EE
For REE VM, it uses this hypercall to request TEE service.
For TEE VM, it uses this hypercall to switch back to REE
when it completes the REE service.
Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
TEE is a secure VM which has its own partitioned resources while
REE is a normal VM which owns the rest of platform resources.
The TEE, as a secure world, it can see the memory of the REE
VM, also known as normal world, but not the other way around.
But please note, TEE and REE can only see their own devices.
So this patch does the following things:
1. go through physical e820 table, to ept add all system memory entries.
2. remove hv owned memory.
Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
Given an e820, this API creates an identical memmap for specified
e820 memory type, EPT memory cache type and access right.
Tracked-On: #6571
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
With current arch design the UUID is used to identify ACRN VMs,
all VM configurations must be deployed with given UUIDs at build time.
For post-launched VMs, end user must use UUID as acrn-dm parameter
to launch specified user VM. This is not friendly for end users
that they have to look up the pre-configured UUID before launching VM,
and then can only launch the VM which its UUID in the pre-configured UUID
list,otherwise the launch will fail.Another side, VM name is much straight
forward for end user to identify VMs, whereas the VM name defined
in launch script has not been passed to hypervisor VM configuration
so it is not consistent with the VM name when user list VM
in hypervisor shell, this would confuse user a lot.
This patch will resolve these issues by removing UUID as VM identifier
and use VM name instead:
1. Hypervisor will check the VM name duplication during VM creation time
to make sure the VM name is unique.
2. If the VM name passed from acrn-dm matches one of pre-configured
VM configurations, the corresponding VM will be launched,
we call it static configured VM.
If there is no matching found, hypervisor will try to allocate one
unused VM configuration slot for this VM with given VM name and get it
run if VM number does not reach CONFIG_MAX_VM_NUM,
we will call it dynamic configured VM.
3. For dynamic configured VMs, we need a guest flag to identify them
because the VM configuration need to be destroyed
when it is shutdown or creation failed.
v7->v8:
-- rename is_static_vm_configured to is_static_configured_vm
-- only set DM owned guest_flags in hcall_create_vm
-- add check dynamic flag in get_unused_vmid
v6->v7:
-- refine get_vmid_by_name, return the first matching vm_id
-- the GUEST_FLAG_STATIC_VM is added to identify the static or
dynamic VM, the offline tool will set this flag for
all the pre-defined VMs.
-- only clear name field for dynamic VM instead of clear entire
vm_config
Tracked-On: #6685
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
Reviewed-by: Victor Sun<victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The CONFIG_MAX_IR_ENTRIES and CONFIG_MAX_PT_IRQ_ENTRIES are separate
configuration items, and they can be configured through configuration tool
When the number of PT irq entries are more than IR entries, then some
passthrough devices' irqs may failed to be protected by interrupt
remapping or automatically injected by post-interrupt mechanism.
And it waste memory if the CONFIG_MAX_IR_ENTRIES is larger.
This patch replace the CONFIG_MAX_IR_ENTRIES to MAX_IR_ENTRIES and
enforce it align to CONFIG_PT_IRQ_ENTRIES and round up to > 2^n as the
IRTA_REG spec.This way can enforce all PT irqs works with IR or PI
mechanism.
Tracked-On: #6745
Signed-off-by: Chenli Wei <chenli.wei@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
e820_alloc_memory requires 4k alignment, so conversion to size is
encapsulated in the function. And then the pre-condition of
`size_arg` is removed.
Tracked-On: #6805
Signed-off-by: Yuanyuan Zhao <yuanyuan.zhao@linux.intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The CONFIG_LOW_RAM_SIZE is used to describe the size of trampoline code
that is never changed. And it totally confused user to configure it.
This patch hard code it to 1MB and remove the macro for configuration.
In the trampoline related code, use ld_trampoline_end and
ld_trampoline_start symbol to calculate the real size.
Tracked-On: #6805
Signed-off-by: Yuanyuan Zhao <yuanyuan.zhao@linux.intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Commit cbf3825 "hv: Pass-through IA32_TSC_AUX MSR to L1 guest"
lets guest own the physical MSR IA32_TSC_AUX and does not handle this MSR
in the hypervisor.
If multiple vCPUs share the same pCPU, when one vCPU reads MSR IA32_TSC_AUX,
it may get the value set by other vCPUs.
To fix this issue, this patch does:
- initialize the MSR content to 0 for the given vCPU, which is consistent with
the value specified in SDM Vol3 "Table 9-1. IA-32 and Intel 64 Processor
States Following Power-up, Reset, or INIT"
- save/restore the MSR content for the given vCPU during context switch
v1 -> v2:
* According to Table 9-1, the content of IA32_TSC_AUX MSR is unchanged
following INIT, v2 updates the initialization logic so that the content for
vCPU is consistent with SDM.
Tracked-On: #6799
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch moves the ssram area in ve820 tab, and reunites the
hpa1_low_part1/2 areas. The ve820 building code is refined.
before:
|<---low_1M--->|
|<---hpa1_low_part1--->|
|<---SSRAM--->|
|<---hpa1_low_part2--->|
|<---GPU_OpRegion--->|
|<---ACPI DATA--->|
|<---ACPI NVS--->|
---2G---
after:
|<---low_1M--->|
|<---hpa_low--->|
|<---SSRAM--->|
|<---GPU_OpRegion--->|
|<---ACPI DATA--->|
|<---ACPI NVS--->|
---2G---
The SSRAM area's address is described in the ACPI's RTCT/PTCT
table. To simplify the SSRAM implementation, SSRAM area was
identical mapped to GPA, and resulted in the divition of hpa_low.
Then the ve820 building logic became too complicated.
Now we managed to edit the guest's RTCT/PTCT table by offline
tools in the former patch, so we can move the guest's SSRAM
area, and reunite the hpa_low areas again.
After doing this, this patch rewrites the ve820 building code
in a much simpler way.
Tracked-On: #6674
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
The ve820 table' hpa1_low area is divided into two parts, which
is making the code too complicated and causing problems. Moving
the entries that divides the hpa1_low could make things easier.
This patch moves the GPU OpRegion to the tail area of 2G,
consecutive to the acpi data/nvs area.
before:
|<---low_1M--->|
|<---hpa1_low_part1--->|
|<---SSRAM--->|
|<---GPU_OpRegion--->|
|<---hpa1_low_part2--->|
|<---ACPI DATA--->|
|<---ACPI NVS--->|
---2G---
after:
|<---low_1M--->|
|<---hpa1_low_part1--->|
|<---SSRAM--->|
|<---hpa1_low_part2--->|
|<---GPU_OpRegion--->|
|<---ACPI DATA--->|
|<---ACPI NVS--->|
---2G---
Tracked-On: #6674
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
The length of the ACPI data entry in ve820 tab was 960K, while the
ACPI file is 1MB. It causes the ACPI file copy failed due to reserved
ACPI regions in ve820 table is not enough when loading pre-launched
VMs. This patch changes ACPI data area to 1MB to fix the problem.
And the ACPI data length was missed when calculating
ENTRY_HPA1_LOW_PART2 length. Fixed here too.
Also adds some refinement to the hard-coded ACPI base/addr definations
Tracked-On: #6674
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>