acrn-hypervisor

Commit Graph

Author	SHA1	Message	Date
Zide Chen	4c29a0bb29	hv: nested: support for VMLAUNCH and VMRESUME emulation Implement the VMLAUNCH and VMRESUME instructions, allowing a L1 hypervisor to run nested guests. - merge VMCS control fields and VMCS guest fields to VMCS02 - clear shadow VMCS indicator on VMCS02 and load VMCS02 as current - set VMCS12 launch state to "launched" in VMLAUNCH handler Tracked-On: #5923 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Alex Merritt <alex.merritt@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-06-03 15:23:25 +08:00
Yonghua Huang	1a6ead9af5	hv: update RTCT ACPI table detecting Signature of RTCT ACPI table maybe "PTCT"(v1) or "RTCT"(v2). and the MAGIC number in CRL header is also changed from "PTCM" to "RTCM". This patch refine the code to detect RTCT table for both v1 and v2. Tracked-On: #6020 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2021-06-01 08:22:20 +08:00
Zide Chen	6d69058a9d	hv: nested: support for VMREAD and VMWRITE emulation This patch implements the VMREAD and VMWRITE instructions. When L1 guest is running with an active VMCS12, the “VMCS shadowing” VM-execution control is always set to 1 in VMCS01. Thus the possible behavior of VMREAD or VMWRITE from L1 could be: - It causes a VM exit to L0 if the bit corresponds to the target VMCS field in the VMREAD bitmap or VMWRITE bitmap is set to 1. - It accesses the VMCS referenced by VMCS01 link pointer (VMCS02 in our case) if the above mentioned bit is set to 0. This patch handles the VMREAD and VMWRITE VM exits in this way: - on VMWRITE, it writes the desired VMCS value to the respective field in the cached VMCS12. For VMCS fields that need to be synced to VMCS02, sets the corresponding dirty flag. - on VMREAD, it reads the desired VMCS value from the cached VMCS12. Tracked-On: #5923 Signed-off-by: Alex Merritt <alex.merritt@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-05-24 10:34:01 +08:00
Zide Chen	2bd269c11c	hv: nested: support for VMCLEAR emulation This patch is to emulate VMCLEAR instruction. L1 hypervisor issues VMCLEAR on a VMCS12 whose state could be any of these: active and current, active but not current, not yet VMPTRLDed. To emulate the VMCLEAR instruction, ACRN sets the VMCS12 launch state to "clear", and if L0 already cached this VMCS12, need to sync it back to guest memory: - sync shadow fields from shadow VMCS VMCS to cache VMCS12 - copy cache VMCS12 to L1 guest memory Tracked-On: #5923 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2021-05-24 10:34:01 +08:00
Zide Chen	5379b14108	hv: nested: define VMCS shadow fields Enable VMCS shadowing for most of the VMCS fields, so that execution of the VMREAD or VMWRITE on these shadow VMCS fields from L1 hypervisor won't cause VM exits, but read from or write to the shadow VMCS. Tracked-On: #5923 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2021-05-24 10:34:01 +08:00
Zide Chen	863e58e539	hv: nested: define software layout for VMCS12 and helper functions Software layout of VMCS12 data is a contract between L1 guest and L0 hypervisor to run a L2 guest. ACRN hypervisor caches the VMCS12 which is passed down from L1 hypervisor by the VMPTRLD instructin. At the time of VMCLEAR, ACRN syncs the cached VMCS12 back to L1 guest memory. Tracked-On: #5923 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-05-24 10:34:01 +08:00
Zide Chen	f5744174b5	hv: nested: support for VMPTRLD emulation This patch emulates the VMPTRLD instruction. L0 hypervisor (ACRN) caches the VMCS12 that is passed down from the VMPTRLD instruction, and merges it with VMCS01 to create VMCS02 to run the nested VM. - Currently ACRN can't cache multiple VMCS12 on one vCPU, so it needs to flushes active but not current VMCS12s to L1 guest. - ACRN creates VMCS02 to run nested VM based on VMCS12: 1) copy VMCS12 from guest memory to the per vCPU cache VMCS12 2) initialize VMCS02 revision ID and host-state area 3) load shadow fields from cache VMCS12 to VMCS02 4) enable VMCS shadowing before L1 Vm entry Tracked-On: #5923 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2021-05-24 10:34:01 +08:00
Zide Chen	0a1ac2f4a0	hv: nested: support for VMXOFF emulation This patch implements the VMXOFF instruction. By issuing VMXOFF, L1 guest Leaves VMX Operation. - cleanup VCPU nested virtualization context states in VMXOFF handler. - implement check_vmx_permission() to check permission for VMX operation for VMXOFF and other VMX instructions. Tracked-On: #5923 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-05-24 10:34:01 +08:00
Zide Chen	3fdad3c6d1	hv: nested: check prerequisites to enter VMX operation According to VMXON Instruction Reference, do the following checks in the virtual hardware environment: vCPU CPL, guest CR0, CR4, revision ID in VMXON region, etc. Currently ACRN doesn't support 32-bit L1 hypervisor, and injects an #UD exception if L1 hypervisor is not running in 64-bit mode. Tracked-On: #5923 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-05-24 10:34:01 +08:00
Zide Chen	fc8f07e740	hv: nested: support for VMXON emulation This patch emulates VMXON instruction. Basically checks some prerequisites to enable VMX operation on L1 guest (next patch), and prepares some virtual hardware environment in L0. Tracked-On: #5923 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-05-24 10:34:01 +08:00
Li Fei1	a69e67b58b	hv: vlapic: wrap a function to calculate destination vcpu mask by shorthand 1. Rename vlapic_calc_dest to vlapic_calc_dest_noshort 2. Remove vlapic_calc_dest_lapic_pt, use vlapic_calc_dest_noshort instead 3. Wrap vlapic_calc_dest to calculate destination vcpu mask according shorthand Tracked-On: #5923 Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-05-24 10:27:32 +08:00
Rong Liu	3db4491e1c	hv: PTM: Create virtual root port Create virtual root port through add_vdev hypercall. add_vdev identifies the virtual device to add by its vendor id and device id, then call the corresponding function to create virtual device. -create_vrp(): Find the right virtual root port to create by its secondary bus number, then initialize the virtual root port. And finally initialize PTM related configurations. -destroy_vrp(): nothing to destroy Tracked-On: #5915 Signed-off-by: Rong Liu <rong.l.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Jason Chen <jason.cj.chen@intel.com> Acked-by: Yu Wang <yu1.wang@intel.com>	2021-05-19 13:54:24 +08:00
Rong Liu	d57bf51c89	hv: PTM: Add virtual root port Add virtual root port that supports the most basic pci-e bridge and root port operations. - init_vroot_port(): init vroot_port's basic registers. - deinit_vroot_port(): reset vroot_port - read_vroot_port_cfg(): read from vroot_port's virtual config space. - write_vroot_port_cfg(): write to vroot_port's virtual config space. Tracked-On: #5915 Signed-off-by: Rong Liu <rong.l.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Jason Chen <jason.cj.chen@intel.com> Acked-by: Yu Wang <yu1.wang@intel.com>	2021-05-19 13:54:24 +08:00
Liang Yi	3547c9cd23	hv/mod_timer: make timer into an arch-independent module x86/timer.[ch] was moved to the common directory largely unchanged. x86 specific code now resides in x86/tsc_deadline_timer.c and its interface was defined in hw/hw_timer.h. The interface defines two functions: init_hw_timer() and set_hw_timeout() that provides HW specific initialization and timer interrupt source. Other than these two functions, the timer module is largely arch agnostic. Tracked-On: #5920 Signed-off-by: Rong Liu <rong2.liu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-05-18 16:43:28 +08:00
Liang Yi	51204a8d11	hv/mod_timer: separate delay functions from the timer module Modules that use udelay() should include "delay.h" explicitly. Tracked-On: #5920 Signed-off-by: Rong Liu <rong2.liu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-05-18 16:43:28 +08:00
Liang Yi	5a2b89b0a4	hv/mod_timer: split tsc handling code from timer. Generalize and split basic cpu cycle/tick routines from x86/timer: - Instead of rdstc(), use cpu_ticks() in generic code. - Instead of get_tsc_khz(), use cpu_tickrate() in generic code. - Include "common/ticks.h" instead of "x86/timer.h" in generic code. - CYCLES_PER_MS is renamed to TICKS_PER_MS. The x86 specific API rdstc() and get_tsc_khz(), as well as TSC_PER_MS are still available in arch/x86/tsc.h but only for x86 specific usage. Tracked-On: #5920 Signed-off-by: Rong Liu <rong2.liu@intel.com> Signed-off-by: Yi Liang <yi.liang@intel.com>	2021-05-18 16:43:28 +08:00
Yonghua Huang	00b3a28d5d	hv: update RTCT parser to support RTCT version 2 RTCT has been updated to version 2, this patch updates hypervisor RTCT parser to support both version 1 and version 2 of RTCT. Tracked-On: #6020 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Jason CJ Chen <jason.cj.chen@intel.com>	2021-05-17 17:19:11 +08:00
Zide Chen	c9982e8c7e	hv: nested: setup emulated VMX MSRs We emulated these MSRs: - MSR_IA32_VMX_PINBASED_CTLS - MSR_IA32_VMX_PROCBASED_CTLS - MSR_IA32_VMX_PROCBASED_CTLS2 - MSR_IA32_VMX_EXIT_CTLS - MSR_IA32_VMX_ENTRY_CTLS - MSR_IA32_VMX_BASIC: emulate VMCS revision ID, etc. - MSR_IA32_VMX_MISC For the following MSRs, we pass through the physical value to L1 guests: - MSR_IA32_VMX_EPT_VPID_CAP - MSR_IA32_VMX_VMCS_ENUM - MSR_IA32_VMX_CR0_FIXED0 - MSR_IA32_VMX_CR0_FIXED1 - MSR_IA32_VMX_CR4_FIXED0 - MSR_IA32_VMX_CR4_FIXED1 Tracked-On: #5923 Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-05-16 19:05:21 +08:00
Zide Chen	4930992118	hv: nested: implement the framework for VMX MSR emulation Define LIST_OF_VMX_MSRS which includes a list of MSRs that are visible to L1 guests if nested virtualization is enabled. - If CONFIG_NVMX_ENABLED is set, these MSRs are included in emulated_guest_msrs[]. - otherwise, they are included in unsupported_msrs[]. In this way we can take advantage of the existing infrastructure to emulate these MSRs. Tracked-On: #5923 Spick igned-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-05-16 19:05:21 +08:00
Yonghua Huang	e9870893a3	hv: rename some software SRAM local names For simplification purpose, use 'ssram' instead of 'software sram' for local names inside rtcm module. Tracked-On: #6015 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-05-16 10:08:17 +08:00
Li Fei1	30febed0e1	hv: cache: wrap common APIs Wrap three common Cache APIs: - flush_invalidate_all_cache - flush_cacheline - flush_cache_range Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2021-05-14 09:18:00 +08:00
Li Fei1	77e64f6092	hv: tlb: wrap common APIs Wrap two common TLB APIs: flush_tlb and flush_tlb_range. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2021-05-14 09:18:00 +08:00
Li Fei1	d94582389e	hv: mmu: move arch specific parts into cpu.h Move Cache/TLB arch specific parts into cpu.h After this change, we should not expose arch specific parts out from mmu.h Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2021-05-14 09:18:00 +08:00
Li Fei1	d6362b6e0a	hv: paging: rename ppt_set/clear_ATTR to set_paging_ATTR Rename ppt_set/clear_(attribute) to set_paging_(attribute) Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2021-05-14 09:18:00 +08:00
Zide Chen	ccfdf9cdd7	hv: nested: enable nested virtualization Allow guest set CR4_VMXE if CONFIG_NVMX_ENABLED is set: - move CR4_VMXE from CR4_EMULATED_RESERVE_BITS to CR4_TRAP_AND_EMULATE_BITS so that CR4_VMXE is removed from cr4_reserved_bits_mask. - force CR4_VMXE to be removed from cr4_rsv_bits_guest_value so that CR4_VMXE is able to be set. Expose VMX feature (CPUID01.01H:ECX[5]) to L1 guests whose GUEST_FLAG_NVMX_ENABLED is set. Assuming guest hypervisor (L1) is KVM, and KVM uses EPT for L2 guests. Constraints on ACRN VM. - LAPIC passthrough should be enabled. - use SCHED_NOOP scheduler. Tracked-On: #5923 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-05-13 16:16:30 +08:00
Zide Chen	dd90eccc25	hv: move invvpid and invept helper code from mmu.c to mmu.h moving invvpid and invept helper code from mmu.c to mmu.h, so that they can be accessed by the nested virtualization code. No logical changes. Tracked-On: #5923 Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-05-13 16:16:30 +08:00
Shuo A Liu	3fffa68665	hv: Support WAITPKG instructions in guest VM TPAUSE, UMONITOR or UMWAIT instructions execution in guest VM cause a #UD if "enable user wait and pause" (bit 26) of VMX_PROCBASED_CTLS2 is not set. To fix this issue, set the bit 26 of VMX_PROCBASED_CTLS2. Besides, these WAITPKG instructions uses MSR_IA32_UMWAIT_CONTROL. So load corresponding vMSR value during context switch in of a vCPU. Please note, the TPAUSE or UMWAIT instruction causes a VM exit if the "RDTSC exiting" and "enable user wait and pause" are both 1. In ACRN hypervisor, "RDTSC exiting" is always 0. So TPAUSE or UMWAIT doesn't cause a VM exit. Performance impact: MSR_IA32_UMWAIT_CONTROL read costs ~19 cycles; MSR_IA32_UMWAIT_CONTROL write costs ~63 cycles. Tracked-On: #6006 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2021-05-13 14:19:50 +08:00
Liang Yi	688a41c290	hv: mod: do not use explicit arch name when including headers Instead of "#include <x86/foo.h>", use "#include <asm/foo.h>". In other words, we are adopting the same practice in Linux kernel. Tracked-On: #5920 Signed-off-by: Liang Yi <yi.liang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-05-08 11:15:46 +08:00
Shuo A Liu	dc88c2e397	hv: Save/restore MSR_IA32_CSTAR during context switch Both Windows guest and Linux guest use the MSR MSR_IA32_CSTAR, while Linux uses it rarely. Now vcpu context switch doesn't save/restore it. Windows detects the change of the MSR and rises a exception. Do the save/resotre MSR_IA32_CSTAR during context switch. Tracked-On: #5899 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-04-23 11:21:52 +08:00
Jian Jun Chen	31b8b698ce	hv: TLFS: Add tsc_offset support for reference time TLFS spec defines that when a VM is created, the value of HV_X64_MSR_TIME_REF_COUNT is set to zero. Now tsc_offset is not supported properly, so guest get a drifted reference time. This patch implements tsc_offset. tsc_scale and tsc_offset are calculated when a VM is launched and are saved in struct acrn_hyperv of struct acrn_vm. Tracked-On: #5956 Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-04-23 10:48:07 +08:00
Li Fei1	628bca5cad	hv: pgtable: use new algo to calculate PPT/EPT_PD_PAGE_NUM In order to support platform (such as Ander Lake) which physical address width bits is 46, the current code need to reserve 2^16 PD page ((2^46) / (2^30)). This is a complete waste of memory. This patch would reserve PD page by three parts: 1. DRAM - may take PD_PAGE_NUM(CONFIG_PLATFORM_RAM_SIZE) PD pages at most; 2. low MMIO - may take PD_PAGE_NUM(MEM_1G << 2U) PD pages at most; 3. high MMIO - may takes (CONFIG_MAX_PCI_DEV_NUM * 6U) PD pages (may plus PDPT entries if its size is larger than 1GB ) at most for: (a) MMIO BAR size must be a power of 2 from 16 bytes; (b) MMIO BAR base address must be power of two in size and are aligned with its size. Tracked-On: #5929 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2021-04-22 14:35:57 +08:00
Li Fei1	41e2d40d1f	hv: e820: remove get_mem_range_info No one uses get_mem_range_info to get the top/bottom/size of the physical memory. We could get these informations by e820 table easily. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: eddie Dong <eddie.dong@intel.com>	2021-04-21 14:00:44 +08:00
Li Fei1	ad15053304	hv: mmu: remove get_mem_range_info in init_paging We used get_mem_range_info to get the top memory address and then use this address as the high 64 bits max memory address. This assumes the platform must have high memory space. This patch calculates the high 64 bits max memory address according the e820 tables and removes the assumption "The platform must have high memory space" by map the low RAM region and high RAM region separately. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: eddie Dong <eddie.dong@intel.com>	2021-04-21 14:00:44 +08:00
Li Fei1	d1ae797742	hv: pgtable: move sanitize_pte into pagetable.c sanitize_pte is used to set page table entry to map to an sanitized page to mitigate l1tf. It should belongs to pgtable module. So move it to pagetable.c Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2021-03-29 13:28:55 +08:00
Li Fei1	ef90bb6db3	hv:pgtable: rename lookup_address to pgtable_lookup_entry lookup_address is used to lookup a pagetable entry by an address. So rename it to pgtable_lookup_entry to indicate this clearly. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-29 13:28:55 +08:00
Li Fei1	36ddd87a09	hv: pgtable: remove alloc_ept_page alloc_page/free_page should been called in pagetable module. In order to do this, we add pgtable_create_root and pgtable_create_trusty_root to create PML4 page table page for normal world and secure world. After this done, no one uses alloc_ept_page. So remove it. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-29 13:28:55 +08:00
Li Fei1	ea701c63c7	hv: pgtable: add pgtable_create_trusty_root Add pgtable_create_trusty_root to allocate a page for trusty PML4 page table page. This function also copy PDPT entries from Normal world to Secure world. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-29 13:28:55 +08:00
Li Fei1	596c349600	hv: pgtable: add pgtable_create_root Add pgtable_create_root to allocate a page for PMl4 page table page. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-29 13:28:55 +08:00
Li Fei1	eb52e2193a	hv: pgtable: refine name for pgtable add/modify/del Rename mmu_add to pgtable_add_map; Rename mmu_modify_or_del to pgtable_modify_or_del_map. And move these functions declaration into pgtable.h Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-29 13:28:55 +08:00
Liang Yi	33ef656462	hv/mod-irq: use arch specific header files Requires explicit arch path name in the include directive. The config scripts was also updated to reflect this change. Tracked-On: #5825 Signed-off-by: Peter Fang <peter.fang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Liang Yi	df36da1b80	hv/mod_irq: do not include x86/irq.h in common/irq.h Each .c file includes the arch specific irq header file (with full path) by itself if required. Tracked-On: #5825 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Liang Yi	741a208a02	hv/mod_irq: cleanup x86 lapic/ioapic header files Declarations referenced nowhere else are moved into the c file. Tracked-On: #5825 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Liang Yi	6f0a7016d3	hv/mod_irq: move IPI declarations out of x86/irq.h They are moved into the new header file x86/notify.h. Tracked-On: #5825 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Liang Yi	ff732cfb2a	hv/mod_irq: move guest interrupt API out of x86/irq.h A new x86/guest/virq.h head file now contains all guest related interrupt handling API. Tracked-On: #5825 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Liang Yi	6098648373	hv/mod_irq: cleanup x86/irq.h Move exception stack layout struct and exception/NMI handling declarations from x86/irq.h into x86/cpu.h. Tracked-On: #5825 Signed-off-by: Peter Fang <peter.fang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Liang Yi	3a50f949e1	hv/mod_irq: split irq.c into arch/x86/irq.c and common/irq.c The common irq file is responsible for managing the central irq_desc data structure and provides the following APIs for host interrupt handling. - init_interrupt() - reserve_irq_num() - request_irq() - free_irq() - set_irq_trigger_mode() - do_irq() API prototypes, constant and data structures belonging to common interrupt handling are all moved into include/common/irq.h. Conversely, the following arch specific APIs are added which are called from the common code at various points: - init_irq_descs_arch() - setup_irqs_arch() - init_interrupt_arch() - free_irq_arch() - request_irq_arch() - pre_irq_arch() - post_irq_arch() Tracked-On: #5825 Signed-off-by: Peter Fang <peter.fang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Liang Yi	c46e3c71ac	hv/mod_irq: decouple irq number reservation from ioapic This is done be adding irq_rsvd_bitmap as an auxiliary bitmap besides irq_alloc_bitmap. Tracked-On: #5825 Signed-off-by: Peter Fang <peter.fang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Liang Yi	f3cae9e258	hv/mod_irq: hide arch specific data in irq_desc Arch specific IRQ data is now an opaque pointer in irq_desc. This is a preparation step for spliting IRQ handling into common and architecture specific parts. Tracked-On: #5825 Signed-off-by: Peter Fang <peter.fang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Li Fei1	9000381f34	hv: pgtable: move pgtable definition to pgtable.h This patch moves pgtable definition to pgtable.h and include the proper header file for page module. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-11 13:48:52 +08:00
Li Fei1	0278a3f46e	hv: pgatble: move the EPT page table related APIs to ept.c Move the EPT page table related APIs to ept.c. page module only provides APIs to allocate/free page for page table page. pagetabl module only provides APIs to add/modify/delete/lookup page table entry. The page pool and the page table related APIs for EPT should defined in EPT module. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-11 13:48:52 +08:00
Li Fei1	5c71ca456a	hv: pgatble: move the MMU page table related APIs to mmu.c Move the MMU page table related APIs to mmu.c. page module only provides APIs to allocate/free page for page table page. pagetabl module only provides APIs to add/modify/delete/lookup page table entry. The page pool and the page table related APIs for MMU should defined in MMU module. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-11 13:48:52 +08:00
Li Fei1	15d68675e9	hv: pgtable: separate common APIs for MMU/EPT We would move the MMU page table related APIs to mmu.c and move the EPT related APIs to EPT.c. The page table module only provides APIs to add/modify/delete/lookup page table entry. This patch separates common APIs and adds separate APIs of page table module for MMU/EPT. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2021-03-11 13:48:52 +08:00
Li Fei1	80bd3ac02a	hv: trusty: move post_uos_sworld_memory into vm.c post_uos_sworld_memory are used for post-launched VM which support trusty. It's more VM related. So move it definition into vm.c Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-11 13:48:52 +08:00
Yonghua Huang	1a011bd91b	hv: disable guest MONITOR-WAIT support when SW SRAM is configured Per-core software SRAM L2 cache may be flushed by 'mwait' extension instruction, which guest VM may execute to enter core deep sleep. Such kind of flushing is not expected when software SRAM is enabled for RTVM. Hypervisor disables MONITOR-WAIT support on both hypervisor and VMs sides to protect above software SRAM from being flushed. This patch disable ACRN guest MONITOR-WAIT support if software SRAM is configured. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-11 09:42:44 +08:00
Yonghua Huang	ae43b2a847	hv: disable host MONITOR-WAIT support when SW SRAM is enabled Per-core software SRAM L2 cache may be flushed by 'mwait' extension instruction, which guest VM may execute to enter core deep sleep. Such kind of flushing is not expected when software SRAM is enabled for RTVM. Hypervisor disables MONITOR-WAIT support on both hypervisor and VMs sides to protect above software SRAM from being flushed. This patch disable hypervisor(host) MONITOR-WAIT support and refine software sram initializaion flow. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-11 09:42:44 +08:00
Yonghua Huang	ea44bb6c4d	hv: wrap function to check software SRAM support Below boolean function are defined in this patch: - is_software_sram_enabled() to check if SW SRAM feature is enabled or not. - set global variable 'is_sw_sram_initialized' to file static. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-11 09:42:44 +08:00
Li Fei1	768e483cd2	hv: pgtable: rename 'struct memory_ops' to 'struct pgtable' The fields and APIs in old 'struct memory_ops' are used to add/modify/delete page table (page or entry). So rename 'struct memory_ops' to 'struct pgtable'. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-10 11:42:13 +08:00
Li Fei1	ef98fa69ce	hv: pgtable: remove get_default_access_right API Use default_access_right field to replace get_default_access_right API. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-10 11:42:13 +08:00
Li Fei1	1db32f4d03	hv: ept: build 4KB page mapping in EPT for code pages of rtvm RTVM is enforced to use 4KB pages to mitigate CVE-2018-12207 and performance jitter, which may be introduced by splitting large page into 4KB pages on demand. It works fine in previous hardware platform where the size of address space for the RTVM is relatively small. However, this is a problem when the platforms support 64 bits high MMIO space, which could be super large and therefore consumes large # of EPT page table pages. This patch optimize it by using large page for purely data pages, such as MMIO spaces, even for the RTVM. Signed-off-by: Li Fei1 <fei1.li@intel.com> Tracked-On: #5788	2021-03-03 13:46:49 +08:00
Li Fei1	0579e2ee24	hv: page: add free_page Add free_page to free page when unmap pagetable. Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Tracked-On: #5788	2021-03-01 13:10:04 +08:00
Li Fei1	8d9f12f3b7	hv: page: use dynamic page allocation for pagetable mapping For FuSa's case, we remove all dynamic memory allocation use in ACRN HV. Instead, we use static memory allocation or embedded data structure. For pagetable page, we prefer to use an index (hva for MMU, gpa for EPT) to get a page from a special page pool. The special page pool should be big enougn for each possible index. This is not a big problem when we don't support 64 bits MMIO. Without 64 bits MMIO support, we could use the index to search addrss not larger than DRAM_SIZE + 4G. However, if ACRN plan to support 64 bits MMIO in SOS, we could not use the static memory alocation any more. This is because there's a very huge hole between the top DRAM address and the bottom 64 bits MMIO address. We could not reserve such many pages for pagetable mapping as the CPU physical address bits may very large. This patch will use dynamic page allocation for pagetable mapping. We also need reserve a big enough page pool at first. For HV MMU, we don't use 4K granularity page table mapping, we need reserve PML4, PDPT and PD pages according the maximum physical address space (PPT va and pa are identical mapping); For each VM EPT, we reserve PML4, PDPT and PD pages according to the maximum physical address space too, (the EPT address sapce can't beyond the physical address space), and we reserve PT pages by real use cases of DRAM, low MMIO and high MMIO. Signed-off-by: Li Fei1 <fei1.li@intel.com> Tracked-On: #5788	2021-03-01 13:10:04 +08:00
Li Fei1	5621fabbcb	hv: memory: remove get_sworld_memory_base API memory_ops structure will be changed to store page table related fields. However, secure world memory base address is not one of them, it's VM related. So save sworld_memory_base_hva in vm_arch structure directly. Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Tracked-On: #5788	2021-03-01 13:10:04 +08:00
Yonghua Huang	fdfd28b140	hv: unmap software region of pre-RTVM from Service VM EPT Accessing to software SRAM region is not allowed when software SRAM is pass-thru to prelaunch RTVM. This patch removes software SRAM region from service VM EPT if it is enabled for prelaunch RTVM. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2021-02-25 09:35:31 +08:00
Sainath Grandhi	80a91987f4	hv: Fix incorrect struct definition for ir_bits Fixing an incorrect struct definition for ir_bits in ioapic_rte. Since bits after the delivery status in the lower 32 bits are not touched by code, this has never showed up as an issue. And the higher 32 bits in the RTE are aligned by the compiler. Tracked-On: #5773 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2021-02-25 09:34:49 +08:00
Shuo A Liu	d4aaf99d86	hv: keylocker: Support keylocker backup MSRs for Guest VM The logical processor scoped IWKey can be copied to or from a platform-scope storage copy called IWKeyBackup. Copying IWKey to IWKeyBackup is called ‘backing up IWKey’ and copying from IWKeyBackup to IWKey is called ‘restoring IWKey’. IWKeyBackup and the path between it and IWKey are protected against software and simple hardware attacks. This means that IWKeyBackup can be used to distribute an IWKey within the logical processors in a platform in a protected manner. Linux keylocker implementation uses this feature, so they are introduced by this patch. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-02-03 13:54:45 +08:00
Shuo A Liu	38cd5b481d	hv: keylocker: host keylocker iwkey context switch Different vCPU may have different IWKeys. Hypervisor need do the iwkey context switch. This patch introduce a load_iwkey() function to do that. Switches the host iwkey when the switch_in vCPU satisfies: 1) keylocker feature enabled 2) Different from the current loaded one. Two opportunities to do the load_iwkey(): 1) Guest enables CR4.KL bit. 2) vCPU thread context switch. load_iwkey() costs ~600 cycles when do the load IWKey action. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-02-03 13:54:45 +08:00
Shuo A Liu	c11c07e0fe	hv: keylocker: Support Key Locker feature for guest VM KeyLocker is a new security feature available in new Intel CPUs that protects data-encryption keys for the Advanced Encryption Standard (AES) algorithm. These keys are more valuable than what they guard. If stolen once, the key can be repeatedly used even on another system and even after vulnerability closed. It also introduces a CPU-internal wrapping key (IWKey), which is a key- encryption key to wrap AES keys into handles. While the IWKey is inaccessible to software, randomizing the value during the boot-time helps its value unpredictable. Keylocker usage: - New “ENCODEKEY” instructions take original key input and returns HANDLE crypted by an internal wrap key (IWKey, init by “LOADIWKEY” instruction) - Software can then delete the original key from memory - Early in boot/software, less likely to have vulnerability that allows stealing original key - Later encrypt/decrypt can use the HANDLE through new AES KeyLocker instructions - Note: * Software can use original key without knowing it (use HANDLE) * HANDLE cannot be used on other systems or after warm/cold reset * IWKey cannot be read from CPU after it's loaded (this is the nature of this feature) and only 1 copy of IWKey inside CPU. The virtualization implementation of Key Locker on ACRN is: - Each vCPU has a 'struct iwkey' to store its IWKey in struct acrn_vcpu_arch. - At initilization, every vCPU is created with a random IWKey. - Hypervisor traps the execution of LOADIWKEY (by 'LOADIWKEY exiting' VM-exectuion control) of vCPU to capture and save the IWKey if guest set a new IWKey. Don't support randomization (emulate CPUID to disable) of the LOADIWKEY as hypervisor cannot capture and save the random IWKey. From keylocker spec: "Note that a VMM may wish to enumerate no support for HW random IWKeys to the guest (i.e. enumerate CPUID.19H:ECX[1] as 0) as such IWKeys cannot be easily context switched. A guest ENCODEKEY will return the type of IWKey used (IWKey.KeySource) and thus will notice if a VMM virtualized a HW random IWKey with a SW specified IWKey." - In context_switch_in() of each vCPU, hypervisor loads that vCPU's IWKey into pCPU by LOADIWKEY instruction. - There is an assumption that ACRN hypervisor will never use the KeyLocker feature itself. This patch implements the vCPU's IWKey management and the next patch implements host context save/restore IWKey logic. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-02-03 13:54:45 +08:00
Shuo A Liu	4483e93bd1	hv: keylocker: Enable the tertiary VM-execution controls In order for a VMM to capture the IWKey values of guests, processors that support Key Locker also support a new "LOADIWKEY exiting" VM-execution control in bit 0 of the tertiary processor-based VM-execution controls. This patch enables the tertiary VM-execution controls. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-02-03 13:54:45 +08:00
Shuo A Liu	e9247dbca0	hv: keylocker: Simulate CPUID of keylocker caps for guest VM KeyLocker is a new security feature available in new Intel CPUs that protects data-encryption keys for the Advanced Encryption Standard (AES) algorithm. This patch emulates Keylocker CPUID leaf 19H to support Keylocker feature for guest VM. To make the hypervisor being able to manage the IWKey correctly, this patch doesn't expose hardware random IWKey capability (CPUID.0x19.ECX[1]) to guest VM. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-02-03 13:54:45 +08:00
Shuo A Liu	15c967ad34	hv: keylocker: Add CR4 bit CR4_KL as CR4_TRAP_AND_PASSTHRU_BITS Bit19 (CR4_KL) of CR4 is CPU KeyLocker feature enable bit. Hypervisor traps the bit's writing to track the keylocker feature on/off of guest. While the bit is set by guest, - set cr4_kl_enabled to indicate the vcpu's keylocker feature enabled status - load vcpu's IWKey in host (will add in later patch) While the bit is clear by guest, - clear cr4_kl_enabled This patch trap and passthru the CR4_KL bit to guest for operation. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-02-03 13:54:45 +08:00
Li Fei1	94a980c923	hv: hypercall: prevent sos can touch hv/pre-launched VM resource Current implementation, SOS may allocate the memory region belonging to hypervisor/pre-launched VM to a post-launched VM. Because it only verifies the start address rather than the entire memory region. This patch verifies the validity of the entire memory region before allocating to a post-launched VM so that the specified memory can only be allocated to a post-launched VM if the entire memory region is mapped in SOS’s EPT. Tracked-On: #5555 Signed-off-by: Li Fei1 <fei1.li@intel.com> Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>	2021-02-02 16:55:40 +08:00
Yonghua Huang	8bec63a6ea	hv: remove the hardcoding of Software SRAM GPA base Currently, we hardcode the GPA base of Software SRAM to an address that is derived from TGL platform, as this GPA is identical with HPA for Pre-launch VM, This hardcoded address may not work on other platforms if the HPA bases of Software SRAM are different. Now, Offline tool configures above GPA based on the detection of Software SRAM on specific platform. This patch removes the hardcoding GPA of Software SRAM, and also renames MACRO 'SOFTWARE_SRAM_BASE_GPA' to 'PRE_RTVM_SW_SRAM_BASE_GPA' to avoid confusing, as it is for Prelaunch VM only. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-01-30 13:41:02 +08:00
Yonghua Huang	a6e666dbe7	hv: remove hardcoding of SW SRAM HPA base Physical address to SW SRAM region maybe different on different platforms, this hardcoded address may result in address mismatch for SW SRAM operations. This patch removes above hardcoded address and uses the physical address parsed from native RTCT. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-01-28 11:29:25 +08:00
Yonghua Huang	a6420e8cfa	hv: cleanup legacy terminologies in RTCM module This patch updates below terminologies according to the latest TCC Spec: PTCT -> RTCT PTCM -> RTCM pSRAM -> Software SRAM Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2021-01-28 11:29:25 +08:00
Yonghua Huang	806f479108	hv: rename RTCM source files 'ptcm' and 'ptct' are legacy name according to the latest TCC spec, hence rename below files to avoid confusing: ptcm.c -> rtcm.c ptcm.h -> rtcm.h ptct.h -> rtct.h Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2021-01-28 11:29:25 +08:00
Liang Yi	1de396363f	hv: modularization: avoid dependency of multiboot on zeropage.h. Split off definition of "struct efi_info" into a separate header file lib/efi.h. Tracked-On: #5661 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-01-27 15:59:47 +08:00
Liang Yi	8f9ec59a53	hv: modularization: cleanup boot.h Move multiboot specific declarations from boot.h to multiboot.h. Tracked-On: #5661 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-01-27 15:59:47 +08:00
Jie Deng	8aebf5526f	hv: move split-lock logic into dedicated file This patch move the split-lock logic into dedicated file to reduce LOC. This may make the logic more clear. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com>	2021-01-08 17:37:20 +08:00
Jie Deng	27d5711b62	hv: add a cache register for VMX_PROC_VM_EXEC_CONTROLS This patch adds a cache register for VMX_PROC_VM_EXEC_CONTROLS to avoid the frequent VMCS access. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com>	2021-01-08 17:37:20 +08:00
Jie Deng	977e862192	hv: Add split-lock emulation for xchg xchg may also cause the #AC for split-lock check. This patch adds this emulation. 1. Kick other vcpus of the guest to stop execution if the guest has more than one vcpu. 2. Emulate the xchg instruction. 3. Notify other vcpus (if any) to restart execution. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-12-31 11:12:33 +08:00
Jie Deng	47e193a7bb	hv: Add split-lock emulation for LOCK prefix instruction This patch adds the split-lock emulation. If a #AC is caused by instruction with LOCK prefix then emulate it, otherwise, inject it back as it used to be. 1. Kick other vcpus of the guest to stop execution and set the TF flag to have #DB if the guest has more than one vcpu. 2. Skip over the LOCK prefix and resume the current vcpu back to guest for execution. 3. Notify other vcpus to restart exception at the end of handling the #DB since we have completed the LOCK prefix instruction emulation. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-12-31 11:12:33 +08:00
Yonghua Huang	643bbcfe34	hv: check the availability of guest CR4 features Check hardware support for all features in CR4, and hide bits from guest by vcpuid if they're not supported for guests OS. Tracked-On: #5586 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-12-18 11:21:22 +08:00
Yonghua Huang	442fc30117	hv: refine virtualization flow for cr0 and cr4 - The current code to virtualize CR0/CR4 is not well designed, and hard to read. This patch reshuffle the logic to make it clear and classify those bits into PASSTHRU, TRAP_AND_PASSTHRU, TRAP_AND_EMULATE & reserved bits. Tracked-On: #5586 Signed-off-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-12-18 11:21:22 +08:00
Peter Fang	68dc8d9f8f	hv: pm: avoid duplicate shutdowns on RTVM It is possible for more than one vCPUs to trigger shutdown on an RTVM. We need to avoid entering VM_READY_TO_POWEROFF state again after the RTVM has been paused or shut down. Also, make sure an RTVM enters VM_READY_TO_POWEROFF state before it can be paused. v1 -> v2: - rename to poweroff_if_rt_vm for better clarity Tracked-On: #5411 Signed-off-by: Peter Fang <peter.fang@intel.com>	2020-11-11 14:05:39 +08:00
dongshen	ca5683f78d	hv: add support for shutdown for pre-launched VMs Currently, ACRN only support shutdown when triple fault happens, because ACRN doesn't present/emulate a virtual HW, i.e. port IO, to support shutdown. This patch emulate a virtual shutdown component, and the vACPI method for guest OS to use. Pre-launched VM uses ACPI reduced HW mode, intercept the virtual sleep control/status registers for pre-launched VMs shutdown Tracked-On: #5411 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-11-04 10:33:31 +08:00
Peter Fang	70b1218952	hv: pm: support shutting down multiple VMs when pCPUs are shared More than one VM may request shutdown on the same pCPU before shutdown_vm_from_idle() is called in the idle thread when pCPUs are shared among VMs. Use a per-pCPU bitmap to store all the VMIDs requesting shutdown. v1 -> v2: - use vm_lock to avoid a race on shutdown Tracked-On: #5411 Signed-off-by: Peter Fang <peter.fang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-11-04 10:33:31 +08:00
Li Fei1	f3067f5385	hv: mmu: rename hv_access_memory_region_update to ppt_clear_user_bit Rename hv_access_memory_region_update to ppt_clear_user_bit to verb + object style. Tracked-On: #5330 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-11-02 10:29:43 +08:00
Li Fei1	35abee60d6	hv: pSRAM: temporarily remove NX bit of PTCM binary Temporarily remove NX bit of PTCM binary in pagetable during pSRAM initialization: 1.added a function ppt_set_nx_bit to temporarily remove/restore the NX bit of a given area in pagetable. 2.Temporarily remove NX bit of PTCM binary during pSRAM initialization to make PTCM codes executable. 3. TODO: We may use SMP call to flush TLB and do pSRAM initilization on APs. Tracked-On: #5330 Signed-off-by: Qian Wang <qian1.wang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-11-02 10:29:43 +08:00
Li Fei1	5fa816f921	hv: pSRAM: add PTCT parsing code The added parse_ptct function will parse native ACPI PTCT table to acquire information like pSRAM location/size/level and PTCM location, and save them. Tracked-On: #5330 Signed-off-by: Qian Wang <qian1.wang@intel.com>	2020-11-02 10:29:43 +08:00
Li Fei1	80121b8347	hv: pSRAM: add pSRAM initialization codes 1.We added a function init_psram to initialize pSRAM as well as some definitions. Both AP and BSP shall call init_psram to make sure pSRAM is initialized, which is required by PTCM. BSP: To parse PTCT and find the entry of PTCM command function, then call PTCM ABI. AP: Wait until BSP has done the parsing work, then call the PTCM ABI. Synchronization of AP and BSP is ensured, both inside and outside PTCM. 2. Added calls of init_psram in init_pcpu_post to initialize pSRAM in HV booting phase Tracked-On: #5330 Signed-off-by: Qian Wang <qian1.wang@intel.com>	2020-11-02 10:29:43 +08:00
Tao Yuhong	4120bd391a	HV: decouple legacy vuart interface from acrn_vuart layer support pci-vuart type, and refine: 1.Rename init_vuart() to init_legacy_vuarts(), only init PIO type. 2.Rename deinit_vuart() to deinit_legacy_vuarts(), only deinit PIO type. 3.Move io handler code out of setup_vuart(), into init_legacy_vuarts() 4.add init_pci_vuart(), deinit_pci_vuart, for one pci vuart vdev. and some change from requirement: 1.Increase MAX_VUART_NUM_PER_VM to 8. Tracked-On: #5394 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>	2020-10-30 20:41:34 +08:00
Yang, Yu-chu	8c78590da7	acrn-config: refactor pci_dev_c.py and insert vuart device information - Refactor pci_dev_c.py to insert devices information per VMs - Add function to get unused vbdf form bus:dev.func 00:00.0 to 00:1F.7 Add pci devices variables to vm_configurations.c - To pass the pci vuart information form tool, add pci_dev_num and pci_devs initialization by tool - Change CONFIG_SOS_VM in hypervisor/include/arch/x86/vm_config.h to compromise vm_configurations.c Tracked-On: #5426 Signed-off-by: Yang, Yu-chu <yu-chu.yang@intel.com>	2020-10-30 20:24:28 +08:00
David B. Kinder	bb6b226c86	doc: fix doxygen 1.8.17 issues The new (1.8.17) release of doxygen is complaining about errors in the doxygen comments that were's reported by our current 1.8.13 release. Let's fix these now. In a separate PR we'll also update some configuration settings that will be obsolete, in preparation for moving to this newer version. [External_System_ID]ACRN-6774 Tracked-On: #5385 Signed-off-by: David B. Kinder <david.b.kinder@intel.com>	2020-10-29 08:25:01 -07:00
Zide Chen	a776ccca94	hv: don't need to save boot context - Since de-privilege boot is removed, we no longer need to save boot context in boot time. - cpu_primary_start_64 is not an entry for ACRN hypervisor any more, and can be removed. Tracked-On: #5197 Signed-off-by: Zide Chen <zide.chen@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-10-29 10:05:05 +08:00
Yonghua Huang	3ea1ae1e11	hv: refine msi interrupt injection functions 1. refine the prototype of 'inject_msi_lapic_pt()' 2. rename below function: - rename 'vlapic_intr_msi()' to 'vlapic_inject_msi()' - rename 'inject_msi_lapic_pt()' to 'inject_msi_for_lapic_pt()' - rename 'inject_msi_lapic_virt()' to 'inject_msi_for_non_lapic_pt()' Tracked-On: #5407 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Li Fei <fei1.li@intel.com> Reviewed-by: Wang, Yu1 <yu1.wang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-10-26 08:44:13 +08:00
Yonghua Huang	012927d0bd	hv: move function 'inject_msi_lapic_pt()' to vlapic.c This function can be used by other modules instead of hypercall handling only, hence move it to vlapic.c Tracked-On: #5407 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Li, Fei <fei1.li@intel.com> Reviewed-by: Wang, Yu1 <yu1.wang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-10-26 08:44:13 +08:00
Zide Chen	bebffb29fc	hv: remove de-privilege boot mode support and remove vboot wrappers Now ACRN supports direct boot mode, which could be SBL/ABL, or GRUB boot. Thus the vboot wrapper layer can be removed and the direct boot functions don't need to be wrapped in direct_boot.c: - remove call to init_vboot(), and call e820_alloc_memory() directly at the time when the trampoline buffer is actually needed. - Similarly, call CPU_IRQ_ENABLE() instead of the wrapper init_vboot_irq(). - remove get_ap_trampoline_buf(), since the existing function get_trampoline_start16_paddr() returns the exact same value. - merge init_general_vm_boot_info() into init_vm_boot_info(). - remove vm_sw_loader pointer, and call direct_boot_sw_loader() directly. - move get_rsdp_ptr() from vboot_wrapper.c to multiboot.c, and remove the wrapper over two boot modes. Tracked-On: #5197 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-10-21 15:09:26 +08:00
Victor Sun	34547e1e19	HV: add acpi module support for pre-launched VM Previously we use a pre-defined structure as vACPI table for pre-launched VM, the structure is initialized by HV code. Now change the method to use a pre-loaded multiboot module instead. The module file will be generated by acrn-config tool and loaded to GPA 0x7ff00000, a hardcoded RSDP table at GPA 0x000f2400 will point to the XSDT table which at GPA 0x7ff00080; Tracked-On: #5266 Signed-off-by: Victor Sun <victor.sun@intel.com> Signed-off-by: Shuang Zheng <shuang.zheng@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-08 19:52:25 +08:00
Nishioka, Toshiki	ba99984f69	hv: add INTx mapping for pre-launched VMs Add the capability of forwarding specified physical IOAPIC interrupt lines to pre-launched VMs as virtual IOAPIC interrupts. This is for the sake of the certain MMIO pass-thru devices on EHL CRB which can support only INTx interrupts. Tracked-On: #5245 Signed-off-by: Toshiki Nishioka <toshiki.nishioka@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-07 14:52:02 +08:00
dongshen	3880e6186e	hv: add pt_intx related members to struct acrn_vm_config On EHL platform, we need to expose GPIO chassis interrupt to pre-launched VM as INTx. Add related data structures so that they can be used in subsequent commits. Tracked-On: #5241 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-09-01 09:35:50 +08:00
dongshen	10d4773f1d	hv: add a new field pt_p2sb_bar to struct acrn_vm_config On EHL platform, we need to pass through P2SB bridge to pre-launched VM. Use pt_p2sb_bar to indicate whether to passthru p2sb bridge to pre-launched VM or not. Tracked-On: #5221 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-09-01 09:35:50 +08:00
Yonghua Huang	c03623f3fb	hv[v2]: Remove deprecated term in vPIC submodule This patch cleanup below deprecated terms: 'master' -> 'primary' 'slave' -> 'secondary' v2 update: Refine comments. Tracked-On: #5249 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-09-01 09:30:08 +08:00
Yuan Liu	6d0f0ebd8a	hv: implement ivshmem device creation and destruction For ivshmem vdev creation, the vdev vBDF, vBARs, shared memory region name and size are set by device model. The shared memory name and size must be same as the corresponding device configuration which is configured by offline tool. v3: add a comment to the vbar_base member of the acrn_vm_pci_dev_config structure that vbar_base is power-on default value Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-28 16:53:12 +08:00
Wei Liu	29ac258134	acrn-config: code refactoring for CAT/MBA 1.Modify clos_mask and mba_delay as a member of the union type. 2.Move HV_SUPPORTED_MAX_CLOS ,MAX_CACHE_CLOS_NUM_ENTRIES and MAX_MBA_CLOS_NUM_ENTRIES to misc_cfg.h file. Tracked-On: #5229 Signed-off-by: Wei Liu <weix.w.liu@intel.com> Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-08-28 16:44:06 +08:00
dongshen	a425730f64	acrn-config: rename MAX_PLATFORM_CLOS_NUM to HV_SUPPORTED_MAX_CLOS HV_SUPPORTED_MAX_CLOS: This value represents the maximum CLOS that is allowed by ACRN hypervisor. This value is set to be least common Max CLOS (CPUID.(EAX=0x10,ECX=ResID):EDX[15:0]) among all supported RDT resources in the platform. In other words, it is min(maximum CLOS of L2, L3 and MBA). This is done in order to have consistent CLOS allocations between all the RDT resources. Tracked-On: #5229 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-08-28 16:44:06 +08:00
Mingqiang Chi	53b11d1048	refine hypercall -- use an array to fast locate the hypercall handler to replace switch case. -- uniform hypercall handler as below: int32_t (*handler)(sos_vm, target_vm, param1, param2) Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2020-08-26 14:55:24 +08:00
Shuang Zheng	c26ae8c420	hv: Inter-VM communication config for hybrid_rt on whl-ipc-i5 add an IVSHMEM regoin and the related configuration parameters in hybrid_rt scenario on whl-ipc-i5. The size of the shared memory is 2M, and it is used for the communication between VM0 and VM2. v6: rename shm name; remove unnecessary MACROs. v7: rename MACRO for shm name; add unassigned vbdf for post-launched VMs. Tracked-On: #4853 Signed-off-by: Shuang Zheng <shuang.zheng@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-19 15:06:15 +08:00
Wei Liu	088cd62d8b	HV: sync hv reference code that generated by config tool Sync hv reference code that generated by acrn-config tool. Tracked-On: #5092 Signed-off-by: Wei Liu <weix.w.liu@intel.com>	2020-08-17 14:34:30 +08:00
Junming Liu	23d9c13c41	hv:cpuid:refine cpuid_subleaf interface There's a corner case: When want to get CPUID.01H:EDX value, may have the following code snippet: uint32_t unused,edx; cpuid_subleaf(0x1U, 0x0U, &unused, &unused, &unused, &edx); while in cpuid_subleaf: eax = leaf; ecx = subleaf; eax and ecx point to the same location, When deep into asm_cpuid, it's input value will be 0x0U and 0x0U. but the expected input value is 0x1U and 0x0U. This case will return CPUID.00H:EDX, which is the wrong answer. Tracked-On: #4526 Signed-off-by: Junming Liu <junming.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-17 10:14:00 +08:00
Junming Liu	3631a85c3c	hv:cpu-caps:refine is_apl_platform func and clean up duplicated code Fix the bug for "is_apl_platform" func. "monitor_cap_buggy" is identical to "is_apl_platform", so remove it. On apl platform: 1) ACRN doesn't use monitor/mwait instructions 2) ACRN disable GPU IOMMU Tracked-On:#3675 Signed-off-by: Junming Liu <junming.liu@intel.com>	2020-08-14 10:08:50 +08:00
liujunming	538e7cf74d	hv:cpu-caps:refine processor family and model info v3 -> v4: Refine commit message and code stype 1. SDM Vol. 2A 3-211 states DisplayFamily = Extended_Family_ID + Family_ID when Family_ID == 0FH. So it should be family += ((eax >> 20U) & 0xffU) when Family_ID == 0FH. 2. IF (Family_ID = 06H or Family_ID = 0FH) THEN DisplayModel = (Extended_Model_ID « 4) + Model_ID; While previous code this logic: IF (DisplayFamily = 06H or DisplayFamily = 0FH) Fix the bug about calculation of display family and display model according to SDM definition. 3. use variable name to distinguish Family ID/Display Family/Model ID/Display Model, then the code is more clear to avoid some mistake Tracked-On:#3675 Signed-off-by: liujunming <junming.liu@intel.com> Reviewed-by: Wu Xiangyang <xiangyang.wu@linux.intel.com> Acked-by： Eddie Dong <eddie.dong@intel.com>	2020-08-14 10:08:50 +08:00
Victor Sun	8245145317	HV: remove sanitize_vm_config function Remove function of sanitize_vm_config() since the processing of sanitizing will be moved to pre-build process. When hypervisor has booted, we assume all VM configurations is sanitized; Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-12 10:21:17 +08:00
Mingqiang Chi	a67a85c70d	hv:refine vm & vcpu lock -- move vm_state_lock to other place in vm structure to avoid the memory waste because of the page-aligned. -- remove the memset from create_vm -- explicitly set max_emul_mmio_regions and vcpuid_entry_nr to 0 inside create_vm to avoid use without initialization. -- rename max_emul_mmio_regions to nr_emul_mmio_regions v1->v2: add deinit_emul_io in shutdown_vm Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-05 13:39:28 +08:00
Victor Sun	a57a4fd7fb	HV: Make: enable build for new configs layout The make command is same as old configs layout: under acrn-hypervisor folder: make hypervisor BOARD=xxx SCENARIO=xxx [TARGET_DIR]=xxx [RELEASE=x] under hypervisor folder: make BOARD=xxx SCENARIO=xxx [TARGET_DIR]=xxx [RELEASE=x] if BOARD/SCENARIO parameter is not specified, the default will be: BOARD=nuc7i7dnb SCENARIO=industry Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-24 16:16:06 +08:00
Victor Sun	e792fa3d3c	HV: nuc7i7dnb example of new VM configuratons layout There are 3 kinds of configurations in ACRN hypervisor source code: hypervisor overall setting, per-board setting and scenario specific per-VM setting. Currently Kconfig act as hypervisor overall setting and its souce is located at "hypervisor/arch/x86/configs/$(BOARD).config"; Per-board configs are located at "hypervisor/arch/x86/configs/$(BOARD)" folder; scenario specific per-VM configs are located at "hypervisor/scenarios/$(SCENARIO)" folder. This layout brings issues that board configs and VM configs are coupled tightly. The board specific Kconfig file and misc_cfg.h are shared by all scenarios, and scenario specific pci_dev.c is shared by all boards. So the user have no way to build hypervisor binary for different scenario on different board with one source code repo. The patch will setup a new VM configurations layout as below: misc/vm_configs ├── boards --> folder of supported boards │ ├── <board_1> --> scenario-irrelevant board configs │ │ ├── board.c --> C file of board configs │ │ ├── board_info.h --> H file of board info │ │ ├── pci_devices.h --> pBDF of PCI devices │ │ └── platform_acpi_info.h --> native ACPI info │ ├── <board_2> │ ├── <board_3> │ └── <board...> └── scenarios --> folder of supported scenarios ├── <scenario_1> --> scenario specific VM configs │ ├── <board_1> --> board specific VM configs for <scenario_1> │ │ ├── <board_1>.config --> Kconfig for specific scenario on specific board │ │ ├── misc_cfg.h --> H file of board specific VM configs │ │ ├── pci_dev.c --> board specific VM pci devices list │ │ └── vbar_base.h --> vBAR base info of VM PT pci devices │ ├── <board_2> │ ├── <board_3> │ ├── <board...> │ ├── vm_configurations.c --> C file of scenario specific VM configs │ └── vm_configurations.h --> H file of scenario specific VM configs ├── <scenario_2> ├── <scenario_3> └── <scenario...> The new layout would decouple board configs and VM configs completely: The boards folder stores kinds of supported boards info, each board folder stores scenario-irrelevant board configs only, which could be totally got from a physical platform and works for all scenarios; The scenarios folder stores VM configs of kinds of working scenario. In each scenario folder, besides the generic scenario specific VM configs, the board specific VM configs would be put in a embedded board folder. In new layout, all configs files will be removed out of hypervisor folder and moved to a separate folder. This would make hypervisor LoC calculation more precisely with below fomula: typical LoC = Loc(hypervisor) + Loc(one vm_configs) which Loc(one vm_configs) = Loc(misc/vm_configs/boards/<board>) + LoC(misc/vm_configs/scenarios/<scenario>/<board>) + Loc(misc/vm_configs/scenarios/<scenario>/vm_configurations.c + Loc(misc/vm_configs/scenarios/<scenario>/vm_configurations.h Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-24 16:16:06 +08:00
Victor Sun	8bcab8e294	HV: add VM uuid and type for pre-launched RTVM add VM UUID and CONFIG_XX_VM() api for pre-launched RTVM; Tracked-On: #5081 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-07-23 21:58:32 +08:00
Shuo A Liu	112f02851c	hv: Disable XSAVE-managed CET state of guest VM To hide CET feature from guest VM completely, the MSR IA32_MSR_XSS also need to be intercepted because it comprises CET_U and CET_S feature bits of xsave/xstors operations. Mask these two bits in IA32_MSR_XSS writing. With IA32_MSR_XSS interception, member 'xss' of 'struct ext_context' can be removed because it is duplicated with the MSR store array 'vcpu->arch.guest_msrs[]'. Tracked-On: #5074 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-07-23 20:15:57 +08:00
Shuo A Liu	ac598b0856	hv: Hide CET feature from guest VM Return-oriented programming (ROP), and similarly CALL/JMP-oriented programming (COP/JOP), have been the prevalent attack methodologies for stealth exploit writers targeting vulnerabilities in programs. CET (Control-flow Enforcement Technology) provides the following capabilities to defend against ROP/COP/JOP style control-flow subversion attacks: * Shadow stack: Return address protection to defend against ROP. * Indirect branch tracking: Free branch protection to defend against COP/JOP The full support of CET for Linux kernel has not been merged yet. As the first stage, hide CET from guest VM. Tracked-On: #5074 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-07-23 20:15:57 +08:00
Li Fei1	5e605e0daf	hv: vmcall: check vm id in dispatch_sos_hypercall Check whether vm_id is valid in dispatch_sos_hypercall Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-23 20:13:20 +08:00
Li Fei1	1859727abc	hv: vapci: add tpm2 support for pre-launched vm On WHL platform, we need to pass through TPM to Secure pre-launched VM. In order to do this, we need to add TPM2 ACPI Table and add TPM DSDT ACPI table to include the _CRS. Now we only support the TPM 2.0 device (TPM 1.2 device is not support). Besides, the TPM must use Start Method 7 (Uses the Command Response Buffer Interface) to notify the TPM 2.0 device that a command is available for processing. Tracked-On: #5053 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-23 20:13:20 +08:00
Li Fei1	acc69007e2	hv: mmio_dev: add mmio device pass through support Add mmio device pass through support for pre-launched VM. When we pass through a MMIO device to pre-launched VM, we would remove its resource from the SOS. Now these resources only include the MMIO regions. Tracked-On: #5053 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-23 20:13:20 +08:00
Conghui Chen	821c65b40c	hv: fix possible SSE region mismatch issue During context switch in hypervisor, xsave/xrstore are used to save/resotre the XSAVE area according to the XCR0 and XSS. The legacy region in XSAVE area include FPU and SSE, we should make sure the legacy region be saved during contex switch. FPU in XCR0 is always enabled according to SDM. For SSE, we enable it in XCR0 during context switch. Tracked-On: #5062 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-22 14:19:21 +08:00
Conghui Chen	53d4a7169b	hv: remove kick_thread from scheduler module kick_thread function is only used by kick_vcpu to kick vcpu out of non-root mode, the implementation in it is sending IPI to target CPU if target obj is running and target PCPU is not current one; while for runnable obj, it will just make reschedule request. So the kick_thread is not actually belong to scheduler module, we can drop it and just do the cpu notification in kick_vcpu. Tracked-On: #5057 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-22 13:38:41 +08:00
Conghui Chen	b6422f8985	hv: remove 'running' from vcpu structure vcpu->running is duplicated with THREAD_STS_RUNNING status of thread object. Introduce an API sleep_thread_sync(), which can utilize the inner status of thread object, to do the sync sleep for zombie_vcpu(). Tracked-On: #5057 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-22 13:38:41 +08:00
Mingqiang Chi	aa89eb3541	hv:add per-vm lock for vm & vcpu state change -- replace global hypercall lock with per-vm lock -- add spinlock protection for vm & vcpu state change v1-->v2: change get_vm_lock/put_vm_lock parameter from vm_id to vm move lock obtain before vm state check move all lock from vmcall.c to hypercall.c Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-20 11:22:17 +08:00
Li Fei1	82f9233d4a	hv: vpci: a minor fix about is_zombie_vf Now we check whether a device is zombie by the ->user != NULL. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-06-21 12:07:15 +08:00
Mingqiang Chi	1b84741a56	rename vm_lock/vlapic_state in VM structure rename: vlapic_state-->vlapic_mode vm_lock --> vlapic_mode_lock check_vm_vlapic_state --> check_vm_vlapic_mode Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
Mingqiang Chi	d0a4052518	remove dead code in io.h remove thess APIs: set64 set32 set16 set8 Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
Conghui Chen	2a4c59db74	hv: add check for BASIC VMX INFORMATION Check bit 48 in IA32_VMX_BASIC MSR, if it is 1, return error, as we only support Intel 64 architecture. SDM: Appendix A.1 BASIC VMX INFORMATION Bit 48 indicates the width of the physical addresses that may be used for the VMXON region, each VMCS, anddata structures referenced by pointers in a VMCS (I/O bitmaps, virtual-APIC page, MSR areas for VMX transitions). If the bit is 0, these addresses are limited to the processor’s physical-address width.2 If the bit is 1, these addresses are limited to 32 bits. This bit is always 0 for processors that support Intel 64 architecture. Tracked-On: #4956 Signed-off-by: Conghui Chen <conghui.chen@intel.com>	2020-06-18 14:05:56 +08:00
Binbin Wu	da1788c9a3	hv: vtd: add an API to reserve continuous irtes dmar_reserve_irte is added to reserve N coutinuous IRTEs. N could be 1, 2, 4, 8, 16, or 32. The reserved IRTEs will not be freed. Tracked-On:#4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Binbin Wu	7bfcc673a6	hv: ptirq: associate an irte with ptirq_remapping_info entry For a ptirq_remapping_info entry, when build IRTE: - If the caller provides a valid IRTE, use the IRET - If the caller doesn't provide a valid IRTE, allocate a IRET when the entry doesn't have a valid IRTE, in this case, the IRET will be freed when free the entry. Tracked-On:#4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Binbin Wu	2fe4280cfa	hv: vtd: add two paramters for dmar_assign_irte idx_in: - If the caller of dmar_assign_irte passes a valid IRTE index, it will be resued; - If the caller of dmar_assign_irte passes INVALID_IRTE_ID as IRTE index, the function will allocate a new IRTE. idx_out: This paramter return the actual index of IRTE used. The caller need to check whether the return value is valid or not. Also this patch adds an internal function alloc_irte. The function takes count as input paramter to allocate continuous IRTEs. The count can only be 1, 2, 4, 8, 16 or 32. This is prepared for multiple MSI vector support. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Li Fei1	65e4a16e6a	hv: mmu: release 1GB cpu side support constrain There're some platforms still doesn't support 1GB large page on CPU side. Such as lakefield, TNT and EHL platforms on which have some silicon bug and this case CPU don't support 1GB large page. This patch tries to release this constrain to support more hardware platform. Note this patch doesn't release the constrain on IOMMU side. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-06-15 15:16:34 +08:00
Binbin Wu	c907a820df	hv: config: add msix emulation support The information needed to enable MSI-x emulation. Only enable MSI-x emuation for the devices in msix_emul_devs array. Currently, only EHL has the need to enable MSI-x emulation for TSN devices. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-10 14:32:15 +08:00
Victor Sun	80262f0602	HV: rename append_seed_arg to fill_seed_arg Previously append_seed_arg() just do fill in seed arg to dest cmd buffer, so rename the api name to fill_seed_arg(). Since fill_seed_arg() will be called in SOS VM path only, the param of bool vm_is_sos is not needed and will be replaced by dest buffer size. The seed_args[] which used by fill_seed_arg() is pre-defined as all-zero, so memset() is not needed in fill_seed_arg(), buffer pointer check and strncpy_s() are not needed also. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Victor Sun	47d20f37e1	HV: replace merge_cmdline api with strncat_s Add a standard string api strncat_s() to replace merge_cmdline() to make code more readable. Another change is that the multiboot cmdline will be appended to the end of configured SOS bootargs instead of the beginning, this would enable a feature that some kernel cmdline paramter items could be overriden by multiboot cmdline since the later one would win if same parameters configured in kernel cmdline. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Li Fei1	ae4fa40adc	hv: vpci: hv: vpci: refine pci device assignment logic Now Host Bridge and PCI Bridge could only be added to SOS's acrn_vm_pci_dev_config. So For UOS, we always emualte Host Bridge and PCI Bridge for it and assign PCI device to it; for SOS, if it's the highest severity VM, we will assign Host Bridge and PCI Bridge to it directly, otherwise, we will emulate them same as UOS. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-03 22:00:43 +08:00
Li Fei1	b8f151a55f	hv: pci: check whether a PCI device is host bridge or not by class According PCI Code and ID Assignment Specification Revision 1.11, a PCI device whose Base Class is 06h and Sub-Class is 00h is a Host bridge. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-03 22:00:43 +08:00
Vijay Dhanraj	d03df0c7e2	HV: Fix MP Init sequence hang by adding a delay As per the BWG a delay should be provided between the INIT IPI and Startup IPI. Without the delay observe hangs on certain platforms during MP Init sequence. So Setting a delay of 10us between assert INIT IPI and Startup IPI. Also, as per SDM section 10.7 the the de-assert INIT IPI is only used for Pentium and P6 processors. This is not applicable for Pentium4 and Xeon processors so removing this sequence. Tracked-On: #4835 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 13:34:59 +08:00
Binbin Wu	3009d9399f	hv: vtd: cleanup snoop control related code Snoop control will not be turned on by hypervisor, delete snoop control related code. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 11:27:42 +08:00
Shuo A Liu	9a15ea82ee	hv: pause all other vCPUs in same VM when do wbinvd emulation Invalidate cache by scanning and flushing the whole guest memory is inefficient which might cause long execution time for WBINVD emulation. A long execution in hypervisor might cause a vCPU stuck phenomenon what impact Windows Guest booting. This patch introduce a workaround method that pausing all other vCPUs in the same VM when do wbinvd emulation. Tracked-On: #4703 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-21 15:21:29 +08:00
Mingqiang Chi	f994b5ffaf	hv:cleanup vcpu state -- remove VCPU_PAUSED and resume_vcpu -- remove vcpu->prev_state in vcpu structure -- rename pause_vcpu to zombie_vcpu Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-05-21 15:08:49 +08:00
Yonghua Huang	3391bffb27	hv:fix rtvm hang with maxcpus=0/1 in bootargs RTVM (with lapic PT) boots hang when maxcpus is assigned a value less than the CPU number configured in hypervisor. In this case, vlapic_state(per VM) is left in TRANSITION state after BSP boot, which blocks interupts to be injected to this UOS. Tracked-On: #4803 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Li, Fei <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-15 10:09:13 +08:00
Li Fei1	27a66acd0e	hv: ptdev: refine look up MSI ptirq entry There's no need to look up MSI ptirq entry by virtual SID any more since the MSI ptirq entry would be removed before the device is assigned to a VM. Now the logic of MSI interrupt remap could simplify as: 1. Add the MSI interrupt remap first; 2. If step is already done, just do the remap part. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com> Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com>	2020-05-13 14:31:01 +08:00
Li Fei1	15e3062631	hv: vpci: remove is_own_device() Now we could know a device status by 'user' filed, like --------------------------------------------------------------------------- \| NULL \| == vdev \| != NULL && != vdev vdev->user \| device is de-init \| used by itself VM \| assigned to another VM --------------------------------------------------------------------------- So we don't need to modify 'vpci' field accordingly. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com>	2020-05-13 14:31:01 +08:00
Zide Chen	0a956c34c7	hv: add a new field cpu_affinity in struct acrn_vm For post-launched VMs, the configured CPU affinity could be different from the actual running CPU affinity. This new field acrn_vm->cpu_affinity recognizes this difference so that it's possible that CREATE_VM hypercall won't overwrite the configured CPU afifnity. Change name cpu_affinity_bitmap in acrn_vm_config to cpu_affinity. This is read-only in run time, never overwritten by acrn-dm. Remove vm_config->vcpu_num, which means the number of vCPUs of the configured CPU affinity. This is not to be confused with the actual running vCPU number: vm->hw.created_vcpus. Changed get_vm_bsp_pcpu_id() to get_configured_bsp_pcpu_id() for less confusion. Tracked-On: #4616 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-08 11:04:31 +08:00
Yan, Like	869ccb7ba8	HV: RDT: add CDP support in ACRN CDP is an extension of CAT. It enables isolation and separate prioritization of code and data fetches to the L2 or L3 cache in a software configurable manner, depending on hardware support. This commit adds a Kconfig switch "CDP_ENABLED" which depends on "RDT_ENABLED". CDP will be enabled if the capability available and "CDP_ENABLED" is selected. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com> Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-08 08:50:13 +08:00
Yan, Like	277c668b04	HV: RDT: clean up RDT code This commit makes some RDT code cleanup, mainling including: - remove the clos_mask and mba_delay validation check in setup_res_clos_msr(), the check will be done in pre-build; - rename platform_clos_num to valid_clos_num, which is set as the minimal clos_mas of all enabled RDT resouces; - init the platform_clos_array in the res_cap_info[] definition; - remove the unnecessary return values and return value check. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com>	2020-05-08 08:50:13 +08:00
Yan, Like	f774ee1fba	HV: RDT: merge struct rdt_cache and rdt_membw in to a union A RDT resource could be CAT or MBA, so only one of struct rdt_cache and struct rdt_membw would be used at a time. They should be a union. This commit merge struct rdt_cache and struct rdt_membw in to a union res. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com> Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com	2020-05-08 08:50:13 +08:00
Li Fei1	0c6b3e57d6	hv: ptdev: minor refine about ptirq_build_physical_msi The virtual MSI information could be included in ptirq_remapping_info structrue, there's no need to pass another input paramater for this puepose. So we could remove the ptirq_msi_info input. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-05-06 11:51:11 +08:00

1 2 3 4 5 ...

1292 Commits