acrn-hypervisor/hypervisor/arch/x86
Shiqing Gao 651d44432c hv: initialize the XSAVE related processor state for guest
If SOS is using kernel 5.4, hypervisor got panic with #GP.

Here is an example on KBL showing how the panic occurs when kernel 5.4 is used:
Notes:
 * Physical MSR_IA32_XSS[bit 8] is 1 when physical CPU boots up.
 * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is initialized to 0.

Following thread switches would happen at run time:
1. idle thread -> vcpu thread
   context_switch_in happens and rstore_xsave_area is called.
   At this moment, vcpu->arch.xsave_enabled is false as vcpu is not launched yet
   and init_vmcs is not called yet (where xsave_enabled is set to true).
   Thus, physical MSR_IA32_XSS is not updated with the value of guest MSR_IA32_XSS.

   States at this point:
    * Physical MSR_IA32_XSS[bit 8] is 1.
    * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is 0.

2. vcpu thread -> idle thread
   context_switch_out happens and save_xsave_area is called.
   At this moment, vcpu->arch.xsave_enabled is true. Processor state is saved
   to memory with XSAVES instruction. As physical MSR_IA32_XSS[bit 8] is 1,
   ectx->xs_area.xsave_hdr.hdr.xcomp_bv[bit 8] is set to 1 after the execution
   of XSAVES instruction.

   States at this point:
    * Physical MSR_IA32_XSS[bit 8] is 1.
    * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is 0.
    * ectx->xs_area.xsave_hdr.hdr.xcomp_bv[bit 8] is 1.

3. idle thread -> vcpu thread
   context_switch_in happens and rstore_xsave_area is called.
   At this moment, vcpu->arch.xsave_enabled is true. Physical MSR_IA32_XSS is
   updated with the value of guest MSR_IA32_XSS, which is 0.

   States at this point:
    * Physical MSR_IA32_XSS[bit 8] is 0.
    * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is 0.
    * ectx->xs_area.xsave_hdr.hdr.xcomp_bv[bit 8] is 1.

   Processor state is restored from memory with XRSTORS instruction afterwards.
   According to SDM Vol1 13.12 OPERATION OF XRSTORS, a #GP occurs if XCOMP_BV
   sets a bit in the range 62:0 that is not set in XCR0 | IA32_XSS.
   So, #GP occurs once XRSTORS instruction is executed.

Such issue does not happen with kernel 5.10. Because kernel 5.10 writes to
MSR_IA32_XSS during initialization, while kernel 5.4 does not do such write.
Once guest writes to MSR_IA32_XSS, it would be trapped to hypervisor, then,
physical MSR_IA32_XSS and the value of MSR_IA32_XSS in vcpu->arch.guest_msrs
are updated with the value specified by guest. So, in the point 2 above,
correct processor state is saved. And #GP would not happen in the point 3.

This patch initializes the XSAVE related processor state for guest.
If vcpu is not launched yet, the processor state is initialized according to
the initial value of vcpu_get_guest_msr(vcpu, MSR_IA32_XSS), ectx->xcr0,
and ectx->xs_area. With this approach, the physical processor state is
consistent with the one presented to guest.

Tracked-On: #6434

Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Reviewed-by: Li Fei1 <fei1.li@intel.com>
2021-08-20 09:46:09 +08:00
..
boot HV: modularization: rename multiboot.h to boot.h 2021-06-11 10:06:02 +08:00
configs hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
guest hv: initialize the XSAVE related processor state for guest 2021-08-20 09:46:09 +08:00
lib HV: rewrite memcpy_s to be iso c11 compliant 2020-06-08 13:30:04 +08:00
seed HV: modularization: use cmdline char array in acrn boot info 2021-06-11 10:06:02 +08:00
cpu.c hv: remove xsave dependence 2021-08-10 16:36:15 +08:00
cpu_caps.c hv: remove xsave dependence 2021-08-10 16:36:15 +08:00
cpu_state_tbl.c hv: dm: Use new power management data structures 2021-07-15 11:53:54 +08:00
e820.c HV: init hv_e820 from efi mmap if boot from uefi 2021-06-11 10:06:02 +08:00
exception.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
gdt.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
idt.S hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
init.c HV: modularization: add boot.c to wrap multiboot module 2021-06-11 10:06:02 +08:00
ioapic.c hv: paging: rename ppt_set/clear_ATTR to set_paging_ATTR 2021-05-14 09:18:00 +08:00
irq.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
lapic.c hv/mod_timer: separate delay functions from the timer module 2021-05-18 16:43:28 +08:00
mmu.c hv: remove the constraint "MMU and EPT must both support large page or not" 2021-08-03 11:01:24 +08:00
nmi.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
notify.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
page.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
pagetable.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
platform_caps.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
pm.c hv: dm: Use new power management data structures 2021-07-15 11:53:54 +08:00
rdt.c hv: some coding style fixes 2021-05-12 16:50:34 +08:00
rtcm.c hv: update RTCT ACPI table detecting 2021-06-01 08:22:20 +08:00
sched.S hv: sched: rename schedule related structs and vars 2019-10-16 10:25:53 +08:00
security.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
sgx.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
trampoline.c hv: cache: wrap common APIs 2021-05-14 09:18:00 +08:00
tsc.c hv/mod_timer: split tsc handling code from timer. 2021-05-18 16:43:28 +08:00
tsc_deadline_timer.c hv/mod_timer: make timer into an arch-independent module 2021-05-18 16:43:28 +08:00
vmx.c hv: VMPTRLD and VMCLEAR VMCS with the common APIs 2021-05-26 11:22:26 +08:00
vtd.c hv: vtd: a minor refine about dmar_wait_completion 2021-08-10 16:36:15 +08:00
wakeup.S