acrn-hypervisor

Commit Graph

Author	SHA1	Message	Date
Jiayuan Yang	e51527fc2d	doc: add celadon as user vm guide This patch adds tutorials about using Celadon as user vm. This tutorials contains: Build Celadon from source code with refined configs and kernel; Launch Celadon vm with passthrough gpu and passthrough disk. Tracked-On: #8254 Signed-off-by: Jiayuan Yang <jiayuan.yang@intel.com>	2024-07-17 13:22:51 +08:00
Jiayuan Yang	8815a0aa6c	doc: Specify elementpath and xmlschema version In the newer version of elementpath and xmlschema, some camera releated feature are missing, thus we need to specify them. Signed-off-by: Jiayuan Yang <jiayuan.yang@intel.com>	2024-07-17 13:22:07 +08:00
Jiayuan Yang	0c10e8d38e	doc: GSG update for ACRN v3.3 - Update ACRN kernel version to 6.1.80. - Update reference board to ASUS Mini PC PN64. - Update development computer and target system SOS to Ubuntu 24.04 noble. - Change User VM image to Ubuntu 24.04 cloud image. - Add some necessary ACRN build tools. - Modify mem parameter in launch script xml to 4096M. - Modify the GRUB menu reference to suit the above changes. Signed-off-by: Jiayuan Yang <jiayuan.yang@intel.com>	2024-07-17 13:22:07 +08:00
YuanXin-Intel	e4429d632b	vUART: change S5 vUART resource This patch is to change the vUART resource occupied by S5 function between Service VM and guest VM to avoid the standard UART port conflict when legacy UART passthrough to guest VM. Tracked-On: #8622 Signed-off-by: YuanXin-Intel <xin.yuan@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>	2024-07-15 15:27:12 +08:00
Jiaqing Zhao	87dffcbc92	dm: pci: update ADL-N and RPL-P iGPU device ids Add more iGPU pci device ids of ADL-N and RPL-P to make passthrough work properly. Tracked-On: #8640 Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>	2024-07-12 18:27:01 +08:00
Zhang Chen	63efde6bdd	HV: boot/elf: Fix the wrong comments in elf.h The definition of elf32_prog_entry with wrong comments, p_filesz should means size of segment in file and p_memsz should means size of segment in memory. Tracked-On: #8642 Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-07-10 15:26:02 +08:00
Zhang Chen	4a176212eb	HV: elf_loader: Fix copy gpa bug in load elf32 The elf images can't be loaded correctly because the elf_loader copy_to_gpa with wrong size. The p_filesz and p_memsz both belong to elf32_prog_entry, this data structure describes segments loaded in ram. p_filesz means size of segment in file and p_memsz means size of segment in memory. ELF loader should copy elf_img to gpa with the size of p_prg_tbl_head32->p_filesz. Tracked-On: #8642 Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-07-10 15:26:02 +08:00
Zhang Chen	1933ee93cb	HV: elf_loader: enable guest multiboot support This patch enable guest multiboot support. Try to find the multiboot header in normal elf guest image. Introduce the multiboot related basic functions to initialize multiboot structure. Including prepare_multiboot_mmap, prepare_loader_name and find_img_multiboot_header. Tracked-On: #8642 Signed-off-by: Victor Sun <victor.sun@intel.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-07-10 15:26:02 +08:00
Zhang Chen	1d4bdd452c	HV: elf_loader: introduce the multiboot_header data structure Define the multiboot_header data structure and MULTIBOOT_MEMORY related definitions. Tracked-On: #8642 Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-07-10 15:26:02 +08:00
Zhang Chen	b808c0ef32	HV: elf_loader: Prepare to extend elf loader for multiboot protocol For the TEE and android kernelflinger boot requirements, elf_loader need to support the multiboot protocol. This patch define a memory block to store ELF format VM load params in guest address space. At the same time, prepare the elf cmdline field and memory map for the guest kernel. Tracked-On: #8642 Signed-off-by: Victor Sun <victor.sun@intel.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-07-10 15:26:02 +08:00
Victor Sun	49a02f599b	HV: elf_loader: Make VM bootargs support elf guest Except Linux guest, elf guest also need support bootargs. Currently VM bootargs support all type of guest. Tracked-On: #8642 Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-07-10 15:26:02 +08:00
Haiwei Li	529ade37a4	config_tools: support vUART Timer pCPU configuration This patch is to allow user to pin vUART timer to specific pCPU via ACRN config tool. User can configure by setting "vUART timer pCPU ID" under Hypervisor->Advanced Parameters. Tracked-On: #8648 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-07-10 10:26:21 +08:00
Haiwei Li	f47b2b6860	config_tools: support vUART Tx/Rx buffer size configuration Introduce an interface to define Tx/Tx buffer size via ACRN config tool. User can configure under Hypervisor->Advanced Parameters. Tracked-On: #8644 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-07-10 10:26:21 +08:00
Gao, Shiqing	cc9013541f	hv: vrtc: remove the unused function `rtc_halted()` rtc_halted() is not invoked anywhere in the code. This patch removes this unused function to fix below error. error: unused function 'rtc_halted' [-Werror,-Wunused-function] Tracked-On: #861 Signed-off-by: Gao, Shiqing <shiqing.gao@intel.com>	2024-07-03 14:55:43 +08:00
Gao, Shiqing	0bcf469758	hv: vtd: fix use of uninitialized variable in dmar_free_irte This patch fixes the following error: error: variable 'sid' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized] Tracked-On: #861 Signed-off-by: Gao, Shiqing <shiqing.gao@intel.com>	2024-07-03 14:55:43 +08:00
Gao, Shiqing	2aed0c7aa1	hv: reloc: enclose `elf64_r_type()` with `#ifdef CONFIG_RELOC` elf64_r_type() is only invoked when CONFIG_RELOC is defined. This patch encloses its definition with `#ifdef CONFIG_RELOC`, otherwise, it is dead code. Tracked-On: #861 Signed-off-by: Gao, Shiqing <shiqing.gao@intel.com>	2024-07-03 14:55:43 +08:00
Gao, Shiqing	f398b9c29e	release: fix the compilation error in release mode Commit `512c98fd7 hv: trace: show cpu usage of vms in pcpu sharing case` causes the compilation error in release mode: hypervisor/common/schedule.c:190: undefined reference to `TRACE_16STR' This patch fixes this issue. Tracked-On: #861 Signed-off-by: Gao, Shiqing <shiqing.gao@intel.com>	2024-07-03 11:26:01 +08:00
YuanXin-Intel	e4b1584577	Change Service VM to supervisor role 1. Enable Service VM to power off or restart the whole platform even when RTVM is running. 2. Allow Service VM stop the RTVM using acrnctl tool with option "stop -f". 3. Add 'Service VM supervisor role enabled' option in ACRN configurator Tracked-On: #8618 Signed-off-by: YuanXin-Intel <xin.yuan@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>	2024-06-28 13:35:07 +08:00
nacui	512c98fd79	hv: trace: show cpu usage of vms in pcpu sharing case To maximize the cpu utilization, core 0 is usually shared by service vm and guest vm. But there are no statistics to show the cpu occupation of each vm. This patch is to provide cpu usage statistic for users. To calculate it, a new trace event is added and marked in scheduling context switch, accompanying with a new python script to analyze the data from acrntrace output. Tracked-On: #8621 Signed-off-by: nacui <na.cui@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com> Reviewed-by: Haiwei Li <haiwei.li@intel.com>	2024-06-28 12:55:23 +08:00
Wu Zhou	926f2346df	config_tools: remove ivshmem size from hv_ram_size As ivshmem has switched from static allocation to E820 allocation, the hv_ram_size no longer needs to include ivshmem size. Tracked-On: #8522 Signed-off-by: Wu Zhou <wu.zhou@intel.com>	2024-06-28 10:00:41 +08:00
Haiwei Li	3d6ca845e2	hv: s3: add timer support When resume from s3, Service VM OS will hang because timer interrupt on BSP is not triggered. Hypervisor won't update physical timer because there are expired timers on pcpu timer list. Add suspend and resume ops for modules that use timers. This patch is just for Service VM OS. Support for User VM will be added in the future. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Haiwei Li	5283c147ef	hv: pci: Add guest cfg header access handling of type 1 device When guests resume form s3, an error occurs in guest: ``` pcieport 0000:00:1c.0: refused to change power state from D0 to D3hot ``` PCI bridge (type 1 device) will access configuration space header but now acrn is not supported. So add handling support. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Haiwei Li	2cd0edaf9c	hv: pci: restore bus and memory/IO info after reset After some kind of reset, such as s3, pci bridge tries to restore the bus and memory/IO info (from 0x18 to 0x32, except for Secondary Latency Timer 0x1b) to resume device state. This patch is to restore these info by hypervisor. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Haiwei Li	81935737ff	hv: s3: reset vm after resume Now only BSP is reset. After Service VM OS resumes from s3, APs' apic_base_msr are incorrect with x2apic bit en. To avoid incorrect states, do `reset_vm` after resume. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Haiwei Li	9c139681f2	hv: s3: hwp: enable hwp after resume from s3 After Service OS resume from s3, an error occurs: [3649827us][cpu=1][idle1][sev=2][seq=1749]:= Unhandled exception: 13 (General Protection) [3658622us][cpu=1][idle1][sev=2][seq=1750]: Host Registers: [3664881us][cpu=1][idle1][sev=2][seq=1751]:= Vector=0x000000000000000D RIP=0x000000000040F9F0 [3674213us][cpu=1][idle1][sev=2][seq=1752]:= RAX=0x0000000080003801 RBX=0x0000000001800800 RCX=0x0000000000000774 [3685787us][cpu=1][idle1][sev=2][seq=1753]:= RDX=0x0000000000000000 RDI=0x0000000000000080 RSI=0x0000000000000000 [3697371us][cpu=1][idle1][sev=2][seq=1754]:= RSP=0x0000000000616C18 RBP=0x0000000000616C38 RBX=0x0000000001800800 [3708947us][cpu=1][idle1][sev=2][seq=1755]:= R8=0x0000000000000038 R9=0x0000000000000001 R10=0x00000000000003F8 [3720539us][cpu=1][idle1][sev=2][seq=1756]:= R11=0x000000000000000D R12=0x0000000000458245 R13=0x0000000000000000 [3732114us][cpu=1][idle1][sev=2][seq=1757]:= RFLAGS=0x0000000000010202 R14=0x0000000000000000 R15=0x0000000000000000 [3743699us][cpu=1][idle1][sev=2][seq=1758]:= ERRCODE=0x0000000000000000 CS=0x0000000000000008 SS=0x0000000000000010 [3755305us][cpu=1][idle1][sev=2][seq=1759]:= CR2=0x0000000000000000 The error occurs in `msr_write(MSR_IA32_HWP_REQUEST, reg)`, when HWP is not available. This patch is to initialize HWP after resume. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Haiwei Li	cdfd35ed3d	hv: s3: enable lapic earlier After Service VM OS resumes from s3, BSP starts APs asynchronously, followed by IPIs to APs to resume tsc. This process takes place in function `host_enter_s3`. While, APs' lapic are not ready to accept IPI interrupt, so BSP fails to resume tsc. So enable lapic earlier to make sure that APs are ready. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Xin Zhang	d0fed9901d	Fix Debian packaging postinst partition finding If the root partition is bind mounted with / and another, the current postinst script (using command lsblk) will fail to find the partition: $type will be "/" only and cause the following command may find the wrong partition. Ubuntu 22.04 desktop with firefox snap by default: ``` > lsblk nvme0n1 259:19 0 931.5G 0 disk ├─nvme0n1p1 259:20 0 243M 0 part /boot/efi ├─nvme0n1p2 259:21 0 927.5G 0 part /var/snap/firefox/common/host-hunspell │ / ``` And current command forces the root partition to be ext4. This patch fixes the two issues. Tracked-On: #8532 Signed-off-by: Xin Zhang <xin.x.zhang@intel.com>	2024-06-26 13:56:01 +08:00
Haiwei Li	fbe30d4001	dm: pm: add s3 support for an User VM with elf/bzImage For an elf-loaded or beImage-loaded User VM, acrn-dm is responsible for handling s3 related matters. After resume from S3, acrn-dm should read waking_vector and set related registers to make guest to resume. Tracked-On: #8536 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-26 12:25:46 +08:00
Haiwei Li	374ad1c9ed	dm: pm: add s3 support for an ovmf-booted User VM For ovmf-booted User VM, we should set CMOS shutdown status register (index 0xF) as S3_resume(0xFE). So ovmf will read it and start S3 resume at POST entry. And ovmf will read waking vector from FACS table and transfer control to guest. Tracked-On: #8624 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-26 12:25:46 +08:00
Haiwei Li	c42b45e9a3	OVMF release v3.3 - OvmfPkg: resolve AcrnS3Lib - OvmfPkg: add AcrnS3Lib to support S3 - OvmfPkg: introduce AcrnS3Lib class - OVMF:ACRN:PCI: Try to load ROM image for the PCI device with PCI_ROM - OVMF:ACRN:PCI: Add LoadOpRomImageLight to Load the PCI Rom - OVMF:ACRN:PCI: Write back the original value of PCI ROM The first three above are related to S3. Tracked-On: #8624 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-26 12:25:46 +08:00
Qiang Zhang	29137b9e9c	doc: update doc for vUART to hypervisor console switch key Things changed since following commit (`c623e1112` debug: vuart: add guest break key support). Tracked-On: #8583 Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>	2024-06-25 11:07:21 +08:00
Yi Sun	cc0364f1f7	doc: Correct the registers names in ivshmem hld - In 'MMIO Registers Definition', the names of interrupt status/mask registers are wrong Tracked-On: #8568 Reviewed-by: Junjie Mao <junjie.mao@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>	2024-06-25 11:03:02 +08:00
Jiaqing Zhao	53825c5cac	e820: properly reserve memory for multiboot modules In current implementation, if there are multiple continous 4k-aligned modules, 0-sized e820 entries will be created between these regions. And for non-4k-aligned modules, when two of them are located in one page, the second memory range will not be reserved as it was not in one e820 entry after the first is reserved, making it vulnerable. This patch fixes it by marking the exact memory range of multiboot modules as unusable first, then shrinking the e820 entries to page boundary. If the module crosses multiple e820 entries, possibly due to a buggy bootloader, hypervisor will panic immediately to prevent modules getting corrupted. Tracked-On: #8617 Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-06-20 09:10:27 +08:00
Haiwei Li	b31fcd3519	hv: cpuid: fix hybrid related cpuid error Some cpuids will return invalid values on hybrid platform because of the error in the pointer arithmetic. Add `(void *)` before `cpu_cpuids.leaves`. Leaf 0x14 is used to report Intel Processor Trace Enumeration and varies between P-cores and E-cores on hybrid platform. So add it to `hybrid_leaves`. Tracked-On: #8608 Fixes: `59a8cc4c2` ("hv: cpuid: make leaf 0x4 per-cpu in hybrid architecture") Signed-off-by: Haiwei Li <haiwei.li@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-06-19 17:07:10 +08:00
Jiaqing Zhao	dc8ea42297	dm: deinit iothreads on vm reset iothreads are created by emulated block devices like virtio. These devices are resetted on vm reset, but these iothreads are not freed, causing a resource leak. Fix it by deinit all iothreads on vm reset. Tracked-On: #8612 Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>	2024-06-19 16:14:33 +08:00
andi6	46a860bf04	hv: fix using cpuid does not clear the upper 32-bit registers. In HV, cpuid uses the lower 32 bits of rax\rbx\rcx\rdx registers to pass parameters, But the software does not clear the upper 32-bit registers, if the guest uses 64-bit variables to pass parameters to cpuid，guest will use rax\rbx\rcx\rdx, not eax\ebx\ecx\edx, the previous value of the high 32 registers will affect the guest. Tracked-On: #8605 Reviewed-by: Junjie Mao <junjie.mao@intel.com> Signed-off-by: andi6 <andi6@xiaomi.com>	2024-06-19 15:35:26 +08:00
nacui	30aeeedcb6	dm: enable lpss device passthrough Passthrough of lpss devices, such as sdio, spi, uart, is not supported for user vm due to irq and acpi info missing. Here provides new pci device passthrough options to pass irq and acpi dsdt info by users. Considering spi dsdt info varies from HW, to add the flexibility of configuration, it is designed to pass dsdt file of spi device by users rather than hard code. Besides, remove the limit of the lpss device passthrough for rtvm. Tracked-On: #8615 Signed-off-by: nacui <na.cui@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>	2024-06-19 14:48:34 +08:00
dongpingx	23b1a44d40	misc: fix Vue3 version & update braces's version Although my former patch can pass through build procedure but when I launch configurator and try to load board.xml, the loading procedure wont finish. So we cannot step forward anymore. I cannot find a solution right now, so I have to fix the version to v3.2.33 for several weeks. This patch is applied to fix vulnerability scanned by Trivy also. Vulnerability ID is CVE-2024-4068 & fixed version of dependency is 3.0.3. I added one configuration item named override for package.json. I tested and confirmed the fix is ok. Signed-off-by: dongpingx <dongpingx.wu@intel.com> Tracked-On: #8626	2024-06-18 10:26:32 +08:00
dongpingx	7739f0ef2a	misc: add checking while append new vm This patch will add checking cpu affinity while user click to add new vm. When I was following client's findings up I found that if I click to add a new post-launched vm for step 3.Configure settings for scenario and launch scripts, it failed to show error messages. The current version will check cpu affinity and serial port for post-launched and hv when creating a new vm, it wont verify when adding new post-launched & pre-launched vms, it will fail to save scenario configuration file without any explanation. I've rebuilt and run configurator, confirmed the checking procedure works. Signed-off-by: dongpingx <dongpingx.wu@intel.com> Tracked-On: #8601 Reviewed-by: Junjie Mao junjie.mao@intel.com	2024-06-17 11:40:24 +08:00
Qiang Zhang	5c9e1c0186	board_inspector: fix typo in PCIe PTM Capability name PCIe extended capability with ID 0x1F is Precise Time Measurement. So fix typo "TPM" which may confuse users. Reviewed-by: Junjie Mao <junjie.mao@intel.com> Tracked-On: #5915 Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>	2024-06-17 10:27:36 +08:00
Shiqing Gao	80b1edabf5	dm: block_if: support bypassing BST_BLOCK logic With current implementation, in blockif_dequeue/blockif_complete, if the current request is consecutive to any request in penq or busyq, current request's status is set to BST_BLOCK. Then, this request is blocked until the prior request, which blocks it, is completed. It indicates that consecutive requests are executed sequentially. This patch adds a flag `no_bst_block` to bypass such logic because: 1. the benefit of this logic is not noticeable; 2. there is a chance that a request is enqueued in block_if_queue but not dequeued when this logic is triggered along with the io_uring mechanism; Example to use this flag: `add_virtual_device 5 virtio-blk /dev/nvme1n1,no_bst_block` Note: When io_uring is enabled, the BST_BLOCK logic would be bypassed. Tracked-On: #8612 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Wang, Yu1 <yu1.wang@intel.com>	2024-06-05 15:23:33 +08:00
Shiqing Gao	11c8907464	dm: virtio-blk: fix virtio_blk_ops bug When multiple virtio-blk instances are created for one VM, using the same `static struct virtio_ops virtio_blk_ops` for all instances is buggy. It only works when all instances are created with the same number of the virtqueues. This patch fixes this issue by introducing a member in `struct virtio_blk` to store the ops info for each virtio-blk instance. Tracked-On: #8612 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Wang, Yu1 <yu1.wang@intel.com>	2024-06-05 15:23:33 +08:00
Shiqing Gao	f92b0f43e6	dm: block_if: io_uring: flush the modified in-core data on demand When `io_uring` is used, `blockif_flush_cache` is missing when an WRITE operation is completed. `blockif_flush_cache` would flush the modified in-core data to the disk device according to the setting of the cache mode. Tracked-On: #8612 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Wang, Yu1 <yu1.wang@intel.com>	2024-06-05 15:23:33 +08:00
Shiqing Gao	5306d9e7db	dm: update the `iothread` option to specify the CPU affinity This patch updates the `iothread` option to specify the CPU affinity of the iothread. Setting the iothread's CPU affinity could benefit the Service VM's CPU utilization when Service VM owns limited dedicated CPUs. It could be helpful to ensure the I/O mediator Quality of Service (QoS). Once the performance tuning is done, the specific CPU affinity config could pass to acrn-dm directly, letting the deployment more easily. The format looks like below: iothread=<num_iothread>@<cpu_affinity> "@" is used to separate the following two settings: - the number of iothread instances - the CPU affinity settings for each iothread instance. The format of `cpu_affinity` looks like below: <cpu_affinity_0>/<cpu_affinity_1>/<cpu_affinity_2>/... 1. "/" is used to separate the CPU affinity setting for each iothread instance (sequentially). 2. char '' can be used to skip the setting for the specific iothread instance. 3. the number of cpu_affinity_x vs. the number of iothread instances - If # of cpu_affinity_x is less than # of iothread instances, no CPU affinity settings for the last few iothread instances. - If # of cpu_affinity_x is more than # of iothread instances, the extra cpu_affinity_x are discarded. 4. ":" is used to separate different CPU cores for each CPU affinity setting. Examples to specify the CPU affinity of the iothread: 1. iothread=3@0:1:2/0:1 `add_virtual_device 9 virtio-blk iothread=3@0:1:2/0:1,mq=3,/dev/nvme1n1` a) 3 iothread instances are created. b) CPU affinity of iothread instances for this virtio-blk device: - 1st iothread instance <-> pins to Service VM CPU 0,1,2 - 2nd iothread instance <-> pins to Service VM CPU 0,1 - 3rd iothread instance <-> No CPU affinity settings 2. iothread=3@0//1 `add_virtual_device 9 virtio-blk iothread=3@0//1,mq=3,/dev/nvme1n1` a) 3 iothread instances are created. b) CPU affinity of iothread instances for this virtio-blk device: - 1st iothread instance <-> pins to Service VM CPU 0 - 2nd iothread instance <-> No CPU affinity settings - 3rd iothread instance <-> pins to Service VM CPU 1 v1 -> v2: encapsulate one API in iothread.c to parse the iothread options, so that other BE can also use it. v2 -> v3: * introduce one API iothread_free_options to free the elements that are allocated dynamically in iothread_parse_options(). Tracked-On: #8612 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Wang, Yu1 <yu1.wang@intel.com>	2024-06-05 15:23:33 +08:00
Shiqing Gao	a90aa4fd26	dm: iothread: rename the thread for better readability This patch renames the iothread for better readability. For instance, the new name of the iothread for virtio-blk device looks like `iothr-0-blk9:0`. It could be helpful when tuning the performance and the CPU utilization. v1 -> v2: * add `const` qualifier for the input parameter of `iothread_create` Tracked-On: #8612 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Wang, Yu1 <yu1.wang@intel.com>	2024-06-05 15:23:33 +08:00
Shiqing Gao	14c20fa31c	dm: block_if: support misaligned request when O_DIRECT is used Use of O_DIRECT flag could be a performance option. But this flag may impose alignment restrictions on the length and address of user-space buffers and the file offset of I/Os. To support the use of O_DIRECT flag in block_if, this patch adds the support to handle the misaligned request. - When O_DIRECT flag is used (`nocache` is specified in acrn-dm parameters), * if the original I/O request is aligned, the original I/O request is submitted directly. * if the original I/O request is not aligned (either due to the buffer address/length misalignment, or the offset misalignment), the misaligned request is converted to an aligned request before submission. - When O_DIRECT flag is not used, the original I/O request is submitted directly. v1 -> v2: * cleanup the free() logic in `blockif_init_bounced_write` Tracked-On: #8612 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Wang, Yu1 <yu1.wang@intel.com>	2024-06-05 15:23:33 +08:00
Shiqing Gao	e2da306755	dm: block_if: support bypassing the Service VM's page cache This patch adds an acrn-dm option `nocache` to bypass the Service VM's page cache. - By default, the Service VM's page cache is utilized. - If `nocache` is specified in acrn-dm parameters, the Service VM's page cache would be bypassed (opening the file/block with O_DIRECT flag). Example to bypass the Service VM's page cache: `add_virtual_device 5 virtio-blk iothread,mq=2,/dev/nvme2n1,writeback,nocache` Tracked-On: #8612 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Wang, Yu1 <yu1.wang@intel.com>	2024-06-05 15:23:33 +08:00
Jian Jun Chen	63d41a75fa	dm: set iothread nice value to PRIO_MIN To improve the performance of the virtual device who utilizes iothread (such as virtio-blk), this patch sets iothread nice value to PRIO_MIN, so that it could get higher priority on scheduling. This patch does: - introduce `set_thread_priority` to set the priority of the current running thread. The priority could be any value in the range PRIO_MIN to PRIO_MAX. Lower numerical value causes more favorable scheduling. - set iothread nice value to PRIO_MIN. Tracked-On: #8612 Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Wang, Yu1 <yu1.wang@intel.com>	2024-06-05 15:23:33 +08:00
Shiqing Gao	7e6a239646	dm: improve the flexibility of the iothread support Prior to this patch, one single iothread instance is created and initialized in the `main` function. This single iothread monitors all the registered fds and handles all the corresponding requests. It leads to the limited flexibility of the iothread support. To improve the flexibility of the iothread support, this patch does: - add the support of multiple iothread instances. `iothread_create` is introduced to create a certain number of iothread instances. It shall be called at first by each virtual device owner (such as virtio-blk BE) on initialization phase. Then, `iothread_add` can be called to add the to be monitored fd to the specified iothread. - update virtio-blk BE to let the acrn-dm option `iothread` accept a number as the number of iothread instances to be created. If `iothread` is contained in the parameters, but the number is not specified, one iothread instance would be created by default. Examples to specify the number of iothread instances: 1. Create 2 iothread instances `add_virtual_device 9 virtio-blk iothread=2,mq=2,/dev/nvme1n1,writeback,aio=io_uring` 2. Create 1 iothread instances (by default) `add_virtual_device 9 virtio-blk iothread,mq=2,/dev/nvme1n1,writeback,aio=io_uring` - update virtio-blk BE to separate the request handling of different virtqueues to different iothreads. The request from one or more virtqueues can be handled in one iothread. The mapping between virtqueues and iothreads is based on round robin. v1 -> v2: * add a mutex to protect the free ioctx slot allocation Tracked-On: #8612 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Wang, Yu1 <yu1.wang@intel.com>	2024-06-05 15:23:33 +08:00
Shiqing Gao	fed8ce513c	dm: block_if: add the io_uring support io_uring is a high-performance asynchronous I/O framework, primarily designed to improve the efficiency of input and output (I/O) operations in user-space applications. This patch enables io_uring in block_if module. It utilizes the interfaces provided by the user-space library `liburing` to interact with io_uring in kernel-space. To build the acrn-dm with io_uring support, `liburing-dev` package needs to be installed. For example, it can be installed like below in Ubuntu 22.04. sudo apt install liburing-dev In order to support both the thread pool mechanism and the io_uring mechanism, an acrn-dm option `aio` is introduced. By default, thread pool mechanism is selected. - Example to use io_uring: `add_virtual_device 9 virtio-blk iothread,mq=2,/dev/nvme1n1,writeback,aio=io_uring` - Example to use thread pool: `add_virtual_device 9 virtio-blk iothread,mq=2,/dev/nvme1n1,writeback,aio=threads` - Example to use thread pool (by default): `add_virtual_device 9 virtio-blk iothread,mq=2,/dev/nvme1n1,writeback` v2 -> v3: * Update iothread_handler - Use the unified eventfd interfaces to read the counter value of the ioeventfd. - Remove the while loop to read the ioeventfd. It is not necessary because one read would reset the counter value to 0. * Update iou_submit_sqe to return an error code The caller of iou_submit_sqe shall check the return value. If there is NO available submission queue entry in the submission queue, need to break the while loop. Request can only be submitted when SQE is available. v1 -> v2: * move the logic of reading out ioeventfd from iothread.c to virtio.c, because it is specific to the virtqueue handling. Tracked-On: #8612 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Wang, Yu1 <yu1.wang@intel.com>	2024-06-05 15:23:33 +08:00

1 2 3 4 5 ...

8237 Commits All Branches Search

8237 Commits

All Branches