acrn-hypervisor/hypervisor/common
Yifan Liu d575edf79a hv: Change sched_event structure to resolve data race in event handling
Currently the sched event handling may encounter data race problem, and
as a result some vcpus might be stalled forever.

One example can be wbinvd handling where more than 1 vcpus are doing
wbinvd concurrently. The following is a possible execution of 3 vcpus:
-------
0                            1                           2
                             req [Note: 0]
                             req bit0 set [Note: 1]
                             IPI -> 0
                             req bit2 set
                             IPI -> 2
                                                         VMExit
                                                         req bit2 cleared
                                                         wait
                                                         vcpu2 descheduled

VMExit
req bit0 cleared
wait
vcpu0 descheduled
                             signal 0
                             event0->set=true
                             wake 0

                             signal 2
                             event2->set=true [Note: 3]
                             wake 2
                                                         vcpu2 scheduled
                                                         event2->set=false
                                                         resume

                                                         req
                                                         req bit0 set
                                                         IPI -> 0
                                                         req bit1 set
                                                         IPI -> 1
                             (doesn't matter)
vcpu0 scheduled [Note: 4]
                                                         signal 0
                                                         event0->set=true
                                                         (no wake) [Note: 2]
event0->set=false                                        (the rest doesn't matter)
resume

Any VMExit
req bit0 cleared
wait
idle running

(blocked forever)

Notes:
0: req: vcpu_make_request(vcpu, ACRN_REQUEST_WAIT_WBINVD).
1: req bit: Bit in pending_req_bits. Bit0 stands for bit for vcpu0.
2: In function signal_event, At this time the event->waiting_thread
    is not NULL, so wake_thread will not execute
3: eventX: struct sched_event of vcpuX.
4: In function wait_event, the lock does not strictly cover the execution between
    schedule() and event->set=false, so other threads may kick in.
-----

As shown in above example, before the last random VMExit, vcpu0 ended up
with request bit set but event->set==false, so blocked forever.

This patch proposes to change event->set from a boolean variable to an
integer. The semantic is very similar to a semaphore. The wait_event
will add 1 to this value, and block when this value is > 0, whereas signal_event
will decrease this value by 1.

It may happen that this value was decreased to a negative number but that
is OK. As long as the wait_event and signal_event are paired and
program order is observed (that is, wait_event always happens-before signal_event
on a single vcpu), this value will eventually be 0.

Tracked-On: #6405
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
2021-08-20 08:11:40 +08:00
..
delay.c hv/mod_timer: separate delay functions from the timer module 2021-05-18 16:43:28 +08:00
efi_mmap.c HV: add efi memory map parsing function 2021-06-11 10:06:02 +08:00
event.c hv: Change sched_event structure to resolve data race in event handling 2021-08-20 08:11:40 +08:00
hv_main.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
hypercall.c HV: refine acrn_mmiodev data structure 2021-08-11 14:45:55 +08:00
irq.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
ptdev.c hv/mod_timer: refine timer interface 2021-05-18 16:43:28 +08:00
sched_bvt.c hv/mod_timer: refine timer interface 2021-05-18 16:43:28 +08:00
sched_iorr.c hv/mod_timer: refine timer interface 2021-05-18 16:43:28 +08:00
sched_noop.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
schedule.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
softirq.c hv: mod: do not use explicit arch name when including headers 2021-05-08 11:15:46 +08:00
ticks.c hv/mod_timer: split tsc handling code from timer. 2021-05-18 16:43:28 +08:00
timer.c hv/mod_timer: refine timer interface 2021-05-18 16:43:28 +08:00
trusty_hypercall.c hv: hypercalls: refactor permission-checking and dispatching logic 2021-05-12 13:43:41 +08:00
vm_load.c HV: Add elf loader sketch 2021-08-19 20:00:45 +08:00