acrn-kernel/include
Frederic Weisbecker 62030a4915 rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes()
[ Upstream commit 28319d6dc5 ]

RCU Tasks and PID-namespace unshare can interact in do_exit() in a
complicated circular dependency:

1) TASK A calls unshare(CLONE_NEWPID), this creates a new PID namespace
   that every subsequent child of TASK A will belong to. But TASK A
   doesn't itself belong to that new PID namespace.

2) TASK A forks() and creates TASK B. TASK A stays attached to its PID
   namespace (let's say PID_NS1) and TASK B is the first task belonging
   to the new PID namespace created by unshare()  (let's call it PID_NS2).

3) Since TASK B is the first task attached to PID_NS2, it becomes the
   PID_NS2 child reaper.

4) TASK A forks() again and creates TASK C which get attached to PID_NS2.
   Note how TASK C has TASK A as a parent (belonging to PID_NS1) but has
   TASK B (belonging to PID_NS2) as a pid_namespace child_reaper.

5) TASK B exits and since it is the child reaper for PID_NS2, it has to
   kill all other tasks attached to PID_NS2, and wait for all of them to
   die before getting reaped itself (zap_pid_ns_process()).

6) TASK A calls synchronize_rcu_tasks() which leads to
   synchronize_srcu(&tasks_rcu_exit_srcu).

7) TASK B is waiting for TASK C to get reaped. But TASK B is under a
   tasks_rcu_exit_srcu SRCU critical section (exit_notify() is between
   exit_tasks_rcu_start() and exit_tasks_rcu_finish()), blocking TASK A.

8) TASK C exits and since TASK A is its parent, it waits for it to reap
   TASK C, but it can't because TASK A waits for TASK B that waits for
   TASK C.

Pid_namespace semantics can hardly be changed at this point. But the
coverage of tasks_rcu_exit_srcu can be reduced instead.

The current task is assumed not to be concurrently reapable at this
stage of exit_notify() and therefore tasks_rcu_exit_srcu can be
temporarily relaxed without breaking its constraints, providing a way
out of the deadlock scenario.

[ paulmck: Fix build failure by adding additional declaration. ]

Fixes: 3f95aa81d2 ("rcu: Make TASKS_RCU handle tasks that are almost done exiting")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Eric W . Biederman <ebiederm@xmission.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:32:52 +01:00
..
acpi ACPI: Fix selecting wrong ACPI fwnode for the iGPU on some Dell laptops 2023-01-18 11:58:11 +01:00
asm-generic arch: fix broken BuildID for arm64 and riscv 2023-02-25 11:25:42 +01:00
clocksource
crypto
drm drm/drm_vma_manager: Add drm_vma_node_allow_once() 2023-02-01 08:34:42 +01:00
dt-bindings dt-bindings: clocks: imx8mp: Add ID for usb suspend clock 2022-12-31 13:33:09 +01:00
keys
kunit kunit: fix kunit_test_init_section_suites(...) 2023-02-09 11:28:08 +01:00
kvm
linux rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes() 2023-03-10 09:32:52 +01:00
math-emu
media media: dvbdev: fix build warning due to comments 2022-12-31 13:33:12 +01:00
memory
misc
net dccp/tcp: Avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions. 2023-02-22 12:59:52 +01:00
pcmcia
ras
rdma
rv
scsi scsi: libsas: Add smp_ata_check_ready_type() 2023-02-25 11:25:39 +01:00
soc ARM: at91: pm: avoid soft resetting AC DLL 2022-11-01 12:25:19 +02:00
sound ALSA: hda/hdmi: fix stream-id config keep-alive for rt suspend 2022-12-31 13:33:07 +01:00
target
trace tracing: Fix TASK_COMM_LEN in trace event format file 2023-02-14 19:11:54 +01:00
uapi drm/virtio: exbuf->fence_fd unmodified on interrupted wait 2023-02-14 19:11:45 +01:00
ufs scsi: ufs: core: Fix devfreq deadlocks 2023-02-01 08:34:39 +01:00
vdso
video
xen