Go to file
Philip Yang b602f098f7 drm/amdkfd: Fix lock dependency warning with srcu
[ Upstream commit 2a9de42e8d3c82c6990d226198602be44f43f340 ]

======================================================
WARNING: possible circular locking dependency detected
6.5.0-kfd-yangp #2289 Not tainted
------------------------------------------------------
kworker/0:2/996 is trying to acquire lock:
        (srcu){.+.+}-{0:0}, at: __synchronize_srcu+0x5/0x1a0

but task is already holding lock:
        ((work_completion)(&svms->deferred_list_work)){+.+.}-{0:0}, at:
	process_one_work+0x211/0x560

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 ((work_completion)(&svms->deferred_list_work)){+.+.}-{0:0}:
        __flush_work+0x88/0x4f0
        svm_range_list_lock_and_flush_work+0x3d/0x110 [amdgpu]
        svm_range_set_attr+0xd6/0x14c0 [amdgpu]
        kfd_ioctl+0x1d1/0x630 [amdgpu]
        __x64_sys_ioctl+0x88/0xc0

-> #2 (&info->lock#2){+.+.}-{3:3}:
        __mutex_lock+0x99/0xc70
        amdgpu_amdkfd_gpuvm_restore_process_bos+0x54/0x740 [amdgpu]
        restore_process_helper+0x22/0x80 [amdgpu]
        restore_process_worker+0x2d/0xa0 [amdgpu]
        process_one_work+0x29b/0x560
        worker_thread+0x3d/0x3d0

-> #1 ((work_completion)(&(&process->restore_work)->work)){+.+.}-{0:0}:
        __flush_work+0x88/0x4f0
        __cancel_work_timer+0x12c/0x1c0
        kfd_process_notifier_release_internal+0x37/0x1f0 [amdgpu]
        __mmu_notifier_release+0xad/0x240
        exit_mmap+0x6a/0x3a0
        mmput+0x6a/0x120
        do_exit+0x322/0xb90
        do_group_exit+0x37/0xa0
        __x64_sys_exit_group+0x18/0x20
        do_syscall_64+0x38/0x80

-> #0 (srcu){.+.+}-{0:0}:
        __lock_acquire+0x1521/0x2510
        lock_sync+0x5f/0x90
        __synchronize_srcu+0x4f/0x1a0
        __mmu_notifier_release+0x128/0x240
        exit_mmap+0x6a/0x3a0
        mmput+0x6a/0x120
        svm_range_deferred_list_work+0x19f/0x350 [amdgpu]
        process_one_work+0x29b/0x560
        worker_thread+0x3d/0x3d0

other info that might help us debug this:
Chain exists of:
  srcu --> &info->lock#2 --> (work_completion)(&svms->deferred_list_work)

Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
        lock((work_completion)(&svms->deferred_list_work));
                        lock(&info->lock#2);
			lock((work_completion)(&svms->deferred_list_work));
        sync(srcu);

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-02-05 20:12:59 +00:00
Documentation ASoC: doc: Fix undefined SND_SOC_DAPM_NOPM argument 2024-02-05 20:12:54 +00:00
LICENSES
arch um: time-travel: fix time corruption 2024-02-05 20:12:57 +00:00
block block: prevent an integer overflow in bvec_try_merge_hw_page 2024-02-05 20:12:53 +00:00
certs
crypto crypto: api - Disallow identical driver names 2024-01-31 16:16:58 -08:00
drivers drm/amdkfd: Fix lock dependency warning with srcu 2024-02-05 20:12:59 +00:00
fs 9p: Fix initialisation of netfs_inode for 9p 2024-02-05 20:12:59 +00:00
include PCI: add INTEL_HDA_ARL to pci_ids.h 2024-02-05 20:12:55 +00:00
init rootfs: Fix support for rootfstype= when root= is given 2024-01-25 15:27:43 -08:00
io_uring io_uring/rw: ensure io->bytes_done is always initialized 2024-01-25 15:27:41 -08:00
ipc
kernel bpf: Set uattr->batch.count as zero before batched update or deletion 2024-02-05 20:12:51 +00:00
lib debugobjects: Stop accessing objects after releasing hash bucket lock 2024-02-05 20:12:47 +00:00
mm mm: page_alloc: unreserve highatomic page blocks before oom 2024-01-31 16:17:03 -08:00
net bridge: cfm: fix enum typo in br_cc_ccm_tx_parse 2024-02-05 20:12:54 +00:00
rust
samples
scripts scripts/get_abi: fix source path leak 2024-01-31 16:17:01 -08:00
security lsm: new security_file_ioctl_compat() hook 2024-01-31 16:17:00 -08:00
sound ALSA: hda/conexant: Fix headset auto detect fail in cx8070 and SN6140 2024-02-05 20:12:57 +00:00
tools libsubcmd: Fix memory leak in uniq() 2024-02-05 20:12:59 +00:00
usr
virt
.clang-format
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore
.mailmap
.rustfmt.toml
COPYING
CREDITS
Kbuild
Kconfig
MAINTAINERS genirq/affinity: Move group_cpus_evenly() into lib/ 2024-01-10 17:10:33 +01:00
Makefile Linux 6.1.76 2024-01-31 16:17:12 -08:00
README

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.