Go to file
Masayoshi Mizuma 295cc7eb31 x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU
When a physical CPU is hot-removed, the following warning messages
are shown while the uncore device is removed in uncore_pci_remove():

  WARNING: CPU: 120 PID: 5 at arch/x86/events/intel/uncore.c:988
  uncore_pci_remove+0xf1/0x110
  ...
  CPU: 120 PID: 5 Comm: kworker/u1024:0 Not tainted 4.15.0-rc8 #1
  Workqueue: kacpi_hotplug acpi_hotplug_work_fn
  ...
  Call Trace:
  pci_device_remove+0x36/0xb0
  device_release_driver_internal+0x145/0x210
  pci_stop_bus_device+0x76/0xa0
  pci_stop_root_bus+0x44/0x60
  acpi_pci_root_remove+0x1f/0x80
  acpi_bus_trim+0x54/0x90
  acpi_bus_trim+0x2e/0x90
  acpi_device_hotplug+0x2bc/0x4b0
  acpi_hotplug_work_fn+0x1a/0x30
  process_one_work+0x141/0x340
  worker_thread+0x47/0x3e0
  kthread+0xf5/0x130

When uncore_pci_remove() runs, it tries to get the package ID to
clear the value of uncore_extra_pci_dev[].dev[] by using
topology_phys_to_logical_pkg(). The warning messesages are
shown because topology_phys_to_logical_pkg() returns -1.

  arch/x86/events/intel/uncore.c:
  static void uncore_pci_remove(struct pci_dev *pdev)
  {
  ...
          phys_id = uncore_pcibus_to_physid(pdev->bus);
  ...
                  pkg = topology_phys_to_logical_pkg(phys_id); // returns -1
                  for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
                          if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
                                  uncore_extra_pci_dev[pkg].dev[i] = NULL;
                                  break;
                          }
                  }
                  WARN_ON_ONCE(i >= UNCORE_EXTRA_PCI_DEV_MAX); // <=========== HERE!!

topology_phys_to_logical_pkg() tries to find
cpuinfo_x86->phys_proc_id that matches the phys_pkg argument.

  arch/x86/kernel/smpboot.c:
  int topology_phys_to_logical_pkg(unsigned int phys_pkg)
  {
          int cpu;

          for_each_possible_cpu(cpu) {
                  struct cpuinfo_x86 *c = &cpu_data(cpu);

                  if (c->initialized && c->phys_proc_id == phys_pkg)
                          return c->logical_proc_id;
          }
          return -1;
  }

However, the phys_proc_id was already set to 0 by remove_siblinginfo()
when the CPU was offlined.

So, topology_phys_to_logical_pkg() cannot find the correct
logical_proc_id and always returns -1.

As the result, uncore_pci_remove() calls WARN_ON_ONCE() and the warning
messages are shown.

What is worse is that the bogus 'pkg' index results in two bugs:

 - We dereference uncore_extra_pci_dev[] with a negative index
 - We fail to clean up a stale pointer in uncore_extra_pci_dev[][]

To fix these bugs, remove the clearing of ->phys_proc_id from remove_siblinginfo().

This should not cause any problems, because ->phys_proc_id is not
used after it is hot-removed and it is re-set while hot-adding.

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yasu.isimatu@gmail.com
Cc: <stable@vger.kernel.org>
Fixes: 30bb981185 ("x86/topology: Avoid wasting 128k for package id array")
Link: http://lkml.kernel.org/r/ed738d54-0f01-b38b-b794-c31dc118c207@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 12:47:28 +01:00
Documentation Merge branch 'idr-2018-02-06' of git://git.infradead.org/users/willy/linux-dax 2018-02-08 14:39:29 -08:00
LICENSES LICENSES: Add MPL-1.1 license 2018-01-06 10:59:44 -07:00
arch x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU 2018-02-13 12:47:28 +01:00
block for-linus-20180204 2018-02-04 11:16:35 -08:00
certs
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2018-01-31 14:22:45 -08:00
drivers IOMMU Updates for Linux v4.16 2018-02-08 12:03:54 -08:00
firmware
fs vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page 2018-02-13 09:15:58 +01:00
include vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page 2018-02-13 09:15:58 +01:00
init membarrier: Provide core serializing command, *_SYNC_CORE 2018-02-05 21:35:03 +01:00
ipc ipc/mqueue.c: have RT tasks queue in by priority in wq_add() 2018-02-06 18:32:46 -08:00
kernel Modules updates for v4.16 2018-02-07 14:29:34 -08:00
lib Merge branch 'idr-2018-02-06' of git://git.infradead.org/users/willy/linux-dax 2018-02-08 14:39:29 -08:00
mm Merge branch 'akpm' (patches from Andrew) 2018-02-06 22:15:42 -08:00
net This request is late, apologies. 2018-02-08 15:18:32 -08:00
samples sample/bpf: fix erspan metadata 2018-02-06 11:32:49 -05:00
scripts - update includes for gcc 8 (Valdis Kletnieks) 2018-02-08 14:37:32 -08:00
security iversion.h related cleanup for v4.16 2018-02-07 14:25:22 -08:00
sound ASoC: Updates for v4.16 2018-02-07 12:11:09 -08:00
tools Merge branch 'idr-2018-02-06' of git://git.infradead.org/users/willy/linux-dax 2018-02-08 14:39:29 -08:00
usr
virt 2nd set of arm64 updates for 4.16: 2018-02-08 10:44:25 -08:00
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore scripts/package: snap-pkg target 2017-12-13 00:00:18 +09:00
.mailmap mailmap: update Mark Yao's email address 2018-01-04 16:45:09 -08:00
COPYING
CREDITS
Kbuild
Kconfig
MAINTAINERS Mostly cleanups, but three bug fixes: 2018-02-08 12:20:41 -08:00
Makefile Makefile: introduce CONFIG_CC_STACKPROTECTOR_AUTO 2018-02-06 18:32:44 -08:00
README

README

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.