clear-pkgs-linux-iot-lts2018/0740-igb_avb-fix-kernel-pan...

81 lines
3.4 KiB
Diff

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Feng Tang <feng.tang@intel.com>
Date: Thu, 15 Nov 2018 09:45:37 +0800
Subject: [PATCH] igb_avb: fix kernel panic in S3 process
We met igb_avb kernel panic during S3 suspend/resume process, the
panic call stack is:
[61155.772446] BUG: unable to handle kernel paging request at 0000000000003818
[61155.772448] PGD 0 P4D 0
[61155.772453] Oops: 0002 [#1] PREEMPT SMP
[61155.772456] CPU: 2 PID: 20864 Comm: kworker/u8:7 Tainted: G U WC 4.19.0-18.iot-lts2018-sos #1
[61155.772462] Workqueue: events_unbound async_run_entry_fn
[61155.772473] RIP: 0010:igb_configure_tx_ring+0x134/0x220 [igb_avb]
[61155.772475] Code: 10 38 00 00 48 98 48 01 c1 48 63 c2 49 89 4d 30 49 8b b4 24 58 04 00 00 48 85 f6 74 0b 48 01 f0 31 d2 89 10 49 8b 4d 30 31 c0 <89> 01 41 8b 84 24 6c 05 00 00 b9 14 01 04 02 83 f8 05 74 0e 83 f8
[61155.772476] RSP: 0000:ffff88ffb7f2bd38 EFLAGS: 00010246
[61155.772478] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000003818
[61155.772479] RDX: 0000000000003810 RSI: 0000000000000000 RDI: 00000000ffffffff
[61155.772480] RBP: 0000000000000000 R08: 000000000000007f R09: 000000000000007f
[61155.772481] R10: 0000000000005c80 R11: 0000000000000000 R12: ffff88ffe849c900
[61155.772483] R13: ffff88ffe89de240 R14: 000000027c01c000 R15: ffffffff8b564e73
[61155.772485] FS: 0000000000000000(0000) GS:ffff88fff3b00000(0000) knlGS:0000000000000000
[61155.772486] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[61155.772487] CR2: 0000000000003818 CR3: 000000022a813000 CR4: 00000000003406e0
[61155.772488] Call Trace:
[61155.772502] igb_configure+0x1a1/0x450 [igb_avb]
[61155.772510] __igb_open+0x6f/0x540 [igb_avb]
[61155.772525] igb_resume+0xca/0x100 [igb_avb]
[61155.772533] dpm_run_callback+0x59/0x180
[61155.772536] device_resume+0xc0/0x2a0
[61155.772538] async_resume+0x19/0x30
[61155.772540] async_run_entry_fn+0x39/0x160
[61155.772543] process_one_work+0x19e/0x3d0
[61155.772546] worker_thread+0x3d/0x390
[61155.772550] kthread+0x11e/0x140
[61155.772557] ret_from_fork+0x3a/0x50
The root cause is there is register access after power cycle, specifically
from the watchdog timer/task.
Fix it by cancelling the watchdog timer/task in suspend hook, and re-enable
it in resume.
Note: Why that race condition was still happening is something we are not
completely sure yet, and will still monitoring in the future test.
Change-Id: Id3001fb6bef905aaac2024968707f518eb28a74b
Tracked-On: PKT-1578
Signed-off-by: Feng Tang <feng.tang@intel.com>
---
drivers/staging/igb_avb/igb_main.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/staging/igb_avb/igb_main.c b/drivers/staging/igb_avb/igb_main.c
index 70219b0a2cd9..8d772cfbe397 100644
--- a/drivers/staging/igb_avb/igb_main.c
+++ b/drivers/staging/igb_avb/igb_main.c
@@ -9259,6 +9259,10 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake,
int retval = 0;
#endif
+ del_timer_sync(&adapter->watchdog_timer);
+ del_timer_sync(&adapter->phy_info_timer);
+ cancel_work_sync(&adapter->watchdog_task);
+
netif_device_detach(netdev);
status = E1000_READ_REG(hw, E1000_STATUS);
@@ -9394,6 +9398,8 @@ static int igb_resume(struct pci_dev *pdev)
netif_device_attach(netdev);
+ timer_setup(&adapter->watchdog_timer, &igb_watchdog, 0);
+ timer_setup(&adapter->phy_info_timer, &igb_update_phy_info, 0);
return 0;
}
--
https://clearlinux.org