443f1cb58c
A Cortex-M BusFault often arises from the execution of a function pointer that got corrupted. The Zephyr Cortex-M fault handler de-references the `$pc` in `z_arm_is_synchronous_svc()` to determine if the fault was due to a kernel oops (ARCH_EXCEPT). This can cause a BusFault if the pc itself was corrupt. A BusFault from a HardFault will trigger ARM Cortex-M "Lockup" preventing the Zephyr fault handler from running to completion. This in turn, results in no fault handling information getting dumped by the Zephyr fault handler. To fix the issue, we can simply set the `CCR.BFHFNMIGN` bit prior to the instruction address dereference which will cause the processor to ignore the BusFault and return a value of 0x0 instead of entering lockup. After the operation is complete, we clear `CCR.BFHFNMIGN` as it would be unexpected for any other code in the fault handler to trigger a fault. The issue can be reproduced programmatically with: ``` void (*unaligned_func)(void) = (void (*)(void))0x50000001; unaligned_func(); ``` I bumped into this problem while debugging an issue on the nRF9160DK (`west build --board nrf9160dk_nrf9160ns`) and confirmed that after making this change I now see the full fault handler print: ``` [00:00:45.582,214] <err> os: Exception occurred in Secure State [00:00:45.582,244] <err> os: ***** HARD FAULT ***** [...] [00:00:45.583,984] <err> os: Current thread: 0x2000d340 (shell_uart) [00:00:45.829,498] <err> fatal_error: Resetting system ``` Signed-off-by: Chris Coleman <chris@memfault.com> |
||
---|---|---|
.. | ||
arc | ||
arm | ||
arm64 | ||
common | ||
mips | ||
nios2 | ||
posix | ||
riscv | ||
sparc | ||
x86 | ||
xtensa | ||
CMakeLists.txt | ||
Kconfig |