This adds a very primitive coredump mechanism under subsys/debug
where during fatal error, register and memory content can be
dumped to coredump backend. One such backend utilizing log
module for output is included. Once the coredump log is converted
to a binary file, it can be used with the ELF output file as
inputs to an overly simplified implementation of a GDB server.
This GDB server can be attached via the target remote command of
GDB and will be serving register and memory content. This allows
using GDB to examine stack and memory where the fatal error
occurred.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
We want to be regression-testing the spurious ISR functionality.
Therefore, in z_fatal_error() we need to allow a test to continue
if an error has occured due to a spurious IRQ being triggered.
Only in test mode, wee allow the function to return without an
error. In normal mode the current thread will be aborted.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
For architectures that support detection of nested interrupts,
we need to check the validity of the exception stack frame,
before we can supply it as a pointer to the function that
evaluates whether we are in a nested interrupt context. This
commits adds the required esf pointer checks in z_fatal_error().
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
We need to unlock IRQs in early return points of
z_fatal_error() functions; not only at the normal
return point.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
The code underneath z_fatal_error() (which is usually run in an
exception context, but is not required to be) was running with
interrupts enabled, which is a little surprising.
The only bug present currently is that the CPU ID extracted for
logging is subject to a race (i.e. it's possible but very unlikely
that such a handler might migrate to another CPU after the error is
flagged and log the wrong CPU ID), but in general users with custom
error handlers are likely to be surprised when their dying threads
gets preempted by other code before they can abort.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Promote the private z_arch_* namespace, which specifies
the interface between the core kernel and the
architecture code, to a new top-level namespace named
arch_*.
This allows our documentation generation to create
online documentation for this set of interfaces,
and this set of interfaces is worth treating in a
more formal way anyway.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
In z_fatal_error() we invoke the arch-specific API that
evaluates whether we are in a nested exception. We then
use the result to log a message that the error occurred
in ISR. In non-test mode, we unconditionally panic, if
an exception has occurred in an ISR and the fatal error
handler has not returned (apart from the case of an
error in stack sentinel check).
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
The reason we decide to ignore it in code coverage:
1.No test case can cover the function for code coverage.
2.Even if we added a test for testing, it would be marked as
"never be called by other code" because the function cause
CPU halted and it can't return.
Signed-off-by: Peng Su <peng.su@intel.com>
This needs further design work due to problems with logging
C strings. Just send always to printk() for now until this
is resolved.
Fixes: #18052
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Any fatal error will print "ZEPHYR FATAL ERROR" now, so
we don't have to maintain a set of strings in the
sanitycheck harness.py
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is now called z_arch_esf_t, conforming to our naming
convention.
This needs to remain a typedef due to how our offset generation
header mechanism works.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We introduce a new z_fatal_print() API and replace all
occurrences of exception handling code to use it.
This routes messages to the logging subsystem if enabled.
Otherwise, messages are sent to printk().
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
* z_NanoFatalErrorHandler() is now moved to common kernel code
and renamed z_fatal_error(). Arches dump arch-specific info
before calling.
* z_SysFatalErrorHandler() is now moved to common kernel code
and renamed k_sys_fatal_error_handler(). It is now much simpler;
the default policy is simply to lock interrupts and halt the system.
If an implementation of this function returns, then the currently
running thread is aborted.
* New arch-specific APIs introduced:
- z_arch_system_halt() simply powers off or halts the system.
* We now have a standard set of fatal exception reason codes,
namespaced under K_ERR_*
* CONFIG_SIMPLE_FATAL_ERROR_HANDLER deleted
* LOG_PANIC() calls moved to k_sys_fatal_error_handler()
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>