This is a private kernel header with private kernel APIs, it should not
be exposed in the public zephyr include directory.
Once sample remains to be fixed (metairq_dispatch), which currently uses
private APIs from that header, it should not be the case.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
This is meant as a substitute for sys_clock_timeout_end_calc()
Current sys_clock_timeout_end_calc() usage opens up many bug
possibilities due to the actual timeout evaluation's open-coded nature.
Issue ##50611 is one example.
- Some users store the returned value in a signed variable, others in
an unsigned one, making the comparison with UINT64_MAX (corresponding
to K_FOREVER) wrong in the signed case.
- Some users compute the difference and store that in a signed variable
to compare against 0 which still doesn't work with K_FOREVER. And when
this difference is used as a timeout argument then the K_FOREVER
nature of the timeout is lost.
- Some users complexify their code by special-casing K_NO_WAIT and
K_FOREVER inline which is bad for both code readability and binary
size.
Let's introduce a better abstraction to deal with absolute timepoints
with an opaque type to be used with a well-defined API.
The word "timeout" was avoided in the naming on purpose as the timeout
namespace is quite crowded already and it is preferable to make a
distinction between relative time periods (timeouts) and absolute time
values (timepoints).
A few stacks are also adjusted as they were too tight on X86.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Scheduling relative timeouts from within timer callbacks (=sys clock ISR
context) differs from scheduling relative timeouts from an application
context.
This change documents and explains the rationale of this distinction.
Signed-off-by: Florian Grandel <fgrandel@code-for-humans.de>
Rework the fragile and ad-hoc computation of timeslice expirations
into per-CPU struct _timeout objects with regular callbacks. The
expiration callbacks themselves simply set a per-cpu flag (they might
run on any CPU), which gets checked at the end of the timer ISR on
every CPU.
This simplifies logic and removes a bunch of code. It also fixes at
least three bugs:
1. As @npitre discovered: On SMP, the number of ticks announced on any
given CPU is going to be a subset of all expired ticks. This broke
the accounting of timeslice ticks, and effectively meant that
timeslicing only worked on SMP on systems where one CPU could hog all
the announcements, and only on that CPU.
2. The bootstrap path to arm the timer driver after setting the first
timeout in an empty list couldn't take into account
sys_clock_elapsed() ticks, as it didn't know whether it was being
called underneath an existing announce loop. Now this code is no
longer responsible for knowing anything about time slicing at all.
3. Also on SMP, there was a case where two CPUs timeslicing
simultaneously could stomp on each others' timeouts in
z_set_timeout_expiry(), as neither had a way of knowing what the
other's state was. CPUs could miss their own expiration and have to
wait for the slice expiration on the other CPU. Now, timeouts are
global objects with simple expiration times, and there's no need for
that function at all.
Signed-off-by: Andy Ross <andyross@google.com>
At least one static analysis tool is flagging a potential NULL
derefence in sys_clock_announce()'s tick processing loop where the
routine 'first()' is concerned. In practice, this does not occur as
...
1. The code in question is protected by a spinlock.
2. 'first()' does not change the contents of anything.
The code has consequently been tweaked to prevent similar such false
positives in the future.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Accurate timekeeping is something that is often taken for granted.
However, reliability of timekeeping code is critical for most core
and subsystem code. Furthermore, Many higher-level timekeeping
utilities in Zephyr work off of ticks but there is no way to modify
ticks directly which would require either unnecessary delays in
test code or non-ideal compromises in test coverage.
Since timekeeping is so critical, there should be as few barriers
to testing timekeeping code as possible, while preserving
integrity of the kernel's public interface.
With this, we expose `sys_clock_tick_set()` as a system call only
when `CONFIG_ZTEST` is set, declared within the ztest framework.
Signed-off-by: Chris Friedt <cfriedt@meta.com>
Fixes an issue in sys_clock_tick_get() that could lead to drift in
a k_timer handler. The handler is invoked in the timer ISR as a
callback in sys_tick_announce().
1. The handler invokes k_uptime_ticks().
2. k_uptime_ticks() invokes sys_clock_tick_get().
3. sys_clock_tick_get() must call elapsed() and not
sys_clock_elapsed() as we do not want to count any
unannounced ticks that may have elapsed while
processing the timer ISR.
Fixes#46378
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Updates sys_clock_announce() such that the <announce_remaining> update
calculation is done after the callback. This prevents another core from
entering the timeout processing loop before the first core leaves it.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
In order to bring consistency in-tree, migrate all kernel code to the
new prefix <zephyr/...>. Note that the conversion has been scripted,
refer to zephyrproject-rtos#45388 for more details.
Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
Commit b1182bf83b ("kernel/timeout: Serialize handler callbacks on
SMP") introduced an important fix to timeout handling on
multiprocessor systems, but it did it in a clumsy way by holding a
spinlock across the entire timeout process on all cores (everything
would have to spin until one core finished the list). The lock also
delays any nested interrupts that might otherwise be delivered, which
breaks our nested_irq_offload case on xtensa+SMP (where contra x86,
the "synchronous" interrupt is sensitive to mask state).
Doing this right turns out not to be so hard: take the timeout lock,
check to see if someone is already iterating
(i.e. "announce_remaining" is non-zero), and if so just increment the
ticks to announce and exit. The original cpu will then complete the
full timeout list without blocking any others longer than needed to
check the timeout state.
Fixes#44758
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
On multiprocessor systems, it's routine to enter sys_clock_announce()
in parallel (the driver will generally announce zero ticks on all but
one cpu).
When that happens, each call will independently enter the loop over
the timeout list. The access is correctly synchronized, so the list
handling is correct. But the lock is RELEASED around the invocation
of the callback, which means that the individual callbacks may
interleave between cpus. That means that individual
application-provided callbacks may be executed in parallel, which to
the app is indistinguishable from "out of order".
That's surprising and error-prone. Don't do it. Place a secondary
outer spinlock around the announce loop (but not the timeslicing
handling) to correctly serialize the timeout handling on a single cpu.
(It should be noted that this was discovered not because of a timeout
callback race, but because the resulting simultaneous calls to
sys_clock_set_timeout from separate cores seems to cause extremely
high latency excursions on intel_adsp hardware using the cavs_timer
driver. That hardware issue is still poorly understood, but this fix
is desirable regardless.)
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
We can't simply use CLAMP to set the next timeout because
when CONFIG_SYSTEM_CLOCK_SLOPPY_IDLE is set, MAX_WAIT is
a negative number and then CLAMP will be called with
the higher boundary lower the lower boundary.
Fixes#41422
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Correct the way the relative ticks value is calculated for an absolute
timeout. Previously, elapsed() was called twice and the returned value
was first subtracted from and then added to the ticks value. It could
happen that the HW counter value read by elapsed() changed between the
two calls to this function. This caused the test_timeout_abs test case
from the timer_api test suite to occasionally fail, e.g. on certain nRF
platforms.
Signed-off-by: Andrzej Głąbek <andrzej.glabek@nordicsemi.no>
K_busy_wait is the only function from thread.c that is used when
CONFIG_MULTITHREADING=n. Moving to timeout since it fits better there
as it requires sys clock to be present.
Signed-off-by: Krzysztof Chruscinski <krzysztof.chruscinski@nordicsemi.no>
z_timeout_end_calc() was missing final else statement
in the if else if construct. This commit pulls the last
condition into a final else {} to comply with guideline
15.7.
Signed-off-by: Jennifer Williams <jennifer.m.williams@intel.com>
The clock/timer APIs are not application facing APIs, however, similar
to arch_ and a few other APIs they are available to implement drivers
and add support for new hardware and are documented and available to be
used outside of the clock/kernel subsystems.
Remove the leading z_ and provide them as clock_* APIs for someone
writing a new timer driver to use.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
This function would correctly suppress attempts to set timeouts that
were too soon for the driver or farther out than what was already set,
but when it actually set the timeout it would use the requested value
and not clamp it to the minimum of it and the current timeout
expiration, leading to "too-long" timeouts being set at the driver.
In uniprocessor configurations, that turns out to have been benign
because something else would always come back along when timeout state
changed and fix the broken value before the expiration.
But in SMP, this opens up races. For example, the idle thread on one
CPU can see that there are no active threads and schedule a maximum
value timeout at the same time as the other thread adds a new timeout
that expects a near-term expiration. The broken code here would see
that the new timeout exists, decide that yes it needs to override, but
then set the K_TICKS_FOREVER value it got from the idle thread!
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Remove duplication in the code by moving macro LOCKED() to the correct
kernel_internal.h header.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
The computation was using the already-adjusted input value that
assumed relative timeouts and not the actual argument the user passed.
Absolute timeouts were consistently waking up one tick early.
Fixes#32499
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
There was an edge case in the timeout handling (exposed by, but not
strictly related to, the recent timeslice fix): the next_timeout()
computation would include time slice expiration as a clamp on the
result, but this would be invoked also on the z_set_timeout_expiry()
path which gets hooked on entry to a new thread which is needed to set
the timeout in the first place. So if no other timer interrupt was
scheduled, it was possible to miss the first timeslice interrupt after
thread scheduling.
The explanation is much longer than the fix (use <= as the comparator
instead of <).
In practice this was only being hit in the existing test suite on
riscv miv running under renode using non-default clock rates.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Fix an edge case that snuck in with the recent fix: if timeslicing is
enabled, the CPU's slice_ticks will be zero, and thus match a timeout
object's dticks value of zero, and thus get suppressed (because "we
already have a timeout scheduled for that") incorrectly.
Fixes#31789
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Time slices don't have a timeout struct associated and stored in
timeout_list. Time slice timeout is direct programmed in the system
clock and tracked in _current_cpu->slice_ticks.
There is one issue where the time slice timeout can be missed because
the system clock is re-programmed to a longer timeout. To this happens,
it is only necessary that the timeout_list is empty (any timeout set)
and a new timeout longer than remaining time slice is set. This is cause
because z_add_timeout does not check for the slice ticks.
The following example spots the issue:
K_THREAD_STACK_DEFINE(tstack, STACK_SIZE);
K_THREAD_STACK_ARRAY_DEFINE(tstacks, NUM_THREAD, STACK_SIZE);
K_SEM_DEFINE(sema, 0, NUM_THREAD);
static inline void spin_for_ms(int ms)
{
uint32_t t32 = k_uptime_get_32();
while (k_uptime_get_32() - t32 < ms) {
}
}
static void thread_time_slice(void *p1, void *p2, void *p3)
{
printk("thread[%d] - Before spin\n", (int)(uintptr_t)p1);
/* Spinning for longer than slice */
spin_for_ms(SLICE_SIZE + 20);
/* The following print should not happen before another
* same priority thread starts.
*/
printk("thread[%d] - After spinning\n", (int)(uintptr_t)p1);
k_sem_give(&sema);
}
void main(void)
{
k_tid_t tid[NUM_THREAD];
struct k_thread t[NUM_THREAD];
uint32_t slice_ticks = k_ms_to_ticks_ceil32(SLICE_SIZE);
int old_prio = k_thread_priority_get(k_current_get());
/* disable timeslice */
k_sched_time_slice_set(0, K_PRIO_PREEMPT(0));
for (int j = 0; j < 2; j++) {
k_sem_reset(&sema);
/* update priority for current thread */
k_thread_priority_set(k_current_get(), K_PRIO_PREEMPT(j));
/* synchronize to tick boundary */
k_usleep(1);
/* create delayed threads with equal preemptive priority */
for (int i = 0; i < NUM_THREAD; i++) {
tid[i] = k_thread_create(&t[i], tstacks[i], STACK_SIZE,
thread_time_slice, (void *)i, NULL,
NULL, K_PRIO_PREEMPT(j), 0,
K_NO_WAIT);
}
/* enable time slice (and reset the counter!) */
k_sched_time_slice_set(SLICE_SIZE, K_PRIO_PREEMPT(0));
/* Spins for while to spend this thread time but not longer */
/* than a slice. This is important */
spin_for_ms(100);
printk("before sleep\n");
/* relinquish CPU and wait for each thread to complete */
k_sleep(K_TICKS(slice_ticks * (NUM_THREAD + 1)));
for (int i = 0; i < NUM_THREAD; i++) {
k_sem_take(&sema, K_FOREVER);
}
/* test case teardown */
for (int i = 0; i < NUM_THREAD; i++) {
k_thread_abort(tid[i]);
}
/* disable time slice */
k_sched_time_slice_set(0, K_PRIO_PREEMPT(0));
}
k_thread_priority_set(k_current_get(), old_prio);
}
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Replaces all existing variants of value clamping with the MIN and MAX
macros with the CLAMP macro.
Signed-off-by: Trond Einar Snekvik <Trond.Einar.Snekvik@nordicsemi.no>
Zephyr SMP kernels need to be able to run on architectures with
incoherent caches. Naive implementation of synchronization on such
architectures requires extensive cache flushing (e.g. flush+invalidate
everything on every spin lock operation, flush on every unlock!) and
is a performance problem.
Instead, many of these systems will have access to separate "coherent"
(usually uncached) and "incoherent" regions of memory. Where this is
available, place all writable data sections by default into the
coherent region. An "__incoherent" attribute flag is defined for data
regions that are known to be CPU-local and which should use the cache.
By default, this is used for stack memory.
Stack memory will be incoherent by default, as by definition it is
local to its current thread. This requires special cache management
on context switch, so an arch API has been added for that.
Also, when enabled, add assertions to strategic places to ensure that
shared kernel data is indeed coherent. We check thread objects, the
_kernel struct, waitq's, timeouts and spinlocks. In practice almost
all kernel synchronization is built on top of these structures, and
any shared data structs will contain at least one of them.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
API that takes _timeout structures but doesn't change data in them is
updated to const-qualify the underlying object, allowing information
to be retrieved from contexts where the containing object is
immutable.
Signed-off-by: Peter A. Bigot <pab@pabigot.com>
Both operands of an operator in the arithmetic conversions
performed shall have the same essential type category.
Changes are related to converting the integer constants to the
unsigned integer constants
Signed-off-by: Aastha Grover <aastha.grover@intel.com>
When to->dticks is an int64_t it may happen that the calculated
remaining time is a value that cannot be exactly represented in the
destination int32_t, producing an implementation-defined result which
can include a signal (interrupt). Cap the maximum delay to the
largest value suported by the int32_t result.
Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
The dticks field was changed from a signed to an unsigned
value so now the assert test to ensure it isn't negative
is no longer needed.
fixes#26355
Signed-off-by: David Leach <david.leach@nxp.com>
MISRA-C Rule 5.3 states that identifiers in inner scope should
not hide identifiers in outer scope.
In the function z_set_timeout_expiry(), the parameter "idle"
and an inner variable "next" collide with function named idle()
and next(). So rename those variables.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The "forever" token has always been interpreted above z_add_timeout()
(because it's always taken ticks, but K_FOREVER used to be in ms).
But it was discovered that k_delayed_work_submit_to_queue() was never
testing for this and passing a raw K_FOREVER down, where it got
interpreted as a negative timeout and caused it to fire at the next
tick.
Now that we actually see the original k_timeout_t here, we might as
well check it locally and do the correct thing (that is, nothing) if
asked to schedule a timeout that will never fire.
Fixes#24409
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Add a call to get the system tick count as an official API (and
redefine the existing millisecond API in terms of it). Sophisticated
applications need to be able to count ticks directly, and the newer
timeout API supports that. Uptime should too, for symmetry.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Add tick-based (i.e. precision resistant) inspection APIs for kernel
timeouts visible via k_timer, k_delayed work and thread timeouts
(i.e. pended/sleeping threads). These are each available in
"remaining" and "expires" variants returning time values relative to
current time and system start. All have system calls where applicable
(i.e. everywhere but k_delayed_work, which is not a userspace API)
The pre-existing millisecond "remaining_get()" predicates for timer
and delayed work remain, but are expressed in terms of the newer
calls.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Add support for "absolute" timeouts, which are expressed relative to
system uptime instead of deltas from current time. These allow for
more race-resistant code to be written by allowing application code to
do a single timeout computation, once, and then reuse the timeout
value even if the thread wakes up and needs to suspend again later.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Add a CONFIG_TIMEOUT_64BIT kconfig that, when selected, makes the
k_ticks_t used in timeout computations pervasively 64 bit. This will
allow much longer timeouts and much faster (i.e. more precise) tick
rates. It also enables the use of absolute (not delta) timeouts in an
upcoming commit.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Add a k_timeout_t type, and use it everywhere that kernel API
functions were accepting a millisecond timeout argument. Instead of
forcing milliseconds everywhere (which are often not integrally
representable as system ticks), do the conversion to ticks at the
point where the timeout is created. This avoids an extra unit
conversion in some application code, and allows us to express the
timeout in units other than milliseconds to achieve greater precision.
The existing K_MSEC() et. al. macros now return initializers for a
k_timeout_t.
The K_NO_WAIT and K_FOREVER constants have now become k_timeout_t
values, which means they cannot be operated on as integers.
Applications which have their own APIs that need to inspect these
vs. user-provided timeouts can now use a K_TIMEOUT_EQ() predicate to
test for equality.
Timer drivers, which receive an integer tick count in ther
z_clock_set_timeout() functions, now use the integer-valued
K_TICKS_FOREVER constant instead of K_FOREVER.
For the initial release, to preserve source compatibility, a
CONFIG_LEGACY_TIMEOUT_API kconfig is provided. When true, the
k_timeout_t will remain a compatible 32 bit value that will work with
any legacy Zephyr application.
Some subsystems present timeout (or timeout-like) values to their own
users as APIs that would re-use the kernel's own constants and
conventions. These will require some minor design work to adapt to
the new scheme (in most cases just using k_timeout_t directly in their
own API), and they have not been changed in this patch, instead
selecting CONFIG_LEGACY_TIMEOUT_API via kconfig. These subsystems
include: CAN Bus, the Microbit display driver, I2S, LoRa modem
drivers, the UART Async API, Video hardware drivers, the console
subsystem, and the network buffer abstraction.
k_sleep() now takes a k_timeout_t argument, with a k_msleep() variant
provided that works identically to the original API.
Most of the changes here are just type/configuration management and
documentation, but there are logic changes in mempool, where a loop
that used a timeout numerically has been reworked using a new
z_timeout_end_calc() predicate. Also in queue.c, a (when POLL was
enabled) a similar loop was needlessly used to try to retry the
k_poll() call after a spurious failure. But k_poll() does not fail
spuriously, so the loop was removed.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Mark the old time conversion APIs deprecated, leave compatibility
macros in place, and replace all usage with the new API.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>