There is a struct and a macro called _ready_q, this is error
prone. Just removing it.
MISRA-C rule 5.4
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
In _pend_current_thread the argument key is always a unsigned
interger type and this function forces it to become a signed
interger. This is a dangerous behavior and cant be trusted to
work as expected.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
This API shouldn't take a int type but instead it should take
u32_t. This argument has to be similar to irq_lock() and
irq_unlock().
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Now that the API has been fixed up, replace the existing timeout queue
with a much smaller version. The basic algorithm is unchanged:
timeouts are stored in a sorted dlist with each node nolding a delta
time from the previous node in the list; the announce call just walks
this list pulling off the heads as needed. Advantages:
* Properly spinlocked and SMP-aware. The earlier timer implementation
relied on only CPU 0 doing timeout work, and on an irq_lock() being
taken before entry (something that was violated in a few spots).
Now any CPU can wake up for an event (or all of them) and everything
works correctly.
* The *_thread_timeout() API is now expressible as a clean wrapping
(just one liners) around the lower-level interface based on function
pointer callbacks. As a result the timeout objects no longer need
to store backpointers to the thread and wait_q and have shrunk by
33%.
* MUCH smaller, to the tune of hundreds of lines of code removed.
* Future proof, in that all operations on the queue are now fronted by
just two entry points (_add_timeout() and z_clock_announce()) which
can easily be augmented with fancier data structures.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Instead of checking every time we hit the low-level context switch
path to see if the new thread has a "partner" with which it needs to
share time, just run the slice timer always and reset it from the
scheduler at the points where it has already decided a switch needs to
happen. In TICKLESS_KERNEL situations, we pay the cost of extra timer
interrupts at ~10Hz or whatever, which is low (note also that this
kind of regular wakeup architecture is required on SMP anyway so the
scheduler can "notice" threads scheduled by other CPUs). Advantages:
1. Much simpler logic. Significantly smaller code. No variance or
dependence on tickless modes or timer driver (beyond setting a
simple timeout).
2. No arch-specific assembly integration with _Swap() needed
3. Better performance on many workloads, as the accounting now happens
at most once per timer interrupt (~5 Hz) and true rescheduling and
not on every unrelated context switch and interrupt return.
4. It's SMP-safe. The previous scheme kept the slice ticks as a
global variable, which was an unnoticed bug.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Change APIs that essentially return a boolean expression - 0 for
false and 1 for true - to return a bool.
MISRA-C rule 14.4
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Add ifdef guard to the z_reset_timeslice() to fix compilation
errors when CONFIG_TIMESLICING is disabled.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Any word started with underscore followed by and uppercase letter or a
second underscore is a reserved word according with C99.
With have *many* violations on Zephyr's code, this commit is tackling
only the violations caused by headers guards. It also takes the
opportunity to normalize them using the filename in uppercase and
replacing dot with underscore. e.g file.h -> FILE_H
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
When adding a new runnable thread in tickless mode, we need to detect
whether it will timeslice with the running thread and reset the timer,
otherwise it won't get any CPU time until the next interrupt fires at
some indeterminate time in the future.
This fixes the specific bug discussed in #7193, but the broader
problem of tickless and timeslicing interacting badly remains. The
code as it exists needs some rework to avoid all the #ifdef mess.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
When adding a new runnable thread in tickless mode, we need to detect
whether it will timeslice with the runnable thread and reset the
timer, otherwise it won't get any CPU time until the next interrupt
fires at some indeterminate time in the future.
This fixes the specific bug discussed in #7193, but the broader
problem of tickless and timeslicing interacting badly remains. The
code as it exists needs some rework to avoid all the #ifdef mess.
Note that the patch also moves _ready_thread() from a ksched.h inline
to sched.c.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Move to more generic tracing hooks that can be implemented in different
ways and do not interfere with the kernel.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Define generic interface and hooks for tracing to replace
kernel_event_logger and existing tracing facilities with something more
common.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
The _THREAD_POLLING bit in thread_state was never actually a
legitimate thread "state". It is a clever synchronization trick
introduced to allow the thread to release the irq_lock while looping
over the input event array without dropping events.
Instead, make that flag a word in the "poller" struct that lives on
the stack of the thread calling k_poll. The disadvantage is the 4
bytes of thread space needed. Advantages:
+ Cleaner API, it's now internal to poll instead of being globally
visible.
+ The thread_state bit space is just one byte, and was almost full
already.
+ Smaller code to write/test a full word and not a bitfield
+ Words are atomic, so no need for one of irq lock/unlock pairs.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Very simple implementation of deadline scheduling. Works by storing a
single word in each thread containing a deadline, setting it (as a
delta from "now") via a single new API call, and using it as extra
input to the existing thread priority comparison function when
priorities are equal.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This replaces the existing scheduler (but not priority handling)
implementation with a somewhat simpler one. Behavior as to thread
selection does not change. New features:
+ Unifies SMP and uniprocessing selection code (with the sole
exception of the "cache" trick not being possible in SMP).
+ The old static multi-queue implementation is gone and has been
replaced with a build-time choice of either a "dumb" list
implementation (faster and significantly smaller for apps with only
a few threads) or a balanced tree queue which scales well to
arbitrary numbers of threads and priority levels. This is
controlled via the CONFIG_SCHED_DUMB kconfig variable.
+ The balanced tree implementation is usable symmetrically for the
wait_q abstraction, fixing a scalability glitch Zephyr had when many
threads were waiting on a single object. This can be selected via
CONFIG_WAITQ_FAST.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
There were multiple spots where code was using the _wait_q_t
abstraction as a synonym for a dlist and doing direct list management
on them with the dlist APIs. Refactor _wait_q_t into a proper opaque
struct (not a typedef for sys_dlist_t) and write a simple wrapper API
for the existing usages. Now replacement of wait_q with a different
data structure is much cleaner.
Note that there were some SYS_DLIST_FOR_EACH_SAFE loops in mailbox.c
that got replaced by the normal/non-safe macro. While these loops do
mutate the list in the code body, they always do an early return in
those circumstances instead of returning into the macro'd for() loop,
so the _SAFE usage was needless.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Refactoring. Mempool wants to unpend all threads at once. It's
cleaner to do this in the scheduler instead of the IPC code.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
There was a ton of junk in this header. Pare it down to just the
stuff actually used by code outside of sched.c, move the needed
internal stuff into sched.c itself, and drop everything else.
Note that (other than the tiny inlines that remain here in the header)
the scheduler interface exposed to the rest of the system is now
composed of just 12 functions.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The POSIX layer had a simple ready_one_thread() utility. Move this to
the scheduler API (with a prepended underscore -- it's an internal
API) so that it can be synchronized along with the rest of the
scheduler.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Almost everywhere this was called, it was immediately followed by
_abort_thread_timeout(), for obvious reasons. The only exceptions
were in timeout and k_timer expiration (unifying these two would be
another good cleanup), which are peripheral parts of the scheduler and
can plausibly use a more "internal" API.
So make the common case the default, and expose the old behavior as
_unpend_thread_no_timeout(). (Along with identical changes for
_unpend_first_thread) Saves code bytes and simplifies scheduler
surface area for future synchronization work.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Now that other work has eliminated the two cases where we had to do a
reschedule "but yield even if we are cooperative", we can squash both
down to a single _reschedule() function which does almost exactly what
legacy _Swap() did, but wrapped as a proper scheduler API.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Everywhere the current thread is pended, the code is going to have to
do a _Swap() soon afterward, yet the scheduler API exposed these as
separate steps. Unify this pattern everywhere it appears, which saves
some code bytes and gets _Swap() out of the general scheduler API at
zero cost.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
There was a somewhat promiscuous pattern in the kernel where IPC
mechanisms would do something that might effect the current thread
choice, then check _must_switch_threads() (or occasionally
__must_switch_threads -- don't ask, the distinction is being replaced
by real English words), sometimes _is_in_isr() (but not always, even
in contexts where that looks like it would be a mistake), and then
call _Swap() if everything is OK, otherwise releasing the irq_lock().
Sometimes this was done directly, sometimes via the inverted test,
sometimes (poll, heh) by doing the test when the thread state was
modified and then needlessly passing the result up the call stack to
the point of the _Swap().
And some places were just calling _reschedule_threads(), which did all
this already.
Unify all this madness. The old _reschedule_threads() function has
split into two variants: _reschedule_yield() and
_reschedule_noyield(). The latter is the "normal" one that respects
the cooperative priority of the current thread (i.e. it won't switch
out even if there is a higher priority thread ready -- the current
thread has to pend itself first), the former is used in the handful of
places where code was doing a swap unconditionally, just to preserve
precise behavior across the refactor. I'm not at all convinced it
should exist...
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
A priority value cannot be simultaneously higher than the maximum
possible value and smaller than the minimum value. Rewrite the
_VALID_PRIO() macro as a function so that this if either of these
invariants are invalid, the priority is considered invalid.
Coverity-CID: 182584
Coverity-CID: 182585
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>
The result of left shifting a bit into the sign-bit is undefined
behavior. This makes the offending shift operation unsigned.
Signed-off-by: Kristian Klomsten Skordal <kristian.skordal@nordicsemi.no>
The scheduler exposed two APIs to do the same thing:
_add_thread_to_ready_q() was a low level primitive that in most cases
was wrapped by _ready_thread(), which also (1) checks that the thread
_is_ready() or exits, (2) flags the thread as "started" to handle the
case of a thread running for the first time out of a waitq timeout,
and (3) signals a logger event.
As it turns out, all existing usage was already checking case #1.
Case #2 can be better handled in the timeout resume path instead of on
every call. And case #3 was probably wrong to have been skipping
anyway (there were paths that could make a thread runnable without
logging).
Now _add_thread_to_ready_q() is an internal scheduler API, as it
probably always should have been.
This also moves some asserts from the inline _ready_thread() wrapper
to the underlying true function for code size reasons, otherwise the
extra use of the inline added by this patch blows past code size
limits on Quark D2000.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The xtensa asm2 layer had a function to select the next switch handle
to return into following an exception. There is no arch-specific code
there, it's just scheduler logic. Move it to the scheduler where it
belongs.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The scheduler needs a few tweaks to work in SMP mode:
1. The "cache" field just doesn't work. With more than one CPU,
caching the highest priority thread isn't useful as you may need N
of them at any given time before another thread is returned to the
scheduler. You could recalculate it at every change, but that
provides no performance benefit. Remove.
2. The "bitmask" designed to prevent the need to individually check
priorities is likewise dropped. This could work, but in fact on
our only current SMP system and with current K_NUM_PRIOPRITIES
values it provides no real benefit.
3. The individual threads now have a "current cpu" and "active" flag
so that the choice of the next thread to run can correctly skip
threads that are active on other CPUs.
The upshot is that a decent amount of code gets #if'd out, and the new
SMP implementations for _get_highest_ready_prio() and
_get_next_ready_thread() are simpler and smaller, at the expense of
having to drop older optimizations.
Note that scheduler synchronization is unchanged: all scheduler APIs
used to require that an irq_lock() be held, which means that they now
require the global spinlock via the same API. This should be a very
early candidate for lock granularity attention!
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Having two implementations of the same thing is bad,
especially when one can just call the other inline version.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This will allow these thread objects to be re-used.
_mark_thread_as_dead() removed, it was only being called in one
place.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
One of the stack sentinel policies was to check the sentinel
any time a cooperative context switch is done (i.e, _Swap is
called).
This was done by adding a hook to _check_stack_sentinel in
every arch's __swap function.
This way is cleaner as we just have the hook in one inline
function rather than implemented in several different assembly
dialects.
The check upon interrupt is now made unconditionally rather
than checking if we are calling __swap, since the check now
is only called on cooperative _Swap(). The interrupt is always
serviced first.
Issue: ZEP-2244
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This places a sentinel value at the lowest 4 bytes of a stack
memory region and checks it at various intervals, including when
servicing interrupts or context switching.
This is implemented on all arches except ARC, which supports stack
bounds checking directly in hardware.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Adds event based scheduling logic to the kernel. Updates
management of timeouts, timers, idling etc. based on
time tracked at events rather than periodic ticks. Provides
interfaces for timers to announce and get next timer expiry
based on kernel scheduling decisions involving time slicing
of threads, timeouts and idling. Uses wall time units instead
of ticks in all scheduling activities.
The implementation involves changes in the following areas
1. Management of time in wall units like ms/us instead of ticks
The existing implementation already had an option to configure
number of ticks in a second. The new implementation builds on
top of that feature and provides option to set the size of the
scheduling granurality to mili seconds or micro seconds. This
allows most of the current implementation to be reused. Due to
this re-use and co-existence with tick based kernel, the names
of variables may contain the word "tick". However, in the
tickless kernel implementation, it represents the currently
configured time unit, which would be be mili seconds or
micro seconds. The APIs that take time as a parameter are not
impacted and they continue to pass time in mili seconds.
2. Timers would not be programmed in periodic mode
generating ticks. Instead they would be programmed in one
shot mode to generate events at the time the kernel scheduler
needs to gain control for its scheduling activities like
timers, timeouts, time slicing, idling etc.
3. The scheduler provides interfaces that the timer drivers
use to announce elapsed time and get the next time the scheduler
needs a timer event. It is possible that the scheduler may not
need another timer event, in which case the system would wait
for a non-timer event to wake it up if it is idling.
4. New APIs are defined to be implemented by timer drivers. Also
they need to handler timer events differently. These changes
have been done in the HPET timer driver. In future other timers
that support tickles kernel should implement these APIs as well.
These APIs are to re-program the timer, update and announce
elapsed time.
5. Philosopher and timer_api applications have been enabled to
test tickless kernel. Separate configuration files are created
which define the necessary CONFIG flags. Run these apps using
following command
make pristine && make BOARD=qemu_x86 CONF_FILE=prj_tickless.conf qemu
Jira: ZEP-339 ZEP-1946 ZEP-948
Change-Id: I7d950c31bf1ff929a9066fad42c2f0559a2e5983
Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com>
This adds a new event type to the kernel event logger that tracks
thread-related events: being added to the ready queue, pending a
thread, and exiting a thread.
It's the only event type that contains "subevents" and thus has a
non-void parameter in their respective _sys_k_event_logger_*()
function. Luckily, as isn't the case with other events (such as IRQs
and thread switching), these functions are called from
platform-agnostic places, so there's no need to worry about changing
the assembly guts.
This is the first patch in a series adding support for better real-time
profiling of Zephyr applications.
Jira: ZEP-1463
Change-Id: I6d63607ba347f7a9cac3d016fef8f5a0a830e267
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>
Convert code to use u{8,16,32,64}_t and s{8,16,32,64}_t instead of C99
integer types. This handles the remaining includes and kernel, plus
touching up various points that we skipped because of include
dependancies. We also convert the PRI printf formatters in the arch
code over to normal formatters.
Jira: ZEP-2051
Change-Id: Iecbb12601a3ee4ea936fd7ddea37788a645b08b0
Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
This reverts commit 7b9dc107a8.
We revert this as we intent to move away from {u}int{8,16,32,64}_t types
to our own internal types for sized variables so we shouldn't need the
PRI macros anymore.
Change-Id: I1d9d797fee47ca266867ae65656c150f8fe2adb2
Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
To allow for various libc implementations (like newlib) in which the way
various {u}int{8,16,32}_t types are defined vary between both libc
implementations and across architectures we need to utilize the PRI
defines.
Change-Id: Ie884fb67015502288152ecbd64c37961a4f538e4
Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
Modify _get_first_thread_to_unpend() so that it does not remove the
thread from the wait queue. Rename it to _find_first_thread_to_unpend()
to match the new behaviour.
This will be needed to fix a semaphore group bug.
Change-Id: I1b7531c3beecf3b6a86ecf88a93a02449edd0767
Signed-off-by: Benjamin Walsh <walsh.benj@gmail.com>
Rather than explicitely checking the thread state bit.
Change-Id: Ic78427d9847e627a0e91d0147d3b6164450597f6
Signed-off-by: Benjamin Walsh <walsh.benj@gmail.com>
This has not bitten us yet, but it was a ticking timebomb.
This is similar to the issue that was found with irq_lock/irq_unlock
implementations on several architectures. Having a volatile variable is
not the way to force the sched_lock variable to be
incremented/decremented around the accesses to data it protects.
Instead, a compiler barrier must prevent the compiler from reordering
the memory accesses around setting of sched_lock. Needed in the inline
implementations _sched_lock()/_sched_unlock_no_reschedule(), which
resolve to simple decrement/increment of the per-thread sched_lock
variable.
Change-Id: I06f5b3524889f193efe69caa947118404b1be0b5
Signed-off-by: Benjamin Walsh <walsh.benj@gmail.com>
They are not part of the API, so rename from K_<state> to
_THREAD_<state>.
Change-Id: Iaebb7d3083b80b9769bee5616e0f96ed2abc5c56
Signed-off-by: Benjamin Walsh <walsh.benj@gmail.com>