Minor style (syntax) fix in the help text of symbol
config EXECUTION_BENCHMARKING.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
This commit changes the names of SYS_POWER_DEEP_SLEEP* Kconfig
options in order to match SYS_POWER_LOW_POWER_STATE* naming
scheme.
Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
The SYS_POWER_LOW_POWER_STATE_SUPPORTED and SYS_POWER_LOW_POWER_STATE
suggests one low power state but these options control multiple
low power state. This commit uses plural in the names to indicate
that.
Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
This API doesn't use the normal thread priority comparison itself, so
doesn't get the magic that thread_base.prio provides. If called when
another thread should be run, this would preempt the current thread
always, even if the scheduler lock was taken.
That was benign until recent spinlockifiation exposed it: a mutex in
the philosophers test run in preempt_only mode would swap away while
holding a spinlock (which used to work with irq locks) and fail later
with a "recursive" spinlock assert.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This port is a little different. Most subsystem synchronization uses
simple critical sections that can be replaced with global or
per-object spinlocks. But the userspace code was heavily exploiting
the fact that irq_lock was recursive and could be taken at any time.
So outer functions were doing locking and then calling into inner
helpers that would take their own lock (because they were called from
other contexts that did not lock).
Rather than try to rework this right now, this just creates a set of
spinlocks corresponding to the recursive states in which they are
taken, to preserve the existing semantics exactly.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Simple global lock around the timer API. Actually a lot of this usage
was using needless vestigial locking around existing scheduler and
timeout APIs that are now internally synchronized.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
One spinlock per pipe object. Also removed some vestigial locking
around _ready_thread(). That call is internally synchronized now.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The k_sleep() locking was actually to protect the _current state from
preemption before the context switch, so document that and replace
with a spinlock. Should probably unify this with the rather cleaner
logic in pend_curr(), but right now "sleeping" and "pended" are
needlessly distinct states.
And we can remove the locking entirely from k_wakeup(). There's no
reason for any of that to need to be synchronized. Even if we're
racing with other thread modifiations, the state on exit will be a
runnable thread without a timeout, or whatever timeout/pend state the
other side was requesting (i.e. it's a bug, but not one solved by
synhronization).
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Use a subsystem lock, not a per-object lock. Really we want to lock
at mutex granularity where possible, but (1) that has non-trivial
memory overhead vs. e.g. directly spinning on the mutex state and (2)
the locking in a few places was originally designed to protect access
to the mutex *owner* priority, which is not 1:1 with a single mutex.
Basically the priority-inheriting mutex code will need some rework
before it works as a fine-grained locking abstraction in SMP.
Note that this fixes an invisible bug: with the older code,
k_mutex_unlock() would actually call irq_unlock() twice along the path
where there was a new owner, which is benign on existing architectures
(so long as the key argument is unchanged) but was never guaranteed to
work. With a spinlock, unlocking an unlocked/unowned lock is a
detectable assertion condition.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Straightforward port. Each struct k_queue object gets a spinlock to
control obvious data ownership.
Note that this port actually discovered a preexisting bug: the -ENOMEM
case in queue_insert() was failing to release the lock. But because
the tests that hit that path didn't rely on other threads being
scheduled, they ran to successful completion even with interrupts
disabled. The spinlock API detects that as a recursive lock when
asserts are enabled.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The two APIs protected by this lock are themselves internally
synchronized. Replace the irq_lock with a spinlock anyway, because
what I think it's doing is trying to prevent a race where something
else like an ISR or something it wakes up mucks with the thread before
this completes. Seems fragile on SMP as it stands, but this preserves
behavior on uniprocessor architectures.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Straightforward spinlock around the global thread state. Two changes
to the locking strategy were needed:
1. There was a needless recursive lock taken in schedule_new_thread().
This is only ever invoked in circumstances where the lock was already
held, or where there is no need for internal synchronization.
2. The recursive irq_lock() around the loop that spawns the initial
static threads (which happens at the start of main thread execution)
was removed. Most of the job (i.e. making sure the threads don't run
before the loop is finished) was already duplicated by the sched_lock
it was already taking, and the attempt to promise that all the
timeouts happen on the same tick is already true by construction at
system startup on uniprocessor systems, and not possible to guarantee
at all under SMP (where other CPUs can take that timer interrupt). We
don't document or test for this feature, so don't try to be fancy.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Really the locking in this file is vestigial. It only exists because
the scheduler's _unpend_all() call to wake up everyone waiting on a
wait_q is unsynchronized, because it was written to assume
irq_lock-style-locking. It would be cleaner to put that locking into
the wait_q itself and/or use the scheduler's subsystem lock. But it's
not clear there's any performance benefit, so let's stick with the
more easily verifiable change first.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Poll gets a single subsystem lock for now. The existing locking in
Ben's code is subtle, being used both for latency control and for
critical section protection. So getting each k_poll_event to use a
separate lock will require care and a little logic change. Do the
simple version for now, which still works to decouple it from the
global lock.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
These functions, for good design reason, take a locking key to
atomically release along with the context swtich. But there's still a
common pattern in code to do a switch unconditionally by passing
irq_lock() directly. On SMP that's a little hurtful as it spams the
global lock. Provide an _unlocked() variant for
_Swap/_reschedule/_pend_curr for simplicity and efficiency.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Switch semaphores to use a subsystem spinlock instead of the system
irqlock.
Note that this is only "half way there". Semaphores will no longer
contend with other irqlock users on SMP systems, but all semaphores
are still sharing the same lock. Really we want semaphores to be
independently synchronized, but adding 4 bytes to every one (there are
a LOT of these things) for a separate spinlock is too much to pay.
Rather, a proper SMP-aware implementation would spin on the count
variable directly. But let's not rock that boat quite yet.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Just like with _Swap(), we need two variants of these utilities which
can atomically release a lock and context switch. The naming shifts
(for byte count reasons) to _reschedule/_pend_curr, and both have an
_irqlock variant which takes the traditional locking.
Just refactoring. No logic changes.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Mostly useless patch. All architectures have their own code for
atomic operations and don't use this fallback. Still, it's a trivial
locking setup and we might as well.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Each work_q object gets a separate spinlock to synchronize access
instead of the global lock. Note that there was a recursive lock
condition in k_delayed_work_cancel(), so that's been split out into an
internal unlocked version and the API entry point that wraps it with a
lock.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The validation checking recently added to spinlocks is useful, but
requires kernel-internals like _current and _current_cpu in a header
context that tends to be needed before those are declared (or where we
don't want them declared), and is causing big header dependency
headaches.
Move it to C code, it's just a validation tool, not a performance
thing.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
We want a _Swap() variant that can atomically release/restore a
spinlock state in addition to the legacy irqlock. The function as it
was is now named "_Swap_irqlock()", while _Swap() now refers to a
spinlock and takes two arguments. The former will be going away once
existing users (not that many! Swap() is an internal API, and the
long port away from legacy irqlocking is going to be happening mostly
in drivers) are ported to spinlocks.
Obviously on uniprocessor setups, these produce identical code. But
SMP requires that the correct API be used to maintain the global lock.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
These two spots were duplicating logic that is already done inside
_reschedule(), which is the cleaner, less dangerous API. Use it where
possible when outside the scheduler internals.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Since we know do DTS before Kconfig we should try and remove dts from
creating Kconfig namespaced symbols and leave that to Kconfig. So
rename CONFIG_CCM_<FOO> to DT_CCM_<FOO>.
Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
The power management framework used two different abstractions
to describe power states. The SYS_PM_* given coarse information
what kind of power state (low power or deep sleep) was used,
while the SYS_POWER_STATE_* abstraction provided information
about particular power mode.
This commit removes the SYS_PM_* abstraction as the same
information is already carried in SYS_POWER_STATE_*.
Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
This was never a long-term solution, more of a gross hack
to get test cases working until we could figure out a good
end-to-end solution for memory domains that generated
appropriate linker sections. Now that we have this with
the app shared memory feature, and have converted all tests
to remove it, delete this feature.
To date all userspace APIs have been tagged as 'experimental'
which sidesteps deprecation policies.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
* Newlib now defines a special z_newlib_partition containing
all globals relevant to newlib. Most of these are in libc.a
with a heap tracking variable in newlib's hooks.
* Both C libraries now expose a k_mem_partition containing the
bounds of the malloc heap arena. Threads that want to use
libc malloc() will need to add this to their memory domain.
* z_newlib_get_heap_bounds has been removed, in favor of the
memory partition for the heap arena
* ztest now includes the C library partitions in its memory
domain.
* The mem_alloc test now runs in user mode to prove that this
all works for both C libraries.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Dynamic kernel objects enforce that the permission state
of an object is also a reference count; using a kernel
object without permission regardless of caller privilege
level is a programming bug.
However, this is not the case for static objects. In
particular, supervisor threads are allowed to use any
object they like without worrying about permissions, and
the logic here was causing cleanup functions to be called
over and over again on kernel objects that were actually
in use.
The automatic cleanup mechanism was intended for
dynamic objects anyway, so just skip it entirely for
static objects.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This patch puts checks in place to ensure that callers to the k_mem_slab
APIs provide word aligned block sizes. If this is not done, this can
result in unaligned accesses and subsequent crashes.
Signed-off-by: Andy Gross <andy.gross@linaro.org>
This commit extends the implementation of sane_partition(..) in
kernel/mem_domain.c so that it generates an ASSERT if partitions
inside a mem_domain overlap. This extension is only implemented
for the case when the MPU requires non-overlapping regions.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
This adds a simple implementation of SMP CPU affinity to Zephyr. The
API is simple and doesn't try to invent abstractions like "cpu sets".
Each thread has an enable/disable flag associated with each CPU in the
system, and the bits can be turned on and off (for threads that are
not currently runnable, of course) using an easy three-function API.
Because the implementation picked requires enumerating runnable
threads in priority order looking for one that match the current CPU,
this is not a good fit for the SCALABLE or MULTIQ scheduler backends,
so it currently can be enabled only for SCHED_DUMB (which is the
default anyway). Fancier algorithms do exist, but even the best of
them scale as O(N_CPUS), so aren't quite constant time and often
require significant memory overhead to keep separate lists for
different cpus/sets.
The intended use here is for apps that want to "pin" threads to
specific CPUs for latency control, or conversely to prevent certain
threads from taking time on specific CPUs to leave them free for fast
response.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
When under SMP, _current is a macro that indirects to a CPU-specific
address, and that trick won't work until kernel_arch_init() has
returned.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Idle threads must (for obvious reasons!) always be preemptible from
the perspective of the scheduler. But when preemptive scheduling is
disabled, they are given a priority of -1, which is the lowest
COOPERATIVE priority. So the scheduler preemption logic needed an
extra test for this case and couldn't just rely on the existing
priority comparison. This was a measurable performance loss, as this
is a hot path on existing benchmarks.
Limit that test to circumstances (!CONFIG_PREEMPT_ENABLED) where it's
actually needed.
Longer term it would be better to just force the existence of one
"preemptible" thread priority always, but right now the number of
priorities and the state of the PREEMPT_ENABLED kconfig flag are
linked, and the existing interrupt return code (with no preemption,
you know with certainty which thread you are returning to and can skip
some work) on some platforms fails when I try this.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
For historical reasons, some architectures had a valid _current thread
pointer at initialization time and others didn't. So the scheduler
logic had a test that checks _current vs. NULL every time it needed to
check premption, when this was only a workaround for initialization
state.
Fix things so that there is a dummy thread always (and clean up the
code to do a struct assignment instead of a memset of bare memory),
and we can remove that test from the scheduler hot path.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
GCC 6.2.0 is making frustratingly poor inlining decisions with some of
these routines, resulting in an awful lot of runtime calls for code
that is only ever expanded once or twice within the file.
Treat with targetted ALWAYS_INLINE's to force the issue. The
scheduler code is a hot path.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The sys_dlist_insert_*() functions had a behavior where a NULL
argument for the insertion position to sys_dlist_insert_after/before()
was interpreted as "the end of the list". We never used that
convention (except in one spot internal to dlist.h which was not
itself used anywhere), and of course already have an API for appending
and prepending to a list.
In practice this was a performance disaster. The NULL check is
virtually never provable statically by the compiler, so that test and
branch is present always. And worse, the check and call to another
function was pushing this beyond the complexity limit for gcc to
inline a function (at -Os optimization anyway), forcing us to use
function calls for what should be a ~8 instruction sequence. The
upshot is that dlist insertions were 2-3x slower than they needed to
be.
Deprecate these older APIs and introduce a new sys_dlist_insert() call
which can be much better optimized.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The fix in commit e664c78b82 ("kernel/timeout: Fix recursive
spinlock in z_set_timeout_expiry()") missed a spot that had also been
introduced with recent locking work. The new
_get_next_timeout_expiry() implementation takes its own lock, which is
recursive when called from z_clock_announce(). Fix by calling the
wrapped implementation instead.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Whether a timeout is linked into the timeout queue can be determined
from the corresponding sys_dnode_t linked state. This removes the need
to use a special flag value in dticks to determine that the timeout is
inactive.
Update _abort_timeout to return an error code, rather than the flag
value, when the timeout to be aborted was not active.
Remove the _INACTIVE flag value, and replace its external uses with an
internal API function that checks whether a timeout is inactive.
Signed-off-by: Peter A. Bigot <pab@pabigot.com>
k_poll events are registered in a linked list when their signal
condition has been met. The code to clear event registration did not
account for events that were not registered, resulting in double-removes
that produced core dumps on native-posix sanitycheck.
Signed-off-by: Peter A. Bigot <pab@pabigot.com>
CONTAINER_OF() on a NULL pointer returns some offset around NULL and not
another NULL pointer. We have to check for that ourselves.
This only worked because the dnode happened to be at the start of the
struct.
Signed-off-by: Peter A. Bigot <pab@pabigot.com>
The help text has been stating that CONFIG_STACK_CANARIES will
silently be ignored when the compiler does not support them. But this
is not the desired behaviour of CONFIG_STACK_CANARIES[1].
This patch corrects the help text to state that an error will occur if
this feature is enabled, but not supported.
[1] "I would much rather see the build break if someone tries to
enable the stack canaries, and the compiler doesn't support
it. Because what happens now is that if someone enables this option,
and there is no support, the build will succeed but there are no
actual stack canaries in place, and unless the user is paying close
attention to the cmake test output they will have no idea."
--
https://github.com/zephyrproject-rtos/zephyr/issues/5019
Signed-off-by: Sebastian Bøe <sebastian.boe@nordicsemi.no>