There is an effort underway to make most of the Zephyr build script's
reentrant. Meaning, the build scripts can be executed multiple times
during the same CMake invocation.
Reentrancy enables several use-cases, the motivating one is the
ability to build several Zephyr executables, or images, for instance a
bootloader and an application.
For build scripts to be reentrant they cannot be directly referencing
global variables, like target names, but must instead reference
variables, which can vary from entry to entry.
Therefore, in this patch, we replace global targets with variables.
Signed-off-by: Sebastian Bøe <sebastian.boe@nordicsemi.no>
The z_set_timeout_expiry() function was added in part to simply the
locking strategy, but it missed a case where a function it was calling
was re-locking the same spinlock. It "works"[1] in uniprocessor
environments, but can be a deadlock in SMP.
Fix this by moving the meat of the function to an unlocked utility,
use that locally, and turn the entry point into one that does locking.
Actually this only gets called from idle now, which is a use case that
will go away when TICKLESS_IDLE is removed as a separate feature (once
you know all timeouts are set tickless, you don't need to set it from
the idle entry at all).
Discovered via lucky inspection.
[1] It doesn't work. It releases the lock prematurely at the end of
the inner block. But in practice this wasn't discovered.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This patch provides support for generating Code coverage reports.
The prj.conf needs to enable CONFIG_COVERAGE. Once enabled, the
code coverage data dump now comes via UART.
This data dump on the UART is triggered once the main
thread exits.
Next step is to save this data dump on file. Then run
scripts/gen_gcov_files.py with the serial console log as argument.
The last step would be be to run the gcovr. Use the following cmd
gcovr -r . --html -o gcov_report/coverage.html --html-details
Currently supported architectures are ARM and x86.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Timeslicing works by removing the _current thread from the run queue
and re-adding it at the end of its priority. On systems with a
_Swap() that can be preempted by a timer interrupt, that means it's
possible for the timeslice to try to slice out a thread that had
already pended itself!
This behavior used to be benign (or at least undetectable) as the
duplicated list operations were idempotent. But now the dlist code is
stricter about correctness and has exposed the bug -- it will blow up
if you try to remove an already-removed list node.
Fix (on affected platforms) by stashing the _current pointer in
_pend_current_thread() that is checked and cleared in the timer
interrupt. If we discover we're trying to interrupt a thread that's
already interrupted itself, we can safely exit z_time_slice() as a
noop. The timeslicing bookeeping was already done for us underneath
the pend code.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This is a refactoring of the fix in commit 6c95dafd82 to limit its
application to affected platforms now that the root cause is
understood.
Note that the bug that fix was addressing was rare and seen only on
after multi-hour sessions on Michael Scott's test rig. So if
something regresses, this is where to look!
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
On ARM, _Swap() isn't atomic and a hardware interrupt can land after
the (irq_locked) caller has entered _Swap() but before the context
switch actually happens. This will require some platform-specific
workarounds in a few places in the scheduler.
This commit is just the Kconfig and selection on ARM.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The call to _arch_switch is a giant screaming sign inviting optimizer
bugs. The code that appears before is what happened long ago when we
were switched out, but the version that EXECUTED just now is actually
in a different thread. So the assignment to _current before the
switch actually assigned OUR thread (the "new_thread" of the old
context!) to _current.
But obviously the optimizer looks at that code and assumes that the
_current which got assigned to the thread we were switching to long
ago is still correct, and used it when retrieving the swap return
value.
Obviously the real bug here is that the _arch_switch() in question
lacked a memory clobber (and it's getting one).
But we can remove two lines, remove code from inside the interrupt
lock and make the implementation more robust by moving the read to
after the irq_unlock() (which generally also has a memory clobber).
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
These files were using z_thread_malloc() without including
kernel_internal.h. On existing architectures that works due to
transitive includes, but x86_64 has a thinner include layer and
doesn't do it for us. Include the files required for the APIs we use.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Commit 76b3518ce6 ("kernel: Make statements evaluate boolean
expressions") changed the type of is_polling in the struct _poller
from int to bool. In the conversion a "0" has been changed into "true"
instead of "false". Fix that.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This API was using variable number of arguments. Which is not
allowed according to misra c guidelines(Rule 17.1). Hence making
this API into a macro and using the util macro FOR_EACH_FIXED_ARG
to get the same functionality.
There is one deviation from the old function. The last argument
shouldn't be NULL.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
The logic in z_set_timeout_expiry() missed the case where the ticks
argument could be zero (or lower), which can happen naturally due to
timing/interrupt slop. In those circumstances, it would still try to
reset a timer that was "about to expire at the next tick", which would
run afoul of the drivers' internal decisions about how soon a timer
interrupt could be set, and then get pushed out to the next tick.
Explicitly detect this as an "imminent" predicate to make the logic
clearer.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The recent change that added a locked z_set_timeout_expiry() API
obsoleted the subtle note about synchronization above
reset_time_slice(). None of that matters any more, the API is
synchronized internally in a conventional way.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The use of dticks == INACTIVE to tell whether or not a timeout was
already in the list was insufficient. There is a time period between
the moment a timeout is removed from the list and the end of its
handler where it is not in the list, yet its list node pointers still
point into it. Doing things like aborting a thread while that is true
(which can be asynchronous too!) would corrupt the list even though
all the operations on it were "atomic".
Set the timeout node pointers to nulls atomically when removed, and
check for double-remove conditions (which, again, might be perfectly
OK).
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This loop was structured badly, as a while(true) with multiple "exit
if" cases in the body. It was bad enough that I genuinely fooled
myself into rewriting it, having convinced myself there was a bug in
it when there wasn't.
So keep the rewritten loop which expresses the iteration in a more
invariant way (i.e. "while we have an element to expire" and not "test
if we have to exit the loop"). Shorter and easier. Also makes the
locking clearer as we can simply release the lock around the callback
in a natural/obvious way.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Don't present USE_SWITCH and SMP to user applications that are
configuring for platforms that do not support SMP or USE_SWITCH.
Signed-off-by: Sebastian Bøe <sebastian.boe@nordicsemi.no>
SMP requires the new-style '_arch_switch' to be enabled. To prevent
users from creating invalid configurations where SMP is enabled while
_arch_switch is not, we add a dependency from SMP to USE_SWITCH.
Signed-off-by: Sebastian Bøe <sebastian.boe@nordicsemi.no>
RETPOLINE has been enabled by default on most platforms, but it is
only supported on X86.
Features should only be enabled if they are supported and active on
the given platform. To rectify this we have RETPOLINE depend on X86,
the only platform on which it is implemented.
Signed-off-by: Sebastian Bøe <sebastian.boe@nordicsemi.no>
In general driver system calls are implemented at a subsystem
layer. However, some drivers may have capabilities specific to
the hardware not covered by the subsystem API. Such drivers may
want to define their own system calls.
This macro makes it simple to validate in the driver-specific
system call handlers that not only does the untrusted device
pointer correspond to the expected subsystem, initialization
state, and caller permissions, but also that the device object
is an instance of a specific driver (and not just any driver in
that subsystem).
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The main function is just a weak function that should be override by the
applications if they need. Just adding a nop instructions to explicitly
says that this function does nothing.
MISRA-C rule 2.2
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
If initialization fails, zero the API struct so that
device_get_binding() can't fetch it, and do not mark
the driver object as initialized to user mode.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This patch splits the text section into 2 parts. The first section
will have some info regarding vector tables and debug info. The
second section will have the complete text section.
This is needed to force the required functions and data variables
the correct locations.
This is due to the behavior of the linker. The linker will only link
once and hence this text section had to be split to make room
for the generated linker script.
Added a new Kconfig CODE_DATA_RELOCATION which when enabled will
invoke the script, which does the required relocation.
Added hooks inside init.c for bss zeroing and data copy operations.
Needed when we have to copy data from ROM to required memory type.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
MISRA-C says all declarations of an object or function must use the
same name and qualifiers.
MISRA-C rule 8.3
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
In C90 was introduced function prototype, that allows argument types
to be checked against parameter types, though it is not necessary
specify names for the parameters. MISRA-C requires names for function
prototype parameters, it claims that names can provide useful
information regarding the function interface.
MISRA-C rule 8.2
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
The order of evaluation of function calls in the arguments of a
function. This is undefined (32)/ unspecified(15-18) in C99.
MISRA-C rule 13.2 does not allow that a value of an expression and its
side effects happens in not deterministic order to avoid these
undefined behaviors.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
According with MISRA-C and unconditional break statement must
terminate every switch-clause.
MISRA-C rule 16.1 and 16.3
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
MISRA-C requires the right-hand operand of && or || operator does not
contain persistent effect.
MISRA-C rule 13.5
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
When a memory partition is removed, it is not required
to clear the start and attr fields, since a free partition
is only indicated by a zero size field. This commit removes
the un-necessary clearing of start and attr fields.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
It is necessary to delay setting lock_count = 0 because an unlocking thread
maybe swapped out when it calls adjust_owner_prio(). If the thread that starts
running sees lock_count = 0 it will successfully acquire the mutex even though
it is not fully unlocked yet.
Fixes#11798.
Signed-off-by: Nicolás Bértolo <nicolasbertolo@gmail.com>
The comment explaining why _IntLibInit was being invoked was left in
place after the invocation itself was removed. Remove it too.
Signed-off-by: Peter A. Bigot <pab@pabigot.com>
The function that checks if an object is valid is essentially a boolean
function. Just changing its return type to reflect it.
MISRA-C rule 14.4
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
This allows for workqueues to be started in user mode.
No additional kernel objects or system calls are defined
other than starting the workqueue in user mode; for
permission purposes the embedded queue and thread objects
are sufficient.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
k_work and k_work_q are not kernel objects, nor will they
be. k_work_q contains some kernel objects which are tracked
independently.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
There were many platforms where this function was doing nothing. Just
merging its functionality with _PrepC function.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
System must not set the clock expiry via backdoor as it may
effect in unbound time drift of all scheduled timeouts.
Fixes: #11502
Signed-off-by: Pawel Dunaj <pawel.dunaj@nordicsemi.no>
System Power Management is only supported in Tickless Idle mode.
This patch modifies Kconfig dependencies to ensure System Power
Management option selects Tickless Idle one.
Fixes: #11046
Signed-off-by: Piotr Mienkowski <piotr.mienkowski@gmail.com>
The call to z_clock_set_timeout() was being made outside the timeout
lock, which can race against other contexts setting sooner-expiring
timeouts.
Also add a long comment to one spot (timeslicing) where this call is
made outside the timeout spinlock (inside the scheduler lock) and why
this is OK.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>