zephyr/kernel/idle.c

191 lines
4.6 KiB
C
Raw Normal View History

/*
* Copyright (c) 2016 Wind River Systems, Inc.
*
* SPDX-License-Identifier: Apache-2.0
*/
#include <kernel.h>
kernel/arch: consolidate tTCS and TNANO definitions There was a lot of duplication between architectures for the definition of threads and the "nanokernel" guts. These have been consolidated. Now, a common file kernel/unified/include/kernel_structs.h holds the common definitions. Architectures provide two files to complement it: kernel_arch_data.h and kernel_arch_func.h. The first one contains at least the struct _thread_arch and struct _kernel_arch data structures, as well as the struct _callee_saved and struct _caller_saved register layouts. The second file contains anything that needs what is provided by the common stuff in kernel_structs.h. Those two files are only meant to be included in kernel_structs.h in very specific locations. The thread data structure has been separated into three major parts: common struct _thread_base and struct k_thread, and arch-specific struct _thread_arch. The first and third ones are included in the second. The struct s_NANO data structure has been split into two: common struct _kernel and arch-specific struct _kernel_arch. The latter is included in the former. Offsets files have also changed: nano_offsets.h has been renamed kernel_offsets.h and is still included by the arch-specific offsets.c. Also, since the thread and kernel data structures are now made of sub-structures, offsets have to be added to make up the full offset. Some of these additions have been consolidated in shorter symbols, available from kernel/unified/include/offsets_short.h, which includes an arch-specific offsets_arch_short.h. Most of the code include offsets_short.h now instead of offsets.h. Change-Id: I084645cb7e6db8db69aeaaf162963fe157045d5a Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
2016-11-08 23:36:50 +08:00
#include <kernel_structs.h>
#include <toolchain.h>
#include <linker/sections.h>
#include <drivers/system_timer.h>
#include <wait_q.h>
#include <power.h>
#if defined(CONFIG_TICKLESS_IDLE)
/*
* Idle time must be this value or higher for timer to go into tickless idle
* state.
*/
s32_t _sys_idle_threshold_ticks = CONFIG_TICKLESS_IDLE_THRESH;
kernel: tickless: Add tickless kernel support Adds event based scheduling logic to the kernel. Updates management of timeouts, timers, idling etc. based on time tracked at events rather than periodic ticks. Provides interfaces for timers to announce and get next timer expiry based on kernel scheduling decisions involving time slicing of threads, timeouts and idling. Uses wall time units instead of ticks in all scheduling activities. The implementation involves changes in the following areas 1. Management of time in wall units like ms/us instead of ticks The existing implementation already had an option to configure number of ticks in a second. The new implementation builds on top of that feature and provides option to set the size of the scheduling granurality to mili seconds or micro seconds. This allows most of the current implementation to be reused. Due to this re-use and co-existence with tick based kernel, the names of variables may contain the word "tick". However, in the tickless kernel implementation, it represents the currently configured time unit, which would be be mili seconds or micro seconds. The APIs that take time as a parameter are not impacted and they continue to pass time in mili seconds. 2. Timers would not be programmed in periodic mode generating ticks. Instead they would be programmed in one shot mode to generate events at the time the kernel scheduler needs to gain control for its scheduling activities like timers, timeouts, time slicing, idling etc. 3. The scheduler provides interfaces that the timer drivers use to announce elapsed time and get the next time the scheduler needs a timer event. It is possible that the scheduler may not need another timer event, in which case the system would wait for a non-timer event to wake it up if it is idling. 4. New APIs are defined to be implemented by timer drivers. Also they need to handler timer events differently. These changes have been done in the HPET timer driver. In future other timers that support tickles kernel should implement these APIs as well. These APIs are to re-program the timer, update and announce elapsed time. 5. Philosopher and timer_api applications have been enabled to test tickless kernel. Separate configuration files are created which define the necessary CONFIG flags. Run these apps using following command make pristine && make BOARD=qemu_x86 CONF_FILE=prj_tickless.conf qemu Jira: ZEP-339 ZEP-1946 ZEP-948 Change-Id: I7d950c31bf1ff929a9066fad42c2f0559a2e5983 Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com>
2017-02-06 11:37:19 +08:00
#if defined(CONFIG_TICKLESS_KERNEL)
#define _must_enter_tickless_idle(ticks) (1)
#else
#define _must_enter_tickless_idle(ticks) \
((ticks == K_FOREVER) || (ticks >= _sys_idle_threshold_ticks))
#endif
#else
#define _must_enter_tickless_idle(ticks) ((void)ticks, (0))
#endif /* CONFIG_TICKLESS_IDLE */
#ifdef CONFIG_SYS_POWER_MANAGEMENT
/*
* Used to allow _sys_soc_suspend() implementation to control notification
* of the event that caused exit from kernel idling after pm operations.
*/
unsigned char _sys_pm_idle_exit_notify;
#if defined(CONFIG_SYS_POWER_LOW_POWER_STATE)
void __attribute__((weak)) _sys_soc_resume(void)
{
}
#endif
#if defined(CONFIG_SYS_POWER_DEEP_SLEEP)
void __attribute__((weak)) _sys_soc_resume_from_deep_sleep(void)
{
}
#endif
/**
*
* @brief Indicate that kernel is idling in tickless mode
*
* Sets the kernel data structure idle field to either a positive value or
* K_FOREVER.
*
* @param ticks the number of ticks to idle
*
* @return N/A
*/
static void set_kernel_idle_time_in_ticks(s32_t ticks)
{
kernel/arch: consolidate tTCS and TNANO definitions There was a lot of duplication between architectures for the definition of threads and the "nanokernel" guts. These have been consolidated. Now, a common file kernel/unified/include/kernel_structs.h holds the common definitions. Architectures provide two files to complement it: kernel_arch_data.h and kernel_arch_func.h. The first one contains at least the struct _thread_arch and struct _kernel_arch data structures, as well as the struct _callee_saved and struct _caller_saved register layouts. The second file contains anything that needs what is provided by the common stuff in kernel_structs.h. Those two files are only meant to be included in kernel_structs.h in very specific locations. The thread data structure has been separated into three major parts: common struct _thread_base and struct k_thread, and arch-specific struct _thread_arch. The first and third ones are included in the second. The struct s_NANO data structure has been split into two: common struct _kernel and arch-specific struct _kernel_arch. The latter is included in the former. Offsets files have also changed: nano_offsets.h has been renamed kernel_offsets.h and is still included by the arch-specific offsets.c. Also, since the thread and kernel data structures are now made of sub-structures, offsets have to be added to make up the full offset. Some of these additions have been consolidated in shorter symbols, available from kernel/unified/include/offsets_short.h, which includes an arch-specific offsets_arch_short.h. Most of the code include offsets_short.h now instead of offsets.h. Change-Id: I084645cb7e6db8db69aeaaf162963fe157045d5a Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
2016-11-08 23:36:50 +08:00
_kernel.idle = ticks;
}
#else
#define set_kernel_idle_time_in_ticks(x) do { } while (0)
#endif
kernel: Rework SMP irq_lock() compatibility layer This was wrong in two ways, one subtle and one awful. The subtle problem was that the IRQ lock isn't actually globally recursive, it gets reset when you context switch (i.e. a _Swap() implicitly releases and reacquires it). So the recursive count I was keeping needs to be per-thread or else we risk deadlock any time we swap away from a thread holding the lock. And because part of my brain apparently knew this, there was an "optimization" in the code that tested the current count vs. zero outside the lock, on the argument that if it was non-zero we must already hold the lock. Which would be true of a per-thread counter, but NOT a global one: the other CPU may be holding that lock, and this test will tell you *you* do. The upshot is that a recursive irq_lock() would almost always SUCCEED INCORRECTLY when there was lock contention. That this didn't break more things is amazing to me. The rework is actually simpler than the original, thankfully. Though there are some further subtleties: * The lock state implied by irq_lock() allows the lock to be implicitly released on context switch (i.e. you can _Swap() with the lock held at a recursion level higher than 1, which needs to allow other processes to run). So return paths into threads from _Swap() and interrupt/exception exit need to check and restore the global lock state, spinning as needed. * The idle loop design specifies a k_cpu_idle() function that is on common architectures expected to enable interrupts (for obvious reasons), but there is no place to put non-arch code to wire it into the global lock accounting. So on SMP, even CPU0 needs to use the "dumb" spinning idle loop. Finally this patch contains a simple bugfix too, found by inspection: the interrupt return code used when CONFIG_SWITCH is enabled wasn't correctly setting the active flag on the threads, opening up the potential for a race that might result in a thread being scheduled on two CPUs simultaneously. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
#ifndef CONFIG_SMP
static void sys_power_save_idle(s32_t ticks)
{
kernel: tickless: Add tickless kernel support Adds event based scheduling logic to the kernel. Updates management of timeouts, timers, idling etc. based on time tracked at events rather than periodic ticks. Provides interfaces for timers to announce and get next timer expiry based on kernel scheduling decisions involving time slicing of threads, timeouts and idling. Uses wall time units instead of ticks in all scheduling activities. The implementation involves changes in the following areas 1. Management of time in wall units like ms/us instead of ticks The existing implementation already had an option to configure number of ticks in a second. The new implementation builds on top of that feature and provides option to set the size of the scheduling granurality to mili seconds or micro seconds. This allows most of the current implementation to be reused. Due to this re-use and co-existence with tick based kernel, the names of variables may contain the word "tick". However, in the tickless kernel implementation, it represents the currently configured time unit, which would be be mili seconds or micro seconds. The APIs that take time as a parameter are not impacted and they continue to pass time in mili seconds. 2. Timers would not be programmed in periodic mode generating ticks. Instead they would be programmed in one shot mode to generate events at the time the kernel scheduler needs to gain control for its scheduling activities like timers, timeouts, time slicing, idling etc. 3. The scheduler provides interfaces that the timer drivers use to announce elapsed time and get the next time the scheduler needs a timer event. It is possible that the scheduler may not need another timer event, in which case the system would wait for a non-timer event to wake it up if it is idling. 4. New APIs are defined to be implemented by timer drivers. Also they need to handler timer events differently. These changes have been done in the HPET timer driver. In future other timers that support tickles kernel should implement these APIs as well. These APIs are to re-program the timer, update and announce elapsed time. 5. Philosopher and timer_api applications have been enabled to test tickless kernel. Separate configuration files are created which define the necessary CONFIG flags. Run these apps using following command make pristine && make BOARD=qemu_x86 CONF_FILE=prj_tickless.conf qemu Jira: ZEP-339 ZEP-1946 ZEP-948 Change-Id: I7d950c31bf1ff929a9066fad42c2f0559a2e5983 Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com>
2017-02-06 11:37:19 +08:00
#ifdef CONFIG_TICKLESS_KERNEL
if (ticks != K_FOREVER) {
ticks -= _get_elapsed_program_time();
if (!ticks) {
/*
* Timer has expired or about to expire
* No time for power saving operations
*
* Note that it will never be zero unless some time
* had elapsed since timer was last programmed.
*/
k_cpu_idle();
return;
}
}
#endif
if (_must_enter_tickless_idle(ticks)) {
/*
* Stop generating system timer interrupts until it's time for
* the next scheduled kernel timer to expire.
*/
kernel: tickless: Add tickless kernel support Adds event based scheduling logic to the kernel. Updates management of timeouts, timers, idling etc. based on time tracked at events rather than periodic ticks. Provides interfaces for timers to announce and get next timer expiry based on kernel scheduling decisions involving time slicing of threads, timeouts and idling. Uses wall time units instead of ticks in all scheduling activities. The implementation involves changes in the following areas 1. Management of time in wall units like ms/us instead of ticks The existing implementation already had an option to configure number of ticks in a second. The new implementation builds on top of that feature and provides option to set the size of the scheduling granurality to mili seconds or micro seconds. This allows most of the current implementation to be reused. Due to this re-use and co-existence with tick based kernel, the names of variables may contain the word "tick". However, in the tickless kernel implementation, it represents the currently configured time unit, which would be be mili seconds or micro seconds. The APIs that take time as a parameter are not impacted and they continue to pass time in mili seconds. 2. Timers would not be programmed in periodic mode generating ticks. Instead they would be programmed in one shot mode to generate events at the time the kernel scheduler needs to gain control for its scheduling activities like timers, timeouts, time slicing, idling etc. 3. The scheduler provides interfaces that the timer drivers use to announce elapsed time and get the next time the scheduler needs a timer event. It is possible that the scheduler may not need another timer event, in which case the system would wait for a non-timer event to wake it up if it is idling. 4. New APIs are defined to be implemented by timer drivers. Also they need to handler timer events differently. These changes have been done in the HPET timer driver. In future other timers that support tickles kernel should implement these APIs as well. These APIs are to re-program the timer, update and announce elapsed time. 5. Philosopher and timer_api applications have been enabled to test tickless kernel. Separate configuration files are created which define the necessary CONFIG flags. Run these apps using following command make pristine && make BOARD=qemu_x86 CONF_FILE=prj_tickless.conf qemu Jira: ZEP-339 ZEP-1946 ZEP-948 Change-Id: I7d950c31bf1ff929a9066fad42c2f0559a2e5983 Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com>
2017-02-06 11:37:19 +08:00
/*
* In the case of tickless kernel, timer driver should
* reprogram timer only if the currently programmed time
* duration is smaller than the idle time.
*/
_timer_idle_enter(ticks);
}
set_kernel_idle_time_in_ticks(ticks);
#if (defined(CONFIG_SYS_POWER_LOW_POWER_STATE) || \
defined(CONFIG_SYS_POWER_DEEP_SLEEP))
_sys_pm_idle_exit_notify = 1;
/*
* Call the suspend hook function of the soc interface to allow
* entry into a low power state. The function returns
* SYS_PM_NOT_HANDLED if low power state was not entered, in which
* case, kernel does normal idle processing.
*
* This function is entered with interrupts disabled. If a low power
* state was entered, then the hook function should enable inerrupts
* before exiting. This is because the kernel does not do its own idle
* processing in those cases i.e. skips k_cpu_idle(). The kernel's
* idle processing re-enables interrupts which is essential for
* the kernel's scheduling logic.
*/
if (_sys_soc_suspend(ticks) == SYS_PM_NOT_HANDLED) {
_sys_pm_idle_exit_notify = 0;
k_cpu_idle();
}
#else
k_cpu_idle();
#endif
}
kernel: Rework SMP irq_lock() compatibility layer This was wrong in two ways, one subtle and one awful. The subtle problem was that the IRQ lock isn't actually globally recursive, it gets reset when you context switch (i.e. a _Swap() implicitly releases and reacquires it). So the recursive count I was keeping needs to be per-thread or else we risk deadlock any time we swap away from a thread holding the lock. And because part of my brain apparently knew this, there was an "optimization" in the code that tested the current count vs. zero outside the lock, on the argument that if it was non-zero we must already hold the lock. Which would be true of a per-thread counter, but NOT a global one: the other CPU may be holding that lock, and this test will tell you *you* do. The upshot is that a recursive irq_lock() would almost always SUCCEED INCORRECTLY when there was lock contention. That this didn't break more things is amazing to me. The rework is actually simpler than the original, thankfully. Though there are some further subtleties: * The lock state implied by irq_lock() allows the lock to be implicitly released on context switch (i.e. you can _Swap() with the lock held at a recursion level higher than 1, which needs to allow other processes to run). So return paths into threads from _Swap() and interrupt/exception exit need to check and restore the global lock state, spinning as needed. * The idle loop design specifies a k_cpu_idle() function that is on common architectures expected to enable interrupts (for obvious reasons), but there is no place to put non-arch code to wire it into the global lock accounting. So on SMP, even CPU0 needs to use the "dumb" spinning idle loop. Finally this patch contains a simple bugfix too, found by inspection: the interrupt return code used when CONFIG_SWITCH is enabled wasn't correctly setting the active flag on the threads, opening up the potential for a race that might result in a thread being scheduled on two CPUs simultaneously. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
#endif
void _sys_power_save_idle_exit(s32_t ticks)
{
#if defined(CONFIG_SYS_POWER_LOW_POWER_STATE)
/* Some CPU low power states require notification at the ISR
* to allow any operations that needs to be done before kernel
* switches task or processes nested interrupts. This can be
* disabled by calling _sys_soc_pm_idle_exit_notification_disable().
* Alternatively it can be simply ignored if not required.
*/
if (_sys_pm_idle_exit_notify) {
_sys_soc_resume();
}
#endif
kernel: tickless: Add tickless kernel support Adds event based scheduling logic to the kernel. Updates management of timeouts, timers, idling etc. based on time tracked at events rather than periodic ticks. Provides interfaces for timers to announce and get next timer expiry based on kernel scheduling decisions involving time slicing of threads, timeouts and idling. Uses wall time units instead of ticks in all scheduling activities. The implementation involves changes in the following areas 1. Management of time in wall units like ms/us instead of ticks The existing implementation already had an option to configure number of ticks in a second. The new implementation builds on top of that feature and provides option to set the size of the scheduling granurality to mili seconds or micro seconds. This allows most of the current implementation to be reused. Due to this re-use and co-existence with tick based kernel, the names of variables may contain the word "tick". However, in the tickless kernel implementation, it represents the currently configured time unit, which would be be mili seconds or micro seconds. The APIs that take time as a parameter are not impacted and they continue to pass time in mili seconds. 2. Timers would not be programmed in periodic mode generating ticks. Instead they would be programmed in one shot mode to generate events at the time the kernel scheduler needs to gain control for its scheduling activities like timers, timeouts, time slicing, idling etc. 3. The scheduler provides interfaces that the timer drivers use to announce elapsed time and get the next time the scheduler needs a timer event. It is possible that the scheduler may not need another timer event, in which case the system would wait for a non-timer event to wake it up if it is idling. 4. New APIs are defined to be implemented by timer drivers. Also they need to handler timer events differently. These changes have been done in the HPET timer driver. In future other timers that support tickles kernel should implement these APIs as well. These APIs are to re-program the timer, update and announce elapsed time. 5. Philosopher and timer_api applications have been enabled to test tickless kernel. Separate configuration files are created which define the necessary CONFIG flags. Run these apps using following command make pristine && make BOARD=qemu_x86 CONF_FILE=prj_tickless.conf qemu Jira: ZEP-339 ZEP-1946 ZEP-948 Change-Id: I7d950c31bf1ff929a9066fad42c2f0559a2e5983 Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com>
2017-02-06 11:37:19 +08:00
if (_must_enter_tickless_idle(ticks)) {
/* Resume normal periodic system timer interrupts */
_timer_idle_exit();
}
}
#if K_IDLE_PRIO < 0
#define IDLE_YIELD_IF_COOP() k_yield()
#else
#define IDLE_YIELD_IF_COOP() do { } while ((0))
#endif
void idle(void *unused1, void *unused2, void *unused3)
{
ARG_UNUSED(unused1);
ARG_UNUSED(unused2);
ARG_UNUSED(unused3);
#ifdef CONFIG_BOOT_TIME_MEASUREMENT
/* record timestamp when idling begins */
extern u64_t __idle_time_stamp;
__idle_time_stamp = (u64_t)k_cycle_get_32();
#endif
#ifdef CONFIG_SMP
kernel: Rework SMP irq_lock() compatibility layer This was wrong in two ways, one subtle and one awful. The subtle problem was that the IRQ lock isn't actually globally recursive, it gets reset when you context switch (i.e. a _Swap() implicitly releases and reacquires it). So the recursive count I was keeping needs to be per-thread or else we risk deadlock any time we swap away from a thread holding the lock. And because part of my brain apparently knew this, there was an "optimization" in the code that tested the current count vs. zero outside the lock, on the argument that if it was non-zero we must already hold the lock. Which would be true of a per-thread counter, but NOT a global one: the other CPU may be holding that lock, and this test will tell you *you* do. The upshot is that a recursive irq_lock() would almost always SUCCEED INCORRECTLY when there was lock contention. That this didn't break more things is amazing to me. The rework is actually simpler than the original, thankfully. Though there are some further subtleties: * The lock state implied by irq_lock() allows the lock to be implicitly released on context switch (i.e. you can _Swap() with the lock held at a recursion level higher than 1, which needs to allow other processes to run). So return paths into threads from _Swap() and interrupt/exception exit need to check and restore the global lock state, spinning as needed. * The idle loop design specifies a k_cpu_idle() function that is on common architectures expected to enable interrupts (for obvious reasons), but there is no place to put non-arch code to wire it into the global lock accounting. So on SMP, even CPU0 needs to use the "dumb" spinning idle loop. Finally this patch contains a simple bugfix too, found by inspection: the interrupt return code used when CONFIG_SWITCH is enabled wasn't correctly setting the active flag on the threads, opening up the potential for a race that might result in a thread being scheduled on two CPUs simultaneously. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
/* Simplified idle for SMP CPUs pending driver support. The
* busy waiting is needed to prevent lock contention. Long
* term we need to wake up idle CPUs with an IPI.
*/
kernel: Rework SMP irq_lock() compatibility layer This was wrong in two ways, one subtle and one awful. The subtle problem was that the IRQ lock isn't actually globally recursive, it gets reset when you context switch (i.e. a _Swap() implicitly releases and reacquires it). So the recursive count I was keeping needs to be per-thread or else we risk deadlock any time we swap away from a thread holding the lock. And because part of my brain apparently knew this, there was an "optimization" in the code that tested the current count vs. zero outside the lock, on the argument that if it was non-zero we must already hold the lock. Which would be true of a per-thread counter, but NOT a global one: the other CPU may be holding that lock, and this test will tell you *you* do. The upshot is that a recursive irq_lock() would almost always SUCCEED INCORRECTLY when there was lock contention. That this didn't break more things is amazing to me. The rework is actually simpler than the original, thankfully. Though there are some further subtleties: * The lock state implied by irq_lock() allows the lock to be implicitly released on context switch (i.e. you can _Swap() with the lock held at a recursion level higher than 1, which needs to allow other processes to run). So return paths into threads from _Swap() and interrupt/exception exit need to check and restore the global lock state, spinning as needed. * The idle loop design specifies a k_cpu_idle() function that is on common architectures expected to enable interrupts (for obvious reasons), but there is no place to put non-arch code to wire it into the global lock accounting. So on SMP, even CPU0 needs to use the "dumb" spinning idle loop. Finally this patch contains a simple bugfix too, found by inspection: the interrupt return code used when CONFIG_SWITCH is enabled wasn't correctly setting the active flag on the threads, opening up the potential for a race that might result in a thread being scheduled on two CPUs simultaneously. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
while (1) {
k_busy_wait(100);
k_yield();
}
kernel: Rework SMP irq_lock() compatibility layer This was wrong in two ways, one subtle and one awful. The subtle problem was that the IRQ lock isn't actually globally recursive, it gets reset when you context switch (i.e. a _Swap() implicitly releases and reacquires it). So the recursive count I was keeping needs to be per-thread or else we risk deadlock any time we swap away from a thread holding the lock. And because part of my brain apparently knew this, there was an "optimization" in the code that tested the current count vs. zero outside the lock, on the argument that if it was non-zero we must already hold the lock. Which would be true of a per-thread counter, but NOT a global one: the other CPU may be holding that lock, and this test will tell you *you* do. The upshot is that a recursive irq_lock() would almost always SUCCEED INCORRECTLY when there was lock contention. That this didn't break more things is amazing to me. The rework is actually simpler than the original, thankfully. Though there are some further subtleties: * The lock state implied by irq_lock() allows the lock to be implicitly released on context switch (i.e. you can _Swap() with the lock held at a recursion level higher than 1, which needs to allow other processes to run). So return paths into threads from _Swap() and interrupt/exception exit need to check and restore the global lock state, spinning as needed. * The idle loop design specifies a k_cpu_idle() function that is on common architectures expected to enable interrupts (for obvious reasons), but there is no place to put non-arch code to wire it into the global lock accounting. So on SMP, even CPU0 needs to use the "dumb" spinning idle loop. Finally this patch contains a simple bugfix too, found by inspection: the interrupt return code used when CONFIG_SWITCH is enabled wasn't correctly setting the active flag on the threads, opening up the potential for a race that might result in a thread being scheduled on two CPUs simultaneously. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
#else
for (;;) {
(void)irq_lock();
sys_power_save_idle(_get_next_timeout_expiry());
IDLE_YIELD_IF_COOP();
}
kernel: Rework SMP irq_lock() compatibility layer This was wrong in two ways, one subtle and one awful. The subtle problem was that the IRQ lock isn't actually globally recursive, it gets reset when you context switch (i.e. a _Swap() implicitly releases and reacquires it). So the recursive count I was keeping needs to be per-thread or else we risk deadlock any time we swap away from a thread holding the lock. And because part of my brain apparently knew this, there was an "optimization" in the code that tested the current count vs. zero outside the lock, on the argument that if it was non-zero we must already hold the lock. Which would be true of a per-thread counter, but NOT a global one: the other CPU may be holding that lock, and this test will tell you *you* do. The upshot is that a recursive irq_lock() would almost always SUCCEED INCORRECTLY when there was lock contention. That this didn't break more things is amazing to me. The rework is actually simpler than the original, thankfully. Though there are some further subtleties: * The lock state implied by irq_lock() allows the lock to be implicitly released on context switch (i.e. you can _Swap() with the lock held at a recursion level higher than 1, which needs to allow other processes to run). So return paths into threads from _Swap() and interrupt/exception exit need to check and restore the global lock state, spinning as needed. * The idle loop design specifies a k_cpu_idle() function that is on common architectures expected to enable interrupts (for obvious reasons), but there is no place to put non-arch code to wire it into the global lock accounting. So on SMP, even CPU0 needs to use the "dumb" spinning idle loop. Finally this patch contains a simple bugfix too, found by inspection: the interrupt return code used when CONFIG_SWITCH is enabled wasn't correctly setting the active flag on the threads, opening up the potential for a race that might result in a thread being scheduled on two CPUs simultaneously. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
#endif
}