2016-09-16 00:37:58 +08:00
|
|
|
/*
|
|
|
|
* Copyright (c) 2016 Wind River Systems, Inc.
|
|
|
|
*
|
2017-01-19 09:01:01 +08:00
|
|
|
* SPDX-License-Identifier: Apache-2.0
|
2016-09-16 00:37:58 +08:00
|
|
|
*/
|
|
|
|
|
2016-12-20 09:25:56 +08:00
|
|
|
#include <kernel.h>
|
2016-11-08 23:36:50 +08:00
|
|
|
#include <kernel_structs.h>
|
2016-09-16 00:37:58 +08:00
|
|
|
#include <toolchain.h>
|
2017-06-17 23:30:47 +08:00
|
|
|
#include <linker/sections.h>
|
2016-09-16 00:37:58 +08:00
|
|
|
#include <drivers/system_timer.h>
|
|
|
|
#include <wait_q.h>
|
2016-10-21 04:43:53 +08:00
|
|
|
#include <power.h>
|
2016-09-16 00:37:58 +08:00
|
|
|
|
|
|
|
#if defined(CONFIG_TICKLESS_IDLE)
|
|
|
|
/*
|
|
|
|
* Idle time must be this value or higher for timer to go into tickless idle
|
|
|
|
* state.
|
|
|
|
*/
|
2017-04-21 23:55:34 +08:00
|
|
|
s32_t _sys_idle_threshold_ticks = CONFIG_TICKLESS_IDLE_THRESH;
|
kernel: tickless: Add tickless kernel support
Adds event based scheduling logic to the kernel. Updates
management of timeouts, timers, idling etc. based on
time tracked at events rather than periodic ticks. Provides
interfaces for timers to announce and get next timer expiry
based on kernel scheduling decisions involving time slicing
of threads, timeouts and idling. Uses wall time units instead
of ticks in all scheduling activities.
The implementation involves changes in the following areas
1. Management of time in wall units like ms/us instead of ticks
The existing implementation already had an option to configure
number of ticks in a second. The new implementation builds on
top of that feature and provides option to set the size of the
scheduling granurality to mili seconds or micro seconds. This
allows most of the current implementation to be reused. Due to
this re-use and co-existence with tick based kernel, the names
of variables may contain the word "tick". However, in the
tickless kernel implementation, it represents the currently
configured time unit, which would be be mili seconds or
micro seconds. The APIs that take time as a parameter are not
impacted and they continue to pass time in mili seconds.
2. Timers would not be programmed in periodic mode
generating ticks. Instead they would be programmed in one
shot mode to generate events at the time the kernel scheduler
needs to gain control for its scheduling activities like
timers, timeouts, time slicing, idling etc.
3. The scheduler provides interfaces that the timer drivers
use to announce elapsed time and get the next time the scheduler
needs a timer event. It is possible that the scheduler may not
need another timer event, in which case the system would wait
for a non-timer event to wake it up if it is idling.
4. New APIs are defined to be implemented by timer drivers. Also
they need to handler timer events differently. These changes
have been done in the HPET timer driver. In future other timers
that support tickles kernel should implement these APIs as well.
These APIs are to re-program the timer, update and announce
elapsed time.
5. Philosopher and timer_api applications have been enabled to
test tickless kernel. Separate configuration files are created
which define the necessary CONFIG flags. Run these apps using
following command
make pristine && make BOARD=qemu_x86 CONF_FILE=prj_tickless.conf qemu
Jira: ZEP-339 ZEP-1946 ZEP-948
Change-Id: I7d950c31bf1ff929a9066fad42c2f0559a2e5983
Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com>
2017-02-06 11:37:19 +08:00
|
|
|
|
|
|
|
#if defined(CONFIG_TICKLESS_KERNEL)
|
|
|
|
#define _must_enter_tickless_idle(ticks) (1)
|
|
|
|
#else
|
|
|
|
#define _must_enter_tickless_idle(ticks) \
|
|
|
|
((ticks == K_FOREVER) || (ticks >= _sys_idle_threshold_ticks))
|
|
|
|
#endif
|
|
|
|
#else
|
|
|
|
#define _must_enter_tickless_idle(ticks) ((void)ticks, (0))
|
2016-09-16 00:37:58 +08:00
|
|
|
#endif /* CONFIG_TICKLESS_IDLE */
|
|
|
|
|
|
|
|
#ifdef CONFIG_SYS_POWER_MANAGEMENT
|
2016-10-27 12:16:37 +08:00
|
|
|
/*
|
|
|
|
* Used to allow _sys_soc_suspend() implementation to control notification
|
2016-11-11 13:16:12 +08:00
|
|
|
* of the event that caused exit from kernel idling after pm operations.
|
2016-10-27 12:16:37 +08:00
|
|
|
*/
|
2016-11-11 13:16:12 +08:00
|
|
|
unsigned char _sys_pm_idle_exit_notify;
|
2016-10-27 12:16:37 +08:00
|
|
|
|
2018-07-17 23:51:30 +08:00
|
|
|
#if defined(CONFIG_SYS_POWER_LOW_POWER_STATE)
|
2016-11-03 14:56:34 +08:00
|
|
|
void __attribute__((weak)) _sys_soc_resume(void)
|
|
|
|
{
|
|
|
|
}
|
2018-07-17 23:51:30 +08:00
|
|
|
#endif
|
2016-11-03 14:56:34 +08:00
|
|
|
|
2018-07-17 23:51:30 +08:00
|
|
|
#if defined(CONFIG_SYS_POWER_DEEP_SLEEP)
|
2016-11-10 14:55:14 +08:00
|
|
|
void __attribute__((weak)) _sys_soc_resume_from_deep_sleep(void)
|
|
|
|
{
|
|
|
|
}
|
2018-07-17 23:51:30 +08:00
|
|
|
#endif
|
2016-11-10 14:55:14 +08:00
|
|
|
|
2016-09-16 00:37:58 +08:00
|
|
|
/**
|
|
|
|
*
|
|
|
|
* @brief Indicate that kernel is idling in tickless mode
|
|
|
|
*
|
2016-12-20 09:25:56 +08:00
|
|
|
* Sets the kernel data structure idle field to either a positive value or
|
2016-09-16 00:37:58 +08:00
|
|
|
* K_FOREVER.
|
|
|
|
*
|
|
|
|
* @param ticks the number of ticks to idle
|
|
|
|
*
|
|
|
|
* @return N/A
|
|
|
|
*/
|
2017-04-21 23:55:34 +08:00
|
|
|
static void set_kernel_idle_time_in_ticks(s32_t ticks)
|
2016-09-16 00:37:58 +08:00
|
|
|
{
|
2016-11-08 23:36:50 +08:00
|
|
|
_kernel.idle = ticks;
|
2016-09-16 00:37:58 +08:00
|
|
|
}
|
|
|
|
#else
|
|
|
|
#define set_kernel_idle_time_in_ticks(x) do { } while (0)
|
|
|
|
#endif
|
|
|
|
|
kernel: Rework SMP irq_lock() compatibility layer
This was wrong in two ways, one subtle and one awful.
The subtle problem was that the IRQ lock isn't actually globally
recursive, it gets reset when you context switch (i.e. a _Swap()
implicitly releases and reacquires it). So the recursive count I was
keeping needs to be per-thread or else we risk deadlock any time we
swap away from a thread holding the lock.
And because part of my brain apparently knew this, there was an
"optimization" in the code that tested the current count vs. zero
outside the lock, on the argument that if it was non-zero we must
already hold the lock. Which would be true of a per-thread counter,
but NOT a global one: the other CPU may be holding that lock, and this
test will tell you *you* do. The upshot is that a recursive
irq_lock() would almost always SUCCEED INCORRECTLY when there was lock
contention. That this didn't break more things is amazing to me.
The rework is actually simpler than the original, thankfully. Though
there are some further subtleties:
* The lock state implied by irq_lock() allows the lock to be
implicitly released on context switch (i.e. you can _Swap() with the
lock held at a recursion level higher than 1, which needs to allow
other processes to run). So return paths into threads from _Swap()
and interrupt/exception exit need to check and restore the global
lock state, spinning as needed.
* The idle loop design specifies a k_cpu_idle() function that is on
common architectures expected to enable interrupts (for obvious
reasons), but there is no place to put non-arch code to wire it into
the global lock accounting. So on SMP, even CPU0 needs to use the
"dumb" spinning idle loop.
Finally this patch contains a simple bugfix too, found by inspection:
the interrupt return code used when CONFIG_SWITCH is enabled wasn't
correctly setting the active flag on the threads, opening up the
potential for a race that might result in a thread being scheduled on
two CPUs simultaneously.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
|
|
|
#ifndef CONFIG_SMP
|
2018-03-07 07:08:55 +08:00
|
|
|
static void sys_power_save_idle(s32_t ticks)
|
2016-09-16 00:37:58 +08:00
|
|
|
{
|
kernel: tickless: Add tickless kernel support
Adds event based scheduling logic to the kernel. Updates
management of timeouts, timers, idling etc. based on
time tracked at events rather than periodic ticks. Provides
interfaces for timers to announce and get next timer expiry
based on kernel scheduling decisions involving time slicing
of threads, timeouts and idling. Uses wall time units instead
of ticks in all scheduling activities.
The implementation involves changes in the following areas
1. Management of time in wall units like ms/us instead of ticks
The existing implementation already had an option to configure
number of ticks in a second. The new implementation builds on
top of that feature and provides option to set the size of the
scheduling granurality to mili seconds or micro seconds. This
allows most of the current implementation to be reused. Due to
this re-use and co-existence with tick based kernel, the names
of variables may contain the word "tick". However, in the
tickless kernel implementation, it represents the currently
configured time unit, which would be be mili seconds or
micro seconds. The APIs that take time as a parameter are not
impacted and they continue to pass time in mili seconds.
2. Timers would not be programmed in periodic mode
generating ticks. Instead they would be programmed in one
shot mode to generate events at the time the kernel scheduler
needs to gain control for its scheduling activities like
timers, timeouts, time slicing, idling etc.
3. The scheduler provides interfaces that the timer drivers
use to announce elapsed time and get the next time the scheduler
needs a timer event. It is possible that the scheduler may not
need another timer event, in which case the system would wait
for a non-timer event to wake it up if it is idling.
4. New APIs are defined to be implemented by timer drivers. Also
they need to handler timer events differently. These changes
have been done in the HPET timer driver. In future other timers
that support tickles kernel should implement these APIs as well.
These APIs are to re-program the timer, update and announce
elapsed time.
5. Philosopher and timer_api applications have been enabled to
test tickless kernel. Separate configuration files are created
which define the necessary CONFIG flags. Run these apps using
following command
make pristine && make BOARD=qemu_x86 CONF_FILE=prj_tickless.conf qemu
Jira: ZEP-339 ZEP-1946 ZEP-948
Change-Id: I7d950c31bf1ff929a9066fad42c2f0559a2e5983
Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com>
2017-02-06 11:37:19 +08:00
|
|
|
#ifdef CONFIG_TICKLESS_KERNEL
|
|
|
|
if (ticks != K_FOREVER) {
|
|
|
|
ticks -= _get_elapsed_program_time();
|
|
|
|
if (!ticks) {
|
|
|
|
/*
|
|
|
|
* Timer has expired or about to expire
|
|
|
|
* No time for power saving operations
|
|
|
|
*
|
|
|
|
* Note that it will never be zero unless some time
|
|
|
|
* had elapsed since timer was last programmed.
|
|
|
|
*/
|
|
|
|
k_cpu_idle();
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
if (_must_enter_tickless_idle(ticks)) {
|
2016-09-16 00:37:58 +08:00
|
|
|
/*
|
|
|
|
* Stop generating system timer interrupts until it's time for
|
|
|
|
* the next scheduled kernel timer to expire.
|
|
|
|
*/
|
|
|
|
|
kernel: tickless: Add tickless kernel support
Adds event based scheduling logic to the kernel. Updates
management of timeouts, timers, idling etc. based on
time tracked at events rather than periodic ticks. Provides
interfaces for timers to announce and get next timer expiry
based on kernel scheduling decisions involving time slicing
of threads, timeouts and idling. Uses wall time units instead
of ticks in all scheduling activities.
The implementation involves changes in the following areas
1. Management of time in wall units like ms/us instead of ticks
The existing implementation already had an option to configure
number of ticks in a second. The new implementation builds on
top of that feature and provides option to set the size of the
scheduling granurality to mili seconds or micro seconds. This
allows most of the current implementation to be reused. Due to
this re-use and co-existence with tick based kernel, the names
of variables may contain the word "tick". However, in the
tickless kernel implementation, it represents the currently
configured time unit, which would be be mili seconds or
micro seconds. The APIs that take time as a parameter are not
impacted and they continue to pass time in mili seconds.
2. Timers would not be programmed in periodic mode
generating ticks. Instead they would be programmed in one
shot mode to generate events at the time the kernel scheduler
needs to gain control for its scheduling activities like
timers, timeouts, time slicing, idling etc.
3. The scheduler provides interfaces that the timer drivers
use to announce elapsed time and get the next time the scheduler
needs a timer event. It is possible that the scheduler may not
need another timer event, in which case the system would wait
for a non-timer event to wake it up if it is idling.
4. New APIs are defined to be implemented by timer drivers. Also
they need to handler timer events differently. These changes
have been done in the HPET timer driver. In future other timers
that support tickles kernel should implement these APIs as well.
These APIs are to re-program the timer, update and announce
elapsed time.
5. Philosopher and timer_api applications have been enabled to
test tickless kernel. Separate configuration files are created
which define the necessary CONFIG flags. Run these apps using
following command
make pristine && make BOARD=qemu_x86 CONF_FILE=prj_tickless.conf qemu
Jira: ZEP-339 ZEP-1946 ZEP-948
Change-Id: I7d950c31bf1ff929a9066fad42c2f0559a2e5983
Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com>
2017-02-06 11:37:19 +08:00
|
|
|
/*
|
|
|
|
* In the case of tickless kernel, timer driver should
|
|
|
|
* reprogram timer only if the currently programmed time
|
|
|
|
* duration is smaller than the idle time.
|
|
|
|
*/
|
2016-09-16 00:37:58 +08:00
|
|
|
_timer_idle_enter(ticks);
|
|
|
|
}
|
|
|
|
|
|
|
|
set_kernel_idle_time_in_ticks(ticks);
|
|
|
|
#if (defined(CONFIG_SYS_POWER_LOW_POWER_STATE) || \
|
2016-10-27 12:16:37 +08:00
|
|
|
defined(CONFIG_SYS_POWER_DEEP_SLEEP))
|
|
|
|
|
2016-11-11 13:16:12 +08:00
|
|
|
_sys_pm_idle_exit_notify = 1;
|
2016-10-27 12:16:37 +08:00
|
|
|
|
2016-09-16 00:37:58 +08:00
|
|
|
/*
|
2016-10-27 12:16:37 +08:00
|
|
|
* Call the suspend hook function of the soc interface to allow
|
|
|
|
* entry into a low power state. The function returns
|
|
|
|
* SYS_PM_NOT_HANDLED if low power state was not entered, in which
|
|
|
|
* case, kernel does normal idle processing.
|
2016-09-16 00:37:58 +08:00
|
|
|
*
|
2016-10-27 12:16:37 +08:00
|
|
|
* This function is entered with interrupts disabled. If a low power
|
|
|
|
* state was entered, then the hook function should enable inerrupts
|
|
|
|
* before exiting. This is because the kernel does not do its own idle
|
2016-12-15 02:04:36 +08:00
|
|
|
* processing in those cases i.e. skips k_cpu_idle(). The kernel's
|
2016-10-27 12:16:37 +08:00
|
|
|
* idle processing re-enables interrupts which is essential for
|
|
|
|
* the kernel's scheduling logic.
|
2016-09-16 00:37:58 +08:00
|
|
|
*/
|
2016-10-27 12:16:37 +08:00
|
|
|
if (_sys_soc_suspend(ticks) == SYS_PM_NOT_HANDLED) {
|
2016-11-11 13:16:12 +08:00
|
|
|
_sys_pm_idle_exit_notify = 0;
|
2016-12-15 02:04:36 +08:00
|
|
|
k_cpu_idle();
|
2016-09-16 00:37:58 +08:00
|
|
|
}
|
|
|
|
#else
|
2016-12-15 02:04:36 +08:00
|
|
|
k_cpu_idle();
|
2016-09-16 00:37:58 +08:00
|
|
|
#endif
|
|
|
|
}
|
kernel: Rework SMP irq_lock() compatibility layer
This was wrong in two ways, one subtle and one awful.
The subtle problem was that the IRQ lock isn't actually globally
recursive, it gets reset when you context switch (i.e. a _Swap()
implicitly releases and reacquires it). So the recursive count I was
keeping needs to be per-thread or else we risk deadlock any time we
swap away from a thread holding the lock.
And because part of my brain apparently knew this, there was an
"optimization" in the code that tested the current count vs. zero
outside the lock, on the argument that if it was non-zero we must
already hold the lock. Which would be true of a per-thread counter,
but NOT a global one: the other CPU may be holding that lock, and this
test will tell you *you* do. The upshot is that a recursive
irq_lock() would almost always SUCCEED INCORRECTLY when there was lock
contention. That this didn't break more things is amazing to me.
The rework is actually simpler than the original, thankfully. Though
there are some further subtleties:
* The lock state implied by irq_lock() allows the lock to be
implicitly released on context switch (i.e. you can _Swap() with the
lock held at a recursion level higher than 1, which needs to allow
other processes to run). So return paths into threads from _Swap()
and interrupt/exception exit need to check and restore the global
lock state, spinning as needed.
* The idle loop design specifies a k_cpu_idle() function that is on
common architectures expected to enable interrupts (for obvious
reasons), but there is no place to put non-arch code to wire it into
the global lock accounting. So on SMP, even CPU0 needs to use the
"dumb" spinning idle loop.
Finally this patch contains a simple bugfix too, found by inspection:
the interrupt return code used when CONFIG_SWITCH is enabled wasn't
correctly setting the active flag on the threads, opening up the
potential for a race that might result in a thread being scheduled on
two CPUs simultaneously.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
|
|
|
#endif
|
2016-09-16 00:37:58 +08:00
|
|
|
|
2017-04-21 23:55:34 +08:00
|
|
|
void _sys_power_save_idle_exit(s32_t ticks)
|
2016-09-16 00:37:58 +08:00
|
|
|
{
|
2016-10-27 12:16:37 +08:00
|
|
|
#if defined(CONFIG_SYS_POWER_LOW_POWER_STATE)
|
|
|
|
/* Some CPU low power states require notification at the ISR
|
|
|
|
* to allow any operations that needs to be done before kernel
|
|
|
|
* switches task or processes nested interrupts. This can be
|
2016-11-11 13:16:12 +08:00
|
|
|
* disabled by calling _sys_soc_pm_idle_exit_notification_disable().
|
2016-10-27 12:16:37 +08:00
|
|
|
* Alternatively it can be simply ignored if not required.
|
2016-09-16 00:37:58 +08:00
|
|
|
*/
|
2016-11-11 13:16:12 +08:00
|
|
|
if (_sys_pm_idle_exit_notify) {
|
2016-10-27 12:16:37 +08:00
|
|
|
_sys_soc_resume();
|
|
|
|
}
|
2016-09-16 00:37:58 +08:00
|
|
|
#endif
|
|
|
|
|
kernel: tickless: Add tickless kernel support
Adds event based scheduling logic to the kernel. Updates
management of timeouts, timers, idling etc. based on
time tracked at events rather than periodic ticks. Provides
interfaces for timers to announce and get next timer expiry
based on kernel scheduling decisions involving time slicing
of threads, timeouts and idling. Uses wall time units instead
of ticks in all scheduling activities.
The implementation involves changes in the following areas
1. Management of time in wall units like ms/us instead of ticks
The existing implementation already had an option to configure
number of ticks in a second. The new implementation builds on
top of that feature and provides option to set the size of the
scheduling granurality to mili seconds or micro seconds. This
allows most of the current implementation to be reused. Due to
this re-use and co-existence with tick based kernel, the names
of variables may contain the word "tick". However, in the
tickless kernel implementation, it represents the currently
configured time unit, which would be be mili seconds or
micro seconds. The APIs that take time as a parameter are not
impacted and they continue to pass time in mili seconds.
2. Timers would not be programmed in periodic mode
generating ticks. Instead they would be programmed in one
shot mode to generate events at the time the kernel scheduler
needs to gain control for its scheduling activities like
timers, timeouts, time slicing, idling etc.
3. The scheduler provides interfaces that the timer drivers
use to announce elapsed time and get the next time the scheduler
needs a timer event. It is possible that the scheduler may not
need another timer event, in which case the system would wait
for a non-timer event to wake it up if it is idling.
4. New APIs are defined to be implemented by timer drivers. Also
they need to handler timer events differently. These changes
have been done in the HPET timer driver. In future other timers
that support tickles kernel should implement these APIs as well.
These APIs are to re-program the timer, update and announce
elapsed time.
5. Philosopher and timer_api applications have been enabled to
test tickless kernel. Separate configuration files are created
which define the necessary CONFIG flags. Run these apps using
following command
make pristine && make BOARD=qemu_x86 CONF_FILE=prj_tickless.conf qemu
Jira: ZEP-339 ZEP-1946 ZEP-948
Change-Id: I7d950c31bf1ff929a9066fad42c2f0559a2e5983
Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com>
2017-02-06 11:37:19 +08:00
|
|
|
if (_must_enter_tickless_idle(ticks)) {
|
|
|
|
/* Resume normal periodic system timer interrupts */
|
2016-09-16 00:37:58 +08:00
|
|
|
_timer_idle_exit();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2016-11-09 04:38:45 +08:00
|
|
|
#if K_IDLE_PRIO < 0
|
|
|
|
#define IDLE_YIELD_IF_COOP() k_yield()
|
|
|
|
#else
|
|
|
|
#define IDLE_YIELD_IF_COOP() do { } while ((0))
|
|
|
|
#endif
|
|
|
|
|
2016-09-16 00:37:58 +08:00
|
|
|
void idle(void *unused1, void *unused2, void *unused3)
|
|
|
|
{
|
|
|
|
ARG_UNUSED(unused1);
|
|
|
|
ARG_UNUSED(unused2);
|
|
|
|
ARG_UNUSED(unused3);
|
|
|
|
|
2016-10-28 03:19:49 +08:00
|
|
|
#ifdef CONFIG_BOOT_TIME_MEASUREMENT
|
|
|
|
/* record timestamp when idling begins */
|
|
|
|
|
2017-06-14 17:45:49 +08:00
|
|
|
extern u64_t __idle_time_stamp;
|
2016-10-28 03:19:49 +08:00
|
|
|
|
2017-06-14 17:45:49 +08:00
|
|
|
__idle_time_stamp = (u64_t)k_cycle_get_32();
|
2016-10-28 03:19:49 +08:00
|
|
|
#endif
|
|
|
|
|
2018-02-03 03:26:41 +08:00
|
|
|
#ifdef CONFIG_SMP
|
kernel: Rework SMP irq_lock() compatibility layer
This was wrong in two ways, one subtle and one awful.
The subtle problem was that the IRQ lock isn't actually globally
recursive, it gets reset when you context switch (i.e. a _Swap()
implicitly releases and reacquires it). So the recursive count I was
keeping needs to be per-thread or else we risk deadlock any time we
swap away from a thread holding the lock.
And because part of my brain apparently knew this, there was an
"optimization" in the code that tested the current count vs. zero
outside the lock, on the argument that if it was non-zero we must
already hold the lock. Which would be true of a per-thread counter,
but NOT a global one: the other CPU may be holding that lock, and this
test will tell you *you* do. The upshot is that a recursive
irq_lock() would almost always SUCCEED INCORRECTLY when there was lock
contention. That this didn't break more things is amazing to me.
The rework is actually simpler than the original, thankfully. Though
there are some further subtleties:
* The lock state implied by irq_lock() allows the lock to be
implicitly released on context switch (i.e. you can _Swap() with the
lock held at a recursion level higher than 1, which needs to allow
other processes to run). So return paths into threads from _Swap()
and interrupt/exception exit need to check and restore the global
lock state, spinning as needed.
* The idle loop design specifies a k_cpu_idle() function that is on
common architectures expected to enable interrupts (for obvious
reasons), but there is no place to put non-arch code to wire it into
the global lock accounting. So on SMP, even CPU0 needs to use the
"dumb" spinning idle loop.
Finally this patch contains a simple bugfix too, found by inspection:
the interrupt return code used when CONFIG_SWITCH is enabled wasn't
correctly setting the active flag on the threads, opening up the
potential for a race that might result in a thread being scheduled on
two CPUs simultaneously.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
|
|
|
/* Simplified idle for SMP CPUs pending driver support. The
|
|
|
|
* busy waiting is needed to prevent lock contention. Long
|
|
|
|
* term we need to wake up idle CPUs with an IPI.
|
2018-02-03 03:26:41 +08:00
|
|
|
*/
|
kernel: Rework SMP irq_lock() compatibility layer
This was wrong in two ways, one subtle and one awful.
The subtle problem was that the IRQ lock isn't actually globally
recursive, it gets reset when you context switch (i.e. a _Swap()
implicitly releases and reacquires it). So the recursive count I was
keeping needs to be per-thread or else we risk deadlock any time we
swap away from a thread holding the lock.
And because part of my brain apparently knew this, there was an
"optimization" in the code that tested the current count vs. zero
outside the lock, on the argument that if it was non-zero we must
already hold the lock. Which would be true of a per-thread counter,
but NOT a global one: the other CPU may be holding that lock, and this
test will tell you *you* do. The upshot is that a recursive
irq_lock() would almost always SUCCEED INCORRECTLY when there was lock
contention. That this didn't break more things is amazing to me.
The rework is actually simpler than the original, thankfully. Though
there are some further subtleties:
* The lock state implied by irq_lock() allows the lock to be
implicitly released on context switch (i.e. you can _Swap() with the
lock held at a recursion level higher than 1, which needs to allow
other processes to run). So return paths into threads from _Swap()
and interrupt/exception exit need to check and restore the global
lock state, spinning as needed.
* The idle loop design specifies a k_cpu_idle() function that is on
common architectures expected to enable interrupts (for obvious
reasons), but there is no place to put non-arch code to wire it into
the global lock accounting. So on SMP, even CPU0 needs to use the
"dumb" spinning idle loop.
Finally this patch contains a simple bugfix too, found by inspection:
the interrupt return code used when CONFIG_SWITCH is enabled wasn't
correctly setting the active flag on the threads, opening up the
potential for a race that might result in a thread being scheduled on
two CPUs simultaneously.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
|
|
|
while (1) {
|
|
|
|
k_busy_wait(100);
|
|
|
|
k_yield();
|
2018-02-03 03:26:41 +08:00
|
|
|
}
|
kernel: Rework SMP irq_lock() compatibility layer
This was wrong in two ways, one subtle and one awful.
The subtle problem was that the IRQ lock isn't actually globally
recursive, it gets reset when you context switch (i.e. a _Swap()
implicitly releases and reacquires it). So the recursive count I was
keeping needs to be per-thread or else we risk deadlock any time we
swap away from a thread holding the lock.
And because part of my brain apparently knew this, there was an
"optimization" in the code that tested the current count vs. zero
outside the lock, on the argument that if it was non-zero we must
already hold the lock. Which would be true of a per-thread counter,
but NOT a global one: the other CPU may be holding that lock, and this
test will tell you *you* do. The upshot is that a recursive
irq_lock() would almost always SUCCEED INCORRECTLY when there was lock
contention. That this didn't break more things is amazing to me.
The rework is actually simpler than the original, thankfully. Though
there are some further subtleties:
* The lock state implied by irq_lock() allows the lock to be
implicitly released on context switch (i.e. you can _Swap() with the
lock held at a recursion level higher than 1, which needs to allow
other processes to run). So return paths into threads from _Swap()
and interrupt/exception exit need to check and restore the global
lock state, spinning as needed.
* The idle loop design specifies a k_cpu_idle() function that is on
common architectures expected to enable interrupts (for obvious
reasons), but there is no place to put non-arch code to wire it into
the global lock accounting. So on SMP, even CPU0 needs to use the
"dumb" spinning idle loop.
Finally this patch contains a simple bugfix too, found by inspection:
the interrupt return code used when CONFIG_SWITCH is enabled wasn't
correctly setting the active flag on the threads, opening up the
potential for a race that might result in a thread being scheduled on
two CPUs simultaneously.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
|
|
|
#else
|
2016-09-16 00:37:58 +08:00
|
|
|
for (;;) {
|
2016-11-09 04:38:45 +08:00
|
|
|
(void)irq_lock();
|
2018-03-07 07:08:55 +08:00
|
|
|
sys_power_save_idle(_get_next_timeout_expiry());
|
2016-09-16 00:37:58 +08:00
|
|
|
|
2016-11-09 04:38:45 +08:00
|
|
|
IDLE_YIELD_IF_COOP();
|
2016-09-16 00:37:58 +08:00
|
|
|
}
|
kernel: Rework SMP irq_lock() compatibility layer
This was wrong in two ways, one subtle and one awful.
The subtle problem was that the IRQ lock isn't actually globally
recursive, it gets reset when you context switch (i.e. a _Swap()
implicitly releases and reacquires it). So the recursive count I was
keeping needs to be per-thread or else we risk deadlock any time we
swap away from a thread holding the lock.
And because part of my brain apparently knew this, there was an
"optimization" in the code that tested the current count vs. zero
outside the lock, on the argument that if it was non-zero we must
already hold the lock. Which would be true of a per-thread counter,
but NOT a global one: the other CPU may be holding that lock, and this
test will tell you *you* do. The upshot is that a recursive
irq_lock() would almost always SUCCEED INCORRECTLY when there was lock
contention. That this didn't break more things is amazing to me.
The rework is actually simpler than the original, thankfully. Though
there are some further subtleties:
* The lock state implied by irq_lock() allows the lock to be
implicitly released on context switch (i.e. you can _Swap() with the
lock held at a recursion level higher than 1, which needs to allow
other processes to run). So return paths into threads from _Swap()
and interrupt/exception exit need to check and restore the global
lock state, spinning as needed.
* The idle loop design specifies a k_cpu_idle() function that is on
common architectures expected to enable interrupts (for obvious
reasons), but there is no place to put non-arch code to wire it into
the global lock accounting. So on SMP, even CPU0 needs to use the
"dumb" spinning idle loop.
Finally this patch contains a simple bugfix too, found by inspection:
the interrupt return code used when CONFIG_SWITCH is enabled wasn't
correctly setting the active flag on the threads, opening up the
potential for a race that might result in a thread being scheduled on
two CPUs simultaneously.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-04-13 03:50:05 +08:00
|
|
|
#endif
|
2016-09-16 00:37:58 +08:00
|
|
|
}
|