developers must be careful to hold spinlocks and ensure all
of protected code is under control, so remove support for nested
spinlocks to improve performance.
Signed-off-by: chao an <anchao@lixiang.com>
Some app with same code runs on different cores in AMP mode,
need the physical core on which the function is called.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Signed-off-by: fangxinyong <fangxinyong@xiaomi.com>
reason:
To remove the "sync pause" and decouple the critical section from the dependency on enabling interrupts,
after that we need to further implement "schedlock + spinlock".
changelist
1 Modify the implementation of critical sections to no longer involve enabling interrupts or handling synchronous pause events.
2 GIC_SMP_CPUCALL attach to pause handler to remove arch interface up_cpu_paused_restore up_cpu_paused_save
3 Completely remove up_cpu_pause, up_cpu_resume, up_cpu_paused, and up_cpu_pausereq
4 change up_cpu_pause_async to up_send_cpu_sgi
Signed-off-by: hujun5 <hujun5@xiaomi.com>
There will be a large performance loss after SCHED_CRITMONITOR is enabled.
By isolating thread running time-related functions, CPU load can be run with less overhead.
Signed-off-by: yinshengkai <yinshengkai@xiaomi.com>
Signed-off-by: buxiasen <buxiasen@xiaomi.com>
Most tools used for compliance and SBOM generation use SPDX identifiers
This change brings us a step closer to an easy SBOM generation.
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
reason:
In the SMP, when a context switch occurs, restore_critical_section is executed.
In order to reduce the time taken for context switching,
we inline the restore_critical_section function.
Given that restore_critical_section is small in size
and is called from only one location, inlining it does not increase the size of the image.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
reason:
1 We place interrupt handling functions of the same priority into the work queue corresponding
to that priority, allowing high-priority interrupts to preempt low-priority ones,
thus ensuring the real-time performance of high-priority interrupts.
2 The sole purpose of the interrupt handler is to wake
up the work queue of the corresponding priority and execute the interrupt handling function.
3 Compared to the functionality of isr threads, this
approach saves more memory, particularly when the number of interrupts is large.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
reason:
dynaminc create g_irqmap to reduce the use of data segments
CONFIG_ARCH_NUSER_INTERRUPTS should be one more than the number of IRQs actually used
Signed-off-by: hujun5 <hujun5@xiaomi.com>
purpose:
To improve the real-time performance of the system, we prefer to perform
as few operations as possible within the interrupt function.
We have designed an interrupt thread for each interrupt,
where all the operations that are not necessary to be handled
in the interrupt function are delegated to be processed by the interrupt thread.
Up_enable_irq will be invoked after isrthread started.
Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-net none -chardev stdio,id=con,mux=on -serial chardev:con \
-mon chardev=con,mode=readline -kernel ./nuttx
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Regression by:
| commit 2ee8aa6f2b
| Author: hujun5 <hujun5@xiaomi.com>
| Date: Thu Jan 11 11:27:31 2024 +0800
|
| sched: we use spin_lock_irqsave replace enter_critical_section to protect g_irqvector
|
| enter_critical_section may be called before os initialized
|
| Signed-off-by: hujun5 <hujun5@xiaomi.com>
Signed-off-by: chao an <anchao@lixiang.com>
because 'g_cpu_nestcount[me] > 0' will never happen, in this place
test:
We can use qemu for testing.
compiling
make distclean -j20; ./tools/configure.sh -l qemu-armv8a:nsh_smp ;make -j20
running
qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic -machine virt,virtualization=on,gic-version=3 -net none -chardev stdio,id=con,mux=on -serial chardev:con -mon chardev=con,mode=readline -kernel ./nuttx
Signed-off-by: hujun5 <hujun5@xiaomi.com>
this_task() is a function call that involves disabling interrupts and this_cpu().
Since restore_critical_section is always called in an interrupt-disabled context,
there's no need to disable interrupts again. Therefore, to save time and achieve
the same effect, I directly call tcb = current_task(me) instead of tcb = this_task().
Signed-off-by: hujun5 <hujun5@xiaomi.com>
we can use g_cpu_lockset to determine whether we are currently in the scheduling lock,
and all accesses and modifications to g_cpu_lockset, g_cpu_irqlock, g_cpu_irqset
are in the critical section, so we can directly operate on it.
test:
We can use qemu for testing.
compiling
make distclean -j20; ./tools/configure.sh -l qemu-armv8a:nsh_smp ;make -j20
running
qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic -machine virt,virtualization=on,gic-version=3 -net none -chardev stdio,id=con,mux=on -serial chardev:con -mon chardev=con,mode=readline -kernel ./nuttx
Signed-off-by: hujun5 <hujun5@xiaomi.com>
1. The critical section does not prevent task scheduling
2. If the critical section is in sched_lock, there is no need to check,
scheduling is not going to happen
3. If sched_lock is in the critical section, sched_unlock will also
trigger scheduling without waiting for the exit of the critical section
4. After exiting the critical section, if there is an interrupt,
the scheduling will be automatically triggered
Signed-off-by: hujun5 <hujun5@xiaomi.com>
In SMP mode, up_cpu_index()/this_cpu() are the same, both return the index of the physical core.
In AMP mode, up_cpu_index() will return the index of the physical core, and this_cpu() will always return 0
| #ifdef CONFIG_SMP
| # define this_cpu() up_cpu_index()
| #elif defined(CONFIG_AMP)
| # define this_cpu() (0)
| #else
| # define this_cpu() (0)
| #endif
Signed-off-by: chao an <anchao@lixiang.com>
cpu0 thread0: cpu1:
sched_yield()
nxsched_set_priority()
nxsched_running_setpriority()
nxsched_reprioritize_rtr()
nxsched_add_readytorun()
up_cpu_pause()
IRQ enter
arm64_pause_handler()
enter_critical_section() begin
up_cpu_paused() pick thread0
arm64_restorestate() set thread0 tcb->xcp.regs to CURRENT_REGS
up_switch_context()
thread0 -> thread1
arm64_syscall()
case SYS_switch_context
change thread0 tcb->xcp.regs
restore_critical_section()
enter_critical_section() done
leave_critical_section()
IRQ leave with restore CURRENT_REGS
ERROR !!!
Reason:
As descript above, cpu0 swith task: thread0 -> thread1, and the
syscall() execute slowly, this time cpu1 pick thread0 to run at
up_cpu_paused(). Then cpu0 syscall execute, cpu1 IRQ leave error.
Resolve:
Move arm64_restorestate() after enter_critical_section() done
This is a continued fix with:
https://github.com/apache/nuttx/pull/6833
Signed-off-by: ligd <liguiding1@xiaomi.com>
spinlock.c:
Implement read write spinlock.
Readers can take lock simultaneously but only one writer can take lock.
irq_spinlock.c:
Align g_irq_spin_count.
If the lock is NULL, the caller will get global lock (e.g. g_irq_spin) and spin_lock_irqsave() support nest on the same CPU.
If the CPU can write lock, it can call write_lock_irqsave() again (e.g. support nest).
Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
Co-authored-by: David Sidrane <David.Sidrane@Nscdg.com>
test config: ./tools/configure.sh -l qemu-armv8a:nsh_smp
Pass ostest
No matter big-endian or little-endian, ticket spinlock only check the
next and the owner is equal or not.
If they are equal, it means there is a task hold the lock or lock is
free.
Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
Co-authored-by: Xiang Xiao <xiaoxiang781216@gmail.com>
When supporting high-priority interrupts, updating the
g_running_tasks within a high-priority interrupt may be
cause problems. The g_running_tasks should only be updated
when it is determined that a task context switch has occurred.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
1. Update all CMakeLists.txt to adapt to new layout
2. Fix cmake build break
3. Update all new file license
4. Fully compatible with current compilation environment(use configure.sh or cmake as you choose)
------------------
How to test
From within nuttx/. Configure:
cmake -B build -DBOARD_CONFIG=sim/nsh -GNinja
cmake -B build -DBOARD_CONFIG=sim:nsh -GNinja
cmake -B build -DBOARD_CONFIG=sabre-6quad/smp -GNinja
cmake -B build -DBOARD_CONFIG=lm3s6965-ek/qemu-flat -GNinja
(or full path in custom board) :
cmake -B build -DBOARD_CONFIG=$PWD/boards/sim/sim/sim/configs/nsh -GNinja
This uses ninja generator (install with sudo apt install ninja-build). To build:
$ cmake --build build
menuconfig:
$ cmake --build build -t menuconfig
--------------------------
2. cmake/build: reformat the cmake style by cmake-format
https://github.com/cheshirekow/cmake_format
$ pip install cmakelang
$ for i in `find -name CMakeLists.txt`;do cmake-format $i -o $i;done
$ for i in `find -name *\.cmake`;do cmake-format $i -o $i;done
Co-authored-by: Matias N <matias@protobits.dev>
Signed-off-by: chao an <anchao@xiaomi.com>
Summary:
- Support arm64 pmu api, Currently only the cycle counter function is supported.
- Using ARM64 PMU hardware capability to implement perf interface, modify all
perf interface related code.
- Support for pmu init under smp.
Signed-off-by: wangming9 <wangming9@xiaomi.com>