reason:
1 To improve efficiency, we mimic Linux's behavior where preemption disabling is only applicable to the current CPU and does not affect other CPUs.
2 In the future, we will implement "spinlock+sched_lock", and use it extensively. Under such circumstances, if preemption is still globally disabled, it will seriously impact the scheduling efficiency.
3 We have removed g_cpu_lockset and used irqcount in order to eliminate the dependency of schedlock on critical sections in the future, simplify the logic, and further enhance the performance of sched_lock.
4 We set lockcount to 1 in order to lock scheduling on all CPUs during startup, without the need to provide additional functions to disable scheduling on other CPUs.
5 Cpu1~n must wait for cpu0 to enter the idle state before enabling scheduling because it prevents CPUs1~n from competing with cpu0 for the memory manager mutex, which could cause the cpu0 idle task to enter a wait state and trigger an assert.
size nuttx
before:
text data bss dec hex filename
265396 51057 63646 380099 5ccc3 nuttx
after:
text data bss dec hex filename
265184 51057 63642 379883 5cbeb nuttx
size -216
Configuring NuttX and compile:
$ ./tools/configure.sh -l qemu-armv8a:nsh_smp
$ make
Running with qemu
$ qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-net none -chardev stdio,id=con,mux=on -serial chardev:con \
-mon chardev=con,mode=readline -kernel ./nuttx
Signed-off-by: hujun5 <hujun5@xiaomi.com>
reason:
Currently, if we need to schedule a task to another CPU, we have to completely halt the other CPU,
manipulate the scheduling linked list, and then resume the operation of that CPU. This process is both time-consuming and unnecessary.
During this process, both the current CPU and the target CPU are inevitably subjected to busyloop.
The improved strategy is to simply send a cross-core interrupt to the target CPU.
The current CPU continues to run while the target CPU responds to the interrupt, eliminating the certainty of a busyloop occurring.
Signed-off-by: hujun5 <hujun5@xiaomi.com>
Most tools used for compliance and SBOM generation use SPDX identifiers
This change brings us a step closer to an easy SBOM generation.
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
The file descriptors of kernel threads should be clean and should
not be generated based on user threads. Instead, an idle thread
hould be chosen.
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
Kthreads can share the group data so that to reduce overheads.
This implements shared kthread group via:
- use `tcb_s` instead of `task_tcb_s` for kthreads
- use `g_kthread_group` when creating kthreads
- use stackargs to start tasks and kthreads
see pull/12320 for test logs.
Signed-off-by: Yanfeng Liu <yfliu2008@qq.com>
sched implementation not depends on macro abstraction, so revert below commit:
This reverts commit 4e62d0005a
This reverts commit 0f0c370520
This reverts commit ad0efd04ee
Signed-off-by: chao an <anchao@lixiang.com>
1. add support to join main task
| static pthread_t self;
|
| static void *join_task(void *arg)
| {
| int ret;
| ret = pthread_join(self, NULL); <--- /* Fix Task could not be joined */
| return NULL;
| }
|
| int main(int argc, char *argv[])
| {
| pthread_t thread;
|
| self = pthread_self();
|
| pthread_create(&thread, NULL, join_task, NULL);
| sleep(1);
|
| pthread_exit(NULL);
| return 0;
| }
2. Detach active thread will not alloc for additional join, just update the task flag.
3. Remove the return value waiting lock logic (data_sem),
the return value will be stored in the waiting tcb.
4. Revise the return value of pthread_join(), consistent with linux
e.g:
Joining a detached and canceled thread should return EINVAL, not ESRCH
https://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_join.html
[EINVAL]
The value specified by thread does not refer to a joinable thread.
NOTE:
This PR will not increase stack usage, but struct tcb_s will increase 32 bytes.
Signed-off-by: chao an <anchao@lixiang.com>
move task group into task_tcb_s to avoid access allocator to improve performance
for Task Termination, the time consumption will be reduced ~2us (Tricore TC397 300MHZ):
15.97(us) -> 13.55(us)
Signed-off-by: chao an <anchao@lixiang.com>
The maximum startup parameters have been checked accordingly in nxtask_setup_stackargs(),
let us save argument counter to avoid limit check.
Signed-off-by: chao an <anchao@lixiang.com>
Current `CONFIG_PAGING` refers to an experimental implementation
to enable embedded MCUs with some limited RAM space to execute
large programs from some non-random access media.
On-demand paging should be implemented for the kernel mode with
address environment implementation enabled.
Usage:
1. CONFIG_FS_PROCFS_MAX_STACK_RECORD > 0, such as 32,
2. add '-finstrument-functions' to CFLAGS for What you want to check
stack.
3. mount porcfs
4. cat /proc/<pid>/stack will print backtace & size
Signed-off-by: anjiahao <anjiahao@xiaomi.com>
The task files should consult the "spawn action" and "O_CLOEXEC flags"
to determine further whether the file should be duplicated.
This PR will further optimize file list duplicating to avoid the performance
regression caused by additional file operations.
Signed-off-by: chao an <anchao@xiaomi.com>
This moves task / thread cancel point logic from the NuttX kernel into
libc, while the data needed by the cancel point logic is moved to TLS.
The change is an enabler to move user-space APIs to libc as well, for
a coherent user/kernel separation.
We can use the driver in nuttx to download
files with debugger
Signed-off-by: anjiahao <anjiahao@xiaomi.com>
Signed-off-by: chao an <anchao@xiaomi.com>
This adds functionality to map pages dynamically into kernel virtual
memory. This allows implementing I/O remap for example, which is a useful
(future) feature.
Now, the first target is to support mapping user pages for the kernel.
Why? There are some userspace structures that might be needed when the
userspace process is not running. Semaphores are one such example. Signals
and the WDT timeout both need access to the user semaphore to work
properly. Even though for this only obtaining the kernel addressable
page pool virtual address is needed, for completeness a procedure is
provided to map several pages.
Set the Default CPU bits. The way to use the unset CPU is to call the
sched_setaffinity function to bind a task to the CPU. bit0 means CPU0.
Signed-off-by: zhangyuan21 <zhangyuan21@xiaomi.com>
it inappropriate to apply volatile to the task list:
1.The code access task list is already protected by critical section
2.The queue is complex struct, it isn't enough to protect by volatile
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
because not all compiler support the weak attribute, and
many features are either always used or guarded by config.
Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
- User mode allocator was used for setting up the environment. This
works in flat mode and probably in protected mode as well, as there
is always a a single user allocator present
- This does not work in kernel mode, where each user task has its own
heap allocator. Also, when the idle tasks environment is being set,
no allocator is ready and the system crashes at once.
Fix this by using the group allocators instead:
- Idle task is a kernel task, so its group is privileged
- Add group_realloc
- Use the group_malloc/realloc functions instead of kumm_malloc