diff --git a/TODO b/TODO index 7a1728743b..029a46fa75 100644 --- a/TODO +++ b/TODO @@ -1,4 +1,4 @@ -NuttX TODO List (Last updated November 21, 2019) +NuttX TODO List (Last updated January 3, 2019) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This file summarizes known NuttX bugs, limitations, inconsistencies with @@ -589,32 +589,58 @@ o SMP can that occur? I think it can occur in the following situation: - CPU0 - Task A is running. - - The CPU0 IDLE task is the only other task in the - CPU0 ready-to-run list. - CPU1 - Task B is running. - - Task C is blocked but remains in the g_assignedtasks[] - list because of a CPU affinity selection. Task C - also holds the critical section which is temporarily - relinquished because Task C is blocked by Task B. - - The CPU1 IDLE task is at the end of the list. + The log below was reported is Nuttx running on two cores + Cortex-A7 architecture in SMP mode. You can notice see that + when sched_addreadytorun() was called, the g_cpu_irqset is 3. - Actions: - 1. Task A/CPU 0 takes the critical section. - 2. Task B/CPU 1 suspends waiting for an event - 3. Task C is restarted. + sched_addreadytorun: irqset cpu 1, me 0 btcbname init, irqset 1 irqcount 2. + sched_addreadytorun: sched_addreadytorun line 338 g_cpu_irqset = 3. - Now both Task A and Task C hold the critical section. + This can happen, but only under a very certain condition. + g_cpu_irqset only exists to support this certain condition: - This problem has never been observed, but seems to be a - possibility. I believe it could only occur if CPU affinity - is used (otherwise, tasks will pend must as when pre- - emption is disabled). + a. A task running on CPU 0 takes the critical section. So + g_cpu_irqset == 0x1. - A proper solution would probably involve re-designing how - CPU affinity is implemented. The CPU1 IDLE thread should - more appropriately run, but cannot because the Task C TCB - is in the g_assignedtasks[] list. + b. A task exits on CPU 1 and a waiting, ready-to-run task + is re-started on CPU 1. This new task also holds the + critical section. So when the task is re-restarted on + CPU 1, we than have g_cpu_irqset == 0x3 + + So we are in a very perverse state! There are two tasks + running on two different CPUs and both hold the critical + section. I believe that is a dangerous situation and there + could be undiscovered bugs that could happen in that case. + However, as of this moment, I have not heard of any specific + problems caused by this weird behavior. + + A possible solution would be to add a new task state that + would exist only for SMP. + + - Add a new SMP-only task list and state. Say, + g_csection_wait[]. It should be prioritized. + - When a task acquires the critical section, all tasks in + g_readytorun[] that need the critical section would be + moved to g_csection_wait[]. + - When any task is unblocked for any reason and moved to the + g_readytorun[] list, if that unblocked task needs the + critical section, it would also be moved to the + g_csection_wait[] list. No task that needs the critical + section can be in the ready-to-run list if the critical + section is not available. + - When the task releases the critical section, all tasks in + the g_csection_wait[] needs to be moved back to + g_readytorun[]. + - This may result in a context switch. The tasks should be + moved back to g_readytorun[] higest priority first. If a + context switch occurs and the critical section to re-taken + by the re-started task, the lower priority tasks in + g_csection_wait[] must stay in that list. + + That is really not as much work as it sounds. It is + something that could be done in 2-3 days of work if you know + what you are doing. Getting the proper test setup and + verifying the cahnge would be the more difficult task. Status: Open Priority: Unknown. Might be high, but first we would need to confirm