Boris reported hung_task splats after commit 5aec788aeb ("sched: Fix
TASK_state comparisons"). Upon closer consideration of that change it
doesn't only exclude TASK_KILLABLE, but also TASK_IDLE.
Update the comment to reflect this fact and add an additional
TASK_NOLOAD test to exclude them.
Additionally, remove the TASK_FREEZABLE early exit from
check_hung_task(), a freezable task is not a frozen task.
Fixes: 5aec788aeb ("sched: Fix TASK_state comparisons")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Borislav Petkov <bp@alien8.de>
Task state is fundamentally a bitmask; direct comparisons are probably
not working as intended. Specifically the normal wait-state have
a number of possible modifiers:
TASK_UNINTERRUPTIBLE: TASK_WAKEKILL, TASK_NOLOAD, TASK_FREEZABLE
TASK_INTERRUPTIBLE: TASK_FREEZABLE
Specifically, the addition of TASK_FREEZABLE wrecked
__wait_is_interruptible(). This however led to an audit of direct
comparisons yielding the rest of the changes.
Fixes: f5d39b0208 ("freezer,sched: Rewrite core freezer logic")
Reported-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Debugged-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Christian Borntraeger <borntraeger@linux.ibm.com>
sched_nr_migrate_break is set to a fix value and never changes so we can
replace it by a define SCHED_NR_MIGRATE_BREAK.
Also, we adjust SCHED_NR_MIGRATE_BREAK to be aligned with the init value
of sysctl_sched_nr_migrate which can be init to different values.
Then, use SCHED_NR_MIGRATE_BREAK to init sysctl_sched_nr_migrate.
The behavior stays unchanged unless you modify sysctl_sched_nr_migrate
trough debugfs.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220825122726.20819-3-vincent.guittot@linaro.org
During load balance, we try at most env->loop_max time to move a task.
But it can happen that the loop_max LRU tasks (ie tail of
the cfs_tasks list) can't be moved to dst_cpu because of affinity.
In this case, loop in the list until we found at least one.
The maximum of detached tasks remained the same as before.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220825122726.20819-2-vincent.guittot@linaro.org
Rewrite the core freezer to behave better wrt thawing and be simpler
in general.
By replacing PF_FROZEN with TASK_FROZEN, a special block state, it is
ensured frozen tasks stay frozen until thawed and don't randomly wake
up early, as is currently possible.
As such, it does away with PF_FROZEN and PF_FREEZER_SKIP, freeing up
two PF_flags (yay!).
Specifically; the current scheme works a little like:
freezer_do_not_count();
schedule();
freezer_count();
And either the task is blocked, or it lands in try_to_freezer()
through freezer_count(). Now, when it is blocked, the freezer
considers it frozen and continues.
However, on thawing, once pm_freezing is cleared, freezer_count()
stops working, and any random/spurious wakeup will let a task run
before its time.
That is, thawing tries to thaw things in explicit order; kernel
threads and workqueues before doing bringing SMP back before userspace
etc.. However due to the above mentioned races it is entirely possible
for userspace tasks to thaw (by accident) before SMP is back.
This can be a fatal problem in asymmetric ISA architectures (eg ARMv9)
where the userspace task requires a special CPU to run.
As said; replace this with a special task state TASK_FROZEN and add
the following state transitions:
TASK_FREEZABLE -> TASK_FROZEN
__TASK_STOPPED -> TASK_FROZEN
__TASK_TRACED -> TASK_FROZEN
The new TASK_FREEZABLE can be set on any state part of TASK_NORMAL
(IOW. TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE) -- any such state
is already required to deal with spurious wakeups and the freezer
causes one such when thawing the task (since the original state is
lost).
The special __TASK_{STOPPED,TRACED} states *can* be restored since
their canonical state is in ->jobctl.
With this, frozen tasks need an explicit TASK_FROZEN wakeup and are
free of undue (early / spurious) wakeups.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220822114649.055452969@infradead.org
In preparation of adding more states, add a few 0s to the literals as
we've just about ran out.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Now that wait_task_inactive()'s @match_state argument is a mask (like
ttwu()) it is possible to replace the special !match_state case with
an 'all-states' value such that any blocked state will match.
Suggested-by: Ingo Molnar (mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/YxhkzfuFTvRnpUaH@hirez.programming.kicks-ass.net
Make wait_task_inactive()'s @match_state work like ttwu()'s @state.
That is, instead of an equal comparison, use it as a mask. This allows
matching multiple block conditions.
(removes the unlikely; it doesn't make sense how it's only part of the
condition)
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220822114648.856734578@infradead.org
handle_initrd() marks itself as PF_FREEZER_SKIP in order to ensure
that the UMH, which is going to freeze the system, doesn't
indefinitely wait for it's caller.
Rework things by adding UMH_FREEZABLE to indicate the completion is
freezable.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220822114648.791019324@infradead.org
Rafael explained that the reason for having both PF_NOFREEZE and
PF_FREEZER_SKIP is that {,un}lock_system_sleep() is callable from
kthread context that has previously called set_freezable().
In preparation of merging the flags, have {,un}lock_system_slee() save
and restore current->flags.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220822114648.725003428@infradead.org
There is some ambiguity about task_running() in that it is unrelated
to TASK_RUNNING but instead tests ->on_cpu. As such, rename the thing
task_on_cpu().
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yxhkhn55uHZx+NGl@hirez.programming.kicks-ass.net
The sched-domain of this cpu is only used for some heuristics when
SIS_PROP is enabled, and it should be irrelevant whether the local
sd_llc is valid or not, since all we care about is target sd_llc
if !SIS_PROP.
Access the local domain only when there is a need.
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Link: https://lore.kernel.org/r/20220907112000.1854-6-wuyun.abel@bytedance.com
It's uncertain whether idle cores exist or not if shared sched-
domains are not ready, so returning "no idle cores" usually
makes sense.
While __update_idle_core() is an exception, it checks status
of this core and set hint to shared sched-domain if necessary.
So the whole logic of this function depends on the existence
of shared sched-domain, and can certainly bail out early if
it is not available.
It's somehow a little tricky, and as Josh suggested that it
should be transient while the domain isn't ready. So remove
the self-defined default value to make things more clearer.
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Don <joshdon@google.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20220907112000.1854-5-wuyun.abel@bytedance.com
The function select_idle_core() only gets called when has_idle_cores
is true which can be possible only when sched_smt_present is enabled.
This change also aligns select_idle_core() with select_idle_smt() in
the way that the caller do the check if necessary.
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20220907112000.1854-4-wuyun.abel@bytedance.com
The prev cpu is checked at the beginning of SIS, and it's unlikely
to be idle before the second check in select_idle_smt(). So we'd
better focus on its SMT siblings.
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Don <joshdon@google.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20220907112000.1854-3-wuyun.abel@bytedance.com
If two cpus share LLC cache, then the two cores they belong to
are also in the same LLC domain.
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Don <joshdon@google.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20220907112000.1854-2-wuyun.abel@bytedance.com
As members in sched_dl_entity are independent with dl_bw, move
__dl_clear_params out of dl_bw lock.
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Link: https://lore.kernel.org/r/20220827020911.30641-1-shangxiaojing@huawei.com
Wrap repeated code in helper function replenish_dl_new_period, which set
the deadline and runtime of input dl_se based on pi_of(dl_se). Note that
setup_new_dl_entity originally set the deadline and runtime base on
dl_se, which should equals to pi_of(dl_se) for non-boosted task.
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Link: https://lore.kernel.org/r/20220826100037.12146-1-shangxiaojing@huawei.com
Wrap repeated code in helper function dl_task_is_earliest_deadline, which
return true if there is no deadline task on the rq at all, or task's
deadline earlier than the whole rq.
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Link: https://lore.kernel.org/r/20220826083453.698-1-shangxiaojing@huawei.com
Wrap repeated code in helper function update_current_exec_runtime for
update the exec time of the current.
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220824082856.15674-1-shangxiaojing@huawei.com
post_init_entity_util_avg() init task util_avg according to the cpu util_avg
at the time of fork, which will decay when switched_to_fair() some time later,
we'd better to not set them at all in the case of !fair task.
Suggested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-10-zhouchengming@bytedance.com
When wake_up_new_task(), we use post_init_entity_util_avg() to init
util_avg/runnable_avg based on cpu's util_avg at that time, and
attach task sched_avg to cfs_rq.
Since enqueue_task_fair() -> enqueue_entity() -> update_load_avg()
loop will do attach, we can move this work to update_load_avg().
wake_up_new_task(p)
post_init_entity_util_avg(p)
attach_entity_cfs_rq() --> (1)
activate_task(rq, p)
enqueue_task() := enqueue_task_fair()
enqueue_entity() loop
update_load_avg(cfs_rq, se, UPDATE_TG | DO_ATTACH)
if (!se->avg.last_update_time && (flags & DO_ATTACH))
attach_entity_load_avg() --> (2)
This patch move attach from (1) to (2), update related comments too.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-9-zhouchengming@bytedance.com
commit 7dc603c902 ("sched/fair: Fix PELT integrity for new tasks")
introduce a TASK_NEW state and an unnessary limitation that would fail
when changing cgroup of new forked task.
Because at that time, we can't handle task_change_group_fair() for new
forked fair task which hasn't been woken up by wake_up_new_task(),
which will cause detach on an unattached task sched_avg problem.
This patch delete this unnessary limitation by adding check before do
detach or attach in task_change_group_fair().
So cpu_cgrp_subsys.can_attach() has nothing to do for fair tasks,
only define it in #ifdef CONFIG_RT_GROUP_SCHED.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-8-zhouchengming@bytedance.com
commit 7dc603c902 ("sched/fair: Fix PELT integrity for new tasks")
fixed two load tracking problems for new task, including detach on
unattached new task problem.
There still left another detach on unattached task problem for the task
which has been woken up by try_to_wake_up() and waiting for actually
being woken up by sched_ttwu_pending().
try_to_wake_up(p)
cpu = select_task_rq(p)
if (task_cpu(p) != cpu)
set_task_cpu(p, cpu)
migrate_task_rq_fair()
remove_entity_load_avg() --> unattached
se->avg.last_update_time = 0;
__set_task_cpu()
ttwu_queue(p, cpu)
ttwu_queue_wakelist()
__ttwu_queue_wakelist()
task_change_group_fair()
detach_task_cfs_rq()
detach_entity_cfs_rq()
detach_entity_load_avg() --> detach on unattached task
set_task_rq()
attach_task_cfs_rq()
attach_entity_cfs_rq()
attach_entity_load_avg()
The reason of this problem is similar, we should check in detach_entity_cfs_rq()
that se->avg.last_update_time != 0, before do detach_entity_load_avg().
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-7-zhouchengming@bytedance.com
When we are migrating task out of the CPU, we can combine detach and
propagation into dequeue_entity() to save the detach_entity_cfs_rq()
in migrate_task_rq_fair().
This optimization is like combining DO_ATTACH in the enqueue_entity()
when migrating task to the CPU. So we don't have to traverse the CFS tree
extra time to do the detach_entity_cfs_rq() -> propagate_entity_cfs_rq(),
which wouldn't be called anymore with this patch's change.
detach_task()
deactivate_task()
dequeue_task_fair()
for_each_sched_entity(se)
dequeue_entity()
update_load_avg() /* (1) */
detach_entity_load_avg()
set_task_cpu()
migrate_task_rq_fair()
detach_entity_cfs_rq() /* (2) */
update_load_avg();
detach_entity_load_avg();
propagate_entity_cfs_rq();
for_each_sched_entity()
update_load_avg()
This patch save the detach_entity_cfs_rq() called in (2) by doing
the detach_entity_load_avg() for a CPU migrating task inside (1)
(the task being the first se in the loop)
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-6-zhouchengming@bytedance.com
When reading the sched_avg related code, I found the comments in
enqueue/dequeue_entity() are not updated with the current code.
We don't add/subtract entity's runnable_avg from cfs_rq->runnable_avg
during enqueue/dequeue_entity(), those are done only for attach/detach.
This patch updates the comments to reflect the current code working.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-5-zhouchengming@bytedance.com
set_task_rq() -> set_task_rq_fair() will try to synchronize the blocked
task's sched_avg when migrate, which is not needed for already detached
task.
task_change_group_fair() will detached the task sched_avg from prev cfs_rq
first, so reset sched_avg last_update_time before set_task_rq() to avoid that.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-4-zhouchengming@bytedance.com
We use cpu_cgrp_subsys->fork() to set task group for the new fair task
in cgroup_post_fork().
Since commit b1e8206582 ("sched: Fix yet more sched_fork() races")
has already set_task_rq() for the new fair task in sched_cgroup_fork(),
so cpu_cgrp_subsys->fork() can be removed.
cgroup_can_fork() --> pin parent's sched_task_group
sched_cgroup_fork()
__set_task_cpu()
set_task_rq()
cgroup_post_fork()
ss->fork() := cpu_cgroup_fork()
sched_change_group(..., TASK_SET_GROUP)
task_set_group_fair()
set_task_rq() --> can be removed
After this patch's change, task_change_group_fair() only need to
care about task cgroup migration, make the code much simplier.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20220818124805.601-3-zhouchengming@bytedance.com
Previously we only maintain task se depth in task_move_group_fair(),
if a !fair task change task group, its se depth will not be updated,
so commit eb7a59b2c8 ("sched/fair: Reset se-depth when task switched to FAIR")
fix the problem by updating se depth in switched_to_fair() too.
Then commit daa59407b5 ("sched/fair: Unify switched_{from,to}_fair()
and task_move_group_fair()") unified these two functions, moved se.depth
setting to attach_task_cfs_rq(), which further into attach_entity_cfs_rq()
with commit df217913e7 ("sched/fair: Factorize attach/detach entity").
This patch move task se depth maintenance from attach_entity_cfs_rq()
to set_task_rq(), which will be called when CPU/cgroup change, so its
depth will always be correct.
This patch is preparation for the next patch.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-2-zhouchengming@bytedance.com
There's no good reason to crash a user's system with a BUG_ON(),
chances are high that they'll never even see the crash message on
Xorg, and it won't make it into the syslog either.
By using a WARN_ON_ONCE() we at least give the user a chance to report
any bugs triggered here - instead of getting silent hangs.
None of these WARN_ON_ONCE()s are supposed to trigger, ever - so we ignore
cases where a NULL check is done via a BUG_ON() and we let a NULL
pointer through after a WARN_ON_ONCE().
There's one exception: WARN_ON_ONCE() arguments with side-effects,
such as locking - in this case we use the return value of the
WARN_ON_ONCE(), such as in:
- BUG_ON(!lock_task_sighand(p, &flags));
+ if (WARN_ON_ONCE(!lock_task_sighand(p, &flags)))
+ return;
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/YvSsKcAXISmshtHo@gmail.com
Users of GNU ld (BFD) from binutils 2.39+ will observe multiple
instances of a new warning when linking kernels in the form:
ld: warning: arch/x86/boot/pmjump.o: missing .note.GNU-stack section implies executable stack
ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
ld: warning: arch/x86/boot/compressed/vmlinux has a LOAD segment with RWX permissions
Generally, we would like to avoid the stack being executable. Because
there could be a need for the stack to be executable, assembler sources
have to opt-in to this security feature via explicit creation of the
.note.GNU-stack feature (which compilers create by default) or command
line flag --noexecstack. Or we can simply tell the linker the
production of such sections is irrelevant and to link the stack as
--noexecstack.
LLVM's LLD linker defaults to -z noexecstack, so this flag isn't
strictly necessary when linking with LLD, only BFD, but it doesn't hurt
to be explicit here for all linkers IMO. --no-warn-rwx-segments is
currently BFD specific and only available in the current latest release,
so it's wrapped in an ld-option check.
While the kernel makes extensive usage of ELF sections, it doesn't use
permissions from ELF segments.
Link: https://lore.kernel.org/linux-block/3af4127a-f453-4cf7-f133-a181cce06f73@kernel.dk/
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107
Link: https://github.com/llvm/llvm-project/issues/57009
Reported-and-tested-by: Jens Axboe <axboe@kernel.dk>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Users of GNU ld (BFD) from binutils 2.39+ will observe multiple
instances of a new warning when linking kernels in the form:
ld: warning: vmlinux: missing .note.GNU-stack section implies executable stack
ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
ld: warning: vmlinux has a LOAD segment with RWX permissions
Generally, we would like to avoid the stack being executable. Because
there could be a need for the stack to be executable, assembler sources
have to opt-in to this security feature via explicit creation of the
.note.GNU-stack feature (which compilers create by default) or command
line flag --noexecstack. Or we can simply tell the linker the
production of such sections is irrelevant and to link the stack as
--noexecstack.
LLVM's LLD linker defaults to -z noexecstack, so this flag isn't
strictly necessary when linking with LLD, only BFD, but it doesn't hurt
to be explicit here for all linkers IMO. --no-warn-rwx-segments is
currently BFD specific and only available in the current latest release,
so it's wrapped in an ld-option check.
While the kernel makes extensive usage of ELF sections, it doesn't use
permissions from ELF segments.
Link: https://lore.kernel.org/linux-block/3af4127a-f453-4cf7-f133-a181cce06f73@kernel.dk/
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107
Link: https://github.com/llvm/llvm-project/issues/57009
Reported-and-tested-by: Jens Axboe <axboe@kernel.dk>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It turns out that gcc-12.1 has some nasty problems with register
allocation on a 32-bit x86 build for the 64-bit values used in the
generic blake2b implementation, where the pattern of 64-bit rotates and
xor operations ends up making gcc generate horrible code.
As a result it ends up with a ridiculously large stack frame for all the
spills it generates, resulting in the following build problem:
crypto/blake2b_generic.c: In function ‘blake2b_compress_one_generic’:
crypto/blake2b_generic.c:109:1: error: the frame size of 2640 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
on the same test-case, clang ends up generating a stack frame that is
just 296 bytes (and older gcc versions generate a slightly bigger one at
428 bytes - still nowhere near that almost 3kB monster stack frame of
gcc-12.1).
The issue is fixed both in mainline and the GCC 12 release branch [1],
but current release compilers end up failing the i386 allmodconfig build
due to this issue.
Disable the warning for now by simply raising the frame size for this
one file, just to keep this issue from having people turn off WERROR.
Link: https://lore.kernel.org/all/CAHk-=wjxqgeG2op+=W9sqgsWqCYnavC+SRfVyopu9-31S6xw+Q@mail.gmail.com/
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930 [1]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Highlights include:
Stable fixes:
- pNFS/flexfiles: Fix infinite looping when the RDMA connection errors out
Bugfixes:
- NFS: fix port value parsing
- SUNRPC: Reinitialise the backchannel request buffers before reuse
- SUNRPC: fix expiry of auth creds
- NFSv4: Fix races in the legacy idmapper upcall
- NFS: O_DIRECT fixes from Jeff Layton
- NFSv4.1: Fix OP_SEQUENCE error handling
- SUNRPC: Fix an RPC/RDMA performance regression
- NFS: Fix case insensitive renames
- NFSv4/pnfs: Fix a use-after-free bug in open
- NFSv4.1: RECLAIM_COMPLETE must handle EACCES
Features:
- NFSv4.1: session trunking enhancements
- NFSv4.2: READ_PLUS performance optimisations
- NFS: relax the rules for rsize/wsize mount options
- NFS: don't unhash dentry during unlink/rename
- SUNRPC: Fail faster on bad verifier
- NFS/SUNRPC: Various tracing improvements
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAmLzy24ACgkQZwvnipYK
APJJvhAAnVmv3jeOLjwDm+3X9xBevTq8PXAikWVeJgbSKCdjfqXO1J+XF49MpxXl
N8PQZMqnwWkF3WvhvYobblwGSl6gJ3RnhVuTgdv4jiPl0ZWyS3XngJY0dCTnNgdL
5O9AjoOMtnGVZwN5j8agymA8f2TUcel5mED6sAk10t2zDZY7VxuQqjp0m696jYjF
0PKBmxoC4+6tXtYJS7d2PGRCTjfEUx2BLnnGuLOKsB7X8f63XmxJiu8/AIiY7spr
M/M+BjAF3ok86dT1LjGlkvSNp23H70Wmsv98udfvxIWBm1l4972oCR/CS+kh3/mU
dYM+oQ3JxxgKEN7Fdak+zU/+qma9q5z2rPFpSIT1fMEuaKN/7H2cbiHPi5RnEBLa
AHWilX/lWBIMnZJZd9g3yYcGe6E/pkT6TqW5JY+2510koyfNER4IismAWMx2iYKU
D0WSZOkmEBS/OYZxpTnqGwvS4L1szo9DN3c+yG2KXLifnmVPpjXZO25wahqSuUo3
V6eYUCXRJmVg+IuXGsMNdrjxGYxD12xChoYzx5RlXls2lwHGeZr+iG3aL3+XayHa
I1Kji3500UmfEOUUUr4UiQ428dOdL3QqNzVzdymN8Vh4d7v64LUL0GSseY+10Xrs
xcbR6l/hwjBIo+I1Bi2mmv3W10tKErFy9eBIKzql3D6VHg7ESOo=
=U00h
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-5.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable fixes:
- pNFS/flexfiles: Fix infinite looping when the RDMA connection
errors out
Bugfixes:
- NFS: fix port value parsing
- SUNRPC: Reinitialise the backchannel request buffers before reuse
- SUNRPC: fix expiry of auth creds
- NFSv4: Fix races in the legacy idmapper upcall
- NFS: O_DIRECT fixes from Jeff Layton
- NFSv4.1: Fix OP_SEQUENCE error handling
- SUNRPC: Fix an RPC/RDMA performance regression
- NFS: Fix case insensitive renames
- NFSv4/pnfs: Fix a use-after-free bug in open
- NFSv4.1: RECLAIM_COMPLETE must handle EACCES
Features:
- NFSv4.1: session trunking enhancements
- NFSv4.2: READ_PLUS performance optimisations
- NFS: relax the rules for rsize/wsize mount options
- NFS: don't unhash dentry during unlink/rename
- SUNRPC: Fail faster on bad verifier
- NFS/SUNRPC: Various tracing improvements"
* tag 'nfs-for-5.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (46 commits)
NFS: Improve readpage/writepage tracing
NFS: Improve O_DIRECT tracing
NFS: Improve write error tracing
NFS: don't unhash dentry during unlink/rename
NFSv4/pnfs: Fix a use-after-free bug in open
NFS: nfs_async_write_reschedule_io must not recurse into the writeback code
SUNRPC: Don't reuse bvec on retransmission of the request
SUNRPC: Reinitialise the backchannel request buffers before reuse
NFSv4.1: RECLAIM_COMPLETE must handle EACCES
NFSv4.1 probe offline transports for trunking on session creation
SUNRPC create a function that probes only offline transports
SUNRPC export xprt_iter_rewind function
SUNRPC restructure rpc_clnt_setup_test_and_add_xprt
NFSv4.1 remove xprt from xprt_switch if session trunking test fails
SUNRPC create an rpc function that allows xprt removal from rpc_clnt
SUNRPC enable back offline transports in trunking discovery
SUNRPC create an iterator to list only OFFLINE xprts
NFSv4.1 offline trunkable transports on DESTROY_SESSION
SUNRPC add function to offline remove trunkable transports
SUNRPC expose functions for offline remote xprt functionality
...
Fix two regressions in nct6775 and lm90 drivers
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEiHPvMQj9QTOCiqgVyx8mb86fmYEFAmLztc8ACgkQyx8mb86f
mYHj8Q/9EfRpWQfkkjEL2Nrn4ovBkS2B/y0JST1XFI5EY/BlXWRDF/RDIqDY7lRE
k2eU0FiE3HX6Hfo0Y4IsMLi8pM1vpujVqRiwmyoTiArVzTcQEipDc9IKcDAZG/rJ
l1oynjTpl5jJxUbtOL7/D4SgY6lQDUhxTtVSuXQzvWsmyE9k7qSizmBjnUz0Z0UK
bH9EEc9oa8Br1trweio6lHWJXiwLdkCRcrLyLXaBHY4Dbi6nxR7/bcGkoXVhCKbJ
hgFb6rxzJoGaUuxsKWy7ZNrLNnrskA4CVxF2di6eQjXBLj1YCosr7FoR9ljJ0Oqx
ll42ZNK8HtoYnKHxpSbfCEXXjLr4HJaARgIEpyLiXOY7R8c8zU8QViqSHpxkvyW5
EhFs2MfVsSJQ8PxOMMnqw7oLQHiFx6Yvmlhvf9FWOlzwkBBbV5xZyrf3U4L/AvBA
BTX/+tKGOezEKAeHqRGB6X40dRwnTClRag0pNMh+E7VKQXmcVhr0BrWAnJ8tz5wa
pkEWq5mG31DyhaT8vuFvTN2CNliiTeQfc2hU780q5qZF2XX/fV+5lkipIO8j4xoe
LFm6o4W1ySZXgyMiAao/ogT0fJyzkjmlic+PnoQHVrykvZKlEZzl7kM/CvNvfN+J
EDuX42pm+yr1oVAgRa2If8qcNI4O6yAC6h9eiyP4hbUhjsKYdL4=
=MMyB
-----END PGP SIGNATURE-----
Merge tag 'hwmon-fixes-for-v6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"Fix two regressions in nct6775 and lm90 drivers"
* tag 'hwmon-fixes-for-v6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (nct6775) Fix platform driver suspend regression
hwmon: (lm90) Fix error return value from detect function
- hardware poisoning support for 1GB hugepages, from Naoya Horiguchi
- highmem documentation fixups from Fabio De Francesco
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYvMFaQAKCRDdBJ7gKXxA
jrM7AQCsaVsrRJgVMNqjDvLAsWFEm48s6/smQT4UZ9n/rPQUsAEA0iRpOfbjcxwK
npAml1mp5uP2AkimTVLB6Bokt6N+wAg=
=9Wta
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull remaining MM updates from Andrew Morton:
"Three patch series - two that perform cleanups and one feature:
- hugetlb_vmemmap cleanups from Muchun Song
- hardware poisoning support for 1GB hugepages, from Naoya Horiguchi
- highmem documentation fixups from Fabio De Francesco"
* tag 'mm-stable-2022-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
Documentation/mm: add details about kmap_local_page() and preemption
highmem: delete a sentence from kmap_local_page() kdocs
Documentation/mm: rrefer kmap_local_page() and avoid kmap()
Documentation/mm: avoid invalid use of addresses from kmap_local_page()
Documentation/mm: don't kmap*() pages which can't come from HIGHMEM
highmem: specify that kmap_local_page() is callable from interrupts
highmem: remove unneeded spaces in kmap_local_page() kdocs
mm, hwpoison: enable memory error handling on 1GB hugepage
mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage
mm, hwpoison: make __page_handle_poison returns int
mm, hwpoison: set PG_hwpoison for busy hugetlb pages
mm, hwpoison: make unpoison aware of raw error info in hwpoisoned hugepage
mm, hwpoison, hugetlb: support saving mechanism of raw error pages
mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry
mm/hugetlb: check gigantic_page_runtime_supported() in return_unused_surplus_pages()
mm: hugetlb_vmemmap: use PTRS_PER_PTE instead of PMD_SIZE / PAGE_SIZE
mm: hugetlb_vmemmap: move code comments to vmemmap_dedup.rst
mm: hugetlb_vmemmap: improve hugetlb_vmemmap code readability
mm: hugetlb_vmemmap: replace early_param() with core_param()
mm: hugetlb_vmemmap: move vmemmap code related to HugeTLB to hugetlb_vmemmap.c
...
- Introduce a 'struct cxl_region' object with support for provisioning
and assembling persistent memory regions.
- Introduce alloc_free_mem_region() to accompany the existing
request_free_mem_region() as a method to allocate physical memory
capacity out of an existing resource.
- Export insert_resource_expand_to_fit() for the CXL subsystem to
late-publish CXL platform windows in iomem_resource.
- Add a polled mode PCI DOE (Data Object Exchange) driver service and
use it in cxl_pci to retrieve the CDAT (Coherent Device Attribute
Table).
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYvLYmAAKCRDfioYZHlFs
Z0pbAQC/3j+WriWpU7CdhrnZI1Wqn+x5IIklF0Lc4/f6LwGZtAEAsSbLpItzvwqx
M/rcLaeLpwYlgvS1JjdsuQ2VQ7KOtAs=
=ehNT
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams:
"Compute Express Link (CXL) updates for 6.0:
- Introduce a 'struct cxl_region' object with support for
provisioning and assembling persistent memory regions.
- Introduce alloc_free_mem_region() to accompany the existing
request_free_mem_region() as a method to allocate physical memory
capacity out of an existing resource.
- Export insert_resource_expand_to_fit() for the CXL subsystem to
late-publish CXL platform windows in iomem_resource.
- Add a polled mode PCI DOE (Data Object Exchange) driver service and
use it in cxl_pci to retrieve the CDAT (Coherent Device Attribute
Table)"
* tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (74 commits)
cxl/hdm: Fix skip allocations vs multiple pmem allocations
cxl/region: Disallow region granularity != window granularity
cxl/region: Fix x1 interleave to greater than x1 interleave routing
cxl/region: Move HPA setup to cxl_region_attach()
cxl/region: Fix decoder interleave programming
Documentation: cxl: remove dangling kernel-doc reference
cxl/region: describe targets and nr_targets members of cxl_region_params
cxl/regions: add padding for cxl_rr_ep_add nested lists
cxl/region: Fix IS_ERR() vs NULL check
cxl/region: Fix region reference target accounting
cxl/region: Fix region commit uninitialized variable warning
cxl/region: Fix port setup uninitialized variable warnings
cxl/region: Stop initializing interleave granularity
cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime
cxl/acpi: Minimize granularity for x1 interleaves
cxl/region: Delete 'region' attribute from root decoders
cxl/acpi: Autoload driver for 'cxl_acpi' test devices
cxl/region: decrement ->nr_targets on error in cxl_region_attach()
cxl/region: prevent underflow in ways_to_cxl()
cxl/region: uninitialized variable in alloc_hpa()
...
Core changes:
- Add PINCTRL_PINGROUP() helper macro (and use it in the AMD driver).
New drivers:
- Intel Meteor Lake support.
- Reneasas RZ/V2M and r8a779g0 (R-Car V4H).
- AXP209 variants AXP221, AXP223 and AXP809.
- Qualcomm MSM8909, PM8226, PMP8074 and SM6375.
- Allwinner D1.
Improvements:
- Proper pin multiplexing in the AMD driver.
- Mediatek MT8192 can use generic drive strength and pin
bias, then fixes on top plus some I2C pin group fixes.
- Have the Allwinner Sunplus SP7021 use the generic DT schema and
make interrupts optional.
- Handle Qualcomm SC7280 ADSP.
- Handle Qualcomm MSM8916 CAMSS GP clock muxing.
- High impedance bias on ZynqMP.
- Serialize StarFive access to MMIO.
- Immutable gpiochip for BCM2835, Ingenic, Qualcomm SPMI GPIO.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmLy1PoACgkQQRCzN7AZ
XXOavA/+O9lIcZqt/I3ZzA4+paEJsXM/win6NKGUHlAE76D5qOvMEXPYCszccGVl
0ZV9s3A3xmlb0AQZONeiK5M6FTghnIHiPMI5Ewzw358hZQg68Mgaba5+/yTqc9ZT
L5zs6whboB1Mlr05L3g5e5ImM1FdFklGHimI6G/evE5r1ZXAAdoyXbSzWgtgLwp9
Gn3rstfqxwwPa9QWIjCXXIeZ/EFnX6BRFT5Pu47dRz/67UWB3xzJjRkZXBf8Nag9
/H/TpmkXSFNaP8HK2kN8m5eNtfWLYM1AmjFPNICWtKLhH12ArD3j+MBYLcJoDnAI
JZryrMSFi2P14Ov42zYeJaSjReTt/QBcRAlWBhSiuotJHl6wrFXzM6wA6JirfvsJ
XQsNm7rKfkmfJ84VjqmCg6QF+39fwKw9MYY9IMXsey7864pBWSyl2xYXUjwXFLua
EWh+6I1CX4db/S6I/uqvluDenT0NKAPZ3rwK5Al1m1DMI47kz0qoW5ZxAW10xeYB
BNGN7IyRvYZhfA/DHcxMB5XgateIKTJqfcYnmHD29Ep4skEetOSac0wVytd3S+Hw
v1zklpcGDLHNiCBXmTYniTlfgBkWJUmVCLA4K6TjSNUKfeoR+33wlpnPHveq8ckn
PJLf79A+5Br3IsLnr6AzDrmtCd0sV46Gy8Vi5I1TD1i/LUUhnL0=
=enmk
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"Outside the pinctrl driver and DT bindings we hit some Arm DT files,
patched by the maintainers.
Other than that it is business as usual.
Core changes:
- Add PINCTRL_PINGROUP() helper macro (and use it in the AMD driver).
New drivers:
- Intel Meteor Lake support.
- Reneasas RZ/V2M and r8a779g0 (R-Car V4H).
- AXP209 variants AXP221, AXP223 and AXP809.
- Qualcomm MSM8909, PM8226, PMP8074 and SM6375.
- Allwinner D1.
Improvements:
- Proper pin multiplexing in the AMD driver.
- Mediatek MT8192 can use generic drive strength and pin bias, then
fixes on top plus some I2C pin group fixes.
- Have the Allwinner Sunplus SP7021 use the generic DT schema and
make interrupts optional.
- Handle Qualcomm SC7280 ADSP.
- Handle Qualcomm MSM8916 CAMSS GP clock muxing.
- High impedance bias on ZynqMP.
- Serialize StarFive access to MMIO.
- Immutable gpiochip for BCM2835, Ingenic, Qualcomm SPMI GPIO"
* tag 'pinctrl-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (117 commits)
dt-bindings: pinctrl: qcom,pmic-gpio: add PM8226 constraints
pinctrl: qcom: Make PINCTRL_SM8450 depend on PINCTRL_MSM
pinctrl: qcom: sm8250: Fix PDC map
pinctrl: amd: Fix an unused variable
dt-bindings: pinctrl: mt8186: Add and use drive-strength-microamp
dt-bindings: pinctrl: mt8186: Add gpio-line-names property
ARM: dts: imxrt1170-pinfunc: Add pinctrl binding header
pinctrl: amd: Use unicode for debugfs output
pinctrl: amd: Fix newline declaration in debugfs output
pinctrl: at91: Fix typo 'the the' in comment
dt-bindings: pinctrl: st,stm32: Correct 'resets' property name
pinctrl: mvebu: Missing a blank line after declarations.
pinctrl: qcom: Add SM6375 TLMM driver
dt-bindings: pinctrl: Add DT schema for SM6375 TLMM
dt-bindings: pinctrl: mt8195: Use drive-strength-microamp in examples
Revert "pinctrl: qcom: spmi-gpio: make the irqchip immutable"
pinctrl: imx93: Add MODULE_DEVICE_TABLE()
pinctrl: sunxi: Add driver for Allwinner D1
pinctrl: sunxi: Make some layout parameters dynamic
pinctrl: sunxi: Refactor register/offset calculation
...
- Convert secid mapping to XArrays instead of IDR
- Add a kernel label to use on kernel objects
- Extend policydb permission set by making use of the xbits
- Make export of raw binary profile to userspace optional
- Enable tuning of policy paranoid load for embedded systems
- Don't create raw_sha1 symlink if sha1 hashing is disabled
- Allow labels to carry debug flags
+ Cleanups
- Update MAINTAINERS file
- Use struct_size() helper in kmalloc()
- Move ptrace mediation to more logical task.{h,c}
- Resolve uninitialized symbol warnings
- Remove redundant ret variable
- Mark alloc_unconfined() as static
- Update help description of policy hash for introspection
- Remove some casts which are no-longer required
+ Bug Fixes
- Fix aa_label_asxprint return check
- Fix reference count leak in aa_pivotroot()
- Fix memleak in aa_simple_write_to_buffer()
- Fix kernel doc comments
- Fix absroot causing audited secids to begin with =
- Fix quiet_denied for file rules
- Fix failed mount permission check error message
- Disable showing the mode as part of a secid to secctx
- Fix setting unconfined mode on a loaded profile
- Fix overlapping attachment computation
- Fix undefined reference to `zlib_deflate_workspacesize'
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE7cSDD705q2rFEEf7BS82cBjVw9gFAmLxtlkACgkQBS82cBjV
w9iomw/8DPUO6qqXcOmP1TRnPUzwfHdkxRzNYWhBjfMAQmIWd4SZN/c2L2FbVbxs
aund3yEmHdMDH8udzN4ZDygmS61tsImpNjZHXfkX+0CULeZiJeVEPTIrFlu5ce/H
pNfxhwrbdJhsq0Gxj8k24qAsTm9590w9xRoWLTL3L023/6C9aFOjZrWhH1/qvgtY
vQ1/Rp64E9u931NY+IF5+sHOud0yzOc0VItzf7gOPZLMN6Vmntvzh/HscoZUSAA/
6Pyj01wNjIjoq+fJ8Gwe5vnrfIRHOfddVDnqVI6AxQ8PwQi3ooaeH0c2+6QZKKd1
6iGDTkI3f5TVVLXwOw/8/sht/kB7vbOvnhssP8fdPHEormiqsBqeyFNEn6Lz2qoW
rdoq314RzXx+5SI50Ta+YqaTs+TMdSqggsq01oV2lkBZYI7/dot//HgOgUUvtD7Z
l3g/3ldKGjlwoCp55JBggjCrvMIK/NE18w3+TToeGL8Jl4ubJ7v+TgCD+77Chw9X
57sVgfe/RukqwoYuABJHdkcaWY3AxKfksMG2qYbjkUxZXncA7pyhG4oF3RdI/fsl
HvvlGd5mG+jPxVLrs1qRgIK4T98EIpMzV/qxe+ykpgN6WdW7Mv+NGnc2ybBbia8C
5kQdoUyG11mff1YSfnKXKgRKrHBHWD/oW/GyFYh95VJJvSwhvAM=
=Wt4E
-----END PGP SIGNATURE-----
Merge tag 'apparmor-pr-2022-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull AppArmor updates from John Johansen:
"This is mostly cleanups and bug fixes with the one bigger change being
Mathew Wilcox's patch to use XArrays instead of the IDR from the
thread around the locking weirdness.
Features:
- Convert secid mapping to XArrays instead of IDR
- Add a kernel label to use on kernel objects
- Extend policydb permission set by making use of the xbits
- Make export of raw binary profile to userspace optional
- Enable tuning of policy paranoid load for embedded systems
- Don't create raw_sha1 symlink if sha1 hashing is disabled
- Allow labels to carry debug flags
Cleanups:
- Update MAINTAINERS file
- Use struct_size() helper in kmalloc()
- Move ptrace mediation to more logical task.{h,c}
- Resolve uninitialized symbol warnings
- Remove redundant ret variable
- Mark alloc_unconfined() as static
- Update help description of policy hash for introspection
- Remove some casts which are no-longer required
Bug Fixes:
- Fix aa_label_asxprint return check
- Fix reference count leak in aa_pivotroot()
- Fix memleak in aa_simple_write_to_buffer()
- Fix kernel doc comments
- Fix absroot causing audited secids to begin with =
- Fix quiet_denied for file rules
- Fix failed mount permission check error message
- Disable showing the mode as part of a secid to secctx
- Fix setting unconfined mode on a loaded profile
- Fix overlapping attachment computation
- Fix undefined reference to `zlib_deflate_workspacesize'"
* tag 'apparmor-pr-2022-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (34 commits)
apparmor: Update MAINTAINERS file with new email address
apparmor: correct config reference to intended one
apparmor: move ptrace mediation to more logical task.{h,c}
apparmor: extend policydb permission set by making use of the xbits
apparmor: allow label to carry debug flags
apparmor: fix overlapping attachment computation
apparmor: fix setting unconfined mode on a loaded profile
apparmor: Fix some kernel-doc comments
apparmor: Mark alloc_unconfined() as static
apparmor: disable showing the mode as part of a secid to secctx
apparmor: Convert secid mapping to XArrays instead of IDR
apparmor: add a kernel label to use on kernel objects
apparmor: test: Remove some casts which are no-longer required
apparmor: Fix memleak in aa_simple_write_to_buffer()
apparmor: fix reference count leak in aa_pivotroot()
apparmor: Fix some kernel-doc comments
apparmor: Fix undefined reference to `zlib_deflate_workspacesize'
apparmor: fix aa_label_asxprint return check
apparmor: Fix some kernel-doc comments
apparmor: Fix some kernel-doc comments
...
- Remove the support for -O3 (CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3)
- Fix error of rpm-pkg cross-builds
- Support riscv for checkstack tool
- Re-enable -Wformwat warnings for Clang
- Clean up modpost, Makefiles, and misc scripts
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmLykZUVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsG4QoP/3Ooac5+kmcm9nT+fwtuQkFMPDhW
/5ipDgE8W6kwbGSZX7/KD/3otiUhyhhlUjh1tUHpl+WEoy9Q1orUzbyOzTQW0QYH
zdGazuDBsTPa35Vmow3vGUyX1FdRNKsHuDXC1M2BBLZK05OEjyNMxgi6NowE/XnK
nFVAdZgu6HYfym/L5FDuXEmM1EYiAcPZL37+rBAd5mVCEyDk3rW2TxDa05Gs/8dr
7QJ9rOKPS7+Hs/gc7w56z91eBzvWOhLjTcKFsqOuL3Yd1oFIwExAhaxo3TRUkp8i
VBYKfty+9tXPxNNzKHBq4U9gONkuwQEQu3wOQbSKJQblkS5Sq2wfXH4kQoyCAZIB
5+lsI4idHnD1ZBpOjYxxDrIY6qD+eb/xbxa+AxILoFOK8P1uEn7IHAtwLAg9BzT0
NXdTd8W63D/5F6hVOJNqK8TPupINcWdXcvFvgz6q+Q6l8EDoVnsmSUP3F1qlJ0DI
WhtKhX1CI1PC2T/8ruKJWfPTi6foHhzu4euYWuqUzMmlkhLbp9yHYDDxDN9Li2bh
eT/Qy2oWHraLfXvmfhuE9SS0FrQgNtwtmPCVIn7JZTcji9JCt4ax7Erq3ufhG1BR
oT1X4M1iangjILbZXJlrrS1qz3DeV84pjjR0TF/56ifqskRJPOPrfHnrQ0m3aMnh
TDSweoE1ah1BcAlz
=9kds
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Remove the support for -O3 (CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3)
- Fix error of rpm-pkg cross-builds
- Support riscv for checkstack tool
- Re-enable -Wformwat warnings for Clang
- Clean up modpost, Makefiles, and misc scripts
* tag 'kbuild-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (30 commits)
modpost: remove .symbol_white_list field entirely
modpost: remove unneeded .symbol_white_list initializers
modpost: add PATTERNS() helper macro
modpost: shorten warning messages in report_sec_mismatch()
Revert "Kbuild, lto, workaround: Don't warn for initcall_reference in modpost"
modpost: use more reliable way to get fromsec in section_rel(a)()
modpost: add array range check to sec_name()
modpost: refactor get_secindex()
kbuild: set EXIT trap before creating temporary directory
modpost: remove unused Elf_Sword macro
Makefile.extrawarn: re-enable -Wformat for clang
kbuild: add dtbs_prepare target
kconfig: Qt5: tell the user which packages are required
modpost: use sym_get_data() to get module device_table data
modpost: drop executable ELF support
checkstack: add riscv support for scripts/checkstack.pl
kconfig: shorten the temporary directory name for cc-option
scripts: headers_install.sh: Update config leak ignore entries
kbuild: error out if $(INSTALL_MOD_PATH) contains % or :
kbuild: error out if $(KBUILD_EXTMOD) contains % or :
...
Commit c3963bc0a0 ("hwmon: (nct6775) Split core and platform
driver") introduced a slight change in nct6775_suspend() in order to
avoid an otherwise-needless symbol export for nct6775_update_device(),
replacing a call to that function with a simple dev_get_drvdata()
instead.
As it turns out, there is no guarantee that nct6775_update_device()
is ever called prior to suspend. If this happens, the resume function
ends up writing bad data into the various chip registers, which results
in a crash shortly after resume.
To fix the problem, just add the symbol export and return to using
nct6775_update_device() as was employed previously.
Reported-by: Zoltán Kővágó <dirty.ice.hu@gmail.com>
Tested-by: Zoltán Kővágó <dirty.ice.hu@gmail.com>
Fixes: c3963bc0a0 ("hwmon: (nct6775) Split core and platform driver")
Cc: stable@kernel.org
Signed-off-by: Zev Weiss <zev@bewilderbeest.net>
Link: https://lore.kernel.org/r/20220810052646.13825-1-zev@bewilderbeest.net
[groeck: Updated description]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
lm90_detect_nuvoton() is supposed to return NULL if it can not detect
a chip, or a pointer to the chip name if it does. Under some circumstances
it returns an error pointer instead. Some versions of gcc interpret an
ERR_PTR as region of size 0 and generate an error message.
In function ‘__fortify_strlen’,
inlined from ‘strlcpy’ at ./include/linux/fortify-string.h:159:10,
inlined from ‘lm90_detect’ at drivers/hwmon/lm90.c:2550:2:
./include/linux/fortify-string.h:50:33: error:
‘__builtin_strlen’ reading 1 or more bytes from a region of size 0
50 | #define __underlying_strlen __builtin_strlen
| ^
./include/linux/fortify-string.h:141:24: note:
in expansion of macro ‘__underlying_strlen’
141 | return __underlying_strlen(p);
| ^~~~~~~~~~~~~~~~~~~
Returning NULL instead of ERR_PTR() fixes the problem.
Fixes: c7cebce984 ("hwmon: (lm90) Rework detect function")
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Let's have a look at this piece of code in __bread_slow:
get_bh(bh);
bh->b_end_io = end_buffer_read_sync;
submit_bh(REQ_OP_READ, 0, bh);
wait_on_buffer(bh);
if (buffer_uptodate(bh))
return bh;
Neither wait_on_buffer nor buffer_uptodate contain any memory barrier.
Consequently, if someone calls sb_bread and then reads the buffer data,
the read of buffer data may be executed before wait_on_buffer(bh) on
architectures with weak memory ordering and it may return invalid data.
Fix this bug by adding a memory barrier to set_buffer_uptodate and an
acquire barrier to buffer_uptodate (in a similar way as
folio_test_uptodate and folio_mark_uptodate).
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Work on "courteous server", which was introduced in 5.19, continues
apace. This release introduces a more flexible limit on the number
of NFSv4 clients that NFSD allows, now that NFSv4 clients can remain
in courtesy state long after the lease expiration timeout. The
client limit is adjusted based on the physical memory size of the
server.
The NFSD filecache is a cache of files held open by NFSv4 clients or
recently touched by NFSv2 or NFSv3 clients. This cache had some
significant scalability constraints that have been relieved in this
release. Thanks to all who contributed to this work.
A data corruption bug found during the most recent NFS bake-a-thon
that involves NFSv3 and NFSv4 clients writing the same file has been
addressed in this release.
This release includes several improvements in CPU scalability for
NFSv4 operations. In addition, Neil Brown provided patches that
simplify locking during file lookup, creation, rename, and removal
that enables subsequent work on making these operations more
scalable. We expect to see that work materialize in the next
release.
There are also numerous single-patch fixes, clean-ups, and the
usual improvements in observability.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmLujF0ACgkQM2qzM29m
f5c14w/+Lsoryo5vdTXMAZBXBNvVdXQmHqLIEotJEVA3sECr+Kad2bF8rFaCWVzS
Gf9KhTetmcDO9O73I5I/UtJ2qFT9B4I6baSGpOzIkjsM/aKeEEQpbpdzPYhKrCEv
bQu3P54js7snH4YV8s0I39nBFOWdnahYXaw7peqE/2GOHxaR2mz88bkrQ+OybCxz
KETqUxA6bKzkOT61S0nHcnQKd8HQzhocMDtrxtANHGsMM167ngI1dw4tUQAtfAUI
s9R+GS6qwiKgwGz1oqhTR6LA/h4DROxPnc7AieuD9FvuAnR3kXw61bGMN5Biwv2T
JZUTBbQvWhNasSV+7qOY9nBu+sHVC6Q7OZ5C9F/KjMyqCioDX0DnbxX9uKP20CDd
EAAMS8n4Tdgd4KRBWdkLXPzizWYAjZQmFIJtcZne1JzGZ4IWRnikgM5qD6n1VviZ
kcPRm5EN3DRHA+Hte4jG0EHIrE/7g5gnf+zr9dWl3uNhZtfTmumCfU16YYmKG8pP
QN4kXBR2w7dAvp8nRaOsY6bBFLDAk/jHbpY8Q4xoUO4tsojfWayCTGVFOrecOjxv
uSn0LhiidC5pLlkcPgwemhysVywDzr+gGXBRJXeUOHfdd05Q2gbFK8OpqDSvJ3dZ
aC/RxFvHc8jaktUcuIjkE6Rsz6AVaAH3EZj84oMZ4hZhyGbEreg=
=PEJ3
-----END PGP SIGNATURE-----
Merge tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
"Work on 'courteous server', which was introduced in 5.19, continues
apace. This release introduces a more flexible limit on the number of
NFSv4 clients that NFSD allows, now that NFSv4 clients can remain in
courtesy state long after the lease expiration timeout. The client
limit is adjusted based on the physical memory size of the server.
The NFSD filecache is a cache of files held open by NFSv4 clients or
recently touched by NFSv2 or NFSv3 clients. This cache had some
significant scalability constraints that have been relieved in this
release. Thanks to all who contributed to this work.
A data corruption bug found during the most recent NFS bake-a-thon
that involves NFSv3 and NFSv4 clients writing the same file has been
addressed in this release.
This release includes several improvements in CPU scalability for
NFSv4 operations. In addition, Neil Brown provided patches that
simplify locking during file lookup, creation, rename, and removal
that enables subsequent work on making these operations more scalable.
We expect to see that work materialize in the next release.
There are also numerous single-patch fixes, clean-ups, and the usual
improvements in observability"
* tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (78 commits)
lockd: detect and reject lock arguments that overflow
NFSD: discard fh_locked flag and fh_lock/fh_unlock
NFSD: use (un)lock_inode instead of fh_(un)lock for file operations
NFSD: use explicit lock/unlock for directory ops
NFSD: reduce locking in nfsd_lookup()
NFSD: only call fh_unlock() once in nfsd_link()
NFSD: always drop directory lock in nfsd_unlink()
NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning.
NFSD: add posix ACLs to struct nfsd_attrs
NFSD: add security label to struct nfsd_attrs
NFSD: set attributes when creating symlinks
NFSD: introduce struct nfsd_attrs
NFSD: verify the opened dentry after setting a delegation
NFSD: drop fh argument from alloc_init_deleg
NFSD: Move copy offload callback arguments into a separate structure
NFSD: Add nfsd4_send_cb_offload()
NFSD: Remove kmalloc from nfsd4_do_async_copy()
NFSD: Refactor nfsd4_do_copy()
NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)
NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)
...