aboutsummaryrefslogtreecommitdiffstats
path: root/net/unix/af_unix.c
diff options
context:
space:
mode:
authorJoel Fernandes (Google) <joel@joelfernandes.org>2023-09-23 01:14:08 +0000
committerIngo Molnar <mingo@kernel.org>2023-09-28 22:58:13 +0200
commitfc09027786c900368de98d03d40af058bcb01ad9 (patch)
tree980329f0a8a328bb49571155e272372f96df1aa5 /net/unix/af_unix.c
parentLinux 6.6-rc3 (diff)
downloadlinux-fc09027786c900368de98d03d40af058bcb01ad9.tar.gz
linux-fc09027786c900368de98d03d40af058bcb01ad9.zip
sched/rt: Fix live lock between select_fallback_rq() and RT push
During RCU-boost testing with the TREE03 rcutorture config, I found that after a few hours, the machine locks up. On tracing, I found that there is a live lock happening between 2 CPUs. One CPU has an RT task running, while another CPU is being offlined which also has an RT task running. During this offlining, all threads are migrated. The migration thread is repeatedly scheduled to migrate actively running tasks on the CPU being offlined. This results in a live lock because select_fallback_rq() keeps picking the CPU that an RT task is already running on only to get pushed back to the CPU being offlined. It is anyway pointless to pick CPUs for pushing tasks to if they are being offlined only to get migrated away to somewhere else. This could also add unwanted latency to this task. Fix these issues by not selecting CPUs in RT if they are not 'active' for scheduling, using the cpu_active_mask. Other parts in core.c already use cpu_active_mask to prevent tasks from being put on CPUs going offline. With this fix I ran the tests for days and could not reproduce the hang. Without the patch, I hit it in a few hours. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230923011409.3522762-1-joel@joelfernandes.org
Diffstat (limited to 'net/unix/af_unix.c')
0 files changed, 0 insertions, 0 deletions
reate tmp dirs without mktempTim Henigan1-7/+4 2012-03-14contrib/diffall: comment actual reason for 'cdup'Tim Henigan1-1/+3 2012-03-14Documentation/diff-options: reword description of --submodule optionTim Henigan1-5/+6 2012-03-13am: officially deprecate -b/--binary optionJunio C Hamano2-4/+4 2012-03-12Update draft release notes to 1.7.10 before -rc1Junio C Hamano1-21/+12 2012-03-12Git 1.7.9.4v1.7.9.4Junio C Hamano4-3/+28 2012-03-12git-am: error out when seeing -b/--binaryThomas Rast1-1/+3 2012-03-12i18n: fix auto detection of gettext scheme for shell scriptsJunio C Hamano1-1/+1 2012-03-12config: report errors at the EOL with correct line numberMartin Stenberg2-4/+40 2012-03-09fast-import: don't allow 'ls' of path with empty componentsJonathan Nieder2-0/+41 2012-03-09fast-import: leakfix for 'ls' of dirty treesJonathan Nieder1-0/+2 2012-03-09t0204: clarify the "observe undefined behaviour" testJunio C Hamano1-14/+23 2012-03-09configure: allow user to prevent $PATH "sanitization" on SolarisStefano Lattarini1-0/+17 2012-03-09p4000: use -3000 when promising -3000Thomas Rast1-1/+1 2012-03-08rerere: Document 'rerere remaining'Phil Hord1-7/+12 2012-03-08verify-tag: Parse GPG configuration options.Alex Zepeda1-1/+9 2012-03-08Update draft release notes to 1.7.10Junio C Hamano1-10/+28 2012-03-08perf: export some important test-lib variablesThomas Rast2-1/+14 2012-03-08perf: load test-lib-functions from the correct directoryThomas Rast2-1/+6 2012-03-09l10n: Improve zh_CN translation for msg "not something we can merge"Thynson1-1/+1 2012-03-09l10n: Improve zh_CN trans for msg that cannot fast-forwardThynson1-1/+1 2012-03-08l10n: Update zh_CN translation for 1.7.10-rc0Jiang Xin1-169/+177 2012-03-08Update Swedish translation (732t0f0u).Peter Krefting2-1396/+1767