<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/workqueue.c, branch v3.19-rc2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://www.git.shady.money/linux/atom?h=v3.19-rc2</id>
<link rel='self' href='https://www.git.shady.money/linux/atom?h=v3.19-rc2'/>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/linux/'/>
<updated>2014-12-08T17:39:16Z</updated>
<entry>
<title>workqueue: allow rescuer thread to do more work.</title>
<updated>2014-12-08T17:39:16Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-12-08T17:39:16Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/linux/commit/?id=008847f66c38712f2819cd956969519006ebc11d'/>
<id>urn:sha1:008847f66c38712f2819cd956969519006ebc11d</id>
<content type='text'>
When there is serious memory pressure, all workers in a pool could be
blocked, and a new thread cannot be created because it requires memory
allocation.

In this situation a WQ_MEM_RECLAIM workqueue will wake up the
rescuer thread to do some work.

The rescuer will only handle requests that are already on -&gt;worklist.
If max_requests is 1, that means it will handle a single request.

The rescuer will be woken again in 100ms to handle another max_requests
requests.

I've seen a machine (running a 3.0 based "enterprise" kernel) with
thousands of requests queued for xfslogd, which has a max_requests of
1, and is needed for retiring all 'xfs' write requests.  When one of
the worker pools gets into this state, it progresses extremely slowly
and possibly never recovers (only waited an hour or two).

With this patch we leave a pool_workqueue on mayday list
until it is clearly no longer in need of assistance.  This allows
all requests to be handled in a timely fashion.

We keep each pool_workqueue on the mayday list until
need_to_create_worker() is false, and no work for this workqueue is
found in the pool.

I have tested this in combination with a (hackish) patch which forces
all work items to be handled by the rescuer thread.  In that context
it significantly improves performance.  A similar patch for a 3.0
kernel significantly improved performance on a heavy work load.

Thanks to Jan Kara for some design ideas, and to Dongsu Park for
some comments and testing.

tj: Inverted the lock order between wq_mayday_lock and pool-&gt;lock with
    a preceding patch and simplified this patch.  Added comment and
    updated changelog accordingly.  Dongsu spotted missing get_pwq()
    in the simplified code.

Cc: Dongsu Park &lt;dongsu.park@profitbricks.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>workqueue: invert the order between pool-&gt;lock and wq_mayday_lock</title>
<updated>2014-12-08T17:39:16Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2014-12-08T17:39:16Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/linux/commit/?id=b2d829096bee7eaf7be31b6229bf722e503adfd8'/>
<id>urn:sha1:b2d829096bee7eaf7be31b6229bf722e503adfd8</id>
<content type='text'>
Currently, pool-&gt;lock nests inside pool-&gt;lock.  There's no inherent
reason for this order.  The only place where the two locks are held
together is pool_mayday_timeout() and it just got decided that way.

This nesting order turns out to complicate things with the planned
rescuer_thread() update.  Let's invert them.  This doesn't cause any
behavior differences.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reviewed-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Dongsu Park &lt;dongsu.park@profitbricks.com&gt;
</content>
</entry>
<entry>
<title>workqueue: cosmetic update in rescuer_thread()</title>
<updated>2014-12-04T15:14:54Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2014-12-04T15:14:13Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/linux/commit/?id=0479c8c54983765085536c9463591548b45ad0a1'/>
<id>urn:sha1:0479c8c54983765085536c9463591548b45ad0a1</id>
<content type='text'>
rescuer_thread() caches &amp;rescuer-&gt;scheduled in a local variable
scheduled for convenience.  There's one WARN_ON_ONCE() which was using
&amp;rescuer-&gt;scheduled directly.  Replace it with the local variable.

This patch causes no functional difference.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>workqueue: Use cond_resched_rcu_qs macro</title>
<updated>2014-10-06T12:58:26Z</updated>
<author>
<name>Joe Lawrence</name>
<email>joe.lawrence@stratus.com</email>
</author>
<published>2014-10-05T17:24:22Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/linux/commit/?id=3e28e377204badfc3c4119ff2abda473127ee0ff'/>
<id>urn:sha1:3e28e377204badfc3c4119ff2abda473127ee0ff</id>
<content type='text'>
Tidy up and use cond_resched_rcu_qs when calling cond_resched and
reporting potential quiescent state to RCU.  Splitting this change in
this way allows easy backporting to -stable for kernel versions not
having cond_resched_rcu_qs().

Signed-off-by: Joe Lawrence &lt;joe.lawrence@stratus.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>workqueue: Add quiescent state between work items</title>
<updated>2014-10-06T12:57:43Z</updated>
<author>
<name>Joe Lawrence</name>
<email>joe.lawrence@stratus.com</email>
</author>
<published>2014-10-05T17:24:21Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/linux/commit/?id=789cbbeca4eb7141cbd748ee93772471101b507b'/>
<id>urn:sha1:789cbbeca4eb7141cbd748ee93772471101b507b</id>
<content type='text'>
Similar to the stop_machine deadlock scenario on !PREEMPT kernels
addressed in b22ce2785d97 "workqueue: cond_resched() after processing
each work item", kworker threads requeueing back-to-back with zero jiffy
delay can stall RCU. The cond_resched call introduced in that fix will
yield only iff there are other higher priority tasks to run, so force a
quiescent RCU state between work items.

Signed-off-by: Joe Lawrence &lt;joe.lawrence@stratus.com&gt;
Link: https://lkml.kernel.org/r/20140926105227.01325697@jlaw-desktop.mno.stratus.com
Link: https://lkml.kernel.org/r/20140929115445.40221d8e@jlaw-desktop.mno.stratus.com
Fixes: b22ce2785d97 ("workqueue: cond_resched() after processing each work item")
Cc: &lt;stable@vger.kernel.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu</title>
<updated>2014-08-04T17:09:27Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-08-04T17:09:27Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/linux/commit/?id=f2a84170ede80e4b80f636e3700ef4d4d5dc7d33'/>
<id>urn:sha1:f2a84170ede80e4b80f636e3700ef4d4d5dc7d33</id>
<content type='text'>
Pull percpu updates from Tejun Heo:

 - Major reorganization of percpu header files which I think makes
   things a lot more readable and logical than before.

 - percpu-refcount is updated so that it requires explicit destruction
   and can be reinitialized if necessary.  This was pulled into the
   block tree to replace the custom percpu refcnting implemented in
   blk-mq.

 - In the process, percpu and percpu-refcount got cleaned up a bit

* 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (21 commits)
  percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero()
  percpu-refcount: require percpu_ref to be exited explicitly
  percpu-refcount: use unsigned long for pcpu_count pointer
  percpu-refcount: add helpers for -&gt;percpu_count accesses
  percpu-refcount: one bit is enough for REF_STATUS
  percpu-refcount, aio: use percpu_ref_cancel_init() in ioctx_alloc()
  workqueue: stronger test in process_one_work()
  workqueue: clear POOL_DISASSOCIATED in rebind_workers()
  percpu: Use ALIGN macro instead of hand coding alignment calculation
  percpu: invoke __verify_pcpu_ptr() from the generic part of accessors and operations
  percpu: preffity percpu header files
  percpu: use raw_cpu_*() to define __this_cpu_*()
  percpu: reorder macros in percpu header files
  percpu: move {raw|this}_cpu_*() definitions to include/linux/percpu-defs.h
  percpu: move generic {raw|this}_cpu_*_N() definitions to include/asm-generic/percpu.h
  percpu: only allow sized arch overrides for {raw|this}_cpu_*() ops
  percpu: reorganize include/linux/percpu-defs.h
  percpu: move accessors from include/linux/percpu.h to percpu-defs.h
  percpu: include/asm-generic/percpu.h should contain only arch-overridable parts
  percpu: introduce arch_raw_cpu_ptr()
  ...
</content>
</entry>
<entry>
<title>Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq</title>
<updated>2014-08-04T17:04:44Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-08-04T17:04:44Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/linux/commit/?id=c4c3f5fba01e189fb3618f09545abdb4cf8ec8ee'/>
<id>urn:sha1:c4c3f5fba01e189fb3618f09545abdb4cf8ec8ee</id>
<content type='text'>
Pull workqueue updates from Tejun Heo:
 "Lai has been doing a lot of cleanups of workqueue and kthread_work.
  No significant behavior change.  Just a lot of cleanups all over the
  place.  Some are a bit invasive but overall nothing too dangerous"

* 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  kthread_work: remove the unused wait_queue_head
  kthread_work: wake up worker only when the worker is idle
  workqueue: use nr_node_ids instead of wq_numa_tbl_len
  workqueue: remove the misnamed out_unlock label in get_unbound_pool()
  workqueue: remove the stale comment in pwq_unbound_release_workfn()
  workqueue: move rescuer pool detachment to the end
  workqueue: unfold start_worker() into create_worker()
  workqueue: remove @wakeup from worker_set_flags()
  workqueue: remove an unneeded UNBOUND test before waking up the next worker
  workqueue: wake regular worker if need_more_worker() when rescuer leave the pool
  workqueue: alloc struct worker on its local node
  workqueue: reuse the already calculated pwq in try_to_grab_pending()
  workqueue: stronger test in process_one_work()
  workqueue: clear POOL_DISASSOCIATED in rebind_workers()
  workqueue: sanity check pool-&gt;cpu in wq_worker_sleeping()
  workqueue: clear leftover flags when detached
  workqueue: remove useless WARN_ON_ONCE()
  workqueue: use schedule_timeout_interruptible() instead of open code
  workqueue: remove the empty check in too_many_workers()
  workqueue: use "pool-&gt;cpu &lt; 0" to stand for an unbound pool
</content>
</entry>
<entry>
<title>workqueue: use nr_node_ids instead of wq_numa_tbl_len</title>
<updated>2014-07-22T16:10:39Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2014-07-22T05:05:40Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/linux/commit/?id=ddcb57e2ed0a4d0de5aef06735dd9df98894f818'/>
<id>urn:sha1:ddcb57e2ed0a4d0de5aef06735dd9df98894f818</id>
<content type='text'>
They are the same and nr_node_ids is provided by the memory subsystem.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>workqueue: remove the misnamed out_unlock label in get_unbound_pool()</title>
<updated>2014-07-22T16:10:39Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2014-07-22T05:04:49Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/linux/commit/?id=3fb1823c093ebe1869d34005837f64df64713780'/>
<id>urn:sha1:3fb1823c093ebe1869d34005837f64df64713780</id>
<content type='text'>
After the locking was moved up to the caller of the get_unbound_pool(),
out_unlock label doesn't need to do any unlock operation and the name
became bad, so we just remove this label, and the only usage-site
"goto out_unlock" is subsituted to "return pool".

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>workqueue: remove the stale comment in pwq_unbound_release_workfn()</title>
<updated>2014-07-22T16:10:39Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2014-07-22T05:04:27Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/linux/commit/?id=29b1cb416a2920fbc70041e4382920ae2d86f426'/>
<id>urn:sha1:29b1cb416a2920fbc70041e4382920ae2d86f426</id>
<content type='text'>
In 75ccf5950f82 ("workqueue: prepare flush_workqueue() for dynamic
creation and destrucion of unbound pool_workqueues"), a comment
about the synchronization for the pwq in pwq_unbound_release_workfn()
was added. The comment claimed the flush_mutex wasn't strictly
necessary, it was correct in that time, due to the pwq was protected
by workqueue_lock.

But it is incorrect now since the wq-&gt;flush_mutex was renamed to
wq-&gt;mutex and workqueue_lock was removed, the wq-&gt;mutex is strictly
needed. But the comment was miss-updated when the synchronization
was changed.

This patch removes the incorrect comments and doesn't add any new
comment to explain why wq-&gt;mutex is needed here, which is definitely
obvious and wq-&gt;pwqs_node has "WQ" notation in its definition which is
better comment.

The old commit mentioned above also introduced a comment in link_pwq()
about the synchronization. This comment is also removed in this patch
since the whole link_pwq() is proteced by wq-&gt;mutex.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
