diff options
author | Ammar Faizi <ammar.faizi@students.amikom.ac.id> | 2021-11-06 18:49:42 +0700 |
---|---|---|
committer | Jens Axboe <axboe@kernel.dk> | 2021-11-09 07:48:00 -0700 |
commit | 1fcdf578088cf1b5e67273419d44774d3d1f2a03 (patch) | |
tree | 022ce9ea6d946f5038da1a75b131790ed0e978a3 /test/Makefile | |
parent | 7b5681f82bf53b42afc7cb35482036559d5bad42 (diff) | |
download | liburing-1fcdf578088cf1b5e67273419d44774d3d1f2a03.tar.gz liburing-1fcdf578088cf1b5e67273419d44774d3d1f2a03.tar.bz2 |
test: Add kworker-hang testissue448
This is the reproducer for the kworker hang bug.
Reproduction Steps:
1) A user task calls io_uring_queue_exit().
2) Suspend the task with SIGSTOP / SIGTRAP before the ring exit is
finished (do it as soon as step (1) is done).
3) Wait for `/proc/sys/kernel/hung_task_timeout_secs` seconds
elapsed.
4) We get a complaint from the khungtaskd because the kworker is
stuck in an uninterruptible state (D).
The kworkers waiting on ring exit are not progressing as the task
cannot proceed. When the user task is continued (e.g. get SIGCONT
after SIGSTOP, or continue after SIGTRAP breakpoint), the kworkers
then can finish the ring exit.
We need a special handling for this case to avoid khungtaskd
complaint. Currently we don't have the fix for this.
The dmesg says:
[247390.432294] INFO: task kworker/u8:2:358488 blocked for more than 10 seconds.
[247390.432314] Tainted: G OE 5.15.0-stable #5
[247390.432322] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[247390.432329] task:kworker/u8:2 state:D stack: 0 pid:358488 ppid: 2 flags:0x00004000
[247390.432341] Workqueue: events_unbound io_ring_exit_work
[247390.432354] Call Trace:
[247390.432368] __schedule+0x453/0x1850
[247390.432388] ? lock_acquire+0xc8/0x2d0
[247390.432404] ? usleep_range+0x90/0x90
[247390.432412] schedule+0x59/0xc0
[247390.432420] schedule_timeout+0x1aa/0x1f0
[247390.432429] ? mark_held_locks+0x49/0x70
[247390.432439] ? lockdep_hardirqs_on_prepare+0xff/0x180
[247390.432445] ? _raw_spin_unlock_irq+0x24/0x40
[247390.432456] __wait_for_common+0xc2/0x170
[247390.432473] io_ring_exit_work+0x1d9/0x750
[247390.432486] ? io_uring_del_tctx_node+0xe0/0xe0
[247390.432502] ? verify_cpu+0xf0/0x100
[247390.432520] process_one_work+0x23b/0x550
[247390.432540] worker_thread+0x55/0x3c0
[247390.432546] ? process_one_work+0x550/0x550
[247390.432556] kthread+0x140/0x160
[247390.432564] ? set_kthread_struct+0x40/0x40
[247390.432574] ret_from_fork+0x1f/0x30
[247390.432605] INFO: task kworker/u8:0:359615 blocked for more than 10 seconds.
[247390.432613] Tainted: G OE 5.15.0-stable #5
[247390.432620] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[247390.432626] task:kworker/u8:0 state:D stack: 0 pid:359615 ppid: 2 flags:0x00004000
[247390.432635] Workqueue: events_unbound io_ring_exit_work
[247390.432643] Call Trace:
[247390.432653] __schedule+0x453/0x1850
[247390.432676] ? usleep_range+0x90/0x90
[247390.432684] schedule+0x59/0xc0
[247390.432691] schedule_timeout+0x1aa/0x1f0
[247390.432700] ? mark_held_locks+0x49/0x70
[247390.432710] ? lockdep_hardirqs_on_prepare+0xff/0x180
[247390.432717] ? _raw_spin_unlock_irq+0x24/0x40
[247390.432728] __wait_for_common+0xc2/0x170
[247390.432744] io_ring_exit_work+0x1d9/0x750
[247390.432758] ? io_uring_del_tctx_node+0xe0/0xe0
[247390.432772] ? verify_cpu+0xf0/0x100
[247390.432788] process_one_work+0x23b/0x550
[247390.432807] worker_thread+0x55/0x3c0
[247390.432813] ? process_one_work+0x550/0x550
[247390.432824] kthread+0x140/0x160
[247390.432830] ? set_kthread_struct+0x40/0x40
[247390.432839] ret_from_fork+0x1f/0x30
[247390.432870]
Showing all locks held in the system:
[247390.432877] 1 lock held by khungtaskd/40:
[247390.432880] #0: ffffffff82976700 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x15/0x174
[247390.432911] 1 lock held by in:imklog/922:
[247390.432915] #0: ffff8881041cfcf0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x4a/0x60
[247390.432977] 2 locks held by pager/318088:
[247390.432981] #0: ffff8881208d4898 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x24/0x50
[247390.433001] #1: ffffc900010fd2e8 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x49e/0x660
[247390.433024] 1 lock held by htop/341462:
[247390.433032] 2 locks held by kworker/u8:2/358488:
[247390.433035] #0: ffff888100106938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1c1/0x550
[247390.433053] #1: ffffc90003797e70 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x1c1/0x550
[247390.433071] 2 locks held by kworker/u8:0/359615:
[247390.433075] #0: ffff888100106938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1c1/0x550
[247390.433092] #1: ffffc90003597e70 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x1c1/0x550
[247390.433110] 1 lock held by dmesg/361178:
[247390.433113] #0: ffff88810b5300d0 (&user->lock){+.+.}-{3:3}, at: devkmsg_read+0x4b/0x230
[247390.433134] =============================================
Cc: Pavel Begunkov <asml.silence@gmail.com>
Link: https://github.com/axboe/liburing/issues/448
Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Diffstat (limited to 'test/Makefile')
-rw-r--r-- | test/Makefile | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/test/Makefile b/test/Makefile index f7eafad..2af684a 100644 --- a/test/Makefile +++ b/test/Makefile @@ -83,6 +83,7 @@ test_srcs := \ io_uring_enter.c \ io_uring_register.c \ io_uring_setup.c \ + kworker-hang.c \ lfs-openat.c \ lfs-openat-write.c \ link.c \ |