Pavel Begunkov [Thu, 1 Apr 2021 14:44:04 +0000 (15:44 +0100)]
io_uring: encapsulate fixed files into struct
Add struct io_fixed_file representing a single registered file, first to
hide ugly struct file **, which may be misleading, and secondly to
retype it to unsigned long as conversions to it and back to file * for
handling and masking FFS_* flags are getting nasty.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/78669731a605a7614c577c3de552631cfaf0869a.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:44:03 +0000 (15:44 +0100)]
io_uring: refactor file tables alloc/free
Introduce a heler io_free_file_tables() doing all the cleaning, there
are several places where it's hand coded. Also move all allocations into
io_sqe_alloc_file_tables() and rename it, so all of it is in one place.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/502a84ebf41ff119b095e59661e678eacb752bf8.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:44:02 +0000 (15:44 +0100)]
io_uring: don't quiesce intial files register
There is no reason why we would want to fully quiesce ring on
IORING_REGISTER_FILES, if it's already registered we fail.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/563bb8060bb2d3efbc32fce6101678281c574d2a.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:44:01 +0000 (15:44 +0100)]
io_uring: set proper FFS* flags on reg file update
Set FFS_* flags (e.g. FFS_ASYNC_READ) not only in initial registration
but also on registered files update. Not a bug, but may miss getting
profit out of the feature.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/df29a841a2d3d3695b509cdffce5070777d9d942.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:44:00 +0000 (15:44 +0100)]
io_uring: deduplicate NOSIGNAL setting
Set MSG_NOSIGNAL and REQ_F_NOWAIT in send/recv prep routines and don't
duplicate it in all four send/recv handlers.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e1133a3ed1c0e192975b7341ea4b0bf91f63b132.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:59 +0000 (15:43 +0100)]
io_uring: put link timeout req consistently
Don't put linked timeout req in io_async_find_and_cancel() but do it in
io_link_timeout_fn(), so we have only one point for that and won't have
to do it differently as it's now (put vs put_deferred). Btw, improve a
bit io_async_find_and_cancel()'s locking.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d75b70957f245275ab7cba83e0ac9c1b86aae78a.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:58 +0000 (15:43 +0100)]
io_uring: simplify overflow handling
Overflowed CQEs doesn't lock requests anymore, so we don't care so much
about cancelling them, so kill cq_overflow_flushed and simplify the
code.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/5799867aeba9e713c32f49aef78e5e1aef9fbc43.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:57 +0000 (15:43 +0100)]
io_uring: lock annotate timeouts and poll
Add timeout and poll ->comletion_lock annotations for Sparse, makes life
easier while looking at the functions.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2345325643093d41543383ba985a735aeb899eac.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:56 +0000 (15:43 +0100)]
io_uring: kill unused forward decls
Kill unused forward declarations for io_ring_file_put() and
io_queue_next(). Also btw rename the first one.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/64aa27c3f9662e14615cc119189f5eaf12989671.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:55 +0000 (15:43 +0100)]
io_uring: store reg buffer end instead of length
It's a bit more convenient for us to store a registered buffer end
address instead of length, see struct io_mapped_ubuf, as it allow to not
recompute it every time.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/39164403fe92f1dc437af134adeec2423cdf9395.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:54 +0000 (15:43 +0100)]
io_uring: improve import_fixed overflow checks
Replace a hand-coded overflow check with a specialised function. Even
though compilers are smart enough to generate identical binary (i.e.
check carry bit), but it's more foolproof and conveys the intention
better.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e437dcdc929bacbb6f11a4824ecbbf17225cb82a.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:53 +0000 (15:43 +0100)]
io_uring: refactor io_async_cancel()
Remove extra tctx==NULL checks that are already done by
io_async_cancel_one().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/70c2a8b958d942e86958a28af0452966ce1095b0.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:52 +0000 (15:43 +0100)]
io_uring: remove unused hash_wait
No users of io_uring_ctx::hash_wait left, kill it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e25cb83c233a5f75f15275596b49fbafbea606fa.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:51 +0000 (15:43 +0100)]
io_uring: better ref handling in poll_remove_one
Instead of io_put_req() to drop not a final ref, use req_ref_put(),
which is slimmer and will also check the invariant.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/85b5774ce13ae55cc2e705abdc8cbafe1212f1bd.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:50 +0000 (15:43 +0100)]
io_uring: combine lock/unlock sections on exit
io_ring_exit_work() already does uring_lock lock/unlock, no need to
repeat it for lock waiting trick in io_ring_ctx_free(). Move the waiting
with comments and spinlocking into io_ring_exit_work.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a8ae0589b0ea64ad4791e2c282e4e9b713dd7024.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:48 +0000 (15:43 +0100)]
io_uring: remove useless is_dying check on quiesce
rsrc_data refs should always be valid for potential submitters,
io_rsrc_ref_quiesce() restores it before unlocking, so
percpu_ref_is_dying() check in io_sqe_files_unregister() does nothing
and misleading. Concurrent quiesce is prevented with
struct io_rsrc_data::quiesce.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/bf97055e1748ee3a382e66daf384a469eb90b931.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:47 +0000 (15:43 +0100)]
io_uring: reuse io_rsrc_node_destroy()
Reuse io_rsrc_node_destroy() in __io_rsrc_put_work(). Also move it to a
more appropriate place -- to the other node routines, and remove forward
declaration.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/cccafba41aee1e5bb59988704885b1340aef3a27.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:46 +0000 (15:43 +0100)]
io_uring: ctx-wide rsrc nodes
If we're going to ever support multiple types of resources we need
shared rsrc nodes to not bloat requests, that is implemented in this
patch. It also gives a nicer API and saves one pointer dereference
in io_req_set_rsrc_node().
We may say that all requests bound to a resource belong to one and only
one rsrc node, and considering that nodes are removed and recycled
strictly in-order, this separates requests into generations, where
generation are changed on each node switch (i.e. io_rsrc_node_switch()).
The API is simple, io_rsrc_node_switch() switches to a new generation if
needed, and also optionally kills a passed in io_rsrc_data. Each call to
io_rsrc_node_switch() have to be preceded with
io_rsrc_node_switch_start(). The start function is idempotent and should
not necessarily be followed by switch.
One difference is that once a node was set it will always retain a valid
rsrc node, even on unregister. It may be a nuisance at the moment, but
makes much sense for multiple types of resources. Another thing changed
is that nodes are bound to/associated with a io_rsrc_data later just
before killing (i.e. switching).
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7e9c693b4b9a2f47aa784b616ce29843021bb65a.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:45 +0000 (15:43 +0100)]
io_uring: refactor io_queue_rsrc_removal()
Pass rsrc_node into io_queue_rsrc_removal() explicitly. Just a
simple preparation patch, makes following changes nicer.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/002889ce4de7baf287f2b010eef86ffe889174c6.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:44 +0000 (15:43 +0100)]
io_uring: move rsrc_put callback into io_rsrc_data
io_rsrc_node's callback operates only on a single io_rsrc_data and only
with its resources, so rsrc_put() callback is actually a property of
io_rsrc_data. Move it there, it makes code much nicecr.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9417c2fba3c09e8668f05747006a603d416d34b4.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:43 +0000 (15:43 +0100)]
io_uring: encapsulate rsrc node manipulations
io_rsrc_node_get() and io_rsrc_node_set() are always used together,
merge them into one so most users don't even see io_rsrc_node and don't
need to care about it.
It helped to catch io_sqe_files_register() inferring rsrc data argument
for get and set differently, not a problem but a good sign.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0827b080b2e61b3dec795380f7e1a1995595d41f.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:42 +0000 (15:43 +0100)]
io_uring: use rsrc prealloc infra for files reg
Keep it consistent with update and use io_rsrc_node_prealloc() +
io_rsrc_node_get() in io_sqe_files_register() as well, that will be used
in future patches, not as error prone and allows to deduplicate
rsrc_node init.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/cf87321e6be5e38f4dc7fe5079d2aa6945b1ace0.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:41 +0000 (15:43 +0100)]
io_uring: simplify io_rsrc_node_ref_zero
Replace queue_delayed_work() with mod_delayed_work() in
io_rsrc_node_ref_zero() as the later one can schedule a new work, and
cleanup it further for better readability.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3b2b23e3a1ea4bbf789cd61815d33e05d9ff945e.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 1 Apr 2021 14:43:40 +0000 (15:43 +0100)]
io_uring: name rsrc bits consistently
Keep resource related structs' and functions' naming consistent, in
particular use "io_rsrc" prefix for everything.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/962f5acdf810f3a62831e65da3932cde24f6d9df.1617287883.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 2 Apr 2021 01:57:07 +0000 (19:57 -0600)]
io-wq: cancel task_work on exit only targeting the current 'wq'
With using task_work_cancel(), we're potentially canceling task_work
that isn't related to this specific io_wq. Use the newly added
task_work_cancel_match() to ensure that we only remove and cancel work
items that are specific to this io_wq.
Fixes:
6c40d316ff3a ("io-wq: eliminate the need for a manager thread")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 2 Apr 2021 01:53:29 +0000 (19:53 -0600)]
task_work: add helper for more targeted task_work canceling
The only exported helper we have right now is task_work_cancel(), which
cancels any task_work from a given task where func matches the queued
work item. This is a bit too coarse for some use cases. Add a
task_work_cancel_match() that allows to more specifically target
individual work items outside of purely the callback function used.
task_work_cancel() can be trivially implemented on top of that, hence do
so.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 31 Mar 2021 15:03:03 +0000 (09:03 -0600)]
io_uring: fix race around poll update and poll triggering
Joakim reports that in some conditions he sees a multishot poll request
being canceled, and that it coincides with getting -EALREADY on
modification. As part of the poll update procedure, there's a small window
where the request is marked as canceled, and if this coincides with the
event actually triggering, then we can get a spurious -ECANCELED and
termination of the multishot request.
Don't mark the poll request as being canceled for update. We also don't
care if we race on removal unless it's a one-shot request, we can safely
updated for either case.
Fixes:
4d636d877e82 ("io_uring: allow events and user_data update of running poll requests")
Reported-by: Joakim Hassila <joj@mac.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Wed, 24 Mar 2021 22:59:01 +0000 (22:59 +0000)]
io_uring: reg buffer overflow checks hardening
We are safe with overflows in io_sqe_buffer_register() because it will
just yield alloc failure, but it's nicer to check explicitly.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2b0625551be3d97b80a5fd21c8cd79dc1c91f0b5.1616624589.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 25 Mar 2021 16:21:35 +0000 (10:21 -0600)]
io_uring: allow SQPOLL without CAP_SYS_ADMIN or CAP_SYS_NICE
Now that we have any worker being attached to the original task as
threads, accounting of CPU time is directly attributed to the original
task as well. This means that we no longer have to restrict SQPOLL to
needing elevated privileges, as it's really no different from just having
the task spawn a busy looping thread in userspace.
Reported-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 8 Mar 2021 16:37:51 +0000 (09:37 -0700)]
io-wq: eliminate the need for a manager thread
io-wq relies on a manager thread to create/fork new workers, as needed.
But there's really no strong need for it anymore. We have the following
cases that fork a new worker:
1) Work queue. This is done from the task itself always, and it's trivial
to create a worker off that path, if needed.
2) All workers have gone to sleep, and we have more work. This is called
off the sched out path. For this case, use a task_work items to queue
a fork-worker operation.
3) Hashed work completion. Don't think we need to do anything off this
case. If need be, it could just use approach 2 as well.
Part of this change is incrementing the running worker count before the
fork, to avoid cases where we observe we need a worker and then queue
creation of one. Then new work comes in, we fork a new one. That last
queue operation should have waited for the previous worker to come up,
it's quite possible we don't even need it. Hence move the worker running
from before we fork it off to more efficiently handle that case.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 22 Mar 2021 15:39:12 +0000 (09:39 -0600)]
kernel: allow fork with TIF_NOTIFY_SIGNAL pending
fork() fails if signal_pending() is true, but there are two conditions
that can lead to that:
1) An actual signal is pending. We want fork to fail for that one, like
we always have.
2) TIF_NOTIFY_SIGNAL is pending, because the task has pending task_work.
We don't need to make it fail for that case.
Allow fork() to proceed if just task_work is pending, by changing the
signal_pending() check to task_sigpending().
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 17 Mar 2021 14:37:41 +0000 (08:37 -0600)]
io_uring: allow events and user_data update of running poll requests
This adds two new POLL_ADD flags, IORING_POLL_UPDATE_EVENTS and
IORING_POLL_UPDATE_USER_DATA. As with the other POLL_ADD flag, these are
masked into sqe->len. If set, the POLL_ADD will have the following
behavior:
- sqe->addr must contain the the user_data of the poll request that
needs to be modified. This field is otherwise invalid for a POLL_ADD
command.
- If IORING_POLL_UPDATE_EVENTS is set, sqe->poll_events must contain the
new mask for the existing poll request. There are no checks for whether
these are identical or not, if a matching poll request is found, then it
is re-armed with the new mask.
- If IORING_POLL_UPDATE_USER_DATA is set, sqe->off must contain the new
user_data for the existing poll request.
A POLL_ADD with any of these flags set may complete with any of the
following results:
1) 0, which means that we successfully found the existing poll request
specified, and performed the re-arm procedure. Any error from that
re-arm will be exposed as a completion event for that original poll
request, not for the update request.
2) -ENOENT, if no existing poll request was found with the given
user_data.
3) -EALREADY, if the existing poll request was already in the process of
being removed/canceled/completing.
4) -EACCES, if an attempt was made to modify an internal poll request
(eg not one originally issued ass IORING_OP_POLL_ADD).
The usual -EINVAL cases apply as well, if any invalid fields are set
in the sqe for this command type.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 17 Mar 2021 14:17:19 +0000 (08:17 -0600)]
io_uring: abstract out a io_poll_find_helper()
We'll need this helper for another purpose, for now just abstract it
out and have io_poll_cancel() use it for lookups.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 23 Feb 2021 16:02:26 +0000 (09:02 -0700)]
io_uring: terminate multishot poll for CQ ring overflow
If we hit overflow and fail to allocate an overflow entry for the
completion, terminate the multishot poll mode.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 23 Feb 2021 15:58:04 +0000 (08:58 -0700)]
io_uring: abstract out helper for removing poll waitqs/hashes
No functional changes in this patch, just preparation for kill multishot
poll on CQ overflow.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 23 Feb 2021 05:08:01 +0000 (22:08 -0700)]
io_uring: add multishot mode for IORING_OP_POLL_ADD
The default io_uring poll mode is one-shot, where once the event triggers,
the poll command is completed and won't trigger any further events. If
we're doing repeated polling on the same file or socket, then it can be
more efficient to do multishot, where we keep triggering whenever the
event becomes true.
This deviates from the usual norm of having one CQE per SQE submitted. Add
a CQE flag, IORING_CQE_F_MORE, which tells the application to expect
further completion events from the submitted SQE. Right now the only user
of this is POLL_ADD in multishot mode.
Since sqe->poll_events is using the space that we normally use for adding
flags to commands, use sqe->len for the flag space for POLL_ADD. Multishot
mode is selected by setting IORING_POLL_ADD_MULTI in sqe->len. An
application should expect more CQEs for the specificed SQE if the CQE is
flagged with IORING_CQE_F_MORE. In multishot mode, only cancelation or an
error will terminate the poll request, in which case the flag will be
cleared.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 23 Feb 2021 05:05:00 +0000 (22:05 -0700)]
io_uring: include cflags in completion trace event
We should be including the completion flags for better introspection on
exactly what completion event was logged.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Tue, 23 Feb 2021 12:40:22 +0000 (12:40 +0000)]
io_uring: allocate memory for overflowed CQEs
Instead of using a request itself for overflowed CQE stashing, allocate a
separate entry. The disadvantage is that the allocation may fail and it
will be accounted as lost (see rings->cq_overflow), so we lose reliability
in case of memory pressure if the application is driving the CQ ring into
overflow. However, it opens a way for for multiple CQEs per an SQE and
even generating SQE-less CQEs.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
[axboe: use GFP_ATOMIC | __GFP_ACCOUNT]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 19 Mar 2021 20:06:24 +0000 (14:06 -0600)]
io_uring: mask in error/nval/hangup consistently for poll
Instead of masking these in as part of regular POLL_ADD prep, do it in
io_init_poll_iocb(), and include NVAL as that's generally unmaskable,
and RDHUP alongside the HUP that is already set.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 22 Mar 2021 01:58:34 +0000 (01:58 +0000)]
io_uring: optimise rw complete error handling
Expect read/write to succeed and create a hot path for this case, in
particular hide all error handling with resubmission under a single
check with the desired result.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 22 Mar 2021 01:58:33 +0000 (01:58 +0000)]
io_uring: hide iter revert in resubmit_prep
Move iov_iter_revert() resetting iterator in case of -EIOCBQUEUED into
io_resubmit_prep(), so we don't do heavy revert in hot path, also saves
a couple of checks.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 22 Mar 2021 01:58:32 +0000 (01:58 +0000)]
io_uring: don't alter iopoll reissue fail ret code
When reissue_prep failed in io_complete_rw_iopoll(), we change return
code to -EIO to prevent io_iopoll_complete() from doing resubmission.
Mark requests with a new flag (i.e. REQ_F_DONT_REISSUE) instead and
retain the original return value.
It also removes io_rw_reissue() from io_iopoll_complete() that will be
used later.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 22 Mar 2021 01:58:31 +0000 (01:58 +0000)]
io_uring: optimise kiocb_end_write for !ISREG
file_end_write() is only for regular files, so the function do a couple
of dereferences to get inode and check for it. However, we already have
REQ_F_ISREG at hand, just use it and inline file_end_write().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 22 Mar 2021 01:58:30 +0000 (01:58 +0000)]
io_uring: kill unused REQ_F_NO_FILE_TABLE
current->files are always valid now even for io-wq threads, so kill not
used anymore REQ_F_NO_FILE_TABLE.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 22 Mar 2021 01:58:29 +0000 (01:58 +0000)]
io_uring: don't init req->work fully in advance
req->work is mostly unused unless it's punted, and io_init_req() is too
hot for fully initialising it. Fortunately, we can skip init work.next
as it's controlled by io-wq, and can not touch work.flags by moving
everything related into io_prep_async_work(). The only field left is
req->work.creds, but there is nothing can be done, keep maintaining it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 22 Mar 2021 01:58:28 +0000 (01:58 +0000)]
io-wq: refactor *_get_acct()
Extract a helper for io_work_get_acct() and io_wqe_get_acct() to avoid
duplication.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 22 Mar 2021 01:58:27 +0000 (01:58 +0000)]
io_uring: remove tctx->sqpoll
struct io_uring_task::sqpoll is not used anymore, kill it
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 22 Mar 2021 01:58:25 +0000 (01:58 +0000)]
io_uring: don't do extra EXITING cancellations
io_match_task() matches all requests with PF_EXITING task, even though
those may be valid requests. It was necessary for SQPOLL cancellation,
but now it kills all requests before exiting via
io_uring_cancel_sqpoll(), so it's not needed.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 22 Mar 2021 01:58:24 +0000 (01:58 +0000)]
io_uring: don't clear REQ_F_LINK_TIMEOUT
REQ_F_LINK_TIMEOUT is a hint that to look for linked timeouts to cancel,
we're leaving it even when it's already fired. Hence don't care to clear
it in io_kill_linked_timeout(), it's safe and is called only once.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:44 +0000 (17:22 +0000)]
io_uring: optimise io_req_task_work_add()
Inline io_task_work_add() into io_req_task_work_add(). They both work
with a request, so keeping them separate doesn't make things much more
clear, but merging allows optimise it. Apart from small wins like not
reading req->ctx or not calculating @notify in the hot path, i.e. with
tctx->task_state set, it avoids doing wake_up_process() for every single
add, but only after actually done task_work_add().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:43 +0000 (17:22 +0000)]
io_uring: abolish old io_put_file()
io_put_file() doesn't do a good job at generating a good code. Inline
it, so we can check REQ_F_FIXED_FILE first, prioritising FIXED_FILE case
over requests without files, and saving a memory load in that case.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:42 +0000 (17:22 +0000)]
io_uring: optimise io_dismantle_req() fast path
Reshuffle io_dismantle_req() checks to put most of slow path stuff under
a single if.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:41 +0000 (17:22 +0000)]
io_uring: inline io_clean_op()'s fast path
Inline io_clean_op(), leaving __io_clean_op() but renaming it. This will
be used in following patches.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:40 +0000 (17:22 +0000)]
io_uring: remove __io_req_task_cancel()
Both io_req_complete_failed() and __io_req_task_cancel() do the same
thing: set failure flag, put both req refs and emit an CQE. The former
one is a bit more advance as it puts req back into a req cache, so make
it to take over __io_req_task_cancel() and remove the last one.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:39 +0000 (17:22 +0000)]
io_uring: add helper flushing locked_free_list
Add a new helper io_flush_cached_locked_reqs() that splices
locked_free_list to free_list, and does it right doing all sync and
invariant reinit.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:38 +0000 (17:22 +0000)]
io_uring: refactor io_free_req_deferred()
We don't care about ret value in io_free_req_deferred(), make the code a
bit more concise.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:37 +0000 (17:22 +0000)]
io_uring: inline io_put_req and friends
One big omission is that io_put_req() haven't been marked inline, and at
least gcc 9 doesn't inline it, not to mention that it's really hot and
extra function call is intolerable, especially when it doesn't put a
final ref.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:36 +0000 (17:22 +0000)]
io_uring: refactor rsrc refnode allocation
There are two problems:
1) we always allocate refnodes in advance and free them if those
haven't been used. It's expensive, takes two allocations, where one of
them is percpu. And it may be pretty common not actually using them.
2) Current API with allocating a refnode and setting some of the fields
is error prone, we don't ever want to have a file node runninng fixed
buffer callback...
Solve both with pre-init/get API. Pre-init just leaves the node for
later if not used, and for get (i.e. io_rsrc_refnode_get()), you need to
explicitly pass all arguments setting callbacks/etc., so it's more
resilient.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:35 +0000 (17:22 +0000)]
io_uring: refactor io_flush_cached_reqs()
Emphasize that return value of io_flush_cached_reqs() depends on number
of requests in the cache. It looks nicer and might help tools from
false-negative analyses.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:34 +0000 (17:22 +0000)]
io_uring: optimise success case of __io_queue_sqe
Move the case of successfully issued request by doing that check first.
It's not much of a difference, just generates slightly better code for
me.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:33 +0000 (17:22 +0000)]
io_uring: inline __io_queue_linked_timeout()
Inline __io_queue_linked_timeout(), we don't need it
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:32 +0000 (17:22 +0000)]
io_uring: keep io_req_free_batch() call locality
Don't do a function call (io_dismantle_req()) in the middle and place it
to near other function calls, otherwise may lead to excessive register
spilling.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:31 +0000 (17:22 +0000)]
io_uring: optimise tctx node checks/alloc
First of all, w need to set tctx->sqpoll only when we add a new entry
into ->xa, so move it from the hot path. Also extract a hot path for
io_uring_add_task_file() as an inline helper.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:30 +0000 (17:22 +0000)]
io_uring: optimise io_uring_enter()
Add unlikely annotations, because my compiler pretty much mispredicts
every first check, and apart jumping around in the fast path, it also
generates extra instructions, like in advance setting ret value.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Fri, 19 Mar 2021 17:22:29 +0000 (17:22 +0000)]
io_uring: don't take ctx refs in task_work handler
__tctx_task_work() guarantees that ctx won't be killed while running
task_works, so we can remove now unnecessary ctx pinning for internally
armed polling.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 23 Feb 2021 15:19:33 +0000 (08:19 -0700)]
io_uring: transform ret == 0 for poll cancelation completions
We can set canceled == true and complete out-of-line, ensure that we catch
that and correctly return -ECANCELED if the poll operation got canceled.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 23 Feb 2021 15:18:36 +0000 (08:18 -0700)]
io_uring: correct comment on poll vs iopoll
The correct function is io_iopoll_complete(), which deals with completions
of IOPOLL requests, not io_poll_complete().
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 12 Mar 2021 15:30:14 +0000 (08:30 -0700)]
io_uring: cache async and regular file state for fixed files
We have to dig quite deep to check for particularly whether or not a
file supports a fast-path nonblock attempt. For fixed files, we can do
this lookup once and cache the state instead.
This adds two new bits to track whether we support async read/write
attempt, and lines up the REQ_F_ISREG bit with those two. The file slot
re-uses the last 3 (or 2, for 32-bit) of the file pointer to cache that
state, and then we mask it in when we go and use a fixed file.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 12 Mar 2021 15:27:05 +0000 (08:27 -0700)]
io_uring: don't check for io_uring_fops for fixed files
We don't allow them at registration time, so limit the check for needing
inflight tracking in io_file_get() to the non-fixed path.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Wed, 10 Mar 2021 13:13:55 +0000 (13:13 +0000)]
io_uring: simplify io_sqd_update_thread_idle()
Use a more comprehensible() max instead of hand coding it with ifs in
io_sqd_update_thread_idle().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 24 Feb 2021 20:32:30 +0000 (13:32 -0700)]
io_uring: switch to atomic_t for io_kiocb reference count
io_uring manipulates references twice for each request, and hence is very
sensitive to performance of the reference count. This commit borrows a
trick from:
commit
f958d7b528b1b40c44cfda5eabe2d82760d868c3
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu Apr 11 10:06:20 2019 -0700
mm: make page ref count overflow check tighter and more explicit
and switches to atomic_t for references, while still retaining overflow
and underflow checks.
This is good for a 2-3% increase in peak IOPS on a single core. Before:
IOPS=
2970879, IOS/call=31/31, inflight=128 (128)
IOPS=
2952597, IOS/call=31/31, inflight=128 (128)
IOPS=
2943904, IOS/call=31/31, inflight=128 (128)
IOPS=
2930006, IOS/call=31/31, inflight=96 (96)
and after:
IOPS=
3054354, IOS/call=31/31, inflight=128 (128)
IOPS=
3059038, IOS/call=31/31, inflight=128 (128)
IOPS=
3060320, IOS/call=31/31, inflight=128 (128)
IOPS=
3068256, IOS/call=31/31, inflight=96 (96)
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 24 Feb 2021 20:28:27 +0000 (13:28 -0700)]
io_uring: wrap io_kiocb reference count manipulation in helpers
No functional changes in this patch, just in preparation for handling the
references a bit more efficiently.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Sun, 28 Feb 2021 22:35:20 +0000 (22:35 +0000)]
io_uring: simplify io_resubmit_prep()
If not for async_data NULL check, io_resubmit_prep() is already an rw
specific version of io_req_prep_async(), but slower because 1) it always
goes through io_import_iovec() even if following io_setup_async_rw() the
result 2) instead of initialising iovec/iter in-place it does it
on-stack and then copies with io_setup_async_rw().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Sun, 28 Feb 2021 22:35:19 +0000 (22:35 +0000)]
io_uring: merge defer_prep() and prep_async()
Merge two function and do renaming in favour of the second one, it
relays the meaning better.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Sun, 28 Feb 2021 22:35:18 +0000 (22:35 +0000)]
io_uring: rethink def->needs_async_data
needs_async_data controls allocation of async_data, and used in two
cases. 1) when async setup requires it (by io_req_prep_async() or
handler themselves), and 2) when op always needs additional space to
operate, like timeouts do.
Opcode preps already don't bother about the second case and do
allocation unconditionally, restrict needs_async_data to the first case
only and rename it into needs_async_setup.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
[axboe: update for IOPOLL fix]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Sun, 28 Feb 2021 22:35:17 +0000 (22:35 +0000)]
io_uring: untie alloc_async_data and needs_async_data
All opcode handlers pretty well know whether they need async data or
not, and can skip testing for needs_async_data. The exception is rw
the generic path, but those test the flag by hand anyway. So, check the
flag and make io_alloc_async_data() allocating unconditionally.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Sun, 28 Feb 2021 22:35:16 +0000 (22:35 +0000)]
io_uring: refactor out send/recv async setup
IORING_OP_[SEND,RECV] don't need async setup neither will get into
io_req_prep_async(). Remove them from io_req_prep_async() and remove
needs_async_data checks from the related setup functions.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Sun, 28 Feb 2021 22:35:15 +0000 (22:35 +0000)]
io_uring: use better types for cflags
__io_cqring_fill_event() takes cflags as long to squeeze it into u32 in
an CQE, awhile all users pass int or unsigned. Replace it with unsigned
int and store it as u32 in struct io_completion to match CQE.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Sun, 28 Feb 2021 22:35:13 +0000 (22:35 +0000)]
io_uring: refactor provide/remove buffer locking
Always complete request holding the mutex instead of doing that strange
dancing with conditional ordering.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Sun, 28 Feb 2021 22:35:12 +0000 (22:35 +0000)]
io_uring: add a helper failing not issued requests
Add a simple helper doing CQE posting, marking request for link-failure,
and putting both submission and completion references.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Sun, 28 Feb 2021 22:35:11 +0000 (22:35 +0000)]
io_uring: further deduplicate file slot selection
io_fixed_file_slot() and io_file_from_index() behave pretty similarly,
DRY and call one from another.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Sun, 28 Feb 2021 22:35:10 +0000 (22:35 +0000)]
io_uring: reuse io_req_task_queue_fail()
Use io_req_task_queue_fail() on the fail path of io_req_task_queue().
It's unlikely to happen, so don't care about additional overhead, but
allows to keep all the req->result invariant in a single function.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Sun, 28 Feb 2021 22:35:09 +0000 (22:35 +0000)]
io_uring: avoid taking ctx refs for task-cancel
Don't bother to take a ctx->refs for io_req_task_cancel() because it
take uring_lock before putting a request, and the context is promised to
stay alive until unlock happens.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Sun, 4 Apr 2021 21:15:36 +0000 (14:15 -0700)]
Linux 5.12-rc6
Zheyu Ma [Sat, 3 Apr 2021 06:58:36 +0000 (06:58 +0000)]
firewire: nosy: Fix a use-after-free bug in nosy_ioctl()
For each device, the nosy driver allocates a pcilynx structure.
A use-after-free might happen in the following scenario:
1. Open nosy device for the first time and call ioctl with command
NOSY_IOC_START, then a new client A will be malloced and added to
doubly linked list.
2. Open nosy device for the second time and call ioctl with command
NOSY_IOC_START, then a new client B will be malloced and added to
doubly linked list.
3. Call ioctl with command NOSY_IOC_START for client A, then client A
will be readded to the doubly linked list. Now the doubly linked
list is messed up.
4. Close the first nosy device and nosy_release will be called. In
nosy_release, client A will be unlinked and freed.
5. Close the second nosy device, and client A will be referenced,
resulting in UAF.
The root cause of this bug is that the element in the doubly linked list
is reentered into the list.
Fix this bug by adding a check before inserting a client. If a client
is already in the linked list, don't insert it.
The following KASAN report reveals it:
BUG: KASAN: use-after-free in nosy_release+0x1ea/0x210
Write of size 8 at addr
ffff888102ad7360 by task poc
CPU: 3 PID: 337 Comm: poc Not tainted 5.12.0-rc5+ #6
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Call Trace:
nosy_release+0x1ea/0x210
__fput+0x1e2/0x840
task_work_run+0xe8/0x180
exit_to_user_mode_prepare+0x114/0x120
syscall_exit_to_user_mode+0x1d/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
Allocated by task 337:
nosy_open+0x154/0x4d0
misc_open+0x2ec/0x410
chrdev_open+0x20d/0x5a0
do_dentry_open+0x40f/0xe80
path_openat+0x1cf9/0x37b0
do_filp_open+0x16d/0x390
do_sys_openat2+0x11d/0x360
__x64_sys_open+0xfd/0x1a0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
Freed by task 337:
kfree+0x8f/0x210
nosy_release+0x158/0x210
__fput+0x1e2/0x840
task_work_run+0xe8/0x180
exit_to_user_mode_prepare+0x114/0x120
syscall_exit_to_user_mode+0x1d/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
The buggy address belongs to the object at
ffff888102ad7300 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 96 bytes inside of 128-byte region [
ffff888102ad7300,
ffff888102ad7380)
[ Modified to use 'list_empty()' inside proper lock - Linus ]
Link: https://lore.kernel.org/lkml/1617433116-5930-1-git-send-email-zheyuma97@gmail.com/
Reported-and-tested-by: 马哲宇 (Zheyu Ma) <zheyuma97@gmail.com>
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 3 Apr 2021 22:42:45 +0000 (15:42 -0700)]
Merge tag 'for-linus' of git://github.com/openrisc/linux
Pull OpenRISC fix from Stafford Horne:
"Fix duplicate header include in Litex SOC driver"
* tag 'for-linus' of git://github.com/openrisc/linux:
soc: litex: Remove duplicated header file inclusion
Linus Torvalds [Sat, 3 Apr 2021 21:26:47 +0000 (14:26 -0700)]
Merge tag 'io_uring-5.12-2021-04-03' of git://git.kernel.dk/linux-block
POull io_uring fix from Jens Axboe:
"Just fixing a silly braino in a previous patch, where we'd end up
failing to compile if CONFIG_BLOCK isn't enabled.
Not that a lot of people do that, but kernel bot spotted it and it's
probably prudent to just flush this out now before -rc6.
Sorry about that, none of my test compile configs have !CONFIG_BLOCK"
* tag 'io_uring-5.12-2021-04-03' of git://git.kernel.dk/linux-block:
io_uring: fix !CONFIG_BLOCK compilation failure
Zhen Lei [Wed, 31 Mar 2021 13:06:43 +0000 (15:06 +0200)]
soc: litex: Remove duplicated header file inclusion
The header file <linux/errno.h> is already included above and can be
removed here.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Mateusz Holenko <mholenko@antmicro.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Linus Torvalds [Sat, 3 Apr 2021 19:15:01 +0000 (12:15 -0700)]
Merge tag 'gfs2-v5.12-rc2-fixes2' of git://git./linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 fixes from Andreas Gruenbacher:
"Two more gfs2 fixes"
* tag 'gfs2-v5.12-rc2-fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: report "already frozen/thawed" errors
gfs2: Flag a withdraw if init_threads() fails
Linus Torvalds [Sat, 3 Apr 2021 18:52:18 +0000 (11:52 -0700)]
Merge tag 'riscv-for-linus-5.12-rc6' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
"A handful of fixes for 5.12:
- fix a stack tracing regression related to "const register asm"
variables, which have unexpected behavior.
- ensure the value to be written by put_user() is evaluated before
enabling access to userspace memory..
- align the exception vector table correctly, so we don't rely on the
firmware's handling of unaligned accesses.
- build fix to make NUMA depend on MMU, which triggered on some
randconfigs"
* tag 'riscv-for-linus-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Make NUMA depend on MMU
riscv: remove unneeded semicolon
riscv,entry: fix misaligned base for excp_vect_table
riscv: evaluate put_user() arg before enabling user access
riscv: Drop const annotation for sp
Linus Torvalds [Sat, 3 Apr 2021 17:49:38 +0000 (10:49 -0700)]
Merge tag 'powerpc-5.12-5' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fix a bug on pseries where spurious wakeups from H_PROD would prevent
partition migration from succeeding.
Fix oopses seen in pcpu_alloc(), caused by parallel faults of the
percpu mapping causing us to corrupt the protection key used for the
mapping, and cause a fatal key fault.
Thanks to Aneesh Kumar K.V, Murilo Opsfelder Araujo, and Nathan Lynch"
* tag 'powerpc-5.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm/book3s64: Use the correct storage key value when calling H_PROTECT
powerpc/pseries/mobility: handle premature return from H_JOIN
powerpc/pseries/mobility: use struct for shared state
Linus Torvalds [Sat, 3 Apr 2021 17:42:20 +0000 (10:42 -0700)]
Merge tag 'hyperv-fixes-signed-
20210402' of git://git./linux/kernel/git/hyperv/linux
Pull Hyper-V fixes from Wei Liu:
"One fix from Lu Yunlong for a double free in hvfb_probe"
* tag 'hyperv-fixes-signed-
20210402' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
video: hyperv_fb: Fix a double free in hvfb_probe
Linus Torvalds [Sat, 3 Apr 2021 17:14:47 +0000 (10:14 -0700)]
Merge tag 'driver-core-5.12-rc6' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is a single driver core fix for a reported problem with differed
probing. It has been in linux-next for a while with no reported
problems"
* tag 'driver-core-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: clear deferred probe reason on probe retry
Linus Torvalds [Sat, 3 Apr 2021 17:05:16 +0000 (10:05 -0700)]
Merge tag 'char-misc-5.12-rc6' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are a few small driver char/misc changes for 5.12-rc6.
Nothing major here, a few fixes for reported issues:
- interconnect fixes for problems found
- fbcon syzbot-found fix
- extcon fixes
- firmware stratix10 bugfix
- MAINTAINERS file update.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
drivers: video: fbcon: fix NULL dereference in fbcon_cursor()
mei: allow map and unmap of client dma buffer only for disconnected client
MAINTAINERS: Add linux-phy list and patchwork
interconnect: Fix kerneldoc warning
firmware: stratix10-svc: reset COMMAND_RECONFIG_FLAG_PARTIAL to 0
extcon: Fix error handling in extcon_dev_register
extcon: Add stubs for extcon_register_notifier_all() functions
interconnect: core: fix error return code of icc_link_destroy()
interconnect: qcom: msm8939: remove rpm-ids from non-RPM nodes
Linus Torvalds [Sat, 3 Apr 2021 17:03:51 +0000 (10:03 -0700)]
Merge tag 'staging-5.12-rc6' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are two rtl8192e staging driver fixes for reported problems.
Both of these have been in linux-next for a while with no reported
issues"
* tag 'staging-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: rtl8192e: Change state information from u16 to u8
staging: rtl8192e: Fix incorrect source in memcpy()
Linus Torvalds [Sat, 3 Apr 2021 17:00:53 +0000 (10:00 -0700)]
Merge tag 'tty-5.12-rc6' of git://git./linux/kernel/git/gregkh/tty
Pull serial driver fix from Greg KH:
"Here is a single serial driver fix for 5.12-rc6. Is is a revert of a
change that showed up in 5.9 that has been reported to cause problems.
It has been in linux-next for a while with no reported issues"
* tag 'tty-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
soc: qcom-geni-se: Cleanup the code to remove proxy votes
Linus Torvalds [Sat, 3 Apr 2021 16:56:22 +0000 (09:56 -0700)]
Merge tag 'usb-5.12-rc6' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a few small USB driver fixes for 5.12-rc6 to resolve reported
problems.
They include:
- a number of cdc-acm fixes for reported problems. It seems more
people are using this driver lately...
- dwc3 driver fixes for reported problems, and fixes for the fixes :)
- dwc2 driver fixes for reported issues.
- musb driver fix.
- new USB quirk additions.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (23 commits)
usb: dwc2: Prevent core suspend when port connection flag is 0
usb: dwc2: Fix HPRT0.PrtSusp bit setting for HiKey 960 board.
usb: musb: Fix suspend with devices connected for a64
usb: xhci-mtk: fix broken streams issue on 0.96 xHCI
usb: dwc3: gadget: Clear DEP flags after stop transfers in ep disable
usbip: vhci_hcd fix shift out-of-bounds in vhci_hub_control()
USB: quirks: ignore remote wake-up on Fibocom L850-GL LTE modem
USB: cdc-acm: do not log successful probe on later errors
USB: cdc-acm: always claim data interface
USB: cdc-acm: use negation for NULL checks
USB: cdc-acm: clean up probe error labels
USB: cdc-acm: drop redundant driver-data reset
USB: cdc-acm: drop redundant driver-data assignment
USB: cdc-acm: fix use-after-free after probe failure
USB: cdc-acm: fix double free on probe failure
USB: cdc-acm: downgrade message to debug
USB: cdc-acm: untangle a circular dependency between callback and softint
cdc-acm: fix BREAK rx code path adding necessary calls
usb: gadget: udc: amd5536udc_pci fix null-ptr-dereference
usb: dwc3: pci: Enable dis_uX_susphy_quirk for Intel Merrifield
...
Linus Torvalds [Sat, 3 Apr 2021 16:07:35 +0000 (09:07 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"A single fix to iscsi for a rare race condition which can cause a
kernel panic"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: iscsi: Fix race condition between login and sync thread
Jens Axboe [Sat, 3 Apr 2021 01:45:34 +0000 (19:45 -0600)]
io_uring: fix !CONFIG_BLOCK compilation failure
kernel test robot correctly pinpoints a compilation failure if
CONFIG_BLOCK isn't set:
fs/io_uring.c: In function '__io_complete_rw':
>> fs/io_uring.c:2509:48: error: implicit declaration of function 'io_rw_should_reissue'; did you mean 'io_rw_reissue'? [-Werror=implicit-function-declaration]
2509 | if ((res == -EAGAIN || res == -EOPNOTSUPP) && io_rw_should_reissue(req)) {
| ^~~~~~~~~~~~~~~~~~~~
| io_rw_reissue
cc1: some warnings being treated as errors
Ensure that we have a stub declaration of io_rw_should_reissue() for
!CONFIG_BLOCK.
Fixes:
230d50d448ac ("io_uring: move reissue into regular IO path")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 2 Apr 2021 23:13:13 +0000 (16:13 -0700)]
Merge tag 'block-5.12-2021-04-02' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Remove comment that never came to fruition in 22 years of development
(Christoph)
- Remove unused request flag (Christoph)
- Fix for null_blk fake timeout handling (Damien)
- Fix for IOCB_NOWAIT being ignored for O_DIRECT on raw bdevs (Pavel)
- Error propagation fix for multiple split bios (Yufen)
* tag 'block-5.12-2021-04-02' of git://git.kernel.dk/linux-block:
block: remove the unused RQF_ALLOCED flag
block: update a few comments in uapi/linux/blkpg.h
block: don't ignore REQ_NOWAIT for direct IO
null_blk: fix command timeout completion handling
block: only update parent bi_status when bio fail