summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-01-20io_uring: remove two unnecessary function declarationsJens Axboe
__io_free_req() and io_double_put_req() aren't used before they are defined, so we can kill these two forwards. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: move *queue_link_head() from common pathPavel Begunkov
Move io_queue_link_head() to links handling code in io_submit_sqe(), so it wouldn't need extra checks and would have better data locality. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: rename prev to headPavel Begunkov
Calling "prev" a head of a link is a bit misleading. Rename it Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add IOSQE_ASYNCJens Axboe
io_uring defaults to always doing inline submissions, if at all possible. But for larger copies, even if the data is fully cached, that can take a long time. Add an IOSQE_ASYNC flag that the application can set on the SQE - if set, it'll ensure that we always go async for those kinds of requests. Use the io-wq IO_WQ_WORK_CONCURRENT flag to ensure we get the concurrency we desire for this case. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io-wq: support concurrent non-blocking workJens Axboe
io-wq assumes that work will complete fast (and not block), so it doesn't create a new worker when work is enqueued, if we already have at least one worker running. This is done on the assumption that if work is running, then it will complete fast. Add an option to force io-wq to fork a new worker for work queued. This is signaled by setting IO_WQ_WORK_CONCURRENT on the work item. For that case, io-wq will create a new worker, even though workers are already running. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for IORING_OP_STATXJens Axboe
This provides support for async statx(2) through io_uring. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20fs: make two stat prep helpers availableJens Axboe
To implement an async stat, we need to provide the flags mapping and the statx user copy. Make them available internally, through fs/internal.h. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: avoid ring quiesce for fixed file set unregister and updateJens Axboe
We currently fully quiesce the ring before an unregister or update of the fixed fileset. This is very expensive, and we can be a bit smarter about this. Add a percpu refcount for the file tables as a whole. Grab a percpu ref when we use a registered file, and put it on completion. This is cheap to do. Upon removal of a file from a set, switch the ref count to atomic mode. When we hit zero ref on the completion side, then we know we can drop the previously registered files. When the old files have been dropped, switch the ref back to percpu mode for normal operation. Since there's a period between doing the update and the kernel being done with it, add a IORING_OP_FILES_UPDATE opcode that can perform the same action. The application knows the update has completed when it gets the CQE for it. Between doing the update and receiving this completion, the application must continue to use the unregistered fd if submitting IO on this particular file. This takes the runtime of test/file-register from liburing from 14s to about 0.7s. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for IORING_OP_CLOSEJens Axboe
This works just like close(2), unsurprisingly. We remove the file descriptor and post the completion inline, then offload the actual (potential) last file put to async context. Mark the async part of this work as uncancellable, as we really must guarantee that the latter part of the close is run. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io-wq: add support for uncancellable workJens Axboe
Not all work can be cancelled, some of it we may need to guarantee that it runs to completion. Allow the caller to set IO_WQ_WORK_NO_CANCEL on work that must not be cancelled. Note that the caller work function must also check for IO_WQ_WORK_NO_CANCEL on work that is marked IO_WQ_WORK_CANCEL. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20fs: move filp_close() outside of __close_fd_get_file()Jens Axboe
Just one caller of this, and just use filp_close() there manually. This is important to allow async close/removal of the fd. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for IORING_OP_OPENATJens Axboe
This works just like openat(2), except it can be performed async. For the normal case of a non-blocking path lookup this will complete inline. If we have to do IO to perform the open, it'll be done from async context. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20fs: make build_open_flags() available internallyJens Axboe
This is a prep patch for supporting non-blocking open from io_uring. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for fallocate()Jens Axboe
This exposes fallocate(2) through io_uring. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20Merge branch 'io_uring-5.5' into for-5.6/io_uring-vfsJens Axboe
Pull in compatability fix for the files_update command. * io_uring-5.5: io_uring: fix compat for IORING_REGISTER_FILES_UPDATE
2020-01-20io_uring: fix compat for IORING_REGISTER_FILES_UPDATEio_uring-5.5-2020-01-22Eugene Syromiatnikov
fds field of struct io_uring_files_update is problematic with regards to compat user space, as pointer size is different in 32-bit, 32-on-64-bit, and 64-bit user space. In order to avoid custom handling of compat in the syscall implementation, make fds __u64 and use u64_to_user_ptr in order to retrieve it. Also, align the field naturally and check that no garbage is passed there. Fixes: c3a31e605620c279 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE") Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-19Merge branch 'work.openat2' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into for-5.6/io_uring-vfs Pull in Al's openat2 branch, since we'll need that for the openat2 support. * 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Documentation: path-lookup: include new LOOKUP flags selftests: add openat2(2) selftests open: introduce openat2(2) syscall namei: LOOKUP_{IN_ROOT,BENEATH}: permit limited ".." resolution namei: LOOKUP_IN_ROOT: chroot-like scoped resolution namei: LOOKUP_BENEATH: O_BENEATH-like scoped resolution namei: LOOKUP_NO_XDEV: block mountpoint crossing namei: LOOKUP_NO_MAGICLINKS: block magic-link resolution namei: LOOKUP_NO_SYMLINKS: block symlink resolution namei: allow set_root() to produce errors namei: allow nd_jump_link() to produce errors nsfs: clean-up ns_get_path() signature to return int namei: only return -ECHILD from follow_dotdot_rcu()
2020-01-19Linux 5.5-rc7Linus Torvalds
2020-01-19Merge tag 'riscv/for-v5.5-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: "Three fixes for RISC-V: - Don't free and reuse memory containing the code that CPUs parked at boot reside in. - Fix rv64 build problems for ubsan and some modules by adding logical and arithmetic shift helpers for 128-bit values. These are from libgcc and are similar to what's present for ARM64. - Fix vDSO builds to clean up their own temporary files" * tag 'riscv/for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Less inefficient gcc tishift helpers (and export their symbols) riscv: delete temporary files riscv: make sure the cores stay looping in .Lsecondary_park
2020-01-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix non-blocking connect() in x25, from Martin Schiller. 2) Fix spurious decryption errors in kTLS, from Jakub Kicinski. 3) Netfilter use-after-free in mtype_destroy(), from Cong Wang. 4) Limit size of TSO packets properly in lan78xx driver, from Eric Dumazet. 5) r8152 probe needs an endpoint sanity check, from Johan Hovold. 6) Prevent looping in tcp_bpf_unhash() during sockmap/tls free, from John Fastabend. 7) hns3 needs short frames padded on transmit, from Yunsheng Lin. 8) Fix netfilter ICMP header corruption, from Eyal Birger. 9) Fix soft lockup when low on memory in hns3, from Yonglong Liu. 10) Fix NTUPLE firmware command failures in bnxt_en, from Michael Chan. 11) Fix memory leak in act_ctinfo, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits) cxgb4: reject overlapped queues in TC-MQPRIO offload cxgb4: fix Tx multi channel port rate limit net: sched: act_ctinfo: fix memory leak bnxt_en: Do not treat DSN (Digital Serial Number) read failure as fatal. bnxt_en: Fix ipv6 RFS filter matching logic. bnxt_en: Fix NTUPLE firmware command failures. net: systemport: Fixed queue mapping in internal ring map net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec net: dsa: sja1105: Don't error out on disabled ports with no phy-mode net: phy: dp83867: Set FORCE_LINK_GOOD to default after reset net: hns: fix soft lockup when there is not enough memory net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key() net/sched: act_ife: initalize ife->metalist earlier netfilter: nat: fix ICMP header corruption on ICMP errors net: wan: lapbether.c: Use built-in RCU list checking netfilter: nf_tables: fix flowtable list del corruption netfilter: nf_tables: fix memory leak in nf_tables_parse_netdev_hooks() netfilter: nf_tables: remove WARN and add NLA_STRING upper limits netfilter: nft_tunnel: ERSPAN_VERSION must not be null netfilter: nft_tunnel: fix null-attribute check ...
2020-01-19Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Two runtime PM fixes and one leak fix" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: iop3xx: Fix memory leak in probe error path i2c: tegra: Properly disable runtime PM on driver's probe error i2c: tegra: Fix suspending in active runtime PM state
2020-01-19cxgb4: reject overlapped queues in TC-MQPRIO offloadRahul Lakkireddy
A queue can't belong to multiple traffic classes. So, reject any such configuration that results in overlapped queues for a traffic class. Fixes: b1396c2bd675 ("cxgb4: parse and configure TC-MQPRIO offload") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19cxgb4: fix Tx multi channel port rate limitRahul Lakkireddy
T6 can support 2 egress traffic management channels per port to double the total number of traffic classes that can be configured. In this configuration, if the class belongs to the other channel, then all the queues must be bound again explicitly to the new class, for the rate limit parameters on the other channel to take effect. So, always explicitly bind all queues to the port rate limit traffic class, regardless of the traffic management channel that it belongs to. Also, only bind queues to port rate limit traffic class, if all the queues don't already belong to an existing different traffic class. Fixes: 4ec4762d8ec6 ("cxgb4: add TC-MATCHALL classifier egress offload") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19net: sched: act_ctinfo: fix memory leakEric Dumazet
Implement a cleanup method to properly free ci->params BUG: memory leak unreferenced object 0xffff88811746e2c0 (size 64): comm "syz-executor617", pid 7106, jiffies 4294943055 (age 14.250s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ c0 34 60 84 ff ff ff ff 00 00 00 00 00 00 00 00 .4`............. backtrace: [<0000000015aa236f>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline] [<0000000015aa236f>] slab_post_alloc_hook mm/slab.h:586 [inline] [<0000000015aa236f>] slab_alloc mm/slab.c:3320 [inline] [<0000000015aa236f>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3549 [<000000002c946bd1>] kmalloc include/linux/slab.h:556 [inline] [<000000002c946bd1>] kzalloc include/linux/slab.h:670 [inline] [<000000002c946bd1>] tcf_ctinfo_init+0x21a/0x530 net/sched/act_ctinfo.c:236 [<0000000086952cca>] tcf_action_init_1+0x400/0x5b0 net/sched/act_api.c:944 [<000000005ab29bf8>] tcf_action_init+0x135/0x1c0 net/sched/act_api.c:1000 [<00000000392f56f9>] tcf_action_add+0x9a/0x200 net/sched/act_api.c:1410 [<0000000088f3c5dd>] tc_ctl_action+0x14d/0x1bb net/sched/act_api.c:1465 [<000000006b39d986>] rtnetlink_rcv_msg+0x178/0x4b0 net/core/rtnetlink.c:5424 [<00000000fd6ecace>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477 [<0000000047493d02>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5442 [<00000000bdcf8286>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] [<00000000bdcf8286>] netlink_unicast+0x223/0x310 net/netlink/af_netlink.c:1328 [<00000000fc5b92d9>] netlink_sendmsg+0x2c0/0x570 net/netlink/af_netlink.c:1917 [<00000000da84d076>] sock_sendmsg_nosec net/socket.c:639 [inline] [<00000000da84d076>] sock_sendmsg+0x54/0x70 net/socket.c:659 [<0000000042fb2eee>] ____sys_sendmsg+0x2d0/0x300 net/socket.c:2330 [<000000008f23f67e>] ___sys_sendmsg+0x8a/0xd0 net/socket.c:2384 [<00000000d838e4f6>] __sys_sendmsg+0x80/0xf0 net/socket.c:2417 [<00000000289a9cb1>] __do_sys_sendmsg net/socket.c:2426 [inline] [<00000000289a9cb1>] __se_sys_sendmsg net/socket.c:2424 [inline] [<00000000289a9cb1>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2424 Fixes: 24ec483cec98 ("net: sched: Introduce act_ctinfo action") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Kevin 'ldir' Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Kevin 'ldir' Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-18riscv: Less inefficient gcc tishift helpers (and export their symbols)Olof Johansson
The existing __lshrti3 was really inefficient, and the other two helpers are also needed to compile some modules. Add the missing versions, and export all of the symbols like arm64 already does. This code is based on the assembly generated by libgcc builds. This fixes a build break triggered by ubsan: riscv64-unknown-linux-gnu-ld: lib/ubsan.o: in function `.L2': ubsan.c:(.text.unlikely+0x38): undefined reference to `__ashlti3' riscv64-unknown-linux-gnu-ld: ubsan.c:(.text.unlikely+0x42): undefined reference to `__ashrti3' Signed-off-by: Olof Johansson <olof@lixom.net> [paul.walmsley@sifive.com: use SYM_FUNC_{START,END} instead of ENTRY/ENDPROC; note libgcc origin] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-18Merge tag 'mtd/fixes-for-5.5-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD fixes from Miquel Raynal: "Raw NAND: - GPMI: Fix the suspend/resume SPI-NOR: - Fix quad enable on Spansion like flashes - Fix selection of 4-byte addressing opcodes on Spansion" * tag 'mtd/fixes-for-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: gpmi: Restore nfc timing setup after suspend/resume mtd: rawnand: gpmi: Fix suspend/resume problem mtd: spi-nor: Fix quad enable for Spansion like flashes mtd: spi-nor: Fix selection of 4-byte addressing opcodes on Spansion
2020-01-18Merge tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Back from LCA2020, fixes wasn't too busy last week, seems to have quieten down appropriately, some amdgpu, i915, then a core mst fix and one fix for virtio-gpu and one for rockchip: core mst: - serialize down messages and clear timeslots are on unplug amdgpu: - Update golden settings for renoir - eDP fix i915: - uAPI fix: Remove dash and colon from PMU names to comply with tools/perf - Fix for include file that was indirectly included - Two fixes to make sure VMA are marked active for error capture virtio: - maintain obj reservation lock when submitting cmds rockchip: - increase link rate var size to accommodate rates" * tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Reorder detect_edp_sink_caps before link settings read. drm/amdgpu: update goldensetting for renoir drm/dp_mst: Have DP_Tx send one msg at a time drm/dp_mst: clear time slots for ports invalid drm/i915/pmu: Do not use colons or dashes in PMU names drm/rockchip: fix integer type used for storing dp data rate drm/i915/gt: Mark ring->vma as active while pinned drm/i915/gt: Mark context->state vma as active while pinned drm/i915/gt: Skip trying to unbind in restore_ggtt_mappings drm/i915: Add missing include file <linux/math64.h> drm/virtio: add missing virtio_gpu_array_lock_resv call
2020-01-18riscv: delete temporary filesIlie Halip
Temporary files used in the VDSO build process linger on even after make mrproper: vdso-dummy.o.tmp, vdso.so.dbg.tmp. Delete them once they're no longer needed. Signed-off-by: Ilie Halip <ilie.halip@gmail.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-18Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - a resctrl fix for uninitialized objects found by debugobjects - a resctrl memory leak fix - fix the unintended re-enabling of the of SME and SEV CPU flags if memory encryption was disabled at bootup via the MSR space" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Ensure clearing of SME/SEV features is maintained x86/resctrl: Fix potential memory leak x86/resctrl: Fix an imbalance in domain_remove_cpu()
2020-01-18Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Three fixes: fix link failure on Alpha, fix a Sparse warning and annotate/robustify a lockless access in the NOHZ code" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick/sched: Annotate lockless access to last_jiffies_update lib/vdso: Make __cvdso_clock_getres() static time/posix-stubs: Provide compat itimer supoprt for alpha
2020-01-18Merge branch 'smp-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull cpu/SMT fix from Ingo Molnar: "Fix a build bug on CONFIG_HOTPLUG_SMT=y && !CONFIG_SYSFS kernels" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/SMT: Fix x86 link error without CONFIG_SYSFS
2020-01-18Merge branch 'ras-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS fix from Ingo Molnar: "Fix a thermal throttling race that can result in easy to trigger boot crashes on certain Ice Lake platforms" * 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce/therm_throt: Do not access uninitialized therm_work
2020-01-18Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Tooling fixes, three Intel uncore driver fixes, plus an AUX events fix uncovered by the perf fuzzer" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Remove PCIe3 unit for SNR perf/x86/intel/uncore: Fix missing marker for snr_uncore_imc_freerunning_events perf/x86/intel/uncore: Add PCI ID of IMC for Xeon E3 V5 Family perf: Correctly handle failed perf_get_aux_event() perf hists: Fix variable name's inconsistency in hists__for_each() macro perf map: Set kmap->kmaps backpointer for main kernel map chunks perf report: Fix incorrectly added dimensions as switch perf data file tools lib traceevent: Fix memory leakage in filter_event
2020-01-18Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Three fixes: - Fix an rwsem spin-on-owner crash, introduced in v5.4 - Fix a lockdep bug when running out of stack_trace entries, introduced in v5.4 - Docbook fix" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwsem: Fix kernel crash when spinning on RWSEM_OWNER_UNKNOWN futex: Fix kernel-doc notation warning locking/lockdep: Fix buffer overrun problem in stack_trace[]
2020-01-18Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Ingo Molnar: "Fix a recent regression in the Ingenic SoCs irqchip driver that floods the syslog" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/ingenic: Get rid of the legacy IRQ domain
2020-01-18Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Ingo Molnar: "Three EFI fixes: - Fix a slow-boot-scrolling regression but making sure we use WC for EFI earlycon framebuffer mappings on x86 - Fix a mixed EFI mode boot crash - Disable paging explicitly before entering startup_32() in mixed mode bootup" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efistub: Disable paging at mixed mode entry efi/libstub/random: Initialize pointer variables to zero for mixed mode efi/earlycon: Fix write-combine mapping on x86
2020-01-18Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull rseq fixes from Ingo Molnar: "Two rseq bugfixes: - CLONE_VM !CLONE_THREAD didn't work properly, the kernel would end up corrupting the TLS of the parent. Technically a change in the ABI but the previous behavior couldn't resonably have been relied on by applications so this looks like a valid exception to the ABI rule. - Make the RSEQ_FLAG_UNREGISTER ABI behavior consistent with the handling of other flags. This is not thought to impact any applications either" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rseq: Unregister rseq for clone CLONE_VM rseq: Reject unknown flags on rseq unregister
2020-01-18Merge tag 'for-linus-2020-01-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull thread fixes from Christian Brauner: "Here is an urgent fix for ptrace_may_access() permission checking. Commit 69f594a38967 ("ptrace: do not audit capability check when outputing /proc/pid/stat") introduced the ability to opt out of audit messages for accesses to various proc files since they are not violations of policy. While doing so it switched the check from ns_capable() to has_ns_capability{_noaudit}(). That means it switched from checking the subjective credentials (ktask->cred) of the task to using the objective credentials (ktask->real_cred). This is appears to be wrong. ptrace_has_cap() is currently only used in ptrace_may_access() And is used to check whether the calling task (subject) has the CAP_SYS_PTRACE capability in the provided user namespace to operate on the target task (object). According to the cred.h comments this means the subjective credentials of the calling task need to be used. With this fix we switch ptrace_has_cap() to use security_capable() and thus back to using the subjective credentials. As one example where this might be particularly problematic, Jann pointed out that in combination with the upcoming IORING_OP_OPENAT{2} feature, this bug might allow unprivileged users to bypass the capability checks while asynchronously opening files like /proc/*/mem, because the capability checks for this would be performed against kernel credentials. To illustrate on the former point about this being exploitable: When io_uring creates a new context it records the subjective credentials of the caller. Later on, when it starts to do work it creates a kernel thread and registers a callback. The callback runs with kernel creds for ktask->real_cred and ktask->cred. To prevent this from becoming a full-blown 0-day io_uring will call override_cred() and override ktask->cred with the subjective credentials of the creator of the io_uring instance. With ptrace_has_cap() currently looking at ktask->real_cred this override will be ineffective and the caller will be able to open arbitray proc files as mentioned above. Luckily, this is currently not exploitable but would be so once IORING_OP_OPENAT{2} land in v5.6. Let's fix it now. To minimize potential regressions I successfully ran the criu testsuite. criu makes heavy use of ptrace() and extensively hits ptrace_may_access() codepaths and has a good change of detecting any regressions. Additionally, I succesfully ran the ptrace and seccomp kernel tests" * tag 'for-linus-2020-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: ptrace: reintroduce usage of subjective credentials in ptrace_has_cap()
2020-01-18Merge tag 's390-5.5-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix printing misleading Secure-IPL enabled message when it is not. - Fix a race condition between host ap bus and guest ap bus doing device reset in crypto code. - Fix sanity check in CCA cipher key function (CCA AES cipher key support), which fails otherwise. * tag 's390-5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/setup: Fix secure ipl message s390/zcrypt: move ap device reset from bus to driver code s390/zcrypt: Fix CCA cipher key gen with clear key value function
2020-01-18Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes in drivers with no impact to core code. The mptfusion fix is enormous because the driver API had to be rethreaded to pass down the necessary iocp pointer, but once that's done a significant chunk of code is deleted. The other two patches are small" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: mptfusion: Fix double fetch bug in ioctl scsi: storvsc: Correctly set number of hardware queues for IDE disk scsi: fnic: fix invalid stack access
2020-01-18Merge tag 'char-misc-5.5-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Here are some small fixes for 5.5-rc7 Included here are: - two lkdtm fixes - coresight build fix - Documentation update for the hw process document All of these have been in linux-next with no reported issues" * tag 'char-misc-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: Documentation/process: Add Amazon contact for embargoed hardware issues lkdtm/bugs: fix build error in lkdtm_UNSET_SMEP lkdtm/bugs: Make double-fault test always available coresight: etm4x: Fix unused function warning
2020-01-18Merge tag 'staging-5.5-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO driver fixes from Greg KH: "Here are some small staging and iio driver fixes for 5.5-rc7 All of them are for some small reported issues. Nothing major, full details in the shortlog. All have been in linux-next with no reported issues" * tag 'staging-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: comedi: ni_routes: allow partial routing information staging: comedi: ni_routes: fix null dereference in ni_find_route_source() iio: light: vcnl4000: Fix scale for vcnl4040 iio: buffer: align the size of scan bytes to size of the largest element iio: chemical: pms7003: fix unmet triggered buffer dependency iio: imu: st_lsm6dsx: Fix selection of ST_LSM6DS3_ID iio: adc: ad7124: Fix DT channel configuration
2020-01-18Merge tag 'usb-5.5-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB driver fixes from Greg KH: "Here are some small USB driver and core fixes for 5.5-rc7 There's one fix for hub wakeup issues and a number of small usb-serial driver fixes and device id updates. The hub fix has been in linux-next for a while with no reported issues, and the usb-serial ones have all passed 0-day with no problems" * tag 'usb-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: serial: quatech2: handle unbound ports USB: serial: keyspan: handle unbound ports USB: serial: io_edgeport: add missing active-port sanity check USB: serial: io_edgeport: handle unbound ports on URB completion USB: serial: ch341: handle unbound port at reset_resume USB: serial: suppress driver bind attributes USB: serial: option: add support for Quectel RM500Q in QDL mode usb: core: hub: Improved device recognition on remote wakeup USB: serial: opticon: fix control-message timeouts USB: serial: option: Add support for Quectel RM500Q USB: serial: simple: Add Motorola Solutions TETRA MTP3xxx and MTP85xx
2020-01-18Documentation: path-lookup: include new LOOKUP flagsAleksa Sarai
Now that we have new LOOKUP flags, we should document them in the relevant path-walking documentation. And now that we've settled on a common name for nd_jump_link() style symlinks ("magic links"), use that term where magic-link semantics are described. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-18selftests: add openat2(2) selftestsAleksa Sarai
Test all of the various openat2(2) flags. A small stress-test of a symlink-rename attack is included to show that the protections against ".."-based attacks are sufficient. The main things these self-tests are enforcing are: * The struct+usize ABI for openat2(2) and copy_struct_from_user() to ensure that upgrades will be handled gracefully (in addition, ensuring that misaligned structures are also handled correctly). * The -EINVAL checks for openat2(2) are all correctly handled to avoid userspace passing unknown or conflicting flag sets (most importantly, ensuring that invalid flag combinations are checked). * All of the RESOLVE_* semantics (including errno values) are correctly handled with various combinations of paths and flags. * RESOLVE_IN_ROOT correctly protects against the symlink rename(2) attack that has been responsible for several CVEs (and likely will be responsible for several more). Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-18open: introduce openat2(2) syscallAleksa Sarai
/* Background. */ For a very long time, extending openat(2) with new features has been incredibly frustrating. This stems from the fact that openat(2) is possibly the most famous counter-example to the mantra "don't silently accept garbage from userspace" -- it doesn't check whether unknown flags are present[1]. This means that (generally) the addition of new flags to openat(2) has been fraught with backwards-compatibility issues (O_TMPFILE has to be defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old kernels gave errors, since it's insecure to silently ignore the flag[2]). All new security-related flags therefore have a tough road to being added to openat(2). Userspace also has a hard time figuring out whether a particular flag is supported on a particular kernel. While it is now possible with contemporary kernels (thanks to [3]), older kernels will expose unknown flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during openat(2) time matches modern syscall designs and is far more fool-proof. In addition, the newly-added path resolution restriction LOOKUP flags (which we would like to expose to user-space) don't feel related to the pre-existing O_* flag set -- they affect all components of path lookup. We'd therefore like to add a new flag argument. Adding a new syscall allows us to finally fix the flag-ignoring problem, and we can make it extensible enough so that we will hopefully never need an openat3(2). /* Syscall Prototype. */ /* * open_how is an extensible structure (similar in interface to * clone3(2) or sched_setattr(2)). The size parameter must be set to * sizeof(struct open_how), to allow for future extensions. All future * extensions will be appended to open_how, with their zero value * acting as a no-op default. */ struct open_how { /* ... */ }; int openat2(int dfd, const char *pathname, struct open_how *how, size_t size); /* Description. */ The initial version of 'struct open_how' contains the following fields: flags Used to specify openat(2)-style flags. However, any unknown flag bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR) will result in -EINVAL. In addition, this field is 64-bits wide to allow for more O_ flags than currently permitted with openat(2). mode The file mode for O_CREAT or O_TMPFILE. Must be set to zero if flags does not contain O_CREAT or O_TMPFILE. resolve Restrict path resolution (in contrast to O_* flags they affect all path components). The current set of flags are as follows (at the moment, all of the RESOLVE_ flags are implemented as just passing the corresponding LOOKUP_ flag). RESOLVE_NO_XDEV => LOOKUP_NO_XDEV RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS RESOLVE_BENEATH => LOOKUP_BENEATH RESOLVE_IN_ROOT => LOOKUP_IN_ROOT open_how does not contain an embedded size field, because it is of little benefit (userspace can figure out the kernel open_how size at runtime fairly easily without it). It also only contains u64s (even though ->mode arguably should be a u16) to avoid having padding fields which are never used in the future. Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE is no longer permitted for openat(2). As far as I can tell, this has always been a bug and appears to not be used by userspace (and I've not seen any problems on my machines by disallowing it). If it turns out this breaks something, we can special-case it and only permit it for openat(2) but not openat2(2). After input from Florian Weimer, the new open_how and flag definitions are inside a separate header from uapi/linux/fcntl.h, to avoid problems that glibc has with importing that header. /* Testing. */ In a follow-up patch there are over 200 selftests which ensure that this syscall has the correct semantics and will correctly handle several attack scenarios. In addition, I've written a userspace library[4] which provides convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care must be taken when using RESOLVE_IN_ROOT'd file descriptors with other syscalls). During the development of this patch, I've run numerous verification tests using libpathrs (showing that the API is reasonably usable by userspace). /* Future Work. */ Additional RESOLVE_ flags have been suggested during the review period. These can be easily implemented separately (such as blocking auto-mount during resolution). Furthermore, there are some other proposed changes to the openat(2) interface (the most obvious example is magic-link hardening[5]) which would be a good opportunity to add a way for userspace to restrict how O_PATH file descriptors can be re-opened. Another possible avenue of future work would be some kind of CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace which openat2(2) flags and fields are supported by the current kernel (to avoid userspace having to go through several guesses to figure it out). [1]: https://lwn.net/Articles/588444/ [2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com [3]: commit 629e014bb834 ("fs: completely ignore unknown open flags") [4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523 [5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/ [6]: https://youtu.be/ggD-eb3yPVs Suggested-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-18Merge branch 'bnxt_en-fixes'David S. Miller
Michael Chan says: ==================== bnxt_en: Bug fixes. 3 small bug fix patches. The 1st two are aRFS fixes and the last one fixes a fatal driver load failure on some kernels without PCIe extended config space support enabled. Please also queue these for -stable. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-18bnxt_en: Do not treat DSN (Digital Serial Number) read failure as fatal.Michael Chan
DSN read can fail, for example on a kdump kernel without PCIe extended config space support. If DSN read fails, don't set the BNXT_FLAG_DSN_VALID flag and continue loading. Check the flag to see if the stored DSN is valid before using it. Only VF reps creation should fail without valid DSN. Fixes: 03213a996531 ("bnxt: move bp->switch_id initialization to PF probe") Reported-by: Marc Smith <msmith626@gmail.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-18bnxt_en: Fix ipv6 RFS filter matching logic.Michael Chan
Fix bnxt_fltr_match() to match ipv6 source and destination addresses. The function currently only checks ipv4 addresses and will not work corrently on ipv6 filters. Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-18bnxt_en: Fix NTUPLE firmware command failures.Michael Chan
The NTUPLE related firmware commands are sent to the wrong firmware channel, causing all these commands to fail on new firmware that supports the new firmware channel. Fix it by excluding the 3 NTUPLE firmware commands from the list for the new firmware channel. Fixes: 760b6d33410c ("bnxt_en: Add support for 2nd firmware message channel.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>