summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-02-17block: don't defer flushes on blk-mq + schedulingfor-4.11/reviewJens Axboe
For blk-mq with scheduling, we can potentially end up with ALL driver tags assigned and sitting on the flush queues. If we defer because of an inlfight data request, then we can deadlock if that data request doesn't already have a tag assigned. This fixes a deadlock with running the xfs/297 xfstest, where thousands of syncs can cause the drive queue to stall. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-17blk-mq-sched: ask scheduler for work, if we failed dispatching leftoversJens Axboe
Usually we don't ask the scheduler for work, if we already have leftovers on the dispatch list. This is done to leave work on the scheduler side for as long as possible, for proper merging. But if we do have work leftover but didn't dispatch anything, then we should ask the scheduler since we could potentially issue requests from that. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-17blk-mq: don't special case flush inserts for blk-mq-schedJens Axboe
The current request insertion machinery works just fine for directly inserting flushes, so no need to special case this anymore. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-17blk-mq-sched: don't add flushes to the head of requeue queueJens Axboe
If we are currently out of driver tags, we don't want to add a new flush (without a tag) to the head of the requeue list. We want to add it to the back, behind the others that are potentially also waiting for a tag. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-17blk-mq: have blk_mq_dispatch_rq_list() return if we queued IO or notJens Axboe
Currently we're almost there, but if we dispatch nothing, then we still return success. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-15Merge branch 'for-4.11/next' into for-nextJens Axboe
2017-02-15block: do not allow updates through sysfs until registration completesTahsin Erdogan
When a new disk shows up, sysfs queue directory is created before elevator is registered. This allows a user to attempt a scheduler switch even though the initial registration hasn't completed yet. In one scenario, blk_register_queue() calls elv_register_queue() and right before cfq_registered_queue() is called, another process executes elevator_switch() and replaces q->elevator with deadline scheduler. When cfq_registered_queue() executes it interprets e->elevator_data as struct cfq_data even though it is actually struct deadline_data. Grab q->sysfs_lock in blk_register_queue() to synchronize with sysfs callers. Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-15Merge branch 'for-4.11/block' into for-nextJens Axboe
2017-02-15lightnvm: set default lun range when no luns are specifiedMatias Bjørling
The create target ioctl takes a lun begin and lun end parameter, which defines the range of luns to initialize a target with. If the user does not set the parameters, it default to only using lun 0. Instead, defaults to use all luns in the OCSSD, as it is the usual behaviour users want. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-15lightnvm: fix off-by-one error on target initializationMatias Bjørling
If one specifies the end lun id to be the absolute number of luns, without taking zero indexing into account, the lightnvm core will pass the off-by-one end lun id to target creation, which then panics during nvm_ioctl_dev_create. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-14Merge branch 'for-4.11/block' into for-nextJens Axboe
2017-02-14Maintainers: Modify SED list from nvme to blockScott Bauer
Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-14Move stack parameters for sed_ioctl to prevent oversized stack with CONFIG_KASANScott Bauer
When CONFIG_KASAN is enabled, compilation fails: block/sed-opal.c: In function 'sed_ioctl': block/sed-opal.c:2447:1: error: the frame size of 2256 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] Moved all the ioctl structures off the stack and dynamically allocate using _IOC_SIZE() Fixes: 455a7b238cd6 ("block: Add Sed-opal library") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-14uapi: sed-opal fix IOW for activate lsp to use correct structScott Bauer
The IOC_OPAL_ACTIVATE_LSP took the wrong strcure which would give us the wrong size when using _IOC_SIZE, switch it to the right structure. Fixes: 058f8a2 ("Include: Uapi: Add user ABI for Sed/Opal") Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-14Merge branch 'for-4.11/block' into for-nextJens Axboe
2017-02-14cdrom: Make device operations read-onlyKees Cook
Since function tables are a common target for attackers, it's best to keep them in read-only memory. As such, this makes the CDROM device ops tables const. This drops additionally n_minors, since it isn't used meaningfully, and sets the only user of cdrom_dummy_generic_packet explicitly so the variables can all be const. Inspired by similar changes in grsecurity/PaX. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-14Merge branch 'for-4.11/block' into for-nextJens Axboe
2017-02-14elevator: fix loading wrong elevator type for blk-mq devicesJens Axboe
The old elevator= boot parameter blindly attempts to load the same scheduler for mq and !mq devices, leading to a crash if we specify the wrong one. Ensure that we only apply this boot parameter to old !mq devices. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-13Merge branch 'for-4.11/block' into for-nextJens Axboe
2017-02-13cciss: switch to pci_irq_alloc_vectorsChristoph Hellwig
Simple cleanup to use the new APIs. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Don Brace <don.brace@microsemi.com> Tested-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-13Merge branch 'for-4.11/block' into for-nextJens Axboe
2017-02-13block/loop: fix race between I/O and set_statusMing Lei
Inside set_status, transfer need to setup again, so we have to drain IO before the transition, otherwise oops may be triggered like the following: divide error: 0000 [#1] SMP KASAN CPU: 0 PID: 2935 Comm: loop7 Not tainted 4.10.0-rc7+ #213 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88006ba1e840 task.stack: ffff880067338000 RIP: 0010:transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP: 0018:ffff88006733f108 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8800688d7000 RCX: 0000000000000059 RDX: 0000000000000000 RSI: 1ffff1000d743f43 RDI: ffff880068891c08 RBP: ffff88006733f160 R08: ffff8800688d7001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800688d7000 R13: ffff880067b7d000 R14: dffffc0000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006c17e0 CR3: 0000000066e3b000 CR4: 00000000001406f0 Call Trace: lo_do_transfer drivers/block/loop.c:251 [inline] lo_read_transfer drivers/block/loop.c:392 [inline] do_req_filebacked drivers/block/loop.c:541 [inline] loop_handle_cmd drivers/block/loop.c:1677 [inline] loop_queue_work+0xda0/0x49b0 drivers/block/loop.c:1689 kthread_worker_fn+0x4c3/0xa30 kernel/kthread.c:630 kthread+0x326/0x3f0 kernel/kthread.c:227 ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430 Code: 03 83 e2 07 41 29 df 42 0f b6 04 30 4d 8d 44 24 01 38 d0 7f 08 84 c0 0f 85 62 02 00 00 44 89 f8 41 0f b6 48 ff 25 ff 01 00 00 99 <f7> 7d c8 48 63 d2 48 03 55 d0 48 89 d0 48 89 d7 48 c1 e8 03 83 RIP: transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP: ffff88006733f108 ---[ end trace 0166f7bd3b0c0933 ]--- Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Signed-off-by: Ming Lei <tom.leiming@gmail.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-10Merge branch 'for-4.11/next' into for-nextJens Axboe
2017-02-10blk-mq-sched: don't hold queue_lock when calling exit_icqOmar Sandoval
None of the other blk-mq elevator hooks are called with this lock held. Additionally, it can lead to circular locking dependencies between queue_lock and the private scheduler lock. Reported-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-10Merge branch 'for-4.11/next' into for-nextJens Axboe
2017-02-10block: set make_request_fn manually in blk_mq_update_nr_hw_queuesJosef Bacik
Calling blk_queue_make_request resets a bunch of settings on the request_queue, but all we really want is to update the make_request_fn, so do this directly so we don't lose things like the logical and physical block sizes. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-10Merge branch 'for-4.11/next' into for-nextJens Axboe
2017-02-10blk-mq: pass bio to blk_mq_sched_get_rq_privPaolo Valente
bio is used in bfq-mq's get_rq_priv, to get the request group. We could pass directly the group here, but I thought that passing the bio was more general, giving the possibility to get other pieces of information if needed. Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-08Merge branch 'for-4.11/next' into for-nextJens Axboe
2017-02-08block: fix double-free in the failure path of cgwb_bdi_init()Tejun Heo
When !CONFIG_CGROUP_WRITEBACK, bdi has single bdi_writeback_congested at bdi->wb_congested. cgwb_bdi_init() allocates it with kzalloc() and doesn't do further initialization. This usually works fine as the reference count gets bumped to 1 by wb_init() and the put from wb_exit() releases it. However, when wb_init() fails, it puts the wb base ref automatically freeing the wb and the explicit kfree() in cgwb_bdi_init() error path ends up trying to free the same pointer the second time causing a double-free. Fix it by explicitly initilizing the refcnt to 1 and putting the base ref from cgwb_bdi_destroy(). Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dmitry Vyukov <dvyukov@google.com> Fixes: a13f35e87140 ("writeback: don't embed root bdi_writeback_congested in bdi_writeback") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-08Merge branch 'for-4.11/next' into for-nextJens Axboe
2017-02-08nvme: support ranged discard requestsChristoph Hellwig
NVMe supports up to 256 ranges per DSM command, so wire up support for ranged discards up to that limit. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-08block: optionally merge discontiguous discard bios into a single requestChristoph Hellwig
Add a new merge strategy that merges discard bios into a request until the maximum number of discard ranges (or the maximum discard size) is reached from the plug merging code. I/O scheduler merging is not wired up yet but might also be useful, although not for fast devices like NVMe which are the only user for now. Note that for now we don't support limiting the size of each discard range, but if needed that can be added later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-08block: enumify ELEVATOR_*_MERGEChristoph Hellwig
Switch these constants to an enum, and make let the compiler ensure that all callers of blk_try_merge and elv_merge handle all potential values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-08block: move req_set_nomerge to blk.hChristoph Hellwig
This makes it available outside of blk-merge.c, and inlining such a trivial helper seems pretty useful to start with. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-07Merge branch 'for-4.11/block' into for-nextJens Axboe
2017-02-07gdrom: Add missing error codeChristophe JAILLET
In case of error, 'err' is known to be 0 here, because of the previous test. Set it to a -ENOMEM instead. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-06Merge branch 'master' into for-nextJens Axboe
2017-02-06Merge branch 'for-4.11/block' into for-nextJens Axboe
2017-02-06Fix SED-OPAL UAPI structs to prevent 32/64 bit size differences.Scott Bauer
This patch is a quick fixup of the user structures that will prevent the structures from being different sizes on 32 and 64 bit archs. Taking this fix will allow us to *NOT* have to do compat ioctls for the sed code. Signed-off-by: Scott Bauer <scott.bauer@intel.com> Fixes: 19641f2d7674 ("Include: Uapi: Add user ABI for Sed/Opal") Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-06Merge tag 'pm-4.10-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These add a quirk to intel_pstate to work around a firmware setting that leads to frequency scaling issues (discovered recently) on some Intel Kaby Lake processors, fix up the recently added brcmstb-avs cpufreq driver and avoid false-positive warnings from the runtime PM framework triggered by recent changes in i915. Specifics: - Add an intel_pstate driver quirk to work around a firmware setting that leads to frequency scaling issues on desktop Intel Kaby Lake processors in some configurations if the hardware-managed P-states (HWP) feature is in use (Srinivas Pandruvada) - Fix up the recently added brcmstb-avs cpufreq driver: fix a bug related to system suspend and change the sysfs interface to match the user space expectations (Markus Mayer) - Modify the runtime PM framework to avoid false-positive warnings from the might_sleep_if() assertions in it (Rafael Wysocki)" * tag 'pm-4.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / runtime: Avoid false-positive warnings from might_sleep_if() cpufreq: intel_pstate: Disable energy efficiency optimization cpufreq: brcmstb-avs-cpufreq: properly retrieve P-state upon suspend cpufreq: brcmstb-avs-cpufreq: extend sysfs entry brcm_avs_pmap
2017-02-06Merge tag 'dm-4.10-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - a fix for a race in .request_fn request-based DM request handling vs DM device destruction - an RCU fix for dm-crypt's kernel keyring support that was included in 4.10-rc1 - a -Wbool-operation warning fix for DM multipath * tag 'dm-4.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm crypt: replace RCU read-side section with rwsem dm rq: cope with DM device destruction while in dm_old_request_fn() dm mpath: cleanup -Wbool-operation warning in choose_pgpath()
2017-02-06Merge tag 'media/v4.10-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "A few documentation fixes at CEC (with got promoted from staging for 4.10), and one fix on its core." * tag 'media/v4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] cec: fix wrong last_la determination [media] cec-intro.rst: mention the v4l-utils package and CEC utilities [media] cec rst: remove "This API is not yet finalized" notice
2017-02-06Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - use-after-free in algif_aead - modular aesni regression when pcbc is modular but absent - bug causing IO page faults in ccp - double list add in ccp - NULL pointer dereference in qat (two patches) - panic in chcr - NULL pointer dereference in chcr - out-of-bound access in chcr * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: chcr - Fix key length for RFC4106 crypto: algif_aead - Fix kernel panic on list_del crypto: aesni - Fix failure when pcbc module is absent crypto: ccp - Fix double add when creating new DMA command crypto: ccp - Fix DMA operations when IOMMU is enabled crypto: chcr - Check device is allocated before use crypto: chcr - Fix panic on dma_unmap_sg crypto: qat - zero esram only for DH85x devices crypto: qat - fix bar discovery for c62x
2017-02-06Merge branch 'for-4.11/next' into for-nextJens Axboe
2017-02-06blk-mq-sched: (un)register elevator when (un)registering queueOmar Sandoval
I noticed that when booting with a default blk-mq I/O scheduler, the /sys/block/*/queue/iosched directory was missing. However, switching after boot did create the directory. This is because we skip the initial elevator register/unregister when we don't have a ->request_fn(), but we should still do it for the ->mq_ops case. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-06Merge branch 'for-4.11/block' into for-nextJens Axboe
2017-02-06nvme: Add Support for Opal: Unlock from S3 & Opal Allocation/IoctlsScott Bauer
This patch implements the necessary logic to unlock an Opal enabled device coming back from an S3. The patch also implements the SED/Opal allocation necessary to support the opal ioctls. Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-06block: Add Sed-opal libraryScott Bauer
This patch implements the necessary logic to bring an Opal enabled drive out of a factory-enabled into a working Opal state. This patch set also enables logic to save a password to be replayed during a resume from suspend. Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Rafael Antognolli <Rafael.Antognolli@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-06Include: Uapi: Add user ABI for Sed/OpalScott Bauer
Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Rafael Antognolli <Rafael.Antognolli@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>