block, documentation: Explain the word 'segments'
[linux-2.6-block.git] / Documentation / block / queue-sysfs.txt
CommitLineData
cbb5901b
JA
1Queue sysfs files
2=================
3
4This text file will detail the queue files that are located in the sysfs tree
5for each block device. Note that stacked devices typically do not export
6any settings, since their queue merely functions are a remapping target.
7These files are the ones found in the /sys/block/xxx/queue/ directory.
8
9Files denoted with a RO postfix are readonly and the RW postfix means
10read-write.
11
4004e90c
NJ
12add_random (RW)
13----------------
db4ced14 14This file allows to turn off the disk entropy contribution. Default
4004e90c
NJ
15value of this file is '1'(on).
16
6728ac33
BVA
17chunk_sectors (RO)
18------------------
19This has different meaning depending on the type of the block device.
20For a RAID device (dm-raid), chunk_sectors indicates the size in 512B sectors
21of the RAID volume stripe segment. For a zoned block device, either host-aware
22or host-managed, chunk_sectors indicates the size in 512B sectors of the zones
23of the device, with the eventual exception of the last zone of the device which
24may be smaller.
25
005411ea
JL
26dax (RO)
27--------
28This file indicates whether the device supports Direct Access (DAX),
29used by CPU-addressable storage to bypass the pagecache. It shows '1'
30if true, '0' if not.
31
4004e90c
NJ
32discard_granularity (RO)
33-----------------------
34This shows the size of internal allocation of the device in bytes, if
35reported by the device. A value of '0' means device does not support
36the discard functionality.
37
0034af03 38discard_max_hw_bytes (RO)
4004e90c
NJ
39----------------------
40Devices that support discard functionality may have internal limits on
41the number of bytes that can be trimmed or unmapped in a single operation.
42The discard_max_bytes parameter is set by the device driver to the maximum
43number of bytes that can be discarded in a single operation. Discard
44requests issued to the device must not exceed this limit. A discard_max_bytes
45value of 0 means that the device does not support discard functionality.
46
0034af03
JA
47discard_max_bytes (RW)
48----------------------
49While discard_max_hw_bytes is the hardware limit for the device, this
50setting is the software limit. Some devices exhibit large latencies when
51large discards are issued, setting this value lower will make Linux issue
52smaller discards and potentially help reduce latencies induced by large
53discard operations.
54
cbb5901b
JA
55hw_sector_size (RO)
56-------------------
57This is the hardware sector size of the device, in bytes.
58
005411ea
JL
59io_poll (RW)
60------------
7158339d
JM
61When read, this file shows whether polling is enabled (1) or disabled
62(0). Writing '0' to this file will disable polling for this device.
63Writing any non-zero value will enable this feature.
005411ea 64
10e6246e
JA
65io_poll_delay (RW)
66------------------
67If polling is enabled, this controls what kind of polling will be
68performed. It defaults to -1, which is classic polling. In this mode,
69the CPU will repeatedly ask for completions without giving up any time.
70If set to 0, a hybrid polling mode is used, where the kernel will attempt
71to make an educated guess at when the IO will complete. Based on this
72guess, the kernel will put the process issuing IO to sleep for an amount
73of time, before entering a classic poll loop. This mode might be a
74little slower than pure classic polling, but it will be more efficient.
75If set to a value larger than 0, the kernel will put the process issuing
f9824952 76IO to sleep for this amount of microseconds before entering classic
10e6246e
JA
77polling.
78
bb351aba
WZ
79io_timeout (RW)
80---------------
81io_timeout is the request timeout in milliseconds. If a request does not
82complete in this time then the block driver timeout handler is invoked.
83That timeout handler can decide to retry the request, to fail it or to start
84a device recovery strategy.
85
4004e90c
NJ
86iostats (RW)
87-------------
88This file is used to control (on/off) the iostats accounting of the
89disk.
90
91logical_block_size (RO)
92-----------------------
141fd28c 93This is the logical block size of the device, in bytes.
4004e90c 94
cbb5901b
JA
95max_hw_sectors_kb (RO)
96----------------------
97This is the maximum number of kilobytes supported in a single data transfer.
98
4004e90c
NJ
99max_integrity_segments (RO)
100---------------------------
0c766e78
BVA
101Maximum number of elements in a DMA scatter/gather list with integrity
102data that will be submitted by the block layer core to the associated
103block driver.
4004e90c 104
cbb5901b
JA
105max_sectors_kb (RW)
106-------------------
107This is the maximum number of kilobytes that the block layer will allow
108for a filesystem request. Must be smaller than or equal to the maximum
109size allowed by the hardware.
110
4004e90c
NJ
111max_segments (RO)
112-----------------
0c766e78
BVA
113Maximum number of elements in a DMA scatter/gather list that is submitted
114to the associated block driver.
4004e90c
NJ
115
116max_segment_size (RO)
117---------------------
0c766e78 118Maximum size in bytes of a single element in a DMA scatter/gather list.
4004e90c
NJ
119
120minimum_io_size (RO)
121--------------------
db4ced14 122This is the smallest preferred IO size reported by the device.
4004e90c 123
cbb5901b
JA
124nomerges (RW)
125-------------
488991e2
AB
126This enables the user to disable the lookup logic involved with IO
127merging requests in the block layer. By default (0) all merges are
128enabled. When set to 1 only simple one-hit merges will be tried. When
129set to 2 no merge algorithms will be tried (including one-hit or more
130complex tree/hash lookups).
cbb5901b
JA
131
132nr_requests (RW)
133----------------
134This controls how many requests may be allocated in the block layer for
135read or write requests. Note that the total allocated number may be twice
136this amount, since it applies only to reads or writes (not the accumulated
137sum).
138
a051661c
TH
139To avoid priority inversion through request starvation, a request
140queue maintains a separate request pool per each cgroup when
141CONFIG_BLK_CGROUP is enabled, and this parameter applies to each such
142per-block-cgroup request pool. IOW, if there are N block cgroups,
f884ab15 143each request queue may have up to N request pools, each independently
a051661c
TH
144regulated by nr_requests.
145
6728ac33
BVA
146nr_zones (RO)
147-------------
148For zoned block devices (zoned attribute indicating "host-managed" or
149"host-aware"), this indicates the total number of zones of the device.
150This is always 0 for regular block devices.
151
4004e90c
NJ
152optimal_io_size (RO)
153--------------------
db4ced14 154This is the optimal IO size reported by the device.
4004e90c
NJ
155
156physical_block_size (RO)
157------------------------
158This is the physical block size of device, in bytes.
159
cbb5901b
JA
160read_ahead_kb (RW)
161------------------
162Maximum number of kilobytes to read-ahead for filesystems on this block
163device.
164
4004e90c
NJ
165rotational (RW)
166---------------
167This file is used to stat if the device is of rotational type or
168non-rotational type.
169
cbb5901b
JA
170rq_affinity (RW)
171----------------
5757a6d7
DW
172If this option is '1', the block layer will migrate request completions to the
173cpu "group" that originally submitted the request. For some workloads this
174provides a significant reduction in CPU cycles due to caching effects.
175
176For storage configurations that need to maximize distribution of completion
177processing setting this option to '2' forces the completion to run on the
178requesting cpu (bypassing the "group" aggregation logic).
cbb5901b
JA
179
180scheduler (RW)
181--------------
182When read, this file will display the current and available IO schedulers
183for this block device. The currently active IO scheduler will be enclosed
184in [] brackets. Writing an IO scheduler name to this file will switch
185control of this block device to that new IO scheduler. Note that writing
186an IO scheduler name to this file will attempt to load that IO scheduler
187module, if it isn't already present in the system.
188
93e9d8e8
JA
189write_cache (RW)
190----------------
191When read, this file will display whether the device has write back
192caching enabled or not. It will return "write back" for the former
193case, and "write through" for the latter. Writing to this file can
194change the kernels view of the device, but it doesn't alter the
195device state. This means that it might not be safe to toggle the
196setting from "write back" to "write through", since that will also
197eliminate cache flushes issued by the kernel.
cbb5901b 198
005411ea
JL
199write_same_max_bytes (RO)
200-------------------------
201This is the number of bytes the device can write in a single write-same
202command. A value of '0' means write-same is not supported by this
203device.
204
152c7776
BVA
205wbt_lat_usec (RW)
206-----------------
87760e5e
JA
207If the device is registered for writeback throttling, then this file shows
208the target minimum read latency. If this latency is exceeded in a given
209window of time (see wb_window_usec), then the writeback throttling will start
80e091d1
JA
210scaling back writes. Writing a value of '0' to this file disables the
211feature. Writing a value of '-1' to this file resets the value to the
212default setting.
87760e5e 213
297e3d85
SL
214throttle_sample_time (RW)
215-------------------------
216This is the time window that blk-throttle samples data, in millisecond.
217blk-throttle makes decision based on the samplings. Lower time means cgroups
218have more smooth throughput, but higher CPU overhead. This exists only when
219CONFIG_BLK_DEV_THROTTLING_LOW is enabled.
cbb5901b 220
f9824952
DLM
221zoned (RO)
222----------
223This indicates if the device is a zoned block device and the zone model of the
224device if it is indeed zoned. The possible values indicated by zoned are
225"none" for regular block devices and "host-aware" or "host-managed" for zoned
226block devices. The characteristics of host-aware and host-managed zoned block
227devices are described in the ZBC (Zoned Block Commands) and ZAC
228(Zoned Device ATA Command Set) standards. These standards also define the
229"drive-managed" zone model. However, since drive-managed zoned block devices
230do not support zone commands, they will be treated as regular block devices
231and zoned will report "none".
232
cbb5901b 233Jens Axboe <jens.axboe@oracle.com>, February 2009