Merge tag 'riscv-for-linus-5.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel...
[linux-block.git] / Documentation / networking / ipvs-sysctl.rst
CommitLineData
82a07bf3
MCC
1.. SPDX-License-Identifier: GPL-2.0
2
3===========
4IPvs-sysctl
5===========
6
6ce1669f 7/proc/sys/net/ipv4/vs/* Variables:
82a07bf3 8==================================
6ce1669f
H
9
10am_droprate - INTEGER
82a07bf3 11 default 10
6ce1669f 12
82a07bf3
MCC
13 It sets the always mode drop rate, which is used in the mode 3
14 of the drop_rate defense.
6ce1669f
H
15
16amemthresh - INTEGER
82a07bf3 17 default 1024
6ce1669f 18
82a07bf3
MCC
19 It sets the available memory threshold (in pages), which is
20 used in the automatic modes of defense. When there is no
21 enough available memory, the respective strategy will be
22 enabled and the variable is automatically set to 2, otherwise
23 the strategy is disabled and the variable is set to 1.
6ce1669f 24
0c12582f 25backup_only - BOOLEAN
82a07bf3
MCC
26 - 0 - disabled (default)
27 - not 0 - enabled
0c12582f
JA
28
29 If set, disable the director function while the server is
30 in backup mode to avoid packet loops for DR/TUN methods.
31
d752c364
MRL
32conn_reuse_mode - INTEGER
33 1 - default
34
35 Controls how ipvs will deal with connections that are detected
36 port reuse. It is a bitmap, with the values being:
37
38 0: disable any special handling on port reuse. The new
39 connection will be delivered to the same real server that was
40 servicing the previous connection. This will effectively
41 disable expire_nodest_conn.
42
43 bit 1: enable rescheduling of new connections when it is safe.
44 That is, whenever expire_nodest_conn and for TCP sockets, when
45 the connection is in TIME_WAIT state (which is only possible if
46 you use NAT mode).
47
48 bit 2: it is bit 1 plus, for TCP connections, when connections
49 are in FIN_WAIT state, as this is the last state seen by load
50 balancer in Direct Routing mode. This bit helps on adding new
51 real servers to a very busy cluster.
52
7e777dd4 53conntrack - BOOLEAN
82a07bf3
MCC
54 - 0 - disabled (default)
55 - not 0 - enabled
7e777dd4
SH
56
57 If set, maintain connection tracking entries for
58 connections handled by IPVS.
59
60 This should be enabled if connections handled by IPVS are to be
61 also handled by stateful firewall rules. That is, iptables rules
62 that make use of connection tracking. It is a performance
63 optimisation to disable this setting otherwise.
64
65 Connections handled by the IPVS FTP application module
66 will have connection tracking entries regardless of this setting.
67
40cb1f9b 68 Only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled.
7e777dd4 69
6ce1669f 70cache_bypass - BOOLEAN
82a07bf3
MCC
71 - 0 - disabled (default)
72 - not 0 - enabled
6ce1669f 73
82a07bf3
MCC
74 If it is enabled, forward packets to the original destination
75 directly when no cache server is available and destination
76 address is not local (iph->daddr is RTN_UNICAST). It is mostly
77 used in transparent web cache cluster.
6ce1669f
H
78
79debug_level - INTEGER
82a07bf3
MCC
80 - 0 - transmission error messages (default)
81 - 1 - non-fatal error messages
82 - 2 - configuration
83 - 3 - destination trash
84 - 4 - drop entry
85 - 5 - service lookup
86 - 6 - scheduling
87 - 7 - connection new/expire, lookup and synchronization
88 - 8 - state transition
89 - 9 - binding destination, template checks and applications
90 - 10 - IPVS packet transmission
91 - 11 - IPVS packet handling (ip_vs_in/ip_vs_out)
92 - 12 or more - packet traversal
6ce1669f 93
40cb1f9b 94 Only available when IPVS is compiled with CONFIG_IP_VS_DEBUG enabled.
6ce1669f
H
95
96 Higher debugging levels include the messages for lower debugging
97 levels, so setting debug level 2, includes level 0, 1 and 2
98 messages. Thus, logging becomes more and more verbose the higher
99 the level.
100
101drop_entry - INTEGER
82a07bf3
MCC
102 - 0 - disabled (default)
103
104 The drop_entry defense is to randomly drop entries in the
105 connection hash table, just in order to collect back some
106 memory for new connections. In the current code, the
107 drop_entry procedure can be activated every second, then it
108 randomly scans 1/32 of the whole and drops entries that are in
109 the SYN-RECV/SYNACK state, which should be effective against
110 syn-flooding attack.
111
112 The valid values of drop_entry are from 0 to 3, where 0 means
113 that this strategy is always disabled, 1 and 2 mean automatic
114 modes (when there is no enough available memory, the strategy
115 is enabled and the variable is automatically set to 2,
116 otherwise the strategy is disabled and the variable is set to
474112d5 117 1), and 3 means that the strategy is always enabled.
6ce1669f
H
118
119drop_packet - INTEGER
82a07bf3 120 - 0 - disabled (default)
6ce1669f 121
82a07bf3
MCC
122 The drop_packet defense is designed to drop 1/rate packets
123 before forwarding them to real servers. If the rate is 1, then
124 drop all the incoming packets.
6ce1669f 125
82a07bf3
MCC
126 The value definition is the same as that of the drop_entry. In
127 the automatic mode, the rate is determined by the follow
128 formula: rate = amemthresh / (amemthresh - available_memory)
129 when available memory is less than the available memory
130 threshold. When the mode 3 is set, the always mode drop rate
131 is controlled by the /proc/sys/net/ipv4/vs/am_droprate.
6ce1669f
H
132
133expire_nodest_conn - BOOLEAN
82a07bf3
MCC
134 - 0 - disabled (default)
135 - not 0 - enabled
136
137 The default value is 0, the load balancer will silently drop
138 packets when its destination server is not available. It may
139 be useful, when user-space monitoring program deletes the
140 destination server (because of server overload or wrong
141 detection) and add back the server later, and the connections
142 to the server can continue.
143
144 If this feature is enabled, the load balancer will expire the
145 connection immediately when a packet arrives and its
146 destination server is not available, then the client program
147 will be notified that the connection is closed. This is
148 equivalent to the feature some people requires to flush
149 connections when its destination is not available.
6ce1669f
H
150
151expire_quiescent_template - BOOLEAN
82a07bf3
MCC
152 - 0 - disabled (default)
153 - not 0 - enabled
6ce1669f
H
154
155 When set to a non-zero value, the load balancer will expire
156 persistent templates when the destination server is quiescent.
157 This may be useful, when a user makes a destination server
158 quiescent by setting its weight to 0 and it is desired that
159 subsequent otherwise persistent connections are sent to a
160 different destination server. By default new persistent
161 connections are allowed to quiescent destination servers.
162
163 If this feature is enabled, the load balancer will expire the
164 persistence template if it is to be used to schedule a new
165 connection and the destination server is quiescent.
166
4e478098 167ignore_tunneled - BOOLEAN
82a07bf3
MCC
168 - 0 - disabled (default)
169 - not 0 - enabled
4e478098
AG
170
171 If set, ipvs will set the ipvs_property on all packets which are of
172 unrecognized protocols. This prevents us from routing tunneled
173 protocols like ipip, which is useful to prevent rescheduling
174 packets that have been tunneled to the ipvs host (i.e. to prevent
175 ipvs routing loops when ipvs is also acting as a real server).
176
6ce1669f 177nat_icmp_send - BOOLEAN
82a07bf3
MCC
178 - 0 - disabled (default)
179 - not 0 - enabled
6ce1669f 180
82a07bf3
MCC
181 It controls sending icmp error messages (ICMP_DEST_UNREACH)
182 for VS/NAT when the load balancer receives packets from real
183 servers but the connection entries don't exist.
6ce1669f 184
3c679cba 185pmtu_disc - BOOLEAN
82a07bf3
MCC
186 - 0 - disabled
187 - not 0 - enabled (default)
3c679cba
HL
188
189 By default, reject with FRAG_NEEDED all DF packets that exceed
190 the PMTU, irrespective of the forwarding method. For TUN method
191 the flag can be disabled to fragment such packets.
192
6ce1669f 193secure_tcp - INTEGER
82a07bf3 194 - 0 - disabled (default)
6ce1669f 195
325aadc8
SH
196 The secure_tcp defense is to use a more complicated TCP state
197 transition table. For VS/NAT, it also delays entering the
198 TCP ESTABLISHED state until the three way handshake is completed.
6ce1669f 199
82a07bf3
MCC
200 The value definition is the same as that of drop_entry and
201 drop_packet.
6ce1669f 202
a2f346d8
HL
203sync_threshold - vector of 2 INTEGERs: sync_threshold, sync_period
204 default 3 50
205
206 It sets synchronization threshold, which is the minimum number
207 of incoming packets that a connection needs to receive before
208 the connection will be synchronized. A connection will be
209 synchronized, every time the number of its incoming packets
210 modulus sync_period equals the threshold. The range of the
211 threshold is from 0 to sync_period.
212
213 When sync_period and sync_refresh_period are 0, send sync only
214 for state changes or only once when pkts matches sync_threshold
215
216sync_refresh_period - UNSIGNED INTEGER
217 default 0
218
219 In seconds, difference in reported connection timer that triggers
220 new sync message. It can be used to avoid sync messages for the
221 specified period (or half of the connection timeout if it is lower)
222 if connection state is not changed since last sync.
223
224 This is useful for normal connections with high traffic to reduce
225 sync rate. Additionally, retry sync_retries times with period of
226 sync_refresh_period/8.
227
228sync_retries - INTEGER
229 default 0
230
231 Defines sync retries with period of sync_refresh_period/8. Useful
232 to protect against loss of sync messages. The range of the
233 sync_retries is from 0 to 3.
7e777dd4 234
237e5722
HL
235sync_qlen_max - UNSIGNED LONG
236
237 Hard limit for queued sync messages that are not sent yet. It
238 defaults to 1/32 of the memory pages but actually represents
239 number of messages. It will protect us from allocating large
240 parts of memory when the sending rate is lower than the queuing
241 rate.
242
243sync_sock_size - INTEGER
244 default 0
245
246 Configuration of SNDBUF (master) or RCVBUF (slave) socket limit.
247 Default value is 0 (preserve system defaults).
248
24b44415
HL
249sync_ports - INTEGER
250 default 1
251
252 The number of threads that master and backup servers can use for
253 sync traffic. Every thread will use single UDP port, thread 0 will
254 use the default port 8848 while last thread will use port
255 8848+sync_ports-1.
256
7e777dd4 257snat_reroute - BOOLEAN
82a07bf3
MCC
258 - 0 - disabled
259 - not 0 - enabled (default)
7e777dd4
SH
260
261 If enabled, recalculate the route of SNATed packets from
262 realservers so that they are routed as if they originate from the
263 director. Otherwise they are routed as if they are forwarded by the
264 director.
265
266 If policy routing is in effect then it is possible that the route
267 of a packet originating from a director is routed differently to a
268 packet being forwarded by the director.
269
270 If policy routing is not in effect then the recalculated route will
271 always be the same as the original route so it is an optimisation
272 to disable snat_reroute and avoid the recalculation.
273
4d0c875d
JA
274sync_persist_mode - INTEGER
275 default 0
276
277 Controls the synchronisation of connections when using persistence
278
279 0: All types of connections are synchronised
82a07bf3 280
4d0c875d
JA
281 1: Attempt to reduce the synchronisation traffic depending on
282 the connection type. For persistent services avoid synchronisation
283 for normal connections, do it only for persistence templates.
284 In such case, for TCP and SCTP it may need enabling sloppy_tcp and
285 sloppy_sctp flags on backup servers. For non-persistent services
286 such optimization is not applied, mode 0 is assumed.
287
7e777dd4
SH
288sync_version - INTEGER
289 default 1
290
291 The version of the synchronisation protocol used when sending
292 synchronisation messages.
293
294 0 selects the original synchronisation protocol (version 0). This
295 should be used when sending synchronisation messages to a legacy
296 system that only understands the original synchronisation protocol.
297
298 1 selects the current synchronisation protocol (version 1). This
299 should be used where possible.
300
301 Kernels with this sync_version entry are able to receive messages
302 of both version 1 and version 2 of the synchronisation protocol.
2232642e
DL
303
304run_estimation - BOOLEAN
305 0 - disabled
306 not 0 - enabled (default)
307
308 If disabled, the estimation will be stop, and you can't see
309 any update on speed estimation data.
310
311 You can always re-enable estimation by setting this value to 1.
312 But be careful, the first estimation after re-enable is not
313 accurate.