Commit | Line | Data |
---|---|---|
f5981a5c MCC |
1 | ===================== |
2 | The Linux IPMI Driver | |
3 | ===================== | |
1da177e4 | 4 | |
f5981a5c | 5 | :Author: Corey Minyard <minyard@mvista.com> / <minyard@acm.org> |
1da177e4 LT |
6 | |
7 | The Intelligent Platform Management Interface, or IPMI, is a | |
8 | standard for controlling intelligent devices that monitor a system. | |
9 | It provides for dynamic discovery of sensors in the system and the | |
10 | ability to monitor the sensors and be informed when the sensor's | |
11 | values change or go outside certain boundaries. It also has a | |
dc474c89 | 12 | standardized database for field-replaceable units (FRUs) and a watchdog |
1da177e4 LT |
13 | timer. |
14 | ||
15 | To use this, you need an interface to an IPMI controller in your | |
16 | system (called a Baseboard Management Controller, or BMC) and | |
17 | management software that can use the IPMI system. | |
18 | ||
19 | This document describes how to use the IPMI driver for Linux. If you | |
20 | are not familiar with IPMI itself, see the web site at | |
8a74c93d | 21 | https://www.intel.com/design/servers/ipmi/index.htm. IPMI is a big |
1da177e4 LT |
22 | subject and I can't cover it all here! |
23 | ||
24 | Configuration | |
25 | ------------- | |
26 | ||
845e78a1 | 27 | The Linux IPMI driver is modular, which means you have to pick several |
1da177e4 | 28 | things to have it work right depending on your hardware. Most of |
845e78a1 CM |
29 | these are available in the 'Character Devices' menu then the IPMI |
30 | menu. | |
1da177e4 LT |
31 | |
32 | No matter what, you must pick 'IPMI top-level message handler' to use | |
33 | IPMI. What you do beyond that depends on your needs and hardware. | |
34 | ||
35 | The message handler does not provide any user-level interfaces. | |
36 | Kernel code (like the watchdog) can still use it. If you need access | |
37 | from userland, you need to select 'Device interface for IPMI' if you | |
845e78a1 CM |
38 | want access through a device driver. |
39 | ||
40 | The driver interface depends on your hardware. If your system | |
41 | properly provides the SMBIOS info for IPMI, the driver will detect it | |
42 | and just work. If you have a board with a standard interface (These | |
43 | will generally be either "KCS", "SMIC", or "BT", consult your hardware | |
25930707 CM |
44 | manual), choose the 'IPMI SI handler' option. A driver also exists |
45 | for direct I2C access to the IPMI management controller. Some boards | |
46 | support this, but it is unknown if it will work on every board. For | |
47 | this, choose 'IPMI SMBus handler', but be ready to try to do some | |
48 | figuring to see if it will work on your system if the SMBIOS/APCI | |
49 | information is wrong or not present. It is fairly safe to have both | |
50 | these enabled and let the drivers auto-detect what is present. | |
1da177e4 LT |
51 | |
52 | You should generally enable ACPI on your system, as systems with IPMI | |
845e78a1 | 53 | can have ACPI tables describing them. |
1da177e4 LT |
54 | |
55 | If you have a standard interface and the board manufacturer has done | |
56 | their job correctly, the IPMI controller should be automatically | |
845e78a1 CM |
57 | detected (via ACPI or SMBIOS tables) and should just work. Sadly, |
58 | many boards do not have this information. The driver attempts | |
59 | standard defaults, but they may not work. If you fall into this | |
25930707 CM |
60 | situation, you need to read the section below named 'The SI Driver' or |
61 | "The SMBus Driver" on how to hand-configure your system. | |
1da177e4 LT |
62 | |
63 | IPMI defines a standard watchdog timer. You can enable this with the | |
64 | 'IPMI Watchdog Timer' config option. If you compile the driver into | |
65 | the kernel, then via a kernel command-line option you can have the | |
dc474c89 | 66 | watchdog timer start as soon as it initializes. It also have a lot |
1da177e4 LT |
67 | of other options, see the 'Watchdog' section below for more details. |
68 | Note that you can also have the watchdog continue to run if it is | |
69 | closed (by default it is disabled on close). Go into the 'Watchdog | |
70 | Cards' menu, enable 'Watchdog Timer Support', and enable the option | |
71 | 'Disable watchdog shutdown on close'. | |
72 | ||
845e78a1 CM |
73 | IPMI systems can often be powered off using IPMI commands. Select |
74 | 'IPMI Poweroff' to do this. The driver will auto-detect if the system | |
75 | can be powered off by IPMI. It is safe to enable this even if your | |
76 | system doesn't support this option. This works on ATCA systems, the | |
77 | Radisys CPI1 card, and any IPMI system that supports standard chassis | |
78 | management commands. | |
79 | ||
80 | If you want the driver to put an event into the event log on a panic, | |
81 | enable the 'Generate a panic event to all BMCs on a panic' option. If | |
82 | you want the whole panic string put into the event log using OEM | |
83 | events, enable the 'Generate OEM events containing the panic string' | |
1c9f98d1 CM |
84 | option. You can also enable these dynamically by setting the module |
85 | parameter named "panic_op" in the ipmi_msghandler module to "event" | |
86 | or "string". Setting that parameter to "none" disables this function. | |
1da177e4 LT |
87 | |
88 | Basic Design | |
89 | ------------ | |
90 | ||
91 | The Linux IPMI driver is designed to be very modular and flexible, you | |
92 | only need to take the pieces you need and you can use it in many | |
93 | different ways. Because of that, it's broken into many chunks of | |
845e78a1 | 94 | code. These chunks (by module name) are: |
1da177e4 LT |
95 | |
96 | ipmi_msghandler - This is the central piece of software for the IPMI | |
97 | system. It handles all messages, message timing, and responses. The | |
98 | IPMI users tie into this, and the IPMI physical interfaces (called | |
99 | System Management Interfaces, or SMIs) also tie in here. This | |
100 | provides the kernelland interface for IPMI, but does not provide an | |
101 | interface for use by application processes. | |
102 | ||
103 | ipmi_devintf - This provides a userland IOCTL interface for the IPMI | |
104 | driver, each open file for this device ties in to the message handler | |
105 | as an IPMI user. | |
106 | ||
845e78a1 | 107 | ipmi_si - A driver for various system interfaces. This supports KCS, |
25930707 CM |
108 | SMIC, and BT interfaces. Unless you have an SMBus interface or your |
109 | own custom interface, you probably need to use this. | |
110 | ||
111 | ipmi_ssif - A driver for accessing BMCs on the SMBus. It uses the | |
112 | I2C kernel driver's SMBus interfaces to send and receive IPMI messages | |
113 | over the SMBus. | |
1da177e4 | 114 | |
c11daf6a CM |
115 | ipmi_powernv - A driver for access BMCs on POWERNV systems. |
116 | ||
845e78a1 CM |
117 | ipmi_watchdog - IPMI requires systems to have a very capable watchdog |
118 | timer. This driver implements the standard Linux watchdog timer | |
119 | interface on top of the IPMI message handler. | |
120 | ||
121 | ipmi_poweroff - Some systems support the ability to be turned off via | |
122 | IPMI commands. | |
123 | ||
c11daf6a CM |
124 | bt-bmc - This is not part of the main driver, but instead a driver for |
125 | accessing a BMC-side interface of a BT interface. It is used on BMCs | |
126 | running Linux to provide an interface to the host. | |
1da177e4 | 127 | |
c11daf6a | 128 | These are all individually selectable via configuration options. |
1da177e4 LT |
129 | |
130 | Much documentation for the interface is in the include files. The | |
131 | IPMI include files are: | |
132 | ||
1da177e4 LT |
133 | linux/ipmi.h - Contains the user interface and IOCTL interface for IPMI. |
134 | ||
135 | linux/ipmi_smi.h - Contains the interface for system management interfaces | |
136 | (things that interface to IPMI controllers) to use. | |
137 | ||
138 | linux/ipmi_msgdefs.h - General definitions for base IPMI messaging. | |
139 | ||
140 | ||
141 | Addressing | |
142 | ---------- | |
143 | ||
144 | The IPMI addressing works much like IP addresses, you have an overlay | |
f5981a5c | 145 | to handle the different address types. The overlay is:: |
1da177e4 LT |
146 | |
147 | struct ipmi_addr | |
148 | { | |
149 | int addr_type; | |
150 | short channel; | |
151 | char data[IPMI_MAX_ADDR_SIZE]; | |
152 | }; | |
153 | ||
154 | The addr_type determines what the address really is. The driver | |
155 | currently understands two different types of addresses. | |
156 | ||
f5981a5c | 157 | "System Interface" addresses are defined as:: |
1da177e4 LT |
158 | |
159 | struct ipmi_system_interface_addr | |
160 | { | |
161 | int addr_type; | |
162 | short channel; | |
163 | }; | |
164 | ||
165 | and the type is IPMI_SYSTEM_INTERFACE_ADDR_TYPE. This is used for talking | |
166 | straight to the BMC on the current card. The channel must be | |
167 | IPMI_BMC_CHANNEL. | |
168 | ||
ddf58738 CM |
169 | Messages that are destined to go out on the IPMB bus going through the |
170 | BMC use the IPMI_IPMB_ADDR_TYPE address type. The format is:: | |
1da177e4 LT |
171 | |
172 | struct ipmi_ipmb_addr | |
173 | { | |
174 | int addr_type; | |
175 | short channel; | |
176 | unsigned char slave_addr; | |
177 | unsigned char lun; | |
178 | }; | |
179 | ||
180 | The "channel" here is generally zero, but some devices support more | |
181 | than one channel, it corresponds to the channel as defined in the IPMI | |
182 | spec. | |
183 | ||
ddf58738 CM |
184 | There is also an IPMB direct address for a situation where the sender |
185 | is directly on an IPMB bus and doesn't have to go through the BMC. | |
186 | You can send messages to a specific management controller (MC) on the | |
187 | IPMB using the IPMI_IPMB_DIRECT_ADDR_TYPE with the following format:: | |
188 | ||
189 | struct ipmi_ipmb_direct_addr | |
190 | { | |
191 | int addr_type; | |
192 | short channel; | |
193 | unsigned char slave_addr; | |
194 | unsigned char rq_lun; | |
195 | unsigned char rs_lun; | |
196 | }; | |
197 | ||
198 | The channel is always zero. You can also receive commands from other | |
199 | MCs that you have registered to handle and respond to them, so you can | |
200 | use this to implement a management controller on a bus.. | |
1da177e4 LT |
201 | |
202 | Messages | |
203 | -------- | |
204 | ||
f5981a5c | 205 | Messages are defined as:: |
1da177e4 | 206 | |
f5981a5c MCC |
207 | struct ipmi_msg |
208 | { | |
1da177e4 LT |
209 | unsigned char netfn; |
210 | unsigned char lun; | |
211 | unsigned char cmd; | |
212 | unsigned char *data; | |
213 | int data_len; | |
f5981a5c | 214 | }; |
1da177e4 LT |
215 | |
216 | The driver takes care of adding/stripping the header information. The | |
217 | data portion is just the data to be send (do NOT put addressing info | |
218 | here) or the response. Note that the completion code of a response is | |
219 | the first item in "data", it is not stripped out because that is how | |
220 | all the messages are defined in the spec (and thus makes counting the | |
221 | offsets a little easier :-). | |
222 | ||
223 | When using the IOCTL interface from userland, you must provide a block | |
224 | of data for "data", fill it, and set data_len to the length of the | |
225 | block of data, even when receiving messages. Otherwise the driver | |
226 | will have no place to put the message. | |
227 | ||
228 | Messages coming up from the message handler in kernelland will come in | |
f5981a5c | 229 | as:: |
1da177e4 LT |
230 | |
231 | struct ipmi_recv_msg | |
232 | { | |
233 | struct list_head link; | |
234 | ||
235 | /* The type of message as defined in the "Receive Types" | |
236 | defines above. */ | |
237 | int recv_type; | |
238 | ||
239 | ipmi_user_t *user; | |
240 | struct ipmi_addr addr; | |
241 | long msgid; | |
242 | struct ipmi_msg msg; | |
243 | ||
244 | /* Call this when done with the message. It will presumably free | |
245 | the message and do any other necessary cleanup. */ | |
246 | void (*done)(struct ipmi_recv_msg *msg); | |
247 | ||
248 | /* Place-holder for the data, don't make any assumptions about | |
249 | the size or existence of this, since it may change. */ | |
250 | unsigned char msg_data[IPMI_MAX_MSG_LENGTH]; | |
251 | }; | |
252 | ||
253 | You should look at the receive type and handle the message | |
254 | appropriately. | |
255 | ||
256 | ||
257 | The Upper Layer Interface (Message Handler) | |
258 | ------------------------------------------- | |
259 | ||
260 | The upper layer of the interface provides the users with a consistent | |
261 | view of the IPMI interfaces. It allows multiple SMI interfaces to be | |
262 | addressed (because some boards actually have multiple BMCs on them) | |
263 | and the user should not have to care what type of SMI is below them. | |
264 | ||
265 | ||
c11daf6a | 266 | Watching For Interfaces |
f5981a5c | 267 | ^^^^^^^^^^^^^^^^^^^^^^^ |
c11daf6a CM |
268 | |
269 | When your code comes up, the IPMI driver may or may not have detected | |
270 | if IPMI devices exist. So you might have to defer your setup until | |
271 | the device is detected, or you might be able to do it immediately. | |
272 | To handle this, and to allow for discovery, you register an SMI | |
273 | watcher with ipmi_smi_watcher_register() to iterate over interfaces | |
274 | and tell you when they come and go. | |
275 | ||
276 | ||
1da177e4 | 277 | Creating the User |
f5981a5c | 278 | ^^^^^^^^^^^^^^^^^ |
1da177e4 | 279 | |
08f6cd01 | 280 | To use the message handler, you must first create a user using |
1da177e4 LT |
281 | ipmi_create_user. The interface number specifies which SMI you want |
282 | to connect to, and you must supply callback functions to be called | |
283 | when data comes in. The callback function can run at interrupt level, | |
284 | so be careful using the callbacks. This also allows to you pass in a | |
285 | piece of data, the handler_data, that will be passed back to you on | |
286 | all calls. | |
287 | ||
288 | Once you are done, call ipmi_destroy_user() to get rid of the user. | |
289 | ||
290 | From userland, opening the device automatically creates a user, and | |
291 | closing the device automatically destroys the user. | |
292 | ||
293 | ||
294 | Messaging | |
f5981a5c | 295 | ^^^^^^^^^ |
1da177e4 | 296 | |
c11daf6a | 297 | To send a message from kernel-land, the ipmi_request_settime() call does |
1da177e4 LT |
298 | pretty much all message handling. Most of the parameter are |
299 | self-explanatory. However, it takes a "msgid" parameter. This is NOT | |
300 | the sequence number of messages. It is simply a long value that is | |
301 | passed back when the response for the message is returned. You may | |
302 | use it for anything you like. | |
303 | ||
304 | Responses come back in the function pointed to by the ipmi_recv_hndl | |
305 | field of the "handler" that you passed in to ipmi_create_user(). | |
306 | Remember again, these may be running at interrupt level. Remember to | |
307 | look at the receive type, too. | |
308 | ||
309 | From userland, you fill out an ipmi_req_t structure and use the | |
310 | IPMICTL_SEND_COMMAND ioctl. For incoming stuff, you can use select() | |
311 | or poll() to wait for messages to come in. However, you cannot use | |
312 | read() to get them, you must call the IPMICTL_RECEIVE_MSG with the | |
313 | ipmi_recv_t structure to actually get the message. Remember that you | |
314 | must supply a pointer to a block of data in the msg.data field, and | |
315 | you must fill in the msg.data_len field with the size of the data. | |
316 | This gives the receiver a place to actually put the message. | |
317 | ||
318 | If the message cannot fit into the data you provide, you will get an | |
319 | EMSGSIZE error and the driver will leave the data in the receive | |
320 | queue. If you want to get it and have it truncate the message, us | |
321 | the IPMICTL_RECEIVE_MSG_TRUNC ioctl. | |
322 | ||
323 | When you send a command (which is defined by the lowest-order bit of | |
324 | the netfn per the IPMI spec) on the IPMB bus, the driver will | |
325 | automatically assign the sequence number to the command and save the | |
326 | command. If the response is not receive in the IPMI-specified 5 | |
327 | seconds, it will generate a response automatically saying the command | |
328 | timed out. If an unsolicited response comes in (if it was after 5 | |
329 | seconds, for instance), that response will be ignored. | |
330 | ||
331 | In kernelland, after you receive a message and are done with it, you | |
332 | MUST call ipmi_free_recv_msg() on it, or you will leak messages. Note | |
333 | that you should NEVER mess with the "done" field of a message, that is | |
334 | required to properly clean up the message. | |
335 | ||
336 | Note that when sending, there is an ipmi_request_supply_msgs() call | |
337 | that lets you supply the smi and receive message. This is useful for | |
338 | pieces of code that need to work even if the system is out of buffers | |
339 | (the watchdog timer uses this, for instance). You supply your own | |
340 | buffer and own free routines. This is not recommended for normal use, | |
341 | though, since it is tricky to manage your own buffers. | |
342 | ||
343 | ||
344 | Events and Incoming Commands | |
f5981a5c | 345 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
1da177e4 LT |
346 | |
347 | The driver takes care of polling for IPMI events and receiving | |
348 | commands (commands are messages that are not responses, they are | |
349 | commands that other things on the IPMB bus have sent you). To receive | |
350 | these, you must register for them, they will not automatically be sent | |
351 | to you. | |
352 | ||
353 | To receive events, you must call ipmi_set_gets_events() and set the | |
354 | "val" to non-zero. Any events that have been received by the driver | |
355 | since startup will immediately be delivered to the first user that | |
356 | registers for events. After that, if multiple users are registered | |
357 | for events, they will all receive all events that come in. | |
358 | ||
359 | For receiving commands, you have to individually register commands you | |
360 | want to receive. Call ipmi_register_for_cmd() and supply the netfn | |
c69c3127 CM |
361 | and command name for each command you want to receive. You also |
362 | specify a bitmask of the channels you want to receive the command from | |
363 | (or use IPMI_CHAN_ALL for all channels if you don't care). Only one | |
364 | user may be registered for each netfn/cmd/channel, but different users | |
365 | may register for different commands, or the same command if the | |
366 | channel bitmasks do not overlap. | |
1da177e4 | 367 | |
ddf58738 CM |
368 | To respond to a received command, set the response bit in the returned |
369 | netfn, use the address from the received message, and use the same | |
370 | msgid that you got in the receive message. | |
371 | ||
1da177e4 LT |
372 | From userland, equivalent IOCTLs are provided to do these functions. |
373 | ||
374 | ||
375 | The Lower Layer (SMI) Interface | |
376 | ------------------------------- | |
377 | ||
378 | As mentioned before, multiple SMI interfaces may be registered to the | |
379 | message handler, each of these is assigned an interface number when | |
380 | they register with the message handler. They are generally assigned | |
381 | in the order they register, although if an SMI unregisters and then | |
382 | another one registers, all bets are off. | |
383 | ||
384 | The ipmi_smi.h defines the interface for management interfaces, see | |
385 | that for more details. | |
386 | ||
387 | ||
388 | The SI Driver | |
389 | ------------- | |
390 | ||
c11daf6a CM |
391 | The SI driver allows KCS, BT, and SMIC interfaces to be configured |
392 | in the system. It discovers interfaces through a host of different | |
393 | methods, depending on the system. | |
394 | ||
395 | You can specify up to four interfaces on the module load line and | |
f5981a5c | 396 | control some module parameters:: |
1da177e4 LT |
397 | |
398 | modprobe ipmi_si.o type=<type1>,<type2>.... | |
399 | ports=<port1>,<port2>... addrs=<addr1>,<addr2>... | |
f2afae46 | 400 | irqs=<irq1>,<irq2>... |
1da177e4 LT |
401 | regspacings=<sp1>,<sp2>,... regsizes=<size1>,<size2>,... |
402 | regshifts=<shift1>,<shift2>,... | |
403 | slave_addrs=<addr1>,<addr2>,... | |
a51f4a81 | 404 | force_kipmid=<enable1>,<enable2>,... |
ae74e823 | 405 | kipmid_max_busy_us=<ustime1>,<ustime2>,... |
b361e27b | 406 | unload_when_empty=[0|1] |
c11daf6a | 407 | trydmi=[0|1] tryacpi=[0|1] |
f2afae46 | 408 | tryplatform=[0|1] trypci=[0|1] |
1da177e4 | 409 | |
f2afae46 | 410 | Each of these except try... items is a list, the first item for the |
1da177e4 LT |
411 | first interface, second item for the second interface, etc. |
412 | ||
413 | The si_type may be either "kcs", "smic", or "bt". If you leave it blank, it | |
414 | defaults to "kcs". | |
415 | ||
f2afae46 | 416 | If you specify addrs as non-zero for an interface, the driver will |
1da177e4 LT |
417 | use the memory address given as the address of the device. This |
418 | overrides si_ports. | |
419 | ||
f2afae46 | 420 | If you specify ports as non-zero for an interface, the driver will |
1da177e4 LT |
421 | use the I/O port given as the device address. |
422 | ||
f2afae46 | 423 | If you specify irqs as non-zero for an interface, the driver will |
1da177e4 LT |
424 | attempt to use the given interrupt for the device. |
425 | ||
f2afae46 CM |
426 | The other try... items disable discovery by their corresponding |
427 | names. These are all enabled by default, set them to zero to disable | |
428 | them. The tryplatform disables openfirmware. | |
429 | ||
1da177e4 LT |
430 | The next three parameters have to do with register layout. The |
431 | registers used by the interfaces may not appear at successive | |
432 | locations and they may not be in 8-bit registers. These parameters | |
433 | allow the layout of the data in the registers to be more precisely | |
434 | specified. | |
435 | ||
436 | The regspacings parameter give the number of bytes between successive | |
437 | register start addresses. For instance, if the regspacing is set to 4 | |
438 | and the start address is 0xca2, then the address for the second | |
439 | register would be 0xca6. This defaults to 1. | |
440 | ||
441 | The regsizes parameter gives the size of a register, in bytes. The | |
442 | data used by IPMI is 8-bits wide, but it may be inside a larger | |
443 | register. This parameter allows the read and write type to specified. | |
444 | It may be 1, 2, 4, or 8. The default is 1. | |
445 | ||
446 | Since the register size may be larger than 32 bits, the IPMI data may not | |
447 | be in the lower 8 bits. The regshifts parameter give the amount to shift | |
448 | the data to get to the actual IPMI data. | |
449 | ||
450 | The slave_addrs specifies the IPMI address of the local BMC. This is | |
451 | usually 0x20 and the driver defaults to that, but in case it's not, it | |
452 | can be specified when the driver starts up. | |
453 | ||
a51f4a81 CM |
454 | The force_ipmid parameter forcefully enables (if set to 1) or disables |
455 | (if set to 0) the kernel IPMI daemon. Normally this is auto-detected | |
456 | by the driver, but systems with broken interrupts might need an enable, | |
457 | or users that don't want the daemon (don't need the performance, don't | |
458 | want the CPU hit) can disable it. | |
459 | ||
b361e27b CM |
460 | If unload_when_empty is set to 1, the driver will be unloaded if it |
461 | doesn't find any interfaces or all the interfaces fail to work. The | |
462 | default is one. Setting to 0 is useful with the hotmod, but is | |
463 | obviously only useful for modules. | |
464 | ||
a51f4a81 | 465 | When compiled into the kernel, the parameters can be specified on the |
f5981a5c | 466 | kernel command line as:: |
1da177e4 LT |
467 | |
468 | ipmi_si.type=<type1>,<type2>... | |
469 | ipmi_si.ports=<port1>,<port2>... ipmi_si.addrs=<addr1>,<addr2>... | |
c11daf6a | 470 | ipmi_si.irqs=<irq1>,<irq2>... |
1da177e4 LT |
471 | ipmi_si.regspacings=<sp1>,<sp2>,... |
472 | ipmi_si.regsizes=<size1>,<size2>,... | |
473 | ipmi_si.regshifts=<shift1>,<shift2>,... | |
474 | ipmi_si.slave_addrs=<addr1>,<addr2>,... | |
a51f4a81 | 475 | ipmi_si.force_kipmid=<enable1>,<enable2>,... |
ae74e823 | 476 | ipmi_si.kipmid_max_busy_us=<ustime1>,<ustime2>,... |
1da177e4 LT |
477 | |
478 | It works the same as the module parameters of the same names. | |
479 | ||
650dd0c7 CM |
480 | If your IPMI interface does not support interrupts and is a KCS or |
481 | SMIC interface, the IPMI driver will start a kernel thread for the | |
482 | interface to help speed things up. This is a low-priority kernel | |
483 | thread that constantly polls the IPMI driver while an IPMI operation | |
484 | is in progress. The force_kipmid module parameter will all the user to | |
485 | force this thread on or off. If you force it off and don't have | |
486 | interrupts, the driver will run VERY slowly. Don't blame me, | |
1da177e4 LT |
487 | these interfaces suck. |
488 | ||
ae74e823 MW |
489 | Unfortunately, this thread can use a lot of CPU depending on the |
490 | interface's performance. This can waste a lot of CPU and cause | |
491 | various issues with detecting idle CPU and using extra power. To | |
492 | avoid this, the kipmid_max_busy_us sets the maximum amount of time, in | |
493 | microseconds, that kipmid will spin before sleeping for a tick. This | |
494 | value sets a balance between performance and CPU waste and needs to be | |
495 | tuned to your needs. Maybe, someday, auto-tuning will be added, but | |
496 | that's not a simple thing and even the auto-tuning would need to be | |
497 | tuned to the user's desired performance. | |
498 | ||
b361e27b CM |
499 | The driver supports a hot add and remove of interfaces. This way, |
500 | interfaces can be added or removed after the kernel is up and running. | |
650dd0c7 CM |
501 | This is done using /sys/modules/ipmi_si/parameters/hotmod, which is a |
502 | write-only parameter. You write a string to this interface. The string | |
f5981a5c MCC |
503 | has the format:: |
504 | ||
b361e27b | 505 | <op1>[:op2[:op3...]] |
f5981a5c MCC |
506 | |
507 | The "op"s are:: | |
508 | ||
b361e27b | 509 | add|remove,kcs|bt|smic,mem|i/o,<address>[,<opt1>[,<opt2>[,...]]] |
f5981a5c MCC |
510 | |
511 | You can specify more than one interface on the line. The "opt"s are:: | |
512 | ||
b361e27b CM |
513 | rsp=<regspacing> |
514 | rsi=<regsize> | |
515 | rsh=<regshift> | |
516 | irq=<irq> | |
517 | ipmb=<ipmb slave addr> | |
f5981a5c | 518 | |
b361e27b CM |
519 | and these have the same meanings as discussed above. Note that you |
520 | can also use this on the kernel command line for a more compact format | |
521 | for specifying an interface. Note that when removing an interface, | |
522 | only the first three parameters (si type, address type, and address) | |
523 | are used for the comparison. Any options are ignored for removing. | |
1da177e4 | 524 | |
25930707 CM |
525 | The SMBus Driver (SSIF) |
526 | ----------------------- | |
527 | ||
528 | The SMBus driver allows up to 4 SMBus devices to be configured in the | |
529 | system. By default, the driver will only register with something it | |
530 | finds in DMI or ACPI tables. You can change this | |
f5981a5c | 531 | at module load time (for a module) with:: |
25930707 CM |
532 | |
533 | modprobe ipmi_ssif.o | |
534 | addr=<i2caddr1>[,<i2caddr2>[,...]] | |
535 | adapter=<adapter1>[,<adapter2>[...]] | |
536 | dbg=<flags1>,<flags2>... | |
c11daf6a CM |
537 | slave_addrs=<addr1>,<addr2>,... |
538 | tryacpi=[0|1] trydmi=[0|1] | |
25930707 | 539 | [dbg_probe=1] |
c8e0506f | 540 | alerts_broken |
25930707 CM |
541 | |
542 | The addresses are normal I2C addresses. The adapter is the string | |
64dce81f | 543 | name of the adapter, as shown in /sys/bus/i2c/devices/i2c-<n>/name. |
b0e9aaa9 CM |
544 | It is *NOT* i2c-<n> itself. Also, the comparison is done ignoring |
545 | spaces, so if the name is "This is an I2C chip" you can say | |
546 | adapter_name=ThisisanI2cchip. This is because it's hard to pass in | |
547 | spaces in kernel parameters. | |
25930707 CM |
548 | |
549 | The debug flags are bit flags for each BMC found, they are: | |
550 | IPMI messages: 1, driver state: 2, timing: 4, I2C probe: 8 | |
551 | ||
c11daf6a CM |
552 | The tryxxx parameters can be used to disable detecting interfaces |
553 | from various sources. | |
554 | ||
25930707 CM |
555 | Setting dbg_probe to 1 will enable debugging of the probing and |
556 | detection process for BMCs on the SMBusses. | |
557 | ||
558 | The slave_addrs specifies the IPMI address of the local BMC. This is | |
559 | usually 0x20 and the driver defaults to that, but in case it's not, it | |
560 | can be specified when the driver starts up. | |
561 | ||
c8e0506f MT |
562 | alerts_broken does not enable SMBus alert for SSIF. Otherwise SMBus |
563 | alert will be enabled on supported hardware. | |
564 | ||
25930707 CM |
565 | Discovering the IPMI compliant BMC on the SMBus can cause devices on |
566 | the I2C bus to fail. The SMBus driver writes a "Get Device ID" IPMI | |
567 | message as a block write to the I2C bus and waits for a response. | |
568 | This action can be detrimental to some I2C devices. It is highly | |
569 | recommended that the known I2C address be given to the SMBus driver in | |
570 | the smb_addr parameter unless you have DMI or ACPI data to tell the | |
571 | driver what to use. | |
572 | ||
573 | When compiled into the kernel, the addresses can be specified on the | |
f5981a5c | 574 | kernel command line as:: |
25930707 CM |
575 | |
576 | ipmb_ssif.addr=<i2caddr1>[,<i2caddr2>[...]] | |
577 | ipmi_ssif.adapter=<adapter1>[,<adapter2>[...]] | |
578 | ipmi_ssif.dbg=<flags1>[,<flags2>[...]] | |
579 | ipmi_ssif.dbg_probe=1 | |
c11daf6a CM |
580 | ipmi_ssif.slave_addrs=<addr1>[,<addr2>[...]] |
581 | ipmi_ssif.tryacpi=[0|1] ipmi_ssif.trydmi=[0|1] | |
25930707 CM |
582 | |
583 | These are the same options as on the module command line. | |
584 | ||
585 | The I2C driver does not support non-blocking access or polling, so | |
586 | this driver cannod to IPMI panic events, extend the watchdog at panic | |
587 | time, or other panic-related IPMI functions without special kernel | |
588 | patches and driver modifications. You can get those at the openipmi | |
589 | web page. | |
590 | ||
591 | The driver supports a hot add and remove of interfaces through the I2C | |
592 | sysfs interface. | |
1da177e4 | 593 | |
b81a817a CM |
594 | The IPMI IPMB Driver |
595 | -------------------- | |
596 | ||
597 | This driver is for supporting a system that sits on an IPMB bus; it | |
598 | allows the interface to look like a normal IPMI interface. Sending | |
599 | system interface addressed messages to it will cause the message to go | |
600 | to the registered BMC on the system (default at IPMI address 0x20). | |
601 | ||
602 | It also allows you to directly address other MCs on the bus using the | |
603 | ipmb direct addressing. You can receive commands from other MCs on | |
604 | the bus and they will be handled through the normal received command | |
605 | mechanism described above. | |
606 | ||
607 | Parameters are:: | |
608 | ||
609 | ipmi_ipmb.bmcaddr=<address to use for system interface addresses messages> | |
610 | ipmi_ipmb.retry_time_ms=<Time between retries on IPMB> | |
611 | ipmi_ipmb.max_retries=<Number of times to retry a message> | |
612 | ||
613 | Loading the module will not result in the driver automatcially | |
614 | starting unless there is device tree information setting it up. If | |
615 | you want to instantiate one of these by hand, do:: | |
616 | ||
617 | echo ipmi-ipmb <addr> > /sys/class/i2c-dev/i2c-<n>/device/new_device | |
618 | ||
619 | Note that the address you give here is the I2C address, not the IPMI | |
620 | address. So if you want your MC address to be 0x60, you put 0x30 | |
621 | here. See the I2C driver info for more details. | |
622 | ||
623 | Command bridging to other IPMB busses through this interface does not | |
624 | work. The receive message queue is not implemented, by design. There | |
625 | is only one receive message queue on a BMC, and that is meant for the | |
626 | host drivers, not something on the IPMB bus. | |
627 | ||
628 | A BMC may have multiple IPMB busses, which bus your device sits on | |
629 | depends on how the system is wired. You can fetch the channels with | |
630 | "ipmitool channel info <n>" where <n> is the channel, with the | |
631 | channels being 0-7 and try the IPMB channels. | |
632 | ||
1da177e4 LT |
633 | Other Pieces |
634 | ------------ | |
635 | ||
37bf501b ZY |
636 | Get the detailed info related with the IPMI device |
637 | -------------------------------------------------- | |
638 | ||
639 | Some users need more detailed information about a device, like where | |
640 | the address came from or the raw base device for the IPMI interface. | |
641 | You can use the IPMI smi_watcher to catch the IPMI interfaces as they | |
642 | come or go, and to grab the information, you can use the function | |
f5981a5c | 643 | ipmi_get_smi_info(), which returns the following structure:: |
37bf501b | 644 | |
f5981a5c | 645 | struct ipmi_smi_info { |
37bf501b ZY |
646 | enum ipmi_addr_src addr_src; |
647 | struct device *dev; | |
648 | union { | |
649 | struct { | |
650 | void *acpi_handle; | |
651 | } acpi_info; | |
652 | } addr_info; | |
f5981a5c | 653 | }; |
37bf501b ZY |
654 | |
655 | Currently special info for only for SI_ACPI address sources is | |
656 | returned. Others may be added as necessary. | |
657 | ||
658 | Note that the dev pointer is included in the above structure, and | |
659 | assuming ipmi_smi_get_info returns success, you must call put_device | |
660 | on the dev pointer. | |
661 | ||
662 | ||
1da177e4 LT |
663 | Watchdog |
664 | -------- | |
665 | ||
666 | A watchdog timer is provided that implements the Linux-standard | |
667 | watchdog timer interface. It has three module parameters that can be | |
f5981a5c | 668 | used to control it:: |
1da177e4 LT |
669 | |
670 | modprobe ipmi_watchdog timeout=<t> pretimeout=<t> action=<action type> | |
671 | preaction=<preaction type> preop=<preop type> start_now=x | |
c7f42c63 | 672 | nowayout=x ifnum_to_use=n panic_wdt_timeout=<t> |
b2c03941 CM |
673 | |
674 | ifnum_to_use specifies which interface the watchdog timer should use. | |
675 | The default is -1, which means to pick the first one registered. | |
1da177e4 LT |
676 | |
677 | The timeout is the number of seconds to the action, and the pretimeout | |
678 | is the amount of seconds before the reset that the pre-timeout panic will | |
679 | occur (if pretimeout is zero, then pretimeout will not be enabled). Note | |
680 | that the pretimeout is the time before the final timeout. So if the | |
681 | timeout is 50 seconds and the pretimeout is 10 seconds, then the pretimeout | |
c7f42c63 JYF |
682 | will occur in 40 second (10 seconds before the timeout). The panic_wdt_timeout |
683 | is the value of timeout which is set on kernel panic, in order to let actions | |
684 | such as kdump to occur during panic. | |
1da177e4 LT |
685 | |
686 | The action may be "reset", "power_cycle", or "power_off", and | |
687 | specifies what to do when the timer times out, and defaults to | |
688 | "reset". | |
689 | ||
690 | The preaction may be "pre_smi" for an indication through the SMI | |
691 | interface, "pre_int" for an indication through the SMI with an | |
692 | interrupts, and "pre_nmi" for a NMI on a preaction. This is how | |
693 | the driver is informed of the pretimeout. | |
694 | ||
695 | The preop may be set to "preop_none" for no operation on a pretimeout, | |
696 | "preop_panic" to set the preoperation to panic, or "preop_give_data" | |
697 | to provide data to read from the watchdog device when the pretimeout | |
698 | occurs. A "pre_nmi" setting CANNOT be used with "preop_give_data" | |
699 | because you can't do data operations from an NMI. | |
700 | ||
701 | When preop is set to "preop_give_data", one byte comes ready to read | |
702 | on the device when the pretimeout occurs. Select and fasync work on | |
703 | the device, as well. | |
704 | ||
705 | If start_now is set to 1, the watchdog timer will start running as | |
706 | soon as the driver is loaded. | |
707 | ||
708 | If nowayout is set to 1, the watchdog timer will not stop when the | |
709 | watchdog device is closed. The default value of nowayout is true | |
710 | if the CONFIG_WATCHDOG_NOWAYOUT option is enabled, or false if not. | |
711 | ||
712 | When compiled into the kernel, the kernel command line is available | |
f5981a5c | 713 | for configuring the watchdog:: |
1da177e4 LT |
714 | |
715 | ipmi_watchdog.timeout=<t> ipmi_watchdog.pretimeout=<t> | |
716 | ipmi_watchdog.action=<action type> | |
717 | ipmi_watchdog.preaction=<preaction type> | |
718 | ipmi_watchdog.preop=<preop type> | |
719 | ipmi_watchdog.start_now=x | |
720 | ipmi_watchdog.nowayout=x | |
c7f42c63 | 721 | ipmi_watchdog.panic_wdt_timeout=<t> |
1da177e4 LT |
722 | |
723 | The options are the same as the module parameter options. | |
724 | ||
725 | The watchdog will panic and start a 120 second reset timeout if it | |
726 | gets a pre-action. During a panic or a reboot, the watchdog will | |
727 | start a 120 timer if it is running to make sure the reboot occurs. | |
728 | ||
612b5a8d CM |
729 | Note that if you use the NMI preaction for the watchdog, you MUST NOT |
730 | use the nmi watchdog. There is no reasonable way to tell if an NMI | |
731 | comes from the IPMI controller, so it must assume that if it gets an | |
732 | otherwise unhandled NMI, it must be from IPMI and it will panic | |
733 | immediately. | |
1da177e4 LT |
734 | |
735 | Once you open the watchdog timer, you must write a 'V' character to the | |
736 | device to close it, or the timer will not stop. This is a new semantic | |
737 | for the driver, but makes it consistent with the rest of the watchdog | |
738 | drivers in Linux. | |
845e78a1 CM |
739 | |
740 | ||
741 | Panic Timeouts | |
742 | -------------- | |
743 | ||
744 | The OpenIPMI driver supports the ability to put semi-custom and custom | |
745 | events in the system event log if a panic occurs. if you enable the | |
746 | 'Generate a panic event to all BMCs on a panic' option, you will get | |
747 | one event on a panic in a standard IPMI event format. If you enable | |
748 | the 'Generate OEM events containing the panic string' option, you will | |
749 | also get a bunch of OEM events holding the panic string. | |
750 | ||
751 | ||
752 | The field settings of the events are: | |
f5981a5c | 753 | |
845e78a1 CM |
754 | * Generator ID: 0x21 (kernel) |
755 | * EvM Rev: 0x03 (this event is formatting in IPMI 1.0 format) | |
756 | * Sensor Type: 0x20 (OS critical stop sensor) | |
757 | * Sensor #: The first byte of the panic string (0 if no panic string) | |
758 | * Event Dir | Event Type: 0x6f (Assertion, sensor-specific event info) | |
759 | * Event Data 1: 0xa1 (Runtime stop in OEM bytes 2 and 3) | |
760 | * Event data 2: second byte of panic string | |
761 | * Event data 3: third byte of panic string | |
f5981a5c | 762 | |
845e78a1 CM |
763 | See the IPMI spec for the details of the event layout. This event is |
764 | always sent to the local management controller. It will handle routing | |
765 | the message to the right place | |
766 | ||
767 | Other OEM events have the following format: | |
f5981a5c MCC |
768 | |
769 | * Record ID (bytes 0-1): Set by the SEL. | |
770 | * Record type (byte 2): 0xf0 (OEM non-timestamped) | |
771 | * byte 3: The slave address of the card saving the panic | |
772 | * byte 4: A sequence number (starting at zero) | |
773 | The rest of the bytes (11 bytes) are the panic string. If the panic string | |
774 | is longer than 11 bytes, multiple messages will be sent with increasing | |
775 | sequence numbers. | |
845e78a1 CM |
776 | |
777 | Because you cannot send OEM events using the standard interface, this | |
778 | function will attempt to find an SEL and add the events there. It | |
779 | will first query the capabilities of the local management controller. | |
780 | If it has an SEL, then they will be stored in the SEL of the local | |
781 | management controller. If not, and the local management controller is | |
782 | an event generator, the event receiver from the local management | |
783 | controller will be queried and the events sent to the SEL on that | |
784 | device. Otherwise, the events go nowhere since there is nowhere to | |
785 | send them. | |
3b625943 CM |
786 | |
787 | ||
788 | Poweroff | |
789 | -------- | |
790 | ||
791 | If the poweroff capability is selected, the IPMI driver will install | |
792 | a shutdown function into the standard poweroff function pointer. This | |
793 | is in the ipmi_poweroff module. When the system requests a powerdown, | |
794 | it will send the proper IPMI commands to do this. This is supported on | |
795 | several platforms. | |
796 | ||
8c702e16 CM |
797 | There is a module parameter named "poweroff_powercycle" that may |
798 | either be zero (do a power down) or non-zero (do a power cycle, power | |
799 | the system off, then power it on in a few seconds). Setting | |
800 | ipmi_poweroff.poweroff_control=x will do the same thing on the kernel | |
801 | command line. The parameter is also available via the proc filesystem | |
802 | in /proc/sys/dev/ipmi/poweroff_powercycle. Note that if the system | |
803 | does not support power cycling, it will always do the power off. | |
3b625943 | 804 | |
b2c03941 CM |
805 | The "ifnum_to_use" parameter specifies which interface the poweroff |
806 | code should use. The default is -1, which means to pick the first one | |
807 | registered. | |
808 | ||
3b625943 CM |
809 | Note that if you have ACPI enabled, the system will prefer using ACPI to |
810 | power off. |