Commit | Line | Data |
---|---|---|
1da177e4 LT |
1 | CPU frequency and voltage scaling code in the Linux(TM) kernel |
2 | ||
3 | ||
4 | L i n u x C P U F r e q | |
5 | ||
6 | C P U F r e q G o v e r n o r s | |
7 | ||
8 | - information for users and developers - | |
9 | ||
10 | ||
11 | Dominik Brodowski <linux@brodo.de> | |
12 | ||
13 | ||
14 | ||
15 | Clock scaling allows you to change the clock speed of the CPUs on the | |
16 | fly. This is a nice method to save battery power, because the lower | |
17 | the clock speed, the less power the CPU consumes. | |
18 | ||
19 | ||
20 | Contents: | |
21 | --------- | |
22 | 1. What is a CPUFreq Governor? | |
23 | ||
24 | 2. Governors In the Linux Kernel | |
25 | 2.1 Performance | |
26 | 2.2 Powersave | |
27 | 2.3 Userspace | |
28 | ||
29 | 3. The Governor Interface in the CPUfreq Core | |
30 | ||
31 | ||
32 | ||
33 | 1. What Is A CPUFreq Governor? | |
34 | ============================== | |
35 | ||
36 | Most cpufreq drivers (in fact, all except one, longrun) or even most | |
37 | cpu frequency scaling algorithms only offer the CPU to be set to one | |
38 | frequency. In order to offer dynamic frequency scaling, the cpufreq | |
39 | core must be able to tell these drivers of a "target frequency". So | |
40 | these specific drivers will be transformed to offer a "->target" | |
41 | call instead of the existing "->setpolicy" call. For "longrun", all | |
42 | stays the same, though. | |
43 | ||
44 | How to decide what frequency within the CPUfreq policy should be used? | |
45 | That's done using "cpufreq governors". Two are already in this patch | |
46 | -- they're the already existing "powersave" and "performance" which | |
47 | set the frequency statically to the lowest or highest frequency, | |
48 | respectively. At least two more such governors will be ready for | |
49 | addition in the near future, but likely many more as there are various | |
50 | different theories and models about dynamic frequency scaling | |
51 | around. Using such a generic interface as cpufreq offers to scaling | |
52 | governors, these can be tested extensively, and the best one can be | |
53 | selected for each specific use. | |
54 | ||
55 | Basically, it's the following flow graph: | |
56 | ||
57 | CPU can be set to switch independetly | CPU can only be set | |
58 | within specific "limits" | to specific frequencies | |
59 | ||
60 | "CPUfreq policy" | |
61 | consists of frequency limits (policy->{min,max}) | |
62 | and CPUfreq governor to be used | |
63 | / \ | |
64 | / \ | |
65 | / the cpufreq governor decides | |
66 | / (dynamically or statically) | |
67 | / what target_freq to set within | |
68 | / the limits of policy->{min,max} | |
69 | / \ | |
70 | / \ | |
71 | Using the ->setpolicy call, Using the ->target call, | |
72 | the limits and the the frequency closest | |
73 | "policy" is set. to target_freq is set. | |
74 | It is assured that it | |
75 | is within policy->{min,max} | |
76 | ||
77 | ||
78 | 2. Governors In the Linux Kernel | |
79 | ================================ | |
80 | ||
81 | 2.1 Performance | |
82 | --------------- | |
83 | ||
84 | The CPUfreq governor "performance" sets the CPU statically to the | |
85 | highest frequency within the borders of scaling_min_freq and | |
86 | scaling_max_freq. | |
87 | ||
88 | ||
89 | 2.1 Powersave | |
90 | ------------- | |
91 | ||
92 | The CPUfreq governor "powersave" sets the CPU statically to the | |
93 | lowest frequency within the borders of scaling_min_freq and | |
94 | scaling_max_freq. | |
95 | ||
96 | ||
97 | 2.2 Userspace | |
98 | ------------- | |
99 | ||
100 | The CPUfreq governor "userspace" allows the user, or any userspace | |
101 | program running with UID "root", to set the CPU to a specific frequency | |
102 | by making a sysfs file "scaling_setspeed" available in the CPU-device | |
103 | directory. | |
104 | ||
105 | ||
106 | ||
107 | 3. The Governor Interface in the CPUfreq Core | |
108 | ============================================= | |
109 | ||
110 | A new governor must register itself with the CPUfreq core using | |
111 | "cpufreq_register_governor". The struct cpufreq_governor, which has to | |
112 | be passed to that function, must contain the following values: | |
113 | ||
114 | governor->name - A unique name for this governor | |
115 | governor->governor - The governor callback function | |
116 | governor->owner - .THIS_MODULE for the governor module (if | |
117 | appropriate) | |
118 | ||
119 | The governor->governor callback is called with the current (or to-be-set) | |
120 | cpufreq_policy struct for that CPU, and an unsigned int event. The | |
121 | following events are currently defined: | |
122 | ||
123 | CPUFREQ_GOV_START: This governor shall start its duty for the CPU | |
124 | policy->cpu | |
125 | CPUFREQ_GOV_STOP: This governor shall end its duty for the CPU | |
126 | policy->cpu | |
127 | CPUFREQ_GOV_LIMITS: The limits for CPU policy->cpu have changed to | |
128 | policy->min and policy->max. | |
129 | ||
130 | If you need other "events" externally of your driver, _only_ use the | |
131 | cpufreq_governor_l(unsigned int cpu, unsigned int event) call to the | |
132 | CPUfreq core to ensure proper locking. | |
133 | ||
134 | ||
135 | The CPUfreq governor may call the CPU processor driver using one of | |
136 | these two functions: | |
137 | ||
138 | int cpufreq_driver_target(struct cpufreq_policy *policy, | |
139 | unsigned int target_freq, | |
140 | unsigned int relation); | |
141 | ||
142 | int __cpufreq_driver_target(struct cpufreq_policy *policy, | |
143 | unsigned int target_freq, | |
144 | unsigned int relation); | |
145 | ||
146 | target_freq must be within policy->min and policy->max, of course. | |
147 | What's the difference between these two functions? When your governor | |
148 | still is in a direct code path of a call to governor->governor, the | |
149 | per-CPU cpufreq lock is still held in the cpufreq core, and there's | |
150 | no need to lock it again (in fact, this would cause a deadlock). So | |
151 | use __cpufreq_driver_target only in these cases. In all other cases | |
152 | (for example, when there's a "daemonized" function that wakes up | |
153 | every second), use cpufreq_driver_target to lock the cpufreq per-CPU | |
154 | lock before the command is passed to the cpufreq processor driver. | |
155 |