Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...

author Linus Torvalds <torvalds@linux-foundation.org>

Thu, 9 Apr 2009 17:38:23 +0000 (10:38 -0700)

committer Linus Torvalds <torvalds@linux-foundation.org>

Thu, 9 Apr 2009 17:38:23 +0000 (10:38 -0700)
author Linus Torvalds <torvalds@linux-foundation.org>
Thu, 9 Apr 2009 17:38:23 +0000 (10:38 -0700)
committer Linus Torvalds <torvalds@linux-foundation.org>
Thu, 9 Apr 2009 17:38:23 +0000 (10:38 -0700)
diff --git a/Documentation/cgroups/cpuacct.txt b/Documentation/cgroups/cpuacct.txt

index bb775fbe43d78f6f5083f03a7986d9edf9012a05..8b930946c52a7dec05657470016946b1c3492123 100644 (file)
--- a/Documentation/cgroups/cpuacct.txt
+++ b/Documentation/cgroups/cpuacct.txt
@@ -30,3 +30,21 @@ The above steps create a new group g1 and move the current shell
  process (bash) into it. CPU time consumed by this bash and its children
  can be obtained from g1/cpuacct.usage and the same is accumulated in
  /cgroups/cpuacct.usage also.
+
+cpuacct.stat file lists a few statistics which further divide the
+CPU time obtained by the cgroup into user and system times. Currently
+the following statistics are supported:
+
+user: Time spent by tasks of the cgroup in user mode.
+system: Time spent by tasks of the cgroup in kernel mode.
+
+user and system are in USER_HZ unit.
+
+cpuacct controller uses percpu_counter interface to collect user and
+system times. This has two side effects:
+
+- It is theoretically possible to see wrong values for user and system times.
+  This is because percpu_counter_read() on 32bit systems isn't safe
+  against concurrent writes.
+- It is possible to see slightly outdated values for user and system times
+  due to the batch processing nature of percpu_counter.
diff --git a/Documentation/ftrace.txt b/Documentation/ftrace.txt

deleted file mode 100644 (file)

index fd9a3e6..0000000
--- a/Documentation/ftrace.txt
+++ /dev/null
@@ -1,1828 +0,0 @@
-               ftrace - Function Tracer
-               ========================
-
-Copyright 2008 Red Hat Inc.
-   Author:   Steven Rostedt <srostedt@redhat.com>
-  License:   The GNU Free Documentation License, Version 1.2
-               (dual licensed under the GPL v2)
-Reviewers:   Elias Oltmanns, Randy Dunlap, Andrew Morton,
-            John Kacur, and David Teigland.
-
-Written for: 2.6.28-rc2
-
-Introduction
-------------
-
-Ftrace is an internal tracer designed to help out developers and
-designers of systems to find what is going on inside the kernel.
-It can be used for debugging or analyzing latencies and
-performance issues that take place outside of user-space.
-
-Although ftrace is the function tracer, it also includes an
-infrastructure that allows for other types of tracing. Some of
-the tracers that are currently in ftrace include a tracer to
-trace context switches, the time it takes for a high priority
-task to run after it was woken up, the time interrupts are
-disabled, and more (ftrace allows for tracer plugins, which
-means that the list of tracers can always grow).
-
-
-The File System
----------------
-
-Ftrace uses the debugfs file system to hold the control files as
-well as the files to display output.
-
-To mount the debugfs system:
-
-  # mkdir /debug
-  # mount -t debugfs nodev /debug
-
-( Note: it is more common to mount at /sys/kernel/debug, but for
-  simplicity this document will use /debug)
-
-That's it! (assuming that you have ftrace configured into your kernel)
-
-After mounting the debugfs, you can see a directory called
-"tracing".  This directory contains the control and output files
-of ftrace. Here is a list of some of the key files:
-
-
- Note: all time values are in microseconds.
-
-  current_tracer:
-
-       This is used to set or display the current tracer
-       that is configured.
-
-  available_tracers:
-
-       This holds the different types of tracers that
-       have been compiled into the kernel. The
-       tracers listed here can be configured by
-       echoing their name into current_tracer.
-
-  tracing_enabled:
-
-       This sets or displays whether the current_tracer
-       is activated and tracing or not. Echo 0 into this
-       file to disable the tracer or 1 to enable it.
-
-  trace:
-
-       This file holds the output of the trace in a human
-       readable format (described below).
-
-  latency_trace:
-
-       This file shows the same trace but the information
-       is organized more to display possible latencies
-       in the system (described below).
-
-  trace_pipe:
-
-       The output is the same as the "trace" file but this
-       file is meant to be streamed with live tracing.
-       Reads from this file will block until new data
-       is retrieved. Unlike the "trace" and "latency_trace"
-       files, this file is a consumer. This means reading
-       from this file causes sequential reads to display
-       more current data. Once data is read from this
-       file, it is consumed, and will not be read
-       again with a sequential read. The "trace" and
-       "latency_trace" files are static, and if the
-       tracer is not adding more data, they will display
-       the same information every time they are read.
-
-  trace_options:
-
-       This file lets the user control the amount of data
-       that is displayed in one of the above output
-       files.
-
-  tracing_max_latency:
-
-       Some of the tracers record the max latency.
-       For example, the time interrupts are disabled.
-       This time is saved in this file. The max trace
-       will also be stored, and displayed by either
-       "trace" or "latency_trace".  A new max trace will
-       only be recorded if the latency is greater than
-       the value in this file. (in microseconds)
-
-  buffer_size_kb:
-
-       This sets or displays the number of kilobytes each CPU
-       buffer can hold. The tracer buffers are the same size
-       for each CPU. The displayed number is the size of the
-       CPU buffer and not total size of all buffers. The
-       trace buffers are allocated in pages (blocks of memory
-       that the kernel uses for allocation, usually 4 KB in size).
-       If the last page allocated has room for more bytes
-       than requested, the rest of the page will be used,
-       making the actual allocation bigger than requested.
-       ( Note, the size may not be a multiple of the page size
-         due to buffer managment overhead. )
-
-       This can only be updated when the current_tracer
-       is set to "nop".
-
-  tracing_cpumask:
-
-       This is a mask that lets the user only trace
-       on specified CPUS. The format is a hex string
-       representing the CPUS.
-
-  set_ftrace_filter:
-
-       When dynamic ftrace is configured in (see the
-       section below "dynamic ftrace"), the code is dynamically
-       modified (code text rewrite) to disable calling of the
-       function profiler (mcount). This lets tracing be configured
-       in with practically no overhead in performance.  This also
-       has a side effect of enabling or disabling specific functions
-       to be traced. Echoing names of functions into this file
-       will limit the trace to only those functions.
-
-  set_ftrace_notrace:
-
-       This has an effect opposite to that of
-       set_ftrace_filter. Any function that is added here will not
-       be traced. If a function exists in both set_ftrace_filter
-       and set_ftrace_notrace, the function will _not_ be traced.
-
-  set_ftrace_pid:
-
-       Have the function tracer only trace a single thread.
-
-  set_graph_function:
-
-       Set a "trigger" function where tracing should start
-       with the function graph tracer (See the section
-       "dynamic ftrace" for more details).
-
-  available_filter_functions:
-
-       This lists the functions that ftrace
-       has processed and can trace. These are the function
-       names that you can pass to "set_ftrace_filter" or
-       "set_ftrace_notrace". (See the section "dynamic ftrace"
-       below for more details.)
-
-
-The Tracers
------------
-
-Here is the list of current tracers that may be configured.
-
-  "function"
-
-       Function call tracer to trace all kernel functions.
-
-  "function_graph_tracer"
-
-       Similar to the function tracer except that the
-       function tracer probes the functions on their entry
-       whereas the function graph tracer traces on both entry
-       and exit of the functions. It then provides the ability
-       to draw a graph of function calls similar to C code
-       source.
-
-  "sched_switch"
-
-       Traces the context switches and wakeups between tasks.
-
-  "irqsoff"
-
-       Traces the areas that disable interrupts and saves
-       the trace with the longest max latency.
-       See tracing_max_latency. When a new max is recorded,
-       it replaces the old trace. It is best to view this
-       trace via the latency_trace file.
-
-  "preemptoff"
-
-       Similar to irqsoff but traces and records the amount of
-       time for which preemption is disabled.
-
-  "preemptirqsoff"
-
-       Similar to irqsoff and preemptoff, but traces and
-       records the largest time for which irqs and/or preemption
-       is disabled.
-
-  "wakeup"
-
-       Traces and records the max latency that it takes for
-       the highest priority task to get scheduled after
-       it has been woken up.
-
-  "hw-branch-tracer"
-
-       Uses the BTS CPU feature on x86 CPUs to traces all
-       branches executed.
-
-  "nop"
-
-       This is the "trace nothing" tracer. To remove all
-       tracers from tracing simply echo "nop" into
-       current_tracer.
-
-
-Examples of using the tracer
-----------------------------
-
-Here are typical examples of using the tracers when controlling
-them only with the debugfs interface (without using any
-user-land utilities).
-
-Output format:
---------------
-
-Here is an example of the output format of the file "trace"
-
-                             --------
-# tracer: function
-#
-#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
-#              | |      |          |         |
-            bash-4251  [01] 10152.583854: path_put <-path_walk
-            bash-4251  [01] 10152.583855: dput <-path_put
-            bash-4251  [01] 10152.583855: _atomic_dec_and_lock <-dput
-                             --------
-
-A header is printed with the tracer name that is represented by
-the trace. In this case the tracer is "function". Then a header
-showing the format. Task name "bash", the task PID "4251", the
-CPU that it was running on "01", the timestamp in <secs>.<usecs>
-format, the function name that was traced "path_put" and the
-parent function that called this function "path_walk". The
-timestamp is the time at which the function was entered.
-
-The sched_switch tracer also includes tracing of task wakeups
-and context switches.
-
-     ksoftirqd/1-7     [01]  1453.070013:      7:115:R   +  2916:115:S
-     ksoftirqd/1-7     [01]  1453.070013:      7:115:R   +    10:115:S
-     ksoftirqd/1-7     [01]  1453.070013:      7:115:R ==>    10:115:R
-        events/1-10    [01]  1453.070013:     10:115:S ==>  2916:115:R
-     kondemand/1-2916  [01]  1453.070013:   2916:115:S ==>     7:115:R
-     ksoftirqd/1-7     [01]  1453.070013:      7:115:S ==>     0:140:R
-
-Wake ups are represented by a "+" and the context switches are
-shown as "==>".  The format is:
-
- Context switches:
-
-       Previous task              Next Task
-
-  <pid>:<prio>:<state>  ==>  <pid>:<prio>:<state>
-
- Wake ups:
-
-       Current task               Task waking up
-
-  <pid>:<prio>:<state>    +  <pid>:<prio>:<state>
-
-The prio is the internal kernel priority, which is the inverse
-of the priority that is usually displayed by user-space tools.
-Zero represents the highest priority (99). Prio 100 starts the
-"nice" priorities with 100 being equal to nice -20 and 139 being
-nice 19. The prio "140" is reserved for the idle task which is
-the lowest priority thread (pid 0).
-
-
-Latency trace format
---------------------
-
-For traces that display latency times, the latency_trace file
-gives somewhat more information to see why a latency happened.
-Here is a typical trace.
-
-# tracer: irqsoff
-#
-irqsoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 97 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
-    -----------------
-    | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
-    -----------------
- => started at: apic_timer_interrupt
- => ended at:   do_softirq
-
-#                _------=> CPU#
-#               / _-----=> irqs-off
-#              | / _----=> need-resched
-#              || / _---=> hardirq/softirq
-#              ||| / _--=> preempt-depth
-#              |||| /
-#              |||||     delay
-#  cmd     pid ||||| time  |   caller
-#     \   /    |||||   \   |   /
-  <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
-  <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
-  <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
-
-
-This shows that the current tracer is "irqsoff" tracing the time
-for which interrupts were disabled. It gives the trace version
-and the version of the kernel upon which this was executed on
-(2.6.26-rc8). Then it displays the max latency in microsecs (97
-us). The number of trace entries displayed and the total number
-recorded (both are three: #3/3). The type of preemption that was
-used (PREEMPT). VP, KP, SP, and HP are always zero and are
-reserved for later use. #P is the number of online CPUS (#P:2).
-
-The task is the process that was running when the latency
-occurred. (swapper pid: 0).
-
-The start and stop (the functions in which the interrupts were
-disabled and enabled respectively) that caused the latencies:
-
-  apic_timer_interrupt is where the interrupts were disabled.
-  do_softirq is where they were enabled again.
-
-The next lines after the header are the trace itself. The header
-explains which is which.
-
-  cmd: The name of the process in the trace.
-
-  pid: The PID of that process.
-
-  CPU#: The CPU which the process was running on.
-
-  irqs-off: 'd' interrupts are disabled. '.' otherwise.
-           Note: If the architecture does not support a way to
-                 read the irq flags variable, an 'X' will always
-                 be printed here.
-
-  need-resched: 'N' task need_resched is set, '.' otherwise.
-
-  hardirq/softirq:
-       'H' - hard irq occurred inside a softirq.
-       'h' - hard irq is running
-       's' - soft irq is running
-       '.' - normal context.
-
-  preempt-depth: The level of preempt_disabled
-
-The above is mostly meaningful for kernel developers.
-
-  time: This differs from the trace file output. The trace file output
-       includes an absolute timestamp. The timestamp used by the
-       latency_trace file is relative to the start of the trace.
-
-  delay: This is just to help catch your eye a bit better. And
-        needs to be fixed to be only relative to the same CPU.
-        The marks are determined by the difference between this
-        current trace and the next trace.
-         '!' - greater than preempt_mark_thresh (default 100)
-         '+' - greater than 1 microsecond
-         ' ' - less than or equal to 1 microsecond.
-
-  The rest is the same as the 'trace' file.
-
-
-trace_options
--------------
-
-The trace_options file is used to control what gets printed in
-the trace output. To see what is available, simply cat the file:
-
-  cat /debug/tracing/trace_options
-  print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \
-  noblock nostacktrace nosched-tree nouserstacktrace nosym-userobj
-
-To disable one of the options, echo in the option prepended with
-"no".
-
-  echo noprint-parent > /debug/tracing/trace_options
-
-To enable an option, leave off the "no".
-
-  echo sym-offset > /debug/tracing/trace_options
-
-Here are the available options:
-
-  print-parent - On function traces, display the calling (parent)
-                function as well as the function being traced.
-
-  print-parent:
-   bash-4000  [01]  1477.606694: simple_strtoul <-strict_strtoul
-
-  noprint-parent:
-   bash-4000  [01]  1477.606694: simple_strtoul
-
-
-  sym-offset - Display not only the function name, but also the
-              offset in the function. For example, instead of
-              seeing just "ktime_get", you will see
-              "ktime_get+0xb/0x20".
-
-  sym-offset:
-   bash-4000  [01]  1477.606694: simple_strtoul+0x6/0xa0
-
-  sym-addr - this will also display the function address as well
-            as the function name.
-
-  sym-addr:
-   bash-4000  [01]  1477.606694: simple_strtoul <c0339346>
-
-  verbose - This deals with the latency_trace file.
-
-    bash  4000 1 0 00000000 00010a95 [58127d26] 1720.415ms \
-    (+0.000ms): simple_strtoul (strict_strtoul)
-
-  raw - This will display raw numbers. This option is best for
-       use with user applications that can translate the raw
-       numbers better than having it done in the kernel.
-
-  hex - Similar to raw, but the numbers will be in a hexadecimal
-       format.
-
-  bin - This will print out the formats in raw binary.
-
-  block - TBD (needs update)
-
-  stacktrace - This is one of the options that changes the trace
-              itself. When a trace is recorded, so is the stack
-              of functions. This allows for back traces of
-              trace sites.
-
-  userstacktrace - This option changes the trace. It records a
-                  stacktrace of the current userspace thread.
-
-  sym-userobj - when user stacktrace are enabled, look up which
-               object the address belongs to, and print a
-               relative address. This is especially useful when
-               ASLR is on, otherwise you don't get a chance to
-               resolve the address to object/file/line after
-               the app is no longer running
-
-               The lookup is performed when you read
-               trace,trace_pipe,latency_trace. Example:
-
-               a.out-1623  [000] 40874.465068: /root/a.out[+0x480] <-/root/a.out[+0
-x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
-
-  sched-tree - trace all tasks that are on the runqueue, at
-              every scheduling event. Will add overhead if
-              there's a lot of tasks running at once.
-
-
-sched_switch
-------------
-
-This tracer simply records schedule switches. Here is an example
-of how to use it.
-
- # echo sched_switch > /debug/tracing/current_tracer
- # echo 1 > /debug/tracing/tracing_enabled
- # sleep 1
- # echo 0 > /debug/tracing/tracing_enabled
- # cat /debug/tracing/trace
-
-# tracer: sched_switch
-#
-#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
-#              | |      |          |         |
-            bash-3997  [01]   240.132281:   3997:120:R   +  4055:120:R
-            bash-3997  [01]   240.132284:   3997:120:R ==>  4055:120:R
-           sleep-4055  [01]   240.132371:   4055:120:S ==>  3997:120:R
-            bash-3997  [01]   240.132454:   3997:120:R   +  4055:120:S
-            bash-3997  [01]   240.132457:   3997:120:R ==>  4055:120:R
-           sleep-4055  [01]   240.132460:   4055:120:D ==>  3997:120:R
-            bash-3997  [01]   240.132463:   3997:120:R   +  4055:120:D
-            bash-3997  [01]   240.132465:   3997:120:R ==>  4055:120:R
-          <idle>-0     [00]   240.132589:      0:140:R   +     4:115:S
-          <idle>-0     [00]   240.132591:      0:140:R ==>     4:115:R
-     ksoftirqd/0-4     [00]   240.132595:      4:115:S ==>     0:140:R
-          <idle>-0     [00]   240.132598:      0:140:R   +     4:115:S
-          <idle>-0     [00]   240.132599:      0:140:R ==>     4:115:R
-     ksoftirqd/0-4     [00]   240.132603:      4:115:S ==>     0:140:R
-           sleep-4055  [01]   240.133058:   4055:120:S ==>  3997:120:R
- [...]
-
-
-As we have discussed previously about this format, the header
-shows the name of the trace and points to the options. The
-"FUNCTION" is a misnomer since here it represents the wake ups
-and context switches.
-
-The sched_switch file only lists the wake ups (represented with
-'+') and context switches ('==>') with the previous task or
-current task first followed by the next task or task waking up.
-The format for both of these is PID:KERNEL-PRIO:TASK-STATE.
-Remember that the KERNEL-PRIO is the inverse of the actual
-priority with zero (0) being the highest priority and the nice
-values starting at 100 (nice -20). Below is a quick chart to map
-the kernel priority to user land priorities.
-
-  Kernel priority: 0 to 99    ==> user RT priority 99 to 0
-  Kernel priority: 100 to 139 ==> user nice -20 to 19
-  Kernel priority: 140        ==> idle task priority
-
-The task states are:
-
- R - running : wants to run, may not actually be running
- S - sleep   : process is waiting to be woken up (handles signals)
- D - disk sleep (uninterruptible sleep) : process must be woken up
-                                       (ignores signals)
- T - stopped : process suspended
- t - traced  : process is being traced (with something like gdb)
- Z - zombie  : process waiting to be cleaned up
- X - unknown
-
-
-ftrace_enabled
---------------
-
-The following tracers (listed below) give different output
-depending on whether or not the sysctl ftrace_enabled is set. To
-set ftrace_enabled, one can either use the sysctl function or
-set it via the proc file system interface.
-
-  sysctl kernel.ftrace_enabled=1
-
- or
-
-  echo 1 > /proc/sys/kernel/ftrace_enabled
-
-To disable ftrace_enabled simply replace the '1' with '0' in the
-above commands.
-
-When ftrace_enabled is set the tracers will also record the
-functions that are within the trace. The descriptions of the
-tracers will also show an example with ftrace enabled.
-
-
-irqsoff
--------
-
-When interrupts are disabled, the CPU can not react to any other
-external event (besides NMIs and SMIs). This prevents the timer
-interrupt from triggering or the mouse interrupt from letting
-the kernel know of a new mouse event. The result is a latency
-with the reaction time.
-
-The irqsoff tracer tracks the time for which interrupts are
-disabled. When a new maximum latency is hit, the tracer saves
-the trace leading up to that latency point so that every time a
-new maximum is reached, the old saved trace is discarded and the
-new trace is saved.
-
-To reset the maximum, echo 0 into tracing_max_latency. Here is
-an example:
-
- # echo irqsoff > /debug/tracing/current_tracer
- # echo 0 > /debug/tracing/tracing_max_latency
- # echo 1 > /debug/tracing/tracing_enabled
- # ls -ltr
- [...]
- # echo 0 > /debug/tracing/tracing_enabled
- # cat /debug/tracing/latency_trace
-# tracer: irqsoff
-#
-irqsoff latency trace v1.1.5 on 2.6.26
---------------------------------------------------------------------
- latency: 12 us, #3/3, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
-    -----------------
-    | task: bash-3730 (uid:0 nice:0 policy:0 rt_prio:0)
-    -----------------
- => started at: sys_setpgid
- => ended at:   sys_setpgid
-
-#                _------=> CPU#
-#               / _-----=> irqs-off
-#              | / _----=> need-resched
-#              || / _---=> hardirq/softirq
-#              ||| / _--=> preempt-depth
-#              |||| /
-#              |||||     delay
-#  cmd     pid ||||| time  |   caller
-#     \   /    |||||   \   |   /
-    bash-3730  1d...    0us : _write_lock_irq (sys_setpgid)
-    bash-3730  1d..1    1us+: _write_unlock_irq (sys_setpgid)
-    bash-3730  1d..2   14us : trace_hardirqs_on (sys_setpgid)
-
-
-Here we see that that we had a latency of 12 microsecs (which is
-very good). The _write_lock_irq in sys_setpgid disabled
-interrupts. The difference between the 12 and the displayed
-timestamp 14us occurred because the clock was incremented
-between the time of recording the max latency and the time of
-recording the function that had that latency.
-
-Note the above example had ftrace_enabled not set. If we set the
-ftrace_enabled, we get a much larger output:
-
-# tracer: irqsoff
-#
-irqsoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 50 us, #101/101, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
-    -----------------
-    | task: ls-4339 (uid:0 nice:0 policy:0 rt_prio:0)
-    -----------------
- => started at: __alloc_pages_internal
- => ended at:   __alloc_pages_internal
-
-#                _------=> CPU#
-#               / _-----=> irqs-off
-#              | / _----=> need-resched
-#              || / _---=> hardirq/softirq
-#              ||| / _--=> preempt-depth
-#              |||| /
-#              |||||     delay
-#  cmd     pid ||||| time  |   caller
-#     \   /    |||||   \   |   /
-      ls-4339  0...1    0us+: get_page_from_freelist (__alloc_pages_internal)
-      ls-4339  0d..1    3us : rmqueue_bulk (get_page_from_freelist)
-      ls-4339  0d..1    3us : _spin_lock (rmqueue_bulk)
-      ls-4339  0d..1    4us : add_preempt_count (_spin_lock)
-      ls-4339  0d..2    4us : __rmqueue (rmqueue_bulk)
-      ls-4339  0d..2    5us : __rmqueue_smallest (__rmqueue)
-      ls-4339  0d..2    5us : __mod_zone_page_state (__rmqueue_smallest)
-      ls-4339  0d..2    6us : __rmqueue (rmqueue_bulk)
-      ls-4339  0d..2    6us : __rmqueue_smallest (__rmqueue)
-      ls-4339  0d..2    7us : __mod_zone_page_state (__rmqueue_smallest)
-      ls-4339  0d..2    7us : __rmqueue (rmqueue_bulk)
-      ls-4339  0d..2    8us : __rmqueue_smallest (__rmqueue)
-[...]
-      ls-4339  0d..2   46us : __rmqueue_smallest (__rmqueue)
-      ls-4339  0d..2   47us : __mod_zone_page_state (__rmqueue_smallest)
-      ls-4339  0d..2   47us : __rmqueue (rmqueue_bulk)
-      ls-4339  0d..2   48us : __rmqueue_smallest (__rmqueue)
-      ls-4339  0d..2   48us : __mod_zone_page_state (__rmqueue_smallest)
-      ls-4339  0d..2   49us : _spin_unlock (rmqueue_bulk)
-      ls-4339  0d..2   49us : sub_preempt_count (_spin_unlock)
-      ls-4339  0d..1   50us : get_page_from_freelist (__alloc_pages_internal)
-      ls-4339  0d..2   51us : trace_hardirqs_on (__alloc_pages_internal)
-
-
-
-Here we traced a 50 microsecond latency. But we also see all the
-functions that were called during that time. Note that by
-enabling function tracing, we incur an added overhead. This
-overhead may extend the latency times. But nevertheless, this
-trace has provided some very helpful debugging information.
-
-
-preemptoff
-----------
-
-When preemption is disabled, we may be able to receive
-interrupts but the task cannot be preempted and a higher
-priority task must wait for preemption to be enabled again
-before it can preempt a lower priority task.
-
-The preemptoff tracer traces the places that disable preemption.
-Like the irqsoff tracer, it records the maximum latency for
-which preemption was disabled. The control of preemptoff tracer
-is much like the irqsoff tracer.
-
- # echo preemptoff > /debug/tracing/current_tracer
- # echo 0 > /debug/tracing/tracing_max_latency
- # echo 1 > /debug/tracing/tracing_enabled
- # ls -ltr
- [...]
- # echo 0 > /debug/tracing/tracing_enabled
- # cat /debug/tracing/latency_trace
-# tracer: preemptoff
-#
-preemptoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 29 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
-    -----------------
-    | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0)
-    -----------------
- => started at: do_IRQ
- => ended at:   __do_softirq
-
-#                _------=> CPU#
-#               / _-----=> irqs-off
-#              | / _----=> need-resched
-#              || / _---=> hardirq/softirq
-#              ||| / _--=> preempt-depth
-#              |||| /
-#              |||||     delay
-#  cmd     pid ||||| time  |   caller
-#     \   /    |||||   \   |   /
-    sshd-4261  0d.h.    0us+: irq_enter (do_IRQ)
-    sshd-4261  0d.s.   29us : _local_bh_enable (__do_softirq)
-    sshd-4261  0d.s1   30us : trace_preempt_on (__do_softirq)
-
-
-This has some more changes. Preemption was disabled when an
-interrupt came in (notice the 'h'), and was enabled while doing
-a softirq. (notice the 's'). But we also see that interrupts
-have been disabled when entering the preempt off section and
-leaving it (the 'd'). We do not know if interrupts were enabled
-in the mean time.
-
-# tracer: preemptoff
-#
-preemptoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 63 us, #87/87, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
-    -----------------
-    | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0)
-    -----------------
- => started at: remove_wait_queue
- => ended at:   __do_softirq
-
-#                _------=> CPU#
-#               / _-----=> irqs-off
-#              | / _----=> need-resched
-#              || / _---=> hardirq/softirq
-#              ||| / _--=> preempt-depth
-#              |||| /
-#              |||||     delay
-#  cmd     pid ||||| time  |   caller
-#     \   /    |||||   \   |   /
-    sshd-4261  0d..1    0us : _spin_lock_irqsave (remove_wait_queue)
-    sshd-4261  0d..1    1us : _spin_unlock_irqrestore (remove_wait_queue)
-    sshd-4261  0d..1    2us : do_IRQ (common_interrupt)
-    sshd-4261  0d..1    2us : irq_enter (do_IRQ)
-    sshd-4261  0d..1    2us : idle_cpu (irq_enter)
-    sshd-4261  0d..1    3us : add_preempt_count (irq_enter)
-    sshd-4261  0d.h1    3us : idle_cpu (irq_enter)
-    sshd-4261  0d.h.    4us : handle_fasteoi_irq (do_IRQ)
-[...]
-    sshd-4261  0d.h.   12us : add_preempt_count (_spin_lock)
-    sshd-4261  0d.h1   12us : ack_ioapic_quirk_irq (handle_fasteoi_irq)
-    sshd-4261  0d.h1   13us : move_native_irq (ack_ioapic_quirk_irq)
-    sshd-4261  0d.h1   13us : _spin_unlock (handle_fasteoi_irq)
-    sshd-4261  0d.h1   14us : sub_preempt_count (_spin_unlock)
-    sshd-4261  0d.h1   14us : irq_exit (do_IRQ)
-    sshd-4261  0d.h1   15us : sub_preempt_count (irq_exit)
-    sshd-4261  0d..2   15us : do_softirq (irq_exit)
-    sshd-4261  0d...   15us : __do_softirq (do_softirq)
-    sshd-4261  0d...   16us : __local_bh_disable (__do_softirq)
-    sshd-4261  0d...   16us+: add_preempt_count (__local_bh_disable)
-    sshd-4261  0d.s4   20us : add_preempt_count (__local_bh_disable)
-    sshd-4261  0d.s4   21us : sub_preempt_count (local_bh_enable)
-    sshd-4261  0d.s5   21us : sub_preempt_count (local_bh_enable)
-[...]
-    sshd-4261  0d.s6   41us : add_preempt_count (__local_bh_disable)
-    sshd-4261  0d.s6   42us : sub_preempt_count (local_bh_enable)
-    sshd-4261  0d.s7   42us : sub_preempt_count (local_bh_enable)
-    sshd-4261  0d.s5   43us : add_preempt_count (__local_bh_disable)
-    sshd-4261  0d.s5   43us : sub_preempt_count (local_bh_enable_ip)
-    sshd-4261  0d.s6   44us : sub_preempt_count (local_bh_enable_ip)
-    sshd-4261  0d.s5   44us : add_preempt_count (__local_bh_disable)
-    sshd-4261  0d.s5   45us : sub_preempt_count (local_bh_enable)
-[...]
-    sshd-4261  0d.s.   63us : _local_bh_enable (__do_softirq)
-    sshd-4261  0d.s1   64us : trace_preempt_on (__do_softirq)
-
-
-The above is an example of the preemptoff trace with
-ftrace_enabled set. Here we see that interrupts were disabled
-the entire time. The irq_enter code lets us know that we entered
-an interrupt 'h'. Before that, the functions being traced still
-show that it is not in an interrupt, but we can see from the
-functions themselves that this is not the case.
-
-Notice that __do_softirq when called does not have a
-preempt_count. It may seem that we missed a preempt enabling.
-What really happened is that the preempt count is held on the
-thread's stack and we switched to the softirq stack (4K stacks
-in effect). The code does not copy the preempt count, but
-because interrupts are disabled, we do not need to worry about
-it. Having a tracer like this is good for letting people know
-what really happens inside the kernel.
-
-
-preemptirqsoff
---------------
-
-Knowing the locations that have interrupts disabled or
-preemption disabled for the longest times is helpful. But
-sometimes we would like to know when either preemption and/or
-interrupts are disabled.
-
-Consider the following code:
-
-    local_irq_disable();
-    call_function_with_irqs_off();
-    preempt_disable();
-    call_function_with_irqs_and_preemption_off();
-    local_irq_enable();
-    call_function_with_preemption_off();
-    preempt_enable();
-
-The irqsoff tracer will record the total length of
-call_function_with_irqs_off() and
-call_function_with_irqs_and_preemption_off().
-
-The preemptoff tracer will record the total length of
-call_function_with_irqs_and_preemption_off() and
-call_function_with_preemption_off().
-
-But neither will trace the time that interrupts and/or
-preemption is disabled. This total time is the time that we can
-not schedule. To record this time, use the preemptirqsoff
-tracer.
-
-Again, using this trace is much like the irqsoff and preemptoff
-tracers.
-
- # echo preemptirqsoff > /debug/tracing/current_tracer
- # echo 0 > /debug/tracing/tracing_max_latency
- # echo 1 > /debug/tracing/tracing_enabled
- # ls -ltr
- [...]
- # echo 0 > /debug/tracing/tracing_enabled
- # cat /debug/tracing/latency_trace
-# tracer: preemptirqsoff
-#
-preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 293 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
-    -----------------
-    | task: ls-4860 (uid:0 nice:0 policy:0 rt_prio:0)
-    -----------------
- => started at: apic_timer_interrupt
- => ended at:   __do_softirq
-
-#                _------=> CPU#
-#               / _-----=> irqs-off
-#              | / _----=> need-resched
-#              || / _---=> hardirq/softirq
-#              ||| / _--=> preempt-depth
-#              |||| /
-#              |||||     delay
-#  cmd     pid ||||| time  |   caller
-#     \   /    |||||   \   |   /
-      ls-4860  0d...    0us!: trace_hardirqs_off_thunk (apic_timer_interrupt)
-      ls-4860  0d.s.  294us : _local_bh_enable (__do_softirq)
-      ls-4860  0d.s1  294us : trace_preempt_on (__do_softirq)
-
-
-
-The trace_hardirqs_off_thunk is called from assembly on x86 when
-interrupts are disabled in the assembly code. Without the
-function tracing, we do not know if interrupts were enabled
-within the preemption points. We do see that it started with
-preemption enabled.
-
-Here is a trace with ftrace_enabled set:
-
-
-# tracer: preemptirqsoff
-#
-preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 105 us, #183/183, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
-    -----------------
-    | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0)
-    -----------------
- => started at: write_chan
- => ended at:   __do_softirq
-
-#                _------=> CPU#
-#               / _-----=> irqs-off
-#              | / _----=> need-resched
-#              || / _---=> hardirq/softirq
-#              ||| / _--=> preempt-depth
-#              |||| /
-#              |||||     delay
-#  cmd     pid ||||| time  |   caller
-#     \   /    |||||   \   |   /
-      ls-4473  0.N..    0us : preempt_schedule (write_chan)
-      ls-4473  0dN.1    1us : _spin_lock (schedule)
-      ls-4473  0dN.1    2us : add_preempt_count (_spin_lock)
-      ls-4473  0d..2    2us : put_prev_task_fair (schedule)
-[...]
-      ls-4473  0d..2   13us : set_normalized_timespec (ktime_get_ts)
-      ls-4473  0d..2   13us : __switch_to (schedule)
-    sshd-4261  0d..2   14us : finish_task_switch (schedule)
-    sshd-4261  0d..2   14us : _spin_unlock_irq (finish_task_switch)
-    sshd-4261  0d..1   15us : add_preempt_count (_spin_lock_irqsave)
-    sshd-4261  0d..2   16us : _spin_unlock_irqrestore (hrtick_set)
-    sshd-4261  0d..2   16us : do_IRQ (common_interrupt)
-    sshd-4261  0d..2   17us : irq_enter (do_IRQ)
-    sshd-4261  0d..2   17us : idle_cpu (irq_enter)
-    sshd-4261  0d..2   18us : add_preempt_count (irq_enter)
-    sshd-4261  0d.h2   18us : idle_cpu (irq_enter)
-    sshd-4261  0d.h.   18us : handle_fasteoi_irq (do_IRQ)
-    sshd-4261  0d.h.   19us : _spin_lock (handle_fasteoi_irq)
-    sshd-4261  0d.h.   19us : add_preempt_count (_spin_lock)
-    sshd-4261  0d.h1   20us : _spin_unlock (handle_fasteoi_irq)
-    sshd-4261  0d.h1   20us : sub_preempt_count (_spin_unlock)
-[...]
-    sshd-4261  0d.h1   28us : _spin_unlock (handle_fasteoi_irq)
-    sshd-4261  0d.h1   29us : sub_preempt_count (_spin_unlock)
-    sshd-4261  0d.h2   29us : irq_exit (do_IRQ)
-    sshd-4261  0d.h2   29us : sub_preempt_count (irq_exit)
-    sshd-4261  0d..3   30us : do_softirq (irq_exit)
-    sshd-4261  0d...   30us : __do_softirq (do_softirq)
-    sshd-4261  0d...   31us : __local_bh_disable (__do_softirq)
-    sshd-4261  0d...   31us+: add_preempt_count (__local_bh_disable)
-    sshd-4261  0d.s4   34us : add_preempt_count (__local_bh_disable)
-[...]
-    sshd-4261  0d.s3   43us : sub_preempt_count (local_bh_enable_ip)
-    sshd-4261  0d.s4   44us : sub_preempt_count (local_bh_enable_ip)
-    sshd-4261  0d.s3   44us : smp_apic_timer_interrupt (apic_timer_interrupt)
-    sshd-4261  0d.s3   45us : irq_enter (smp_apic_timer_interrupt)
-    sshd-4261  0d.s3   45us : idle_cpu (irq_enter)
-    sshd-4261  0d.s3   46us : add_preempt_count (irq_enter)
-    sshd-4261  0d.H3   46us : idle_cpu (irq_enter)
-    sshd-4261  0d.H3   47us : hrtimer_interrupt (smp_apic_timer_interrupt)
-    sshd-4261  0d.H3   47us : ktime_get (hrtimer_interrupt)
-[...]
-    sshd-4261  0d.H3   81us : tick_program_event (hrtimer_interrupt)
-    sshd-4261  0d.H3   82us : ktime_get (tick_program_event)
-    sshd-4261  0d.H3   82us : ktime_get_ts (ktime_get)
-    sshd-4261  0d.H3   83us : getnstimeofday (ktime_get_ts)
-    sshd-4261  0d.H3   83us : set_normalized_timespec (ktime_get_ts)
-    sshd-4261  0d.H3   84us : clockevents_program_event (tick_program_event)
-    sshd-4261  0d.H3   84us : lapic_next_event (clockevents_program_event)
-    sshd-4261  0d.H3   85us : irq_exit (smp_apic_timer_interrupt)
-    sshd-4261  0d.H3   85us : sub_preempt_count (irq_exit)
-    sshd-4261  0d.s4   86us : sub_preempt_count (irq_exit)
-    sshd-4261  0d.s3   86us : add_preempt_count (__local_bh_disable)
-[...]
-    sshd-4261  0d.s1   98us : sub_preempt_count (net_rx_action)
-    sshd-4261  0d.s.   99us : add_preempt_count (_spin_lock_irq)
-    sshd-4261  0d.s1   99us+: _spin_unlock_irq (run_timer_softirq)
-    sshd-4261  0d.s.  104us : _local_bh_enable (__do_softirq)
-    sshd-4261  0d.s.  104us : sub_preempt_count (_local_bh_enable)
-    sshd-4261  0d.s.  105us : _local_bh_enable (__do_softirq)
-    sshd-4261  0d.s1  105us : trace_preempt_on (__do_softirq)
-
-
-This is a very interesting trace. It started with the preemption
-of the ls task. We see that the task had the "need_resched" bit
-set via the 'N' in the trace.  Interrupts were disabled before
-the spin_lock at the beginning of the trace. We see that a
-schedule took place to run sshd.  When the interrupts were
-enabled, we took an interrupt. On return from the interrupt
-handler, the softirq ran. We took another interrupt while
-running the softirq as we see from the capital 'H'.
-
-
-wakeup
-------
-
-In a Real-Time environment it is very important to know the
-wakeup time it takes for the highest priority task that is woken
-up to the time that it executes. This is also known as "schedule
-latency". I stress the point that this is about RT tasks. It is
-also important to know the scheduling latency of non-RT tasks,
-but the average schedule latency is better for non-RT tasks.
-Tools like LatencyTop are more appropriate for such
-measurements.
-
-Real-Time environments are interested in the worst case latency.
-That is the longest latency it takes for something to happen,
-and not the average. We can have a very fast scheduler that may
-only have a large latency once in a while, but that would not
-work well with Real-Time tasks.  The wakeup tracer was designed
-to record the worst case wakeups of RT tasks. Non-RT tasks are
-not recorded because the tracer only records one worst case and
-tracing non-RT tasks that are unpredictable will overwrite the
-worst case latency of RT tasks.
-
-Since this tracer only deals with RT tasks, we will run this
-slightly differently than we did with the previous tracers.
-Instead of performing an 'ls', we will run 'sleep 1' under
-'chrt' which changes the priority of the task.
-
- # echo wakeup > /debug/tracing/current_tracer
- # echo 0 > /debug/tracing/tracing_max_latency
- # echo 1 > /debug/tracing/tracing_enabled
- # chrt -f 5 sleep 1
- # echo 0 > /debug/tracing/tracing_enabled
- # cat /debug/tracing/latency_trace
-# tracer: wakeup
-#
-wakeup latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 4 us, #2/2, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
-    -----------------
-    | task: sleep-4901 (uid:0 nice:0 policy:1 rt_prio:5)
-    -----------------
-
-#                _------=> CPU#
-#               / _-----=> irqs-off
-#              | / _----=> need-resched
-#              || / _---=> hardirq/softirq
-#              ||| / _--=> preempt-depth
-#              |||| /
-#              |||||     delay
-#  cmd     pid ||||| time  |   caller
-#     \   /    |||||   \   |   /
-  <idle>-0     1d.h4    0us+: try_to_wake_up (wake_up_process)
-  <idle>-0     1d..4    4us : schedule (cpu_idle)
-
-
-Running this on an idle system, we see that it only took 4
-microseconds to perform the task switch.  Note, since the trace
-marker in the schedule is before the actual "switch", we stop
-the tracing when the recorded task is about to schedule in. This
-may change if we add a new marker at the end of the scheduler.
-
-Notice that the recorded task is 'sleep' with the PID of 4901
-and it has an rt_prio of 5. This priority is user-space priority
-and not the internal kernel priority. The policy is 1 for
-SCHED_FIFO and 2 for SCHED_RR.
-
-Doing the same with chrt -r 5 and ftrace_enabled set.
-
-# tracer: wakeup
-#
-wakeup latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 50 us, #60/60, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
-    -----------------
-    | task: sleep-4068 (uid:0 nice:0 policy:2 rt_prio:5)
-    -----------------
-
-#                _------=> CPU#
-#               / _-----=> irqs-off
-#              | / _----=> need-resched
-#              || / _---=> hardirq/softirq
-#              ||| / _--=> preempt-depth
-#              |||| /
-#              |||||     delay
-#  cmd     pid ||||| time  |   caller
-#     \   /    |||||   \   |   /
-ksoftirq-7     1d.H3    0us : try_to_wake_up (wake_up_process)
-ksoftirq-7     1d.H4    1us : sub_preempt_count (marker_probe_cb)
-ksoftirq-7     1d.H3    2us : check_preempt_wakeup (try_to_wake_up)
-ksoftirq-7     1d.H3    3us : update_curr (check_preempt_wakeup)
-ksoftirq-7     1d.H3    4us : calc_delta_mine (update_curr)
-ksoftirq-7     1d.H3    5us : __resched_task (check_preempt_wakeup)
-ksoftirq-7     1d.H3    6us : task_wake_up_rt (try_to_wake_up)
-ksoftirq-7     1d.H3    7us : _spin_unlock_irqrestore (try_to_wake_up)
-[...]
-ksoftirq-7     1d.H2   17us : irq_exit (smp_apic_timer_interrupt)
-ksoftirq-7     1d.H2   18us : sub_preempt_count (irq_exit)
-ksoftirq-7     1d.s3   19us : sub_preempt_count (irq_exit)
-ksoftirq-7     1..s2   20us : rcu_process_callbacks (__do_softirq)
-[...]
-ksoftirq-7     1..s2   26us : __rcu_process_callbacks (rcu_process_callbacks)
-ksoftirq-7     1d.s2   27us : _local_bh_enable (__do_softirq)
-ksoftirq-7     1d.s2   28us : sub_preempt_count (_local_bh_enable)
-ksoftirq-7     1.N.3   29us : sub_preempt_count (ksoftirqd)
-ksoftirq-7     1.N.2   30us : _cond_resched (ksoftirqd)
-ksoftirq-7     1.N.2   31us : __cond_resched (_cond_resched)
-ksoftirq-7     1.N.2   32us : add_preempt_count (__cond_resched)
-ksoftirq-7     1.N.2   33us : schedule (__cond_resched)
-ksoftirq-7     1.N.2   33us : add_preempt_count (schedule)
-ksoftirq-7     1.N.3   34us : hrtick_clear (schedule)
-ksoftirq-7     1dN.3   35us : _spin_lock (schedule)
-ksoftirq-7     1dN.3   36us : add_preempt_count (_spin_lock)
-ksoftirq-7     1d..4   37us : put_prev_task_fair (schedule)
-ksoftirq-7     1d..4   38us : update_curr (put_prev_task_fair)
-[...]
-ksoftirq-7     1d..5   47us : _spin_trylock (tracing_record_cmdline)
-ksoftirq-7     1d..5   48us : add_preempt_count (_spin_trylock)
-ksoftirq-7     1d..6   49us : _spin_unlock (tracing_record_cmdline)
-ksoftirq-7     1d..6   49us : sub_preempt_count (_spin_unlock)
-ksoftirq-7     1d..4   50us : schedule (__cond_resched)
-
-The interrupt went off while running ksoftirqd. This task runs
-at SCHED_OTHER. Why did not we see the 'N' set early? This may
-be a harmless bug with x86_32 and 4K stacks. On x86_32 with 4K
-stacks configured, the interrupt and softirq run with their own
-stack. Some information is held on the top of the task's stack
-(need_resched and preempt_count are both stored there). The
-setting of the NEED_RESCHED bit is done directly to the task's
-stack, but the reading of the NEED_RESCHED is done by looking at
-the current stack, which in this case is the stack for the hard
-interrupt. This hides the fact that NEED_RESCHED has been set.
-We do not see the 'N' until we switch back to the task's
-assigned stack.
-
-function
---------
-
-This tracer is the function tracer. Enabling the function tracer
-can be done from the debug file system. Make sure the
-ftrace_enabled is set; otherwise this tracer is a nop.
-
- # sysctl kernel.ftrace_enabled=1
- # echo function > /debug/tracing/current_tracer
- # echo 1 > /debug/tracing/tracing_enabled
- # usleep 1
- # echo 0 > /debug/tracing/tracing_enabled
- # cat /debug/tracing/trace
-# tracer: function
-#
-#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
-#              | |      |          |         |
-            bash-4003  [00]   123.638713: finish_task_switch <-schedule
-            bash-4003  [00]   123.638714: _spin_unlock_irq <-finish_task_switch
-            bash-4003  [00]   123.638714: sub_preempt_count <-_spin_unlock_irq
-            bash-4003  [00]   123.638715: hrtick_set <-schedule
-            bash-4003  [00]   123.638715: _spin_lock_irqsave <-hrtick_set
-            bash-4003  [00]   123.638716: add_preempt_count <-_spin_lock_irqsave
-            bash-4003  [00]   123.638716: _spin_unlock_irqrestore <-hrtick_set
-            bash-4003  [00]   123.638717: sub_preempt_count <-_spin_unlock_irqrestore
-            bash-4003  [00]   123.638717: hrtick_clear <-hrtick_set
-            bash-4003  [00]   123.638718: sub_preempt_count <-schedule
-            bash-4003  [00]   123.638718: sub_preempt_count <-preempt_schedule
-            bash-4003  [00]   123.638719: wait_for_completion <-__stop_machine_run
-            bash-4003  [00]   123.638719: wait_for_common <-wait_for_completion
-            bash-4003  [00]   123.638720: _spin_lock_irq <-wait_for_common
-            bash-4003  [00]   123.638720: add_preempt_count <-_spin_lock_irq
-[...]
-
-
-Note: function tracer uses ring buffers to store the above
-entries. The newest data may overwrite the oldest data.
-Sometimes using echo to stop the trace is not sufficient because
-the tracing could have overwritten the data that you wanted to
-record. For this reason, it is sometimes better to disable
-tracing directly from a program. This allows you to stop the
-tracing at the point that you hit the part that you are
-interested in. To disable the tracing directly from a C program,
-something like following code snippet can be used:
-
-int trace_fd;
-[...]
-int main(int argc, char *argv[]) {
-       [...]
-       trace_fd = open("/debug/tracing/tracing_enabled", O_WRONLY);
-       [...]
-       if (condition_hit()) {
-               write(trace_fd, "0", 1);
-       }
-       [...]
-}
-
-Note: Here we hard coded the path name. The debugfs mount is not
-guaranteed to be at /debug (and is more commonly at
-/sys/kernel/debug). For simple one time traces, the above is
-sufficent. For anything else, a search through /proc/mounts may
-be needed to find where the debugfs file-system is mounted.
-
-
-Single thread tracing
----------------------
-
-By writing into /debug/tracing/set_ftrace_pid you can trace a
-single thread. For example:
-
-# cat /debug/tracing/set_ftrace_pid
-no pid
-# echo 3111 > /debug/tracing/set_ftrace_pid
-# cat /debug/tracing/set_ftrace_pid
-3111
-# echo function > /debug/tracing/current_tracer
-# cat /debug/tracing/trace | head
- # tracer: function
- #
- #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
- #              | |       |          |         |
-     yum-updatesd-3111  [003]  1637.254676: finish_task_switch <-thread_return
-     yum-updatesd-3111  [003]  1637.254681: hrtimer_cancel <-schedule_hrtimeout_range
-     yum-updatesd-3111  [003]  1637.254682: hrtimer_try_to_cancel <-hrtimer_cancel
-     yum-updatesd-3111  [003]  1637.254683: lock_hrtimer_base <-hrtimer_try_to_cancel
-     yum-updatesd-3111  [003]  1637.254685: fget_light <-do_sys_poll
-     yum-updatesd-3111  [003]  1637.254686: pipe_poll <-do_sys_poll
-# echo -1 > /debug/tracing/set_ftrace_pid
-# cat /debug/tracing/trace |head
- # tracer: function
- #
- #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
- #              | |       |          |         |
- ##### CPU 3 buffer started ####
-     yum-updatesd-3111  [003]  1701.957688: free_poll_entry <-poll_freewait
-     yum-updatesd-3111  [003]  1701.957689: remove_wait_queue <-free_poll_entry
-     yum-updatesd-3111  [003]  1701.957691: fput <-free_poll_entry
-     yum-updatesd-3111  [003]  1701.957692: audit_syscall_exit <-sysret_audit
-     yum-updatesd-3111  [003]  1701.957693: path_put <-audit_syscall_exit
-
-If you want to trace a function when executing, you could use
-something like this simple program:
-
-#include <stdio.h>
-#include <stdlib.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <unistd.h>
-
-int main (int argc, char **argv)
-{
-        if (argc < 1)
-                exit(-1);
-
-        if (fork() > 0) {
-                int fd, ffd;
-                char line[64];
-                int s;
-
-                ffd = open("/debug/tracing/current_tracer", O_WRONLY);
-                if (ffd < 0)
-                        exit(-1);
-                write(ffd, "nop", 3);
-
-                fd = open("/debug/tracing/set_ftrace_pid", O_WRONLY);
-                s = sprintf(line, "%d\n", getpid());
-                write(fd, line, s);
-
-                write(ffd, "function", 8);
-
-                close(fd);
-                close(ffd);
-
-                execvp(argv[1], argv+1);
-        }
-
-        return 0;
-}
-
-
-hw-branch-tracer (x86 only)
----------------------------
-
-This tracer uses the x86 last branch tracing hardware feature to
-collect a branch trace on all cpus with relatively low overhead.
-
-The tracer uses a fixed-size circular buffer per cpu and only
-traces ring 0 branches. The trace file dumps that buffer in the
-following format:
-
-# tracer: hw-branch-tracer
-#
-# CPU#        TO  <-  FROM
-   0  scheduler_tick+0xb5/0x1bf          <-  task_tick_idle+0x5/0x6
-   2  run_posix_cpu_timers+0x2b/0x72a    <-  run_posix_cpu_timers+0x25/0x72a
-   0  scheduler_tick+0x139/0x1bf         <-  scheduler_tick+0xed/0x1bf
-   0  scheduler_tick+0x17c/0x1bf         <-  scheduler_tick+0x148/0x1bf
-   2  run_posix_cpu_timers+0x9e/0x72a    <-  run_posix_cpu_timers+0x5e/0x72a
-   0  scheduler_tick+0x1b6/0x1bf         <-  scheduler_tick+0x1aa/0x1bf
-
-
-The tracer may be used to dump the trace for the oops'ing cpu on
-a kernel oops into the system log. To enable this,
-ftrace_dump_on_oops must be set. To set ftrace_dump_on_oops, one
-can either use the sysctl function or set it via the proc system
-interface.
-
-  sysctl kernel.ftrace_dump_on_oops=1
-
-or
-
-  echo 1 > /proc/sys/kernel/ftrace_dump_on_oops
-
-
-Here's an example of such a dump after a null pointer
-dereference in a kernel module:
-
-[57848.105921] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
-[57848.106019] IP: [<ffffffffa0000006>] open+0x6/0x14 [oops]
-[57848.106019] PGD 2354e9067 PUD 2375e7067 PMD 0
-[57848.106019] Oops: 0002 [#1] SMP
-[57848.106019] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:20:05.0/local_cpus
-[57848.106019] Dumping ftrace buffer:
-[57848.106019] ---------------------------------
-[...]
-[57848.106019]    0  chrdev_open+0xe6/0x165      <-  cdev_put+0x23/0x24
-[57848.106019]    0  chrdev_open+0x117/0x165     <-  chrdev_open+0xfa/0x165
-[57848.106019]    0  chrdev_open+0x120/0x165     <-  chrdev_open+0x11c/0x165
-[57848.106019]    0  chrdev_open+0x134/0x165     <-  chrdev_open+0x12b/0x165
-[57848.106019]    0  open+0x0/0x14 [oops]        <-  chrdev_open+0x144/0x165
-[57848.106019]    0  page_fault+0x0/0x30         <-  open+0x6/0x14 [oops]
-[57848.106019]    0  error_entry+0x0/0x5b        <-  page_fault+0x4/0x30
-[57848.106019]    0  error_kernelspace+0x0/0x31          <-  error_entry+0x59/0x5b
-[57848.106019]    0  error_sti+0x0/0x1   <-  error_kernelspace+0x2d/0x31
-[57848.106019]    0  page_fault+0x9/0x30         <-  error_sti+0x0/0x1
-[57848.106019]    0  do_page_fault+0x0/0x881     <-  page_fault+0x1a/0x30
-[...]
-[57848.106019]    0  do_page_fault+0x66b/0x881   <-  is_prefetch+0x1ee/0x1f2
-[57848.106019]    0  do_page_fault+0x6e0/0x881   <-  do_page_fault+0x67a/0x881
-[57848.106019]    0  oops_begin+0x0/0x96         <-  do_page_fault+0x6e0/0x881
-[57848.106019]    0  trace_hw_branch_oops+0x0/0x2d       <-  oops_begin+0x9/0x96
-[...]
-[57848.106019]    0  ds_suspend_bts+0x2a/0xe3    <-  ds_suspend_bts+0x1a/0xe3
-[57848.106019] ---------------------------------
-[57848.106019] CPU 0
-[57848.106019] Modules linked in: oops
-[57848.106019] Pid: 5542, comm: cat Tainted: G        W  2.6.28 #23
-[57848.106019] RIP: 0010:[<ffffffffa0000006>]  [<ffffffffa0000006>] open+0x6/0x14 [oops]
-[57848.106019] RSP: 0018:ffff880235457d48  EFLAGS: 00010246
-[...]
-
-
-function graph tracer
----------------------------
-
-This tracer is similar to the function tracer except that it
-probes a function on its entry and its exit. This is done by
-using a dynamically allocated stack of return addresses in each
-task_struct. On function entry the tracer overwrites the return
-address of each function traced to set a custom probe. Thus the
-original return address is stored on the stack of return address
-in the task_struct.
-
-Probing on both ends of a function leads to special features
-such as:
-
-- measure of a function's time execution
-- having a reliable call stack to draw function calls graph
-
-This tracer is useful in several situations:
-
-- you want to find the reason of a strange kernel behavior and
-  need to see what happens in detail on any areas (or specific
-  ones).
-
-- you are experiencing weird latencies but it's difficult to
-  find its origin.
-
-- you want to find quickly which path is taken by a specific
-  function
-
-- you just want to peek inside a working kernel and want to see
-  what happens there.
-
-# tracer: function_graph
-#
-# CPU  DURATION                  FUNCTION CALLS
-# |     |   |                     |   |   |   |
-
- 0)               |  sys_open() {
- 0)               |    do_sys_open() {
- 0)               |      getname() {
- 0)               |        kmem_cache_alloc() {
- 0)   1.382 us    |          __might_sleep();
- 0)   2.478 us    |        }
- 0)               |        strncpy_from_user() {
- 0)               |          might_fault() {
- 0)   1.389 us    |            __might_sleep();
- 0)   2.553 us    |          }
- 0)   3.807 us    |        }
- 0)   7.876 us    |      }
- 0)               |      alloc_fd() {
- 0)   0.668 us    |        _spin_lock();
- 0)   0.570 us    |        expand_files();
- 0)   0.586 us    |        _spin_unlock();
-
-
-There are several columns that can be dynamically
-enabled/disabled. You can use every combination of options you
-want, depending on your needs.
-
-- The cpu number on which the function executed is default
-  enabled.  It is sometimes better to only trace one cpu (see
-  tracing_cpu_mask file) or you might sometimes see unordered
-  function calls while cpu tracing switch.
-
-       hide: echo nofuncgraph-cpu > /debug/tracing/trace_options
-       show: echo funcgraph-cpu > /debug/tracing/trace_options
-
-- The duration (function's time of execution) is displayed on
-  the closing bracket line of a function or on the same line
-  than the current function in case of a leaf one. It is default
-  enabled.
-
-       hide: echo nofuncgraph-duration > /debug/tracing/trace_options
-       show: echo funcgraph-duration > /debug/tracing/trace_options
-
-- The overhead field precedes the duration field in case of
-  reached duration thresholds.
-
-       hide: echo nofuncgraph-overhead > /debug/tracing/trace_options
-       show: echo funcgraph-overhead > /debug/tracing/trace_options
-       depends on: funcgraph-duration
-
-  ie:
-
-  0)               |    up_write() {
-  0)   0.646 us    |      _spin_lock_irqsave();
-  0)   0.684 us    |      _spin_unlock_irqrestore();
-  0)   3.123 us    |    }
-  0)   0.548 us    |    fput();
-  0) + 58.628 us   |  }
-
-  [...]
-
-  0)               |      putname() {
-  0)               |        kmem_cache_free() {
-  0)   0.518 us    |          __phys_addr();
-  0)   1.757 us    |        }
-  0)   2.861 us    |      }
-  0) ! 115.305 us  |    }
-  0) ! 116.402 us  |  }
-
-  + means that the function exceeded 10 usecs.
-  ! means that the function exceeded 100 usecs.
-
-
-- The task/pid field displays the thread cmdline and pid which
-  executed the function. It is default disabled.
-
-       hide: echo nofuncgraph-proc > /debug/tracing/trace_options
-       show: echo funcgraph-proc > /debug/tracing/trace_options
-
-  ie:
-
-  # tracer: function_graph
-  #
-  # CPU  TASK/PID        DURATION                  FUNCTION CALLS
-  # |    |    |           |   |                     |   |   |   |
-  0)    sh-4802     |               |                  d_free() {
-  0)    sh-4802     |               |                    call_rcu() {
-  0)    sh-4802     |               |                      __call_rcu() {
-  0)    sh-4802     |   0.616 us    |                        rcu_process_gp_end();
-  0)    sh-4802     |   0.586 us    |                        check_for_new_grace_period();
-  0)    sh-4802     |   2.899 us    |                      }
-  0)    sh-4802     |   4.040 us    |                    }
-  0)    sh-4802     |   5.151 us    |                  }
-  0)    sh-4802     | + 49.370 us   |                }
-
-
-- The absolute time field is an absolute timestamp given by the
-  system clock since it started. A snapshot of this time is
-  given on each entry/exit of functions
-
-       hide: echo nofuncgraph-abstime > /debug/tracing/trace_options
-       show: echo funcgraph-abstime > /debug/tracing/trace_options
-
-  ie:
-
-  #
-  #      TIME       CPU  DURATION                  FUNCTION CALLS
-  #       |         |     |   |                     |   |   |   |
-  360.774522 |   1)   0.541 us    |                                          }
-  360.774522 |   1)   4.663 us    |                                        }
-  360.774523 |   1)   0.541 us    |                                        __wake_up_bit();
-  360.774524 |   1)   6.796 us    |                                      }
-  360.774524 |   1)   7.952 us    |                                    }
-  360.774525 |   1)   9.063 us    |                                  }
-  360.774525 |   1)   0.615 us    |                                  journal_mark_dirty();
-  360.774527 |   1)   0.578 us    |                                  __brelse();
-  360.774528 |   1)               |                                  reiserfs_prepare_for_journal() {
-  360.774528 |   1)               |                                    unlock_buffer() {
-  360.774529 |   1)               |                                      wake_up_bit() {
-  360.774529 |   1)               |                                        bit_waitqueue() {
-  360.774530 |   1)   0.594 us    |                                          __phys_addr();
-
-
-You can put some comments on specific functions by using
-trace_printk() For example, if you want to put a comment inside
-the __might_sleep() function, you just have to include
-<linux/ftrace.h> and call trace_printk() inside __might_sleep()
-
-trace_printk("I'm a comment!\n")
-
-will produce:
-
- 1)               |             __might_sleep() {
- 1)               |                /* I'm a comment! */
- 1)   1.449 us    |             }
-
-
-You might find other useful features for this tracer in the
-following "dynamic ftrace" section such as tracing only specific
-functions or tasks.
-
-dynamic ftrace
---------------
-
-If CONFIG_DYNAMIC_FTRACE is set, the system will run with
-virtually no overhead when function tracing is disabled. The way
-this works is the mcount function call (placed at the start of
-every kernel function, produced by the -pg switch in gcc),
-starts of pointing to a simple return. (Enabling FTRACE will
-include the -pg switch in the compiling of the kernel.)
-
-At compile time every C file object is run through the
-recordmcount.pl script (located in the scripts directory). This
-script will process the C object using objdump to find all the
-locations in the .text section that call mcount. (Note, only the
-.text section is processed, since processing other sections like
-.init.text may cause races due to those sections being freed).
-
-A new section called "__mcount_loc" is created that holds
-references to all the mcount call sites in the .text section.
-This section is compiled back into the original object. The
-final linker will add all these references into a single table.
-
-On boot up, before SMP is initialized, the dynamic ftrace code
-scans this table and updates all the locations into nops. It
-also records the locations, which are added to the
-available_filter_functions list.  Modules are processed as they
-are loaded and before they are executed.  When a module is
-unloaded, it also removes its functions from the ftrace function
-list. This is automatic in the module unload code, and the
-module author does not need to worry about it.
-
-When tracing is enabled, kstop_machine is called to prevent
-races with the CPUS executing code being modified (which can
-cause the CPU to do undesireable things), and the nops are
-patched back to calls. But this time, they do not call mcount
-(which is just a function stub). They now call into the ftrace
-infrastructure.
-
-One special side-effect to the recording of the functions being
-traced is that we can now selectively choose which functions we
-wish to trace and which ones we want the mcount calls to remain
-as nops.
-
-Two files are used, one for enabling and one for disabling the
-tracing of specified functions. They are:
-
-  set_ftrace_filter
-
-and
-
-  set_ftrace_notrace
-
-A list of available functions that you can add to these files is
-listed in:
-
-   available_filter_functions
-
- # cat /debug/tracing/available_filter_functions
-put_prev_task_idle
-kmem_cache_create
-pick_next_task_rt
-get_online_cpus
-pick_next_task_fair
-mutex_lock
-[...]
-
-If I am only interested in sys_nanosleep and hrtimer_interrupt:
-
- # echo sys_nanosleep hrtimer_interrupt \
-               > /debug/tracing/set_ftrace_filter
- # echo ftrace > /debug/tracing/current_tracer
- # echo 1 > /debug/tracing/tracing_enabled
- # usleep 1
- # echo 0 > /debug/tracing/tracing_enabled
- # cat /debug/tracing/trace
-# tracer: ftrace
-#
-#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
-#              | |      |          |         |
-          usleep-4134  [00]  1317.070017: hrtimer_interrupt <-smp_apic_timer_interrupt
-          usleep-4134  [00]  1317.070111: sys_nanosleep <-syscall_call
-          <idle>-0     [00]  1317.070115: hrtimer_interrupt <-smp_apic_timer_interrupt
-
-To see which functions are being traced, you can cat the file:
-
- # cat /debug/tracing/set_ftrace_filter
-hrtimer_interrupt
-sys_nanosleep
-
-
-Perhaps this is not enough. The filters also allow simple wild
-cards. Only the following are currently available
-
-  <match>*  - will match functions that begin with <match>
-  *<match>  - will match functions that end with <match>
-  *<match>* - will match functions that have <match> in it
-
-These are the only wild cards which are supported.
-
-  <match>*<match> will not work.
-
-Note: It is better to use quotes to enclose the wild cards,
-      otherwise the shell may expand the parameters into names
-      of files in the local directory.
-
- # echo 'hrtimer_*' > /debug/tracing/set_ftrace_filter
-
-Produces:
-
-# tracer: ftrace
-#
-#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
-#              | |      |          |         |
-            bash-4003  [00]  1480.611794: hrtimer_init <-copy_process
-            bash-4003  [00]  1480.611941: hrtimer_start <-hrtick_set
-            bash-4003  [00]  1480.611956: hrtimer_cancel <-hrtick_clear
-            bash-4003  [00]  1480.611956: hrtimer_try_to_cancel <-hrtimer_cancel
-          <idle>-0     [00]  1480.612019: hrtimer_get_next_event <-get_next_timer_interrupt
-          <idle>-0     [00]  1480.612025: hrtimer_get_next_event <-get_next_timer_interrupt
-          <idle>-0     [00]  1480.612032: hrtimer_get_next_event <-get_next_timer_interrupt
-          <idle>-0     [00]  1480.612037: hrtimer_get_next_event <-get_next_timer_interrupt
-          <idle>-0     [00]  1480.612382: hrtimer_get_next_event <-get_next_timer_interrupt
-
-
-Notice that we lost the sys_nanosleep.
-
- # cat /debug/tracing/set_ftrace_filter
-hrtimer_run_queues
-hrtimer_run_pending
-hrtimer_init
-hrtimer_cancel
-hrtimer_try_to_cancel
-hrtimer_forward
-hrtimer_start
-hrtimer_reprogram
-hrtimer_force_reprogram
-hrtimer_get_next_event
-hrtimer_interrupt
-hrtimer_nanosleep
-hrtimer_wakeup
-hrtimer_get_remaining
-hrtimer_get_res
-hrtimer_init_sleeper
-
-
-This is because the '>' and '>>' act just like they do in bash.
-To rewrite the filters, use '>'
-To append to the filters, use '>>'
-
-To clear out a filter so that all functions will be recorded
-again:
-
- # echo > /debug/tracing/set_ftrace_filter
- # cat /debug/tracing/set_ftrace_filter
- #
-
-Again, now we want to append.
-
- # echo sys_nanosleep > /debug/tracing/set_ftrace_filter
- # cat /debug/tracing/set_ftrace_filter
-sys_nanosleep
- # echo 'hrtimer_*' >> /debug/tracing/set_ftrace_filter
- # cat /debug/tracing/set_ftrace_filter
-hrtimer_run_queues
-hrtimer_run_pending
-hrtimer_init
-hrtimer_cancel
-hrtimer_try_to_cancel
-hrtimer_forward
-hrtimer_start
-hrtimer_reprogram
-hrtimer_force_reprogram
-hrtimer_get_next_event
-hrtimer_interrupt
-sys_nanosleep
-hrtimer_nanosleep
-hrtimer_wakeup
-hrtimer_get_remaining
-hrtimer_get_res
-hrtimer_init_sleeper
-
-
-The set_ftrace_notrace prevents those functions from being
-traced.
-
- # echo '*preempt*' '*lock*' > /debug/tracing/set_ftrace_notrace
-
-Produces:
-
-# tracer: ftrace
-#
-#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
-#              | |      |          |         |
-            bash-4043  [01]   115.281644: finish_task_switch <-schedule
-            bash-4043  [01]   115.281645: hrtick_set <-schedule
-            bash-4043  [01]   115.281645: hrtick_clear <-hrtick_set
-            bash-4043  [01]   115.281646: wait_for_completion <-__stop_machine_run
-            bash-4043  [01]   115.281647: wait_for_common <-wait_for_completion
-            bash-4043  [01]   115.281647: kthread_stop <-stop_machine_run
-            bash-4043  [01]   115.281648: init_waitqueue_head <-kthread_stop
-            bash-4043  [01]   115.281648: wake_up_process <-kthread_stop
-            bash-4043  [01]   115.281649: try_to_wake_up <-wake_up_process
-
-We can see that there's no more lock or preempt tracing.
-
-
-Dynamic ftrace with the function graph tracer
----------------------------------------------
-
-Although what has been explained above concerns both the
-function tracer and the function-graph-tracer, there are some
-special features only available in the function-graph tracer.
-
-If you want to trace only one function and all of its children,
-you just have to echo its name into set_graph_function:
-
- echo __do_fault > set_graph_function
-
-will produce the following "expanded" trace of the __do_fault()
-function:
-
- 0)               |  __do_fault() {
- 0)               |    filemap_fault() {
- 0)               |      find_lock_page() {
- 0)   0.804 us    |        find_get_page();
- 0)               |        __might_sleep() {
- 0)   1.329 us    |        }
- 0)   3.904 us    |      }
- 0)   4.979 us    |    }
- 0)   0.653 us    |    _spin_lock();
- 0)   0.578 us    |    page_add_file_rmap();
- 0)   0.525 us    |    native_set_pte_at();
- 0)   0.585 us    |    _spin_unlock();
- 0)               |    unlock_page() {
- 0)   0.541 us    |      page_waitqueue();
- 0)   0.639 us    |      __wake_up_bit();
- 0)   2.786 us    |    }
- 0) + 14.237 us   |  }
- 0)               |  __do_fault() {
- 0)               |    filemap_fault() {
- 0)               |      find_lock_page() {
- 0)   0.698 us    |        find_get_page();
- 0)               |        __might_sleep() {
- 0)   1.412 us    |        }
- 0)   3.950 us    |      }
- 0)   5.098 us    |    }
- 0)   0.631 us    |    _spin_lock();
- 0)   0.571 us    |    page_add_file_rmap();
- 0)   0.526 us    |    native_set_pte_at();
- 0)   0.586 us    |    _spin_unlock();
- 0)               |    unlock_page() {
- 0)   0.533 us    |      page_waitqueue();
- 0)   0.638 us    |      __wake_up_bit();
- 0)   2.793 us    |    }
- 0) + 14.012 us   |  }
-
-You can also expand several functions at once:
-
- echo sys_open > set_graph_function
- echo sys_close >> set_graph_function
-
-Now if you want to go back to trace all functions you can clear
-this special filter via:
-
- echo > set_graph_function
-
-
-trace_pipe
-----------
-
-The trace_pipe outputs the same content as the trace file, but
-the effect on the tracing is different. Every read from
-trace_pipe is consumed. This means that subsequent reads will be
-different. The trace is live.
-
- # echo function > /debug/tracing/current_tracer
- # cat /debug/tracing/trace_pipe > /tmp/trace.out &
-[1] 4153
- # echo 1 > /debug/tracing/tracing_enabled
- # usleep 1
- # echo 0 > /debug/tracing/tracing_enabled
- # cat /debug/tracing/trace
-# tracer: function
-#
-#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
-#              | |      |          |         |
-
- #
- # cat /tmp/trace.out
-            bash-4043  [00] 41.267106: finish_task_switch <-schedule
-            bash-4043  [00] 41.267106: hrtick_set <-schedule
-            bash-4043  [00] 41.267107: hrtick_clear <-hrtick_set
-            bash-4043  [00] 41.267108: wait_for_completion <-__stop_machine_run
-            bash-4043  [00] 41.267108: wait_for_common <-wait_for_completion
-            bash-4043  [00] 41.267109: kthread_stop <-stop_machine_run
-            bash-4043  [00] 41.267109: init_waitqueue_head <-kthread_stop
-            bash-4043  [00] 41.267110: wake_up_process <-kthread_stop
-            bash-4043  [00] 41.267110: try_to_wake_up <-wake_up_process
-            bash-4043  [00] 41.267111: select_task_rq_rt <-try_to_wake_up
-
-
-Note, reading the trace_pipe file will block until more input is
-added. By changing the tracer, trace_pipe will issue an EOF. We
-needed to set the function tracer _before_ we "cat" the
-trace_pipe file.
-
-
-trace entries
--------------
-
-Having too much or not enough data can be troublesome in
-diagnosing an issue in the kernel. The file buffer_size_kb is
-used to modify the size of the internal trace buffers. The
-number listed is the number of entries that can be recorded per
-CPU. To know the full size, multiply the number of possible CPUS
-with the number of entries.
-
- # cat /debug/tracing/buffer_size_kb
-1408 (units kilobytes)
-
-Note, to modify this, you must have tracing completely disabled.
-To do that, echo "nop" into the current_tracer. If the
-current_tracer is not set to "nop", an EINVAL error will be
-returned.
-
- # echo nop > /debug/tracing/current_tracer
- # echo 10000 > /debug/tracing/buffer_size_kb
- # cat /debug/tracing/buffer_size_kb
-10000 (units kilobytes)
-
-The number of pages which will be allocated is limited to a
-percentage of available memory. Allocating too much will produce
-an error.
-
- # echo 1000000000000 > /debug/tracing/buffer_size_kb
--bash: echo: write error: Cannot allocate memory
- # cat /debug/tracing/buffer_size_kb
-85
-
------------
-
-More details can be found in the source code, in the
-kernel/tracing/*.c files.
diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt

new file mode 100644 (file)

index 0000000..fd9a3e6
--- /dev/null
+++ b/Documentation/trace/ftrace.txt
@@ -0,0 +1,1828 @@
+               ftrace - Function Tracer
+               ========================
+
+Copyright 2008 Red Hat Inc.
+   Author:   Steven Rostedt <srostedt@redhat.com>
+  License:   The GNU Free Documentation License, Version 1.2
+               (dual licensed under the GPL v2)
+Reviewers:   Elias Oltmanns, Randy Dunlap, Andrew Morton,
+            John Kacur, and David Teigland.
+
+Written for: 2.6.28-rc2
+
+Introduction
+------------
+
+Ftrace is an internal tracer designed to help out developers and
+designers of systems to find what is going on inside the kernel.
+It can be used for debugging or analyzing latencies and
+performance issues that take place outside of user-space.
+
+Although ftrace is the function tracer, it also includes an
+infrastructure that allows for other types of tracing. Some of
+the tracers that are currently in ftrace include a tracer to
+trace context switches, the time it takes for a high priority
+task to run after it was woken up, the time interrupts are
+disabled, and more (ftrace allows for tracer plugins, which
+means that the list of tracers can always grow).
+
+
+The File System
+---------------
+
+Ftrace uses the debugfs file system to hold the control files as
+well as the files to display output.
+
+To mount the debugfs system:
+
+  # mkdir /debug
+  # mount -t debugfs nodev /debug
+
+( Note: it is more common to mount at /sys/kernel/debug, but for
+  simplicity this document will use /debug)
+
+That's it! (assuming that you have ftrace configured into your kernel)
+
+After mounting the debugfs, you can see a directory called
+"tracing".  This directory contains the control and output files
+of ftrace. Here is a list of some of the key files:
+
+
+ Note: all time values are in microseconds.
+
+  current_tracer:
+
+       This is used to set or display the current tracer
+       that is configured.
+
+  available_tracers:
+
+       This holds the different types of tracers that
+       have been compiled into the kernel. The
+       tracers listed here can be configured by
+       echoing their name into current_tracer.
+
+  tracing_enabled:
+
+       This sets or displays whether the current_tracer
+       is activated and tracing or not. Echo 0 into this
+       file to disable the tracer or 1 to enable it.
+
+  trace:
+
+       This file holds the output of the trace in a human
+       readable format (described below).
+
+  latency_trace:
+
+       This file shows the same trace but the information
+       is organized more to display possible latencies
+       in the system (described below).
+
+  trace_pipe:
+
+       The output is the same as the "trace" file but this
+       file is meant to be streamed with live tracing.
+       Reads from this file will block until new data
+       is retrieved. Unlike the "trace" and "latency_trace"
+       files, this file is a consumer. This means reading
+       from this file causes sequential reads to display
+       more current data. Once data is read from this
+       file, it is consumed, and will not be read
+       again with a sequential read. The "trace" and
+       "latency_trace" files are static, and if the
+       tracer is not adding more data, they will display
+       the same information every time they are read.
+
+  trace_options:
+
+       This file lets the user control the amount of data
+       that is displayed in one of the above output
+       files.
+
+  tracing_max_latency:
+
+       Some of the tracers record the max latency.
+       For example, the time interrupts are disabled.
+       This time is saved in this file. The max trace
+       will also be stored, and displayed by either
+       "trace" or "latency_trace".  A new max trace will
+       only be recorded if the latency is greater than
+       the value in this file. (in microseconds)
+
+  buffer_size_kb:
+
+       This sets or displays the number of kilobytes each CPU
+       buffer can hold. The tracer buffers are the same size
+       for each CPU. The displayed number is the size of the
+       CPU buffer and not total size of all buffers. The
+       trace buffers are allocated in pages (blocks of memory
+       that the kernel uses for allocation, usually 4 KB in size).
+       If the last page allocated has room for more bytes
+       than requested, the rest of the page will be used,
+       making the actual allocation bigger than requested.
+       ( Note, the size may not be a multiple of the page size
+         due to buffer managment overhead. )
+
+       This can only be updated when the current_tracer
+       is set to "nop".
+
+  tracing_cpumask:
+
+       This is a mask that lets the user only trace
+       on specified CPUS. The format is a hex string
+       representing the CPUS.
+
+  set_ftrace_filter:
+
+       When dynamic ftrace is configured in (see the
+       section below "dynamic ftrace"), the code is dynamically
+       modified (code text rewrite) to disable calling of the
+       function profiler (mcount). This lets tracing be configured
+       in with practically no overhead in performance.  This also
+       has a side effect of enabling or disabling specific functions
+       to be traced. Echoing names of functions into this file
+       will limit the trace to only those functions.
+
+  set_ftrace_notrace:
+
+       This has an effect opposite to that of
+       set_ftrace_filter. Any function that is added here will not
+       be traced. If a function exists in both set_ftrace_filter
+       and set_ftrace_notrace, the function will _not_ be traced.
+
+  set_ftrace_pid:
+
+       Have the function tracer only trace a single thread.
+
+  set_graph_function:
+
+       Set a "trigger" function where tracing should start
+       with the function graph tracer (See the section
+       "dynamic ftrace" for more details).
+
+  available_filter_functions:
+
+       This lists the functions that ftrace
+       has processed and can trace. These are the function
+       names that you can pass to "set_ftrace_filter" or
+       "set_ftrace_notrace". (See the section "dynamic ftrace"
+       below for more details.)
+
+
+The Tracers
+-----------
+
+Here is the list of current tracers that may be configured.
+
+  "function"
+
+       Function call tracer to trace all kernel functions.
+
+  "function_graph_tracer"
+
+       Similar to the function tracer except that the
+       function tracer probes the functions on their entry
+       whereas the function graph tracer traces on both entry
+       and exit of the functions. It then provides the ability
+       to draw a graph of function calls similar to C code
+       source.
+
+  "sched_switch"
+
+       Traces the context switches and wakeups between tasks.
+
+  "irqsoff"
+
+       Traces the areas that disable interrupts and saves
+       the trace with the longest max latency.
+       See tracing_max_latency. When a new max is recorded,
+       it replaces the old trace. It is best to view this
+       trace via the latency_trace file.
+
+  "preemptoff"
+
+       Similar to irqsoff but traces and records the amount of
+       time for which preemption is disabled.
+
+  "preemptirqsoff"
+
+       Similar to irqsoff and preemptoff, but traces and
+       records the largest time for which irqs and/or preemption
+       is disabled.
+
+  "wakeup"
+
+       Traces and records the max latency that it takes for
+       the highest priority task to get scheduled after
+       it has been woken up.
+
+  "hw-branch-tracer"
+
+       Uses the BTS CPU feature on x86 CPUs to traces all
+       branches executed.
+
+  "nop"
+
+       This is the "trace nothing" tracer. To remove all
+       tracers from tracing simply echo "nop" into
+       current_tracer.
+
+
+Examples of using the tracer
+----------------------------
+
+Here are typical examples of using the tracers when controlling
+them only with the debugfs interface (without using any
+user-land utilities).
+
+Output format:
+--------------
+
+Here is an example of the output format of the file "trace"
+
+                             --------
+# tracer: function
+#
+#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
+#              | |      |          |         |
+            bash-4251  [01] 10152.583854: path_put <-path_walk
+            bash-4251  [01] 10152.583855: dput <-path_put
+            bash-4251  [01] 10152.583855: _atomic_dec_and_lock <-dput
+                             --------
+
+A header is printed with the tracer name that is represented by
+the trace. In this case the tracer is "function". Then a header
+showing the format. Task name "bash", the task PID "4251", the
+CPU that it was running on "01", the timestamp in <secs>.<usecs>
+format, the function name that was traced "path_put" and the
+parent function that called this function "path_walk". The
+timestamp is the time at which the function was entered.
+
+The sched_switch tracer also includes tracing of task wakeups
+and context switches.
+
+     ksoftirqd/1-7     [01]  1453.070013:      7:115:R   +  2916:115:S
+     ksoftirqd/1-7     [01]  1453.070013:      7:115:R   +    10:115:S
+     ksoftirqd/1-7     [01]  1453.070013:      7:115:R ==>    10:115:R
+        events/1-10    [01]  1453.070013:     10:115:S ==>  2916:115:R
+     kondemand/1-2916  [01]  1453.070013:   2916:115:S ==>     7:115:R
+     ksoftirqd/1-7     [01]  1453.070013:      7:115:S ==>     0:140:R
+
+Wake ups are represented by a "+" and the context switches are
+shown as "==>".  The format is:
+
+ Context switches:
+
+       Previous task              Next Task
+
+  <pid>:<prio>:<state>  ==>  <pid>:<prio>:<state>
+
+ Wake ups:
+
+       Current task               Task waking up
+
+  <pid>:<prio>:<state>    +  <pid>:<prio>:<state>
+
+The prio is the internal kernel priority, which is the inverse
+of the priority that is usually displayed by user-space tools.
+Zero represents the highest priority (99). Prio 100 starts the
+"nice" priorities with 100 being equal to nice -20 and 139 being
+nice 19. The prio "140" is reserved for the idle task which is
+the lowest priority thread (pid 0).
+
+
+Latency trace format
+--------------------
+
+For traces that display latency times, the latency_trace file
+gives somewhat more information to see why a latency happened.
+Here is a typical trace.
+
+# tracer: irqsoff
+#
+irqsoff latency trace v1.1.5 on 2.6.26-rc8
+--------------------------------------------------------------------
+ latency: 97 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
+    -----------------
+    | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
+    -----------------
+ => started at: apic_timer_interrupt
+ => ended at:   do_softirq
+
+#                _------=> CPU#
+#               / _-----=> irqs-off
+#              | / _----=> need-resched
+#              || / _---=> hardirq/softirq
+#              ||| / _--=> preempt-depth
+#              |||| /
+#              |||||     delay
+#  cmd     pid ||||| time  |   caller
+#     \   /    |||||   \   |   /
+  <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
+  <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
+  <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
+
+
+This shows that the current tracer is "irqsoff" tracing the time
+for which interrupts were disabled. It gives the trace version
+and the version of the kernel upon which this was executed on
+(2.6.26-rc8). Then it displays the max latency in microsecs (97
+us). The number of trace entries displayed and the total number
+recorded (both are three: #3/3). The type of preemption that was
+used (PREEMPT). VP, KP, SP, and HP are always zero and are
+reserved for later use. #P is the number of online CPUS (#P:2).
+
+The task is the process that was running when the latency
+occurred. (swapper pid: 0).
+
+The start and stop (the functions in which the interrupts were
+disabled and enabled respectively) that caused the latencies:
+
+  apic_timer_interrupt is where the interrupts were disabled.
+  do_softirq is where they were enabled again.
+
+The next lines after the header are the trace itself. The header
+explains which is which.
+
+  cmd: The name of the process in the trace.
+
+  pid: The PID of that process.
+
+  CPU#: The CPU which the process was running on.
+
+  irqs-off: 'd' interrupts are disabled. '.' otherwise.
+           Note: If the architecture does not support a way to
+                 read the irq flags variable, an 'X' will always
+                 be printed here.
+
+  need-resched: 'N' task need_resched is set, '.' otherwise.
+
+  hardirq/softirq:
+       'H' - hard irq occurred inside a softirq.
+       'h' - hard irq is running
+       's' - soft irq is running
+       '.' - normal context.
+
+  preempt-depth: The level of preempt_disabled
+
+The above is mostly meaningful for kernel developers.
+
+  time: This differs from the trace file output. The trace file output
+       includes an absolute timestamp. The timestamp used by the
+       latency_trace file is relative to the start of the trace.
+
+  delay: This is just to help catch your eye a bit better. And
+        needs to be fixed to be only relative to the same CPU.
+        The marks are determined by the difference between this
+        current trace and the next trace.
+         '!' - greater than preempt_mark_thresh (default 100)
+         '+' - greater than 1 microsecond
+         ' ' - less than or equal to 1 microsecond.
+
+  The rest is the same as the 'trace' file.
+
+
+trace_options
+-------------
+
+The trace_options file is used to control what gets printed in
+the trace output. To see what is available, simply cat the file:
+
+  cat /debug/tracing/trace_options
+  print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \
+  noblock nostacktrace nosched-tree nouserstacktrace nosym-userobj
+
+To disable one of the options, echo in the option prepended with
+"no".
+
+  echo noprint-parent > /debug/tracing/trace_options
+
+To enable an option, leave off the "no".
+
+  echo sym-offset > /debug/tracing/trace_options
+
+Here are the available options:
+
+  print-parent - On function traces, display the calling (parent)
+                function as well as the function being traced.
+
+  print-parent:
+   bash-4000  [01]  1477.606694: simple_strtoul <-strict_strtoul
+
+  noprint-parent:
+   bash-4000  [01]  1477.606694: simple_strtoul
+
+
+  sym-offset - Display not only the function name, but also the
+              offset in the function. For example, instead of
+              seeing just "ktime_get", you will see
+              "ktime_get+0xb/0x20".
+
+  sym-offset:
+   bash-4000  [01]  1477.606694: simple_strtoul+0x6/0xa0
+
+  sym-addr - this will also display the function address as well
+            as the function name.
+
+  sym-addr:
+   bash-4000  [01]  1477.606694: simple_strtoul <c0339346>
+
+  verbose - This deals with the latency_trace file.
+
+    bash  4000 1 0 00000000 00010a95 [58127d26] 1720.415ms \
+    (+0.000ms): simple_strtoul (strict_strtoul)
+
+  raw - This will display raw numbers. This option is best for
+       use with user applications that can translate the raw
+       numbers better than having it done in the kernel.
+
+  hex - Similar to raw, but the numbers will be in a hexadecimal
+       format.
+
+  bin - This will print out the formats in raw binary.
+
+  block - TBD (needs update)
+
+  stacktrace - This is one of the options that changes the trace
+              itself. When a trace is recorded, so is the stack
+              of functions. This allows for back traces of
+              trace sites.
+
+  userstacktrace - This option changes the trace. It records a
+                  stacktrace of the current userspace thread.
+
+  sym-userobj - when user stacktrace are enabled, look up which
+               object the address belongs to, and print a
+               relative address. This is especially useful when
+               ASLR is on, otherwise you don't get a chance to
+               resolve the address to object/file/line after
+               the app is no longer running
+
+               The lookup is performed when you read
+               trace,trace_pipe,latency_trace. Example:
+
+               a.out-1623  [000] 40874.465068: /root/a.out[+0x480] <-/root/a.out[+0
+x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
+
+  sched-tree - trace all tasks that are on the runqueue, at
+              every scheduling event. Will add overhead if
+              there's a lot of tasks running at once.
+
+
+sched_switch
+------------
+
+This tracer simply records schedule switches. Here is an example
+of how to use it.
+
+ # echo sched_switch > /debug/tracing/current_tracer
+ # echo 1 > /debug/tracing/tracing_enabled
+ # sleep 1
+ # echo 0 > /debug/tracing/tracing_enabled
+ # cat /debug/tracing/trace
+
+# tracer: sched_switch
+#
+#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
+#              | |      |          |         |
+            bash-3997  [01]   240.132281:   3997:120:R   +  4055:120:R
+            bash-3997  [01]   240.132284:   3997:120:R ==>  4055:120:R
+           sleep-4055  [01]   240.132371:   4055:120:S ==>  3997:120:R
+            bash-3997  [01]   240.132454:   3997:120:R   +  4055:120:S
+            bash-3997  [01]   240.132457:   3997:120:R ==>  4055:120:R
+           sleep-4055  [01]   240.132460:   4055:120:D ==>  3997:120:R
+            bash-3997  [01]   240.132463:   3997:120:R   +  4055:120:D
+            bash-3997  [01]   240.132465:   3997:120:R ==>  4055:120:R
+          <idle>-0     [00]   240.132589:      0:140:R   +     4:115:S
+          <idle>-0     [00]   240.132591:      0:140:R ==>     4:115:R
+     ksoftirqd/0-4     [00]   240.132595:      4:115:S ==>     0:140:R
+          <idle>-0     [00]   240.132598:      0:140:R   +     4:115:S
+          <idle>-0     [00]   240.132599:      0:140:R ==>     4:115:R
+     ksoftirqd/0-4     [00]   240.132603:      4:115:S ==>     0:140:R
+           sleep-4055  [01]   240.133058:   4055:120:S ==>  3997:120:R
+ [...]
+
+
+As we have discussed previously about this format, the header
+shows the name of the trace and points to the options. The
+"FUNCTION" is a misnomer since here it represents the wake ups
+and context switches.
+
+The sched_switch file only lists the wake ups (represented with
+'+') and context switches ('==>') with the previous task or
+current task first followed by the next task or task waking up.
+The format for both of these is PID:KERNEL-PRIO:TASK-STATE.
+Remember that the KERNEL-PRIO is the inverse of the actual
+priority with zero (0) being the highest priority and the nice
+values starting at 100 (nice -20). Below is a quick chart to map
+the kernel priority to user land priorities.
+
+  Kernel priority: 0 to 99    ==> user RT priority 99 to 0
+  Kernel priority: 100 to 139 ==> user nice -20 to 19
+  Kernel priority: 140        ==> idle task priority
+
+The task states are:
+
+ R - running : wants to run, may not actually be running
+ S - sleep   : process is waiting to be woken up (handles signals)
+ D - disk sleep (uninterruptible sleep) : process must be woken up
+                                       (ignores signals)
+ T - stopped : process suspended
+ t - traced  : process is being traced (with something like gdb)
+ Z - zombie  : process waiting to be cleaned up
+ X - unknown
+
+
+ftrace_enabled
+--------------
+
+The following tracers (listed below) give different output
+depending on whether or not the sysctl ftrace_enabled is set. To
+set ftrace_enabled, one can either use the sysctl function or
+set it via the proc file system interface.
+
+  sysctl kernel.ftrace_enabled=1
+
+ or
+
+  echo 1 > /proc/sys/kernel/ftrace_enabled
+
+To disable ftrace_enabled simply replace the '1' with '0' in the
+above commands.
+
+When ftrace_enabled is set the tracers will also record the
+functions that are within the trace. The descriptions of the
+tracers will also show an example with ftrace enabled.
+
+
+irqsoff
+-------
+
+When interrupts are disabled, the CPU can not react to any other
+external event (besides NMIs and SMIs). This prevents the timer
+interrupt from triggering or the mouse interrupt from letting
+the kernel know of a new mouse event. The result is a latency
+with the reaction time.
+
+The irqsoff tracer tracks the time for which interrupts are
+disabled. When a new maximum latency is hit, the tracer saves
+the trace leading up to that latency point so that every time a
+new maximum is reached, the old saved trace is discarded and the
+new trace is saved.
+
+To reset the maximum, echo 0 into tracing_max_latency. Here is
+an example:
+
+ # echo irqsoff > /debug/tracing/current_tracer
+ # echo 0 > /debug/tracing/tracing_max_latency
+ # echo 1 > /debug/tracing/tracing_enabled
+ # ls -ltr
+ [...]
+ # echo 0 > /debug/tracing/tracing_enabled
+ # cat /debug/tracing/latency_trace
+# tracer: irqsoff
+#
+irqsoff latency trace v1.1.5 on 2.6.26
+--------------------------------------------------------------------
+ latency: 12 us, #3/3, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
+    -----------------
+    | task: bash-3730 (uid:0 nice:0 policy:0 rt_prio:0)
+    -----------------
+ => started at: sys_setpgid
+ => ended at:   sys_setpgid
+
+#                _------=> CPU#
+#               / _-----=> irqs-off
+#              | / _----=> need-resched
+#              || / _---=> hardirq/softirq
+#              ||| / _--=> preempt-depth
+#              |||| /
+#              |||||     delay
+#  cmd     pid ||||| time  |   caller
+#     \   /    |||||   \   |   /
+    bash-3730  1d...    0us : _write_lock_irq (sys_setpgid)
+    bash-3730  1d..1    1us+: _write_unlock_irq (sys_setpgid)
+    bash-3730  1d..2   14us : trace_hardirqs_on (sys_setpgid)
+
+
+Here we see that that we had a latency of 12 microsecs (which is
+very good). The _write_lock_irq in sys_setpgid disabled
+interrupts. The difference between the 12 and the displayed
+timestamp 14us occurred because the clock was incremented
+between the time of recording the max latency and the time of
+recording the function that had that latency.
+
+Note the above example had ftrace_enabled not set. If we set the
+ftrace_enabled, we get a much larger output:
+
+# tracer: irqsoff
+#
+irqsoff latency trace v1.1.5 on 2.6.26-rc8
+--------------------------------------------------------------------
+ latency: 50 us, #101/101, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
+    -----------------
+    | task: ls-4339 (uid:0 nice:0 policy:0 rt_prio:0)
+    -----------------
+ => started at: __alloc_pages_internal
+ => ended at:   __alloc_pages_internal
+
+#                _------=> CPU#
+#               / _-----=> irqs-off
+#              | / _----=> need-resched
+#              || / _---=> hardirq/softirq
+#              ||| / _--=> preempt-depth
+#              |||| /
+#              |||||     delay
+#  cmd     pid ||||| time  |   caller
+#     \   /    |||||   \   |   /
+      ls-4339  0...1    0us+: get_page_from_freelist (__alloc_pages_internal)
+      ls-4339  0d..1    3us : rmqueue_bulk (get_page_from_freelist)
+      ls-4339  0d..1    3us : _spin_lock (rmqueue_bulk)
+      ls-4339  0d..1    4us : add_preempt_count (_spin_lock)
+      ls-4339  0d..2    4us : __rmqueue (rmqueue_bulk)
+      ls-4339  0d..2    5us : __rmqueue_smallest (__rmqueue)
+      ls-4339  0d..2    5us : __mod_zone_page_state (__rmqueue_smallest)
+      ls-4339  0d..2    6us : __rmqueue (rmqueue_bulk)
+      ls-4339  0d..2    6us : __rmqueue_smallest (__rmqueue)
+      ls-4339  0d..2    7us : __mod_zone_page_state (__rmqueue_smallest)
+      ls-4339  0d..2    7us : __rmqueue (rmqueue_bulk)
+      ls-4339  0d..2    8us : __rmqueue_smallest (__rmqueue)
+[...]
+      ls-4339  0d..2   46us : __rmqueue_smallest (__rmqueue)
+      ls-4339  0d..2   47us : __mod_zone_page_state (__rmqueue_smallest)
+      ls-4339  0d..2   47us : __rmqueue (rmqueue_bulk)
+      ls-4339  0d..2   48us : __rmqueue_smallest (__rmqueue)
+      ls-4339  0d..2   48us : __mod_zone_page_state (__rmqueue_smallest)
+      ls-4339  0d..2   49us : _spin_unlock (rmqueue_bulk)
+      ls-4339  0d..2   49us : sub_preempt_count (_spin_unlock)
+      ls-4339  0d..1   50us : get_page_from_freelist (__alloc_pages_internal)
+      ls-4339  0d..2   51us : trace_hardirqs_on (__alloc_pages_internal)
+
+
+
+Here we traced a 50 microsecond latency. But we also see all the
+functions that were called during that time. Note that by
+enabling function tracing, we incur an added overhead. This
+overhead may extend the latency times. But nevertheless, this
+trace has provided some very helpful debugging information.
+
+
+preemptoff
+----------
+
+When preemption is disabled, we may be able to receive
+interrupts but the task cannot be preempted and a higher
+priority task must wait for preemption to be enabled again
+before it can preempt a lower priority task.
+
+The preemptoff tracer traces the places that disable preemption.
+Like the irqsoff tracer, it records the maximum latency for
+which preemption was disabled. The control of preemptoff tracer
+is much like the irqsoff tracer.
+
+ # echo preemptoff > /debug/tracing/current_tracer
+ # echo 0 > /debug/tracing/tracing_max_latency
+ # echo 1 > /debug/tracing/tracing_enabled
+ # ls -ltr
+ [...]
+ # echo 0 > /debug/tracing/tracing_enabled
+ # cat /debug/tracing/latency_trace
+# tracer: preemptoff
+#
+preemptoff latency trace v1.1.5 on 2.6.26-rc8
+--------------------------------------------------------------------
+ latency: 29 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
+    -----------------
+    | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0)
+    -----------------
+ => started at: do_IRQ
+ => ended at:   __do_softirq
+
+#                _------=> CPU#
+#               / _-----=> irqs-off
+#              | / _----=> need-resched
+#              || / _---=> hardirq/softirq
+#              ||| / _--=> preempt-depth
+#              |||| /
+#              |||||     delay
+#  cmd     pid ||||| time  |   caller
+#     \   /    |||||   \   |   /
+    sshd-4261  0d.h.    0us+: irq_enter (do_IRQ)
+    sshd-4261  0d.s.   29us : _local_bh_enable (__do_softirq)
+    sshd-4261  0d.s1   30us : trace_preempt_on (__do_softirq)
+
+
+This has some more changes. Preemption was disabled when an
+interrupt came in (notice the 'h'), and was enabled while doing
+a softirq. (notice the 's'). But we also see that interrupts
+have been disabled when entering the preempt off section and
+leaving it (the 'd'). We do not know if interrupts were enabled
+in the mean time.
+
+# tracer: preemptoff
+#
+preemptoff latency trace v1.1.5 on 2.6.26-rc8
+--------------------------------------------------------------------
+ latency: 63 us, #87/87, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
+    -----------------
+    | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0)
+    -----------------
+ => started at: remove_wait_queue
+ => ended at:   __do_softirq
+
+#                _------=> CPU#
+#               / _-----=> irqs-off
+#              | / _----=> need-resched
+#              || / _---=> hardirq/softirq
+#              ||| / _--=> preempt-depth
+#              |||| /
+#              |||||     delay
+#  cmd     pid ||||| time  |   caller
+#     \   /    |||||   \   |   /
+    sshd-4261  0d..1    0us : _spin_lock_irqsave (remove_wait_queue)
+    sshd-4261  0d..1    1us : _spin_unlock_irqrestore (remove_wait_queue)
+    sshd-4261  0d..1    2us : do_IRQ (common_interrupt)
+    sshd-4261  0d..1    2us : irq_enter (do_IRQ)
+    sshd-4261  0d..1    2us : idle_cpu (irq_enter)
+    sshd-4261  0d..1    3us : add_preempt_count (irq_enter)
+    sshd-4261  0d.h1    3us : idle_cpu (irq_enter)
+    sshd-4261  0d.h.    4us : handle_fasteoi_irq (do_IRQ)
+[...]
+    sshd-4261  0d.h.   12us : add_preempt_count (_spin_lock)
+    sshd-4261  0d.h1   12us : ack_ioapic_quirk_irq (handle_fasteoi_irq)
+    sshd-4261  0d.h1   13us : move_native_irq (ack_ioapic_quirk_irq)
+    sshd-4261  0d.h1   13us : _spin_unlock (handle_fasteoi_irq)
+    sshd-4261  0d.h1   14us : sub_preempt_count (_spin_unlock)
+    sshd-4261  0d.h1   14us : irq_exit (do_IRQ)
+    sshd-4261  0d.h1   15us : sub_preempt_count (irq_exit)
+    sshd-4261  0d..2   15us : do_softirq (irq_exit)
+    sshd-4261  0d...   15us : __do_softirq (do_softirq)
+    sshd-4261  0d...   16us : __local_bh_disable (__do_softirq)
+    sshd-4261  0d...   16us+: add_preempt_count (__local_bh_disable)
+    sshd-4261  0d.s4   20us : add_preempt_count (__local_bh_disable)
+    sshd-4261  0d.s4   21us : sub_preempt_count (local_bh_enable)
+    sshd-4261  0d.s5   21us : sub_preempt_count (local_bh_enable)
+[...]
+    sshd-4261  0d.s6   41us : add_preempt_count (__local_bh_disable)
+    sshd-4261  0d.s6   42us : sub_preempt_count (local_bh_enable)
+    sshd-4261  0d.s7   42us : sub_preempt_count (local_bh_enable)
+    sshd-4261  0d.s5   43us : add_preempt_count (__local_bh_disable)
+    sshd-4261  0d.s5   43us : sub_preempt_count (local_bh_enable_ip)
+    sshd-4261  0d.s6   44us : sub_preempt_count (local_bh_enable_ip)
+    sshd-4261  0d.s5   44us : add_preempt_count (__local_bh_disable)
+    sshd-4261  0d.s5   45us : sub_preempt_count (local_bh_enable)
+[...]
+    sshd-4261  0d.s.   63us : _local_bh_enable (__do_softirq)
+    sshd-4261  0d.s1   64us : trace_preempt_on (__do_softirq)
+
+
+The above is an example of the preemptoff trace with
+ftrace_enabled set. Here we see that interrupts were disabled
+the entire time. The irq_enter code lets us know that we entered
+an interrupt 'h'. Before that, the functions being traced still
+show that it is not in an interrupt, but we can see from the
+functions themselves that this is not the case.
+
+Notice that __do_softirq when called does not have a
+preempt_count. It may seem that we missed a preempt enabling.
+What really happened is that the preempt count is held on the
+thread's stack and we switched to the softirq stack (4K stacks
+in effect). The code does not copy the preempt count, but
+because interrupts are disabled, we do not need to worry about
+it. Having a tracer like this is good for letting people know
+what really happens inside the kernel.
+
+
+preemptirqsoff
+--------------
+
+Knowing the locations that have interrupts disabled or
+preemption disabled for the longest times is helpful. But
+sometimes we would like to know when either preemption and/or
+interrupts are disabled.
+
+Consider the following code:
+
+    local_irq_disable();
+    call_function_with_irqs_off();
+    preempt_disable();
+    call_function_with_irqs_and_preemption_off();
+    local_irq_enable();
+    call_function_with_preemption_off();
+    preempt_enable();
+
+The irqsoff tracer will record the total length of
+call_function_with_irqs_off() and
+call_function_with_irqs_and_preemption_off().
+
+The preemptoff tracer will record the total length of
+call_function_with_irqs_and_preemption_off() and
+call_function_with_preemption_off().
+
+But neither will trace the time that interrupts and/or
+preemption is disabled. This total time is the time that we can
+not schedule. To record this time, use the preemptirqsoff
+tracer.
+
+Again, using this trace is much like the irqsoff and preemptoff
+tracers.
+
+ # echo preemptirqsoff > /debug/tracing/current_tracer
+ # echo 0 > /debug/tracing/tracing_max_latency
+ # echo 1 > /debug/tracing/tracing_enabled
+ # ls -ltr
+ [...]
+ # echo 0 > /debug/tracing/tracing_enabled
+ # cat /debug/tracing/latency_trace
+# tracer: preemptirqsoff
+#
+preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8
+--------------------------------------------------------------------
+ latency: 293 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
+    -----------------
+    | task: ls-4860 (uid:0 nice:0 policy:0 rt_prio:0)
+    -----------------
+ => started at: apic_timer_interrupt
+ => ended at:   __do_softirq
+
+#                _------=> CPU#
+#               / _-----=> irqs-off
+#              | / _----=> need-resched
+#              || / _---=> hardirq/softirq
+#              ||| / _--=> preempt-depth
+#              |||| /
+#              |||||     delay
+#  cmd     pid ||||| time  |   caller
+#     \   /    |||||   \   |   /
+      ls-4860  0d...    0us!: trace_hardirqs_off_thunk (apic_timer_interrupt)
+      ls-4860  0d.s.  294us : _local_bh_enable (__do_softirq)
+      ls-4860  0d.s1  294us : trace_preempt_on (__do_softirq)
+
+
+
+The trace_hardirqs_off_thunk is called from assembly on x86 when
+interrupts are disabled in the assembly code. Without the
+function tracing, we do not know if interrupts were enabled
+within the preemption points. We do see that it started with
+preemption enabled.
+
+Here is a trace with ftrace_enabled set:
+
+
+# tracer: preemptirqsoff
+#
+preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8
+--------------------------------------------------------------------
+ latency: 105 us, #183/183, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
+    -----------------
+    | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0)
+    -----------------
+ => started at: write_chan
+ => ended at:   __do_softirq
+
+#                _------=> CPU#
+#               / _-----=> irqs-off
+#              | / _----=> need-resched
+#              || / _---=> hardirq/softirq
+#              ||| / _--=> preempt-depth
+#              |||| /
+#              |||||     delay
+#  cmd     pid ||||| time  |   caller
+#     \   /    |||||   \   |   /
+      ls-4473  0.N..    0us : preempt_schedule (write_chan)
+      ls-4473  0dN.1    1us : _spin_lock (schedule)
+      ls-4473  0dN.1    2us : add_preempt_count (_spin_lock)
+      ls-4473  0d..2    2us : put_prev_task_fair (schedule)
+[...]
+      ls-4473  0d..2   13us : set_normalized_timespec (ktime_get_ts)
+      ls-4473  0d..2   13us : __switch_to (schedule)
+    sshd-4261  0d..2   14us : finish_task_switch (schedule)
+    sshd-4261  0d..2   14us : _spin_unlock_irq (finish_task_switch)
+    sshd-4261  0d..1   15us : add_preempt_count (_spin_lock_irqsave)
+    sshd-4261  0d..2   16us : _spin_unlock_irqrestore (hrtick_set)
+    sshd-4261  0d..2   16us : do_IRQ (common_interrupt)
+    sshd-4261  0d..2   17us : irq_enter (do_IRQ)
+    sshd-4261  0d..2   17us : idle_cpu (irq_enter)
+    sshd-4261  0d..2   18us : add_preempt_count (irq_enter)
+    sshd-4261  0d.h2   18us : idle_cpu (irq_enter)
+    sshd-4261  0d.h.   18us : handle_fasteoi_irq (do_IRQ)
+    sshd-4261  0d.h.   19us : _spin_lock (handle_fasteoi_irq)
+    sshd-4261  0d.h.   19us : add_preempt_count (_spin_lock)
+    sshd-4261  0d.h1   20us : _spin_unlock (handle_fasteoi_irq)
+    sshd-4261  0d.h1   20us : sub_preempt_count (_spin_unlock)
+[...]
+    sshd-4261  0d.h1   28us : _spin_unlock (handle_fasteoi_irq)
+    sshd-4261  0d.h1   29us : sub_preempt_count (_spin_unlock)
+    sshd-4261  0d.h2   29us : irq_exit (do_IRQ)
+    sshd-4261  0d.h2   29us : sub_preempt_count (irq_exit)
+    sshd-4261  0d..3   30us : do_softirq (irq_exit)
+    sshd-4261  0d...   30us : __do_softirq (do_softirq)
+    sshd-4261  0d...   31us : __local_bh_disable (__do_softirq)
+    sshd-4261  0d...   31us+: add_preempt_count (__local_bh_disable)
+    sshd-4261  0d.s4   34us : add_preempt_count (__local_bh_disable)
+[...]
+    sshd-4261  0d.s3   43us : sub_preempt_count (local_bh_enable_ip)
+    sshd-4261  0d.s4   44us : sub_preempt_count (local_bh_enable_ip)
+    sshd-4261  0d.s3   44us : smp_apic_timer_interrupt (apic_timer_interrupt)
+    sshd-4261  0d.s3   45us : irq_enter (smp_apic_timer_interrupt)
+    sshd-4261  0d.s3   45us : idle_cpu (irq_enter)
+    sshd-4261  0d.s3   46us : add_preempt_count (irq_enter)
+    sshd-4261  0d.H3   46us : idle_cpu (irq_enter)
+    sshd-4261  0d.H3   47us : hrtimer_interrupt (smp_apic_timer_interrupt)
+    sshd-4261  0d.H3   47us : ktime_get (hrtimer_interrupt)
+[...]
+    sshd-4261  0d.H3   81us : tick_program_event (hrtimer_interrupt)
+    sshd-4261  0d.H3   82us : ktime_get (tick_program_event)
+    sshd-4261  0d.H3   82us : ktime_get_ts (ktime_get)
+    sshd-4261  0d.H3   83us : getnstimeofday (ktime_get_ts)
+    sshd-4261  0d.H3   83us : set_normalized_timespec (ktime_get_ts)
+    sshd-4261  0d.H3   84us : clockevents_program_event (tick_program_event)
+    sshd-4261  0d.H3   84us : lapic_next_event (clockevents_program_event)
+    sshd-4261  0d.H3   85us : irq_exit (smp_apic_timer_interrupt)
+    sshd-4261  0d.H3   85us : sub_preempt_count (irq_exit)
+    sshd-4261  0d.s4   86us : sub_preempt_count (irq_exit)
+    sshd-4261  0d.s3   86us : add_preempt_count (__local_bh_disable)
+[...]
+    sshd-4261  0d.s1   98us : sub_preempt_count (net_rx_action)
+    sshd-4261  0d.s.   99us : add_preempt_count (_spin_lock_irq)
+    sshd-4261  0d.s1   99us+: _spin_unlock_irq (run_timer_softirq)
+    sshd-4261  0d.s.  104us : _local_bh_enable (__do_softirq)
+    sshd-4261  0d.s.  104us : sub_preempt_count (_local_bh_enable)
+    sshd-4261  0d.s.  105us : _local_bh_enable (__do_softirq)
+    sshd-4261  0d.s1  105us : trace_preempt_on (__do_softirq)
+
+
+This is a very interesting trace. It started with the preemption
+of the ls task. We see that the task had the "need_resched" bit
+set via the 'N' in the trace.  Interrupts were disabled before
+the spin_lock at the beginning of the trace. We see that a
+schedule took place to run sshd.  When the interrupts were
+enabled, we took an interrupt. On return from the interrupt
+handler, the softirq ran. We took another interrupt while
+running the softirq as we see from the capital 'H'.
+
+
+wakeup
+------
+
+In a Real-Time environment it is very important to know the
+wakeup time it takes for the highest priority task that is woken
+up to the time that it executes. This is also known as "schedule
+latency". I stress the point that this is about RT tasks. It is
+also important to know the scheduling latency of non-RT tasks,
+but the average schedule latency is better for non-RT tasks.
+Tools like LatencyTop are more appropriate for such
+measurements.
+
+Real-Time environments are interested in the worst case latency.
+That is the longest latency it takes for something to happen,
+and not the average. We can have a very fast scheduler that may
+only have a large latency once in a while, but that would not
+work well with Real-Time tasks.  The wakeup tracer was designed
+to record the worst case wakeups of RT tasks. Non-RT tasks are
+not recorded because the tracer only records one worst case and
+tracing non-RT tasks that are unpredictable will overwrite the
+worst case latency of RT tasks.
+
+Since this tracer only deals with RT tasks, we will run this
+slightly differently than we did with the previous tracers.
+Instead of performing an 'ls', we will run 'sleep 1' under
+'chrt' which changes the priority of the task.
+
+ # echo wakeup > /debug/tracing/current_tracer
+ # echo 0 > /debug/tracing/tracing_max_latency
+ # echo 1 > /debug/tracing/tracing_enabled
+ # chrt -f 5 sleep 1
+ # echo 0 > /debug/tracing/tracing_enabled
+ # cat /debug/tracing/latency_trace
+# tracer: wakeup
+#
+wakeup latency trace v1.1.5 on 2.6.26-rc8
+--------------------------------------------------------------------
+ latency: 4 us, #2/2, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
+    -----------------
+    | task: sleep-4901 (uid:0 nice:0 policy:1 rt_prio:5)
+    -----------------
+
+#                _------=> CPU#
+#               / _-----=> irqs-off
+#              | / _----=> need-resched
+#              || / _---=> hardirq/softirq
+#              ||| / _--=> preempt-depth
+#              |||| /
+#              |||||     delay
+#  cmd     pid ||||| time  |   caller
+#     \   /    |||||   \   |   /
+  <idle>-0     1d.h4    0us+: try_to_wake_up (wake_up_process)
+  <idle>-0     1d..4    4us : schedule (cpu_idle)
+
+
+Running this on an idle system, we see that it only took 4
+microseconds to perform the task switch.  Note, since the trace
+marker in the schedule is before the actual "switch", we stop
+the tracing when the recorded task is about to schedule in. This
+may change if we add a new marker at the end of the scheduler.
+
+Notice that the recorded task is 'sleep' with the PID of 4901
+and it has an rt_prio of 5. This priority is user-space priority
+and not the internal kernel priority. The policy is 1 for
+SCHED_FIFO and 2 for SCHED_RR.
+
+Doing the same with chrt -r 5 and ftrace_enabled set.
+
+# tracer: wakeup
+#
+wakeup latency trace v1.1.5 on 2.6.26-rc8
+--------------------------------------------------------------------
+ latency: 50 us, #60/60, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
+    -----------------
+    | task: sleep-4068 (uid:0 nice:0 policy:2 rt_prio:5)
+    -----------------
+
+#                _------=> CPU#
+#               / _-----=> irqs-off
+#              | / _----=> need-resched
+#              || / _---=> hardirq/softirq
+#              ||| / _--=> preempt-depth
+#              |||| /
+#              |||||     delay
+#  cmd     pid ||||| time  |   caller
+#     \   /    |||||   \   |   /
+ksoftirq-7     1d.H3    0us : try_to_wake_up (wake_up_process)
+ksoftirq-7     1d.H4    1us : sub_preempt_count (marker_probe_cb)
+ksoftirq-7     1d.H3    2us : check_preempt_wakeup (try_to_wake_up)
+ksoftirq-7     1d.H3    3us : update_curr (check_preempt_wakeup)
+ksoftirq-7     1d.H3    4us : calc_delta_mine (update_curr)
+ksoftirq-7     1d.H3    5us : __resched_task (check_preempt_wakeup)
+ksoftirq-7     1d.H3    6us : task_wake_up_rt (try_to_wake_up)
+ksoftirq-7     1d.H3    7us : _spin_unlock_irqrestore (try_to_wake_up)
+[...]
+ksoftirq-7     1d.H2   17us : irq_exit (smp_apic_timer_interrupt)
+ksoftirq-7     1d.H2   18us : sub_preempt_count (irq_exit)
+ksoftirq-7     1d.s3   19us : sub_preempt_count (irq_exit)
+ksoftirq-7     1..s2   20us : rcu_process_callbacks (__do_softirq)
+[...]
+ksoftirq-7     1..s2   26us : __rcu_process_callbacks (rcu_process_callbacks)
+ksoftirq-7     1d.s2   27us : _local_bh_enable (__do_softirq)
+ksoftirq-7     1d.s2   28us : sub_preempt_count (_local_bh_enable)
+ksoftirq-7     1.N.3   29us : sub_preempt_count (ksoftirqd)
+ksoftirq-7     1.N.2   30us : _cond_resched (ksoftirqd)
+ksoftirq-7     1.N.2   31us : __cond_resched (_cond_resched)
+ksoftirq-7     1.N.2   32us : add_preempt_count (__cond_resched)
+ksoftirq-7     1.N.2   33us : schedule (__cond_resched)
+ksoftirq-7     1.N.2   33us : add_preempt_count (schedule)
+ksoftirq-7     1.N.3   34us : hrtick_clear (schedule)
+ksoftirq-7     1dN.3   35us : _spin_lock (schedule)
+ksoftirq-7     1dN.3   36us : add_preempt_count (_spin_lock)
+ksoftirq-7     1d..4   37us : put_prev_task_fair (schedule)
+ksoftirq-7     1d..4   38us : update_curr (put_prev_task_fair)
+[...]
+ksoftirq-7     1d..5   47us : _spin_trylock (tracing_record_cmdline)
+ksoftirq-7     1d..5   48us : add_preempt_count (_spin_trylock)
+ksoftirq-7     1d..6   49us : _spin_unlock (tracing_record_cmdline)
+ksoftirq-7     1d..6   49us : sub_preempt_count (_spin_unlock)
+ksoftirq-7     1d..4   50us : schedule (__cond_resched)
+
+The interrupt went off while running ksoftirqd. This task runs
+at SCHED_OTHER. Why did not we see the 'N' set early? This may
+be a harmless bug with x86_32 and 4K stacks. On x86_32 with 4K
+stacks configured, the interrupt and softirq run with their own
+stack. Some information is held on the top of the task's stack
+(need_resched and preempt_count are both stored there). The
+setting of the NEED_RESCHED bit is done directly to the task's
+stack, but the reading of the NEED_RESCHED is done by looking at
+the current stack, which in this case is the stack for the hard
+interrupt. This hides the fact that NEED_RESCHED has been set.
+We do not see the 'N' until we switch back to the task's
+assigned stack.
+
+function
+--------
+
+This tracer is the function tracer. Enabling the function tracer
+can be done from the debug file system. Make sure the
+ftrace_enabled is set; otherwise this tracer is a nop.
+
+ # sysctl kernel.ftrace_enabled=1
+ # echo function > /debug/tracing/current_tracer
+ # echo 1 > /debug/tracing/tracing_enabled
+ # usleep 1
+ # echo 0 > /debug/tracing/tracing_enabled
+ # cat /debug/tracing/trace
+# tracer: function
+#
+#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
+#              | |      |          |         |
+            bash-4003  [00]   123.638713: finish_task_switch <-schedule
+            bash-4003  [00]   123.638714: _spin_unlock_irq <-finish_task_switch
+            bash-4003  [00]   123.638714: sub_preempt_count <-_spin_unlock_irq
+            bash-4003  [00]   123.638715: hrtick_set <-schedule
+            bash-4003  [00]   123.638715: _spin_lock_irqsave <-hrtick_set
+            bash-4003  [00]   123.638716: add_preempt_count <-_spin_lock_irqsave
+            bash-4003  [00]   123.638716: _spin_unlock_irqrestore <-hrtick_set
+            bash-4003  [00]   123.638717: sub_preempt_count <-_spin_unlock_irqrestore
+            bash-4003  [00]   123.638717: hrtick_clear <-hrtick_set
+            bash-4003  [00]   123.638718: sub_preempt_count <-schedule
+            bash-4003  [00]   123.638718: sub_preempt_count <-preempt_schedule
+            bash-4003  [00]   123.638719: wait_for_completion <-__stop_machine_run
+            bash-4003  [00]   123.638719: wait_for_common <-wait_for_completion
+            bash-4003  [00]   123.638720: _spin_lock_irq <-wait_for_common
+            bash-4003  [00]   123.638720: add_preempt_count <-_spin_lock_irq
+[...]
+
+
+Note: function tracer uses ring buffers to store the above
+entries. The newest data may overwrite the oldest data.
+Sometimes using echo to stop the trace is not sufficient because
+the tracing could have overwritten the data that you wanted to
+record. For this reason, it is sometimes better to disable
+tracing directly from a program. This allows you to stop the
+tracing at the point that you hit the part that you are
+interested in. To disable the tracing directly from a C program,
+something like following code snippet can be used:
+
+int trace_fd;
+[...]
+int main(int argc, char *argv[]) {
+       [...]
+       trace_fd = open("/debug/tracing/tracing_enabled", O_WRONLY);
+       [...]
+       if (condition_hit()) {
+               write(trace_fd, "0", 1);
+       }
+       [...]
+}
+
+Note: Here we hard coded the path name. The debugfs mount is not
+guaranteed to be at /debug (and is more commonly at
+/sys/kernel/debug). For simple one time traces, the above is
+sufficent. For anything else, a search through /proc/mounts may
+be needed to find where the debugfs file-system is mounted.
+
+
+Single thread tracing
+---------------------
+
+By writing into /debug/tracing/set_ftrace_pid you can trace a
+single thread. For example:
+
+# cat /debug/tracing/set_ftrace_pid
+no pid
+# echo 3111 > /debug/tracing/set_ftrace_pid
+# cat /debug/tracing/set_ftrace_pid
+3111
+# echo function > /debug/tracing/current_tracer
+# cat /debug/tracing/trace | head
+ # tracer: function
+ #
+ #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
+ #              | |       |          |         |
+     yum-updatesd-3111  [003]  1637.254676: finish_task_switch <-thread_return
+     yum-updatesd-3111  [003]  1637.254681: hrtimer_cancel <-schedule_hrtimeout_range
+     yum-updatesd-3111  [003]  1637.254682: hrtimer_try_to_cancel <-hrtimer_cancel
+     yum-updatesd-3111  [003]  1637.254683: lock_hrtimer_base <-hrtimer_try_to_cancel
+     yum-updatesd-3111  [003]  1637.254685: fget_light <-do_sys_poll
+     yum-updatesd-3111  [003]  1637.254686: pipe_poll <-do_sys_poll
+# echo -1 > /debug/tracing/set_ftrace_pid
+# cat /debug/tracing/trace |head
+ # tracer: function
+ #
+ #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
+ #              | |       |          |         |
+ ##### CPU 3 buffer started ####
+     yum-updatesd-3111  [003]  1701.957688: free_poll_entry <-poll_freewait
+     yum-updatesd-3111  [003]  1701.957689: remove_wait_queue <-free_poll_entry
+     yum-updatesd-3111  [003]  1701.957691: fput <-free_poll_entry
+     yum-updatesd-3111  [003]  1701.957692: audit_syscall_exit <-sysret_audit
+     yum-updatesd-3111  [003]  1701.957693: path_put <-audit_syscall_exit
+
+If you want to trace a function when executing, you could use
+something like this simple program:
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+
+int main (int argc, char **argv)
+{
+        if (argc < 1)
+                exit(-1);
+
+        if (fork() > 0) {
+                int fd, ffd;
+                char line[64];
+                int s;
+
+                ffd = open("/debug/tracing/current_tracer", O_WRONLY);
+                if (ffd < 0)
+                        exit(-1);
+                write(ffd, "nop", 3);
+
+                fd = open("/debug/tracing/set_ftrace_pid", O_WRONLY);
+                s = sprintf(line, "%d\n", getpid());
+                write(fd, line, s);
+
+                write(ffd, "function", 8);
+
+                close(fd);
+                close(ffd);
+
+                execvp(argv[1], argv+1);
+        }
+
+        return 0;
+}
+
+
+hw-branch-tracer (x86 only)
+---------------------------
+
+This tracer uses the x86 last branch tracing hardware feature to
+collect a branch trace on all cpus with relatively low overhead.
+
+The tracer uses a fixed-size circular buffer per cpu and only
+traces ring 0 branches. The trace file dumps that buffer in the
+following format:
+
+# tracer: hw-branch-tracer
+#
+# CPU#        TO  <-  FROM
+   0  scheduler_tick+0xb5/0x1bf          <-  task_tick_idle+0x5/0x6
+   2  run_posix_cpu_timers+0x2b/0x72a    <-  run_posix_cpu_timers+0x25/0x72a
+   0  scheduler_tick+0x139/0x1bf         <-  scheduler_tick+0xed/0x1bf
+   0  scheduler_tick+0x17c/0x1bf         <-  scheduler_tick+0x148/0x1bf
+   2  run_posix_cpu_timers+0x9e/0x72a    <-  run_posix_cpu_timers+0x5e/0x72a
+   0  scheduler_tick+0x1b6/0x1bf         <-  scheduler_tick+0x1aa/0x1bf
+
+
+The tracer may be used to dump the trace for the oops'ing cpu on
+a kernel oops into the system log. To enable this,
+ftrace_dump_on_oops must be set. To set ftrace_dump_on_oops, one
+can either use the sysctl function or set it via the proc system
+interface.
+
+  sysctl kernel.ftrace_dump_on_oops=1
+
+or
+
+  echo 1 > /proc/sys/kernel/ftrace_dump_on_oops
+
+
+Here's an example of such a dump after a null pointer
+dereference in a kernel module:
+
+[57848.105921] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
+[57848.106019] IP: [<ffffffffa0000006>] open+0x6/0x14 [oops]
+[57848.106019] PGD 2354e9067 PUD 2375e7067 PMD 0
+[57848.106019] Oops: 0002 [#1] SMP
+[57848.106019] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:20:05.0/local_cpus
+[57848.106019] Dumping ftrace buffer:
+[57848.106019] ---------------------------------
+[...]
+[57848.106019]    0  chrdev_open+0xe6/0x165      <-  cdev_put+0x23/0x24
+[57848.106019]    0  chrdev_open+0x117/0x165     <-  chrdev_open+0xfa/0x165
+[57848.106019]    0  chrdev_open+0x120/0x165     <-  chrdev_open+0x11c/0x165
+[57848.106019]    0  chrdev_open+0x134/0x165     <-  chrdev_open+0x12b/0x165
+[57848.106019]    0  open+0x0/0x14 [oops]        <-  chrdev_open+0x144/0x165
+[57848.106019]    0  page_fault+0x0/0x30         <-  open+0x6/0x14 [oops]
+[57848.106019]    0  error_entry+0x0/0x5b        <-  page_fault+0x4/0x30
+[57848.106019]    0  error_kernelspace+0x0/0x31          <-  error_entry+0x59/0x5b
+[57848.106019]    0  error_sti+0x0/0x1   <-  error_kernelspace+0x2d/0x31
+[57848.106019]    0  page_fault+0x9/0x30         <-  error_sti+0x0/0x1
+[57848.106019]    0  do_page_fault+0x0/0x881     <-  page_fault+0x1a/0x30
+[...]
+[57848.106019]    0  do_page_fault+0x66b/0x881   <-  is_prefetch+0x1ee/0x1f2
+[57848.106019]    0  do_page_fault+0x6e0/0x881   <-  do_page_fault+0x67a/0x881
+[57848.106019]    0  oops_begin+0x0/0x96         <-  do_page_fault+0x6e0/0x881
+[57848.106019]    0  trace_hw_branch_oops+0x0/0x2d       <-  oops_begin+0x9/0x96
+[...]
+[57848.106019]    0  ds_suspend_bts+0x2a/0xe3    <-  ds_suspend_bts+0x1a/0xe3
+[57848.106019] ---------------------------------
+[57848.106019] CPU 0
+[57848.106019] Modules linked in: oops
+[57848.106019] Pid: 5542, comm: cat Tainted: G        W  2.6.28 #23
+[57848.106019] RIP: 0010:[<ffffffffa0000006>]  [<ffffffffa0000006>] open+0x6/0x14 [oops]
+[57848.106019] RSP: 0018:ffff880235457d48  EFLAGS: 00010246
+[...]
+
+
+function graph tracer
+---------------------------
+
+This tracer is similar to the function tracer except that it
+probes a function on its entry and its exit. This is done by
+using a dynamically allocated stack of return addresses in each
+task_struct. On function entry the tracer overwrites the return
+address of each function traced to set a custom probe. Thus the
+original return address is stored on the stack of return address
+in the task_struct.
+
+Probing on both ends of a function leads to special features
+such as:
+
+- measure of a function's time execution
+- having a reliable call stack to draw function calls graph
+
+This tracer is useful in several situations:
+
+- you want to find the reason of a strange kernel behavior and
+  need to see what happens in detail on any areas (or specific
+  ones).
+
+- you are experiencing weird latencies but it's difficult to
+  find its origin.
+
+- you want to find quickly which path is taken by a specific
+  function
+
+- you just want to peek inside a working kernel and want to see
+  what happens there.
+
+# tracer: function_graph
+#
+# CPU  DURATION                  FUNCTION CALLS
+# |     |   |                     |   |   |   |
+
+ 0)               |  sys_open() {
+ 0)               |    do_sys_open() {
+ 0)               |      getname() {
+ 0)               |        kmem_cache_alloc() {
+ 0)   1.382 us    |          __might_sleep();
+ 0)   2.478 us    |        }
+ 0)               |        strncpy_from_user() {
+ 0)               |          might_fault() {
+ 0)   1.389 us    |            __might_sleep();
+ 0)   2.553 us    |          }
+ 0)   3.807 us    |        }
+ 0)   7.876 us    |      }
+ 0)               |      alloc_fd() {
+ 0)   0.668 us    |        _spin_lock();
+ 0)   0.570 us    |        expand_files();
+ 0)   0.586 us    |        _spin_unlock();
+
+
+There are several columns that can be dynamically
+enabled/disabled. You can use every combination of options you
+want, depending on your needs.
+
+- The cpu number on which the function executed is default
+  enabled.  It is sometimes better to only trace one cpu (see
+  tracing_cpu_mask file) or you might sometimes see unordered
+  function calls while cpu tracing switch.
+
+       hide: echo nofuncgraph-cpu > /debug/tracing/trace_options
+       show: echo funcgraph-cpu > /debug/tracing/trace_options
+
+- The duration (function's time of execution) is displayed on
+  the closing bracket line of a function or on the same line
+  than the current function in case of a leaf one. It is default
+  enabled.
+
+       hide: echo nofuncgraph-duration > /debug/tracing/trace_options
+       show: echo funcgraph-duration > /debug/tracing/trace_options
+
+- The overhead field precedes the duration field in case of
+  reached duration thresholds.
+
+       hide: echo nofuncgraph-overhead > /debug/tracing/trace_options
+       show: echo funcgraph-overhead > /debug/tracing/trace_options
+       depends on: funcgraph-duration
+
+  ie:
+
+  0)               |    up_write() {
+  0)   0.646 us    |      _spin_lock_irqsave();
+  0)   0.684 us    |      _spin_unlock_irqrestore();
+  0)   3.123 us    |    }
+  0)   0.548 us    |    fput();
+  0) + 58.628 us   |  }
+
+  [...]
+
+  0)               |      putname() {
+  0)               |        kmem_cache_free() {
+  0)   0.518 us    |          __phys_addr();
+  0)   1.757 us    |        }
+  0)   2.861 us    |      }
+  0) ! 115.305 us  |    }
+  0) ! 116.402 us  |  }
+
+  + means that the function exceeded 10 usecs.
+  ! means that the function exceeded 100 usecs.
+
+
+- The task/pid field displays the thread cmdline and pid which
+  executed the function. It is default disabled.
+
+       hide: echo nofuncgraph-proc > /debug/tracing/trace_options
+       show: echo funcgraph-proc > /debug/tracing/trace_options
+
+  ie:
+
+  # tracer: function_graph
+  #
+  # CPU  TASK/PID        DURATION                  FUNCTION CALLS
+  # |    |    |           |   |                     |   |   |   |
+  0)    sh-4802     |               |                  d_free() {
+  0)    sh-4802     |               |                    call_rcu() {
+  0)    sh-4802     |               |                      __call_rcu() {
+  0)    sh-4802     |   0.616 us    |                        rcu_process_gp_end();
+  0)    sh-4802     |   0.586 us    |                        check_for_new_grace_period();
+  0)    sh-4802     |   2.899 us    |                      }
+  0)    sh-4802     |   4.040 us    |                    }
+  0)    sh-4802     |   5.151 us    |                  }
+  0)    sh-4802     | + 49.370 us   |                }
+
+
+- The absolute time field is an absolute timestamp given by the
+  system clock since it started. A snapshot of this time is
+  given on each entry/exit of functions
+
+       hide: echo nofuncgraph-abstime > /debug/tracing/trace_options
+       show: echo funcgraph-abstime > /debug/tracing/trace_options
+
+  ie:
+
+  #
+  #      TIME       CPU  DURATION                  FUNCTION CALLS
+  #       |         |     |   |                     |   |   |   |
+  360.774522 |   1)   0.541 us    |                                          }
+  360.774522 |   1)   4.663 us    |                                        }
+  360.774523 |   1)   0.541 us    |                                        __wake_up_bit();
+  360.774524 |   1)   6.796 us    |                                      }
+  360.774524 |   1)   7.952 us    |                                    }
+  360.774525 |   1)   9.063 us    |                                  }
+  360.774525 |   1)   0.615 us    |                                  journal_mark_dirty();
+  360.774527 |   1)   0.578 us    |                                  __brelse();
+  360.774528 |   1)               |                                  reiserfs_prepare_for_journal() {
+  360.774528 |   1)               |                                    unlock_buffer() {
+  360.774529 |   1)               |                                      wake_up_bit() {
+  360.774529 |   1)               |                                        bit_waitqueue() {
+  360.774530 |   1)   0.594 us    |                                          __phys_addr();
+
+
+You can put some comments on specific functions by using
+trace_printk() For example, if you want to put a comment inside
+the __might_sleep() function, you just have to include
+<linux/ftrace.h> and call trace_printk() inside __might_sleep()
+
+trace_printk("I'm a comment!\n")
+
+will produce:
+
+ 1)               |             __might_sleep() {
+ 1)               |                /* I'm a comment! */
+ 1)   1.449 us    |             }
+
+
+You might find other useful features for this tracer in the
+following "dynamic ftrace" section such as tracing only specific
+functions or tasks.
+
+dynamic ftrace
+--------------
+
+If CONFIG_DYNAMIC_FTRACE is set, the system will run with
+virtually no overhead when function tracing is disabled. The way
+this works is the mcount function call (placed at the start of
+every kernel function, produced by the -pg switch in gcc),
+starts of pointing to a simple return. (Enabling FTRACE will
+include the -pg switch in the compiling of the kernel.)
+
+At compile time every C file object is run through the
+recordmcount.pl script (located in the scripts directory). This
+script will process the C object using objdump to find all the
+locations in the .text section that call mcount. (Note, only the
+.text section is processed, since processing other sections like
+.init.text may cause races due to those sections being freed).
+
+A new section called "__mcount_loc" is created that holds
+references to all the mcount call sites in the .text section.
+This section is compiled back into the original object. The
+final linker will add all these references into a single table.
+
+On boot up, before SMP is initialized, the dynamic ftrace code
+scans this table and updates all the locations into nops. It
+also records the locations, which are added to the
+available_filter_functions list.  Modules are processed as they
+are loaded and before they are executed.  When a module is
+unloaded, it also removes its functions from the ftrace function
+list. This is automatic in the module unload code, and the
+module author does not need to worry about it.
+
+When tracing is enabled, kstop_machine is called to prevent
+races with the CPUS executing code being modified (which can
+cause the CPU to do undesireable things), and the nops are
+patched back to calls. But this time, they do not call mcount
+(which is just a function stub). They now call into the ftrace
+infrastructure.
+
+One special side-effect to the recording of the functions being
+traced is that we can now selectively choose which functions we
+wish to trace and which ones we want the mcount calls to remain
+as nops.
+
+Two files are used, one for enabling and one for disabling the
+tracing of specified functions. They are:
+
+  set_ftrace_filter
+
+and
+
+  set_ftrace_notrace
+
+A list of available functions that you can add to these files is
+listed in:
+
+   available_filter_functions
+
+ # cat /debug/tracing/available_filter_functions
+put_prev_task_idle
+kmem_cache_create
+pick_next_task_rt
+get_online_cpus
+pick_next_task_fair
+mutex_lock
+[...]
+
+If I am only interested in sys_nanosleep and hrtimer_interrupt:
+
+ # echo sys_nanosleep hrtimer_interrupt \
+               > /debug/tracing/set_ftrace_filter
+ # echo ftrace > /debug/tracing/current_tracer
+ # echo 1 > /debug/tracing/tracing_enabled
+ # usleep 1
+ # echo 0 > /debug/tracing/tracing_enabled
+ # cat /debug/tracing/trace
+# tracer: ftrace
+#
+#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
+#              | |      |          |         |
+          usleep-4134  [00]  1317.070017: hrtimer_interrupt <-smp_apic_timer_interrupt
+          usleep-4134  [00]  1317.070111: sys_nanosleep <-syscall_call
+          <idle>-0     [00]  1317.070115: hrtimer_interrupt <-smp_apic_timer_interrupt
+
+To see which functions are being traced, you can cat the file:
+
+ # cat /debug/tracing/set_ftrace_filter
+hrtimer_interrupt
+sys_nanosleep
+
+
+Perhaps this is not enough. The filters also allow simple wild
+cards. Only the following are currently available
+
+  <match>*  - will match functions that begin with <match>
+  *<match>  - will match functions that end with <match>
+  *<match>* - will match functions that have <match> in it
+
+These are the only wild cards which are supported.
+
+  <match>*<match> will not work.
+
+Note: It is better to use quotes to enclose the wild cards,
+      otherwise the shell may expand the parameters into names
+      of files in the local directory.
+
+ # echo 'hrtimer_*' > /debug/tracing/set_ftrace_filter
+
+Produces:
+
+# tracer: ftrace
+#
+#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
+#              | |      |          |         |
+            bash-4003  [00]  1480.611794: hrtimer_init <-copy_process
+            bash-4003  [00]  1480.611941: hrtimer_start <-hrtick_set
+            bash-4003  [00]  1480.611956: hrtimer_cancel <-hrtick_clear
+            bash-4003  [00]  1480.611956: hrtimer_try_to_cancel <-hrtimer_cancel
+          <idle>-0     [00]  1480.612019: hrtimer_get_next_event <-get_next_timer_interrupt
+          <idle>-0     [00]  1480.612025: hrtimer_get_next_event <-get_next_timer_interrupt
+          <idle>-0     [00]  1480.612032: hrtimer_get_next_event <-get_next_timer_interrupt
+          <idle>-0     [00]  1480.612037: hrtimer_get_next_event <-get_next_timer_interrupt
+          <idle>-0     [00]  1480.612382: hrtimer_get_next_event <-get_next_timer_interrupt
+
+
+Notice that we lost the sys_nanosleep.
+
+ # cat /debug/tracing/set_ftrace_filter
+hrtimer_run_queues
+hrtimer_run_pending
+hrtimer_init
+hrtimer_cancel
+hrtimer_try_to_cancel
+hrtimer_forward
+hrtimer_start
+hrtimer_reprogram
+hrtimer_force_reprogram
+hrtimer_get_next_event
+hrtimer_interrupt
+hrtimer_nanosleep
+hrtimer_wakeup
+hrtimer_get_remaining
+hrtimer_get_res
+hrtimer_init_sleeper
+
+
+This is because the '>' and '>>' act just like they do in bash.
+To rewrite the filters, use '>'
+To append to the filters, use '>>'
+
+To clear out a filter so that all functions will be recorded
+again:
+
+ # echo > /debug/tracing/set_ftrace_filter
+ # cat /debug/tracing/set_ftrace_filter
+ #
+
+Again, now we want to append.
+
+ # echo sys_nanosleep > /debug/tracing/set_ftrace_filter
+ # cat /debug/tracing/set_ftrace_filter
+sys_nanosleep
+ # echo 'hrtimer_*' >> /debug/tracing/set_ftrace_filter
+ # cat /debug/tracing/set_ftrace_filter
+hrtimer_run_queues
+hrtimer_run_pending
+hrtimer_init
+hrtimer_cancel
+hrtimer_try_to_cancel
+hrtimer_forward
+hrtimer_start
+hrtimer_reprogram
+hrtimer_force_reprogram
+hrtimer_get_next_event
+hrtimer_interrupt
+sys_nanosleep
+hrtimer_nanosleep
+hrtimer_wakeup
+hrtimer_get_remaining
+hrtimer_get_res
+hrtimer_init_sleeper
+
+
+The set_ftrace_notrace prevents those functions from being
+traced.
+
+ # echo '*preempt*' '*lock*' > /debug/tracing/set_ftrace_notrace
+
+Produces:
+
+# tracer: ftrace
+#
+#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
+#              | |      |          |         |
+            bash-4043  [01]   115.281644: finish_task_switch <-schedule
+            bash-4043  [01]   115.281645: hrtick_set <-schedule
+            bash-4043  [01]   115.281645: hrtick_clear <-hrtick_set
+            bash-4043  [01]   115.281646: wait_for_completion <-__stop_machine_run
+            bash-4043  [01]   115.281647: wait_for_common <-wait_for_completion
+            bash-4043  [01]   115.281647: kthread_stop <-stop_machine_run
+            bash-4043  [01]   115.281648: init_waitqueue_head <-kthread_stop
+            bash-4043  [01]   115.281648: wake_up_process <-kthread_stop
+            bash-4043  [01]   115.281649: try_to_wake_up <-wake_up_process
+
+We can see that there's no more lock or preempt tracing.
+
+
+Dynamic ftrace with the function graph tracer
+---------------------------------------------
+
+Although what has been explained above concerns both the
+function tracer and the function-graph-tracer, there are some
+special features only available in the function-graph tracer.
+
+If you want to trace only one function and all of its children,
+you just have to echo its name into set_graph_function:
+
+ echo __do_fault > set_graph_function
+
+will produce the following "expanded" trace of the __do_fault()
+function:
+
+ 0)               |  __do_fault() {
+ 0)               |    filemap_fault() {
+ 0)               |      find_lock_page() {
+ 0)   0.804 us    |        find_get_page();
+ 0)               |        __might_sleep() {
+ 0)   1.329 us    |        }
+ 0)   3.904 us    |      }
+ 0)   4.979 us    |    }
+ 0)   0.653 us    |    _spin_lock();
+ 0)   0.578 us    |    page_add_file_rmap();
+ 0)   0.525 us    |    native_set_pte_at();
+ 0)   0.585 us    |    _spin_unlock();
+ 0)               |    unlock_page() {
+ 0)   0.541 us    |      page_waitqueue();
+ 0)   0.639 us    |      __wake_up_bit();
+ 0)   2.786 us    |    }
+ 0) + 14.237 us   |  }
+ 0)               |  __do_fault() {
+ 0)               |    filemap_fault() {
+ 0)               |      find_lock_page() {
+ 0)   0.698 us    |        find_get_page();
+ 0)               |        __might_sleep() {
+ 0)   1.412 us    |        }
+ 0)   3.950 us    |      }
+ 0)   5.098 us    |    }
+ 0)   0.631 us    |    _spin_lock();
+ 0)   0.571 us    |    page_add_file_rmap();
+ 0)   0.526 us    |    native_set_pte_at();
+ 0)   0.586 us    |    _spin_unlock();
+ 0)               |    unlock_page() {
+ 0)   0.533 us    |      page_waitqueue();
+ 0)   0.638 us    |      __wake_up_bit();
+ 0)   2.793 us    |    }
+ 0) + 14.012 us   |  }
+
+You can also expand several functions at once:
+
+ echo sys_open > set_graph_function
+ echo sys_close >> set_graph_function
+
+Now if you want to go back to trace all functions you can clear
+this special filter via:
+
+ echo > set_graph_function
+
+
+trace_pipe
+----------
+
+The trace_pipe outputs the same content as the trace file, but
+the effect on the tracing is different. Every read from
+trace_pipe is consumed. This means that subsequent reads will be
+different. The trace is live.
+
+ # echo function > /debug/tracing/current_tracer
+ # cat /debug/tracing/trace_pipe > /tmp/trace.out &
+[1] 4153
+ # echo 1 > /debug/tracing/tracing_enabled
+ # usleep 1
+ # echo 0 > /debug/tracing/tracing_enabled
+ # cat /debug/tracing/trace
+# tracer: function
+#
+#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
+#              | |      |          |         |
+
+ #
+ # cat /tmp/trace.out
+            bash-4043  [00] 41.267106: finish_task_switch <-schedule
+            bash-4043  [00] 41.267106: hrtick_set <-schedule
+            bash-4043  [00] 41.267107: hrtick_clear <-hrtick_set
+            bash-4043  [00] 41.267108: wait_for_completion <-__stop_machine_run
+            bash-4043  [00] 41.267108: wait_for_common <-wait_for_completion
+            bash-4043  [00] 41.267109: kthread_stop <-stop_machine_run
+            bash-4043  [00] 41.267109: init_waitqueue_head <-kthread_stop
+            bash-4043  [00] 41.267110: wake_up_process <-kthread_stop
+            bash-4043  [00] 41.267110: try_to_wake_up <-wake_up_process
+            bash-4043  [00] 41.267111: select_task_rq_rt <-try_to_wake_up
+
+
+Note, reading the trace_pipe file will block until more input is
+added. By changing the tracer, trace_pipe will issue an EOF. We
+needed to set the function tracer _before_ we "cat" the
+trace_pipe file.
+
+
+trace entries
+-------------
+
+Having too much or not enough data can be troublesome in
+diagnosing an issue in the kernel. The file buffer_size_kb is
+used to modify the size of the internal trace buffers. The
+number listed is the number of entries that can be recorded per
+CPU. To know the full size, multiply the number of possible CPUS
+with the number of entries.
+
+ # cat /debug/tracing/buffer_size_kb
+1408 (units kilobytes)
+
+Note, to modify this, you must have tracing completely disabled.
+To do that, echo "nop" into the current_tracer. If the
+current_tracer is not set to "nop", an EINVAL error will be
+returned.
+
+ # echo nop > /debug/tracing/current_tracer
+ # echo 10000 > /debug/tracing/buffer_size_kb
+ # cat /debug/tracing/buffer_size_kb
+10000 (units kilobytes)
+
+The number of pages which will be allocated is limited to a
+percentage of available memory. Allocating too much will produce
+an error.
+
+ # echo 1000000000000 > /debug/tracing/buffer_size_kb
+-bash: echo: write error: Cannot allocate memory
+ # cat /debug/tracing/buffer_size_kb
+85
+
+-----------
+
+More details can be found in the source code, in the
+kernel/tracing/*.c files.
diff --git a/Documentation/trace/kmemtrace.txt b/Documentation/trace/kmemtrace.txt

new file mode 100644 (file)

index 0000000..a956d9b
--- /dev/null
+++ b/Documentation/trace/kmemtrace.txt
@@ -0,0 +1,126 @@
+                       kmemtrace - Kernel Memory Tracer
+
+                         by Eduard - Gabriel Munteanu
+                            <eduard.munteanu@linux360.ro>
+
+I. Introduction
+===============
+
+kmemtrace helps kernel developers figure out two things:
+1) how different allocators (SLAB, SLUB etc.) perform
+2) how kernel code allocates memory and how much
+
+To do this, we trace every allocation and export information to the userspace
+through the relay interface. We export things such as the number of requested
+bytes, the number of bytes actually allocated (i.e. including internal
+fragmentation), whether this is a slab allocation or a plain kmalloc() and so
+on.
+
+The actual analysis is performed by a userspace tool (see section III for
+details on where to get it from). It logs the data exported by the kernel,
+processes it and (as of writing this) can provide the following information:
+- the total amount of memory allocated and fragmentation per call-site
+- the amount of memory allocated and fragmentation per allocation
+- total memory allocated and fragmentation in the collected dataset
+- number of cross-CPU allocation and frees (makes sense in NUMA environments)
+
+Moreover, it can potentially find inconsistent and erroneous behavior in
+kernel code, such as using slab free functions on kmalloc'ed memory or
+allocating less memory than requested (but not truly failed allocations).
+
+kmemtrace also makes provisions for tracing on some arch and analysing the
+data on another.
+
+II. Design and goals
+====================
+
+kmemtrace was designed to handle rather large amounts of data. Thus, it uses
+the relay interface to export whatever is logged to userspace, which then
+stores it. Analysis and reporting is done asynchronously, that is, after the
+data is collected and stored. By design, it allows one to log and analyse
+on different machines and different arches.
+
+As of writing this, the ABI is not considered stable, though it might not
+change much. However, no guarantees are made about compatibility yet. When
+deemed stable, the ABI should still allow easy extension while maintaining
+backward compatibility. This is described further in Documentation/ABI.
+
+Summary of design goals:
+       - allow logging and analysis to be done across different machines
+       - be fast and anticipate usage in high-load environments (*)
+       - be reasonably extensible
+       - make it possible for GNU/Linux distributions to have kmemtrace
+       included in their repositories
+
+(*) - one of the reasons Pekka Enberg's original userspace data analysis
+    tool's code was rewritten from Perl to C (although this is more than a
+    simple conversion)
+
+
+III. Quick usage guide
+======================
+
+1) Get a kernel that supports kmemtrace and build it accordingly (i.e. enable
+CONFIG_KMEMTRACE).
+
+2) Get the userspace tool and build it:
+$ git-clone git://repo.or.cz/kmemtrace-user.git                # current repository
+$ cd kmemtrace-user/
+$ ./autogen.sh
+$ ./configure
+$ make
+
+3) Boot the kmemtrace-enabled kernel if you haven't, preferably in the
+'single' runlevel (so that relay buffers don't fill up easily), and run
+kmemtrace:
+# '$' does not mean user, but root here.
+$ mount -t debugfs none /sys/kernel/debug
+$ mount -t proc none /proc
+$ cd path/to/kmemtrace-user/
+$ ./kmemtraced
+Wait a bit, then stop it with CTRL+C.
+$ cat /sys/kernel/debug/kmemtrace/total_overruns       # Check if we didn't
+                                                       # overrun, should
+                                                       # be zero.
+$ (Optionally) [Run kmemtrace_check separately on each cpu[0-9]*.out file to
+               check its correctness]
+$ ./kmemtrace-report
+
+Now you should have a nice and short summary of how the allocator performs.
+
+IV. FAQ and known issues
+========================
+
+Q: 'cat /sys/kernel/debug/kmemtrace/total_overruns' is non-zero, how do I fix
+this? Should I worry?
+A: If it's non-zero, this affects kmemtrace's accuracy, depending on how
+large the number is. You can fix it by supplying a higher
+'kmemtrace.subbufs=N' kernel parameter.
+---
+
+Q: kmemtrace_check reports errors, how do I fix this? Should I worry?
+A: This is a bug and should be reported. It can occur for a variety of
+reasons:
+       - possible bugs in relay code
+       - possible misuse of relay by kmemtrace
+       - timestamps being collected unorderly
+Or you may fix it yourself and send us a patch.
+---
+
+Q: kmemtrace_report shows many errors, how do I fix this? Should I worry?
+A: This is a known issue and I'm working on it. These might be true errors
+in kernel code, which may have inconsistent behavior (e.g. allocating memory
+with kmem_cache_alloc() and freeing it with kfree()). Pekka Enberg pointed
+out this behavior may work with SLAB, but may fail with other allocators.
+
+It may also be due to lack of tracing in some unusual allocator functions.
+
+We don't want bug reports regarding this issue yet.
+---
+
+V. See also
+===========
+
+Documentation/kernel-parameters.txt
+Documentation/ABI/testing/debugfs-kmemtrace
+
diff --git a/Documentation/trace/mmiotrace.txt b/Documentation/trace/mmiotrace.txt

new file mode 100644 (file)

index 0000000..5731c67
--- /dev/null
+++ b/Documentation/trace/mmiotrace.txt
@@ -0,0 +1,163 @@
+               In-kernel memory-mapped I/O tracing
+
+
+Home page and links to optional user space tools:
+
+       http://nouveau.freedesktop.org/wiki/MmioTrace
+
+MMIO tracing was originally developed by Intel around 2003 for their Fault
+Injection Test Harness. In Dec 2006 - Jan 2007, using the code from Intel,
+Jeff Muizelaar created a tool for tracing MMIO accesses with the Nouveau
+project in mind. Since then many people have contributed.
+
+Mmiotrace was built for reverse engineering any memory-mapped IO device with
+the Nouveau project as the first real user. Only x86 and x86_64 architectures
+are supported.
+
+Out-of-tree mmiotrace was originally modified for mainline inclusion and
+ftrace framework by Pekka Paalanen <pq@iki.fi>.
+
+
+Preparation
+-----------
+
+Mmiotrace feature is compiled in by the CONFIG_MMIOTRACE option. Tracing is
+disabled by default, so it is safe to have this set to yes. SMP systems are
+supported, but tracing is unreliable and may miss events if more than one CPU
+is on-line, therefore mmiotrace takes all but one CPU off-line during run-time
+activation. You can re-enable CPUs by hand, but you have been warned, there
+is no way to automatically detect if you are losing events due to CPUs racing.
+
+
+Usage Quick Reference
+---------------------
+
+$ mount -t debugfs debugfs /debug
+$ echo mmiotrace > /debug/tracing/current_tracer
+$ cat /debug/tracing/trace_pipe > mydump.txt &
+Start X or whatever.
+$ echo "X is up" > /debug/tracing/trace_marker
+$ echo nop > /debug/tracing/current_tracer
+Check for lost events.
+
+
+Usage
+-----
+
+Make sure debugfs is mounted to /debug. If not, (requires root privileges)
+$ mount -t debugfs debugfs /debug
+
+Check that the driver you are about to trace is not loaded.
+
+Activate mmiotrace (requires root privileges):
+$ echo mmiotrace > /debug/tracing/current_tracer
+
+Start storing the trace:
+$ cat /debug/tracing/trace_pipe > mydump.txt &
+The 'cat' process should stay running (sleeping) in the background.
+
+Load the driver you want to trace and use it. Mmiotrace will only catch MMIO
+accesses to areas that are ioremapped while mmiotrace is active.
+
+During tracing you can place comments (markers) into the trace by
+$ echo "X is up" > /debug/tracing/trace_marker
+This makes it easier to see which part of the (huge) trace corresponds to
+which action. It is recommended to place descriptive markers about what you
+do.
+
+Shut down mmiotrace (requires root privileges):
+$ echo nop > /debug/tracing/current_tracer
+The 'cat' process exits. If it does not, kill it by issuing 'fg' command and
+pressing ctrl+c.
+
+Check that mmiotrace did not lose events due to a buffer filling up. Either
+$ grep -i lost mydump.txt
+which tells you exactly how many events were lost, or use
+$ dmesg
+to view your kernel log and look for "mmiotrace has lost events" warning. If
+events were lost, the trace is incomplete. You should enlarge the buffers and
+try again. Buffers are enlarged by first seeing how large the current buffers
+are:
+$ cat /debug/tracing/buffer_size_kb
+gives you a number. Approximately double this number and write it back, for
+instance:
+$ echo 128000 > /debug/tracing/buffer_size_kb
+Then start again from the top.
+
+If you are doing a trace for a driver project, e.g. Nouveau, you should also
+do the following before sending your results:
+$ lspci -vvv > lspci.txt
+$ dmesg > dmesg.txt
+$ tar zcf pciid-nick-mmiotrace.tar.gz mydump.txt lspci.txt dmesg.txt
+and then send the .tar.gz file. The trace compresses considerably. Replace
+"pciid" and "nick" with the PCI ID or model name of your piece of hardware
+under investigation and your nick name.
+
+
+How Mmiotrace Works
+-------------------
+
+Access to hardware IO-memory is gained by mapping addresses from PCI bus by
+calling one of the ioremap_*() functions. Mmiotrace is hooked into the
+__ioremap() function and gets called whenever a mapping is created. Mapping is
+an event that is recorded into the trace log. Note, that ISA range mappings
+are not caught, since the mapping always exists and is returned directly.
+
+MMIO accesses are recorded via page faults. Just before __ioremap() returns,
+the mapped pages are marked as not present. Any access to the pages causes a
+fault. The page fault handler calls mmiotrace to handle the fault. Mmiotrace
+marks the page present, sets TF flag to achieve single stepping and exits the
+fault handler. The instruction that faulted is executed and debug trap is
+entered. Here mmiotrace again marks the page as not present. The instruction
+is decoded to get the type of operation (read/write), data width and the value
+read or written. These are stored to the trace log.
+
+Setting the page present in the page fault handler has a race condition on SMP
+machines. During the single stepping other CPUs may run freely on that page
+and events can be missed without a notice. Re-enabling other CPUs during
+tracing is discouraged.
+
+
+Trace Log Format
+----------------
+
+The raw log is text and easily filtered with e.g. grep and awk. One record is
+one line in the log. A record starts with a keyword, followed by keyword
+dependant arguments. Arguments are separated by a space, or continue until the
+end of line. The format for version 20070824 is as follows:
+
+Explanation    Keyword Space separated arguments
+---------------------------------------------------------------------------
+
+read event     R       width, timestamp, map id, physical, value, PC, PID
+write event    W       width, timestamp, map id, physical, value, PC, PID
+ioremap event  MAP     timestamp, map id, physical, virtual, length, PC, PID
+iounmap event  UNMAP   timestamp, map id, PC, PID
+marker         MARK    timestamp, text
+version                VERSION the string "20070824"
+info for reader        LSPCI   one line from lspci -v
+PCI address map        PCIDEV  space separated /proc/bus/pci/devices data
+unk. opcode    UNKNOWN timestamp, map id, physical, data, PC, PID
+
+Timestamp is in seconds with decimals. Physical is a PCI bus address, virtual
+is a kernel virtual address. Width is the data width in bytes and value is the
+data value. Map id is an arbitrary id number identifying the mapping that was
+used in an operation. PC is the program counter and PID is process id. PC is
+zero if it is not recorded. PID is always zero as tracing MMIO accesses
+originating in user space memory is not yet supported.
+
+For instance, the following awk filter will pass all 32-bit writes that target
+physical addresses in the range [0xfb73ce40, 0xfb800000[
+
+$ awk '/W 4 / { adr=strtonum($5); if (adr >= 0xfb73ce40 &&
+adr < 0xfb800000) print; }'
+
+
+Tools for Developers
+--------------------
+
+The user space tools include utilities for:
+- replacing numeric addresses and values with hardware register names
+- replaying MMIO logs, i.e., re-executing the recorded writes
+
+
diff --git a/Documentation/trace/tracepoints.txt b/Documentation/trace/tracepoints.txt

new file mode 100644 (file)

index 0000000..c0e1cee
--- /dev/null
+++ b/Documentation/trace/tracepoints.txt
@@ -0,0 +1,116 @@
+                    Using the Linux Kernel Tracepoints
+
+                           Mathieu Desnoyers
+
+
+This document introduces Linux Kernel Tracepoints and their use. It
+provides examples of how to insert tracepoints in the kernel and
+connect probe functions to them and provides some examples of probe
+functions.
+
+
+* Purpose of tracepoints
+
+A tracepoint placed in code provides a hook to call a function (probe)
+that you can provide at runtime. A tracepoint can be "on" (a probe is
+connected to it) or "off" (no probe is attached). When a tracepoint is
+"off" it has no effect, except for adding a tiny time penalty
+(checking a condition for a branch) and space penalty (adding a few
+bytes for the function call at the end of the instrumented function
+and adds a data structure in a separate section).  When a tracepoint
+is "on", the function you provide is called each time the tracepoint
+is executed, in the execution context of the caller. When the function
+provided ends its execution, it returns to the caller (continuing from
+the tracepoint site).
+
+You can put tracepoints at important locations in the code. They are
+lightweight hooks that can pass an arbitrary number of parameters,
+which prototypes are described in a tracepoint declaration placed in a
+header file.
+
+They can be used for tracing and performance accounting.
+
+
+* Usage
+
+Two elements are required for tracepoints :
+
+- A tracepoint definition, placed in a header file.
+- The tracepoint statement, in C code.
+
+In order to use tracepoints, you should include linux/tracepoint.h.
+
+In include/trace/subsys.h :
+
+#include <linux/tracepoint.h>
+
+DECLARE_TRACE(subsys_eventname,
+       TP_PROTO(int firstarg, struct task_struct *p),
+       TP_ARGS(firstarg, p));
+
+In subsys/file.c (where the tracing statement must be added) :
+
+#include <trace/subsys.h>
+
+DEFINE_TRACE(subsys_eventname);
+
+void somefct(void)
+{
+       ...
+       trace_subsys_eventname(arg, task);
+       ...
+}
+
+Where :
+- subsys_eventname is an identifier unique to your event
+    - subsys is the name of your subsystem.
+    - eventname is the name of the event to trace.
+
+- TP_PROTO(int firstarg, struct task_struct *p) is the prototype of the
+  function called by this tracepoint.
+
+- TP_ARGS(firstarg, p) are the parameters names, same as found in the
+  prototype.
+
+Connecting a function (probe) to a tracepoint is done by providing a
+probe (function to call) for the specific tracepoint through
+register_trace_subsys_eventname().  Removing a probe is done through
+unregister_trace_subsys_eventname(); it will remove the probe.
+
+tracepoint_synchronize_unregister() must be called before the end of
+the module exit function to make sure there is no caller left using
+the probe. This, and the fact that preemption is disabled around the
+probe call, make sure that probe removal and module unload are safe.
+See the "Probe example" section below for a sample probe module.
+
+The tracepoint mechanism supports inserting multiple instances of the
+same tracepoint, but a single definition must be made of a given
+tracepoint name over all the kernel to make sure no type conflict will
+occur. Name mangling of the tracepoints is done using the prototypes
+to make sure typing is correct. Verification of probe type correctness
+is done at the registration site by the compiler. Tracepoints can be
+put in inline functions, inlined static functions, and unrolled loops
+as well as regular functions.
+
+The naming scheme "subsys_event" is suggested here as a convention
+intended to limit collisions. Tracepoint names are global to the
+kernel: they are considered as being the same whether they are in the
+core kernel image or in modules.
+
+If the tracepoint has to be used in kernel modules, an
+EXPORT_TRACEPOINT_SYMBOL_GPL() or EXPORT_TRACEPOINT_SYMBOL() can be
+used to export the defined tracepoints.
+
+* Probe / tracepoint example
+
+See the example provided in samples/tracepoints
+
+Compile them with your kernel.  They are built during 'make' (not
+'make modules') when CONFIG_SAMPLE_TRACEPOINTS=m.
+
+Run, as root :
+modprobe tracepoint-sample (insmod order is not important)
+modprobe tracepoint-probe-sample
+cat /proc/tracepoint-sample (returns an expected error)
+rmmod tracepoint-sample tracepoint-probe-sample
+dmesg
diff --git a/Documentation/tracepoints.txt b/Documentation/tracepoints.txt

deleted file mode 100644 (file)

index c0e1cee..0000000
--- a/Documentation/tracepoints.txt
+++ /dev/null
@@ -1,116 +0,0 @@
-                    Using the Linux Kernel Tracepoints
-
-                           Mathieu Desnoyers
-
-
-This document introduces Linux Kernel Tracepoints and their use. It
-provides examples of how to insert tracepoints in the kernel and
-connect probe functions to them and provides some examples of probe
-functions.
-
-
-* Purpose of tracepoints
-
-A tracepoint placed in code provides a hook to call a function (probe)
-that you can provide at runtime. A tracepoint can be "on" (a probe is
-connected to it) or "off" (no probe is attached). When a tracepoint is
-"off" it has no effect, except for adding a tiny time penalty
-(checking a condition for a branch) and space penalty (adding a few
-bytes for the function call at the end of the instrumented function
-and adds a data structure in a separate section).  When a tracepoint
-is "on", the function you provide is called each time the tracepoint
-is executed, in the execution context of the caller. When the function
-provided ends its execution, it returns to the caller (continuing from
-the tracepoint site).
-
-You can put tracepoints at important locations in the code. They are
-lightweight hooks that can pass an arbitrary number of parameters,
-which prototypes are described in a tracepoint declaration placed in a
-header file.
-
-They can be used for tracing and performance accounting.
-
-
-* Usage
-
-Two elements are required for tracepoints :
-
-- A tracepoint definition, placed in a header file.
-- The tracepoint statement, in C code.
-
-In order to use tracepoints, you should include linux/tracepoint.h.
-
-In include/trace/subsys.h :
-
-#include <linux/tracepoint.h>
-
-DECLARE_TRACE(subsys_eventname,
-       TP_PROTO(int firstarg, struct task_struct *p),
-       TP_ARGS(firstarg, p));
-
-In subsys/file.c (where the tracing statement must be added) :
-
-#include <trace/subsys.h>
-
-DEFINE_TRACE(subsys_eventname);
-
-void somefct(void)
-{
-       ...
-       trace_subsys_eventname(arg, task);
-       ...
-}
-
-Where :
-- subsys_eventname is an identifier unique to your event
-    - subsys is the name of your subsystem.
-    - eventname is the name of the event to trace.
-
-- TP_PROTO(int firstarg, struct task_struct *p) is the prototype of the
-  function called by this tracepoint.
-
-- TP_ARGS(firstarg, p) are the parameters names, same as found in the
-  prototype.
-
-Connecting a function (probe) to a tracepoint is done by providing a
-probe (function to call) for the specific tracepoint through
-register_trace_subsys_eventname().  Removing a probe is done through
-unregister_trace_subsys_eventname(); it will remove the probe.
-
-tracepoint_synchronize_unregister() must be called before the end of
-the module exit function to make sure there is no caller left using
-the probe. This, and the fact that preemption is disabled around the
-probe call, make sure that probe removal and module unload are safe.
-See the "Probe example" section below for a sample probe module.
-
-The tracepoint mechanism supports inserting multiple instances of the
-same tracepoint, but a single definition must be made of a given
-tracepoint name over all the kernel to make sure no type conflict will
-occur. Name mangling of the tracepoints is done using the prototypes
-to make sure typing is correct. Verification of probe type correctness
-is done at the registration site by the compiler. Tracepoints can be
-put in inline functions, inlined static functions, and unrolled loops
-as well as regular functions.
-
-The naming scheme "subsys_event" is suggested here as a convention
-intended to limit collisions. Tracepoint names are global to the
-kernel: they are considered as being the same whether they are in the
-core kernel image or in modules.
-
-If the tracepoint has to be used in kernel modules, an
-EXPORT_TRACEPOINT_SYMBOL_GPL() or EXPORT_TRACEPOINT_SYMBOL() can be
-used to export the defined tracepoints.
-
-* Probe / tracepoint example
-
-See the example provided in samples/tracepoints
-
-Compile them with your kernel.  They are built during 'make' (not
-'make modules') when CONFIG_SAMPLE_TRACEPOINTS=m.
-
-Run, as root :
-modprobe tracepoint-sample (insmod order is not important)
-modprobe tracepoint-probe-sample
-cat /proc/tracepoint-sample (returns an expected error)
-rmmod tracepoint-sample tracepoint-probe-sample
-dmesg
diff --git a/Documentation/tracers/mmiotrace.txt b/Documentation/tracers/mmiotrace.txt

deleted file mode 100644 (file)

index 5731c67..0000000
--- a/Documentation/tracers/mmiotrace.txt
+++ /dev/null
@@ -1,163 +0,0 @@
-               In-kernel memory-mapped I/O tracing
-
-
-Home page and links to optional user space tools:
-
-       http://nouveau.freedesktop.org/wiki/MmioTrace
-
-MMIO tracing was originally developed by Intel around 2003 for their Fault
-Injection Test Harness. In Dec 2006 - Jan 2007, using the code from Intel,
-Jeff Muizelaar created a tool for tracing MMIO accesses with the Nouveau
-project in mind. Since then many people have contributed.
-
-Mmiotrace was built for reverse engineering any memory-mapped IO device with
-the Nouveau project as the first real user. Only x86 and x86_64 architectures
-are supported.
-
-Out-of-tree mmiotrace was originally modified for mainline inclusion and
-ftrace framework by Pekka Paalanen <pq@iki.fi>.
-
-
-Preparation
------------
-
-Mmiotrace feature is compiled in by the CONFIG_MMIOTRACE option. Tracing is
-disabled by default, so it is safe to have this set to yes. SMP systems are
-supported, but tracing is unreliable and may miss events if more than one CPU
-is on-line, therefore mmiotrace takes all but one CPU off-line during run-time
-activation. You can re-enable CPUs by hand, but you have been warned, there
-is no way to automatically detect if you are losing events due to CPUs racing.
-
-
-Usage Quick Reference
----------------------
-
-$ mount -t debugfs debugfs /debug
-$ echo mmiotrace > /debug/tracing/current_tracer
-$ cat /debug/tracing/trace_pipe > mydump.txt &
-Start X or whatever.
-$ echo "X is up" > /debug/tracing/trace_marker
-$ echo nop > /debug/tracing/current_tracer
-Check for lost events.
-
-
-Usage
------
-
-Make sure debugfs is mounted to /debug. If not, (requires root privileges)
-$ mount -t debugfs debugfs /debug
-
-Check that the driver you are about to trace is not loaded.
-
-Activate mmiotrace (requires root privileges):
-$ echo mmiotrace > /debug/tracing/current_tracer
-
-Start storing the trace:
-$ cat /debug/tracing/trace_pipe > mydump.txt &
-The 'cat' process should stay running (sleeping) in the background.
-
-Load the driver you want to trace and use it. Mmiotrace will only catch MMIO
-accesses to areas that are ioremapped while mmiotrace is active.
-
-During tracing you can place comments (markers) into the trace by
-$ echo "X is up" > /debug/tracing/trace_marker
-This makes it easier to see which part of the (huge) trace corresponds to
-which action. It is recommended to place descriptive markers about what you
-do.
-
-Shut down mmiotrace (requires root privileges):
-$ echo nop > /debug/tracing/current_tracer
-The 'cat' process exits. If it does not, kill it by issuing 'fg' command and
-pressing ctrl+c.
-
-Check that mmiotrace did not lose events due to a buffer filling up. Either
-$ grep -i lost mydump.txt
-which tells you exactly how many events were lost, or use
-$ dmesg
-to view your kernel log and look for "mmiotrace has lost events" warning. If
-events were lost, the trace is incomplete. You should enlarge the buffers and
-try again. Buffers are enlarged by first seeing how large the current buffers
-are:
-$ cat /debug/tracing/buffer_size_kb
-gives you a number. Approximately double this number and write it back, for
-instance:
-$ echo 128000 > /debug/tracing/buffer_size_kb
-Then start again from the top.
-
-If you are doing a trace for a driver project, e.g. Nouveau, you should also
-do the following before sending your results:
-$ lspci -vvv > lspci.txt
-$ dmesg > dmesg.txt
-$ tar zcf pciid-nick-mmiotrace.tar.gz mydump.txt lspci.txt dmesg.txt
-and then send the .tar.gz file. The trace compresses considerably. Replace
-"pciid" and "nick" with the PCI ID or model name of your piece of hardware
-under investigation and your nick name.
-
-
-How Mmiotrace Works
--------------------
-
-Access to hardware IO-memory is gained by mapping addresses from PCI bus by
-calling one of the ioremap_*() functions. Mmiotrace is hooked into the
-__ioremap() function and gets called whenever a mapping is created. Mapping is
-an event that is recorded into the trace log. Note, that ISA range mappings
-are not caught, since the mapping always exists and is returned directly.
-
-MMIO accesses are recorded via page faults. Just before __ioremap() returns,
-the mapped pages are marked as not present. Any access to the pages causes a
-fault. The page fault handler calls mmiotrace to handle the fault. Mmiotrace
-marks the page present, sets TF flag to achieve single stepping and exits the
-fault handler. The instruction that faulted is executed and debug trap is
-entered. Here mmiotrace again marks the page as not present. The instruction
-is decoded to get the type of operation (read/write), data width and the value
-read or written. These are stored to the trace log.
-
-Setting the page present in the page fault handler has a race condition on SMP
-machines. During the single stepping other CPUs may run freely on that page
-and events can be missed without a notice. Re-enabling other CPUs during
-tracing is discouraged.
-
-
-Trace Log Format
-----------------
-
-The raw log is text and easily filtered with e.g. grep and awk. One record is
-one line in the log. A record starts with a keyword, followed by keyword
-dependant arguments. Arguments are separated by a space, or continue until the
-end of line. The format for version 20070824 is as follows:
-
-Explanation    Keyword Space separated arguments
----------------------------------------------------------------------------
-
-read event     R       width, timestamp, map id, physical, value, PC, PID
-write event    W       width, timestamp, map id, physical, value, PC, PID
-ioremap event  MAP     timestamp, map id, physical, virtual, length, PC, PID
-iounmap event  UNMAP   timestamp, map id, PC, PID
-marker         MARK    timestamp, text
-version                VERSION the string "20070824"
-info for reader        LSPCI   one line from lspci -v
-PCI address map        PCIDEV  space separated /proc/bus/pci/devices data
-unk. opcode    UNKNOWN timestamp, map id, physical, data, PC, PID
-
-Timestamp is in seconds with decimals. Physical is a PCI bus address, virtual
-is a kernel virtual address. Width is the data width in bytes and value is the
-data value. Map id is an arbitrary id number identifying the mapping that was
-used in an operation. PC is the program counter and PID is process id. PC is
-zero if it is not recorded. PID is always zero as tracing MMIO accesses
-originating in user space memory is not yet supported.
-
-For instance, the following awk filter will pass all 32-bit writes that target
-physical addresses in the range [0xfb73ce40, 0xfb800000[
-
-$ awk '/W 4 / { adr=strtonum($5); if (adr >= 0xfb73ce40 &&
-adr < 0xfb800000) print; }'
-
-
-Tools for Developers
---------------------
-
-The user space tools include utilities for:
-- replacing numeric addresses and values with hardware register names
-- replaying MMIO logs, i.e., re-executing the recorded writes
-
-
diff --git a/Documentation/vm/kmemtrace.txt b/Documentation/vm/kmemtrace.txt

deleted file mode 100644 (file)

index a956d9b..0000000
--- a/Documentation/vm/kmemtrace.txt
+++ /dev/null
@@ -1,126 +0,0 @@
-                       kmemtrace - Kernel Memory Tracer
-
-                         by Eduard - Gabriel Munteanu
-                            <eduard.munteanu@linux360.ro>
-
-I. Introduction
-===============
-
-kmemtrace helps kernel developers figure out two things:
-1) how different allocators (SLAB, SLUB etc.) perform
-2) how kernel code allocates memory and how much
-
-To do this, we trace every allocation and export information to the userspace
-through the relay interface. We export things such as the number of requested
-bytes, the number of bytes actually allocated (i.e. including internal
-fragmentation), whether this is a slab allocation or a plain kmalloc() and so
-on.
-
-The actual analysis is performed by a userspace tool (see section III for
-details on where to get it from). It logs the data exported by the kernel,
-processes it and (as of writing this) can provide the following information:
-- the total amount of memory allocated and fragmentation per call-site
-- the amount of memory allocated and fragmentation per allocation
-- total memory allocated and fragmentation in the collected dataset
-- number of cross-CPU allocation and frees (makes sense in NUMA environments)
-
-Moreover, it can potentially find inconsistent and erroneous behavior in
-kernel code, such as using slab free functions on kmalloc'ed memory or
-allocating less memory than requested (but not truly failed allocations).
-
-kmemtrace also makes provisions for tracing on some arch and analysing the
-data on another.
-
-II. Design and goals
-====================
-
-kmemtrace was designed to handle rather large amounts of data. Thus, it uses
-the relay interface to export whatever is logged to userspace, which then
-stores it. Analysis and reporting is done asynchronously, that is, after the
-data is collected and stored. By design, it allows one to log and analyse
-on different machines and different arches.
-
-As of writing this, the ABI is not considered stable, though it might not
-change much. However, no guarantees are made about compatibility yet. When
-deemed stable, the ABI should still allow easy extension while maintaining
-backward compatibility. This is described further in Documentation/ABI.
-
-Summary of design goals:
-       - allow logging and analysis to be done across different machines
-       - be fast and anticipate usage in high-load environments (*)
-       - be reasonably extensible
-       - make it possible for GNU/Linux distributions to have kmemtrace
-       included in their repositories
-
-(*) - one of the reasons Pekka Enberg's original userspace data analysis
-    tool's code was rewritten from Perl to C (although this is more than a
-    simple conversion)
-
-
-III. Quick usage guide
-======================
-
-1) Get a kernel that supports kmemtrace and build it accordingly (i.e. enable
-CONFIG_KMEMTRACE).
-
-2) Get the userspace tool and build it:
-$ git-clone git://repo.or.cz/kmemtrace-user.git                # current repository
-$ cd kmemtrace-user/
-$ ./autogen.sh
-$ ./configure
-$ make
-
-3) Boot the kmemtrace-enabled kernel if you haven't, preferably in the
-'single' runlevel (so that relay buffers don't fill up easily), and run
-kmemtrace:
-# '$' does not mean user, but root here.
-$ mount -t debugfs none /sys/kernel/debug
-$ mount -t proc none /proc
-$ cd path/to/kmemtrace-user/
-$ ./kmemtraced
-Wait a bit, then stop it with CTRL+C.
-$ cat /sys/kernel/debug/kmemtrace/total_overruns       # Check if we didn't
-                                                       # overrun, should
-                                                       # be zero.
-$ (Optionally) [Run kmemtrace_check separately on each cpu[0-9]*.out file to
-               check its correctness]
-$ ./kmemtrace-report
-
-Now you should have a nice and short summary of how the allocator performs.
-
-IV. FAQ and known issues
-========================
-
-Q: 'cat /sys/kernel/debug/kmemtrace/total_overruns' is non-zero, how do I fix
-this? Should I worry?
-A: If it's non-zero, this affects kmemtrace's accuracy, depending on how
-large the number is. You can fix it by supplying a higher
-'kmemtrace.subbufs=N' kernel parameter.
----
-
-Q: kmemtrace_check reports errors, how do I fix this? Should I worry?
-A: This is a bug and should be reported. It can occur for a variety of
-reasons:
-       - possible bugs in relay code
-       - possible misuse of relay by kmemtrace
-       - timestamps being collected unorderly
-Or you may fix it yourself and send us a patch.
----
-
-Q: kmemtrace_report shows many errors, how do I fix this? Should I worry?
-A: This is a known issue and I'm working on it. These might be true errors
-in kernel code, which may have inconsistent behavior (e.g. allocating memory
-with kmem_cache_alloc() and freeing it with kfree()). Pekka Enberg pointed
-out this behavior may work with SLAB, but may fail with other allocators.
-
-It may also be due to lack of tracing in some unusual allocator functions.
-
-We don't want bug reports regarding this issue yet.
----
-
-V. See also
-===========
-
-Documentation/kernel-parameters.txt
-Documentation/ABI/testing/debugfs-kmemtrace
-
diff --git a/MAINTAINERS b/MAINTAINERS

index c3b215970f7b439fd83cf1dc2009979c3ce0a84f..5d843588e1de159958693aee543af4af2e82c6dc 100644 (file)
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -636,7 +636,7 @@ P:  Dirk Opfer
  M:     dirk@opfer-online.de
  S:     Maintained
  
-ARM/PALMTX,PALMT5,PALMLD SUPPORT
+ARM/PALMTX,PALMT5,PALMLD,PALMTE2 SUPPORT
  P:     Marek Vasut
  M:     marek.vasut@gmail.com
  W:     http://hackndev.com
@@ -3057,7 +3057,7 @@ S:        Supported
  
  MULTIMEDIA CARD (MMC), SECURE DIGITAL (SD) AND SDIO SUBSYSTEM
  P:     Pierre Ossman
-M:     drzeus-mmc@drzeus.cx
+M:     pierre@ossman.eu
  L:     linux-kernel@vger.kernel.org
  S:     Maintained
  
@@ -3873,8 +3873,8 @@ S:        Maintained
  SCHEDULER
  P:     Ingo Molnar
  M:     mingo@elte.hu
-P:     Robert Love    [the preemptible kernel bits]
-M:     rml@tech9.net
+P:     Peter Zijlstra
+M:     peterz@infradead.org
  L:     linux-kernel@vger.kernel.org
  S:     Maintained
  
@@ -3939,7 +3939,7 @@ S:        Maintained
  
  SECURE DIGITAL HOST CONTROLLER INTERFACE (SDHCI) DRIVER
  P:     Pierre Ossman
-M:     drzeus-sdhci@drzeus.cx
+M:     pierre@ossman.eu
  L:     sdhci-devel@lists.ossman.eu
  S:     Maintained
  
@@ -4926,7 +4926,7 @@ S:        Maintained
  
  W83L51xD SD/MMC CARD INTERFACE DRIVER
  P:     Pierre Ossman
-M:     drzeus-wbsd@drzeus.cx
+M:     pierre@ossman.eu
  L:     linux-kernel@vger.kernel.org
  S:     Maintained
  
diff --git a/arch/arm/configs/magician_defconfig b/arch/arm/configs/magician_defconfig

index 82428c2f234cb4c1b6f509aed187d52164211834..f56837f69ca7323130828ab367e736e59ca98ea6 100644 (file)
--- a/arch/arm/configs/magician_defconfig
+++ b/arch/arm/configs/magician_defconfig
@@ -1183,7 +1183,11 @@ CONFIG_RTC_INTF_DEV=y
  CONFIG_RTC_DRV_SA1100=y
  # CONFIG_RTC_DRV_PXA is not set
  # CONFIG_DMADEVICES is not set
-# CONFIG_REGULATOR is not set
+CONFIG_REGULATOR=y
+# CONFIG_REGULATOR_DEBUG is not set
+# CONFIG_REGULATOR_FIXED_VOLTAGE is not set
+# CONFIG_REGULATOR_VIRTUAL_CONSUMER is not set
+CONFIG_REGULATOR_BQ24022=y
  # CONFIG_UIO is not set
  # CONFIG_STAGING is not set
  
diff --git a/arch/arm/include/asm/sizes.h b/arch/arm/include/asm/sizes.h

index c10d1aa4b48728aae12cf45764e8bb62a6b1f5e8..ada93a8fc2ef6ed426cf5ac61ba27b3926ee402e 100644 (file)
--- a/arch/arm/include/asm/sizes.h
+++ b/arch/arm/include/asm/sizes.h
@@ -32,6 +32,7 @@
  #define SZ_4K                           0x00001000
  #define SZ_8K                           0x00002000
  #define SZ_16K                          0x00004000
+#define SZ_32K                          0x00008000
  #define SZ_64K                          0x00010000
  #define SZ_128K                         0x00020000
  #define SZ_256K                         0x00040000
diff --git a/arch/arm/mach-at91/include/mach/board.h b/arch/arm/mach-at91/include/mach/board.h

index 793fe7b25f367653292110e5936c1c8b4bcdc3d8..e6afff849b85e5475155b344123387276958a694 100644 (file)
--- a/arch/arm/mach-at91/include/mach/board.h
+++ b/arch/arm/mach-at91/include/mach/board.h
@@ -87,7 +87,7 @@ extern void __init at91_add_device_eth(struct at91_eth_data *data);
   /* USB Host */
  struct at91_usbh_data {
         u8              ports;          /* number of ports on root hub */
-       u8              vbus_pin[];     /* port power-control pin */
+       u8              vbus_pin[2];    /* port power-control pin */
  };
  extern void __init at91_add_device_usbh(struct at91_usbh_data *data);
  
diff --git a/arch/arm/mach-omap1/clock.c b/arch/arm/mach-omap1/clock.c

index dafe4f71d15f2a4785f076d444a58e529480e176..336e51dc612725588b802ef9b0e4da53f3d70d63 100644 (file)
--- a/arch/arm/mach-omap1/clock.c
+++ b/arch/arm/mach-omap1/clock.c
@@ -590,27 +590,28 @@ static void omap1_init_ext_clk(struct clk * clk)
  static int omap1_clk_enable(struct clk *clk)
  {
         int ret = 0;
+
         if (clk->usecount++ == 0) {
-               if (likely(clk->parent)) {
+               if (clk->parent) {
                         ret = omap1_clk_enable(clk->parent);
-
-                       if (unlikely(ret != 0)) {
-                               clk->usecount--;
-                               return ret;
-                       }
+                       if (ret)
+                               goto err;
  
                         if (clk->flags & CLOCK_NO_IDLE_PARENT)
                                 omap1_clk_deny_idle(clk->parent);
                 }
  
                 ret = clk->ops->enable(clk);
-
-               if (unlikely(ret != 0) && clk->parent) {
-                       omap1_clk_disable(clk->parent);
-                       clk->usecount--;
+               if (ret) {
+                       if (clk->parent)
+                               omap1_clk_disable(clk->parent);
+                       goto err;
                 }
         }
+       return ret;
  
+err:
+       clk->usecount--;
         return ret;
  }
  
diff --git a/arch/arm/mach-pxa/Kconfig b/arch/arm/mach-pxa/Kconfig

index 96a2006cb597919566f0a95150ad196ca2555ace..3e66d9099eab2bd329b9c2a877a741969e32b9ee 100644 (file)
--- a/arch/arm/mach-pxa/Kconfig
+++ b/arch/arm/mach-pxa/Kconfig
@@ -343,6 +343,15 @@ config ARCH_PXA_PALM
         bool "PXA based Palm PDAs"
         select HAVE_PWM
  
+config MACH_PALMTE2
+       bool "Palm Tungsten|E2"
+       default y
+       depends on ARCH_PXA_PALM
+       select PXA25x
+       help
+         Say Y here if you intend to run this kernel on a Palm Tungsten|E2
+         handheld computer.
+
  config MACH_PALMT5
         bool "Palm Tungsten|T5"
         default y
diff --git a/arch/arm/mach-pxa/Makefile b/arch/arm/mach-pxa/Makefile

index c80e1bac4945a71a5759dc87ec4fa81b94c15db5..682dbf4e14b06e25bac7ade81d38ac117d9079ef 100644 (file)
--- a/arch/arm/mach-pxa/Makefile
+++ b/arch/arm/mach-pxa/Makefile
@@ -57,6 +57,7 @@ obj-$(CONFIG_MACH_E740)               += e740.o
  obj-$(CONFIG_MACH_E750)                += e750.o
  obj-$(CONFIG_MACH_E400)                += e400.o
  obj-$(CONFIG_MACH_E800)                += e800.o
+obj-$(CONFIG_MACH_PALMTE2)     += palmte2.o
  obj-$(CONFIG_MACH_PALMT5)      += palmt5.o
  obj-$(CONFIG_MACH_PALMTX)      += palmtx.o
  obj-$(CONFIG_MACH_PALMLD)      += palmld.o
diff --git a/arch/arm/mach-pxa/cm-x2xx.c b/arch/arm/mach-pxa/cm-x2xx.c

index 117b5435f8d572358e98fc4b63952c79e9c06f84..b50ef39eabfcdadf5751b34d69d79bdeb3fefaf2 100644 (file)
--- a/arch/arm/mach-pxa/cm-x2xx.c
+++ b/arch/arm/mach-pxa/cm-x2xx.c
@@ -121,7 +121,7 @@ static inline void cmx2xx_init_dm9000(void) {}
  /* UCB1400 touchscreen controller */
  #if defined(CONFIG_TOUCHSCREEN_UCB1400) || defined(CONFIG_TOUCHSCREEN_UCB1400_MODULE)
  static struct platform_device cmx2xx_ts_device = {
-       .name           = "ucb1400_ts",
+       .name           = "ucb1400_core",
         .id             = -1,
  };
  
diff --git a/arch/arm/mach-pxa/colibri-pxa300.c b/arch/arm/mach-pxa/colibri-pxa300.c

index 10c2eaf932306c5cd0d30ce8402e13114bcf47df..7c9c34c19ae2781cd058a4bdf3516d374dc1c1b2 100644 (file)
--- a/arch/arm/mach-pxa/colibri-pxa300.c
+++ b/arch/arm/mach-pxa/colibri-pxa300.c
@@ -15,7 +15,7 @@
  #include <linux/kernel.h>
  #include <linux/platform_device.h>
  #include <linux/gpio.h>
-#include <net/ax88796.h>
+#include <linux/interrupt.h>
  
  #include <asm/mach-types.h>
  #include <asm/sizes.h>
@@ -32,12 +32,13 @@
  
  #if defined(CONFIG_AX88796)
  #define COLIBRI_ETH_IRQ_GPIO   mfp_to_gpio(GPIO26_GPIO)
+
  /*
   * Asix AX88796 Ethernet
   */
  static struct ax_plat_data colibri_asix_platdata = {
-       .flags          = AXFLG_MAC_FROMDEV,
-       .wordlength     = 2
+       .flags          = 0, /* defined later */
+       .wordlength     = 2,
  };
  
  static struct resource colibri_asix_resource[] = {
@@ -49,7 +50,7 @@ static struct resource colibri_asix_resource[] = {
         [1] = {
                 .start = gpio_to_irq(COLIBRI_ETH_IRQ_GPIO),
                 .end   = gpio_to_irq(COLIBRI_ETH_IRQ_GPIO),
-               .flags = IORESOURCE_IRQ
+               .flags = IORESOURCE_IRQ | IRQF_TRIGGER_FALLING,
         }
  };
  
@@ -70,8 +71,8 @@ static mfp_cfg_t colibri_pxa300_eth_pin_config[] __initdata = {
  
  static void __init colibri_pxa300_init_eth(void)
  {
+       colibri_pxa3xx_init_eth(&colibri_asix_platdata);
         pxa3xx_mfp_config(ARRAY_AND_SIZE(colibri_pxa300_eth_pin_config));
-       set_irq_type(gpio_to_irq(COLIBRI_ETH_IRQ_GPIO), IRQ_TYPE_EDGE_FALLING);
         platform_device_register(&asix_device);
  }
  #else
diff --git a/arch/arm/mach-pxa/colibri-pxa320.c b/arch/arm/mach-pxa/colibri-pxa320.c

index 55b74a7a6151b6f3e08e702348f6113b93a2da32..a18d37b3c5e65dbb79a25514ed15f370f3032364 100644 (file)
--- a/arch/arm/mach-pxa/colibri-pxa320.c
+++ b/arch/arm/mach-pxa/colibri-pxa320.c
@@ -15,7 +15,7 @@
  #include <linux/kernel.h>
  #include <linux/platform_device.h>
  #include <linux/gpio.h>
-#include <net/ax88796.h>
+#include <linux/interrupt.h>
  
  #include <asm/mach-types.h>
  #include <asm/sizes.h>
@@ -38,8 +38,8 @@
   * Asix AX88796 Ethernet
   */
  static struct ax_plat_data colibri_asix_platdata = {
-       .flags          = AXFLG_MAC_FROMDEV,
-       .wordlength     = 2
+       .flags          = 0, /* defined later */
+       .wordlength     = 2,
  };
  
  static struct resource colibri_asix_resource[] = {
@@ -51,7 +51,7 @@ static struct resource colibri_asix_resource[] = {
         [1] = {
                 .start = gpio_to_irq(COLIBRI_ETH_IRQ_GPIO),
                 .end   = gpio_to_irq(COLIBRI_ETH_IRQ_GPIO),
-               .flags = IORESOURCE_IRQ
+               .flags = IORESOURCE_IRQ | IRQF_TRIGGER_FALLING,
         }
  };
  
@@ -72,8 +72,8 @@ static mfp_cfg_t colibri_pxa320_eth_pin_config[] __initdata = {
  
  static void __init colibri_pxa320_init_eth(void)
  {
+       colibri_pxa3xx_init_eth(&colibri_asix_platdata);
         pxa3xx_mfp_config(ARRAY_AND_SIZE(colibri_pxa320_eth_pin_config));
-       set_irq_type(gpio_to_irq(COLIBRI_ETH_IRQ_GPIO), IRQ_TYPE_EDGE_FALLING);
         platform_device_register(&asix_device);
  }
  #else
diff --git a/arch/arm/mach-pxa/colibri-pxa3xx.c b/arch/arm/mach-pxa/colibri-pxa3xx.c

index 12d0afc54aa5342cd97cda940d3206d98866dab9..ea34e34f8cd82e4a8104d5d127c857cc93426fc7 100644 (file)
--- a/arch/arm/mach-pxa/colibri-pxa3xx.c
+++ b/arch/arm/mach-pxa/colibri-pxa3xx.c
@@ -14,6 +14,7 @@
  #include <linux/kernel.h>
  #include <linux/platform_device.h>
  #include <linux/gpio.h>
+#include <linux/etherdevice.h>
  #include <asm/mach-types.h>
  #include <mach/hardware.h>
  #include <asm/sizes.h>
@@ -28,6 +29,40 @@
  #include "generic.h"
  #include "devices.h"
  
+#if defined(CONFIG_AX88796)
+#define ETHER_ADDR_LEN 6
+static u8 ether_mac_addr[ETHER_ADDR_LEN];
+
+void __init colibri_pxa3xx_init_eth(struct ax_plat_data *plat_data)
+{
+       int i;
+       u64 serial = ((u64) system_serial_high << 32) | system_serial_low;
+
+       /*
+        * If the bootloader passed in a serial boot tag, which contains a
+        * valid ethernet MAC, pass it to the interface. Toradex ships the
+        * modules with their own bootloader which provides a valid MAC
+        * this way.
+        */
+
+       for (i = 0; i < ETHER_ADDR_LEN; i++) {
+               ether_mac_addr[i] = serial & 0xff;
+               serial >>= 8;
+       }
+
+       if (is_valid_ether_addr(ether_mac_addr)) {
+               plat_data->flags |= AXFLG_MAC_FROMPLATFORM;
+               plat_data->mac_addr = ether_mac_addr;
+               printk(KERN_INFO "%s(): taking MAC from serial boot tag\n",
+                       __func__);
+       } else {
+               plat_data->flags |= AXFLG_MAC_FROMDEV;
+               printk(KERN_INFO "%s(): no valid serial boot tag found, "
+                       "taking MAC from device\n", __func__);
+       }
+}
+#endif
+
  #if defined(CONFIG_MMC_PXA) || defined(CONFIG_MMC_PXA_MODULE)
  static int mmc_detect_pin;
  
diff --git a/arch/arm/mach-pxa/csb701.c b/arch/arm/mach-pxa/csb701.c

index 4a2a2952c3748e0c50f7147a1a247ef5cf4790a0..5a221a49ea4de38bb37df529e0271baa9355ca31 100644 (file)
--- a/arch/arm/mach-pxa/csb701.c
+++ b/arch/arm/mach-pxa/csb701.c
@@ -5,6 +5,8 @@
  #include <linux/input.h>
  #include <linux/leds.h>
  
+#include <asm/mach-types.h>
+
  static struct gpio_keys_button csb701_buttons[] = {
         {
                 .code   = 0x7,
@@ -54,6 +56,9 @@ static struct platform_device *devices[] __initdata = {
  
  static int __init csb701_init(void)
  {
+       if (!machine_is_csb726())
+               return -ENODEV;
+
         return platform_add_devices(devices, ARRAY_SIZE(devices));
  }
  
diff --git a/arch/arm/mach-pxa/e740.c b/arch/arm/mach-pxa/e740.c

index 07500a04fd8c09f91ae6179a0ac4f1b8fdfec96a..a36fc17f671d937f73e8e0fe8626a2e271afa1e2 100644 (file)
--- a/arch/arm/mach-pxa/e740.c
+++ b/arch/arm/mach-pxa/e740.c
@@ -29,6 +29,7 @@
  #include <mach/udc.h>
  #include <mach/irda.h>
  #include <mach/irqs.h>
+#include <mach/audio.h>
  
  #include "generic.h"
  #include "eseries.h"
@@ -197,6 +198,7 @@ static void __init e740_init(void)
         eseries_get_tmio_gpios();
         platform_add_devices(devices, ARRAY_SIZE(devices));
         pxa_set_udc_info(&e7xx_udc_mach_info);
+       pxa_set_ac97_info(NULL);
         e7xx_irda_init();
         pxa_set_ficp_info(&e7xx_ficp_platform_data);
  }
diff --git a/arch/arm/mach-pxa/e750.c b/arch/arm/mach-pxa/e750.c

index 6126c04e02bcfd610a7718f5750de41f11f95f66..1d00110590e5c7dee25cca9918e1c6591a798526 100644 (file)
--- a/arch/arm/mach-pxa/e750.c
+++ b/arch/arm/mach-pxa/e750.c
@@ -28,6 +28,7 @@
  #include <mach/udc.h>
  #include <mach/irda.h>
  #include <mach/irqs.h>
+#include <mach/audio.h>
  
  #include "generic.h"
  #include "eseries.h"
@@ -198,6 +199,7 @@ static void __init e750_init(void)
         eseries_get_tmio_gpios();
         platform_add_devices(devices, ARRAY_SIZE(devices));
         pxa_set_udc_info(&e7xx_udc_mach_info);
+       pxa_set_ac97_info(NULL);
         e7xx_irda_init();
         pxa_set_ficp_info(&e7xx_ficp_platform_data);
  }
diff --git a/arch/arm/mach-pxa/e800.c b/arch/arm/mach-pxa/e800.c

index 74ab09812a72a1444e3b7af9d59e0999b263b854..9866c7b9e78416efc55a2f810e9ccee8359f5dfd 100644 (file)
--- a/arch/arm/mach-pxa/e800.c
+++ b/arch/arm/mach-pxa/e800.c
@@ -27,6 +27,7 @@
  #include <mach/eseries-gpio.h>
  #include <mach/udc.h>
  #include <mach/irqs.h>
+#include <mach/audio.h>
  
  #include "generic.h"
  #include "eseries.h"
@@ -199,6 +200,7 @@ static void __init e800_init(void)
         eseries_get_tmio_gpios();
         platform_add_devices(devices, ARRAY_SIZE(devices));
         pxa_set_udc_info(&e800_udc_mach_info);
+       pxa_set_ac97_info(NULL);
  }
  
  MACHINE_START(E800, "Toshiba e800")
diff --git a/arch/arm/mach-pxa/em-x270.c b/arch/arm/mach-pxa/em-x270.c

index 920dfb8d36dabcb686159e8d24427c9d5ecd7ff3..67611dadb44ec0f7f811b5859e8d89f52518e093 100644 (file)
--- a/arch/arm/mach-pxa/em-x270.c
+++ b/arch/arm/mach-pxa/em-x270.c
@@ -25,8 +25,10 @@
  #include <linux/regulator/machine.h>
  #include <linux/spi/spi.h>
  #include <linux/spi/tdo24m.h>
+#include <linux/spi/libertas_spi.h>
  #include <linux/power_supply.h>
  #include <linux/apm-emulation.h>
+#include <linux/delay.h>
  
  #include <media/soc_camera.h>
  
@@ -62,6 +64,8 @@
  #define GPIO93_CAM_RESET       (93)
  #define GPIO41_ETHIRQ          (41)
  #define EM_X270_ETHIRQ         IRQ_GPIO(GPIO41_ETHIRQ)
+#define GPIO115_WLAN_PWEN      (115)
+#define GPIO19_WLAN_STRAP      (19)
  
  static int mmc_cd;
  static int nand_rb;
@@ -159,8 +163,8 @@ static unsigned long common_pin_config[] = {
         GPIO57_SSP1_TXD,
  
         /* SSP2 */
-       GPIO19_SSP2_SCLK,
-       GPIO14_SSP2_SFRM,
+       GPIO19_GPIO,    /* SSP2 clock is used as GPIO for Libertas pin-strap */
+       GPIO14_GPIO,
         GPIO89_SSP2_TXD,
         GPIO88_SSP2_RXD,
  
@@ -648,20 +652,86 @@ static struct tdo24m_platform_data em_x270_tdo24m_pdata = {
         .model = TDO35S,
  };
  
+static struct pxa2xx_spi_master em_x270_spi_2_info = {
+       .num_chipselect = 1,
+       .enable_dma     = 1,
+};
+
+static struct pxa2xx_spi_chip em_x270_libertas_chip = {
+       .rx_threshold   = 1,
+       .tx_threshold   = 1,
+       .timeout        = 1000,
+};
+
+static unsigned long em_x270_libertas_pin_config[] = {
+       /* SSP2 */
+       GPIO19_SSP2_SCLK,
+       GPIO14_GPIO,
+       GPIO89_SSP2_TXD,
+       GPIO88_SSP2_RXD,
+};
+
+static int em_x270_libertas_setup(struct spi_device *spi)
+{
+       int err = gpio_request(GPIO115_WLAN_PWEN, "WLAN PWEN");
+       if (err)
+               return err;
+
+       gpio_direction_output(GPIO19_WLAN_STRAP, 1);
+       mdelay(100);
+
+       pxa2xx_mfp_config(ARRAY_AND_SIZE(em_x270_libertas_pin_config));
+
+       gpio_direction_output(GPIO115_WLAN_PWEN, 0);
+       mdelay(100);
+       gpio_set_value(GPIO115_WLAN_PWEN, 1);
+       mdelay(100);
+
+       spi->bits_per_word = 16;
+       spi_setup(spi);
+
+       return 0;
+}
+
+static int em_x270_libertas_teardown(struct spi_device *spi)
+{
+       gpio_set_value(GPIO115_WLAN_PWEN, 0);
+       gpio_free(GPIO115_WLAN_PWEN);
+
+       return 0;
+}
+
+struct libertas_spi_platform_data em_x270_libertas_pdata = {
+       .use_dummy_writes       = 1,
+       .gpio_cs                = 14,
+       .setup                  = em_x270_libertas_setup,
+       .teardown               = em_x270_libertas_teardown,
+};
+
  static struct spi_board_info em_x270_spi_devices[] __initdata = {
         {
-               .modalias = "tdo24m",
-               .max_speed_hz = 1000000,
-               .bus_num = 1,
-               .chip_select = 0,
-               .controller_data = &em_x270_tdo24m_chip,
-               .platform_data = &em_x270_tdo24m_pdata,
+               .modalias               = "tdo24m",
+               .max_speed_hz           = 1000000,
+               .bus_num                = 1,
+               .chip_select            = 0,
+               .controller_data        = &em_x270_tdo24m_chip,
+               .platform_data          = &em_x270_tdo24m_pdata,
+       },
+       {
+               .modalias               = "libertas_spi",
+               .max_speed_hz           = 13000000,
+               .bus_num                = 2,
+               .irq                    = IRQ_GPIO(116),
+               .chip_select            = 0,
+               .controller_data        = &em_x270_libertas_chip,
+               .platform_data          = &em_x270_libertas_pdata,
         },
  };
  
  static void __init em_x270_init_spi(void)
  {
         pxa2xx_set_spi_info(1, &em_x270_spi_info);
+       pxa2xx_set_spi_info(2, &em_x270_spi_2_info);
         spi_register_board_info(ARRAY_AND_SIZE(em_x270_spi_devices));
  }
  #else
diff --git a/arch/arm/mach-pxa/include/mach/colibri.h b/arch/arm/mach-pxa/include/mach/colibri.h

index 3f2a01d6a03c6563c377761a42ba586f2cd9d5bd..90230c6f9925c0d0d25e2ffc92eb9025522850b8 100644 (file)
--- a/arch/arm/mach-pxa/include/mach/colibri.h
+++ b/arch/arm/mach-pxa/include/mach/colibri.h
@@ -1,5 +1,8 @@
  #ifndef _COLIBRI_H_
  #define _COLIBRI_H_
+
+#include <net/ax88796.h>
+
  /*
   * common settings for all modules
   */
@@ -16,6 +19,10 @@ extern void colibri_pxa3xx_init_lcd(int bl_pin);
  static inline void colibri_pxa3xx_init_lcd(int) {}
  #endif
  
+#if defined(CONFIG_AX88796)
+extern void colibri_pxa3xx_init_eth(struct ax_plat_data *plat_data);
+#endif
+
  /* physical memory regions */
  #define COLIBRI_SDRAM_BASE     0xa0000000      /* SDRAM region */
  
diff --git a/arch/arm/mach-pxa/include/mach/magician.h b/arch/arm/mach-pxa/include/mach/magician.h

index 82a399f3f9f2b547c7a1ccec5615045547770423..20ef37d4a9a75b2567e73791d1ba8ee5056ae563 100644 (file)
--- a/arch/arm/mach-pxa/include/mach/magician.h
+++ b/arch/arm/mach-pxa/include/mach/magician.h
@@ -27,7 +27,7 @@
  #define GPIO22_MAGICIAN_VIBRA_EN               22
  #define GPIO26_MAGICIAN_GSM_POWER              26
  #define GPIO27_MAGICIAN_USBC_PUEN              27
-#define GPIO30_MAGICIAN_nCHARGE_EN             30
+#define GPIO30_MAGICIAN_BQ24022_nCHARGE_EN     30
  #define GPIO37_MAGICIAN_KEY_HANGUP             37
  #define GPIO38_MAGICIAN_KEY_CONTACTS           38
  #define GPIO40_MAGICIAN_GSM_OUT2               40
@@ -98,7 +98,7 @@
  #define EGPIO_MAGICIAN_UNKNOWN_WAVEDEV_DLL     MAGICIAN_EGPIO(2, 2)
  #define EGPIO_MAGICIAN_FLASH_VPP               MAGICIAN_EGPIO(2, 3)
  #define EGPIO_MAGICIAN_BL_POWER2               MAGICIAN_EGPIO(2, 4)
-#define EGPIO_MAGICIAN_CHARGE_EN               MAGICIAN_EGPIO(2, 5)
+#define EGPIO_MAGICIAN_BQ24022_ISET2           MAGICIAN_EGPIO(2, 5)
  #define EGPIO_MAGICIAN_GSM_POWER               MAGICIAN_EGPIO(2, 7)
  
  /* input */
diff --git a/arch/arm/mach-pxa/include/mach/palmld.h b/arch/arm/mach-pxa/include/mach/palmld.h

index 7c295a48d78460fea21411ffcb7e75c4120ebaeb..fb13c82ad6dcdbe5f1c148b67e1705e55347d203 100644 (file)
--- a/arch/arm/mach-pxa/include/mach/palmld.h
+++ b/arch/arm/mach-pxa/include/mach/palmld.h
@@ -87,6 +87,7 @@
  #define PALMLD_IDE_SIZE                0x00100000
  
  #define PALMLD_PHYS_IO_START   0x40000000
+#define PALMLD_STR_BASE                0xa0200000
  
  /* BATTERY */
  #define PALMLD_BAT_MAX_VOLTAGE         4000    /* 4.00V maximum voltage */
diff --git a/arch/arm/mach-pxa/include/mach/palmt5.h b/arch/arm/mach-pxa/include/mach/palmt5.h

index 94db2881f04828f3c90ba2112506fc0c8766d8d1..052bfe788adae87ed0061fea2112e5435fc09bf9 100644 (file)
--- a/arch/arm/mach-pxa/include/mach/palmt5.h
+++ b/arch/arm/mach-pxa/include/mach/palmt5.h
@@ -59,6 +59,7 @@
  /* Various addresses  */
  #define PALMT5_PHYS_RAM_START  0xa0000000
  #define PALMT5_PHYS_IO_START   0x40000000
+#define PALMT5_STR_BASE                0xa0200000
  
  /* TOUCHSCREEN */
  #define AC97_LINK_FRAME                21
diff --git a/arch/arm/mach-pxa/include/mach/palmte2.h b/arch/arm/mach-pxa/include/mach/palmte2.h

new file mode 100644 (file)

index 0000000..1236134
--- /dev/null
+++ b/arch/arm/mach-pxa/include/mach/palmte2.h
@@ -0,0 +1,68 @@
+/*
+ * GPIOs and interrupts for Palm Tungsten|E2 Handheld Computer
+ *
+ * Author:
+ *             Carlos Eduardo Medaglia Dyonisio <cadu@nerdfeliz.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#ifndef _INCLUDE_PALMTE2_H_
+#define _INCLUDE_PALMTE2_H_
+
+/** HERE ARE GPIOs **/
+
+/* GPIOs */
+#define GPIO_NR_PALMTE2_POWER_DETECT           9
+#define GPIO_NR_PALMTE2_HOTSYNC_BUTTON_N       4
+#define GPIO_NR_PALMTE2_EARPHONE_DETECT                15
+
+/* SD/MMC */
+#define GPIO_NR_PALMTE2_SD_DETECT_N            10
+#define GPIO_NR_PALMTE2_SD_POWER               55
+#define GPIO_NR_PALMTE2_SD_READONLY            51
+
+/* IRDA -  disable GPIO connected to SD pin of tranceiver (TFBS4710?) ? */
+#define GPIO_NR_PALMTE2_IR_DISABLE             48
+
+/* USB */
+#define GPIO_NR_PALMTE2_USB_DETECT_N           35
+#define GPIO_NR_PALMTE2_USB_PULLUP             53
+
+/* LCD/BACKLIGHT */
+#define GPIO_NR_PALMTE2_BL_POWER               56
+#define GPIO_NR_PALMTE2_LCD_POWER              37
+
+/* KEYS */
+#define GPIO_NR_PALMTE2_KEY_NOTES      5
+#define GPIO_NR_PALMTE2_KEY_TASKS      7
+#define GPIO_NR_PALMTE2_KEY_CALENDAR   11
+#define GPIO_NR_PALMTE2_KEY_CONTACTS   13
+#define GPIO_NR_PALMTE2_KEY_CENTER     14
+#define GPIO_NR_PALMTE2_KEY_LEFT       19
+#define GPIO_NR_PALMTE2_KEY_RIGHT      20
+#define GPIO_NR_PALMTE2_KEY_DOWN       21
+#define GPIO_NR_PALMTE2_KEY_UP         22
+
+/** HERE ARE INIT VALUES **/
+
+/* BACKLIGHT */
+#define PALMTE2_MAX_INTENSITY          0xFE
+#define PALMTE2_DEFAULT_INTENSITY      0x7E
+#define PALMTE2_LIMIT_MASK             0x7F
+#define PALMTE2_PRESCALER              0x3F
+#define PALMTE2_PERIOD_NS              3500
+
+/* BATTERY */
+#define PALMTE2_BAT_MAX_VOLTAGE                4000    /* 4.00v current voltage */
+#define PALMTE2_BAT_MIN_VOLTAGE                3550    /* 3.55v critical voltage */
+#define PALMTE2_BAT_MAX_CURRENT                0       /* unknokn */
+#define PALMTE2_BAT_MIN_CURRENT                0       /* unknown */
+#define PALMTE2_BAT_MAX_CHARGE         1       /* unknown */
+#define PALMTE2_BAT_MIN_CHARGE         1       /* unknown */
+#define PALMTE2_MAX_LIFE_MINS          360     /* on-life in minutes */
+
+#endif
diff --git a/arch/arm/mach-pxa/include/mach/palmtx.h b/arch/arm/mach-pxa/include/mach/palmtx.h

index 1e8bccbda51027a0fbb686ab8f33e6b321b3ecd0..9f7d62fb4cbb5797227ece0856c8d9b0049c7403 100644 (file)
--- a/arch/arm/mach-pxa/include/mach/palmtx.h
+++ b/arch/arm/mach-pxa/include/mach/palmtx.h
@@ -78,6 +78,8 @@
  #define PALMTX_PHYS_RAM_START  0xa0000000
  #define PALMTX_PHYS_IO_START   0x40000000
  
+#define PALMTX_STR_BASE                0xa0200000
+
  #define PALMTX_PHYS_FLASH_START        PXA_CS0_PHYS    /* ChipSelect 0 */
  #define PALMTX_PHYS_NAND_START PXA_CS1_PHYS    /* ChipSelect 1 */
  
diff --git a/arch/arm/mach-pxa/magician.c b/arch/arm/mach-pxa/magician.c

index deeea1c2782b4a68c1f2876bd1b92dc07f8d1ec5..c899bbd94dc06bad6fbbcb9d57133281667fe509 100644 (file)
--- a/arch/arm/mach-pxa/magician.c
+++ b/arch/arm/mach-pxa/magician.c
@@ -25,6 +25,8 @@
  #include <linux/mtd/physmap.h>
  #include <linux/pda_power.h>
  #include <linux/pwm_backlight.h>
+#include <linux/regulator/bq24022.h>
+#include <linux/regulator/machine.h>
  #include <linux/usb/gpio_vbus.h>
  
  #include <mach/hardware.h>
@@ -552,33 +554,7 @@ static struct platform_device gpio_vbus = {
  
  static int power_supply_init(struct device *dev)
  {
-       int ret;
-
-       ret = gpio_request(EGPIO_MAGICIAN_CABLE_STATE_AC, "CABLE_STATE_AC");
-       if (ret)
-               goto err_cs_ac;
-       ret = gpio_request(EGPIO_MAGICIAN_CABLE_STATE_USB, "CABLE_STATE_USB");
-       if (ret)
-               goto err_cs_usb;
-       ret = gpio_request(EGPIO_MAGICIAN_CHARGE_EN, "CHARGE_EN");
-       if (ret)
-               goto err_chg_en;
-       ret = gpio_request(GPIO30_MAGICIAN_nCHARGE_EN, "nCHARGE_EN");
-       if (!ret)
-               ret = gpio_direction_output(GPIO30_MAGICIAN_nCHARGE_EN, 0);
-       if (ret)
-               goto err_nchg_en;
-
-       return 0;
-
-err_nchg_en:
-       gpio_free(EGPIO_MAGICIAN_CHARGE_EN);
-err_chg_en:
-       gpio_free(EGPIO_MAGICIAN_CABLE_STATE_USB);
-err_cs_usb:
-       gpio_free(EGPIO_MAGICIAN_CABLE_STATE_AC);
-err_cs_ac:
-       return ret;
+       return gpio_request(EGPIO_MAGICIAN_CABLE_STATE_AC, "CABLE_STATE_AC");
  }
  
  static int magician_is_ac_online(void)
@@ -586,22 +562,8 @@ static int magician_is_ac_online(void)
         return gpio_get_value(EGPIO_MAGICIAN_CABLE_STATE_AC);
  }
  
-static int magician_is_usb_online(void)
-{
-       return gpio_get_value(EGPIO_MAGICIAN_CABLE_STATE_USB);
-}
-
-static void magician_set_charge(int flags)
-{
-       gpio_set_value(GPIO30_MAGICIAN_nCHARGE_EN, !flags);
-       gpio_set_value(EGPIO_MAGICIAN_CHARGE_EN, flags);
-}
-
  static void power_supply_exit(struct device *dev)
  {
-       gpio_free(GPIO30_MAGICIAN_nCHARGE_EN);
-       gpio_free(EGPIO_MAGICIAN_CHARGE_EN);
-       gpio_free(EGPIO_MAGICIAN_CABLE_STATE_USB);
         gpio_free(EGPIO_MAGICIAN_CABLE_STATE_AC);
  }
  
@@ -612,8 +574,6 @@ static char *magician_supplicants[] = {
  static struct pda_power_pdata power_supply_info = {
         .init            = power_supply_init,
         .is_ac_online    = magician_is_ac_online,
-       .is_usb_online   = magician_is_usb_online,
-       .set_charge      = magician_set_charge,
         .exit            = power_supply_exit,
         .supplied_to     = magician_supplicants,
         .num_supplicants = ARRAY_SIZE(magician_supplicants),
@@ -646,6 +606,43 @@ static struct platform_device power_supply = {
         .num_resources = ARRAY_SIZE(power_supply_resources),
  };
  
+/*
+ * Battery charger
+ */
+
+static struct regulator_consumer_supply bq24022_consumers[] = {
+       {
+               .dev = &gpio_vbus.dev,
+               .supply = "vbus_draw",
+       },
+       {
+               .dev = &power_supply.dev,
+               .supply = "ac_draw",
+       },
+};
+
+static struct regulator_init_data bq24022_init_data = {
+       .constraints = {
+               .max_uA         = 500000,
+               .valid_ops_mask = REGULATOR_CHANGE_CURRENT,
+       },
+       .num_consumer_supplies  = ARRAY_SIZE(bq24022_consumers),
+       .consumer_supplies      = bq24022_consumers,
+};
+
+static struct bq24022_mach_info bq24022_info = {
+       .gpio_nce   = GPIO30_MAGICIAN_BQ24022_nCHARGE_EN,
+       .gpio_iset2 = EGPIO_MAGICIAN_BQ24022_ISET2,
+       .init_data  = &bq24022_init_data,
+};
+
+static struct platform_device bq24022 = {
+       .name = "bq24022",
+       .id   = -1,
+       .dev  = {
+               .platform_data = &bq24022_info,
+       },
+};
  
  /*
   * MMC/SD
@@ -756,6 +753,7 @@ static struct platform_device *devices[] __initdata = {
         &egpio,
         &backlight,
         &pasic3,
+       &bq24022,
         &gpio_vbus,
         &power_supply,
         &strataflash,
diff --git a/arch/arm/mach-pxa/mioa701.c b/arch/arm/mach-pxa/mioa701.c

index 97c93a7a285c12b70a7249c25cd9f47ee1b2f493..9203b069b35c7483a801cb8e7e9ba97753a2853f 100644 (file)
--- a/arch/arm/mach-pxa/mioa701.c
+++ b/arch/arm/mach-pxa/mioa701.c
@@ -50,6 +50,7 @@
  #include <mach/pxa27x-udc.h>
  #include <mach/i2c.h>
  #include <mach/camera.h>
+#include <mach/audio.h>
  #include <media/soc_camera.h>
  
  #include <mach/mioa701.h>
@@ -763,8 +764,6 @@ MIO_PARENT_DEV(mioa701_backlight, "pwm-backlight",  &pxa27x_device_pwm0.dev,
                 &mioa701_backlight_data);
  MIO_SIMPLE_DEV(mioa701_led,      "leds-gpio",      &gpio_led_info)
  MIO_SIMPLE_DEV(pxa2xx_pcm,       "pxa2xx-pcm",     NULL)
-MIO_SIMPLE_DEV(pxa2xx_ac97,      "pxa2xx-ac97",    NULL)
-MIO_PARENT_DEV(mio_wm9713_codec,  "wm9713-codec",   &pxa2xx_ac97.dev, NULL)
  MIO_SIMPLE_DEV(mioa701_sound,    "mioa701-wm9713", NULL)
  MIO_SIMPLE_DEV(mioa701_board,    "mioa701-board",  NULL)
  MIO_SIMPLE_DEV(gpio_vbus,        "gpio-vbus",      &gpio_vbus_data);
@@ -774,8 +773,6 @@ static struct platform_device *devices[] __initdata = {
         &mioa701_backlight,
         &mioa701_led,
         &pxa2xx_pcm,
-       &pxa2xx_ac97,
-       &mio_wm9713_codec,
         &mioa701_sound,
         &power_dev,
         &strataflash,
@@ -818,6 +815,7 @@ static void __init mioa701_machine_init(void)
         pxa_set_keypad_info(&mioa701_keypad_info);
         wm97xx_bat_set_pdata(&mioa701_battery_data);
         pxa_set_udc_info(&mioa701_udc_info);
+       pxa_set_ac97_info(NULL);
         pm_power_off = mioa701_poweroff;
         arm_pm_restart = mioa701_restart;
         platform_add_devices(devices, ARRAY_SIZE(devices));
diff --git a/arch/arm/mach-pxa/palmld.c b/arch/arm/mach-pxa/palmld.c

index 8587477a9bb7d52ae822551b27f48376d225a6b8..ecf5910e39d770d40a86e5a994f32b8305c2af45 100644 (file)
--- a/arch/arm/mach-pxa/palmld.c
+++ b/arch/arm/mach-pxa/palmld.c
@@ -24,6 +24,7 @@
  #include <linux/gpio.h>
  #include <linux/wm97xx_batt.h>
  #include <linux/power_supply.h>
+#include <linux/sysdev.h>
  
  #include <asm/mach-types.h>
  #include <asm/mach/arch.h>
@@ -68,10 +69,10 @@ static unsigned long palmld_pin_config[] __initdata = {
         GPIO47_FICP_TXD,
  
         /* MATRIX KEYPAD */
-       GPIO100_KP_MKIN_0,
-       GPIO101_KP_MKIN_1,
-       GPIO102_KP_MKIN_2,
-       GPIO97_KP_MKIN_3,
+       GPIO100_KP_MKIN_0 | WAKEUP_ON_LEVEL_HIGH,
+       GPIO101_KP_MKIN_1 | WAKEUP_ON_LEVEL_HIGH,
+       GPIO102_KP_MKIN_2 | WAKEUP_ON_LEVEL_HIGH,
+       GPIO97_KP_MKIN_3 | WAKEUP_ON_LEVEL_HIGH,
         GPIO103_KP_MKOUT_0,
         GPIO104_KP_MKOUT_1,
         GPIO105_KP_MKOUT_2,
@@ -506,6 +507,33 @@ static struct pxafb_mach_info palmld_lcd_screen = {
         .lcd_conn       = LCD_COLOR_TFT_16BPP | LCD_PCLK_EDGE_FALL,
  };
  
+/******************************************************************************
+ * Power management - standby
+ ******************************************************************************/
+#ifdef CONFIG_PM
+static u32 *addr __initdata;
+static u32 resume[3] __initdata = {
+       0xe3a00101,     /* mov  r0,     #0x40000000 */
+       0xe380060f,     /* orr  r0, r0, #0x00f00000 */
+       0xe590f008,     /* ldr  pc, [r0, #0x08] */
+};
+
+static int __init palmld_pm_init(void)
+{
+       int i;
+
+       /* this is where the bootloader jumps */
+       addr = phys_to_virt(PALMLD_STR_BASE);
+
+       for (i = 0; i < 3; i++)
+               addr[i] = resume[i];
+
+       return 0;
+}
+
+device_initcall(palmld_pm_init);
+#endif
+
  /******************************************************************************
   * Machine init
   ******************************************************************************/
diff --git a/arch/arm/mach-pxa/palmt5.c b/arch/arm/mach-pxa/palmt5.c

index 9521c7b334927f28d79eda5cc1b199fe23d3fab9..0680f1a575a33556e41e40b75ff83da0886dd178 100644 (file)
--- a/arch/arm/mach-pxa/palmt5.c
+++ b/arch/arm/mach-pxa/palmt5.c
@@ -75,10 +75,10 @@ static unsigned long palmt5_pin_config[] __initdata = {
         GPIO95_GPIO,    /* usb power */
  
         /* MATRIX KEYPAD */
-       GPIO100_KP_MKIN_0,
-       GPIO101_KP_MKIN_1,
-       GPIO102_KP_MKIN_2,
-       GPIO97_KP_MKIN_3,
+       GPIO100_KP_MKIN_0 | WAKEUP_ON_LEVEL_HIGH,
+       GPIO101_KP_MKIN_1 | WAKEUP_ON_LEVEL_HIGH,
+       GPIO102_KP_MKIN_2 | WAKEUP_ON_LEVEL_HIGH,
+       GPIO97_KP_MKIN_3 | WAKEUP_ON_LEVEL_HIGH,
         GPIO103_KP_MKOUT_0,
         GPIO104_KP_MKOUT_1,
         GPIO105_KP_MKOUT_2,
@@ -449,6 +449,33 @@ static struct pxafb_mach_info palmt5_lcd_screen = {
         .lcd_conn       = LCD_COLOR_TFT_16BPP | LCD_PCLK_EDGE_FALL,
  };
  
+/******************************************************************************
+ * Power management - standby
+ ******************************************************************************/
+#ifdef CONFIG_PM
+static u32 *addr __initdata;
+static u32 resume[3] __initdata = {
+       0xe3a00101,     /* mov  r0,     #0x40000000 */
+       0xe380060f,     /* orr  r0, r0, #0x00f00000 */
+       0xe590f008,     /* ldr  pc, [r0, #0x08] */
+};
+
+static int __init palmt5_pm_init(void)
+{
+       int i;
+
+       /* this is where the bootloader jumps */
+       addr = phys_to_virt(PALMT5_STR_BASE);
+
+       for (i = 0; i < 3; i++)
+               addr[i] = resume[i];
+
+       return 0;
+}
+
+device_initcall(palmt5_pm_init);
+#endif
+
  /******************************************************************************
   * Machine init
   ******************************************************************************/
diff --git a/arch/arm/mach-pxa/palmte2.c b/arch/arm/mach-pxa/palmte2.c

new file mode 100644 (file)

index 0000000..43fcf2e
--- /dev/null
+++ b/arch/arm/mach-pxa/palmte2.c
@@ -0,0 +1,466 @@
+/*
+ * Hardware definitions for Palm Tungsten|E2
+ *
+ * Author:
+ *     Carlos Eduardo Medaglia Dyonisio <cadu@nerdfeliz.com>
+ *
+ * Rewrite for mainline:
+ *     Marek Vasut <marek.vasut@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * (find more info at www.hackndev.com)
+ *
+ */
+
+#include <linux/platform_device.h>
+#include <linux/delay.h>
+#include <linux/irq.h>
+#include <linux/gpio_keys.h>
+#include <linux/input.h>
+#include <linux/pda_power.h>
+#include <linux/pwm_backlight.h>
+#include <linux/gpio.h>
+#include <linux/wm97xx_batt.h>
+#include <linux/power_supply.h>
+
+#include <asm/mach-types.h>
+#include <asm/mach/arch.h>
+#include <asm/mach/map.h>
+
+#include <mach/audio.h>
+#include <mach/palmte2.h>
+#include <mach/mmc.h>
+#include <mach/pxafb.h>
+#include <mach/mfp-pxa25x.h>
+#include <mach/irda.h>
+#include <mach/udc.h>
+
+#include "generic.h"
+#include "devices.h"
+
+/******************************************************************************
+ * Pin configuration
+ ******************************************************************************/
+static unsigned long palmte2_pin_config[] __initdata = {
+       /* MMC */
+       GPIO6_MMC_CLK,
+       GPIO8_MMC_CS0,
+       GPIO10_GPIO,    /* SD detect */
+       GPIO55_GPIO,    /* SD power */
+       GPIO51_GPIO,    /* SD r/o switch */
+
+       /* AC97 */
+       GPIO28_AC97_BITCLK,
+       GPIO29_AC97_SDATA_IN_0,
+       GPIO30_AC97_SDATA_OUT,
+       GPIO31_AC97_SYNC,
+
+       /* PWM */
+       GPIO16_PWM0_OUT,
+
+       /* USB */
+       GPIO15_GPIO,    /* usb detect */
+       GPIO53_GPIO,    /* usb power */
+
+       /* IrDA */
+       GPIO48_GPIO,    /* ir disable */
+       GPIO46_FICP_RXD,
+       GPIO47_FICP_TXD,
+
+       /* LCD */
+       GPIO58_LCD_LDD_0,
+       GPIO59_LCD_LDD_1,
+       GPIO60_LCD_LDD_2,
+       GPIO61_LCD_LDD_3,
+       GPIO62_LCD_LDD_4,
+       GPIO63_LCD_LDD_5,
+       GPIO64_LCD_LDD_6,
+       GPIO65_LCD_LDD_7,
+       GPIO66_LCD_LDD_8,
+       GPIO67_LCD_LDD_9,
+       GPIO68_LCD_LDD_10,
+       GPIO69_LCD_LDD_11,
+       GPIO70_LCD_LDD_12,
+       GPIO71_LCD_LDD_13,
+       GPIO72_LCD_LDD_14,
+       GPIO73_LCD_LDD_15,
+       GPIO74_LCD_FCLK,
+       GPIO75_LCD_LCLK,
+       GPIO76_LCD_PCLK,
+       GPIO77_LCD_BIAS,
+
+       /* GPIO KEYS */
+       GPIO5_GPIO,     /* notes */
+       GPIO7_GPIO,     /* tasks */
+       GPIO11_GPIO,    /* calendar */
+       GPIO13_GPIO,    /* contacts */
+       GPIO14_GPIO,    /* center */
+       GPIO19_GPIO,    /* left */
+       GPIO20_GPIO,    /* right */
+       GPIO21_GPIO,    /* down */
+       GPIO22_GPIO,    /* up */
+
+       /* MISC */
+       GPIO1_RST,      /* reset */
+       GPIO4_GPIO,     /* Hotsync button */
+       GPIO9_GPIO,     /* power detect */
+       GPIO37_GPIO,    /* LCD power */
+       GPIO56_GPIO,    /* Backlight power */
+};
+
+/******************************************************************************
+ * SD/MMC card controller
+ ******************************************************************************/
+static int palmte2_mci_init(struct device *dev,
+                               irq_handler_t palmte2_detect_int, void *data)
+{
+       int err = 0;
+
+       /* Setup an interrupt for detecting card insert/remove events */
+       err = gpio_request(GPIO_NR_PALMTE2_SD_DETECT_N, "SD IRQ");
+       if (err)
+               goto err;
+       err = gpio_direction_input(GPIO_NR_PALMTE2_SD_DETECT_N);
+       if (err)
+               goto err2;
+       err = request_irq(gpio_to_irq(GPIO_NR_PALMTE2_SD_DETECT_N),
+                       palmte2_detect_int, IRQF_DISABLED | IRQF_SAMPLE_RANDOM |
+                       IRQF_TRIGGER_FALLING | IRQF_TRIGGER_RISING,
+                       "SD/MMC card detect", data);
+       if (err) {
+               printk(KERN_ERR "%s: cannot request SD/MMC card detect IRQ\n",
+                               __func__);
+               goto err2;
+       }
+
+       err = gpio_request(GPIO_NR_PALMTE2_SD_POWER, "SD_POWER");
+       if (err)
+               goto err3;
+       err = gpio_direction_output(GPIO_NR_PALMTE2_SD_POWER, 0);
+       if (err)
+               goto err4;
+
+       err = gpio_request(GPIO_NR_PALMTE2_SD_READONLY, "SD_READONLY");
+       if (err)
+               goto err4;
+       err = gpio_direction_input(GPIO_NR_PALMTE2_SD_READONLY);
+       if (err)
+               goto err5;
+
+       printk(KERN_DEBUG "%s: irq registered\n", __func__);
+
+       return 0;
+
+err5:
+       gpio_free(GPIO_NR_PALMTE2_SD_READONLY);
+err4:
+       gpio_free(GPIO_NR_PALMTE2_SD_POWER);
+err3:
+       free_irq(gpio_to_irq(GPIO_NR_PALMTE2_SD_DETECT_N), data);
+err2:
+       gpio_free(GPIO_NR_PALMTE2_SD_DETECT_N);
+err:
+       return err;
+}
+
+static void palmte2_mci_exit(struct device *dev, void *data)
+{
+       gpio_free(GPIO_NR_PALMTE2_SD_READONLY);
+       gpio_free(GPIO_NR_PALMTE2_SD_POWER);
+       free_irq(gpio_to_irq(GPIO_NR_PALMTE2_SD_DETECT_N), data);
+       gpio_free(GPIO_NR_PALMTE2_SD_DETECT_N);
+}
+
+static void palmte2_mci_power(struct device *dev, unsigned int vdd)
+{
+       struct pxamci_platform_data *p_d = dev->platform_data;
+       gpio_set_value(GPIO_NR_PALMTE2_SD_POWER, p_d->ocr_mask & (1 << vdd));
+}
+
+static int palmte2_mci_get_ro(struct device *dev)
+{
+       return gpio_get_value(GPIO_NR_PALMTE2_SD_READONLY);
+}
+
+static struct pxamci_platform_data palmte2_mci_platform_data = {
+       .ocr_mask       = MMC_VDD_32_33 | MMC_VDD_33_34,
+       .setpower       = palmte2_mci_power,
+       .get_ro         = palmte2_mci_get_ro,
+       .init           = palmte2_mci_init,
+       .exit           = palmte2_mci_exit,
+};
+
+/******************************************************************************
+ * GPIO keys
+ ******************************************************************************/
+static struct gpio_keys_button palmte2_pxa_buttons[] = {
+       {KEY_F1,        GPIO_NR_PALMTE2_KEY_CONTACTS,   1, "Contacts" },
+       {KEY_F2,        GPIO_NR_PALMTE2_KEY_CALENDAR,   1, "Calendar" },
+       {KEY_F3,        GPIO_NR_PALMTE2_KEY_TASKS,      1, "Tasks" },
+       {KEY_F4,        GPIO_NR_PALMTE2_KEY_NOTES,      1, "Notes" },
+       {KEY_ENTER,     GPIO_NR_PALMTE2_KEY_CENTER,     1, "Center" },
+       {KEY_LEFT,      GPIO_NR_PALMTE2_KEY_LEFT,       1, "Left" },
+       {KEY_RIGHT,     GPIO_NR_PALMTE2_KEY_RIGHT,      1, "Right" },
+       {KEY_DOWN,      GPIO_NR_PALMTE2_KEY_DOWN,       1, "Down" },
+       {KEY_UP,        GPIO_NR_PALMTE2_KEY_UP,         1, "Up" },
+};
+
+static struct gpio_keys_platform_data palmte2_pxa_keys_data = {
+       .buttons        = palmte2_pxa_buttons,
+       .nbuttons       = ARRAY_SIZE(palmte2_pxa_buttons),
+};
+
+static struct platform_device palmte2_pxa_keys = {
+       .name   = "gpio-keys",
+       .id     = -1,
+       .dev    = {
+               .platform_data = &palmte2_pxa_keys_data,
+       },
+};
+
+/******************************************************************************
+ * Backlight
+ ******************************************************************************/
+static int palmte2_backlight_init(struct device *dev)
+{
+       int ret;
+
+       ret = gpio_request(GPIO_NR_PALMTE2_BL_POWER, "BL POWER");
+       if (ret)
+               goto err;
+       ret = gpio_direction_output(GPIO_NR_PALMTE2_BL_POWER, 0);
+       if (ret)
+               goto err2;
+       ret = gpio_request(GPIO_NR_PALMTE2_LCD_POWER, "LCD POWER");
+       if (ret)
+               goto err2;
+       ret = gpio_direction_output(GPIO_NR_PALMTE2_LCD_POWER, 0);
+       if (ret)
+               goto err3;
+
+       return 0;
+err3:
+       gpio_free(GPIO_NR_PALMTE2_LCD_POWER);
+err2:
+       gpio_free(GPIO_NR_PALMTE2_BL_POWER);
+err:
+       return ret;
+}
+
+static int palmte2_backlight_notify(int brightness)
+{
+       gpio_set_value(GPIO_NR_PALMTE2_BL_POWER, brightness);
+       gpio_set_value(GPIO_NR_PALMTE2_LCD_POWER, brightness);
+       return brightness;
+}
+
+static void palmte2_backlight_exit(struct device *dev)
+{
+       gpio_free(GPIO_NR_PALMTE2_BL_POWER);
+       gpio_free(GPIO_NR_PALMTE2_LCD_POWER);
+}
+
+static struct platform_pwm_backlight_data palmte2_backlight_data = {
+       .pwm_id         = 0,
+       .max_brightness = PALMTE2_MAX_INTENSITY,
+       .dft_brightness = PALMTE2_MAX_INTENSITY,
+       .pwm_period_ns  = PALMTE2_PERIOD_NS,
+       .init           = palmte2_backlight_init,
+       .notify         = palmte2_backlight_notify,
+       .exit           = palmte2_backlight_exit,
+};
+
+static struct platform_device palmte2_backlight = {
+       .name   = "pwm-backlight",
+       .dev    = {
+               .parent         = &pxa25x_device_pwm0.dev,
+               .platform_data  = &palmte2_backlight_data,
+       },
+};
+
+/******************************************************************************
+ * IrDA
+ ******************************************************************************/
+static int palmte2_irda_startup(struct device *dev)
+{
+       int err;
+       err = gpio_request(GPIO_NR_PALMTE2_IR_DISABLE, "IR DISABLE");
+       if (err)
+               goto err;
+       err = gpio_direction_output(GPIO_NR_PALMTE2_IR_DISABLE, 1);
+       if (err)
+               gpio_free(GPIO_NR_PALMTE2_IR_DISABLE);
+err:
+       return err;
+}
+
+static void palmte2_irda_shutdown(struct device *dev)
+{
+       gpio_free(GPIO_NR_PALMTE2_IR_DISABLE);
+}
+
+static void palmte2_irda_transceiver_mode(struct device *dev, int mode)
+{
+       gpio_set_value(GPIO_NR_PALMTE2_IR_DISABLE, mode & IR_OFF);
+       pxa2xx_transceiver_mode(dev, mode);
+}
+
+static struct pxaficp_platform_data palmte2_ficp_platform_data = {
+       .startup                = palmte2_irda_startup,
+       .shutdown               = palmte2_irda_shutdown,
+       .transceiver_cap        = IR_SIRMODE | IR_FIRMODE | IR_OFF,
+       .transceiver_mode       = palmte2_irda_transceiver_mode,
+};
+
+/******************************************************************************
+ * UDC
+ ******************************************************************************/
+static struct pxa2xx_udc_mach_info palmte2_udc_info __initdata = {
+       .gpio_vbus              = GPIO_NR_PALMTE2_USB_DETECT_N,
+       .gpio_vbus_inverted     = 1,
+       .gpio_pullup            = GPIO_NR_PALMTE2_USB_PULLUP,
+       .gpio_pullup_inverted   = 0,
+};
+
+/******************************************************************************
+ * Power supply
+ ******************************************************************************/
+static int power_supply_init(struct device *dev)
+{
+       int ret;
+
+       ret = gpio_request(GPIO_NR_PALMTE2_POWER_DETECT, "CABLE_STATE_AC");
+       if (ret)
+               goto err1;
+       ret = gpio_direction_input(GPIO_NR_PALMTE2_POWER_DETECT);
+       if (ret)
+               goto err2;
+
+       return 0;
+
+err2:
+       gpio_free(GPIO_NR_PALMTE2_POWER_DETECT);
+err1:
+       return ret;
+}
+
+static int palmte2_is_ac_online(void)
+{
+       return gpio_get_value(GPIO_NR_PALMTE2_POWER_DETECT);
+}
+
+static void power_supply_exit(struct device *dev)
+{
+       gpio_free(GPIO_NR_PALMTE2_POWER_DETECT);
+}
+
+static char *palmte2_supplicants[] = {
+       "main-battery",
+};
+
+static struct pda_power_pdata power_supply_info = {
+       .init            = power_supply_init,
+       .is_ac_online    = palmte2_is_ac_online,
+       .exit            = power_supply_exit,
+       .supplied_to     = palmte2_supplicants,
+       .num_supplicants = ARRAY_SIZE(palmte2_supplicants),
+};
+
+static struct platform_device power_supply = {
+       .name = "pda-power",
+       .id   = -1,
+       .dev  = {
+               .platform_data = &power_supply_info,
+       },
+};
+
+/******************************************************************************
+ * WM97xx battery
+ ******************************************************************************/
+static struct wm97xx_batt_info wm97xx_batt_pdata = {
+       .batt_aux       = WM97XX_AUX_ID3,
+       .temp_aux       = WM97XX_AUX_ID2,
+       .charge_gpio    = -1,
+       .max_voltage    = PALMTE2_BAT_MAX_VOLTAGE,
+       .min_voltage    = PALMTE2_BAT_MIN_VOLTAGE,
+       .batt_mult      = 1000,
+       .batt_div       = 414,
+       .temp_mult      = 1,
+       .temp_div       = 1,
+       .batt_tech      = POWER_SUPPLY_TECHNOLOGY_LIPO,
+       .batt_name      = "main-batt",
+};
+
+/******************************************************************************
+ * Framebuffer
+ ******************************************************************************/
+static struct pxafb_mode_info palmte2_lcd_modes[] = {
+{
+       .pixclock       = 77757,
+       .xres           = 320,
+       .yres           = 320,
+       .bpp            = 16,
+
+       .left_margin    = 28,
+       .right_margin   = 7,
+       .upper_margin   = 7,
+       .lower_margin   = 5,
+
+       .hsync_len      = 4,
+       .vsync_len      = 1,
+},
+};
+
+static struct pxafb_mach_info palmte2_lcd_screen = {
+       .modes          = palmte2_lcd_modes,
+       .num_modes      = ARRAY_SIZE(palmte2_lcd_modes),
+       .lcd_conn       = LCD_COLOR_TFT_16BPP | LCD_PCLK_EDGE_FALL,
+};
+
+/******************************************************************************
+ * Machine init
+ ******************************************************************************/
+static struct platform_device *devices[] __initdata = {
+#if defined(CONFIG_KEYBOARD_GPIO) || defined(CONFIG_KEYBOARD_GPIO_MODULE)
+       &palmte2_pxa_keys,
+#endif
+       &palmte2_backlight,
+       &power_supply,
+};
+
+/* setup udc GPIOs initial state */
+static void __init palmte2_udc_init(void)
+{
+       if (!gpio_request(GPIO_NR_PALMTE2_USB_PULLUP, "UDC Vbus")) {
+               gpio_direction_output(GPIO_NR_PALMTE2_USB_PULLUP, 1);
+               gpio_free(GPIO_NR_PALMTE2_USB_PULLUP);
+       }
+}
+
+static void __init palmte2_init(void)
+{
+       pxa2xx_mfp_config(ARRAY_AND_SIZE(palmte2_pin_config));
+
+       set_pxa_fb_info(&palmte2_lcd_screen);
+       pxa_set_mci_info(&palmte2_mci_platform_data);
+       palmte2_udc_init();
+       pxa_set_udc_info(&palmte2_udc_info);
+       pxa_set_ac97_info(NULL);
+       pxa_set_ficp_info(&palmte2_ficp_platform_data);
+       wm97xx_bat_set_pdata(&wm97xx_batt_pdata);
+
+       platform_add_devices(devices, ARRAY_SIZE(devices));
+}
+
+MACHINE_START(PALMTE2, "Palm Tungsten|E2")
+       .phys_io        = 0x40000000,
+       .io_pg_offst    = (io_p2v(0x40000000) >> 18) & 0xfffc,
+       .boot_params    = 0xa0000100,
+       .map_io         = pxa_map_io,
+       .init_irq       = pxa25x_init_irq,
+       .timer          = &pxa_timer,
+       .init_machine   = palmte2_init
+MACHINE_END
diff --git a/arch/arm/mach-pxa/palmtx.c b/arch/arm/mach-pxa/palmtx.c

index b490c0924619bf7e69fce0ca50fbde060532f090..59d0c1cba5563940e8ba993300bb444743acc753 100644 (file)
--- a/arch/arm/mach-pxa/palmtx.c
+++ b/arch/arm/mach-pxa/palmtx.c
@@ -93,10 +93,10 @@ static unsigned long palmtx_pin_config[] __initdata = {
         GPIO116_GPIO,   /* wifi ready */
  
         /* MATRIX KEYPAD */
-       GPIO100_KP_MKIN_0,
-       GPIO101_KP_MKIN_1,
-       GPIO102_KP_MKIN_2,
-       GPIO97_KP_MKIN_3,
+       GPIO100_KP_MKIN_0 | WAKEUP_ON_LEVEL_HIGH,
+       GPIO101_KP_MKIN_1 | WAKEUP_ON_LEVEL_HIGH,
+       GPIO102_KP_MKIN_2 | WAKEUP_ON_LEVEL_HIGH,
+       GPIO97_KP_MKIN_3 | WAKEUP_ON_LEVEL_HIGH,
         GPIO103_KP_MKOUT_0,
         GPIO104_KP_MKOUT_1,
         GPIO105_KP_MKOUT_2,
@@ -458,6 +458,33 @@ static struct pxafb_mach_info palmtx_lcd_screen = {
         .lcd_conn       = LCD_COLOR_TFT_16BPP | LCD_PCLK_EDGE_FALL,
  };
  
+/******************************************************************************
+ * Power management - standby
+ ******************************************************************************/
+#ifdef CONFIG_PM
+static u32 *addr __initdata;
+static u32 resume[3] __initdata = {
+       0xe3a00101,     /* mov  r0,     #0x40000000 */
+       0xe380060f,     /* orr  r0, r0, #0x00f00000 */
+       0xe590f008,     /* ldr  pc, [r0, #0x08] */
+};
+
+static int __init palmtx_pm_init(void)
+{
+       int i;
+
+       /* this is where the bootloader jumps */
+       addr = phys_to_virt(PALMTX_STR_BASE);
+
+       for (i = 0; i < 3; i++)
+               addr[i] = resume[i];
+
+       return 0;
+}
+
+device_initcall(palmtx_pm_init);
+#endif
+
  /******************************************************************************
   * Machine init
   ******************************************************************************/
diff --git a/arch/arm/mach-pxa/tosa.c b/arch/arm/mach-pxa/tosa.c

index 6e8ade6ae33903e34a051e94f80b3343f84c27aa..afac5b6d3d78e50212afbca1df72579f57de00e4 100644 (file)
--- a/arch/arm/mach-pxa/tosa.c
+++ b/arch/arm/mach-pxa/tosa.c
@@ -45,6 +45,7 @@
  #include <mach/udc.h>
  #include <mach/tosa_bt.h>
  #include <mach/pxa2xx_spi.h>
+#include <mach/audio.h>
  
  #include <asm/mach/arch.h>
  #include <mach/tosa.h>
@@ -914,6 +915,7 @@ static void __init tosa_init(void)
         pxa_set_udc_info(&udc_info);
         pxa_set_ficp_info(&tosa_ficp_platform_data);
         pxa_set_i2c_info(NULL);
+       pxa_set_ac97_info(NULL);
         platform_scoop_config = &tosa_pcmcia_config;
  
         pxa2xx_set_spi_info(2, &pxa_ssp_master_info);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c

index b438fc4fb77b7bc0fcd838697ef5bef8c75661de..e6344ece00cee9c6e026bbe3b443e203745534e9 100644 (file)
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -828,6 +828,17 @@ void __init reserve_node_zero(pg_data_t *pgdat)
                                 BOOTMEM_DEFAULT);
         }
  
+       if (machine_is_palmld() || machine_is_palmtx()) {
+               reserve_bootmem_node(pgdat, 0xa0000000, 0x1000,
+                               BOOTMEM_EXCLUSIVE);
+               reserve_bootmem_node(pgdat, 0xa0200000, 0x1000,
+                               BOOTMEM_EXCLUSIVE);
+       }
+
+       if (machine_is_palmt5())
+               reserve_bootmem_node(pgdat, 0xa0200000, 0x1000,
+                               BOOTMEM_EXCLUSIVE);
+
  #ifdef CONFIG_SA1111
         /*
          * Because of the SA1111 DMA bug, we want to preserve our
diff --git a/arch/ia64/include/asm/unistd.h b/arch/ia64/include/asm/unistd.h

index 9015979ebe0f3481a2a7582db001097a608e0155..10a9eb05f74d121777ca763e0887f73c96dca88c 100644 (file)
--- a/arch/ia64/include/asm/unistd.h
+++ b/arch/ia64/include/asm/unistd.h
@@ -308,11 +308,13 @@
  #define __NR_dup3                      1316
  #define __NR_pipe2                     1317
  #define __NR_inotify_init1             1318
+#define __NR_preadv                    1319
+#define __NR_pwritev                   1320
  
  #ifdef __KERNEL__
  
  
-#define NR_syscalls                    295 /* length of syscall table */
+#define NR_syscalls                    297 /* length of syscall table */
  
  /*
   * The following defines stop scripts/checksyscalls.sh from complaining about
diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S

index 8dc69669586aeca975802ed634e7138df3b249bd..7bebac0e1d44705d8142860d3ce0093bc27ac60e 100644 (file)
--- a/arch/ia64/kernel/entry.S
+++ b/arch/ia64/kernel/entry.S
@@ -1803,6 +1803,8 @@ sys_call_table:
         data8 sys_dup3
         data8 sys_pipe2
         data8 sys_inotify_init1
+       data8 sys_preadv
+       data8 sys_pwritev                       // 1320
  
         .org sys_call_table + 8*NR_syscalls     // guard against failures to increase NR_syscalls
  #endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */
diff --git a/arch/mn10300/kernel/irq.c b/arch/mn10300/kernel/irq.c

index 50fdb5c16e0c34368e4faad479d041b4b4362eb9..4c3c58ef5cda6fe571cbe39a649e8d4ebbfc3ae0 100644 (file)
--- a/arch/mn10300/kernel/irq.c
+++ b/arch/mn10300/kernel/irq.c
@@ -140,7 +140,7 @@ void __init init_IRQ(void)
         int irq;
  
         for (irq = 0; irq < NR_IRQS; irq++)
-               if (irq_desc[irq].chip == &no_irq_type)
+               if (irq_desc[irq].chip == &no_irq_chip)
                         /* due to the PIC latching interrupt requests, even
                          * when the IRQ is disabled, IRQ_PENDING is superfluous
                          * and we can use handle_level_irq() for edge-triggered
diff --git a/arch/sparc/include/asm/unistd.h b/arch/sparc/include/asm/unistd.h

index 031f038b19f719c6a34fdd346efd4c3ebacf7283..b8eb71ef31631794b3e9e2b6e4527d320d44857c 100644 (file)
--- a/arch/sparc/include/asm/unistd.h
+++ b/arch/sparc/include/asm/unistd.h
@@ -392,8 +392,10 @@
  #define __NR_pipe2             321
  #define __NR_inotify_init1     322
  #define __NR_accept4           323
+#define __NR_preadv            324
+#define __NR_pwritev           325
  
-#define NR_SYSCALLS            324
+#define NR_SYSCALLS            326
  
  #ifdef __32bit_syscall_numbers__
  /* Sparc 32-bit only has the "setresuid32", "getresuid32" variants,
diff --git a/arch/sparc/kernel/of_device_64.c b/arch/sparc/kernel/of_device_64.c

index b4a12c9aa5f823d7ed1b35de8ed8404df05bd807..27381f1baffc256b8e321146c44c4efab828824e 100644 (file)
--- a/arch/sparc/kernel/of_device_64.c
+++ b/arch/sparc/kernel/of_device_64.c
@@ -99,8 +99,7 @@ static inline u64 of_read_addr(const u32 *cell, int size)
         return r;
  }
  
-static void __init get_cells(struct device_node *dp,
-                            int *addrc, int *sizec)
+static void get_cells(struct device_node *dp, int *addrc, int *sizec)
  {
         if (addrc)
                 *addrc = of_n_addr_cells(dp);
diff --git a/arch/sparc/kernel/pci_fire.c b/arch/sparc/kernel/pci_fire.c

index 9462b68f489415f631702baaeef8935246def172..d53f45bc7dda269b3b488c7117da530c8b1c701f 100644 (file)
--- a/arch/sparc/kernel/pci_fire.c
+++ b/arch/sparc/kernel/pci_fire.c
@@ -409,8 +409,8 @@ static void pci_fire_hw_init(struct pci_pbm_info *pbm)
         upa_writeq(~(u64)0, pbm->pbm_regs + FIRE_PEC_IENAB);
  }
  
-static int __init pci_fire_pbm_init(struct pci_pbm_info *pbm,
-                                   struct of_device *op, u32 portid)
+static int __devinit pci_fire_pbm_init(struct pci_pbm_info *pbm,
+                                      struct of_device *op, u32 portid)
  {
         const struct linux_prom64_registers *regs;
         struct device_node *dp = op->node;
diff --git a/arch/sparc/kernel/pci_psycho.c b/arch/sparc/kernel/pci_psycho.c

index 3b34344082ef17616687e847b5df0d5209b0bcd6..142b9d6984a8d9886b7c25433e75de59205689f0 100644 (file)
--- a/arch/sparc/kernel/pci_psycho.c
+++ b/arch/sparc/kernel/pci_psycho.c
@@ -365,8 +365,8 @@ static void pbm_config_busmastering(struct pci_pbm_info *pbm)
         pci_config_write8(addr, 64);
  }
  
-static void __init psycho_scan_bus(struct pci_pbm_info *pbm,
-                                  struct device *parent)
+static void __devinit psycho_scan_bus(struct pci_pbm_info *pbm,
+                                     struct device *parent)
  {
         pbm_config_busmastering(pbm);
         pbm->is_66mhz_capable = 0;
@@ -482,8 +482,8 @@ static void psycho_pbm_strbuf_init(struct pci_pbm_info *pbm,
  #define PSYCHO_MEMSPACE_B      0x180000000UL
  #define PSYCHO_MEMSPACE_SIZE   0x07fffffffUL
  
-static void __init psycho_pbm_init(struct pci_pbm_info *pbm,
-                                  struct of_device *op, int is_pbm_a)
+static void __devinit psycho_pbm_init(struct pci_pbm_info *pbm,
+                                     struct of_device *op, int is_pbm_a)
  {
         psycho_pbm_init_common(pbm, op, "PSYCHO", PBM_CHIP_TYPE_PSYCHO);
         psycho_pbm_strbuf_init(pbm, is_pbm_a);
diff --git a/arch/sparc/kernel/pci_sabre.c b/arch/sparc/kernel/pci_sabre.c

index 713257b6963c0c163b8a42340e82fafb1ce5fe6a..ba6fbeba3e2cf7e2e5c5f34afdcbabd6d1c60a36 100644 (file)
--- a/arch/sparc/kernel/pci_sabre.c
+++ b/arch/sparc/kernel/pci_sabre.c
@@ -402,8 +402,8 @@ static void apb_init(struct pci_bus *sabre_bus)
         }
  }
  
-static void __init sabre_scan_bus(struct pci_pbm_info *pbm,
-                                 struct device *parent)
+static void __devinit sabre_scan_bus(struct pci_pbm_info *pbm,
+                                    struct device *parent)
  {
         static int once;
  
@@ -442,8 +442,8 @@ static void __init sabre_scan_bus(struct pci_pbm_info *pbm,
         sabre_register_error_handlers(pbm);
  }
  
-static void __init sabre_pbm_init(struct pci_pbm_info *pbm,
-                                 struct of_device *op)
+static void __devinit sabre_pbm_init(struct pci_pbm_info *pbm,
+                                    struct of_device *op)
  {
         psycho_pbm_init_common(pbm, op, "SABRE", PBM_CHIP_TYPE_SABRE);
         pbm->pci_afsr = pbm->controller_regs + SABRE_PIOAFSR;
diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c

index 0ef0ab3d47635083ed66a7cac72b813a165c6f02..5db5ebed35da0083370e16a932d9478e20f4cfe1 100644 (file)
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -545,8 +545,8 @@ static const struct dma_ops sun4v_dma_ops = {
         .sync_sg_for_cpu                = dma_4v_sync_sg_for_cpu,
  };
  
-static void __init pci_sun4v_scan_bus(struct pci_pbm_info *pbm,
-                                     struct device *parent)
+static void __devinit pci_sun4v_scan_bus(struct pci_pbm_info *pbm,
+                                        struct device *parent)
  {
         struct property *prop;
         struct device_node *dp;
@@ -559,8 +559,8 @@ static void __init pci_sun4v_scan_bus(struct pci_pbm_info *pbm,
         /* XXX register error interrupt handlers XXX */
  }
  
-static unsigned long __init probe_existing_entries(struct pci_pbm_info *pbm,
-                                                  struct iommu *iommu)
+static unsigned long __devinit probe_existing_entries(struct pci_pbm_info *pbm,
+                                                     struct iommu *iommu)
  {
         struct iommu_arena *arena = &iommu->arena;
         unsigned long i, cnt = 0;
@@ -587,7 +587,7 @@ static unsigned long __init probe_existing_entries(struct pci_pbm_info *pbm,
         return cnt;
  }
  
-static int __init pci_sun4v_iommu_init(struct pci_pbm_info *pbm)
+static int __devinit pci_sun4v_iommu_init(struct pci_pbm_info *pbm)
  {
         static const u32 vdma_default[] = { 0x80000000, 0x80000000 };
         struct iommu *iommu = pbm->iommu;
@@ -889,8 +889,8 @@ static void pci_sun4v_msi_init(struct pci_pbm_info *pbm)
  }
  #endif /* !(CONFIG_PCI_MSI) */
  
-static int __init pci_sun4v_pbm_init(struct pci_pbm_info *pbm,
-                                    struct of_device *op, u32 devhandle)
+static int __devinit pci_sun4v_pbm_init(struct pci_pbm_info *pbm,
+                                       struct of_device *op, u32 devhandle)
  {
         struct device_node *dp = op->node;
         int err;
diff --git a/arch/sparc/kernel/power.c b/arch/sparc/kernel/power.c

index ae88f06a7ec4c74838f514817d9c15a793eb1e5b..e2a045c235a10b9d3861e48b7af2e2acab7b50e1 100644 (file)
--- a/arch/sparc/kernel/power.c
+++ b/arch/sparc/kernel/power.c
@@ -23,7 +23,7 @@ static irqreturn_t power_handler(int irq, void *dev_id)
         return IRQ_HANDLED;
  }
  
-static int __init has_button_interrupt(unsigned int irq, struct device_node *dp)
+static int __devinit has_button_interrupt(unsigned int irq, struct device_node *dp)
  {
         if (irq == 0xffffffff)
                 return 0;
diff --git a/arch/sparc/kernel/systbls_32.S b/arch/sparc/kernel/systbls_32.S

index dccc95df0c7f6ce4a97222e5db40295c70456aaa..00ec3b15f38ceb4e53332b085bd437cdedab5ba8 100644 (file)
--- a/arch/sparc/kernel/systbls_32.S
+++ b/arch/sparc/kernel/systbls_32.S
@@ -81,4 +81,4 @@ sys_call_table:
  /*305*/        .long sys_set_mempolicy, sys_kexec_load, sys_move_pages, sys_getcpu, sys_epoll_pwait
  /*310*/        .long sys_utimensat, sys_signalfd, sys_timerfd_create, sys_eventfd, sys_fallocate
  /*315*/        .long sys_timerfd_settime, sys_timerfd_gettime, sys_signalfd4, sys_eventfd2, sys_epoll_create1
-/*320*/        .long sys_dup3, sys_pipe2, sys_inotify_init1, sys_accept4
+/*320*/        .long sys_dup3, sys_pipe2, sys_inotify_init1, sys_accept4, sys_preadv, sys_pwritev
diff --git a/arch/sparc/kernel/systbls_64.S b/arch/sparc/kernel/systbls_64.S

index a8000b1cda74d60ee726ff0fc76a7e556093e1f8..82b5bf85b9d2b2137bcaedda89ffe472dd06650e 100644 (file)
--- a/arch/sparc/kernel/systbls_64.S
+++ b/arch/sparc/kernel/systbls_64.S
@@ -82,7 +82,7 @@ sys_call_table32:
         .word compat_sys_set_mempolicy, compat_sys_kexec_load, compat_sys_move_pages, sys_getcpu, compat_sys_epoll_pwait
  /*310*/        .word compat_sys_utimensat, compat_sys_signalfd, sys_timerfd_create, sys_eventfd, compat_sys_fallocate
         .word compat_sys_timerfd_settime, compat_sys_timerfd_gettime, compat_sys_signalfd4, sys_eventfd2, sys_epoll_create1
-/*320*/        .word sys_dup3, sys_pipe2, sys_inotify_init1, sys_accept4
+/*320*/        .word sys_dup3, sys_pipe2, sys_inotify_init1, sys_accept4, compat_sys_preadv, compat_sys_pwritev
  
  #endif /* CONFIG_COMPAT */
  
@@ -156,4 +156,4 @@ sys_call_table:
         .word sys_set_mempolicy, sys_kexec_load, sys_move_pages, sys_getcpu, sys_epoll_pwait
  /*310*/        .word sys_utimensat, sys_signalfd, sys_timerfd_create, sys_eventfd, sys_fallocate
         .word sys_timerfd_settime, sys_timerfd_gettime, sys_signalfd4, sys_eventfd2, sys_epoll_create1
-/*320*/        .word sys_dup3, sys_pipe2, sys_inotify_init1, sys_accept4
+/*320*/        .word sys_dup3, sys_pipe2, sys_inotify_init1, sys_accept4, sys_preadv, sys_pwritev
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c

index 2c8dfeb7ab04b24634bba62532f6ac62a510059b..f26a352c08a068b0b915ddc0d309c394cb2fc966 100644 (file)
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -70,8 +70,8 @@ extern struct tsb swapper_4m_tsb[KERNEL_TSB4M_NENTRIES];
  
  #define MAX_BANKS      32
  
-static struct linux_prom64_registers pavail[MAX_BANKS] __initdata;
-static int pavail_ents __initdata;
+static struct linux_prom64_registers pavail[MAX_BANKS] __devinitdata;
+static int pavail_ents __devinitdata;
  
  static int cmp_p64(const void *a, const void *b)
  {
@@ -968,7 +968,7 @@ int of_node_to_nid(struct device_node *dp)
         return nid;
  }
  
-static void add_node_ranges(void)
+static void __init add_node_ranges(void)
  {
         int i;
  
@@ -1841,7 +1841,7 @@ void __init paging_init(void)
         printk("Booting Linux...\n");
  }
  
-int __init page_in_phys_avail(unsigned long paddr)
+int __devinit page_in_phys_avail(unsigned long paddr)
  {
         int i;
  
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h

index 0beba0d1468db24bceaa1e81cdf8c652eb5180ee..bb83b1c397aad8d05251db9dcd71880b1967b1f4 100644 (file)
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -154,6 +154,7 @@
   * CPUID levels like 0x6, 0xA etc
   */
  #define X86_FEATURE_IDA                (7*32+ 0) /* Intel Dynamic Acceleration */
+#define X86_FEATURE_ARAT       (7*32+ 1) /* Always Running APIC Timer */
  
  /* Virtualization flags: Linux defined */
  #define X86_FEATURE_TPR_SHADOW  (8*32+ 0) /* Intel TPR Shadow */
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c

index 098ec84b8c0054d1f0fcff0e02aa357a02b33dd3..f2870920f246a9f1d7e2075c37b65b05de27dbf1 100644 (file)
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -431,6 +431,12 @@ static void __cpuinit setup_APIC_timer(void)
  {
         struct clock_event_device *levt = &__get_cpu_var(lapic_events);
  
+       if (cpu_has(&current_cpu_data, X86_FEATURE_ARAT)) {
+               lapic_clockevent.features &= ~CLOCK_EVT_FEAT_C3STOP;
+               /* Make LAPIC timer preferrable over percpu HPET */
+               lapic_clockevent.rating = 150;
+       }
+
         memcpy(levt, &lapic_clockevent, sizeof(*levt));
         levt->cpumask = cpumask_of(smp_processor_id());
  
diff --git a/arch/x86/kernel/cpu/addon_cpuid_features.c b/arch/x86/kernel/cpu/addon_cpuid_features.c

index 8220ae69849d4aa3e5405a412a6bca2b03c4b958..c965e5212714ee66cfe04e847544e213f44a7b2d 100644 (file)
--- a/arch/x86/kernel/cpu/addon_cpuid_features.c
+++ b/arch/x86/kernel/cpu/addon_cpuid_features.c
@@ -31,6 +31,7 @@ void __cpuinit init_scattered_cpuid_features(struct cpuinfo_x86 *c)
  
         static const struct cpuid_bit __cpuinitconst cpuid_bits[] = {
                 { X86_FEATURE_IDA, CR_EAX, 1, 0x00000006 },
+               { X86_FEATURE_ARAT, CR_EAX, 2, 0x00000006 },
                 { 0, 0, 0, 0 }
         };
  
diff --git a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c

index 19f6b9d27e83288fb516e59a490c6e08f55a9a64..9d3af380c6bdfc41a847578ae8eb78a0a151bc38 100644 (file)
--- a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -68,6 +68,7 @@ struct acpi_cpufreq_data {
         unsigned int max_freq;
         unsigned int resume;
         unsigned int cpu_feature;
+       u64 saved_aperf, saved_mperf;
  };
  
  static DEFINE_PER_CPU(struct acpi_cpufreq_data *, drv_data);
@@ -241,26 +242,23 @@ static u32 get_cur_val(const struct cpumask *mask)
         return cmd.val;
  }
  
-struct perf_cur {
+struct perf_pair {
         union {
                 struct {
                         u32 lo;
                         u32 hi;
                 } split;
                 u64 whole;
-       } aperf_cur, mperf_cur;
+       } aperf, mperf;
  };
  
  
  static long read_measured_perf_ctrs(void *_cur)
  {
-       struct perf_cur *cur = _cur;
+       struct perf_pair *cur = _cur;
  
-       rdmsr(MSR_IA32_APERF, cur->aperf_cur.split.lo, cur->aperf_cur.split.hi);
-       rdmsr(MSR_IA32_MPERF, cur->mperf_cur.split.lo, cur->mperf_cur.split.hi);
-
-       wrmsr(MSR_IA32_APERF, 0, 0);
-       wrmsr(MSR_IA32_MPERF, 0, 0);
+       rdmsr(MSR_IA32_APERF, cur->aperf.split.lo, cur->aperf.split.hi);
+       rdmsr(MSR_IA32_MPERF, cur->mperf.split.lo, cur->mperf.split.hi);
  
         return 0;
  }
@@ -281,52 +279,57 @@ static long read_measured_perf_ctrs(void *_cur)
  static unsigned int get_measured_perf(struct cpufreq_policy *policy,
                                       unsigned int cpu)
  {
-       struct perf_cur cur;
+       struct perf_pair readin, cur;
         unsigned int perf_percent;
         unsigned int retval;
  
-       if (!work_on_cpu(cpu, read_measured_perf_ctrs, &cur))
+       if (!work_on_cpu(cpu, read_measured_perf_ctrs, &readin))
                 return 0;
  
+       cur.aperf.whole = readin.aperf.whole -
+                               per_cpu(drv_data, cpu)->saved_aperf;
+       cur.mperf.whole = readin.mperf.whole -
+                               per_cpu(drv_data, cpu)->saved_mperf;
+       per_cpu(drv_data, cpu)->saved_aperf = readin.aperf.whole;
+       per_cpu(drv_data, cpu)->saved_mperf = readin.mperf.whole;
+
  #ifdef __i386__
         /*
          * We dont want to do 64 bit divide with 32 bit kernel
          * Get an approximate value. Return failure in case we cannot get
          * an approximate value.
          */
-       if (unlikely(cur.aperf_cur.split.hi || cur.mperf_cur.split.hi)) {
+       if (unlikely(cur.aperf.split.hi || cur.mperf.split.hi)) {
                 int shift_count;
                 u32 h;
  
-               h = max_t(u32, cur.aperf_cur.split.hi, cur.mperf_cur.split.hi);
+               h = max_t(u32, cur.aperf.split.hi, cur.mperf.split.hi);
                 shift_count = fls(h);
  
-               cur.aperf_cur.whole >>= shift_count;
-               cur.mperf_cur.whole >>= shift_count;
+               cur.aperf.whole >>= shift_count;
+               cur.mperf.whole >>= shift_count;
         }
  
-       if (((unsigned long)(-1) / 100) < cur.aperf_cur.split.lo) {
+       if (((unsigned long)(-1) / 100) < cur.aperf.split.lo) {
                 int shift_count = 7;
-               cur.aperf_cur.split.lo >>= shift_count;
-               cur.mperf_cur.split.lo >>= shift_count;
+               cur.aperf.split.lo >>= shift_count;
+               cur.mperf.split.lo >>= shift_count;
         }
  
-       if (cur.aperf_cur.split.lo && cur.mperf_cur.split.lo)
-               perf_percent = (cur.aperf_cur.split.lo * 100) /
-                               cur.mperf_cur.split.lo;
+       if (cur.aperf.split.lo && cur.mperf.split.lo)
+               perf_percent = (cur.aperf.split.lo * 100) / cur.mperf.split.lo;
         else
                 perf_percent = 0;
  
  #else
-       if (unlikely(((unsigned long)(-1) / 100) < cur.aperf_cur.whole)) {
+       if (unlikely(((unsigned long)(-1) / 100) < cur.aperf.whole)) {
                 int shift_count = 7;
-               cur.aperf_cur.whole >>= shift_count;
-               cur.mperf_cur.whole >>= shift_count;
+               cur.aperf.whole >>= shift_count;
+               cur.mperf.whole >>= shift_count;
         }
  
-       if (cur.aperf_cur.whole && cur.mperf_cur.whole)
-               perf_percent = (cur.aperf_cur.whole * 100) /
-                               cur.mperf_cur.whole;
+       if (cur.aperf.whole && cur.mperf.whole)
+               perf_percent = (cur.aperf.whole * 100) / cur.mperf.whole;
         else
                 perf_percent = 0;
  
diff --git a/arch/x86/kernel/cpu/cpufreq/longhaul.c b/arch/x86/kernel/cpu/cpufreq/longhaul.c

index 0bd48e65a0caa3a77335bd1e8d5a53bc8764fae3..ce2ed3e4aad96a200fe0280dea4d66a4b2fb9268 100644 (file)
--- a/arch/x86/kernel/cpu/cpufreq/longhaul.c
+++ b/arch/x86/kernel/cpu/cpufreq/longhaul.c
@@ -33,7 +33,6 @@
  #include <linux/timex.h>
  #include <linux/io.h>
  #include <linux/acpi.h>
-#include <linux/kernel.h>
  
  #include <asm/msr.h>
  #include <acpi/processor.h>
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c

index 70a10ca100f68273e0371a58f8236ff5d3075acf..18dfa30795c9fe79e617457983acb7e1baa5dbd6 100644 (file)
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -18,6 +18,8 @@
  #include <linux/init.h>
  #include <linux/list.h>
  
+#include <trace/syscall.h>
+
  #include <asm/cacheflush.h>
  #include <asm/ftrace.h>
  #include <asm/nops.h>
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c

index fe9345c967de2c62bf6681a0202541738f5d9da2..23b7c8f017e2afa74194c81c7720724a2604751b 100644 (file)
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -21,7 +21,6 @@
  #include <linux/audit.h>
  #include <linux/seccomp.h>
  #include <linux/signal.h>
-#include <linux/ftrace.h>
  
  #include <asm/uaccess.h>
  #include <asm/pgtable.h>
@@ -35,6 +34,8 @@
  #include <asm/proto.h>
  #include <asm/ds.h>
  
+#include <trace/syscall.h>
+
  #include "tls.h"
  
  enum x86_regset {
diff --git a/drivers/acpi/acpica/hwvalid.c b/drivers/acpi/acpica/hwvalid.c

index bd3c937b0ac094aecff5f6fa2bd7a5007805a7d4..7737afb157c35645bb2de73f7f5a4f95e4674ec8 100644 (file)
--- a/drivers/acpi/acpica/hwvalid.c
+++ b/drivers/acpi/acpica/hwvalid.c
@@ -90,7 +90,6 @@ static const struct acpi_port_info acpi_protected_ports[] = {
         {"PIT2", 0x0048, 0x004B, ACPI_OSI_WIN_XP},
         {"RTC", 0x0070, 0x0071, ACPI_OSI_WIN_XP},
         {"CMOS", 0x0074, 0x0076, ACPI_OSI_WIN_XP},
-       {"DMA1", 0x0081, 0x0083, ACPI_OSI_WIN_XP},
         {"DMA1L", 0x0087, 0x0087, ACPI_OSI_WIN_XP},
         {"DMA2", 0x0089, 0x008B, ACPI_OSI_WIN_XP},
         {"DMA2L", 0x008F, 0x008F, ACPI_OSI_WIN_XP},
diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c

index b0de6312919a82a455f20f8d06dddd8f1b672d80..3c7d8942f23b4dd126d1a8e2145adfa0a1ad499b 100644 (file)
--- a/drivers/acpi/battery.c
+++ b/drivers/acpi/battery.c
@@ -903,7 +903,7 @@ static struct acpi_driver acpi_battery_driver = {
                 },
  };
  
-static void __init acpi_battery_init_async(void *unused, async_cookie_t cookie)
+static void acpi_battery_init_async(void *unused, async_cookie_t cookie)
  {
         if (acpi_disabled)
                 return;
diff --git a/drivers/acpi/proc.c b/drivers/acpi/proc.c

index 05dfdc96802e2fa0d4b2967868ea0f3ebdaad425..d0d550d22a6d43a14ccecec6658e9fb2773d6b70 100644 (file)
--- a/drivers/acpi/proc.c
+++ b/drivers/acpi/proc.c
@@ -343,9 +343,6 @@ acpi_system_write_alarm(struct file *file,
  }
  #endif                         /* HAVE_ACPI_LEGACY_ALARM */
  
-extern struct list_head acpi_wakeup_device_list;
-extern spinlock_t acpi_device_lock;
-
  static int
  acpi_system_wakeup_device_seq_show(struct seq_file *seq, void *offset)
  {
@@ -353,7 +350,7 @@ acpi_system_wakeup_device_seq_show(struct seq_file *seq, void *offset)
  
         seq_printf(seq, "Device\tS-state\t  Status   Sysfs node\n");
  
-       spin_lock(&acpi_device_lock);
+       mutex_lock(&acpi_device_lock);
         list_for_each_safe(node, next, &acpi_wakeup_device_list) {
                 struct acpi_device *dev =
                     container_of(node, struct acpi_device, wakeup_list);
@@ -361,7 +358,6 @@ acpi_system_wakeup_device_seq_show(struct seq_file *seq, void *offset)
  
                 if (!dev->wakeup.flags.valid)
                         continue;
-               spin_unlock(&acpi_device_lock);
  
                 ldev = acpi_get_physical_device(dev->handle);
                 seq_printf(seq, "%s\t  S%d\t%c%-8s  ",
@@ -376,9 +372,8 @@ acpi_system_wakeup_device_seq_show(struct seq_file *seq, void *offset)
                 seq_printf(seq, "\n");
                 put_device(ldev);
  
-               spin_lock(&acpi_device_lock);
         }
-       spin_unlock(&acpi_device_lock);
+       mutex_unlock(&acpi_device_lock);
         return 0;
  }
  
@@ -409,7 +404,7 @@ acpi_system_write_wakeup_device(struct file *file,
         strbuf[len] = '\0';
         sscanf(strbuf, "%s", str);
  
-       spin_lock(&acpi_device_lock);
+       mutex_lock(&acpi_device_lock);
         list_for_each_safe(node, next, &acpi_wakeup_device_list) {
                 struct acpi_device *dev =
                     container_of(node, struct acpi_device, wakeup_list);
@@ -446,7 +441,7 @@ acpi_system_write_wakeup_device(struct file *file,
                         }
                 }
         }
-       spin_unlock(&acpi_device_lock);
+       mutex_unlock(&acpi_device_lock);
         return count;
  }
  
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c

index 4e6e758bd397f50686b3d951edabb8c93742fd38..6fe121434ffb36ec67a91b45823e3df3d45873c5 100644 (file)
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -145,6 +145,9 @@ static void acpi_timer_check_state(int state, struct acpi_processor *pr,
         struct acpi_processor_power *pwr = &pr->power;
         u8 type = local_apic_timer_c2_ok ? ACPI_STATE_C3 : ACPI_STATE_C2;
  
+       if (cpu_has(&cpu_data(pr->id), X86_FEATURE_ARAT))
+               return;
+
         /*
          * Check, if one of the previous states already marked the lapic
          * unstable
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c

index 20c23c04920777259fc426acc7a9154d612a8cd0..8ff510b91d88f4f38e473afab76425a9cf392477 100644 (file)
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -24,7 +24,7 @@ extern struct acpi_device *acpi_root;
  
  static LIST_HEAD(acpi_device_list);
  static LIST_HEAD(acpi_bus_id_list);
-DEFINE_SPINLOCK(acpi_device_lock);
+DEFINE_MUTEX(acpi_device_lock);
  LIST_HEAD(acpi_wakeup_device_list);
  
  struct acpi_device_bus_id{
@@ -491,7 +491,6 @@ static int acpi_device_register(struct acpi_device *device,
          */
         INIT_LIST_HEAD(&device->children);
         INIT_LIST_HEAD(&device->node);
-       INIT_LIST_HEAD(&device->g_list);
         INIT_LIST_HEAD(&device->wakeup_list);
  
         new_bus_id = kzalloc(sizeof(struct acpi_device_bus_id), GFP_KERNEL);
@@ -500,7 +499,7 @@ static int acpi_device_register(struct acpi_device *device,
                 return -ENOMEM;
         }
  
-       spin_lock(&acpi_device_lock);
+       mutex_lock(&acpi_device_lock);
         /*
          * Find suitable bus_id and instance number in acpi_bus_id_list
          * If failed, create one and link it into acpi_bus_id_list
@@ -521,14 +520,12 @@ static int acpi_device_register(struct acpi_device *device,
         }
         dev_set_name(&device->dev, "%s:%02x", acpi_device_bus_id->bus_id, acpi_device_bus_id->instance_no);
  
-       if (device->parent) {
+       if (device->parent)
                 list_add_tail(&device->node, &device->parent->children);
-               list_add_tail(&device->g_list, &device->parent->g_list);
-       } else
-               list_add_tail(&device->g_list, &acpi_device_list);
+
         if (device->wakeup.flags.valid)
                 list_add_tail(&device->wakeup_list, &acpi_wakeup_device_list);
-       spin_unlock(&acpi_device_lock);
+       mutex_unlock(&acpi_device_lock);
  
         if (device->parent)
                 device->dev.parent = &parent->dev;
@@ -549,28 +546,22 @@ static int acpi_device_register(struct acpi_device *device,
         device->removal_type = ACPI_BUS_REMOVAL_NORMAL;
         return 0;
    end:
-       spin_lock(&acpi_device_lock);
-       if (device->parent) {
+       mutex_lock(&acpi_device_lock);
+       if (device->parent)
                 list_del(&device->node);
-               list_del(&device->g_list);
-       } else
-               list_del(&device->g_list);
         list_del(&device->wakeup_list);
-       spin_unlock(&acpi_device_lock);
+       mutex_unlock(&acpi_device_lock);
         return result;
  }
  
  static void acpi_device_unregister(struct acpi_device *device, int type)
  {
-       spin_lock(&acpi_device_lock);
-       if (device->parent) {
+       mutex_lock(&acpi_device_lock);
+       if (device->parent)
                 list_del(&device->node);
-               list_del(&device->g_list);
-       } else
-               list_del(&device->g_list);
  
         list_del(&device->wakeup_list);
-       spin_unlock(&acpi_device_lock);
+       mutex_unlock(&acpi_device_lock);
  
         acpi_detach_data(device->handle, acpi_bus_data_handler);
  
diff --git a/drivers/acpi/sleep.h b/drivers/acpi/sleep.h

index cfaf8f5b0a149b3b7bdcf056ebfcd2bfc7fb9870..8a8f3b3382a672483924ea5a3dd1cdd742e9a11d 100644 (file)
--- a/drivers/acpi/sleep.h
+++ b/drivers/acpi/sleep.h
@@ -5,3 +5,6 @@ extern int acpi_suspend (u32 state);
  extern void acpi_enable_wakeup_device_prep(u8 sleep_state);
  extern void acpi_enable_wakeup_device(u8 sleep_state);
  extern void acpi_disable_wakeup_device(u8 sleep_state);
+
+extern struct list_head acpi_wakeup_device_list;
+extern struct mutex acpi_device_lock;
diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c

index e8c143caf0fd6db6786944977c883af949ae84ca..9cd15e8c893226288958d1b1253169d85eb36620 100644 (file)
--- a/drivers/acpi/thermal.c
+++ b/drivers/acpi/thermal.c
@@ -98,6 +98,7 @@ MODULE_PARM_DESC(psv, "Disable or override all passive trip points.");
  static int acpi_thermal_add(struct acpi_device *device);
  static int acpi_thermal_remove(struct acpi_device *device, int type);
  static int acpi_thermal_resume(struct acpi_device *device);
+static void acpi_thermal_notify(struct acpi_device *device, u32 event);
  static int acpi_thermal_state_open_fs(struct inode *inode, struct file *file);
  static int acpi_thermal_temp_open_fs(struct inode *inode, struct file *file);
  static int acpi_thermal_trip_open_fs(struct inode *inode, struct file *file);
@@ -123,6 +124,7 @@ static struct acpi_driver acpi_thermal_driver = {
                 .add = acpi_thermal_add,
                 .remove = acpi_thermal_remove,
                 .resume = acpi_thermal_resume,
+               .notify = acpi_thermal_notify,
                 },
  };
  
@@ -192,6 +194,7 @@ struct acpi_thermal {
         struct acpi_handle_list devices;
         struct thermal_zone_device *thermal_zone;
         int tz_enabled;
+       int kelvin_offset;
         struct mutex lock;
  };
  
@@ -581,7 +584,7 @@ static void acpi_thermal_check(void *data)
  }
  
  /* sys I/F for generic thermal sysfs support */
-#define KELVIN_TO_MILLICELSIUS(t) (t * 100 - 273200)
+#define KELVIN_TO_MILLICELSIUS(t, off) (((t) - (off)) * 100)
  
  static int thermal_get_temp(struct thermal_zone_device *thermal,
                             unsigned long *temp)
@@ -596,7 +599,7 @@ static int thermal_get_temp(struct thermal_zone_device *thermal,
         if (result)
                 return result;
  
-       *temp = KELVIN_TO_MILLICELSIUS(tz->temperature);
+       *temp = KELVIN_TO_MILLICELSIUS(tz->temperature, tz->kelvin_offset);
         return 0;
  }
  
@@ -702,7 +705,8 @@ static int thermal_get_trip_temp(struct thermal_zone_device *thermal,
         if (tz->trips.critical.flags.valid) {
                 if (!trip) {
                         *temp = KELVIN_TO_MILLICELSIUS(
-                               tz->trips.critical.temperature);
+                               tz->trips.critical.temperature,
+                               tz->kelvin_offset);
                         return 0;
                 }
                 trip--;
@@ -711,7 +715,8 @@ static int thermal_get_trip_temp(struct thermal_zone_device *thermal,
         if (tz->trips.hot.flags.valid) {
                 if (!trip) {
                         *temp = KELVIN_TO_MILLICELSIUS(
-                               tz->trips.hot.temperature);
+                               tz->trips.hot.temperature,
+                               tz->kelvin_offset);
                         return 0;
                 }
                 trip--;
@@ -720,7 +725,8 @@ static int thermal_get_trip_temp(struct thermal_zone_device *thermal,
         if (tz->trips.passive.flags.valid) {
                 if (!trip) {
                         *temp = KELVIN_TO_MILLICELSIUS(
-                               tz->trips.passive.temperature);
+                               tz->trips.passive.temperature,
+                               tz->kelvin_offset);
                         return 0;
                 }
                 trip--;
@@ -730,7 +736,8 @@ static int thermal_get_trip_temp(struct thermal_zone_device *thermal,
                 tz->trips.active[i].flags.valid; i++) {
                 if (!trip) {
                         *temp = KELVIN_TO_MILLICELSIUS(
-                               tz->trips.active[i].temperature);
+                               tz->trips.active[i].temperature,
+                               tz->kelvin_offset);
                         return 0;
                 }
                 trip--;
@@ -745,7 +752,8 @@ static int thermal_get_crit_temp(struct thermal_zone_device *thermal,
  
         if (tz->trips.critical.flags.valid) {
                 *temperature = KELVIN_TO_MILLICELSIUS(
-                               tz->trips.critical.temperature);
+                               tz->trips.critical.temperature,
+                               tz->kelvin_offset);
                 return 0;
         } else
                 return -EINVAL;
@@ -1264,17 +1272,14 @@ static int acpi_thermal_remove_fs(struct acpi_device *device)
                                   Driver Interface
     -------------------------------------------------------------------------- */
  
-static void acpi_thermal_notify(acpi_handle handle, u32 event, void *data)
+static void acpi_thermal_notify(struct acpi_device *device, u32 event)
  {
-       struct acpi_thermal *tz = data;
-       struct acpi_device *device = NULL;
+       struct acpi_thermal *tz = acpi_driver_data(device);
  
  
         if (!tz)
                 return;
  
-       device = tz->device;
-
         switch (event) {
         case ACPI_THERMAL_NOTIFY_TEMPERATURE:
                 acpi_thermal_check(tz);
@@ -1298,8 +1303,6 @@ static void acpi_thermal_notify(acpi_handle handle, u32 event, void *data)
                                   "Unsupported event [0x%x]\n", event));
                 break;
         }
-
-       return;
  }
  
  static int acpi_thermal_get_info(struct acpi_thermal *tz)
@@ -1334,10 +1337,28 @@ static int acpi_thermal_get_info(struct acpi_thermal *tz)
         return 0;
  }
  
+/*
+ * The exact offset between Kelvin and degree Celsius is 273.15. However ACPI
+ * handles temperature values with a single decimal place. As a consequence,
+ * some implementations use an offset of 273.1 and others use an offset of
+ * 273.2. Try to find out which one is being used, to present the most
+ * accurate and visually appealing number.
+ *
+ * The heuristic below should work for all ACPI thermal zones which have a
+ * critical trip point with a value being a multiple of 0.5 degree Celsius.
+ */
+static void acpi_thermal_guess_offset(struct acpi_thermal *tz)
+{
+       if (tz->trips.critical.flags.valid &&
+           (tz->trips.critical.temperature % 5) == 1)
+               tz->kelvin_offset = 2731;
+       else
+               tz->kelvin_offset = 2732;
+}
+
  static int acpi_thermal_add(struct acpi_device *device)
  {
         int result = 0;
-       acpi_status status = AE_OK;
         struct acpi_thermal *tz = NULL;
  
  
@@ -1360,6 +1381,8 @@ static int acpi_thermal_add(struct acpi_device *device)
         if (result)
                 goto free_memory;
  
+       acpi_thermal_guess_offset(tz);
+
         result = acpi_thermal_register_thermal_zone(tz);
         if (result)
                 goto free_memory;
@@ -1368,21 +1391,11 @@ static int acpi_thermal_add(struct acpi_device *device)
         if (result)
                 goto unregister_thermal_zone;
  
-       status = acpi_install_notify_handler(device->handle,
-                                            ACPI_DEVICE_NOTIFY,
-                                            acpi_thermal_notify, tz);
-       if (ACPI_FAILURE(status)) {
-               result = -ENODEV;
-               goto remove_fs;
-       }
-
         printk(KERN_INFO PREFIX "%s [%s] (%ld C)\n",
                acpi_device_name(device), acpi_device_bid(device),
                KELVIN_TO_CELSIUS(tz->temperature));
         goto end;
  
-remove_fs:
-       acpi_thermal_remove_fs(device);
  unregister_thermal_zone:
         thermal_zone_device_unregister(tz->thermal_zone);
  free_memory:
@@ -1393,7 +1406,6 @@ end:
  
  static int acpi_thermal_remove(struct acpi_device *device, int type)
  {
-       acpi_status status = AE_OK;
         struct acpi_thermal *tz = NULL;
  
         if (!device || !acpi_driver_data(device))
@@ -1401,10 +1413,6 @@ static int acpi_thermal_remove(struct acpi_device *device, int type)
  
         tz = acpi_driver_data(device);
  
-       status = acpi_remove_notify_handler(device->handle,
-                                           ACPI_DEVICE_NOTIFY,
-                                           acpi_thermal_notify);
-
         acpi_thermal_remove_fs(device);
         acpi_thermal_unregister_thermal_zone(tz);
         mutex_destroy(&tz->lock);
diff --git a/drivers/acpi/video.c b/drivers/acpi/video.c

index ab06143672bc456ea9ed8c39349647caa8f8e888..cd4fb7543a902ff206599b9a062762e50f68b526 100644 (file)
--- a/drivers/acpi/video.c
+++ b/drivers/acpi/video.c
@@ -79,6 +79,7 @@ module_param(brightness_switch_enabled, bool, 0644);
  static int acpi_video_bus_add(struct acpi_device *device);
  static int acpi_video_bus_remove(struct acpi_device *device, int type);
  static int acpi_video_resume(struct acpi_device *device);
+static void acpi_video_bus_notify(struct acpi_device *device, u32 event);
  
  static const struct acpi_device_id video_device_ids[] = {
         {ACPI_VIDEO_HID, 0},
@@ -94,6 +95,7 @@ static struct acpi_driver acpi_video_bus = {
                 .add = acpi_video_bus_add,
                 .remove = acpi_video_bus_remove,
                 .resume = acpi_video_resume,
+               .notify = acpi_video_bus_notify,
                 },
  };
  
@@ -1986,17 +1988,15 @@ static int acpi_video_bus_stop_devices(struct acpi_video_bus *video)
         return acpi_video_bus_DOS(video, 0, 1);
  }
  
-static void acpi_video_bus_notify(acpi_handle handle, u32 event, void *data)
+static void acpi_video_bus_notify(struct acpi_device *device, u32 event)
  {
-       struct acpi_video_bus *video = data;
-       struct acpi_device *device = NULL;
+       struct acpi_video_bus *video = acpi_driver_data(device);
         struct input_dev *input;
         int keycode;
  
         if (!video)
                 return;
  
-       device = video->device;
         input = video->input;
  
         switch (event) {
@@ -2127,7 +2127,6 @@ static int acpi_video_resume(struct acpi_device *device)
  
  static int acpi_video_bus_add(struct acpi_device *device)
  {
-       acpi_status status;
         struct acpi_video_bus *video;
         struct input_dev *input;
         int error;
@@ -2169,20 +2168,10 @@ static int acpi_video_bus_add(struct acpi_device *device)
         acpi_video_bus_get_devices(video, device);
         acpi_video_bus_start_devices(video);
  
-       status = acpi_install_notify_handler(device->handle,
-                                            ACPI_DEVICE_NOTIFY,
-                                            acpi_video_bus_notify, video);
-       if (ACPI_FAILURE(status)) {
-               printk(KERN_ERR PREFIX
-                                 "Error installing notify handler\n");
-               error = -ENODEV;
-               goto err_stop_video;
-       }
-
         video->input = input = input_allocate_device();
         if (!input) {
                 error = -ENOMEM;
-               goto err_uninstall_notify;
+               goto err_stop_video;
         }
  
         snprintf(video->phys, sizeof(video->phys),
@@ -2218,9 +2207,6 @@ static int acpi_video_bus_add(struct acpi_device *device)
  
   err_free_input_dev:
         input_free_device(input);
- err_uninstall_notify:
-       acpi_remove_notify_handler(device->handle, ACPI_DEVICE_NOTIFY,
-                                  acpi_video_bus_notify);
   err_stop_video:
         acpi_video_bus_stop_devices(video);
         acpi_video_bus_put_devices(video);
@@ -2235,7 +2221,6 @@ static int acpi_video_bus_add(struct acpi_device *device)
  
  static int acpi_video_bus_remove(struct acpi_device *device, int type)
  {
-       acpi_status status = 0;
         struct acpi_video_bus *video = NULL;
  
  
@@ -2245,11 +2230,6 @@ static int acpi_video_bus_remove(struct acpi_device *device, int type)
         video = acpi_driver_data(device);
  
         acpi_video_bus_stop_devices(video);
-
-       status = acpi_remove_notify_handler(video->device->handle,
-                                           ACPI_DEVICE_NOTIFY,
-                                           acpi_video_bus_notify);
-
         acpi_video_bus_put_devices(video);
         acpi_video_bus_remove_fs(device);
  
diff --git a/drivers/acpi/wakeup.c b/drivers/acpi/wakeup.c

index 5aee8c26cc9fb93996569ca464987f4451876092..88725dcdf8bc813e42641b6e81cbdf79c8057159 100644 (file)
--- a/drivers/acpi/wakeup.c
+++ b/drivers/acpi/wakeup.c
@@ -12,12 +12,14 @@
  #include "internal.h"
  #include "sleep.h"
  
+/*
+ * We didn't lock acpi_device_lock in the file, because it invokes oops in
+ * suspend/resume and isn't really required as this is called in S-state. At
+ * that time, there is no device hotplug
+ **/
  #define _COMPONENT             ACPI_SYSTEM_COMPONENT
  ACPI_MODULE_NAME("wakeup_devices")
  
-extern struct list_head acpi_wakeup_device_list;
-extern spinlock_t acpi_device_lock;
-
  /**
   * acpi_enable_wakeup_device_prep - prepare wakeup devices
   *     @sleep_state:   ACPI state
@@ -29,7 +31,6 @@ void acpi_enable_wakeup_device_prep(u8 sleep_state)
  {
         struct list_head *node, *next;
  
-       spin_lock(&acpi_device_lock);
         list_for_each_safe(node, next, &acpi_wakeup_device_list) {
                 struct acpi_device *dev = container_of(node,
                                                        struct acpi_device,
@@ -40,11 +41,8 @@ void acpi_enable_wakeup_device_prep(u8 sleep_state)
                     (sleep_state > (u32) dev->wakeup.sleep_state))
                         continue;
  
-               spin_unlock(&acpi_device_lock);
                 acpi_enable_wakeup_device_power(dev, sleep_state);
-               spin_lock(&acpi_device_lock);
         }
-       spin_unlock(&acpi_device_lock);
  }
  
  /**
@@ -60,7 +58,6 @@ void acpi_enable_wakeup_device(u8 sleep_state)
          * Caution: this routine must be invoked when interrupt is disabled 
          * Refer ACPI2.0: P212
          */
-       spin_lock(&acpi_device_lock);
         list_for_each_safe(node, next, &acpi_wakeup_device_list) {
                 struct acpi_device *dev =
                         container_of(node, struct acpi_device, wakeup_list);
@@ -74,22 +71,17 @@ void acpi_enable_wakeup_device(u8 sleep_state)
                 if ((!dev->wakeup.state.enabled && !dev->wakeup.flags.prepared)
                     || sleep_state > (u32) dev->wakeup.sleep_state) {
                         if (dev->wakeup.flags.run_wake) {
-                               spin_unlock(&acpi_device_lock);
                                 /* set_gpe_type will disable GPE, leave it like that */
                                 acpi_set_gpe_type(dev->wakeup.gpe_device,
                                                   dev->wakeup.gpe_number,
                                                   ACPI_GPE_TYPE_RUNTIME);
-                               spin_lock(&acpi_device_lock);
                         }
                         continue;
                 }
-               spin_unlock(&acpi_device_lock);
                 if (!dev->wakeup.flags.run_wake)
                         acpi_enable_gpe(dev->wakeup.gpe_device,
                                         dev->wakeup.gpe_number);
-               spin_lock(&acpi_device_lock);
         }
-       spin_unlock(&acpi_device_lock);
  }
  
  /**
@@ -101,7 +93,6 @@ void acpi_disable_wakeup_device(u8 sleep_state)
  {
         struct list_head *node, *next;
  
-       spin_lock(&acpi_device_lock);
         list_for_each_safe(node, next, &acpi_wakeup_device_list) {
                 struct acpi_device *dev =
                         container_of(node, struct acpi_device, wakeup_list);
@@ -112,19 +103,16 @@ void acpi_disable_wakeup_device(u8 sleep_state)
                 if ((!dev->wakeup.state.enabled && !dev->wakeup.flags.prepared)
                     || sleep_state > (u32) dev->wakeup.sleep_state) {
                         if (dev->wakeup.flags.run_wake) {
-                               spin_unlock(&acpi_device_lock);
                                 acpi_set_gpe_type(dev->wakeup.gpe_device,
                                                   dev->wakeup.gpe_number,
                                                   ACPI_GPE_TYPE_WAKE_RUN);
                                 /* Re-enable it, since set_gpe_type will disable it */
                                 acpi_enable_gpe(dev->wakeup.gpe_device,
                                                 dev->wakeup.gpe_number);
-                               spin_lock(&acpi_device_lock);
                         }
                         continue;
                 }
  
-               spin_unlock(&acpi_device_lock);
                 acpi_disable_wakeup_device_power(dev);
                 /* Never disable run-wake GPE */
                 if (!dev->wakeup.flags.run_wake) {
@@ -133,16 +121,14 @@ void acpi_disable_wakeup_device(u8 sleep_state)
                         acpi_clear_gpe(dev->wakeup.gpe_device,
                                        dev->wakeup.gpe_number, ACPI_NOT_ISR);
                 }
-               spin_lock(&acpi_device_lock);
         }
-       spin_unlock(&acpi_device_lock);
  }
  
  int __init acpi_wakeup_device_init(void)
  {
         struct list_head *node, *next;
  
-       spin_lock(&acpi_device_lock);
+       mutex_lock(&acpi_device_lock);
         list_for_each_safe(node, next, &acpi_wakeup_device_list) {
                 struct acpi_device *dev = container_of(node,
                                                        struct acpi_device,
@@ -150,15 +136,13 @@ int __init acpi_wakeup_device_init(void)
                 /* In case user doesn't load button driver */
                 if (!dev->wakeup.flags.run_wake || dev->wakeup.state.enabled)
                         continue;
-               spin_unlock(&acpi_device_lock);
                 acpi_set_gpe_type(dev->wakeup.gpe_device,
                                   dev->wakeup.gpe_number,
                                   ACPI_GPE_TYPE_WAKE_RUN);
                 acpi_enable_gpe(dev->wakeup.gpe_device,
                                 dev->wakeup.gpe_number);
                 dev->wakeup.state.enabled = 1;
-               spin_lock(&acpi_device_lock);
         }
-       spin_unlock(&acpi_device_lock);
+       mutex_unlock(&acpi_device_lock);
         return 0;
  }
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c

index f01096549a939b0ad4b1bccbc07546bc3d775d43..823ceba6efa8dcccc95b14cf72986195c52b3ae8 100644 (file)
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1047,6 +1047,19 @@ static int populate_table(struct dm_table *table,
         return dm_table_complete(table);
  }
  
+static int table_prealloc_integrity(struct dm_table *t,
+                                   struct mapped_device *md)
+{
+       struct list_head *devices = dm_table_get_devices(t);
+       struct dm_dev_internal *dd;
+
+       list_for_each_entry(dd, devices, list)
+               if (bdev_get_integrity(dd->dm_dev.bdev))
+                       return blk_integrity_register(dm_disk(md), NULL);
+
+       return 0;
+}
+
  static int table_load(struct dm_ioctl *param, size_t param_size)
  {
         int r;
@@ -1068,6 +1081,14 @@ static int table_load(struct dm_ioctl *param, size_t param_size)
                 goto out;
         }
  
+       r = table_prealloc_integrity(t, md);
+       if (r) {
+               DMERR("%s: could not register integrity profile.",
+                     dm_device_name(md));
+               dm_table_destroy(t);
+               goto out;
+       }
+
         down_write(&_hash_lock);
         hc = dm_get_mdptr(md);
         if (!hc || hc->md != md) {
diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c

index 0a225da21272543c33b4992a9bf1c83b923b8bbf..3e3fc06cb861cc456f6b3184e4cc9150d4772198 100644 (file)
--- a/drivers/md/dm-kcopyd.c
+++ b/drivers/md/dm-kcopyd.c
@@ -297,7 +297,8 @@ static int run_complete_job(struct kcopyd_job *job)
         dm_kcopyd_notify_fn fn = job->fn;
         struct dm_kcopyd_client *kc = job->kc;
  
-       kcopyd_put_pages(kc, job->pages);
+       if (job->pages)
+               kcopyd_put_pages(kc, job->pages);
         mempool_free(job, kc->job_pool);
         fn(read_err, write_err, context);
  
@@ -461,6 +462,7 @@ static void segment_complete(int read_err, unsigned long write_err,
         sector_t progress = 0;
         sector_t count = 0;
         struct kcopyd_job *job = (struct kcopyd_job *) context;
+       struct dm_kcopyd_client *kc = job->kc;
  
         mutex_lock(&job->lock);
  
@@ -490,7 +492,7 @@ static void segment_complete(int read_err, unsigned long write_err,
  
         if (count) {
                 int i;
-               struct kcopyd_job *sub_job = mempool_alloc(job->kc->job_pool,
+               struct kcopyd_job *sub_job = mempool_alloc(kc->job_pool,
                                                            GFP_NOIO);
  
                 *sub_job = *job;
@@ -509,13 +511,16 @@ static void segment_complete(int read_err, unsigned long write_err,
         } else if (atomic_dec_and_test(&job->sub_jobs)) {
  
                 /*
-                * To avoid a race we must keep the job around
-                * until after the notify function has completed.
-                * Otherwise the client may try and stop the job
-                * after we've completed.
+                * Queue the completion callback to the kcopyd thread.
+                *
+                * Some callers assume that all the completions are called
+                * from a single thread and don't race with each other.
+                *
+                * We must not call the callback directly here because this
+                * code may not be executing in the thread.
                  */
-               job->fn(read_err, write_err, job->context);
-               mempool_free(job, job->kc->job_pool);
+               push(&kc->complete_jobs, job);
+               wake(kc);
         }
  }
  
@@ -528,6 +533,8 @@ static void split_job(struct kcopyd_job *job)
  {
         int i;
  
+       atomic_inc(&job->kc->nr_jobs);
+
         atomic_set(&job->sub_jobs, SPLIT_COUNT);
         for (i = 0; i < SPLIT_COUNT; i++)
                 segment_complete(0, 0u, job);
diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c

index bfa107f59d96fca62f219bff16e3affcf08b5046..79fb53e51c709e2875382c517bc61413e50fd978 100644 (file)
--- a/drivers/md/dm-linear.c
+++ b/drivers/md/dm-linear.c
@@ -142,7 +142,6 @@ static struct target_type linear_target = {
         .status = linear_status,
         .ioctl  = linear_ioctl,
         .merge  = linear_merge,
-       .features = DM_TARGET_SUPPORTS_BARRIERS,
  };
  
  int __init dm_linear_init(void)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c

index e8361b191b9b223baef941bfc14212dc052d127f..429b50b975d5a14af8c8e97412425ae9169784d3 100644 (file)
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -52,8 +52,6 @@ struct dm_table {
         sector_t *highs;
         struct dm_target *targets;
  
-       unsigned barriers_supported:1;
-
         /*
          * Indicates the rw permissions for the new logical
          * device.  This should be a combination of FMODE_READ
@@ -243,7 +241,6 @@ int dm_table_create(struct dm_table **result, fmode_t mode,
  
         INIT_LIST_HEAD(&t->devices);
         atomic_set(&t->holders, 0);
-       t->barriers_supported = 1;
  
         if (!num_targets)
                 num_targets = KEYS_PER_NODE;
@@ -751,10 +748,6 @@ int dm_table_add_target(struct dm_table *t, const char *type,
         /* FIXME: the plan is to combine high here and then have
          * the merge fn apply the target level restrictions. */
         combine_restrictions_low(&t->limits, &tgt->limits);
-
-       if (!(tgt->type->features & DM_TARGET_SUPPORTS_BARRIERS))
-               t->barriers_supported = 0;
-
         return 0;
  
   bad:
@@ -799,12 +792,6 @@ int dm_table_complete(struct dm_table *t)
  
         check_for_valid_limits(&t->limits);
  
-       /*
-        * We only support barriers if there is exactly one underlying device.
-        */
-       if (!list_is_singular(&t->devices))
-               t->barriers_supported = 0;
-
         /* how many indexes will the btree have ? */
         leaf_nodes = dm_div_up(t->num_targets, KEYS_PER_NODE);
         t->depth = 1 + int_log(leaf_nodes, CHILDREN_PER_NODE);
@@ -879,6 +866,45 @@ struct dm_target *dm_table_find_target(struct dm_table *t, sector_t sector)
         return &t->targets[(KEYS_PER_NODE * n) + k];
  }
  
+/*
+ * Set the integrity profile for this device if all devices used have
+ * matching profiles.
+ */
+static void dm_table_set_integrity(struct dm_table *t)
+{
+       struct list_head *devices = dm_table_get_devices(t);
+       struct dm_dev_internal *prev = NULL, *dd = NULL;
+
+       if (!blk_get_integrity(dm_disk(t->md)))
+               return;
+
+       list_for_each_entry(dd, devices, list) {
+               if (prev &&
+                   blk_integrity_compare(prev->dm_dev.bdev->bd_disk,
+                                         dd->dm_dev.bdev->bd_disk) < 0) {
+                       DMWARN("%s: integrity not set: %s and %s mismatch",
+                              dm_device_name(t->md),
+                              prev->dm_dev.bdev->bd_disk->disk_name,
+                              dd->dm_dev.bdev->bd_disk->disk_name);
+                       goto no_integrity;
+               }
+               prev = dd;
+       }
+
+       if (!prev || !bdev_get_integrity(prev->dm_dev.bdev))
+               goto no_integrity;
+
+       blk_integrity_register(dm_disk(t->md),
+                              bdev_get_integrity(prev->dm_dev.bdev));
+
+       return;
+
+no_integrity:
+       blk_integrity_register(dm_disk(t->md), NULL);
+
+       return;
+}
+
  void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q)
  {
         /*
@@ -899,6 +925,7 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q)
         else
                 queue_flag_set_unlocked(QUEUE_FLAG_CLUSTER, q);
  
+       dm_table_set_integrity(t);
  }
  
  unsigned int dm_table_get_num_targets(struct dm_table *t)
@@ -1019,12 +1046,6 @@ struct mapped_device *dm_table_get_md(struct dm_table *t)
         return t->md;
  }
  
-int dm_table_barrier_ok(struct dm_table *t)
-{
-       return t->barriers_supported;
-}
-EXPORT_SYMBOL(dm_table_barrier_ok);
-
  EXPORT_SYMBOL(dm_vcalloc);
  EXPORT_SYMBOL(dm_get_device);
  EXPORT_SYMBOL(dm_put_device);
diff --git a/drivers/md/dm.c b/drivers/md/dm.c

index 788ba96a6256aaed6de8625d306cc4d5e33a82f9..8a994be035ba47a89f677ecf34b473d41684cf52 100644 (file)
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -89,12 +89,13 @@ union map_info *dm_get_mapinfo(struct bio *bio)
  /*
   * Bits for the md->flags field.
   */
-#define DMF_BLOCK_IO 0
+#define DMF_BLOCK_IO_FOR_SUSPEND 0
  #define DMF_SUSPENDED 1
  #define DMF_FROZEN 2
  #define DMF_FREEING 3
  #define DMF_DELETING 4
  #define DMF_NOFLUSH_SUSPENDING 5
+#define DMF_QUEUE_IO_TO_THREAD 6
  
  /*
   * Work processed by per-device workqueue.
@@ -123,6 +124,11 @@ struct mapped_device {
         struct bio_list deferred;
         spinlock_t deferred_lock;
  
+       /*
+        * An error from the barrier request currently being processed.
+        */
+       int barrier_error;
+
         /*
          * Processing queue (flush/barriers)
          */
@@ -424,6 +430,10 @@ static void end_io_acct(struct dm_io *io)
         part_stat_add(cpu, &dm_disk(md)->part0, ticks[rw], duration);
         part_stat_unlock();
  
+       /*
+        * After this is decremented the bio must not be touched if it is
+        * a barrier.
+        */
         dm_disk(md)->part0.in_flight = pending =
                 atomic_dec_return(&md->pending);
  
@@ -435,21 +445,18 @@ static void end_io_acct(struct dm_io *io)
  /*
   * Add the bio to the list of deferred io.
   */
-static int queue_io(struct mapped_device *md, struct bio *bio)
+static void queue_io(struct mapped_device *md, struct bio *bio)
  {
         down_write(&md->io_lock);
  
-       if (!test_bit(DMF_BLOCK_IO, &md->flags)) {
-               up_write(&md->io_lock);
-               return 1;
-       }
-
         spin_lock_irq(&md->deferred_lock);
         bio_list_add(&md->deferred, bio);
         spin_unlock_irq(&md->deferred_lock);
  
+       if (!test_and_set_bit(DMF_QUEUE_IO_TO_THREAD, &md->flags))
+               queue_work(md->wq, &md->work);
+
         up_write(&md->io_lock);
-       return 0;               /* deferred successfully */
  }
  
  /*
@@ -533,25 +540,35 @@ static void dec_pending(struct dm_io *io, int error)
                          */
                         spin_lock_irqsave(&md->deferred_lock, flags);
                         if (__noflush_suspending(md))
-                               bio_list_add(&md->deferred, io->bio);
+                               bio_list_add_head(&md->deferred, io->bio);
                         else
                                 /* noflush suspend was interrupted. */
                                 io->error = -EIO;
                         spin_unlock_irqrestore(&md->deferred_lock, flags);
                 }
  
-               end_io_acct(io);
-
                 io_error = io->error;
                 bio = io->bio;
  
-               free_io(md, io);
+               if (bio_barrier(bio)) {
+                       /*
+                        * There can be just one barrier request so we use
+                        * a per-device variable for error reporting.
+                        * Note that you can't touch the bio after end_io_acct
+                        */
+                       md->barrier_error = io_error;
+                       end_io_acct(io);
+               } else {
+                       end_io_acct(io);
  
-               if (io_error != DM_ENDIO_REQUEUE) {
-                       trace_block_bio_complete(md->queue, bio);
+                       if (io_error != DM_ENDIO_REQUEUE) {
+                               trace_block_bio_complete(md->queue, bio);
  
-                       bio_endio(bio, io_error);
+                               bio_endio(bio, io_error);
+                       }
                 }
+
+               free_io(md, io);
         }
  }
  
@@ -693,13 +710,19 @@ static struct bio *split_bvec(struct bio *bio, sector_t sector,
  
         clone->bi_sector = sector;
         clone->bi_bdev = bio->bi_bdev;
-       clone->bi_rw = bio->bi_rw;
+       clone->bi_rw = bio->bi_rw & ~(1 << BIO_RW_BARRIER);
         clone->bi_vcnt = 1;
         clone->bi_size = to_bytes(len);
         clone->bi_io_vec->bv_offset = offset;
         clone->bi_io_vec->bv_len = clone->bi_size;
         clone->bi_flags |= 1 << BIO_CLONED;
  
+       if (bio_integrity(bio)) {
+               bio_integrity_clone(clone, bio, GFP_NOIO);
+               bio_integrity_trim(clone,
+                                  bio_sector_offset(bio, idx, offset), len);
+       }
+
         return clone;
  }
  
@@ -714,6 +737,7 @@ static struct bio *clone_bio(struct bio *bio, sector_t sector,
  
         clone = bio_alloc_bioset(GFP_NOIO, bio->bi_max_vecs, bs);
         __bio_clone(clone, bio);
+       clone->bi_rw &= ~(1 << BIO_RW_BARRIER);
         clone->bi_destructor = dm_bio_destructor;
         clone->bi_sector = sector;
         clone->bi_idx = idx;
@@ -721,6 +745,14 @@ static struct bio *clone_bio(struct bio *bio, sector_t sector,
         clone->bi_size = to_bytes(len);
         clone->bi_flags &= ~(1 << BIO_SEG_VALID);
  
+       if (bio_integrity(bio)) {
+               bio_integrity_clone(clone, bio, GFP_NOIO);
+
+               if (idx != bio->bi_idx || clone->bi_size < bio->bi_size)
+                       bio_integrity_trim(clone,
+                                          bio_sector_offset(bio, idx, 0), len);
+       }
+
         return clone;
  }
  
@@ -834,14 +866,13 @@ static void __split_and_process_bio(struct mapped_device *md, struct bio *bio)
  
         ci.map = dm_get_table(md);
         if (unlikely(!ci.map)) {
-               bio_io_error(bio);
-               return;
-       }
-       if (unlikely(bio_barrier(bio) && !dm_table_barrier_ok(ci.map))) {
-               dm_table_put(ci.map);
-               bio_endio(bio, -EOPNOTSUPP);
+               if (!bio_barrier(bio))
+                       bio_io_error(bio);
+               else
+                       md->barrier_error = -EIO;
                 return;
         }
+
         ci.md = md;
         ci.bio = bio;
         ci.io = alloc_io(md);
@@ -918,7 +949,6 @@ out:
   */
  static int dm_request(struct request_queue *q, struct bio *bio)
  {
-       int r = -EIO;
         int rw = bio_data_dir(bio);
         struct mapped_device *md = q->queuedata;
         int cpu;
@@ -931,34 +961,27 @@ static int dm_request(struct request_queue *q, struct bio *bio)
         part_stat_unlock();
  
         /*
-        * If we're suspended we have to queue
-        * this io for later.
+        * If we're suspended or the thread is processing barriers
+        * we have to queue this io for later.
          */
-       while (test_bit(DMF_BLOCK_IO, &md->flags)) {
+       if (unlikely(test_bit(DMF_QUEUE_IO_TO_THREAD, &md->flags)) ||
+           unlikely(bio_barrier(bio))) {
                 up_read(&md->io_lock);
  
-               if (bio_rw(bio) != READA)
-                       r = queue_io(md, bio);
+               if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) &&
+                   bio_rw(bio) == READA) {
+                       bio_io_error(bio);
+                       return 0;
+               }
  
-               if (r <= 0)
-                       goto out_req;
+               queue_io(md, bio);
  
-               /*
-                * We're in a while loop, because someone could suspend
-                * before we get to the following read lock.
-                */
-               down_read(&md->io_lock);
+               return 0;
         }
  
         __split_and_process_bio(md, bio);
         up_read(&md->io_lock);
         return 0;
-
-out_req:
-       if (r < 0)
-               bio_io_error(bio);
-
-       return 0;
  }
  
  static void dm_unplug_all(struct request_queue *q)
@@ -978,7 +1001,7 @@ static int dm_any_congested(void *congested_data, int bdi_bits)
         struct mapped_device *md = congested_data;
         struct dm_table *map;
  
-       if (!test_bit(DMF_BLOCK_IO, &md->flags)) {
+       if (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) {
                 map = dm_get_table(md);
                 if (map) {
                         r = dm_table_any_congested(map, bdi_bits);
@@ -1193,6 +1216,7 @@ static void free_dev(struct mapped_device *md)
         mempool_destroy(md->tio_pool);
         mempool_destroy(md->io_pool);
         bioset_free(md->bs);
+       blk_integrity_unregister(md->disk);
         del_gendisk(md->disk);
         free_minor(minor);
  
@@ -1406,6 +1430,36 @@ static int dm_wait_for_completion(struct mapped_device *md, int interruptible)
         return r;
  }
  
+static int dm_flush(struct mapped_device *md)
+{
+       dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE);
+       return 0;
+}
+
+static void process_barrier(struct mapped_device *md, struct bio *bio)
+{
+       int error = dm_flush(md);
+
+       if (unlikely(error)) {
+               bio_endio(bio, error);
+               return;
+       }
+       if (bio_empty_barrier(bio)) {
+               bio_endio(bio, 0);
+               return;
+       }
+
+       __split_and_process_bio(md, bio);
+
+       error = dm_flush(md);
+
+       if (!error && md->barrier_error)
+               error = md->barrier_error;
+
+       if (md->barrier_error != DM_ENDIO_REQUEUE)
+               bio_endio(bio, error);
+}
+
  /*
   * Process the deferred bios
   */
@@ -1417,25 +1471,34 @@ static void dm_wq_work(struct work_struct *work)
  
         down_write(&md->io_lock);
  
-next_bio:
-       spin_lock_irq(&md->deferred_lock);
-       c = bio_list_pop(&md->deferred);
-       spin_unlock_irq(&md->deferred_lock);
+       while (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) {
+               spin_lock_irq(&md->deferred_lock);
+               c = bio_list_pop(&md->deferred);
+               spin_unlock_irq(&md->deferred_lock);
  
-       if (c) {
-               __split_and_process_bio(md, c);
-               goto next_bio;
-       }
+               if (!c) {
+                       clear_bit(DMF_QUEUE_IO_TO_THREAD, &md->flags);
+                       break;
+               }
  
-       clear_bit(DMF_BLOCK_IO, &md->flags);
+               up_write(&md->io_lock);
+
+               if (bio_barrier(c))
+                       process_barrier(md, c);
+               else
+                       __split_and_process_bio(md, c);
+
+               down_write(&md->io_lock);
+       }
  
         up_write(&md->io_lock);
  }
  
  static void dm_queue_flush(struct mapped_device *md)
  {
+       clear_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags);
+       smp_mb__after_clear_bit();
         queue_work(md->wq, &md->work);
-       flush_workqueue(md->wq);
  }
  
  /*
@@ -1553,20 +1616,36 @@ int dm_suspend(struct mapped_device *md, unsigned suspend_flags)
         }
  
         /*
-        * First we set the BLOCK_IO flag so no more ios will be mapped.
+        * Here we must make sure that no processes are submitting requests
+        * to target drivers i.e. no one may be executing
+        * __split_and_process_bio. This is called from dm_request and
+        * dm_wq_work.
+        *
+        * To get all processes out of __split_and_process_bio in dm_request,
+        * we take the write lock. To prevent any process from reentering
+        * __split_and_process_bio from dm_request, we set
+        * DMF_QUEUE_IO_TO_THREAD.
+        *
+        * To quiesce the thread (dm_wq_work), we set DMF_BLOCK_IO_FOR_SUSPEND
+        * and call flush_workqueue(md->wq). flush_workqueue will wait until
+        * dm_wq_work exits and DMF_BLOCK_IO_FOR_SUSPEND will prevent any
+        * further calls to __split_and_process_bio from dm_wq_work.
          */
         down_write(&md->io_lock);
-       set_bit(DMF_BLOCK_IO, &md->flags);
-
+       set_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags);
+       set_bit(DMF_QUEUE_IO_TO_THREAD, &md->flags);
         up_write(&md->io_lock);
  
+       flush_workqueue(md->wq);
+
         /*
-        * Wait for the already-mapped ios to complete.
+        * At this point no more requests are entering target request routines.
+        * We call dm_wait_for_completion to wait for all existing requests
+        * to finish.
          */
         r = dm_wait_for_completion(md, TASK_INTERRUPTIBLE);
  
         down_write(&md->io_lock);
-
         if (noflush)
                 clear_bit(DMF_NOFLUSH_SUSPENDING, &md->flags);
         up_write(&md->io_lock);
@@ -1579,6 +1658,12 @@ int dm_suspend(struct mapped_device *md, unsigned suspend_flags)
                 goto out; /* pushback list is already flushed, so skip flush */
         }
  
+       /*
+        * If dm_wait_for_completion returned 0, the device is completely
+        * quiescent now. There is no request-processing activity. All new
+        * requests are being added to md->deferred list.
+        */
+
         dm_table_postsuspend_targets(map);
  
         set_bit(DMF_SUSPENDED, &md->flags);
diff --git a/drivers/md/dm.h b/drivers/md/dm.h

index b48397c0abbd44b253c37ce56db3647f7513eb9f..a31506d93e9164115a7d1584d33f30714913f521 100644 (file)
--- a/drivers/md/dm.h
+++ b/drivers/md/dm.h
@@ -52,7 +52,6 @@ int dm_table_any_congested(struct dm_table *t, int bdi_bits);
   * To check the return value from dm_table_find_target().
   */
  #define dm_target_is_valid(t) ((t)->table)
-int dm_table_barrier_ok(struct dm_table *t);
  
  /*-----------------------------------------------------------------
   * A registry of target types.
diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c

index c232d11a7ed4c16adf8de5502902889a1a7489ee..06084dbf12772fcf1c1dcf98fd220d9dd41f7f54 100644 (file)
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -208,7 +208,7 @@ static int mmc_read_ext_csd(struct mmc_card *card)
         }
  
         ext_csd_struct = ext_csd[EXT_CSD_REV];
-       if (ext_csd_struct > 2) {
+       if (ext_csd_struct > 3) {
                 printk(KERN_ERR "%s: unrecognised EXT_CSD structure "
                         "version %d\n", mmc_hostname(card->host),
                         ext_csd_struct);
diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c

index 26fc098d77cd5d9261852f34aa902b11e88eaa79..cd81c395e1646056cd90a82200d09bff9e09a03f 100644 (file)
--- a/drivers/mmc/core/sd.c
+++ b/drivers/mmc/core/sd.c
@@ -362,15 +362,6 @@ static int mmc_sd_init_card(struct mmc_host *host, u32 ocr,
         if (err)
                 goto err;
  
-       /*
-        * For SPI, enable CRC as appropriate.
-        */
-       if (mmc_host_is_spi(host)) {
-               err = mmc_spi_set_crc(host, use_spi_crc);
-               if (err)
-                       goto err;
-       }
-
         /*
          * Fetch CID from card.
          */
@@ -457,6 +448,18 @@ static int mmc_sd_init_card(struct mmc_host *host, u32 ocr,
                         goto free_card;
         }
  
+       /*
+        * For SPI, enable CRC as appropriate.
+        * This CRC enable is located AFTER the reading of the
+        * card registers because some SDHC cards are not able
+        * to provide valid CRCs for non-512-byte blocks.
+        */
+       if (mmc_host_is_spi(host)) {
+               err = mmc_spi_set_crc(host, use_spi_crc);
+               if (err)
+                       goto free_card;
+       }
+
         /*
          * Attempt to change to high-speed (if supported)
          */
diff --git a/drivers/mmc/host/imxmmc.c b/drivers/mmc/host/imxmmc.c

index eb29b1d933acbcae19611300805f7151ac4ec859..e0be21a4a696d1daf19019eae1bb80651bf2e554 100644 (file)
--- a/drivers/mmc/host/imxmmc.c
+++ b/drivers/mmc/host/imxmmc.c
@@ -307,13 +307,6 @@ static void imxmci_setup_data(struct imxmci_host *host, struct mmc_data *data)
  
         wmb();
  
-       if (host->actual_bus_width == MMC_BUS_WIDTH_4)
-               BLR(host->dma) = 0;     /* burst 64 byte read / 64 bytes write */
-       else
-               BLR(host->dma) = 16;    /* burst 16 byte read / 16 bytes write */
-
-       RSSR(host->dma) = DMA_REQ_SDHC;
-
         set_bit(IMXMCI_PEND_DMA_DATA_b, &host->pending_events);
         clear_bit(IMXMCI_PEND_CPU_DATA_b, &host->pending_events);
  
@@ -818,9 +811,11 @@ static void imxmci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
         if (ios->bus_width == MMC_BUS_WIDTH_4) {
                 host->actual_bus_width = MMC_BUS_WIDTH_4;
                 imx_gpio_mode(PB11_PF_SD_DAT3);
+               BLR(host->dma) = 0;     /* burst 64 byte read/write */
         } else {
                 host->actual_bus_width = MMC_BUS_WIDTH_1;
                 imx_gpio_mode(GPIO_PORTB | GPIO_IN | GPIO_PUEN | 11);
+               BLR(host->dma) = 16;    /* burst 16 byte read/write */
         }
  
         if (host->power_mode != ios->power_mode) {
@@ -938,7 +933,7 @@ static void imxmci_check_status(unsigned long data)
         mod_timer(&host->timer, jiffies + (HZ>>1));
  }
  
-static int imxmci_probe(struct platform_device *pdev)
+static int __init imxmci_probe(struct platform_device *pdev)
  {
         struct mmc_host *mmc;
         struct imxmci_host *host = NULL;
@@ -1034,6 +1029,7 @@ static int imxmci_probe(struct platform_device *pdev)
         }
         host->dma_allocated = 1;
         imx_dma_setup_handlers(host->dma, imxmci_dma_irq, NULL, host);
+       RSSR(host->dma) = DMA_REQ_SDHC;
  
         tasklet_init(&host->tasklet, imxmci_tasklet_fnc, (unsigned long)host);
         host->status_reg=0;
@@ -1079,7 +1075,7 @@ out:
         return ret;
  }
  
-static int imxmci_remove(struct platform_device *pdev)
+static int __exit imxmci_remove(struct platform_device *pdev)
  {
         struct mmc_host *mmc = platform_get_drvdata(pdev);
  
@@ -1145,8 +1141,7 @@ static int imxmci_resume(struct platform_device *dev)
  #endif /* CONFIG_PM */
  
  static struct platform_driver imxmci_driver = {
-       .probe          = imxmci_probe,
-       .remove         = imxmci_remove,
+       .remove         = __exit_p(imxmci_remove),
         .suspend        = imxmci_suspend,
         .resume         = imxmci_resume,
         .driver         = {
@@ -1157,7 +1152,7 @@ static struct platform_driver imxmci_driver = {
  
  static int __init imxmci_init(void)
  {
-       return platform_driver_register(&imxmci_driver);
+       return platform_driver_probe(&imxmci_driver, imxmci_probe);
  }
  
  static void __exit imxmci_exit(void)
diff --git a/drivers/mmc/host/mmc_spi.c b/drivers/mmc/host/mmc_spi.c

index 72f8bde4877a790f8d65f95450aae1723e521c84..f48349d18c9264ba988b12d0373c7a4d5d7da8eb 100644 (file)
--- a/drivers/mmc/host/mmc_spi.c
+++ b/drivers/mmc/host/mmc_spi.c
@@ -24,7 +24,7 @@
   * along with this program; if not, write to the Free Software
   * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
   */
-#include <linux/hrtimer.h>
+#include <linux/sched.h>
  #include <linux/delay.h>
  #include <linux/bio.h>
  #include <linux/dma-mapping.h>
@@ -95,7 +95,7 @@
   * reads which takes nowhere near that long.  Older cards may be able to use
   * shorter timeouts ... but why bother?
   */
-#define r1b_timeout            ktime_set(3, 0)
+#define r1b_timeout            (HZ * 3)
  
  
  /****************************************************************************/
@@ -183,12 +183,11 @@ mmc_spi_readbytes(struct mmc_spi_host *host, unsigned len)
         return status;
  }
  
-static int
-mmc_spi_skip(struct mmc_spi_host *host, ktime_t timeout, unsigned n, u8 byte)
+static int mmc_spi_skip(struct mmc_spi_host *host, unsigned long timeout,
+                       unsigned n, u8 byte)
  {
         u8              *cp = host->data->status;
-
-       timeout = ktime_add(timeout, ktime_get());
+       unsigned long start = jiffies;
  
         while (1) {
                 int             status;
@@ -203,22 +202,26 @@ mmc_spi_skip(struct mmc_spi_host *host, ktime_t timeout, unsigned n, u8 byte)
                                 return cp[i];
                 }
  
-               /* REVISIT investigate msleep() to avoid busy-wait I/O
-                * in at least some cases.
-                */
-               if (ktime_to_ns(ktime_sub(ktime_get(), timeout)) > 0)
+               if (time_is_before_jiffies(start + timeout))
                         break;
+
+               /* If we need long timeouts, we may release the CPU.
+                * We use jiffies here because we want to have a relation
+                * between elapsed time and the blocking of the scheduler.
+                */
+               if (time_is_before_jiffies(start+1))
+                       schedule();
         }
         return -ETIMEDOUT;
  }
  
  static inline int
-mmc_spi_wait_unbusy(struct mmc_spi_host *host, ktime_t timeout)
+mmc_spi_wait_unbusy(struct mmc_spi_host *host, unsigned long timeout)
  {
         return mmc_spi_skip(host, timeout, sizeof(host->data->status), 0);
  }
  
-static int mmc_spi_readtoken(struct mmc_spi_host *host, ktime_t timeout)
+static int mmc_spi_readtoken(struct mmc_spi_host *host, unsigned long timeout)
  {
         return mmc_spi_skip(host, timeout, 1, 0xff);
  }
@@ -251,6 +254,10 @@ static int mmc_spi_response_get(struct mmc_spi_host *host,
         u8      *cp = host->data->status;
         u8      *end = cp + host->t.len;
         int     value = 0;
+       int     bitshift;
+       u8      leftover = 0;
+       unsigned short rotator;
+       int     i;
         char    tag[32];
  
         snprintf(tag, sizeof(tag), "  ... CMD%d response SPI_%s",
@@ -268,9 +275,8 @@ static int mmc_spi_response_get(struct mmc_spi_host *host,
  
         /* Data block reads (R1 response types) may need more data... */
         if (cp == end) {
-               unsigned        i;
-
                 cp = host->data->status;
+               end = cp+1;
  
                 /* Card sends N(CR) (== 1..8) bytes of all-ones then one
                  * status byte ... and we already scanned 2 bytes.
@@ -295,20 +301,34 @@ static int mmc_spi_response_get(struct mmc_spi_host *host,
         }
  
  checkstatus:
-       if (*cp & 0x80) {
-               dev_dbg(&host->spi->dev, "%s: INVALID RESPONSE, %02x\n",
-                                       tag, *cp);
-               value = -EBADR;
-               goto done;
+       bitshift = 0;
+       if (*cp & 0x80) {
+               /* Houston, we have an ugly card with a bit-shifted response */
+               rotator = *cp++ << 8;
+               /* read the next byte */
+               if (cp == end) {
+                       value = mmc_spi_readbytes(host, 1);
+                       if (value < 0)
+                               goto done;
+                       cp = host->data->status;
+                       end = cp+1;
+               }
+               rotator |= *cp++;
+               while (rotator & 0x8000) {
+                       bitshift++;
+                       rotator <<= 1;
+               }
+               cmd->resp[0] = rotator >> 8;
+               leftover = rotator;
+       } else {
+               cmd->resp[0] = *cp++;
         }
-
-       cmd->resp[0] = *cp++;
         cmd->error = 0;
  
         /* Status byte: the entire seven-bit R1 response.  */
         if (cmd->resp[0] != 0) {
                 if ((R1_SPI_PARAMETER | R1_SPI_ADDRESS
-                                       | R1_SPI_ILLEGAL_COMMAND)
+                                     | R1_SPI_ILLEGAL_COMMAND)
                                 & cmd->resp[0])
                         value = -EINVAL;
                 else if (R1_SPI_COM_CRC & cmd->resp[0])
@@ -336,12 +356,45 @@ checkstatus:
          * SPI R5 == R1 + data byte; IO_RW_DIRECT
          */
         case MMC_RSP_SPI_R2:
-               cmd->resp[0] |= *cp << 8;
+               /* read the next byte */
+               if (cp == end) {
+                       value = mmc_spi_readbytes(host, 1);
+                       if (value < 0)
+                               goto done;
+                       cp = host->data->status;
+                       end = cp+1;
+               }
+               if (bitshift) {
+                       rotator = leftover << 8;
+                       rotator |= *cp << bitshift;
+                       cmd->resp[0] |= (rotator & 0xFF00);
+               } else {
+                       cmd->resp[0] |= *cp << 8;
+               }
                 break;
  
         /* SPI R3, R4, or R7 == R1 + 4 bytes */
         case MMC_RSP_SPI_R3:
-               cmd->resp[1] = get_unaligned_be32(cp);
+               rotator = leftover << 8;
+               cmd->resp[1] = 0;
+               for (i = 0; i < 4; i++) {
+                       cmd->resp[1] <<= 8;
+                       /* read the next byte */
+                       if (cp == end) {
+                               value = mmc_spi_readbytes(host, 1);
+                               if (value < 0)
+                                       goto done;
+                               cp = host->data->status;
+                               end = cp+1;
+                       }
+                       if (bitshift) {
+                               rotator |= *cp++ << bitshift;
+                               cmd->resp[1] |= (rotator >> 8);
+                               rotator <<= 8;
+                       } else {
+                               cmd->resp[1] |= *cp++;
+                       }
+               }
                 break;
  
         /* SPI R1 == just one status byte */
@@ -607,7 +660,7 @@ mmc_spi_setup_data_message(
   */
  static int
  mmc_spi_writeblock(struct mmc_spi_host *host, struct spi_transfer *t,
-       ktime_t timeout)
+       unsigned long timeout)
  {
         struct spi_device       *spi = host->spi;
         int                     status, i;
@@ -717,11 +770,13 @@ mmc_spi_writeblock(struct mmc_spi_host *host, struct spi_transfer *t,
   */
  static int
  mmc_spi_readblock(struct mmc_spi_host *host, struct spi_transfer *t,
-       ktime_t timeout)
+       unsigned long timeout)
  {
         struct spi_device       *spi = host->spi;
         int                     status;
         struct scratch          *scratch = host->data;
+       unsigned int            bitshift;
+       u8                      leftover;
  
         /* At least one SD card sends an all-zeroes byte when N(CX)
          * applies, before the all-ones bytes ... just cope with that.
@@ -733,38 +788,60 @@ mmc_spi_readblock(struct mmc_spi_host *host, struct spi_transfer *t,
         if (status == 0xff || status == 0)
                 status = mmc_spi_readtoken(host, timeout);
  
-       if (status == SPI_TOKEN_SINGLE) {
-               if (host->dma_dev) {
-                       dma_sync_single_for_device(host->dma_dev,
-                                       host->data_dma, sizeof(*scratch),
-                                       DMA_BIDIRECTIONAL);
-                       dma_sync_single_for_device(host->dma_dev,
-                                       t->rx_dma, t->len,
-                                       DMA_FROM_DEVICE);
-               }
+       if (status < 0) {
+               dev_dbg(&spi->dev, "read error %02x (%d)\n", status, status);
+               return status;
+       }
  
-               status = spi_sync(spi, &host->m);
+       /* The token may be bit-shifted...
+        * the first 0-bit precedes the data stream.
+        */
+       bitshift = 7;
+       while (status & 0x80) {
+               status <<= 1;
+               bitshift--;
+       }
+       leftover = status << 1;
  
-               if (host->dma_dev) {
-                       dma_sync_single_for_cpu(host->dma_dev,
-                                       host->data_dma, sizeof(*scratch),
-                                       DMA_BIDIRECTIONAL);
-                       dma_sync_single_for_cpu(host->dma_dev,
-                                       t->rx_dma, t->len,
-                                       DMA_FROM_DEVICE);
-               }
+       if (host->dma_dev) {
+               dma_sync_single_for_device(host->dma_dev,
+                               host->data_dma, sizeof(*scratch),
+                               DMA_BIDIRECTIONAL);
+               dma_sync_single_for_device(host->dma_dev,
+                               t->rx_dma, t->len,
+                               DMA_FROM_DEVICE);
+       }
  
-       } else {
-               dev_dbg(&spi->dev, "read error %02x (%d)\n", status, status);
+       status = spi_sync(spi, &host->m);
  
-               /* we've read extra garbage, timed out, etc */
-               if (status < 0)
-                       return status;
+       if (host->dma_dev) {
+               dma_sync_single_for_cpu(host->dma_dev,
+                               host->data_dma, sizeof(*scratch),
+                               DMA_BIDIRECTIONAL);
+               dma_sync_single_for_cpu(host->dma_dev,
+                               t->rx_dma, t->len,
+                               DMA_FROM_DEVICE);
+       }
  
-               /* low four bits are an R2 subset, fifth seems to be
-                * vendor specific ... map them all to generic error..
+       if (bitshift) {
+               /* Walk through the data and the crc and do
+                * all the magic to get byte-aligned data.
                  */
-               return -EIO;
+               u8 *cp = t->rx_buf;
+               unsigned int len;
+               unsigned int bitright = 8 - bitshift;
+               u8 temp;
+               for (len = t->len; len; len--) {
+                       temp = *cp;
+                       *cp++ = leftover | (temp >> bitshift);
+                       leftover = temp << bitright;
+               }
+               cp = (u8 *) &scratch->crc_val;
+               temp = *cp;
+               *cp++ = leftover | (temp >> bitshift);
+               leftover = temp << bitright;
+               temp = *cp;
+               *cp = leftover | (temp >> bitshift);
         }
  
         if (host->mmc->use_spi_crc) {
@@ -803,7 +880,7 @@ mmc_spi_data_do(struct mmc_spi_host *host, struct mmc_command *cmd,
         unsigned                n_sg;
         int                     multiple = (data->blocks > 1);
         u32                     clock_rate;
-       ktime_t                 timeout;
+       unsigned long           timeout;
  
         if (data->flags & MMC_DATA_READ)
                 direction = DMA_FROM_DEVICE;
@@ -817,8 +894,9 @@ mmc_spi_data_do(struct mmc_spi_host *host, struct mmc_command *cmd,
         else
                 clock_rate = spi->max_speed_hz;
  
-       timeout = ktime_add_ns(ktime_set(0, 0), data->timeout_ns +
-                       data->timeout_clks * 1000000 / clock_rate);
+       timeout = data->timeout_ns +
+                 data->timeout_clks * 1000000 / clock_rate;
+       timeout = usecs_to_jiffies((unsigned int)(timeout / 1000)) + 1;
  
         /* Handle scatterlist segments one at a time, with synch for
          * each 512-byte block
diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c

index d183be6f2a5f7ab369ef8b60c5cfd25815db110f..e62a22a7f00cb10089ab36424793aa498a81284f 100644 (file)
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -298,7 +298,6 @@ mmc_omap_xfer_done(struct mmc_omap_host *host, struct mmc_data *data)
                 struct mmc_request *mrq = host->mrq;
  
                 host->mrq = NULL;
-               mmc_omap_fclk_lazy_disable(host);
                 mmc_request_done(host->mmc, mrq);
                 return;
         }
@@ -434,6 +433,8 @@ static irqreturn_t mmc_omap_irq(int irq, void *dev_id)
         if (host->mrq == NULL) {
                 OMAP_HSMMC_WRITE(host->base, STAT,
                         OMAP_HSMMC_READ(host->base, STAT));
+               /* Flush posted write */
+               OMAP_HSMMC_READ(host->base, STAT);
                 return IRQ_HANDLED;
         }
  
@@ -489,8 +490,10 @@ static irqreturn_t mmc_omap_irq(int irq, void *dev_id)
         }
  
         OMAP_HSMMC_WRITE(host->base, STAT, status);
+       /* Flush posted write */
+       OMAP_HSMMC_READ(host->base, STAT);
  
-       if (end_cmd || (status & CC))
+       if (end_cmd || ((status & CC) && host->cmd))
                 mmc_omap_cmd_done(host, host->cmd);
         if (end_trans || (status & TC))
                 mmc_omap_xfer_done(host, data);
diff --git a/drivers/mmc/host/sdhci-pci.c b/drivers/mmc/host/sdhci-pci.c

index c5b316e22371e687076eee25142bc0ae73de868c..cd37962ec44fd997d4e8ec76820d0dcf607f83e9 100644 (file)
--- a/drivers/mmc/host/sdhci-pci.c
+++ b/drivers/mmc/host/sdhci-pci.c
@@ -729,6 +729,6 @@ static void __exit sdhci_drv_exit(void)
  module_init(sdhci_drv_init);
  module_exit(sdhci_drv_exit);
  
-MODULE_AUTHOR("Pierre Ossman <drzeus@drzeus.cx>");
+MODULE_AUTHOR("Pierre Ossman <pierre@ossman.eu>");
  MODULE_DESCRIPTION("Secure Digital Host Controller Interface PCI driver");
  MODULE_LICENSE("GPL");
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c

index 30d8e3d4e6fdbc727605aa3f99563980c38940dd..9234be2226e743fc40b1d5b54e53a8500c09dd19 100644 (file)
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1935,7 +1935,7 @@ module_exit(sdhci_drv_exit);
  
  module_param(debug_quirks, uint, 0444);
  
-MODULE_AUTHOR("Pierre Ossman <drzeus@drzeus.cx>");
+MODULE_AUTHOR("Pierre Ossman <pierre@ossman.eu>");
  MODULE_DESCRIPTION("Secure Digital Host Controller Interface core driver");
  MODULE_LICENSE("GPL");
  
diff --git a/drivers/mmc/host/wbsd.c b/drivers/mmc/host/wbsd.c

index adda37952032af52335dcfbb69fb925bf16e5046..89bf8cd25cacec9a6df3d9d231a429c82ce36e40 100644 (file)
--- a/drivers/mmc/host/wbsd.c
+++ b/drivers/mmc/host/wbsd.c
@@ -2036,7 +2036,7 @@ module_param_named(irq, param_irq, uint, 0444);
  module_param_named(dma, param_dma, int, 0444);
  
  MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Pierre Ossman <drzeus@drzeus.cx>");
+MODULE_AUTHOR("Pierre Ossman <pierre@ossman.eu>");
  MODULE_DESCRIPTION("Winbond W83L51xD SD/MMC card interface driver");
  
  #ifdef CONFIG_PNP
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig

index 9e7baec457202c6b3d39865752fb4ab819d4a42f..9e921544ba207d71fa99d16a2c6784079fe39e96 100644 (file)
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -977,6 +977,8 @@ config ETHOC
         depends on NET_ETHERNET && HAS_IOMEM
         select MII
         select PHYLIB
+       select CRC32
+       select BITREVERSE
         help
           Say Y here if you want to use the OpenCores 10/100 Mbps Ethernet MAC.
  
@@ -2056,6 +2058,27 @@ config IGB_DCA
           driver.  DCA is a method for warming the CPU cache before data
           is used, with the intent of lessening the impact of cache misses.
  
+config IGBVF
+       tristate "Intel(R) 82576 Virtual Function Ethernet support"
+       depends on PCI
+       ---help---
+         This driver supports Intel(R) 82576 virtual functions.  For more
+         information on how to identify your adapter, go to the Adapter &
+         Driver ID Guide at:
+
+         <http://support.intel.com/support/network/adapter/pro100/21397.htm>
+
+         For general information and support, go to the Intel support
+         website at:
+
+         <http://support.intel.com>
+
+         More specific information on configuring the driver is in
+         <file:Documentation/networking/e1000.txt>.
+
+         To compile this driver as a module, choose M here. The module
+         will be called igbvf.
+
  source "drivers/net/ixp2000/Kconfig"
  
  config MYRI_SBUS
diff --git a/drivers/net/Makefile b/drivers/net/Makefile

index edc9a0d6171ddd6d9643f855b61dc5b51724e0ba..1fc4602a6ff2c29912f64759207f0d38a8e17895 100644 (file)
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_E1000) += e1000/
  obj-$(CONFIG_E1000E) += e1000e/
  obj-$(CONFIG_IBM_NEW_EMAC) += ibm_newemac/
  obj-$(CONFIG_IGB) += igb/
+obj-$(CONFIG_IGBVF) += igbvf/
  obj-$(CONFIG_IXGBE) += ixgbe/
  obj-$(CONFIG_IXGB) += ixgb/
  obj-$(CONFIG_IP1000) += ipg.o
diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c

index 9d268be0b670e70a9921cdf66694400b76e7d788..d47839184a0680cbd51440afdc5d85fbd6c60a48 100644 (file)
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -3427,8 +3427,8 @@ static int __devinit
  bnx2_request_firmware(struct bnx2 *bp)
  {
         const char *mips_fw_file, *rv2p_fw_file;
-       const struct bnx2_mips_fw_file *mips;
-       const struct bnx2_rv2p_fw_file *rv2p;
+       const struct bnx2_mips_fw_file *mips_fw;
+       const struct bnx2_rv2p_fw_file *rv2p_fw;
         int rc;
  
         if (CHIP_NUM(bp) == CHIP_NUM_5709) {
@@ -3452,21 +3452,21 @@ bnx2_request_firmware(struct bnx2 *bp)
                        rv2p_fw_file);
                 return rc;
         }
-       mips = (const struct bnx2_mips_fw_file *) bp->mips_firmware->data;
-       rv2p = (const struct bnx2_rv2p_fw_file *) bp->rv2p_firmware->data;
-       if (bp->mips_firmware->size < sizeof(*mips) ||
-           check_mips_fw_entry(bp->mips_firmware, &mips->com) ||
-           check_mips_fw_entry(bp->mips_firmware, &mips->cp) ||
-           check_mips_fw_entry(bp->mips_firmware, &mips->rxp) ||
-           check_mips_fw_entry(bp->mips_firmware, &mips->tpat) ||
-           check_mips_fw_entry(bp->mips_firmware, &mips->txp)) {
+       mips_fw = (const struct bnx2_mips_fw_file *) bp->mips_firmware->data;
+       rv2p_fw = (const struct bnx2_rv2p_fw_file *) bp->rv2p_firmware->data;
+       if (bp->mips_firmware->size < sizeof(*mips_fw) ||
+           check_mips_fw_entry(bp->mips_firmware, &mips_fw->com) ||
+           check_mips_fw_entry(bp->mips_firmware, &mips_fw->cp) ||
+           check_mips_fw_entry(bp->mips_firmware, &mips_fw->rxp) ||
+           check_mips_fw_entry(bp->mips_firmware, &mips_fw->tpat) ||
+           check_mips_fw_entry(bp->mips_firmware, &mips_fw->txp)) {
                 printk(KERN_ERR PFX "Firmware file \"%s\" is invalid\n",
                        mips_fw_file);
                 return -EINVAL;
         }
-       if (bp->rv2p_firmware->size < sizeof(*rv2p) ||
-           check_fw_section(bp->rv2p_firmware, &rv2p->proc1.rv2p, 8, true) ||
-           check_fw_section(bp->rv2p_firmware, &rv2p->proc2.rv2p, 8, true)) {
+       if (bp->rv2p_firmware->size < sizeof(*rv2p_fw) ||
+           check_fw_section(bp->rv2p_firmware, &rv2p_fw->proc1.rv2p, 8, true) ||
+           check_fw_section(bp->rv2p_firmware, &rv2p_fw->proc2.rv2p, 8, true)) {
                 printk(KERN_ERR PFX "Firmware file \"%s\" is invalid\n",
                        rv2p_fw_file);
                 return -EINVAL;
diff --git a/drivers/net/eql.c b/drivers/net/eql.c

index 51ead7941f83ad249f6757eda2b50f6b6521d758..5210bb1027cce43d168d5f09d391064af2fac812 100644 (file)
--- a/drivers/net/eql.c
+++ b/drivers/net/eql.c
@@ -542,6 +542,8 @@ static int eql_s_slave_cfg(struct net_device *dev, slave_config_t __user *scp)
         }
         spin_unlock_bh(&eql->queue.lock);
  
+       dev_put(slave_dev);
+
         return ret;
  }
  
diff --git a/drivers/net/fec.c b/drivers/net/fec.c

index a515acccc61f7f64a7b19ecda6773e5af26131c2..682e7f0b558127eb845be5bd60e73e97cafb59cc 100644 (file)
--- a/drivers/net/fec.c
+++ b/drivers/net/fec.c
@@ -1240,6 +1240,7 @@ static void __inline__ fec_phy_ack_intr(void)
         icrp = (volatile unsigned long *) (MCF_MBAR + MCFSIM_ICR1);
         *icrp = 0x0d000000;
  }
+#endif
  
  #ifdef CONFIG_M5272
  static void __inline__ fec_get_mac(struct net_device *dev)
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c

index 6b0697c565b95bbd055b35b995eed55eaa9a73c8..db7274e62228c0fe6c1925d210b262bf3752f667 100644 (file)
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -152,14 +152,13 @@ static struct notifier_block dca_notifier = {
  /* for netdump / net console */
  static void igb_netpoll(struct net_device *);
  #endif
-
  #ifdef CONFIG_PCI_IOV
-static ssize_t igb_set_num_vfs(struct device *, struct device_attribute *,
-                               const char *, size_t);
-static ssize_t igb_show_num_vfs(struct device *, struct device_attribute *,
-                               char *);
-DEVICE_ATTR(num_vfs, S_IRUGO | S_IWUSR, igb_show_num_vfs, igb_set_num_vfs);
-#endif
+static unsigned int max_vfs = 0;
+module_param(max_vfs, uint, 0);
+MODULE_PARM_DESC(max_vfs, "Maximum number of virtual functions to allocate "
+                 "per physical function");
+#endif /* CONFIG_PCI_IOV */
+
  static pci_ers_result_t igb_io_error_detected(struct pci_dev *,
                      pci_channel_state_t);
  static pci_ers_result_t igb_io_slot_reset(struct pci_dev *);
@@ -671,6 +670,21 @@ static void igb_set_interrupt_capability(struct igb_adapter *adapter)
  
         /* If we can't do MSI-X, try MSI */
  msi_only:
+#ifdef CONFIG_PCI_IOV
+       /* disable SR-IOV for non MSI-X configurations */
+       if (adapter->vf_data) {
+               struct e1000_hw *hw = &adapter->hw;
+               /* disable iov and allow time for transactions to clear */
+               pci_disable_sriov(adapter->pdev);
+               msleep(500);
+
+               kfree(adapter->vf_data);
+               adapter->vf_data = NULL;
+               wr32(E1000_IOVCTL, E1000_IOVCTL_REUSE_VFQ);
+               msleep(100);
+               dev_info(&adapter->pdev->dev, "IOV Disabled\n");
+       }
+#endif
         adapter->num_rx_queues = 1;
         adapter->num_tx_queues = 1;
         if (!pci_enable_msi(adapter->pdev))
@@ -1238,6 +1252,39 @@ static int __devinit igb_probe(struct pci_dev *pdev,
         if (err)
                 goto err_sw_init;
  
+#ifdef CONFIG_PCI_IOV
+       /* since iov functionality isn't critical to base device function we
+        * can accept failure.  If it fails we don't allow iov to be enabled */
+       if (hw->mac.type == e1000_82576) {
+               /* 82576 supports a maximum of 7 VFs in addition to the PF */
+               unsigned int num_vfs = (max_vfs > 7) ? 7 : max_vfs;
+               int i;
+               unsigned char mac_addr[ETH_ALEN];
+
+               if (num_vfs)
+                       adapter->vf_data = kcalloc(num_vfs,
+                                               sizeof(struct vf_data_storage),
+                                               GFP_KERNEL);
+               if (!adapter->vf_data) {
+                       dev_err(&pdev->dev, "Could not allocate VF private "
+                               "data - IOV enable failed\n");
+               } else {
+                       err = pci_enable_sriov(pdev, num_vfs);
+                       if (!err) {
+                               adapter->vfs_allocated_count = num_vfs;
+                               dev_info(&pdev->dev, "%d vfs allocated\n", num_vfs);
+                               for (i = 0; i < adapter->vfs_allocated_count; i++) {
+                                       random_ether_addr(mac_addr);
+                                       igb_set_vf_mac(adapter, i, mac_addr);
+                               }
+                       } else {
+                               kfree(adapter->vf_data);
+                               adapter->vf_data = NULL;
+                       }
+               }
+       }
+
+#endif
         /* setup the private structure */
         err = igb_sw_init(adapter);
         if (err)
@@ -1397,19 +1444,6 @@ static int __devinit igb_probe(struct pci_dev *pdev,
         if (err)
                 goto err_register;
  
-#ifdef CONFIG_PCI_IOV
-       /* since iov functionality isn't critical to base device function we
-        * can accept failure.  If it fails we don't allow iov to be enabled */
-       if (hw->mac.type == e1000_82576) {
-               err = pci_enable_sriov(pdev, 0);
-               if (!err)
-                       err = device_create_file(&netdev->dev,
-                                                &dev_attr_num_vfs);
-               if (err)
-                       dev_err(&pdev->dev, "Failed to initialize IOV\n");
-       }
-
-#endif
  #ifdef CONFIG_IGB_DCA
         if (dca_add_requester(&pdev->dev) == 0) {
                 adapter->flags |= IGB_FLAG_DCA_ENABLED;
@@ -5422,89 +5456,4 @@ static void igb_vmm_control(struct igb_adapter *adapter)
         igb_vmdq_set_replication_pf(hw, true);
  }
  
-#ifdef CONFIG_PCI_IOV
-static ssize_t igb_show_num_vfs(struct device *dev,
-                                struct device_attribute *attr, char *buf)
-{
-       struct igb_adapter *adapter = netdev_priv(to_net_dev(dev));
-
-       return sprintf(buf, "%d\n", adapter->vfs_allocated_count);
-}
-
-static ssize_t igb_set_num_vfs(struct device *dev,
-                               struct device_attribute *attr,
-                               const char *buf, size_t count)
-{
-       struct net_device *netdev = to_net_dev(dev);
-       struct igb_adapter *adapter = netdev_priv(netdev);
-       struct e1000_hw *hw = &adapter->hw;
-       struct pci_dev *pdev = adapter->pdev;
-       unsigned int num_vfs, i;
-       unsigned char mac_addr[ETH_ALEN];
-       int err;
-
-       sscanf(buf, "%u", &num_vfs);
-
-       if (num_vfs > 7)
-               num_vfs = 7;
-
-       /* value unchanged do nothing */
-       if (num_vfs == adapter->vfs_allocated_count)
-               return count;
-
-       if (netdev->flags & IFF_UP)
-               igb_close(netdev);
-
-       igb_reset_interrupt_capability(adapter);
-       igb_free_queues(adapter);
-       adapter->tx_ring = NULL;
-       adapter->rx_ring = NULL;
-       adapter->vfs_allocated_count = 0;
-
-       /* reclaim resources allocated to VFs since we are changing count */
-       if (adapter->vf_data) {
-               /* disable iov and allow time for transactions to clear */
-               pci_disable_sriov(pdev);
-               msleep(500);
-
-               kfree(adapter->vf_data);
-               adapter->vf_data = NULL;
-               wr32(E1000_IOVCTL, E1000_IOVCTL_REUSE_VFQ);
-               msleep(100);
-               dev_info(&pdev->dev, "IOV Disabled\n");
-       }
-
-       if (num_vfs) {
-               adapter->vf_data = kcalloc(num_vfs,
-                                          sizeof(struct vf_data_storage),
-                                          GFP_KERNEL);
-               if (!adapter->vf_data) {
-                       dev_err(&pdev->dev, "Could not allocate VF private "
-                               "data - IOV enable failed\n");
-               } else {
-                       err = pci_enable_sriov(pdev, num_vfs);
-                       if (!err) {
-                               adapter->vfs_allocated_count = num_vfs;
-                               dev_info(&pdev->dev, "%d vfs allocated\n", num_vfs);
-                               for (i = 0; i < adapter->vfs_allocated_count; i++) {
-                                       random_ether_addr(mac_addr);
-                                       igb_set_vf_mac(adapter, i, mac_addr);
-                               }
-                       } else {
-                               kfree(adapter->vf_data);
-                               adapter->vf_data = NULL;
-                       }
-               }
-       }
-
-       igb_set_interrupt_capability(adapter);
-       igb_alloc_queues(adapter);
-       igb_reset(adapter);
-
-       if (netdev->flags & IFF_UP)
-               igb_open(netdev);
-
-       return count;
-}
-#endif /* CONFIG_PCI_IOV */
  /* igb_main.c */
diff --git a/drivers/net/igbvf/Makefile b/drivers/net/igbvf/Makefile

new file mode 100644 (file)

index 0000000..c2f150d
--- /dev/null
+++ b/drivers/net/igbvf/Makefile
@@ -0,0 +1,38 @@
+################################################################################
+#
+# Intel(R) 82576 Virtual Function Linux driver
+# Copyright(c) 2009 Intel Corporation.
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms and conditions of the GNU General Public License,
+# version 2, as published by the Free Software Foundation.
+#
+# This program is distributed in the hope it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+# more details.
+#
+# You should have received a copy of the GNU General Public License along with
+# this program; if not, write to the Free Software Foundation, Inc.,
+# 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+#
+# The full GNU General Public License is included in this distribution in
+# the file called "COPYING".
+#
+# Contact Information:
+# e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+# Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+#
+################################################################################
+
+#
+# Makefile for the Intel(R) 82576 VF ethernet driver
+#
+
+obj-$(CONFIG_IGBVF) += igbvf.o
+
+igbvf-objs := vf.o \
+              mbx.o \
+              ethtool.o \
+              netdev.o
+
diff --git a/drivers/net/igbvf/defines.h b/drivers/net/igbvf/defines.h

new file mode 100644 (file)

index 0000000..88a4753
--- /dev/null
+++ b/drivers/net/igbvf/defines.h
@@ -0,0 +1,125 @@
+/*******************************************************************************
+
+  Intel(R) 82576 Virtual Function Linux driver
+  Copyright(c) 1999 - 2009 Intel Corporation.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Contact Information:
+  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+*******************************************************************************/
+
+#ifndef _E1000_DEFINES_H_
+#define _E1000_DEFINES_H_
+
+/* Number of Transmit and Receive Descriptors must be a multiple of 8 */
+#define REQ_TX_DESCRIPTOR_MULTIPLE  8
+#define REQ_RX_DESCRIPTOR_MULTIPLE  8
+
+/* IVAR valid bit */
+#define E1000_IVAR_VALID        0x80
+
+/* Receive Descriptor bit definitions */
+#define E1000_RXD_STAT_DD       0x01    /* Descriptor Done */
+#define E1000_RXD_STAT_EOP      0x02    /* End of Packet */
+#define E1000_RXD_STAT_IXSM     0x04    /* Ignore checksum */
+#define E1000_RXD_STAT_VP       0x08    /* IEEE VLAN Packet */
+#define E1000_RXD_STAT_UDPCS    0x10    /* UDP xsum calculated */
+#define E1000_RXD_STAT_TCPCS    0x20    /* TCP xsum calculated */
+#define E1000_RXD_STAT_IPCS     0x40    /* IP xsum calculated */
+#define E1000_RXD_ERR_SE        0x02    /* Symbol Error */
+#define E1000_RXD_SPC_VLAN_MASK 0x0FFF  /* VLAN ID is in lower 12 bits */
+
+#define E1000_RXDEXT_STATERR_CE    0x01000000
+#define E1000_RXDEXT_STATERR_SE    0x02000000
+#define E1000_RXDEXT_STATERR_SEQ   0x04000000
+#define E1000_RXDEXT_STATERR_CXE   0x10000000
+#define E1000_RXDEXT_STATERR_TCPE  0x20000000
+#define E1000_RXDEXT_STATERR_IPE   0x40000000
+#define E1000_RXDEXT_STATERR_RXE   0x80000000
+
+
+/* Same mask, but for extended and packet split descriptors */
+#define E1000_RXDEXT_ERR_FRAME_ERR_MASK ( \
+    E1000_RXDEXT_STATERR_CE  |            \
+    E1000_RXDEXT_STATERR_SE  |            \
+    E1000_RXDEXT_STATERR_SEQ |            \
+    E1000_RXDEXT_STATERR_CXE |            \
+    E1000_RXDEXT_STATERR_RXE)
+
+/* Device Control */
+#define E1000_CTRL_RST      0x04000000  /* Global reset */
+
+/* Device Status */
+#define E1000_STATUS_FD         0x00000001      /* Full duplex.0=half,1=full */
+#define E1000_STATUS_LU         0x00000002      /* Link up.0=no,1=link */
+#define E1000_STATUS_TXOFF      0x00000010      /* transmission paused */
+#define E1000_STATUS_SPEED_10   0x00000000      /* Speed 10Mb/s */
+#define E1000_STATUS_SPEED_100  0x00000040      /* Speed 100Mb/s */
+#define E1000_STATUS_SPEED_1000 0x00000080      /* Speed 1000Mb/s */
+
+#define SPEED_10    10
+#define SPEED_100   100
+#define SPEED_1000  1000
+#define HALF_DUPLEX 1
+#define FULL_DUPLEX 2
+
+/* Transmit Descriptor bit definitions */
+#define E1000_TXD_POPTS_IXSM 0x01       /* Insert IP checksum */
+#define E1000_TXD_POPTS_TXSM 0x02       /* Insert TCP/UDP checksum */
+#define E1000_TXD_CMD_DEXT   0x20000000 /* Descriptor extension (0 = legacy) */
+#define E1000_TXD_STAT_DD    0x00000001 /* Descriptor Done */
+
+#define MAX_JUMBO_FRAME_SIZE    0x3F00
+
+/* 802.1q VLAN Packet Size */
+#define VLAN_TAG_SIZE              4    /* 802.3ac tag (not DMA'd) */
+
+/* Error Codes */
+#define E1000_SUCCESS      0
+#define E1000_ERR_CONFIG   3
+#define E1000_ERR_MAC_INIT 5
+#define E1000_ERR_MBX      15
+
+#ifndef ETH_ADDR_LEN
+#define ETH_ADDR_LEN                 6
+#endif
+
+/* SRRCTL bit definitions */
+#define E1000_SRRCTL_BSIZEPKT_SHIFT                     10 /* Shift _right_ */
+#define E1000_SRRCTL_BSIZEHDRSIZE_MASK                  0x00000F00
+#define E1000_SRRCTL_BSIZEHDRSIZE_SHIFT                 2  /* Shift _left_ */
+#define E1000_SRRCTL_DESCTYPE_ADV_ONEBUF                0x02000000
+#define E1000_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS          0x0A000000
+#define E1000_SRRCTL_DESCTYPE_MASK                      0x0E000000
+#define E1000_SRRCTL_DROP_EN                            0x80000000
+
+#define E1000_SRRCTL_BSIZEPKT_MASK      0x0000007F
+#define E1000_SRRCTL_BSIZEHDR_MASK      0x00003F00
+
+/* Additional Descriptor Control definitions */
+#define E1000_TXDCTL_QUEUE_ENABLE  0x02000000 /* Enable specific Tx Queue */
+#define E1000_RXDCTL_QUEUE_ENABLE  0x02000000 /* Enable specific Rx Queue */
+
+/* Direct Cache Access (DCA) definitions */
+#define E1000_DCA_TXCTRL_TX_WB_RO_EN (1 << 11) /* Tx Desc writeback RO bit */
+
+#define E1000_VF_INIT_TIMEOUT 200 /* Number of retries to clear RSTI */
+
+#endif /* _E1000_DEFINES_H_ */
diff --git a/drivers/net/igbvf/ethtool.c b/drivers/net/igbvf/ethtool.c

new file mode 100644 (file)

index 0000000..1dcaa69
--- /dev/null
+++ b/drivers/net/igbvf/ethtool.c
@@ -0,0 +1,540 @@
+/*******************************************************************************
+
+  Intel(R) 82576 Virtual Function Linux driver
+  Copyright(c) 2009 Intel Corporation.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Contact Information:
+  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+*******************************************************************************/
+
+/* ethtool support for igbvf */
+
+#include <linux/netdevice.h>
+#include <linux/ethtool.h>
+#include <linux/pci.h>
+#include <linux/vmalloc.h>
+#include <linux/delay.h>
+
+#include "igbvf.h"
+#include <linux/if_vlan.h>
+
+
+struct igbvf_stats {
+       char stat_string[ETH_GSTRING_LEN];
+       int sizeof_stat;
+       int stat_offset;
+       int base_stat_offset;
+};
+
+#define IGBVF_STAT(current, base) \
+               sizeof(((struct igbvf_adapter *)0)->current), \
+               offsetof(struct igbvf_adapter, current), \
+               offsetof(struct igbvf_adapter, base)
+
+static const struct igbvf_stats igbvf_gstrings_stats[] = {
+       { "rx_packets", IGBVF_STAT(stats.gprc, stats.base_gprc) },
+       { "tx_packets", IGBVF_STAT(stats.gptc, stats.base_gptc) },
+       { "rx_bytes", IGBVF_STAT(stats.gorc, stats.base_gorc) },
+       { "tx_bytes", IGBVF_STAT(stats.gotc, stats.base_gotc) },
+       { "multicast", IGBVF_STAT(stats.mprc, stats.base_mprc) },
+       { "lbrx_bytes", IGBVF_STAT(stats.gorlbc, stats.base_gorlbc) },
+       { "lbrx_packets", IGBVF_STAT(stats.gprlbc, stats.base_gprlbc) },
+       { "tx_restart_queue", IGBVF_STAT(restart_queue, zero_base) },
+       { "rx_long_byte_count", IGBVF_STAT(stats.gorc, stats.base_gorc) },
+       { "rx_csum_offload_good", IGBVF_STAT(hw_csum_good, zero_base) },
+       { "rx_csum_offload_errors", IGBVF_STAT(hw_csum_err, zero_base) },
+       { "rx_header_split", IGBVF_STAT(rx_hdr_split, zero_base) },
+       { "alloc_rx_buff_failed", IGBVF_STAT(alloc_rx_buff_failed, zero_base) },
+};
+
+#define IGBVF_GLOBAL_STATS_LEN ARRAY_SIZE(igbvf_gstrings_stats)
+
+static const char igbvf_gstrings_test[][ETH_GSTRING_LEN] = {
+       "Link test   (on/offline)"
+};
+
+#define IGBVF_TEST_LEN ARRAY_SIZE(igbvf_gstrings_test)
+
+static int igbvf_get_settings(struct net_device *netdev,
+                              struct ethtool_cmd *ecmd)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct e1000_hw *hw = &adapter->hw;
+       u32 status;
+
+       ecmd->supported   = SUPPORTED_1000baseT_Full;
+
+       ecmd->advertising = ADVERTISED_1000baseT_Full;
+
+       ecmd->port = -1;
+       ecmd->transceiver = XCVR_DUMMY1;
+
+       status = er32(STATUS);
+       if (status & E1000_STATUS_LU) {
+               if (status & E1000_STATUS_SPEED_1000)
+                       ecmd->speed = 1000;
+               else if (status & E1000_STATUS_SPEED_100)
+                       ecmd->speed = 100;
+               else
+                       ecmd->speed = 10;
+
+               if (status & E1000_STATUS_FD)
+                       ecmd->duplex = DUPLEX_FULL;
+               else
+                       ecmd->duplex = DUPLEX_HALF;
+       } else {
+               ecmd->speed = -1;
+               ecmd->duplex = -1;
+       }
+
+       ecmd->autoneg = AUTONEG_DISABLE;
+
+       return 0;
+}
+
+static u32 igbvf_get_link(struct net_device *netdev)
+{
+       return netif_carrier_ok(netdev);
+}
+
+static int igbvf_set_settings(struct net_device *netdev,
+                              struct ethtool_cmd *ecmd)
+{
+       return -EOPNOTSUPP;
+}
+
+static void igbvf_get_pauseparam(struct net_device *netdev,
+                                 struct ethtool_pauseparam *pause)
+{
+       return;
+}
+
+static int igbvf_set_pauseparam(struct net_device *netdev,
+                                struct ethtool_pauseparam *pause)
+{
+       return -EOPNOTSUPP;
+}
+
+static u32 igbvf_get_tx_csum(struct net_device *netdev)
+{
+       return ((netdev->features & NETIF_F_IP_CSUM) != 0);
+}
+
+static int igbvf_set_tx_csum(struct net_device *netdev, u32 data)
+{
+       if (data)
+               netdev->features |= (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM);
+       else
+               netdev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM);
+       return 0;
+}
+
+static int igbvf_set_tso(struct net_device *netdev, u32 data)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       int i;
+       struct net_device *v_netdev;
+
+       if (data) {
+               netdev->features |= NETIF_F_TSO;
+               netdev->features |= NETIF_F_TSO6;
+       } else {
+               netdev->features &= ~NETIF_F_TSO;
+               netdev->features &= ~NETIF_F_TSO6;
+               /* disable TSO on all VLANs if they're present */
+               if (!adapter->vlgrp)
+                       goto tso_out;
+               for (i = 0; i < VLAN_GROUP_ARRAY_LEN; i++) {
+                       v_netdev = vlan_group_get_device(adapter->vlgrp, i);
+                       if (!v_netdev)
+                               continue;
+
+                       v_netdev->features &= ~NETIF_F_TSO;
+                       v_netdev->features &= ~NETIF_F_TSO6;
+                       vlan_group_set_device(adapter->vlgrp, i, v_netdev);
+               }
+       }
+
+tso_out:
+       dev_info(&adapter->pdev->dev, "TSO is %s\n",
+                data ? "Enabled" : "Disabled");
+       adapter->flags |= FLAG_TSO_FORCE;
+       return 0;
+}
+
+static u32 igbvf_get_msglevel(struct net_device *netdev)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       return adapter->msg_enable;
+}
+
+static void igbvf_set_msglevel(struct net_device *netdev, u32 data)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       adapter->msg_enable = data;
+}
+
+static int igbvf_get_regs_len(struct net_device *netdev)
+{
+#define IGBVF_REGS_LEN 8
+       return IGBVF_REGS_LEN * sizeof(u32);
+}
+
+static void igbvf_get_regs(struct net_device *netdev,
+                           struct ethtool_regs *regs, void *p)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct e1000_hw *hw = &adapter->hw;
+       u32 *regs_buff = p;
+       u8 revision_id;
+
+       memset(p, 0, IGBVF_REGS_LEN * sizeof(u32));
+
+       pci_read_config_byte(adapter->pdev, PCI_REVISION_ID, &revision_id);
+
+       regs->version = (1 << 24) | (revision_id << 16) | adapter->pdev->device;
+
+       regs_buff[0] = er32(CTRL);
+       regs_buff[1] = er32(STATUS);
+
+       regs_buff[2] = er32(RDLEN(0));
+       regs_buff[3] = er32(RDH(0));
+       regs_buff[4] = er32(RDT(0));
+
+       regs_buff[5] = er32(TDLEN(0));
+       regs_buff[6] = er32(TDH(0));
+       regs_buff[7] = er32(TDT(0));
+}
+
+static int igbvf_get_eeprom_len(struct net_device *netdev)
+{
+       return 0;
+}
+
+static int igbvf_get_eeprom(struct net_device *netdev,
+                            struct ethtool_eeprom *eeprom, u8 *bytes)
+{
+       return -EOPNOTSUPP;
+}
+
+static int igbvf_set_eeprom(struct net_device *netdev,
+                            struct ethtool_eeprom *eeprom, u8 *bytes)
+{
+       return -EOPNOTSUPP;
+}
+
+static void igbvf_get_drvinfo(struct net_device *netdev,
+                              struct ethtool_drvinfo *drvinfo)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       char firmware_version[32] = "N/A";
+
+       strncpy(drvinfo->driver,  igbvf_driver_name, 32);
+       strncpy(drvinfo->version, igbvf_driver_version, 32);
+       strncpy(drvinfo->fw_version, firmware_version, 32);
+       strncpy(drvinfo->bus_info, pci_name(adapter->pdev), 32);
+       drvinfo->regdump_len = igbvf_get_regs_len(netdev);
+       drvinfo->eedump_len = igbvf_get_eeprom_len(netdev);
+}
+
+static void igbvf_get_ringparam(struct net_device *netdev,
+                                struct ethtool_ringparam *ring)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct igbvf_ring *tx_ring = adapter->tx_ring;
+       struct igbvf_ring *rx_ring = adapter->rx_ring;
+
+       ring->rx_max_pending = IGBVF_MAX_RXD;
+       ring->tx_max_pending = IGBVF_MAX_TXD;
+       ring->rx_mini_max_pending = 0;
+       ring->rx_jumbo_max_pending = 0;
+       ring->rx_pending = rx_ring->count;
+       ring->tx_pending = tx_ring->count;
+       ring->rx_mini_pending = 0;
+       ring->rx_jumbo_pending = 0;
+}
+
+static int igbvf_set_ringparam(struct net_device *netdev,
+                               struct ethtool_ringparam *ring)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct igbvf_ring *temp_ring;
+       int err;
+       u32 new_rx_count, new_tx_count;
+
+       if ((ring->rx_mini_pending) || (ring->rx_jumbo_pending))
+               return -EINVAL;
+
+       new_rx_count = max(ring->rx_pending, (u32)IGBVF_MIN_RXD);
+       new_rx_count = min(new_rx_count, (u32)IGBVF_MAX_RXD);
+       new_rx_count = ALIGN(new_rx_count, REQ_RX_DESCRIPTOR_MULTIPLE);
+
+       new_tx_count = max(ring->tx_pending, (u32)IGBVF_MIN_TXD);
+       new_tx_count = min(new_tx_count, (u32)IGBVF_MAX_TXD);
+       new_tx_count = ALIGN(new_tx_count, REQ_TX_DESCRIPTOR_MULTIPLE);
+
+       if ((new_tx_count == adapter->tx_ring->count) &&
+           (new_rx_count == adapter->rx_ring->count)) {
+               /* nothing to do */
+               return 0;
+       }
+
+       temp_ring = vmalloc(sizeof(struct igbvf_ring));
+       if (!temp_ring)
+               return -ENOMEM;
+
+       while (test_and_set_bit(__IGBVF_RESETTING, &adapter->state))
+               msleep(1);
+
+       if (netif_running(adapter->netdev))
+               igbvf_down(adapter);
+
+       /*
+        * We can't just free everything and then setup again,
+        * because the ISRs in MSI-X mode get passed pointers
+        * to the tx and rx ring structs.
+        */
+       if (new_tx_count != adapter->tx_ring->count) {
+               memcpy(temp_ring, adapter->tx_ring, sizeof(struct igbvf_ring));
+
+               temp_ring->count = new_tx_count;
+               err = igbvf_setup_tx_resources(adapter, temp_ring);
+               if (err)
+                       goto err_setup;
+
+               igbvf_free_tx_resources(adapter->tx_ring);
+
+               memcpy(adapter->tx_ring, temp_ring, sizeof(struct igbvf_ring));
+       }
+
+       if (new_rx_count != adapter->rx_ring->count) {
+               memcpy(temp_ring, adapter->rx_ring, sizeof(struct igbvf_ring));
+
+               temp_ring->count = new_rx_count;
+               err = igbvf_setup_rx_resources(adapter, temp_ring);
+               if (err)
+                       goto err_setup;
+
+               igbvf_free_rx_resources(adapter->rx_ring);
+
+               memcpy(adapter->rx_ring, temp_ring,sizeof(struct igbvf_ring));
+       }
+
+       err = 0;
+err_setup:
+       if (netif_running(adapter->netdev))
+               igbvf_up(adapter);
+
+       clear_bit(__IGBVF_RESETTING, &adapter->state);
+       vfree(temp_ring);
+       return err;
+}
+
+static int igbvf_link_test(struct igbvf_adapter *adapter, u64 *data)
+{
+       struct e1000_hw *hw = &adapter->hw;
+       *data = 0;
+
+       hw->mac.ops.check_for_link(hw);
+
+       if (!(er32(STATUS) & E1000_STATUS_LU))
+               *data = 1;
+
+       return *data;
+}
+
+static int igbvf_get_self_test_count(struct net_device *netdev)
+{
+       return IGBVF_TEST_LEN;
+}
+
+static int igbvf_get_stats_count(struct net_device *netdev)
+{
+       return IGBVF_GLOBAL_STATS_LEN;
+}
+
+static void igbvf_diag_test(struct net_device *netdev,
+                            struct ethtool_test *eth_test, u64 *data)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       set_bit(__IGBVF_TESTING, &adapter->state);
+
+       /*
+        * Link test performed before hardware reset so autoneg doesn't
+        * interfere with test result
+        */
+       if (igbvf_link_test(adapter, &data[0]))
+               eth_test->flags |= ETH_TEST_FL_FAILED;
+
+       clear_bit(__IGBVF_TESTING, &adapter->state);
+       msleep_interruptible(4 * 1000);
+}
+
+static void igbvf_get_wol(struct net_device *netdev,
+                          struct ethtool_wolinfo *wol)
+{
+       wol->supported = 0;
+       wol->wolopts = 0;
+
+       return;
+}
+
+static int igbvf_set_wol(struct net_device *netdev,
+                         struct ethtool_wolinfo *wol)
+{
+       return -EOPNOTSUPP;
+}
+
+static int igbvf_phys_id(struct net_device *netdev, u32 data)
+{
+       return 0;
+}
+
+static int igbvf_get_coalesce(struct net_device *netdev,
+                              struct ethtool_coalesce *ec)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       if (adapter->itr_setting <= 3)
+               ec->rx_coalesce_usecs = adapter->itr_setting;
+       else
+               ec->rx_coalesce_usecs = adapter->itr_setting >> 2;
+
+       return 0;
+}
+
+static int igbvf_set_coalesce(struct net_device *netdev,
+                              struct ethtool_coalesce *ec)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct e1000_hw *hw = &adapter->hw;
+
+       if ((ec->rx_coalesce_usecs > IGBVF_MAX_ITR_USECS) ||
+           ((ec->rx_coalesce_usecs > 3) &&
+            (ec->rx_coalesce_usecs < IGBVF_MIN_ITR_USECS)) ||
+           (ec->rx_coalesce_usecs == 2))
+               return -EINVAL;
+
+       /* convert to rate of irq's per second */
+       if (ec->rx_coalesce_usecs && ec->rx_coalesce_usecs <= 3) {
+               adapter->itr = IGBVF_START_ITR;
+               adapter->itr_setting = ec->rx_coalesce_usecs;
+       } else {
+               adapter->itr = ec->rx_coalesce_usecs << 2;
+               adapter->itr_setting = adapter->itr;
+       }
+
+       writel(adapter->itr,
+              hw->hw_addr + adapter->rx_ring[0].itr_register);
+
+       return 0;
+}
+
+static int igbvf_nway_reset(struct net_device *netdev)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       if (netif_running(netdev))
+               igbvf_reinit_locked(adapter);
+       return 0;
+}
+
+
+static void igbvf_get_ethtool_stats(struct net_device *netdev,
+                                    struct ethtool_stats *stats,
+                                    u64 *data)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       int i;
+
+       igbvf_update_stats(adapter);
+       for (i = 0; i < IGBVF_GLOBAL_STATS_LEN; i++) {
+               char *p = (char *)adapter +
+                         igbvf_gstrings_stats[i].stat_offset;
+               char *b = (char *)adapter +
+                         igbvf_gstrings_stats[i].base_stat_offset;
+               data[i] = ((igbvf_gstrings_stats[i].sizeof_stat ==
+                           sizeof(u64)) ? (*(u64 *)p - *(u64 *)b) :
+                           (*(u32 *)p - *(u32 *)b));
+       }
+
+}
+
+static void igbvf_get_strings(struct net_device *netdev, u32 stringset,
+                              u8 *data)
+{
+       u8 *p = data;
+       int i;
+
+       switch (stringset) {
+       case ETH_SS_TEST:
+               memcpy(data, *igbvf_gstrings_test, sizeof(igbvf_gstrings_test));
+               break;
+       case ETH_SS_STATS:
+               for (i = 0; i < IGBVF_GLOBAL_STATS_LEN; i++) {
+                       memcpy(p, igbvf_gstrings_stats[i].stat_string,
+                              ETH_GSTRING_LEN);
+                       p += ETH_GSTRING_LEN;
+               }
+               break;
+       }
+}
+
+static const struct ethtool_ops igbvf_ethtool_ops = {
+       .get_settings           = igbvf_get_settings,
+       .set_settings           = igbvf_set_settings,
+       .get_drvinfo            = igbvf_get_drvinfo,
+       .get_regs_len           = igbvf_get_regs_len,
+       .get_regs               = igbvf_get_regs,
+       .get_wol                = igbvf_get_wol,
+       .set_wol                = igbvf_set_wol,
+       .get_msglevel           = igbvf_get_msglevel,
+       .set_msglevel           = igbvf_set_msglevel,
+       .nway_reset             = igbvf_nway_reset,
+       .get_link               = igbvf_get_link,
+       .get_eeprom_len         = igbvf_get_eeprom_len,
+       .get_eeprom             = igbvf_get_eeprom,
+       .set_eeprom             = igbvf_set_eeprom,
+       .get_ringparam          = igbvf_get_ringparam,
+       .set_ringparam          = igbvf_set_ringparam,
+       .get_pauseparam         = igbvf_get_pauseparam,
+       .set_pauseparam         = igbvf_set_pauseparam,
+       .get_tx_csum            = igbvf_get_tx_csum,
+       .set_tx_csum            = igbvf_set_tx_csum,
+       .get_sg                 = ethtool_op_get_sg,
+       .set_sg                 = ethtool_op_set_sg,
+       .get_tso                = ethtool_op_get_tso,
+       .set_tso                = igbvf_set_tso,
+       .self_test              = igbvf_diag_test,
+       .get_strings            = igbvf_get_strings,
+       .phys_id                = igbvf_phys_id,
+       .get_ethtool_stats      = igbvf_get_ethtool_stats,
+       .self_test_count        = igbvf_get_self_test_count,
+       .get_stats_count        = igbvf_get_stats_count,
+       .get_coalesce           = igbvf_get_coalesce,
+       .set_coalesce           = igbvf_set_coalesce,
+};
+
+void igbvf_set_ethtool_ops(struct net_device *netdev)
+{
+       /* have to "undeclare" const on this struct to remove warnings */
+       SET_ETHTOOL_OPS(netdev, (struct ethtool_ops *)&igbvf_ethtool_ops);
+}
diff --git a/drivers/net/igbvf/igbvf.h b/drivers/net/igbvf/igbvf.h

new file mode 100644 (file)

index 0000000..936ed2a
--- /dev/null
+++ b/drivers/net/igbvf/igbvf.h
@@ -0,0 +1,335 @@
+/*******************************************************************************
+
+  Intel(R) 82576 Virtual Function Linux driver
+  Copyright(c) 2009 Intel Corporation.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Contact Information:
+  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+*******************************************************************************/
+
+/* Linux PRO/1000 Ethernet Driver main header file */
+
+#ifndef _IGBVF_H_
+#define _IGBVF_H_
+
+#include <linux/types.h>
+#include <linux/timer.h>
+#include <linux/io.h>
+#include <linux/netdevice.h>
+
+
+#include "vf.h"
+
+/* Forward declarations */
+struct igbvf_info;
+struct igbvf_adapter;
+
+/* Interrupt defines */
+#define IGBVF_START_ITR                 648 /* ~6000 ints/sec */
+
+/* Interrupt modes, as used by the IntMode paramter */
+#define IGBVF_INT_MODE_LEGACY           0
+#define IGBVF_INT_MODE_MSI              1
+#define IGBVF_INT_MODE_MSIX             2
+
+/* Tx/Rx descriptor defines */
+#define IGBVF_DEFAULT_TXD               256
+#define IGBVF_MAX_TXD                   4096
+#define IGBVF_MIN_TXD                   80
+
+#define IGBVF_DEFAULT_RXD               256
+#define IGBVF_MAX_RXD                   4096
+#define IGBVF_MIN_RXD                   80
+
+#define IGBVF_MIN_ITR_USECS             10 /* 100000 irq/sec */
+#define IGBVF_MAX_ITR_USECS             10000 /* 100    irq/sec */
+
+/* RX descriptor control thresholds.
+ * PTHRESH - MAC will consider prefetch if it has fewer than this number of
+ *           descriptors available in its onboard memory.
+ *           Setting this to 0 disables RX descriptor prefetch.
+ * HTHRESH - MAC will only prefetch if there are at least this many descriptors
+ *           available in host memory.
+ *           If PTHRESH is 0, this should also be 0.
+ * WTHRESH - RX descriptor writeback threshold - MAC will delay writing back
+ *           descriptors until either it has this many to write back, or the
+ *           ITR timer expires.
+ */
+#define IGBVF_RX_PTHRESH                16
+#define IGBVF_RX_HTHRESH                8
+#define IGBVF_RX_WTHRESH                1
+
+/* this is the size past which hardware will drop packets when setting LPE=0 */
+#define MAXIMUM_ETHERNET_VLAN_SIZE      1522
+
+#define IGBVF_FC_PAUSE_TIME             0x0680 /* 858 usec */
+
+/* How many Tx Descriptors do we need to call netif_wake_queue ? */
+#define IGBVF_TX_QUEUE_WAKE             32
+/* How many Rx Buffers do we bundle into one write to the hardware ? */
+#define IGBVF_RX_BUFFER_WRITE           16 /* Must be power of 2 */
+
+#define AUTO_ALL_MODES                  0
+#define IGBVF_EEPROM_APME               0x0400
+
+#define IGBVF_MNG_VLAN_NONE             (-1)
+
+/* Number of packet split data buffers (not including the header buffer) */
+#define PS_PAGE_BUFFERS                 (MAX_PS_BUFFERS - 1)
+
+enum igbvf_boards {
+       board_vf,
+};
+
+struct igbvf_queue_stats {
+       u64 packets;
+       u64 bytes;
+};
+
+/*
+ * wrappers around a pointer to a socket buffer,
+ * so a DMA handle can be stored along with the buffer
+ */
+struct igbvf_buffer {
+       dma_addr_t dma;
+       struct sk_buff *skb;
+       union {
+               /* Tx */
+               struct {
+                       unsigned long time_stamp;
+                       u16 length;
+                       u16 next_to_watch;
+               };
+               /* Rx */
+               struct {
+                       struct page *page;
+                       u64 page_dma;
+                       unsigned int page_offset;
+               };
+       };
+       struct page *page;
+};
+
+union igbvf_desc {
+       union e1000_adv_rx_desc rx_desc;
+       union e1000_adv_tx_desc tx_desc;
+       struct e1000_adv_tx_context_desc tx_context_desc;
+};
+
+struct igbvf_ring {
+       struct igbvf_adapter *adapter;  /* backlink */
+       union igbvf_desc *desc;         /* pointer to ring memory  */
+       dma_addr_t dma;                 /* phys address of ring    */
+       unsigned int size;              /* length of ring in bytes */
+       unsigned int count;             /* number of desc. in ring */
+
+       u16 next_to_use;
+       u16 next_to_clean;
+
+       u16 head;
+       u16 tail;
+
+       /* array of buffer information structs */
+       struct igbvf_buffer *buffer_info;
+       struct napi_struct napi;
+
+       char name[IFNAMSIZ + 5];
+       u32 eims_value;
+       u32 itr_val;
+       u16 itr_register;
+       int set_itr;
+
+       struct sk_buff *rx_skb_top;
+
+       struct igbvf_queue_stats stats;
+};
+
+/* board specific private data structure */
+struct igbvf_adapter {
+       struct timer_list watchdog_timer;
+       struct timer_list blink_timer;
+
+       struct work_struct reset_task;
+       struct work_struct watchdog_task;
+
+       const struct igbvf_info *ei;
+
+       struct vlan_group *vlgrp;
+       u32 bd_number;
+       u32 rx_buffer_len;
+       u32 polling_interval;
+       u16 mng_vlan_id;
+       u16 link_speed;
+       u16 link_duplex;
+
+       spinlock_t tx_queue_lock; /* prevent concurrent tail updates */
+
+       /* track device up/down/testing state */
+       unsigned long state;
+
+       /* Interrupt Throttle Rate */
+       u32 itr;
+       u32 itr_setting;
+       u16 tx_itr;
+       u16 rx_itr;
+
+       /*
+        * Tx
+        */
+       struct igbvf_ring *tx_ring /* One per active queue */
+       ____cacheline_aligned_in_smp;
+
+       unsigned long tx_queue_len;
+       unsigned int restart_queue;
+       u32 txd_cmd;
+
+       bool detect_tx_hung;
+       u8 tx_timeout_factor;
+
+       u32 tx_int_delay;
+       u32 tx_abs_int_delay;
+
+       unsigned int total_tx_bytes;
+       unsigned int total_tx_packets;
+       unsigned int total_rx_bytes;
+       unsigned int total_rx_packets;
+
+       /* Tx stats */
+       u32 tx_timeout_count;
+       u32 tx_fifo_head;
+       u32 tx_head_addr;
+       u32 tx_fifo_size;
+       u32 tx_dma_failed;
+
+       /*
+        * Rx
+        */
+       struct igbvf_ring *rx_ring;
+
+       u32 rx_int_delay;
+       u32 rx_abs_int_delay;
+
+       /* Rx stats */
+       u64 hw_csum_err;
+       u64 hw_csum_good;
+       u64 rx_hdr_split;
+       u32 alloc_rx_buff_failed;
+       u32 rx_dma_failed;
+
+       unsigned int rx_ps_hdr_size;
+       u32 max_frame_size;
+       u32 min_frame_size;
+
+       /* OS defined structs */
+       struct net_device *netdev;
+       struct pci_dev *pdev;
+       struct net_device_stats net_stats;
+       spinlock_t stats_lock;      /* prevent concurrent stats updates */
+
+       /* structs defined in e1000_hw.h */
+       struct e1000_hw hw;
+
+       /* The VF counters don't clear on read so we have to get a base
+        * count on driver start up and always subtract that base on
+        * on the first update, thus the flag..
+        */
+       struct e1000_vf_stats stats;
+       u64 zero_base;
+
+       struct igbvf_ring test_tx_ring;
+       struct igbvf_ring test_rx_ring;
+       u32 test_icr;
+
+       u32 msg_enable;
+       struct msix_entry *msix_entries;
+       int int_mode;
+       u32 eims_enable_mask;
+       u32 eims_other;
+       u32 int_counter0;
+       u32 int_counter1;
+
+       u32 eeprom_wol;
+       u32 wol;
+       u32 pba;
+
+       bool fc_autoneg;
+
+       unsigned long led_status;
+
+       unsigned int flags;
+};
+
+struct igbvf_info {
+       enum e1000_mac_type     mac;
+       unsigned int            flags;
+       u32                     pba;
+       void                    (*init_ops)(struct e1000_hw *);
+       s32                     (*get_variants)(struct igbvf_adapter *);
+};
+
+/* hardware capability, feature, and workaround flags */
+#define FLAG_HAS_HW_VLAN_FILTER           (1 << 0)
+#define FLAG_HAS_JUMBO_FRAMES             (1 << 1)
+#define FLAG_MSI_ENABLED                  (1 << 2)
+#define FLAG_RX_CSUM_ENABLED              (1 << 3)
+#define FLAG_TSO_FORCE                    (1 << 4)
+
+#define IGBVF_RX_DESC_ADV(R, i)     \
+       (&((((R).desc))[i].rx_desc))
+#define IGBVF_TX_DESC_ADV(R, i)     \
+       (&((((R).desc))[i].tx_desc))
+#define IGBVF_TX_CTXTDESC_ADV(R, i) \
+       (&((((R).desc))[i].tx_context_desc))
+
+enum igbvf_state_t {
+       __IGBVF_TESTING,
+       __IGBVF_RESETTING,
+       __IGBVF_DOWN
+};
+
+enum latency_range {
+       lowest_latency = 0,
+       low_latency = 1,
+       bulk_latency = 2,
+       latency_invalid = 255
+};
+
+extern char igbvf_driver_name[];
+extern const char igbvf_driver_version[];
+
+extern void igbvf_check_options(struct igbvf_adapter *);
+extern void igbvf_set_ethtool_ops(struct net_device *);
+
+extern int igbvf_up(struct igbvf_adapter *);
+extern void igbvf_down(struct igbvf_adapter *);
+extern void igbvf_reinit_locked(struct igbvf_adapter *);
+extern void igbvf_reset(struct igbvf_adapter *);
+extern int igbvf_setup_rx_resources(struct igbvf_adapter *, struct igbvf_ring *);
+extern int igbvf_setup_tx_resources(struct igbvf_adapter *, struct igbvf_ring *);
+extern void igbvf_free_rx_resources(struct igbvf_ring *);
+extern void igbvf_free_tx_resources(struct igbvf_ring *);
+extern void igbvf_update_stats(struct igbvf_adapter *);
+extern void igbvf_set_interrupt_capability(struct igbvf_adapter *);
+extern void igbvf_reset_interrupt_capability(struct igbvf_adapter *);
+
+extern unsigned int copybreak;
+
+#endif /* _IGBVF_H_ */
diff --git a/drivers/net/igbvf/mbx.c b/drivers/net/igbvf/mbx.c

new file mode 100644 (file)

index 0000000..819a8ec
--- /dev/null
+++ b/drivers/net/igbvf/mbx.c
@@ -0,0 +1,350 @@
+/*******************************************************************************
+
+  Intel(R) 82576 Virtual Function Linux driver
+  Copyright(c) 2009 Intel Corporation.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Contact Information:
+  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+*******************************************************************************/
+
+#include "mbx.h"
+
+/**
+ *  e1000_poll_for_msg - Wait for message notification
+ *  @hw: pointer to the HW structure
+ *
+ *  returns SUCCESS if it successfully received a message notification
+ **/
+static s32 e1000_poll_for_msg(struct e1000_hw *hw)
+{
+       struct e1000_mbx_info *mbx = &hw->mbx;
+       int countdown = mbx->timeout;
+
+       if (!mbx->ops.check_for_msg)
+               goto out;
+
+       while (countdown && mbx->ops.check_for_msg(hw)) {
+               countdown--;
+               udelay(mbx->usec_delay);
+       }
+
+       /* if we failed, all future posted messages fail until reset */
+       if (!countdown)
+               mbx->timeout = 0;
+out:
+       return countdown ? E1000_SUCCESS : -E1000_ERR_MBX;
+}
+
+/**
+ *  e1000_poll_for_ack - Wait for message acknowledgement
+ *  @hw: pointer to the HW structure
+ *
+ *  returns SUCCESS if it successfully received a message acknowledgement
+ **/
+static s32 e1000_poll_for_ack(struct e1000_hw *hw)
+{
+       struct e1000_mbx_info *mbx = &hw->mbx;
+       int countdown = mbx->timeout;
+
+       if (!mbx->ops.check_for_ack)
+               goto out;
+
+       while (countdown && mbx->ops.check_for_ack(hw)) {
+               countdown--;
+               udelay(mbx->usec_delay);
+       }
+
+       /* if we failed, all future posted messages fail until reset */
+       if (!countdown)
+               mbx->timeout = 0;
+out:
+       return countdown ? E1000_SUCCESS : -E1000_ERR_MBX;
+}
+
+/**
+ *  e1000_read_posted_mbx - Wait for message notification and receive message
+ *  @hw: pointer to the HW structure
+ *  @msg: The message buffer
+ *  @size: Length of buffer
+ *
+ *  returns SUCCESS if it successfully received a message notification and
+ *  copied it into the receive buffer.
+ **/
+static s32 e1000_read_posted_mbx(struct e1000_hw *hw, u32 *msg, u16 size)
+{
+       struct e1000_mbx_info *mbx = &hw->mbx;
+       s32 ret_val = -E1000_ERR_MBX;
+
+       if (!mbx->ops.read)
+               goto out;
+
+       ret_val = e1000_poll_for_msg(hw);
+
+       /* if ack received read message, otherwise we timed out */
+       if (!ret_val)
+               ret_val = mbx->ops.read(hw, msg, size);
+out:
+       return ret_val;
+}
+
+/**
+ *  e1000_write_posted_mbx - Write a message to the mailbox, wait for ack
+ *  @hw: pointer to the HW structure
+ *  @msg: The message buffer
+ *  @size: Length of buffer
+ *
+ *  returns SUCCESS if it successfully copied message into the buffer and
+ *  received an ack to that message within delay * timeout period
+ **/
+static s32 e1000_write_posted_mbx(struct e1000_hw *hw, u32 *msg, u16 size)
+{
+       struct e1000_mbx_info *mbx = &hw->mbx;
+       s32 ret_val = -E1000_ERR_MBX;
+
+       /* exit if we either can't write or there isn't a defined timeout */
+       if (!mbx->ops.write || !mbx->timeout)
+               goto out;
+
+       /* send msg*/
+       ret_val = mbx->ops.write(hw, msg, size);
+
+       /* if msg sent wait until we receive an ack */
+       if (!ret_val)
+               ret_val = e1000_poll_for_ack(hw);
+out:
+       return ret_val;
+}
+
+/**
+ *  e1000_read_v2p_mailbox - read v2p mailbox
+ *  @hw: pointer to the HW structure
+ *
+ *  This function is used to read the v2p mailbox without losing the read to
+ *  clear status bits.
+ **/
+static u32 e1000_read_v2p_mailbox(struct e1000_hw *hw)
+{
+       u32 v2p_mailbox = er32(V2PMAILBOX(0));
+
+       v2p_mailbox |= hw->dev_spec.vf.v2p_mailbox;
+       hw->dev_spec.vf.v2p_mailbox |= v2p_mailbox & E1000_V2PMAILBOX_R2C_BITS;
+
+       return v2p_mailbox;
+}
+
+/**
+ *  e1000_check_for_bit_vf - Determine if a status bit was set
+ *  @hw: pointer to the HW structure
+ *  @mask: bitmask for bits to be tested and cleared
+ *
+ *  This function is used to check for the read to clear bits within
+ *  the V2P mailbox.
+ **/
+static s32 e1000_check_for_bit_vf(struct e1000_hw *hw, u32 mask)
+{
+       u32 v2p_mailbox = e1000_read_v2p_mailbox(hw);
+       s32 ret_val = -E1000_ERR_MBX;
+
+       if (v2p_mailbox & mask)
+               ret_val = E1000_SUCCESS;
+
+       hw->dev_spec.vf.v2p_mailbox &= ~mask;
+
+       return ret_val;
+}
+
+/**
+ *  e1000_check_for_msg_vf - checks to see if the PF has sent mail
+ *  @hw: pointer to the HW structure
+ *
+ *  returns SUCCESS if the PF has set the Status bit or else ERR_MBX
+ **/
+static s32 e1000_check_for_msg_vf(struct e1000_hw *hw)
+{
+       s32 ret_val = -E1000_ERR_MBX;
+
+       if (!e1000_check_for_bit_vf(hw, E1000_V2PMAILBOX_PFSTS)) {
+               ret_val = E1000_SUCCESS;
+               hw->mbx.stats.reqs++;
+       }
+
+       return ret_val;
+}
+
+/**
+ *  e1000_check_for_ack_vf - checks to see if the PF has ACK'd
+ *  @hw: pointer to the HW structure
+ *
+ *  returns SUCCESS if the PF has set the ACK bit or else ERR_MBX
+ **/
+static s32 e1000_check_for_ack_vf(struct e1000_hw *hw)
+{
+       s32 ret_val = -E1000_ERR_MBX;
+
+       if (!e1000_check_for_bit_vf(hw, E1000_V2PMAILBOX_PFACK)) {
+               ret_val = E1000_SUCCESS;
+               hw->mbx.stats.acks++;
+       }
+
+       return ret_val;
+}
+
+/**
+ *  e1000_check_for_rst_vf - checks to see if the PF has reset
+ *  @hw: pointer to the HW structure
+ *
+ *  returns true if the PF has set the reset done bit or else false
+ **/
+static s32 e1000_check_for_rst_vf(struct e1000_hw *hw)
+{
+       s32 ret_val = -E1000_ERR_MBX;
+
+       if (!e1000_check_for_bit_vf(hw, (E1000_V2PMAILBOX_RSTD |
+                                        E1000_V2PMAILBOX_RSTI))) {
+               ret_val = E1000_SUCCESS;
+               hw->mbx.stats.rsts++;
+       }
+
+       return ret_val;
+}
+
+/**
+ *  e1000_obtain_mbx_lock_vf - obtain mailbox lock
+ *  @hw: pointer to the HW structure
+ *
+ *  return SUCCESS if we obtained the mailbox lock
+ **/
+static s32 e1000_obtain_mbx_lock_vf(struct e1000_hw *hw)
+{
+       s32 ret_val = -E1000_ERR_MBX;
+
+       /* Take ownership of the buffer */
+       ew32(V2PMAILBOX(0), E1000_V2PMAILBOX_VFU);
+
+       /* reserve mailbox for vf use */
+       if (e1000_read_v2p_mailbox(hw) & E1000_V2PMAILBOX_VFU)
+               ret_val = E1000_SUCCESS;
+
+       return ret_val;
+}
+
+/**
+ *  e1000_write_mbx_vf - Write a message to the mailbox
+ *  @hw: pointer to the HW structure
+ *  @msg: The message buffer
+ *  @size: Length of buffer
+ *
+ *  returns SUCCESS if it successfully copied message into the buffer
+ **/
+static s32 e1000_write_mbx_vf(struct e1000_hw *hw, u32 *msg, u16 size)
+{
+       s32 err;
+       u16 i;
+
+       /* lock the mailbox to prevent pf/vf race condition */
+       err = e1000_obtain_mbx_lock_vf(hw);
+       if (err)
+               goto out_no_write;
+
+       /* flush any ack or msg as we are going to overwrite mailbox */
+       e1000_check_for_ack_vf(hw);
+       e1000_check_for_msg_vf(hw);
+
+       /* copy the caller specified message to the mailbox memory buffer */
+       for (i = 0; i < size; i++)
+               array_ew32(VMBMEM(0), i, msg[i]);
+
+       /* update stats */
+       hw->mbx.stats.msgs_tx++;
+
+       /* Drop VFU and interrupt the PF to tell it a message has been sent */
+       ew32(V2PMAILBOX(0), E1000_V2PMAILBOX_REQ);
+
+out_no_write:
+       return err;
+}
+
+/**
+ *  e1000_read_mbx_vf - Reads a message from the inbox intended for vf
+ *  @hw: pointer to the HW structure
+ *  @msg: The message buffer
+ *  @size: Length of buffer
+ *
+ *  returns SUCCESS if it successfuly read message from buffer
+ **/
+static s32 e1000_read_mbx_vf(struct e1000_hw *hw, u32 *msg, u16 size)
+{
+       s32 err;
+       u16 i;
+
+       /* lock the mailbox to prevent pf/vf race condition */
+       err = e1000_obtain_mbx_lock_vf(hw);
+       if (err)
+               goto out_no_read;
+
+       /* copy the message from the mailbox memory buffer */
+       for (i = 0; i < size; i++)
+               msg[i] = array_er32(VMBMEM(0), i);
+
+       /* Acknowledge receipt and release mailbox, then we're done */
+       ew32(V2PMAILBOX(0), E1000_V2PMAILBOX_ACK);
+
+       /* update stats */
+       hw->mbx.stats.msgs_rx++;
+
+out_no_read:
+       return err;
+}
+
+/**
+ *  e1000_init_mbx_params_vf - set initial values for vf mailbox
+ *  @hw: pointer to the HW structure
+ *
+ *  Initializes the hw->mbx struct to correct values for vf mailbox
+ */
+s32 e1000_init_mbx_params_vf(struct e1000_hw *hw)
+{
+       struct e1000_mbx_info *mbx = &hw->mbx;
+
+       /* start mailbox as timed out and let the reset_hw call set the timeout
+        * value to being communications */
+       mbx->timeout = 0;
+       mbx->usec_delay = E1000_VF_MBX_INIT_DELAY;
+
+       mbx->size = E1000_VFMAILBOX_SIZE;
+
+       mbx->ops.read = e1000_read_mbx_vf;
+       mbx->ops.write = e1000_write_mbx_vf;
+       mbx->ops.read_posted = e1000_read_posted_mbx;
+       mbx->ops.write_posted = e1000_write_posted_mbx;
+       mbx->ops.check_for_msg = e1000_check_for_msg_vf;
+       mbx->ops.check_for_ack = e1000_check_for_ack_vf;
+       mbx->ops.check_for_rst = e1000_check_for_rst_vf;
+
+       mbx->stats.msgs_tx = 0;
+       mbx->stats.msgs_rx = 0;
+       mbx->stats.reqs = 0;
+       mbx->stats.acks = 0;
+       mbx->stats.rsts = 0;
+
+       return E1000_SUCCESS;
+}
+
diff --git a/drivers/net/igbvf/mbx.h b/drivers/net/igbvf/mbx.h

new file mode 100644 (file)

index 0000000..4938609
--- /dev/null
+++ b/drivers/net/igbvf/mbx.h
@@ -0,0 +1,75 @@
+/*******************************************************************************
+
+  Intel(R) 82576 Virtual Function Linux driver
+  Copyright(c) 1999 - 2009 Intel Corporation.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Contact Information:
+  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+*******************************************************************************/
+
+#ifndef _E1000_MBX_H_
+#define _E1000_MBX_H_
+
+#include "vf.h"
+
+#define E1000_V2PMAILBOX_REQ   0x00000001 /* Request for PF Ready bit */
+#define E1000_V2PMAILBOX_ACK   0x00000002 /* Ack PF message received */
+#define E1000_V2PMAILBOX_VFU   0x00000004 /* VF owns the mailbox buffer */
+#define E1000_V2PMAILBOX_PFU   0x00000008 /* PF owns the mailbox buffer */
+#define E1000_V2PMAILBOX_PFSTS 0x00000010 /* PF wrote a message in the MB */
+#define E1000_V2PMAILBOX_PFACK 0x00000020 /* PF ack the previous VF msg */
+#define E1000_V2PMAILBOX_RSTI  0x00000040 /* PF has reset indication */
+#define E1000_V2PMAILBOX_RSTD  0x00000080 /* PF has indicated reset done */
+#define E1000_V2PMAILBOX_R2C_BITS 0x000000B0 /* All read to clear bits */
+
+#define E1000_VFMAILBOX_SIZE   16 /* 16 32 bit words - 64 bytes */
+
+/* If it's a E1000_VF_* msg then it originates in the VF and is sent to the
+ * PF.  The reverse is true if it is E1000_PF_*.
+ * Message ACK's are the value or'd with 0xF0000000
+ */
+#define E1000_VT_MSGTYPE_ACK      0x80000000  /* Messages below or'd with
+                                               * this are the ACK */
+#define E1000_VT_MSGTYPE_NACK     0x40000000  /* Messages below or'd with
+                                               * this are the NACK */
+#define E1000_VT_MSGTYPE_CTS      0x20000000  /* Indicates that VF is still
+                                                 clear to send requests */
+
+/* We have a total wait time of 1s for vf mailbox posted messages */
+#define E1000_VF_MBX_INIT_TIMEOUT 2000 /* retry count for mailbox timeout */
+#define E1000_VF_MBX_INIT_DELAY   500  /* usec delay between retries */
+
+#define E1000_VT_MSGINFO_SHIFT    16
+/* bits 23:16 are used for exra info for certain messages */
+#define E1000_VT_MSGINFO_MASK     (0xFF << E1000_VT_MSGINFO_SHIFT)
+
+#define E1000_VF_RESET            0x01 /* VF requests reset */
+#define E1000_VF_SET_MAC_ADDR     0x02 /* VF requests PF to set MAC addr */
+#define E1000_VF_SET_MULTICAST    0x03 /* VF requests PF to set MC addr */
+#define E1000_VF_SET_VLAN         0x04 /* VF requests PF to set VLAN */
+#define E1000_VF_SET_LPE          0x05 /* VF requests PF to set VMOLR.LPE */
+
+#define E1000_PF_CONTROL_MSG      0x0100 /* PF control message */
+
+void e1000_init_mbx_ops_generic(struct e1000_hw *hw);
+s32 e1000_init_mbx_params_vf(struct e1000_hw *);
+
+#endif /* _E1000_MBX_H_ */
diff --git a/drivers/net/igbvf/netdev.c b/drivers/net/igbvf/netdev.c

new file mode 100644 (file)

index 0000000..c564842
--- /dev/null
+++ b/drivers/net/igbvf/netdev.c
@@ -0,0 +1,2919 @@
+/*******************************************************************************
+
+  Intel(R) 82576 Virtual Function Linux driver
+  Copyright(c) 2009 Intel Corporation.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Contact Information:
+  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+*******************************************************************************/
+
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/init.h>
+#include <linux/pci.h>
+#include <linux/vmalloc.h>
+#include <linux/pagemap.h>
+#include <linux/delay.h>
+#include <linux/netdevice.h>
+#include <linux/tcp.h>
+#include <linux/ipv6.h>
+#include <net/checksum.h>
+#include <net/ip6_checksum.h>
+#include <linux/mii.h>
+#include <linux/ethtool.h>
+#include <linux/if_vlan.h>
+#include <linux/pm_qos_params.h>
+
+#include "igbvf.h"
+
+#define DRV_VERSION "1.0.0-k0"
+char igbvf_driver_name[] = "igbvf";
+const char igbvf_driver_version[] = DRV_VERSION;
+static const char igbvf_driver_string[] =
+                               "Intel(R) Virtual Function Network Driver";
+static const char igbvf_copyright[] = "Copyright (c) 2009 Intel Corporation.";
+
+static int igbvf_poll(struct napi_struct *napi, int budget);
+
+static struct igbvf_info igbvf_vf_info = {
+       .mac                    = e1000_vfadapt,
+       .flags                  = FLAG_HAS_JUMBO_FRAMES
+                                 | FLAG_RX_CSUM_ENABLED,
+       .pba                    = 10,
+       .init_ops               = e1000_init_function_pointers_vf,
+};
+
+static const struct igbvf_info *igbvf_info_tbl[] = {
+       [board_vf]              = &igbvf_vf_info,
+};
+
+/**
+ * igbvf_desc_unused - calculate if we have unused descriptors
+ **/
+static int igbvf_desc_unused(struct igbvf_ring *ring)
+{
+       if (ring->next_to_clean > ring->next_to_use)
+               return ring->next_to_clean - ring->next_to_use - 1;
+
+       return ring->count + ring->next_to_clean - ring->next_to_use - 1;
+}
+
+/**
+ * igbvf_receive_skb - helper function to handle Rx indications
+ * @adapter: board private structure
+ * @status: descriptor status field as written by hardware
+ * @vlan: descriptor vlan field as written by hardware (no le/be conversion)
+ * @skb: pointer to sk_buff to be indicated to stack
+ **/
+static void igbvf_receive_skb(struct igbvf_adapter *adapter,
+                              struct net_device *netdev,
+                              struct sk_buff *skb,
+                              u32 status, u16 vlan)
+{
+       if (adapter->vlgrp && (status & E1000_RXD_STAT_VP))
+               vlan_hwaccel_receive_skb(skb, adapter->vlgrp,
+                                        le16_to_cpu(vlan) &
+                                        E1000_RXD_SPC_VLAN_MASK);
+       else
+               netif_receive_skb(skb);
+
+       netdev->last_rx = jiffies;
+}
+
+static inline void igbvf_rx_checksum_adv(struct igbvf_adapter *adapter,
+                                         u32 status_err, struct sk_buff *skb)
+{
+       skb->ip_summed = CHECKSUM_NONE;
+
+       /* Ignore Checksum bit is set or checksum is disabled through ethtool */
+       if ((status_err & E1000_RXD_STAT_IXSM))
+               return;
+       /* TCP/UDP checksum error bit is set */
+       if (status_err &
+           (E1000_RXDEXT_STATERR_TCPE | E1000_RXDEXT_STATERR_IPE)) {
+               /* let the stack verify checksum errors */
+               adapter->hw_csum_err++;
+               return;
+       }
+       /* It must be a TCP or UDP packet with a valid checksum */
+       if (status_err & (E1000_RXD_STAT_TCPCS | E1000_RXD_STAT_UDPCS))
+               skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+       adapter->hw_csum_good++;
+}
+
+/**
+ * igbvf_alloc_rx_buffers - Replace used receive buffers; packet split
+ * @rx_ring: address of ring structure to repopulate
+ * @cleaned_count: number of buffers to repopulate
+ **/
+static void igbvf_alloc_rx_buffers(struct igbvf_ring *rx_ring,
+                                   int cleaned_count)
+{
+       struct igbvf_adapter *adapter = rx_ring->adapter;
+       struct net_device *netdev = adapter->netdev;
+       struct pci_dev *pdev = adapter->pdev;
+       union e1000_adv_rx_desc *rx_desc;
+       struct igbvf_buffer *buffer_info;
+       struct sk_buff *skb;
+       unsigned int i;
+       int bufsz;
+
+       i = rx_ring->next_to_use;
+       buffer_info = &rx_ring->buffer_info[i];
+
+       if (adapter->rx_ps_hdr_size)
+               bufsz = adapter->rx_ps_hdr_size;
+       else
+               bufsz = adapter->rx_buffer_len;
+       bufsz += NET_IP_ALIGN;
+
+       while (cleaned_count--) {
+               rx_desc = IGBVF_RX_DESC_ADV(*rx_ring, i);
+
+               if (adapter->rx_ps_hdr_size && !buffer_info->page_dma) {
+                       if (!buffer_info->page) {
+                               buffer_info->page = alloc_page(GFP_ATOMIC);
+                               if (!buffer_info->page) {
+                                       adapter->alloc_rx_buff_failed++;
+                                       goto no_buffers;
+                               }
+                               buffer_info->page_offset = 0;
+                       } else {
+                               buffer_info->page_offset ^= PAGE_SIZE / 2;
+                       }
+                       buffer_info->page_dma =
+                               pci_map_page(pdev, buffer_info->page,
+                                            buffer_info->page_offset,
+                                            PAGE_SIZE / 2,
+                                            PCI_DMA_FROMDEVICE);
+               }
+
+               if (!buffer_info->skb) {
+                       skb = netdev_alloc_skb(netdev, bufsz);
+                       if (!skb) {
+                               adapter->alloc_rx_buff_failed++;
+                               goto no_buffers;
+                       }
+
+                       /* Make buffer alignment 2 beyond a 16 byte boundary
+                        * this will result in a 16 byte aligned IP header after
+                        * the 14 byte MAC header is removed
+                        */
+                       skb_reserve(skb, NET_IP_ALIGN);
+
+                       buffer_info->skb = skb;
+                       buffer_info->dma = pci_map_single(pdev, skb->data,
+                                                         bufsz,
+                                                         PCI_DMA_FROMDEVICE);
+               }
+               /* Refresh the desc even if buffer_addrs didn't change because
+                * each write-back erases this info. */
+               if (adapter->rx_ps_hdr_size) {
+                       rx_desc->read.pkt_addr =
+                            cpu_to_le64(buffer_info->page_dma);
+                       rx_desc->read.hdr_addr = cpu_to_le64(buffer_info->dma);
+               } else {
+                       rx_desc->read.pkt_addr =
+                            cpu_to_le64(buffer_info->dma);
+                       rx_desc->read.hdr_addr = 0;
+               }
+
+               i++;
+               if (i == rx_ring->count)
+                       i = 0;
+               buffer_info = &rx_ring->buffer_info[i];
+       }
+
+no_buffers:
+       if (rx_ring->next_to_use != i) {
+               rx_ring->next_to_use = i;
+               if (i == 0)
+                       i = (rx_ring->count - 1);
+               else
+                       i--;
+
+               /* Force memory writes to complete before letting h/w
+                * know there are new descriptors to fetch.  (Only
+                * applicable for weak-ordered memory model archs,
+                * such as IA-64). */
+               wmb();
+               writel(i, adapter->hw.hw_addr + rx_ring->tail);
+       }
+}
+
+/**
+ * igbvf_clean_rx_irq - Send received data up the network stack; legacy
+ * @adapter: board private structure
+ *
+ * the return value indicates whether actual cleaning was done, there
+ * is no guarantee that everything was cleaned
+ **/
+static bool igbvf_clean_rx_irq(struct igbvf_adapter *adapter,
+                               int *work_done, int work_to_do)
+{
+       struct igbvf_ring *rx_ring = adapter->rx_ring;
+       struct net_device *netdev = adapter->netdev;
+       struct pci_dev *pdev = adapter->pdev;
+       union e1000_adv_rx_desc *rx_desc, *next_rxd;
+       struct igbvf_buffer *buffer_info, *next_buffer;
+       struct sk_buff *skb;
+       bool cleaned = false;
+       int cleaned_count = 0;
+       unsigned int total_bytes = 0, total_packets = 0;
+       unsigned int i;
+       u32 length, hlen, staterr;
+
+       i = rx_ring->next_to_clean;
+       rx_desc = IGBVF_RX_DESC_ADV(*rx_ring, i);
+       staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
+
+       while (staterr & E1000_RXD_STAT_DD) {
+               if (*work_done >= work_to_do)
+                       break;
+               (*work_done)++;
+
+               buffer_info = &rx_ring->buffer_info[i];
+
+               /* HW will not DMA in data larger than the given buffer, even
+                * if it parses the (NFS, of course) header to be larger.  In
+                * that case, it fills the header buffer and spills the rest
+                * into the page.
+                */
+               hlen = (le16_to_cpu(rx_desc->wb.lower.lo_dword.hs_rss.hdr_info) &
+                 E1000_RXDADV_HDRBUFLEN_MASK) >> E1000_RXDADV_HDRBUFLEN_SHIFT;
+               if (hlen > adapter->rx_ps_hdr_size)
+                       hlen = adapter->rx_ps_hdr_size;
+
+               length = le16_to_cpu(rx_desc->wb.upper.length);
+               cleaned = true;
+               cleaned_count++;
+
+               skb = buffer_info->skb;
+               prefetch(skb->data - NET_IP_ALIGN);
+               buffer_info->skb = NULL;
+               if (!adapter->rx_ps_hdr_size) {
+                       pci_unmap_single(pdev, buffer_info->dma,
+                                        adapter->rx_buffer_len,
+                                        PCI_DMA_FROMDEVICE);
+                       buffer_info->dma = 0;
+                       skb_put(skb, length);
+                       goto send_up;
+               }
+
+               if (!skb_shinfo(skb)->nr_frags) {
+                       pci_unmap_single(pdev, buffer_info->dma,
+                                        adapter->rx_ps_hdr_size + NET_IP_ALIGN,
+                                        PCI_DMA_FROMDEVICE);
+                       skb_put(skb, hlen);
+               }
+
+               if (length) {
+                       pci_unmap_page(pdev, buffer_info->page_dma,
+                                      PAGE_SIZE / 2,
+                                      PCI_DMA_FROMDEVICE);
+                       buffer_info->page_dma = 0;
+
+                       skb_fill_page_desc(skb, skb_shinfo(skb)->nr_frags++,
+                                          buffer_info->page,
+                                          buffer_info->page_offset,
+                                          length);
+
+                       if ((adapter->rx_buffer_len > (PAGE_SIZE / 2)) ||
+                           (page_count(buffer_info->page) != 1))
+                               buffer_info->page = NULL;
+                       else
+                               get_page(buffer_info->page);
+
+                       skb->len += length;
+                       skb->data_len += length;
+                       skb->truesize += length;
+               }
+send_up:
+               i++;
+               if (i == rx_ring->count)
+                       i = 0;
+               next_rxd = IGBVF_RX_DESC_ADV(*rx_ring, i);
+               prefetch(next_rxd);
+               next_buffer = &rx_ring->buffer_info[i];
+
+               if (!(staterr & E1000_RXD_STAT_EOP)) {
+                       buffer_info->skb = next_buffer->skb;
+                       buffer_info->dma = next_buffer->dma;
+                       next_buffer->skb = skb;
+                       next_buffer->dma = 0;
+                       goto next_desc;
+               }
+
+               if (staterr & E1000_RXDEXT_ERR_FRAME_ERR_MASK) {
+                       dev_kfree_skb_irq(skb);
+                       goto next_desc;
+               }
+
+               total_bytes += skb->len;
+               total_packets++;
+
+               igbvf_rx_checksum_adv(adapter, staterr, skb);
+
+               skb->protocol = eth_type_trans(skb, netdev);
+
+               igbvf_receive_skb(adapter, netdev, skb, staterr,
+                                 rx_desc->wb.upper.vlan);
+
+               netdev->last_rx = jiffies;
+
+next_desc:
+               rx_desc->wb.upper.status_error = 0;
+
+               /* return some buffers to hardware, one at a time is too slow */
+               if (cleaned_count >= IGBVF_RX_BUFFER_WRITE) {
+                       igbvf_alloc_rx_buffers(rx_ring, cleaned_count);
+                       cleaned_count = 0;
+               }
+
+               /* use prefetched values */
+               rx_desc = next_rxd;
+               buffer_info = next_buffer;
+
+               staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
+       }
+
+       rx_ring->next_to_clean = i;
+       cleaned_count = igbvf_desc_unused(rx_ring);
+
+       if (cleaned_count)
+               igbvf_alloc_rx_buffers(rx_ring, cleaned_count);
+
+       adapter->total_rx_packets += total_packets;
+       adapter->total_rx_bytes += total_bytes;
+       adapter->net_stats.rx_bytes += total_bytes;
+       adapter->net_stats.rx_packets += total_packets;
+       return cleaned;
+}
+
+static void igbvf_put_txbuf(struct igbvf_adapter *adapter,
+                            struct igbvf_buffer *buffer_info)
+{
+       buffer_info->dma = 0;
+       if (buffer_info->skb) {
+               skb_dma_unmap(&adapter->pdev->dev, buffer_info->skb,
+                             DMA_TO_DEVICE);
+               dev_kfree_skb_any(buffer_info->skb);
+               buffer_info->skb = NULL;
+       }
+       buffer_info->time_stamp = 0;
+}
+
+static void igbvf_print_tx_hang(struct igbvf_adapter *adapter)
+{
+       struct igbvf_ring *tx_ring = adapter->tx_ring;
+       unsigned int i = tx_ring->next_to_clean;
+       unsigned int eop = tx_ring->buffer_info[i].next_to_watch;
+       union e1000_adv_tx_desc *eop_desc = IGBVF_TX_DESC_ADV(*tx_ring, eop);
+
+       /* detected Tx unit hang */
+       dev_err(&adapter->pdev->dev,
+               "Detected Tx Unit Hang:\n"
+               "  TDH                  <%x>\n"
+               "  TDT                  <%x>\n"
+               "  next_to_use          <%x>\n"
+               "  next_to_clean        <%x>\n"
+               "buffer_info[next_to_clean]:\n"
+               "  time_stamp           <%lx>\n"
+               "  next_to_watch        <%x>\n"
+               "  jiffies              <%lx>\n"
+               "  next_to_watch.status <%x>\n",
+               readl(adapter->hw.hw_addr + tx_ring->head),
+               readl(adapter->hw.hw_addr + tx_ring->tail),
+               tx_ring->next_to_use,
+               tx_ring->next_to_clean,
+               tx_ring->buffer_info[eop].time_stamp,
+               eop,
+               jiffies,
+               eop_desc->wb.status);
+}
+
+/**
+ * igbvf_setup_tx_resources - allocate Tx resources (Descriptors)
+ * @adapter: board private structure
+ *
+ * Return 0 on success, negative on failure
+ **/
+int igbvf_setup_tx_resources(struct igbvf_adapter *adapter,
+                             struct igbvf_ring *tx_ring)
+{
+       struct pci_dev *pdev = adapter->pdev;
+       int size;
+
+       size = sizeof(struct igbvf_buffer) * tx_ring->count;
+       tx_ring->buffer_info = vmalloc(size);
+       if (!tx_ring->buffer_info)
+               goto err;
+       memset(tx_ring->buffer_info, 0, size);
+
+       /* round up to nearest 4K */
+       tx_ring->size = tx_ring->count * sizeof(union e1000_adv_tx_desc);
+       tx_ring->size = ALIGN(tx_ring->size, 4096);
+
+       tx_ring->desc = pci_alloc_consistent(pdev, tx_ring->size,
+                                            &tx_ring->dma);
+
+       if (!tx_ring->desc)
+               goto err;
+
+       tx_ring->adapter = adapter;
+       tx_ring->next_to_use = 0;
+       tx_ring->next_to_clean = 0;
+
+       return 0;
+err:
+       vfree(tx_ring->buffer_info);
+       dev_err(&adapter->pdev->dev,
+               "Unable to allocate memory for the transmit descriptor ring\n");
+       return -ENOMEM;
+}
+
+/**
+ * igbvf_setup_rx_resources - allocate Rx resources (Descriptors)
+ * @adapter: board private structure
+ *
+ * Returns 0 on success, negative on failure
+ **/
+int igbvf_setup_rx_resources(struct igbvf_adapter *adapter,
+                            struct igbvf_ring *rx_ring)
+{
+       struct pci_dev *pdev = adapter->pdev;
+       int size, desc_len;
+
+       size = sizeof(struct igbvf_buffer) * rx_ring->count;
+       rx_ring->buffer_info = vmalloc(size);
+       if (!rx_ring->buffer_info)
+               goto err;
+       memset(rx_ring->buffer_info, 0, size);
+
+       desc_len = sizeof(union e1000_adv_rx_desc);
+
+       /* Round up to nearest 4K */
+       rx_ring->size = rx_ring->count * desc_len;
+       rx_ring->size = ALIGN(rx_ring->size, 4096);
+
+       rx_ring->desc = pci_alloc_consistent(pdev, rx_ring->size,
+                                            &rx_ring->dma);
+
+       if (!rx_ring->desc)
+               goto err;
+
+       rx_ring->next_to_clean = 0;
+       rx_ring->next_to_use = 0;
+
+       rx_ring->adapter = adapter;
+
+       return 0;
+
+err:
+       vfree(rx_ring->buffer_info);
+       rx_ring->buffer_info = NULL;
+       dev_err(&adapter->pdev->dev,
+               "Unable to allocate memory for the receive descriptor ring\n");
+       return -ENOMEM;
+}
+
+/**
+ * igbvf_clean_tx_ring - Free Tx Buffers
+ * @tx_ring: ring to be cleaned
+ **/
+static void igbvf_clean_tx_ring(struct igbvf_ring *tx_ring)
+{
+       struct igbvf_adapter *adapter = tx_ring->adapter;
+       struct igbvf_buffer *buffer_info;
+       unsigned long size;
+       unsigned int i;
+
+       if (!tx_ring->buffer_info)
+               return;
+
+       /* Free all the Tx ring sk_buffs */
+       for (i = 0; i < tx_ring->count; i++) {
+               buffer_info = &tx_ring->buffer_info[i];
+               igbvf_put_txbuf(adapter, buffer_info);
+       }
+
+       size = sizeof(struct igbvf_buffer) * tx_ring->count;
+       memset(tx_ring->buffer_info, 0, size);
+
+       /* Zero out the descriptor ring */
+       memset(tx_ring->desc, 0, tx_ring->size);
+
+       tx_ring->next_to_use = 0;
+       tx_ring->next_to_clean = 0;
+
+       writel(0, adapter->hw.hw_addr + tx_ring->head);
+       writel(0, adapter->hw.hw_addr + tx_ring->tail);
+}
+
+/**
+ * igbvf_free_tx_resources - Free Tx Resources per Queue
+ * @tx_ring: ring to free resources from
+ *
+ * Free all transmit software resources
+ **/
+void igbvf_free_tx_resources(struct igbvf_ring *tx_ring)
+{
+       struct pci_dev *pdev = tx_ring->adapter->pdev;
+
+       igbvf_clean_tx_ring(tx_ring);
+
+       vfree(tx_ring->buffer_info);
+       tx_ring->buffer_info = NULL;
+
+       pci_free_consistent(pdev, tx_ring->size, tx_ring->desc, tx_ring->dma);
+
+       tx_ring->desc = NULL;
+}
+
+/**
+ * igbvf_clean_rx_ring - Free Rx Buffers per Queue
+ * @adapter: board private structure
+ **/
+static void igbvf_clean_rx_ring(struct igbvf_ring *rx_ring)
+{
+       struct igbvf_adapter *adapter = rx_ring->adapter;
+       struct igbvf_buffer *buffer_info;
+       struct pci_dev *pdev = adapter->pdev;
+       unsigned long size;
+       unsigned int i;
+
+       if (!rx_ring->buffer_info)
+               return;
+
+       /* Free all the Rx ring sk_buffs */
+       for (i = 0; i < rx_ring->count; i++) {
+               buffer_info = &rx_ring->buffer_info[i];
+               if (buffer_info->dma) {
+                       if (adapter->rx_ps_hdr_size){
+                               pci_unmap_single(pdev, buffer_info->dma,
+                                                adapter->rx_ps_hdr_size,
+                                                PCI_DMA_FROMDEVICE);
+                       } else {
+                               pci_unmap_single(pdev, buffer_info->dma,
+                                                adapter->rx_buffer_len,
+                                                PCI_DMA_FROMDEVICE);
+                       }
+                       buffer_info->dma = 0;
+               }
+
+               if (buffer_info->skb) {
+                       dev_kfree_skb(buffer_info->skb);
+                       buffer_info->skb = NULL;
+               }
+
+               if (buffer_info->page) {
+                       if (buffer_info->page_dma)
+                               pci_unmap_page(pdev, buffer_info->page_dma,
+                                              PAGE_SIZE / 2,
+                                              PCI_DMA_FROMDEVICE);
+                       put_page(buffer_info->page);
+                       buffer_info->page = NULL;
+                       buffer_info->page_dma = 0;
+                       buffer_info->page_offset = 0;
+               }
+       }
+
+       size = sizeof(struct igbvf_buffer) * rx_ring->count;
+       memset(rx_ring->buffer_info, 0, size);
+
+       /* Zero out the descriptor ring */
+       memset(rx_ring->desc, 0, rx_ring->size);
+
+       rx_ring->next_to_clean = 0;
+       rx_ring->next_to_use = 0;
+
+       writel(0, adapter->hw.hw_addr + rx_ring->head);
+       writel(0, adapter->hw.hw_addr + rx_ring->tail);
+}
+
+/**
+ * igbvf_free_rx_resources - Free Rx Resources
+ * @rx_ring: ring to clean the resources from
+ *
+ * Free all receive software resources
+ **/
+
+void igbvf_free_rx_resources(struct igbvf_ring *rx_ring)
+{
+       struct pci_dev *pdev = rx_ring->adapter->pdev;
+
+       igbvf_clean_rx_ring(rx_ring);
+
+       vfree(rx_ring->buffer_info);
+       rx_ring->buffer_info = NULL;
+
+       dma_free_coherent(&pdev->dev, rx_ring->size, rx_ring->desc,
+                         rx_ring->dma);
+       rx_ring->desc = NULL;
+}
+
+/**
+ * igbvf_update_itr - update the dynamic ITR value based on statistics
+ * @adapter: pointer to adapter
+ * @itr_setting: current adapter->itr
+ * @packets: the number of packets during this measurement interval
+ * @bytes: the number of bytes during this measurement interval
+ *
+ *      Stores a new ITR value based on packets and byte
+ *      counts during the last interrupt.  The advantage of per interrupt
+ *      computation is faster updates and more accurate ITR for the current
+ *      traffic pattern.  Constants in this function were computed
+ *      based on theoretical maximum wire speed and thresholds were set based
+ *      on testing data as well as attempting to minimize response time
+ *      while increasing bulk throughput.  This functionality is controlled
+ *      by the InterruptThrottleRate module parameter.
+ **/
+static unsigned int igbvf_update_itr(struct igbvf_adapter *adapter,
+                                     u16 itr_setting, int packets,
+                                     int bytes)
+{
+       unsigned int retval = itr_setting;
+
+       if (packets == 0)
+               goto update_itr_done;
+
+       switch (itr_setting) {
+       case lowest_latency:
+               /* handle TSO and jumbo frames */
+               if (bytes/packets > 8000)
+                       retval = bulk_latency;
+               else if ((packets < 5) && (bytes > 512))
+                       retval = low_latency;
+               break;
+       case low_latency:  /* 50 usec aka 20000 ints/s */
+               if (bytes > 10000) {
+                       /* this if handles the TSO accounting */
+                       if (bytes/packets > 8000)
+                               retval = bulk_latency;
+                       else if ((packets < 10) || ((bytes/packets) > 1200))
+                               retval = bulk_latency;
+                       else if ((packets > 35))
+                               retval = lowest_latency;
+               } else if (bytes/packets > 2000) {
+                       retval = bulk_latency;
+               } else if (packets <= 2 && bytes < 512) {
+                       retval = lowest_latency;
+               }
+               break;
+       case bulk_latency: /* 250 usec aka 4000 ints/s */
+               if (bytes > 25000) {
+                       if (packets > 35)
+                               retval = low_latency;
+               } else if (bytes < 6000) {
+                       retval = low_latency;
+               }
+               break;
+       }
+
+update_itr_done:
+       return retval;
+}
+
+static void igbvf_set_itr(struct igbvf_adapter *adapter)
+{
+       struct e1000_hw *hw = &adapter->hw;
+       u16 current_itr;
+       u32 new_itr = adapter->itr;
+
+       adapter->tx_itr = igbvf_update_itr(adapter, adapter->tx_itr,
+                                          adapter->total_tx_packets,
+                                          adapter->total_tx_bytes);
+       /* conservative mode (itr 3) eliminates the lowest_latency setting */
+       if (adapter->itr_setting == 3 && adapter->tx_itr == lowest_latency)
+               adapter->tx_itr = low_latency;
+
+       adapter->rx_itr = igbvf_update_itr(adapter, adapter->rx_itr,
+                                          adapter->total_rx_packets,
+                                          adapter->total_rx_bytes);
+       /* conservative mode (itr 3) eliminates the lowest_latency setting */
+       if (adapter->itr_setting == 3 && adapter->rx_itr == lowest_latency)
+               adapter->rx_itr = low_latency;
+
+       current_itr = max(adapter->rx_itr, adapter->tx_itr);
+
+       switch (current_itr) {
+       /* counts and packets in update_itr are dependent on these numbers */
+       case lowest_latency:
+               new_itr = 70000;
+               break;
+       case low_latency:
+               new_itr = 20000; /* aka hwitr = ~200 */
+               break;
+       case bulk_latency:
+               new_itr = 4000;
+               break;
+       default:
+               break;
+       }
+
+       if (new_itr != adapter->itr) {
+               /*
+                * this attempts to bias the interrupt rate towards Bulk
+                * by adding intermediate steps when interrupt rate is
+                * increasing
+                */
+               new_itr = new_itr > adapter->itr ?
+                            min(adapter->itr + (new_itr >> 2), new_itr) :
+                            new_itr;
+               adapter->itr = new_itr;
+               adapter->rx_ring->itr_val = 1952;
+
+               if (adapter->msix_entries)
+                       adapter->rx_ring->set_itr = 1;
+               else
+                       ew32(ITR, 1952);
+       }
+}
+
+/**
+ * igbvf_clean_tx_irq - Reclaim resources after transmit completes
+ * @adapter: board private structure
+ * returns true if ring is completely cleaned
+ **/
+static bool igbvf_clean_tx_irq(struct igbvf_ring *tx_ring)
+{
+       struct igbvf_adapter *adapter = tx_ring->adapter;
+       struct e1000_hw *hw = &adapter->hw;
+       struct net_device *netdev = adapter->netdev;
+       struct igbvf_buffer *buffer_info;
+       struct sk_buff *skb;
+       union e1000_adv_tx_desc *tx_desc, *eop_desc;
+       unsigned int total_bytes = 0, total_packets = 0;
+       unsigned int i, eop, count = 0;
+       bool cleaned = false;
+
+       i = tx_ring->next_to_clean;
+       eop = tx_ring->buffer_info[i].next_to_watch;
+       eop_desc = IGBVF_TX_DESC_ADV(*tx_ring, eop);
+
+       while ((eop_desc->wb.status & cpu_to_le32(E1000_TXD_STAT_DD)) &&
+              (count < tx_ring->count)) {
+               for (cleaned = false; !cleaned; count++) {
+                       tx_desc = IGBVF_TX_DESC_ADV(*tx_ring, i);
+                       buffer_info = &tx_ring->buffer_info[i];
+                       cleaned = (i == eop);
+                       skb = buffer_info->skb;
+
+                       if (skb) {
+                               unsigned int segs, bytecount;
+
+                               /* gso_segs is currently only valid for tcp */
+                               segs = skb_shinfo(skb)->gso_segs ?: 1;
+                               /* multiply data chunks by size of headers */
+                               bytecount = ((segs - 1) * skb_headlen(skb)) +
+                                           skb->len;
+                               total_packets += segs;
+                               total_bytes += bytecount;
+                       }
+
+                       igbvf_put_txbuf(adapter, buffer_info);
+                       tx_desc->wb.status = 0;
+
+                       i++;
+                       if (i == tx_ring->count)
+                               i = 0;
+               }
+               eop = tx_ring->buffer_info[i].next_to_watch;
+               eop_desc = IGBVF_TX_DESC_ADV(*tx_ring, eop);
+       }
+
+       tx_ring->next_to_clean = i;
+
+       if (unlikely(count &&
+                    netif_carrier_ok(netdev) &&
+                    igbvf_desc_unused(tx_ring) >= IGBVF_TX_QUEUE_WAKE)) {
+               /* Make sure that anybody stopping the queue after this
+                * sees the new next_to_clean.
+                */
+               smp_mb();
+               if (netif_queue_stopped(netdev) &&
+                   !(test_bit(__IGBVF_DOWN, &adapter->state))) {
+                       netif_wake_queue(netdev);
+                       ++adapter->restart_queue;
+               }
+       }
+
+       if (adapter->detect_tx_hung) {
+               /* Detect a transmit hang in hardware, this serializes the
+                * check with the clearing of time_stamp and movement of i */
+               adapter->detect_tx_hung = false;
+               if (tx_ring->buffer_info[i].time_stamp &&
+                   time_after(jiffies, tx_ring->buffer_info[i].time_stamp +
+                              (adapter->tx_timeout_factor * HZ))
+                   && !(er32(STATUS) & E1000_STATUS_TXOFF)) {
+
+                       tx_desc = IGBVF_TX_DESC_ADV(*tx_ring, i);
+                       /* detected Tx unit hang */
+                       igbvf_print_tx_hang(adapter);
+
+                       netif_stop_queue(netdev);
+               }
+       }
+       adapter->net_stats.tx_bytes += total_bytes;
+       adapter->net_stats.tx_packets += total_packets;
+       return (count < tx_ring->count);
+}
+
+static irqreturn_t igbvf_msix_other(int irq, void *data)
+{
+       struct net_device *netdev = data;
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct e1000_hw *hw = &adapter->hw;
+
+       adapter->int_counter1++;
+
+       netif_carrier_off(netdev);
+       hw->mac.get_link_status = 1;
+       if (!test_bit(__IGBVF_DOWN, &adapter->state))
+               mod_timer(&adapter->watchdog_timer, jiffies + 1);
+
+       ew32(EIMS, adapter->eims_other);
+
+       return IRQ_HANDLED;
+}
+
+static irqreturn_t igbvf_intr_msix_tx(int irq, void *data)
+{
+       struct net_device *netdev = data;
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct e1000_hw *hw = &adapter->hw;
+       struct igbvf_ring *tx_ring = adapter->tx_ring;
+
+
+       adapter->total_tx_bytes = 0;
+       adapter->total_tx_packets = 0;
+
+       /* auto mask will automatically reenable the interrupt when we write
+        * EICS */
+       if (!igbvf_clean_tx_irq(tx_ring))
+               /* Ring was not completely cleaned, so fire another interrupt */
+               ew32(EICS, tx_ring->eims_value);
+       else
+               ew32(EIMS, tx_ring->eims_value);
+
+       return IRQ_HANDLED;
+}
+
+static irqreturn_t igbvf_intr_msix_rx(int irq, void *data)
+{
+       struct net_device *netdev = data;
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       adapter->int_counter0++;
+
+       /* Write the ITR value calculated at the end of the
+        * previous interrupt.
+        */
+       if (adapter->rx_ring->set_itr) {
+               writel(adapter->rx_ring->itr_val,
+                      adapter->hw.hw_addr + adapter->rx_ring->itr_register);
+               adapter->rx_ring->set_itr = 0;
+       }
+
+       if (napi_schedule_prep(&adapter->rx_ring->napi)) {
+               adapter->total_rx_bytes = 0;
+               adapter->total_rx_packets = 0;
+               __napi_schedule(&adapter->rx_ring->napi);
+       }
+
+       return IRQ_HANDLED;
+}
+
+#define IGBVF_NO_QUEUE -1
+
+static void igbvf_assign_vector(struct igbvf_adapter *adapter, int rx_queue,
+                                int tx_queue, int msix_vector)
+{
+       struct e1000_hw *hw = &adapter->hw;
+       u32 ivar, index;
+
+       /* 82576 uses a table-based method for assigning vectors.
+          Each queue has a single entry in the table to which we write
+          a vector number along with a "valid" bit.  Sadly, the layout
+          of the table is somewhat counterintuitive. */
+       if (rx_queue > IGBVF_NO_QUEUE) {
+               index = (rx_queue >> 1);
+               ivar = array_er32(IVAR0, index);
+               if (rx_queue & 0x1) {
+                       /* vector goes into third byte of register */
+                       ivar = ivar & 0xFF00FFFF;
+                       ivar |= (msix_vector | E1000_IVAR_VALID) << 16;
+               } else {
+                       /* vector goes into low byte of register */
+                       ivar = ivar & 0xFFFFFF00;
+                       ivar |= msix_vector | E1000_IVAR_VALID;
+               }
+               adapter->rx_ring[rx_queue].eims_value = 1 << msix_vector;
+               array_ew32(IVAR0, index, ivar);
+       }
+       if (tx_queue > IGBVF_NO_QUEUE) {
+               index = (tx_queue >> 1);
+               ivar = array_er32(IVAR0, index);
+               if (tx_queue & 0x1) {
+                       /* vector goes into high byte of register */
+                       ivar = ivar & 0x00FFFFFF;
+                       ivar |= (msix_vector | E1000_IVAR_VALID) << 24;
+               } else {
+                       /* vector goes into second byte of register */
+                       ivar = ivar & 0xFFFF00FF;
+                       ivar |= (msix_vector | E1000_IVAR_VALID) << 8;
+               }
+               adapter->tx_ring[tx_queue].eims_value = 1 << msix_vector;
+               array_ew32(IVAR0, index, ivar);
+       }
+}
+
+/**
+ * igbvf_configure_msix - Configure MSI-X hardware
+ *
+ * igbvf_configure_msix sets up the hardware to properly
+ * generate MSI-X interrupts.
+ **/
+static void igbvf_configure_msix(struct igbvf_adapter *adapter)
+{
+       u32 tmp;
+       struct e1000_hw *hw = &adapter->hw;
+       struct igbvf_ring *tx_ring = adapter->tx_ring;
+       struct igbvf_ring *rx_ring = adapter->rx_ring;
+       int vector = 0;
+
+       adapter->eims_enable_mask = 0;
+
+       igbvf_assign_vector(adapter, IGBVF_NO_QUEUE, 0, vector++);
+       adapter->eims_enable_mask |= tx_ring->eims_value;
+       if (tx_ring->itr_val)
+               writel(tx_ring->itr_val,
+                      hw->hw_addr + tx_ring->itr_register);
+       else
+               writel(1952, hw->hw_addr + tx_ring->itr_register);
+
+       igbvf_assign_vector(adapter, 0, IGBVF_NO_QUEUE, vector++);
+       adapter->eims_enable_mask |= rx_ring->eims_value;
+       if (rx_ring->itr_val)
+               writel(rx_ring->itr_val,
+                      hw->hw_addr + rx_ring->itr_register);
+       else
+               writel(1952, hw->hw_addr + rx_ring->itr_register);
+
+       /* set vector for other causes, i.e. link changes */
+
+       tmp = (vector++ | E1000_IVAR_VALID);
+
+       ew32(IVAR_MISC, tmp);
+
+       adapter->eims_enable_mask = (1 << (vector)) - 1;
+       adapter->eims_other = 1 << (vector - 1);
+       e1e_flush();
+}
+
+void igbvf_reset_interrupt_capability(struct igbvf_adapter *adapter)
+{
+       if (adapter->msix_entries) {
+               pci_disable_msix(adapter->pdev);
+               kfree(adapter->msix_entries);
+               adapter->msix_entries = NULL;
+       }
+}
+
+/**
+ * igbvf_set_interrupt_capability - set MSI or MSI-X if supported
+ *
+ * Attempt to configure interrupts using the best available
+ * capabilities of the hardware and kernel.
+ **/
+void igbvf_set_interrupt_capability(struct igbvf_adapter *adapter)
+{
+       int err = -ENOMEM;
+       int i;
+
+       /* we allocate 3 vectors, 1 for tx, 1 for rx, one for pf messages */
+       adapter->msix_entries = kcalloc(3, sizeof(struct msix_entry),
+                                       GFP_KERNEL);
+       if (adapter->msix_entries) {
+               for (i = 0; i < 3; i++)
+                       adapter->msix_entries[i].entry = i;
+
+               err = pci_enable_msix(adapter->pdev,
+                                     adapter->msix_entries, 3);
+       }
+
+       if (err) {
+               /* MSI-X failed */
+               dev_err(&adapter->pdev->dev,
+                       "Failed to initialize MSI-X interrupts.\n");
+               igbvf_reset_interrupt_capability(adapter);
+       }
+}
+
+/**
+ * igbvf_request_msix - Initialize MSI-X interrupts
+ *
+ * igbvf_request_msix allocates MSI-X vectors and requests interrupts from the
+ * kernel.
+ **/
+static int igbvf_request_msix(struct igbvf_adapter *adapter)
+{
+       struct net_device *netdev = adapter->netdev;
+       int err = 0, vector = 0;
+
+       if (strlen(netdev->name) < (IFNAMSIZ - 5)) {
+               sprintf(adapter->tx_ring->name, "%s-tx-0", netdev->name);
+               sprintf(adapter->rx_ring->name, "%s-rx-0", netdev->name);
+       } else {
+               memcpy(adapter->tx_ring->name, netdev->name, IFNAMSIZ);
+               memcpy(adapter->rx_ring->name, netdev->name, IFNAMSIZ);
+       }
+
+       err = request_irq(adapter->msix_entries[vector].vector,
+                         &igbvf_intr_msix_tx, 0, adapter->tx_ring->name,
+                         netdev);
+       if (err)
+               goto out;
+
+       adapter->tx_ring->itr_register = E1000_EITR(vector);
+       adapter->tx_ring->itr_val = 1952;
+       vector++;
+
+       err = request_irq(adapter->msix_entries[vector].vector,
+                         &igbvf_intr_msix_rx, 0, adapter->rx_ring->name,
+                         netdev);
+       if (err)
+               goto out;
+
+       adapter->rx_ring->itr_register = E1000_EITR(vector);
+       adapter->rx_ring->itr_val = 1952;
+       vector++;
+
+       err = request_irq(adapter->msix_entries[vector].vector,
+                         &igbvf_msix_other, 0, netdev->name, netdev);
+       if (err)
+               goto out;
+
+       igbvf_configure_msix(adapter);
+       return 0;
+out:
+       return err;
+}
+
+/**
+ * igbvf_alloc_queues - Allocate memory for all rings
+ * @adapter: board private structure to initialize
+ **/
+static int __devinit igbvf_alloc_queues(struct igbvf_adapter *adapter)
+{
+       struct net_device *netdev = adapter->netdev;
+
+       adapter->tx_ring = kzalloc(sizeof(struct igbvf_ring), GFP_KERNEL);
+       if (!adapter->tx_ring)
+               return -ENOMEM;
+
+       adapter->rx_ring = kzalloc(sizeof(struct igbvf_ring), GFP_KERNEL);
+       if (!adapter->rx_ring) {
+               kfree(adapter->tx_ring);
+               return -ENOMEM;
+       }
+
+       netif_napi_add(netdev, &adapter->rx_ring->napi, igbvf_poll, 64);
+
+       return 0;
+}
+
+/**
+ * igbvf_request_irq - initialize interrupts
+ *
+ * Attempts to configure interrupts using the best available
+ * capabilities of the hardware and kernel.
+ **/
+static int igbvf_request_irq(struct igbvf_adapter *adapter)
+{
+       int err = -1;
+
+       /* igbvf supports msi-x only */
+       if (adapter->msix_entries)
+               err = igbvf_request_msix(adapter);
+
+       if (!err)
+               return err;
+
+       dev_err(&adapter->pdev->dev,
+               "Unable to allocate interrupt, Error: %d\n", err);
+
+       return err;
+}
+
+static void igbvf_free_irq(struct igbvf_adapter *adapter)
+{
+       struct net_device *netdev = adapter->netdev;
+       int vector;
+
+       if (adapter->msix_entries) {
+               for (vector = 0; vector < 3; vector++)
+                       free_irq(adapter->msix_entries[vector].vector, netdev);
+       }
+}
+
+/**
+ * igbvf_irq_disable - Mask off interrupt generation on the NIC
+ **/
+static void igbvf_irq_disable(struct igbvf_adapter *adapter)
+{
+       struct e1000_hw *hw = &adapter->hw;
+
+       ew32(EIMC, ~0);
+
+       if (adapter->msix_entries)
+               ew32(EIAC, 0);
+}
+
+/**
+ * igbvf_irq_enable - Enable default interrupt generation settings
+ **/
+static void igbvf_irq_enable(struct igbvf_adapter *adapter)
+{
+       struct e1000_hw *hw = &adapter->hw;
+
+       ew32(EIAC, adapter->eims_enable_mask);
+       ew32(EIAM, adapter->eims_enable_mask);
+       ew32(EIMS, adapter->eims_enable_mask);
+}
+
+/**
+ * igbvf_poll - NAPI Rx polling callback
+ * @napi: struct associated with this polling callback
+ * @budget: amount of packets driver is allowed to process this poll
+ **/
+static int igbvf_poll(struct napi_struct *napi, int budget)
+{
+       struct igbvf_ring *rx_ring = container_of(napi, struct igbvf_ring, napi);
+       struct igbvf_adapter *adapter = rx_ring->adapter;
+       struct e1000_hw *hw = &adapter->hw;
+       int work_done = 0;
+
+       igbvf_clean_rx_irq(adapter, &work_done, budget);
+
+       /* If not enough Rx work done, exit the polling mode */
+       if (work_done < budget) {
+               napi_complete(napi);
+
+               if (adapter->itr_setting & 3)
+                       igbvf_set_itr(adapter);
+
+               if (!test_bit(__IGBVF_DOWN, &adapter->state))
+                       ew32(EIMS, adapter->rx_ring->eims_value);
+       }
+
+       return work_done;
+}
+
+/**
+ * igbvf_set_rlpml - set receive large packet maximum length
+ * @adapter: board private structure
+ *
+ * Configure the maximum size of packets that will be received
+ */
+static void igbvf_set_rlpml(struct igbvf_adapter *adapter)
+{
+       int max_frame_size = adapter->max_frame_size;
+       struct e1000_hw *hw = &adapter->hw;
+
+       if (adapter->vlgrp)
+               max_frame_size += VLAN_TAG_SIZE;
+
+       e1000_rlpml_set_vf(hw, max_frame_size);
+}
+
+static void igbvf_vlan_rx_add_vid(struct net_device *netdev, u16 vid)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct e1000_hw *hw = &adapter->hw;
+
+       if (hw->mac.ops.set_vfta(hw, vid, true))
+               dev_err(&adapter->pdev->dev, "Failed to add vlan id %d\n", vid);
+}
+
+static void igbvf_vlan_rx_kill_vid(struct net_device *netdev, u16 vid)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct e1000_hw *hw = &adapter->hw;
+
+       igbvf_irq_disable(adapter);
+       vlan_group_set_device(adapter->vlgrp, vid, NULL);
+
+       if (!test_bit(__IGBVF_DOWN, &adapter->state))
+               igbvf_irq_enable(adapter);
+
+       if (hw->mac.ops.set_vfta(hw, vid, false))
+               dev_err(&adapter->pdev->dev,
+                       "Failed to remove vlan id %d\n", vid);
+}
+
+static void igbvf_vlan_rx_register(struct net_device *netdev,
+                                   struct vlan_group *grp)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       adapter->vlgrp = grp;
+}
+
+static void igbvf_restore_vlan(struct igbvf_adapter *adapter)
+{
+       u16 vid;
+
+       if (!adapter->vlgrp)
+               return;
+
+       for (vid = 0; vid < VLAN_GROUP_ARRAY_LEN; vid++) {
+               if (!vlan_group_get_device(adapter->vlgrp, vid))
+                       continue;
+               igbvf_vlan_rx_add_vid(adapter->netdev, vid);
+       }
+
+       igbvf_set_rlpml(adapter);
+}
+
+/**
+ * igbvf_configure_tx - Configure Transmit Unit after Reset
+ * @adapter: board private structure
+ *
+ * Configure the Tx unit of the MAC after a reset.
+ **/
+static void igbvf_configure_tx(struct igbvf_adapter *adapter)
+{
+       struct e1000_hw *hw = &adapter->hw;
+       struct igbvf_ring *tx_ring = adapter->tx_ring;
+       u64 tdba;
+       u32 txdctl, dca_txctrl;
+
+       /* disable transmits */
+       txdctl = er32(TXDCTL(0));
+       ew32(TXDCTL(0), txdctl & ~E1000_TXDCTL_QUEUE_ENABLE);
+       msleep(10);
+
+       /* Setup the HW Tx Head and Tail descriptor pointers */
+       ew32(TDLEN(0), tx_ring->count * sizeof(union e1000_adv_tx_desc));
+       tdba = tx_ring->dma;
+       ew32(TDBAL(0), (tdba & DMA_32BIT_MASK));
+       ew32(TDBAH(0), (tdba >> 32));
+       ew32(TDH(0), 0);
+       ew32(TDT(0), 0);
+       tx_ring->head = E1000_TDH(0);
+       tx_ring->tail = E1000_TDT(0);
+
+       /* Turn off Relaxed Ordering on head write-backs.  The writebacks
+        * MUST be delivered in order or it will completely screw up
+        * our bookeeping.
+        */
+       dca_txctrl = er32(DCA_TXCTRL(0));
+       dca_txctrl &= ~E1000_DCA_TXCTRL_TX_WB_RO_EN;
+       ew32(DCA_TXCTRL(0), dca_txctrl);
+
+       /* enable transmits */
+       txdctl |= E1000_TXDCTL_QUEUE_ENABLE;
+       ew32(TXDCTL(0), txdctl);
+
+       /* Setup Transmit Descriptor Settings for eop descriptor */
+       adapter->txd_cmd = E1000_ADVTXD_DCMD_EOP | E1000_ADVTXD_DCMD_IFCS;
+
+       /* enable Report Status bit */
+       adapter->txd_cmd |= E1000_ADVTXD_DCMD_RS;
+
+       adapter->tx_queue_len = adapter->netdev->tx_queue_len;
+}
+
+/**
+ * igbvf_setup_srrctl - configure the receive control registers
+ * @adapter: Board private structure
+ **/
+static void igbvf_setup_srrctl(struct igbvf_adapter *adapter)
+{
+       struct e1000_hw *hw = &adapter->hw;
+       u32 srrctl = 0;
+
+       srrctl &= ~(E1000_SRRCTL_DESCTYPE_MASK |
+                   E1000_SRRCTL_BSIZEHDR_MASK |
+                   E1000_SRRCTL_BSIZEPKT_MASK);
+
+       /* Enable queue drop to avoid head of line blocking */
+       srrctl |= E1000_SRRCTL_DROP_EN;
+
+       /* Setup buffer sizes */
+       srrctl |= ALIGN(adapter->rx_buffer_len, 1024) >>
+                 E1000_SRRCTL_BSIZEPKT_SHIFT;
+
+       if (adapter->rx_buffer_len < 2048) {
+               adapter->rx_ps_hdr_size = 0;
+               srrctl |= E1000_SRRCTL_DESCTYPE_ADV_ONEBUF;
+       } else {
+               adapter->rx_ps_hdr_size = 128;
+               srrctl |= adapter->rx_ps_hdr_size <<
+                         E1000_SRRCTL_BSIZEHDRSIZE_SHIFT;
+               srrctl |= E1000_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS;
+       }
+
+       ew32(SRRCTL(0), srrctl);
+}
+
+/**
+ * igbvf_configure_rx - Configure Receive Unit after Reset
+ * @adapter: board private structure
+ *
+ * Configure the Rx unit of the MAC after a reset.
+ **/
+static void igbvf_configure_rx(struct igbvf_adapter *adapter)
+{
+       struct e1000_hw *hw = &adapter->hw;
+       struct igbvf_ring *rx_ring = adapter->rx_ring;
+       u64 rdba;
+       u32 rdlen, rxdctl;
+
+       /* disable receives */
+       rxdctl = er32(RXDCTL(0));
+       ew32(RXDCTL(0), rxdctl & ~E1000_RXDCTL_QUEUE_ENABLE);
+       msleep(10);
+
+       rdlen = rx_ring->count * sizeof(union e1000_adv_rx_desc);
+
+       /*
+        * Setup the HW Rx Head and Tail Descriptor Pointers and
+        * the Base and Length of the Rx Descriptor Ring
+        */
+       rdba = rx_ring->dma;
+       ew32(RDBAL(0), (rdba & DMA_32BIT_MASK));
+       ew32(RDBAH(0), (rdba >> 32));
+       ew32(RDLEN(0), rx_ring->count * sizeof(union e1000_adv_rx_desc));
+       rx_ring->head = E1000_RDH(0);
+       rx_ring->tail = E1000_RDT(0);
+       ew32(RDH(0), 0);
+       ew32(RDT(0), 0);
+
+       rxdctl |= E1000_RXDCTL_QUEUE_ENABLE;
+       rxdctl &= 0xFFF00000;
+       rxdctl |= IGBVF_RX_PTHRESH;
+       rxdctl |= IGBVF_RX_HTHRESH << 8;
+       rxdctl |= IGBVF_RX_WTHRESH << 16;
+
+       igbvf_set_rlpml(adapter);
+
+       /* enable receives */
+       ew32(RXDCTL(0), rxdctl);
+}
+
+/**
+ * igbvf_set_multi - Multicast and Promiscuous mode set
+ * @netdev: network interface device structure
+ *
+ * The set_multi entry point is called whenever the multicast address
+ * list or the network interface flags are updated.  This routine is
+ * responsible for configuring the hardware for proper multicast,
+ * promiscuous mode, and all-multi behavior.
+ **/
+static void igbvf_set_multi(struct net_device *netdev)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct e1000_hw *hw = &adapter->hw;
+       struct dev_mc_list *mc_ptr;
+       u8  *mta_list = NULL;
+       int i;
+
+       if (netdev->mc_count) {
+               mta_list = kmalloc(netdev->mc_count * 6, GFP_ATOMIC);
+               if (!mta_list) {
+                       dev_err(&adapter->pdev->dev,
+                               "failed to allocate multicast filter list\n");
+                       return;
+               }
+       }
+
+       /* prepare a packed array of only addresses. */
+       mc_ptr = netdev->mc_list;
+
+       for (i = 0; i < netdev->mc_count; i++) {
+               if (!mc_ptr)
+                       break;
+               memcpy(mta_list + (i*ETH_ALEN), mc_ptr->dmi_addr,
+                      ETH_ALEN);
+               mc_ptr = mc_ptr->next;
+       }
+
+       hw->mac.ops.update_mc_addr_list(hw, mta_list, i, 0, 0);
+       kfree(mta_list);
+}
+
+/**
+ * igbvf_configure - configure the hardware for Rx and Tx
+ * @adapter: private board structure
+ **/
+static void igbvf_configure(struct igbvf_adapter *adapter)
+{
+       igbvf_set_multi(adapter->netdev);
+
+       igbvf_restore_vlan(adapter);
+
+       igbvf_configure_tx(adapter);
+       igbvf_setup_srrctl(adapter);
+       igbvf_configure_rx(adapter);
+       igbvf_alloc_rx_buffers(adapter->rx_ring,
+                              igbvf_desc_unused(adapter->rx_ring));
+}
+
+/* igbvf_reset - bring the hardware into a known good state
+ *
+ * This function boots the hardware and enables some settings that
+ * require a configuration cycle of the hardware - those cannot be
+ * set/changed during runtime. After reset the device needs to be
+ * properly configured for Rx, Tx etc.
+ */
+void igbvf_reset(struct igbvf_adapter *adapter)
+{
+       struct e1000_mac_info *mac = &adapter->hw.mac;
+       struct net_device *netdev = adapter->netdev;
+       struct e1000_hw *hw = &adapter->hw;
+
+       /* Allow time for pending master requests to run */
+       if (mac->ops.reset_hw(hw))
+               dev_err(&adapter->pdev->dev, "PF still resetting\n");
+
+       mac->ops.init_hw(hw);
+
+       if (is_valid_ether_addr(adapter->hw.mac.addr)) {
+               memcpy(netdev->dev_addr, adapter->hw.mac.addr,
+                      netdev->addr_len);
+               memcpy(netdev->perm_addr, adapter->hw.mac.addr,
+                      netdev->addr_len);
+       }
+}
+
+int igbvf_up(struct igbvf_adapter *adapter)
+{
+       struct e1000_hw *hw = &adapter->hw;
+
+       /* hardware has been reset, we need to reload some things */
+       igbvf_configure(adapter);
+
+       clear_bit(__IGBVF_DOWN, &adapter->state);
+
+       napi_enable(&adapter->rx_ring->napi);
+       if (adapter->msix_entries)
+               igbvf_configure_msix(adapter);
+
+       /* Clear any pending interrupts. */
+       er32(EICR);
+       igbvf_irq_enable(adapter);
+
+       /* start the watchdog */
+       hw->mac.get_link_status = 1;
+       mod_timer(&adapter->watchdog_timer, jiffies + 1);
+
+
+       return 0;
+}
+
+void igbvf_down(struct igbvf_adapter *adapter)
+{
+       struct net_device *netdev = adapter->netdev;
+       struct e1000_hw *hw = &adapter->hw;
+       u32 rxdctl, txdctl;
+
+       /*
+        * signal that we're down so the interrupt handler does not
+        * reschedule our watchdog timer
+        */
+       set_bit(__IGBVF_DOWN, &adapter->state);
+
+       /* disable receives in the hardware */
+       rxdctl = er32(RXDCTL(0));
+       ew32(RXDCTL(0), rxdctl & ~E1000_RXDCTL_QUEUE_ENABLE);
+
+       netif_stop_queue(netdev);
+
+       /* disable transmits in the hardware */
+       txdctl = er32(TXDCTL(0));
+       ew32(TXDCTL(0), txdctl & ~E1000_TXDCTL_QUEUE_ENABLE);
+
+       /* flush both disables and wait for them to finish */
+       e1e_flush();
+       msleep(10);
+
+       napi_disable(&adapter->rx_ring->napi);
+
+       igbvf_irq_disable(adapter);
+
+       del_timer_sync(&adapter->watchdog_timer);
+
+       netdev->tx_queue_len = adapter->tx_queue_len;
+       netif_carrier_off(netdev);
+
+       /* record the stats before reset*/
+       igbvf_update_stats(adapter);
+
+       adapter->link_speed = 0;
+       adapter->link_duplex = 0;
+
+       igbvf_reset(adapter);
+       igbvf_clean_tx_ring(adapter->tx_ring);
+       igbvf_clean_rx_ring(adapter->rx_ring);
+}
+
+void igbvf_reinit_locked(struct igbvf_adapter *adapter)
+{
+       might_sleep();
+       while (test_and_set_bit(__IGBVF_RESETTING, &adapter->state))
+               msleep(1);
+       igbvf_down(adapter);
+       igbvf_up(adapter);
+       clear_bit(__IGBVF_RESETTING, &adapter->state);
+}
+
+/**
+ * igbvf_sw_init - Initialize general software structures (struct igbvf_adapter)
+ * @adapter: board private structure to initialize
+ *
+ * igbvf_sw_init initializes the Adapter private data structure.
+ * Fields are initialized based on PCI device information and
+ * OS network device settings (MTU size).
+ **/
+static int __devinit igbvf_sw_init(struct igbvf_adapter *adapter)
+{
+       struct net_device *netdev = adapter->netdev;
+       s32 rc;
+
+       adapter->rx_buffer_len = ETH_FRAME_LEN + VLAN_HLEN + ETH_FCS_LEN;
+       adapter->rx_ps_hdr_size = 0;
+       adapter->max_frame_size = netdev->mtu + ETH_HLEN + ETH_FCS_LEN;
+       adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
+
+       adapter->tx_int_delay = 8;
+       adapter->tx_abs_int_delay = 32;
+       adapter->rx_int_delay = 0;
+       adapter->rx_abs_int_delay = 8;
+       adapter->itr_setting = 3;
+       adapter->itr = 20000;
+
+       /* Set various function pointers */
+       adapter->ei->init_ops(&adapter->hw);
+
+       rc = adapter->hw.mac.ops.init_params(&adapter->hw);
+       if (rc)
+               return rc;
+
+       rc = adapter->hw.mbx.ops.init_params(&adapter->hw);
+       if (rc)
+               return rc;
+
+       igbvf_set_interrupt_capability(adapter);
+
+       if (igbvf_alloc_queues(adapter))
+               return -ENOMEM;
+
+       spin_lock_init(&adapter->tx_queue_lock);
+
+       /* Explicitly disable IRQ since the NIC can be in any state. */
+       igbvf_irq_disable(adapter);
+
+       spin_lock_init(&adapter->stats_lock);
+
+       set_bit(__IGBVF_DOWN, &adapter->state);
+       return 0;
+}
+
+static void igbvf_initialize_last_counter_stats(struct igbvf_adapter *adapter)
+{
+       struct e1000_hw *hw = &adapter->hw;
+
+       adapter->stats.last_gprc = er32(VFGPRC);
+       adapter->stats.last_gorc = er32(VFGORC);
+       adapter->stats.last_gptc = er32(VFGPTC);
+       adapter->stats.last_gotc = er32(VFGOTC);
+       adapter->stats.last_mprc = er32(VFMPRC);
+       adapter->stats.last_gotlbc = er32(VFGOTLBC);
+       adapter->stats.last_gptlbc = er32(VFGPTLBC);
+       adapter->stats.last_gorlbc = er32(VFGORLBC);
+       adapter->stats.last_gprlbc = er32(VFGPRLBC);
+
+       adapter->stats.base_gprc = er32(VFGPRC);
+       adapter->stats.base_gorc = er32(VFGORC);
+       adapter->stats.base_gptc = er32(VFGPTC);
+       adapter->stats.base_gotc = er32(VFGOTC);
+       adapter->stats.base_mprc = er32(VFMPRC);
+       adapter->stats.base_gotlbc = er32(VFGOTLBC);
+       adapter->stats.base_gptlbc = er32(VFGPTLBC);
+       adapter->stats.base_gorlbc = er32(VFGORLBC);
+       adapter->stats.base_gprlbc = er32(VFGPRLBC);
+}
+
+/**
+ * igbvf_open - Called when a network interface is made active
+ * @netdev: network interface device structure
+ *
+ * Returns 0 on success, negative value on failure
+ *
+ * The open entry point is called when a network interface is made
+ * active by the system (IFF_UP).  At this point all resources needed
+ * for transmit and receive operations are allocated, the interrupt
+ * handler is registered with the OS, the watchdog timer is started,
+ * and the stack is notified that the interface is ready.
+ **/
+static int igbvf_open(struct net_device *netdev)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct e1000_hw *hw = &adapter->hw;
+       int err;
+
+       /* disallow open during test */
+       if (test_bit(__IGBVF_TESTING, &adapter->state))
+               return -EBUSY;
+
+       /* allocate transmit descriptors */
+       err = igbvf_setup_tx_resources(adapter, adapter->tx_ring);
+       if (err)
+               goto err_setup_tx;
+
+       /* allocate receive descriptors */
+       err = igbvf_setup_rx_resources(adapter, adapter->rx_ring);
+       if (err)
+               goto err_setup_rx;
+
+       /*
+        * before we allocate an interrupt, we must be ready to handle it.
+        * Setting DEBUG_SHIRQ in the kernel makes it fire an interrupt
+        * as soon as we call pci_request_irq, so we have to setup our
+        * clean_rx handler before we do so.
+        */
+       igbvf_configure(adapter);
+
+       err = igbvf_request_irq(adapter);
+       if (err)
+               goto err_req_irq;
+
+       /* From here on the code is the same as igbvf_up() */
+       clear_bit(__IGBVF_DOWN, &adapter->state);
+
+       napi_enable(&adapter->rx_ring->napi);
+
+       /* clear any pending interrupts */
+       er32(EICR);
+
+       igbvf_irq_enable(adapter);
+
+       /* start the watchdog */
+       hw->mac.get_link_status = 1;
+       mod_timer(&adapter->watchdog_timer, jiffies + 1);
+
+       return 0;
+
+err_req_irq:
+       igbvf_free_rx_resources(adapter->rx_ring);
+err_setup_rx:
+       igbvf_free_tx_resources(adapter->tx_ring);
+err_setup_tx:
+       igbvf_reset(adapter);
+
+       return err;
+}
+
+/**
+ * igbvf_close - Disables a network interface
+ * @netdev: network interface device structure
+ *
+ * Returns 0, this is not allowed to fail
+ *
+ * The close entry point is called when an interface is de-activated
+ * by the OS.  The hardware is still under the drivers control, but
+ * needs to be disabled.  A global MAC reset is issued to stop the
+ * hardware, and all transmit and receive resources are freed.
+ **/
+static int igbvf_close(struct net_device *netdev)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       WARN_ON(test_bit(__IGBVF_RESETTING, &adapter->state));
+       igbvf_down(adapter);
+
+       igbvf_free_irq(adapter);
+
+       igbvf_free_tx_resources(adapter->tx_ring);
+       igbvf_free_rx_resources(adapter->rx_ring);
+
+       return 0;
+}
+/**
+ * igbvf_set_mac - Change the Ethernet Address of the NIC
+ * @netdev: network interface device structure
+ * @p: pointer to an address structure
+ *
+ * Returns 0 on success, negative on failure
+ **/
+static int igbvf_set_mac(struct net_device *netdev, void *p)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct e1000_hw *hw = &adapter->hw;
+       struct sockaddr *addr = p;
+
+       if (!is_valid_ether_addr(addr->sa_data))
+               return -EADDRNOTAVAIL;
+
+       memcpy(hw->mac.addr, addr->sa_data, netdev->addr_len);
+
+       hw->mac.ops.rar_set(hw, hw->mac.addr, 0);
+
+       if (memcmp(addr->sa_data, hw->mac.addr, 6))
+               return -EADDRNOTAVAIL;
+
+       memcpy(netdev->dev_addr, addr->sa_data, netdev->addr_len);
+
+       return 0;
+}
+
+#define UPDATE_VF_COUNTER(reg, name)                                    \
+       {                                                               \
+               u32 current_counter = er32(reg);                        \
+               if (current_counter < adapter->stats.last_##name)       \
+                       adapter->stats.name += 0x100000000LL;           \
+               adapter->stats.last_##name = current_counter;           \
+               adapter->stats.name &= 0xFFFFFFFF00000000LL;            \
+               adapter->stats.name |= current_counter;                 \
+       }
+
+/**
+ * igbvf_update_stats - Update the board statistics counters
+ * @adapter: board private structure
+**/
+void igbvf_update_stats(struct igbvf_adapter *adapter)
+{
+       struct e1000_hw *hw = &adapter->hw;
+       struct pci_dev *pdev = adapter->pdev;
+
+       /*
+        * Prevent stats update while adapter is being reset, link is down
+        * or if the pci connection is down.
+        */
+       if (adapter->link_speed == 0)
+               return;
+
+       if (test_bit(__IGBVF_RESETTING, &adapter->state))
+               return;
+
+       if (pci_channel_offline(pdev))
+               return;
+
+       UPDATE_VF_COUNTER(VFGPRC, gprc);
+       UPDATE_VF_COUNTER(VFGORC, gorc);
+       UPDATE_VF_COUNTER(VFGPTC, gptc);
+       UPDATE_VF_COUNTER(VFGOTC, gotc);
+       UPDATE_VF_COUNTER(VFMPRC, mprc);
+       UPDATE_VF_COUNTER(VFGOTLBC, gotlbc);
+       UPDATE_VF_COUNTER(VFGPTLBC, gptlbc);
+       UPDATE_VF_COUNTER(VFGORLBC, gorlbc);
+       UPDATE_VF_COUNTER(VFGPRLBC, gprlbc);
+
+       /* Fill out the OS statistics structure */
+       adapter->net_stats.multicast = adapter->stats.mprc;
+}
+
+static void igbvf_print_link_info(struct igbvf_adapter *adapter)
+{
+       dev_info(&adapter->pdev->dev, "Link is Up %d Mbps %s\n",
+                adapter->link_speed,
+                ((adapter->link_duplex == FULL_DUPLEX) ?
+                 "Full Duplex" : "Half Duplex"));
+}
+
+static bool igbvf_has_link(struct igbvf_adapter *adapter)
+{
+       struct e1000_hw *hw = &adapter->hw;
+       s32 ret_val = E1000_SUCCESS;
+       bool link_active;
+
+       ret_val = hw->mac.ops.check_for_link(hw);
+       link_active = !hw->mac.get_link_status;
+
+       /* if check for link returns error we will need to reset */
+       if (ret_val)
+               schedule_work(&adapter->reset_task);
+
+       return link_active;
+}
+
+/**
+ * igbvf_watchdog - Timer Call-back
+ * @data: pointer to adapter cast into an unsigned long
+ **/
+static void igbvf_watchdog(unsigned long data)
+{
+       struct igbvf_adapter *adapter = (struct igbvf_adapter *) data;
+
+       /* Do the rest outside of interrupt context */
+       schedule_work(&adapter->watchdog_task);
+}
+
+static void igbvf_watchdog_task(struct work_struct *work)
+{
+       struct igbvf_adapter *adapter = container_of(work,
+                                                    struct igbvf_adapter,
+                                                    watchdog_task);
+       struct net_device *netdev = adapter->netdev;
+       struct e1000_mac_info *mac = &adapter->hw.mac;
+       struct igbvf_ring *tx_ring = adapter->tx_ring;
+       struct e1000_hw *hw = &adapter->hw;
+       u32 link;
+       int tx_pending = 0;
+
+       link = igbvf_has_link(adapter);
+
+       if (link) {
+               if (!netif_carrier_ok(netdev)) {
+                       bool txb2b = 1;
+
+                       mac->ops.get_link_up_info(&adapter->hw,
+                                                 &adapter->link_speed,
+                                                 &adapter->link_duplex);
+                       igbvf_print_link_info(adapter);
+
+                       /*
+                        * tweak tx_queue_len according to speed/duplex
+                        * and adjust the timeout factor
+                        */
+                       netdev->tx_queue_len = adapter->tx_queue_len;
+                       adapter->tx_timeout_factor = 1;
+                       switch (adapter->link_speed) {
+                       case SPEED_10:
+                               txb2b = 0;
+                               netdev->tx_queue_len = 10;
+                               adapter->tx_timeout_factor = 16;
+                               break;
+                       case SPEED_100:
+                               txb2b = 0;
+                               netdev->tx_queue_len = 100;
+                               /* maybe add some timeout factor ? */
+                               break;
+                       }
+
+                       netif_carrier_on(netdev);
+                       netif_wake_queue(netdev);
+               }
+       } else {
+               if (netif_carrier_ok(netdev)) {
+                       adapter->link_speed = 0;
+                       adapter->link_duplex = 0;
+                       dev_info(&adapter->pdev->dev, "Link is Down\n");
+                       netif_carrier_off(netdev);
+                       netif_stop_queue(netdev);
+               }
+       }
+
+       if (netif_carrier_ok(netdev)) {
+               igbvf_update_stats(adapter);
+       } else {
+               tx_pending = (igbvf_desc_unused(tx_ring) + 1 <
+                             tx_ring->count);
+               if (tx_pending) {
+                       /*
+                        * We've lost link, so the controller stops DMA,
+                        * but we've got queued Tx work that's never going
+                        * to get done, so reset controller to flush Tx.
+                        * (Do the reset outside of interrupt context).
+                        */
+                       adapter->tx_timeout_count++;
+                       schedule_work(&adapter->reset_task);
+               }
+       }
+
+       /* Cause software interrupt to ensure Rx ring is cleaned */
+       ew32(EICS, adapter->rx_ring->eims_value);
+
+       /* Force detection of hung controller every watchdog period */
+       adapter->detect_tx_hung = 1;
+
+       /* Reset the timer */
+       if (!test_bit(__IGBVF_DOWN, &adapter->state))
+               mod_timer(&adapter->watchdog_timer,
+                         round_jiffies(jiffies + (2 * HZ)));
+}
+
+#define IGBVF_TX_FLAGS_CSUM             0x00000001
+#define IGBVF_TX_FLAGS_VLAN             0x00000002
+#define IGBVF_TX_FLAGS_TSO              0x00000004
+#define IGBVF_TX_FLAGS_IPV4             0x00000008
+#define IGBVF_TX_FLAGS_VLAN_MASK        0xffff0000
+#define IGBVF_TX_FLAGS_VLAN_SHIFT       16
+
+static int igbvf_tso(struct igbvf_adapter *adapter,
+                     struct igbvf_ring *tx_ring,
+                     struct sk_buff *skb, u32 tx_flags, u8 *hdr_len)
+{
+       struct e1000_adv_tx_context_desc *context_desc;
+       unsigned int i;
+       int err;
+       struct igbvf_buffer *buffer_info;
+       u32 info = 0, tu_cmd = 0;
+       u32 mss_l4len_idx, l4len;
+       *hdr_len = 0;
+
+       if (skb_header_cloned(skb)) {
+               err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
+               if (err) {
+                       dev_err(&adapter->pdev->dev,
+                               "igbvf_tso returning an error\n");
+                       return err;
+               }
+       }
+
+       l4len = tcp_hdrlen(skb);
+       *hdr_len += l4len;
+
+       if (skb->protocol == htons(ETH_P_IP)) {
+               struct iphdr *iph = ip_hdr(skb);
+               iph->tot_len = 0;
+               iph->check = 0;
+               tcp_hdr(skb)->check = ~csum_tcpudp_magic(iph->saddr,
+                                                        iph->daddr, 0,
+                                                        IPPROTO_TCP,
+                                                        0);
+       } else if (skb_shinfo(skb)->gso_type == SKB_GSO_TCPV6) {
+               ipv6_hdr(skb)->payload_len = 0;
+               tcp_hdr(skb)->check = ~csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
+                                                      &ipv6_hdr(skb)->daddr,
+                                                      0, IPPROTO_TCP, 0);
+       }
+
+       i = tx_ring->next_to_use;
+
+       buffer_info = &tx_ring->buffer_info[i];
+       context_desc = IGBVF_TX_CTXTDESC_ADV(*tx_ring, i);
+       /* VLAN MACLEN IPLEN */
+       if (tx_flags & IGBVF_TX_FLAGS_VLAN)
+               info |= (tx_flags & IGBVF_TX_FLAGS_VLAN_MASK);
+       info |= (skb_network_offset(skb) << E1000_ADVTXD_MACLEN_SHIFT);
+       *hdr_len += skb_network_offset(skb);
+       info |= (skb_transport_header(skb) - skb_network_header(skb));
+       *hdr_len += (skb_transport_header(skb) - skb_network_header(skb));
+       context_desc->vlan_macip_lens = cpu_to_le32(info);
+
+       /* ADV DTYP TUCMD MKRLOC/ISCSIHEDLEN */
+       tu_cmd |= (E1000_TXD_CMD_DEXT | E1000_ADVTXD_DTYP_CTXT);
+
+       if (skb->protocol == htons(ETH_P_IP))
+               tu_cmd |= E1000_ADVTXD_TUCMD_IPV4;
+       tu_cmd |= E1000_ADVTXD_TUCMD_L4T_TCP;
+
+       context_desc->type_tucmd_mlhl = cpu_to_le32(tu_cmd);
+
+       /* MSS L4LEN IDX */
+       mss_l4len_idx = (skb_shinfo(skb)->gso_size << E1000_ADVTXD_MSS_SHIFT);
+       mss_l4len_idx |= (l4len << E1000_ADVTXD_L4LEN_SHIFT);
+
+       context_desc->mss_l4len_idx = cpu_to_le32(mss_l4len_idx);
+       context_desc->seqnum_seed = 0;
+
+       buffer_info->time_stamp = jiffies;
+       buffer_info->next_to_watch = i;
+       buffer_info->dma = 0;
+       i++;
+       if (i == tx_ring->count)
+               i = 0;
+
+       tx_ring->next_to_use = i;
+
+       return true;
+}
+
+static inline bool igbvf_tx_csum(struct igbvf_adapter *adapter,
+                                 struct igbvf_ring *tx_ring,
+                                 struct sk_buff *skb, u32 tx_flags)
+{
+       struct e1000_adv_tx_context_desc *context_desc;
+       unsigned int i;
+       struct igbvf_buffer *buffer_info;
+       u32 info = 0, tu_cmd = 0;
+
+       if ((skb->ip_summed == CHECKSUM_PARTIAL) ||
+           (tx_flags & IGBVF_TX_FLAGS_VLAN)) {
+               i = tx_ring->next_to_use;
+               buffer_info = &tx_ring->buffer_info[i];
+               context_desc = IGBVF_TX_CTXTDESC_ADV(*tx_ring, i);
+
+               if (tx_flags & IGBVF_TX_FLAGS_VLAN)
+                       info |= (tx_flags & IGBVF_TX_FLAGS_VLAN_MASK);
+
+               info |= (skb_network_offset(skb) << E1000_ADVTXD_MACLEN_SHIFT);
+               if (skb->ip_summed == CHECKSUM_PARTIAL)
+                       info |= (skb_transport_header(skb) -
+                                skb_network_header(skb));
+
+
+               context_desc->vlan_macip_lens = cpu_to_le32(info);
+
+               tu_cmd |= (E1000_TXD_CMD_DEXT | E1000_ADVTXD_DTYP_CTXT);
+
+               if (skb->ip_summed == CHECKSUM_PARTIAL) {
+                       switch (skb->protocol) {
+                       case __constant_htons(ETH_P_IP):
+                               tu_cmd |= E1000_ADVTXD_TUCMD_IPV4;
+                               if (ip_hdr(skb)->protocol == IPPROTO_TCP)
+                                       tu_cmd |= E1000_ADVTXD_TUCMD_L4T_TCP;
+                               break;
+                       case __constant_htons(ETH_P_IPV6):
+                               if (ipv6_hdr(skb)->nexthdr == IPPROTO_TCP)
+                                       tu_cmd |= E1000_ADVTXD_TUCMD_L4T_TCP;
+                               break;
+                       default:
+                               break;
+                       }
+               }
+
+               context_desc->type_tucmd_mlhl = cpu_to_le32(tu_cmd);
+               context_desc->seqnum_seed = 0;
+               context_desc->mss_l4len_idx = 0;
+
+               buffer_info->time_stamp = jiffies;
+               buffer_info->next_to_watch = i;
+               buffer_info->dma = 0;
+               i++;
+               if (i == tx_ring->count)
+                       i = 0;
+               tx_ring->next_to_use = i;
+
+               return true;
+       }
+
+       return false;
+}
+
+static int igbvf_maybe_stop_tx(struct net_device *netdev, int size)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       /* there is enough descriptors then we don't need to worry  */
+       if (igbvf_desc_unused(adapter->tx_ring) >= size)
+               return 0;
+
+       netif_stop_queue(netdev);
+
+       smp_mb();
+
+       /* We need to check again just in case room has been made available */
+       if (igbvf_desc_unused(adapter->tx_ring) < size)
+               return -EBUSY;
+
+       netif_wake_queue(netdev);
+
+       ++adapter->restart_queue;
+       return 0;
+}
+
+#define IGBVF_MAX_TXD_PWR       16
+#define IGBVF_MAX_DATA_PER_TXD  (1 << IGBVF_MAX_TXD_PWR)
+
+static inline int igbvf_tx_map_adv(struct igbvf_adapter *adapter,
+                                   struct igbvf_ring *tx_ring,
+                                   struct sk_buff *skb,
+                                   unsigned int first)
+{
+       struct igbvf_buffer *buffer_info;
+       unsigned int len = skb_headlen(skb);
+       unsigned int count = 0, i;
+       unsigned int f;
+       dma_addr_t *map;
+
+       i = tx_ring->next_to_use;
+
+       if (skb_dma_map(&adapter->pdev->dev, skb, DMA_TO_DEVICE)) {
+               dev_err(&adapter->pdev->dev, "TX DMA map failed\n");
+               return 0;
+       }
+
+       map = skb_shinfo(skb)->dma_maps;
+
+       buffer_info = &tx_ring->buffer_info[i];
+       BUG_ON(len >= IGBVF_MAX_DATA_PER_TXD);
+       buffer_info->length = len;
+       /* set time_stamp *before* dma to help avoid a possible race */
+       buffer_info->time_stamp = jiffies;
+       buffer_info->next_to_watch = i;
+       buffer_info->dma = map[count];
+       count++;
+
+       for (f = 0; f < skb_shinfo(skb)->nr_frags; f++) {
+               struct skb_frag_struct *frag;
+
+               i++;
+               if (i == tx_ring->count)
+                       i = 0;
+
+               frag = &skb_shinfo(skb)->frags[f];
+               len = frag->size;
+
+               buffer_info = &tx_ring->buffer_info[i];
+               BUG_ON(len >= IGBVF_MAX_DATA_PER_TXD);
+               buffer_info->length = len;
+               buffer_info->time_stamp = jiffies;
+               buffer_info->next_to_watch = i;
+               buffer_info->dma = map[count];
+               count++;
+       }
+
+       tx_ring->buffer_info[i].skb = skb;
+       tx_ring->buffer_info[first].next_to_watch = i;
+
+       return count;
+}
+
+static inline void igbvf_tx_queue_adv(struct igbvf_adapter *adapter,
+                                      struct igbvf_ring *tx_ring,
+                                      int tx_flags, int count, u32 paylen,
+                                      u8 hdr_len)
+{
+       union e1000_adv_tx_desc *tx_desc = NULL;
+       struct igbvf_buffer *buffer_info;
+       u32 olinfo_status = 0, cmd_type_len;
+       unsigned int i;
+
+       cmd_type_len = (E1000_ADVTXD_DTYP_DATA | E1000_ADVTXD_DCMD_IFCS |
+                       E1000_ADVTXD_DCMD_DEXT);
+
+       if (tx_flags & IGBVF_TX_FLAGS_VLAN)
+               cmd_type_len |= E1000_ADVTXD_DCMD_VLE;
+
+       if (tx_flags & IGBVF_TX_FLAGS_TSO) {
+               cmd_type_len |= E1000_ADVTXD_DCMD_TSE;
+
+               /* insert tcp checksum */
+               olinfo_status |= E1000_TXD_POPTS_TXSM << 8;
+
+               /* insert ip checksum */
+               if (tx_flags & IGBVF_TX_FLAGS_IPV4)
+                       olinfo_status |= E1000_TXD_POPTS_IXSM << 8;
+
+       } else if (tx_flags & IGBVF_TX_FLAGS_CSUM) {
+               olinfo_status |= E1000_TXD_POPTS_TXSM << 8;
+       }
+
+       olinfo_status |= ((paylen - hdr_len) << E1000_ADVTXD_PAYLEN_SHIFT);
+
+       i = tx_ring->next_to_use;
+       while (count--) {
+               buffer_info = &tx_ring->buffer_info[i];
+               tx_desc = IGBVF_TX_DESC_ADV(*tx_ring, i);
+               tx_desc->read.buffer_addr = cpu_to_le64(buffer_info->dma);
+               tx_desc->read.cmd_type_len =
+                        cpu_to_le32(cmd_type_len | buffer_info->length);
+               tx_desc->read.olinfo_status = cpu_to_le32(olinfo_status);
+               i++;
+               if (i == tx_ring->count)
+                       i = 0;
+       }
+
+       tx_desc->read.cmd_type_len |= cpu_to_le32(adapter->txd_cmd);
+       /* Force memory writes to complete before letting h/w
+        * know there are new descriptors to fetch.  (Only
+        * applicable for weak-ordered memory model archs,
+        * such as IA-64). */
+       wmb();
+
+       tx_ring->next_to_use = i;
+       writel(i, adapter->hw.hw_addr + tx_ring->tail);
+       /* we need this if more than one processor can write to our tail
+        * at a time, it syncronizes IO on IA64/Altix systems */
+       mmiowb();
+}
+
+static int igbvf_xmit_frame_ring_adv(struct sk_buff *skb,
+                                   struct net_device *netdev,
+                                   struct igbvf_ring *tx_ring)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       unsigned int first, tx_flags = 0;
+       u8 hdr_len = 0;
+       int count = 0;
+       int tso = 0;
+
+       if (test_bit(__IGBVF_DOWN, &adapter->state)) {
+               dev_kfree_skb_any(skb);
+               return NETDEV_TX_OK;
+       }
+
+       if (skb->len <= 0) {
+               dev_kfree_skb_any(skb);
+               return NETDEV_TX_OK;
+       }
+
+       /*
+        * need: count + 4 desc gap to keep tail from touching
+         *       + 2 desc gap to keep tail from touching head,
+         *       + 1 desc for skb->data,
+         *       + 1 desc for context descriptor,
+        * head, otherwise try next time
+        */
+       if (igbvf_maybe_stop_tx(netdev, skb_shinfo(skb)->nr_frags + 4)) {
+               /* this is a hard error */
+               return NETDEV_TX_BUSY;
+       }
+
+       if (adapter->vlgrp && vlan_tx_tag_present(skb)) {
+               tx_flags |= IGBVF_TX_FLAGS_VLAN;
+               tx_flags |= (vlan_tx_tag_get(skb) << IGBVF_TX_FLAGS_VLAN_SHIFT);
+       }
+
+       if (skb->protocol == htons(ETH_P_IP))
+               tx_flags |= IGBVF_TX_FLAGS_IPV4;
+
+       first = tx_ring->next_to_use;
+
+       tso = skb_is_gso(skb) ?
+               igbvf_tso(adapter, tx_ring, skb, tx_flags, &hdr_len) : 0;
+       if (unlikely(tso < 0)) {
+               dev_kfree_skb_any(skb);
+               return NETDEV_TX_OK;
+       }
+
+       if (tso)
+               tx_flags |= IGBVF_TX_FLAGS_TSO;
+       else if (igbvf_tx_csum(adapter, tx_ring, skb, tx_flags) &&
+                (skb->ip_summed == CHECKSUM_PARTIAL))
+               tx_flags |= IGBVF_TX_FLAGS_CSUM;
+
+       /*
+        * count reflects descriptors mapped, if 0 then mapping error
+        * has occured and we need to rewind the descriptor queue
+        */
+       count = igbvf_tx_map_adv(adapter, tx_ring, skb, first);
+
+       if (count) {
+               igbvf_tx_queue_adv(adapter, tx_ring, tx_flags, count,
+                                  skb->len, hdr_len);
+               netdev->trans_start = jiffies;
+               /* Make sure there is space in the ring for the next send. */
+               igbvf_maybe_stop_tx(netdev, MAX_SKB_FRAGS + 4);
+       } else {
+               dev_kfree_skb_any(skb);
+               tx_ring->buffer_info[first].time_stamp = 0;
+               tx_ring->next_to_use = first;
+       }
+
+       return NETDEV_TX_OK;
+}
+
+static int igbvf_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct igbvf_ring *tx_ring;
+       int retval;
+
+       if (test_bit(__IGBVF_DOWN, &adapter->state)) {
+               dev_kfree_skb_any(skb);
+               return NETDEV_TX_OK;
+       }
+
+       tx_ring = &adapter->tx_ring[0];
+
+       retval = igbvf_xmit_frame_ring_adv(skb, netdev, tx_ring);
+
+       return retval;
+}
+
+/**
+ * igbvf_tx_timeout - Respond to a Tx Hang
+ * @netdev: network interface device structure
+ **/
+static void igbvf_tx_timeout(struct net_device *netdev)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       /* Do the reset outside of interrupt context */
+       adapter->tx_timeout_count++;
+       schedule_work(&adapter->reset_task);
+}
+
+static void igbvf_reset_task(struct work_struct *work)
+{
+       struct igbvf_adapter *adapter;
+       adapter = container_of(work, struct igbvf_adapter, reset_task);
+
+       igbvf_reinit_locked(adapter);
+}
+
+/**
+ * igbvf_get_stats - Get System Network Statistics
+ * @netdev: network interface device structure
+ *
+ * Returns the address of the device statistics structure.
+ * The statistics are actually updated from the timer callback.
+ **/
+static struct net_device_stats *igbvf_get_stats(struct net_device *netdev)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       /* only return the current stats */
+       return &adapter->net_stats;
+}
+
+/**
+ * igbvf_change_mtu - Change the Maximum Transfer Unit
+ * @netdev: network interface device structure
+ * @new_mtu: new value for maximum frame size
+ *
+ * Returns 0 on success, negative on failure
+ **/
+static int igbvf_change_mtu(struct net_device *netdev, int new_mtu)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       int max_frame = new_mtu + ETH_HLEN + ETH_FCS_LEN;
+
+       if ((new_mtu < 68) || (max_frame > MAX_JUMBO_FRAME_SIZE)) {
+               dev_err(&adapter->pdev->dev, "Invalid MTU setting\n");
+               return -EINVAL;
+       }
+
+       /* Jumbo frame size limits */
+       if (max_frame > ETH_FRAME_LEN + ETH_FCS_LEN) {
+               if (!(adapter->flags & FLAG_HAS_JUMBO_FRAMES)) {
+                       dev_err(&adapter->pdev->dev,
+                               "Jumbo Frames not supported.\n");
+                       return -EINVAL;
+               }
+       }
+
+#define MAX_STD_JUMBO_FRAME_SIZE 9234
+       if (max_frame > MAX_STD_JUMBO_FRAME_SIZE) {
+               dev_err(&adapter->pdev->dev, "MTU > 9216 not supported.\n");
+               return -EINVAL;
+       }
+
+       while (test_and_set_bit(__IGBVF_RESETTING, &adapter->state))
+               msleep(1);
+       /* igbvf_down has a dependency on max_frame_size */
+       adapter->max_frame_size = max_frame;
+       if (netif_running(netdev))
+               igbvf_down(adapter);
+
+       /*
+        * NOTE: netdev_alloc_skb reserves 16 bytes, and typically NET_IP_ALIGN
+        * means we reserve 2 more, this pushes us to allocate from the next
+        * larger slab size.
+        * i.e. RXBUFFER_2048 --> size-4096 slab
+        * However with the new *_jumbo_rx* routines, jumbo receives will use
+        * fragmented skbs
+        */
+
+       if (max_frame <= 1024)
+               adapter->rx_buffer_len = 1024;
+       else if (max_frame <= 2048)
+               adapter->rx_buffer_len = 2048;
+       else
+#if (PAGE_SIZE / 2) > 16384
+               adapter->rx_buffer_len = 16384;
+#else
+               adapter->rx_buffer_len = PAGE_SIZE / 2;
+#endif
+
+
+       /* adjust allocation if LPE protects us, and we aren't using SBP */
+       if ((max_frame == ETH_FRAME_LEN + ETH_FCS_LEN) ||
+            (max_frame == ETH_FRAME_LEN + VLAN_HLEN + ETH_FCS_LEN))
+               adapter->rx_buffer_len = ETH_FRAME_LEN + VLAN_HLEN +
+                                        ETH_FCS_LEN;
+
+       dev_info(&adapter->pdev->dev, "changing MTU from %d to %d\n",
+                netdev->mtu, new_mtu);
+       netdev->mtu = new_mtu;
+
+       if (netif_running(netdev))
+               igbvf_up(adapter);
+       else
+               igbvf_reset(adapter);
+
+       clear_bit(__IGBVF_RESETTING, &adapter->state);
+
+       return 0;
+}
+
+static int igbvf_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd)
+{
+       switch (cmd) {
+       default:
+               return -EOPNOTSUPP;
+       }
+}
+
+static int igbvf_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+       struct net_device *netdev = pci_get_drvdata(pdev);
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+#ifdef CONFIG_PM
+       int retval = 0;
+#endif
+
+       netif_device_detach(netdev);
+
+       if (netif_running(netdev)) {
+               WARN_ON(test_bit(__IGBVF_RESETTING, &adapter->state));
+               igbvf_down(adapter);
+               igbvf_free_irq(adapter);
+       }
+
+#ifdef CONFIG_PM
+       retval = pci_save_state(pdev);
+       if (retval)
+               return retval;
+#endif
+
+       pci_disable_device(pdev);
+
+       return 0;
+}
+
+#ifdef CONFIG_PM
+static int igbvf_resume(struct pci_dev *pdev)
+{
+       struct net_device *netdev = pci_get_drvdata(pdev);
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       u32 err;
+
+       pci_restore_state(pdev);
+       err = pci_enable_device_mem(pdev);
+       if (err) {
+               dev_err(&pdev->dev, "Cannot enable PCI device from suspend\n");
+               return err;
+       }
+
+       pci_set_master(pdev);
+
+       if (netif_running(netdev)) {
+               err = igbvf_request_irq(adapter);
+               if (err)
+                       return err;
+       }
+
+       igbvf_reset(adapter);
+
+       if (netif_running(netdev))
+               igbvf_up(adapter);
+
+       netif_device_attach(netdev);
+
+       return 0;
+}
+#endif
+
+static void igbvf_shutdown(struct pci_dev *pdev)
+{
+       igbvf_suspend(pdev, PMSG_SUSPEND);
+}
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+/*
+ * Polling 'interrupt' - used by things like netconsole to send skbs
+ * without having to re-enable interrupts. It's not called while
+ * the interrupt routine is executing.
+ */
+static void igbvf_netpoll(struct net_device *netdev)
+{
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       disable_irq(adapter->pdev->irq);
+
+       igbvf_clean_tx_irq(adapter->tx_ring);
+
+       enable_irq(adapter->pdev->irq);
+}
+#endif
+
+/**
+ * igbvf_io_error_detected - called when PCI error is detected
+ * @pdev: Pointer to PCI device
+ * @state: The current pci connection state
+ *
+ * This function is called after a PCI bus error affecting
+ * this device has been detected.
+ */
+static pci_ers_result_t igbvf_io_error_detected(struct pci_dev *pdev,
+                                                pci_channel_state_t state)
+{
+       struct net_device *netdev = pci_get_drvdata(pdev);
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       netif_device_detach(netdev);
+
+       if (netif_running(netdev))
+               igbvf_down(adapter);
+       pci_disable_device(pdev);
+
+       /* Request a slot slot reset. */
+       return PCI_ERS_RESULT_NEED_RESET;
+}
+
+/**
+ * igbvf_io_slot_reset - called after the pci bus has been reset.
+ * @pdev: Pointer to PCI device
+ *
+ * Restart the card from scratch, as if from a cold-boot. Implementation
+ * resembles the first-half of the igbvf_resume routine.
+ */
+static pci_ers_result_t igbvf_io_slot_reset(struct pci_dev *pdev)
+{
+       struct net_device *netdev = pci_get_drvdata(pdev);
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       if (pci_enable_device_mem(pdev)) {
+               dev_err(&pdev->dev,
+                       "Cannot re-enable PCI device after reset.\n");
+               return PCI_ERS_RESULT_DISCONNECT;
+       }
+       pci_set_master(pdev);
+
+       igbvf_reset(adapter);
+
+       return PCI_ERS_RESULT_RECOVERED;
+}
+
+/**
+ * igbvf_io_resume - called when traffic can start flowing again.
+ * @pdev: Pointer to PCI device
+ *
+ * This callback is called when the error recovery driver tells us that
+ * its OK to resume normal operation. Implementation resembles the
+ * second-half of the igbvf_resume routine.
+ */
+static void igbvf_io_resume(struct pci_dev *pdev)
+{
+       struct net_device *netdev = pci_get_drvdata(pdev);
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+
+       if (netif_running(netdev)) {
+               if (igbvf_up(adapter)) {
+                       dev_err(&pdev->dev,
+                               "can't bring device back up after reset\n");
+                       return;
+               }
+       }
+
+       netif_device_attach(netdev);
+}
+
+static void igbvf_print_device_info(struct igbvf_adapter *adapter)
+{
+       struct e1000_hw *hw = &adapter->hw;
+       struct net_device *netdev = adapter->netdev;
+       struct pci_dev *pdev = adapter->pdev;
+
+       dev_info(&pdev->dev, "Intel(R) 82576 Virtual Function\n");
+       dev_info(&pdev->dev, "Address: %02x:%02x:%02x:%02x:%02x:%02x\n",
+                /* MAC address */
+                netdev->dev_addr[0], netdev->dev_addr[1],
+                netdev->dev_addr[2], netdev->dev_addr[3],
+                netdev->dev_addr[4], netdev->dev_addr[5]);
+       dev_info(&pdev->dev, "MAC: %d\n", hw->mac.type);
+}
+
+static const struct net_device_ops igbvf_netdev_ops = {
+       .ndo_open                       = igbvf_open,
+       .ndo_stop                       = igbvf_close,
+       .ndo_start_xmit                 = igbvf_xmit_frame,
+       .ndo_get_stats                  = igbvf_get_stats,
+       .ndo_set_multicast_list         = igbvf_set_multi,
+       .ndo_set_mac_address            = igbvf_set_mac,
+       .ndo_change_mtu                 = igbvf_change_mtu,
+       .ndo_do_ioctl                   = igbvf_ioctl,
+       .ndo_tx_timeout                 = igbvf_tx_timeout,
+       .ndo_vlan_rx_register           = igbvf_vlan_rx_register,
+       .ndo_vlan_rx_add_vid            = igbvf_vlan_rx_add_vid,
+       .ndo_vlan_rx_kill_vid           = igbvf_vlan_rx_kill_vid,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+       .ndo_poll_controller            = igbvf_netpoll,
+#endif
+};
+
+/**
+ * igbvf_probe - Device Initialization Routine
+ * @pdev: PCI device information struct
+ * @ent: entry in igbvf_pci_tbl
+ *
+ * Returns 0 on success, negative on failure
+ *
+ * igbvf_probe initializes an adapter identified by a pci_dev structure.
+ * The OS initialization, configuring of the adapter private structure,
+ * and a hardware reset occur.
+ **/
+static int __devinit igbvf_probe(struct pci_dev *pdev,
+                                 const struct pci_device_id *ent)
+{
+       struct net_device *netdev;
+       struct igbvf_adapter *adapter;
+       struct e1000_hw *hw;
+       const struct igbvf_info *ei = igbvf_info_tbl[ent->driver_data];
+
+       static int cards_found;
+       int err, pci_using_dac;
+
+       err = pci_enable_device_mem(pdev);
+       if (err)
+               return err;
+
+       pci_using_dac = 0;
+       err = pci_set_dma_mask(pdev, DMA_64BIT_MASK);
+       if (!err) {
+               err = pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK);
+               if (!err)
+                       pci_using_dac = 1;
+       } else {
+               err = pci_set_dma_mask(pdev, DMA_32BIT_MASK);
+               if (err) {
+                       err = pci_set_consistent_dma_mask(pdev, DMA_32BIT_MASK);
+                       if (err) {
+                               dev_err(&pdev->dev, "No usable DMA "
+                                       "configuration, aborting\n");
+                               goto err_dma;
+                       }
+               }
+       }
+
+       err = pci_request_regions(pdev, igbvf_driver_name);
+       if (err)
+               goto err_pci_reg;
+
+       pci_set_master(pdev);
+
+       err = -ENOMEM;
+       netdev = alloc_etherdev(sizeof(struct igbvf_adapter));
+       if (!netdev)
+               goto err_alloc_etherdev;
+
+       SET_NETDEV_DEV(netdev, &pdev->dev);
+
+       pci_set_drvdata(pdev, netdev);
+       adapter = netdev_priv(netdev);
+       hw = &adapter->hw;
+       adapter->netdev = netdev;
+       adapter->pdev = pdev;
+       adapter->ei = ei;
+       adapter->pba = ei->pba;
+       adapter->flags = ei->flags;
+       adapter->hw.back = adapter;
+       adapter->hw.mac.type = ei->mac;
+       adapter->msg_enable = (1 << NETIF_MSG_DRV | NETIF_MSG_PROBE) - 1;
+
+       /* PCI config space info */
+
+       hw->vendor_id = pdev->vendor;
+       hw->device_id = pdev->device;
+       hw->subsystem_vendor_id = pdev->subsystem_vendor;
+       hw->subsystem_device_id = pdev->subsystem_device;
+
+       pci_read_config_byte(pdev, PCI_REVISION_ID, &hw->revision_id);
+
+       err = -EIO;
+       adapter->hw.hw_addr = ioremap(pci_resource_start(pdev, 0),
+                                     pci_resource_len(pdev, 0));
+
+       if (!adapter->hw.hw_addr)
+               goto err_ioremap;
+
+       if (ei->get_variants) {
+               err = ei->get_variants(adapter);
+               if (err)
+                       goto err_ioremap;
+       }
+
+       /* setup adapter struct */
+       err = igbvf_sw_init(adapter);
+       if (err)
+               goto err_sw_init;
+
+       /* construct the net_device struct */
+       netdev->netdev_ops = &igbvf_netdev_ops;
+
+       igbvf_set_ethtool_ops(netdev);
+       netdev->watchdog_timeo = 5 * HZ;
+       strncpy(netdev->name, pci_name(pdev), sizeof(netdev->name) - 1);
+
+       adapter->bd_number = cards_found++;
+
+       netdev->features = NETIF_F_SG |
+                          NETIF_F_IP_CSUM |
+                          NETIF_F_HW_VLAN_TX |
+                          NETIF_F_HW_VLAN_RX |
+                          NETIF_F_HW_VLAN_FILTER;
+
+       netdev->features |= NETIF_F_IPV6_CSUM;
+       netdev->features |= NETIF_F_TSO;
+       netdev->features |= NETIF_F_TSO6;
+
+       if (pci_using_dac)
+               netdev->features |= NETIF_F_HIGHDMA;
+
+       netdev->vlan_features |= NETIF_F_TSO;
+       netdev->vlan_features |= NETIF_F_TSO6;
+       netdev->vlan_features |= NETIF_F_IP_CSUM;
+       netdev->vlan_features |= NETIF_F_IPV6_CSUM;
+       netdev->vlan_features |= NETIF_F_SG;
+
+       /*reset the controller to put the device in a known good state */
+       err = hw->mac.ops.reset_hw(hw);
+       if (err) {
+               dev_info(&pdev->dev,
+                        "PF still in reset state, assigning new address\n");
+               random_ether_addr(hw->mac.addr);
+       } else {
+               err = hw->mac.ops.read_mac_addr(hw);
+               if (err) {
+                       dev_err(&pdev->dev, "Error reading MAC address\n");
+                       goto err_hw_init;
+               }
+       }
+
+       memcpy(netdev->dev_addr, adapter->hw.mac.addr, netdev->addr_len);
+       memcpy(netdev->perm_addr, adapter->hw.mac.addr, netdev->addr_len);
+
+       if (!is_valid_ether_addr(netdev->perm_addr)) {
+               dev_err(&pdev->dev, "Invalid MAC Address: "
+                       "%02x:%02x:%02x:%02x:%02x:%02x\n",
+                       netdev->dev_addr[0], netdev->dev_addr[1],
+                       netdev->dev_addr[2], netdev->dev_addr[3],
+                       netdev->dev_addr[4], netdev->dev_addr[5]);
+               err = -EIO;
+               goto err_hw_init;
+       }
+
+       setup_timer(&adapter->watchdog_timer, &igbvf_watchdog,
+                   (unsigned long) adapter);
+
+       INIT_WORK(&adapter->reset_task, igbvf_reset_task);
+       INIT_WORK(&adapter->watchdog_task, igbvf_watchdog_task);
+
+       /* ring size defaults */
+       adapter->rx_ring->count = 1024;
+       adapter->tx_ring->count = 1024;
+
+       /* reset the hardware with the new settings */
+       igbvf_reset(adapter);
+
+       /* tell the stack to leave us alone until igbvf_open() is called */
+       netif_carrier_off(netdev);
+       netif_stop_queue(netdev);
+
+       strcpy(netdev->name, "eth%d");
+       err = register_netdev(netdev);
+       if (err)
+               goto err_hw_init;
+
+       igbvf_print_device_info(adapter);
+
+       igbvf_initialize_last_counter_stats(adapter);
+
+       return 0;
+
+err_hw_init:
+       kfree(adapter->tx_ring);
+       kfree(adapter->rx_ring);
+err_sw_init:
+       igbvf_reset_interrupt_capability(adapter);
+       iounmap(adapter->hw.hw_addr);
+err_ioremap:
+       free_netdev(netdev);
+err_alloc_etherdev:
+       pci_release_regions(pdev);
+err_pci_reg:
+err_dma:
+       pci_disable_device(pdev);
+       return err;
+}
+
+/**
+ * igbvf_remove - Device Removal Routine
+ * @pdev: PCI device information struct
+ *
+ * igbvf_remove is called by the PCI subsystem to alert the driver
+ * that it should release a PCI device.  The could be caused by a
+ * Hot-Plug event, or because the driver is going to be removed from
+ * memory.
+ **/
+static void __devexit igbvf_remove(struct pci_dev *pdev)
+{
+       struct net_device *netdev = pci_get_drvdata(pdev);
+       struct igbvf_adapter *adapter = netdev_priv(netdev);
+       struct e1000_hw *hw = &adapter->hw;
+
+       /*
+        * flush_scheduled work may reschedule our watchdog task, so
+        * explicitly disable watchdog tasks from being rescheduled
+        */
+       set_bit(__IGBVF_DOWN, &adapter->state);
+       del_timer_sync(&adapter->watchdog_timer);
+
+       flush_scheduled_work();
+
+       unregister_netdev(netdev);
+
+       igbvf_reset_interrupt_capability(adapter);
+
+       /*
+        * it is important to delete the napi struct prior to freeing the
+        * rx ring so that you do not end up with null pointer refs
+        */
+       netif_napi_del(&adapter->rx_ring->napi);
+       kfree(adapter->tx_ring);
+       kfree(adapter->rx_ring);
+
+       iounmap(hw->hw_addr);
+       if (hw->flash_address)
+               iounmap(hw->flash_address);
+       pci_release_regions(pdev);
+
+       free_netdev(netdev);
+
+       pci_disable_device(pdev);
+}
+
+/* PCI Error Recovery (ERS) */
+static struct pci_error_handlers igbvf_err_handler = {
+       .error_detected = igbvf_io_error_detected,
+       .slot_reset = igbvf_io_slot_reset,
+       .resume = igbvf_io_resume,
+};
+
+static struct pci_device_id igbvf_pci_tbl[] = {
+       { PCI_VDEVICE(INTEL, E1000_DEV_ID_82576_VF), board_vf },
+       { } /* terminate list */
+};
+MODULE_DEVICE_TABLE(pci, igbvf_pci_tbl);
+
+/* PCI Device API Driver */
+static struct pci_driver igbvf_driver = {
+       .name     = igbvf_driver_name,
+       .id_table = igbvf_pci_tbl,
+       .probe    = igbvf_probe,
+       .remove   = __devexit_p(igbvf_remove),
+#ifdef CONFIG_PM
+       /* Power Management Hooks */
+       .suspend  = igbvf_suspend,
+       .resume   = igbvf_resume,
+#endif
+       .shutdown = igbvf_shutdown,
+       .err_handler = &igbvf_err_handler
+};
+
+/**
+ * igbvf_init_module - Driver Registration Routine
+ *
+ * igbvf_init_module is the first routine called when the driver is
+ * loaded. All it does is register with the PCI subsystem.
+ **/
+static int __init igbvf_init_module(void)
+{
+       int ret;
+       printk(KERN_INFO "%s - version %s\n",
+              igbvf_driver_string, igbvf_driver_version);
+       printk(KERN_INFO "%s\n", igbvf_copyright);
+
+       ret = pci_register_driver(&igbvf_driver);
+       pm_qos_add_requirement(PM_QOS_CPU_DMA_LATENCY, igbvf_driver_name,
+                              PM_QOS_DEFAULT_VALUE);
+
+       return ret;
+}
+module_init(igbvf_init_module);
+
+/**
+ * igbvf_exit_module - Driver Exit Cleanup Routine
+ *
+ * igbvf_exit_module is called just before the driver is removed
+ * from memory.
+ **/
+static void __exit igbvf_exit_module(void)
+{
+       pci_unregister_driver(&igbvf_driver);
+       pm_qos_remove_requirement(PM_QOS_CPU_DMA_LATENCY, igbvf_driver_name);
+}
+module_exit(igbvf_exit_module);
+
+
+MODULE_AUTHOR("Intel Corporation, <e1000-devel@lists.sourceforge.net>");
+MODULE_DESCRIPTION("Intel(R) 82576 Virtual Function Network Driver");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(DRV_VERSION);
+
+/* netdev.c */
diff --git a/drivers/net/igbvf/regs.h b/drivers/net/igbvf/regs.h

new file mode 100644 (file)

index 0000000..b9e24ed
--- /dev/null
+++ b/drivers/net/igbvf/regs.h
@@ -0,0 +1,108 @@
+/*******************************************************************************
+
+  Intel(R) 82576 Virtual Function Linux driver
+  Copyright(c) 2009 Intel Corporation.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Contact Information:
+  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+*******************************************************************************/
+
+#ifndef _E1000_REGS_H_
+#define _E1000_REGS_H_
+
+#define E1000_CTRL      0x00000 /* Device Control - RW */
+#define E1000_STATUS    0x00008 /* Device Status - RO */
+#define E1000_ITR       0x000C4 /* Interrupt Throttling Rate - RW */
+#define E1000_EICR      0x01580 /* Ext. Interrupt Cause Read - R/clr */
+#define E1000_EITR(_n)  (0x01680 + (0x4 * (_n)))
+#define E1000_EICS      0x01520 /* Ext. Interrupt Cause Set - W0 */
+#define E1000_EIMS      0x01524 /* Ext. Interrupt Mask Set/Read - RW */
+#define E1000_EIMC      0x01528 /* Ext. Interrupt Mask Clear - WO */
+#define E1000_EIAC      0x0152C /* Ext. Interrupt Auto Clear - RW */
+#define E1000_EIAM      0x01530 /* Ext. Interrupt Ack Auto Clear Mask - RW */
+#define E1000_IVAR0     0x01700 /* Interrupt Vector Allocation (array) - RW */
+#define E1000_IVAR_MISC 0x01740 /* IVAR for "other" causes - RW */
+/*
+ * Convenience macros
+ *
+ * Note: "_n" is the queue number of the register to be written to.
+ *
+ * Example usage:
+ * E1000_RDBAL_REG(current_rx_queue)
+ */
+#define E1000_RDBAL(_n)      ((_n) < 4 ? (0x02800 + ((_n) * 0x100)) : \
+                                         (0x0C000 + ((_n) * 0x40)))
+#define E1000_RDBAH(_n)      ((_n) < 4 ? (0x02804 + ((_n) * 0x100)) : \
+                                         (0x0C004 + ((_n) * 0x40)))
+#define E1000_RDLEN(_n)      ((_n) < 4 ? (0x02808 + ((_n) * 0x100)) : \
+                                         (0x0C008 + ((_n) * 0x40)))
+#define E1000_SRRCTL(_n)     ((_n) < 4 ? (0x0280C + ((_n) * 0x100)) : \
+                                         (0x0C00C + ((_n) * 0x40)))
+#define E1000_RDH(_n)        ((_n) < 4 ? (0x02810 + ((_n) * 0x100)) : \
+                                         (0x0C010 + ((_n) * 0x40)))
+#define E1000_RDT(_n)        ((_n) < 4 ? (0x02818 + ((_n) * 0x100)) : \
+                                         (0x0C018 + ((_n) * 0x40)))
+#define E1000_RXDCTL(_n)     ((_n) < 4 ? (0x02828 + ((_n) * 0x100)) : \
+                                         (0x0C028 + ((_n) * 0x40)))
+#define E1000_TDBAL(_n)      ((_n) < 4 ? (0x03800 + ((_n) * 0x100)) : \
+                                         (0x0E000 + ((_n) * 0x40)))
+#define E1000_TDBAH(_n)      ((_n) < 4 ? (0x03804 + ((_n) * 0x100)) : \
+                                         (0x0E004 + ((_n) * 0x40)))
+#define E1000_TDLEN(_n)      ((_n) < 4 ? (0x03808 + ((_n) * 0x100)) : \
+                                         (0x0E008 + ((_n) * 0x40)))
+#define E1000_TDH(_n)        ((_n) < 4 ? (0x03810 + ((_n) * 0x100)) : \
+                                         (0x0E010 + ((_n) * 0x40)))
+#define E1000_TDT(_n)        ((_n) < 4 ? (0x03818 + ((_n) * 0x100)) : \
+                                         (0x0E018 + ((_n) * 0x40)))
+#define E1000_TXDCTL(_n)     ((_n) < 4 ? (0x03828 + ((_n) * 0x100)) : \
+                                         (0x0E028 + ((_n) * 0x40)))
+#define E1000_DCA_TXCTRL(_n) (0x03814 + (_n << 8))
+#define E1000_DCA_RXCTRL(_n) (0x02814 + (_n << 8))
+#define E1000_RAL(_i)  (((_i) <= 15) ? (0x05400 + ((_i) * 8)) : \
+                                       (0x054E0 + ((_i - 16) * 8)))
+#define E1000_RAH(_i)  (((_i) <= 15) ? (0x05404 + ((_i) * 8)) : \
+                                       (0x054E4 + ((_i - 16) * 8)))
+
+/* Statistics registers */
+#define E1000_VFGPRC    0x00F10
+#define E1000_VFGORC    0x00F18
+#define E1000_VFMPRC    0x00F3C
+#define E1000_VFGPTC    0x00F14
+#define E1000_VFGOTC    0x00F34
+#define E1000_VFGOTLBC  0x00F50
+#define E1000_VFGPTLBC  0x00F44
+#define E1000_VFGORLBC  0x00F48
+#define E1000_VFGPRLBC  0x00F40
+
+/* These act per VF so an array friendly macro is used */
+#define E1000_V2PMAILBOX(_n)   (0x00C40 + (4 * (_n)))
+#define E1000_VMBMEM(_n)       (0x00800 + (64 * (_n)))
+
+/* Define macros for handling registers */
+#define er32(reg) readl(hw->hw_addr + E1000_##reg)
+#define ew32(reg, val) writel((val), hw->hw_addr +  E1000_##reg)
+#define array_er32(reg, offset) \
+       readl(hw->hw_addr + E1000_##reg + (offset << 2))
+#define array_ew32(reg, offset, val) \
+       writel((val), hw->hw_addr +  E1000_##reg + (offset << 2))
+#define e1e_flush() er32(STATUS)
+
+#endif
diff --git a/drivers/net/igbvf/vf.c b/drivers/net/igbvf/vf.c

new file mode 100644 (file)

index 0000000..aa246c9
--- /dev/null
+++ b/drivers/net/igbvf/vf.c
@@ -0,0 +1,398 @@
+/*******************************************************************************
+
+  Intel(R) 82576 Virtual Function Linux driver
+  Copyright(c) 2009 Intel Corporation.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Contact Information:
+  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+*******************************************************************************/
+
+
+#include "vf.h"
+
+static s32 e1000_check_for_link_vf(struct e1000_hw *hw);
+static s32 e1000_get_link_up_info_vf(struct e1000_hw *hw, u16 *speed,
+                                     u16 *duplex);
+static s32 e1000_init_hw_vf(struct e1000_hw *hw);
+static s32 e1000_reset_hw_vf(struct e1000_hw *hw);
+
+static void e1000_update_mc_addr_list_vf(struct e1000_hw *hw, u8 *,
+                                         u32, u32, u32);
+static void e1000_rar_set_vf(struct e1000_hw *, u8 *, u32);
+static s32 e1000_read_mac_addr_vf(struct e1000_hw *);
+static s32 e1000_set_vfta_vf(struct e1000_hw *, u16, bool);
+
+/**
+ *  e1000_init_mac_params_vf - Inits MAC params
+ *  @hw: pointer to the HW structure
+ **/
+s32 e1000_init_mac_params_vf(struct e1000_hw *hw)
+{
+       struct e1000_mac_info *mac = &hw->mac;
+
+       /* VF's have no MTA Registers - PF feature only */
+       mac->mta_reg_count = 128;
+       /* VF's have no access to RAR entries  */
+       mac->rar_entry_count = 1;
+
+       /* Function pointers */
+       /* reset */
+       mac->ops.reset_hw = e1000_reset_hw_vf;
+       /* hw initialization */
+       mac->ops.init_hw = e1000_init_hw_vf;
+       /* check for link */
+       mac->ops.check_for_link = e1000_check_for_link_vf;
+       /* link info */
+       mac->ops.get_link_up_info = e1000_get_link_up_info_vf;
+       /* multicast address update */
+       mac->ops.update_mc_addr_list = e1000_update_mc_addr_list_vf;
+       /* set mac address */
+       mac->ops.rar_set = e1000_rar_set_vf;
+       /* read mac address */
+       mac->ops.read_mac_addr = e1000_read_mac_addr_vf;
+       /* set vlan filter table array */
+       mac->ops.set_vfta = e1000_set_vfta_vf;
+
+       return E1000_SUCCESS;
+}
+
+/**
+ *  e1000_init_function_pointers_vf - Inits function pointers
+ *  @hw: pointer to the HW structure
+ **/
+void e1000_init_function_pointers_vf(struct e1000_hw *hw)
+{
+       hw->mac.ops.init_params = e1000_init_mac_params_vf;
+       hw->mbx.ops.init_params = e1000_init_mbx_params_vf;
+}
+
+/**
+ *  e1000_get_link_up_info_vf - Gets link info.
+ *  @hw: pointer to the HW structure
+ *  @speed: pointer to 16 bit value to store link speed.
+ *  @duplex: pointer to 16 bit value to store duplex.
+ *
+ *  Since we cannot read the PHY and get accurate link info, we must rely upon
+ *  the status register's data which is often stale and inaccurate.
+ **/
+static s32 e1000_get_link_up_info_vf(struct e1000_hw *hw, u16 *speed,
+                                     u16 *duplex)
+{
+       s32 status;
+
+       status = er32(STATUS);
+       if (status & E1000_STATUS_SPEED_1000)
+               *speed = SPEED_1000;
+       else if (status & E1000_STATUS_SPEED_100)
+               *speed = SPEED_100;
+       else
+               *speed = SPEED_10;
+
+       if (status & E1000_STATUS_FD)
+               *duplex = FULL_DUPLEX;
+       else
+               *duplex = HALF_DUPLEX;
+
+       return E1000_SUCCESS;
+}
+
+/**
+ *  e1000_reset_hw_vf - Resets the HW
+ *  @hw: pointer to the HW structure
+ *
+ *  VF's provide a function level reset. This is done using bit 26 of ctrl_reg.
+ *  This is all the reset we can perform on a VF.
+ **/
+static s32 e1000_reset_hw_vf(struct e1000_hw *hw)
+{
+       struct e1000_mbx_info *mbx = &hw->mbx;
+       u32 timeout = E1000_VF_INIT_TIMEOUT;
+       u32 ret_val = -E1000_ERR_MAC_INIT;
+       u32 msgbuf[3];
+       u8 *addr = (u8 *)(&msgbuf[1]);
+       u32 ctrl;
+
+       /* assert vf queue/interrupt reset */
+       ctrl = er32(CTRL);
+       ew32(CTRL, ctrl | E1000_CTRL_RST);
+
+       /* we cannot initialize while the RSTI / RSTD bits are asserted */
+       while (!mbx->ops.check_for_rst(hw) && timeout) {
+               timeout--;
+               udelay(5);
+       }
+
+       if (timeout) {
+               /* mailbox timeout can now become active */
+               mbx->timeout = E1000_VF_MBX_INIT_TIMEOUT;
+
+               /* notify pf of vf reset completion */
+               msgbuf[0] = E1000_VF_RESET;
+               mbx->ops.write_posted(hw, msgbuf, 1);
+
+               msleep(10);
+
+               /* set our "perm_addr" based on info provided by PF */
+               ret_val = mbx->ops.read_posted(hw, msgbuf, 3);
+               if (!ret_val) {
+                       if (msgbuf[0] == (E1000_VF_RESET | E1000_VT_MSGTYPE_ACK))
+                               memcpy(hw->mac.perm_addr, addr, 6);
+                       else
+                               ret_val = -E1000_ERR_MAC_INIT;
+               }
+       }
+
+       return ret_val;
+}
+
+/**
+ *  e1000_init_hw_vf - Inits the HW
+ *  @hw: pointer to the HW structure
+ *
+ *  Not much to do here except clear the PF Reset indication if there is one.
+ **/
+static s32 e1000_init_hw_vf(struct e1000_hw *hw)
+{
+       /* attempt to set and restore our mac address */
+       e1000_rar_set_vf(hw, hw->mac.addr, 0);
+
+       return E1000_SUCCESS;
+}
+
+/**
+ *  e1000_hash_mc_addr_vf - Generate a multicast hash value
+ *  @hw: pointer to the HW structure
+ *  @mc_addr: pointer to a multicast address
+ *
+ *  Generates a multicast address hash value which is used to determine
+ *  the multicast filter table array address and new table value.  See
+ *  e1000_mta_set_generic()
+ **/
+static u32 e1000_hash_mc_addr_vf(struct e1000_hw *hw, u8 *mc_addr)
+{
+       u32 hash_value, hash_mask;
+       u8 bit_shift = 0;
+
+       /* Register count multiplied by bits per register */
+       hash_mask = (hw->mac.mta_reg_count * 32) - 1;
+
+       /*
+        * The bit_shift is the number of left-shifts
+        * where 0xFF would still fall within the hash mask.
+        */
+       while (hash_mask >> bit_shift != 0xFF)
+               bit_shift++;
+
+       hash_value = hash_mask & (((mc_addr[4] >> (8 - bit_shift)) |
+                                 (((u16) mc_addr[5]) << bit_shift)));
+
+       return hash_value;
+}
+
+/**
+ *  e1000_update_mc_addr_list_vf - Update Multicast addresses
+ *  @hw: pointer to the HW structure
+ *  @mc_addr_list: array of multicast addresses to program
+ *  @mc_addr_count: number of multicast addresses to program
+ *  @rar_used_count: the first RAR register free to program
+ *  @rar_count: total number of supported Receive Address Registers
+ *
+ *  Updates the Receive Address Registers and Multicast Table Array.
+ *  The caller must have a packed mc_addr_list of multicast addresses.
+ *  The parameter rar_count will usually be hw->mac.rar_entry_count
+ *  unless there are workarounds that change this.
+ **/
+void e1000_update_mc_addr_list_vf(struct e1000_hw *hw,
+                                  u8 *mc_addr_list, u32 mc_addr_count,
+                                  u32 rar_used_count, u32 rar_count)
+{
+       struct e1000_mbx_info *mbx = &hw->mbx;
+       u32 msgbuf[E1000_VFMAILBOX_SIZE];
+       u16 *hash_list = (u16 *)&msgbuf[1];
+       u32 hash_value;
+       u32 cnt, i;
+
+       /* Each entry in the list uses 1 16 bit word.  We have 30
+        * 16 bit words available in our HW msg buffer (minus 1 for the
+        * msg type).  That's 30 hash values if we pack 'em right.  If
+        * there are more than 30 MC addresses to add then punt the
+        * extras for now and then add code to handle more than 30 later.
+        * It would be unusual for a server to request that many multi-cast
+        * addresses except for in large enterprise network environments.
+        */
+
+       cnt = (mc_addr_count > 30) ? 30 : mc_addr_count;
+       msgbuf[0] = E1000_VF_SET_MULTICAST;
+       msgbuf[0] |= cnt << E1000_VT_MSGINFO_SHIFT;
+
+       for (i = 0; i < cnt; i++) {
+               hash_value = e1000_hash_mc_addr_vf(hw, mc_addr_list);
+               hash_list[i] = hash_value & 0x0FFFF;
+               mc_addr_list += ETH_ADDR_LEN;
+       }
+
+       mbx->ops.write_posted(hw, msgbuf, E1000_VFMAILBOX_SIZE);
+}
+
+/**
+ *  e1000_set_vfta_vf - Set/Unset vlan filter table address
+ *  @hw: pointer to the HW structure
+ *  @vid: determines the vfta register and bit to set/unset
+ *  @set: if true then set bit, else clear bit
+ **/
+static s32 e1000_set_vfta_vf(struct e1000_hw *hw, u16 vid, bool set)
+{
+       struct e1000_mbx_info *mbx = &hw->mbx;
+       u32 msgbuf[2];
+       s32 err;
+
+       msgbuf[0] = E1000_VF_SET_VLAN;
+       msgbuf[1] = vid;
+       /* Setting the 8 bit field MSG INFO to true indicates "add" */
+       if (set)
+               msgbuf[0] |= 1 << E1000_VT_MSGINFO_SHIFT;
+
+       mbx->ops.write_posted(hw, msgbuf, 2);
+
+       err = mbx->ops.read_posted(hw, msgbuf, 2);
+
+       /* if nacked the vlan was rejected */
+       if (!err && (msgbuf[0] == (E1000_VF_SET_VLAN | E1000_VT_MSGTYPE_NACK)))
+               err = -E1000_ERR_MAC_INIT;
+
+       return err;
+}
+
+/** e1000_rlpml_set_vf - Set the maximum receive packet length
+ *  @hw: pointer to the HW structure
+ *  @max_size: value to assign to max frame size
+ **/
+void e1000_rlpml_set_vf(struct e1000_hw *hw, u16 max_size)
+{
+       struct e1000_mbx_info *mbx = &hw->mbx;
+       u32 msgbuf[2];
+
+       msgbuf[0] = E1000_VF_SET_LPE;
+       msgbuf[1] = max_size;
+
+       mbx->ops.write_posted(hw, msgbuf, 2);
+}
+
+/**
+ *  e1000_rar_set_vf - set device MAC address
+ *  @hw: pointer to the HW structure
+ *  @addr: pointer to the receive address
+ *  @index receive address array register
+ **/
+static void e1000_rar_set_vf(struct e1000_hw *hw, u8 * addr, u32 index)
+{
+       struct e1000_mbx_info *mbx = &hw->mbx;
+       u32 msgbuf[3];
+       u8 *msg_addr = (u8 *)(&msgbuf[1]);
+       s32 ret_val;
+
+       memset(msgbuf, 0, 12);
+       msgbuf[0] = E1000_VF_SET_MAC_ADDR;
+       memcpy(msg_addr, addr, 6);
+       ret_val = mbx->ops.write_posted(hw, msgbuf, 3);
+
+       if (!ret_val)
+               ret_val = mbx->ops.read_posted(hw, msgbuf, 3);
+
+       /* if nacked the address was rejected, use "perm_addr" */
+       if (!ret_val &&
+           (msgbuf[0] == (E1000_VF_SET_MAC_ADDR | E1000_VT_MSGTYPE_NACK)))
+               e1000_read_mac_addr_vf(hw);
+}
+
+/**
+ *  e1000_read_mac_addr_vf - Read device MAC address
+ *  @hw: pointer to the HW structure
+ **/
+static s32 e1000_read_mac_addr_vf(struct e1000_hw *hw)
+{
+       int i;
+
+       for (i = 0; i < ETH_ADDR_LEN; i++)
+               hw->mac.addr[i] = hw->mac.perm_addr[i];
+
+       return E1000_SUCCESS;
+}
+
+/**
+ *  e1000_check_for_link_vf - Check for link for a virtual interface
+ *  @hw: pointer to the HW structure
+ *
+ *  Checks to see if the underlying PF is still talking to the VF and
+ *  if it is then it reports the link state to the hardware, otherwise
+ *  it reports link down and returns an error.
+ **/
+static s32 e1000_check_for_link_vf(struct e1000_hw *hw)
+{
+       struct e1000_mbx_info *mbx = &hw->mbx;
+       struct e1000_mac_info *mac = &hw->mac;
+       s32 ret_val = E1000_SUCCESS;
+       u32 in_msg = 0;
+
+       /*
+        * We only want to run this if there has been a rst asserted.
+        * in this case that could mean a link change, device reset,
+        * or a virtual function reset
+        */
+
+       /* If we were hit with a reset drop the link */
+       if (!mbx->ops.check_for_rst(hw))
+               mac->get_link_status = true;
+
+       if (!mac->get_link_status)
+               goto out;
+
+       /* if link status is down no point in checking to see if pf is up */
+       if (!(er32(STATUS) & E1000_STATUS_LU))
+               goto out;
+
+       /* if the read failed it could just be a mailbox collision, best wait
+        * until we are called again and don't report an error */
+       if (mbx->ops.read(hw, &in_msg, 1))
+               goto out;
+
+       /* if incoming message isn't clear to send we are waiting on response */
+       if (!(in_msg & E1000_VT_MSGTYPE_CTS)) {
+               /* message is not CTS and is NACK we must have lost CTS status */
+               if (in_msg & E1000_VT_MSGTYPE_NACK)
+                       ret_val = -E1000_ERR_MAC_INIT;
+               goto out;
+       }
+
+       /* the pf is talking, if we timed out in the past we reinit */
+       if (!mbx->timeout) {
+               ret_val = -E1000_ERR_MAC_INIT;
+               goto out;
+       }
+
+       /* if we passed all the tests above then the link is up and we no
+        * longer need to check for link */
+       mac->get_link_status = false;
+
+out:
+       return ret_val;
+}
+
diff --git a/drivers/net/igbvf/vf.h b/drivers/net/igbvf/vf.h

new file mode 100644 (file)

index 0000000..ec07228
--- /dev/null
+++ b/drivers/net/igbvf/vf.h
@@ -0,0 +1,265 @@
+/*******************************************************************************
+
+  Intel(R) 82576 Virtual Function Linux driver
+  Copyright(c) 2009 Intel Corporation.
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Contact Information:
+  e1000-devel Mailing List <e1000-devel@lists.sourceforge.net>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
+*******************************************************************************/
+
+#ifndef _E1000_VF_H_
+#define _E1000_VF_H_
+
+#include <linux/pci.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/if_ether.h>
+
+#include "regs.h"
+#include "defines.h"
+
+struct e1000_hw;
+
+#define E1000_DEV_ID_82576_VF                 0x10CA
+#define E1000_REVISION_0 0
+#define E1000_REVISION_1 1
+#define E1000_REVISION_2 2
+#define E1000_REVISION_3 3
+#define E1000_REVISION_4 4
+
+#define E1000_FUNC_0     0
+#define E1000_FUNC_1     1
+
+/*
+ * Receive Address Register Count
+ * Number of high/low register pairs in the RAR.  The RAR (Receive Address
+ * Registers) holds the directed and multicast addresses that we monitor.
+ * These entries are also used for MAC-based filtering.
+ */
+#define E1000_RAR_ENTRIES_VF      1
+
+/* Receive Descriptor - Advanced */
+union e1000_adv_rx_desc {
+       struct {
+               u64 pkt_addr;             /* Packet buffer address */
+               u64 hdr_addr;             /* Header buffer address */
+       } read;
+       struct {
+               struct {
+                       union {
+                               u32 data;
+                               struct {
+                                       u16 pkt_info; /* RSS/Packet type */
+                                       u16 hdr_info; /* Split Header,
+                                                      * hdr buffer length */
+                               } hs_rss;
+                       } lo_dword;
+                       union {
+                               u32 rss;          /* RSS Hash */
+                               struct {
+                                       u16 ip_id;    /* IP id */
+                                       u16 csum;     /* Packet Checksum */
+                               } csum_ip;
+                       } hi_dword;
+               } lower;
+               struct {
+                       u32 status_error;     /* ext status/error */
+                       u16 length;           /* Packet length */
+                       u16 vlan;             /* VLAN tag */
+               } upper;
+       } wb;  /* writeback */
+};
+
+#define E1000_RXDADV_HDRBUFLEN_MASK      0x7FE0
+#define E1000_RXDADV_HDRBUFLEN_SHIFT     5
+
+/* Transmit Descriptor - Advanced */
+union e1000_adv_tx_desc {
+       struct {
+               u64 buffer_addr;    /* Address of descriptor's data buf */
+               u32 cmd_type_len;
+               u32 olinfo_status;
+       } read;
+       struct {
+               u64 rsvd;       /* Reserved */
+               u32 nxtseq_seed;
+               u32 status;
+       } wb;
+};
+
+/* Adv Transmit Descriptor Config Masks */
+#define E1000_ADVTXD_DTYP_CTXT    0x00200000 /* Advanced Context Descriptor */
+#define E1000_ADVTXD_DTYP_DATA    0x00300000 /* Advanced Data Descriptor */
+#define E1000_ADVTXD_DCMD_EOP     0x01000000 /* End of Packet */
+#define E1000_ADVTXD_DCMD_IFCS    0x02000000 /* Insert FCS (Ethernet CRC) */
+#define E1000_ADVTXD_DCMD_RS      0x08000000 /* Report Status */
+#define E1000_ADVTXD_DCMD_DEXT    0x20000000 /* Descriptor extension (1=Adv) */
+#define E1000_ADVTXD_DCMD_VLE     0x40000000 /* VLAN pkt enable */
+#define E1000_ADVTXD_DCMD_TSE     0x80000000 /* TCP Seg enable */
+#define E1000_ADVTXD_PAYLEN_SHIFT    14 /* Adv desc PAYLEN shift */
+
+/* Context descriptors */
+struct e1000_adv_tx_context_desc {
+       u32 vlan_macip_lens;
+       u32 seqnum_seed;
+       u32 type_tucmd_mlhl;
+       u32 mss_l4len_idx;
+};
+
+#define E1000_ADVTXD_MACLEN_SHIFT    9  /* Adv ctxt desc mac len shift */
+#define E1000_ADVTXD_TUCMD_IPV4    0x00000400  /* IP Packet Type: 1=IPv4 */
+#define E1000_ADVTXD_TUCMD_L4T_TCP 0x00000800  /* L4 Packet TYPE of TCP */
+#define E1000_ADVTXD_L4LEN_SHIFT     8  /* Adv ctxt L4LEN shift */
+#define E1000_ADVTXD_MSS_SHIFT      16  /* Adv ctxt MSS shift */
+
+enum e1000_mac_type {
+       e1000_undefined = 0,
+       e1000_vfadapt,
+       e1000_num_macs  /* List is 1-based, so subtract 1 for true count. */
+};
+
+struct e1000_vf_stats {
+       u64 base_gprc;
+       u64 base_gptc;
+       u64 base_gorc;
+       u64 base_gotc;
+       u64 base_mprc;
+       u64 base_gotlbc;
+       u64 base_gptlbc;
+       u64 base_gorlbc;
+       u64 base_gprlbc;
+
+       u32 last_gprc;
+       u32 last_gptc;
+       u32 last_gorc;
+       u32 last_gotc;
+       u32 last_mprc;
+       u32 last_gotlbc;
+       u32 last_gptlbc;
+       u32 last_gorlbc;
+       u32 last_gprlbc;
+
+       u64 gprc;
+       u64 gptc;
+       u64 gorc;
+       u64 gotc;
+       u64 mprc;
+       u64 gotlbc;
+       u64 gptlbc;
+       u64 gorlbc;
+       u64 gprlbc;
+};
+
+#include "mbx.h"
+
+struct e1000_mac_operations {
+       /* Function pointers for the MAC. */
+       s32  (*init_params)(struct e1000_hw *);
+       s32  (*check_for_link)(struct e1000_hw *);
+       void (*clear_vfta)(struct e1000_hw *);
+       s32  (*get_bus_info)(struct e1000_hw *);
+       s32  (*get_link_up_info)(struct e1000_hw *, u16 *, u16 *);
+       void (*update_mc_addr_list)(struct e1000_hw *, u8 *, u32, u32, u32);
+       s32  (*reset_hw)(struct e1000_hw *);
+       s32  (*init_hw)(struct e1000_hw *);
+       s32  (*setup_link)(struct e1000_hw *);
+       void (*write_vfta)(struct e1000_hw *, u32, u32);
+       void (*mta_set)(struct e1000_hw *, u32);
+       void (*rar_set)(struct e1000_hw *, u8*, u32);
+       s32  (*read_mac_addr)(struct e1000_hw *);
+       s32  (*set_vfta)(struct e1000_hw *, u16, bool);
+};
+
+struct e1000_mac_info {
+       struct e1000_mac_operations ops;
+       u8 addr[6];
+       u8 perm_addr[6];
+
+       enum e1000_mac_type type;
+
+       u16 mta_reg_count;
+       u16 rar_entry_count;
+
+       bool get_link_status;
+};
+
+struct e1000_mbx_operations {
+       s32 (*init_params)(struct e1000_hw *hw);
+       s32 (*read)(struct e1000_hw *, u32 *, u16);
+       s32 (*write)(struct e1000_hw *, u32 *, u16);
+       s32 (*read_posted)(struct e1000_hw *, u32 *, u16);
+       s32 (*write_posted)(struct e1000_hw *, u32 *, u16);
+       s32 (*check_for_msg)(struct e1000_hw *);
+       s32 (*check_for_ack)(struct e1000_hw *);
+       s32 (*check_for_rst)(struct e1000_hw *);
+};
+
+struct e1000_mbx_stats {
+       u32 msgs_tx;
+       u32 msgs_rx;
+
+       u32 acks;
+       u32 reqs;
+       u32 rsts;
+};
+
+struct e1000_mbx_info {
+       struct e1000_mbx_operations ops;
+       struct e1000_mbx_stats stats;
+       u32 timeout;
+       u32 usec_delay;
+       u16 size;
+};
+
+struct e1000_dev_spec_vf {
+       u32 vf_number;
+       u32 v2p_mailbox;
+};
+
+struct e1000_hw {
+       void *back;
+
+       u8 __iomem *hw_addr;
+       u8 __iomem *flash_address;
+       unsigned long io_base;
+
+       struct e1000_mac_info  mac;
+       struct e1000_mbx_info mbx;
+
+       union {
+               struct e1000_dev_spec_vf vf;
+       } dev_spec;
+
+       u16 device_id;
+       u16 subsystem_vendor_id;
+       u16 subsystem_device_id;
+       u16 vendor_id;
+
+       u8  revision_id;
+};
+
+/* These functions must be implemented by drivers */
+void e1000_rlpml_set_vf(struct e1000_hw *, u16);
+void e1000_init_function_pointers_vf(struct e1000_hw *hw);
+s32  e1000_init_mac_params_vf(struct e1000_hw *hw);
+
+
+#endif /* _E1000_VF_H_ */
diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c

index a56d9d2df73f3b108427a8ee828cb1afb0cb782d..b3185bf2c158b41b02ad4efd6b862775c9a37637 100644 (file)
--- a/drivers/net/mv643xx_eth.c
+++ b/drivers/net/mv643xx_eth.c
@@ -2274,8 +2274,6 @@ static void port_start(struct mv643xx_eth_private *mp)
                 pscr |= FORCE_LINK_PASS;
         wrlp(mp, PORT_SERIAL_CONTROL, pscr);
  
-       wrlp(mp, SDMA_CONFIG, PORT_SDMA_CONFIG_DEFAULT_VALUE);
-
         /*
          * Configure TX path and queues.
          */
@@ -2957,6 +2955,8 @@ static int mv643xx_eth_probe(struct platform_device *pdev)
  
         netif_carrier_off(dev);
  
+       wrlp(mp, SDMA_CONFIG, PORT_SDMA_CONFIG_DEFAULT_VALUE);
+
         set_rx_coal(mp, 250);
         set_tx_coal(mp, 0);
  
diff --git a/drivers/net/niu.c b/drivers/net/niu.c

index 73cac6c78cb6f97131ec0a52ec56340a93f56594..2b1745328cf78a90aa20cd05d1fb1a74598544ff 100644 (file)
--- a/drivers/net/niu.c
+++ b/drivers/net/niu.c
@@ -4834,6 +4834,7 @@ static int niu_compute_rbr_cfig_b(struct rx_ring_info *rp, u64 *ret)
  {
         u64 val = 0;
  
+       *ret = 0;
         switch (rp->rbr_block_size) {
         case 4 * 1024:
                 val |= (RBR_BLKSIZE_4K << RBR_CFIG_B_BLKSIZE_SHIFT);
@@ -9542,7 +9543,7 @@ static struct niu_parent * __devinit niu_new_parent(struct niu *np,
  
         plat_dev = platform_device_register_simple("niu", niu_parent_index,
                                                    NULL, 0);
-       if (!plat_dev)
+       if (IS_ERR(plat_dev))
                 return NULL;
  
         for (i = 0; attr_name(niu_parent_attributes[i]); i++) {
diff --git a/drivers/net/r6040.c b/drivers/net/r6040.c

index 5e8540b6ffa11d7184cbcf1e5ee11f88f1d57043..6f97b47d74a6931c83294644640d724fd21207de 100644 (file)
--- a/drivers/net/r6040.c
+++ b/drivers/net/r6040.c
@@ -160,6 +160,7 @@ MODULE_AUTHOR("Sten Wang <sten.wang@rdc.com.tw>,"
         "Florian Fainelli <florian@openwrt.org>");
  MODULE_LICENSE("GPL");
  MODULE_DESCRIPTION("RDC R6040 NAPI PCI FastEthernet driver");
+MODULE_VERSION(DRV_VERSION " " DRV_RELDATE);
  
  /* RX and TX interrupts that we handle */
  #define RX_INTS                        (RX_FIFO_FULL | RX_NO_DESC | RX_FINISH)
diff --git a/drivers/net/smsc911x.c b/drivers/net/smsc911x.c

index 6da678129828c1bd764a6b49a1dc77f038090e13..eb7db032a780eac55990782c69fc25dfac0cf8ee 100644 (file)
--- a/drivers/net/smsc911x.c
+++ b/drivers/net/smsc911x.c
@@ -317,7 +317,7 @@ static int smsc911x_mii_read(struct mii_bus *bus, int phyaddr, int regidx)
                         goto out;
                 }
  
-       SMSC_WARNING(HW, "Timed out waiting for MII write to finish");
+       SMSC_WARNING(HW, "Timed out waiting for MII read to finish");
         reg = -EIO;
  
  out:
diff --git a/drivers/platform/x86/fujitsu-laptop.c b/drivers/platform/x86/fujitsu-laptop.c

index 45940f31fe9e57642b08bbdb4aa8dc45d8d5d798..218b9a16ac3f5dc2ab7f6c48481394ce991c82d2 100644 (file)
--- a/drivers/platform/x86/fujitsu-laptop.c
+++ b/drivers/platform/x86/fujitsu-laptop.c
@@ -174,8 +174,7 @@ struct fujitsu_hotkey_t {
  
  static struct fujitsu_hotkey_t *fujitsu_hotkey;
  
-static void acpi_fujitsu_hotkey_notify(acpi_handle handle, u32 event,
-                                      void *data);
+static void acpi_fujitsu_hotkey_notify(struct acpi_device *device, u32 event);
  
  #ifdef CONFIG_LEDS_CLASS
  static enum led_brightness logolamp_get(struct led_classdev *cdev);
@@ -203,7 +202,7 @@ struct led_classdev kblamps_led = {
  static u32 dbg_level = 0x03;
  #endif
  
-static void acpi_fujitsu_notify(acpi_handle handle, u32 event, void *data);
+static void acpi_fujitsu_notify(struct acpi_device *device, u32 event);
  
  /* Fujitsu ACPI interface function */
  
@@ -658,7 +657,6 @@ static struct dmi_system_id fujitsu_dmi_table[] = {
  
  static int acpi_fujitsu_add(struct acpi_device *device)
  {
-       acpi_status status;
         acpi_handle handle;
         int result = 0;
         int state = 0;
@@ -673,20 +671,10 @@ static int acpi_fujitsu_add(struct acpi_device *device)
         sprintf(acpi_device_class(device), "%s", ACPI_FUJITSU_CLASS);
         device->driver_data = fujitsu;
  
-       status = acpi_install_notify_handler(device->handle,
-                                            ACPI_DEVICE_NOTIFY,
-                                            acpi_fujitsu_notify, fujitsu);
-
-       if (ACPI_FAILURE(status)) {
-               printk(KERN_ERR "Error installing notify handler\n");
-               error = -ENODEV;
-               goto err_stop;
-       }
-
         fujitsu->input = input = input_allocate_device();
         if (!input) {
                 error = -ENOMEM;
-               goto err_uninstall_notify;
+               goto err_stop;
         }
  
         snprintf(fujitsu->phys, sizeof(fujitsu->phys),
@@ -743,9 +731,6 @@ static int acpi_fujitsu_add(struct acpi_device *device)
  end:
  err_free_input_dev:
         input_free_device(input);
-err_uninstall_notify:
-       acpi_remove_notify_handler(device->handle, ACPI_DEVICE_NOTIFY,
-                                  acpi_fujitsu_notify);
  err_stop:
  
         return result;
@@ -753,7 +738,6 @@ err_stop:
  
  static int acpi_fujitsu_remove(struct acpi_device *device, int type)
  {
-       acpi_status status;
         struct fujitsu_t *fujitsu = NULL;
  
         if (!device || !acpi_driver_data(device))
@@ -761,10 +745,6 @@ static int acpi_fujitsu_remove(struct acpi_device *device, int type)
  
         fujitsu = acpi_driver_data(device);
  
-       status = acpi_remove_notify_handler(fujitsu->acpi_handle,
-                                           ACPI_DEVICE_NOTIFY,
-                                           acpi_fujitsu_notify);
-
         if (!device || !acpi_driver_data(device))
                 return -EINVAL;
  
@@ -775,7 +755,7 @@ static int acpi_fujitsu_remove(struct acpi_device *device, int type)
  
  /* Brightness notify */
  
-static void acpi_fujitsu_notify(acpi_handle handle, u32 event, void *data)
+static void acpi_fujitsu_notify(struct acpi_device *device, u32 event)
  {
         struct input_dev *input;
         int keycode;
@@ -829,15 +809,12 @@ static void acpi_fujitsu_notify(acpi_handle handle, u32 event, void *data)
                 input_report_key(input, keycode, 0);
                 input_sync(input);
         }
-
-       return;
  }
  
  /* ACPI device for hotkey handling */
  
  static int acpi_fujitsu_hotkey_add(struct acpi_device *device)
  {
-       acpi_status status;
         acpi_handle handle;
         int result = 0;
         int state = 0;
@@ -854,17 +831,6 @@ static int acpi_fujitsu_hotkey_add(struct acpi_device *device)
         sprintf(acpi_device_class(device), "%s", ACPI_FUJITSU_CLASS);
         device->driver_data = fujitsu_hotkey;
  
-       status = acpi_install_notify_handler(device->handle,
-                                            ACPI_DEVICE_NOTIFY,
-                                            acpi_fujitsu_hotkey_notify,
-                                            fujitsu_hotkey);
-
-       if (ACPI_FAILURE(status)) {
-               printk(KERN_ERR "Error installing notify handler\n");
-               error = -ENODEV;
-               goto err_stop;
-       }
-
         /* kfifo */
         spin_lock_init(&fujitsu_hotkey->fifo_lock);
         fujitsu_hotkey->fifo =
@@ -879,7 +845,7 @@ static int acpi_fujitsu_hotkey_add(struct acpi_device *device)
         fujitsu_hotkey->input = input = input_allocate_device();
         if (!input) {
                 error = -ENOMEM;
-               goto err_uninstall_notify;
+               goto err_free_fifo;
         }
  
         snprintf(fujitsu_hotkey->phys, sizeof(fujitsu_hotkey->phys),
@@ -975,9 +941,7 @@ static int acpi_fujitsu_hotkey_add(struct acpi_device *device)
  end:
  err_free_input_dev:
         input_free_device(input);
-err_uninstall_notify:
-       acpi_remove_notify_handler(device->handle, ACPI_DEVICE_NOTIFY,
-                                  acpi_fujitsu_hotkey_notify);
+err_free_fifo:
         kfifo_free(fujitsu_hotkey->fifo);
  err_stop:
  
@@ -986,7 +950,6 @@ err_stop:
  
  static int acpi_fujitsu_hotkey_remove(struct acpi_device *device, int type)
  {
-       acpi_status status;
         struct fujitsu_hotkey_t *fujitsu_hotkey = NULL;
  
         if (!device || !acpi_driver_data(device))
@@ -994,10 +957,6 @@ static int acpi_fujitsu_hotkey_remove(struct acpi_device *device, int type)
  
         fujitsu_hotkey = acpi_driver_data(device);
  
-       status = acpi_remove_notify_handler(fujitsu_hotkey->acpi_handle,
-                                           ACPI_DEVICE_NOTIFY,
-                                           acpi_fujitsu_hotkey_notify);
-
         fujitsu_hotkey->acpi_handle = NULL;
  
         kfifo_free(fujitsu_hotkey->fifo);
@@ -1005,8 +964,7 @@ static int acpi_fujitsu_hotkey_remove(struct acpi_device *device, int type)
         return 0;
  }
  
-static void acpi_fujitsu_hotkey_notify(acpi_handle handle, u32 event,
-                                      void *data)
+static void acpi_fujitsu_hotkey_notify(struct acpi_device *device, u32 event)
  {
         struct input_dev *input;
         int keycode, keycode_r;
@@ -1089,8 +1047,6 @@ static void acpi_fujitsu_hotkey_notify(acpi_handle handle, u32 event,
                 input_sync(input);
                 break;
         }
-
-       return;
  }
  
  /* Initialization */
@@ -1107,6 +1063,7 @@ static struct acpi_driver acpi_fujitsu_driver = {
         .ops = {
                 .add = acpi_fujitsu_add,
                 .remove = acpi_fujitsu_remove,
+               .notify = acpi_fujitsu_notify,
                 },
  };
  
@@ -1122,6 +1079,7 @@ static struct acpi_driver acpi_fujitsu_hotkey_driver = {
         .ops = {
                 .add = acpi_fujitsu_hotkey_add,
                 .remove = acpi_fujitsu_hotkey_remove,
+               .notify = acpi_fujitsu_hotkey_notify,
                 },
  };
  
diff --git a/drivers/platform/x86/panasonic-laptop.c b/drivers/platform/x86/panasonic-laptop.c

index a5ce4bc202e33663da13eceb1bc763270fa419d9..fe7cf0188acc12df1794ebf9c8107de5f1bac7f6 100644 (file)
--- a/drivers/platform/x86/panasonic-laptop.c
+++ b/drivers/platform/x86/panasonic-laptop.c
@@ -176,6 +176,7 @@ enum SINF_BITS { SINF_NUM_BATTERIES = 0,
  static int acpi_pcc_hotkey_add(struct acpi_device *device);
  static int acpi_pcc_hotkey_remove(struct acpi_device *device, int type);
  static int acpi_pcc_hotkey_resume(struct acpi_device *device);
+static void acpi_pcc_hotkey_notify(struct acpi_device *device, u32 event);
  
  static const struct acpi_device_id pcc_device_ids[] = {
         { "MAT0012", 0},
@@ -194,6 +195,7 @@ static struct acpi_driver acpi_pcc_driver = {
                                 .add =          acpi_pcc_hotkey_add,
                                 .remove =       acpi_pcc_hotkey_remove,
                                 .resume =       acpi_pcc_hotkey_resume,
+                               .notify =       acpi_pcc_hotkey_notify,
                         },
  };
  
@@ -271,7 +273,7 @@ static int acpi_pcc_retrieve_biosdata(struct pcc_acpi *pcc, u32 *sinf)
         union acpi_object *hkey = NULL;
         int i;
  
-       status = acpi_evaluate_object(pcc->handle, METHOD_HKEY_SINF, 0,
+       status = acpi_evaluate_object(pcc->handle, METHOD_HKEY_SINF, NULL,
                                       &buffer);
         if (ACPI_FAILURE(status)) {
                 ACPI_DEBUG_PRINT((ACPI_DB_ERROR,
@@ -527,9 +529,9 @@ static void acpi_pcc_generate_keyinput(struct pcc_acpi *pcc)
         return;
  }
  
-static void acpi_pcc_hotkey_notify(acpi_handle handle, u32 event, void *data)
+static void acpi_pcc_hotkey_notify(struct acpi_device *device, u32 event)
  {
-       struct pcc_acpi *pcc = (struct pcc_acpi *) data;
+       struct pcc_acpi *pcc = acpi_driver_data(device);
  
         switch (event) {
         case HKEY_NOTIFY:
@@ -599,7 +601,6 @@ static int acpi_pcc_hotkey_resume(struct acpi_device *device)
  
  static int acpi_pcc_hotkey_add(struct acpi_device *device)
  {
-       acpi_status status;
         struct pcc_acpi *pcc;
         int num_sifr, result;
  
@@ -640,22 +641,11 @@ static int acpi_pcc_hotkey_add(struct acpi_device *device)
                 goto out_sinf;
         }
  
-       /* initialize hotkey input device */
-       status = acpi_install_notify_handler(pcc->handle, ACPI_DEVICE_NOTIFY,
-                                            acpi_pcc_hotkey_notify, pcc);
-
-       if (ACPI_FAILURE(status)) {
-               ACPI_DEBUG_PRINT((ACPI_DB_ERROR,
-                                 "Error installing notify handler\n"));
-               result = -ENODEV;
-               goto out_input;
-       }
-
         /* initialize backlight */
         pcc->backlight = backlight_device_register("panasonic", NULL, pcc,
                                                    &pcc_backlight_ops);
         if (IS_ERR(pcc->backlight))
-               goto out_notify;
+               goto out_input;
  
         if (!acpi_pcc_retrieve_biosdata(pcc, pcc->sinf)) {
                 ACPI_DEBUG_PRINT((ACPI_DB_ERROR,
@@ -680,9 +670,6 @@ static int acpi_pcc_hotkey_add(struct acpi_device *device)
  
  out_backlight:
         backlight_device_unregister(pcc->backlight);
-out_notify:
-       acpi_remove_notify_handler(pcc->handle, ACPI_DEVICE_NOTIFY,
-                                  acpi_pcc_hotkey_notify);
  out_input:
         input_unregister_device(pcc->input_dev);
         /* no need to input_free_device() since core input API refcount and
@@ -723,9 +710,6 @@ static int acpi_pcc_hotkey_remove(struct acpi_device *device, int type)
  
         backlight_device_unregister(pcc->backlight);
  
-       acpi_remove_notify_handler(pcc->handle, ACPI_DEVICE_NOTIFY,
-                                  acpi_pcc_hotkey_notify);
-
         input_unregister_device(pcc->input_dev);
         /* no need to input_free_device() since core input API refcount and
          * free()s the device */
diff --git a/drivers/platform/x86/sony-laptop.c b/drivers/platform/x86/sony-laptop.c

index a90ec5cb2f20dcb547df43cb7a0f5f45c9abeab9..d3c92d777bde343d95f733ad62f34a20e91da533 100644 (file)
--- a/drivers/platform/x86/sony-laptop.c
+++ b/drivers/platform/x86/sony-laptop.c
@@ -914,7 +914,7 @@ static struct sony_nc_event sony_127_events[] = {
  /*
   * ACPI callbacks
   */
-static void sony_acpi_notify(acpi_handle handle, u32 event, void *data)
+static void sony_nc_notify(struct acpi_device *device, u32 event)
  {
         u32 ev = event;
  
@@ -933,7 +933,7 @@ static void sony_acpi_notify(acpi_handle handle, u32 event, void *data)
                         struct sony_nc_event *key_event;
  
                         if (sony_call_snc_handle(key_handle, 0x200, &result)) {
-                               dprintk("sony_acpi_notify, unable to decode"
+                               dprintk("sony_nc_notify, unable to decode"
                                         " event 0x%.2x 0x%.2x\n", key_handle,
                                         ev);
                                 /* restore the original event */
@@ -968,7 +968,7 @@ static void sony_acpi_notify(acpi_handle handle, u32 event, void *data)
         } else
                 sony_laptop_report_input_event(ev);
  
-       dprintk("sony_acpi_notify, event: 0x%.2x\n", ev);
+       dprintk("sony_nc_notify, event: 0x%.2x\n", ev);
         acpi_bus_generate_proc_event(sony_nc_acpi_device, 1, ev);
  }
  
@@ -1276,15 +1276,6 @@ static int sony_nc_add(struct acpi_device *device)
                 goto outwalk;
         }
  
-       status = acpi_install_notify_handler(sony_nc_acpi_handle,
-                                            ACPI_DEVICE_NOTIFY,
-                                            sony_acpi_notify, NULL);
-       if (ACPI_FAILURE(status)) {
-               printk(KERN_WARNING DRV_PFX "unable to install notify handler (%u)\n", status);
-               result = -ENODEV;
-               goto outinput;
-       }
-
         if (acpi_video_backlight_support()) {
                 printk(KERN_INFO DRV_PFX "brightness ignored, must be "
                        "controlled by ACPI video driver\n");
@@ -1362,13 +1353,6 @@ static int sony_nc_add(struct acpi_device *device)
         if (sony_backlight_device)
                 backlight_device_unregister(sony_backlight_device);
  
-       status = acpi_remove_notify_handler(sony_nc_acpi_handle,
-                                           ACPI_DEVICE_NOTIFY,
-                                           sony_acpi_notify);
-       if (ACPI_FAILURE(status))
-               printk(KERN_WARNING DRV_PFX "unable to remove notify handler\n");
-
-      outinput:
         sony_laptop_remove_input();
  
        outwalk:
@@ -1378,7 +1362,6 @@ static int sony_nc_add(struct acpi_device *device)
  
  static int sony_nc_remove(struct acpi_device *device, int type)
  {
-       acpi_status status;
         struct sony_nc_value *item;
  
         if (sony_backlight_device)
@@ -1386,12 +1369,6 @@ static int sony_nc_remove(struct acpi_device *device, int type)
  
         sony_nc_acpi_device = NULL;
  
-       status = acpi_remove_notify_handler(sony_nc_acpi_handle,
-                                           ACPI_DEVICE_NOTIFY,
-                                           sony_acpi_notify);
-       if (ACPI_FAILURE(status))
-               printk(KERN_WARNING DRV_PFX "unable to remove notify handler\n");
-
         for (item = sony_nc_values; item->name; ++item) {
                 device_remove_file(&sony_pf_device->dev, &item->devattr);
         }
@@ -1425,6 +1402,7 @@ static struct acpi_driver sony_nc_driver = {
                 .add = sony_nc_add,
                 .remove = sony_nc_remove,
                 .resume = sony_nc_resume,
+               .notify = sony_nc_notify,
                 },
  };
  
diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c

index 2f269e117b8fb51fb9af361f00ba89007133b3a8..043b208d971d58b9841d1eac1592ec0620cbd85b 100644 (file)
--- a/drivers/platform/x86/wmi.c
+++ b/drivers/platform/x86/wmi.c
@@ -81,6 +81,7 @@ static struct wmi_block wmi_blocks;
  
  static int acpi_wmi_remove(struct acpi_device *device, int type);
  static int acpi_wmi_add(struct acpi_device *device);
+static void acpi_wmi_notify(struct acpi_device *device, u32 event);
  
  static const struct acpi_device_id wmi_device_ids[] = {
         {"PNP0C14", 0},
@@ -96,6 +97,7 @@ static struct acpi_driver acpi_wmi_driver = {
         .ops = {
                 .add = acpi_wmi_add,
                 .remove = acpi_wmi_remove,
+               .notify = acpi_wmi_notify,
                 },
  };
  
@@ -643,12 +645,11 @@ acpi_wmi_ec_space_handler(u32 function, acpi_physical_address address,
         }
  }
  
-static void acpi_wmi_notify(acpi_handle handle, u32 event, void *data)
+static void acpi_wmi_notify(struct acpi_device *device, u32 event)
  {
         struct guid_block *block;
         struct wmi_block *wblock;
         struct list_head *p;
-       struct acpi_device *device = data;
  
         list_for_each(p, &wmi_blocks.list) {
                 wblock = list_entry(p, struct wmi_block, list);
@@ -669,9 +670,6 @@ static void acpi_wmi_notify(acpi_handle handle, u32 event, void *data)
  
  static int acpi_wmi_remove(struct acpi_device *device, int type)
  {
-       acpi_remove_notify_handler(device->handle, ACPI_DEVICE_NOTIFY,
-               acpi_wmi_notify);
-
         acpi_remove_address_space_handler(device->handle,
                                 ACPI_ADR_SPACE_EC, &acpi_wmi_ec_space_handler);
  
@@ -683,13 +681,6 @@ static int __init acpi_wmi_add(struct acpi_device *device)
         acpi_status status;
         int result = 0;
  
-       status = acpi_install_notify_handler(device->handle, ACPI_DEVICE_NOTIFY,
-               acpi_wmi_notify, device);
-       if (ACPI_FAILURE(status)) {
-               printk(KERN_ERR PREFIX "Error installing notify handler\n");
-               return -ENODEV;
-       }
-
         status = acpi_install_address_space_handler(device->handle,
                                                     ACPI_ADR_SPACE_EC,
                                                     &acpi_wmi_ec_space_handler,
diff --git a/drivers/power/pcf50633-charger.c b/drivers/power/pcf50633-charger.c

index 41aec2acbb916ceb3d5c6ce0072408f91075bec2..e8b278f71781a2c3590fd6029f6a1daa34ccfe0f 100644 (file)
--- a/drivers/power/pcf50633-charger.c
+++ b/drivers/power/pcf50633-charger.c
@@ -36,6 +36,8 @@ struct pcf50633_mbc {
  
         struct power_supply usb;
         struct power_supply adapter;
+
+       struct delayed_work charging_restart_work;
  };
  
  int pcf50633_mbc_usb_curlim_set(struct pcf50633 *pcf, int ma)
@@ -43,6 +45,8 @@ int pcf50633_mbc_usb_curlim_set(struct pcf50633 *pcf, int ma)
         struct pcf50633_mbc *mbc = platform_get_drvdata(pcf->mbc_pdev);
         int ret = 0;
         u8 bits;
+       int charging_start = 1;
+       u8 mbcs2, chgmod;
  
         if (ma >= 1000)
                 bits = PCF50633_MBCC7_USB_1000mA;
@@ -50,8 +54,10 @@ int pcf50633_mbc_usb_curlim_set(struct pcf50633 *pcf, int ma)
                 bits = PCF50633_MBCC7_USB_500mA;
         else if (ma >= 100)
                 bits = PCF50633_MBCC7_USB_100mA;
-       else
+       else {
                 bits = PCF50633_MBCC7_USB_SUSPEND;
+               charging_start = 0;
+       }
  
         ret = pcf50633_reg_set_bit_mask(pcf, PCF50633_REG_MBCC7,
                                         PCF50633_MBCC7_USB_MASK, bits);
@@ -60,6 +66,22 @@ int pcf50633_mbc_usb_curlim_set(struct pcf50633 *pcf, int ma)
         else
                 dev_info(pcf->dev, "usb curlim to %d mA\n", ma);
  
+       /* Manual charging start */
+       mbcs2 = pcf50633_reg_read(pcf, PCF50633_REG_MBCS2);
+       chgmod = (mbcs2 & PCF50633_MBCS2_MBC_MASK);
+
+       /* If chgmod == BATFULL, setting chgena has no effect.
+        * We need to set resume instead.
+        */
+       if (chgmod != PCF50633_MBCS2_MBC_BAT_FULL)
+               pcf50633_reg_set_bit_mask(pcf, PCF50633_REG_MBCC1,
+                               PCF50633_MBCC1_CHGENA, PCF50633_MBCC1_CHGENA);
+       else
+               pcf50633_reg_set_bit_mask(pcf, PCF50633_REG_MBCC1,
+                               PCF50633_MBCC1_RESUME, PCF50633_MBCC1_RESUME);
+
+       mbc->usb_active = charging_start;
+
         power_supply_changed(&mbc->usb);
  
         return ret;
@@ -84,21 +106,6 @@ int pcf50633_mbc_get_status(struct pcf50633 *pcf)
  }
  EXPORT_SYMBOL_GPL(pcf50633_mbc_get_status);
  
-void pcf50633_mbc_set_status(struct pcf50633 *pcf, int what, int status)
-{
-       struct pcf50633_mbc *mbc = platform_get_drvdata(pcf->mbc_pdev);
-
-       if (what & PCF50633_MBC_USB_ONLINE)
-               mbc->usb_online = !!status;
-       if (what & PCF50633_MBC_USB_ACTIVE)
-               mbc->usb_active = !!status;
-       if (what & PCF50633_MBC_ADAPTER_ONLINE)
-               mbc->adapter_online = !!status;
-       if (what & PCF50633_MBC_ADAPTER_ACTIVE)
-               mbc->adapter_active = !!status;
-}
-EXPORT_SYMBOL_GPL(pcf50633_mbc_set_status);
-
  static ssize_t
  show_chgmode(struct device *dev, struct device_attribute *attr, char *buf)
  {
@@ -160,10 +167,44 @@ static struct attribute_group mbc_attr_group = {
         .attrs  = pcf50633_mbc_sysfs_entries,
  };
  
+/* MBC state machine switches into charging mode when the battery voltage
+ * falls below 96% of a battery float voltage. But the voltage drop in Li-ion
+ * batteries is marginal(1~2 %) till about 80% of its capacity - which means,
+ * after a BATFULL, charging won't be restarted until 80%.
+ *
+ * This work_struct function restarts charging at regular intervals to make
+ * sure we don't discharge too much
+ */
+
+static void pcf50633_mbc_charging_restart(struct work_struct *work)
+{
+       struct pcf50633_mbc *mbc;
+       u8 mbcs2, chgmod;
+
+       mbc = container_of(work, struct pcf50633_mbc,
+                               charging_restart_work.work);
+
+       mbcs2 = pcf50633_reg_read(mbc->pcf, PCF50633_REG_MBCS2);
+       chgmod = (mbcs2 & PCF50633_MBCS2_MBC_MASK);
+
+       if (chgmod != PCF50633_MBCS2_MBC_BAT_FULL)
+               return;
+
+       /* Restart charging */
+       pcf50633_reg_set_bit_mask(mbc->pcf, PCF50633_REG_MBCC1,
+                               PCF50633_MBCC1_RESUME, PCF50633_MBCC1_RESUME);
+       mbc->usb_active = 1;
+       power_supply_changed(&mbc->usb);
+
+       dev_info(mbc->pcf->dev, "Charging restarted\n");
+}
+
  static void
  pcf50633_mbc_irq_handler(int irq, void *data)
  {
         struct pcf50633_mbc *mbc = data;
+       int chg_restart_interval =
+                       mbc->pcf->pdata->charging_restart_interval;
  
         /* USB */
         if (irq == PCF50633_IRQ_USBINS) {
@@ -172,6 +213,7 @@ pcf50633_mbc_irq_handler(int irq, void *data)
                 mbc->usb_online = 0;
                 mbc->usb_active = 0;
                 pcf50633_mbc_usb_curlim_set(mbc->pcf, 0);
+               cancel_delayed_work_sync(&mbc->charging_restart_work);
         }
  
         /* Adapter */
@@ -186,7 +228,14 @@ pcf50633_mbc_irq_handler(int irq, void *data)
         if (irq == PCF50633_IRQ_BATFULL) {
                 mbc->usb_active = 0;
                 mbc->adapter_active = 0;
-       }
+
+               if (chg_restart_interval > 0)
+                       schedule_delayed_work(&mbc->charging_restart_work,
+                                                       chg_restart_interval);
+       } else if (irq == PCF50633_IRQ_USBLIMON)
+               mbc->usb_active = 0;
+       else if (irq == PCF50633_IRQ_USBLIMOFF)
+               mbc->usb_active = 1;
  
         power_supply_changed(&mbc->usb);
         power_supply_changed(&mbc->adapter);
@@ -303,6 +352,9 @@ static int __devinit pcf50633_mbc_probe(struct platform_device *pdev)
                 return ret;
         }
  
+       INIT_DELAYED_WORK(&mbc->charging_restart_work,
+                               pcf50633_mbc_charging_restart);
+
         ret = sysfs_create_group(&pdev->dev.kobj, &mbc_attr_group);
         if (ret)
                 dev_err(mbc->pcf->dev, "failed to create sysfs entries\n");
@@ -328,6 +380,8 @@ static int __devexit pcf50633_mbc_remove(struct platform_device *pdev)
         power_supply_unregister(&mbc->usb);
         power_supply_unregister(&mbc->adapter);
  
+       cancel_delayed_work_sync(&mbc->charging_restart_work);
+
         kfree(mbc);
  
         return 0;
diff --git a/drivers/power/pda_power.c b/drivers/power/pda_power.c

index b56a704409d26365f88f144034d12e3036f34854..a232de6a57037599f23dd8e99b2a2857f5c443a9 100644 (file)
--- a/drivers/power/pda_power.c
+++ b/drivers/power/pda_power.c
@@ -12,11 +12,14 @@
  
  #include <linux/module.h>
  #include <linux/platform_device.h>
+#include <linux/err.h>
  #include <linux/interrupt.h>
  #include <linux/power_supply.h>
  #include <linux/pda_power.h>
+#include <linux/regulator/consumer.h>
  #include <linux/timer.h>
  #include <linux/jiffies.h>
+#include <linux/usb/otg.h>
  
  static inline unsigned int get_irq_flags(struct resource *res)
  {
@@ -35,6 +38,11 @@ static struct timer_list supply_timer;
  static struct timer_list polling_timer;
  static int polling;
  
+#ifdef CONFIG_USB_OTG_UTILS
+static struct otg_transceiver *transceiver;
+#endif
+static struct regulator *ac_draw;
+
  enum {
         PDA_PSY_OFFLINE = 0,
         PDA_PSY_ONLINE = 1,
@@ -104,18 +112,35 @@ static void update_status(void)
  
  static void update_charger(void)
  {
-       if (!pdata->set_charge)
-               return;
-
-       if (new_ac_status > 0) {
-               dev_dbg(dev, "charger on (AC)\n");
-               pdata->set_charge(PDA_POWER_CHARGE_AC);
-       } else if (new_usb_status > 0) {
-               dev_dbg(dev, "charger on (USB)\n");
-               pdata->set_charge(PDA_POWER_CHARGE_USB);
-       } else {
-               dev_dbg(dev, "charger off\n");
-               pdata->set_charge(0);
+       static int regulator_enabled;
+       int max_uA = pdata->ac_max_uA;
+
+       if (pdata->set_charge) {
+               if (new_ac_status > 0) {
+                       dev_dbg(dev, "charger on (AC)\n");
+                       pdata->set_charge(PDA_POWER_CHARGE_AC);
+               } else if (new_usb_status > 0) {
+                       dev_dbg(dev, "charger on (USB)\n");
+                       pdata->set_charge(PDA_POWER_CHARGE_USB);
+               } else {
+                       dev_dbg(dev, "charger off\n");
+                       pdata->set_charge(0);
+               }
+       } else if (ac_draw) {
+               if (new_ac_status > 0) {
+                       regulator_set_current_limit(ac_draw, max_uA, max_uA);
+                       if (!regulator_enabled) {
+                               dev_dbg(dev, "charger on (AC)\n");
+                               regulator_enable(ac_draw);
+                               regulator_enabled = 1;
+                       }
+               } else {
+                       if (regulator_enabled) {
+                               dev_dbg(dev, "charger off\n");
+                               regulator_disable(ac_draw);
+                               regulator_enabled = 0;
+                       }
+               }
         }
  }
  
@@ -194,6 +219,13 @@ static void polling_timer_func(unsigned long unused)
                   jiffies + msecs_to_jiffies(pdata->polling_interval));
  }
  
+#ifdef CONFIG_USB_OTG_UTILS
+static int otg_is_usb_online(void)
+{
+       return (transceiver->state == OTG_STATE_B_PERIPHERAL);
+}
+#endif
+
  static int pda_power_probe(struct platform_device *pdev)
  {
         int ret = 0;
@@ -227,6 +259,9 @@ static int pda_power_probe(struct platform_device *pdev)
         if (!pdata->polling_interval)
                 pdata->polling_interval = 2000;
  
+       if (!pdata->ac_max_uA)
+               pdata->ac_max_uA = 500000;
+
         setup_timer(&charger_timer, charger_timer_func, 0);
         setup_timer(&supply_timer, supply_timer_func, 0);
  
@@ -240,6 +275,13 @@ static int pda_power_probe(struct platform_device *pdev)
                 pda_psy_usb.num_supplicants = pdata->num_supplicants;
         }
  
+       ac_draw = regulator_get(dev, "ac_draw");
+       if (IS_ERR(ac_draw)) {
+               dev_dbg(dev, "couldn't get ac_draw regulator\n");
+               ac_draw = NULL;
+               ret = PTR_ERR(ac_draw);
+       }
+
         if (pdata->is_ac_online) {
                 ret = power_supply_register(&pdev->dev, &pda_psy_ac);
                 if (ret) {
@@ -261,6 +303,13 @@ static int pda_power_probe(struct platform_device *pdev)
                 }
         }
  
+#ifdef CONFIG_USB_OTG_UTILS
+       transceiver = otg_get_transceiver();
+       if (transceiver && !pdata->is_usb_online) {
+               pdata->is_usb_online = otg_is_usb_online;
+       }
+#endif
+
         if (pdata->is_usb_online) {
                 ret = power_supply_register(&pdev->dev, &pda_psy_usb);
                 if (ret) {
@@ -300,10 +349,18 @@ usb_irq_failed:
  usb_supply_failed:
         if (pdata->is_ac_online && ac_irq)
                 free_irq(ac_irq->start, &pda_psy_ac);
+#ifdef CONFIG_USB_OTG_UTILS
+       if (transceiver)
+               otg_put_transceiver(transceiver);
+#endif
  ac_irq_failed:
         if (pdata->is_ac_online)
                 power_supply_unregister(&pda_psy_ac);
  ac_supply_failed:
+       if (ac_draw) {
+               regulator_put(ac_draw);
+               ac_draw = NULL;
+       }
         if (pdata->exit)
                 pdata->exit(dev);
  init_failed:
@@ -327,6 +384,14 @@ static int pda_power_remove(struct platform_device *pdev)
                 power_supply_unregister(&pda_psy_usb);
         if (pdata->is_ac_online)
                 power_supply_unregister(&pda_psy_ac);
+#ifdef CONFIG_USB_OTG_UTILS
+       if (transceiver)
+               otg_put_transceiver(transceiver);
+#endif
+       if (ac_draw) {
+               regulator_put(ac_draw);
+               ac_draw = NULL;
+       }
         if (pdata->exit)
                 pdata->exit(dev);
  
diff --git a/drivers/serial/max3100.c b/drivers/serial/max3100.c

new file mode 100644 (file)

index 0000000..9fd33e5
--- /dev/null
+++ b/drivers/serial/max3100.c
@@ -0,0 +1,927 @@
+/*
+ *
+ *  Copyright (C) 2008 Christian Pellegrin <chripell@evolware.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ *
+ * Notes: the MAX3100 doesn't provide an interrupt on CTS so we have
+ * to use polling for flow control. TX empty IRQ is unusable, since
+ * writing conf clears FIFO buffer and we cannot have this interrupt
+ * always asking us for attention.
+ *
+ * Example platform data:
+
+ static struct plat_max3100 max3100_plat_data = {
+ .loopback = 0,
+ .crystal = 0,
+ .poll_time = 100,
+ };
+
+ static struct spi_board_info spi_board_info[] = {
+ {
+ .modalias     = "max3100",
+ .platform_data        = &max3100_plat_data,
+ .irq          = IRQ_EINT12,
+ .max_speed_hz = 5*1000*1000,
+ .chip_select  = 0,
+ },
+ };
+
+ * The initial minor number is 209 in the low-density serial port:
+ * mknod /dev/ttyMAX0 c 204 209
+ */
+
+#define MAX3100_MAJOR 204
+#define MAX3100_MINOR 209
+/* 4 MAX3100s should be enough for everyone */
+#define MAX_MAX3100 4
+
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/serial_core.h>
+#include <linux/serial.h>
+#include <linux/spi/spi.h>
+#include <linux/freezer.h>
+
+#include <linux/serial_max3100.h>
+
+#define MAX3100_C    (1<<14)
+#define MAX3100_D    (0<<14)
+#define MAX3100_W    (1<<15)
+#define MAX3100_RX   (0<<15)
+
+#define MAX3100_WC   (MAX3100_W  | MAX3100_C)
+#define MAX3100_RC   (MAX3100_RX | MAX3100_C)
+#define MAX3100_WD   (MAX3100_W  | MAX3100_D)
+#define MAX3100_RD   (MAX3100_RX | MAX3100_D)
+#define MAX3100_CMD  (3 << 14)
+
+#define MAX3100_T    (1<<14)
+#define MAX3100_R    (1<<15)
+
+#define MAX3100_FEN  (1<<13)
+#define MAX3100_SHDN (1<<12)
+#define MAX3100_TM   (1<<11)
+#define MAX3100_RM   (1<<10)
+#define MAX3100_PM   (1<<9)
+#define MAX3100_RAM  (1<<8)
+#define MAX3100_IR   (1<<7)
+#define MAX3100_ST   (1<<6)
+#define MAX3100_PE   (1<<5)
+#define MAX3100_L    (1<<4)
+#define MAX3100_BAUD (0xf)
+
+#define MAX3100_TE   (1<<10)
+#define MAX3100_RAFE (1<<10)
+#define MAX3100_RTS  (1<<9)
+#define MAX3100_CTS  (1<<9)
+#define MAX3100_PT   (1<<8)
+#define MAX3100_DATA (0xff)
+
+#define MAX3100_RT   (MAX3100_R | MAX3100_T)
+#define MAX3100_RTC  (MAX3100_RT | MAX3100_CTS | MAX3100_RAFE)
+
+/* the following simulate a status reg for ignore_status_mask */
+#define MAX3100_STATUS_PE 1
+#define MAX3100_STATUS_FE 2
+#define MAX3100_STATUS_OE 4
+
+struct max3100_port {
+       struct uart_port port;
+       struct spi_device *spi;
+
+       int cts;                /* last CTS received for flow ctrl */
+       int tx_empty;           /* last TX empty bit */
+
+       spinlock_t conf_lock;   /* shared data */
+       int conf_commit;        /* need to make changes */
+       int conf;               /* configuration for the MAX31000
+                                * (bits 0-7, bits 8-11 are irqs) */
+       int rts_commit;         /* need to change rts */
+       int rts;                /* rts status */
+       int baud;               /* current baud rate */
+
+       int parity;             /* keeps track if we should send parity */
+#define MAX3100_PARITY_ON 1
+#define MAX3100_PARITY_ODD 2
+#define MAX3100_7BIT 4
+       int rx_enabled;         /* if we should rx chars */
+
+       int irq;                /* irq assigned to the max3100 */
+
+       int minor;              /* minor number */
+       int crystal;            /* 1 if 3.6864Mhz crystal 0 for 1.8432 */
+       int loopback;           /* 1 if we are in loopback mode */
+
+       /* for handling irqs: need workqueue since we do spi_sync */
+       struct workqueue_struct *workqueue;
+       struct work_struct work;
+       /* set to 1 to make the workhandler exit as soon as possible */
+       int  force_end_work;
+       /* need to know we are suspending to avoid deadlock on workqueue */
+       int suspending;
+
+       /* hook for suspending MAX3100 via dedicated pin */
+       void (*max3100_hw_suspend) (int suspend);
+
+       /* poll time (in ms) for ctrl lines */
+       int poll_time;
+       /* and its timer */
+       struct timer_list       timer;
+};
+
+static struct max3100_port *max3100s[MAX_MAX3100]; /* the chips */
+static DEFINE_MUTEX(max3100s_lock);               /* race on probe */
+
+static int max3100_do_parity(struct max3100_port *s, u16 c)
+{
+       int parity;
+
+       if (s->parity & MAX3100_PARITY_ODD)
+               parity = 1;
+       else
+               parity = 0;
+
+       if (s->parity & MAX3100_7BIT)
+               c &= 0x7f;
+       else
+               c &= 0xff;
+
+       parity = parity ^ (hweight8(c) & 1);
+       return parity;
+}
+
+static int max3100_check_parity(struct max3100_port *s, u16 c)
+{
+       return max3100_do_parity(s, c) == ((c >> 8) & 1);
+}
+
+static void max3100_calc_parity(struct max3100_port *s, u16 *c)
+{
+       if (s->parity & MAX3100_7BIT)
+               *c &= 0x7f;
+       else
+               *c &= 0xff;
+
+       if (s->parity & MAX3100_PARITY_ON)
+               *c |= max3100_do_parity(s, *c) << 8;
+}
+
+static void max3100_work(struct work_struct *w);
+
+static void max3100_dowork(struct max3100_port *s)
+{
+       if (!s->force_end_work && !work_pending(&s->work) &&
+           !freezing(current) && !s->suspending)
+               queue_work(s->workqueue, &s->work);
+}
+
+static void max3100_timeout(unsigned long data)
+{
+       struct max3100_port *s = (struct max3100_port *)data;
+
+       if (s->port.info) {
+               max3100_dowork(s);
+               mod_timer(&s->timer, jiffies + s->poll_time);
+       }
+}
+
+static int max3100_sr(struct max3100_port *s, u16 tx, u16 *rx)
+{
+       struct spi_message message;
+       u16 etx, erx;
+       int status;
+       struct spi_transfer tran = {
+               .tx_buf = &etx,
+               .rx_buf = &erx,
+               .len = 2,
+       };
+
+       etx = cpu_to_be16(tx);
+       spi_message_init(&message);
+       spi_message_add_tail(&tran, &message);
+       status = spi_sync(s->spi, &message);
+       if (status) {
+               dev_warn(&s->spi->dev, "error while calling spi_sync\n");
+               return -EIO;
+       }
+       *rx = be16_to_cpu(erx);
+       s->tx_empty = (*rx & MAX3100_T) > 0;
+       dev_dbg(&s->spi->dev, "%04x - %04x\n", tx, *rx);
+       return 0;
+}
+
+static int max3100_handlerx(struct max3100_port *s, u16 rx)
+{
+       unsigned int ch, flg, status = 0;
+       int ret = 0, cts;
+
+       if (rx & MAX3100_R && s->rx_enabled) {
+               dev_dbg(&s->spi->dev, "%s\n", __func__);
+               ch = rx & (s->parity & MAX3100_7BIT ? 0x7f : 0xff);
+               if (rx & MAX3100_RAFE) {
+                       s->port.icount.frame++;
+                       flg = TTY_FRAME;
+                       status |= MAX3100_STATUS_FE;
+               } else {
+                       if (s->parity & MAX3100_PARITY_ON) {
+                               if (max3100_check_parity(s, rx)) {
+                                       s->port.icount.rx++;
+                                       flg = TTY_NORMAL;
+                               } else {
+                                       s->port.icount.parity++;
+                                       flg = TTY_PARITY;
+                                       status |= MAX3100_STATUS_PE;
+                               }
+                       } else {
+                               s->port.icount.rx++;
+                               flg = TTY_NORMAL;
+                       }
+               }
+               uart_insert_char(&s->port, status, MAX3100_STATUS_OE, ch, flg);
+               ret = 1;
+       }
+
+       cts = (rx & MAX3100_CTS) > 0;
+       if (s->cts != cts) {
+               s->cts = cts;
+               uart_handle_cts_change(&s->port, cts ? TIOCM_CTS : 0);
+       }
+
+       return ret;
+}
+
+static void max3100_work(struct work_struct *w)
+{
+       struct max3100_port *s = container_of(w, struct max3100_port, work);
+       int rxchars;
+       u16 tx, rx;
+       int conf, cconf, rts, crts;
+       struct circ_buf *xmit = &s->port.info->xmit;
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       rxchars = 0;
+       do {
+               spin_lock(&s->conf_lock);
+               conf = s->conf;
+               cconf = s->conf_commit;
+               s->conf_commit = 0;
+               rts = s->rts;
+               crts = s->rts_commit;
+               s->rts_commit = 0;
+               spin_unlock(&s->conf_lock);
+               if (cconf)
+                       max3100_sr(s, MAX3100_WC | conf, &rx);
+               if (crts) {
+                       max3100_sr(s, MAX3100_WD | MAX3100_TE |
+                                  (s->rts ? MAX3100_RTS : 0), &rx);
+                       rxchars += max3100_handlerx(s, rx);
+               }
+
+               max3100_sr(s, MAX3100_RD, &rx);
+               rxchars += max3100_handlerx(s, rx);
+
+               if (rx & MAX3100_T) {
+                       tx = 0xffff;
+                       if (s->port.x_char) {
+                               tx = s->port.x_char;
+                               s->port.icount.tx++;
+                               s->port.x_char = 0;
+                       } else if (!uart_circ_empty(xmit) &&
+                                  !uart_tx_stopped(&s->port)) {
+                               tx = xmit->buf[xmit->tail];
+                               xmit->tail = (xmit->tail + 1) &
+                                       (UART_XMIT_SIZE - 1);
+                               s->port.icount.tx++;
+                       }
+                       if (tx != 0xffff) {
+                               max3100_calc_parity(s, &tx);
+                               tx |= MAX3100_WD | (s->rts ? MAX3100_RTS : 0);
+                               max3100_sr(s, tx, &rx);
+                               rxchars += max3100_handlerx(s, rx);
+                       }
+               }
+
+               if (rxchars > 16 && s->port.info->port.tty != NULL) {
+                       tty_flip_buffer_push(s->port.info->port.tty);
+                       rxchars = 0;
+               }
+               if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
+                       uart_write_wakeup(&s->port);
+
+       } while (!s->force_end_work &&
+                !freezing(current) &&
+                ((rx & MAX3100_R) ||
+                 (!uart_circ_empty(xmit) &&
+                  !uart_tx_stopped(&s->port))));
+
+       if (rxchars > 0 && s->port.info->port.tty != NULL)
+               tty_flip_buffer_push(s->port.info->port.tty);
+}
+
+static irqreturn_t max3100_irq(int irqno, void *dev_id)
+{
+       struct max3100_port *s = dev_id;
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       max3100_dowork(s);
+       return IRQ_HANDLED;
+}
+
+static void max3100_enable_ms(struct uart_port *port)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       if (s->poll_time > 0)
+               mod_timer(&s->timer, jiffies);
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+}
+
+static void max3100_start_tx(struct uart_port *port)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       max3100_dowork(s);
+}
+
+static void max3100_stop_rx(struct uart_port *port)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       s->rx_enabled = 0;
+       spin_lock(&s->conf_lock);
+       s->conf &= ~MAX3100_RM;
+       s->conf_commit = 1;
+       spin_unlock(&s->conf_lock);
+       max3100_dowork(s);
+}
+
+static unsigned int max3100_tx_empty(struct uart_port *port)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       /* may not be truly up-to-date */
+       max3100_dowork(s);
+       return s->tx_empty;
+}
+
+static unsigned int max3100_get_mctrl(struct uart_port *port)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       /* may not be truly up-to-date */
+       max3100_dowork(s);
+       /* always assert DCD and DSR since these lines are not wired */
+       return (s->cts ? TIOCM_CTS : 0) | TIOCM_DSR | TIOCM_CAR;
+}
+
+static void max3100_set_mctrl(struct uart_port *port, unsigned int mctrl)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+       int rts;
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       rts = (mctrl & TIOCM_RTS) > 0;
+
+       spin_lock(&s->conf_lock);
+       if (s->rts != rts) {
+               s->rts = rts;
+               s->rts_commit = 1;
+               max3100_dowork(s);
+       }
+       spin_unlock(&s->conf_lock);
+}
+
+static void
+max3100_set_termios(struct uart_port *port, struct ktermios *termios,
+                   struct ktermios *old)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+       int baud = 0;
+       unsigned cflag;
+       u32 param_new, param_mask, parity = 0;
+       struct tty_struct *tty = s->port.info->port.tty;
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+       if (!tty)
+               return;
+
+       cflag = termios->c_cflag;
+       param_new = 0;
+       param_mask = 0;
+
+       baud = tty_get_baud_rate(tty);
+       param_new = s->conf & MAX3100_BAUD;
+       switch (baud) {
+       case 300:
+               if (s->crystal)
+                       baud = s->baud;
+               else
+                       param_new = 15;
+               break;
+       case 600:
+               param_new = 14 + s->crystal;
+               break;
+       case 1200:
+               param_new = 13 + s->crystal;
+               break;
+       case 2400:
+               param_new = 12 + s->crystal;
+               break;
+       case 4800:
+               param_new = 11 + s->crystal;
+               break;
+       case 9600:
+               param_new = 10 + s->crystal;
+               break;
+       case 19200:
+               param_new = 9 + s->crystal;
+               break;
+       case 38400:
+               param_new = 8 + s->crystal;
+               break;
+       case 57600:
+               param_new = 1 + s->crystal;
+               break;
+       case 115200:
+               param_new = 0 + s->crystal;
+               break;
+       case 230400:
+               if (s->crystal)
+                       param_new = 0;
+               else
+                       baud = s->baud;
+               break;
+       default:
+               baud = s->baud;
+       }
+       tty_encode_baud_rate(tty, baud, baud);
+       s->baud = baud;
+       param_mask |= MAX3100_BAUD;
+
+       if ((cflag & CSIZE) == CS8) {
+               param_new &= ~MAX3100_L;
+               parity &= ~MAX3100_7BIT;
+       } else {
+               param_new |= MAX3100_L;
+               parity |= MAX3100_7BIT;
+               cflag = (cflag & ~CSIZE) | CS7;
+       }
+       param_mask |= MAX3100_L;
+
+       if (cflag & CSTOPB)
+               param_new |= MAX3100_ST;
+       else
+               param_new &= ~MAX3100_ST;
+       param_mask |= MAX3100_ST;
+
+       if (cflag & PARENB) {
+               param_new |= MAX3100_PE;
+               parity |= MAX3100_PARITY_ON;
+       } else {
+               param_new &= ~MAX3100_PE;
+               parity &= ~MAX3100_PARITY_ON;
+       }
+       param_mask |= MAX3100_PE;
+
+       if (cflag & PARODD)
+               parity |= MAX3100_PARITY_ODD;
+       else
+               parity &= ~MAX3100_PARITY_ODD;
+
+       /* mask termios capabilities we don't support */
+       cflag &= ~CMSPAR;
+       termios->c_cflag = cflag;
+
+       s->port.ignore_status_mask = 0;
+       if (termios->c_iflag & IGNPAR)
+               s->port.ignore_status_mask |=
+                       MAX3100_STATUS_PE | MAX3100_STATUS_FE |
+                       MAX3100_STATUS_OE;
+
+       /* we are sending char from a workqueue so enable */
+       s->port.info->port.tty->low_latency = 1;
+
+       if (s->poll_time > 0)
+               del_timer_sync(&s->timer);
+
+       uart_update_timeout(port, termios->c_cflag, baud);
+
+       spin_lock(&s->conf_lock);
+       s->conf = (s->conf & ~param_mask) | (param_new & param_mask);
+       s->conf_commit = 1;
+       s->parity = parity;
+       spin_unlock(&s->conf_lock);
+       max3100_dowork(s);
+
+       if (UART_ENABLE_MS(&s->port, termios->c_cflag))
+               max3100_enable_ms(&s->port);
+}
+
+static void max3100_shutdown(struct uart_port *port)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       if (s->suspending)
+               return;
+
+       s->force_end_work = 1;
+
+       if (s->poll_time > 0)
+               del_timer_sync(&s->timer);
+
+       if (s->workqueue) {
+               flush_workqueue(s->workqueue);
+               destroy_workqueue(s->workqueue);
+               s->workqueue = NULL;
+       }
+       if (s->irq)
+               free_irq(s->irq, s);
+
+       /* set shutdown mode to save power */
+       if (s->max3100_hw_suspend)
+               s->max3100_hw_suspend(1);
+       else  {
+               u16 tx, rx;
+
+               tx = MAX3100_WC | MAX3100_SHDN;
+               max3100_sr(s, tx, &rx);
+       }
+}
+
+static int max3100_startup(struct uart_port *port)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+       char b[12];
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       s->conf = MAX3100_RM;
+       s->baud = s->crystal ? 230400 : 115200;
+       s->rx_enabled = 1;
+
+       if (s->suspending)
+               return 0;
+
+       s->force_end_work = 0;
+       s->parity = 0;
+       s->rts = 0;
+
+       sprintf(b, "max3100-%d", s->minor);
+       s->workqueue = create_freezeable_workqueue(b);
+       if (!s->workqueue) {
+               dev_warn(&s->spi->dev, "cannot create workqueue\n");
+               return -EBUSY;
+       }
+       INIT_WORK(&s->work, max3100_work);
+
+       if (request_irq(s->irq, max3100_irq,
+                       IRQF_TRIGGER_FALLING, "max3100", s) < 0) {
+               dev_warn(&s->spi->dev, "cannot allocate irq %d\n", s->irq);
+               s->irq = 0;
+               destroy_workqueue(s->workqueue);
+               s->workqueue = NULL;
+               return -EBUSY;
+       }
+
+       if (s->loopback) {
+               u16 tx, rx;
+               tx = 0x4001;
+               max3100_sr(s, tx, &rx);
+       }
+
+       if (s->max3100_hw_suspend)
+               s->max3100_hw_suspend(0);
+       s->conf_commit = 1;
+       max3100_dowork(s);
+       /* wait for clock to settle */
+       msleep(50);
+
+       max3100_enable_ms(&s->port);
+
+       return 0;
+}
+
+static const char *max3100_type(struct uart_port *port)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       return s->port.type == PORT_MAX3100 ? "MAX3100" : NULL;
+}
+
+static void max3100_release_port(struct uart_port *port)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+}
+
+static void max3100_config_port(struct uart_port *port, int flags)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       if (flags & UART_CONFIG_TYPE)
+               s->port.type = PORT_MAX3100;
+}
+
+static int max3100_verify_port(struct uart_port *port,
+                              struct serial_struct *ser)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+       int ret = -EINVAL;
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       if (ser->type == PORT_UNKNOWN || ser->type == PORT_MAX3100)
+               ret = 0;
+       return ret;
+}
+
+static void max3100_stop_tx(struct uart_port *port)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+}
+
+static int max3100_request_port(struct uart_port *port)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+       return 0;
+}
+
+static void max3100_break_ctl(struct uart_port *port, int break_state)
+{
+       struct max3100_port *s = container_of(port,
+                                             struct max3100_port,
+                                             port);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+}
+
+static struct uart_ops max3100_ops = {
+       .tx_empty       = max3100_tx_empty,
+       .set_mctrl      = max3100_set_mctrl,
+       .get_mctrl      = max3100_get_mctrl,
+       .stop_tx        = max3100_stop_tx,
+       .start_tx       = max3100_start_tx,
+       .stop_rx        = max3100_stop_rx,
+       .enable_ms      = max3100_enable_ms,
+       .break_ctl      = max3100_break_ctl,
+       .startup        = max3100_startup,
+       .shutdown       = max3100_shutdown,
+       .set_termios    = max3100_set_termios,
+       .type           = max3100_type,
+       .release_port   = max3100_release_port,
+       .request_port   = max3100_request_port,
+       .config_port    = max3100_config_port,
+       .verify_port    = max3100_verify_port,
+};
+
+static struct uart_driver max3100_uart_driver = {
+       .owner          = THIS_MODULE,
+       .driver_name    = "ttyMAX",
+       .dev_name       = "ttyMAX",
+       .major          = MAX3100_MAJOR,
+       .minor          = MAX3100_MINOR,
+       .nr             = MAX_MAX3100,
+};
+static int uart_driver_registered;
+
+static int __devinit max3100_probe(struct spi_device *spi)
+{
+       int i, retval;
+       struct plat_max3100 *pdata;
+       u16 tx, rx;
+
+       mutex_lock(&max3100s_lock);
+
+       if (!uart_driver_registered) {
+               uart_driver_registered = 1;
+               retval = uart_register_driver(&max3100_uart_driver);
+               if (retval) {
+                       printk(KERN_ERR "Couldn't register max3100 uart driver\n");
+                       mutex_unlock(&max3100s_lock);
+                       return retval;
+               }
+       }
+
+       for (i = 0; i < MAX_MAX3100; i++)
+               if (!max3100s[i])
+                       break;
+       if (i == MAX_MAX3100) {
+               dev_warn(&spi->dev, "too many MAX3100 chips\n");
+               mutex_unlock(&max3100s_lock);
+               return -ENOMEM;
+       }
+
+       max3100s[i] = kzalloc(sizeof(struct max3100_port), GFP_KERNEL);
+       if (!max3100s[i]) {
+               dev_warn(&spi->dev,
+                        "kmalloc for max3100 structure %d failed!\n", i);
+               mutex_unlock(&max3100s_lock);
+               return -ENOMEM;
+       }
+       max3100s[i]->spi = spi;
+       max3100s[i]->irq = spi->irq;
+       spin_lock_init(&max3100s[i]->conf_lock);
+       dev_set_drvdata(&spi->dev, max3100s[i]);
+       pdata = spi->dev.platform_data;
+       max3100s[i]->crystal = pdata->crystal;
+       max3100s[i]->loopback = pdata->loopback;
+       max3100s[i]->poll_time = pdata->poll_time * HZ / 1000;
+       if (pdata->poll_time > 0 && max3100s[i]->poll_time == 0)
+               max3100s[i]->poll_time = 1;
+       max3100s[i]->max3100_hw_suspend = pdata->max3100_hw_suspend;
+       max3100s[i]->minor = i;
+       init_timer(&max3100s[i]->timer);
+       max3100s[i]->timer.function = max3100_timeout;
+       max3100s[i]->timer.data = (unsigned long) max3100s[i];
+
+       dev_dbg(&spi->dev, "%s: adding port %d\n", __func__, i);
+       max3100s[i]->port.irq = max3100s[i]->irq;
+       max3100s[i]->port.uartclk = max3100s[i]->crystal ? 3686400 : 1843200;
+       max3100s[i]->port.fifosize = 16;
+       max3100s[i]->port.ops = &max3100_ops;
+       max3100s[i]->port.flags = UPF_SKIP_TEST | UPF_BOOT_AUTOCONF;
+       max3100s[i]->port.line = i;
+       max3100s[i]->port.type = PORT_MAX3100;
+       max3100s[i]->port.dev = &spi->dev;
+       retval = uart_add_one_port(&max3100_uart_driver, &max3100s[i]->port);
+       if (retval < 0)
+               dev_warn(&spi->dev,
+                        "uart_add_one_port failed for line %d with error %d\n",
+                        i, retval);
+
+       /* set shutdown mode to save power. Will be woken-up on open */
+       if (max3100s[i]->max3100_hw_suspend)
+               max3100s[i]->max3100_hw_suspend(1);
+       else {
+               tx = MAX3100_WC | MAX3100_SHDN;
+               max3100_sr(max3100s[i], tx, &rx);
+       }
+       mutex_unlock(&max3100s_lock);
+       return 0;
+}
+
+static int __devexit max3100_remove(struct spi_device *spi)
+{
+       struct max3100_port *s = dev_get_drvdata(&spi->dev);
+       int i;
+
+       mutex_lock(&max3100s_lock);
+
+       /* find out the index for the chip we are removing */
+       for (i = 0; i < MAX_MAX3100; i++)
+               if (max3100s[i] == s)
+                       break;
+
+       dev_dbg(&spi->dev, "%s: removing port %d\n", __func__, i);
+       uart_remove_one_port(&max3100_uart_driver, &max3100s[i]->port);
+       kfree(max3100s[i]);
+       max3100s[i] = NULL;
+
+       /* check if this is the last chip we have */
+       for (i = 0; i < MAX_MAX3100; i++)
+               if (max3100s[i]) {
+                       mutex_unlock(&max3100s_lock);
+                       return 0;
+               }
+       pr_debug("removing max3100 driver\n");
+       uart_unregister_driver(&max3100_uart_driver);
+
+       mutex_unlock(&max3100s_lock);
+       return 0;
+}
+
+#ifdef CONFIG_PM
+
+static int max3100_suspend(struct spi_device *spi, pm_message_t state)
+{
+       struct max3100_port *s = dev_get_drvdata(&spi->dev);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       disable_irq(s->irq);
+
+       s->suspending = 1;
+       uart_suspend_port(&max3100_uart_driver, &s->port);
+
+       if (s->max3100_hw_suspend)
+               s->max3100_hw_suspend(1);
+       else {
+               /* no HW suspend, so do SW one */
+               u16 tx, rx;
+
+               tx = MAX3100_WC | MAX3100_SHDN;
+               max3100_sr(s, tx, &rx);
+       }
+       return 0;
+}
+
+static int max3100_resume(struct spi_device *spi)
+{
+       struct max3100_port *s = dev_get_drvdata(&spi->dev);
+
+       dev_dbg(&s->spi->dev, "%s\n", __func__);
+
+       if (s->max3100_hw_suspend)
+               s->max3100_hw_suspend(0);
+       uart_resume_port(&max3100_uart_driver, &s->port);
+       s->suspending = 0;
+
+       enable_irq(s->irq);
+
+       s->conf_commit = 1;
+       if (s->workqueue)
+               max3100_dowork(s);
+
+       return 0;
+}
+
+#else
+#define max3100_suspend NULL
+#define max3100_resume  NULL
+#endif
+
+static struct spi_driver max3100_driver = {
+       .driver = {
+               .name           = "max3100",
+               .bus            = &spi_bus_type,
+               .owner          = THIS_MODULE,
+       },
+
+       .probe          = max3100_probe,
+       .remove         = __devexit_p(max3100_remove),
+       .suspend        = max3100_suspend,
+       .resume         = max3100_resume,
+};
+
+static int __init max3100_init(void)
+{
+       return spi_register_driver(&max3100_driver);
+}
+module_init(max3100_init);
+
+static void __exit max3100_exit(void)
+{
+       spi_unregister_driver(&max3100_driver);
+}
+module_exit(max3100_exit);
+
+MODULE_DESCRIPTION("MAX3100 driver");
+MODULE_AUTHOR("Christian Pellegrin <chripell@evolware.org>");
+MODULE_LICENSE("GPL");
diff --git a/drivers/serial/sunsu.c b/drivers/serial/sunsu.c

index a4dc79b1d7ab4b2d5f6df9e1a98a22c4d3d89fd9..47c6837850b1d47d998164207e81525d1b40e9ba 100644 (file)
--- a/drivers/serial/sunsu.c
+++ b/drivers/serial/sunsu.c
@@ -1178,7 +1178,7 @@ static struct uart_driver sunsu_reg = {
         .major                  = TTY_MAJOR,
  };
  
-static int __init sunsu_kbd_ms_init(struct uart_sunsu_port *up)
+static int __devinit sunsu_kbd_ms_init(struct uart_sunsu_port *up)
  {
         int quot, baud;
  #ifdef CONFIG_SERIO
diff --git a/drivers/usb/host/ohci-at91.c b/drivers/usb/host/ohci-at91.c

index 4ed228a899435a91bc1428daa37a9eb54f283b14..bb5e6f67157837691f4b22130133c4dc1283469e 100644 (file)
--- a/drivers/usb/host/ohci-at91.c
+++ b/drivers/usb/host/ohci-at91.c
@@ -280,7 +280,7 @@ static int ohci_hcd_at91_drv_probe(struct platform_device *pdev)
                  * are always powered while this driver is active, and use
                  * active-low power switches.
                  */
-               for (i = 0; i < pdata->ports; i++) {
+               for (i = 0; i < ARRAY_SIZE(pdata->vbus_pin); i++) {
                         if (pdata->vbus_pin[i] <= 0)
                                 continue;
                         gpio_request(pdata->vbus_pin[i], "ohci_vbus");
@@ -298,7 +298,7 @@ static int ohci_hcd_at91_drv_remove(struct platform_device *pdev)
         int                     i;
  
         if (pdata) {
-               for (i = 0; i < pdata->ports; i++) {
+               for (i = 0; i < ARRAY_SIZE(pdata->vbus_pin); i++) {
                         if (pdata->vbus_pin[i] <= 0)
                                 continue;
                         gpio_direction_output(pdata->vbus_pin[i], 1);
diff --git a/fs/befs/super.c b/fs/befs/super.c

index 41f2b4d0093e66ed67ab747831cea76fedc397cb..ca40f828f64dac9478d667720b4bcdf6f2d9d26e 100644 (file)
--- a/fs/befs/super.c
+++ b/fs/befs/super.c
@@ -8,6 +8,7 @@
   */
  
  #include <linux/fs.h>
+#include <asm/page.h> /* for PAGE_SIZE */
  
  #include "befs.h"
  #include "super.h"
diff --git a/fs/buffer.c b/fs/buffer.c

index 6e35762b6169be2a30a69462764b71967369a1d8..13edf7ad3ff1524032d043ec10aa3b13465103ef 100644 (file)
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1596,6 +1596,16 @@ EXPORT_SYMBOL(unmap_underlying_metadata);
   * locked buffer.   This only can happen if someone has written the buffer
   * directly, with submit_bh().  At the address_space level PageWriteback
   * prevents this contention from occurring.
+ *
+ * If block_write_full_page() is called with wbc->sync_mode ==
+ * WB_SYNC_ALL, the writes are posted using WRITE_SYNC_PLUG; this
+ * causes the writes to be flagged as synchronous writes, but the
+ * block device queue will NOT be unplugged, since usually many pages
+ * will be pushed to the out before the higher-level caller actually
+ * waits for the writes to be completed.  The various wait functions,
+ * such as wait_on_writeback_range() will ultimately call sync_page()
+ * which will ultimately call blk_run_backing_dev(), which will end up
+ * unplugging the device queue.
   */
  static int __block_write_full_page(struct inode *inode, struct page *page,
                         get_block_t *get_block, struct writeback_control *wbc)
@@ -1606,7 +1616,8 @@ static int __block_write_full_page(struct inode *inode, struct page *page,
         struct buffer_head *bh, *head;
         const unsigned blocksize = 1 << inode->i_blkbits;
         int nr_underway = 0;
-       int write_op = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE);
+       int write_op = (wbc->sync_mode == WB_SYNC_ALL ?
+                       WRITE_SYNC_PLUG : WRITE);
  
         BUG_ON(!PageLocked(page));
  
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c

index 466a332e0bd124c871143cfd682392074252aa76..fcfa243618567c2787d50b2119bad58d4d137cb2 100644 (file)
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1521,12 +1521,16 @@ static int ext3_ordered_writepage(struct page *page,
         if (!page_has_buffers(page)) {
                 create_empty_buffers(page, inode->i_sb->s_blocksize,
                                 (1 << BH_Dirty)|(1 << BH_Uptodate));
-       } else if (!walk_page_buffers(NULL, page_buffers(page), 0, PAGE_CACHE_SIZE, NULL, buffer_unmapped)) {
-               /* Provide NULL instead of get_block so that we catch bugs if buffers weren't really mapped */
-               return block_write_full_page(page, NULL, wbc);
+               page_bufs = page_buffers(page);
+       } else {
+               page_bufs = page_buffers(page);
+               if (!walk_page_buffers(NULL, page_bufs, 0, PAGE_CACHE_SIZE,
+                                      NULL, buffer_unmapped)) {
+                       /* Provide NULL get_block() to catch bugs if buffers
+                        * weren't really mapped */
+                       return block_write_full_page(page, NULL, wbc);
+               }
         }
-       page_bufs = page_buffers(page);
-
         handle = ext3_journal_start(inode, ext3_writepage_trans_blocks(inode));
  
         if (IS_ERR(handle)) {
@@ -1581,6 +1585,15 @@ static int ext3_writeback_writepage(struct page *page,
         if (ext3_journal_current_handle())
                 goto out_fail;
  
+       if (page_has_buffers(page)) {
+               if (!walk_page_buffers(NULL, page_buffers(page), 0,
+                                     PAGE_CACHE_SIZE, NULL, buffer_unmapped)) {
+                       /* Provide NULL get_block() to catch bugs if buffers
+                        * weren't really mapped */
+                       return block_write_full_page(page, NULL, wbc);
+               }
+       }
+
         handle = ext3_journal_start(inode, ext3_writepage_trans_blocks(inode));
         if (IS_ERR(handle)) {
                 ret = PTR_ERR(handle);
diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c

index 12c20377772d86f33111d7ea2797c9b077dcc979..64a72e2e76509545bf6c0d4d4e0788779baaab26 100644 (file)
--- a/fs/proc/task_nommu.c
+++ b/fs/proc/task_nommu.c
@@ -135,7 +135,7 @@ static int nommu_vma_show(struct seq_file *m, struct vm_area_struct *vma)
                 struct inode *inode = vma->vm_file->f_path.dentry->d_inode;
                 dev = inode->i_sb->s_dev;
                 ino = inode->i_ino;
-               pgoff = (loff_t)vma->pg_off << PAGE_SHIFT;
+               pgoff = (loff_t)vma->vm_pgoff << PAGE_SHIFT;
         }
  
         seq_printf(m,
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h

index a2228511d4be9433ca290bae158583148507f04f..c34b11022908a01c521367ddf85249fcacb98ffe 100644 (file)
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -270,7 +270,6 @@ struct acpi_device {
         struct list_head children;
         struct list_head node;
         struct list_head wakeup_list;
-       struct list_head g_list;
         struct acpi_device_status status;
         struct acpi_device_flags flags;
         struct acpi_device_pnp pnp;
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h

index 66ec05a5795558cef24450388e67ffd6bf0743f3..ded2d7c42668ccc396f193ae3c157cf0ea024133 100644 (file)
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -116,7 +116,6 @@ void dm_put_device(struct dm_target *ti, struct dm_dev *d);
  /*
   * Target features
   */
-#define DM_TARGET_SUPPORTS_BARRIERS 0x00000001
  
  struct target_type {
         uint64_t features;
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h

index da5405dce34746aa4c7ab757af6a46e430f9b538..8a0c2f221e6b95b448991b1e291610c9e52ea91e 100644 (file)
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -357,7 +357,7 @@ struct ftrace_graph_ret {
  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
  
  /* for init task */
-#define INIT_FTRACE_GRAPH              .ret_stack = NULL
+#define INIT_FTRACE_GRAPH              .ret_stack = NULL,
  
  /*
   * Stack of return addresses for functions
@@ -511,33 +511,4 @@ static inline void trace_hw_branch_oops(void) {}
  
  #endif /* CONFIG_HW_BRANCH_TRACER */
  
-/*
- * A syscall entry in the ftrace syscalls array.
- *
- * @name: name of the syscall
- * @nb_args: number of parameters it takes
- * @types: list of types as strings
- * @args: list of args as strings (args[i] matches types[i])
- */
-struct syscall_metadata {
-       const char      *name;
-       int             nb_args;
-       const char      **types;
-       const char      **args;
-};
-
-#ifdef CONFIG_FTRACE_SYSCALLS
-extern void arch_init_ftrace_syscalls(void);
-extern struct syscall_metadata *syscall_nr_to_meta(int nr);
-extern void start_ftrace_syscalls(void);
-extern void stop_ftrace_syscalls(void);
-extern void ftrace_syscall_enter(struct pt_regs *regs);
-extern void ftrace_syscall_exit(struct pt_regs *regs);
-#else
-static inline void start_ftrace_syscalls(void) { }
-static inline void stop_ftrace_syscalls(void) { }
-static inline void ftrace_syscall_enter(struct pt_regs *regs) { }
-static inline void ftrace_syscall_exit(struct pt_regs *regs) { }
-#endif
-
  #endif /* _LINUX_FTRACE_H */
diff --git a/include/linux/irq.h b/include/linux/irq.h

index ca507c9426b00972af3254d180837d7bdb05b471..b7cbeed972e425b694f1666343f6fc0558ac439e 100644 (file)
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -487,6 +487,16 @@ static inline void init_copy_desc_masks(struct irq_desc *old_desc,
  #endif
  }
  
+static inline void free_desc_masks(struct irq_desc *old_desc,
+                                  struct irq_desc *new_desc)
+{
+       free_cpumask_var(old_desc->affinity);
+
+#ifdef CONFIG_GENERIC_PENDING_IRQ
+       free_cpumask_var(old_desc->pending_mask);
+#endif
+}
+
  #else /* !CONFIG_SMP */
  
  static inline bool init_alloc_desc_masks(struct irq_desc *desc, int cpu,
@@ -500,6 +510,10 @@ static inline void init_copy_desc_masks(struct irq_desc *old_desc,
  {
  }
  
+static inline void free_desc_masks(struct irq_desc *old_desc,
+                                  struct irq_desc *new_desc)
+{
+}
  #endif /* CONFIG_SMP */
  
  #endif /* _LINUX_IRQ_H */
diff --git a/include/linux/kmod.h b/include/linux/kmod.h

index d5fa565086d16746b2a193d0fa18134825b353bd..384ca8bbf1ac26210ed6d217d8651b86d9b5e4e0 100644 (file)
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -34,7 +34,7 @@ extern int __request_module(bool wait, const char *name, ...) \
  #define request_module(mod...) __request_module(true, mod)
  #define request_module_nowait(mod...) __request_module(false, mod)
  #define try_then_request_module(x, mod...) \
-       ((x) ?: (__request_module(false, mod), (x)))
+       ((x) ?: (__request_module(true, mod), (x)))
  #else
  static inline int request_module(const char *name, ...) { return -ENOSYS; }
  static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
diff --git a/include/linux/mfd/pcf50633/core.h b/include/linux/mfd/pcf50633/core.h

index 4455b212d75acec92a21a56cdf5504777165da05..c8f51c3c0a725d21f9906178f91d7fb868ca0299 100644 (file)
--- a/include/linux/mfd/pcf50633/core.h
+++ b/include/linux/mfd/pcf50633/core.h
@@ -29,6 +29,8 @@ struct pcf50633_platform_data {
         char **batteries;
         int num_batteries;
  
+       int charging_restart_interval;
+
         /* Callbacks */
         void (*probe_done)(struct pcf50633 *);
         void (*mbc_event_callback)(struct pcf50633 *, int);
diff --git a/include/linux/mfd/pcf50633/mbc.h b/include/linux/mfd/pcf50633/mbc.h

index 6e17619b773a162e6f75c76648a07647b1247473..4119579acf2c447d15472f3e35cdb34f727d31d6 100644 (file)
--- a/include/linux/mfd/pcf50633/mbc.h
+++ b/include/linux/mfd/pcf50633/mbc.h
@@ -128,7 +128,6 @@ enum pcf50633_reg_mbcs3 {
  int pcf50633_mbc_usb_curlim_set(struct pcf50633 *pcf, int ma);
  
  int pcf50633_mbc_get_status(struct pcf50633 *);
-void pcf50633_mbc_set_status(struct pcf50633 *, int what, int status);
  
  #endif
  
diff --git a/include/linux/pda_power.h b/include/linux/pda_power.h

index cb7d10f3076343f2e60e5a853411c05aefd6eb0f..d4cf7a2ceb3edfc7779cb4f10c76de5825288317 100644 (file)
--- a/include/linux/pda_power.h
+++ b/include/linux/pda_power.h
@@ -31,6 +31,8 @@ struct pda_power_pdata {
         unsigned int wait_for_status; /* msecs, default is 500 */
         unsigned int wait_for_charger; /* msecs, default is 500 */
         unsigned int polling_interval; /* msecs, default is 2000 */
+
+       unsigned long ac_max_uA; /* current to draw when on AC */
  };
  
  #endif /* __PDA_POWER_H__ */
diff --git a/include/linux/sched.h b/include/linux/sched.h

index 98e1fe51601df0066786500ba78e0648ec896287..b4c38bc8049cbbea17e0ca4f929f35df9cddbe1f 100644 (file)
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -205,7 +205,8 @@ extern unsigned long long time_sync_thresh;
  #define task_is_stopped_or_traced(task)        \
                         ((task->state & (__TASK_STOPPED | __TASK_TRACED)) != 0)
  #define task_contributes_to_load(task) \
-                               ((task->state & TASK_UNINTERRUPTIBLE) != 0)
+                               ((task->state & TASK_UNINTERRUPTIBLE) != 0 && \
+                                (task->flags & PF_FROZEN) == 0)
  
  #define __set_task_state(tsk, state_value)             \
         do { (tsk)->state = (state_value); } while (0)
diff --git a/include/linux/serial_max3100.h b/include/linux/serial_max3100.h

new file mode 100644 (file)

index 0000000..4976bef
--- /dev/null
+++ b/include/linux/serial_max3100.h
@@ -0,0 +1,52 @@
+/*
+ *
+ *  Copyright (C) 2007 Christian Pellegrin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+
+#ifndef _LINUX_SERIAL_MAX3100_H
+#define _LINUX_SERIAL_MAX3100_H 1
+
+
+/**
+ * struct plat_max3100 - MAX3100 SPI UART platform data
+ * @loopback:            force MAX3100 in loopback
+ * @crystal:             1 for 3.6864 Mhz, 0 for 1.8432
+ * @max3100_hw_suspend:  MAX3100 has a shutdown pin. This is a hook
+ *                       called on suspend and resume to activate it.
+ * @poll_time:           poll time for CTS signal in ms, 0 disables (so no hw
+ *                       flow ctrl is possible but you have less CPU usage)
+ *
+ * You should use this structure in your machine description to specify
+ * how the MAX3100 is connected. Example:
+ *
+ * static struct plat_max3100 max3100_plat_data = {
+ *  .loopback = 0,
+ *  .crystal = 0,
+ *  .poll_time = 100,
+ * };
+ *
+ * static struct spi_board_info spi_board_info[] = {
+ * {
+ *  .modalias  = "max3100",
+ *  .platform_data     = &max3100_plat_data,
+ *  .irq               = IRQ_EINT12,
+ *  .max_speed_hz      = 5*1000*1000,
+ *  .chip_select       = 0,
+ * },
+ * };
+ *
+ **/
+struct plat_max3100 {
+       int loopback;
+       int crystal;
+       void (*max3100_hw_suspend) (int suspend);
+       int poll_time;
+};
+
+#endif
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h

index 6470f74074af5e812cfe967d8c9c54810f0ad74a..dabe4ad8914111d4979806ea0c152772b4a3996f 100644 (file)
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -65,7 +65,7 @@ struct old_linux_dirent;
  #include <asm/signal.h>
  #include <linux/quota.h>
  #include <linux/key.h>
-#include <linux/ftrace.h>
+#include <trace/syscall.h>
  
  #define __SC_DECL1(t1, a1)     t1 a1
  #define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
diff --git a/include/net/netfilter/nf_conntrack_expect.h b/include/net/netfilter/nf_conntrack_expect.h

index ab17a159ac66192bdc4e6a445f3f1325653c571e..a9652806d0df59fbcdbbb09cf8d4cf353c310390 100644 (file)
--- a/include/net/netfilter/nf_conntrack_expect.h
+++ b/include/net/netfilter/nf_conntrack_expect.h
@@ -99,9 +99,12 @@ void nf_ct_expect_init(struct nf_conntrack_expect *, unsigned int, u_int8_t,
                        const union nf_inet_addr *,
                        u_int8_t, const __be16 *, const __be16 *);
  void nf_ct_expect_put(struct nf_conntrack_expect *exp);
-int nf_ct_expect_related(struct nf_conntrack_expect *expect);
  int nf_ct_expect_related_report(struct nf_conntrack_expect *expect, 
                                 u32 pid, int report);
+static inline int nf_ct_expect_related(struct nf_conntrack_expect *expect)
+{
+       return nf_ct_expect_related_report(expect, 0, 0);
+}
  
  #endif /*_NF_CONNTRACK_EXPECT_H*/
  
diff --git a/include/trace/syscall.h b/include/trace/syscall.h

new file mode 100644 (file)

index 0000000..8cfe515
--- /dev/null
+++ b/include/trace/syscall.h
@@ -0,0 +1,35 @@
+#ifndef _TRACE_SYSCALL_H
+#define _TRACE_SYSCALL_H
+
+#include <asm/ptrace.h>
+
+/*
+ * A syscall entry in the ftrace syscalls array.
+ *
+ * @name: name of the syscall
+ * @nb_args: number of parameters it takes
+ * @types: list of types as strings
+ * @args: list of args as strings (args[i] matches types[i])
+ */
+struct syscall_metadata {
+       const char      *name;
+       int             nb_args;
+       const char      **types;
+       const char      **args;
+};
+
+#ifdef CONFIG_FTRACE_SYSCALLS
+extern void arch_init_ftrace_syscalls(void);
+extern struct syscall_metadata *syscall_nr_to_meta(int nr);
+extern void start_ftrace_syscalls(void);
+extern void stop_ftrace_syscalls(void);
+extern void ftrace_syscall_enter(struct pt_regs *regs);
+extern void ftrace_syscall_exit(struct pt_regs *regs);
+#else
+static inline void start_ftrace_syscalls(void)                 { }
+static inline void stop_ftrace_syscalls(void)                  { }
+static inline void ftrace_syscall_enter(struct pt_regs *regs)  { }
+static inline void ftrace_syscall_exit(struct pt_regs *regs)   { }
+#endif
+
+#endif /* _TRACE_SYSCALL_H */
diff --git a/kernel/fork.c b/kernel/fork.c

index 989c7c202b3d831ef1aa9ad024872edd864c08ca..b9e2edd00726538467877d458d7aeb7771c03015 100644 (file)
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -800,6 +800,12 @@ static void posix_cpu_timers_init_group(struct signal_struct *sig)
         sig->cputime_expires.virt_exp = cputime_zero;
         sig->cputime_expires.sched_exp = 0;
  
+       if (sig->rlim[RLIMIT_CPU].rlim_cur != RLIM_INFINITY) {
+               sig->cputime_expires.prof_exp =
+                       secs_to_cputime(sig->rlim[RLIMIT_CPU].rlim_cur);
+               sig->cputimer.running = 1;
+       }
+
         /* The timer lists. */
         INIT_LIST_HEAD(&sig->cpu_timers[0]);
         INIT_LIST_HEAD(&sig->cpu_timers[1]);
@@ -815,11 +821,8 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
                 atomic_inc(&current->signal->live);
                 return 0;
         }
-       sig = kmem_cache_alloc(signal_cachep, GFP_KERNEL);
-
-       if (sig)
-               posix_cpu_timers_init_group(sig);
  
+       sig = kmem_cache_alloc(signal_cachep, GFP_KERNEL);
         tsk->signal = sig;
         if (!sig)
                 return -ENOMEM;
@@ -859,6 +862,8 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
         memcpy(sig->rlim, current->signal->rlim, sizeof sig->rlim);
         task_unlock(current->group_leader);
  
+       posix_cpu_timers_init_group(sig);
+
         acct_init_pacct(&sig->pacct);
  
         tty_audit_fork(sig);
diff --git a/kernel/futex.c b/kernel/futex.c

index 6b50a024bca22e32b0a606fb4b7ea6daf1525967..eef8cd26b5e5062e37830099128f9844b3323253 100644 (file)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -883,7 +883,12 @@ retry_private:
  out_unlock:
         double_unlock_hb(hb1, hb2);
  
-       /* drop_futex_key_refs() must be called outside the spinlocks. */
+       /*
+        * drop_futex_key_refs() must be called outside the spinlocks. During
+        * the requeue we moved futex_q's from the hash bucket at key1 to the
+        * one at key2 and updated their key pointer.  We no longer need to
+        * hold the references to key1.
+        */
         while (--drop_count >= 0)
                 drop_futex_key_refs(&key1);
  
diff --git a/kernel/irq/numa_migrate.c b/kernel/irq/numa_migrate.c

index 243d6121e50e08c1b54972fd3bb61c827a8c3dfe..44bbdcbaf8d2c8193f2d631b54061047546d0065 100644 (file)
--- a/kernel/irq/numa_migrate.c
+++ b/kernel/irq/numa_migrate.c
@@ -54,6 +54,7 @@ static bool init_copy_one_irq_desc(int irq, struct irq_desc *old_desc,
  static void free_one_irq_desc(struct irq_desc *old_desc, struct irq_desc *desc)
  {
         free_kstat_irqs(old_desc, desc);
+       free_desc_masks(old_desc, desc);
         arch_free_chip_data(old_desc, desc);
  }
  
diff --git a/kernel/kthread.c b/kernel/kthread.c

index 84bbadd4d0213c1558ade617ab49695df0b1bd9f..4ebaf8519abf64fb0cc4e93ef42d7ad0cf286c9f 100644 (file)
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -76,6 +76,7 @@ static int kthread(void *_create)
  
         /* OK, tell user we're spawned, wait for stop or wakeup */
         __set_current_state(TASK_UNINTERRUPTIBLE);
+       create->result = current;
         complete(&create->started);
         schedule();
  
@@ -96,22 +97,10 @@ static void create_kthread(struct kthread_create_info *create)
  
         /* We want our own signal handler (we take no signals by default). */
         pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
-       if (pid < 0) {
+       if (pid < 0)
                 create->result = ERR_PTR(pid);
-       } else {
-               struct sched_param param = { .sched_priority = 0 };
+       else
                 wait_for_completion(&create->started);
-               read_lock(&tasklist_lock);
-               create->result = find_task_by_pid_ns(pid, &init_pid_ns);
-               read_unlock(&tasklist_lock);
-               /*
-                * root may have changed our (kthreadd's) priority or CPU mask.
-                * The kernel thread should not inherit these properties.
-                */
-               sched_setscheduler(create->result, SCHED_NORMAL, &param);
-               set_user_nice(create->result, KTHREAD_NICE_LEVEL);
-               set_cpus_allowed_ptr(create->result, cpu_all_mask);
-       }
         complete(&create->done);
  }
  
@@ -154,11 +143,20 @@ struct task_struct *kthread_create(int (*threadfn)(void *data),
         wait_for_completion(&create.done);
  
         if (!IS_ERR(create.result)) {
+               struct sched_param param = { .sched_priority = 0 };
                 va_list args;
+
                 va_start(args, namefmt);
                 vsnprintf(create.result->comm, sizeof(create.result->comm),
                           namefmt, args);
                 va_end(args);
+               /*
+                * root may have changed our (kthreadd's) priority or CPU mask.
+                * The kernel thread should not inherit these properties.
+                */
+               sched_setscheduler_nocheck(create.result, SCHED_NORMAL, &param);
+               set_user_nice(create.result, KTHREAD_NICE_LEVEL);
+               set_cpus_allowed_ptr(create.result, cpu_all_mask);
         }
         return create.result;
  }
diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c

index 8e5d9a68b0222f0c028c42c059a39a736c636dba..c9dcf98b44633398217e3a7829ace3bf7791403d 100644 (file)
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -18,7 +18,7 @@ void update_rlimit_cpu(unsigned long rlim_new)
  
         cputime = secs_to_cputime(rlim_new);
         if (cputime_eq(current->signal->it_prof_expires, cputime_zero) ||
-           cputime_lt(current->signal->it_prof_expires, cputime)) {
+           cputime_gt(current->signal->it_prof_expires, cputime)) {
                 spin_lock_irq(&current->sighand->siglock);
                 set_process_cpu_timer(current, CPUCLOCK_PROF, &cputime, NULL);
                 spin_unlock_irq(&current->sighand->siglock);
@@ -224,7 +224,7 @@ static int cpu_clock_sample(const clockid_t which_clock, struct task_struct *p,
                 cpu->cpu = virt_ticks(p);
                 break;
         case CPUCLOCK_SCHED:
-               cpu->sched = p->se.sum_exec_runtime + task_delta_exec(p);
+               cpu->sched = task_sched_runtime(p);
                 break;
         }
         return 0;
@@ -305,18 +305,19 @@ static int cpu_clock_sample_group(const clockid_t which_clock,
  {
         struct task_cputime cputime;
  
-       thread_group_cputime(p, &cputime);
         switch (CPUCLOCK_WHICH(which_clock)) {
         default:
                 return -EINVAL;
         case CPUCLOCK_PROF:
+               thread_group_cputime(p, &cputime);
                 cpu->cpu = cputime_add(cputime.utime, cputime.stime);
                 break;
         case CPUCLOCK_VIRT:
+               thread_group_cputime(p, &cputime);
                 cpu->cpu = cputime.utime;
                 break;
         case CPUCLOCK_SCHED:
-               cpu->sched = cputime.sum_exec_runtime + task_delta_exec(p);
+               cpu->sched = thread_group_sched_runtime(p);
                 break;
         }
         return 0;
diff --git a/kernel/ptrace.c b/kernel/ptrace.c

index aaad0ec341948690d8d8cad7ad22ee0600035fad..64191fa09b7e9eb8c5672caa0b3764dc9a13a354 100644 (file)
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -21,9 +21,7 @@
  #include <linux/audit.h>
  #include <linux/pid_namespace.h>
  #include <linux/syscalls.h>
-
-#include <asm/pgtable.h>
-#include <asm/uaccess.h>
+#include <linux/uaccess.h>
  
  
  /*
@@ -48,7 +46,7 @@ void __ptrace_link(struct task_struct *child, struct task_struct *new_parent)
         list_add(&child->ptrace_entry, &new_parent->ptraced);
         child->parent = new_parent;
  }
- 
+
  /*
   * Turn a tracing stop into a normal stop now, since with no tracer there
   * would be no way to wake it up with SIGCONT or SIGKILL.  If there was a
@@ -173,7 +171,7 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode)
         task_lock(task);
         err = __ptrace_may_access(task, mode);
         task_unlock(task);
-       return (!err ? true : false);
+       return !err;
  }
  
  int ptrace_attach(struct task_struct *task)
@@ -358,7 +356,7 @@ int ptrace_readdata(struct task_struct *tsk, unsigned long src, char __user *dst
                 copied += retval;
                 src += retval;
                 dst += retval;
-               len -= retval;                  
+               len -= retval;
         }
         return copied;
  }
@@ -383,7 +381,7 @@ int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned long ds
                 copied += retval;
                 src += retval;
                 dst += retval;
-               len -= retval;                  
+               len -= retval;
         }
         return copied;
  }
@@ -496,9 +494,9 @@ static int ptrace_resume(struct task_struct *child, long request, long data)
                 if (unlikely(!arch_has_single_step()))
                         return -EIO;
                 user_enable_single_step(child);
-       }
-       else
+       } else {
                 user_disable_single_step(child);
+       }
  
         child->exit_code = data;
         wake_up_process(child);
diff --git a/kernel/sched.c b/kernel/sched.c

index 6cc1fd5d5072b69638c562d7e01697d4c9870684..5724508c3b66b30d8182f32bc0cde560f790ba01 100644 (file)
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1418,10 +1418,22 @@ iter_move_one_task(struct rq *this_rq, int this_cpu, struct rq *busiest,
                    struct rq_iterator *iterator);
  #endif
  
+/* Time spent by the tasks of the cpu accounting group executing in ... */
+enum cpuacct_stat_index {
+       CPUACCT_STAT_USER,      /* ... user mode */
+       CPUACCT_STAT_SYSTEM,    /* ... kernel mode */
+
+       CPUACCT_STAT_NSTATS,
+};
+
  #ifdef CONFIG_CGROUP_CPUACCT
  static void cpuacct_charge(struct task_struct *tsk, u64 cputime);
+static void cpuacct_update_stats(struct task_struct *tsk,
+               enum cpuacct_stat_index idx, cputime_t val);
  #else
  static inline void cpuacct_charge(struct task_struct *tsk, u64 cputime) {}
+static inline void cpuacct_update_stats(struct task_struct *tsk,
+               enum cpuacct_stat_index idx, cputime_t val) {}
  #endif
  
  static inline void inc_cpu_load(struct rq *rq, unsigned long load)
@@ -4511,9 +4523,25 @@ DEFINE_PER_CPU(struct kernel_stat, kstat);
  EXPORT_PER_CPU_SYMBOL(kstat);
  
  /*
- * Return any ns on the sched_clock that have not yet been banked in
+ * Return any ns on the sched_clock that have not yet been accounted in
   * @p in case that task is currently running.
+ *
+ * Called with task_rq_lock() held on @rq.
   */
+static u64 do_task_delta_exec(struct task_struct *p, struct rq *rq)
+{
+       u64 ns = 0;
+
+       if (task_current(rq, p)) {
+               update_rq_clock(rq);
+               ns = rq->clock - p->se.exec_start;
+               if ((s64)ns < 0)
+                       ns = 0;
+       }
+
+       return ns;
+}
+
  unsigned long long task_delta_exec(struct task_struct *p)
  {
         unsigned long flags;
@@ -4521,16 +4549,49 @@ unsigned long long task_delta_exec(struct task_struct *p)
         u64 ns = 0;
  
         rq = task_rq_lock(p, &flags);
+       ns = do_task_delta_exec(p, rq);
+       task_rq_unlock(rq, &flags);
  
-       if (task_current(rq, p)) {
-               u64 delta_exec;
+       return ns;
+}
  
-               update_rq_clock(rq);
-               delta_exec = rq->clock - p->se.exec_start;
-               if ((s64)delta_exec > 0)
-                       ns = delta_exec;
-       }
+/*
+ * Return accounted runtime for the task.
+ * In case the task is currently running, return the runtime plus current's
+ * pending runtime that have not been accounted yet.
+ */
+unsigned long long task_sched_runtime(struct task_struct *p)
+{
+       unsigned long flags;
+       struct rq *rq;
+       u64 ns = 0;
+
+       rq = task_rq_lock(p, &flags);
+       ns = p->se.sum_exec_runtime + do_task_delta_exec(p, rq);
+       task_rq_unlock(rq, &flags);
+
+       return ns;
+}
+
+/*
+ * Return sum_exec_runtime for the thread group.
+ * In case the task is currently running, return the sum plus current's
+ * pending runtime that have not been accounted yet.
+ *
+ * Note that the thread group might have other running tasks as well,
+ * so the return value not includes other pending runtime that other
+ * running tasks might have.
+ */
+unsigned long long thread_group_sched_runtime(struct task_struct *p)
+{
+       struct task_cputime totals;
+       unsigned long flags;
+       struct rq *rq;
+       u64 ns;
  
+       rq = task_rq_lock(p, &flags);
+       thread_group_cputime(p, &totals);
+       ns = totals.sum_exec_runtime + do_task_delta_exec(p, rq);
         task_rq_unlock(rq, &flags);
  
         return ns;
@@ -4559,6 +4620,8 @@ void account_user_time(struct task_struct *p, cputime_t cputime,
                 cpustat->nice = cputime64_add(cpustat->nice, tmp);
         else
                 cpustat->user = cputime64_add(cpustat->user, tmp);
+
+       cpuacct_update_stats(p, CPUACCT_STAT_USER, cputime);
         /* Account for user time used */
         acct_update_integrals(p);
  }
@@ -4620,6 +4683,8 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
         else
                 cpustat->system = cputime64_add(cpustat->system, tmp);
  
+       cpuacct_update_stats(p, CPUACCT_STAT_SYSTEM, cputime);
+
         /* Account for system time used */
         acct_update_integrals(p);
  }
@@ -7302,7 +7367,8 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
                 cpumask_or(groupmask, groupmask, sched_group_cpus(group));
  
                 cpulist_scnprintf(str, sizeof(str), sched_group_cpus(group));
-               printk(KERN_CONT " %s", str);
+               printk(KERN_CONT " %s (__cpu_power = %d)", str,
+                                               group->__cpu_power);
  
                 group = group->next;
         } while (group != sd->groups);
@@ -9925,6 +9991,7 @@ struct cpuacct {
         struct cgroup_subsys_state css;
         /* cpuusage holds pointer to a u64-type object on every cpu */
         u64 *cpuusage;
+       struct percpu_counter cpustat[CPUACCT_STAT_NSTATS];
         struct cpuacct *parent;
  };
  
@@ -9949,20 +10016,32 @@ static struct cgroup_subsys_state *cpuacct_create(
         struct cgroup_subsys *ss, struct cgroup *cgrp)
  {
         struct cpuacct *ca = kzalloc(sizeof(*ca), GFP_KERNEL);
+       int i;
  
         if (!ca)
-               return ERR_PTR(-ENOMEM);
+               goto out;
  
         ca->cpuusage = alloc_percpu(u64);
-       if (!ca->cpuusage) {
-               kfree(ca);
-               return ERR_PTR(-ENOMEM);
-       }
+       if (!ca->cpuusage)
+               goto out_free_ca;
+
+       for (i = 0; i < CPUACCT_STAT_NSTATS; i++)
+               if (percpu_counter_init(&ca->cpustat[i], 0))
+                       goto out_free_counters;
  
         if (cgrp->parent)
                 ca->parent = cgroup_ca(cgrp->parent);
  
         return &ca->css;
+
+out_free_counters:
+       while (--i >= 0)
+               percpu_counter_destroy(&ca->cpustat[i]);
+       free_percpu(ca->cpuusage);
+out_free_ca:
+       kfree(ca);
+out:
+       return ERR_PTR(-ENOMEM);
  }
  
  /* destroy an existing cpu accounting group */
@@ -9970,7 +10049,10 @@ static void
  cpuacct_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
  {
         struct cpuacct *ca = cgroup_ca(cgrp);
+       int i;
  
+       for (i = 0; i < CPUACCT_STAT_NSTATS; i++)
+               percpu_counter_destroy(&ca->cpustat[i]);
         free_percpu(ca->cpuusage);
         kfree(ca);
  }
@@ -10057,6 +10139,25 @@ static int cpuacct_percpu_seq_read(struct cgroup *cgroup, struct cftype *cft,
         return 0;
  }
  
+static const char *cpuacct_stat_desc[] = {
+       [CPUACCT_STAT_USER] = "user",
+       [CPUACCT_STAT_SYSTEM] = "system",
+};
+
+static int cpuacct_stats_show(struct cgroup *cgrp, struct cftype *cft,
+               struct cgroup_map_cb *cb)
+{
+       struct cpuacct *ca = cgroup_ca(cgrp);
+       int i;
+
+       for (i = 0; i < CPUACCT_STAT_NSTATS; i++) {
+               s64 val = percpu_counter_read(&ca->cpustat[i]);
+               val = cputime64_to_clock_t(val);
+               cb->fill(cb, cpuacct_stat_desc[i], val);
+       }
+       return 0;
+}
+
  static struct cftype files[] = {
         {
                 .name = "usage",
@@ -10067,7 +10168,10 @@ static struct cftype files[] = {
                 .name = "usage_percpu",
                 .read_seq_string = cpuacct_percpu_seq_read,
         },
-
+       {
+               .name = "stat",
+               .read_map = cpuacct_stats_show,
+       },
  };
  
  static int cpuacct_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
@@ -10089,12 +10193,38 @@ static void cpuacct_charge(struct task_struct *tsk, u64 cputime)
                 return;
  
         cpu = task_cpu(tsk);
+
+       rcu_read_lock();
+
         ca = task_ca(tsk);
  
         for (; ca; ca = ca->parent) {
                 u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu);
                 *cpuusage += cputime;
         }
+
+       rcu_read_unlock();
+}
+
+/*
+ * Charge the system/user time to the task's accounting group.
+ */
+static void cpuacct_update_stats(struct task_struct *tsk,
+               enum cpuacct_stat_index idx, cputime_t val)
+{
+       struct cpuacct *ca;
+
+       if (unlikely(!cpuacct_subsys.active))
+               return;
+
+       rcu_read_lock();
+       ca = task_ca(tsk);
+
+       do {
+               percpu_counter_add(&ca->cpustat[idx], val);
+               ca = ca->parent;
+       } while (ca);
+       rcu_read_unlock();
  }
  
  struct cgroup_subsys cpuacct_subsys = {
diff --git a/kernel/sched_cpupri.c b/kernel/sched_cpupri.c

index 1e00bfacf9b851d35cf589fe1baf3fb6773068e8..cdd3c89574cd759ebe3dc6e37499974598d873d5 100644 (file)
--- a/kernel/sched_cpupri.c
+++ b/kernel/sched_cpupri.c
@@ -55,7 +55,7 @@ static int convert_prio(int prio)
   * cpupri_find - find the best (lowest-pri) CPU in the system
   * @cp: The cpupri context
   * @p: The task
- * @lowest_mask: A mask to fill in with selected CPUs
+ * @lowest_mask: A mask to fill in with selected CPUs (or NULL)
   *
   * Note: This function returns the recommended CPUs as calculated during the
   * current invokation.  By the time the call returns, the CPUs may have in
@@ -81,7 +81,8 @@ int cpupri_find(struct cpupri *cp, struct task_struct *p,
                 if (cpumask_any_and(&p->cpus_allowed, vec->mask) >= nr_cpu_ids)
                         continue;
  
-               cpumask_and(lowest_mask, &p->cpus_allowed, vec->mask);
+               if (lowest_mask)
+                       cpumask_and(lowest_mask, &p->cpus_allowed, vec->mask);
                 return 1;
         }
  
diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c

index 299d012b4394e8c62d3a677502e41c41802ab444..f2c66f8f9712d218e4849a77ad147bbd65124959 100644 (file)
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -948,20 +948,15 @@ static int select_task_rq_rt(struct task_struct *p, int sync)
  
  static void check_preempt_equal_prio(struct rq *rq, struct task_struct *p)
  {
-       cpumask_var_t mask;
-
         if (rq->curr->rt.nr_cpus_allowed == 1)
                 return;
  
-       if (!alloc_cpumask_var(&mask, GFP_ATOMIC))
-               return;
-
         if (p->rt.nr_cpus_allowed != 1
-           && cpupri_find(&rq->rd->cpupri, p, mask))
-               goto free;
+           && cpupri_find(&rq->rd->cpupri, p, NULL))
+               return;
  
-       if (!cpupri_find(&rq->rd->cpupri, rq->curr, mask))
-               goto free;
+       if (!cpupri_find(&rq->rd->cpupri, rq->curr, NULL))
+               return;
  
         /*
          * There appears to be other cpus that can accept
@@ -970,8 +965,6 @@ static void check_preempt_equal_prio(struct rq *rq, struct task_struct *p)
          */
         requeue_task_rt(rq, p, 1);
         resched_task(rq->curr);
-free:
-       free_cpumask_var(mask);
  }
  
  #endif /* CONFIG_SMP */
diff --git a/kernel/timer.c b/kernel/timer.c

index b4555568b4e4ad16f34a887eabed6f21e05abfba..cffffad01c31cf1670490f2860f0e5de0f25d301 100644 (file)
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -531,10 +531,13 @@ static void __init_timer(struct timer_list *timer,
  }
  
  /**
- * init_timer - initialize a timer.
+ * init_timer_key - initialize a timer
   * @timer: the timer to be initialized
+ * @name: name of the timer
+ * @key: lockdep class key of the fake lock used for tracking timer
+ *       sync lock dependencies
   *
- * init_timer() must be done to a timer prior calling *any* of the
+ * init_timer_key() must be done to a timer prior calling *any* of the
   * other timer functions.
   */
  void init_timer_key(struct timer_list *timer,
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c

index b32ff446c3fb3c1d4e88829ac46a9e0adc705031..921ef5d1f0ba95e7497faa55afb293c50ae7ee47 100644 (file)
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1377,12 +1377,12 @@ static int blk_trace_str2mask(const char *str)
  {
         int i;
         int mask = 0;
-       char *s, *token;
+       char *buf, *s, *token;
  
-       s = kstrdup(str, GFP_KERNEL);
-       if (s == NULL)
+       buf = kstrdup(str, GFP_KERNEL);
+       if (buf == NULL)
                 return -ENOMEM;
-       s = strstrip(s);
+       s = strstrip(buf);
  
         while (1) {
                 token = strsep(&s, ",");
@@ -1403,7 +1403,7 @@ static int blk_trace_str2mask(const char *str)
                         break;
                 }
         }
-       kfree(s);
+       kfree(buf);
  
         return mask;
  }
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c

index a2a3af29c94337bef68f64608a85977a2fe39ce8..5e579645ac86dbbf2c3800fde221b9f60db83d85 100644 (file)
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -1,5 +1,5 @@
+#include <trace/syscall.h>
  #include <linux/kernel.h>
-#include <linux/ftrace.h>
  #include <asm/syscall.h>
  
  #include "trace_output.h"
diff --git a/kernel/workqueue.c b/kernel/workqueue.c

index b6b966ce1451adf7c71fffd38524e68587f7504e..f71fb2a089503534eec8d8790f82d9702a3545cc 100644 (file)
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -966,20 +966,20 @@ undo:
  }
  
  #ifdef CONFIG_SMP
-static struct workqueue_struct *work_on_cpu_wq __read_mostly;
  
  struct work_for_cpu {
-       struct work_struct work;
+       struct completion completion;
         long (*fn)(void *);
         void *arg;
         long ret;
  };
  
-static void do_work_for_cpu(struct work_struct *w)
+static int do_work_for_cpu(void *_wfc)
  {
-       struct work_for_cpu *wfc = container_of(w, struct work_for_cpu, work);
-
+       struct work_for_cpu *wfc = _wfc;
         wfc->ret = wfc->fn(wfc->arg);
+       complete(&wfc->completion);
+       return 0;
  }
  
  /**
@@ -990,17 +990,23 @@ static void do_work_for_cpu(struct work_struct *w)
   *
   * This will return the value @fn returns.
   * It is up to the caller to ensure that the cpu doesn't go offline.
+ * The caller must not hold any locks which would prevent @fn from completing.
   */
  long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
  {
-       struct work_for_cpu wfc;
-
-       INIT_WORK(&wfc.work, do_work_for_cpu);
-       wfc.fn = fn;
-       wfc.arg = arg;
-       queue_work_on(cpu, work_on_cpu_wq, &wfc.work);
-       flush_work(&wfc.work);
-
+       struct task_struct *sub_thread;
+       struct work_for_cpu wfc = {
+               .completion = COMPLETION_INITIALIZER_ONSTACK(wfc.completion),
+               .fn = fn,
+               .arg = arg,
+       };
+
+       sub_thread = kthread_create(do_work_for_cpu, &wfc, "work_for_cpu");
+       if (IS_ERR(sub_thread))
+               return PTR_ERR(sub_thread);
+       kthread_bind(sub_thread, cpu);
+       wake_up_process(sub_thread);
+       wait_for_completion(&wfc.completion);
         return wfc.ret;
  }
  EXPORT_SYMBOL_GPL(work_on_cpu);
@@ -1016,8 +1022,4 @@ void __init init_workqueues(void)
         hotcpu_notifier(workqueue_cpu_callback, 0);
         keventd_wq = create_workqueue("events");
         BUG_ON(!keventd_wq);
-#ifdef CONFIG_SMP
-       work_on_cpu_wq = create_workqueue("work_on_cpu");
-       BUG_ON(!work_on_cpu_wq);
-#endif
  }
diff --git a/lib/vsprintf.c b/lib/vsprintf.c

index be3001f912e4d906999242971f42146ebc07cc43..7536acea135ba4069c1dc04ad36d5024ac747d8c 100644 (file)
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -1051,13 +1051,6 @@ int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
                         if (str < end)
                                 *str = '%';
                         ++str;
-                       if (*fmt) {
-                               if (str < end)
-                                       *str = *fmt;
-                               ++str;
-                       } else {
-                               --fmt;
-                       }
                         break;
  
                 case FORMAT_TYPE_NRCHARS: {
@@ -1339,8 +1332,6 @@ do {                                                                      \
                         break;
  
                 case FORMAT_TYPE_INVALID:
-                       if (!*fmt)
-                               --fmt;
                         break;
  
                 case FORMAT_TYPE_NRCHARS: {
@@ -1523,13 +1514,6 @@ int bstr_printf(char *buf, size_t size, const char *fmt, const u32 *bin_buf)
                         if (str < end)
                                 *str = '%';
                         ++str;
-                       if (*fmt) {
-                               if (str < end)
-                                       *str = *fmt;
-                               ++str;
-                       } else {
-                               --fmt;
-                       }
                         break;
  
                 case FORMAT_TYPE_NRCHARS:
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c

index dfed176aed37a05698c62f887e0227d95c1faff4..800ae854247163f8890ea244060835eccc5f0a99 100644 (file)
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -1033,6 +1033,8 @@ static struct xt_counters *alloc_counters(struct xt_table *table)
  
         xt_free_table_info(info);
  
+       return counters;
+
   free_counters:
         vfree(counters);
   nomem:
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig

index bb279bf59a1b9147b5951268ec70d48662eb50f2..2329c5f5055195ddaf81f6286d5510dab7e2ee5f 100644 (file)
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -374,7 +374,7 @@ config NETFILTER_XT_TARGET_HL
  
  config NETFILTER_XT_TARGET_LED
         tristate '"LED" target support'
-       depends on LEDS_CLASS && LED_TRIGGERS
+       depends on LEDS_CLASS && LEDS_TRIGGERS
         depends on NETFILTER_ADVANCED
         help
           This option adds a `LED' target, which allows you to blink LEDs in
diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c

index 3940f996a2e4ac180b8ed135e9f7c423f670b66d..afde8f991646eadfa6afc72d6390fb27b5ecaf61 100644 (file)
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -372,7 +372,7 @@ static inline int __nf_ct_expect_check(struct nf_conntrack_expect *expect)
         struct net *net = nf_ct_exp_net(expect);
         struct hlist_node *n;
         unsigned int h;
-       int ret = 0;
+       int ret = 1;
  
         if (!master_help->helper) {
                 ret = -ESHUTDOWN;
@@ -412,41 +412,23 @@ out:
         return ret;
  }
  
-int nf_ct_expect_related(struct nf_conntrack_expect *expect)
+int nf_ct_expect_related_report(struct nf_conntrack_expect *expect, 
+                               u32 pid, int report)
  {
         int ret;
  
         spin_lock_bh(&nf_conntrack_lock);
         ret = __nf_ct_expect_check(expect);
-       if (ret < 0)
+       if (ret <= 0)
                 goto out;
  
+       ret = 0;
         nf_ct_expect_insert(expect);
-       atomic_inc(&expect->use);
-       spin_unlock_bh(&nf_conntrack_lock);
-       nf_ct_expect_event(IPEXP_NEW, expect);
-       nf_ct_expect_put(expect);
-       return ret;
-out:
         spin_unlock_bh(&nf_conntrack_lock);
+       nf_ct_expect_event_report(IPEXP_NEW, expect, pid, report);
         return ret;
-}
-EXPORT_SYMBOL_GPL(nf_ct_expect_related);
-
-int nf_ct_expect_related_report(struct nf_conntrack_expect *expect, 
-                               u32 pid, int report)
-{
-       int ret;
-
-       spin_lock_bh(&nf_conntrack_lock);
-       ret = __nf_ct_expect_check(expect);
-       if (ret < 0)
-               goto out;
-       nf_ct_expect_insert(expect);
  out:
         spin_unlock_bh(&nf_conntrack_lock);
-       if (ret == 0)
-               nf_ct_expect_event_report(IPEXP_NEW, expect, pid, report);
         return ret;
  }
  EXPORT_SYMBOL_GPL(nf_ct_expect_related_report);
diff --git a/security/commoncap.c b/security/commoncap.c

index 7cd61a5f520557f1991e6e2d093c3c4133ed4307..beac0258c2a8f3a0cbad52ec9adf12e019264626 100644 (file)
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -916,7 +916,6 @@ changed:
         return commit_creds(new);
  
  no_change:
-       error = 0;
  error:
         abort_creds(new);
         return error;
diff --git a/sound/sparc/cs4231.c b/sound/sparc/cs4231.c

index 7d93fa705ccf493135b36cefff7ea242812ca11b..8d13d933087dee455186d568e2ba4364988225ba 100644 (file)
--- a/sound/sparc/cs4231.c
+++ b/sound/sparc/cs4231.c
@@ -703,7 +703,7 @@ static int snd_cs4231_timer_stop(struct snd_timer *timer)
         return 0;
  }
  
-static void __init snd_cs4231_init(struct snd_cs4231 *chip)
+static void __devinit snd_cs4231_init(struct snd_cs4231 *chip)
  {
         unsigned long flags;
  
@@ -1020,7 +1020,7 @@ static snd_pcm_uframes_t snd_cs4231_capture_pointer(
         return bytes_to_frames(substream->runtime, ptr);
  }
  
-static int __init snd_cs4231_probe(struct snd_cs4231 *chip)
+static int __devinit snd_cs4231_probe(struct snd_cs4231 *chip)
  {
         unsigned long flags;
         int i;
@@ -1219,7 +1219,7 @@ static struct snd_pcm_ops snd_cs4231_capture_ops = {
         .pointer        =       snd_cs4231_capture_pointer,
  };
  
-static int __init snd_cs4231_pcm(struct snd_card *card)
+static int __devinit snd_cs4231_pcm(struct snd_card *card)
  {
         struct snd_cs4231 *chip = card->private_data;
         struct snd_pcm *pcm;
@@ -1248,7 +1248,7 @@ static int __init snd_cs4231_pcm(struct snd_card *card)
         return 0;
  }
  
-static int __init snd_cs4231_timer(struct snd_card *card)
+static int __devinit snd_cs4231_timer(struct snd_card *card)
  {
         struct snd_cs4231 *chip = card->private_data;
         struct snd_timer *timer;
@@ -1499,7 +1499,7 @@ static int snd_cs4231_put_double(struct snd_kcontrol *kcontrol,
    .private_value = (left_reg) | ((right_reg) << 8) | ((shift_left) << 16) | \
                    ((shift_right) << 19) | ((mask) << 24) | ((invert) << 22) }
  
-static struct snd_kcontrol_new snd_cs4231_controls[] __initdata = {
+static struct snd_kcontrol_new snd_cs4231_controls[] __devinitdata = {
  CS4231_DOUBLE("PCM Playback Switch", 0, CS4231_LEFT_OUTPUT,
                 CS4231_RIGHT_OUTPUT, 7, 7, 1, 1),
  CS4231_DOUBLE("PCM Playback Volume", 0, CS4231_LEFT_OUTPUT,
@@ -1538,7 +1538,7 @@ CS4231_SINGLE("Line Out Switch", 0, CS4231_PIN_CTRL, 6, 1, 1),
  CS4231_SINGLE("Headphone Out Switch", 0, CS4231_PIN_CTRL, 7, 1, 1)
  };
  
-static int __init snd_cs4231_mixer(struct snd_card *card)
+static int __devinit snd_cs4231_mixer(struct snd_card *card)
  {
         struct snd_cs4231 *chip = card->private_data;
         int err, idx;
@@ -1559,7 +1559,7 @@ static int __init snd_cs4231_mixer(struct snd_card *card)
  
  static int dev;
  
-static int __init cs4231_attach_begin(struct snd_card **rcard)
+static int __devinit cs4231_attach_begin(struct snd_card **rcard)
  {
         struct snd_card *card;
         struct snd_cs4231 *chip;
@@ -1590,7 +1590,7 @@ static int __init cs4231_attach_begin(struct snd_card **rcard)
         return 0;
  }
  
-static int __init cs4231_attach_finish(struct snd_card *card)
+static int __devinit cs4231_attach_finish(struct snd_card *card)
  {
         struct snd_cs4231 *chip = card->private_data;
         int err;
@@ -1794,9 +1794,9 @@ static struct snd_device_ops snd_cs4231_sbus_dev_ops = {
         .dev_free       =       snd_cs4231_sbus_dev_free,
  };
  
-static int __init snd_cs4231_sbus_create(struct snd_card *card,
-                                        struct of_device *op,
-                                        int dev)
+static int __devinit snd_cs4231_sbus_create(struct snd_card *card,
+                                           struct of_device *op,
+                                           int dev)
  {
         struct snd_cs4231 *chip = card->private_data;
         int err;
@@ -1960,9 +1960,9 @@ static struct snd_device_ops snd_cs4231_ebus_dev_ops = {
         .dev_free       =       snd_cs4231_ebus_dev_free,
  };
  
-static int __init snd_cs4231_ebus_create(struct snd_card *card,
-                                        struct of_device *op,
-                                        int dev)
+static int __devinit snd_cs4231_ebus_create(struct snd_card *card,
+                                           struct of_device *op,
+                                           int dev)
  {
         struct snd_cs4231 *chip = card->private_data;
         int err;
author	Linus Torvalds <torvalds@linux-foundation.org>
	Thu, 9 Apr 2009 17:38:23 +0000 (10:38 -0700)
committer	Linus Torvalds <torvalds@linux-foundation.org>
	Thu, 9 Apr 2009 17:38:23 +0000 (10:38 -0700)
Documentation/cgroups/cpuacct.txt		patch \| blob \| blame \| history
Documentation/ftrace.txt	[deleted file]	patch \| blob \| blame \| history
Documentation/trace/ftrace.txt	[new file with mode: 0644]	patch \| blob
Documentation/trace/kmemtrace.txt	[new file with mode: 0644]	patch \| blob
Documentation/trace/mmiotrace.txt	[new file with mode: 0644]	patch \| blob
Documentation/trace/tracepoints.txt	[new file with mode: 0644]	patch \| blob
Documentation/tracepoints.txt	[deleted file]	patch \| blob \| blame \| history
Documentation/tracers/mmiotrace.txt	[deleted file]	patch \| blob \| blame \| history
Documentation/vm/kmemtrace.txt	[deleted file]	patch \| blob \| blame \| history
MAINTAINERS		patch \| blob \| blame \| history
arch/arm/configs/magician_defconfig		patch \| blob \| blame \| history
arch/arm/include/asm/sizes.h		patch \| blob \| blame \| history
arch/arm/mach-at91/include/mach/board.h		patch \| blob \| blame \| history
arch/arm/mach-omap1/clock.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/Kconfig		patch \| blob \| blame \| history
arch/arm/mach-pxa/Makefile		patch \| blob \| blame \| history
arch/arm/mach-pxa/cm-x2xx.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/colibri-pxa300.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/colibri-pxa320.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/colibri-pxa3xx.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/csb701.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/e740.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/e750.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/e800.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/em-x270.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/include/mach/colibri.h		patch \| blob \| blame \| history
arch/arm/mach-pxa/include/mach/magician.h		patch \| blob \| blame \| history
arch/arm/mach-pxa/include/mach/palmld.h		patch \| blob \| blame \| history
arch/arm/mach-pxa/include/mach/palmt5.h		patch \| blob \| blame \| history
arch/arm/mach-pxa/include/mach/palmte2.h	[new file with mode: 0644]	patch \| blob
arch/arm/mach-pxa/include/mach/palmtx.h		patch \| blob \| blame \| history
arch/arm/mach-pxa/magician.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/mioa701.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/palmld.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/palmt5.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/palmte2.c	[new file with mode: 0644]	patch \| blob
arch/arm/mach-pxa/palmtx.c		patch \| blob \| blame \| history
arch/arm/mach-pxa/tosa.c		patch \| blob \| blame \| history
arch/arm/mm/mmu.c		patch \| blob \| blame \| history
arch/ia64/include/asm/unistd.h		patch \| blob \| blame \| history
arch/ia64/kernel/entry.S		patch \| blob \| blame \| history
arch/mn10300/kernel/irq.c		patch \| blob \| blame \| history
arch/sparc/include/asm/unistd.h		patch \| blob \| blame \| history
arch/sparc/kernel/of_device_64.c		patch \| blob \| blame \| history
arch/sparc/kernel/pci_fire.c		patch \| blob \| blame \| history
arch/sparc/kernel/pci_psycho.c		patch \| blob \| blame \| history
arch/sparc/kernel/pci_sabre.c		patch \| blob \| blame \| history
arch/sparc/kernel/pci_sun4v.c		patch \| blob \| blame \| history
arch/sparc/kernel/power.c		patch \| blob \| blame \| history
arch/sparc/kernel/systbls_32.S		patch \| blob \| blame \| history
arch/sparc/kernel/systbls_64.S		patch \| blob \| blame \| history
arch/sparc/mm/init_64.c		patch \| blob \| blame \| history
arch/x86/include/asm/cpufeature.h		patch \| blob \| blame \| history
arch/x86/kernel/apic/apic.c		patch \| blob \| blame \| history
arch/x86/kernel/cpu/addon_cpuid_features.c		patch \| blob \| blame \| history
arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c		patch \| blob \| blame \| history
arch/x86/kernel/cpu/cpufreq/longhaul.c		patch \| blob \| blame \| history
arch/x86/kernel/ftrace.c		patch \| blob \| blame \| history
arch/x86/kernel/ptrace.c		patch \| blob \| blame \| history
drivers/acpi/acpica/hwvalid.c		patch \| blob \| blame \| history
drivers/acpi/battery.c		patch \| blob \| blame \| history
drivers/acpi/proc.c		patch \| blob \| blame \| history
drivers/acpi/processor_idle.c		patch \| blob \| blame \| history
drivers/acpi/scan.c		patch \| blob \| blame \| history
drivers/acpi/sleep.h		patch \| blob \| blame \| history
drivers/acpi/thermal.c		patch \| blob \| blame \| history
drivers/acpi/video.c		patch \| blob \| blame \| history
drivers/acpi/wakeup.c		patch \| blob \| blame \| history
drivers/md/dm-ioctl.c		patch \| blob \| blame \| history
drivers/md/dm-kcopyd.c		patch \| blob \| blame \| history
drivers/md/dm-linear.c		patch \| blob \| blame \| history
drivers/md/dm-table.c		patch \| blob \| blame \| history
drivers/md/dm.c		patch \| blob \| blame \| history
drivers/md/dm.h		patch \| blob \| blame \| history
drivers/mmc/core/mmc.c		patch \| blob \| blame \| history
drivers/mmc/core/sd.c		patch \| blob \| blame \| history
drivers/mmc/host/imxmmc.c		patch \| blob \| blame \| history
drivers/mmc/host/mmc_spi.c		patch \| blob \| blame \| history
drivers/mmc/host/omap_hsmmc.c		patch \| blob \| blame \| history
drivers/mmc/host/sdhci-pci.c		patch \| blob \| blame \| history
drivers/mmc/host/sdhci.c		patch \| blob \| blame \| history
drivers/mmc/host/wbsd.c		patch \| blob \| blame \| history
drivers/net/Kconfig		patch \| blob \| blame \| history
drivers/net/Makefile		patch \| blob \| blame \| history
drivers/net/bnx2.c		patch \| blob \| blame \| history
drivers/net/eql.c		patch \| blob \| blame \| history
drivers/net/fec.c		patch \| blob \| blame \| history
drivers/net/igb/igb_main.c		patch \| blob \| blame \| history
drivers/net/igbvf/Makefile	[new file with mode: 0644]	patch \| blob
drivers/net/igbvf/defines.h	[new file with mode: 0644]	patch \| blob
drivers/net/igbvf/ethtool.c	[new file with mode: 0644]	patch \| blob
drivers/net/igbvf/igbvf.h	[new file with mode: 0644]	patch \| blob
drivers/net/igbvf/mbx.c	[new file with mode: 0644]	patch \| blob
drivers/net/igbvf/mbx.h	[new file with mode: 0644]	patch \| blob
drivers/net/igbvf/netdev.c	[new file with mode: 0644]	patch \| blob
drivers/net/igbvf/regs.h	[new file with mode: 0644]	patch \| blob
drivers/net/igbvf/vf.c	[new file with mode: 0644]	patch \| blob
drivers/net/igbvf/vf.h	[new file with mode: 0644]	patch \| blob
drivers/net/mv643xx_eth.c		patch \| blob \| blame \| history
drivers/net/niu.c		patch \| blob \| blame \| history
drivers/net/r6040.c		patch \| blob \| blame \| history
drivers/net/smsc911x.c		patch \| blob \| blame \| history
drivers/platform/x86/fujitsu-laptop.c		patch \| blob \| blame \| history
drivers/platform/x86/panasonic-laptop.c		patch \| blob \| blame \| history
drivers/platform/x86/sony-laptop.c		patch \| blob \| blame \| history
drivers/platform/x86/wmi.c		patch \| blob \| blame \| history
drivers/power/pcf50633-charger.c		patch \| blob \| blame \| history
drivers/power/pda_power.c		patch \| blob \| blame \| history
drivers/serial/max3100.c	[new file with mode: 0644]	patch \| blob
drivers/serial/sunsu.c		patch \| blob \| blame \| history
drivers/usb/host/ohci-at91.c		patch \| blob \| blame \| history
fs/befs/super.c		patch \| blob \| blame \| history
fs/buffer.c		patch \| blob \| blame \| history
fs/ext3/inode.c		patch \| blob \| blame \| history
fs/proc/task_nommu.c		patch \| blob \| blame \| history
include/acpi/acpi_bus.h		patch \| blob \| blame \| history
include/linux/device-mapper.h		patch \| blob \| blame \| history
include/linux/ftrace.h		patch \| blob \| blame \| history
include/linux/irq.h		patch \| blob \| blame \| history
include/linux/kmod.h		patch \| blob \| blame \| history
include/linux/mfd/pcf50633/core.h		patch \| blob \| blame \| history
include/linux/mfd/pcf50633/mbc.h		patch \| blob \| blame \| history
include/linux/pda_power.h		patch \| blob \| blame \| history
include/linux/sched.h		patch \| blob \| blame \| history
include/linux/serial_max3100.h	[new file with mode: 0644]	patch \| blob
include/linux/syscalls.h		patch \| blob \| blame \| history
include/net/netfilter/nf_conntrack_expect.h		patch \| blob \| blame \| history
include/trace/syscall.h	[new file with mode: 0644]	patch \| blob
kernel/fork.c		patch \| blob \| blame \| history
kernel/futex.c		patch \| blob \| blame \| history
kernel/irq/numa_migrate.c		patch \| blob \| blame \| history
kernel/kthread.c		patch \| blob \| blame \| history
kernel/posix-cpu-timers.c		patch \| blob \| blame \| history
kernel/ptrace.c		patch \| blob \| blame \| history
kernel/sched.c		patch \| blob \| blame \| history
kernel/sched_cpupri.c		patch \| blob \| blame \| history
kernel/sched_rt.c		patch \| blob \| blame \| history
kernel/timer.c		patch \| blob \| blame \| history
kernel/trace/blktrace.c		patch \| blob \| blame \| history
kernel/trace/trace_syscalls.c		patch \| blob \| blame \| history
kernel/workqueue.c		patch \| blob \| blame \| history
lib/vsprintf.c		patch \| blob \| blame \| history
net/ipv6/netfilter/ip6_tables.c		patch \| blob \| blame \| history
net/netfilter/Kconfig		patch \| blob \| blame \| history
net/netfilter/nf_conntrack_expect.c		patch \| blob \| blame \| history
security/commoncap.c		patch \| blob \| blame \| history
sound/sparc/cs4231.c		patch \| blob \| blame \| history