| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 1 | perf-script-python(1) | 
| Arnaldo Carvalho de Melo | 4778e0e | 2010-05-05 11:23:27 -0300 | [diff] [blame] | 2 | ==================== | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 3 |  | 
|  | 4 | NAME | 
|  | 5 | ---- | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 6 | perf-script-python - Process trace data with a Python script | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 7 |  | 
|  | 8 | SYNOPSIS | 
|  | 9 | -------- | 
|  | 10 | [verse] | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 11 | 'perf script' [-s [Python]:script[.py] ] | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 12 |  | 
|  | 13 | DESCRIPTION | 
|  | 14 | ----------- | 
|  | 15 |  | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 16 | This perf script option is used to process perf script data using perf's | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 17 | built-in Python interpreter.  It reads and processes the input file and | 
|  | 18 | displays the results of the trace analysis implemented in the given | 
|  | 19 | Python script, if any. | 
|  | 20 |  | 
|  | 21 | A QUICK EXAMPLE | 
|  | 22 | --------------- | 
|  | 23 |  | 
|  | 24 | This section shows the process, start to finish, of creating a working | 
|  | 25 | Python script that aggregates and extracts useful information from a | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 26 | raw perf script stream.  You can avoid reading the rest of this | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 27 | document if an example is enough for you; the rest of the document | 
|  | 28 | provides more details on each step and lists the library functions | 
|  | 29 | available to script writers. | 
|  | 30 |  | 
|  | 31 | This example actually details the steps that were used to create the | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 32 | 'syscall-counts' script you see when you list the available perf script | 
|  | 33 | scripts via 'perf script -l'.  As such, this script also shows how to | 
|  | 34 | integrate your script into the list of general-purpose 'perf script' | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 35 | scripts listed by that command. | 
|  | 36 |  | 
|  | 37 | The syscall-counts script is a simple script, but demonstrates all the | 
|  | 38 | basic ideas necessary to create a useful script.  Here's an example | 
| Frederic Weisbecker | c2fbaa4 | 2010-02-25 03:03:52 +0100 | [diff] [blame] | 39 | of its output (syscall names are not yet supported, they will appear | 
|  | 40 | as numbers): | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 41 |  | 
|  | 42 | ---- | 
|  | 43 | syscall events: | 
|  | 44 |  | 
|  | 45 | event                                          count | 
|  | 46 | ----------------------------------------  ----------- | 
|  | 47 | sys_write                                     455067 | 
|  | 48 | sys_getdents                                    4072 | 
|  | 49 | sys_close                                       3037 | 
|  | 50 | sys_swapoff                                     1769 | 
|  | 51 | sys_read                                         923 | 
|  | 52 | sys_sched_setparam                               826 | 
|  | 53 | sys_open                                         331 | 
|  | 54 | sys_newfstat                                     326 | 
|  | 55 | sys_mmap                                         217 | 
|  | 56 | sys_munmap                                       216 | 
|  | 57 | sys_futex                                        141 | 
|  | 58 | sys_select                                       102 | 
|  | 59 | sys_poll                                          84 | 
|  | 60 | sys_setitimer                                     12 | 
|  | 61 | sys_writev                                         8 | 
|  | 62 | 15                                                 8 | 
|  | 63 | sys_lseek                                          7 | 
|  | 64 | sys_rt_sigprocmask                                 6 | 
|  | 65 | sys_wait4                                          3 | 
|  | 66 | sys_ioctl                                          3 | 
|  | 67 | sys_set_robust_list                                1 | 
|  | 68 | sys_exit                                           1 | 
|  | 69 | 56                                                 1 | 
|  | 70 | sys_access                                         1 | 
|  | 71 | ---- | 
|  | 72 |  | 
|  | 73 | Basically our task is to keep a per-syscall tally that gets updated | 
|  | 74 | every time a system call occurs in the system.  Our script will do | 
|  | 75 | that, but first we need to record the data that will be processed by | 
|  | 76 | that script.  Theoretically, there are a couple of ways we could do | 
|  | 77 | that: | 
|  | 78 |  | 
|  | 79 | - we could enable every event under the tracing/events/syscalls | 
|  | 80 | directory, but this is over 600 syscalls, well beyond the number | 
|  | 81 | allowable by perf.  These individual syscall events will however be | 
|  | 82 | useful if we want to later use the guidance we get from the | 
|  | 83 | general-purpose scripts to drill down and get more detail about | 
|  | 84 | individual syscalls of interest. | 
|  | 85 |  | 
|  | 86 | - we can enable the sys_enter and/or sys_exit syscalls found under | 
|  | 87 | tracing/events/raw_syscalls.  These are called for all syscalls; the | 
|  | 88 | 'id' field can be used to distinguish between individual syscall | 
|  | 89 | numbers. | 
|  | 90 |  | 
|  | 91 | For this script, we only need to know that a syscall was entered; we | 
|  | 92 | don't care how it exited, so we'll use 'perf record' to record only | 
|  | 93 | the sys_enter events: | 
|  | 94 |  | 
|  | 95 | ---- | 
| Frederic Weisbecker | e5a5f1f | 2010-04-30 19:55:00 +0200 | [diff] [blame] | 96 | # perf record -a -e raw_syscalls:sys_enter | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 97 |  | 
|  | 98 | ^C[ perf record: Woken up 1 times to write data ] | 
|  | 99 | [ perf record: Captured and wrote 56.545 MB perf.data (~2470503 samples) ] | 
|  | 100 | ---- | 
|  | 101 |  | 
|  | 102 | The options basically say to collect data for every syscall event | 
|  | 103 | system-wide and multiplex the per-cpu output into a single stream. | 
|  | 104 | That single stream will be recorded in a file in the current directory | 
|  | 105 | called perf.data. | 
|  | 106 |  | 
|  | 107 | Once we have a perf.data file containing our data, we can use the -g | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 108 | 'perf script' option to generate a Python script that will contain a | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 109 | callback handler for each event type found in the perf.data trace | 
|  | 110 | stream (for more details, see the STARTER SCRIPTS section). | 
|  | 111 |  | 
|  | 112 | ---- | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 113 | # perf script -g python | 
|  | 114 | generated Python script: perf-script.py | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 115 |  | 
|  | 116 | The output file created also in the current directory is named | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 117 | perf-script.py.  Here's the file in its entirety: | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 118 |  | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 119 | # perf script event handlers, generated by perf script -g python | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 120 | # Licensed under the terms of the GNU GPL License version 2 | 
|  | 121 |  | 
|  | 122 | # The common_* event handler fields are the most useful fields common to | 
|  | 123 | # all events.  They don't necessarily correspond to the 'common_*' fields | 
|  | 124 | # in the format files.  Those fields not available as handler params can | 
|  | 125 | # be retrieved using Python functions of the form common_*(context). | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 126 | # See the perf-script-python Documentation for the list of available functions. | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 127 |  | 
|  | 128 | import os | 
|  | 129 | import sys | 
|  | 130 |  | 
|  | 131 | sys.path.append(os.environ['PERF_EXEC_PATH'] + \ | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 132 | '/scripts/python/perf-script-Util/lib/Perf/Trace') | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 133 |  | 
|  | 134 | from perf_trace_context import * | 
|  | 135 | from Core import * | 
|  | 136 |  | 
|  | 137 | def trace_begin(): | 
|  | 138 | print "in trace_begin" | 
|  | 139 |  | 
|  | 140 | def trace_end(): | 
|  | 141 | print "in trace_end" | 
|  | 142 |  | 
|  | 143 | def raw_syscalls__sys_enter(event_name, context, common_cpu, | 
|  | 144 | common_secs, common_nsecs, common_pid, common_comm, | 
|  | 145 | id, args): | 
|  | 146 | print_header(event_name, common_cpu, common_secs, common_nsecs, | 
|  | 147 | common_pid, common_comm) | 
|  | 148 |  | 
|  | 149 | print "id=%d, args=%s\n" % \ | 
|  | 150 | (id, args), | 
|  | 151 |  | 
|  | 152 | def trace_unhandled(event_name, context, common_cpu, common_secs, common_nsecs, | 
|  | 153 | common_pid, common_comm): | 
|  | 154 | print_header(event_name, common_cpu, common_secs, common_nsecs, | 
|  | 155 | common_pid, common_comm) | 
|  | 156 |  | 
|  | 157 | def print_header(event_name, cpu, secs, nsecs, pid, comm): | 
|  | 158 | print "%-20s %5u %05u.%09u %8u %-20s " % \ | 
|  | 159 | (event_name, cpu, secs, nsecs, pid, comm), | 
|  | 160 | ---- | 
|  | 161 |  | 
|  | 162 | At the top is a comment block followed by some import statements and a | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 163 | path append which every perf script script should include. | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 164 |  | 
|  | 165 | Following that are a couple generated functions, trace_begin() and | 
|  | 166 | trace_end(), which are called at the beginning and the end of the | 
|  | 167 | script respectively (for more details, see the SCRIPT_LAYOUT section | 
|  | 168 | below). | 
|  | 169 |  | 
|  | 170 | Following those are the 'event handler' functions generated one for | 
|  | 171 | every event in the 'perf record' output.  The handler functions take | 
|  | 172 | the form subsystem__event_name, and contain named parameters, one for | 
|  | 173 | each field in the event; in this case, there's only one event, | 
|  | 174 | raw_syscalls__sys_enter().  (see the EVENT HANDLERS section below for | 
|  | 175 | more info on event handlers). | 
|  | 176 |  | 
|  | 177 | The final couple of functions are, like the begin and end functions, | 
|  | 178 | generated for every script.  The first, trace_unhandled(), is called | 
|  | 179 | every time the script finds an event in the perf.data file that | 
|  | 180 | doesn't correspond to any event handler in the script.  This could | 
|  | 181 | mean either that the record step recorded event types that it wasn't | 
|  | 182 | really interested in, or the script was run against a trace file that | 
|  | 183 | doesn't correspond to the script. | 
|  | 184 |  | 
| Kirill Smelkov | 5d2be7c | 2010-05-13 14:39:25 +0400 | [diff] [blame] | 185 | The script generated by -g option simply prints a line for each | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 186 | event found in the trace stream i.e. it basically just dumps the event | 
|  | 187 | and its parameter values to stdout.  The print_header() function is | 
|  | 188 | simply a utility function used for that purpose.  Let's rename the | 
|  | 189 | script and run it to see the default output: | 
|  | 190 |  | 
|  | 191 | ---- | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 192 | # mv perf-script.py syscall-counts.py | 
|  | 193 | # perf script -s syscall-counts.py | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 194 |  | 
|  | 195 | raw_syscalls__sys_enter     1 00840.847582083     7506 perf                  id=1, args= | 
|  | 196 | raw_syscalls__sys_enter     1 00840.847595764     7506 perf                  id=1, args= | 
|  | 197 | raw_syscalls__sys_enter     1 00840.847620860     7506 perf                  id=1, args= | 
|  | 198 | raw_syscalls__sys_enter     1 00840.847710478     6533 npviewer.bin          id=78, args= | 
|  | 199 | raw_syscalls__sys_enter     1 00840.847719204     6533 npviewer.bin          id=142, args= | 
|  | 200 | raw_syscalls__sys_enter     1 00840.847755445     6533 npviewer.bin          id=3, args= | 
|  | 201 | raw_syscalls__sys_enter     1 00840.847775601     6533 npviewer.bin          id=3, args= | 
|  | 202 | raw_syscalls__sys_enter     1 00840.847781820     6533 npviewer.bin          id=3, args= | 
|  | 203 | . | 
|  | 204 | . | 
|  | 205 | . | 
|  | 206 | ---- | 
|  | 207 |  | 
|  | 208 | Of course, for this script, we're not interested in printing every | 
|  | 209 | trace event, but rather aggregating it in a useful way.  So we'll get | 
|  | 210 | rid of everything to do with printing as well as the trace_begin() and | 
|  | 211 | trace_unhandled() functions, which we won't be using.  That leaves us | 
|  | 212 | with this minimalistic skeleton: | 
|  | 213 |  | 
|  | 214 | ---- | 
|  | 215 | import os | 
|  | 216 | import sys | 
|  | 217 |  | 
|  | 218 | sys.path.append(os.environ['PERF_EXEC_PATH'] + \ | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 219 | '/scripts/python/perf-script-Util/lib/Perf/Trace') | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 220 |  | 
|  | 221 | from perf_trace_context import * | 
|  | 222 | from Core import * | 
|  | 223 |  | 
|  | 224 | def trace_end(): | 
|  | 225 | print "in trace_end" | 
|  | 226 |  | 
|  | 227 | def raw_syscalls__sys_enter(event_name, context, common_cpu, | 
|  | 228 | common_secs, common_nsecs, common_pid, common_comm, | 
|  | 229 | id, args): | 
|  | 230 | ---- | 
|  | 231 |  | 
|  | 232 | In trace_end(), we'll simply print the results, but first we need to | 
|  | 233 | generate some results to print.  To do that we need to have our | 
|  | 234 | sys_enter() handler do the necessary tallying until all events have | 
|  | 235 | been counted.  A hash table indexed by syscall id is a good way to | 
|  | 236 | store that information; every time the sys_enter() handler is called, | 
|  | 237 | we simply increment a count associated with that hash entry indexed by | 
|  | 238 | that syscall id: | 
|  | 239 |  | 
|  | 240 | ---- | 
|  | 241 | syscalls = autodict() | 
|  | 242 |  | 
|  | 243 | try: | 
|  | 244 | syscalls[id] += 1 | 
|  | 245 | except TypeError: | 
|  | 246 | syscalls[id] = 1 | 
|  | 247 | ---- | 
|  | 248 |  | 
|  | 249 | The syscalls 'autodict' object is a special kind of Python dictionary | 
|  | 250 | (implemented in Core.py) that implements Perl's 'autovivifying' hashes | 
|  | 251 | in Python i.e. with autovivifying hashes, you can assign nested hash | 
|  | 252 | values without having to go to the trouble of creating intermediate | 
|  | 253 | levels if they don't exist e.g syscalls[comm][pid][id] = 1 will create | 
|  | 254 | the intermediate hash levels and finally assign the value 1 to the | 
|  | 255 | hash entry for 'id' (because the value being assigned isn't a hash | 
|  | 256 | object itself, the initial value is assigned in the TypeError | 
|  | 257 | exception.  Well, there may be a better way to do this in Python but | 
|  | 258 | that's what works for now). | 
|  | 259 |  | 
|  | 260 | Putting that code into the raw_syscalls__sys_enter() handler, we | 
|  | 261 | effectively end up with a single-level dictionary keyed on syscall id | 
|  | 262 | and having the counts we've tallied as values. | 
|  | 263 |  | 
|  | 264 | The print_syscall_totals() function iterates over the entries in the | 
|  | 265 | dictionary and displays a line for each entry containing the syscall | 
|  | 266 | name (the dictonary keys contain the syscall ids, which are passed to | 
|  | 267 | the Util function syscall_name(), which translates the raw syscall | 
|  | 268 | numbers to the corresponding syscall name strings).  The output is | 
|  | 269 | displayed after all the events in the trace have been processed, by | 
|  | 270 | calling the print_syscall_totals() function from the trace_end() | 
|  | 271 | handler called at the end of script processing. | 
|  | 272 |  | 
|  | 273 | The final script producing the output shown above is shown in its | 
| Frederic Weisbecker | c2fbaa4 | 2010-02-25 03:03:52 +0100 | [diff] [blame] | 274 | entirety below (syscall_name() helper is not yet available, you can | 
|  | 275 | only deal with id's for now): | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 276 |  | 
|  | 277 | ---- | 
|  | 278 | import os | 
|  | 279 | import sys | 
|  | 280 |  | 
|  | 281 | sys.path.append(os.environ['PERF_EXEC_PATH'] + \ | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 282 | '/scripts/python/perf-script-Util/lib/Perf/Trace') | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 283 |  | 
|  | 284 | from perf_trace_context import * | 
|  | 285 | from Core import * | 
|  | 286 | from Util import * | 
|  | 287 |  | 
|  | 288 | syscalls = autodict() | 
|  | 289 |  | 
|  | 290 | def trace_end(): | 
|  | 291 | print_syscall_totals() | 
|  | 292 |  | 
|  | 293 | def raw_syscalls__sys_enter(event_name, context, common_cpu, | 
|  | 294 | common_secs, common_nsecs, common_pid, common_comm, | 
|  | 295 | id, args): | 
|  | 296 | try: | 
|  | 297 | syscalls[id] += 1 | 
|  | 298 | except TypeError: | 
|  | 299 | syscalls[id] = 1 | 
|  | 300 |  | 
|  | 301 | def print_syscall_totals(): | 
|  | 302 | if for_comm is not None: | 
|  | 303 | print "\nsyscall events for %s:\n\n" % (for_comm), | 
|  | 304 | else: | 
|  | 305 | print "\nsyscall events:\n\n", | 
|  | 306 |  | 
|  | 307 | print "%-40s  %10s\n" % ("event", "count"), | 
|  | 308 | print "%-40s  %10s\n" % ("----------------------------------------", \ | 
|  | 309 | "-----------"), | 
|  | 310 |  | 
|  | 311 | for id, val in sorted(syscalls.iteritems(), key = lambda(k, v): (v, k), \ | 
|  | 312 | reverse = True): | 
|  | 313 | print "%-40s  %10d\n" % (syscall_name(id), val), | 
|  | 314 | ---- | 
|  | 315 |  | 
|  | 316 | The script can be run just as before: | 
|  | 317 |  | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 318 | # perf script -s syscall-counts.py | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 319 |  | 
|  | 320 | So those are the essential steps in writing and running a script.  The | 
|  | 321 | process can be generalized to any tracepoint or set of tracepoints | 
|  | 322 | you're interested in - basically find the tracepoint(s) you're | 
|  | 323 | interested in by looking at the list of available events shown by | 
|  | 324 | 'perf list' and/or look in /sys/kernel/debug/tracing events for | 
|  | 325 | detailed event and field info, record the corresponding trace data | 
|  | 326 | using 'perf record', passing it the list of interesting events, | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 327 | generate a skeleton script using 'perf script -g python' and modify the | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 328 | code to aggregate and display it for your particular needs. | 
|  | 329 |  | 
|  | 330 | After you've done that you may end up with a general-purpose script | 
|  | 331 | that you want to keep around and have available for future use.  By | 
|  | 332 | writing a couple of very simple shell scripts and putting them in the | 
|  | 333 | right place, you can have your script listed alongside the other | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 334 | scripts listed by the 'perf script -l' command e.g.: | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 335 |  | 
|  | 336 | ---- | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 337 | root@tropicana:~# perf script -l | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 338 | List of available trace scripts: | 
|  | 339 | workqueue-stats                      workqueue stats (ins/exe/create/destroy) | 
|  | 340 | wakeup-latency                       system-wide min/max/avg wakeup latency | 
|  | 341 | rw-by-file <comm>                    r/w activity for a program, by file | 
|  | 342 | rw-by-pid                            system-wide r/w activity | 
|  | 343 | ---- | 
|  | 344 |  | 
|  | 345 | A nice side effect of doing this is that you also then capture the | 
|  | 346 | probably lengthy 'perf record' command needed to record the events for | 
|  | 347 | the script. | 
|  | 348 |  | 
|  | 349 | To have the script appear as a 'built-in' script, you write two simple | 
|  | 350 | scripts, one for recording and one for 'reporting'. | 
|  | 351 |  | 
|  | 352 | The 'record' script is a shell script with the same base name as your | 
|  | 353 | script, but with -record appended.  The shell script should be put | 
|  | 354 | into the perf/scripts/python/bin directory in the kernel source tree. | 
|  | 355 | In that script, you write the 'perf record' command-line needed for | 
|  | 356 | your script: | 
|  | 357 |  | 
|  | 358 | ---- | 
|  | 359 | # cat kernel-source/tools/perf/scripts/python/bin/syscall-counts-record | 
|  | 360 |  | 
|  | 361 | #!/bin/bash | 
| Frederic Weisbecker | e5a5f1f | 2010-04-30 19:55:00 +0200 | [diff] [blame] | 362 | perf record -a -e raw_syscalls:sys_enter | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 363 | ---- | 
|  | 364 |  | 
|  | 365 | The 'report' script is also a shell script with the same base name as | 
|  | 366 | your script, but with -report appended.  It should also be located in | 
|  | 367 | the perf/scripts/python/bin directory.  In that script, you write the | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 368 | 'perf script -s' command-line needed for running your script: | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 369 |  | 
|  | 370 | ---- | 
|  | 371 | # cat kernel-source/tools/perf/scripts/python/bin/syscall-counts-report | 
|  | 372 |  | 
|  | 373 | #!/bin/bash | 
|  | 374 | # description: system-wide syscall counts | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 375 | perf script -s ~/libexec/perf-core/scripts/python/syscall-counts.py | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 376 | ---- | 
|  | 377 |  | 
|  | 378 | Note that the location of the Python script given in the shell script | 
|  | 379 | is in the libexec/perf-core/scripts/python directory - this is where | 
|  | 380 | the script will be copied by 'make install' when you install perf. | 
|  | 381 | For the installation to install your script there, your script needs | 
|  | 382 | to be located in the perf/scripts/python directory in the kernel | 
|  | 383 | source tree: | 
|  | 384 |  | 
|  | 385 | ---- | 
|  | 386 | # ls -al kernel-source/tools/perf/scripts/python | 
|  | 387 |  | 
|  | 388 | root@tropicana:/home/trz/src/tip# ls -al tools/perf/scripts/python | 
|  | 389 | total 32 | 
|  | 390 | drwxr-xr-x 4 trz trz 4096 2010-01-26 22:30 . | 
|  | 391 | drwxr-xr-x 4 trz trz 4096 2010-01-26 22:29 .. | 
|  | 392 | drwxr-xr-x 2 trz trz 4096 2010-01-26 22:29 bin | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 393 | -rw-r--r-- 1 trz trz 2548 2010-01-26 22:29 check-perf-script.py | 
|  | 394 | drwxr-xr-x 3 trz trz 4096 2010-01-26 22:49 perf-script-Util | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 395 | -rw-r--r-- 1 trz trz 1462 2010-01-26 22:30 syscall-counts.py | 
|  | 396 | ---- | 
|  | 397 |  | 
|  | 398 | Once you've done that (don't forget to do a new 'make install', | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 399 | otherwise your script won't show up at run-time), 'perf script -l' | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 400 | should show a new entry for your script: | 
|  | 401 |  | 
|  | 402 | ---- | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 403 | root@tropicana:~# perf script -l | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 404 | List of available trace scripts: | 
|  | 405 | workqueue-stats                      workqueue stats (ins/exe/create/destroy) | 
|  | 406 | wakeup-latency                       system-wide min/max/avg wakeup latency | 
|  | 407 | rw-by-file <comm>                    r/w activity for a program, by file | 
|  | 408 | rw-by-pid                            system-wide r/w activity | 
|  | 409 | syscall-counts                       system-wide syscall counts | 
|  | 410 | ---- | 
|  | 411 |  | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 412 | You can now perform the record step via 'perf script record': | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 413 |  | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 414 | # perf script record syscall-counts | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 415 |  | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 416 | and display the output using 'perf script report': | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 417 |  | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 418 | # perf script report syscall-counts | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 419 |  | 
|  | 420 | STARTER SCRIPTS | 
|  | 421 | --------------- | 
|  | 422 |  | 
|  | 423 | You can quickly get started writing a script for a particular set of | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 424 | trace data by generating a skeleton script using 'perf script -g | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 425 | python' in the same directory as an existing perf.data trace file. | 
|  | 426 | That will generate a starter script containing a handler for each of | 
|  | 427 | the event types in the trace file; it simply prints every available | 
|  | 428 | field for each event in the trace file. | 
|  | 429 |  | 
|  | 430 | You can also look at the existing scripts in | 
|  | 431 | ~/libexec/perf-core/scripts/python for typical examples showing how to | 
|  | 432 | do basic things like aggregate event data, print results, etc.  Also, | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 433 | the check-perf-script.py script, while not interesting for its results, | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 434 | attempts to exercise all of the main scripting features. | 
|  | 435 |  | 
|  | 436 | EVENT HANDLERS | 
|  | 437 | -------------- | 
|  | 438 |  | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 439 | When perf script is invoked using a trace script, a user-defined | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 440 | 'handler function' is called for each event in the trace.  If there's | 
|  | 441 | no handler function defined for a given event type, the event is | 
|  | 442 | ignored (or passed to a 'trace_handled' function, see below) and the | 
|  | 443 | next event is processed. | 
|  | 444 |  | 
|  | 445 | Most of the event's field values are passed as arguments to the | 
|  | 446 | handler function; some of the less common ones aren't - those are | 
|  | 447 | available as calls back into the perf executable (see below). | 
|  | 448 |  | 
|  | 449 | As an example, the following perf record command can be used to record | 
|  | 450 | all sched_wakeup events in the system: | 
|  | 451 |  | 
| Frederic Weisbecker | e5a5f1f | 2010-04-30 19:55:00 +0200 | [diff] [blame] | 452 | # perf record -a -e sched:sched_wakeup | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 453 |  | 
|  | 454 | Traces meant to be processed using a script should be recorded with | 
| Frederic Weisbecker | e5a5f1f | 2010-04-30 19:55:00 +0200 | [diff] [blame] | 455 | the above option: -a to enable system-wide collection. | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 456 |  | 
|  | 457 | The format file for the sched_wakep event defines the following fields | 
|  | 458 | (see /sys/kernel/debug/tracing/events/sched/sched_wakeup/format): | 
|  | 459 |  | 
|  | 460 | ---- | 
|  | 461 | format: | 
|  | 462 | field:unsigned short common_type; | 
|  | 463 | field:unsigned char common_flags; | 
|  | 464 | field:unsigned char common_preempt_count; | 
|  | 465 | field:int common_pid; | 
|  | 466 | field:int common_lock_depth; | 
|  | 467 |  | 
|  | 468 | field:char comm[TASK_COMM_LEN]; | 
|  | 469 | field:pid_t pid; | 
|  | 470 | field:int prio; | 
|  | 471 | field:int success; | 
|  | 472 | field:int target_cpu; | 
|  | 473 | ---- | 
|  | 474 |  | 
|  | 475 | The handler function for this event would be defined as: | 
|  | 476 |  | 
|  | 477 | ---- | 
|  | 478 | def sched__sched_wakeup(event_name, context, common_cpu, common_secs, | 
|  | 479 | common_nsecs, common_pid, common_comm, | 
|  | 480 | comm, pid, prio, success, target_cpu): | 
|  | 481 | pass | 
|  | 482 | ---- | 
|  | 483 |  | 
|  | 484 | The handler function takes the form subsystem__event_name. | 
|  | 485 |  | 
|  | 486 | The common_* arguments in the handler's argument list are the set of | 
|  | 487 | arguments passed to all event handlers; some of the fields correspond | 
|  | 488 | to the common_* fields in the format file, but some are synthesized, | 
|  | 489 | and some of the common_* fields aren't common enough to to be passed | 
|  | 490 | to every event as arguments but are available as library functions. | 
|  | 491 |  | 
|  | 492 | Here's a brief description of each of the invariant event args: | 
|  | 493 |  | 
|  | 494 | event_name 	  	    the name of the event as text | 
|  | 495 | context		    an opaque 'cookie' used in calls back into perf | 
|  | 496 | common_cpu		    the cpu the event occurred on | 
|  | 497 | common_secs		    the secs portion of the event timestamp | 
|  | 498 | common_nsecs		    the nsecs portion of the event timestamp | 
|  | 499 | common_pid		    the pid of the current task | 
|  | 500 | common_comm		    the name of the current process | 
|  | 501 |  | 
|  | 502 | All of the remaining fields in the event's format file have | 
|  | 503 | counterparts as handler function arguments of the same name, as can be | 
|  | 504 | seen in the example above. | 
|  | 505 |  | 
|  | 506 | The above provides the basics needed to directly access every field of | 
|  | 507 | every event in a trace, which covers 90% of what you need to know to | 
|  | 508 | write a useful trace script.  The sections below cover the rest. | 
|  | 509 |  | 
|  | 510 | SCRIPT LAYOUT | 
|  | 511 | ------------- | 
|  | 512 |  | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 513 | Every perf script Python script should start by setting up a Python | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 514 | module search path and 'import'ing a few support modules (see module | 
|  | 515 | descriptions below): | 
|  | 516 |  | 
|  | 517 | ---- | 
|  | 518 | import os | 
|  | 519 | import sys | 
|  | 520 |  | 
|  | 521 | sys.path.append(os.environ['PERF_EXEC_PATH'] + \ | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 522 | '/scripts/python/perf-script-Util/lib/Perf/Trace') | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 523 |  | 
|  | 524 | from perf_trace_context import * | 
|  | 525 | from Core import * | 
|  | 526 | ---- | 
|  | 527 |  | 
|  | 528 | The rest of the script can contain handler functions and support | 
|  | 529 | functions in any order. | 
|  | 530 |  | 
|  | 531 | Aside from the event handler functions discussed above, every script | 
|  | 532 | can implement a set of optional functions: | 
|  | 533 |  | 
|  | 534 | *trace_begin*, if defined, is called before any event is processed and | 
|  | 535 | gives scripts a chance to do setup tasks: | 
|  | 536 |  | 
|  | 537 | ---- | 
|  | 538 | def trace_begin: | 
|  | 539 | pass | 
|  | 540 | ---- | 
|  | 541 |  | 
|  | 542 | *trace_end*, if defined, is called after all events have been | 
|  | 543 | processed and gives scripts a chance to do end-of-script tasks, such | 
|  | 544 | as display results: | 
|  | 545 |  | 
|  | 546 | ---- | 
|  | 547 | def trace_end: | 
|  | 548 | pass | 
|  | 549 | ---- | 
|  | 550 |  | 
|  | 551 | *trace_unhandled*, if defined, is called after for any event that | 
|  | 552 | doesn't have a handler explicitly defined for it.  The standard set | 
|  | 553 | of common arguments are passed into it: | 
|  | 554 |  | 
|  | 555 | ---- | 
|  | 556 | def trace_unhandled(event_name, context, common_cpu, common_secs, | 
|  | 557 | common_nsecs, common_pid, common_comm): | 
|  | 558 | pass | 
|  | 559 | ---- | 
|  | 560 |  | 
|  | 561 | The remaining sections provide descriptions of each of the available | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 562 | built-in perf script Python modules and their associated functions. | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 563 |  | 
|  | 564 | AVAILABLE MODULES AND FUNCTIONS | 
|  | 565 | ------------------------------- | 
|  | 566 |  | 
|  | 567 | The following sections describe the functions and variables available | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 568 | via the various perf script Python modules.  To use the functions and | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 569 | variables from the given module, add the corresponding 'from XXXX | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 570 | import' line to your perf script script. | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 571 |  | 
|  | 572 | Core.py Module | 
|  | 573 | ~~~~~~~~~~~~~~ | 
|  | 574 |  | 
|  | 575 | These functions provide some essential functions to user scripts. | 
|  | 576 |  | 
|  | 577 | The *flag_str* and *symbol_str* functions provide human-readable | 
|  | 578 | strings for flag and symbolic fields.  These correspond to the strings | 
|  | 579 | and values parsed from the 'print fmt' fields of the event format | 
|  | 580 | files: | 
|  | 581 |  | 
|  | 582 | flag_str(event_name, field_name, field_value) - returns the string represention corresponding to field_value for the flag field field_name of event event_name | 
|  | 583 | symbol_str(event_name, field_name, field_value) - returns the string represention corresponding to field_value for the symbolic field field_name of event event_name | 
|  | 584 |  | 
| Kirill Smelkov | 5d2be7c | 2010-05-13 14:39:25 +0400 | [diff] [blame] | 585 | The *autodict* function returns a special kind of Python | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 586 | dictionary that implements Perl's 'autovivifying' hashes in Python | 
|  | 587 | i.e. with autovivifying hashes, you can assign nested hash values | 
|  | 588 | without having to go to the trouble of creating intermediate levels if | 
|  | 589 | they don't exist. | 
|  | 590 |  | 
|  | 591 | autodict() - returns an autovivifying dictionary instance | 
|  | 592 |  | 
|  | 593 |  | 
|  | 594 | perf_trace_context Module | 
|  | 595 | ~~~~~~~~~~~~~~~~~~~~~~~~~ | 
|  | 596 |  | 
|  | 597 | Some of the 'common' fields in the event format file aren't all that | 
|  | 598 | common, but need to be made accessible to user scripts nonetheless. | 
|  | 599 |  | 
|  | 600 | perf_trace_context defines a set of functions that can be used to | 
|  | 601 | access this data in the context of the current event.  Each of these | 
|  | 602 | functions expects a context variable, which is the same as the | 
|  | 603 | context variable passed into every event handler as the second | 
|  | 604 | argument. | 
|  | 605 |  | 
|  | 606 | common_pc(context) - returns common_preempt count for the current event | 
|  | 607 | common_flags(context) - returns common_flags for the current event | 
|  | 608 | common_lock_depth(context) - returns common_lock_depth for the current event | 
|  | 609 |  | 
|  | 610 | Util.py Module | 
|  | 611 | ~~~~~~~~~~~~~~ | 
|  | 612 |  | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 613 | Various utility functions for use with perf script: | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 614 |  | 
|  | 615 | nsecs(secs, nsecs) - returns total nsecs given secs/nsecs pair | 
|  | 616 | nsecs_secs(nsecs) - returns whole secs portion given nsecs | 
|  | 617 | nsecs_nsecs(nsecs) - returns nsecs remainder given nsecs | 
|  | 618 | nsecs_str(nsecs) - returns printable string in the form secs.nsecs | 
|  | 619 | avg(total, n) - returns average given a sum and a total number of values | 
| Tom Zanussi | cff68e5 | 2010-01-27 02:28:03 -0600 | [diff] [blame] | 620 |  | 
|  | 621 | SEE ALSO | 
|  | 622 | -------- | 
| Ingo Molnar | 133dc4c | 2010-11-16 18:45:39 +0100 | [diff] [blame] | 623 | linkperf:perf-script[1] |