| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 1 | Fault injection capabilities infrastructure | 
|  | 2 | =========================================== | 
|  | 3 |  | 
|  | 4 | See also drivers/md/faulty.c and "every_nth" module option for scsi_debug. | 
|  | 5 |  | 
|  | 6 |  | 
|  | 7 | Available fault injection capabilities | 
|  | 8 | -------------------------------------- | 
|  | 9 |  | 
|  | 10 | o failslab | 
|  | 11 |  | 
|  | 12 | injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...) | 
|  | 13 |  | 
|  | 14 | o fail_page_alloc | 
|  | 15 |  | 
|  | 16 | injects page allocation failures. (alloc_pages(), get_free_pages(), ...) | 
|  | 17 |  | 
|  | 18 | o fail_make_request | 
|  | 19 |  | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 20 | injects disk IO errors on devices permitted by setting | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 21 | /sys/block/<device>/make-it-fail or | 
|  | 22 | /sys/block/<device>/<partition>/make-it-fail. (generic_make_request()) | 
|  | 23 |  | 
| Per Forlin | 1e4cb22 | 2011-08-19 14:52:38 +0200 | [diff] [blame] | 24 | o fail_mmc_request | 
|  | 25 |  | 
|  | 26 | injects MMC data errors on devices permitted by setting | 
|  | 27 | debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request | 
|  | 28 |  | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 29 | Configure fault-injection capabilities behavior | 
|  | 30 | ----------------------------------------------- | 
|  | 31 |  | 
|  | 32 | o debugfs entries | 
|  | 33 |  | 
|  | 34 | fault-inject-debugfs kernel module provides some debugfs entries for runtime | 
|  | 35 | configuration of fault-injection capabilities. | 
|  | 36 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 37 | - /sys/kernel/debug/fail*/probability: | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 38 |  | 
|  | 39 | likelihood of failure injection, in percent. | 
|  | 40 | Format: <percent> | 
|  | 41 |  | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 42 | Note that one-failure-per-hundred is a very high error rate | 
|  | 43 | for some testcases.  Consider setting probability=100 and configure | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 44 | /sys/kernel/debug/fail*/interval for such testcases. | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 45 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 46 | - /sys/kernel/debug/fail*/interval: | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 47 |  | 
|  | 48 | specifies the interval between failures, for calls to | 
|  | 49 | should_fail() that pass all the other tests. | 
|  | 50 |  | 
|  | 51 | Note that if you enable this, by setting interval>1, you will | 
|  | 52 | probably want to set probability=100. | 
|  | 53 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 54 | - /sys/kernel/debug/fail*/times: | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 55 |  | 
|  | 56 | specifies how many times failures may happen at most. | 
|  | 57 | A value of -1 means "no limit". | 
|  | 58 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 59 | - /sys/kernel/debug/fail*/space: | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 60 |  | 
|  | 61 | specifies an initial resource "budget", decremented by "size" | 
|  | 62 | on each call to should_fail(,size).  Failure injection is | 
|  | 63 | suppressed until "space" reaches zero. | 
|  | 64 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 65 | - /sys/kernel/debug/fail*/verbose | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 66 |  | 
|  | 67 | Format: { 0 | 1 | 2 } | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 68 | specifies the verbosity of the messages when failure is | 
|  | 69 | injected.  '0' means no messages; '1' will print only a single | 
|  | 70 | log line per failure; '2' will print a call trace too -- useful | 
|  | 71 | to debug the problems revealed by fault injection. | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 72 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 73 | - /sys/kernel/debug/fail*/task-filter: | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 74 |  | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 75 | Format: { 'Y' | 'N' } | 
|  | 76 | A value of 'N' disables filtering by process (default). | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 77 | Any positive value limits failures to only processes indicated by | 
|  | 78 | /proc/<pid>/make-it-fail==1. | 
|  | 79 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 80 | - /sys/kernel/debug/fail*/require-start: | 
|  | 81 | - /sys/kernel/debug/fail*/require-end: | 
|  | 82 | - /sys/kernel/debug/fail*/reject-start: | 
|  | 83 | - /sys/kernel/debug/fail*/reject-end: | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 84 |  | 
|  | 85 | specifies the range of virtual addresses tested during | 
|  | 86 | stacktrace walking.  Failure is injected only if some caller | 
| Akinobu Mita | 329409a | 2006-12-08 02:39:48 -0800 | [diff] [blame] | 87 | in the walked stacktrace lies within the required range, and | 
|  | 88 | none lies within the rejected range. | 
|  | 89 | Default required range is [0,ULONG_MAX) (whole of virtual address space). | 
|  | 90 | Default rejected range is [0,0). | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 91 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 92 | - /sys/kernel/debug/fail*/stacktrace-depth: | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 93 |  | 
|  | 94 | specifies the maximum stacktrace depth walked during search | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 95 | for a caller within [require-start,require-end) OR | 
|  | 96 | [reject-start,reject-end). | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 97 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 98 | - /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem: | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 99 |  | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 100 | Format: { 'Y' | 'N' } | 
|  | 101 | default is 'N', setting it to 'Y' won't inject failures into | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 102 | highmem/user allocations. | 
|  | 103 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 104 | - /sys/kernel/debug/failslab/ignore-gfp-wait: | 
|  | 105 | - /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait: | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 106 |  | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 107 | Format: { 'Y' | 'N' } | 
|  | 108 | default is 'N', setting it to 'Y' will inject failures | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 109 | only into non-sleep allocations (GFP_ATOMIC allocations). | 
|  | 110 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 111 | - /sys/kernel/debug/fail_page_alloc/min-order: | 
| Akinobu Mita | 5411499 | 2007-07-15 23:40:23 -0700 | [diff] [blame] | 112 |  | 
|  | 113 | specifies the minimum page allocation order to be injected | 
|  | 114 | failures. | 
|  | 115 |  | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 116 | o Boot option | 
|  | 117 |  | 
|  | 118 | In order to inject faults while debugfs is not available (early boot time), | 
|  | 119 | use the boot option: | 
|  | 120 |  | 
|  | 121 | failslab= | 
|  | 122 | fail_page_alloc= | 
| Per Forlin | 1e4cb22 | 2011-08-19 14:52:38 +0200 | [diff] [blame] | 123 | fail_make_request= | 
| Per Forlin | 199e3f4 | 2011-09-13 23:03:30 +0200 | [diff] [blame] | 124 | mmc_core.fail_request=<interval>,<probability>,<space>,<times> | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 125 |  | 
|  | 126 | How to add new fault injection capability | 
|  | 127 | ----------------------------------------- | 
|  | 128 |  | 
|  | 129 | o #include <linux/fault-inject.h> | 
|  | 130 |  | 
|  | 131 | o define the fault attributes | 
|  | 132 |  | 
|  | 133 | DECLARE_FAULT_INJECTION(name); | 
|  | 134 |  | 
|  | 135 | Please see the definition of struct fault_attr in fault-inject.h | 
|  | 136 | for details. | 
|  | 137 |  | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 138 | o provide a way to configure fault attributes | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 139 |  | 
|  | 140 | - boot option | 
|  | 141 |  | 
|  | 142 | If you need to enable the fault injection capability from boot time, you can | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 143 | provide boot option to configure it. There is a helper function for it: | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 144 |  | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 145 | setup_fault_attr(attr, str); | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 146 |  | 
|  | 147 | - debugfs entries | 
|  | 148 |  | 
|  | 149 | failslab, fail_page_alloc, and fail_make_request use this way. | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 150 | Helper functions: | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 151 |  | 
| Akinobu Mita | dd48c08 | 2011-08-03 16:21:01 -0700 | [diff] [blame] | 152 | fault_create_debugfs_attr(name, parent, attr); | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 153 |  | 
|  | 154 | - module parameters | 
|  | 155 |  | 
|  | 156 | If the scope of the fault injection capability is limited to a | 
|  | 157 | single kernel module, it is better to provide module parameters to | 
|  | 158 | configure the fault attributes. | 
|  | 159 |  | 
|  | 160 | o add a hook to insert failures | 
|  | 161 |  | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 162 | Upon should_fail() returning true, client code should inject a failure. | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 163 |  | 
| Don Mullis | 5d0ffa2 | 2006-12-08 02:39:50 -0800 | [diff] [blame] | 164 | should_fail(attr, size); | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 165 |  | 
|  | 166 | Application Examples | 
|  | 167 | -------------------- | 
|  | 168 |  | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 169 | o Inject slab allocation failures into module init/exit code | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 170 |  | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 171 | #!/bin/bash | 
|  | 172 |  | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 173 | FAILTYPE=failslab | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 174 | echo Y > /sys/kernel/debug/$FAILTYPE/task-filter | 
|  | 175 | echo 10 > /sys/kernel/debug/$FAILTYPE/probability | 
|  | 176 | echo 100 > /sys/kernel/debug/$FAILTYPE/interval | 
|  | 177 | echo -1 > /sys/kernel/debug/$FAILTYPE/times | 
|  | 178 | echo 0 > /sys/kernel/debug/$FAILTYPE/space | 
|  | 179 | echo 2 > /sys/kernel/debug/$FAILTYPE/verbose | 
|  | 180 | echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 181 |  | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 182 | faulty_system() | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 183 | { | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 184 | bash -c "echo 1 > /proc/self/make-it-fail && exec $*" | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 185 | } | 
|  | 186 |  | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 187 | if [ $# -eq 0 ] | 
|  | 188 | then | 
|  | 189 | echo "Usage: $0 modulename [ modulename ... ]" | 
|  | 190 | exit 1 | 
|  | 191 | fi | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 192 |  | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 193 | for m in $* | 
|  | 194 | do | 
|  | 195 | echo inserting $m... | 
|  | 196 | faulty_system modprobe $m | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 197 |  | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 198 | echo removing $m... | 
|  | 199 | faulty_system modprobe -r $m | 
|  | 200 | done | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 201 |  | 
|  | 202 | ------------------------------------------------------------------------------ | 
|  | 203 |  | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 204 | o Inject page allocation failures only for a specific module | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 205 |  | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 206 | #!/bin/bash | 
|  | 207 |  | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 208 | FAILTYPE=fail_page_alloc | 
|  | 209 | module=$1 | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 210 |  | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 211 | if [ -z $module ] | 
|  | 212 | then | 
|  | 213 | echo "Usage: $0 <modulename>" | 
|  | 214 | exit 1 | 
|  | 215 | fi | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 216 |  | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 217 | modprobe $module | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 218 |  | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 219 | if [ ! -d /sys/module/$module/sections ] | 
|  | 220 | then | 
|  | 221 | echo Module $module is not loaded | 
|  | 222 | exit 1 | 
|  | 223 | fi | 
|  | 224 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 225 | cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start | 
|  | 226 | cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 227 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 228 | echo N > /sys/kernel/debug/$FAILTYPE/task-filter | 
|  | 229 | echo 10 > /sys/kernel/debug/$FAILTYPE/probability | 
|  | 230 | echo 100 > /sys/kernel/debug/$FAILTYPE/interval | 
|  | 231 | echo -1 > /sys/kernel/debug/$FAILTYPE/times | 
|  | 232 | echo 0 > /sys/kernel/debug/$FAILTYPE/space | 
|  | 233 | echo 2 > /sys/kernel/debug/$FAILTYPE/verbose | 
|  | 234 | echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait | 
|  | 235 | echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem | 
|  | 236 | echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 237 |  | 
| GeunSik Lim | 156f5a7 | 2009-06-02 15:01:37 +0900 | [diff] [blame] | 238 | trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT | 
| Akinobu Mita | 1858487 | 2007-07-15 23:40:24 -0700 | [diff] [blame] | 239 |  | 
|  | 240 | echo "Injecting errors into the module $module... (interrupt to stop)" | 
|  | 241 | sleep 1000000 | 
| Akinobu Mita | de1ba09 | 2006-12-08 02:39:42 -0800 | [diff] [blame] | 242 |  |