| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | ------------------------------------------------------------------------------ | 
 | 2 |                        T H E  /proc   F I L E S Y S T E M | 
 | 3 | ------------------------------------------------------------------------------ | 
 | 4 | /proc/sys         Terrehon Bowden <terrehon@pacbell.net>        October 7 1999 | 
 | 5 |                   Bodo Bauer <bb@ricochet.net> | 
 | 6 |  | 
 | 7 | 2.4.x update	  Jorge Nerin <comandante@zaralinux.com>      November 14 2000 | 
 | 8 | ------------------------------------------------------------------------------ | 
 | 9 | Version 1.3                                              Kernel version 2.2.12 | 
 | 10 | 					      Kernel version 2.4.0-test11-pre4 | 
 | 11 | ------------------------------------------------------------------------------ | 
 | 12 |  | 
 | 13 | Table of Contents | 
 | 14 | ----------------- | 
 | 15 |  | 
 | 16 |   0     Preface | 
 | 17 |   0.1	Introduction/Credits | 
 | 18 |   0.2	Legal Stuff | 
 | 19 |  | 
 | 20 |   1	Collecting System Information | 
 | 21 |   1.1	Process-Specific Subdirectories | 
 | 22 |   1.2	Kernel data | 
 | 23 |   1.3	IDE devices in /proc/ide | 
 | 24 |   1.4	Networking info in /proc/net | 
 | 25 |   1.5	SCSI info | 
 | 26 |   1.6	Parallel port info in /proc/parport | 
 | 27 |   1.7	TTY info in /proc/tty | 
 | 28 |   1.8	Miscellaneous kernel statistics in /proc/stat | 
 | 29 |  | 
 | 30 |   2	Modifying System Parameters | 
 | 31 |   2.1	/proc/sys/fs - File system data | 
 | 32 |   2.2	/proc/sys/fs/binfmt_misc - Miscellaneous binary formats | 
 | 33 |   2.3	/proc/sys/kernel - general kernel parameters | 
 | 34 |   2.4	/proc/sys/vm - The virtual memory subsystem | 
 | 35 |   2.5	/proc/sys/dev - Device specific parameters | 
 | 36 |   2.6	/proc/sys/sunrpc - Remote procedure calls | 
 | 37 |   2.7	/proc/sys/net - Networking stuff | 
 | 38 |   2.8	/proc/sys/net/ipv4 - IPV4 settings | 
 | 39 |   2.9	Appletalk | 
 | 40 |   2.10	IPX | 
 | 41 |   2.11	/proc/sys/fs/mqueue - POSIX message queues filesystem | 
| Jan-Frode Myklebust | d7ff0db | 2006-09-29 01:59:45 -0700 | [diff] [blame] | 42 |   2.12	/proc/<pid>/oom_adj - Adjust the oom-killer score | 
 | 43 |   2.13	/proc/<pid>/oom_score - Display current oom-killer score | 
| Roland Kletzing | f9c9946 | 2007-03-05 00:30:54 -0800 | [diff] [blame] | 44 |   2.14	/proc/<pid>/io - Display the IO accounting fields | 
| Kawai, Hidehiro | bb90110 | 2007-07-19 01:48:31 -0700 | [diff] [blame] | 45 |   2.15	/proc/<pid>/coredump_filter - Core dump filtering settings | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 46 |  | 
 | 47 | ------------------------------------------------------------------------------ | 
 | 48 | Preface | 
 | 49 | ------------------------------------------------------------------------------ | 
 | 50 |  | 
 | 51 | 0.1 Introduction/Credits | 
 | 52 | ------------------------ | 
 | 53 |  | 
 | 54 | This documentation is  part of a soon (or  so we hope) to be  released book on | 
 | 55 | the SuSE  Linux distribution. As  there is  no complete documentation  for the | 
 | 56 | /proc file system and we've used  many freely available sources to write these | 
 | 57 | chapters, it  seems only fair  to give the work  back to the  Linux community. | 
 | 58 | This work is  based on the 2.2.*  kernel version and the  upcoming 2.4.*. I'm | 
 | 59 | afraid it's still far from complete, but we  hope it will be useful. As far as | 
 | 60 | we know, it is the first 'all-in-one' document about the /proc file system. It | 
 | 61 | is focused  on the Intel  x86 hardware,  so if you  are looking for  PPC, ARM, | 
 | 62 | SPARC, AXP, etc., features, you probably  won't find what you are looking for. | 
 | 63 | It also only covers IPv4 networking, not IPv6 nor other protocols - sorry. But | 
 | 64 | additions and patches  are welcome and will  be added to this  document if you | 
 | 65 | mail them to Bodo. | 
 | 66 |  | 
 | 67 | We'd like  to  thank Alan Cox, Rik van Riel, and Alexey Kuznetsov and a lot of | 
 | 68 | other people for help compiling this documentation. We'd also like to extend a | 
 | 69 | special thank  you to Andi Kleen for documentation, which we relied on heavily | 
 | 70 | to create  this  document,  as well as the additional information he provided. | 
 | 71 | Thanks to  everybody  else  who contributed source or docs to the Linux kernel | 
 | 72 | and helped create a great piece of software... :) | 
 | 73 |  | 
 | 74 | If you  have  any comments, corrections or additions, please don't hesitate to | 
 | 75 | contact Bodo  Bauer  at  bb@ricochet.net.  We'll  be happy to add them to this | 
 | 76 | document. | 
 | 77 |  | 
 | 78 | The   latest   version    of   this   document   is    available   online   at | 
 | 79 | http://skaro.nightcrawler.com/~bb/Docs/Proc as HTML version. | 
 | 80 |  | 
 | 81 | If  the above  direction does  not works  for you,  ypu could  try the  kernel | 
 | 82 | mailing  list  at  linux-kernel@vger.kernel.org  and/or try  to  reach  me  at | 
 | 83 | comandante@zaralinux.com. | 
 | 84 |  | 
 | 85 | 0.2 Legal Stuff | 
 | 86 | --------------- | 
 | 87 |  | 
 | 88 | We don't  guarantee  the  correctness  of this document, and if you come to us | 
 | 89 | complaining about  how  you  screwed  up  your  system  because  of  incorrect | 
 | 90 | documentation, we won't feel responsible... | 
 | 91 |  | 
 | 92 | ------------------------------------------------------------------------------ | 
 | 93 | CHAPTER 1: COLLECTING SYSTEM INFORMATION | 
 | 94 | ------------------------------------------------------------------------------ | 
 | 95 |  | 
 | 96 | ------------------------------------------------------------------------------ | 
 | 97 | In This Chapter | 
 | 98 | ------------------------------------------------------------------------------ | 
 | 99 | * Investigating  the  properties  of  the  pseudo  file  system  /proc and its | 
 | 100 |   ability to provide information on the running Linux system | 
 | 101 | * Examining /proc's structure | 
 | 102 | * Uncovering  various  information  about the kernel and the processes running | 
 | 103 |   on the system | 
 | 104 | ------------------------------------------------------------------------------ | 
 | 105 |  | 
 | 106 |  | 
 | 107 | The proc  file  system acts as an interface to internal data structures in the | 
 | 108 | kernel. It  can  be  used to obtain information about the system and to change | 
 | 109 | certain kernel parameters at runtime (sysctl). | 
 | 110 |  | 
 | 111 | First, we'll  take  a  look  at the read-only parts of /proc. In Chapter 2, we | 
 | 112 | show you how you can use /proc/sys to change settings. | 
 | 113 |  | 
 | 114 | 1.1 Process-Specific Subdirectories | 
 | 115 | ----------------------------------- | 
 | 116 |  | 
 | 117 | The directory  /proc  contains  (among other things) one subdirectory for each | 
 | 118 | process running on the system, which is named after the process ID (PID). | 
 | 119 |  | 
 | 120 | The link  self  points  to  the  process reading the file system. Each process | 
 | 121 | subdirectory has the entries listed in Table 1-1. | 
 | 122 |  | 
 | 123 |  | 
 | 124 | Table 1-1: Process specific entries in /proc  | 
 | 125 | .............................................................................. | 
| David Rientjes | b813e93 | 2007-05-06 14:49:24 -0700 | [diff] [blame] | 126 |  File		Content | 
 | 127 |  clear_refs	Clears page referenced bits shown in smaps output | 
 | 128 |  cmdline	Command line arguments | 
 | 129 |  cpu		Current and last cpu in which it was executed	(2.4)(smp) | 
 | 130 |  cwd		Link to the current working directory | 
 | 131 |  environ	Values of environment variables | 
 | 132 |  exe		Link to the executable of this process | 
 | 133 |  fd		Directory, which contains all file descriptors | 
 | 134 |  maps		Memory maps to executables and library files	(2.4) | 
 | 135 |  mem		Memory held by this process | 
 | 136 |  root		Link to the root directory of this process | 
 | 137 |  stat		Process status | 
 | 138 |  statm		Process memory status information | 
 | 139 |  status		Process status in human readable form | 
 | 140 |  wchan		If CONFIG_KALLSYMS is set, a pre-decoded wchan | 
 | 141 |  smaps		Extension based on maps, the rss size for each mapped file | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 142 | .............................................................................. | 
 | 143 |  | 
 | 144 | For example, to get the status information of a process, all you have to do is | 
 | 145 | read the file /proc/PID/status: | 
 | 146 |  | 
 | 147 |   >cat /proc/self/status  | 
 | 148 |   Name:   cat  | 
 | 149 |   State:  R (running)  | 
 | 150 |   Pid:    5452  | 
 | 151 |   PPid:   743  | 
 | 152 |   TracerPid:      0						(2.4) | 
 | 153 |   Uid:    501     501     501     501  | 
 | 154 |   Gid:    100     100     100     100  | 
 | 155 |   Groups: 100 14 16  | 
 | 156 |   VmSize:     1112 kB  | 
 | 157 |   VmLck:         0 kB  | 
 | 158 |   VmRSS:       348 kB  | 
 | 159 |   VmData:       24 kB  | 
 | 160 |   VmStk:        12 kB  | 
 | 161 |   VmExe:         8 kB  | 
 | 162 |   VmLib:      1044 kB  | 
 | 163 |   SigPnd: 0000000000000000  | 
 | 164 |   SigBlk: 0000000000000000  | 
 | 165 |   SigIgn: 0000000000000000  | 
 | 166 |   SigCgt: 0000000000000000  | 
 | 167 |   CapInh: 00000000fffffeff  | 
 | 168 |   CapPrm: 0000000000000000  | 
 | 169 |   CapEff: 0000000000000000  | 
 | 170 |  | 
 | 171 |  | 
 | 172 | This shows you nearly the same information you would get if you viewed it with | 
 | 173 | the ps  command.  In  fact,  ps  uses  the  proc  file  system  to  obtain its | 
 | 174 | information. The  statm  file  contains  more  detailed  information about the | 
| Kees Cook | 18d9677 | 2007-07-15 23:40:38 -0700 | [diff] [blame] | 175 | process memory usage. Its seven fields are explained in Table 1-2.  The stat | 
 | 176 | file contains details information about the process itself.  Its fields are | 
 | 177 | explained in Table 1-3. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 178 |  | 
 | 179 |  | 
 | 180 | Table 1-2: Contents of the statm files (as of 2.6.8-rc3) | 
 | 181 | .............................................................................. | 
 | 182 |  Field    Content | 
 | 183 |  size     total program size (pages)		(same as VmSize in status) | 
 | 184 |  resident size of memory portions (pages)	(same as VmRSS in status) | 
 | 185 |  shared   number of pages that are shared	(i.e. backed by a file) | 
 | 186 |  trs      number of pages that are 'code'	(not including libs; broken, | 
 | 187 | 							includes data segment) | 
 | 188 |  lrs      number of pages of library		(always 0 on 2.6) | 
 | 189 |  drs      number of pages of data/stack		(including libs; broken, | 
 | 190 | 							includes library text) | 
 | 191 |  dt       number of dirty pages			(always 0 on 2.6) | 
 | 192 | .............................................................................. | 
 | 193 |  | 
| Kees Cook | 18d9677 | 2007-07-15 23:40:38 -0700 | [diff] [blame] | 194 |  | 
 | 195 | Table 1-3: Contents of the stat files (as of 2.6.22-rc3) | 
 | 196 | .............................................................................. | 
 | 197 |  Field          Content | 
 | 198 |   pid           process id | 
 | 199 |   tcomm         filename of the executable | 
 | 200 |   state         state (R is running, S is sleeping, D is sleeping in an | 
 | 201 |                 uninterruptible wait, Z is zombie, T is traced or stopped) | 
 | 202 |   ppid          process id of the parent process | 
 | 203 |   pgrp          pgrp of the process | 
 | 204 |   sid           session id | 
 | 205 |   tty_nr        tty the process uses | 
 | 206 |   tty_pgrp      pgrp of the tty | 
 | 207 |   flags         task flags | 
 | 208 |   min_flt       number of minor faults | 
 | 209 |   cmin_flt      number of minor faults with child's | 
 | 210 |   maj_flt       number of major faults | 
 | 211 |   cmaj_flt      number of major faults with child's | 
 | 212 |   utime         user mode jiffies | 
 | 213 |   stime         kernel mode jiffies | 
 | 214 |   cutime        user mode jiffies with child's | 
 | 215 |   cstime        kernel mode jiffies with child's | 
 | 216 |   priority      priority level | 
 | 217 |   nice          nice level | 
 | 218 |   num_threads   number of threads | 
| Leonardo Chiquitto | 2e01e00 | 2008-02-03 16:17:16 +0200 | [diff] [blame] | 219 |   it_real_value	(obsolete, always 0) | 
| Kees Cook | 18d9677 | 2007-07-15 23:40:38 -0700 | [diff] [blame] | 220 |   start_time    time the process started after system boot | 
 | 221 |   vsize         virtual memory size | 
 | 222 |   rss           resident set memory size | 
 | 223 |   rsslim        current limit in bytes on the rss | 
 | 224 |   start_code    address above which program text can run | 
 | 225 |   end_code      address below which program text can run | 
 | 226 |   start_stack   address of the start of the stack | 
 | 227 |   esp           current value of ESP | 
 | 228 |   eip           current value of EIP | 
 | 229 |   pending       bitmap of pending signals (obsolete) | 
 | 230 |   blocked       bitmap of blocked signals (obsolete) | 
 | 231 |   sigign        bitmap of ignored signals (obsolete) | 
 | 232 |   sigcatch      bitmap of catched signals (obsolete) | 
 | 233 |   wchan         address where process went to sleep | 
 | 234 |   0             (place holder) | 
 | 235 |   0             (place holder) | 
 | 236 |   exit_signal   signal to send to parent thread on exit | 
 | 237 |   task_cpu      which CPU the task is scheduled on | 
 | 238 |   rt_priority   realtime priority | 
 | 239 |   policy        scheduling policy (man sched_setscheduler) | 
 | 240 |   blkio_ticks   time spent waiting for block IO | 
 | 241 | .............................................................................. | 
 | 242 |  | 
 | 243 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 244 | 1.2 Kernel data | 
 | 245 | --------------- | 
 | 246 |  | 
 | 247 | Similar to  the  process entries, the kernel data files give information about | 
 | 248 | the running kernel. The files used to obtain this information are contained in | 
| Kees Cook | 18d9677 | 2007-07-15 23:40:38 -0700 | [diff] [blame] | 249 | /proc and  are  listed  in Table 1-4. Not all of these will be present in your | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 250 | system. It  depends  on the kernel configuration and the loaded modules, which | 
 | 251 | files are there, and which are missing. | 
 | 252 |  | 
| Kees Cook | 18d9677 | 2007-07-15 23:40:38 -0700 | [diff] [blame] | 253 | Table 1-4: Kernel info in /proc | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 254 | .............................................................................. | 
 | 255 |  File        Content                                            | 
 | 256 |  apm         Advanced power management info                     | 
 | 257 |  buddyinfo   Kernel memory allocator information (see text)	(2.5) | 
 | 258 |  bus         Directory containing bus specific information      | 
 | 259 |  cmdline     Kernel command line                                | 
 | 260 |  cpuinfo     Info about the CPU                                 | 
 | 261 |  devices     Available devices (block and character)            | 
 | 262 |  dma         Used DMS channels                                  | 
 | 263 |  filesystems Supported filesystems                              | 
 | 264 |  driver	     Various drivers grouped here, currently rtc (2.4) | 
 | 265 |  execdomains Execdomains, related to security			(2.4) | 
 | 266 |  fb	     Frame Buffer devices				(2.4) | 
 | 267 |  fs	     File system parameters, currently nfs/exports	(2.4) | 
 | 268 |  ide         Directory containing info about the IDE subsystem  | 
 | 269 |  interrupts  Interrupt usage                                    | 
 | 270 |  iomem	     Memory map						(2.4) | 
 | 271 |  ioports     I/O port usage                                     | 
 | 272 |  irq	     Masks for irq to cpu affinity			(2.4)(smp?) | 
 | 273 |  isapnp	     ISA PnP (Plug&Play) Info				(2.4) | 
 | 274 |  kcore       Kernel core image (can be ELF or A.OUT(deprecated in 2.4))    | 
 | 275 |  kmsg        Kernel messages                                    | 
 | 276 |  ksyms       Kernel symbol table                                | 
 | 277 |  loadavg     Load average of last 1, 5 & 15 minutes                 | 
 | 278 |  locks       Kernel locks                                       | 
 | 279 |  meminfo     Memory info                                        | 
 | 280 |  misc        Miscellaneous                                      | 
 | 281 |  modules     List of loaded modules                             | 
 | 282 |  mounts      Mounted filesystems                                | 
 | 283 |  net         Networking info (see text)                         | 
 | 284 |  partitions  Table of partitions known to the system            | 
| Randy Dunlap | 8b60756 | 2007-05-09 07:19:14 +0200 | [diff] [blame] | 285 |  pci	     Deprecated info of PCI bus (new way -> /proc/bus/pci/, | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 286 |              decoupled by lspci					(2.4) | 
 | 287 |  rtc         Real time clock                                    | 
 | 288 |  scsi        SCSI info (see text)                               | 
 | 289 |  slabinfo    Slab pool info                                     | 
 | 290 |  stat        Overall statistics                                 | 
 | 291 |  swaps       Swap space utilization                             | 
 | 292 |  sys         See chapter 2                                      | 
 | 293 |  sysvipc     Info of SysVIPC Resources (msg, sem, shm)		(2.4) | 
 | 294 |  tty	     Info of tty drivers | 
 | 295 |  uptime      System uptime                                      | 
 | 296 |  version     Kernel version                                     | 
 | 297 |  video	     bttv info of video resources			(2.4) | 
 | 298 | .............................................................................. | 
 | 299 |  | 
 | 300 | You can,  for  example,  check  which interrupts are currently in use and what | 
 | 301 | they are used for by looking in the file /proc/interrupts: | 
 | 302 |  | 
 | 303 |   > cat /proc/interrupts  | 
 | 304 |              CPU0         | 
 | 305 |     0:    8728810          XT-PIC  timer  | 
 | 306 |     1:        895          XT-PIC  keyboard  | 
 | 307 |     2:          0          XT-PIC  cascade  | 
 | 308 |     3:     531695          XT-PIC  aha152x  | 
 | 309 |     4:    2014133          XT-PIC  serial  | 
 | 310 |     5:      44401          XT-PIC  pcnet_cs  | 
 | 311 |     8:          2          XT-PIC  rtc  | 
 | 312 |    11:          8          XT-PIC  i82365  | 
 | 313 |    12:     182918          XT-PIC  PS/2 Mouse  | 
 | 314 |    13:          1          XT-PIC  fpu  | 
 | 315 |    14:    1232265          XT-PIC  ide0  | 
 | 316 |    15:          7          XT-PIC  ide1  | 
 | 317 |   NMI:          0  | 
 | 318 |  | 
 | 319 | In 2.4.* a couple of lines where added to this file LOC & ERR (this time is the | 
 | 320 | output of a SMP machine): | 
 | 321 |  | 
 | 322 |   > cat /proc/interrupts  | 
 | 323 |  | 
 | 324 |              CPU0       CPU1        | 
 | 325 |     0:    1243498    1214548    IO-APIC-edge  timer | 
 | 326 |     1:       8949       8958    IO-APIC-edge  keyboard | 
 | 327 |     2:          0          0          XT-PIC  cascade | 
 | 328 |     5:      11286      10161    IO-APIC-edge  soundblaster | 
 | 329 |     8:          1          0    IO-APIC-edge  rtc | 
 | 330 |     9:      27422      27407    IO-APIC-edge  3c503 | 
 | 331 |    12:     113645     113873    IO-APIC-edge  PS/2 Mouse | 
 | 332 |    13:          0          0          XT-PIC  fpu | 
 | 333 |    14:      22491      24012    IO-APIC-edge  ide0 | 
 | 334 |    15:       2183       2415    IO-APIC-edge  ide1 | 
 | 335 |    17:      30564      30414   IO-APIC-level  eth0 | 
 | 336 |    18:        177        164   IO-APIC-level  bttv | 
 | 337 |   NMI:    2457961    2457959  | 
 | 338 |   LOC:    2457882    2457881  | 
 | 339 |   ERR:       2155 | 
 | 340 |  | 
 | 341 | NMI is incremented in this case because every timer interrupt generates a NMI | 
 | 342 | (Non Maskable Interrupt) which is used by the NMI Watchdog to detect lockups. | 
 | 343 |  | 
 | 344 | LOC is the local interrupt counter of the internal APIC of every CPU. | 
 | 345 |  | 
 | 346 | ERR is incremented in the case of errors in the IO-APIC bus (the bus that | 
 | 347 | connects the CPUs in a SMP system. This means that an error has been detected, | 
 | 348 | the IO-APIC automatically retry the transmission, so it should not be a big | 
 | 349 | problem, but you should read the SMP-FAQ. | 
 | 350 |  | 
| Joe Korty | 38e760a | 2007-10-17 18:04:40 +0200 | [diff] [blame] | 351 | In 2.6.2* /proc/interrupts was expanded again.  This time the goal was for | 
 | 352 | /proc/interrupts to display every IRQ vector in use by the system, not | 
 | 353 | just those considered 'most important'.  The new vectors are: | 
 | 354 |  | 
 | 355 |   THR -- interrupt raised when a machine check threshold counter | 
 | 356 |   (typically counting ECC corrected errors of memory or cache) exceeds | 
 | 357 |   a configurable threshold.  Only available on some systems. | 
 | 358 |  | 
 | 359 |   TRM -- a thermal event interrupt occurs when a temperature threshold | 
 | 360 |   has been exceeded for the CPU.  This interrupt may also be generated | 
 | 361 |   when the temperature drops back to normal. | 
 | 362 |  | 
 | 363 |   SPU -- a spurious interrupt is some interrupt that was raised then lowered | 
 | 364 |   by some IO device before it could be fully processed by the APIC.  Hence | 
 | 365 |   the APIC sees the interrupt but does not know what device it came from. | 
 | 366 |   For this case the APIC will generate the interrupt with a IRQ vector | 
 | 367 |   of 0xff. This might also be generated by chipset bugs. | 
 | 368 |  | 
 | 369 |   RES, CAL, TLB -- rescheduling, call and TLB flush interrupts are | 
 | 370 |   sent from one CPU to another per the needs of the OS.  Typically, | 
 | 371 |   their statistics are used by kernel developers and interested users to | 
 | 372 |   determine the occurance of interrupt of the given type. | 
 | 373 |  | 
 | 374 | The above IRQ vectors are displayed only when relevent.  For example, | 
 | 375 | the threshold vector does not exist on x86_64 platforms.  Others are | 
 | 376 | suppressed when the system is a uniprocessor.  As of this writing, only | 
 | 377 | i386 and x86_64 platforms support the new IRQ vector displays. | 
 | 378 |  | 
 | 379 | Of some interest is the introduction of the /proc/irq directory to 2.4. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 380 | It could be used to set IRQ to CPU affinity, this means that you can "hook" an | 
 | 381 | IRQ to only one CPU, or to exclude a CPU of handling IRQs. The contents of the | 
 | 382 | irq subdir is one subdir for each IRQ, and one file; prof_cpu_mask | 
 | 383 |  | 
 | 384 | For example  | 
 | 385 |   > ls /proc/irq/ | 
 | 386 |   0  10  12  14  16  18  2  4  6  8  prof_cpu_mask | 
 | 387 |   1  11  13  15  17  19  3  5  7  9 | 
 | 388 |   > ls /proc/irq/0/ | 
 | 389 |   smp_affinity | 
 | 390 |  | 
 | 391 | The contents of the prof_cpu_mask file and each smp_affinity file for each IRQ | 
 | 392 | is the same by default: | 
 | 393 |  | 
 | 394 |   > cat /proc/irq/0/smp_affinity  | 
 | 395 |   ffffffff | 
 | 396 |  | 
| Uwe Zeisberger | c30fe7f | 2006-03-24 18:23:14 +0100 | [diff] [blame] | 397 | It's a bitmask, in which you can specify which CPUs can handle the IRQ, you can | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 398 | set it by doing: | 
 | 399 |  | 
 | 400 |   > echo 1 > /proc/irq/prof_cpu_mask | 
 | 401 |  | 
 | 402 | This means that only the first CPU will handle the IRQ, but you can also echo 5 | 
| Uwe Zeisberger | c30fe7f | 2006-03-24 18:23:14 +0100 | [diff] [blame] | 403 | which means that only the first and fourth CPU can handle the IRQ. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 404 |  | 
 | 405 | The way IRQs are routed is handled by the IO-APIC, and it's Round Robin | 
 | 406 | between all the CPUs which are allowed to handle it. As usual the kernel has | 
 | 407 | more info than you and does a better job than you, so the defaults are the | 
 | 408 | best choice for almost everyone. | 
 | 409 |  | 
 | 410 | There are  three  more  important subdirectories in /proc: net, scsi, and sys. | 
 | 411 | The general  rule  is  that  the  contents,  or  even  the  existence of these | 
 | 412 | directories, depend  on your kernel configuration. If SCSI is not enabled, the | 
 | 413 | directory scsi  may  not  exist. The same is true with the net, which is there | 
 | 414 | only when networking support is present in the running kernel. | 
 | 415 |  | 
 | 416 | The slabinfo  file  gives  information  about  memory usage at the slab level. | 
 | 417 | Linux uses  slab  pools for memory management above page level in version 2.2. | 
 | 418 | Commonly used  objects  have  their  own  slab  pool (such as network buffers, | 
 | 419 | directory cache, and so on). | 
 | 420 |  | 
 | 421 | .............................................................................. | 
 | 422 |  | 
 | 423 | > cat /proc/buddyinfo | 
 | 424 |  | 
 | 425 | Node 0, zone      DMA      0      4      5      4      4      3 ... | 
 | 426 | Node 0, zone   Normal      1      0      0      1    101      8 ... | 
 | 427 | Node 0, zone  HighMem      2      0      0      1      1      0 ... | 
 | 428 |  | 
 | 429 | Memory fragmentation is a problem under some workloads, and buddyinfo is a  | 
 | 430 | useful tool for helping diagnose these problems.  Buddyinfo will give you a  | 
 | 431 | clue as to how big an area you can safely allocate, or why a previous | 
 | 432 | allocation failed. | 
 | 433 |  | 
 | 434 | Each column represents the number of pages of a certain order which are  | 
 | 435 | available.  In this case, there are 0 chunks of 2^0*PAGE_SIZE available in  | 
 | 436 | ZONE_DMA, 4 chunks of 2^1*PAGE_SIZE in ZONE_DMA, 101 chunks of 2^4*PAGE_SIZE  | 
 | 437 | available in ZONE_NORMAL, etc...  | 
 | 438 |  | 
 | 439 | .............................................................................. | 
 | 440 |  | 
 | 441 | meminfo: | 
 | 442 |  | 
 | 443 | Provides information about distribution and utilization of memory.  This | 
 | 444 | varies by architecture and compile options.  The following is from a | 
 | 445 | 16GB PIII, which has highmem enabled.  You may not have all of these fields. | 
 | 446 |  | 
 | 447 | > cat /proc/meminfo | 
 | 448 |  | 
 | 449 |  | 
 | 450 | MemTotal:     16344972 kB | 
 | 451 | MemFree:      13634064 kB | 
 | 452 | Buffers:          3656 kB | 
 | 453 | Cached:        1195708 kB | 
 | 454 | SwapCached:          0 kB | 
 | 455 | Active:         891636 kB | 
 | 456 | Inactive:      1077224 kB | 
 | 457 | HighTotal:    15597528 kB | 
 | 458 | HighFree:     13629632 kB | 
 | 459 | LowTotal:       747444 kB | 
 | 460 | LowFree:          4432 kB | 
 | 461 | SwapTotal:           0 kB | 
 | 462 | SwapFree:            0 kB | 
 | 463 | Dirty:             968 kB | 
 | 464 | Writeback:           0 kB | 
 | 465 | Mapped:         280372 kB | 
 | 466 | Slab:           684068 kB | 
 | 467 | CommitLimit:   7669796 kB | 
 | 468 | Committed_AS:   100056 kB | 
 | 469 | PageTables:      24448 kB | 
 | 470 | VmallocTotal:   112216 kB | 
 | 471 | VmallocUsed:       428 kB | 
 | 472 | VmallocChunk:   111088 kB | 
 | 473 |  | 
 | 474 |     MemTotal: Total usable ram (i.e. physical ram minus a few reserved | 
 | 475 |               bits and the kernel binary code) | 
 | 476 |      MemFree: The sum of LowFree+HighFree | 
 | 477 |      Buffers: Relatively temporary storage for raw disk blocks | 
 | 478 |               shouldn't get tremendously large (20MB or so) | 
 | 479 |       Cached: in-memory cache for files read from the disk (the | 
 | 480 |               pagecache).  Doesn't include SwapCached | 
 | 481 |   SwapCached: Memory that once was swapped out, is swapped back in but | 
 | 482 |               still also is in the swapfile (if memory is needed it | 
 | 483 |               doesn't need to be swapped out AGAIN because it is already | 
 | 484 |               in the swapfile. This saves I/O) | 
 | 485 |       Active: Memory that has been used more recently and usually not | 
 | 486 |               reclaimed unless absolutely necessary. | 
 | 487 |     Inactive: Memory which has been less recently used.  It is more | 
 | 488 |               eligible to be reclaimed for other purposes | 
 | 489 |    HighTotal: | 
 | 490 |     HighFree: Highmem is all memory above ~860MB of physical memory | 
 | 491 |               Highmem areas are for use by userspace programs, or | 
 | 492 |               for the pagecache.  The kernel must use tricks to access | 
 | 493 |               this memory, making it slower to access than lowmem. | 
 | 494 |     LowTotal: | 
 | 495 |      LowFree: Lowmem is memory which can be used for everything that | 
| Matt LaPlante | 3f6dee9 | 2006-10-03 22:45:33 +0200 | [diff] [blame] | 496 |               highmem can be used for, but it is also available for the | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 497 |               kernel's use for its own data structures.  Among many | 
 | 498 |               other things, it is where everything from the Slab is | 
 | 499 |               allocated.  Bad things happen when you're out of lowmem. | 
 | 500 |    SwapTotal: total amount of swap space available | 
 | 501 |     SwapFree: Memory which has been evicted from RAM, and is temporarily | 
 | 502 |               on the disk | 
 | 503 |        Dirty: Memory which is waiting to get written back to the disk | 
 | 504 |    Writeback: Memory which is actively being written back to the disk | 
 | 505 |       Mapped: files which have been mmaped, such as libraries | 
| Adrian Bunk | e82443c | 2006-01-10 00:20:30 +0100 | [diff] [blame] | 506 |         Slab: in-kernel data structures cache | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 507 |  CommitLimit: Based on the overcommit ratio ('vm.overcommit_ratio'), | 
 | 508 |               this is the total amount of  memory currently available to | 
 | 509 |               be allocated on the system. This limit is only adhered to | 
 | 510 |               if strict overcommit accounting is enabled (mode 2 in | 
 | 511 |               'vm.overcommit_memory'). | 
 | 512 |               The CommitLimit is calculated with the following formula: | 
 | 513 |               CommitLimit = ('vm.overcommit_ratio' * Physical RAM) + Swap | 
 | 514 |               For example, on a system with 1G of physical RAM and 7G | 
 | 515 |               of swap with a `vm.overcommit_ratio` of 30 it would | 
 | 516 |               yield a CommitLimit of 7.3G. | 
 | 517 |               For more details, see the memory overcommit documentation | 
 | 518 |               in vm/overcommit-accounting. | 
 | 519 | Committed_AS: The amount of memory presently allocated on the system. | 
 | 520 |               The committed memory is a sum of all of the memory which | 
 | 521 |               has been allocated by processes, even if it has not been | 
 | 522 |               "used" by them as of yet. A process which malloc()'s 1G | 
 | 523 |               of memory, but only touches 300M of it will only show up | 
 | 524 |               as using 300M of memory even if it has the address space | 
 | 525 |               allocated for the entire 1G. This 1G is memory which has | 
 | 526 |               been "committed" to by the VM and can be used at any time | 
 | 527 |               by the allocating application. With strict overcommit | 
 | 528 |               enabled on the system (mode 2 in 'vm.overcommit_memory'), | 
 | 529 |               allocations which would exceed the CommitLimit (detailed | 
 | 530 |               above) will not be permitted. This is useful if one needs | 
 | 531 |               to guarantee that processes will not fail due to lack of | 
 | 532 |               memory once that memory has been successfully allocated. | 
 | 533 |   PageTables: amount of memory dedicated to the lowest level of page | 
 | 534 |               tables. | 
 | 535 | VmallocTotal: total size of vmalloc memory area | 
 | 536 |  VmallocUsed: amount of vmalloc area which is used | 
 | 537 | VmallocChunk: largest contigious block of vmalloc area which is free | 
 | 538 |  | 
 | 539 |  | 
 | 540 | 1.3 IDE devices in /proc/ide | 
 | 541 | ---------------------------- | 
 | 542 |  | 
 | 543 | The subdirectory /proc/ide contains information about all IDE devices of which | 
 | 544 | the kernel  is  aware.  There is one subdirectory for each IDE controller, the | 
 | 545 | file drivers  and a link for each IDE device, pointing to the device directory | 
 | 546 | in the controller specific subtree. | 
 | 547 |  | 
 | 548 | The file  drivers  contains general information about the drivers used for the | 
 | 549 | IDE devices: | 
 | 550 |  | 
 | 551 |   > cat /proc/ide/drivers | 
 | 552 |   ide-cdrom version 4.53 | 
 | 553 |   ide-disk version 1.08 | 
 | 554 |  | 
 | 555 | More detailed  information  can  be  found  in  the  controller  specific | 
 | 556 | subdirectories. These  are  named  ide0,  ide1  and  so  on.  Each  of  these | 
| Kees Cook | 18d9677 | 2007-07-15 23:40:38 -0700 | [diff] [blame] | 557 | directories contains the files shown in table 1-5. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 558 |  | 
 | 559 |  | 
| Kees Cook | 18d9677 | 2007-07-15 23:40:38 -0700 | [diff] [blame] | 560 | Table 1-5: IDE controller info in  /proc/ide/ide? | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 561 | .............................................................................. | 
 | 562 |  File    Content                                  | 
 | 563 |  channel IDE channel (0 or 1)                     | 
 | 564 |  config  Configuration (only for PCI/IDE bridge)  | 
 | 565 |  mate    Mate name                                | 
 | 566 |  model   Type/Chipset of IDE controller           | 
 | 567 | .............................................................................. | 
 | 568 |  | 
 | 569 | Each device  connected  to  a  controller  has  a separate subdirectory in the | 
| Kees Cook | 18d9677 | 2007-07-15 23:40:38 -0700 | [diff] [blame] | 570 | controllers directory.  The  files  listed in table 1-6 are contained in these | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 571 | directories. | 
 | 572 |  | 
 | 573 |  | 
| Kees Cook | 18d9677 | 2007-07-15 23:40:38 -0700 | [diff] [blame] | 574 | Table 1-6: IDE device information | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 575 | .............................................................................. | 
 | 576 |  File             Content                                     | 
 | 577 |  cache            The cache                                   | 
 | 578 |  capacity         Capacity of the medium (in 512Byte blocks)  | 
 | 579 |  driver           driver and version                          | 
 | 580 |  geometry         physical and logical geometry               | 
 | 581 |  identify         device identify block                       | 
 | 582 |  media            media type                                  | 
 | 583 |  model            device identifier                           | 
 | 584 |  settings         device setup                                | 
 | 585 |  smart_thresholds IDE disk management thresholds              | 
 | 586 |  smart_values     IDE disk management values                  | 
 | 587 | .............................................................................. | 
 | 588 |  | 
 | 589 | The most  interesting  file is settings. This file contains a nice overview of | 
 | 590 | the drive parameters: | 
 | 591 |  | 
 | 592 |   # cat /proc/ide/ide0/hda/settings  | 
 | 593 |   name                    value           min             max             mode  | 
 | 594 |   ----                    -----           ---             ---             ----  | 
 | 595 |   bios_cyl                526             0               65535           rw  | 
 | 596 |   bios_head               255             0               255             rw  | 
 | 597 |   bios_sect               63              0               63              rw  | 
 | 598 |   breada_readahead        4               0               127             rw  | 
 | 599 |   bswap                   0               0               1               r  | 
 | 600 |   file_readahead          72              0               2097151         rw  | 
 | 601 |   io_32bit                0               0               3               rw  | 
 | 602 |   keepsettings            0               0               1               rw  | 
 | 603 |   max_kb_per_request      122             1               127             rw  | 
 | 604 |   multcount               0               0               8               rw  | 
 | 605 |   nice1                   1               0               1               rw  | 
 | 606 |   nowerr                  0               0               1               rw  | 
 | 607 |   pio_mode                write-only      0               255             w  | 
 | 608 |   slow                    0               0               1               rw  | 
 | 609 |   unmaskirq               0               0               1               rw  | 
 | 610 |   using_dma               0               0               1               rw  | 
 | 611 |  | 
 | 612 |  | 
 | 613 | 1.4 Networking info in /proc/net | 
 | 614 | -------------------------------- | 
 | 615 |  | 
 | 616 | The subdirectory  /proc/net  follows  the  usual  pattern. Table 1-6 shows the | 
 | 617 | additional values  you  get  for  IP  version 6 if you configure the kernel to | 
 | 618 | support this. Table 1-7 lists the files and their meaning. | 
 | 619 |  | 
 | 620 |  | 
 | 621 | Table 1-6: IPv6 info in /proc/net  | 
 | 622 | .............................................................................. | 
 | 623 |  File       Content                                                | 
 | 624 |  udp6       UDP sockets (IPv6)                                     | 
 | 625 |  tcp6       TCP sockets (IPv6)                                     | 
 | 626 |  raw6       Raw device statistics (IPv6)                           | 
 | 627 |  igmp6      IP multicast addresses, which this host joined (IPv6)  | 
 | 628 |  if_inet6   List of IPv6 interface addresses                       | 
 | 629 |  ipv6_route Kernel routing table for IPv6                          | 
 | 630 |  rt6_stats  Global IPv6 routing tables statistics                  | 
 | 631 |  sockstat6  Socket statistics (IPv6)                               | 
 | 632 |  snmp6      Snmp data (IPv6)                                       | 
 | 633 | .............................................................................. | 
 | 634 |  | 
 | 635 |  | 
 | 636 | Table 1-7: Network info in /proc/net  | 
 | 637 | .............................................................................. | 
 | 638 |  File          Content                                                          | 
 | 639 |  arp           Kernel  ARP table                                                | 
 | 640 |  dev           network devices with statistics                                  | 
 | 641 |  dev_mcast     the Layer2 multicast groups a device is listening too | 
 | 642 |                (interface index, label, number of references, number of bound | 
 | 643 |                addresses).  | 
 | 644 |  dev_stat      network device status                                            | 
 | 645 |  ip_fwchains   Firewall chain linkage                                           | 
 | 646 |  ip_fwnames    Firewall chain names                                             | 
 | 647 |  ip_masq       Directory containing the masquerading tables                     | 
 | 648 |  ip_masquerade Major masquerading table                                         | 
 | 649 |  netstat       Network statistics                                               | 
 | 650 |  raw           raw device statistics                                            | 
 | 651 |  route         Kernel routing table                                             | 
 | 652 |  rpc           Directory containing rpc info                                    | 
 | 653 |  rt_cache      Routing cache                                                    | 
 | 654 |  snmp          SNMP data                                                        | 
 | 655 |  sockstat      Socket statistics                                                | 
 | 656 |  tcp           TCP  sockets                                                     | 
 | 657 |  tr_rif        Token ring RIF routing table                                     | 
 | 658 |  udp           UDP sockets                                                      | 
 | 659 |  unix          UNIX domain sockets                                              | 
 | 660 |  wireless      Wireless interface data (Wavelan etc)                            | 
 | 661 |  igmp          IP multicast addresses, which this host joined                   | 
 | 662 |  psched        Global packet scheduler parameters.                              | 
 | 663 |  netlink       List of PF_NETLINK sockets                                       | 
 | 664 |  ip_mr_vifs    List of multicast virtual interfaces                             | 
 | 665 |  ip_mr_cache   List of multicast routing cache                                  | 
 | 666 | .............................................................................. | 
 | 667 |  | 
 | 668 | You can  use  this  information  to see which network devices are available in | 
 | 669 | your system and how much traffic was routed over those devices: | 
 | 670 |  | 
 | 671 |   > cat /proc/net/dev  | 
 | 672 |   Inter-|Receive                                                   |[...  | 
 | 673 |    face |bytes    packets errs drop fifo frame compressed multicast|[...  | 
 | 674 |       lo:  908188   5596     0    0    0     0          0         0 [...          | 
 | 675 |     ppp0:15475140  20721   410    0    0   410          0         0 [...   | 
 | 676 |     eth0:  614530   7085     0    0    0     0          0         1 [...  | 
 | 677 |     | 
 | 678 |   ...] Transmit  | 
 | 679 |   ...] bytes    packets errs drop fifo colls carrier compressed  | 
 | 680 |   ...]  908188     5596    0    0    0     0       0          0  | 
 | 681 |   ...] 1375103    17405    0    0    0     0       0          0  | 
 | 682 |   ...] 1703981     5535    0    0    0     3       0          0  | 
 | 683 |  | 
 | 684 | In addition, each Channel Bond interface has it's own directory.  For | 
 | 685 | example, the bond0 device will have a directory called /proc/net/bond0/. | 
 | 686 | It will contain information that is specific to that bond, such as the | 
 | 687 | current slaves of the bond, the link status of the slaves, and how | 
 | 688 | many times the slaves link has failed. | 
 | 689 |  | 
 | 690 | 1.5 SCSI info | 
 | 691 | ------------- | 
 | 692 |  | 
 | 693 | If you  have  a  SCSI  host adapter in your system, you'll find a subdirectory | 
 | 694 | named after  the driver for this adapter in /proc/scsi. You'll also see a list | 
 | 695 | of all recognized SCSI devices in /proc/scsi: | 
 | 696 |  | 
 | 697 |   >cat /proc/scsi/scsi  | 
 | 698 |   Attached devices:  | 
 | 699 |   Host: scsi0 Channel: 00 Id: 00 Lun: 00  | 
 | 700 |     Vendor: IBM      Model: DGHS09U          Rev: 03E0  | 
 | 701 |     Type:   Direct-Access                    ANSI SCSI revision: 03  | 
 | 702 |   Host: scsi0 Channel: 00 Id: 06 Lun: 00  | 
 | 703 |     Vendor: PIONEER  Model: CD-ROM DR-U06S   Rev: 1.04  | 
 | 704 |     Type:   CD-ROM                           ANSI SCSI revision: 02  | 
 | 705 |  | 
 | 706 |  | 
 | 707 | The directory  named  after  the driver has one file for each adapter found in | 
 | 708 | the system.  These  files  contain information about the controller, including | 
 | 709 | the used  IRQ  and  the  IO  address range. The amount of information shown is | 
 | 710 | dependent on  the adapter you use. The example shows the output for an Adaptec | 
 | 711 | AHA-2940 SCSI adapter: | 
 | 712 |  | 
 | 713 |   > cat /proc/scsi/aic7xxx/0  | 
 | 714 |     | 
 | 715 |   Adaptec AIC7xxx driver version: 5.1.19/3.2.4  | 
 | 716 |   Compile Options:  | 
 | 717 |     TCQ Enabled By Default : Disabled  | 
 | 718 |     AIC7XXX_PROC_STATS     : Disabled  | 
 | 719 |     AIC7XXX_RESET_DELAY    : 5  | 
 | 720 |   Adapter Configuration:  | 
 | 721 |              SCSI Adapter: Adaptec AHA-294X Ultra SCSI host adapter  | 
 | 722 |                              Ultra Wide Controller  | 
 | 723 |       PCI MMAPed I/O Base: 0xeb001000  | 
 | 724 |    Adapter SEEPROM Config: SEEPROM found and used.  | 
 | 725 |         Adaptec SCSI BIOS: Enabled  | 
 | 726 |                       IRQ: 10  | 
 | 727 |                      SCBs: Active 0, Max Active 2,  | 
 | 728 |                            Allocated 15, HW 16, Page 255  | 
 | 729 |                Interrupts: 160328  | 
 | 730 |         BIOS Control Word: 0x18b6  | 
 | 731 |      Adapter Control Word: 0x005b  | 
 | 732 |      Extended Translation: Enabled  | 
 | 733 |   Disconnect Enable Flags: 0xffff  | 
 | 734 |        Ultra Enable Flags: 0x0001  | 
 | 735 |    Tag Queue Enable Flags: 0x0000  | 
 | 736 |   Ordered Queue Tag Flags: 0x0000  | 
 | 737 |   Default Tag Queue Depth: 8  | 
 | 738 |       Tagged Queue By Device array for aic7xxx host instance 0:  | 
 | 739 |         {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255}  | 
 | 740 |       Actual queue depth per device for aic7xxx host instance 0:  | 
 | 741 |         {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}  | 
 | 742 |   Statistics:  | 
 | 743 |   (scsi0:0:0:0)  | 
 | 744 |     Device using Wide/Sync transfers at 40.0 MByte/sec, offset 8  | 
 | 745 |     Transinfo settings: current(12/8/1/0), goal(12/8/1/0), user(12/15/1/0)  | 
 | 746 |     Total transfers 160151 (74577 reads and 85574 writes)  | 
 | 747 |   (scsi0:0:6:0)  | 
 | 748 |     Device using Narrow/Sync transfers at 5.0 MByte/sec, offset 15  | 
 | 749 |     Transinfo settings: current(50/15/0/0), goal(50/15/0/0), user(50/15/0/0)  | 
 | 750 |     Total transfers 0 (0 reads and 0 writes)  | 
 | 751 |  | 
 | 752 |  | 
 | 753 | 1.6 Parallel port info in /proc/parport | 
 | 754 | --------------------------------------- | 
 | 755 |  | 
 | 756 | The directory  /proc/parport  contains information about the parallel ports of | 
 | 757 | your system.  It  has  one  subdirectory  for  each port, named after the port | 
 | 758 | number (0,1,2,...). | 
 | 759 |  | 
 | 760 | These directories contain the four files shown in Table 1-8. | 
 | 761 |  | 
 | 762 |  | 
 | 763 | Table 1-8: Files in /proc/parport  | 
 | 764 | .............................................................................. | 
 | 765 |  File      Content                                                              | 
 | 766 |  autoprobe Any IEEE-1284 device ID information that has been acquired.          | 
 | 767 |  devices   list of the device drivers using that port. A + will appear by the | 
 | 768 |            name of the device currently using the port (it might not appear | 
 | 769 |            against any).  | 
 | 770 |  hardware  Parallel port's base address, IRQ line and DMA channel.              | 
 | 771 |  irq       IRQ that parport is using for that port. This is in a separate | 
 | 772 |            file to allow you to alter it by writing a new value in (IRQ | 
 | 773 |            number or none).  | 
 | 774 | .............................................................................. | 
 | 775 |  | 
 | 776 | 1.7 TTY info in /proc/tty | 
 | 777 | ------------------------- | 
 | 778 |  | 
 | 779 | Information about  the  available  and actually used tty's can be found in the | 
 | 780 | directory /proc/tty.You'll  find  entries  for drivers and line disciplines in | 
 | 781 | this directory, as shown in Table 1-9. | 
 | 782 |  | 
 | 783 |  | 
 | 784 | Table 1-9: Files in /proc/tty  | 
 | 785 | .............................................................................. | 
 | 786 |  File          Content                                         | 
 | 787 |  drivers       list of drivers and their usage                 | 
 | 788 |  ldiscs        registered line disciplines                     | 
 | 789 |  driver/serial usage statistic and status of single tty lines  | 
 | 790 | .............................................................................. | 
 | 791 |  | 
 | 792 | To see  which  tty's  are  currently in use, you can simply look into the file | 
 | 793 | /proc/tty/drivers: | 
 | 794 |  | 
 | 795 |   > cat /proc/tty/drivers  | 
 | 796 |   pty_slave            /dev/pts      136   0-255 pty:slave  | 
 | 797 |   pty_master           /dev/ptm      128   0-255 pty:master  | 
 | 798 |   pty_slave            /dev/ttyp       3   0-255 pty:slave  | 
 | 799 |   pty_master           /dev/pty        2   0-255 pty:master  | 
 | 800 |   serial               /dev/cua        5   64-67 serial:callout  | 
 | 801 |   serial               /dev/ttyS       4   64-67 serial  | 
 | 802 |   /dev/tty0            /dev/tty0       4       0 system:vtmaster  | 
 | 803 |   /dev/ptmx            /dev/ptmx       5       2 system  | 
 | 804 |   /dev/console         /dev/console    5       1 system:console  | 
 | 805 |   /dev/tty             /dev/tty        5       0 system:/dev/tty  | 
 | 806 |   unknown              /dev/tty        4    1-63 console  | 
 | 807 |  | 
 | 808 |  | 
 | 809 | 1.8 Miscellaneous kernel statistics in /proc/stat | 
 | 810 | ------------------------------------------------- | 
 | 811 |  | 
 | 812 | Various pieces   of  information about  kernel activity  are  available in the | 
 | 813 | /proc/stat file.  All  of  the numbers reported  in  this file are  aggregates | 
 | 814 | since the system first booted.  For a quick look, simply cat the file: | 
 | 815 |  | 
 | 816 |   > cat /proc/stat | 
| Leonardo Chiquitto | b68f2c3 | 2007-10-20 03:03:38 +0200 | [diff] [blame] | 817 |   cpu  2255 34 2290 22625563 6290 127 456 0 | 
 | 818 |   cpu0 1132 34 1441 11311718 3675 127 438 0 | 
 | 819 |   cpu1 1123 0 849 11313845 2614 0 18 0 | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 820 |   intr 114930548 113199788 3 0 5 263 0 4 [... lots more numbers ...] | 
 | 821 |   ctxt 1990473 | 
 | 822 |   btime 1062191376 | 
 | 823 |   processes 2915 | 
 | 824 |   procs_running 1 | 
 | 825 |   procs_blocked 0 | 
 | 826 |  | 
 | 827 | The very first  "cpu" line aggregates the  numbers in all  of the other "cpuN" | 
 | 828 | lines.  These numbers identify the amount of time the CPU has spent performing | 
 | 829 | different kinds of work.  Time units are in USER_HZ (typically hundredths of a | 
 | 830 | second).  The meanings of the columns are as follows, from left to right: | 
 | 831 |  | 
 | 832 | - user: normal processes executing in user mode | 
 | 833 | - nice: niced processes executing in user mode | 
 | 834 | - system: processes executing in kernel mode | 
 | 835 | - idle: twiddling thumbs | 
 | 836 | - iowait: waiting for I/O to complete | 
 | 837 | - irq: servicing interrupts | 
 | 838 | - softirq: servicing softirqs | 
| Leonardo Chiquitto | b68f2c3 | 2007-10-20 03:03:38 +0200 | [diff] [blame] | 839 | - steal: involuntary wait | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 840 |  | 
 | 841 | The "intr" line gives counts of interrupts  serviced since boot time, for each | 
 | 842 | of the  possible system interrupts.   The first  column  is the  total of  all | 
 | 843 | interrupts serviced; each  subsequent column is the  total for that particular | 
 | 844 | interrupt. | 
 | 845 |  | 
 | 846 | The "ctxt" line gives the total number of context switches across all CPUs. | 
 | 847 |  | 
 | 848 | The "btime" line gives  the time at which the  system booted, in seconds since | 
 | 849 | the Unix epoch. | 
 | 850 |  | 
 | 851 | The "processes" line gives the number  of processes and threads created, which | 
 | 852 | includes (but  is not limited  to) those  created by  calls to the  fork() and | 
 | 853 | clone() system calls. | 
 | 854 |  | 
 | 855 | The  "procs_running" line gives the  number of processes  currently running on | 
 | 856 | CPUs. | 
 | 857 |  | 
 | 858 | The   "procs_blocked" line gives  the  number of  processes currently blocked, | 
 | 859 | waiting for I/O to complete. | 
 | 860 |  | 
| Alex Tomas | c9de560 | 2008-01-29 00:19:52 -0500 | [diff] [blame] | 861 | 1.9 Ext4 file system parameters | 
 | 862 | ------------------------------ | 
 | 863 | Ext4 file system have one directory per partition under /proc/fs/ext4/ | 
 | 864 | # ls /proc/fs/ext4/hdc/ | 
 | 865 | group_prealloc  max_to_scan  mb_groups  mb_history  min_to_scan  order2_req | 
 | 866 | stats  stream_req | 
 | 867 |  | 
 | 868 | mb_groups: | 
 | 869 | This file gives the details of mutiblock allocator buddy cache of free blocks | 
 | 870 |  | 
 | 871 | mb_history: | 
 | 872 | Multiblock allocation history. | 
 | 873 |  | 
 | 874 | stats: | 
 | 875 | This file indicate whether the multiblock allocator should start collecting | 
 | 876 | statistics. The statistics are shown during unmount | 
 | 877 |  | 
 | 878 | group_prealloc: | 
 | 879 | The multiblock allocator normalize the block allocation request to | 
 | 880 | group_prealloc filesystem blocks if we don't have strip value set. | 
 | 881 | The stripe value can be specified at mount time or during mke2fs. | 
 | 882 |  | 
 | 883 | max_to_scan: | 
 | 884 | How long multiblock allocator can look for a best extent (in found extents) | 
 | 885 |  | 
 | 886 | min_to_scan: | 
 | 887 | How long multiblock allocator  must look for a best extent | 
 | 888 |  | 
 | 889 | order2_req: | 
 | 890 | Multiblock allocator use  2^N search using buddies only for requests greater | 
 | 891 | than or equal to order2_req. The request size is specfied in file system | 
 | 892 | blocks. A value of 2 indicate only if the requests are greater than or equal | 
 | 893 | to 4 blocks. | 
 | 894 |  | 
 | 895 | stream_req: | 
 | 896 | Files smaller than stream_req are served by the stream allocator, whose | 
 | 897 | purpose is to pack requests as close each to other as possible to | 
 | 898 | produce smooth I/O traffic. Avalue of 16 indicate that file smaller than 16 | 
 | 899 | filesystem block size will use group based preallocation. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 900 |  | 
 | 901 | ------------------------------------------------------------------------------ | 
 | 902 | Summary | 
 | 903 | ------------------------------------------------------------------------------ | 
 | 904 | The /proc file system serves information about the running system. It not only | 
 | 905 | allows access to process data but also allows you to request the kernel status | 
 | 906 | by reading files in the hierarchy. | 
 | 907 |  | 
 | 908 | The directory  structure  of /proc reflects the types of information and makes | 
 | 909 | it easy, if not obvious, where to look for specific data. | 
 | 910 | ------------------------------------------------------------------------------ | 
 | 911 |  | 
 | 912 | ------------------------------------------------------------------------------ | 
 | 913 | CHAPTER 2: MODIFYING SYSTEM PARAMETERS | 
 | 914 | ------------------------------------------------------------------------------ | 
 | 915 |  | 
 | 916 | ------------------------------------------------------------------------------ | 
 | 917 | In This Chapter | 
 | 918 | ------------------------------------------------------------------------------ | 
 | 919 | * Modifying kernel parameters by writing into files found in /proc/sys | 
 | 920 | * Exploring the files which modify certain parameters | 
 | 921 | * Review of the /proc/sys file tree | 
 | 922 | ------------------------------------------------------------------------------ | 
 | 923 |  | 
 | 924 |  | 
 | 925 | A very  interesting part of /proc is the directory /proc/sys. This is not only | 
 | 926 | a source  of  information,  it also allows you to change parameters within the | 
 | 927 | kernel. Be  very  careful  when attempting this. You can optimize your system, | 
 | 928 | but you  can  also  cause  it  to  crash.  Never  alter kernel parameters on a | 
 | 929 | production system.  Set  up  a  development machine and test to make sure that | 
 | 930 | everything works  the  way  you want it to. You may have no alternative but to | 
 | 931 | reboot the machine once an error has been made. | 
 | 932 |  | 
 | 933 | To change  a  value,  simply  echo  the new value into the file. An example is | 
 | 934 | given below  in the section on the file system data. You need to be root to do | 
 | 935 | this. You  can  create  your  own  boot script to perform this every time your | 
 | 936 | system boots. | 
 | 937 |  | 
 | 938 | The files  in /proc/sys can be used to fine tune and monitor miscellaneous and | 
 | 939 | general things  in  the operation of the Linux kernel. Since some of the files | 
 | 940 | can inadvertently  disrupt  your  system,  it  is  advisable  to  read  both | 
 | 941 | documentation and  source  before actually making adjustments. In any case, be | 
 | 942 | very careful  when  writing  to  any  of these files. The entries in /proc may | 
 | 943 | change slightly between the 2.1.* and the 2.2 kernel, so if there is any doubt | 
 | 944 | review the kernel documentation in the directory /usr/src/linux/Documentation. | 
 | 945 | This chapter  is  heavily  based  on the documentation included in the pre 2.2 | 
 | 946 | kernels, and became part of it in version 2.2.1 of the Linux kernel. | 
 | 947 |  | 
 | 948 | 2.1 /proc/sys/fs - File system data | 
 | 949 | ----------------------------------- | 
 | 950 |  | 
 | 951 | This subdirectory  contains  specific  file system, file handle, inode, dentry | 
 | 952 | and quota information. | 
 | 953 |  | 
 | 954 | Currently, these files are in /proc/sys/fs: | 
 | 955 |  | 
 | 956 | dentry-state | 
 | 957 | ------------ | 
 | 958 |  | 
 | 959 | Status of  the  directory  cache.  Since  directory  entries  are  dynamically | 
 | 960 | allocated and  deallocated,  this  file indicates the current status. It holds | 
 | 961 | six values, in which the last two are not used and are always zero. The others | 
 | 962 | are listed in table 2-1. | 
 | 963 |  | 
 | 964 |  | 
 | 965 | Table 2-1: Status files of the directory cache  | 
 | 966 | .............................................................................. | 
 | 967 |  File       Content                                                             | 
 | 968 |  nr_dentry  Almost always zero                                                  | 
 | 969 |  nr_unused  Number of unused cache entries                                      | 
 | 970 |  age_limit   | 
 | 971 |             in seconds after the entry may be reclaimed, when memory is short  | 
 | 972 |  want_pages internally                                                          | 
 | 973 | .............................................................................. | 
 | 974 |  | 
 | 975 | dquot-nr and dquot-max | 
 | 976 | ---------------------- | 
 | 977 |  | 
 | 978 | The file dquot-max shows the maximum number of cached disk quota entries. | 
 | 979 |  | 
 | 980 | The file  dquot-nr  shows  the  number of allocated disk quota entries and the | 
 | 981 | number of free disk quota entries. | 
 | 982 |  | 
 | 983 | If the number of available cached disk quotas is very low and you have a large | 
 | 984 | number of simultaneous system users, you might want to raise the limit. | 
 | 985 |  | 
 | 986 | file-nr and file-max | 
 | 987 | -------------------- | 
 | 988 |  | 
 | 989 | The kernel  allocates file handles dynamically, but doesn't free them again at | 
 | 990 | this time. | 
 | 991 |  | 
 | 992 | The value  in  file-max  denotes  the  maximum number of file handles that the | 
 | 993 | Linux kernel will allocate. When you get a lot of error messages about running | 
 | 994 | out of  file handles, you might want to raise this limit. The default value is | 
 | 995 | 10% of  RAM in kilobytes.  To  change it, just  write the new number  into the | 
 | 996 | file: | 
 | 997 |  | 
 | 998 |   # cat /proc/sys/fs/file-max  | 
 | 999 |   4096  | 
 | 1000 |   # echo 8192 > /proc/sys/fs/file-max  | 
 | 1001 |   # cat /proc/sys/fs/file-max  | 
 | 1002 |   8192  | 
 | 1003 |  | 
 | 1004 |  | 
 | 1005 | This method  of  revision  is  useful  for  all customizable parameters of the | 
 | 1006 | kernel - simply echo the new value to the corresponding file. | 
 | 1007 |  | 
 | 1008 | Historically, the three values in file-nr denoted the number of allocated file | 
 | 1009 | handles,  the number of  allocated but  unused file  handles, and  the maximum | 
 | 1010 | number of file handles. Linux 2.6 always  reports 0 as the number of free file | 
 | 1011 | handles -- this  is not an error,  it just means that the  number of allocated | 
 | 1012 | file handles exactly matches the number of used file handles. | 
 | 1013 |  | 
 | 1014 | Attempts to  allocate more  file descriptors than  file-max are  reported with | 
 | 1015 | printk, look for "VFS: file-max limit <number> reached". | 
 | 1016 |  | 
 | 1017 | inode-state and inode-nr | 
 | 1018 | ------------------------ | 
 | 1019 |  | 
 | 1020 | The file inode-nr contains the first two items from inode-state, so we'll skip | 
 | 1021 | to that file... | 
 | 1022 |  | 
 | 1023 | inode-state contains  two  actual numbers and five dummy values. The numbers | 
 | 1024 | are nr_inodes and nr_free_inodes (in order of appearance). | 
 | 1025 |  | 
 | 1026 | nr_inodes | 
 | 1027 | ~~~~~~~~~ | 
 | 1028 |  | 
 | 1029 | Denotes the  number  of  inodes the system has allocated. This number will | 
 | 1030 | grow and shrink dynamically. | 
 | 1031 |  | 
| Eric Dumazet | 9cfe015 | 2008-02-06 01:37:16 -0800 | [diff] [blame] | 1032 | nr_open | 
 | 1033 | ------- | 
 | 1034 |  | 
 | 1035 | Denotes the maximum number of file-handles a process can | 
 | 1036 | allocate. Default value is 1024*1024 (1048576) which should be | 
 | 1037 | enough for most machines. Actual limit depends on RLIMIT_NOFILE | 
 | 1038 | resource limit. | 
 | 1039 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1040 | nr_free_inodes | 
 | 1041 | -------------- | 
 | 1042 |  | 
 | 1043 | Represents the  number of free inodes. Ie. The number of inuse inodes is | 
 | 1044 | (nr_inodes - nr_free_inodes). | 
 | 1045 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1046 | aio-nr and aio-max-nr | 
 | 1047 | --------------------- | 
 | 1048 |  | 
 | 1049 | aio-nr is the running total of the number of events specified on the | 
 | 1050 | io_setup system call for all currently active aio contexts.  If aio-nr | 
 | 1051 | reaches aio-max-nr then io_setup will fail with EAGAIN.  Note that | 
 | 1052 | raising aio-max-nr does not result in the pre-allocation or re-sizing | 
 | 1053 | of any kernel data structures. | 
 | 1054 |  | 
 | 1055 | 2.2 /proc/sys/fs/binfmt_misc - Miscellaneous binary formats | 
 | 1056 | ----------------------------------------------------------- | 
 | 1057 |  | 
 | 1058 | Besides these  files, there is the subdirectory /proc/sys/fs/binfmt_misc. This | 
 | 1059 | handles the kernel support for miscellaneous binary formats. | 
 | 1060 |  | 
 | 1061 | Binfmt_misc provides  the ability to register additional binary formats to the | 
 | 1062 | Kernel without  compiling  an additional module/kernel. Therefore, binfmt_misc | 
 | 1063 | needs to  know magic numbers at the beginning or the filename extension of the | 
 | 1064 | binary. | 
 | 1065 |  | 
 | 1066 | It works by maintaining a linked list of structs that contain a description of | 
 | 1067 | a binary  format,  including  a  magic  with size (or the filename extension), | 
 | 1068 | offset and  mask,  and  the  interpreter name. On request it invokes the given | 
 | 1069 | interpreter with  the  original  program  as  argument,  as  binfmt_java  and | 
 | 1070 | binfmt_em86 and  binfmt_mz  do.  Since binfmt_misc does not define any default | 
 | 1071 | binary-formats, you have to register an additional binary-format. | 
 | 1072 |  | 
 | 1073 | There are two general files in binfmt_misc and one file per registered format. | 
 | 1074 | The two general files are register and status. | 
 | 1075 |  | 
 | 1076 | Registering a new binary format | 
 | 1077 | ------------------------------- | 
 | 1078 |  | 
 | 1079 | To register a new binary format you have to issue the command | 
 | 1080 |  | 
 | 1081 |   echo :name:type:offset:magic:mask:interpreter: > /proc/sys/fs/binfmt_misc/register  | 
 | 1082 |  | 
 | 1083 |  | 
 | 1084 |  | 
 | 1085 | with appropriate  name (the name for the /proc-dir entry), offset (defaults to | 
 | 1086 | 0, if  omitted),  magic, mask (which can be omitted, defaults to all 0xff) and | 
 | 1087 | last but  not  least,  the  interpreter that is to be invoked (for example and | 
 | 1088 | testing /bin/echo).  Type  can be M for usual magic matching or E for filename | 
 | 1089 | extension matching (give extension in place of magic). | 
 | 1090 |  | 
 | 1091 | Check or reset the status of the binary format handler | 
 | 1092 | ------------------------------------------------------ | 
 | 1093 |  | 
 | 1094 | If you  do a cat on the file /proc/sys/fs/binfmt_misc/status, you will get the | 
 | 1095 | current status (enabled/disabled) of binfmt_misc. Change the status by echoing | 
 | 1096 | 0 (disables)  or  1  (enables)  or  -1  (caution:  this  clears all previously | 
 | 1097 | registered binary  formats)  to status. For example echo 0 > status to disable | 
 | 1098 | binfmt_misc (temporarily). | 
 | 1099 |  | 
 | 1100 | Status of a single handler | 
 | 1101 | -------------------------- | 
 | 1102 |  | 
 | 1103 | Each registered  handler has an entry in /proc/sys/fs/binfmt_misc. These files | 
 | 1104 | perform the  same function as status, but their scope is limited to the actual | 
 | 1105 | binary format.  By  cating this file, you also receive all related information | 
 | 1106 | about the interpreter/magic of the binfmt. | 
 | 1107 |  | 
 | 1108 | Example usage of binfmt_misc (emulate binfmt_java) | 
 | 1109 | -------------------------------------------------- | 
 | 1110 |  | 
 | 1111 |   cd /proc/sys/fs/binfmt_misc   | 
 | 1112 |   echo ':Java:M::\xca\xfe\xba\xbe::/usr/local/java/bin/javawrapper:' > register   | 
 | 1113 |   echo ':HTML:E::html::/usr/local/java/bin/appletviewer:' > register   | 
 | 1114 |   echo ':Applet:M::<!--applet::/usr/local/java/bin/appletviewer:' > register  | 
 | 1115 |   echo ':DEXE:M::\x0eDEX::/usr/bin/dosexec:' > register  | 
 | 1116 |  | 
 | 1117 |  | 
 | 1118 | These four  lines  add  support  for  Java  executables and Java applets (like | 
 | 1119 | binfmt_java, additionally  recognizing the .html extension with no need to put | 
 | 1120 | <!--applet> to  every  applet  file).  You  have  to  install  the JDK and the | 
 | 1121 | shell-script /usr/local/java/bin/javawrapper  too.  It  works  around  the | 
 | 1122 | brokenness of  the Java filename handling. To add a Java binary, just create a | 
 | 1123 | link to the class-file somewhere in the path. | 
 | 1124 |  | 
 | 1125 | 2.3 /proc/sys/kernel - general kernel parameters | 
 | 1126 | ------------------------------------------------ | 
 | 1127 |  | 
 | 1128 | This directory  reflects  general  kernel  behaviors. As I've said before, the | 
 | 1129 | contents depend  on  your  configuration.  Here you'll find the most important | 
 | 1130 | files, along with descriptions of what they mean and how to use them. | 
 | 1131 |  | 
 | 1132 | acct | 
 | 1133 | ---- | 
 | 1134 |  | 
 | 1135 | The file contains three values; highwater, lowwater, and frequency. | 
 | 1136 |  | 
 | 1137 | It exists  only  when  BSD-style  process  accounting is enabled. These values | 
 | 1138 | control its behavior. If the free space on the file system where the log lives | 
 | 1139 | goes below  lowwater  percentage,  accounting  suspends.  If  it  goes  above | 
 | 1140 | highwater percentage,  accounting  resumes. Frequency determines how often you | 
 | 1141 | check the amount of free space (value is in seconds). Default settings are: 4, | 
 | 1142 | 2, and  30.  That is, suspend accounting if there is less than 2 percent free; | 
 | 1143 | resume it  if we have a value of 3 or more percent; consider information about | 
 | 1144 | the amount of free space valid for 30 seconds | 
 | 1145 |  | 
 | 1146 | ctrl-alt-del | 
 | 1147 | ------------ | 
 | 1148 |  | 
 | 1149 | When the value in this file is 0, ctrl-alt-del is trapped and sent to the init | 
 | 1150 | program to  handle a graceful restart. However, when the value is greater that | 
 | 1151 | zero, Linux's  reaction  to  this key combination will be an immediate reboot, | 
 | 1152 | without syncing its dirty buffers. | 
 | 1153 |  | 
 | 1154 | [NOTE] | 
 | 1155 |     When a  program  (like  dosemu)  has  the  keyboard  in  raw  mode,  the | 
 | 1156 |     ctrl-alt-del is  intercepted  by  the  program  before it ever reaches the | 
 | 1157 |     kernel tty  layer,  and  it is up to the program to decide what to do with | 
 | 1158 |     it. | 
 | 1159 |  | 
 | 1160 | domainname and hostname | 
 | 1161 | ----------------------- | 
 | 1162 |  | 
 | 1163 | These files  can  be controlled to set the NIS domainname and hostname of your | 
 | 1164 | box. For the classic darkstar.frop.org a simple: | 
 | 1165 |  | 
 | 1166 |   # echo "darkstar" > /proc/sys/kernel/hostname  | 
 | 1167 |   # echo "frop.org" > /proc/sys/kernel/domainname  | 
 | 1168 |  | 
 | 1169 |  | 
 | 1170 | would suffice to set your hostname and NIS domainname. | 
 | 1171 |  | 
 | 1172 | osrelease, ostype and version | 
 | 1173 | ----------------------------- | 
 | 1174 |  | 
 | 1175 | The names make it pretty obvious what these fields contain: | 
 | 1176 |  | 
 | 1177 |   > cat /proc/sys/kernel/osrelease  | 
 | 1178 |   2.2.12  | 
 | 1179 |     | 
 | 1180 |   > cat /proc/sys/kernel/ostype  | 
 | 1181 |   Linux  | 
 | 1182 |     | 
 | 1183 |   > cat /proc/sys/kernel/version  | 
 | 1184 |   #4 Fri Oct 1 12:41:14 PDT 1999  | 
 | 1185 |  | 
 | 1186 |  | 
 | 1187 | The files  osrelease and ostype should be clear enough. Version needs a little | 
 | 1188 | more clarification.  The  #4 means that this is the 4th kernel built from this | 
 | 1189 | source base and the date after it indicates the time the kernel was built. The | 
 | 1190 | only way to tune these values is to rebuild the kernel. | 
 | 1191 |  | 
 | 1192 | panic | 
 | 1193 | ----- | 
 | 1194 |  | 
 | 1195 | The value  in  this  file  represents  the  number of seconds the kernel waits | 
 | 1196 | before rebooting  on  a  panic.  When  you  use  the  software  watchdog,  the | 
 | 1197 | recommended setting  is  60. If set to 0, the auto reboot after a kernel panic | 
 | 1198 | is disabled, which is the default setting. | 
 | 1199 |  | 
 | 1200 | printk | 
 | 1201 | ------ | 
 | 1202 |  | 
 | 1203 | The four values in printk denote | 
 | 1204 | * console_loglevel, | 
 | 1205 | * default_message_loglevel, | 
 | 1206 | * minimum_console_loglevel and | 
 | 1207 | * default_console_loglevel | 
 | 1208 | respectively. | 
 | 1209 |  | 
 | 1210 | These values  influence  printk()  behavior  when  printing  or  logging error | 
 | 1211 | messages, which  come  from  inside  the  kernel.  See  syslog(2)  for  more | 
 | 1212 | information on the different log levels. | 
 | 1213 |  | 
 | 1214 | console_loglevel | 
 | 1215 | ---------------- | 
 | 1216 |  | 
 | 1217 | Messages with a higher priority than this will be printed to the console. | 
 | 1218 |  | 
 | 1219 | default_message_level | 
 | 1220 | --------------------- | 
 | 1221 |  | 
 | 1222 | Messages without an explicit priority will be printed with this priority. | 
 | 1223 |  | 
 | 1224 | minimum_console_loglevel | 
 | 1225 | ------------------------ | 
 | 1226 |  | 
 | 1227 | Minimum (highest) value to which the console_loglevel can be set. | 
 | 1228 |  | 
 | 1229 | default_console_loglevel | 
 | 1230 | ------------------------ | 
 | 1231 |  | 
 | 1232 | Default value for console_loglevel. | 
 | 1233 |  | 
 | 1234 | sg-big-buff | 
 | 1235 | ----------- | 
 | 1236 |  | 
 | 1237 | This file  shows  the size of the generic SCSI (sg) buffer. At this point, you | 
 | 1238 | can't tune  it  yet,  but  you  can  change  it  at  compile  time  by editing | 
 | 1239 | include/scsi/sg.h and changing the value of SG_BIG_BUFF. | 
 | 1240 |  | 
 | 1241 | If you use a scanner with SANE (Scanner Access Now Easy) you might want to set | 
 | 1242 | this to a higher value. Refer to the SANE documentation on this issue. | 
 | 1243 |  | 
 | 1244 | modprobe | 
 | 1245 | -------- | 
 | 1246 |  | 
 | 1247 | The location  where  the  modprobe  binary  is  located.  The kernel uses this | 
 | 1248 | program to load modules on demand. | 
 | 1249 |  | 
 | 1250 | unknown_nmi_panic | 
 | 1251 | ----------------- | 
 | 1252 |  | 
 | 1253 | The value in this file affects behavior of handling NMI. When the value is | 
 | 1254 | non-zero, unknown NMI is trapped and then panic occurs. At that time, kernel | 
 | 1255 | debugging information is displayed on console. | 
 | 1256 |  | 
 | 1257 | NMI switch that most IA32 servers have fires unknown NMI up, for example. | 
 | 1258 | If a system hangs up, try pressing the NMI switch. | 
 | 1259 |  | 
| Don Zickus | e33e89a | 2006-09-26 10:52:27 +0200 | [diff] [blame] | 1260 | nmi_watchdog | 
 | 1261 | ------------ | 
 | 1262 |  | 
 | 1263 | Enables/Disables the NMI watchdog on x86 systems.  When the value is non-zero | 
 | 1264 | the NMI watchdog is enabled and will continuously test all online cpus to | 
 | 1265 | determine whether or not they are still functioning properly. | 
 | 1266 |  | 
 | 1267 | Because the NMI watchdog shares registers with oprofile, by disabling the NMI | 
 | 1268 | watchdog, oprofile may have more registers to utilize. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1269 |  | 
| Kees Cook | 5096add | 2007-05-08 00:26:04 -0700 | [diff] [blame] | 1270 | maps_protect | 
 | 1271 | ------------ | 
 | 1272 |  | 
 | 1273 | Enables/Disables the protection of the per-process proc entries "maps" and | 
 | 1274 | "smaps".  When enabled, the contents of these files are visible only to | 
 | 1275 | readers that are allowed to ptrace() the given process. | 
 | 1276 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1277 |  | 
 | 1278 | 2.4 /proc/sys/vm - The virtual memory subsystem | 
 | 1279 | ----------------------------------------------- | 
 | 1280 |  | 
 | 1281 | The files  in  this directory can be used to tune the operation of the virtual | 
 | 1282 | memory (VM)  subsystem  of  the  Linux  kernel. | 
 | 1283 |  | 
 | 1284 | vfs_cache_pressure | 
 | 1285 | ------------------ | 
 | 1286 |  | 
 | 1287 | Controls the tendency of the kernel to reclaim the memory which is used for | 
 | 1288 | caching of directory and inode objects. | 
 | 1289 |  | 
 | 1290 | At the default value of vfs_cache_pressure=100 the kernel will attempt to | 
 | 1291 | reclaim dentries and inodes at a "fair" rate with respect to pagecache and | 
 | 1292 | swapcache reclaim.  Decreasing vfs_cache_pressure causes the kernel to prefer | 
 | 1293 | to retain dentry and inode caches.  Increasing vfs_cache_pressure beyond 100 | 
 | 1294 | causes the kernel to prefer to reclaim dentries and inodes. | 
 | 1295 |  | 
 | 1296 | dirty_background_ratio | 
 | 1297 | ---------------------- | 
 | 1298 |  | 
 | 1299 | Contains, as a percentage of total system memory, the number of pages at which | 
 | 1300 | the pdflush background writeback daemon will start writing out dirty data. | 
 | 1301 |  | 
 | 1302 | dirty_ratio | 
 | 1303 | ----------------- | 
 | 1304 |  | 
 | 1305 | Contains, as a percentage of total system memory, the number of pages at which | 
 | 1306 | a process which is generating disk writes will itself start writing out dirty | 
 | 1307 | data. | 
 | 1308 |  | 
 | 1309 | dirty_writeback_centisecs | 
 | 1310 | ------------------------- | 
 | 1311 |  | 
 | 1312 | The pdflush writeback daemons will periodically wake up and write `old' data | 
 | 1313 | out to disk.  This tunable expresses the interval between those wakeups, in | 
 | 1314 | 100'ths of a second. | 
 | 1315 |  | 
 | 1316 | Setting this to zero disables periodic writeback altogether. | 
 | 1317 |  | 
 | 1318 | dirty_expire_centisecs | 
 | 1319 | ---------------------- | 
 | 1320 |  | 
 | 1321 | This tunable is used to define when dirty data is old enough to be eligible | 
 | 1322 | for writeout by the pdflush daemons.  It is expressed in 100'ths of a second.  | 
 | 1323 | Data which has been dirty in-memory for longer than this interval will be | 
 | 1324 | written out next time a pdflush daemon wakes up. | 
 | 1325 |  | 
| Bron Gondwana | 195cf45 | 2008-02-04 22:29:20 -0800 | [diff] [blame] | 1326 | highmem_is_dirtyable | 
 | 1327 | -------------------- | 
 | 1328 |  | 
 | 1329 | Only present if CONFIG_HIGHMEM is set. | 
 | 1330 |  | 
 | 1331 | This defaults to 0 (false), meaning that the ratios set above are calculated | 
 | 1332 | as a percentage of lowmem only.  This protects against excessive scanning | 
 | 1333 | in page reclaim, swapping and general VM distress. | 
 | 1334 |  | 
 | 1335 | Setting this to 1 can be useful on 32 bit machines where you want to make | 
 | 1336 | random changes within an MMAPed file that is larger than your available | 
 | 1337 | lowmem without causing large quantities of random IO.  Is is safe if the | 
 | 1338 | behavior of all programs running on the machine is known and memory will | 
 | 1339 | not be otherwise stressed. | 
 | 1340 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1341 | legacy_va_layout | 
 | 1342 | ---------------- | 
 | 1343 |  | 
 | 1344 | If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel | 
 | 1345 | will use the legacy (2.4) layout for all processes. | 
 | 1346 |  | 
| Yasunori Goto | 7786fa9 | 2008-02-04 22:29:32 -0800 | [diff] [blame] | 1347 | lowmem_reserve_ratio | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1348 | --------------------- | 
 | 1349 |  | 
 | 1350 | For some specialised workloads on highmem machines it is dangerous for | 
 | 1351 | the kernel to allow process memory to be allocated from the "lowmem" | 
 | 1352 | zone.  This is because that memory could then be pinned via the mlock() | 
 | 1353 | system call, or by unavailability of swapspace. | 
 | 1354 |  | 
 | 1355 | And on large highmem machines this lack of reclaimable lowmem memory | 
 | 1356 | can be fatal. | 
 | 1357 |  | 
 | 1358 | So the Linux page allocator has a mechanism which prevents allocations | 
 | 1359 | which _could_ use highmem from using too much lowmem.  This means that | 
 | 1360 | a certain amount of lowmem is defended from the possibility of being | 
 | 1361 | captured into pinned user memory. | 
 | 1362 |  | 
 | 1363 | (The same argument applies to the old 16 megabyte ISA DMA region.  This | 
 | 1364 | mechanism will also defend that region from allocations which could use | 
 | 1365 | highmem or lowmem). | 
 | 1366 |  | 
| Yasunori Goto | 7786fa9 | 2008-02-04 22:29:32 -0800 | [diff] [blame] | 1367 | The `lowmem_reserve_ratio' tunable determines how aggressive the kernel is | 
 | 1368 | in defending these lower zones. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1369 |  | 
 | 1370 | If you have a machine which uses highmem or ISA DMA and your | 
 | 1371 | applications are using mlock(), or if you are running with no swap then | 
| Yasunori Goto | 7786fa9 | 2008-02-04 22:29:32 -0800 | [diff] [blame] | 1372 | you probably should change the lowmem_reserve_ratio setting. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1373 |  | 
| Yasunori Goto | 7786fa9 | 2008-02-04 22:29:32 -0800 | [diff] [blame] | 1374 | The lowmem_reserve_ratio is an array. You can see them by reading this file. | 
 | 1375 | - | 
 | 1376 | % cat /proc/sys/vm/lowmem_reserve_ratio | 
 | 1377 | 256     256     32 | 
 | 1378 | - | 
 | 1379 | Note: # of this elements is one fewer than number of zones. Because the highest | 
 | 1380 |       zone's value is not necessary for following calculation. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1381 |  | 
| Yasunori Goto | 7786fa9 | 2008-02-04 22:29:32 -0800 | [diff] [blame] | 1382 | But, these values are not used directly. The kernel calculates # of protection | 
 | 1383 | pages for each zones from them. These are shown as array of protection pages | 
 | 1384 | in /proc/zoneinfo like followings. (This is an example of x86-64 box). | 
 | 1385 | Each zone has an array of protection pages like this. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1386 |  | 
| Yasunori Goto | 7786fa9 | 2008-02-04 22:29:32 -0800 | [diff] [blame] | 1387 | - | 
 | 1388 | Node 0, zone      DMA | 
 | 1389 |   pages free     1355 | 
 | 1390 |         min      3 | 
 | 1391 |         low      3 | 
 | 1392 |         high     4 | 
 | 1393 | 	: | 
 | 1394 | 	: | 
 | 1395 |     numa_other   0 | 
 | 1396 |         protection: (0, 2004, 2004, 2004) | 
 | 1397 | 	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 | 1398 |   pagesets | 
 | 1399 |     cpu: 0 pcp: 0 | 
 | 1400 |         : | 
 | 1401 | - | 
 | 1402 | These protections are added to score to judge whether this zone should be used | 
 | 1403 | for page allocation or should be reclaimed. | 
 | 1404 |  | 
 | 1405 | In this example, if normal pages (index=2) are required to this DMA zone and | 
 | 1406 | pages_high is used for watermark, the kernel judges this zone should not be | 
 | 1407 | used because pages_free(1355) is smaller than watermark + protection[2] | 
 | 1408 | (4 + 2004 = 2008). If this protection value is 0, this zone would be used for | 
 | 1409 | normal page requirement. If requirement is DMA zone(index=0), protection[0] | 
 | 1410 | (=0) is used. | 
 | 1411 |  | 
 | 1412 | zone[i]'s protection[j] is calculated by following exprssion. | 
 | 1413 |  | 
 | 1414 | (i < j): | 
 | 1415 |   zone[i]->protection[j] | 
 | 1416 |   = (total sums of present_pages from zone[i+1] to zone[j] on the node) | 
 | 1417 |     / lowmem_reserve_ratio[i]; | 
 | 1418 | (i = j): | 
 | 1419 |    (should not be protected. = 0; | 
 | 1420 | (i > j): | 
 | 1421 |    (not necessary, but looks 0) | 
 | 1422 |  | 
 | 1423 | The default values of lowmem_reserve_ratio[i] are | 
 | 1424 |     256 (if zone[i] means DMA or DMA32 zone) | 
 | 1425 |     32  (others). | 
 | 1426 | As above expression, they are reciprocal number of ratio. | 
 | 1427 | 256 means 1/256. # of protection pages becomes about "0.39%" of total present | 
 | 1428 | pages of higher zones on the node. | 
 | 1429 |  | 
 | 1430 | If you would like to protect more pages, smaller values are effective. | 
 | 1431 | The minimum value is 1 (1/1 -> 100%). | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1432 |  | 
 | 1433 | page-cluster | 
 | 1434 | ------------ | 
 | 1435 |  | 
 | 1436 | page-cluster controls the number of pages which are written to swap in | 
 | 1437 | a single attempt.  The swap I/O size. | 
 | 1438 |  | 
 | 1439 | It is a logarithmic value - setting it to zero means "1 page", setting | 
 | 1440 | it to 1 means "2 pages", setting it to 2 means "4 pages", etc. | 
 | 1441 |  | 
 | 1442 | The default value is three (eight pages at a time).  There may be some | 
 | 1443 | small benefits in tuning this to a different value if your workload is | 
 | 1444 | swap-intensive. | 
 | 1445 |  | 
 | 1446 | overcommit_memory | 
 | 1447 | ----------------- | 
 | 1448 |  | 
| Chuck Ebbert | af97c72 | 2005-09-09 13:10:15 -0700 | [diff] [blame] | 1449 | Controls overcommit of system memory, possibly allowing processes | 
 | 1450 | to allocate (but not use) more memory than is actually available. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1451 |  | 
| Chuck Ebbert | af97c72 | 2005-09-09 13:10:15 -0700 | [diff] [blame] | 1452 |  | 
 | 1453 | 0	-	Heuristic overcommit handling. Obvious overcommits of | 
 | 1454 | 		address space are refused. Used for a typical system. It | 
 | 1455 | 		ensures a seriously wild allocation fails while allowing | 
 | 1456 | 		overcommit to reduce swap usage.  root is allowed to | 
| Matt LaPlante | 53cb472 | 2006-10-03 22:55:17 +0200 | [diff] [blame] | 1457 | 		allocate slightly more memory in this mode. This is the | 
| Chuck Ebbert | af97c72 | 2005-09-09 13:10:15 -0700 | [diff] [blame] | 1458 | 		default. | 
 | 1459 |  | 
 | 1460 | 1	-	Always overcommit. Appropriate for some scientific | 
 | 1461 | 		applications. | 
 | 1462 |  | 
 | 1463 | 2	-	Don't overcommit. The total address space commit | 
 | 1464 | 		for the system is not permitted to exceed swap plus a | 
 | 1465 | 		configurable percentage (default is 50) of physical RAM. | 
 | 1466 | 		Depending on the percentage you use, in most situations | 
 | 1467 | 		this means a process will not be killed while attempting | 
 | 1468 | 		to use already-allocated memory but will receive errors | 
 | 1469 | 		on memory allocation as	appropriate. | 
 | 1470 |  | 
 | 1471 | overcommit_ratio | 
 | 1472 | ---------------- | 
 | 1473 |  | 
 | 1474 | Percentage of physical memory size to include in overcommit calculations | 
 | 1475 | (see above.) | 
 | 1476 |  | 
 | 1477 | Memory allocation limit = swapspace + physmem * (overcommit_ratio / 100) | 
 | 1478 |  | 
 | 1479 | 	swapspace = total size of all swap areas | 
 | 1480 | 	physmem = size of physical memory in system | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1481 |  | 
 | 1482 | nr_hugepages and hugetlb_shm_group | 
 | 1483 | ---------------------------------- | 
 | 1484 |  | 
 | 1485 | nr_hugepages configures number of hugetlb page reserved for the system. | 
 | 1486 |  | 
 | 1487 | hugetlb_shm_group contains group id that is allowed to create SysV shared | 
 | 1488 | memory segment using hugetlb page. | 
 | 1489 |  | 
| Mel Gorman | ed7ed36 | 2007-07-17 04:03:14 -0700 | [diff] [blame] | 1490 | hugepages_treat_as_movable | 
 | 1491 | -------------------------- | 
 | 1492 |  | 
 | 1493 | This parameter is only useful when kernelcore= is specified at boot time to | 
 | 1494 | create ZONE_MOVABLE for pages that may be reclaimed or migrated. Huge pages | 
 | 1495 | are not movable so are not normally allocated from ZONE_MOVABLE. A non-zero | 
 | 1496 | value written to hugepages_treat_as_movable allows huge pages to be allocated | 
 | 1497 | from ZONE_MOVABLE. | 
 | 1498 |  | 
 | 1499 | Once enabled, the ZONE_MOVABLE is treated as an area of memory the huge | 
 | 1500 | pages pool can easily grow or shrink within. Assuming that applications are | 
 | 1501 | not running that mlock() a lot of memory, it is likely the huge pages pool | 
 | 1502 | can grow to the size of ZONE_MOVABLE by repeatedly entering the desired value | 
 | 1503 | into nr_hugepages and triggering page reclaim. | 
 | 1504 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1505 | laptop_mode | 
 | 1506 | ----------- | 
 | 1507 |  | 
 | 1508 | laptop_mode is a knob that controls "laptop mode". All the things that are | 
 | 1509 | controlled by this knob are discussed in Documentation/laptop-mode.txt. | 
 | 1510 |  | 
 | 1511 | block_dump | 
 | 1512 | ---------- | 
 | 1513 |  | 
 | 1514 | block_dump enables block I/O debugging when set to a nonzero value. More | 
 | 1515 | information on block I/O debugging is in Documentation/laptop-mode.txt. | 
 | 1516 |  | 
 | 1517 | swap_token_timeout | 
 | 1518 | ------------------ | 
 | 1519 |  | 
 | 1520 | This file contains valid hold time of swap out protection token. The Linux | 
 | 1521 | VM has token based thrashing control mechanism and uses the token to prevent | 
 | 1522 | unnecessary page faults in thrashing situation. The unit of the value is | 
 | 1523 | second. The value would be useful to tune thrashing behavior. | 
 | 1524 |  | 
| Andrew Morton | 9d0243b | 2006-01-08 01:00:39 -0800 | [diff] [blame] | 1525 | drop_caches | 
 | 1526 | ----------- | 
 | 1527 |  | 
 | 1528 | Writing to this will cause the kernel to drop clean caches, dentries and | 
 | 1529 | inodes from memory, causing that memory to become free. | 
 | 1530 |  | 
 | 1531 | To free pagecache: | 
 | 1532 | 	echo 1 > /proc/sys/vm/drop_caches | 
 | 1533 | To free dentries and inodes: | 
 | 1534 | 	echo 2 > /proc/sys/vm/drop_caches | 
 | 1535 | To free pagecache, dentries and inodes: | 
 | 1536 | 	echo 3 > /proc/sys/vm/drop_caches | 
 | 1537 |  | 
 | 1538 | As this is a non-destructive operation and dirty objects are not freeable, the | 
 | 1539 | user should run `sync' first. | 
 | 1540 |  | 
 | 1541 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1542 | 2.5 /proc/sys/dev - Device specific parameters | 
 | 1543 | ---------------------------------------------- | 
 | 1544 |  | 
 | 1545 | Currently there is only support for CDROM drives, and for those, there is only | 
 | 1546 | one read-only  file containing information about the CD-ROM drives attached to | 
 | 1547 | the system: | 
 | 1548 |  | 
 | 1549 |   >cat /proc/sys/dev/cdrom/info  | 
 | 1550 |   CD-ROM information, Id: cdrom.c 2.55 1999/04/25  | 
 | 1551 |     | 
 | 1552 |   drive name:             sr0     hdb  | 
 | 1553 |   drive speed:            32      40  | 
 | 1554 |   drive # of slots:       1       0  | 
 | 1555 |   Can close tray:         1       1  | 
 | 1556 |   Can open tray:          1       1  | 
 | 1557 |   Can lock tray:          1       1  | 
 | 1558 |   Can change speed:       1       1  | 
 | 1559 |   Can select disk:        0       1  | 
 | 1560 |   Can read multisession:  1       1  | 
 | 1561 |   Can read MCN:           1       1  | 
 | 1562 |   Reports media changed:  1       1  | 
 | 1563 |   Can play audio:         1       1  | 
 | 1564 |  | 
 | 1565 |  | 
 | 1566 | You see two drives, sr0 and hdb, along with a list of their features. | 
 | 1567 |  | 
 | 1568 | 2.6 /proc/sys/sunrpc - Remote procedure calls | 
 | 1569 | --------------------------------------------- | 
 | 1570 |  | 
 | 1571 | This directory  contains four files, which enable or disable debugging for the | 
 | 1572 | RPC functions NFS, NFS-daemon, RPC and NLM. The default values are 0. They can | 
 | 1573 | be set to one to turn debugging on. (The default value is 0 for each) | 
 | 1574 |  | 
 | 1575 | 2.7 /proc/sys/net - Networking stuff | 
 | 1576 | ------------------------------------ | 
 | 1577 |  | 
 | 1578 | The interface  to  the  networking  parts  of  the  kernel  is  located  in | 
 | 1579 | /proc/sys/net. Table  2-3  shows all possible subdirectories. You may see only | 
 | 1580 | some of them, depending on your kernel's configuration. | 
 | 1581 |  | 
 | 1582 |  | 
 | 1583 | Table 2-3: Subdirectories in /proc/sys/net  | 
 | 1584 | .............................................................................. | 
 | 1585 |  Directory Content             Directory  Content             | 
 | 1586 |  core      General parameter   appletalk  Appletalk protocol  | 
 | 1587 |  unix      Unix domain sockets netrom     NET/ROM             | 
 | 1588 |  802       E802 protocol       ax25       AX25                | 
 | 1589 |  ethernet  Ethernet protocol   rose       X.25 PLP layer      | 
 | 1590 |  ipv4      IP version 4        x25        X.25 protocol       | 
 | 1591 |  ipx       IPX                 token-ring IBM token ring      | 
 | 1592 |  bridge    Bridging            decnet     DEC net             | 
 | 1593 |  ipv6      IP version 6                    | 
 | 1594 | .............................................................................. | 
 | 1595 |  | 
 | 1596 | We will  concentrate  on IP networking here. Since AX15, X.25, and DEC Net are | 
 | 1597 | only minor players in the Linux world, we'll skip them in this chapter. You'll | 
 | 1598 | find some  short  info on Appletalk and IPX further on in this chapter. Review | 
 | 1599 | the online  documentation  and the kernel source to get a detailed view of the | 
 | 1600 | parameters for  those  protocols.  In  this  section  we'll  discuss  the | 
 | 1601 | subdirectories printed  in  bold letters in the table above. As default values | 
 | 1602 | are suitable for most needs, there is no need to change these values. | 
 | 1603 |  | 
 | 1604 | /proc/sys/net/core - Network core options | 
 | 1605 | ----------------------------------------- | 
 | 1606 |  | 
 | 1607 | rmem_default | 
 | 1608 | ------------ | 
 | 1609 |  | 
 | 1610 | The default setting of the socket receive buffer in bytes. | 
 | 1611 |  | 
 | 1612 | rmem_max | 
 | 1613 | -------- | 
 | 1614 |  | 
 | 1615 | The maximum receive socket buffer size in bytes. | 
 | 1616 |  | 
 | 1617 | wmem_default | 
 | 1618 | ------------ | 
 | 1619 |  | 
 | 1620 | The default setting (in bytes) of the socket send buffer. | 
 | 1621 |  | 
 | 1622 | wmem_max | 
 | 1623 | -------- | 
 | 1624 |  | 
 | 1625 | The maximum send socket buffer size in bytes. | 
 | 1626 |  | 
 | 1627 | message_burst and message_cost | 
 | 1628 | ------------------------------ | 
 | 1629 |  | 
 | 1630 | These parameters  are used to limit the warning messages written to the kernel | 
 | 1631 | log from  the  networking  code.  They  enforce  a  rate  limit  to  make  a | 
 | 1632 | denial-of-service attack  impossible. A higher message_cost factor, results in | 
 | 1633 | fewer messages that will be written. Message_burst controls when messages will | 
 | 1634 | be dropped.  The  default  settings  limit  warning messages to one every five | 
 | 1635 | seconds. | 
 | 1636 |  | 
| Stephen Hemminger | a2a316f | 2007-03-08 20:41:08 -0800 | [diff] [blame] | 1637 | warnings | 
 | 1638 | -------- | 
 | 1639 |  | 
 | 1640 | This controls console messages from the networking stack that can occur because | 
 | 1641 | of problems on the network like duplicate address or bad checksums. Normally, | 
 | 1642 | this should be enabled, but if the problem persists the messages can be | 
 | 1643 | disabled. | 
 | 1644 |  | 
 | 1645 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1646 | netdev_max_backlog | 
 | 1647 | ------------------ | 
 | 1648 |  | 
 | 1649 | Maximum number  of  packets,  queued  on  the  INPUT  side, when the interface | 
 | 1650 | receives packets faster than kernel can process them. | 
 | 1651 |  | 
 | 1652 | optmem_max | 
 | 1653 | ---------- | 
 | 1654 |  | 
 | 1655 | Maximum ancillary buffer size allowed per socket. Ancillary data is a sequence | 
 | 1656 | of struct cmsghdr structures with appended data. | 
 | 1657 |  | 
 | 1658 | /proc/sys/net/unix - Parameters for Unix domain sockets | 
 | 1659 | ------------------------------------------------------- | 
 | 1660 |  | 
 | 1661 | There are  only  two  files  in this subdirectory. They control the delays for | 
 | 1662 | deleting and destroying socket descriptors. | 
 | 1663 |  | 
 | 1664 | 2.8 /proc/sys/net/ipv4 - IPV4 settings | 
 | 1665 | -------------------------------------- | 
 | 1666 |  | 
 | 1667 | IP version  4  is  still the most used protocol in Unix networking. It will be | 
 | 1668 | replaced by  IP version 6 in the next couple of years, but for the moment it's | 
 | 1669 | the de  facto  standard  for  the  internet  and  is  used  in most networking | 
 | 1670 | environments around  the  world.  Because  of the importance of this protocol, | 
 | 1671 | we'll have a deeper look into the subtree controlling the behavior of the IPv4 | 
 | 1672 | subsystem of the Linux kernel. | 
 | 1673 |  | 
 | 1674 | Let's start with the entries in /proc/sys/net/ipv4. | 
 | 1675 |  | 
 | 1676 | ICMP settings | 
 | 1677 | ------------- | 
 | 1678 |  | 
 | 1679 | icmp_echo_ignore_all and icmp_echo_ignore_broadcasts | 
 | 1680 | ---------------------------------------------------- | 
 | 1681 |  | 
 | 1682 | Turn on (1) or off (0), if the kernel should ignore all ICMP ECHO requests, or | 
 | 1683 | just those to broadcast and multicast addresses. | 
 | 1684 |  | 
 | 1685 | Please note that if you accept ICMP echo requests with a broadcast/multi\-cast | 
 | 1686 | destination address  your  network  may  be  used as an exploder for denial of | 
 | 1687 | service packet flooding attacks to other hosts. | 
 | 1688 |  | 
 | 1689 | icmp_destunreach_rate, icmp_echoreply_rate, icmp_paramprob_rate and icmp_timeexeed_rate | 
 | 1690 | --------------------------------------------------------------------------------------- | 
 | 1691 |  | 
 | 1692 | Sets limits  for  sending  ICMP  packets  to specific targets. A value of zero | 
 | 1693 | disables all  limiting.  Any  positive  value sets the maximum package rate in | 
 | 1694 | hundredth of a second (on Intel systems). | 
 | 1695 |  | 
 | 1696 | IP settings | 
 | 1697 | ----------- | 
 | 1698 |  | 
 | 1699 | ip_autoconfig | 
 | 1700 | ------------- | 
 | 1701 |  | 
 | 1702 | This file contains the number one if the host received its IP configuration by | 
 | 1703 | RARP, BOOTP, DHCP or a similar mechanism. Otherwise it is zero. | 
 | 1704 |  | 
 | 1705 | ip_default_ttl | 
 | 1706 | -------------- | 
 | 1707 |  | 
 | 1708 | TTL (Time  To  Live) for IPv4 interfaces. This is simply the maximum number of | 
 | 1709 | hops a packet may travel. | 
 | 1710 |  | 
 | 1711 | ip_dynaddr | 
 | 1712 | ---------- | 
 | 1713 |  | 
 | 1714 | Enable dynamic  socket  address rewriting on interface address change. This is | 
 | 1715 | useful for dialup interface with changing IP addresses. | 
 | 1716 |  | 
 | 1717 | ip_forward | 
 | 1718 | ---------- | 
 | 1719 |  | 
 | 1720 | Enable or  disable forwarding of IP packages between interfaces. Changing this | 
 | 1721 | value resets  all other parameters to their default values. They differ if the | 
 | 1722 | kernel is configured as host or router. | 
 | 1723 |  | 
 | 1724 | ip_local_port_range | 
 | 1725 | ------------------- | 
 | 1726 |  | 
 | 1727 | Range of  ports  used  by  TCP  and UDP to choose the local port. Contains two | 
 | 1728 | numbers, the  first  number  is the lowest port, the second number the highest | 
 | 1729 | local port.  Default  is  1024-4999.  Should  be  changed  to  32768-61000 for | 
 | 1730 | high-usage systems. | 
 | 1731 |  | 
 | 1732 | ip_no_pmtu_disc | 
 | 1733 | --------------- | 
 | 1734 |  | 
 | 1735 | Global switch  to  turn  path  MTU  discovery off. It can also be set on a per | 
 | 1736 | socket basis by the applications or on a per route basis. | 
 | 1737 |  | 
 | 1738 | ip_masq_debug | 
 | 1739 | ------------- | 
 | 1740 |  | 
 | 1741 | Enable/disable debugging of IP masquerading. | 
 | 1742 |  | 
 | 1743 | IP fragmentation settings | 
 | 1744 | ------------------------- | 
 | 1745 |  | 
 | 1746 | ipfrag_high_trash and ipfrag_low_trash | 
 | 1747 | -------------------------------------- | 
 | 1748 |  | 
 | 1749 | Maximum memory  used to reassemble IP fragments. When ipfrag_high_thresh bytes | 
 | 1750 | of memory  is  allocated  for  this  purpose,  the  fragment handler will toss | 
 | 1751 | packets until ipfrag_low_thresh is reached. | 
 | 1752 |  | 
 | 1753 | ipfrag_time | 
 | 1754 | ----------- | 
 | 1755 |  | 
 | 1756 | Time in seconds to keep an IP fragment in memory. | 
 | 1757 |  | 
 | 1758 | TCP settings | 
 | 1759 | ------------ | 
 | 1760 |  | 
 | 1761 | tcp_ecn | 
 | 1762 | ------- | 
 | 1763 |  | 
| Matt LaPlante | fa00e7e | 2006-11-30 04:55:36 +0100 | [diff] [blame] | 1764 | This file controls the use of the ECN bit in the IPv4 headers. This is a new | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1765 | feature about Explicit Congestion Notification, but some routers and firewalls | 
| Matt LaPlante | fa00e7e | 2006-11-30 04:55:36 +0100 | [diff] [blame] | 1766 | block traffic that has this bit set, so it could be necessary to echo 0 to | 
 | 1767 | /proc/sys/net/ipv4/tcp_ecn if you want to talk to these sites. For more info | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1768 | you could read RFC2481. | 
 | 1769 |  | 
 | 1770 | tcp_retrans_collapse | 
 | 1771 | -------------------- | 
 | 1772 |  | 
 | 1773 | Bug-to-bug compatibility with some broken printers. On retransmit, try to send | 
 | 1774 | larger packets to work around bugs in certain TCP stacks. Can be turned off by | 
 | 1775 | setting it to zero. | 
 | 1776 |  | 
 | 1777 | tcp_keepalive_probes | 
 | 1778 | -------------------- | 
 | 1779 |  | 
 | 1780 | Number of  keep  alive  probes  TCP  sends  out,  until  it  decides  that the | 
 | 1781 | connection is broken. | 
 | 1782 |  | 
 | 1783 | tcp_keepalive_time | 
 | 1784 | ------------------ | 
 | 1785 |  | 
 | 1786 | How often  TCP  sends out keep alive messages, when keep alive is enabled. The | 
 | 1787 | default is 2 hours. | 
 | 1788 |  | 
 | 1789 | tcp_syn_retries | 
 | 1790 | --------------- | 
 | 1791 |  | 
 | 1792 | Number of  times  initial  SYNs  for  a  TCP  connection  attempt  will  be | 
 | 1793 | retransmitted. Should  not  be  higher  than 255. This is only the timeout for | 
 | 1794 | outgoing connections,  for  incoming  connections the number of retransmits is | 
 | 1795 | defined by tcp_retries1. | 
 | 1796 |  | 
 | 1797 | tcp_sack | 
 | 1798 | -------- | 
 | 1799 |  | 
 | 1800 | Enable select acknowledgments after RFC2018. | 
 | 1801 |  | 
 | 1802 | tcp_timestamps | 
 | 1803 | -------------- | 
 | 1804 |  | 
 | 1805 | Enable timestamps as defined in RFC1323. | 
 | 1806 |  | 
 | 1807 | tcp_stdurg | 
 | 1808 | ---------- | 
 | 1809 |  | 
 | 1810 | Enable the  strict  RFC793 interpretation of the TCP urgent pointer field. The | 
 | 1811 | default is  to  use  the  BSD  compatible interpretation of the urgent pointer | 
 | 1812 | pointing to the first byte after the urgent data. The RFC793 interpretation is | 
 | 1813 | to have  it  point  to  the last byte of urgent data. Enabling this option may | 
| Matt LaPlante | 2fe0ae7 | 2006-10-03 22:50:39 +0200 | [diff] [blame] | 1814 | lead to interoperability problems. Disabled by default. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1815 |  | 
 | 1816 | tcp_syncookies | 
 | 1817 | -------------- | 
 | 1818 |  | 
 | 1819 | Only valid  when  the  kernel  was  compiled  with CONFIG_SYNCOOKIES. Send out | 
 | 1820 | syncookies when  the  syn backlog queue of a socket overflows. This is to ward | 
 | 1821 | off the common 'syn flood attack'. Disabled by default. | 
 | 1822 |  | 
 | 1823 | Note that  the  concept  of a socket backlog is abandoned. This means the peer | 
 | 1824 | may not  receive  reliable  error  messages  from  an  over loaded server with | 
 | 1825 | syncookies enabled. | 
 | 1826 |  | 
 | 1827 | tcp_window_scaling | 
 | 1828 | ------------------ | 
 | 1829 |  | 
 | 1830 | Enable window scaling as defined in RFC1323. | 
 | 1831 |  | 
 | 1832 | tcp_fin_timeout | 
 | 1833 | --------------- | 
 | 1834 |  | 
 | 1835 | The length  of  time  in  seconds  it  takes to receive a final FIN before the | 
 | 1836 | socket is  always  closed.  This  is  strictly  a  violation  of  the  TCP | 
 | 1837 | specification, but required to prevent denial-of-service attacks. | 
 | 1838 |  | 
 | 1839 | tcp_max_ka_probes | 
 | 1840 | ----------------- | 
 | 1841 |  | 
 | 1842 | Indicates how  many  keep alive probes are sent per slow timer run. Should not | 
 | 1843 | be set too high to prevent bursts. | 
 | 1844 |  | 
 | 1845 | tcp_max_syn_backlog | 
 | 1846 | ------------------- | 
 | 1847 |  | 
 | 1848 | Length of  the per socket backlog queue. Since Linux 2.2 the backlog specified | 
 | 1849 | in listen(2)  only  specifies  the  length  of  the  backlog  queue of already | 
 | 1850 | established sockets. When more connection requests arrive Linux starts to drop | 
 | 1851 | packets. When  syncookies  are  enabled the packets are still answered and the | 
 | 1852 | maximum queue is effectively ignored. | 
 | 1853 |  | 
 | 1854 | tcp_retries1 | 
 | 1855 | ------------ | 
 | 1856 |  | 
 | 1857 | Defines how  often  an  answer  to  a  TCP connection request is retransmitted | 
 | 1858 | before giving up. | 
 | 1859 |  | 
 | 1860 | tcp_retries2 | 
 | 1861 | ------------ | 
 | 1862 |  | 
 | 1863 | Defines how often a TCP packet is retransmitted before giving up. | 
 | 1864 |  | 
 | 1865 | Interface specific settings | 
 | 1866 | --------------------------- | 
 | 1867 |  | 
 | 1868 | In the directory /proc/sys/net/ipv4/conf you'll find one subdirectory for each | 
 | 1869 | interface the  system  knows about and one directory calls all. Changes in the | 
 | 1870 | all subdirectory  affect  all  interfaces,  whereas  changes  in  the  other | 
 | 1871 | subdirectories affect  only  one  interface.  All  directories  have  the same | 
 | 1872 | entries: | 
 | 1873 |  | 
 | 1874 | accept_redirects | 
 | 1875 | ---------------- | 
 | 1876 |  | 
 | 1877 | This switch  decides  if the kernel accepts ICMP redirect messages or not. The | 
 | 1878 | default is 'yes' if the kernel is configured for a regular host and 'no' for a | 
 | 1879 | router configuration. | 
 | 1880 |  | 
 | 1881 | accept_source_route | 
 | 1882 | ------------------- | 
 | 1883 |  | 
 | 1884 | Should source  routed  packages  be  accepted  or  declined.  The  default  is | 
 | 1885 | dependent on  the  kernel  configuration.  It's 'yes' for routers and 'no' for | 
 | 1886 | hosts. | 
 | 1887 |  | 
 | 1888 | bootp_relay | 
 | 1889 | ~~~~~~~~~~~ | 
 | 1890 |  | 
 | 1891 | Accept packets  with source address 0.b.c.d with destinations not to this host | 
 | 1892 | as local ones. It is supposed that a BOOTP relay daemon will catch and forward | 
 | 1893 | such packets. | 
 | 1894 |  | 
 | 1895 | The default  is  0,  since this feature is not implemented yet (kernel version | 
 | 1896 | 2.2.12). | 
 | 1897 |  | 
 | 1898 | forwarding | 
 | 1899 | ---------- | 
 | 1900 |  | 
 | 1901 | Enable or disable IP forwarding on this interface. | 
 | 1902 |  | 
 | 1903 | log_martians | 
 | 1904 | ------------ | 
 | 1905 |  | 
 | 1906 | Log packets with source addresses with no known route to kernel log. | 
 | 1907 |  | 
 | 1908 | mc_forwarding | 
 | 1909 | ------------- | 
 | 1910 |  | 
 | 1911 | Do multicast routing. The kernel needs to be compiled with CONFIG_MROUTE and a | 
 | 1912 | multicast routing daemon is required. | 
 | 1913 |  | 
 | 1914 | proxy_arp | 
 | 1915 | --------- | 
 | 1916 |  | 
 | 1917 | Does (1) or does not (0) perform proxy ARP. | 
 | 1918 |  | 
 | 1919 | rp_filter | 
 | 1920 | --------- | 
 | 1921 |  | 
 | 1922 | Integer value determines if a source validation should be made. 1 means yes, 0 | 
 | 1923 | means no.  Disabled by default, but local/broadcast address spoofing is always | 
 | 1924 | on. | 
 | 1925 |  | 
 | 1926 | If you  set this to 1 on a router that is the only connection for a network to | 
 | 1927 | the net,  it  will  prevent  spoofing  attacks  against your internal networks | 
 | 1928 | (external addresses  can  still  be  spoofed), without the need for additional | 
 | 1929 | firewall rules. | 
 | 1930 |  | 
 | 1931 | secure_redirects | 
 | 1932 | ---------------- | 
 | 1933 |  | 
 | 1934 | Accept ICMP  redirect  messages  only  for gateways, listed in default gateway | 
 | 1935 | list. Enabled by default. | 
 | 1936 |  | 
 | 1937 | shared_media | 
 | 1938 | ------------ | 
 | 1939 |  | 
 | 1940 | If it  is  not  set  the kernel does not assume that different subnets on this | 
 | 1941 | device can communicate directly. Default setting is 'yes'. | 
 | 1942 |  | 
 | 1943 | send_redirects | 
 | 1944 | -------------- | 
 | 1945 |  | 
 | 1946 | Determines whether to send ICMP redirects to other hosts. | 
 | 1947 |  | 
 | 1948 | Routing settings | 
 | 1949 | ---------------- | 
 | 1950 |  | 
 | 1951 | The directory  /proc/sys/net/ipv4/route  contains  several  file  to  control | 
 | 1952 | routing issues. | 
 | 1953 |  | 
 | 1954 | error_burst and error_cost | 
 | 1955 | -------------------------- | 
 | 1956 |  | 
 | 1957 | These  parameters  are used to limit how many ICMP destination unreachable to  | 
 | 1958 | send  from  the  host  in question. ICMP destination unreachable messages are  | 
| Matt LaPlante | 84eb8d0 | 2006-10-03 22:53:09 +0200 | [diff] [blame] | 1959 | sent  when  we  cannot reach  the next hop while trying to transmit a packet.  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1960 | It  will also print some error messages to kernel logs if someone is ignoring  | 
 | 1961 | our   ICMP  redirects.  The  higher  the  error_cost  factor  is,  the  fewer  | 
 | 1962 | destination  unreachable  and error messages will be let through. Error_burst  | 
 | 1963 | controls  when  destination  unreachable  messages and error messages will be | 
 | 1964 | dropped. The default settings limit warning messages to five every second. | 
 | 1965 |  | 
 | 1966 | flush | 
 | 1967 | ----- | 
 | 1968 |  | 
 | 1969 | Writing to this file results in a flush of the routing cache. | 
 | 1970 |  | 
 | 1971 | gc_elasticity, gc_interval, gc_min_interval_ms, gc_timeout, gc_thresh | 
 | 1972 | --------------------------------------------------------------------- | 
 | 1973 |  | 
 | 1974 | Values to  control  the  frequency  and  behavior  of  the  garbage collection | 
 | 1975 | algorithm for the routing cache. gc_min_interval is deprecated and replaced | 
 | 1976 | by gc_min_interval_ms. | 
 | 1977 |  | 
 | 1978 |  | 
 | 1979 | max_size | 
 | 1980 | -------- | 
 | 1981 |  | 
 | 1982 | Maximum size  of  the routing cache. Old entries will be purged once the cache | 
 | 1983 | reached has this size. | 
 | 1984 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1985 | redirect_load, redirect_number | 
 | 1986 | ------------------------------ | 
 | 1987 |  | 
 | 1988 | Factors which  determine  if  more ICPM redirects should be sent to a specific | 
 | 1989 | host. No  redirects  will be sent once the load limit or the maximum number of | 
 | 1990 | redirects has been reached. | 
 | 1991 |  | 
 | 1992 | redirect_silence | 
 | 1993 | ---------------- | 
 | 1994 |  | 
 | 1995 | Timeout for redirects. After this period redirects will be sent again, even if | 
 | 1996 | this has been stopped, because the load or number limit has been reached. | 
 | 1997 |  | 
 | 1998 | Network Neighbor handling | 
 | 1999 | ------------------------- | 
 | 2000 |  | 
 | 2001 | Settings about how to handle connections with direct neighbors (nodes attached | 
 | 2002 | to the same link) can be found in the directory /proc/sys/net/ipv4/neigh. | 
 | 2003 |  | 
 | 2004 | As we  saw  it  in  the  conf directory, there is a default subdirectory which | 
 | 2005 | holds the  default  values, and one directory for each interface. The contents | 
 | 2006 | of the  directories  are identical, with the single exception that the default | 
 | 2007 | settings contain additional options to set garbage collection parameters. | 
 | 2008 |  | 
 | 2009 | In the interface directories you'll find the following entries: | 
 | 2010 |  | 
 | 2011 | base_reachable_time, base_reachable_time_ms | 
 | 2012 | ------------------------------------------- | 
 | 2013 |  | 
 | 2014 | A base  value  used for computing the random reachable time value as specified | 
 | 2015 | in RFC2461. | 
 | 2016 |  | 
 | 2017 | Expression of base_reachable_time, which is deprecated, is in seconds. | 
 | 2018 | Expression of base_reachable_time_ms is in milliseconds. | 
 | 2019 |  | 
 | 2020 | retrans_time, retrans_time_ms | 
 | 2021 | ----------------------------- | 
 | 2022 |  | 
 | 2023 | The time between retransmitted Neighbor Solicitation messages. | 
 | 2024 | Used for address resolution and to determine if a neighbor is | 
 | 2025 | unreachable. | 
 | 2026 |  | 
 | 2027 | Expression of retrans_time, which is deprecated, is in 1/100 seconds (for | 
 | 2028 | IPv4) or in jiffies (for IPv6). | 
 | 2029 | Expression of retrans_time_ms is in milliseconds. | 
 | 2030 |  | 
 | 2031 | unres_qlen | 
 | 2032 | ---------- | 
 | 2033 |  | 
 | 2034 | Maximum queue  length  for a pending arp request - the number of packets which | 
 | 2035 | are accepted from other layers while the ARP address is still resolved. | 
 | 2036 |  | 
 | 2037 | anycast_delay | 
 | 2038 | ------------- | 
 | 2039 |  | 
 | 2040 | Maximum for  random  delay  of  answers  to  neighbor solicitation messages in | 
 | 2041 | jiffies (1/100  sec). Not yet implemented (Linux does not have anycast support | 
 | 2042 | yet). | 
 | 2043 |  | 
 | 2044 | ucast_solicit | 
 | 2045 | ------------- | 
 | 2046 |  | 
 | 2047 | Maximum number of retries for unicast solicitation. | 
 | 2048 |  | 
 | 2049 | mcast_solicit | 
 | 2050 | ------------- | 
 | 2051 |  | 
 | 2052 | Maximum number of retries for multicast solicitation. | 
 | 2053 |  | 
 | 2054 | delay_first_probe_time | 
 | 2055 | ---------------------- | 
 | 2056 |  | 
 | 2057 | Delay for  the  first  time  probe  if  the  neighbor  is  reachable.  (see | 
 | 2058 | gc_stale_time) | 
 | 2059 |  | 
 | 2060 | locktime | 
 | 2061 | -------- | 
 | 2062 |  | 
 | 2063 | An ARP/neighbor  entry  is only replaced with a new one if the old is at least | 
 | 2064 | locktime old. This prevents ARP cache thrashing. | 
 | 2065 |  | 
 | 2066 | proxy_delay | 
 | 2067 | ----------- | 
 | 2068 |  | 
 | 2069 | Maximum time  (real  time is random [0..proxytime]) before answering to an ARP | 
 | 2070 | request for  which  we have an proxy ARP entry. In some cases, this is used to | 
 | 2071 | prevent network flooding. | 
 | 2072 |  | 
 | 2073 | proxy_qlen | 
 | 2074 | ---------- | 
 | 2075 |  | 
 | 2076 | Maximum queue length of the delayed proxy arp timer. (see proxy_delay). | 
 | 2077 |  | 
| Matt LaPlante | 53cb472 | 2006-10-03 22:55:17 +0200 | [diff] [blame] | 2078 | app_solicit | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2079 | ---------- | 
 | 2080 |  | 
 | 2081 | Determines the  number of requests to send to the user level ARP daemon. Use 0 | 
 | 2082 | to turn off. | 
 | 2083 |  | 
 | 2084 | gc_stale_time | 
 | 2085 | ------------- | 
 | 2086 |  | 
 | 2087 | Determines how  often  to  check  for stale ARP entries. After an ARP entry is | 
 | 2088 | stale it  will  be resolved again (which is useful when an IP address migrates | 
 | 2089 | to another  machine).  When  ucast_solicit is greater than 0 it first tries to | 
 | 2090 | send an  ARP  packet  directly  to  the  known  host  When  that  fails  and | 
 | 2091 | mcast_solicit is greater than 0, an ARP request is broadcasted. | 
 | 2092 |  | 
 | 2093 | 2.9 Appletalk | 
 | 2094 | ------------- | 
 | 2095 |  | 
 | 2096 | The /proc/sys/net/appletalk  directory  holds the Appletalk configuration data | 
 | 2097 | when Appletalk is loaded. The configurable parameters are: | 
 | 2098 |  | 
 | 2099 | aarp-expiry-time | 
 | 2100 | ---------------- | 
 | 2101 |  | 
 | 2102 | The amount  of  time  we keep an ARP entry before expiring it. Used to age out | 
 | 2103 | old hosts. | 
 | 2104 |  | 
 | 2105 | aarp-resolve-time | 
 | 2106 | ----------------- | 
 | 2107 |  | 
 | 2108 | The amount of time we will spend trying to resolve an Appletalk address. | 
 | 2109 |  | 
 | 2110 | aarp-retransmit-limit | 
 | 2111 | --------------------- | 
 | 2112 |  | 
 | 2113 | The number of times we will retransmit a query before giving up. | 
 | 2114 |  | 
 | 2115 | aarp-tick-time | 
 | 2116 | -------------- | 
 | 2117 |  | 
 | 2118 | Controls the rate at which expires are checked. | 
 | 2119 |  | 
 | 2120 | The directory  /proc/net/appletalk  holds the list of active Appletalk sockets | 
 | 2121 | on a machine. | 
 | 2122 |  | 
 | 2123 | The fields  indicate  the DDP type, the local address (in network:node format) | 
 | 2124 | the remote  address,  the  size of the transmit pending queue, the size of the | 
 | 2125 | received queue  (bytes waiting for applications to read) the state and the uid | 
 | 2126 | owning the socket. | 
 | 2127 |  | 
 | 2128 | /proc/net/atalk_iface lists  all  the  interfaces  configured for appletalk.It | 
 | 2129 | shows the  name  of the interface, its Appletalk address, the network range on | 
 | 2130 | that address  (or  network number for phase 1 networks), and the status of the | 
 | 2131 | interface. | 
 | 2132 |  | 
 | 2133 | /proc/net/atalk_route lists  each  known  network  route.  It lists the target | 
 | 2134 | (network) that the route leads to, the router (may be directly connected), the | 
 | 2135 | route flags, and the device the route is using. | 
 | 2136 |  | 
 | 2137 | 2.10 IPX | 
 | 2138 | -------- | 
 | 2139 |  | 
 | 2140 | The IPX protocol has no tunable values in proc/sys/net. | 
 | 2141 |  | 
 | 2142 | The IPX  protocol  does,  however,  provide  proc/net/ipx. This lists each IPX | 
 | 2143 | socket giving  the  local  and  remote  addresses  in  Novell  format (that is | 
 | 2144 | network:node:port). In  accordance  with  the  strange  Novell  tradition, | 
 | 2145 | everything but the port is in hex. Not_Connected is displayed for sockets that | 
 | 2146 | are not  tied to a specific remote address. The Tx and Rx queue sizes indicate | 
 | 2147 | the number  of  bytes  pending  for  transmission  and  reception.  The  state | 
 | 2148 | indicates the  state  the  socket  is  in and the uid is the owning uid of the | 
 | 2149 | socket. | 
 | 2150 |  | 
 | 2151 | The /proc/net/ipx_interface  file lists all IPX interfaces. For each interface | 
 | 2152 | it gives  the network number, the node number, and indicates if the network is | 
 | 2153 | the primary  network.  It  also  indicates  which  device  it  is bound to (or | 
 | 2154 | Internal for  internal  networks)  and  the  Frame  Type if appropriate. Linux | 
 | 2155 | supports 802.3,  802.2,  802.2  SNAP  and DIX (Blue Book) ethernet framing for | 
 | 2156 | IPX. | 
 | 2157 |  | 
 | 2158 | The /proc/net/ipx_route  table  holds  a list of IPX routes. For each route it | 
 | 2159 | gives the  destination  network, the router node (or Directly) and the network | 
 | 2160 | address of the router (or Connected) for internal networks. | 
 | 2161 |  | 
 | 2162 | 2.11 /proc/sys/fs/mqueue - POSIX message queues filesystem | 
 | 2163 | ---------------------------------------------------------- | 
 | 2164 |  | 
 | 2165 | The "mqueue"  filesystem provides  the necessary kernel features to enable the | 
 | 2166 | creation of a  user space  library that  implements  the  POSIX message queues | 
 | 2167 | API (as noted by the  MSG tag in the  POSIX 1003.1-2001 version  of the System | 
 | 2168 | Interfaces specification.) | 
 | 2169 |  | 
 | 2170 | The "mqueue" filesystem contains values for determining/setting  the amount of | 
 | 2171 | resources used by the file system. | 
 | 2172 |  | 
 | 2173 | /proc/sys/fs/mqueue/queues_max is a read/write  file for  setting/getting  the | 
 | 2174 | maximum number of message queues allowed on the system. | 
 | 2175 |  | 
 | 2176 | /proc/sys/fs/mqueue/msg_max  is  a  read/write file  for  setting/getting  the | 
 | 2177 | maximum number of messages in a queue value.  In fact it is the limiting value | 
 | 2178 | for another (user) limit which is set in mq_open invocation. This attribute of | 
 | 2179 | a queue must be less or equal then msg_max. | 
 | 2180 |  | 
 | 2181 | /proc/sys/fs/mqueue/msgsize_max is  a read/write  file for setting/getting the | 
 | 2182 | maximum  message size value (it is every  message queue's attribute set during | 
 | 2183 | its creation). | 
 | 2184 |  | 
| Jan-Frode Myklebust | d7ff0db | 2006-09-29 01:59:45 -0700 | [diff] [blame] | 2185 | 2.12 /proc/<pid>/oom_adj - Adjust the oom-killer score | 
 | 2186 | ------------------------------------------------------ | 
 | 2187 |  | 
 | 2188 | This file can be used to adjust the score used to select which processes | 
 | 2189 | should be killed in an  out-of-memory  situation.  Giving it a high score will | 
 | 2190 | increase the likelihood of this process being killed by the oom-killer.  Valid | 
 | 2191 | values are in the range -16 to +15, plus the special value -17, which disables | 
 | 2192 | oom-killing altogether for this process. | 
 | 2193 |  | 
 | 2194 | 2.13 /proc/<pid>/oom_score - Display current oom-killer score | 
 | 2195 | ------------------------------------------------------------- | 
 | 2196 |  | 
 | 2197 | ------------------------------------------------------------------------------ | 
 | 2198 | This file can be used to check the current score used by the oom-killer is for | 
 | 2199 | any given <pid>. Use it together with /proc/<pid>/oom_adj to tune which | 
 | 2200 | process should be killed in an out-of-memory situation. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2201 |  | 
 | 2202 | ------------------------------------------------------------------------------ | 
 | 2203 | Summary | 
 | 2204 | ------------------------------------------------------------------------------ | 
 | 2205 | Certain aspects  of  kernel  behavior  can be modified at runtime, without the | 
 | 2206 | need to  recompile  the kernel, or even to reboot the system. The files in the | 
 | 2207 | /proc/sys tree  can  not only be read, but also modified. You can use the echo | 
 | 2208 | command to write value into these files, thereby changing the default settings | 
 | 2209 | of the kernel. | 
 | 2210 | ------------------------------------------------------------------------------ | 
| Roland Kletzing | f9c9946 | 2007-03-05 00:30:54 -0800 | [diff] [blame] | 2211 |  | 
 | 2212 | 2.14  /proc/<pid>/io - Display the IO accounting fields | 
 | 2213 | ------------------------------------------------------- | 
 | 2214 |  | 
 | 2215 | This file contains IO statistics for each running process | 
 | 2216 |  | 
 | 2217 | Example | 
 | 2218 | ------- | 
 | 2219 |  | 
 | 2220 | test:/tmp # dd if=/dev/zero of=/tmp/test.dat & | 
 | 2221 | [1] 3828 | 
 | 2222 |  | 
 | 2223 | test:/tmp # cat /proc/3828/io | 
 | 2224 | rchar: 323934931 | 
 | 2225 | wchar: 323929600 | 
 | 2226 | syscr: 632687 | 
 | 2227 | syscw: 632675 | 
 | 2228 | read_bytes: 0 | 
 | 2229 | write_bytes: 323932160 | 
 | 2230 | cancelled_write_bytes: 0 | 
 | 2231 |  | 
 | 2232 |  | 
 | 2233 | Description | 
 | 2234 | ----------- | 
 | 2235 |  | 
 | 2236 | rchar | 
 | 2237 | ----- | 
 | 2238 |  | 
 | 2239 | I/O counter: chars read | 
 | 2240 | The number of bytes which this task has caused to be read from storage. This | 
 | 2241 | is simply the sum of bytes which this process passed to read() and pread(). | 
 | 2242 | It includes things like tty IO and it is unaffected by whether or not actual | 
 | 2243 | physical disk IO was required (the read might have been satisfied from | 
 | 2244 | pagecache) | 
 | 2245 |  | 
 | 2246 |  | 
 | 2247 | wchar | 
 | 2248 | ----- | 
 | 2249 |  | 
 | 2250 | I/O counter: chars written | 
 | 2251 | The number of bytes which this task has caused, or shall cause to be written | 
 | 2252 | to disk. Similar caveats apply here as with rchar. | 
 | 2253 |  | 
 | 2254 |  | 
 | 2255 | syscr | 
 | 2256 | ----- | 
 | 2257 |  | 
 | 2258 | I/O counter: read syscalls | 
 | 2259 | Attempt to count the number of read I/O operations, i.e. syscalls like read() | 
 | 2260 | and pread(). | 
 | 2261 |  | 
 | 2262 |  | 
 | 2263 | syscw | 
 | 2264 | ----- | 
 | 2265 |  | 
 | 2266 | I/O counter: write syscalls | 
 | 2267 | Attempt to count the number of write I/O operations, i.e. syscalls like | 
 | 2268 | write() and pwrite(). | 
 | 2269 |  | 
 | 2270 |  | 
 | 2271 | read_bytes | 
 | 2272 | ---------- | 
 | 2273 |  | 
 | 2274 | I/O counter: bytes read | 
 | 2275 | Attempt to count the number of bytes which this process really did cause to | 
 | 2276 | be fetched from the storage layer. Done at the submit_bio() level, so it is | 
 | 2277 | accurate for block-backed filesystems. <please add status regarding NFS and | 
 | 2278 | CIFS at a later time> | 
 | 2279 |  | 
 | 2280 |  | 
 | 2281 | write_bytes | 
 | 2282 | ----------- | 
 | 2283 |  | 
 | 2284 | I/O counter: bytes written | 
 | 2285 | Attempt to count the number of bytes which this process caused to be sent to | 
 | 2286 | the storage layer. This is done at page-dirtying time. | 
 | 2287 |  | 
 | 2288 |  | 
 | 2289 | cancelled_write_bytes | 
 | 2290 | --------------------- | 
 | 2291 |  | 
 | 2292 | The big inaccuracy here is truncate. If a process writes 1MB to a file and | 
 | 2293 | then deletes the file, it will in fact perform no writeout. But it will have | 
 | 2294 | been accounted as having caused 1MB of write. | 
 | 2295 | In other words: The number of bytes which this process caused to not happen, | 
 | 2296 | by truncating pagecache. A task can cause "negative" IO too. If this task | 
 | 2297 | truncates some dirty pagecache, some IO which another task has been accounted | 
 | 2298 | for (in it's write_bytes) will not be happening. We _could_ just subtract that | 
 | 2299 | from the truncating task's write_bytes, but there is information loss in doing | 
 | 2300 | that. | 
 | 2301 |  | 
 | 2302 |  | 
 | 2303 | Note | 
 | 2304 | ---- | 
 | 2305 |  | 
 | 2306 | At its current implementation state, this is a bit racy on 32-bit machines: if | 
 | 2307 | process A reads process B's /proc/pid/io while process B is updating one of | 
 | 2308 | those 64-bit counters, process A could see an intermediate result. | 
 | 2309 |  | 
 | 2310 |  | 
 | 2311 | More information about this can be found within the taskstats documentation in | 
 | 2312 | Documentation/accounting. | 
 | 2313 |  | 
| Kawai, Hidehiro | bb90110 | 2007-07-19 01:48:31 -0700 | [diff] [blame] | 2314 | 2.15 /proc/<pid>/coredump_filter - Core dump filtering settings | 
 | 2315 | --------------------------------------------------------------- | 
 | 2316 | When a process is dumped, all anonymous memory is written to a core file as | 
 | 2317 | long as the size of the core file isn't limited. But sometimes we don't want | 
 | 2318 | to dump some memory segments, for example, huge shared memory. Conversely, | 
 | 2319 | sometimes we want to save file-backed memory segments into a core file, not | 
 | 2320 | only the individual files. | 
 | 2321 |  | 
 | 2322 | /proc/<pid>/coredump_filter allows you to customize which memory segments | 
 | 2323 | will be dumped when the <pid> process is dumped. coredump_filter is a bitmask | 
 | 2324 | of memory types. If a bit of the bitmask is set, memory segments of the | 
 | 2325 | corresponding memory type are dumped, otherwise they are not dumped. | 
 | 2326 |  | 
 | 2327 | The following 4 memory types are supported: | 
 | 2328 |   - (bit 0) anonymous private memory | 
 | 2329 |   - (bit 1) anonymous shared memory | 
 | 2330 |   - (bit 2) file-backed private memory | 
 | 2331 |   - (bit 3) file-backed shared memory | 
 | 2332 |  | 
 | 2333 |   Note that MMIO pages such as frame buffer are never dumped and vDSO pages | 
 | 2334 |   are always dumped regardless of the bitmask status. | 
 | 2335 |  | 
 | 2336 | Default value of coredump_filter is 0x3; this means all anonymous memory | 
 | 2337 | segments are dumped. | 
 | 2338 |  | 
 | 2339 | If you don't want to dump all shared memory segments attached to pid 1234, | 
 | 2340 | write 1 to the process's proc file. | 
 | 2341 |  | 
 | 2342 |   $ echo 0x1 > /proc/1234/coredump_filter | 
 | 2343 |  | 
 | 2344 | When a new process is created, the process inherits the bitmask status from its | 
 | 2345 | parent. It is useful to set up coredump_filter before the program runs. | 
 | 2346 | For example: | 
 | 2347 |  | 
 | 2348 |   $ echo 0x7 > /proc/self/coredump_filter | 
 | 2349 |   $ ./some_program | 
 | 2350 |  | 
| Roland Kletzing | f9c9946 | 2007-03-05 00:30:54 -0800 | [diff] [blame] | 2351 | ------------------------------------------------------------------------------ |