| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | ------------------------------------------------------------------------------ | 
|  | 2 | T H E  /proc   F I L E S Y S T E M | 
|  | 3 | ------------------------------------------------------------------------------ | 
|  | 4 | /proc/sys         Terrehon Bowden <terrehon@pacbell.net>        October 7 1999 | 
|  | 5 | Bodo Bauer <bb@ricochet.net> | 
|  | 6 |  | 
|  | 7 | 2.4.x update	  Jorge Nerin <comandante@zaralinux.com>      November 14 2000 | 
|  | 8 | ------------------------------------------------------------------------------ | 
|  | 9 | Version 1.3                                              Kernel version 2.2.12 | 
|  | 10 | Kernel version 2.4.0-test11-pre4 | 
|  | 11 | ------------------------------------------------------------------------------ | 
|  | 12 |  | 
|  | 13 | Table of Contents | 
|  | 14 | ----------------- | 
|  | 15 |  | 
|  | 16 | 0     Preface | 
|  | 17 | 0.1	Introduction/Credits | 
|  | 18 | 0.2	Legal Stuff | 
|  | 19 |  | 
|  | 20 | 1	Collecting System Information | 
|  | 21 | 1.1	Process-Specific Subdirectories | 
|  | 22 | 1.2	Kernel data | 
|  | 23 | 1.3	IDE devices in /proc/ide | 
|  | 24 | 1.4	Networking info in /proc/net | 
|  | 25 | 1.5	SCSI info | 
|  | 26 | 1.6	Parallel port info in /proc/parport | 
|  | 27 | 1.7	TTY info in /proc/tty | 
|  | 28 | 1.8	Miscellaneous kernel statistics in /proc/stat | 
|  | 29 |  | 
|  | 30 | 2	Modifying System Parameters | 
|  | 31 | 2.1	/proc/sys/fs - File system data | 
|  | 32 | 2.2	/proc/sys/fs/binfmt_misc - Miscellaneous binary formats | 
|  | 33 | 2.3	/proc/sys/kernel - general kernel parameters | 
|  | 34 | 2.4	/proc/sys/vm - The virtual memory subsystem | 
|  | 35 | 2.5	/proc/sys/dev - Device specific parameters | 
|  | 36 | 2.6	/proc/sys/sunrpc - Remote procedure calls | 
|  | 37 | 2.7	/proc/sys/net - Networking stuff | 
|  | 38 | 2.8	/proc/sys/net/ipv4 - IPV4 settings | 
|  | 39 | 2.9	Appletalk | 
|  | 40 | 2.10	IPX | 
|  | 41 | 2.11	/proc/sys/fs/mqueue - POSIX message queues filesystem | 
|  | 42 |  | 
|  | 43 | ------------------------------------------------------------------------------ | 
|  | 44 | Preface | 
|  | 45 | ------------------------------------------------------------------------------ | 
|  | 46 |  | 
|  | 47 | 0.1 Introduction/Credits | 
|  | 48 | ------------------------ | 
|  | 49 |  | 
|  | 50 | This documentation is  part of a soon (or  so we hope) to be  released book on | 
|  | 51 | the SuSE  Linux distribution. As  there is  no complete documentation  for the | 
|  | 52 | /proc file system and we've used  many freely available sources to write these | 
|  | 53 | chapters, it  seems only fair  to give the work  back to the  Linux community. | 
|  | 54 | This work is  based on the 2.2.*  kernel version and the  upcoming 2.4.*. I'm | 
|  | 55 | afraid it's still far from complete, but we  hope it will be useful. As far as | 
|  | 56 | we know, it is the first 'all-in-one' document about the /proc file system. It | 
|  | 57 | is focused  on the Intel  x86 hardware,  so if you  are looking for  PPC, ARM, | 
|  | 58 | SPARC, AXP, etc., features, you probably  won't find what you are looking for. | 
|  | 59 | It also only covers IPv4 networking, not IPv6 nor other protocols - sorry. But | 
|  | 60 | additions and patches  are welcome and will  be added to this  document if you | 
|  | 61 | mail them to Bodo. | 
|  | 62 |  | 
|  | 63 | We'd like  to  thank Alan Cox, Rik van Riel, and Alexey Kuznetsov and a lot of | 
|  | 64 | other people for help compiling this documentation. We'd also like to extend a | 
|  | 65 | special thank  you to Andi Kleen for documentation, which we relied on heavily | 
|  | 66 | to create  this  document,  as well as the additional information he provided. | 
|  | 67 | Thanks to  everybody  else  who contributed source or docs to the Linux kernel | 
|  | 68 | and helped create a great piece of software... :) | 
|  | 69 |  | 
|  | 70 | If you  have  any comments, corrections or additions, please don't hesitate to | 
|  | 71 | contact Bodo  Bauer  at  bb@ricochet.net.  We'll  be happy to add them to this | 
|  | 72 | document. | 
|  | 73 |  | 
|  | 74 | The   latest   version    of   this   document   is    available   online   at | 
|  | 75 | http://skaro.nightcrawler.com/~bb/Docs/Proc as HTML version. | 
|  | 76 |  | 
|  | 77 | If  the above  direction does  not works  for you,  ypu could  try the  kernel | 
|  | 78 | mailing  list  at  linux-kernel@vger.kernel.org  and/or try  to  reach  me  at | 
|  | 79 | comandante@zaralinux.com. | 
|  | 80 |  | 
|  | 81 | 0.2 Legal Stuff | 
|  | 82 | --------------- | 
|  | 83 |  | 
|  | 84 | We don't  guarantee  the  correctness  of this document, and if you come to us | 
|  | 85 | complaining about  how  you  screwed  up  your  system  because  of  incorrect | 
|  | 86 | documentation, we won't feel responsible... | 
|  | 87 |  | 
|  | 88 | ------------------------------------------------------------------------------ | 
|  | 89 | CHAPTER 1: COLLECTING SYSTEM INFORMATION | 
|  | 90 | ------------------------------------------------------------------------------ | 
|  | 91 |  | 
|  | 92 | ------------------------------------------------------------------------------ | 
|  | 93 | In This Chapter | 
|  | 94 | ------------------------------------------------------------------------------ | 
|  | 95 | * Investigating  the  properties  of  the  pseudo  file  system  /proc and its | 
|  | 96 | ability to provide information on the running Linux system | 
|  | 97 | * Examining /proc's structure | 
|  | 98 | * Uncovering  various  information  about the kernel and the processes running | 
|  | 99 | on the system | 
|  | 100 | ------------------------------------------------------------------------------ | 
|  | 101 |  | 
|  | 102 |  | 
|  | 103 | The proc  file  system acts as an interface to internal data structures in the | 
|  | 104 | kernel. It  can  be  used to obtain information about the system and to change | 
|  | 105 | certain kernel parameters at runtime (sysctl). | 
|  | 106 |  | 
|  | 107 | First, we'll  take  a  look  at the read-only parts of /proc. In Chapter 2, we | 
|  | 108 | show you how you can use /proc/sys to change settings. | 
|  | 109 |  | 
|  | 110 | 1.1 Process-Specific Subdirectories | 
|  | 111 | ----------------------------------- | 
|  | 112 |  | 
|  | 113 | The directory  /proc  contains  (among other things) one subdirectory for each | 
|  | 114 | process running on the system, which is named after the process ID (PID). | 
|  | 115 |  | 
|  | 116 | The link  self  points  to  the  process reading the file system. Each process | 
|  | 117 | subdirectory has the entries listed in Table 1-1. | 
|  | 118 |  | 
|  | 119 |  | 
|  | 120 | Table 1-1: Process specific entries in /proc | 
|  | 121 | .............................................................................. | 
|  | 122 | File    Content | 
|  | 123 | cmdline Command line arguments | 
|  | 124 | cpu	 Current and last cpu in wich it was executed		(2.4)(smp) | 
|  | 125 | cwd	 Link to the current working directory | 
|  | 126 | environ Values of environment variables | 
|  | 127 | exe	 Link to the executable of this process | 
|  | 128 | fd      Directory, which contains all file descriptors | 
|  | 129 | maps	 Memory maps to executables and library files		(2.4) | 
|  | 130 | mem     Memory held by this process | 
|  | 131 | root	 Link to the root directory of this process | 
|  | 132 | stat    Process status | 
|  | 133 | statm   Process memory status information | 
|  | 134 | status  Process status in human readable form | 
|  | 135 | wchan   If CONFIG_KALLSYMS is set, a pre-decoded wchan | 
| Mauricio Lin | e070ad4 | 2005-09-03 15:55:10 -0700 | [diff] [blame] | 136 | smaps	 Extension based on maps, presenting the rss size for each mapped file | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 137 | .............................................................................. | 
|  | 138 |  | 
|  | 139 | For example, to get the status information of a process, all you have to do is | 
|  | 140 | read the file /proc/PID/status: | 
|  | 141 |  | 
|  | 142 | >cat /proc/self/status | 
|  | 143 | Name:   cat | 
|  | 144 | State:  R (running) | 
|  | 145 | Pid:    5452 | 
|  | 146 | PPid:   743 | 
|  | 147 | TracerPid:      0						(2.4) | 
|  | 148 | Uid:    501     501     501     501 | 
|  | 149 | Gid:    100     100     100     100 | 
|  | 150 | Groups: 100 14 16 | 
|  | 151 | VmSize:     1112 kB | 
|  | 152 | VmLck:         0 kB | 
|  | 153 | VmRSS:       348 kB | 
|  | 154 | VmData:       24 kB | 
|  | 155 | VmStk:        12 kB | 
|  | 156 | VmExe:         8 kB | 
|  | 157 | VmLib:      1044 kB | 
|  | 158 | SigPnd: 0000000000000000 | 
|  | 159 | SigBlk: 0000000000000000 | 
|  | 160 | SigIgn: 0000000000000000 | 
|  | 161 | SigCgt: 0000000000000000 | 
|  | 162 | CapInh: 00000000fffffeff | 
|  | 163 | CapPrm: 0000000000000000 | 
|  | 164 | CapEff: 0000000000000000 | 
|  | 165 |  | 
|  | 166 |  | 
|  | 167 | This shows you nearly the same information you would get if you viewed it with | 
|  | 168 | the ps  command.  In  fact,  ps  uses  the  proc  file  system  to  obtain its | 
|  | 169 | information. The  statm  file  contains  more  detailed  information about the | 
|  | 170 | process memory usage. Its seven fields are explained in Table 1-2. | 
|  | 171 |  | 
|  | 172 |  | 
|  | 173 | Table 1-2: Contents of the statm files (as of 2.6.8-rc3) | 
|  | 174 | .............................................................................. | 
|  | 175 | Field    Content | 
|  | 176 | size     total program size (pages)		(same as VmSize in status) | 
|  | 177 | resident size of memory portions (pages)	(same as VmRSS in status) | 
|  | 178 | shared   number of pages that are shared	(i.e. backed by a file) | 
|  | 179 | trs      number of pages that are 'code'	(not including libs; broken, | 
|  | 180 | includes data segment) | 
|  | 181 | lrs      number of pages of library		(always 0 on 2.6) | 
|  | 182 | drs      number of pages of data/stack		(including libs; broken, | 
|  | 183 | includes library text) | 
|  | 184 | dt       number of dirty pages			(always 0 on 2.6) | 
|  | 185 | .............................................................................. | 
|  | 186 |  | 
|  | 187 | 1.2 Kernel data | 
|  | 188 | --------------- | 
|  | 189 |  | 
|  | 190 | Similar to  the  process entries, the kernel data files give information about | 
|  | 191 | the running kernel. The files used to obtain this information are contained in | 
|  | 192 | /proc and  are  listed  in Table 1-3. Not all of these will be present in your | 
|  | 193 | system. It  depends  on the kernel configuration and the loaded modules, which | 
|  | 194 | files are there, and which are missing. | 
|  | 195 |  | 
|  | 196 | Table 1-3: Kernel info in /proc | 
|  | 197 | .............................................................................. | 
|  | 198 | File        Content | 
|  | 199 | apm         Advanced power management info | 
|  | 200 | buddyinfo   Kernel memory allocator information (see text)	(2.5) | 
|  | 201 | bus         Directory containing bus specific information | 
|  | 202 | cmdline     Kernel command line | 
|  | 203 | cpuinfo     Info about the CPU | 
|  | 204 | devices     Available devices (block and character) | 
|  | 205 | dma         Used DMS channels | 
|  | 206 | filesystems Supported filesystems | 
|  | 207 | driver	     Various drivers grouped here, currently rtc (2.4) | 
|  | 208 | execdomains Execdomains, related to security			(2.4) | 
|  | 209 | fb	     Frame Buffer devices				(2.4) | 
|  | 210 | fs	     File system parameters, currently nfs/exports	(2.4) | 
|  | 211 | ide         Directory containing info about the IDE subsystem | 
|  | 212 | interrupts  Interrupt usage | 
|  | 213 | iomem	     Memory map						(2.4) | 
|  | 214 | ioports     I/O port usage | 
|  | 215 | irq	     Masks for irq to cpu affinity			(2.4)(smp?) | 
|  | 216 | isapnp	     ISA PnP (Plug&Play) Info				(2.4) | 
|  | 217 | kcore       Kernel core image (can be ELF or A.OUT(deprecated in 2.4)) | 
|  | 218 | kmsg        Kernel messages | 
|  | 219 | ksyms       Kernel symbol table | 
|  | 220 | loadavg     Load average of last 1, 5 & 15 minutes | 
|  | 221 | locks       Kernel locks | 
|  | 222 | meminfo     Memory info | 
|  | 223 | misc        Miscellaneous | 
|  | 224 | modules     List of loaded modules | 
|  | 225 | mounts      Mounted filesystems | 
|  | 226 | net         Networking info (see text) | 
|  | 227 | partitions  Table of partitions known to the system | 
|  | 228 | pci	     Depreciated info of PCI bus (new way -> /proc/bus/pci/, | 
|  | 229 | decoupled by lspci					(2.4) | 
|  | 230 | rtc         Real time clock | 
|  | 231 | scsi        SCSI info (see text) | 
|  | 232 | slabinfo    Slab pool info | 
|  | 233 | stat        Overall statistics | 
|  | 234 | swaps       Swap space utilization | 
|  | 235 | sys         See chapter 2 | 
|  | 236 | sysvipc     Info of SysVIPC Resources (msg, sem, shm)		(2.4) | 
|  | 237 | tty	     Info of tty drivers | 
|  | 238 | uptime      System uptime | 
|  | 239 | version     Kernel version | 
|  | 240 | video	     bttv info of video resources			(2.4) | 
|  | 241 | .............................................................................. | 
|  | 242 |  | 
|  | 243 | You can,  for  example,  check  which interrupts are currently in use and what | 
|  | 244 | they are used for by looking in the file /proc/interrupts: | 
|  | 245 |  | 
|  | 246 | > cat /proc/interrupts | 
|  | 247 | CPU0 | 
|  | 248 | 0:    8728810          XT-PIC  timer | 
|  | 249 | 1:        895          XT-PIC  keyboard | 
|  | 250 | 2:          0          XT-PIC  cascade | 
|  | 251 | 3:     531695          XT-PIC  aha152x | 
|  | 252 | 4:    2014133          XT-PIC  serial | 
|  | 253 | 5:      44401          XT-PIC  pcnet_cs | 
|  | 254 | 8:          2          XT-PIC  rtc | 
|  | 255 | 11:          8          XT-PIC  i82365 | 
|  | 256 | 12:     182918          XT-PIC  PS/2 Mouse | 
|  | 257 | 13:          1          XT-PIC  fpu | 
|  | 258 | 14:    1232265          XT-PIC  ide0 | 
|  | 259 | 15:          7          XT-PIC  ide1 | 
|  | 260 | NMI:          0 | 
|  | 261 |  | 
|  | 262 | In 2.4.* a couple of lines where added to this file LOC & ERR (this time is the | 
|  | 263 | output of a SMP machine): | 
|  | 264 |  | 
|  | 265 | > cat /proc/interrupts | 
|  | 266 |  | 
|  | 267 | CPU0       CPU1 | 
|  | 268 | 0:    1243498    1214548    IO-APIC-edge  timer | 
|  | 269 | 1:       8949       8958    IO-APIC-edge  keyboard | 
|  | 270 | 2:          0          0          XT-PIC  cascade | 
|  | 271 | 5:      11286      10161    IO-APIC-edge  soundblaster | 
|  | 272 | 8:          1          0    IO-APIC-edge  rtc | 
|  | 273 | 9:      27422      27407    IO-APIC-edge  3c503 | 
|  | 274 | 12:     113645     113873    IO-APIC-edge  PS/2 Mouse | 
|  | 275 | 13:          0          0          XT-PIC  fpu | 
|  | 276 | 14:      22491      24012    IO-APIC-edge  ide0 | 
|  | 277 | 15:       2183       2415    IO-APIC-edge  ide1 | 
|  | 278 | 17:      30564      30414   IO-APIC-level  eth0 | 
|  | 279 | 18:        177        164   IO-APIC-level  bttv | 
|  | 280 | NMI:    2457961    2457959 | 
|  | 281 | LOC:    2457882    2457881 | 
|  | 282 | ERR:       2155 | 
|  | 283 |  | 
|  | 284 | NMI is incremented in this case because every timer interrupt generates a NMI | 
|  | 285 | (Non Maskable Interrupt) which is used by the NMI Watchdog to detect lockups. | 
|  | 286 |  | 
|  | 287 | LOC is the local interrupt counter of the internal APIC of every CPU. | 
|  | 288 |  | 
|  | 289 | ERR is incremented in the case of errors in the IO-APIC bus (the bus that | 
|  | 290 | connects the CPUs in a SMP system. This means that an error has been detected, | 
|  | 291 | the IO-APIC automatically retry the transmission, so it should not be a big | 
|  | 292 | problem, but you should read the SMP-FAQ. | 
|  | 293 |  | 
|  | 294 | In this context it could be interesting to note the new irq directory in 2.4. | 
|  | 295 | It could be used to set IRQ to CPU affinity, this means that you can "hook" an | 
|  | 296 | IRQ to only one CPU, or to exclude a CPU of handling IRQs. The contents of the | 
|  | 297 | irq subdir is one subdir for each IRQ, and one file; prof_cpu_mask | 
|  | 298 |  | 
|  | 299 | For example | 
|  | 300 | > ls /proc/irq/ | 
|  | 301 | 0  10  12  14  16  18  2  4  6  8  prof_cpu_mask | 
|  | 302 | 1  11  13  15  17  19  3  5  7  9 | 
|  | 303 | > ls /proc/irq/0/ | 
|  | 304 | smp_affinity | 
|  | 305 |  | 
|  | 306 | The contents of the prof_cpu_mask file and each smp_affinity file for each IRQ | 
|  | 307 | is the same by default: | 
|  | 308 |  | 
|  | 309 | > cat /proc/irq/0/smp_affinity | 
|  | 310 | ffffffff | 
|  | 311 |  | 
|  | 312 | It's a bitmask, in wich you can specify wich CPUs can handle the IRQ, you can | 
|  | 313 | set it by doing: | 
|  | 314 |  | 
|  | 315 | > echo 1 > /proc/irq/prof_cpu_mask | 
|  | 316 |  | 
|  | 317 | This means that only the first CPU will handle the IRQ, but you can also echo 5 | 
|  | 318 | wich means that only the first and fourth CPU can handle the IRQ. | 
|  | 319 |  | 
|  | 320 | The way IRQs are routed is handled by the IO-APIC, and it's Round Robin | 
|  | 321 | between all the CPUs which are allowed to handle it. As usual the kernel has | 
|  | 322 | more info than you and does a better job than you, so the defaults are the | 
|  | 323 | best choice for almost everyone. | 
|  | 324 |  | 
|  | 325 | There are  three  more  important subdirectories in /proc: net, scsi, and sys. | 
|  | 326 | The general  rule  is  that  the  contents,  or  even  the  existence of these | 
|  | 327 | directories, depend  on your kernel configuration. If SCSI is not enabled, the | 
|  | 328 | directory scsi  may  not  exist. The same is true with the net, which is there | 
|  | 329 | only when networking support is present in the running kernel. | 
|  | 330 |  | 
|  | 331 | The slabinfo  file  gives  information  about  memory usage at the slab level. | 
|  | 332 | Linux uses  slab  pools for memory management above page level in version 2.2. | 
|  | 333 | Commonly used  objects  have  their  own  slab  pool (such as network buffers, | 
|  | 334 | directory cache, and so on). | 
|  | 335 |  | 
|  | 336 | .............................................................................. | 
|  | 337 |  | 
|  | 338 | > cat /proc/buddyinfo | 
|  | 339 |  | 
|  | 340 | Node 0, zone      DMA      0      4      5      4      4      3 ... | 
|  | 341 | Node 0, zone   Normal      1      0      0      1    101      8 ... | 
|  | 342 | Node 0, zone  HighMem      2      0      0      1      1      0 ... | 
|  | 343 |  | 
|  | 344 | Memory fragmentation is a problem under some workloads, and buddyinfo is a | 
|  | 345 | useful tool for helping diagnose these problems.  Buddyinfo will give you a | 
|  | 346 | clue as to how big an area you can safely allocate, or why a previous | 
|  | 347 | allocation failed. | 
|  | 348 |  | 
|  | 349 | Each column represents the number of pages of a certain order which are | 
|  | 350 | available.  In this case, there are 0 chunks of 2^0*PAGE_SIZE available in | 
|  | 351 | ZONE_DMA, 4 chunks of 2^1*PAGE_SIZE in ZONE_DMA, 101 chunks of 2^4*PAGE_SIZE | 
|  | 352 | available in ZONE_NORMAL, etc... | 
|  | 353 |  | 
|  | 354 | .............................................................................. | 
|  | 355 |  | 
|  | 356 | meminfo: | 
|  | 357 |  | 
|  | 358 | Provides information about distribution and utilization of memory.  This | 
|  | 359 | varies by architecture and compile options.  The following is from a | 
|  | 360 | 16GB PIII, which has highmem enabled.  You may not have all of these fields. | 
|  | 361 |  | 
|  | 362 | > cat /proc/meminfo | 
|  | 363 |  | 
|  | 364 |  | 
|  | 365 | MemTotal:     16344972 kB | 
|  | 366 | MemFree:      13634064 kB | 
|  | 367 | Buffers:          3656 kB | 
|  | 368 | Cached:        1195708 kB | 
|  | 369 | SwapCached:          0 kB | 
|  | 370 | Active:         891636 kB | 
|  | 371 | Inactive:      1077224 kB | 
|  | 372 | HighTotal:    15597528 kB | 
|  | 373 | HighFree:     13629632 kB | 
|  | 374 | LowTotal:       747444 kB | 
|  | 375 | LowFree:          4432 kB | 
|  | 376 | SwapTotal:           0 kB | 
|  | 377 | SwapFree:            0 kB | 
|  | 378 | Dirty:             968 kB | 
|  | 379 | Writeback:           0 kB | 
|  | 380 | Mapped:         280372 kB | 
|  | 381 | Slab:           684068 kB | 
|  | 382 | CommitLimit:   7669796 kB | 
|  | 383 | Committed_AS:   100056 kB | 
|  | 384 | PageTables:      24448 kB | 
|  | 385 | VmallocTotal:   112216 kB | 
|  | 386 | VmallocUsed:       428 kB | 
|  | 387 | VmallocChunk:   111088 kB | 
|  | 388 |  | 
|  | 389 | MemTotal: Total usable ram (i.e. physical ram minus a few reserved | 
|  | 390 | bits and the kernel binary code) | 
|  | 391 | MemFree: The sum of LowFree+HighFree | 
|  | 392 | Buffers: Relatively temporary storage for raw disk blocks | 
|  | 393 | shouldn't get tremendously large (20MB or so) | 
|  | 394 | Cached: in-memory cache for files read from the disk (the | 
|  | 395 | pagecache).  Doesn't include SwapCached | 
|  | 396 | SwapCached: Memory that once was swapped out, is swapped back in but | 
|  | 397 | still also is in the swapfile (if memory is needed it | 
|  | 398 | doesn't need to be swapped out AGAIN because it is already | 
|  | 399 | in the swapfile. This saves I/O) | 
|  | 400 | Active: Memory that has been used more recently and usually not | 
|  | 401 | reclaimed unless absolutely necessary. | 
|  | 402 | Inactive: Memory which has been less recently used.  It is more | 
|  | 403 | eligible to be reclaimed for other purposes | 
|  | 404 | HighTotal: | 
|  | 405 | HighFree: Highmem is all memory above ~860MB of physical memory | 
|  | 406 | Highmem areas are for use by userspace programs, or | 
|  | 407 | for the pagecache.  The kernel must use tricks to access | 
|  | 408 | this memory, making it slower to access than lowmem. | 
|  | 409 | LowTotal: | 
|  | 410 | LowFree: Lowmem is memory which can be used for everything that | 
|  | 411 | highmem can be used for, but it is also availble for the | 
|  | 412 | kernel's use for its own data structures.  Among many | 
|  | 413 | other things, it is where everything from the Slab is | 
|  | 414 | allocated.  Bad things happen when you're out of lowmem. | 
|  | 415 | SwapTotal: total amount of swap space available | 
|  | 416 | SwapFree: Memory which has been evicted from RAM, and is temporarily | 
|  | 417 | on the disk | 
|  | 418 | Dirty: Memory which is waiting to get written back to the disk | 
|  | 419 | Writeback: Memory which is actively being written back to the disk | 
|  | 420 | Mapped: files which have been mmaped, such as libraries | 
|  | 421 | Slab: in-kernel data structures cache | 
|  | 422 | CommitLimit: Based on the overcommit ratio ('vm.overcommit_ratio'), | 
|  | 423 | this is the total amount of  memory currently available to | 
|  | 424 | be allocated on the system. This limit is only adhered to | 
|  | 425 | if strict overcommit accounting is enabled (mode 2 in | 
|  | 426 | 'vm.overcommit_memory'). | 
|  | 427 | The CommitLimit is calculated with the following formula: | 
|  | 428 | CommitLimit = ('vm.overcommit_ratio' * Physical RAM) + Swap | 
|  | 429 | For example, on a system with 1G of physical RAM and 7G | 
|  | 430 | of swap with a `vm.overcommit_ratio` of 30 it would | 
|  | 431 | yield a CommitLimit of 7.3G. | 
|  | 432 | For more details, see the memory overcommit documentation | 
|  | 433 | in vm/overcommit-accounting. | 
|  | 434 | Committed_AS: The amount of memory presently allocated on the system. | 
|  | 435 | The committed memory is a sum of all of the memory which | 
|  | 436 | has been allocated by processes, even if it has not been | 
|  | 437 | "used" by them as of yet. A process which malloc()'s 1G | 
|  | 438 | of memory, but only touches 300M of it will only show up | 
|  | 439 | as using 300M of memory even if it has the address space | 
|  | 440 | allocated for the entire 1G. This 1G is memory which has | 
|  | 441 | been "committed" to by the VM and can be used at any time | 
|  | 442 | by the allocating application. With strict overcommit | 
|  | 443 | enabled on the system (mode 2 in 'vm.overcommit_memory'), | 
|  | 444 | allocations which would exceed the CommitLimit (detailed | 
|  | 445 | above) will not be permitted. This is useful if one needs | 
|  | 446 | to guarantee that processes will not fail due to lack of | 
|  | 447 | memory once that memory has been successfully allocated. | 
|  | 448 | PageTables: amount of memory dedicated to the lowest level of page | 
|  | 449 | tables. | 
|  | 450 | VmallocTotal: total size of vmalloc memory area | 
|  | 451 | VmallocUsed: amount of vmalloc area which is used | 
|  | 452 | VmallocChunk: largest contigious block of vmalloc area which is free | 
|  | 453 |  | 
|  | 454 |  | 
|  | 455 | 1.3 IDE devices in /proc/ide | 
|  | 456 | ---------------------------- | 
|  | 457 |  | 
|  | 458 | The subdirectory /proc/ide contains information about all IDE devices of which | 
|  | 459 | the kernel  is  aware.  There is one subdirectory for each IDE controller, the | 
|  | 460 | file drivers  and a link for each IDE device, pointing to the device directory | 
|  | 461 | in the controller specific subtree. | 
|  | 462 |  | 
|  | 463 | The file  drivers  contains general information about the drivers used for the | 
|  | 464 | IDE devices: | 
|  | 465 |  | 
|  | 466 | > cat /proc/ide/drivers | 
|  | 467 | ide-cdrom version 4.53 | 
|  | 468 | ide-disk version 1.08 | 
|  | 469 |  | 
|  | 470 | More detailed  information  can  be  found  in  the  controller  specific | 
|  | 471 | subdirectories. These  are  named  ide0,  ide1  and  so  on.  Each  of  these | 
|  | 472 | directories contains the files shown in table 1-4. | 
|  | 473 |  | 
|  | 474 |  | 
|  | 475 | Table 1-4: IDE controller info in  /proc/ide/ide? | 
|  | 476 | .............................................................................. | 
|  | 477 | File    Content | 
|  | 478 | channel IDE channel (0 or 1) | 
|  | 479 | config  Configuration (only for PCI/IDE bridge) | 
|  | 480 | mate    Mate name | 
|  | 481 | model   Type/Chipset of IDE controller | 
|  | 482 | .............................................................................. | 
|  | 483 |  | 
|  | 484 | Each device  connected  to  a  controller  has  a separate subdirectory in the | 
|  | 485 | controllers directory.  The  files  listed in table 1-5 are contained in these | 
|  | 486 | directories. | 
|  | 487 |  | 
|  | 488 |  | 
|  | 489 | Table 1-5: IDE device information | 
|  | 490 | .............................................................................. | 
|  | 491 | File             Content | 
|  | 492 | cache            The cache | 
|  | 493 | capacity         Capacity of the medium (in 512Byte blocks) | 
|  | 494 | driver           driver and version | 
|  | 495 | geometry         physical and logical geometry | 
|  | 496 | identify         device identify block | 
|  | 497 | media            media type | 
|  | 498 | model            device identifier | 
|  | 499 | settings         device setup | 
|  | 500 | smart_thresholds IDE disk management thresholds | 
|  | 501 | smart_values     IDE disk management values | 
|  | 502 | .............................................................................. | 
|  | 503 |  | 
|  | 504 | The most  interesting  file is settings. This file contains a nice overview of | 
|  | 505 | the drive parameters: | 
|  | 506 |  | 
|  | 507 | # cat /proc/ide/ide0/hda/settings | 
|  | 508 | name                    value           min             max             mode | 
|  | 509 | ----                    -----           ---             ---             ---- | 
|  | 510 | bios_cyl                526             0               65535           rw | 
|  | 511 | bios_head               255             0               255             rw | 
|  | 512 | bios_sect               63              0               63              rw | 
|  | 513 | breada_readahead        4               0               127             rw | 
|  | 514 | bswap                   0               0               1               r | 
|  | 515 | file_readahead          72              0               2097151         rw | 
|  | 516 | io_32bit                0               0               3               rw | 
|  | 517 | keepsettings            0               0               1               rw | 
|  | 518 | max_kb_per_request      122             1               127             rw | 
|  | 519 | multcount               0               0               8               rw | 
|  | 520 | nice1                   1               0               1               rw | 
|  | 521 | nowerr                  0               0               1               rw | 
|  | 522 | pio_mode                write-only      0               255             w | 
|  | 523 | slow                    0               0               1               rw | 
|  | 524 | unmaskirq               0               0               1               rw | 
|  | 525 | using_dma               0               0               1               rw | 
|  | 526 |  | 
|  | 527 |  | 
|  | 528 | 1.4 Networking info in /proc/net | 
|  | 529 | -------------------------------- | 
|  | 530 |  | 
|  | 531 | The subdirectory  /proc/net  follows  the  usual  pattern. Table 1-6 shows the | 
|  | 532 | additional values  you  get  for  IP  version 6 if you configure the kernel to | 
|  | 533 | support this. Table 1-7 lists the files and their meaning. | 
|  | 534 |  | 
|  | 535 |  | 
|  | 536 | Table 1-6: IPv6 info in /proc/net | 
|  | 537 | .............................................................................. | 
|  | 538 | File       Content | 
|  | 539 | udp6       UDP sockets (IPv6) | 
|  | 540 | tcp6       TCP sockets (IPv6) | 
|  | 541 | raw6       Raw device statistics (IPv6) | 
|  | 542 | igmp6      IP multicast addresses, which this host joined (IPv6) | 
|  | 543 | if_inet6   List of IPv6 interface addresses | 
|  | 544 | ipv6_route Kernel routing table for IPv6 | 
|  | 545 | rt6_stats  Global IPv6 routing tables statistics | 
|  | 546 | sockstat6  Socket statistics (IPv6) | 
|  | 547 | snmp6      Snmp data (IPv6) | 
|  | 548 | .............................................................................. | 
|  | 549 |  | 
|  | 550 |  | 
|  | 551 | Table 1-7: Network info in /proc/net | 
|  | 552 | .............................................................................. | 
|  | 553 | File          Content | 
|  | 554 | arp           Kernel  ARP table | 
|  | 555 | dev           network devices with statistics | 
|  | 556 | dev_mcast     the Layer2 multicast groups a device is listening too | 
|  | 557 | (interface index, label, number of references, number of bound | 
|  | 558 | addresses). | 
|  | 559 | dev_stat      network device status | 
|  | 560 | ip_fwchains   Firewall chain linkage | 
|  | 561 | ip_fwnames    Firewall chain names | 
|  | 562 | ip_masq       Directory containing the masquerading tables | 
|  | 563 | ip_masquerade Major masquerading table | 
|  | 564 | netstat       Network statistics | 
|  | 565 | raw           raw device statistics | 
|  | 566 | route         Kernel routing table | 
|  | 567 | rpc           Directory containing rpc info | 
|  | 568 | rt_cache      Routing cache | 
|  | 569 | snmp          SNMP data | 
|  | 570 | sockstat      Socket statistics | 
|  | 571 | tcp           TCP  sockets | 
|  | 572 | tr_rif        Token ring RIF routing table | 
|  | 573 | udp           UDP sockets | 
|  | 574 | unix          UNIX domain sockets | 
|  | 575 | wireless      Wireless interface data (Wavelan etc) | 
|  | 576 | igmp          IP multicast addresses, which this host joined | 
|  | 577 | psched        Global packet scheduler parameters. | 
|  | 578 | netlink       List of PF_NETLINK sockets | 
|  | 579 | ip_mr_vifs    List of multicast virtual interfaces | 
|  | 580 | ip_mr_cache   List of multicast routing cache | 
|  | 581 | .............................................................................. | 
|  | 582 |  | 
|  | 583 | You can  use  this  information  to see which network devices are available in | 
|  | 584 | your system and how much traffic was routed over those devices: | 
|  | 585 |  | 
|  | 586 | > cat /proc/net/dev | 
|  | 587 | Inter-|Receive                                                   |[... | 
|  | 588 | face |bytes    packets errs drop fifo frame compressed multicast|[... | 
|  | 589 | lo:  908188   5596     0    0    0     0          0         0 [... | 
|  | 590 | ppp0:15475140  20721   410    0    0   410          0         0 [... | 
|  | 591 | eth0:  614530   7085     0    0    0     0          0         1 [... | 
|  | 592 |  | 
|  | 593 | ...] Transmit | 
|  | 594 | ...] bytes    packets errs drop fifo colls carrier compressed | 
|  | 595 | ...]  908188     5596    0    0    0     0       0          0 | 
|  | 596 | ...] 1375103    17405    0    0    0     0       0          0 | 
|  | 597 | ...] 1703981     5535    0    0    0     3       0          0 | 
|  | 598 |  | 
|  | 599 | In addition, each Channel Bond interface has it's own directory.  For | 
|  | 600 | example, the bond0 device will have a directory called /proc/net/bond0/. | 
|  | 601 | It will contain information that is specific to that bond, such as the | 
|  | 602 | current slaves of the bond, the link status of the slaves, and how | 
|  | 603 | many times the slaves link has failed. | 
|  | 604 |  | 
|  | 605 | 1.5 SCSI info | 
|  | 606 | ------------- | 
|  | 607 |  | 
|  | 608 | If you  have  a  SCSI  host adapter in your system, you'll find a subdirectory | 
|  | 609 | named after  the driver for this adapter in /proc/scsi. You'll also see a list | 
|  | 610 | of all recognized SCSI devices in /proc/scsi: | 
|  | 611 |  | 
|  | 612 | >cat /proc/scsi/scsi | 
|  | 613 | Attached devices: | 
|  | 614 | Host: scsi0 Channel: 00 Id: 00 Lun: 00 | 
|  | 615 | Vendor: IBM      Model: DGHS09U          Rev: 03E0 | 
|  | 616 | Type:   Direct-Access                    ANSI SCSI revision: 03 | 
|  | 617 | Host: scsi0 Channel: 00 Id: 06 Lun: 00 | 
|  | 618 | Vendor: PIONEER  Model: CD-ROM DR-U06S   Rev: 1.04 | 
|  | 619 | Type:   CD-ROM                           ANSI SCSI revision: 02 | 
|  | 620 |  | 
|  | 621 |  | 
|  | 622 | The directory  named  after  the driver has one file for each adapter found in | 
|  | 623 | the system.  These  files  contain information about the controller, including | 
|  | 624 | the used  IRQ  and  the  IO  address range. The amount of information shown is | 
|  | 625 | dependent on  the adapter you use. The example shows the output for an Adaptec | 
|  | 626 | AHA-2940 SCSI adapter: | 
|  | 627 |  | 
|  | 628 | > cat /proc/scsi/aic7xxx/0 | 
|  | 629 |  | 
|  | 630 | Adaptec AIC7xxx driver version: 5.1.19/3.2.4 | 
|  | 631 | Compile Options: | 
|  | 632 | TCQ Enabled By Default : Disabled | 
|  | 633 | AIC7XXX_PROC_STATS     : Disabled | 
|  | 634 | AIC7XXX_RESET_DELAY    : 5 | 
|  | 635 | Adapter Configuration: | 
|  | 636 | SCSI Adapter: Adaptec AHA-294X Ultra SCSI host adapter | 
|  | 637 | Ultra Wide Controller | 
|  | 638 | PCI MMAPed I/O Base: 0xeb001000 | 
|  | 639 | Adapter SEEPROM Config: SEEPROM found and used. | 
|  | 640 | Adaptec SCSI BIOS: Enabled | 
|  | 641 | IRQ: 10 | 
|  | 642 | SCBs: Active 0, Max Active 2, | 
|  | 643 | Allocated 15, HW 16, Page 255 | 
|  | 644 | Interrupts: 160328 | 
|  | 645 | BIOS Control Word: 0x18b6 | 
|  | 646 | Adapter Control Word: 0x005b | 
|  | 647 | Extended Translation: Enabled | 
|  | 648 | Disconnect Enable Flags: 0xffff | 
|  | 649 | Ultra Enable Flags: 0x0001 | 
|  | 650 | Tag Queue Enable Flags: 0x0000 | 
|  | 651 | Ordered Queue Tag Flags: 0x0000 | 
|  | 652 | Default Tag Queue Depth: 8 | 
|  | 653 | Tagged Queue By Device array for aic7xxx host instance 0: | 
|  | 654 | {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255} | 
|  | 655 | Actual queue depth per device for aic7xxx host instance 0: | 
|  | 656 | {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1} | 
|  | 657 | Statistics: | 
|  | 658 | (scsi0:0:0:0) | 
|  | 659 | Device using Wide/Sync transfers at 40.0 MByte/sec, offset 8 | 
|  | 660 | Transinfo settings: current(12/8/1/0), goal(12/8/1/0), user(12/15/1/0) | 
|  | 661 | Total transfers 160151 (74577 reads and 85574 writes) | 
|  | 662 | (scsi0:0:6:0) | 
|  | 663 | Device using Narrow/Sync transfers at 5.0 MByte/sec, offset 15 | 
|  | 664 | Transinfo settings: current(50/15/0/0), goal(50/15/0/0), user(50/15/0/0) | 
|  | 665 | Total transfers 0 (0 reads and 0 writes) | 
|  | 666 |  | 
|  | 667 |  | 
|  | 668 | 1.6 Parallel port info in /proc/parport | 
|  | 669 | --------------------------------------- | 
|  | 670 |  | 
|  | 671 | The directory  /proc/parport  contains information about the parallel ports of | 
|  | 672 | your system.  It  has  one  subdirectory  for  each port, named after the port | 
|  | 673 | number (0,1,2,...). | 
|  | 674 |  | 
|  | 675 | These directories contain the four files shown in Table 1-8. | 
|  | 676 |  | 
|  | 677 |  | 
|  | 678 | Table 1-8: Files in /proc/parport | 
|  | 679 | .............................................................................. | 
|  | 680 | File      Content | 
|  | 681 | autoprobe Any IEEE-1284 device ID information that has been acquired. | 
|  | 682 | devices   list of the device drivers using that port. A + will appear by the | 
|  | 683 | name of the device currently using the port (it might not appear | 
|  | 684 | against any). | 
|  | 685 | hardware  Parallel port's base address, IRQ line and DMA channel. | 
|  | 686 | irq       IRQ that parport is using for that port. This is in a separate | 
|  | 687 | file to allow you to alter it by writing a new value in (IRQ | 
|  | 688 | number or none). | 
|  | 689 | .............................................................................. | 
|  | 690 |  | 
|  | 691 | 1.7 TTY info in /proc/tty | 
|  | 692 | ------------------------- | 
|  | 693 |  | 
|  | 694 | Information about  the  available  and actually used tty's can be found in the | 
|  | 695 | directory /proc/tty.You'll  find  entries  for drivers and line disciplines in | 
|  | 696 | this directory, as shown in Table 1-9. | 
|  | 697 |  | 
|  | 698 |  | 
|  | 699 | Table 1-9: Files in /proc/tty | 
|  | 700 | .............................................................................. | 
|  | 701 | File          Content | 
|  | 702 | drivers       list of drivers and their usage | 
|  | 703 | ldiscs        registered line disciplines | 
|  | 704 | driver/serial usage statistic and status of single tty lines | 
|  | 705 | .............................................................................. | 
|  | 706 |  | 
|  | 707 | To see  which  tty's  are  currently in use, you can simply look into the file | 
|  | 708 | /proc/tty/drivers: | 
|  | 709 |  | 
|  | 710 | > cat /proc/tty/drivers | 
|  | 711 | pty_slave            /dev/pts      136   0-255 pty:slave | 
|  | 712 | pty_master           /dev/ptm      128   0-255 pty:master | 
|  | 713 | pty_slave            /dev/ttyp       3   0-255 pty:slave | 
|  | 714 | pty_master           /dev/pty        2   0-255 pty:master | 
|  | 715 | serial               /dev/cua        5   64-67 serial:callout | 
|  | 716 | serial               /dev/ttyS       4   64-67 serial | 
|  | 717 | /dev/tty0            /dev/tty0       4       0 system:vtmaster | 
|  | 718 | /dev/ptmx            /dev/ptmx       5       2 system | 
|  | 719 | /dev/console         /dev/console    5       1 system:console | 
|  | 720 | /dev/tty             /dev/tty        5       0 system:/dev/tty | 
|  | 721 | unknown              /dev/tty        4    1-63 console | 
|  | 722 |  | 
|  | 723 |  | 
|  | 724 | 1.8 Miscellaneous kernel statistics in /proc/stat | 
|  | 725 | ------------------------------------------------- | 
|  | 726 |  | 
|  | 727 | Various pieces   of  information about  kernel activity  are  available in the | 
|  | 728 | /proc/stat file.  All  of  the numbers reported  in  this file are  aggregates | 
|  | 729 | since the system first booted.  For a quick look, simply cat the file: | 
|  | 730 |  | 
|  | 731 | > cat /proc/stat | 
|  | 732 | cpu  2255 34 2290 22625563 6290 127 456 | 
|  | 733 | cpu0 1132 34 1441 11311718 3675 127 438 | 
|  | 734 | cpu1 1123 0 849 11313845 2614 0 18 | 
|  | 735 | intr 114930548 113199788 3 0 5 263 0 4 [... lots more numbers ...] | 
|  | 736 | ctxt 1990473 | 
|  | 737 | btime 1062191376 | 
|  | 738 | processes 2915 | 
|  | 739 | procs_running 1 | 
|  | 740 | procs_blocked 0 | 
|  | 741 |  | 
|  | 742 | The very first  "cpu" line aggregates the  numbers in all  of the other "cpuN" | 
|  | 743 | lines.  These numbers identify the amount of time the CPU has spent performing | 
|  | 744 | different kinds of work.  Time units are in USER_HZ (typically hundredths of a | 
|  | 745 | second).  The meanings of the columns are as follows, from left to right: | 
|  | 746 |  | 
|  | 747 | - user: normal processes executing in user mode | 
|  | 748 | - nice: niced processes executing in user mode | 
|  | 749 | - system: processes executing in kernel mode | 
|  | 750 | - idle: twiddling thumbs | 
|  | 751 | - iowait: waiting for I/O to complete | 
|  | 752 | - irq: servicing interrupts | 
|  | 753 | - softirq: servicing softirqs | 
|  | 754 |  | 
|  | 755 | The "intr" line gives counts of interrupts  serviced since boot time, for each | 
|  | 756 | of the  possible system interrupts.   The first  column  is the  total of  all | 
|  | 757 | interrupts serviced; each  subsequent column is the  total for that particular | 
|  | 758 | interrupt. | 
|  | 759 |  | 
|  | 760 | The "ctxt" line gives the total number of context switches across all CPUs. | 
|  | 761 |  | 
|  | 762 | The "btime" line gives  the time at which the  system booted, in seconds since | 
|  | 763 | the Unix epoch. | 
|  | 764 |  | 
|  | 765 | The "processes" line gives the number  of processes and threads created, which | 
|  | 766 | includes (but  is not limited  to) those  created by  calls to the  fork() and | 
|  | 767 | clone() system calls. | 
|  | 768 |  | 
|  | 769 | The  "procs_running" line gives the  number of processes  currently running on | 
|  | 770 | CPUs. | 
|  | 771 |  | 
|  | 772 | The   "procs_blocked" line gives  the  number of  processes currently blocked, | 
|  | 773 | waiting for I/O to complete. | 
|  | 774 |  | 
|  | 775 |  | 
|  | 776 | ------------------------------------------------------------------------------ | 
|  | 777 | Summary | 
|  | 778 | ------------------------------------------------------------------------------ | 
|  | 779 | The /proc file system serves information about the running system. It not only | 
|  | 780 | allows access to process data but also allows you to request the kernel status | 
|  | 781 | by reading files in the hierarchy. | 
|  | 782 |  | 
|  | 783 | The directory  structure  of /proc reflects the types of information and makes | 
|  | 784 | it easy, if not obvious, where to look for specific data. | 
|  | 785 | ------------------------------------------------------------------------------ | 
|  | 786 |  | 
|  | 787 | ------------------------------------------------------------------------------ | 
|  | 788 | CHAPTER 2: MODIFYING SYSTEM PARAMETERS | 
|  | 789 | ------------------------------------------------------------------------------ | 
|  | 790 |  | 
|  | 791 | ------------------------------------------------------------------------------ | 
|  | 792 | In This Chapter | 
|  | 793 | ------------------------------------------------------------------------------ | 
|  | 794 | * Modifying kernel parameters by writing into files found in /proc/sys | 
|  | 795 | * Exploring the files which modify certain parameters | 
|  | 796 | * Review of the /proc/sys file tree | 
|  | 797 | ------------------------------------------------------------------------------ | 
|  | 798 |  | 
|  | 799 |  | 
|  | 800 | A very  interesting part of /proc is the directory /proc/sys. This is not only | 
|  | 801 | a source  of  information,  it also allows you to change parameters within the | 
|  | 802 | kernel. Be  very  careful  when attempting this. You can optimize your system, | 
|  | 803 | but you  can  also  cause  it  to  crash.  Never  alter kernel parameters on a | 
|  | 804 | production system.  Set  up  a  development machine and test to make sure that | 
|  | 805 | everything works  the  way  you want it to. You may have no alternative but to | 
|  | 806 | reboot the machine once an error has been made. | 
|  | 807 |  | 
|  | 808 | To change  a  value,  simply  echo  the new value into the file. An example is | 
|  | 809 | given below  in the section on the file system data. You need to be root to do | 
|  | 810 | this. You  can  create  your  own  boot script to perform this every time your | 
|  | 811 | system boots. | 
|  | 812 |  | 
|  | 813 | The files  in /proc/sys can be used to fine tune and monitor miscellaneous and | 
|  | 814 | general things  in  the operation of the Linux kernel. Since some of the files | 
|  | 815 | can inadvertently  disrupt  your  system,  it  is  advisable  to  read  both | 
|  | 816 | documentation and  source  before actually making adjustments. In any case, be | 
|  | 817 | very careful  when  writing  to  any  of these files. The entries in /proc may | 
|  | 818 | change slightly between the 2.1.* and the 2.2 kernel, so if there is any doubt | 
|  | 819 | review the kernel documentation in the directory /usr/src/linux/Documentation. | 
|  | 820 | This chapter  is  heavily  based  on the documentation included in the pre 2.2 | 
|  | 821 | kernels, and became part of it in version 2.2.1 of the Linux kernel. | 
|  | 822 |  | 
|  | 823 | 2.1 /proc/sys/fs - File system data | 
|  | 824 | ----------------------------------- | 
|  | 825 |  | 
|  | 826 | This subdirectory  contains  specific  file system, file handle, inode, dentry | 
|  | 827 | and quota information. | 
|  | 828 |  | 
|  | 829 | Currently, these files are in /proc/sys/fs: | 
|  | 830 |  | 
|  | 831 | dentry-state | 
|  | 832 | ------------ | 
|  | 833 |  | 
|  | 834 | Status of  the  directory  cache.  Since  directory  entries  are  dynamically | 
|  | 835 | allocated and  deallocated,  this  file indicates the current status. It holds | 
|  | 836 | six values, in which the last two are not used and are always zero. The others | 
|  | 837 | are listed in table 2-1. | 
|  | 838 |  | 
|  | 839 |  | 
|  | 840 | Table 2-1: Status files of the directory cache | 
|  | 841 | .............................................................................. | 
|  | 842 | File       Content | 
|  | 843 | nr_dentry  Almost always zero | 
|  | 844 | nr_unused  Number of unused cache entries | 
|  | 845 | age_limit | 
|  | 846 | in seconds after the entry may be reclaimed, when memory is short | 
|  | 847 | want_pages internally | 
|  | 848 | .............................................................................. | 
|  | 849 |  | 
|  | 850 | dquot-nr and dquot-max | 
|  | 851 | ---------------------- | 
|  | 852 |  | 
|  | 853 | The file dquot-max shows the maximum number of cached disk quota entries. | 
|  | 854 |  | 
|  | 855 | The file  dquot-nr  shows  the  number of allocated disk quota entries and the | 
|  | 856 | number of free disk quota entries. | 
|  | 857 |  | 
|  | 858 | If the number of available cached disk quotas is very low and you have a large | 
|  | 859 | number of simultaneous system users, you might want to raise the limit. | 
|  | 860 |  | 
|  | 861 | file-nr and file-max | 
|  | 862 | -------------------- | 
|  | 863 |  | 
|  | 864 | The kernel  allocates file handles dynamically, but doesn't free them again at | 
|  | 865 | this time. | 
|  | 866 |  | 
|  | 867 | The value  in  file-max  denotes  the  maximum number of file handles that the | 
|  | 868 | Linux kernel will allocate. When you get a lot of error messages about running | 
|  | 869 | out of  file handles, you might want to raise this limit. The default value is | 
|  | 870 | 10% of  RAM in kilobytes.  To  change it, just  write the new number  into the | 
|  | 871 | file: | 
|  | 872 |  | 
|  | 873 | # cat /proc/sys/fs/file-max | 
|  | 874 | 4096 | 
|  | 875 | # echo 8192 > /proc/sys/fs/file-max | 
|  | 876 | # cat /proc/sys/fs/file-max | 
|  | 877 | 8192 | 
|  | 878 |  | 
|  | 879 |  | 
|  | 880 | This method  of  revision  is  useful  for  all customizable parameters of the | 
|  | 881 | kernel - simply echo the new value to the corresponding file. | 
|  | 882 |  | 
|  | 883 | Historically, the three values in file-nr denoted the number of allocated file | 
|  | 884 | handles,  the number of  allocated but  unused file  handles, and  the maximum | 
|  | 885 | number of file handles. Linux 2.6 always  reports 0 as the number of free file | 
|  | 886 | handles -- this  is not an error,  it just means that the  number of allocated | 
|  | 887 | file handles exactly matches the number of used file handles. | 
|  | 888 |  | 
|  | 889 | Attempts to  allocate more  file descriptors than  file-max are  reported with | 
|  | 890 | printk, look for "VFS: file-max limit <number> reached". | 
|  | 891 |  | 
|  | 892 | inode-state and inode-nr | 
|  | 893 | ------------------------ | 
|  | 894 |  | 
|  | 895 | The file inode-nr contains the first two items from inode-state, so we'll skip | 
|  | 896 | to that file... | 
|  | 897 |  | 
|  | 898 | inode-state contains  two  actual numbers and five dummy values. The numbers | 
|  | 899 | are nr_inodes and nr_free_inodes (in order of appearance). | 
|  | 900 |  | 
|  | 901 | nr_inodes | 
|  | 902 | ~~~~~~~~~ | 
|  | 903 |  | 
|  | 904 | Denotes the  number  of  inodes the system has allocated. This number will | 
|  | 905 | grow and shrink dynamically. | 
|  | 906 |  | 
|  | 907 | nr_free_inodes | 
|  | 908 | -------------- | 
|  | 909 |  | 
|  | 910 | Represents the  number of free inodes. Ie. The number of inuse inodes is | 
|  | 911 | (nr_inodes - nr_free_inodes). | 
|  | 912 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 913 | aio-nr and aio-max-nr | 
|  | 914 | --------------------- | 
|  | 915 |  | 
|  | 916 | aio-nr is the running total of the number of events specified on the | 
|  | 917 | io_setup system call for all currently active aio contexts.  If aio-nr | 
|  | 918 | reaches aio-max-nr then io_setup will fail with EAGAIN.  Note that | 
|  | 919 | raising aio-max-nr does not result in the pre-allocation or re-sizing | 
|  | 920 | of any kernel data structures. | 
|  | 921 |  | 
|  | 922 | 2.2 /proc/sys/fs/binfmt_misc - Miscellaneous binary formats | 
|  | 923 | ----------------------------------------------------------- | 
|  | 924 |  | 
|  | 925 | Besides these  files, there is the subdirectory /proc/sys/fs/binfmt_misc. This | 
|  | 926 | handles the kernel support for miscellaneous binary formats. | 
|  | 927 |  | 
|  | 928 | Binfmt_misc provides  the ability to register additional binary formats to the | 
|  | 929 | Kernel without  compiling  an additional module/kernel. Therefore, binfmt_misc | 
|  | 930 | needs to  know magic numbers at the beginning or the filename extension of the | 
|  | 931 | binary. | 
|  | 932 |  | 
|  | 933 | It works by maintaining a linked list of structs that contain a description of | 
|  | 934 | a binary  format,  including  a  magic  with size (or the filename extension), | 
|  | 935 | offset and  mask,  and  the  interpreter name. On request it invokes the given | 
|  | 936 | interpreter with  the  original  program  as  argument,  as  binfmt_java  and | 
|  | 937 | binfmt_em86 and  binfmt_mz  do.  Since binfmt_misc does not define any default | 
|  | 938 | binary-formats, you have to register an additional binary-format. | 
|  | 939 |  | 
|  | 940 | There are two general files in binfmt_misc and one file per registered format. | 
|  | 941 | The two general files are register and status. | 
|  | 942 |  | 
|  | 943 | Registering a new binary format | 
|  | 944 | ------------------------------- | 
|  | 945 |  | 
|  | 946 | To register a new binary format you have to issue the command | 
|  | 947 |  | 
|  | 948 | echo :name:type:offset:magic:mask:interpreter: > /proc/sys/fs/binfmt_misc/register | 
|  | 949 |  | 
|  | 950 |  | 
|  | 951 |  | 
|  | 952 | with appropriate  name (the name for the /proc-dir entry), offset (defaults to | 
|  | 953 | 0, if  omitted),  magic, mask (which can be omitted, defaults to all 0xff) and | 
|  | 954 | last but  not  least,  the  interpreter that is to be invoked (for example and | 
|  | 955 | testing /bin/echo).  Type  can be M for usual magic matching or E for filename | 
|  | 956 | extension matching (give extension in place of magic). | 
|  | 957 |  | 
|  | 958 | Check or reset the status of the binary format handler | 
|  | 959 | ------------------------------------------------------ | 
|  | 960 |  | 
|  | 961 | If you  do a cat on the file /proc/sys/fs/binfmt_misc/status, you will get the | 
|  | 962 | current status (enabled/disabled) of binfmt_misc. Change the status by echoing | 
|  | 963 | 0 (disables)  or  1  (enables)  or  -1  (caution:  this  clears all previously | 
|  | 964 | registered binary  formats)  to status. For example echo 0 > status to disable | 
|  | 965 | binfmt_misc (temporarily). | 
|  | 966 |  | 
|  | 967 | Status of a single handler | 
|  | 968 | -------------------------- | 
|  | 969 |  | 
|  | 970 | Each registered  handler has an entry in /proc/sys/fs/binfmt_misc. These files | 
|  | 971 | perform the  same function as status, but their scope is limited to the actual | 
|  | 972 | binary format.  By  cating this file, you also receive all related information | 
|  | 973 | about the interpreter/magic of the binfmt. | 
|  | 974 |  | 
|  | 975 | Example usage of binfmt_misc (emulate binfmt_java) | 
|  | 976 | -------------------------------------------------- | 
|  | 977 |  | 
|  | 978 | cd /proc/sys/fs/binfmt_misc | 
|  | 979 | echo ':Java:M::\xca\xfe\xba\xbe::/usr/local/java/bin/javawrapper:' > register | 
|  | 980 | echo ':HTML:E::html::/usr/local/java/bin/appletviewer:' > register | 
|  | 981 | echo ':Applet:M::<!--applet::/usr/local/java/bin/appletviewer:' > register | 
|  | 982 | echo ':DEXE:M::\x0eDEX::/usr/bin/dosexec:' > register | 
|  | 983 |  | 
|  | 984 |  | 
|  | 985 | These four  lines  add  support  for  Java  executables and Java applets (like | 
|  | 986 | binfmt_java, additionally  recognizing the .html extension with no need to put | 
|  | 987 | <!--applet> to  every  applet  file).  You  have  to  install  the JDK and the | 
|  | 988 | shell-script /usr/local/java/bin/javawrapper  too.  It  works  around  the | 
|  | 989 | brokenness of  the Java filename handling. To add a Java binary, just create a | 
|  | 990 | link to the class-file somewhere in the path. | 
|  | 991 |  | 
|  | 992 | 2.3 /proc/sys/kernel - general kernel parameters | 
|  | 993 | ------------------------------------------------ | 
|  | 994 |  | 
|  | 995 | This directory  reflects  general  kernel  behaviors. As I've said before, the | 
|  | 996 | contents depend  on  your  configuration.  Here you'll find the most important | 
|  | 997 | files, along with descriptions of what they mean and how to use them. | 
|  | 998 |  | 
|  | 999 | acct | 
|  | 1000 | ---- | 
|  | 1001 |  | 
|  | 1002 | The file contains three values; highwater, lowwater, and frequency. | 
|  | 1003 |  | 
|  | 1004 | It exists  only  when  BSD-style  process  accounting is enabled. These values | 
|  | 1005 | control its behavior. If the free space on the file system where the log lives | 
|  | 1006 | goes below  lowwater  percentage,  accounting  suspends.  If  it  goes  above | 
|  | 1007 | highwater percentage,  accounting  resumes. Frequency determines how often you | 
|  | 1008 | check the amount of free space (value is in seconds). Default settings are: 4, | 
|  | 1009 | 2, and  30.  That is, suspend accounting if there is less than 2 percent free; | 
|  | 1010 | resume it  if we have a value of 3 or more percent; consider information about | 
|  | 1011 | the amount of free space valid for 30 seconds | 
|  | 1012 |  | 
|  | 1013 | ctrl-alt-del | 
|  | 1014 | ------------ | 
|  | 1015 |  | 
|  | 1016 | When the value in this file is 0, ctrl-alt-del is trapped and sent to the init | 
|  | 1017 | program to  handle a graceful restart. However, when the value is greater that | 
|  | 1018 | zero, Linux's  reaction  to  this key combination will be an immediate reboot, | 
|  | 1019 | without syncing its dirty buffers. | 
|  | 1020 |  | 
|  | 1021 | [NOTE] | 
|  | 1022 | When a  program  (like  dosemu)  has  the  keyboard  in  raw  mode,  the | 
|  | 1023 | ctrl-alt-del is  intercepted  by  the  program  before it ever reaches the | 
|  | 1024 | kernel tty  layer,  and  it is up to the program to decide what to do with | 
|  | 1025 | it. | 
|  | 1026 |  | 
|  | 1027 | domainname and hostname | 
|  | 1028 | ----------------------- | 
|  | 1029 |  | 
|  | 1030 | These files  can  be controlled to set the NIS domainname and hostname of your | 
|  | 1031 | box. For the classic darkstar.frop.org a simple: | 
|  | 1032 |  | 
|  | 1033 | # echo "darkstar" > /proc/sys/kernel/hostname | 
|  | 1034 | # echo "frop.org" > /proc/sys/kernel/domainname | 
|  | 1035 |  | 
|  | 1036 |  | 
|  | 1037 | would suffice to set your hostname and NIS domainname. | 
|  | 1038 |  | 
|  | 1039 | osrelease, ostype and version | 
|  | 1040 | ----------------------------- | 
|  | 1041 |  | 
|  | 1042 | The names make it pretty obvious what these fields contain: | 
|  | 1043 |  | 
|  | 1044 | > cat /proc/sys/kernel/osrelease | 
|  | 1045 | 2.2.12 | 
|  | 1046 |  | 
|  | 1047 | > cat /proc/sys/kernel/ostype | 
|  | 1048 | Linux | 
|  | 1049 |  | 
|  | 1050 | > cat /proc/sys/kernel/version | 
|  | 1051 | #4 Fri Oct 1 12:41:14 PDT 1999 | 
|  | 1052 |  | 
|  | 1053 |  | 
|  | 1054 | The files  osrelease and ostype should be clear enough. Version needs a little | 
|  | 1055 | more clarification.  The  #4 means that this is the 4th kernel built from this | 
|  | 1056 | source base and the date after it indicates the time the kernel was built. The | 
|  | 1057 | only way to tune these values is to rebuild the kernel. | 
|  | 1058 |  | 
|  | 1059 | panic | 
|  | 1060 | ----- | 
|  | 1061 |  | 
|  | 1062 | The value  in  this  file  represents  the  number of seconds the kernel waits | 
|  | 1063 | before rebooting  on  a  panic.  When  you  use  the  software  watchdog,  the | 
|  | 1064 | recommended setting  is  60. If set to 0, the auto reboot after a kernel panic | 
|  | 1065 | is disabled, which is the default setting. | 
|  | 1066 |  | 
|  | 1067 | printk | 
|  | 1068 | ------ | 
|  | 1069 |  | 
|  | 1070 | The four values in printk denote | 
|  | 1071 | * console_loglevel, | 
|  | 1072 | * default_message_loglevel, | 
|  | 1073 | * minimum_console_loglevel and | 
|  | 1074 | * default_console_loglevel | 
|  | 1075 | respectively. | 
|  | 1076 |  | 
|  | 1077 | These values  influence  printk()  behavior  when  printing  or  logging error | 
|  | 1078 | messages, which  come  from  inside  the  kernel.  See  syslog(2)  for  more | 
|  | 1079 | information on the different log levels. | 
|  | 1080 |  | 
|  | 1081 | console_loglevel | 
|  | 1082 | ---------------- | 
|  | 1083 |  | 
|  | 1084 | Messages with a higher priority than this will be printed to the console. | 
|  | 1085 |  | 
|  | 1086 | default_message_level | 
|  | 1087 | --------------------- | 
|  | 1088 |  | 
|  | 1089 | Messages without an explicit priority will be printed with this priority. | 
|  | 1090 |  | 
|  | 1091 | minimum_console_loglevel | 
|  | 1092 | ------------------------ | 
|  | 1093 |  | 
|  | 1094 | Minimum (highest) value to which the console_loglevel can be set. | 
|  | 1095 |  | 
|  | 1096 | default_console_loglevel | 
|  | 1097 | ------------------------ | 
|  | 1098 |  | 
|  | 1099 | Default value for console_loglevel. | 
|  | 1100 |  | 
|  | 1101 | sg-big-buff | 
|  | 1102 | ----------- | 
|  | 1103 |  | 
|  | 1104 | This file  shows  the size of the generic SCSI (sg) buffer. At this point, you | 
|  | 1105 | can't tune  it  yet,  but  you  can  change  it  at  compile  time  by editing | 
|  | 1106 | include/scsi/sg.h and changing the value of SG_BIG_BUFF. | 
|  | 1107 |  | 
|  | 1108 | If you use a scanner with SANE (Scanner Access Now Easy) you might want to set | 
|  | 1109 | this to a higher value. Refer to the SANE documentation on this issue. | 
|  | 1110 |  | 
|  | 1111 | modprobe | 
|  | 1112 | -------- | 
|  | 1113 |  | 
|  | 1114 | The location  where  the  modprobe  binary  is  located.  The kernel uses this | 
|  | 1115 | program to load modules on demand. | 
|  | 1116 |  | 
|  | 1117 | unknown_nmi_panic | 
|  | 1118 | ----------------- | 
|  | 1119 |  | 
|  | 1120 | The value in this file affects behavior of handling NMI. When the value is | 
|  | 1121 | non-zero, unknown NMI is trapped and then panic occurs. At that time, kernel | 
|  | 1122 | debugging information is displayed on console. | 
|  | 1123 |  | 
|  | 1124 | NMI switch that most IA32 servers have fires unknown NMI up, for example. | 
|  | 1125 | If a system hangs up, try pressing the NMI switch. | 
|  | 1126 |  | 
|  | 1127 | [NOTE] | 
|  | 1128 | This function and oprofile share a NMI callback. Therefore this function | 
|  | 1129 | cannot be enabled when oprofile is activated. | 
|  | 1130 | And NMI watchdog will be disabled when the value in this file is set to | 
|  | 1131 | non-zero. | 
|  | 1132 |  | 
|  | 1133 |  | 
|  | 1134 | 2.4 /proc/sys/vm - The virtual memory subsystem | 
|  | 1135 | ----------------------------------------------- | 
|  | 1136 |  | 
|  | 1137 | The files  in  this directory can be used to tune the operation of the virtual | 
|  | 1138 | memory (VM)  subsystem  of  the  Linux  kernel. | 
|  | 1139 |  | 
|  | 1140 | vfs_cache_pressure | 
|  | 1141 | ------------------ | 
|  | 1142 |  | 
|  | 1143 | Controls the tendency of the kernel to reclaim the memory which is used for | 
|  | 1144 | caching of directory and inode objects. | 
|  | 1145 |  | 
|  | 1146 | At the default value of vfs_cache_pressure=100 the kernel will attempt to | 
|  | 1147 | reclaim dentries and inodes at a "fair" rate with respect to pagecache and | 
|  | 1148 | swapcache reclaim.  Decreasing vfs_cache_pressure causes the kernel to prefer | 
|  | 1149 | to retain dentry and inode caches.  Increasing vfs_cache_pressure beyond 100 | 
|  | 1150 | causes the kernel to prefer to reclaim dentries and inodes. | 
|  | 1151 |  | 
|  | 1152 | dirty_background_ratio | 
|  | 1153 | ---------------------- | 
|  | 1154 |  | 
|  | 1155 | Contains, as a percentage of total system memory, the number of pages at which | 
|  | 1156 | the pdflush background writeback daemon will start writing out dirty data. | 
|  | 1157 |  | 
|  | 1158 | dirty_ratio | 
|  | 1159 | ----------------- | 
|  | 1160 |  | 
|  | 1161 | Contains, as a percentage of total system memory, the number of pages at which | 
|  | 1162 | a process which is generating disk writes will itself start writing out dirty | 
|  | 1163 | data. | 
|  | 1164 |  | 
|  | 1165 | dirty_writeback_centisecs | 
|  | 1166 | ------------------------- | 
|  | 1167 |  | 
|  | 1168 | The pdflush writeback daemons will periodically wake up and write `old' data | 
|  | 1169 | out to disk.  This tunable expresses the interval between those wakeups, in | 
|  | 1170 | 100'ths of a second. | 
|  | 1171 |  | 
|  | 1172 | Setting this to zero disables periodic writeback altogether. | 
|  | 1173 |  | 
|  | 1174 | dirty_expire_centisecs | 
|  | 1175 | ---------------------- | 
|  | 1176 |  | 
|  | 1177 | This tunable is used to define when dirty data is old enough to be eligible | 
|  | 1178 | for writeout by the pdflush daemons.  It is expressed in 100'ths of a second. | 
|  | 1179 | Data which has been dirty in-memory for longer than this interval will be | 
|  | 1180 | written out next time a pdflush daemon wakes up. | 
|  | 1181 |  | 
|  | 1182 | legacy_va_layout | 
|  | 1183 | ---------------- | 
|  | 1184 |  | 
|  | 1185 | If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel | 
|  | 1186 | will use the legacy (2.4) layout for all processes. | 
|  | 1187 |  | 
|  | 1188 | lower_zone_protection | 
|  | 1189 | --------------------- | 
|  | 1190 |  | 
|  | 1191 | For some specialised workloads on highmem machines it is dangerous for | 
|  | 1192 | the kernel to allow process memory to be allocated from the "lowmem" | 
|  | 1193 | zone.  This is because that memory could then be pinned via the mlock() | 
|  | 1194 | system call, or by unavailability of swapspace. | 
|  | 1195 |  | 
|  | 1196 | And on large highmem machines this lack of reclaimable lowmem memory | 
|  | 1197 | can be fatal. | 
|  | 1198 |  | 
|  | 1199 | So the Linux page allocator has a mechanism which prevents allocations | 
|  | 1200 | which _could_ use highmem from using too much lowmem.  This means that | 
|  | 1201 | a certain amount of lowmem is defended from the possibility of being | 
|  | 1202 | captured into pinned user memory. | 
|  | 1203 |  | 
|  | 1204 | (The same argument applies to the old 16 megabyte ISA DMA region.  This | 
|  | 1205 | mechanism will also defend that region from allocations which could use | 
|  | 1206 | highmem or lowmem). | 
|  | 1207 |  | 
|  | 1208 | The `lower_zone_protection' tunable determines how aggressive the kernel is | 
|  | 1209 | in defending these lower zones.  The default value is zero - no | 
|  | 1210 | protection at all. | 
|  | 1211 |  | 
|  | 1212 | If you have a machine which uses highmem or ISA DMA and your | 
|  | 1213 | applications are using mlock(), or if you are running with no swap then | 
|  | 1214 | you probably should increase the lower_zone_protection setting. | 
|  | 1215 |  | 
|  | 1216 | The units of this tunable are fairly vague.  It is approximately equal | 
|  | 1217 | to "megabytes".  So setting lower_zone_protection=100 will protect around 100 | 
|  | 1218 | megabytes of the lowmem zone from user allocations.  It will also make | 
|  | 1219 | those 100 megabytes unavaliable for use by applications and by | 
|  | 1220 | pagecache, so there is a cost. | 
|  | 1221 |  | 
|  | 1222 | The effects of this tunable may be observed by monitoring | 
|  | 1223 | /proc/meminfo:LowFree.  Write a single huge file and observe the point | 
|  | 1224 | at which LowFree ceases to fall. | 
|  | 1225 |  | 
|  | 1226 | A reasonable value for lower_zone_protection is 100. | 
|  | 1227 |  | 
|  | 1228 | page-cluster | 
|  | 1229 | ------------ | 
|  | 1230 |  | 
|  | 1231 | page-cluster controls the number of pages which are written to swap in | 
|  | 1232 | a single attempt.  The swap I/O size. | 
|  | 1233 |  | 
|  | 1234 | It is a logarithmic value - setting it to zero means "1 page", setting | 
|  | 1235 | it to 1 means "2 pages", setting it to 2 means "4 pages", etc. | 
|  | 1236 |  | 
|  | 1237 | The default value is three (eight pages at a time).  There may be some | 
|  | 1238 | small benefits in tuning this to a different value if your workload is | 
|  | 1239 | swap-intensive. | 
|  | 1240 |  | 
|  | 1241 | overcommit_memory | 
|  | 1242 | ----------------- | 
|  | 1243 |  | 
| Chuck Ebbert | af97c72 | 2005-09-09 13:10:15 -0700 | [diff] [blame] | 1244 | Controls overcommit of system memory, possibly allowing processes | 
|  | 1245 | to allocate (but not use) more memory than is actually available. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1246 |  | 
| Chuck Ebbert | af97c72 | 2005-09-09 13:10:15 -0700 | [diff] [blame] | 1247 |  | 
|  | 1248 | 0	-	Heuristic overcommit handling. Obvious overcommits of | 
|  | 1249 | address space are refused. Used for a typical system. It | 
|  | 1250 | ensures a seriously wild allocation fails while allowing | 
|  | 1251 | overcommit to reduce swap usage.  root is allowed to | 
|  | 1252 | allocate slighly more memory in this mode. This is the | 
|  | 1253 | default. | 
|  | 1254 |  | 
|  | 1255 | 1	-	Always overcommit. Appropriate for some scientific | 
|  | 1256 | applications. | 
|  | 1257 |  | 
|  | 1258 | 2	-	Don't overcommit. The total address space commit | 
|  | 1259 | for the system is not permitted to exceed swap plus a | 
|  | 1260 | configurable percentage (default is 50) of physical RAM. | 
|  | 1261 | Depending on the percentage you use, in most situations | 
|  | 1262 | this means a process will not be killed while attempting | 
|  | 1263 | to use already-allocated memory but will receive errors | 
|  | 1264 | on memory allocation as	appropriate. | 
|  | 1265 |  | 
|  | 1266 | overcommit_ratio | 
|  | 1267 | ---------------- | 
|  | 1268 |  | 
|  | 1269 | Percentage of physical memory size to include in overcommit calculations | 
|  | 1270 | (see above.) | 
|  | 1271 |  | 
|  | 1272 | Memory allocation limit = swapspace + physmem * (overcommit_ratio / 100) | 
|  | 1273 |  | 
|  | 1274 | swapspace = total size of all swap areas | 
|  | 1275 | physmem = size of physical memory in system | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1276 |  | 
|  | 1277 | nr_hugepages and hugetlb_shm_group | 
|  | 1278 | ---------------------------------- | 
|  | 1279 |  | 
|  | 1280 | nr_hugepages configures number of hugetlb page reserved for the system. | 
|  | 1281 |  | 
|  | 1282 | hugetlb_shm_group contains group id that is allowed to create SysV shared | 
|  | 1283 | memory segment using hugetlb page. | 
|  | 1284 |  | 
|  | 1285 | laptop_mode | 
|  | 1286 | ----------- | 
|  | 1287 |  | 
|  | 1288 | laptop_mode is a knob that controls "laptop mode". All the things that are | 
|  | 1289 | controlled by this knob are discussed in Documentation/laptop-mode.txt. | 
|  | 1290 |  | 
|  | 1291 | block_dump | 
|  | 1292 | ---------- | 
|  | 1293 |  | 
|  | 1294 | block_dump enables block I/O debugging when set to a nonzero value. More | 
|  | 1295 | information on block I/O debugging is in Documentation/laptop-mode.txt. | 
|  | 1296 |  | 
|  | 1297 | swap_token_timeout | 
|  | 1298 | ------------------ | 
|  | 1299 |  | 
|  | 1300 | This file contains valid hold time of swap out protection token. The Linux | 
|  | 1301 | VM has token based thrashing control mechanism and uses the token to prevent | 
|  | 1302 | unnecessary page faults in thrashing situation. The unit of the value is | 
|  | 1303 | second. The value would be useful to tune thrashing behavior. | 
|  | 1304 |  | 
|  | 1305 | 2.5 /proc/sys/dev - Device specific parameters | 
|  | 1306 | ---------------------------------------------- | 
|  | 1307 |  | 
|  | 1308 | Currently there is only support for CDROM drives, and for those, there is only | 
|  | 1309 | one read-only  file containing information about the CD-ROM drives attached to | 
|  | 1310 | the system: | 
|  | 1311 |  | 
|  | 1312 | >cat /proc/sys/dev/cdrom/info | 
|  | 1313 | CD-ROM information, Id: cdrom.c 2.55 1999/04/25 | 
|  | 1314 |  | 
|  | 1315 | drive name:             sr0     hdb | 
|  | 1316 | drive speed:            32      40 | 
|  | 1317 | drive # of slots:       1       0 | 
|  | 1318 | Can close tray:         1       1 | 
|  | 1319 | Can open tray:          1       1 | 
|  | 1320 | Can lock tray:          1       1 | 
|  | 1321 | Can change speed:       1       1 | 
|  | 1322 | Can select disk:        0       1 | 
|  | 1323 | Can read multisession:  1       1 | 
|  | 1324 | Can read MCN:           1       1 | 
|  | 1325 | Reports media changed:  1       1 | 
|  | 1326 | Can play audio:         1       1 | 
|  | 1327 |  | 
|  | 1328 |  | 
|  | 1329 | You see two drives, sr0 and hdb, along with a list of their features. | 
|  | 1330 |  | 
|  | 1331 | 2.6 /proc/sys/sunrpc - Remote procedure calls | 
|  | 1332 | --------------------------------------------- | 
|  | 1333 |  | 
|  | 1334 | This directory  contains four files, which enable or disable debugging for the | 
|  | 1335 | RPC functions NFS, NFS-daemon, RPC and NLM. The default values are 0. They can | 
|  | 1336 | be set to one to turn debugging on. (The default value is 0 for each) | 
|  | 1337 |  | 
|  | 1338 | 2.7 /proc/sys/net - Networking stuff | 
|  | 1339 | ------------------------------------ | 
|  | 1340 |  | 
|  | 1341 | The interface  to  the  networking  parts  of  the  kernel  is  located  in | 
|  | 1342 | /proc/sys/net. Table  2-3  shows all possible subdirectories. You may see only | 
|  | 1343 | some of them, depending on your kernel's configuration. | 
|  | 1344 |  | 
|  | 1345 |  | 
|  | 1346 | Table 2-3: Subdirectories in /proc/sys/net | 
|  | 1347 | .............................................................................. | 
|  | 1348 | Directory Content             Directory  Content | 
|  | 1349 | core      General parameter   appletalk  Appletalk protocol | 
|  | 1350 | unix      Unix domain sockets netrom     NET/ROM | 
|  | 1351 | 802       E802 protocol       ax25       AX25 | 
|  | 1352 | ethernet  Ethernet protocol   rose       X.25 PLP layer | 
|  | 1353 | ipv4      IP version 4        x25        X.25 protocol | 
|  | 1354 | ipx       IPX                 token-ring IBM token ring | 
|  | 1355 | bridge    Bridging            decnet     DEC net | 
|  | 1356 | ipv6      IP version 6 | 
|  | 1357 | .............................................................................. | 
|  | 1358 |  | 
|  | 1359 | We will  concentrate  on IP networking here. Since AX15, X.25, and DEC Net are | 
|  | 1360 | only minor players in the Linux world, we'll skip them in this chapter. You'll | 
|  | 1361 | find some  short  info on Appletalk and IPX further on in this chapter. Review | 
|  | 1362 | the online  documentation  and the kernel source to get a detailed view of the | 
|  | 1363 | parameters for  those  protocols.  In  this  section  we'll  discuss  the | 
|  | 1364 | subdirectories printed  in  bold letters in the table above. As default values | 
|  | 1365 | are suitable for most needs, there is no need to change these values. | 
|  | 1366 |  | 
|  | 1367 | /proc/sys/net/core - Network core options | 
|  | 1368 | ----------------------------------------- | 
|  | 1369 |  | 
|  | 1370 | rmem_default | 
|  | 1371 | ------------ | 
|  | 1372 |  | 
|  | 1373 | The default setting of the socket receive buffer in bytes. | 
|  | 1374 |  | 
|  | 1375 | rmem_max | 
|  | 1376 | -------- | 
|  | 1377 |  | 
|  | 1378 | The maximum receive socket buffer size in bytes. | 
|  | 1379 |  | 
|  | 1380 | wmem_default | 
|  | 1381 | ------------ | 
|  | 1382 |  | 
|  | 1383 | The default setting (in bytes) of the socket send buffer. | 
|  | 1384 |  | 
|  | 1385 | wmem_max | 
|  | 1386 | -------- | 
|  | 1387 |  | 
|  | 1388 | The maximum send socket buffer size in bytes. | 
|  | 1389 |  | 
|  | 1390 | message_burst and message_cost | 
|  | 1391 | ------------------------------ | 
|  | 1392 |  | 
|  | 1393 | These parameters  are used to limit the warning messages written to the kernel | 
|  | 1394 | log from  the  networking  code.  They  enforce  a  rate  limit  to  make  a | 
|  | 1395 | denial-of-service attack  impossible. A higher message_cost factor, results in | 
|  | 1396 | fewer messages that will be written. Message_burst controls when messages will | 
|  | 1397 | be dropped.  The  default  settings  limit  warning messages to one every five | 
|  | 1398 | seconds. | 
|  | 1399 |  | 
|  | 1400 | netdev_max_backlog | 
|  | 1401 | ------------------ | 
|  | 1402 |  | 
|  | 1403 | Maximum number  of  packets,  queued  on  the  INPUT  side, when the interface | 
|  | 1404 | receives packets faster than kernel can process them. | 
|  | 1405 |  | 
|  | 1406 | optmem_max | 
|  | 1407 | ---------- | 
|  | 1408 |  | 
|  | 1409 | Maximum ancillary buffer size allowed per socket. Ancillary data is a sequence | 
|  | 1410 | of struct cmsghdr structures with appended data. | 
|  | 1411 |  | 
|  | 1412 | /proc/sys/net/unix - Parameters for Unix domain sockets | 
|  | 1413 | ------------------------------------------------------- | 
|  | 1414 |  | 
|  | 1415 | There are  only  two  files  in this subdirectory. They control the delays for | 
|  | 1416 | deleting and destroying socket descriptors. | 
|  | 1417 |  | 
|  | 1418 | 2.8 /proc/sys/net/ipv4 - IPV4 settings | 
|  | 1419 | -------------------------------------- | 
|  | 1420 |  | 
|  | 1421 | IP version  4  is  still the most used protocol in Unix networking. It will be | 
|  | 1422 | replaced by  IP version 6 in the next couple of years, but for the moment it's | 
|  | 1423 | the de  facto  standard  for  the  internet  and  is  used  in most networking | 
|  | 1424 | environments around  the  world.  Because  of the importance of this protocol, | 
|  | 1425 | we'll have a deeper look into the subtree controlling the behavior of the IPv4 | 
|  | 1426 | subsystem of the Linux kernel. | 
|  | 1427 |  | 
|  | 1428 | Let's start with the entries in /proc/sys/net/ipv4. | 
|  | 1429 |  | 
|  | 1430 | ICMP settings | 
|  | 1431 | ------------- | 
|  | 1432 |  | 
|  | 1433 | icmp_echo_ignore_all and icmp_echo_ignore_broadcasts | 
|  | 1434 | ---------------------------------------------------- | 
|  | 1435 |  | 
|  | 1436 | Turn on (1) or off (0), if the kernel should ignore all ICMP ECHO requests, or | 
|  | 1437 | just those to broadcast and multicast addresses. | 
|  | 1438 |  | 
|  | 1439 | Please note that if you accept ICMP echo requests with a broadcast/multi\-cast | 
|  | 1440 | destination address  your  network  may  be  used as an exploder for denial of | 
|  | 1441 | service packet flooding attacks to other hosts. | 
|  | 1442 |  | 
|  | 1443 | icmp_destunreach_rate, icmp_echoreply_rate, icmp_paramprob_rate and icmp_timeexeed_rate | 
|  | 1444 | --------------------------------------------------------------------------------------- | 
|  | 1445 |  | 
|  | 1446 | Sets limits  for  sending  ICMP  packets  to specific targets. A value of zero | 
|  | 1447 | disables all  limiting.  Any  positive  value sets the maximum package rate in | 
|  | 1448 | hundredth of a second (on Intel systems). | 
|  | 1449 |  | 
|  | 1450 | IP settings | 
|  | 1451 | ----------- | 
|  | 1452 |  | 
|  | 1453 | ip_autoconfig | 
|  | 1454 | ------------- | 
|  | 1455 |  | 
|  | 1456 | This file contains the number one if the host received its IP configuration by | 
|  | 1457 | RARP, BOOTP, DHCP or a similar mechanism. Otherwise it is zero. | 
|  | 1458 |  | 
|  | 1459 | ip_default_ttl | 
|  | 1460 | -------------- | 
|  | 1461 |  | 
|  | 1462 | TTL (Time  To  Live) for IPv4 interfaces. This is simply the maximum number of | 
|  | 1463 | hops a packet may travel. | 
|  | 1464 |  | 
|  | 1465 | ip_dynaddr | 
|  | 1466 | ---------- | 
|  | 1467 |  | 
|  | 1468 | Enable dynamic  socket  address rewriting on interface address change. This is | 
|  | 1469 | useful for dialup interface with changing IP addresses. | 
|  | 1470 |  | 
|  | 1471 | ip_forward | 
|  | 1472 | ---------- | 
|  | 1473 |  | 
|  | 1474 | Enable or  disable forwarding of IP packages between interfaces. Changing this | 
|  | 1475 | value resets  all other parameters to their default values. They differ if the | 
|  | 1476 | kernel is configured as host or router. | 
|  | 1477 |  | 
|  | 1478 | ip_local_port_range | 
|  | 1479 | ------------------- | 
|  | 1480 |  | 
|  | 1481 | Range of  ports  used  by  TCP  and UDP to choose the local port. Contains two | 
|  | 1482 | numbers, the  first  number  is the lowest port, the second number the highest | 
|  | 1483 | local port.  Default  is  1024-4999.  Should  be  changed  to  32768-61000 for | 
|  | 1484 | high-usage systems. | 
|  | 1485 |  | 
|  | 1486 | ip_no_pmtu_disc | 
|  | 1487 | --------------- | 
|  | 1488 |  | 
|  | 1489 | Global switch  to  turn  path  MTU  discovery off. It can also be set on a per | 
|  | 1490 | socket basis by the applications or on a per route basis. | 
|  | 1491 |  | 
|  | 1492 | ip_masq_debug | 
|  | 1493 | ------------- | 
|  | 1494 |  | 
|  | 1495 | Enable/disable debugging of IP masquerading. | 
|  | 1496 |  | 
|  | 1497 | IP fragmentation settings | 
|  | 1498 | ------------------------- | 
|  | 1499 |  | 
|  | 1500 | ipfrag_high_trash and ipfrag_low_trash | 
|  | 1501 | -------------------------------------- | 
|  | 1502 |  | 
|  | 1503 | Maximum memory  used to reassemble IP fragments. When ipfrag_high_thresh bytes | 
|  | 1504 | of memory  is  allocated  for  this  purpose,  the  fragment handler will toss | 
|  | 1505 | packets until ipfrag_low_thresh is reached. | 
|  | 1506 |  | 
|  | 1507 | ipfrag_time | 
|  | 1508 | ----------- | 
|  | 1509 |  | 
|  | 1510 | Time in seconds to keep an IP fragment in memory. | 
|  | 1511 |  | 
|  | 1512 | TCP settings | 
|  | 1513 | ------------ | 
|  | 1514 |  | 
|  | 1515 | tcp_ecn | 
|  | 1516 | ------- | 
|  | 1517 |  | 
|  | 1518 | This file controls the use of the ECN bit in the IPv4 headers, this is a new | 
|  | 1519 | feature about Explicit Congestion Notification, but some routers and firewalls | 
|  | 1520 | block trafic that has this bit set, so it could be necessary to echo 0 to | 
|  | 1521 | /proc/sys/net/ipv4/tcp_ecn, if you want to talk to this sites. For more info | 
|  | 1522 | you could read RFC2481. | 
|  | 1523 |  | 
|  | 1524 | tcp_retrans_collapse | 
|  | 1525 | -------------------- | 
|  | 1526 |  | 
|  | 1527 | Bug-to-bug compatibility with some broken printers. On retransmit, try to send | 
|  | 1528 | larger packets to work around bugs in certain TCP stacks. Can be turned off by | 
|  | 1529 | setting it to zero. | 
|  | 1530 |  | 
|  | 1531 | tcp_keepalive_probes | 
|  | 1532 | -------------------- | 
|  | 1533 |  | 
|  | 1534 | Number of  keep  alive  probes  TCP  sends  out,  until  it  decides  that the | 
|  | 1535 | connection is broken. | 
|  | 1536 |  | 
|  | 1537 | tcp_keepalive_time | 
|  | 1538 | ------------------ | 
|  | 1539 |  | 
|  | 1540 | How often  TCP  sends out keep alive messages, when keep alive is enabled. The | 
|  | 1541 | default is 2 hours. | 
|  | 1542 |  | 
|  | 1543 | tcp_syn_retries | 
|  | 1544 | --------------- | 
|  | 1545 |  | 
|  | 1546 | Number of  times  initial  SYNs  for  a  TCP  connection  attempt  will  be | 
|  | 1547 | retransmitted. Should  not  be  higher  than 255. This is only the timeout for | 
|  | 1548 | outgoing connections,  for  incoming  connections the number of retransmits is | 
|  | 1549 | defined by tcp_retries1. | 
|  | 1550 |  | 
|  | 1551 | tcp_sack | 
|  | 1552 | -------- | 
|  | 1553 |  | 
|  | 1554 | Enable select acknowledgments after RFC2018. | 
|  | 1555 |  | 
|  | 1556 | tcp_timestamps | 
|  | 1557 | -------------- | 
|  | 1558 |  | 
|  | 1559 | Enable timestamps as defined in RFC1323. | 
|  | 1560 |  | 
|  | 1561 | tcp_stdurg | 
|  | 1562 | ---------- | 
|  | 1563 |  | 
|  | 1564 | Enable the  strict  RFC793 interpretation of the TCP urgent pointer field. The | 
|  | 1565 | default is  to  use  the  BSD  compatible interpretation of the urgent pointer | 
|  | 1566 | pointing to the first byte after the urgent data. The RFC793 interpretation is | 
|  | 1567 | to have  it  point  to  the last byte of urgent data. Enabling this option may | 
|  | 1568 | lead to interoperatibility problems. Disabled by default. | 
|  | 1569 |  | 
|  | 1570 | tcp_syncookies | 
|  | 1571 | -------------- | 
|  | 1572 |  | 
|  | 1573 | Only valid  when  the  kernel  was  compiled  with CONFIG_SYNCOOKIES. Send out | 
|  | 1574 | syncookies when  the  syn backlog queue of a socket overflows. This is to ward | 
|  | 1575 | off the common 'syn flood attack'. Disabled by default. | 
|  | 1576 |  | 
|  | 1577 | Note that  the  concept  of a socket backlog is abandoned. This means the peer | 
|  | 1578 | may not  receive  reliable  error  messages  from  an  over loaded server with | 
|  | 1579 | syncookies enabled. | 
|  | 1580 |  | 
|  | 1581 | tcp_window_scaling | 
|  | 1582 | ------------------ | 
|  | 1583 |  | 
|  | 1584 | Enable window scaling as defined in RFC1323. | 
|  | 1585 |  | 
|  | 1586 | tcp_fin_timeout | 
|  | 1587 | --------------- | 
|  | 1588 |  | 
|  | 1589 | The length  of  time  in  seconds  it  takes to receive a final FIN before the | 
|  | 1590 | socket is  always  closed.  This  is  strictly  a  violation  of  the  TCP | 
|  | 1591 | specification, but required to prevent denial-of-service attacks. | 
|  | 1592 |  | 
|  | 1593 | tcp_max_ka_probes | 
|  | 1594 | ----------------- | 
|  | 1595 |  | 
|  | 1596 | Indicates how  many  keep alive probes are sent per slow timer run. Should not | 
|  | 1597 | be set too high to prevent bursts. | 
|  | 1598 |  | 
|  | 1599 | tcp_max_syn_backlog | 
|  | 1600 | ------------------- | 
|  | 1601 |  | 
|  | 1602 | Length of  the per socket backlog queue. Since Linux 2.2 the backlog specified | 
|  | 1603 | in listen(2)  only  specifies  the  length  of  the  backlog  queue of already | 
|  | 1604 | established sockets. When more connection requests arrive Linux starts to drop | 
|  | 1605 | packets. When  syncookies  are  enabled the packets are still answered and the | 
|  | 1606 | maximum queue is effectively ignored. | 
|  | 1607 |  | 
|  | 1608 | tcp_retries1 | 
|  | 1609 | ------------ | 
|  | 1610 |  | 
|  | 1611 | Defines how  often  an  answer  to  a  TCP connection request is retransmitted | 
|  | 1612 | before giving up. | 
|  | 1613 |  | 
|  | 1614 | tcp_retries2 | 
|  | 1615 | ------------ | 
|  | 1616 |  | 
|  | 1617 | Defines how often a TCP packet is retransmitted before giving up. | 
|  | 1618 |  | 
|  | 1619 | Interface specific settings | 
|  | 1620 | --------------------------- | 
|  | 1621 |  | 
|  | 1622 | In the directory /proc/sys/net/ipv4/conf you'll find one subdirectory for each | 
|  | 1623 | interface the  system  knows about and one directory calls all. Changes in the | 
|  | 1624 | all subdirectory  affect  all  interfaces,  whereas  changes  in  the  other | 
|  | 1625 | subdirectories affect  only  one  interface.  All  directories  have  the same | 
|  | 1626 | entries: | 
|  | 1627 |  | 
|  | 1628 | accept_redirects | 
|  | 1629 | ---------------- | 
|  | 1630 |  | 
|  | 1631 | This switch  decides  if the kernel accepts ICMP redirect messages or not. The | 
|  | 1632 | default is 'yes' if the kernel is configured for a regular host and 'no' for a | 
|  | 1633 | router configuration. | 
|  | 1634 |  | 
|  | 1635 | accept_source_route | 
|  | 1636 | ------------------- | 
|  | 1637 |  | 
|  | 1638 | Should source  routed  packages  be  accepted  or  declined.  The  default  is | 
|  | 1639 | dependent on  the  kernel  configuration.  It's 'yes' for routers and 'no' for | 
|  | 1640 | hosts. | 
|  | 1641 |  | 
|  | 1642 | bootp_relay | 
|  | 1643 | ~~~~~~~~~~~ | 
|  | 1644 |  | 
|  | 1645 | Accept packets  with source address 0.b.c.d with destinations not to this host | 
|  | 1646 | as local ones. It is supposed that a BOOTP relay daemon will catch and forward | 
|  | 1647 | such packets. | 
|  | 1648 |  | 
|  | 1649 | The default  is  0,  since this feature is not implemented yet (kernel version | 
|  | 1650 | 2.2.12). | 
|  | 1651 |  | 
|  | 1652 | forwarding | 
|  | 1653 | ---------- | 
|  | 1654 |  | 
|  | 1655 | Enable or disable IP forwarding on this interface. | 
|  | 1656 |  | 
|  | 1657 | log_martians | 
|  | 1658 | ------------ | 
|  | 1659 |  | 
|  | 1660 | Log packets with source addresses with no known route to kernel log. | 
|  | 1661 |  | 
|  | 1662 | mc_forwarding | 
|  | 1663 | ------------- | 
|  | 1664 |  | 
|  | 1665 | Do multicast routing. The kernel needs to be compiled with CONFIG_MROUTE and a | 
|  | 1666 | multicast routing daemon is required. | 
|  | 1667 |  | 
|  | 1668 | proxy_arp | 
|  | 1669 | --------- | 
|  | 1670 |  | 
|  | 1671 | Does (1) or does not (0) perform proxy ARP. | 
|  | 1672 |  | 
|  | 1673 | rp_filter | 
|  | 1674 | --------- | 
|  | 1675 |  | 
|  | 1676 | Integer value determines if a source validation should be made. 1 means yes, 0 | 
|  | 1677 | means no.  Disabled by default, but local/broadcast address spoofing is always | 
|  | 1678 | on. | 
|  | 1679 |  | 
|  | 1680 | If you  set this to 1 on a router that is the only connection for a network to | 
|  | 1681 | the net,  it  will  prevent  spoofing  attacks  against your internal networks | 
|  | 1682 | (external addresses  can  still  be  spoofed), without the need for additional | 
|  | 1683 | firewall rules. | 
|  | 1684 |  | 
|  | 1685 | secure_redirects | 
|  | 1686 | ---------------- | 
|  | 1687 |  | 
|  | 1688 | Accept ICMP  redirect  messages  only  for gateways, listed in default gateway | 
|  | 1689 | list. Enabled by default. | 
|  | 1690 |  | 
|  | 1691 | shared_media | 
|  | 1692 | ------------ | 
|  | 1693 |  | 
|  | 1694 | If it  is  not  set  the kernel does not assume that different subnets on this | 
|  | 1695 | device can communicate directly. Default setting is 'yes'. | 
|  | 1696 |  | 
|  | 1697 | send_redirects | 
|  | 1698 | -------------- | 
|  | 1699 |  | 
|  | 1700 | Determines whether to send ICMP redirects to other hosts. | 
|  | 1701 |  | 
|  | 1702 | Routing settings | 
|  | 1703 | ---------------- | 
|  | 1704 |  | 
|  | 1705 | The directory  /proc/sys/net/ipv4/route  contains  several  file  to  control | 
|  | 1706 | routing issues. | 
|  | 1707 |  | 
|  | 1708 | error_burst and error_cost | 
|  | 1709 | -------------------------- | 
|  | 1710 |  | 
|  | 1711 | These  parameters  are used to limit how many ICMP destination unreachable to | 
|  | 1712 | send  from  the  host  in question. ICMP destination unreachable messages are | 
|  | 1713 | sent  when  we can not reach the next hop, while trying to transmit a packet. | 
|  | 1714 | It  will also print some error messages to kernel logs if someone is ignoring | 
|  | 1715 | our   ICMP  redirects.  The  higher  the  error_cost  factor  is,  the  fewer | 
|  | 1716 | destination  unreachable  and error messages will be let through. Error_burst | 
|  | 1717 | controls  when  destination  unreachable  messages and error messages will be | 
|  | 1718 | dropped. The default settings limit warning messages to five every second. | 
|  | 1719 |  | 
|  | 1720 | flush | 
|  | 1721 | ----- | 
|  | 1722 |  | 
|  | 1723 | Writing to this file results in a flush of the routing cache. | 
|  | 1724 |  | 
|  | 1725 | gc_elasticity, gc_interval, gc_min_interval_ms, gc_timeout, gc_thresh | 
|  | 1726 | --------------------------------------------------------------------- | 
|  | 1727 |  | 
|  | 1728 | Values to  control  the  frequency  and  behavior  of  the  garbage collection | 
|  | 1729 | algorithm for the routing cache. gc_min_interval is deprecated and replaced | 
|  | 1730 | by gc_min_interval_ms. | 
|  | 1731 |  | 
|  | 1732 |  | 
|  | 1733 | max_size | 
|  | 1734 | -------- | 
|  | 1735 |  | 
|  | 1736 | Maximum size  of  the routing cache. Old entries will be purged once the cache | 
|  | 1737 | reached has this size. | 
|  | 1738 |  | 
|  | 1739 | max_delay, min_delay | 
|  | 1740 | -------------------- | 
|  | 1741 |  | 
|  | 1742 | Delays for flushing the routing cache. | 
|  | 1743 |  | 
|  | 1744 | redirect_load, redirect_number | 
|  | 1745 | ------------------------------ | 
|  | 1746 |  | 
|  | 1747 | Factors which  determine  if  more ICPM redirects should be sent to a specific | 
|  | 1748 | host. No  redirects  will be sent once the load limit or the maximum number of | 
|  | 1749 | redirects has been reached. | 
|  | 1750 |  | 
|  | 1751 | redirect_silence | 
|  | 1752 | ---------------- | 
|  | 1753 |  | 
|  | 1754 | Timeout for redirects. After this period redirects will be sent again, even if | 
|  | 1755 | this has been stopped, because the load or number limit has been reached. | 
|  | 1756 |  | 
|  | 1757 | Network Neighbor handling | 
|  | 1758 | ------------------------- | 
|  | 1759 |  | 
|  | 1760 | Settings about how to handle connections with direct neighbors (nodes attached | 
|  | 1761 | to the same link) can be found in the directory /proc/sys/net/ipv4/neigh. | 
|  | 1762 |  | 
|  | 1763 | As we  saw  it  in  the  conf directory, there is a default subdirectory which | 
|  | 1764 | holds the  default  values, and one directory for each interface. The contents | 
|  | 1765 | of the  directories  are identical, with the single exception that the default | 
|  | 1766 | settings contain additional options to set garbage collection parameters. | 
|  | 1767 |  | 
|  | 1768 | In the interface directories you'll find the following entries: | 
|  | 1769 |  | 
|  | 1770 | base_reachable_time, base_reachable_time_ms | 
|  | 1771 | ------------------------------------------- | 
|  | 1772 |  | 
|  | 1773 | A base  value  used for computing the random reachable time value as specified | 
|  | 1774 | in RFC2461. | 
|  | 1775 |  | 
|  | 1776 | Expression of base_reachable_time, which is deprecated, is in seconds. | 
|  | 1777 | Expression of base_reachable_time_ms is in milliseconds. | 
|  | 1778 |  | 
|  | 1779 | retrans_time, retrans_time_ms | 
|  | 1780 | ----------------------------- | 
|  | 1781 |  | 
|  | 1782 | The time between retransmitted Neighbor Solicitation messages. | 
|  | 1783 | Used for address resolution and to determine if a neighbor is | 
|  | 1784 | unreachable. | 
|  | 1785 |  | 
|  | 1786 | Expression of retrans_time, which is deprecated, is in 1/100 seconds (for | 
|  | 1787 | IPv4) or in jiffies (for IPv6). | 
|  | 1788 | Expression of retrans_time_ms is in milliseconds. | 
|  | 1789 |  | 
|  | 1790 | unres_qlen | 
|  | 1791 | ---------- | 
|  | 1792 |  | 
|  | 1793 | Maximum queue  length  for a pending arp request - the number of packets which | 
|  | 1794 | are accepted from other layers while the ARP address is still resolved. | 
|  | 1795 |  | 
|  | 1796 | anycast_delay | 
|  | 1797 | ------------- | 
|  | 1798 |  | 
|  | 1799 | Maximum for  random  delay  of  answers  to  neighbor solicitation messages in | 
|  | 1800 | jiffies (1/100  sec). Not yet implemented (Linux does not have anycast support | 
|  | 1801 | yet). | 
|  | 1802 |  | 
|  | 1803 | ucast_solicit | 
|  | 1804 | ------------- | 
|  | 1805 |  | 
|  | 1806 | Maximum number of retries for unicast solicitation. | 
|  | 1807 |  | 
|  | 1808 | mcast_solicit | 
|  | 1809 | ------------- | 
|  | 1810 |  | 
|  | 1811 | Maximum number of retries for multicast solicitation. | 
|  | 1812 |  | 
|  | 1813 | delay_first_probe_time | 
|  | 1814 | ---------------------- | 
|  | 1815 |  | 
|  | 1816 | Delay for  the  first  time  probe  if  the  neighbor  is  reachable.  (see | 
|  | 1817 | gc_stale_time) | 
|  | 1818 |  | 
|  | 1819 | locktime | 
|  | 1820 | -------- | 
|  | 1821 |  | 
|  | 1822 | An ARP/neighbor  entry  is only replaced with a new one if the old is at least | 
|  | 1823 | locktime old. This prevents ARP cache thrashing. | 
|  | 1824 |  | 
|  | 1825 | proxy_delay | 
|  | 1826 | ----------- | 
|  | 1827 |  | 
|  | 1828 | Maximum time  (real  time is random [0..proxytime]) before answering to an ARP | 
|  | 1829 | request for  which  we have an proxy ARP entry. In some cases, this is used to | 
|  | 1830 | prevent network flooding. | 
|  | 1831 |  | 
|  | 1832 | proxy_qlen | 
|  | 1833 | ---------- | 
|  | 1834 |  | 
|  | 1835 | Maximum queue length of the delayed proxy arp timer. (see proxy_delay). | 
|  | 1836 |  | 
|  | 1837 | app_solcit | 
|  | 1838 | ---------- | 
|  | 1839 |  | 
|  | 1840 | Determines the  number of requests to send to the user level ARP daemon. Use 0 | 
|  | 1841 | to turn off. | 
|  | 1842 |  | 
|  | 1843 | gc_stale_time | 
|  | 1844 | ------------- | 
|  | 1845 |  | 
|  | 1846 | Determines how  often  to  check  for stale ARP entries. After an ARP entry is | 
|  | 1847 | stale it  will  be resolved again (which is useful when an IP address migrates | 
|  | 1848 | to another  machine).  When  ucast_solicit is greater than 0 it first tries to | 
|  | 1849 | send an  ARP  packet  directly  to  the  known  host  When  that  fails  and | 
|  | 1850 | mcast_solicit is greater than 0, an ARP request is broadcasted. | 
|  | 1851 |  | 
|  | 1852 | 2.9 Appletalk | 
|  | 1853 | ------------- | 
|  | 1854 |  | 
|  | 1855 | The /proc/sys/net/appletalk  directory  holds the Appletalk configuration data | 
|  | 1856 | when Appletalk is loaded. The configurable parameters are: | 
|  | 1857 |  | 
|  | 1858 | aarp-expiry-time | 
|  | 1859 | ---------------- | 
|  | 1860 |  | 
|  | 1861 | The amount  of  time  we keep an ARP entry before expiring it. Used to age out | 
|  | 1862 | old hosts. | 
|  | 1863 |  | 
|  | 1864 | aarp-resolve-time | 
|  | 1865 | ----------------- | 
|  | 1866 |  | 
|  | 1867 | The amount of time we will spend trying to resolve an Appletalk address. | 
|  | 1868 |  | 
|  | 1869 | aarp-retransmit-limit | 
|  | 1870 | --------------------- | 
|  | 1871 |  | 
|  | 1872 | The number of times we will retransmit a query before giving up. | 
|  | 1873 |  | 
|  | 1874 | aarp-tick-time | 
|  | 1875 | -------------- | 
|  | 1876 |  | 
|  | 1877 | Controls the rate at which expires are checked. | 
|  | 1878 |  | 
|  | 1879 | The directory  /proc/net/appletalk  holds the list of active Appletalk sockets | 
|  | 1880 | on a machine. | 
|  | 1881 |  | 
|  | 1882 | The fields  indicate  the DDP type, the local address (in network:node format) | 
|  | 1883 | the remote  address,  the  size of the transmit pending queue, the size of the | 
|  | 1884 | received queue  (bytes waiting for applications to read) the state and the uid | 
|  | 1885 | owning the socket. | 
|  | 1886 |  | 
|  | 1887 | /proc/net/atalk_iface lists  all  the  interfaces  configured for appletalk.It | 
|  | 1888 | shows the  name  of the interface, its Appletalk address, the network range on | 
|  | 1889 | that address  (or  network number for phase 1 networks), and the status of the | 
|  | 1890 | interface. | 
|  | 1891 |  | 
|  | 1892 | /proc/net/atalk_route lists  each  known  network  route.  It lists the target | 
|  | 1893 | (network) that the route leads to, the router (may be directly connected), the | 
|  | 1894 | route flags, and the device the route is using. | 
|  | 1895 |  | 
|  | 1896 | 2.10 IPX | 
|  | 1897 | -------- | 
|  | 1898 |  | 
|  | 1899 | The IPX protocol has no tunable values in proc/sys/net. | 
|  | 1900 |  | 
|  | 1901 | The IPX  protocol  does,  however,  provide  proc/net/ipx. This lists each IPX | 
|  | 1902 | socket giving  the  local  and  remote  addresses  in  Novell  format (that is | 
|  | 1903 | network:node:port). In  accordance  with  the  strange  Novell  tradition, | 
|  | 1904 | everything but the port is in hex. Not_Connected is displayed for sockets that | 
|  | 1905 | are not  tied to a specific remote address. The Tx and Rx queue sizes indicate | 
|  | 1906 | the number  of  bytes  pending  for  transmission  and  reception.  The  state | 
|  | 1907 | indicates the  state  the  socket  is  in and the uid is the owning uid of the | 
|  | 1908 | socket. | 
|  | 1909 |  | 
|  | 1910 | The /proc/net/ipx_interface  file lists all IPX interfaces. For each interface | 
|  | 1911 | it gives  the network number, the node number, and indicates if the network is | 
|  | 1912 | the primary  network.  It  also  indicates  which  device  it  is bound to (or | 
|  | 1913 | Internal for  internal  networks)  and  the  Frame  Type if appropriate. Linux | 
|  | 1914 | supports 802.3,  802.2,  802.2  SNAP  and DIX (Blue Book) ethernet framing for | 
|  | 1915 | IPX. | 
|  | 1916 |  | 
|  | 1917 | The /proc/net/ipx_route  table  holds  a list of IPX routes. For each route it | 
|  | 1918 | gives the  destination  network, the router node (or Directly) and the network | 
|  | 1919 | address of the router (or Connected) for internal networks. | 
|  | 1920 |  | 
|  | 1921 | 2.11 /proc/sys/fs/mqueue - POSIX message queues filesystem | 
|  | 1922 | ---------------------------------------------------------- | 
|  | 1923 |  | 
|  | 1924 | The "mqueue"  filesystem provides  the necessary kernel features to enable the | 
|  | 1925 | creation of a  user space  library that  implements  the  POSIX message queues | 
|  | 1926 | API (as noted by the  MSG tag in the  POSIX 1003.1-2001 version  of the System | 
|  | 1927 | Interfaces specification.) | 
|  | 1928 |  | 
|  | 1929 | The "mqueue" filesystem contains values for determining/setting  the amount of | 
|  | 1930 | resources used by the file system. | 
|  | 1931 |  | 
|  | 1932 | /proc/sys/fs/mqueue/queues_max is a read/write  file for  setting/getting  the | 
|  | 1933 | maximum number of message queues allowed on the system. | 
|  | 1934 |  | 
|  | 1935 | /proc/sys/fs/mqueue/msg_max  is  a  read/write file  for  setting/getting  the | 
|  | 1936 | maximum number of messages in a queue value.  In fact it is the limiting value | 
|  | 1937 | for another (user) limit which is set in mq_open invocation. This attribute of | 
|  | 1938 | a queue must be less or equal then msg_max. | 
|  | 1939 |  | 
|  | 1940 | /proc/sys/fs/mqueue/msgsize_max is  a read/write  file for setting/getting the | 
|  | 1941 | maximum  message size value (it is every  message queue's attribute set during | 
|  | 1942 | its creation). | 
|  | 1943 |  | 
|  | 1944 |  | 
|  | 1945 | ------------------------------------------------------------------------------ | 
|  | 1946 | Summary | 
|  | 1947 | ------------------------------------------------------------------------------ | 
|  | 1948 | Certain aspects  of  kernel  behavior  can be modified at runtime, without the | 
|  | 1949 | need to  recompile  the kernel, or even to reboot the system. The files in the | 
|  | 1950 | /proc/sys tree  can  not only be read, but also modified. You can use the echo | 
|  | 1951 | command to write value into these files, thereby changing the default settings | 
|  | 1952 | of the kernel. | 
|  | 1953 | ------------------------------------------------------------------------------ |