| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | Documentation for /proc/sys/fs/*	kernel version 2.2.10 | 
 | 2 | 	(c) 1998, 1999,  Rik van Riel <riel@nl.linux.org> | 
| Shen Feng | 760df93 | 2009-04-02 16:57:20 -0700 | [diff] [blame] | 3 | 	(c) 2009,        Shen Feng<shen@cn.fujitsu.com> | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 4 |  | 
 | 5 | For general info and legal blurb, please look in README. | 
 | 6 |  | 
 | 7 | ============================================================== | 
 | 8 |  | 
 | 9 | This file contains documentation for the sysctl files in | 
 | 10 | /proc/sys/fs/ and is valid for Linux kernel version 2.2. | 
 | 11 |  | 
 | 12 | The files in this directory can be used to tune and monitor | 
 | 13 | miscellaneous and general things in the operation of the Linux | 
 | 14 | kernel. Since some of the files _can_ be used to screw up your | 
 | 15 | system, it is advisable to read both documentation and source | 
 | 16 | before actually making adjustments. | 
 | 17 |  | 
| Shen Feng | 760df93 | 2009-04-02 16:57:20 -0700 | [diff] [blame] | 18 | 1. /proc/sys/fs | 
 | 19 | ---------------------------------------------------------- | 
 | 20 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 21 | Currently, these files are in /proc/sys/fs: | 
| Shen Feng | 760df93 | 2009-04-02 16:57:20 -0700 | [diff] [blame] | 22 | - aio-max-nr | 
 | 23 | - aio-nr | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 24 | - dentry-state | 
 | 25 | - dquot-max | 
 | 26 | - dquot-nr | 
 | 27 | - file-max | 
 | 28 | - file-nr | 
 | 29 | - inode-max | 
 | 30 | - inode-nr | 
 | 31 | - inode-state | 
| Eric Dumazet | 9cfe015 | 2008-02-06 01:37:16 -0800 | [diff] [blame] | 32 | - nr_open | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 33 | - overflowuid | 
 | 34 | - overflowgid | 
| Alexey Dobriyan | a2e0b56 | 2006-08-27 01:23:28 -0700 | [diff] [blame] | 35 | - suid_dumpable | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 36 | - super-max | 
 | 37 | - super-nr | 
 | 38 |  | 
| Shen Feng | 760df93 | 2009-04-02 16:57:20 -0700 | [diff] [blame] | 39 | ============================================================== | 
 | 40 |  | 
 | 41 | aio-nr & aio-max-nr: | 
 | 42 |  | 
 | 43 | aio-nr is the running total of the number of events specified on the | 
 | 44 | io_setup system call for all currently active aio contexts.  If aio-nr | 
 | 45 | reaches aio-max-nr then io_setup will fail with EAGAIN.  Note that | 
 | 46 | raising aio-max-nr does not result in the pre-allocation or re-sizing | 
 | 47 | of any kernel data structures. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 48 |  | 
 | 49 | ============================================================== | 
 | 50 |  | 
 | 51 | dentry-state: | 
 | 52 |  | 
 | 53 | From linux/fs/dentry.c: | 
 | 54 | -------------------------------------------------------------- | 
 | 55 | struct { | 
 | 56 |         int nr_dentry; | 
 | 57 |         int nr_unused; | 
 | 58 |         int age_limit;         /* age in seconds */ | 
 | 59 |         int want_pages;        /* pages requested by system */ | 
 | 60 |         int dummy[2]; | 
 | 61 | } dentry_stat = {0, 0, 45, 0,}; | 
 | 62 | --------------------------------------------------------------  | 
 | 63 |  | 
 | 64 | Dentries are dynamically allocated and deallocated, and | 
 | 65 | nr_dentry seems to be 0 all the time. Hence it's safe to | 
 | 66 | assume that only nr_unused, age_limit and want_pages are | 
 | 67 | used. Nr_unused seems to be exactly what its name says. | 
 | 68 | Age_limit is the age in seconds after which dcache entries | 
 | 69 | can be reclaimed when memory is short and want_pages is | 
 | 70 | nonzero when shrink_dcache_pages() has been called and the | 
 | 71 | dcache isn't pruned yet. | 
 | 72 |  | 
 | 73 | ============================================================== | 
 | 74 |  | 
 | 75 | dquot-max & dquot-nr: | 
 | 76 |  | 
 | 77 | The file dquot-max shows the maximum number of cached disk | 
 | 78 | quota entries. | 
 | 79 |  | 
 | 80 | The file dquot-nr shows the number of allocated disk quota | 
 | 81 | entries and the number of free disk quota entries. | 
 | 82 |  | 
 | 83 | If the number of free cached disk quotas is very low and | 
 | 84 | you have some awesome number of simultaneous system users, | 
 | 85 | you might want to raise the limit. | 
 | 86 |  | 
 | 87 | ============================================================== | 
 | 88 |  | 
 | 89 | file-max & file-nr: | 
 | 90 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 91 | The value in file-max denotes the maximum number of file- | 
 | 92 | handles that the Linux kernel will allocate. When you get lots | 
 | 93 | of error messages about running out of file handles, you might | 
 | 94 | want to increase this limit. | 
 | 95 |  | 
| Federica Teodori | ca3b78a | 2011-03-15 16:12:05 -0700 | [diff] [blame] | 96 | Historically,the kernel was able to allocate file handles | 
 | 97 | dynamically, but not to free them again. The three values in | 
 | 98 | file-nr denote the number of allocated file handles, the number | 
 | 99 | of allocated but unused file handles, and the maximum number of | 
 | 100 | file handles. Linux 2.6 always reports 0 as the number of free | 
 | 101 | file handles -- this is not an error, it just means that the | 
 | 102 | number of allocated file handles exactly matches the number of | 
 | 103 | used file handles. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 104 |  | 
| Xiaotian Feng | bcadbbd | 2009-09-23 15:56:13 -0700 | [diff] [blame] | 105 | Attempts to allocate more file descriptors than file-max are | 
 | 106 | reported with printk, look for "VFS: file-max limit <number> | 
 | 107 | reached". | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 108 | ============================================================== | 
 | 109 |  | 
| Eric Dumazet | 9cfe015 | 2008-02-06 01:37:16 -0800 | [diff] [blame] | 110 | nr_open: | 
 | 111 |  | 
 | 112 | This denotes the maximum number of file-handles a process can | 
 | 113 | allocate. Default value is 1024*1024 (1048576) which should be | 
 | 114 | enough for most machines. Actual limit depends on RLIMIT_NOFILE | 
 | 115 | resource limit. | 
 | 116 |  | 
 | 117 | ============================================================== | 
 | 118 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 119 | inode-max, inode-nr & inode-state: | 
 | 120 |  | 
 | 121 | As with file handles, the kernel allocates the inode structures | 
 | 122 | dynamically, but can't free them yet. | 
 | 123 |  | 
 | 124 | The value in inode-max denotes the maximum number of inode | 
 | 125 | handlers. This value should be 3-4 times larger than the value | 
 | 126 | in file-max, since stdin, stdout and network sockets also | 
 | 127 | need an inode struct to handle them. When you regularly run | 
 | 128 | out of inodes, you need to increase this value. | 
 | 129 |  | 
 | 130 | The file inode-nr contains the first two items from | 
 | 131 | inode-state, so we'll skip to that file... | 
 | 132 |  | 
 | 133 | Inode-state contains three actual numbers and four dummies. | 
 | 134 | The actual numbers are, in order of appearance, nr_inodes, | 
 | 135 | nr_free_inodes and preshrink. | 
 | 136 |  | 
 | 137 | Nr_inodes stands for the number of inodes the system has | 
 | 138 | allocated, this can be slightly more than inode-max because | 
 | 139 | Linux allocates them one pageful at a time. | 
 | 140 |  | 
 | 141 | Nr_free_inodes represents the number of free inodes (?) and | 
 | 142 | preshrink is nonzero when the nr_inodes > inode-max and the | 
 | 143 | system needs to prune the inode list instead of allocating | 
 | 144 | more. | 
 | 145 |  | 
 | 146 | ============================================================== | 
 | 147 |  | 
 | 148 | overflowgid & overflowuid: | 
 | 149 |  | 
 | 150 | Some filesystems only support 16-bit UIDs and GIDs, although in Linux | 
 | 151 | UIDs and GIDs are 32 bits. When one of these filesystems is mounted | 
 | 152 | with writes enabled, any UID or GID that would exceed 65535 is translated | 
 | 153 | to a fixed value before being written to disk. | 
 | 154 |  | 
 | 155 | These sysctls allow you to change the value of the fixed UID and GID. | 
 | 156 | The default is 65534. | 
 | 157 |  | 
 | 158 | ============================================================== | 
 | 159 |  | 
| Alexey Dobriyan | a2e0b56 | 2006-08-27 01:23:28 -0700 | [diff] [blame] | 160 | suid_dumpable: | 
 | 161 |  | 
 | 162 | This value can be used to query and set the core dump mode for setuid | 
 | 163 | or otherwise protected/tainted binaries. The modes are | 
 | 164 |  | 
 | 165 | 0 - (default) - traditional behaviour. Any process which has changed | 
 | 166 | 	privilege levels or is execute only will not be dumped | 
 | 167 | 1 - (debug) - all processes dump core when possible. The core dump is | 
 | 168 | 	owned by the current user and no security is applied. This is | 
 | 169 | 	intended for system debugging situations only. Ptrace is unchecked. | 
 | 170 | 2 - (suidsafe) - any binary which normally would not be dumped is dumped | 
 | 171 | 	readable by root only. This allows the end user to remove | 
 | 172 | 	such a dump but not access it directly. For security reasons | 
 | 173 | 	core dumps in this mode will not overwrite one another or | 
| Matt LaPlante | 5d3f083 | 2006-11-30 05:21:10 +0100 | [diff] [blame] | 174 | 	other files. This mode is appropriate when administrators are | 
| Alexey Dobriyan | a2e0b56 | 2006-08-27 01:23:28 -0700 | [diff] [blame] | 175 | 	attempting to debug problems in a normal environment. | 
 | 176 |  | 
 | 177 | ============================================================== | 
 | 178 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 179 | super-max & super-nr: | 
 | 180 |  | 
 | 181 | These numbers control the maximum number of superblocks, and | 
 | 182 | thus the maximum number of mounted filesystems the kernel | 
 | 183 | can have. You only need to increase super-max if you need to | 
 | 184 | mount more filesystems than the current value in super-max | 
 | 185 | allows you to. | 
 | 186 |  | 
 | 187 | ============================================================== | 
 | 188 |  | 
 | 189 | aio-nr & aio-max-nr: | 
 | 190 |  | 
 | 191 | aio-nr shows the current system-wide number of asynchronous io | 
 | 192 | requests.  aio-max-nr allows you to change the maximum value | 
 | 193 | aio-nr can grow to. | 
 | 194 |  | 
 | 195 | ============================================================== | 
| Shen Feng | 760df93 | 2009-04-02 16:57:20 -0700 | [diff] [blame] | 196 |  | 
 | 197 |  | 
 | 198 | 2. /proc/sys/fs/binfmt_misc | 
 | 199 | ---------------------------------------------------------- | 
 | 200 |  | 
 | 201 | Documentation for the files in /proc/sys/fs/binfmt_misc is | 
 | 202 | in Documentation/binfmt_misc.txt. | 
 | 203 |  | 
 | 204 |  | 
 | 205 | 3. /proc/sys/fs/mqueue - POSIX message queues filesystem | 
 | 206 | ---------------------------------------------------------- | 
 | 207 |  | 
 | 208 | The "mqueue"  filesystem provides  the necessary kernel features to enable the | 
 | 209 | creation of a  user space  library that  implements  the  POSIX message queues | 
 | 210 | API (as noted by the  MSG tag in the  POSIX 1003.1-2001 version  of the System | 
 | 211 | Interfaces specification.) | 
 | 212 |  | 
 | 213 | The "mqueue" filesystem contains values for determining/setting  the amount of | 
 | 214 | resources used by the file system. | 
 | 215 |  | 
 | 216 | /proc/sys/fs/mqueue/queues_max is a read/write  file for  setting/getting  the | 
 | 217 | maximum number of message queues allowed on the system. | 
 | 218 |  | 
 | 219 | /proc/sys/fs/mqueue/msg_max  is  a  read/write file  for  setting/getting  the | 
 | 220 | maximum number of messages in a queue value.  In fact it is the limiting value | 
 | 221 | for another (user) limit which is set in mq_open invocation. This attribute of | 
 | 222 | a queue must be less or equal then msg_max. | 
 | 223 |  | 
 | 224 | /proc/sys/fs/mqueue/msgsize_max is  a read/write  file for setting/getting the | 
 | 225 | maximum  message size value (it is every  message queue's attribute set during | 
 | 226 | its creation). | 
 | 227 |  | 
 | 228 |  | 
 | 229 | 4. /proc/sys/fs/epoll - Configuration options for the epoll interface | 
 | 230 | -------------------------------------------------------- | 
 | 231 |  | 
 | 232 | This directory contains configuration options for the epoll(7) interface. | 
 | 233 |  | 
 | 234 | max_user_instances | 
 | 235 | ------------------ | 
 | 236 |  | 
 | 237 | This is the maximum number of epoll file descriptors that a single user can | 
 | 238 | have open at a given time. The default value is 128, and should be enough | 
 | 239 | for normal users. | 
 | 240 |  | 
 | 241 | max_user_watches | 
 | 242 | ---------------- | 
 | 243 |  | 
 | 244 | Every epoll file descriptor can store a number of files to be monitored | 
 | 245 | for event readiness. Each one of these monitored files constitutes a "watch". | 
 | 246 | This configuration option sets the maximum number of "watches" that are | 
 | 247 | allowed for each user. | 
 | 248 | Each "watch" costs roughly 90 bytes on a 32bit kernel, and roughly 160 bytes | 
 | 249 | on a 64bit one. | 
 | 250 | The current default value for  max_user_watches  is the 1/32 of the available | 
 | 251 | low memory, divided for the "watch" cost in bytes. | 
 | 252 |  |