| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | The Linux kernel supports the following overcommit handling modes | 
|  | 2 |  | 
|  | 3 | 0	-	Heuristic overcommit handling. Obvious overcommits of | 
|  | 4 | address space are refused. Used for a typical system. It | 
|  | 5 | ensures a seriously wild allocation fails while allowing | 
|  | 6 | overcommit to reduce swap usage.  root is allowed to | 
| Lucas De Marchi | 25985ed | 2011-03-30 22:57:33 -0300 | [diff] [blame] | 7 | allocate slightly more memory in this mode. This is the | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 8 | default. | 
|  | 9 |  | 
|  | 10 | 1	-	Always overcommit. Appropriate for some scientific | 
| Andrew Shewmaker | c9b1d09 | 2013-04-29 15:08:10 -0700 | [diff] [blame] | 11 | applications. Classic example is code using sparse arrays | 
|  | 12 | and just relying on the virtual memory consisting almost | 
|  | 13 | entirely of zero pages. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 14 |  | 
|  | 15 | 2	-	Don't overcommit. The total address space commit | 
|  | 16 | for the system is not permitted to exceed swap + a | 
|  | 17 | configurable percentage (default is 50) of physical RAM. | 
|  | 18 | Depending on the percentage you use, in most situations | 
|  | 19 | this means a process will not be killed while accessing | 
|  | 20 | pages but will receive errors on memory allocation as | 
|  | 21 | appropriate. | 
|  | 22 |  | 
| Andrew Shewmaker | c9b1d09 | 2013-04-29 15:08:10 -0700 | [diff] [blame] | 23 | Useful for applications that want to guarantee their | 
|  | 24 | memory allocations will be available in the future | 
|  | 25 | without having to initialize every page. | 
|  | 26 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 27 | The overcommit policy is set via the sysctl `vm.overcommit_memory'. | 
|  | 28 |  | 
|  | 29 | The overcommit percentage is set via `vm.overcommit_ratio'. | 
|  | 30 |  | 
|  | 31 | The current overcommit limit and amount committed are viewable in | 
|  | 32 | /proc/meminfo as CommitLimit and Committed_AS respectively. | 
|  | 33 |  | 
|  | 34 | Gotchas | 
|  | 35 | ------- | 
|  | 36 |  | 
|  | 37 | The C language stack growth does an implicit mremap. If you want absolute | 
|  | 38 | guarantees and run close to the edge you MUST mmap your stack for the | 
|  | 39 | largest size you think you will need. For typical stack usage this does | 
|  | 40 | not matter much but it's a corner case if you really really care | 
|  | 41 |  | 
|  | 42 | In mode 2 the MAP_NORESERVE flag is ignored. | 
|  | 43 |  | 
|  | 44 |  | 
|  | 45 | How It Works | 
|  | 46 | ------------ | 
|  | 47 |  | 
|  | 48 | The overcommit is based on the following rules | 
|  | 49 |  | 
|  | 50 | For a file backed map | 
|  | 51 | SHARED or READ-only	-	0 cost (the file is the map not swap) | 
|  | 52 | PRIVATE WRITABLE	-	size of mapping per instance | 
|  | 53 |  | 
|  | 54 | For an anonymous or /dev/zero map | 
|  | 55 | SHARED			-	size of mapping | 
|  | 56 | PRIVATE READ-only	-	0 cost (but of little use) | 
|  | 57 | PRIVATE WRITABLE	-	size of mapping per instance | 
|  | 58 |  | 
|  | 59 | Additional accounting | 
|  | 60 | Pages made writable copies by mmap | 
|  | 61 | shmfs memory drawn from the same pool | 
|  | 62 |  | 
|  | 63 | Status | 
|  | 64 | ------ | 
|  | 65 |  | 
|  | 66 | o	We account mmap memory mappings | 
|  | 67 | o	We account mprotect changes in commit | 
|  | 68 | o	We account mremap changes in size | 
|  | 69 | o	We account brk | 
|  | 70 | o	We account munmap | 
|  | 71 | o	We report the commit status in /proc | 
|  | 72 | o	Account and check on fork | 
|  | 73 | o	Review stack handling/building on exec | 
|  | 74 | o	SHMfs accounting | 
|  | 75 | o	Implement actual limit enforcement | 
|  | 76 |  | 
|  | 77 | To Do | 
|  | 78 | ----- | 
|  | 79 | o	Account ptrace pages (this is hard) |