)]}'
{
  "log": [
    {
      "commit": "5b8ba10198a109f8a02380648c5d29000caa9c55",
      "tree": "1e4328d86395baa3d429c0d9911b7d7e1272629d",
      "parents": [
        "4d258b25d947521c8b913154db61ec55198243f8"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hughd@google.com",
        "time": "Mon Jun 27 16:18:01 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Jun 27 18:00:12 2011 -0700"
      },
      "message": "mm: move vmtruncate_range to truncate.c\n\nYou would expect to find vmtruncate_range() next to vmtruncate() in\nmm/truncate.c: move it there.\n\nSigned-off-by: Hugh Dickins \u003chughd@google.com\u003e\nAcked-by: Christoph Hellwig \u003chch@infradead.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "5f1a19070b16c20cdc71ed0e981bfa19f8f6a4ee",
      "tree": "f3eaeb7a040e2484d71485118d58e34eb0760bf3",
      "parents": [
        "4bbd61fb9726808e72ab2aa440401f6e5e1aa8f7"
      ],
      "author": {
        "name": "Steven Rostedt",
        "email": "rostedt@goodmis.org",
        "time": "Wed Jun 15 15:08:23 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Jun 15 20:04:00 2011 -0700"
      },
      "message": "mm: fix wrong kunmap_atomic() pointer\n\nRunning a ktest.pl test, I hit the following bug on x86_32:\n\n  ------------[ cut here ]------------\n  WARNING: at arch/x86/mm/highmem_32.c:81 __kunmap_atomic+0x64/0xc1()\n   Hardware name:\n  Modules linked in:\n  Pid: 93, comm: sh Not tainted 2.6.39-test+ #1\n  Call Trace:\n   [\u003cc04450da\u003e] warn_slowpath_common+0x7c/0x91\n   [\u003cc042f5df\u003e] ? __kunmap_atomic+0x64/0xc1\n   [\u003cc042f5df\u003e] ? __kunmap_atomic+0x64/0xc1^M\n   [\u003cc0445111\u003e] warn_slowpath_null+0x22/0x24\n   [\u003cc042f5df\u003e] __kunmap_atomic+0x64/0xc1\n   [\u003cc04d4a22\u003e] unmap_vmas+0x43a/0x4e0\n   [\u003cc04d9065\u003e] exit_mmap+0x91/0xd2\n   [\u003cc0443057\u003e] mmput+0x43/0xad\n   [\u003cc0448358\u003e] exit_mm+0x111/0x119\n   [\u003cc044855f\u003e] do_exit+0x1ff/0x5fa\n   [\u003cc0454ea2\u003e] ? set_current_blocked+0x3c/0x40\n   [\u003cc0454f24\u003e] ? sigprocmask+0x7e/0x8e\n   [\u003cc0448b55\u003e] do_group_exit+0x65/0x88\n   [\u003cc0448b90\u003e] sys_exit_group+0x18/0x1c\n   [\u003cc0c3915f\u003e] sysenter_do_call+0x12/0x38\n  ---[ end trace 8055f74ea3c0eb62 ]---\n\nRunning a ktest.pl git bisect, found the culprit: commit e303297e6c3a\n(\"mm: extended batches for generic mmu_gather\")\n\nBut although this was the commit triggering the bug, it was not the one\noriginally responsible for the bug.  That was commit d16dfc550f53 (\"mm:\nmmu_gather rework\").\n\nThe code in zap_pte_range() has something that looks like the following:\n\n\tpte \u003d  pte_offset_map_lock(mm, pmd, addr, \u0026ptl);\n\tdo {\n\t\t[...]\n\t} while (pte++, addr +\u003d PAGE_SIZE, addr !\u003d end);\n\tpte_unmap_unlock(pte - 1, ptl);\n\nThe pte starts off pointing at the first element in the page table\ndirectory that was returned by the pte_offset_map_lock().  When it\u0027s done\nwith the page, pte will be pointing to anything between the next entry and\nthe first entry of the next page inclusive.  By doing a pte - 1, this puts\nthe pte back onto the original page, which is all that pte_unmap_unlock()\nneeds.\n\nIn most archs (64 bit), this is not an issue as the pte is ignored in the\npte_unmap_unlock().  But on 32 bit archs, where things may be kmapped, it\nis essential that the pte passed to pte_unmap_unlock() resides on the same\npage that was given by pte_offest_map_lock().\n\nThe problem came in d16dfc55 (\"mm: mmu_gather rework\") where it introduced\na \"break;\" from the while loop.  This alone did not seem to easily trigger\nthe bug.  But the modifications made by e303297e6 caused that \"break;\" to\nbe hit on the first iteration, before the pte++.\n\nThe pte not being incremented will now cause pte_unmap_unlock(pte - 1) to\nbe pointing to the previous page.  This will cause the wrong page to be\nunmapped, and also trigger the warning above.\n\nThe simple solution is to just save the pointer given by\npte_offset_map_lock() and use it in the unlock.\n\nSigned-off-by: Steven Rostedt \u003crostedt@goodmis.org\u003e\nCc: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Hugh Dickins \u003chughd@google.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0164f69d0cf1a6abbc936851f5b72ece92187cda",
      "tree": "000bb234b98d76ce0b5195a3ee53a505aa0d3d86",
      "parents": [
        "f300ea499721ca208fc4714b9105bfd7e9f75be0"
      ],
      "author": {
        "name": "Randy Dunlap",
        "email": "randy.dunlap@oracle.com",
        "time": "Wed Jun 15 15:08:09 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Jun 15 20:03:59 2011 -0700"
      },
      "message": "mm/memory.c: fix kernel-doc notation\n\nFix new kernel-doc warnings in mm/memory.c:\n\n  Warning(mm/memory.c:1327): No description found for parameter \u0027tlb\u0027\n  Warning(mm/memory.c:1327): Excess function parameter \u0027tlbp\u0027 description in \u0027unmap_vmas\u0027\n\nSigned-off-by: Randy Dunlap \u003crandy.dunlap@oracle.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "456f998ec817ebfa254464be4f089542fa390645",
      "tree": "5976aa500638f0bbade1a672233cad71765b89b8",
      "parents": [
        "406eb0c9ba765eb066406fd5ce9d5e2b169a4d5a"
      ],
      "author": {
        "name": "Ying Han",
        "email": "yinghan@google.com",
        "time": "Thu May 26 16:25:38 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu May 26 17:12:36 2011 -0700"
      },
      "message": "memcg: add the pagefault count into memcg stats\n\nTwo new stats in per-memcg memory.stat which tracks the number of page\nfaults and number of major page faults.\n\n  \"pgfault\"\n  \"pgmajfault\"\n\nThey are different from \"pgpgin\"/\"pgpgout\" stat which count number of\npages charged/discharged to the cgroup and have no meaning of reading/\nwriting page to disk.\n\nIt is valuable to track the two stats for both measuring application\u0027s\nperformance as well as the efficiency of the kernel page reclaim path.\nCounting pagefaults per process is useful, but we also need the aggregated\nvalue since processes are monitored and controlled in cgroup basis in\nmemcg.\n\nFunctional test: check the total number of pgfault/pgmajfault of all\nmemcgs and compare with global vmstat value:\n\n  $ cat /proc/vmstat | grep fault\n  pgfault 1070751\n  pgmajfault 553\n\n  $ cat /dev/cgroup/memory.stat | grep fault\n  pgfault 1071138\n  pgmajfault 553\n  total_pgfault 1071142\n  total_pgmajfault 553\n\n  $ cat /dev/cgroup/A/memory.stat | grep fault\n  pgfault 199\n  pgmajfault 0\n  total_pgfault 199\n  total_pgmajfault 0\n\nPerformance test: run page fault test(pft) wit 16 thread on faulting in\n15G anon pages in 16G container.  There is no regression noticed on the\n\"flt/cpu/s\"\n\nSample output from pft:\n\n  TAG pft:anon-sys-default:\n    Gb  Thr CLine   User     System     Wall    flt/cpu/s fault/wsec\n    15   16   1     0.67s   233.41s    14.76s   16798.546 266356.260\n\n  +-------------------------------------------------------------------------+\n      N           Min           Max        Median           Avg        Stddev\n  x  10     16682.962     17344.027     16913.524     16928.812      166.5362\n  +  10     16695.568     16923.896     16820.604     16824.652     84.816568\n  No difference proven at 95.0% confidence\n\n[akpm@linux-foundation.org: fix build]\n[hughd@google.com: shmem fix]\nSigned-off-by: Ying Han \u003cyinghan@google.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nAcked-by: Balbir Singh \u003cbalbir@linux.vnet.ibm.com\u003e\nSigned-off-by: Hugh Dickins \u003chughd@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ca16d140af91febe25daeb9e032bf8bd46b8c31f",
      "tree": "a093c3f244a1bdfc2a50e271a7e6df3324df0f05",
      "parents": [
        "4db70f73e56961b9bcdfd0c36c62847a18b7dbb5"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Thu May 26 19:16:19 2011 +0900"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu May 26 09:20:31 2011 -0700"
      },
      "message": "mm: don\u0027t access vm_flags as \u0027int\u0027\n\nThe type of vma-\u003evm_flags is \u0027unsigned long\u0027. Neither \u0027int\u0027 nor\n\u0027unsigned int\u0027. This patch fixes such misuse.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\n[ Changed to use a typedef - we\u0027ll extend it to cover more cases\n  later, since there has been discussion about making it a 64-bit\n  type..                      - Linus ]\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "9547d01bfb9c351dc19067f8a4cea9d3955f4125",
      "tree": "3c32521dbbf380471e1eef3e11ae656b24164255",
      "parents": [
        "88c22088bf235f50b09a10bd9f022b0472bcb6b5"
      ],
      "author": {
        "name": "Peter Zijlstra",
        "email": "a.p.zijlstra@chello.nl",
        "time": "Tue May 24 17:12:14 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:20 2011 -0700"
      },
      "message": "mm: uninline large generic tlb.h functions\n\nSome of these functions have grown beyond inline sanity, move them\nout-of-line.\n\nSigned-off-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nRequested-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nRequested-by: Hugh Dickins \u003chughd@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "3d48ae45e72390ddf8cc5256ac32ed6f7a19cbea",
      "tree": "1f46db3a8424090dd8e0b58991fa5acc1a73e680",
      "parents": [
        "97a894136f29802da19a15541de3c019e1ca147e"
      ],
      "author": {
        "name": "Peter Zijlstra",
        "email": "a.p.zijlstra@chello.nl",
        "time": "Tue May 24 17:12:06 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:18 2011 -0700"
      },
      "message": "mm: Convert i_mmap_lock to a mutex\n\nStraightforward conversion of i_mmap_lock to a mutex.\n\nSigned-off-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nAcked-by: Hugh Dickins \u003chughd@google.com\u003e\nCc: Benjamin Herrenschmidt \u003cbenh@kernel.crashing.org\u003e\nCc: David Miller \u003cdavem@davemloft.net\u003e\nCc: Martin Schwidefsky \u003cschwidefsky@de.ibm.com\u003e\nCc: Russell King \u003crmk@arm.linux.org.uk\u003e\nCc: Paul Mundt \u003clethal@linux-sh.org\u003e\nCc: Jeff Dike \u003cjdike@addtoit.com\u003e\nCc: Richard Weinberger \u003crichard@nod.at\u003e\nCc: Tony Luck \u003ctony.luck@intel.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nCc: Namhyung Kim \u003cnamhyung@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "97a894136f29802da19a15541de3c019e1ca147e",
      "tree": "1fd3f92ba92a37d5d8527a1f41458091d0a944dc",
      "parents": [
        "e4c70a6629f9c74c4b0de258a3951890e9047c82"
      ],
      "author": {
        "name": "Peter Zijlstra",
        "email": "a.p.zijlstra@chello.nl",
        "time": "Tue May 24 17:12:04 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:17 2011 -0700"
      },
      "message": "mm: Remove i_mmap_lock lockbreak\n\nHugh says:\n \"The only significant loser, I think, would be page reclaim (when\n  concurrent with truncation): could spin for a long time waiting for\n  the i_mmap_mutex it expects would soon be dropped? \"\n\nCounter points:\n - cpu contention makes the spin stop (need_resched())\n - zap pages should be freeing pages at a higher rate than reclaim\n   ever can\n\nI think the simplification of the truncate code is definitely worth it.\n\nEffectively reverts: 2aa15890f3c (\"mm: prevent concurrent\nunmap_mapping_range() on the same inode\") and takes out the code that\ncaused its problem.\n\nSigned-off-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Benjamin Herrenschmidt \u003cbenh@kernel.crashing.org\u003e\nCc: David Miller \u003cdavem@davemloft.net\u003e\nCc: Martin Schwidefsky \u003cschwidefsky@de.ibm.com\u003e\nCc: Russell King \u003crmk@arm.linux.org.uk\u003e\nCc: Paul Mundt \u003clethal@linux-sh.org\u003e\nCc: Jeff Dike \u003cjdike@addtoit.com\u003e\nCc: Richard Weinberger \u003crichard@nod.at\u003e\nCc: Tony Luck \u003ctony.luck@intel.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nCc: Namhyung Kim \u003cnamhyung@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e303297e6c3a7b847c4731eb14006ca6b435ecca",
      "tree": "c2bbec8fb0cad1405f4a3ff908cd1d22abcd3e77",
      "parents": [
        "267239116987d64850ad2037d8e0f3071dc3b5ce"
      ],
      "author": {
        "name": "Peter Zijlstra",
        "email": "a.p.zijlstra@chello.nl",
        "time": "Tue May 24 17:12:01 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:16 2011 -0700"
      },
      "message": "mm: extended batches for generic mmu_gather\n\nInstead of using a single batch (the small on-stack, or an allocated\npage), try and extend the batch every time it runs out and only flush once\neither the extend fails or we\u0027re done.\n\nSigned-off-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nRequested-by: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Hugh Dickins \u003chughd@google.com\u003e\nCc: Benjamin Herrenschmidt \u003cbenh@kernel.crashing.org\u003e\nCc: David Miller \u003cdavem@davemloft.net\u003e\nCc: Martin Schwidefsky \u003cschwidefsky@de.ibm.com\u003e\nCc: Russell King \u003crmk@arm.linux.org.uk\u003e\nCc: Paul Mundt \u003clethal@linux-sh.org\u003e\nCc: Jeff Dike \u003cjdike@addtoit.com\u003e\nCc: Richard Weinberger \u003crichard@nod.at\u003e\nCc: Tony Luck \u003ctony.luck@intel.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nCc: Namhyung Kim \u003cnamhyung@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "267239116987d64850ad2037d8e0f3071dc3b5ce",
      "tree": "142595897f7fc7bb673b791891dcc2fab31f6e91",
      "parents": [
        "1c395176962176660bb108f90e97e1686cfe0d85"
      ],
      "author": {
        "name": "Peter Zijlstra",
        "email": "a.p.zijlstra@chello.nl",
        "time": "Tue May 24 17:12:00 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:16 2011 -0700"
      },
      "message": "mm, powerpc: move the RCU page-table freeing into generic code\n\nIn case other architectures require RCU freed page-tables to implement\ngup_fast() and software filled hashes and similar things, provide the\nmeans to do so by moving the logic into generic code.\n\nSigned-off-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nRequested-by: David Miller \u003cdavem@davemloft.net\u003e\nCc: Benjamin Herrenschmidt \u003cbenh@kernel.crashing.org\u003e\nCc: Martin Schwidefsky \u003cschwidefsky@de.ibm.com\u003e\nCc: Russell King \u003crmk@arm.linux.org.uk\u003e\nCc: Paul Mundt \u003clethal@linux-sh.org\u003e\nCc: Jeff Dike \u003cjdike@addtoit.com\u003e\nCc: Richard Weinberger \u003crichard@nod.at\u003e\nCc: Tony Luck \u003ctony.luck@intel.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nCc: Namhyung Kim \u003cnamhyung@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d16dfc550f5326a4000f3322582a7c05dec91d7a",
      "tree": "8ee963542705cbf2187777f1d3f2b209cbda827a",
      "parents": [
        "d05f3169c0fbca16132ec7c2be71685c6de638b5"
      ],
      "author": {
        "name": "Peter Zijlstra",
        "email": "a.p.zijlstra@chello.nl",
        "time": "Tue May 24 17:11:45 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:12 2011 -0700"
      },
      "message": "mm: mmu_gather rework\n\nRework the existing mmu_gather infrastructure.\n\nThe direct purpose of these patches was to allow preemptible mmu_gather,\nbut even without that I think these patches provide an improvement to the\nstatus quo.\n\nThe first 9 patches rework the mmu_gather infrastructure.  For review\npurpose I\u0027ve split them into generic and per-arch patches with the last of\nthose a generic cleanup.\n\nThe next patch provides generic RCU page-table freeing, and the followup\nis a patch converting s390 to use this.  I\u0027ve also got 4 patches from\nDaveM lined up (not included in this series) that uses this to implement\ngup_fast() for sparc64.\n\nThen there is one patch that extends the generic mmu_gather batching.\n\nAfter that follow the mm preemptibility patches, these make part of the mm\na lot more preemptible.  It converts i_mmap_lock and anon_vma-\u003elock to\nmutexes which together with the mmu_gather rework makes mmu_gather\npreemptible as well.\n\nMaking i_mmap_lock a mutex also enables a clean-up of the truncate code.\n\nThis also allows for preemptible mmu_notifiers, something that XPMEM I\nthink wants.\n\nFurthermore, it removes the new and universially detested unmap_mutex.\n\nThis patch:\n\nRemove the first obstacle towards a fully preemptible mmu_gather.\n\nThe current scheme assumes mmu_gather is always done with preemption\ndisabled and uses per-cpu storage for the page batches.  Change this to\ntry and allocate a page for batching and in case of failure, use a small\non-stack array to make some progress.\n\nPreemptible mmu_gather is desired in general and usable once i_mmap_lock\nbecomes a mutex.  Doing it before the mutex conversion saves us from\nhaving to rework the code by moving the mmu_gather bits inside the\npte_lock.\n\nAlso avoid flushing the tlb batches from under the pte lock, this is\nuseful even without the i_mmap_lock conversion as it significantly reduces\npte lock hold times.\n\n[akpm@linux-foundation.org: fix comment tpyo]\nSigned-off-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nCc: Benjamin Herrenschmidt \u003cbenh@kernel.crashing.org\u003e\nCc: David Miller \u003cdavem@davemloft.net\u003e\nCc: Martin Schwidefsky \u003cschwidefsky@de.ibm.com\u003e\nCc: Russell King \u003crmk@arm.linux.org.uk\u003e\nCc: Paul Mundt \u003clethal@linux-sh.org\u003e\nCc: Jeff Dike \u003cjdike@addtoit.com\u003e\nCc: Richard Weinberger \u003crichard@nod.at\u003e\nCc: Tony Luck \u003ctony.luck@intel.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Hugh Dickins \u003chughd@google.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nCc: Namhyung Kim \u003cnamhyung@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d05f3169c0fbca16132ec7c2be71685c6de638b5",
      "tree": "37d82004869fa4e530617883f12cab7538dbd4a6",
      "parents": [
        "248ac0e1943ad1796393d281b096184719eb3f97"
      ],
      "author": {
        "name": "Michal Hocko",
        "email": "mhocko@suse.cz",
        "time": "Tue May 24 17:11:44 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:12 2011 -0700"
      },
      "message": "mm: make expand_downwards() symmetrical with expand_upwards()\n\nCurrently we have expand_upwards exported while expand_downwards is\naccessible only via expand_stack or expand_stack_downwards.\n\ncheck_stack_guard_page is a nice example of the asymmetry.  It uses\nexpand_stack for VM_GROWSDOWN while expand_upwards is called for\nVM_GROWSUP case.\n\nLet\u0027s clean this up by exporting both functions and make those names\nconsistent.  Let\u0027s use expand_{upwards,downwards} because expanding\ndoesn\u0027t always involve stack manipulation (an example is\nia64_do_page_fault which uses expand_upwards for registers backing store\nexpansion).  expand_downwards has to be defined for both\nCONFIG_STACK_GROWS{UP,DOWN} because get_arg_page calls the downwards\nversion in the early process initialization phase for growsup\nconfiguration.\n\nSigned-off-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nAcked-by: Hugh Dickins \u003chughd@google.com\u003e\nCc: James Bottomley \u003cJames.Bottomley@HansenPartnership.com\u003e\nCc: \"Luck, Tony\" \u003ctony.luck@intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a09a79f66874c905af35d5bb5e5f2fdc7b6b894d",
      "tree": "9cb2ae1fef7083af91a49c19411e9871e0e59a37",
      "parents": [
        "26822eebb25500fb0776c7c256a6af041e9f538b"
      ],
      "author": {
        "name": "Mikulas Patocka",
        "email": "mikulas@artax.karlin.mff.cuni.cz",
        "time": "Mon May 09 13:01:09 2011 +0200"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon May 09 16:22:07 2011 -0700"
      },
      "message": "Don\u0027t lock guardpage if the stack is growing up\n\nLinux kernel excludes guard page when performing mlock on a VMA with\ndown-growing stack. However, some architectures have up-growing stack\nand locking the guard page should be excluded in this case too.\n\nThis patch fixes lvm2 on PA-RISC (and possibly other architectures with\nup-growing stack). lvm2 calculates number of used pages when locking and\nwhen unlocking and reports an internal error if the numbers mismatch.\n\n[ Patch changed fairly extensively to also fix /proc/\u003cpid\u003e/maps for the\n  grows-up case, and to move things around a bit to clean it all up and\n  share the infrstructure with the /proc bits.\n\n  Tested on ia64 that has both grow-up and grow-down segments  - Linus ]\n\nSigned-off-by: Mikulas Patocka \u003cmikulas@artax.karlin.mff.cuni.cz\u003e\nTested-by: Tony Luck \u003ctony.luck@gmail.com\u003e\nCc: stable@kernel.org\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a1fde08c74e90accd62d4cfdbf580d2ede938fe7",
      "tree": "bdf58078fd37484729e350acb066dc1b1fa890ee",
      "parents": [
        "5895198c56d131cc696556a45f7ff0ea99ac297b"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 04 21:30:28 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 04 21:30:28 2011 -0700"
      },
      "message": "VM: skip the stack guard page lookup in get_user_pages only for mlock\n\nThe logic in __get_user_pages() used to skip the stack guard page lookup\nwhenever the caller wasn\u0027t interested in seeing what the actual page\nwas.  But Michel Lespinasse points out that there are cases where we\ndon\u0027t care about the physical page itself (so \u0027pages\u0027 may be NULL), but\ndo want to make sure a page is mapped into the virtual address space.\n\nSo using the existence of the \"pages\" array as an indication of whether\nto look up the guard page or not isn\u0027t actually so great, and we really\nshould just use the FOLL_MLOCK bit.  But because that bit was only set\nfor the VM_LOCKED case (and not all vma\u0027s necessarily have it, even for\nmlock()), we couldn\u0027t do that originally.\n\nFix that by moving the VM_LOCKED check deeper into the call-chain, which\nactually simplifies many things.  Now mlock() gets simpler, and we can\nalso check for FOLL_MLOCK in __get_user_pages() and the code ends up\nmuch more straightforward.\n\nReported-and-reviewed-by: Michel Lespinasse \u003cwalken@google.com\u003e\nCc: stable@kernel.org\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "cc03638df20acbec5d0d0d9e07234aadde9e698d",
      "tree": "462baff7982f3b58a647fc895ec3a62402e3d0b3",
      "parents": [
        "1409f141ac719b994d2832911b1e9ec928943fc2"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mgorman@suse.de",
        "time": "Wed Apr 27 15:26:56 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Apr 28 11:28:21 2011 -0700"
      },
      "message": "mm: check if PTE is already allocated during page fault\n\nWith transparent hugepage support, handle_mm_fault() has to be careful\nthat a normal PMD has been established before handling a PTE fault.  To\nachieve this, it used __pte_alloc() directly instead of pte_alloc_map as\npte_alloc_map is unsafe to run against a huge PMD.  pte_offset_map() is\ncalled once it is known the PMD is safe.\n\npte_alloc_map() is smart enough to check if a PTE is already present\nbefore calling __pte_alloc but this check was lost.  As a consequence,\nPTEs may be allocated unnecessarily and the page table lock taken.  Thi\nuseless PTE does get cleaned up but it\u0027s a performance hit which is\nvisible in page_test from aim9.\n\nThis patch simply re-adds the check normally done by pte_alloc_map to\ncheck if the PTE needs to be allocated before taking the page table lock.\nThe effect is noticable in page_test from aim9.\n\n  AIM9\n                  2.6.38-vanilla 2.6.38-checkptenone\n  creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)\n  page_test       38.10 ( 0.00%)    42.04 ( 9.37%)\n  brk_test        52.45 ( 0.00%)    51.57 (-1.71%)\n  exec_test      382.00 ( 0.00%)   456.90 (16.39%)\n  fork_test       60.11 ( 0.00%)    67.79 (11.34%)\n  MMTests Statistics: duration\n  Total Elapsed Time (seconds)                611.90    612.22\n\n(While this affects 2.6.38, it is a performance rather than a functional\nbug and normally outside the rules -stable.  While the big performance\ndifferences are to a microbench, the difference in fork and exec\nperformance may be significant enough that -stable wants to consider the\npatch)\n\nReported-by: Raz Ben Yehuda \u003craziebe@gmail.com\u003e\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: \u003cstable@kernel.org\u003e\t\t[2.6.38.x]\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "fe936dfc23fed3475b11067e8d9b70553eafcd9e",
      "tree": "b45ad916853194b26bfe4504879e0bff64a43bf7",
      "parents": [
        "4471a675dfc7ca676c165079e91c712b09dc9ce4"
      ],
      "author": {
        "name": "Michael Ellerman",
        "email": "michael@ellerman.id.au",
        "time": "Thu Apr 14 15:22:10 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Apr 14 16:06:55 2011 -0700"
      },
      "message": "mm: check that we have the right vma in __access_remote_vm()\n\nIn __access_remote_vm() we need to check that we have found the right\nvma, not the following vma before we try to access it.  Otherwise we\nmight call the vma\u0027s access routine with an address which does not fall\ninside the vma.\n\nIt was discovered on a current kernel but with an unreleased driver,\nfrom memory it was strace leading to a kernel bad access, but it\nobviously depends on what the access implementation does.\n\nLooking at other access implementations I only see:\n\n  $ git grep -A 5 vm_operations|grep access\n  arch/powerpc/platforms/cell/spufs/file.c-\t.access \u003d spufs_mem_mmap_access,\n  arch/x86/pci/i386.c-\t.access \u003d generic_access_phys,\n  drivers/char/mem.c-\t.access \u003d generic_access_phys\n  fs/sysfs/bin.c-\t.access\t\t\u003d bin_access,\n\nThe spufs one looks like it might behave badly given the wrong vma, it\nassumes vma-\u003evm_file-\u003eprivate_data is a spu_context, and looks like it\nwould probably blow up pretty quickly if it wasn\u0027t.\n\ngeneric_access_phys() only uses the vma to check vm_flags and get the\nmm, and then walks page tables using the address.  So it should bail on\nthe vm_flags check, or at worst let you access some other VM_IO mapping.\n\nAnd bin_access() just proxies to another access implementation.\n\nSigned-off-by: Michael Ellerman \u003cmichael@ellerman.id.au\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "95042f9eb78a8d9a17455e2ef263f2f310ecef15",
      "tree": "ac9fe0a5e17c4b94b18b84338ffbeca2cee140cb",
      "parents": [
        "be85bccaa5aa5a11dcaf85f9e945ffefd253f631"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Apr 12 14:15:51 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Apr 12 14:15:51 2011 -0700"
      },
      "message": "vm: fix mlock() on stack guard page\n\nCommit 53a7706d5ed8 (\"mlock: do not hold mmap_sem for extended periods\nof time\") changed mlock() to care about the exact number of pages that\n__get_user_pages() had brought it.  Before, it would only care about\nerrors.\n\nAnd that doesn\u0027t work, because we also handled one page specially in\n__mlock_vma_pages_range(), namely the stack guard page.  So when that\ncase was handled, the number of pages that the function returned was off\nby one.  In particular, it could be zero, and then the caller would end\nup not making any progress at all.\n\nRather than try to fix up that off-by-one error for the mlock case\nspecially, this just moves the logic to handle the stack guard page\ninto__get_user_pages() itself, thus making all the counts come out\nright automatically.\n\nReported-by: Robert Święcki \u003crobert@swiecki.net\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Oleg Nesterov \u003coleg@redhat.com\u003e\nCc: stable@kernel.org\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ae91dbfc9949cf042c45798557b48d3b83bc3635",
      "tree": "6af0edfd904b957a2f6ca65ae4a5fdebb78ca5b8",
      "parents": [
        "d7c3f8cee81f4548de0513403b74131aee655576"
      ],
      "author": {
        "name": "Randy Dunlap",
        "email": "randy.dunlap@oracle.com",
        "time": "Sat Mar 26 13:27:01 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sun Mar 27 19:30:18 2011 -0700"
      },
      "message": "mm: fix memory.c incorrect kernel-doc\n\nFix mm/memory.c incorrect kernel-doc function notation:\n\n  Warning(mm/memory.c:3718): Cannot understand  * @access_remote_vm - access another process\u0027 address space\n   on line 3718 - I thought it was a doc line\n\nSigned-off-by: Randy Dunlap \u003crandy.dunlap@oracle.com\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "b81a618dcd3ea99de292dbe624f41ca68f464376",
      "tree": "c5fbe44f944da9d7dc0c224116be77094d379c8a",
      "parents": [
        "2f284c846331fa44be1300a3c2c3e85800268a00",
        "a9712bc12c40c172e393f85a9b2ba8db4bf59509"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Mar 23 20:51:42 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Mar 23 20:51:42 2011 -0700"
      },
      "message": "Merge branch \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6\n\n* \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:\n  deal with races in /proc/*/{syscall,stack,personality}\n  proc: enable writing to /proc/pid/mem\n  proc: make check_mem_permission() return an mm_struct on success\n  proc: hold cred_guard_mutex in check_mem_permission()\n  proc: disable mem_write after exec\n  mm: implement access_remote_vm\n  mm: factor out main logic of access_process_vm\n  mm: use mm_struct to resolve gate vma\u0027s in __get_user_pages\n  mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm\n  mm: arch: make in_gate_area take an mm_struct instead of a task_struct\n  mm: arch: make get_gate_vma take an mm_struct instead of a task_struct\n  x86: mark associated mm when running a task in 32 bit compatibility mode\n  x86: add context tag to mark mm when running a task in 32-bit compatibility mode\n  auxv: require the target to be tracable (or yourself)\n  close race in /proc/*/environ\n  report errors in /proc/*/*map* sanely\n  pagemap: close races with suid execve\n  make sessionid permissions in /proc/*/task/* match those in /proc/*\n  fix leaks in path_lookupat()\n\nFix up trivial conflicts in fs/proc/base.c\n"
    },
    {
      "commit": "56039efa18f2530fc23e8ef19e716b65ee2a1d1e",
      "tree": "a61cbd2f760e93363657622de2cd1591db028458",
      "parents": [
        "6c191cd01a935e5b53ef43c9403c771bb7a32b60"
      ],
      "author": {
        "name": "KAMEZAWA Hiroyuki",
        "email": "kamezawa.hiroyu@jp.fujitsu.com",
        "time": "Wed Mar 23 16:42:19 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Mar 23 19:46:22 2011 -0700"
      },
      "message": "memcg: fix ugly initialization of return value is in caller\n\nRemove initialization of vaiable in caller of memory cgroup function.\nActually, it\u0027s return value of memcg function but it\u0027s initialized in\ncaller.\n\nSome memory cgroup uses following style to bring the result of start\nfunction to the end function for avoiding races.\n\n   mem_cgroup_start_A(\u0026(*ptr))\n   /* Something very complicated can happen here. */\n   mem_cgroup_end_A(*ptr)\n\nIn some calls, *ptr should be initialized to NULL be caller.  But it\u0027s\nugly.  This patch fixes that *ptr is initialized by _start function.\n\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nAcked-by: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nCc: Balbir Singh \u003cbalbir@linux.vnet.ibm.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "5ddd36b9c59887c6416e21daf984fbdd9b1818df",
      "tree": "1cc7ce9a671f4c49dc594e1f5d1fc8b596e77b5f",
      "parents": [
        "206cb636576b969e9b471cdedeaea7752e6acb33"
      ],
      "author": {
        "name": "Stephen Wilson",
        "email": "wilsons@start.ca",
        "time": "Sun Mar 13 15:49:20 2011 -0400"
      },
      "committer": {
        "name": "Al Viro",
        "email": "viro@zeniv.linux.org.uk",
        "time": "Wed Mar 23 16:36:57 2011 -0400"
      },
      "message": "mm: implement access_remote_vm\n\nProvide an alternative to access_process_vm that allows the caller to obtain a\nreference to the supplied mm_struct.\n\nSigned-off-by: Stephen Wilson \u003cwilsons@start.ca\u003e\nSigned-off-by: Al Viro \u003cviro@zeniv.linux.org.uk\u003e\n"
    },
    {
      "commit": "206cb636576b969e9b471cdedeaea7752e6acb33",
      "tree": "252a1b5e9ce41521fb93b519265d4a1dbd18cfe9",
      "parents": [
        "e7f22e207bacdba5b73f2893a3abe935a5373e2e"
      ],
      "author": {
        "name": "Stephen Wilson",
        "email": "wilsons@start.ca",
        "time": "Sun Mar 13 15:49:19 2011 -0400"
      },
      "committer": {
        "name": "Al Viro",
        "email": "viro@zeniv.linux.org.uk",
        "time": "Wed Mar 23 16:36:56 2011 -0400"
      },
      "message": "mm: factor out main logic of access_process_vm\n\nIntroduce an internal helper __access_remote_vm and base access_process_vm on\ntop of it.  This new method may be called with a NULL task_struct if page fault\naccounting is not desired.  This code will be shared with a new address space\naccessor that is independent of task_struct.\n\nSigned-off-by: Stephen Wilson \u003cwilsons@start.ca\u003e\nSigned-off-by: Al Viro \u003cviro@zeniv.linux.org.uk\u003e\n"
    },
    {
      "commit": "e7f22e207bacdba5b73f2893a3abe935a5373e2e",
      "tree": "02e9f01788742db409587475a0aa10f3a0347e38",
      "parents": [
        "cae5d39032acf26c265f6b1dc73d7ce6ff4bc387"
      ],
      "author": {
        "name": "Stephen Wilson",
        "email": "wilsons@start.ca",
        "time": "Sun Mar 13 15:49:18 2011 -0400"
      },
      "committer": {
        "name": "Al Viro",
        "email": "viro@zeniv.linux.org.uk",
        "time": "Wed Mar 23 16:36:56 2011 -0400"
      },
      "message": "mm: use mm_struct to resolve gate vma\u0027s in __get_user_pages\n\nWe now check if a requested user page overlaps a gate vma using the supplied mm\ninstead of the supplied task.  The given task is now used solely for accounting\npurposes and may be NULL.\n\nSigned-off-by: Stephen Wilson \u003cwilsons@start.ca\u003e\nSigned-off-by: Al Viro \u003cviro@zeniv.linux.org.uk\u003e\n"
    },
    {
      "commit": "cae5d39032acf26c265f6b1dc73d7ce6ff4bc387",
      "tree": "9c89bcab3f4c17fb34eb44342d1f67bb4230d632",
      "parents": [
        "83b964bbf82eb13a8f31bb49ca420787fe01f7a6"
      ],
      "author": {
        "name": "Stephen Wilson",
        "email": "wilsons@start.ca",
        "time": "Sun Mar 13 15:49:17 2011 -0400"
      },
      "committer": {
        "name": "Al Viro",
        "email": "viro@zeniv.linux.org.uk",
        "time": "Wed Mar 23 16:36:55 2011 -0400"
      },
      "message": "mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm\n\nNow that gate vma\u0027s are referenced with respect to a particular mm and not a\nparticular task it only makes sense to propagate the change to this predicate as\nwell.\n\nSigned-off-by: Stephen Wilson \u003cwilsons@start.ca\u003e\nReviewed-by: Michel Lespinasse \u003cwalken@google.com\u003e\nCc: Thomas Gleixner \u003ctglx@linutronix.de\u003e\nCc: Ingo Molnar \u003cmingo@redhat.com\u003e\nCc: \"H. Peter Anvin\" \u003chpa@zytor.com\u003e\nSigned-off-by: Al Viro \u003cviro@zeniv.linux.org.uk\u003e\n"
    },
    {
      "commit": "83b964bbf82eb13a8f31bb49ca420787fe01f7a6",
      "tree": "c94dcf5f4116ca351570fb9d2b7e37834e93f430",
      "parents": [
        "31db58b3ab432f72ea76be58b12e6ffaf627d5db"
      ],
      "author": {
        "name": "Stephen Wilson",
        "email": "wilsons@start.ca",
        "time": "Sun Mar 13 15:49:16 2011 -0400"
      },
      "committer": {
        "name": "Al Viro",
        "email": "viro@zeniv.linux.org.uk",
        "time": "Wed Mar 23 16:36:54 2011 -0400"
      },
      "message": "mm: arch: make in_gate_area take an mm_struct instead of a task_struct\n\nMorally, the question of whether an address lies in a gate vma should be asked\nwith respect to an mm, not a particular task.  Moreover, dropping the dependency\non task_struct will help make existing and future operations on mm\u0027s more\nflexible and convenient.\n\nSigned-off-by: Stephen Wilson \u003cwilsons@start.ca\u003e\nReviewed-by: Michel Lespinasse \u003cwalken@google.com\u003e\nCc: Thomas Gleixner \u003ctglx@linutronix.de\u003e\nCc: Ingo Molnar \u003cmingo@redhat.com\u003e\nCc: \"H. Peter Anvin\" \u003chpa@zytor.com\u003e\nSigned-off-by: Al Viro \u003cviro@zeniv.linux.org.uk\u003e\n"
    },
    {
      "commit": "31db58b3ab432f72ea76be58b12e6ffaf627d5db",
      "tree": "c88b742e1f2c52045d5abc6d35d7492ebdf64541",
      "parents": [
        "375906f8765e131a4a159b1ffebf78c15db7b3bf"
      ],
      "author": {
        "name": "Stephen Wilson",
        "email": "wilsons@start.ca",
        "time": "Sun Mar 13 15:49:15 2011 -0400"
      },
      "committer": {
        "name": "Al Viro",
        "email": "viro@zeniv.linux.org.uk",
        "time": "Wed Mar 23 16:36:54 2011 -0400"
      },
      "message": "mm: arch: make get_gate_vma take an mm_struct instead of a task_struct\n\nMorally, the presence of a gate vma is more an attribute of a particular mm than\na particular task.  Moreover, dropping the dependency on task_struct will help\nmake both existing and future operations on mm\u0027s more flexible and convenient.\n\nSigned-off-by: Stephen Wilson \u003cwilsons@start.ca\u003e\nReviewed-by: Michel Lespinasse \u003cwalken@google.com\u003e\nCc: Thomas Gleixner \u003ctglx@linutronix.de\u003e\nCc: Ingo Molnar \u003cmingo@redhat.com\u003e\nCc: \"H. Peter Anvin\" \u003chpa@zytor.com\u003e\nSigned-off-by: Al Viro \u003cviro@zeniv.linux.org.uk\u003e\n"
    },
    {
      "commit": "318b275fbca1ab9ec0862de71420e0e92c3d1aa7",
      "tree": "aa4984469443ed53b4e7fa23d3f91966e536a803",
      "parents": [
        "5fda1bd5b8869574dad8e1f9f71e23bf0c186274"
      ],
      "author": {
        "name": "Gleb Natapov",
        "email": "gleb@redhat.com",
        "time": "Tue Mar 22 16:30:51 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Mar 22 17:44:02 2011 -0700"
      },
      "message": "mm: allow GUP to fail instead of waiting on a page\n\nGUP user may want to try to acquire a reference to a page if it is already\nin memory, but not if IO, to bring it in, is needed.  For example KVM may\ntell vcpu to schedule another guest process if current one is trying to\naccess swapped out page.  Meanwhile, the page will be swapped in and the\nguest process, that depends on it, will be able to run again.\n\nThis patch adds FAULT_FLAG_RETRY_NOWAIT (suggested by Linus) and\nFOLL_NOWAIT follow_page flags.  FAULT_FLAG_RETRY_NOWAIT, when used in\nconjunction with VM_FAULT_ALLOW_RETRY, indicates to handle_mm_fault that\nit shouldn\u0027t drop mmap_sem and wait on a page, but return VM_FAULT_RETRY\ninstead.\n\n[akpm@linux-foundation.org: improve FOLL_NOWAIT comment]\nSigned-off-by: Gleb Natapov \u003cgleb@redhat.com\u003e\nCc: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Michel Lespinasse \u003cwalken@google.com\u003e\nCc: Avi Kivity \u003cavi@redhat.com\u003e\nCc: Marcelo Tosatti \u003cmtosatti@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e16b396ce314b2bcdfe6c173fe075bf8e3432368",
      "tree": "640f0f56f2ea676647af4eb42d32fa56be2ee549",
      "parents": [
        "7fd23a24717a327a66f3c32d11a20a2f169c824f",
        "e6e8dd5055a974935af1398c8648d4a9359b0ecb"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Mar 18 10:37:40 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Mar 18 10:37:40 2011 -0700"
      },
      "message": "Merge branch \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial\n\n* \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits)\n  doc: CONFIG_UNEVICTABLE_LRU doesn\u0027t exist anymore\n  Update cpuset info \u0026 webiste for cgroups\n  dcdbas: force SMI to happen when expected\n  arch/arm/Kconfig: remove one to many l\u0027s in the word.\n  asm-generic/user.h: Fix spelling in comment\n  drm: fix printk typo \u0027sracth\u0027\n  Remove one to many n\u0027s in a word\n  Documentation/filesystems/romfs.txt: fixing link to genromfs\n  drivers:scsi Change printk typo initate -\u003e initiate\n  serial, pch uart: Remove duplicate inclusion of linux/pci.h header\n  fs/eventpoll.c: fix spelling\n  mm: Fix out-of-date comments which refers non-existent functions\n  drm: Fix printk typo \u0027failled\u0027\n  coh901318.c: Change initate to initiate.\n  mbox-db5500.c Change initate to initiate.\n  edac: correct i82975x error-info reported\n  edac: correct i82975x mci initialisation\n  edac: correct commented info\n  fs: update comments to point correct document\n  target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c\n  ...\n\nTrivial conflict in fs/eventpoll.c (spelling vs addition)\n"
    },
    {
      "commit": "69ebb83e13e514222b0ae4f8bd813a17679ed876",
      "tree": "62ccc7ee1e840d0a6cc01a9fc1c44a5f4e6f1edd",
      "parents": [
        "0014bd990e69063b0fb78940b35439d7980ce3ee"
      ],
      "author": {
        "name": "Huang Ying",
        "email": "ying.huang@intel.com",
        "time": "Sun Jan 30 11:15:48 2011 +0800"
      },
      "committer": {
        "name": "Marcelo Tosatti",
        "email": "mtosatti@redhat.com",
        "time": "Thu Mar 17 13:08:27 2011 -0300"
      },
      "message": "mm: make __get_user_pages return -EHWPOISON for HWPOISON page optionally\n\nMake __get_user_pages return -EHWPOISON for HWPOISON page only if\nFOLL_HWPOISON is specified.  With this patch, the interested callers\ncan distinguish HWPOISON pages from general FAULT pages, while other\ncallers will still get -EFAULT for all these pages, so the user space\ninterface need not to be changed.\n\nThis feature is needed by KVM, where UCR MCE should be relayed to\nguest for HWPOISON page, while instruction emulation and MMIO will be\ntried for general FAULT page.\n\nThe idea comes from Andrew Morton.\n\nSigned-off-by: Huang Ying \u003cying.huang@intel.com\u003e\nCc: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Marcelo Tosatti \u003cmtosatti@redhat.com\u003e\nSigned-off-by: Avi Kivity \u003cavi@redhat.com\u003e\n"
    },
    {
      "commit": "0014bd990e69063b0fb78940b35439d7980ce3ee",
      "tree": "56d4576cc07954eb304abaf602aba44a6aa2a4f1",
      "parents": [
        "91c9c3eda4f3066980d13a6907ef84f3a99364bd"
      ],
      "author": {
        "name": "Huang Ying",
        "email": "ying.huang@intel.com",
        "time": "Sun Jan 30 11:15:47 2011 +0800"
      },
      "committer": {
        "name": "Marcelo Tosatti",
        "email": "mtosatti@redhat.com",
        "time": "Thu Mar 17 13:08:27 2011 -0300"
      },
      "message": "mm: export __get_user_pages\n\nIn most cases, get_user_pages and get_user_pages_fast should be used\nto pin user pages in memory.  But sometimes, some special flags except\nFOLL_GET, FOLL_WRITE and FOLL_FORCE are needed, for example in\nfollowing patch, KVM needs FOLL_HWPOISON.  To support these users,\n__get_user_pages is exported directly.\n\nThere are some symbol name conflicts in infiniband driver, fixed them too.\n\nSigned-off-by: Huang Ying \u003cying.huang@intel.com\u003e\nCC: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nCC: Michel Lespinasse \u003cwalken@google.com\u003e\nCC: Roland Dreier \u003croland@kernel.org\u003e\nCC: Ralph Campbell \u003cinfinipath@qlogic.com\u003e\nSigned-off-by: Marcelo Tosatti \u003cmtosatti@redhat.com\u003e\n"
    },
    {
      "commit": "2aa15890f3c191326678f1bd68af61ec6b8753ec",
      "tree": "347f5fdcd0678b12be92f266cd2a5e7a74749403",
      "parents": [
        "78794b2cdeac37ac1fd950fc9c4454b56d88ac03"
      ],
      "author": {
        "name": "Miklos Szeredi",
        "email": "mszeredi@suse.cz",
        "time": "Wed Feb 23 13:49:47 2011 +0100"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Feb 23 19:52:52 2011 -0800"
      },
      "message": "mm: prevent concurrent unmap_mapping_range() on the same inode\n\nMichael Leun reported that running parallel opens on a fuse filesystem\ncan trigger a \"kernel BUG at mm/truncate.c:475\"\n\nGurudas Pai reported the same bug on NFS.\n\nThe reason is, unmap_mapping_range() is not prepared for more than\none concurrent invocation per inode.  For example:\n\n  thread1: going through a big range, stops in the middle of a vma and\n     stores the restart address in vm_truncate_count.\n\n  thread2: comes in with a small (e.g. single page) unmap request on\n     the same vma, somewhere before restart_address, finds that the\n     vma was already unmapped up to the restart address and happily\n     returns without doing anything.\n\nAnother scenario would be two big unmap requests, both having to\nrestart the unmapping and each one setting vm_truncate_count to its\nown value.  This could go on forever without any of them being able to\nfinish.\n\nTruncate and hole punching already serialize with i_mutex.  Other\ncallers of unmap_mapping_range() do not, and it\u0027s difficult to get\ni_mutex protection for all callers.  In particular -\u003ed_revalidate(),\nwhich calls invalidate_inode_pages2_range() in fuse, may be called\nwith or without i_mutex.\n\nThis patch adds a new mutex to \u0027struct address_space\u0027 to prevent\nrunning multiple concurrent unmap_mapping_range() on the same mapping.\n\n[ We\u0027ll hopefully get rid of all this with the upcoming mm\n  preemptibility series by Peter Zijlstra, the \"mm: Remove i_mmap_mutex\n  lockbreak\" patch in particular.  But that is for 2.6.39 ]\n\nSigned-off-by: Miklos Szeredi \u003cmszeredi@suse.cz\u003e\nReported-by: Michael Leun \u003clkml20101129@newton.leun.net\u003e\nReported-by: Gurudas Pai \u003cgurudas.pai@oracle.com\u003e\nTested-by: Gurudas Pai \u003cgurudas.pai@oracle.com\u003e\nAcked-by: Hugh Dickins \u003chughd@google.com\u003e\nCc: stable@kernel.org\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a335b2e17301afae9e794f21071a2fcdd5879c1e",
      "tree": "d6e3e3a5fad04c241d3c18ade63b8d239b30b6f9",
      "parents": [
        "ec4f2ac471e25d3e0cea05abb8da34c05a0868f9"
      ],
      "author": {
        "name": "Ryota Ozaki",
        "email": "ozaki.ryota@gmail.com",
        "time": "Thu Feb 10 13:56:28 2011 +0900"
      },
      "committer": {
        "name": "Jiri Kosina",
        "email": "jkosina@suse.cz",
        "time": "Thu Feb 17 16:54:39 2011 +0100"
      },
      "message": "mm: Fix out-of-date comments which refers non-existent functions\n\ndo_file_page and do_no_page don\u0027t exist anymore, but some comments\nstill refers them. The patch fixes them by replacing them with\nexisting ones.\n\nSigned-off-by: Ryota Ozaki \u003cozaki.ryota@gmail.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Jiri Kosina \u003cjkosina@suse.cz\u003e\n"
    },
    {
      "commit": "419d8c96dbfa558f00e623023917d0a5afc46129",
      "tree": "74882b1ed7340d3d0e448b343c52fd12969ea518",
      "parents": [
        "e15f8c01af924e611bc7be1e45449c4a74e5dfdd"
      ],
      "author": {
        "name": "Michel Lespinasse",
        "email": "walken@google.com",
        "time": "Thu Feb 10 15:01:33 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Feb 11 16:12:20 2011 -0800"
      },
      "message": "mlock: do not munlock pages in __do_fault()\n\nIf the page is going to be written to, __do_page needs to break COW.\n\nHowever, the old page (before breaking COW) was never mapped mapped into\nthe current pte (__do_fault is only called when the pte is not present),\nso vmscan can\u0027t have marked the old page as PageMlocked due to being\nmapped in __do_fault\u0027s VMA.  Therefore, __do_fault() does not need to\nworry about clearing PageMlocked() on the old page.\n\nSigned-off-by: Michel Lespinasse \u003cwalken@google.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Hugh Dickins \u003chughd@google.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e15f8c01af924e611bc7be1e45449c4a74e5dfdd",
      "tree": "7319b3d6834707996b16fd8d13ab745ad9b13a91",
      "parents": [
        "e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20"
      ],
      "author": {
        "name": "Michel Lespinasse",
        "email": "walken@google.com",
        "time": "Thu Feb 10 15:01:32 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Feb 11 16:12:20 2011 -0800"
      },
      "message": "mlock: fix race when munlocking pages in do_wp_page()\n\nvmscan can lazily find pages that are mapped within VM_LOCKED vmas, and\nset the PageMlocked bit on these pages, transfering them onto the\nunevictable list.  When do_wp_page() breaks COW within a VM_LOCKED vma,\nit may need to clear PageMlocked on the old page and set it on the new\npage instead.\n\nThis change fixes an issue where do_wp_page() was clearing PageMlocked\non the old page while the pte was still pointing to it (as well as\nrmap).  Therefore, we were not protected against vmscan immediately\ntransfering the old page back onto the unevictable list.  This could\ncause pages to get stranded there forever.\n\nI propose to move the corresponding code to the end of do_wp_page(),\nafter the pte (and rmap) have been pointed to the new page.\nAdditionally, we can use munlock_vma_page() instead of\nclear_page_mlock(), so that the old page stays mlocked if there are\nstill other VM_LOCKED vmas mapping it.\n\nSigned-off-by: Michel Lespinasse \u003cwalken@google.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Hugh Dickins \u003chughd@google.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "14d1a55cd26f1860f837f37ae42520c7c13b1347",
      "tree": "b80634a6a2a5f306fd1c3fc408993dd9fc98202b",
      "parents": [
        "05b258e99725112c4febeab4fad23ea2c8908a3a"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Thu Jan 13 15:47:15 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:47 2011 -0800"
      },
      "message": "thp: add debug checks for mapcount related invariants\n\nAdd debug checks for invariants that if broken could lead to mapcount vs\npage_mapcount debug checks to trigger later in split_huge_page.\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "500d65d471018d9a13b0d51b7e141ed2a3555c1d",
      "tree": "046dc2337f87a1a365fde126fab7f4ac9ae82793",
      "parents": [
        "0af4e98b6b095c74588af04872f83d333c958c32"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Thu Jan 13 15:46:55 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:42 2011 -0800"
      },
      "message": "thp: pmd_trans_huge migrate bugcheck\n\nNo pmd_trans_huge should ever materialize in migration ptes areas, because\nwe split the hugepage before migration ptes are instantiated.\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f66055ab6fb9731dbfce320c5202ef4441b5d77f",
      "tree": "de347e42d1e5cf481344a153d272e86a95b774f4",
      "parents": [
        "05759d380a9d7f131a475186c07fce58ceaa8902"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Thu Jan 13 15:46:54 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:42 2011 -0800"
      },
      "message": "thp: verify pmd_trans_huge isn\u0027t leaking\n\npte_trans_huge must not leak in certain vmas like the mmio special pfn or\nfilebacked mappings.\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "8a07651ee8cdaa9e27cb4ae372aed347533770f5",
      "tree": "07a442e66c3f608e174edd3b8a2fd154f4219380",
      "parents": [
        "71e3aac0724ffe8918992d76acfe3aad7d8724a5"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hughd@google.com",
        "time": "Thu Jan 13 15:46:52 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:42 2011 -0800"
      },
      "message": "thp: transparent hugepage core fixlet\n\nIf you configure THP in addition to HUGETLB_PAGE on x86_32 without PAE,\nthe p?d-folding works out that munlock_vma_pages_range() can crash to\nfollow_page()\u0027s pud_huge() BUG_ON(flags \u0026 FOLL_GET): it needs the same\nVM_HUGETLB check already there on the pmd_huge() line.  Conveniently,\nopenSUSE provides a \"blogd\" which tests this out at startup!\n\nSigned-off-by: Hugh Dickins \u003chughd@google.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "71e3aac0724ffe8918992d76acfe3aad7d8724a5",
      "tree": "4ff96e1fc3e53bc9d25b859bf7e5bdbab8f1b25a",
      "parents": [
        "5c3240d92e29ae7bfb9cb58a9b37e80ab40894ff"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Thu Jan 13 15:46:52 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:42 2011 -0800"
      },
      "message": "thp: transparent hugepage core\n\nLately I\u0027ve been working to make KVM use hugepages transparently without\nthe usual restrictions of hugetlbfs.  Some of the restrictions I\u0027d like to\nsee removed:\n\n1) hugepages have to be swappable or the guest physical memory remains\n   locked in RAM and can\u0027t be paged out to swap\n\n2) if a hugepage allocation fails, regular pages should be allocated\n   instead and mixed in the same vma without any failure and without\n   userland noticing\n\n3) if some task quits and more hugepages become available in the\n   buddy, guest physical memory backed by regular pages should be\n   relocated on hugepages automatically in regions under\n   madvise(MADV_HUGEPAGE) (ideally event driven by waking up the\n   kernel deamon if the order\u003dHPAGE_PMD_SHIFT-PAGE_SHIFT list becomes\n   not null)\n\n4) avoidance of reservation and maximization of use of hugepages whenever\n   possible. Reservation (needed to avoid runtime fatal faliures) may be ok for\n   1 machine with 1 database with 1 database cache with 1 database cache size\n   known at boot time. It\u0027s definitely not feasible with a virtualization\n   hypervisor usage like RHEV-H that runs an unknown number of virtual machines\n   with an unknown size of each virtual machine with an unknown amount of\n   pagecache that could be potentially useful in the host for guest not using\n   O_DIRECT (aka cache\u003doff).\n\nhugepages in the virtualization hypervisor (and also in the guest!) are\nmuch more important than in a regular host not using virtualization,\nbecasue with NPT/EPT they decrease the tlb-miss cacheline accesses from 24\nto 19 in case only the hypervisor uses transparent hugepages, and they\ndecrease the tlb-miss cacheline accesses from 19 to 15 in case both the\nlinux hypervisor and the linux guest both uses this patch (though the\nguest will limit the addition speedup to anonymous regions only for\nnow...).  Even more important is that the tlb miss handler is much slower\non a NPT/EPT guest than for a regular shadow paging or no-virtualization\nscenario.  So maximizing the amount of virtual memory cached by the TLB\npays off significantly more with NPT/EPT than without (even if there would\nbe no significant speedup in the tlb-miss runtime).\n\nThe first (and more tedious) part of this work requires allowing the VM to\nhandle anonymous hugepages mixed with regular pages transparently on\nregular anonymous vmas.  This is what this patch tries to achieve in the\nleast intrusive possible way.  We want hugepages and hugetlb to be used in\na way so that all applications can benefit without changes (as usual we\nleverage the KVM virtualization design: by improving the Linux VM at\nlarge, KVM gets the performance boost too).\n\nThe most important design choice is: always fallback to 4k allocation if\nthe hugepage allocation fails!  This is the _very_ opposite of some large\npagecache patches that failed with -EIO back then if a 64k (or similar)\nallocation failed...\n\nSecond important decision (to reduce the impact of the feature on the\nexisting pagetable handling code) is that at any time we can split an\nhugepage into 512 regular pages and it has to be done with an operation\nthat can\u0027t fail.  This way the reliability of the swapping isn\u0027t decreased\n(no need to allocate memory when we are short on memory to swap) and it\u0027s\ntrivial to plug a split_huge_page* one-liner where needed without\npolluting the VM.  Over time we can teach mprotect, mremap and friends to\nhandle pmd_trans_huge natively without calling split_huge_page*.  The fact\nit can\u0027t fail isn\u0027t just for swap: if split_huge_page would return -ENOMEM\n(instead of the current void) we\u0027d need to rollback the mprotect from the\nmiddle of it (ideally including undoing the split_vma) which would be a\nbig change and in the very wrong direction (it\u0027d likely be simpler not to\ncall split_huge_page at all and to teach mprotect and friends to handle\nhugepages instead of rolling them back from the middle).  In short the\nvery value of split_huge_page is that it can\u0027t fail.\n\nThe collapsing and madvise(MADV_HUGEPAGE) part will remain separated and\nincremental and it\u0027ll just be an \"harmless\" addition later if this initial\npart is agreed upon.  It also should be noted that locking-wise replacing\nregular pages with hugepages is going to be very easy if compared to what\nI\u0027m doing below in split_huge_page, as it will only happen when\npage_count(page) matches page_mapcount(page) if we can take the PG_lock\nand mmap_sem in write mode.  collapse_huge_page will be a \"best effort\"\nthat (unlike split_huge_page) can fail at the minimal sign of trouble and\nwe can try again later.  collapse_huge_page will be similar to how KSM\nworks and the madvise(MADV_HUGEPAGE) will work similar to\nmadvise(MADV_MERGEABLE).\n\nThe default I like is that transparent hugepages are used at page fault\ntime.  This can be changed with\n/sys/kernel/mm/transparent_hugepage/enabled.  The control knob can be set\nto three values \"always\", \"madvise\", \"never\" which mean respectively that\nhugepages are always used, or only inside madvise(MADV_HUGEPAGE) regions,\nor never used.  /sys/kernel/mm/transparent_hugepage/defrag instead\ncontrols if the hugepage allocation should defrag memory aggressively\n\"always\", only inside \"madvise\" regions, or \"never\".\n\nThe pmd_trans_splitting/pmd_trans_huge locking is very solid.  The\nput_page (from get_user_page users that can\u0027t use mmu notifier like\nO_DIRECT) that runs against a __split_huge_page_refcount instead was a\npain to serialize in a way that would result always in a coherent page\ncount for both tail and head.  I think my locking solution with a\ncompound_lock taken only after the page_first is valid and is still a\nPageHead should be safe but it surely needs review from SMP race point of\nview.  In short there is no current existing way to serialize the O_DIRECT\nfinal put_page against split_huge_page_refcount so I had to invent a new\none (O_DIRECT loses knowledge on the mapping status by the time gup_fast\nreturns so...).  And I didn\u0027t want to impact all gup/gup_fast users for\nnow, maybe if we change the gup interface substantially we can avoid this\nlocking, I admit I didn\u0027t think too much about it because changing the gup\nunpinning interface would be invasive.\n\nIf we ignored O_DIRECT we could stick to the existing compound refcounting\ncode, by simply adding a get_user_pages_fast_flags(foll_flags) where KVM\n(and any other mmu notifier user) would call it without FOLL_GET (and if\nFOLL_GET isn\u0027t set we\u0027d just BUG_ON if nobody registered itself in the\ncurrent task mmu notifier list yet).  But O_DIRECT is fundamental for\ndecent performance of virtualized I/O on fast storage so we can\u0027t avoid it\nto solve the race of put_page against split_huge_page_refcount to achieve\na complete hugepage feature for KVM.\n\nSwap and oom works fine (well just like with regular pages ;).  MMU\nnotifier is handled transparently too, with the exception of the young bit\non the pmd, that didn\u0027t have a range check but I think KVM will be fine\nbecause the whole point of hugepages is that EPT/NPT will also use a huge\npmd when they notice gup returns pages with PageCompound set, so they\nwon\u0027t care of a range and there\u0027s just the pmd young bit to check in that\ncase.\n\nNOTE: in some cases if the L2 cache is small, this may slowdown and waste\nmemory during COWs because 4M of memory are accessed in a single fault\ninstead of 8k (the payoff is that after COW the program can run faster).\nSo we might want to switch the copy_huge_page (and clear_huge_page too) to\nnot temporal stores.  I also extensively researched ways to avoid this\ncache trashing with a full prefault logic that would cow in 8k/16k/32k/64k\nup to 1M (I can send those patches that fully implemented prefault) but I\nconcluded they\u0027re not worth it and they add an huge additional complexity\nand they remove all tlb benefits until the full hugepage has been faulted\nin, to save a little bit of memory and some cache during app startup, but\nthey still don\u0027t improve substantially the cache-trashing during startup\nif the prefault happens in \u003e4k chunks.  One reason is that those 4k pte\nentries copied are still mapped on a perfectly cache-colored hugepage, so\nthe trashing is the worst one can generate in those copies (cow of 4k page\ncopies aren\u0027t so well colored so they trashes less, but again this results\nin software running faster after the page fault).  Those prefault patches\nallowed things like a pte where post-cow pages were local 4k regular anon\npages and the not-yet-cowed pte entries were pointing in the middle of\nsome hugepage mapped read-only.  If it doesn\u0027t payoff substantially with\ntodays hardware it will payoff even less in the future with larger l2\ncaches, and the prefault logic would blot the VM a lot.  If one is\nemebdded transparent_hugepage can be disabled during boot with sysfs or\nwith the boot commandline parameter transparent_hugepage\u003d0 (or\ntransparent_hugepage\u003d2 to restrict hugepages inside madvise regions) that\nwill ensure not a single hugepage is allocated at boot time.  It is simple\nenough to just disable transparent hugepage globally and let transparent\nhugepages be allocated selectively by applications in the MADV_HUGEPAGE\nregion (both at page fault time, and if enabled with the\ncollapse_huge_page too through the kernel daemon).\n\nThis patch supports only hugepages mapped in the pmd, archs that have\nsmaller hugepages will not fit in this patch alone.  Also some archs like\npower have certain tlb limits that prevents mixing different page size in\nthe same regions so they will not fit in this framework that requires\n\"graceful fallback\" to basic PAGE_SIZE in case of physical memory\nfragmentation.  hugetlbfs remains a perfect fit for those because its\nsoftware limits happen to match the hardware limits.  hugetlbfs also\nremains a perfect fit for hugepage sizes like 1GByte that cannot be hoped\nto be found not fragmented after a certain system uptime and that would be\nvery expensive to defragment with relocation, so requiring reservation.\nhugetlbfs is the \"reservation way\", the point of transparent hugepages is\nnot to have any reservation at all and maximizing the use of cache and\nhugepages at all times automatically.\n\nSome performance result:\n\nvmx andrea # LD_PRELOAD\u003d/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE\u003dyes HUGETLB_PATH\u003d/mnt/huge/ ./largep\nages3\nmemset page fault 1566023\nmemset tlb miss 453854\nmemset second tlb miss 453321\nrandom access tlb miss 41635\nrandom access second tlb miss 41658\nvmx andrea # LD_PRELOAD\u003d/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE\u003dyes HUGETLB_PATH\u003d/mnt/huge/ ./largepages3\nmemset page fault 1566471\nmemset tlb miss 453375\nmemset second tlb miss 453320\nrandom access tlb miss 41636\nrandom access second tlb miss 41637\nvmx andrea # ./largepages3\nmemset page fault 1566642\nmemset tlb miss 453417\nmemset second tlb miss 453313\nrandom access tlb miss 41630\nrandom access second tlb miss 41647\nvmx andrea # ./largepages3\nmemset page fault 1566872\nmemset tlb miss 453418\nmemset second tlb miss 453315\nrandom access tlb miss 41618\nrandom access second tlb miss 41659\nvmx andrea # echo 0 \u003e /proc/sys/vm/transparent_hugepage\nvmx andrea # ./largepages3\nmemset page fault 2182476\nmemset tlb miss 460305\nmemset second tlb miss 460179\nrandom access tlb miss 44483\nrandom access second tlb miss 44186\nvmx andrea # ./largepages3\nmemset page fault 2182791\nmemset tlb miss 460742\nmemset second tlb miss 459962\nrandom access tlb miss 43981\nrandom access second tlb miss 43988\n\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n#include \u003cstdio.h\u003e\n#include \u003cstdlib.h\u003e\n#include \u003cstring.h\u003e\n#include \u003csys/time.h\u003e\n\n#define SIZE (3UL*1024*1024*1024)\n\nint main()\n{\n\tchar *p \u003d malloc(SIZE), *p2;\n\tstruct timeval before, after;\n\n\tgettimeofday(\u0026before, NULL);\n\tmemset(p, 0, SIZE);\n\tgettimeofday(\u0026after, NULL);\n\tprintf(\"memset page fault %Lu\\n\",\n\t       (after.tv_sec-before.tv_sec)*1000000UL +\n\t       after.tv_usec-before.tv_usec);\n\n\tgettimeofday(\u0026before, NULL);\n\tmemset(p, 0, SIZE);\n\tgettimeofday(\u0026after, NULL);\n\tprintf(\"memset tlb miss %Lu\\n\",\n\t       (after.tv_sec-before.tv_sec)*1000000UL +\n\t       after.tv_usec-before.tv_usec);\n\n\tgettimeofday(\u0026before, NULL);\n\tmemset(p, 0, SIZE);\n\tgettimeofday(\u0026after, NULL);\n\tprintf(\"memset second tlb miss %Lu\\n\",\n\t       (after.tv_sec-before.tv_sec)*1000000UL +\n\t       after.tv_usec-before.tv_usec);\n\n\tgettimeofday(\u0026before, NULL);\n\tfor (p2 \u003d p; p2 \u003c p+SIZE; p2 +\u003d 4096)\n\t\t*p2 \u003d 0;\n\tgettimeofday(\u0026after, NULL);\n\tprintf(\"random access tlb miss %Lu\\n\",\n\t       (after.tv_sec-before.tv_sec)*1000000UL +\n\t       after.tv_usec-before.tv_usec);\n\n\tgettimeofday(\u0026before, NULL);\n\tfor (p2 \u003d p; p2 \u003c p+SIZE; p2 +\u003d 4096)\n\t\t*p2 \u003d 0;\n\tgettimeofday(\u0026after, NULL);\n\tprintf(\"random access second tlb miss %Lu\\n\",\n\t       (after.tv_sec-before.tv_sec)*1000000UL +\n\t       after.tv_usec-before.tv_usec);\n\n\treturn 0;\n}\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "47ad8475c000141eacb3ecda5e5ce4b43a9cd04d",
      "tree": "78c29aaf2ae9340e314a25ea08e9724471cf4414",
      "parents": [
        "3f04f62f90d46a82dd73027c5fd7a15daed5c33d"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Thu Jan 13 15:46:47 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:41 2011 -0800"
      },
      "message": "thp: clear_copy_huge_page\n\nMove the copy/clear_huge_page functions to common code to share between\nhugetlb.c and huge_memory.c.\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "8ac1f8320a0073f28cf9e0491af4cd98f504f92a",
      "tree": "4dad891c302587fdc7b099b18e05d7dbc5526c64",
      "parents": [
        "64cc6ae001d70bc59e5f854e6b5678f59110df16"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Thu Jan 13 15:46:43 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:40 2011 -0800"
      },
      "message": "thp: pte alloc trans splitting\n\npte alloc routines must wait for split_huge_page if the pmd is not present\nand not null (i.e.  pmd_trans_splitting).  The additional branches are\noptimized away at compile time by pmd_trans_splitting if the config option\nis off.  However we must pass the vma down in order to know the anon_vma\nlock to wait for.\n\n[akpm@linux-foundation.org: coding-style fixes]\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "14fd403f2146f740942d78af4e0ee59396ad8eab",
      "tree": "c87734f6c6639684208d36548aa3687c6f460e23",
      "parents": [
        "2609ae6d10af0531e826335bd1445d1ace17c847"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Thu Jan 13 15:46:37 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:39 2011 -0800"
      },
      "message": "thp: export maybe_mkwrite\n\nhuge_memory.c needs it too when it fallbacks in copying hugepages into\nregular fragmented pages if hugepage allocation fails during COW.\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "53a7706d5ed8f1a53ba062b318773160cc476dde",
      "tree": "a1990d90d5af3686b7a83b2bbc2ae6463971efc5",
      "parents": [
        "5fdb2002131cd4e210b9638a4fc932ec7be491d1"
      ],
      "author": {
        "name": "Michel Lespinasse",
        "email": "walken@google.com",
        "time": "Thu Jan 13 15:46:14 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:36 2011 -0800"
      },
      "message": "mlock: do not hold mmap_sem for extended periods of time\n\n__get_user_pages gets a new \u0027nonblocking\u0027 parameter to signal that the\ncaller is prepared to re-acquire mmap_sem and retry the operation if\nneeded.  This is used to split off long operations if they are going to\nblock on a disk transfer, or when we detect contention on the mmap_sem.\n\n[akpm@linux-foundation.org: remove ref to rwsem_is_contended()]\nSigned-off-by: Michel Lespinasse \u003cwalken@google.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Peter Zijlstra \u003cpeterz@infradead.org\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Ingo Molnar \u003cmingo@elte.hu\u003e\nCc: \"H. Peter Anvin\" \u003chpa@zytor.com\u003e\nCc: Thomas Gleixner \u003ctglx@linutronix.de\u003e\nCc: David Howells \u003cdhowells@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "110d74a921f4d272b47ef6104fcf937df808f4c8",
      "tree": "a2f1705e049f06e1cf8cbaf7d6b3261f0b46b6ab",
      "parents": [
        "fed067da46ad3b9acedaf794a5f05d0bc153280b"
      ],
      "author": {
        "name": "Michel Lespinasse",
        "email": "walken@google.com",
        "time": "Thu Jan 13 15:46:11 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:36 2011 -0800"
      },
      "message": "mm: add FOLL_MLOCK follow_page flag.\n\nMove the code to mlock pages from __mlock_vma_pages_range() to\nfollow_page().\n\nThis allows __mlock_vma_pages_range() to not have to break down work into\n16-page batches.\n\nAn additional motivation for doing this within the present patch series is\nthat it\u0027ll make it easier for a later chagne to drop mmap_sem when\nblocking on disk (we\u0027d like to be able to resume at the page that was read\nfrom disk instead of at the start of a 16-page batch).\n\nSigned-off-by: Michel Lespinasse \u003cwalken@google.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Peter Zijlstra \u003cpeterz@infradead.org\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Ingo Molnar \u003cmingo@elte.hu\u003e\nCc: \"H. Peter Anvin\" \u003chpa@zytor.com\u003e\nCc: Thomas Gleixner \u003ctglx@linutronix.de\u003e\nCc: David Howells \u003cdhowells@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "5ecfda041e4b4bd858d25bbf5a16c2a6c06d7272",
      "tree": "e6c3e7dac64a5e45b48ab7836318752202579a17",
      "parents": [
        "72ddc8f72270758951ccefb7d190f364d20215ab"
      ],
      "author": {
        "name": "Michel Lespinasse",
        "email": "walken@google.com",
        "time": "Thu Jan 13 15:46:09 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:35 2011 -0800"
      },
      "message": "mlock: avoid dirtying pages and triggering writeback\n\nWhen faulting in pages for mlock(), we want to break COW for anonymous or\nfile pages within VM_WRITABLE, non-VM_SHARED vmas.  However, there is no\nneed to write-fault into VM_SHARED vmas since shared file pages can be\nmlocked first and dirtied later, when/if they actually get written to.\nSkipping the write fault is desirable, as we don\u0027t want to unnecessarily\ncause these pages to be dirtied and queued for writeback.\n\nSigned-off-by: Michel Lespinasse \u003cwalken@google.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Kosaki Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Peter Zijlstra \u003cpeterz@infradead.org\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nCc: Theodore Tso \u003ctytso@google.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nCc: Suleiman Souhlal \u003csuleiman@google.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "72ddc8f72270758951ccefb7d190f364d20215ab",
      "tree": "11772272825f72aa3f32c0f9be5cf35155cf1441",
      "parents": [
        "b009c024ff0059e293c1937516f2defe56263650"
      ],
      "author": {
        "name": "Michel Lespinasse",
        "email": "walken@google.com",
        "time": "Thu Jan 13 15:46:08 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:35 2011 -0800"
      },
      "message": "do_wp_page: clarify dirty_page handling\n\nReorganize the code so that dirty pages are handled closer to the place\nthat makes them dirty (handling write fault into shared, writable VMAs).\nNo behavior changes.\n\nSigned-off-by: Michel Lespinasse \u003cwalken@google.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Kosaki Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Peter Zijlstra \u003cpeterz@infradead.org\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nCc: Theodore Tso \u003ctytso@google.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nCc: Suleiman Souhlal \u003csuleiman@google.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "b009c024ff0059e293c1937516f2defe56263650",
      "tree": "35d71c837b954e884c429c9c36a85aaf7b033c49",
      "parents": [
        "212260aa07135b327752dc02625c68cf4ce04caf"
      ],
      "author": {
        "name": "Michel Lespinasse",
        "email": "walken@google.com",
        "time": "Thu Jan 13 15:46:07 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:35 2011 -0800"
      },
      "message": "do_wp_page: remove the \u0027reuse\u0027 flag\n\nmlocking a shared, writable vma currently causes the corresponding pages\nto be marked as dirty and queued for writeback.  This seems rather\nunnecessary given that the pages are not being actually modified during\nmlock.  It is understood that for non-shared mappings (file or anon) we\nwant to use a write fault in order to break COW, but there is just no such\nneed for shared mappings.\n\nThe first two patches in this series do not introduce any behavior change.\n The intent there is to make it obvious that dirtying file pages is only\ndone in the (writable, shared) case.  I think this clarifies the code, but\nI wouldn\u0027t mind dropping these two patches if there is no consensus about\nthem.\n\nThe last patch is where we actually avoid dirtying shared mappings during\nmlock.  Note that as a side effect of this, we won\u0027t call page_mkwrite()\nfor the mappings that define it, and won\u0027t be pre-allocating data blocks\nat the FS level if the mapped file was sparsely allocated.  My\nunderstanding is that mlock does not need to provide such guarantee, as\nevidenced by the fact that it never did for the filesystems that don\u0027t\ndefine page_mkwrite() - including some common ones like ext3.  However, I\nwould like to gather feedback on this from filesystem people as a\nprecaution.  If this turns out to be a showstopper, maybe block\npreallocation can be added back on using a different interface.\n\nLarge shared mlocks are getting significantly (\u003e2x) faster in my tests, as\nthe disk can be fully used for reading the file instead of having to share\nbetween this and writeback.\n\nThis patch:\n\nReorganize the code to remove the \u0027reuse\u0027 flag.  No behavior changes.\n\nSigned-off-by: Michel Lespinasse \u003cwalken@google.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Kosaki Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Peter Zijlstra \u003cpeterz@infradead.org\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nCc: Theodore Tso \u003ctytso@google.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nCc: Suleiman Souhlal \u003csuleiman@google.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "3ecb01df3261d3b1f02ccfcf8384e2a255d2a1d0",
      "tree": "1fe91114d8829a511db48d757c787cfede3b929c",
      "parents": [
        "b6472776816af1ed52848c93d26e3edb3b17adab"
      ],
      "author": {
        "name": "Jan Beulich",
        "email": "JBeulich@novell.com",
        "time": "Tue Oct 26 14:22:27 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:13 2010 -0700"
      },
      "message": "use clear_page()/copy_page() in favor of memset()/memcpy() on whole pages\n\nAfter all that\u0027s what they are intended for.\n\nSigned-off-by: Jan Beulich \u003cjbeulich@novell.com\u003e\nCc: Miklos Szeredi \u003cmiklos@szeredi.hu\u003e\nCc: \"Eric W. Biederman\" \u003cebiederm@xmission.com\u003e\nCc: \"Rafael J. Wysocki\" \u003crjw@sisk.pl\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "1b36ba815bd91f17e31277a44dd5c6b6a5a8d97e",
      "tree": "9d68d66e780c619b01c5d8ddc93e19547b448142",
      "parents": [
        "e6219ec8195efd5640765e657810f262ad9d1a92"
      ],
      "author": {
        "name": "Namhyung Kim",
        "email": "namhyung@gmail.com",
        "time": "Tue Oct 26 14:22:00 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:09 2010 -0700"
      },
      "message": "mm: wrap follow_pte() using __cond_lock()\n\nThe follow_pte() conditionally grabs *@ptlp in case of returning 0.\nRename and wrap it using __cond_lock() removes following warnings:\n\n mm/memory.c:2337:9: warning: context imbalance in \u0027do_wp_page\u0027 - unexpected unlock\n mm/memory.c:3142:19: warning: context imbalance in \u0027handle_mm_fault\u0027 - different lock contexts for basic block\n\nSigned-off-by: Namhyung Kim \u003cnamhyung@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e6219ec8195efd5640765e657810f262ad9d1a92",
      "tree": "36c718adce5018fe87398fc7d8ebb7c1dfb14646",
      "parents": [
        "25ca1d6c02fe1c6d90d918867ef670d323725458"
      ],
      "author": {
        "name": "Namhyung Kim",
        "email": "namhyung@gmail.com",
        "time": "Tue Oct 26 14:22:00 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:09 2010 -0700"
      },
      "message": "mm: add lock release annotation on do_wp_page()\n\nThe do_wp_page() releases @ptl but was missing proper annotation.  Add it.\n This removes following warnings from sparse:\n\n mm/memory.c:2337:9: warning: context imbalance in \u0027do_wp_page\u0027 - unexpected unlock\n mm/memory.c:3142:19: warning: context imbalance in \u0027handle_mm_fault\u0027 - different lock contexts for basic block\n\nSigned-off-by: Namhyung Kim \u003cnamhyung@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "25ca1d6c02fe1c6d90d918867ef670d323725458",
      "tree": "de1709dd1dc7e0b9e9bd91840beb02f12e56b7e0",
      "parents": [
        "e6223a3b19421e3a8df1352d21fd0d71093f44ae"
      ],
      "author": {
        "name": "Namhyung Kim",
        "email": "namhyung@gmail.com",
        "time": "Tue Oct 26 14:21:59 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:09 2010 -0700"
      },
      "message": "mm: wrap get_locked_pte() using __cond_lock()\n\nThe get_locked_pte() conditionally grabs \u0027ptl\u0027 in case of returning\nnon-NULL.  This leads sparse to complain about context imbalance.  Rename\nand wrap it using __cond_lock() to make sparse happy.\n\nSigned-off-by: Namhyung Kim \u003cnamhyung@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d065bd810b6deb67d4897a14bfe21f8eb526ba99",
      "tree": "f58c59075732ec4ccba336278c9bdc7ff61bef94",
      "parents": [
        "b522c94da5d9cbc73f708be5e530ebc3bbd4a031"
      ],
      "author": {
        "name": "Michel Lespinasse",
        "email": "walken@google.com",
        "time": "Tue Oct 26 14:21:57 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:09 2010 -0700"
      },
      "message": "mm: retry page fault when blocking on disk transfer\n\nThis change reduces mmap_sem hold times that are caused by waiting for\ndisk transfers when accessing file mapped VMAs.\n\nIt introduces the VM_FAULT_ALLOW_RETRY flag, which indicates that the call\nsite wants mmap_sem to be released if blocking on a pending disk transfer.\nIn that case, filemap_fault() returns the VM_FAULT_RETRY status bit and\ndo_page_fault() will then re-acquire mmap_sem and retry the page fault.\n\nIt is expected that the retry will hit the same page which will now be\ncached, and thus it will complete with a low mmap_sem hold time.\n\nTests:\n\n- microbenchmark: thread A mmaps a large file and does random read accesses\n  to the mmaped area - achieves about 55 iterations/s. Thread B does\n  mmap/munmap in a loop at a separate location - achieves 55 iterations/s\n  before, 15000 iterations/s after.\n\n- We are seeing related effects in some applications in house, which show\n  significant performance regressions when running without this change.\n\n[akpm@linux-foundation.org: fix warning \u0026 crash]\nSigned-off-by: Michel Lespinasse \u003cwalken@google.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\nCc: Nick Piggin \u003cnickpiggin@yahoo.com.au\u003e\nReviewed-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Ying Han \u003cyinghan@google.com\u003e\nCc: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nCc: Ingo Molnar \u003cmingo@elte.hu\u003e\nCc: Thomas Gleixner \u003ctglx@linutronix.de\u003e\nAcked-by: \"H. Peter Anvin\" \u003chpa@zytor.com\u003e\nCc: \u003clinux-arch@vger.kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ece0e2b6406a995c371e0311190631ea34ad851a",
      "tree": "726a516a91f5f7efe9dbb247ba28d019981d456e",
      "parents": [
        "3e4d3af501cccdc8a8cca41bdbe57d54ad7e7e73"
      ],
      "author": {
        "name": "Peter Zijlstra",
        "email": "a.p.zijlstra@chello.nl",
        "time": "Tue Oct 26 14:21:52 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:08 2010 -0700"
      },
      "message": "mm: remove pte_*map_nested()\n\nSince we no longer need to provide KM_type, the whole pte_*map_nested()\nAPI is now redundant, remove it.\n\nSigned-off-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nAcked-by: Chris Metcalf \u003ccmetcalf@tilera.com\u003e\nCc: David Howells \u003cdhowells@redhat.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Ingo Molnar \u003cmingo@elte.hu\u003e\nCc: Thomas Gleixner \u003ctglx@linutronix.de\u003e\nCc: \"H. Peter Anvin\" \u003chpa@zytor.com\u003e\nCc: Steven Rostedt \u003crostedt@goodmis.org\u003e\nCc: Russell King \u003crmk@arm.linux.org.uk\u003e\nCc: Ralf Baechle \u003cralf@linux-mips.org\u003e\nCc: David Miller \u003cdavem@davemloft.net\u003e\nCc: Paul Mackerras \u003cpaulus@samba.org\u003e\nCc: Benjamin Herrenschmidt \u003cbenh@kernel.crashing.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "46e387bbd82d438b9131e237e6e2cb55a825da49",
      "tree": "414948afd6b4d63c6ea8cc79ce022128bc1bf2eb",
      "parents": [
        "e9d08567ef72a2d0fb9b14dded386352d3136442",
        "3ef8fd7f720fc4f462fcdcae2fcde6f1c0536bfe"
      ],
      "author": {
        "name": "Andi Kleen",
        "email": "ak@linux.intel.com",
        "time": "Fri Oct 22 17:40:48 2010 +0200"
      },
      "committer": {
        "name": "Andi Kleen",
        "email": "ak@linux.intel.com",
        "time": "Fri Oct 22 17:40:48 2010 +0200"
      },
      "message": "Merge branch \u0027hwpoison-hugepages\u0027 into hwpoison\n\nConflicts:\n\tmm/memory-failure.c\n"
    },
    {
      "commit": "c3b86a29429dac1033e3f602f51fa8d00006a8eb",
      "tree": "bcedd0a553ca2396eeb58318ef6ee6b426e83652",
      "parents": [
        "8d8d2e9ccd331a1345c88b292ebee9d256fd8749",
        "2aeb66d3036dbafc297ac553a257a40283dadb3e"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Oct 21 13:47:29 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Oct 21 13:47:29 2010 -0700"
      },
      "message": "Merge branch \u0027x86-mm-for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip\n\n* \u0027x86-mm-for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:\n  x86-32, percpu: Correct the ordering of the percpu readmostly section\n  x86, mm: Enable ARCH_DMA_ADDR_T_64BIT with X86_64 || HIGHMEM64G\n  x86: Spread tlb flush vector between nodes\n  percpu: Introduce a read-mostly percpu API\n  x86, mm: Fix incorrect data type in vmalloc_sync_all()\n  x86, mm: Hold mm-\u003epage_table_lock while doing vmalloc_sync\n  x86, mm: Fix bogus whitespace in sync_global_pgds()\n  x86-32: Fix sparse warning for the __PHYSICAL_MASK calculation\n  x86, mm: Add RESERVE_BRK_ARRAY() helper\n  mm, x86: Saving vmcore with non-lazy freeing of vmas\n  x86, kdump: Change copy_oldmem_page() to use cached addressing\n  x86, mm: fix uninitialized addr in kernel_physical_mapping_init()\n  x86, kmemcheck: Remove double test\n  x86, mm: Make spurious_fault check explicitly check the PRESENT bit\n  x86-64, mem: Update all PGDs for direct mapping and vmemmap mapping changes\n  x86, mm: Separate x86_64 vmalloc_sync_all() into separate functions\n  x86, mm: Avoid unnecessary TLB flush\n"
    },
    {
      "commit": "aa50d3a7aa8147b9e14dc9d5972a5d2359db4ef8",
      "tree": "68fae5060333dcc24c17e9dd00a87bd760d883e9",
      "parents": [
        "6f39ce056ab2ab2d29b2fae4aed61ed0b485972f"
      ],
      "author": {
        "name": "Andi Kleen",
        "email": "ak@linux.intel.com",
        "time": "Wed Oct 06 21:45:00 2010 +0200"
      },
      "committer": {
        "name": "Andi Kleen",
        "email": "ak@linux.intel.com",
        "time": "Fri Oct 08 09:32:46 2010 +0200"
      },
      "message": "Encode huge page size for VM_FAULT_HWPOISON errors\n\nThis fixes a problem introduced with the hugetlb hwpoison handling\n\nThe user space SIGBUS signalling wants to know the size of the hugepage\nthat caused a HWPOISON fault.\n\nUnfortunately the architecture page fault handlers do not have easy\naccess to the struct page.\n\nPass the information out in the fault error code instead.\n\nI added a separate VM_FAULT_HWPOISON_LARGE bit for this case and encode\nthe hpage index in some free upper bits of the fault code. The small\npage hwpoison keeps stays with the VM_FAULT_HWPOISON name to minimize\nchanges.\n\nAlso add code to hugetlb.h to convert that index into a page shift.\n\nWill be used in a further patch.\n\nCc: Naoya Horiguchi \u003cn-horiguchi@ah.jp.nec.com\u003e\nCc: fengguang.wu@intel.com\nSigned-off-by: Andi Kleen \u003cak@linux.intel.com\u003e\n"
    },
    {
      "commit": "31c4a3d3a0f84a5847665f8aa0552d188389f791",
      "tree": "6dbc630213c899c82030e38c9fa1125c060ef2fe",
      "parents": [
        "2422084a94fcd5038406261b331672a13c92c050"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hughd@google.com",
        "time": "Sun Sep 19 19:40:22 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Sep 20 10:44:37 2010 -0700"
      },
      "message": "mm: further fix swapin race condition\n\nCommit 4969c1192d15 (\"mm: fix swapin race condition\") is now agreed to\nbe incomplete.  There\u0027s a race, not very much less likely than the\noriginal race envisaged, in which it is further necessary to check that\nthe swapcache page\u0027s swap has not changed.\n\nHere\u0027s the reasoning: cast in terms of reuse_swap_page(), but probably\ncould be reformulated to rely on try_to_free_swap() instead, or on\nswapoff+swapon.\n\nA, faults into do_swap_page(): does page1 \u003d lookup_swap_cache(swap1) and\ncomes through the lock_page(page1).\n\nB, a racing thread of the same process, faults on the same address: does\npage1 \u003d lookup_swap_cache(swap1) and now waits in lock_page(page1), but\nfor whatever reason is unlucky not to get the lock any time soon.\n\nA carries on through do_swap_page(), a write fault, but cannot reuse the\nswap page1 (another reference to swap1).  Unlocks the page1 (but B\ndoesn\u0027t get it yet), does COW in do_wp_page(), page2 now in that pte.\n\nC, perhaps the parent of A+B, comes in and write faults the same swap\npage1 into its mm, reuse_swap_page() succeeds this time, swap1 is freed.\n\nkswapd comes in after some time (B still unlucky) and swaps out some\npages from A+B and C: it allocates the original swap1 to page2 in A+B,\nand some other swap2 to the original page1 now in C.  But does not\nimmediately free page1 (actually it couldn\u0027t: B holds a reference),\nleaving it in swap cache for now.\n\nB at last gets the lock on page1, hooray! Is PageSwapCache(page1)? Yes.\nIs pte_same(*page_table, orig_pte)? Yes, because page2 has now been\ngiven the swap1 which page1 used to have.  So B proceeds to insert page1\ninto A+B\u0027s page_table, though its content now belongs to C, quite\ndifferent from what A wrote there.\n\nB ought to have checked that page1\u0027s swap was still swap1.\n\nSigned-off-by: Hugh Dickins \u003chughd@google.com\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: stable@kernel.org\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4969c1192d15afa3389e7ae3302096ff684ba655",
      "tree": "abe560c8f293191be65488c49f4db3f3a626e63c",
      "parents": [
        "7c5367f205f7d53659fb19b9fdf65b7bc1a592c6"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Thu Sep 09 16:37:52 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Sep 09 18:57:24 2010 -0700"
      },
      "message": "mm: fix swapin race condition\n\nThe pte_same check is reliable only if the swap entry remains pinned (by\nthe page lock on swapcache).  We\u0027ve also to ensure the swapcache isn\u0027t\nremoved before we take the lock as try_to_free_swap won\u0027t care about the\npage pin.\n\nOne of the possible impacts of this patch is that a KSM-shared page can\npoint to the anon_vma of another process, which could exit before the page\nis freed.\n\nThis can leave a page with a pointer to a recycled anon_vma object, or\nworse, a pointer to something that is no longer an anon_vma.\n\n[riel@redhat.com: changelog help]\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Hugh Dickins \u003chughd@google.com\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: \u003cstable@kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "8ca3eb08097f6839b2206e2242db4179aee3cfb3",
      "tree": "32b9f033230d615d248fa0bbfa1a0c644a422ed8",
      "parents": [
        "9559fcdbff4f93d29af04478bbc48294519424f5"
      ],
      "author": {
        "name": "Luck, Tony",
        "email": "tony.luck@intel.com",
        "time": "Tue Aug 24 11:44:18 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Aug 24 12:13:20 2010 -0700"
      },
      "message": "guard page for stacks that grow upwards\n\npa-risc and ia64 have stacks that grow upwards. Check that\nthey do not run into other mappings. By making VM_GROWSUP\n0x0 on architectures that do not ever use it, we can avoid\nsome unpleasant #ifdefs in check_stack_guard_page().\n\nSigned-off-by: Tony Luck \u003ctony.luck@intel.com\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "61c77326d1df079f202fa79403c3ccd8c5966a81",
      "tree": "57780e6b94f24f402d1c9036d6e7cf37a359c22f",
      "parents": [
        "76be97c1fc945db08aae1f1b746012662d643e97"
      ],
      "author": {
        "name": "Shaohua Li",
        "email": "shaohua.li@intel.com",
        "time": "Mon Aug 16 09:16:55 2010 +0800"
      },
      "committer": {
        "name": "H. Peter Anvin",
        "email": "hpa@zytor.com",
        "time": "Mon Aug 23 10:04:57 2010 -0700"
      },
      "message": "x86, mm: Avoid unnecessary TLB flush\n\nIn x86, access and dirty bits are set automatically by CPU when CPU accesses\nmemory. When we go into the code path of below flush_tlb_fix_spurious_fault(),\nwe already set dirty bit for pte and don\u0027t need flush tlb. This might mean\ntlb entry in some CPUs hasn\u0027t dirty bit set, but this doesn\u0027t matter. When\nthe CPUs do page write, they will automatically check the bit and no software\ninvolved.\n\nOn the other hand, flush tlb in below position is harmful. Test creates CPU\nnumber of threads, each thread writes to a same but random address in same vma\nrange and we measure the total time. Under a 4 socket system, original time is\n1.96s, while with the patch, the time is 0.8s. Under a 2 socket system, there is\n20% time cut too. perf shows a lot of time are taking to send ipi/handle ipi for\ntlb flush.\n\nSigned-off-by: Shaohua Li \u003cshaohua.li@intel.com\u003e\nLKML-Reference: \u003c20100816011655.GA362@sli10-desk.sh.intel.com\u003e\nAcked-by: Suresh Siddha \u003csuresh.b.siddha@intel.com\u003e\nCc: Andrea Archangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: H. Peter Anvin \u003chpa@zytor.com\u003e\n"
    },
    {
      "commit": "0e8e50e20c837eeec8323bba7dcd25fe5479194c",
      "tree": "12c7ec767a4a8508be33442c6fb55c28a26c94cd",
      "parents": [
        "7798330ac8114c731cfab83e634c6ecedaa233d7"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Aug 20 16:49:40 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Aug 21 08:50:00 2010 -0700"
      },
      "message": "mm: make stack guard page logic use vm_prev pointer\n\nLike the mlock() change previously, this makes the stack guard check\ncode use vma-\u003evm_prev to see what the mapping below the current stack\nis, rather than have to look it up with find_vma().\n\nAlso, accept an abutting stack segment, since that happens naturally if\nyou split the stack with mlock or mprotect.\n\nTested-by: Ian Campbell \u003cijc@hellion.org.uk\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "11ac552477e32835cb6970bf0a70c210807f5673",
      "tree": "959521ee3e217da81b08209df0f0db760e1efdb8",
      "parents": [
        "92fa5bd9a946b6e7aab6764e7312e4e3d9bed295"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Aug 14 11:44:56 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Aug 14 11:44:56 2010 -0700"
      },
      "message": "mm: fix page table unmap for stack guard page properly\n\nWe do in fact need to unmap the page table _before_ doing the whole\nstack guard page logic, because if it is needed (mainly 32-bit x86 with\nPAE and CONFIG_HIGHPTE, but other architectures may use it too) then it\nwill do a kmap_atomic/kunmap_atomic.\n\nAnd those kmaps will create an atomic region that we cannot do\nallocations in.  However, the whole stack expand code will need to do\nanon_vma_prepare() and vma_lock_anon_vma() and they cannot do that in an\natomic region.\n\nNow, a better model might actually be to do the anon_vma_prepare() when\n_creating_ a VM_GROWSDOWN segment, and not have to worry about any of\nthis at page fault time.  But in the meantime, this is the\nstraightforward fix for the issue.\n\nSee https://bugzilla.kernel.org/show_bug.cgi?id\u003d16588 for details.\n\nReported-by: Wylda \u003cwylda@volny.cz\u003e\nReported-by: Sedat Dilek \u003csedat.dilek@gmail.com\u003e\nReported-by: Mike Pagano \u003cmpagano@gentoo.org\u003e\nReported-by: François Valenduc \u003cfrancois.valenduc@tvcablenet.be\u003e\nTested-by: Ed Tomlinson \u003cedt@aei.ca\u003e\nCc: Pekka Enberg \u003cpenberg@kernel.org\u003e\nCc: Greg KH \u003cgregkh@suse.de\u003e\nCc: stable@kernel.org\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "5528f9132cf65d4d892bcbc5684c61e7822b21e9",
      "tree": "46ad9b7a106a42579b869b42bf237a663370a613",
      "parents": [
        "320b2b8de12698082609ebbc1a17165727f4c893"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Aug 13 09:24:04 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Aug 13 09:24:04 2010 -0700"
      },
      "message": "mm: fix missing page table unmap for stack guard page failure case\n\n.. which didn\u0027t show up in my tests because it\u0027s a no-op on x86-64 and\nmost other architectures.  But we enter the function with the last-level\npage table mapped, and should unmap it at exit.\n\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "320b2b8de12698082609ebbc1a17165727f4c893",
      "tree": "bb62fe1ba3bb8bf68ff1fd44e613ece9c9581c36",
      "parents": [
        "2069601b3f0ea38170d4b509b89f3ca0a373bdc1"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Aug 12 17:54:33 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Aug 12 17:54:33 2010 -0700"
      },
      "message": "mm: keep a guard page below a grow-down stack segment\n\nThis is a rather minimally invasive patch to solve the problem of the\nuser stack growing into a memory mapped area below it.  Whenever we fill\nthe first page of the stack segment, expand the segment down by one\npage.\n\nNow, admittedly some odd application might _want_ the stack to grow down\ninto the preceding memory mapping, and so we may at some point need to\nmake this a process tunable (some people might also want to have more\nthan a single page of guarding), but let\u0027s try the minimal approach\nfirst.\n\nTested with trivial application that maps a single page just below the\nstack, and then starts recursing.  Without this, we will get a SIGSEGV\n_after_ the stack has smashed the mapping.  With this patch, we\u0027ll get a\nnice SIGBUS just as the stack touches the page just above the mapping.\n\nRequested-by: Keith Packard \u003ckeithp@keithp.com\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "57250a5bf0f6ff68dc339572adbd881a11f366fa",
      "tree": "ef11c141a9f89403bcd4b1fc705d672c0ff41818",
      "parents": [
        "58c37f6e0dfaaab85a3c11fcbf24451dfe70c721"
      ],
      "author": {
        "name": "Jeremy Fitzhardinge",
        "email": "jeremy@goop.org",
        "time": "Mon Aug 09 17:19:52 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:03 2010 -0700"
      },
      "message": "mmu-notifiers: remove mmu notifier calls in apply_to_page_range()\n\nIt is not appropriate for apply_to_page_range() to directly call any mmu\nnotifiers, because it is a general purpose function whose effect depends\non what context it is called in and what the callback function does.\n\nIn particular, if it is being used as part of an mmu notifier\nimplementation, the recursive calls can be particularly problematic.\n\nIt is up to apply_to_page_range\u0027s caller to do any notifier calls if\nnecessary.  It does not affect any in-tree users because they all operate\non init_mm, and mmu notifiers only pertain to usermode mappings.\n\n[stefano.stabellini@eu.citrix.com: remove unused local `start\u0027]\nSigned-off-by: Jeremy Fitzhardinge \u003cjeremy.fitzhardinge@citrix.com\u003e\nSigned-off-by: Stefano Stabellini \u003cstefano.stabellini@eu.citrix.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Stefano Stabellini \u003cstefano.stabellini@eu.citrix.com\u003e\nCc: Avi Kivity \u003cavi@qumranet.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "9a5b489b870def9a93f5e89dac03ebe136f901db",
      "tree": "df7f0acfdb81ce0d77b78ff4d131c40472731994",
      "parents": [
        "ad8c2ee801ad7a52d919b478d9b2c7b39a72d295"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Mon Aug 09 17:19:49 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:02 2010 -0700"
      },
      "message": "mm: set VM_FAULT_WRITE in do_swap_page()\n\nSet the flag if do_swap_page is decowing the page the same way do_wp_page\nwould too.\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Nick Piggin \u003cnickpiggin@yahoo.com.au\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ad8c2ee801ad7a52d919b478d9b2c7b39a72d295",
      "tree": "bc56cc023da3467447b0aecd30c0516881d53992",
      "parents": [
        "51b1bd2ace1595b72956224deda349efa880b693"
      ],
      "author": {
        "name": "Rik van Riel",
        "email": "riel@redhat.com",
        "time": "Mon Aug 09 17:19:48 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:02 2010 -0700"
      },
      "message": "rmap: add exclusive page to private anon_vma on swapin\n\nOn swapin it is fairly common for a page to be owned exclusively by one\nprocess.  In that case we want to add the page to the anon_vma of that\nprocess\u0027s VMA, instead of to the root anon_vma.\n\nThis will reduce the amount of rmap searching that the swapout code needs\nto do.\n\nSigned-off-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4e60c86bd9e5a7110ed28874d0b6592186550ae8",
      "tree": "9fb60e9f49b44b293a0c0c7d9f40e1a354a22b5a",
      "parents": [
        "627295e492638936e76f3d8fcb1e0a3367b88341"
      ],
      "author": {
        "name": "Andi Kleen",
        "email": "andi@firstfloor.org",
        "time": "Mon Aug 09 17:19:03 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:44:58 2010 -0700"
      },
      "message": "gcc-4.6: mm: fix unused but set warnings\n\nNo real bugs, just some dead code and some fixups.\n\nSigned-off-by: Andi Kleen \u003cak@linux.intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "de51257aa301652876ab6e8f13ea4eadbe4a3846",
      "tree": "388ee39bed1d7e362438d047b57399a28e2617f8",
      "parents": [
        "51c20fcced5badee0e2021c6c89f44aa3cbd72aa"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hughd@google.com",
        "time": "Fri Jul 30 10:58:26 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Jul 30 18:56:09 2010 -0700"
      },
      "message": "mm: fix ia64 crash when gcore reads gate area\n\nDebian\u0027s ia64 autobuilders have been seeing kernel freeze or reboot\nwhen running the gdb testsuite (Debian bug 588574): dannf bisected to\n2.6.32 62eede62dafb4a6633eae7ffbeb34c60dba5e7b1 \"mm: ZERO_PAGE without\nPTE_SPECIAL\"; and reproduced it with gdb\u0027s gcore on a simple target.\n\nI\u0027d missed updating the gate_vma handling in __get_user_pages(): that\nhappens to use vm_normal_page() (nowadays failing on the zero page),\nyet reported success even when it failed to get a page - boom when\naccess_process_vm() tried to copy that to its intermediate buffer.\n\nFix this, resisting cleanups: in particular, leave it for now reporting\nsuccess when not asked to get any pages - very probably safe to change,\nbut let\u0027s not risk it without testing exposure.\n\nWhy did ia64 crash with 16kB pages, but succeed with 64kB pages?\nBecause setup_gate() pads each 64kB of its gate area with zero pages.\n\nReported-by: Andreas Barth \u003caba@not.so.argh.org\u003e\nBisected-by: dann frazier \u003cdannf@debian.org\u003e\nSigned-off-by: Hugh Dickins \u003chughd@google.com\u003e\nTested-by: dann frazier \u003cdannf@dannf.org\u003e\nCc: stable@kernel.org\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "142762bd8d8c46345e79f0f68d3374564306972f",
      "tree": "c33360b872883d24b068ba7b8f01466fccb9dfc9",
      "parents": [
        "58a9d3d8db06ca2ec31f64ec49ab0aeb89971b85"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "hannes@cmpxchg.org",
        "time": "Mon May 24 14:32:39 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue May 25 08:07:00 2010 -0700"
      },
      "message": "mm: document follow_page()\n\nSigned-off-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Dan Carpenter \u003cerror27@gmail.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Izik Eidus \u003cieidus@redhat.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a3a2e76c77fa22b114e421ac11dec0c56c3503fb",
      "tree": "cc67bbd8d5d364e55ea7a00d0b5ad68d5eac08ac",
      "parents": [
        "b01d0942c2b7a3026d2b7d38b5773d3d00420e06"
      ],
      "author": {
        "name": "KAMEZAWA Hiroyuki",
        "email": "kamezawa.hiroyu@jp.fujitsu.com",
        "time": "Tue Apr 06 14:34:42 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Apr 07 08:38:02 2010 -0700"
      },
      "message": "mm: avoid null-pointer deref in sync_mm_rss()\n\n- We weren\u0027t zeroing p-\u003erss_stat[] at fork()\n\n- Consequently sync_mm_rss() was dereferencing tsk-\u003emm for kernel\n  threads and was oopsing.\n\n- Make __sync_task_rss_stat() static, too.\n\nAddresses https://bugzilla.kernel.org/show_bug.cgi?id\u003d15648\n\n[akpm@linux-foundation.org: remove the BUG_ON(!mm-\u003erss)]\nReported-by: Troels Liebe Bentsen \u003ctlb@rapanden.dk\u003e\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\n\"Michael S. Tsirkin\" \u003cmst@redhat.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "5a0e3ad6af8660be21ca98a971cd00f331318c05",
      "tree": "5bfb7be11a03176a87296a43ac6647975c00a1d1",
      "parents": [
        "ed391f4ebf8f701d3566423ce8f17e614cde9806"
      ],
      "author": {
        "name": "Tejun Heo",
        "email": "tj@kernel.org",
        "time": "Wed Mar 24 17:04:11 2010 +0900"
      },
      "committer": {
        "name": "Tejun Heo",
        "email": "tj@kernel.org",
        "time": "Tue Mar 30 22:02:32 2010 +0900"
      },
      "message": "include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h\n\npercpu.h is included by sched.h and module.h and thus ends up being\nincluded when building most .c files.  percpu.h includes slab.h which\nin turn includes gfp.h making everything defined by the two files\nuniversally available and complicating inclusion dependencies.\n\npercpu.h -\u003e slab.h dependency is about to be removed.  Prepare for\nthis change by updating users of gfp and slab facilities include those\nheaders directly instead of assuming availability.  As this conversion\nneeds to touch large number of source files, the following script is\nused as the basis of conversion.\n\n  http://userweb.kernel.org/~tj/misc/slabh-sweep.py\n\nThe script does the followings.\n\n* Scan files for gfp and slab usages and update includes such that\n  only the necessary includes are there.  ie. if only gfp is used,\n  gfp.h, if slab is used, slab.h.\n\n* When the script inserts a new include, it looks at the include\n  blocks and try to put the new include such that its order conforms\n  to its surrounding.  It\u0027s put in the include block which contains\n  core kernel includes, in the same order that the rest are ordered -\n  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there\n  doesn\u0027t seem to be any matching order.\n\n* If the script can\u0027t find a place to put a new include (mostly\n  because the file doesn\u0027t have fitting include block), it prints out\n  an error message indicating which .h file needs to be added to the\n  file.\n\nThe conversion was done in the following steps.\n\n1. The initial automatic conversion of all .c files updated slightly\n   over 4000 files, deleting around 700 includes and adding ~480 gfp.h\n   and ~3000 slab.h inclusions.  The script emitted errors for ~400\n   files.\n\n2. Each error was manually checked.  Some didn\u0027t need the inclusion,\n   some needed manual addition while adding it to implementation .h or\n   embedding .c file was more appropriate for others.  This step added\n   inclusions to around 150 files.\n\n3. The script was run again and the output was compared to the edits\n   from #2 to make sure no file was left behind.\n\n4. Several build tests were done and a couple of problems were fixed.\n   e.g. lib/decompress_*.c used malloc/free() wrappers around slab\n   APIs requiring slab.h to be added manually.\n\n5. The script was run on all .h files but without automatically\n   editing them as sprinkling gfp.h and slab.h inclusions around .h\n   files could easily lead to inclusion dependency hell.  Most gfp.h\n   inclusion directives were ignored as stuff from gfp.h was usually\n   wildly available and often used in preprocessor macros.  Each\n   slab.h inclusion directive was examined and added manually as\n   necessary.\n\n6. percpu.h was updated not to include slab.h.\n\n7. Build test were done on the following configurations and failures\n   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my\n   distributed build env didn\u0027t work with gcov compiles) and a few\n   more options had to be turned off depending on archs to make things\n   build (like ipr on powerpc/64 which failed due to missing writeq).\n\n   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.\n   * powerpc and powerpc64 SMP allmodconfig\n   * sparc and sparc64 SMP allmodconfig\n   * ia64 SMP allmodconfig\n   * s390 SMP allmodconfig\n   * alpha SMP allmodconfig\n   * um on x86_64 SMP allmodconfig\n\n8. percpu.h modifications were reverted so that it could be applied as\n   a separate patch and serve as bisection point.\n\nGiven the fact that I had only a couple of failures from tests on step\n6, I\u0027m fairly confident about the coverage of this conversion patch.\nIf there is a breakage, it\u0027s likely to be something in one of the arch\nheaders which should be easily discoverable easily on most builds of\nthe specific arch.\n\nSigned-off-by: Tejun Heo \u003ctj@kernel.org\u003e\nGuess-its-ok-by: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: Ingo Molnar \u003cmingo@redhat.com\u003e\nCc: Lee Schermerhorn \u003cLee.Schermerhorn@hp.com\u003e\n"
    },
    {
      "commit": "298359c5bf06c04258d7cf552426e198c47e83c1",
      "tree": "d8ba710675a2e4e9dabbc9ee06a4445fb5657ce5",
      "parents": [
        "53feb29767c29c877f9d47dcfe14211b5b0f7ebd"
      ],
      "author": {
        "name": "Michael S. Tsirkin",
        "email": "mst@redhat.com",
        "time": "Tue Mar 23 13:35:37 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Mar 24 16:31:21 2010 -0700"
      },
      "message": "exit: fix oops in sync_mm_rss\n\nIn 2.6.34-rc1, removing vhost_net module causes an oops in sync_mm_rss\n(called from do_exit) when workqueue is destroyed.  This does not happen\non net-next, or with vhost on top of to 2.6.33.\n\nThe issue seems to be introduced by\n34e55232e59f7b19050267a05ff1226e5cd122a5 (\"mm: avoid false sharing of\nmm_counter) which added sync_mm_rss() that is passed task-\u003emm, and\ndereferences it without checking.  If task is a kernel thread, mm might be\nNULL.  I think this might also happen e.g.  with aio.\n\nThis patch fixes the oops by calling sync_mm_rss when task-\u003emm is set to\nNULL.  I also added BUG_ON to detect any other cases where counters get\nincremented while mm is NULL.\n\nThe oops I observed looks like this:\n\nBUG: unable to handle kernel NULL pointer dereference at 00000000000002a8\nIP: [\u003cffffffff810b436d\u003e] sync_mm_rss+0x33/0x6f\nPGD 0\nOops: 0002 [#1] SMP\nlast sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map\nCPU 2\nModules linked in: vhost_net(-) tun bridge stp sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table kvm_intel kvm i5000_edac edac_core rtc_cmos bnx2 button i2c_i801 i2c_core rtc_core e1000e sg joydev ide_cd_mod serio_raw pcspkr rtc_lib cdrom virtio_net virtio_blk virtio_pci virtio_ring virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]\n\nPid: 2046, comm: vhost Not tainted 2.6.34-rc1-vhost #25 System Planar/IBM System x3550 -[7978B3G]-\nRIP: 0010:[\u003cffffffff810b436d\u003e]  [\u003cffffffff810b436d\u003e] sync_mm_rss+0x33/0x6f\nRSP: 0018:ffff8802379b7e60  EFLAGS: 00010202\nRAX: 0000000000000008 RBX: ffff88023f2390c0 RCX: 0000000000000000\nRDX: ffff88023f2396b0 RSI: 0000000000000000 RDI: ffff88023f2390c0\nRBP: ffff8802379b7e60 R08: 0000000000000000 R09: 0000000000000000\nR10: ffff88023aecfbc0 R11: 0000000000013240 R12: 0000000000000000\nR13: ffffffff81051a6c R14: ffffe8ffffc0f540 R15: 0000000000000000\nFS:  0000000000000000(0000) GS:ffff880001e80000(0000) knlGS:0000000000000000\nCS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b\nCR2: 00000000000002a8 CR3: 000000023af23000 CR4: 00000000000406e0\nDR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000\nDR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400\nProcess vhost (pid: 2046, threadinfo ffff8802379b6000, task ffff88023f2390c0)\nStack:\n ffff8802379b7ee0 ffffffff81040687 ffffe8ffffc0f558 ffffffffa00a3e2d\n\u003c0\u003e 0000000000000000 ffff88023f2390c0 ffffffff81055817 ffff8802379b7e98\n\u003c0\u003e ffff8802379b7e98 0000000100000286 ffff8802379b7ee0 ffff88023ad47d78\nCall Trace:\n [\u003cffffffff81040687\u003e] do_exit+0x147/0x6c4\n [\u003cffffffffa00a3e2d\u003e] ? handle_rx_net+0x0/0x17 [vhost_net]\n [\u003cffffffff81055817\u003e] ? autoremove_wake_function+0x0/0x39\n [\u003cffffffff81051a6c\u003e] ? worker_thread+0x0/0x229\n [\u003cffffffff810553c9\u003e] kthreadd+0x0/0xf2\n [\u003cffffffff810038d4\u003e] kernel_thread_helper+0x4/0x10\n [\u003cffffffff81055342\u003e] ? kthread+0x0/0x87\n [\u003cffffffff810038d0\u003e] ? kernel_thread_helper+0x0/0x10\nCode: 00 8b 87 6c 02 00 00 85 c0 74 14 48 98 f0 48 01 86 a0 02 00 00 c7 87 6c 02 00 00 00 00 00 00 8b 87 70 02 00 00 85 c0 74 14 48 98 \u003cf0\u003e 48 01 86 a8 02 00 00 c7 87 70 02 00 00 00 00 00 00 8b 87 74\nRIP  [\u003cffffffff810b436d\u003e] sync_mm_rss+0x33/0x6f\n RSP \u003cffff8802379b7e60\u003e\nCR2: 00000000000002a8\n---[ end trace 41603ba922beddd2 ]---\nFixing recursive fault but reboot is needed!\n\n(note: handle_rx_net is a work item using workqueue in question).\nsync_mm_rss+0x33/0x6f gave me a hint. I also tried reverting\n34e55232e59f7b19050267a05ff1226e5cd122a5 and the oops goes away.\n\nThe module in question calls use_mm and later unuse_mm from a kernel\nthread.  It is when this kernel thread is destroyed that the crash\nhappens.\n\nSigned-off-by: Michael S. Tsirkin \u003cmst@redhat.com\u003e\nAndrea Arcangeli \u003caarcange@redhat.com\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "718a38211bf4375c0a1efad3afbc5dbaef5d33f9",
      "tree": "ade6815c619705f0342f98cc8bb39fa3309c81a6",
      "parents": [
        "9b3a6549b2602ca30f58715a0071e29f9898cae9"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Mar 10 15:20:43 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Mar 12 15:52:28 2010 -0800"
      },
      "message": "mm: introduce dump_page() and print symbolic flag names\n\n- introduce dump_page() to print the page info for debugging some error\n  condition.\n\n- convert three mm users: bad_page(), print_bad_pte() and memory offline\n  failure.\n\n- print an extra field: the symbolic names of page-\u003eflags\n\nExample dump_page() output:\n\n[  157.521694] page:ffffea0000a7cba8 count:2 mapcount:1 mapping:ffff88001c901791 index:0x147\n[  157.525570] page flags: 0x100000000100068(uptodate|lru|active|swapbacked)\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Ingo Molnar \u003cmingo@elte.hu\u003e\nCc: Alex Chiang \u003cachiang@hp.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Andi Kleen \u003candi@firstfloor.org\u003e\nCc: Mel Gorman \u003cmel@linux.vnet.ibm.com\u003e\nCc: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "53bddb4e9f3f53df02a783751984ddeade71b085",
      "tree": "74b9dfa0b61d6455a006beb2de20310aee0bc28b",
      "parents": [
        "936ed49a540e2dce645da27e7e4032b24310a8e4"
      ],
      "author": {
        "name": "KAMEZAWA Hiroyuki",
        "email": "kamezawa.hiroyu@jp.fujitsu.com",
        "time": "Wed Mar 10 15:20:38 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Mar 12 15:52:28 2010 -0800"
      },
      "message": "nommu: fix build breakage\n\nCommit 34e55232e59f7b19050267a05ff1226e5cd122a5 (\"mm: avoid false sharing\nof mm_counter\") added sync_mm_rss() for syncing loosely accounted rss\ncounters.  It\u0027s for CONFIG_MMU but sync_mm_rss is called even in NOMMU\nenviroment (kerne/exit.c, fs/exec.c).  Above commit doesn\u0027t handle it\nwell.\n\nThis patch changes\n  SPLIT_RSS_COUNTING depends on SPLIT_PTLOCKS \u0026\u0026 CONFIG_MMU\n\nAnd for avoid unnecessary function calls, sync_mm_rss changed to be inlined\nnoop function in header file.\n\nReported-by: David Howells \u003cdhowells@redhat.com\u003e\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nSigned-off-by: Mike Frysinger \u003cvapier@gentoo.org\u003e\nSigned-off-by: Michal Simek \u003cmonstr@monstr.eu\u003e\nSigned-off-by: David Howells \u003cdhowells@redhat.com\u003e\nCc: Greg Ungerer \u003cgerg@snapgear.com\u003e\nCc: Geert Uytterhoeven \u003cgeert@linux-m68k.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "c44b674323f4a2480dbeb65d4b487fa5f06f49e0",
      "tree": "b753050e6752eb2fc961ad3ea5dfdf88ef88364d",
      "parents": [
        "033a64b56aed798991de18d226085dfb1ccd858d"
      ],
      "author": {
        "name": "Rik van Riel",
        "email": "riel@redhat.com",
        "time": "Fri Mar 05 13:42:09 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Mar 06 11:26:26 2010 -0800"
      },
      "message": "rmap: move exclusively owned pages to own anon_vma in do_wp_page()\n\nWhen the parent process breaks the COW on a page, both the original which\nis mapped at child and the new page which is mapped parent end up in that\nsame anon_vma.  Generally this won\u0027t be a problem, but for some workloads\nit could preserve the O(N) rmap scanning complexity.\n\nA simple fix is to ensure that, when a page which is mapped child gets\nreused in do_wp_page, because we already are the exclusive owner, the page\ngets moved to our own exclusive child\u0027s anon_vma.\n\nSigned-off-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Larry Woodman \u003clwoodman@redhat.com\u003e\nCc: Lee Schermerhorn \u003cLee.Schermerhorn@hp.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "5beb49305251e5669852ed541e8e2f2f7696c53e",
      "tree": "46457450a22f23938b24904aeba5d4ada2f53b20",
      "parents": [
        "648bcc771145172a14bc35eeb849ed08f6aa4f1e"
      ],
      "author": {
        "name": "Rik van Riel",
        "email": "riel@redhat.com",
        "time": "Fri Mar 05 13:42:07 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Mar 06 11:26:26 2010 -0800"
      },
      "message": "mm: change anon_vma linking to fix multi-process server scalability issue\n\nThe old anon_vma code can lead to scalability issues with heavily forking\nworkloads.  Specifically, each anon_vma will be shared between the parent\nprocess and all its child processes.\n\nIn a workload with 1000 child processes and a VMA with 1000 anonymous\npages per process that get COWed, this leads to a system with a million\nanonymous pages in the same anon_vma, each of which is mapped in just one\nof the 1000 processes.  However, the current rmap code needs to walk them\nall, leading to O(N) scanning complexity for each page.\n\nThis can result in systems where one CPU is walking the page tables of\n1000 processes in page_referenced_one, while all other CPUs are stuck on\nthe anon_vma lock.  This leads to catastrophic failure for a benchmark\nlike AIM7, where the total number of processes can reach in the tens of\nthousands.  Real workloads are still a factor 10 less process intensive\nthan AIM7, but they are catching up.\n\nThis patch changes the way anon_vmas and VMAs are linked, which allows us\nto associate multiple anon_vmas with a VMA.  At fork time, each child\nprocess gets its own anon_vmas, in which its COWed pages will be\ninstantiated.  The parents\u0027 anon_vma is also linked to the VMA, because\nnon-COWed pages could be present in any of the children.\n\nThis reduces rmap scanning complexity to O(1) for the pages of the 1000\nchild processes, with O(N) complexity for at most 1/N pages in the system.\n This reduces the average scanning cost in heavily forking workloads from\nO(N) to 2.\n\nThe only real complexity in this patch stems from the fact that linking a\nVMA to anon_vmas now involves memory allocations.  This means vma_adjust\ncan fail, if it needs to attach a VMA to anon_vma structures.  This in\nturn means error handling needs to be added to the calling functions.\n\nA second source of complexity is that, because there can be multiple\nanon_vmas, the anon_vma linking in vma_adjust can no longer be done under\n\"the\" anon_vma lock.  To prevent the rmap code from walking up an\nincomplete VMA, this patch introduces the VM_LOCK_RMAP VMA flag.  This bit\nflag uses the same slot as the NOMMU VM_MAPPED_COPY, with an ifdef in mm.h\nto make sure it is impossible to compile a kernel that needs both symbolic\nvalues for the same bitflag.\n\nSome test results:\n\nWithout the anon_vma changes, when AIM7 hits around 9.7k users (on a test\nbox with 16GB RAM and not quite enough IO), the system ends up running\n\u003e99% in system time, with every CPU on the same anon_vma lock in the\npageout code.\n\nWith these changes, AIM7 hits the cross-over point around 29.7k users.\nThis happens with ~99% IO wait time, there never seems to be any spike in\nsystem time.  The anon_vma lock contention appears to be resolved.\n\n[akpm@linux-foundation.org: cleanups]\nSigned-off-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Larry Woodman \u003clwoodman@redhat.com\u003e\nCc: Lee Schermerhorn \u003cLee.Schermerhorn@hp.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "b084d4353ff99d824d3bc5a5c2c22c70b1fba722",
      "tree": "8178db2b337fc8a36e6ca2e1fc2e7d7473957e27",
      "parents": [
        "34e55232e59f7b19050267a05ff1226e5cd122a5"
      ],
      "author": {
        "name": "KAMEZAWA Hiroyuki",
        "email": "kamezawa.hiroyu@jp.fujitsu.com",
        "time": "Fri Mar 05 13:41:42 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Mar 06 11:26:24 2010 -0800"
      },
      "message": "mm: count swap usage\n\nA frequent questions from users about memory management is what numbers of\nswap ents are user for processes.  And this information will give some\nhints to oom-killer.\n\nBesides we can count the number of swapents per a process by scanning\n/proc/\u003cpid\u003e/smaps, this is very slow and not good for usual process\ninformation handler which works like \u0027ps\u0027 or \u0027top\u0027.  (ps or top is now\nenough slow..)\n\nThis patch adds a counter of swapents to mm_counter and update is at each\nswap events.  Information is exported via /proc/\u003cpid\u003e/status file as\n\n[kamezawa@bluextal memory]$ cat /proc/self/status\nName:   cat\nState:  R (running)\nTgid:   2910\nPid:    2910\nPPid:   2823\nTracerPid:      0\nUid:    500     500     500     500\nGid:    500     500     500     500\nFDSize: 256\nGroups: 500\nVmPeak:    82696 kB\nVmSize:    82696 kB\nVmLck:         0 kB\nVmHWM:       432 kB\nVmRSS:       432 kB\nVmData:      172 kB\nVmStk:        84 kB\nVmExe:        48 kB\nVmLib:      1568 kB\nVmPTE:        40 kB\nVmSwap:        0 kB \u003c\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d this.\n\n[akpm@linux-foundation.org: coding-style fixes]\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nCc: David Rientjes \u003crientjes@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "34e55232e59f7b19050267a05ff1226e5cd122a5",
      "tree": "6b94e776e87d2a2fe1ceca7c5606901575323900",
      "parents": [
        "d559db086ff5be9bcc259e5aa50bf3d881eaf1d1"
      ],
      "author": {
        "name": "KAMEZAWA Hiroyuki",
        "email": "kamezawa.hiroyu@jp.fujitsu.com",
        "time": "Fri Mar 05 13:41:40 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Mar 06 11:26:24 2010 -0800"
      },
      "message": "mm: avoid false sharing of mm_counter\n\nConsidering the nature of per mm stats, it\u0027s the shared object among\nthreads and can be a cache-miss point in the page fault path.\n\nThis patch adds per-thread cache for mm_counter.  RSS value will be\ncounted into a struct in task_struct and synchronized with mm\u0027s one at\nevents.\n\nNow, in this patch, the event is the number of calls to handle_mm_fault.\nPer-thread value is added to mm at each 64 calls.\n\n rough estimation with small benchmark on parallel thread (2threads) shows\n [before]\n     4.5 cache-miss/faults\n [after]\n     4.0 cache-miss/faults\n Anyway, the most contended object is mmap_sem if the number of threads grows.\n\n[akpm@linux-foundation.org: coding-style fixes]\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nCc: David Rientjes \u003crientjes@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d559db086ff5be9bcc259e5aa50bf3d881eaf1d1",
      "tree": "aa968c8a4093234e4623a34c0415bf9d8683671c",
      "parents": [
        "19b629f581320999ddb9f6597051b79cdb53459c"
      ],
      "author": {
        "name": "KAMEZAWA Hiroyuki",
        "email": "kamezawa.hiroyu@jp.fujitsu.com",
        "time": "Fri Mar 05 13:41:39 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Mar 06 11:26:23 2010 -0800"
      },
      "message": "mm: clean up mm_counter\n\nPresently, per-mm statistics counter is defined by macro in sched.h\n\nThis patch modifies it to\n  - defined in mm.h as inlinf functions\n  - use array instead of macro\u0027s name creation.\n\nThis patch is for reducing patch size in future patch to modify\nimplementation of per-mm counter.\n\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nCc: David Rientjes \u003crientjes@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4b3073e1c53a256275f1079c0fbfbe85883d9275",
      "tree": "a0fa98cb75edbbc58c43bbe38ac4c6da0913ae6d",
      "parents": [
        "ed42acaef1a9d51631a31b55e9ed52d400430492"
      ],
      "author": {
        "name": "Russell King",
        "email": "rmk+kernel@arm.linux.org.uk",
        "time": "Fri Dec 18 16:40:18 2009 +0000"
      },
      "committer": {
        "name": "Russell King",
        "email": "rmk+kernel@arm.linux.org.uk",
        "time": "Sat Feb 20 16:41:46 2010 +0000"
      },
      "message": "MM: Pass a PTE pointer to update_mmu_cache() rather than the PTE itself\n\nOn VIVT ARM, when we have multiple shared mappings of the same file\nin the same MM, we need to ensure that we have coherency across all\ncopies.  We do this via make_coherent() by making the pages\nuncacheable.\n\nThis used to work fine, until we allowed highmem with highpte - we\nnow have a page table which is mapped as required, and is not available\nfor modification via update_mmu_cache().\n\nRalf Beache suggested getting rid of the PTE value passed to\nupdate_mmu_cache():\n\n  On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables\n  to construct a pointer to the pte again.  Passing a pte_t * is much\n  more elegant.  Maybe we might even replace the pte argument with the\n  pte_t?\n\nBen Herrenschmidt would also like the pte pointer for PowerPC:\n\n  Passing the ptep in there is exactly what I want.  I want that\n  -instead- of the PTE value, because I have issue on some ppc cases,\n  for I$/D$ coherency, where set_pte_at() may decide to mask out the\n  _PAGE_EXEC.\n\nSo, pass in the mapped page table pointer into update_mmu_cache(), and\nremove the PTE value, updating all implementations and call sites to\nsuit.\n\nIncludes a fix from Stephen Rothwell:\n\n  sparc: fix fallout from update_mmu_cache API change\n\n  Signed-off-by: Stephen Rothwell \u003csfr@canb.auug.org.au\u003e\n\nAcked-by: Benjamin Herrenschmidt \u003cbenh@kernel.crashing.org\u003e\nSigned-off-by: Russell King \u003crmk+kernel@arm.linux.org.uk\u003e\n"
    },
    {
      "commit": "d4220f987cf473c65a342ca69e3eb13dea919a49",
      "tree": "dbb004a9c805d6de3f6e3955398fee1084a29f16",
      "parents": [
        "61cf693159d6a968a7014e24905143f71ed8ddcf",
        "f2c03debdfb387fa2e35cac6382779072b8b9209"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Dec 16 12:36:49 2009 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Dec 16 12:36:49 2009 -0800"
      },
      "message": "Merge branch \u0027hwpoison\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6\n\n* \u0027hwpoison\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (34 commits)\n  HWPOISON: Remove stray phrase in a comment\n  HWPOISON: Try to allocate migration page on the same node\n  HWPOISON: Don\u0027t do early filtering if filter is disabled\n  HWPOISON: Add a madvise() injector for soft page offlining\n  HWPOISON: Add soft page offline support\n  HWPOISON: Undefine short-hand macros after use to avoid namespace conflict\n  HWPOISON: Use new shake_page in memory_failure\n  HWPOISON: Use correct name for MADV_HWPOISON in documentation\n  HWPOISON: mention HWPoison in Kconfig entry\n  HWPOISON: Use get_user_page_fast in hwpoison madvise\n  HWPOISON: add an interface to switch off/on all the page filters\n  HWPOISON: add memory cgroup filter\n  memcg: add accessor to mem_cgroup.css\n  memcg: rename and export try_get_mem_cgroup_from_page()\n  HWPOISON: add page flags filter\n  mm: export stable page flags\n  HWPOISON: limit hwpoison injector to known page types\n  HWPOISON: add fs/device filters\n  HWPOISON: return 0 to indicate success reliably\n  HWPOISON: make semantics of IGNORED/DELAYED clear\n  ...\n"
    },
    {
      "commit": "569b846df54ffb2827b83ce3244c5f032394cba4",
      "tree": "77c5d373a5edf97710fab8777912971b99e84828",
      "parents": [
        "cd9b45b78a61e8df250e69385c74e729e5b66abf"
      ],
      "author": {
        "name": "KAMEZAWA Hiroyuki",
        "email": "kamezawa.hiroyu@jp.fujitsu.com",
        "time": "Tue Dec 15 16:47:03 2009 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Dec 16 07:20:07 2009 -0800"
      },
      "message": "memcg: coalesce uncharge during unmap/truncate\n\nIn massive parallel enviroment, res_counter can be a performance\nbottleneck.  One strong techinque to reduce lock contention is reducing\ncalls by coalescing some amount of calls into one.\n\nConsidering charge/uncharge chatacteristic,\n\t- charge is done one by one via demand-paging.\n\t- uncharge is done by\n\t\t- in chunk at munmap, truncate, exit, execve...\n\t\t- one by one via vmscan/paging.\n\nIt seems we have a chance to coalesce uncharges for improving scalability\nat unmap/truncation.\n\nThis patch is a for coalescing uncharge.  For avoiding scattering memcg\u0027s\nstructure to functions under /mm, this patch adds memcg batch uncharge\ninformation to the task.  A reason for per-task batching is for making use\nof caller\u0027s context information.  We do batched uncharge (deleyed\nuncharge) when truncation/unmap occurs but do direct uncharge when\nuncharge is called by memory reclaim (vmscan.c).\n\nThe degree of coalescing depends on callers\n  - at invalidate/trucate... pagevec size\n  - at unmap ....ZAP_BLOCK_SIZE\n(memory itself will be freed in this degree.)\nThen, we\u0027ll not coalescing too much.\n\nOn x86-64 8cpu server, I tested overheads of memcg at page fault by\nrunning a program which does map/fault/unmap in a loop. Running\na task per a cpu by taskset and see sum of the number of page faults\nin 60secs.\n\n[without memcg config]\n  40156968  page-faults              #      0.085 M/sec   ( +-   0.046% )\n  27.67 cache-miss/faults\n[root cgroup]\n  36659599  page-faults              #      0.077 M/sec   ( +-   0.247% )\n  31.58 miss/faults\n[in a child cgroup]\n  18444157  page-faults              #      0.039 M/sec   ( +-   0.133% )\n  69.96 miss/faults\n[child with this patch]\n  27133719  page-faults              #      0.057 M/sec   ( +-   0.155% )\n  47.16 miss/faults\n\nWe can see some amounts of improvement.\n(root cgroup doesn\u0027t affected by this patch)\nAnother patch for \"charge\" will follow this and above will be improved more.\n\nChangelog(since 2009/10/02):\n - renamed filed of memcg_batch (as pages to bytes, memsw to memsw_bytes)\n - some clean up and commentary/description updates.\n - added initialize code to copy_process(). (possible bug fix)\n\nChangelog(old):\n - fixed !CONFIG_MEM_CGROUP case.\n - rebased onto the latest mmotm + softlimit fix patches.\n - unified patch for callers\n - added commetns.\n - make -\u003edo_batch as bool.\n - removed css_get() at el. We don\u0027t need it.\n\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "71f72525dfaaec012e23089c73331654ea7b12d3",
      "tree": "70a8c7831ddab19be1ee8430611ad653f3e764f3",
      "parents": [
        "db0480b3a61bd6ad86ead3b8bbad094ab0996932"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Dec 16 12:19:58 2009 +0100"
      },
      "committer": {
        "name": "Andi Kleen",
        "email": "ak@linux.intel.com",
        "time": "Wed Dec 16 12:19:58 2009 +0100"
      },
      "message": "HWPOISON: comment dirty swapcache pages\n\nAK: Improve comment\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nSigned-off-by: Andi Kleen \u003cak@linux.intel.com\u003e\n"
    },
    {
      "commit": "5ad6468801d28c4d4ac9f48ec19297817c915f6a",
      "tree": "edd8dc48693f43278d6fe1614aca2bf660d4dc10",
      "parents": [
        "73848b4684e84a84cfd1555af78d41158f31e16b"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hugh.dickins@tiscali.co.uk",
        "time": "Mon Dec 14 17:59:24 2009 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Dec 15 08:53:19 2009 -0800"
      },
      "message": "ksm: let shared pages be swappable\n\nInitial implementation for swapping out KSM\u0027s shared pages: add\npage_referenced_ksm() and try_to_unmap_ksm(), which rmap.c calls when\nfaced with a PageKsm page.\n\nMost of what\u0027s needed can be got from the rmap_items listed from the\nstable_node of the ksm page, without discovering the actual vma: so in\nthis patch just fake up a struct vma for page_referenced_one() or\ntry_to_unmap_one(), then refine that in the next patch.\n\nAdd VM_NONLINEAR to ksm_madvise()\u0027s list of exclusions: it has always been\nimplicit there (being only set with VM_SHARED, already excluded), but\nlet\u0027s make it explicit, to help justify the lack of nonlinear unmap.\n\nRely on the page lock to protect against concurrent modifications to that\npage\u0027s node of the stable tree.\n\nThe awkward part is not swapout but swapin: do_swap_page() and\npage_add_anon_rmap() now have to allow for new possibilities - perhaps a\nksm page still in swapcache, perhaps a swapcache page associated with one\nlocation in one anon_vma now needed for another location or anon_vma.\n(And the vma might even be no longer VM_MERGEABLE when that happens.)\n\nksm_might_need_to_copy() checks for that case, and supplies a duplicate\npage when necessary, simply leaving it to a subsequent pass of ksmd to\nrediscover the identity and merge them back into one ksm page.\nDisappointingly primitive: but the alternative would have to accumulate\nunswappable info about the swapped out ksm pages, limiting swappability.\n\nRemove page_add_ksm_rmap(): page_add_anon_rmap() now has to allow for the\nparticular case it was handling, so just use it instead.\n\nSigned-off-by: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nCc: Izik Eidus \u003cieidus@redhat.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Chris Wright \u003cchrisw@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d99be1a8ecf377c2c9b3372d36411ad6547bbd4c",
      "tree": "844a156da951783a46f9cfd0e66858eae12e5f54",
      "parents": [
        "a70caa8ba48f21f46d3b4e71b6b8d14080bbd57a"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hugh.dickins@tiscali.co.uk",
        "time": "Mon Dec 14 17:59:04 2009 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Dec 15 08:53:17 2009 -0800"
      },
      "message": "mm: sigbus instead of abusing oom\n\nWhen do_nonlinear_fault() realizes that the page table must have been\ncorrupted for it to have been called, it does print_bad_pte() and returns\n...  VM_FAULT_OOM, which is hard to understand.\n\nIt made some sense when I did it for 2.6.15, when do_page_fault() just\nkilled the current process; but nowadays it lets the OOM killer decide who\nto kill - so page table corruption in one process would be liable to kill\nanother.\n\nChange it to return VM_FAULT_SIGBUS instead: that doesn\u0027t guarantee that\nthe process will be killed, but is good enough for such a rare\nabnormality, accompanied as it is by the \"BUG: Bad page map\" message.\n\nAnd recent HWPOISON work has copied that code into do_swap_page(), when it\nfinds an impossible swap entry: fix that to VM_FAULT_SIGBUS too.\n\nSigned-off-by: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nCc: Izik Eidus \u003cieidus@redhat.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Lee Schermerhorn \u003cLee.Schermerhorn@hp.com\u003e\nCc: Andi Kleen \u003candi@firstfloor.org\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "570a335b8e22579e2a51a68136d2b1f907a20eec",
      "tree": "c5312383e948d2e7ac60c2fa410fee98e8b38a70",
      "parents": [
        "8d69aaee80c123b460918816cbfa2e83224c3646"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hugh.dickins@tiscali.co.uk",
        "time": "Mon Dec 14 17:58:46 2009 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Dec 15 08:53:15 2009 -0800"
      },
      "message": "swap_info: swap count continuations\n\nSwap is duplicated (reference count incremented by one) whenever the same\nswap page is inserted into another mm (when forking finds a swap entry in\nplace of a pte, or when reclaim unmaps a pte to insert the swap entry).\n\nswap_info_struct\u0027s vmalloc\u0027ed swap_map is the array of these reference\ncounts: but what happens when the unsigned short (or unsigned char since\nthe preceding patch) is full? (and its high bit is kept for a cache flag)\n\nWe then lose track of it, never freeing, leaving it in use until swapoff:\nat which point we _hope_ that a single pass will have found all instances,\nassume there are no more, and will lose user data if we\u0027re wrong.\n\nSwapping of KSM pages has not yet been enabled; but it is implemented,\nand makes it very easy for a user to overflow the maximum swap count:\npossible with ordinary process pages, but unlikely, even when pid_max\nhas been raised from PID_MAX_DEFAULT.\n\nThis patch implements swap count continuations: when the count overflows,\na continuation page is allocated and linked to the original vmalloc\u0027ed\nmap page, and this used to hold the continuation counts for that entry\nand its neighbours.  These continuation pages are seldom referenced:\nthe common paths all work on the original swap_map, only referring to\na continuation page when the low \"digit\" of a count is incremented or\ndecremented through SWAP_MAP_MAX.\n\nSigned-off-by: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "3242f9804ba992c867360e2b57efc268b8e4e175",
      "tree": "96fbdbc1344aa67588ce26765f308c674b91a75f",
      "parents": [
        "23756692147c5dfd3328afd42e16e9d943ff756c",
        "7456b0405d8fc063c49628f969cdb23be060fc80"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Oct 29 08:20:00 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Oct 29 08:20:00 2009 -0700"
      },
      "message": "Merge branch \u0027hwpoison-2.6.32\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6\n\n* \u0027hwpoison-2.6.32\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6:\n  HWPOISON: fix invalid page count in printk output\n  HWPOISON: Allow schedule_on_each_cpu() from keventd\n  HWPOISON: fix/proc/meminfo alignment\n  HWPOISON: fix oops on ksm pages\n  HWPOISON: Fix page count leak in hwpoison late kill in do_swap_page\n  HWPOISON: return early on non-LRU pages\n  HWPOISON: Add brief hwpoison description to Documentation\n  HWPOISON: Clean up PR_MCE_KILL interface\n"
    },
    {
      "commit": "c36987e2ef32e1bb7850379515f21187cba44754",
      "tree": "0b0a6b6a54c2a80de86426a74367ec4b1f089b61",
      "parents": [
        "2545f038f4af0ff9945d47c10f988418dda50140"
      ],
      "author": {
        "name": "Daisuke Nishimura",
        "email": "nishimura@mxp.nes.nec.co.jp",
        "time": "Mon Oct 26 16:50:23 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Oct 29 07:39:32 2009 -0700"
      },
      "message": "mm: don\u0027t call pte_unmap() against an improper pte\n\nThere are some places where we do like:\n\n\tpte \u003d pte_map();\n\tdo {\n\t\t(do break in some conditions)\n\t} while (pte++, ...);\n\tpte_unmap(pte - 1);\n\nBut if the loop breaks at the first loop, pte_unmap() unmaps invalid pte.\n\nThis patch is a fix for this problem.\n\nSigned-off-by: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nReviewd-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4779cb31c0ee3b355116745edca3f3e5fe865553",
      "tree": "7409cb0762ca55afe11aa981db4624d3496ed6fb",
      "parents": [
        "e43c3afb367112a5b357f9adfac7817255129c88"
      ],
      "author": {
        "name": "Andi Kleen",
        "email": "ak@linux.intel.com",
        "time": "Wed Oct 14 01:51:41 2009 +0200"
      },
      "committer": {
        "name": "Andi Kleen",
        "email": "ak@linux.intel.com",
        "time": "Mon Oct 19 07:29:20 2009 +0200"
      },
      "message": "HWPOISON: Fix page count leak in hwpoison late kill in do_swap_page\n\nWhen returning due to a poisoned page drop the page count.\n\nIt wasn\u0027t a fatal problem because noone cares about the page count\non a poisoned page (except when it wraps), but it\u0027s cleaner to fix it.\n\nPointed out by Linus.\n\nSigned-off-by: Andi Kleen \u003cak@linux.intel.com\u003e\n"
    },
    {
      "commit": "6c5daf012c9155aafd2c7973e4278766c30dfad0",
      "tree": "33959d7b36d03e1610615641a2940cb2de5e8603",
      "parents": [
        "6d39b27f0ac7e805ae3bd9efa51d7da04bec0360",
        "c08d3b0e33edce28e9cfa7b64f7fe5bdeeb29248"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Sep 24 08:32:11 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Sep 24 08:32:11 2009 -0700"
      },
      "message": "Merge branch \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6\n\n* \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:\n  truncate: use new helpers\n  truncate: new helpers\n  fs: fix overflow in sys_mount() for in-kernel calls\n  fs: Make unload_nls() NULL pointer safe\n  freeze_bdev: grab active reference to frozen superblocks\n  freeze_bdev: kill bd_mount_sem\n  exofs: remove BKL from super operations\n  fs/romfs: correct error-handling code\n  vfs: seq_file: add helpers for data filling\n  vfs: remove redundant position check in do_sendfile\n  vfs: change sb-\u003es_maxbytes to a loff_t\n  vfs: explicitly cast s_maxbytes in fiemap_check_ranges\n  libfs: return error code on failed attr set\n  seq_file: return a negative error code when seq_path_root() fails.\n  vfs: optimize touch_time() too\n  vfs: optimization for touch_atime()\n  vfs: split generic_forget_inode() so that hugetlbfs does not have to copy it\n  fs/inode.c: add dev-id and inode number for debugging in init_special_inode()\n  libfs: make simple_read_from_buffer conventional\n"
    },
    {
      "commit": "db16826367fefcb0ddb93d76b66adc52eb4e6339",
      "tree": "626224c1eb1eb79c522714591f208b4fdbdcd9d4",
      "parents": [
        "cd6045138ed1bb5d8773e940d51c34318eef3ef2",
        "465fdd97cbe16ef8727221857e96ef62dd352017"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Sep 24 07:53:22 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Sep 24 07:53:22 2009 -0700"
      },
      "message": "Merge branch \u0027hwpoison\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6\n\n* \u0027hwpoison\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (21 commits)\n  HWPOISON: Enable error_remove_page on btrfs\n  HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs\n  HWPOISON: Add madvise() based injector for hardware poisoned pages v4\n  HWPOISON: Enable error_remove_page for NFS\n  HWPOISON: Enable .remove_error_page for migration aware file systems\n  HWPOISON: The high level memory error handler in the VM v7\n  HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per process\n  HWPOISON: shmem: call set_page_dirty() with locked page\n  HWPOISON: Define a new error_remove_page address space op for async truncation\n  HWPOISON: Add invalidate_inode_page\n  HWPOISON: Refactor truncate to allow direct truncating of page v2\n  HWPOISON: check and isolate corrupted free pages v2\n  HWPOISON: Handle hardware poisoned pages in try_to_unmap\n  HWPOISON: Use bitmask/action code for try_to_unmap behaviour\n  HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2\n  HWPOISON: Add poison check to page fault handling\n  HWPOISON: Add basic support for poisoned pages in fault handler v3\n  HWPOISON: Add new SIGBUS error codes for hardware poison signals\n  HWPOISON: Add support for poison swap entries v2\n  HWPOISON: Export some rmap vma locking to outside world\n  ...\n"
    },
    {
      "commit": "25d9e2d15286281ec834b829a4aaf8969011f1cd",
      "tree": "e4329a481ca197afae30f04335e023c7d04f7d67",
      "parents": [
        "eca6f534e61919b28fb21aafbd1c2983deae75be"
      ],
      "author": {
        "name": "npiggin@suse.de",
        "email": "npiggin@suse.de",
        "time": "Fri Aug 21 02:35:05 2009 +1000"
      },
      "committer": {
        "name": "al",
        "email": "al@dizzy.pdmi.ras.ru",
        "time": "Thu Sep 24 08:41:47 2009 -0400"
      },
      "message": "truncate: new helpers\n\nIntroduce new truncate helpers truncate_pagecache and inode_newsize_ok.\nvmtruncate is also consolidated from mm/memory.c and mm/nommu.c and\ninto mm/truncate.c.\n\nReviewed-by: Christoph Hellwig \u003chch@lst.de\u003e\nSigned-off-by: Nick Piggin \u003cnpiggin@suse.de\u003e\nSigned-off-by: Al Viro \u003cviro@zeniv.linux.org.uk\u003e\n"
    },
    {
      "commit": "03f6462a3ae78f36eb1f0ee8b4d5ae2f7859c1d5",
      "tree": "bf19c5019705796e90ef592233aca5f09025a92f",
      "parents": [
        "62eede62dafb4a6633eae7ffbeb34c60dba5e7b1"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hugh.dickins@tiscali.co.uk",
        "time": "Mon Sep 21 17:03:35 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Sep 22 07:17:41 2009 -0700"
      },
      "message": "mm: move highest_memmap_pfn\n\nMove highest_memmap_pfn __read_mostly from page_alloc.c next to zero_pfn\n__read_mostly in memory.c: to help them share a cacheline, since they\u0027re\nvery often tested together in vm_normal_page().\n\nSigned-off-by: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "62eede62dafb4a6633eae7ffbeb34c60dba5e7b1",
      "tree": "e55a0ca4ad0c55ad162443146268cfb4c473750f",
      "parents": [
        "3ae77f43b1118a76ea37952d444319c15e002c03"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hugh.dickins@tiscali.co.uk",
        "time": "Mon Sep 21 17:03:34 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Sep 22 07:17:41 2009 -0700"
      },
      "message": "mm: ZERO_PAGE without PTE_SPECIAL\n\nReinstate anonymous use of ZERO_PAGE to all architectures, not just to\nthose which __HAVE_ARCH_PTE_SPECIAL: as suggested by Nick Piggin.\n\nContrary to how I\u0027d imagined it, there\u0027s nothing ugly about this, just a\nzero_pfn test built into one or another block of vm_normal_page().\n\nBut the MIPS ZERO_PAGE-of-many-colours case demands is_zero_pfn() and\nmy_zero_pfn() inlines.  Reinstate its mremap move_pte() shuffling of\nZERO_PAGEs we did from 2.6.17 to 2.6.19?  Not unless someone shouts for\nthat: it would have to take vm_flags to weed out some cases.\n\nSigned-off-by: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Ralf Baechle \u003cralf@linux-mips.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "58fa879e1e640a1856f736b418984ebeccee1c95",
      "tree": "dc37bce8379e29c46e79f105cc71d137b14965cf",
      "parents": [
        "a13ea5b759645a0779edc6dbfec9abfd83220844"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hugh.dickins@tiscali.co.uk",
        "time": "Mon Sep 21 17:03:31 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Sep 22 07:17:40 2009 -0700"
      },
      "message": "mm: FOLL flags for GUP flags\n\n__get_user_pages() has been taking its own GUP flags, then processing\nthem into FOLL flags for follow_page().  Though oddly named, the FOLL\nflags are more widely used, so pass them to __get_user_pages() now.\nSorry, VM flags, VM_FAULT flags and FAULT_FLAGs are still distinct.\n\n(The patch to __get_user_pages() looks peculiar, with both gup_flags\nand foll_flags: the gup_flags remain constant; but as before there\u0027s\nan exceptional case, out of scope of the patch, in which foll_flags\nper page have FOLL_WRITE masked off.)\n\nSigned-off-by: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a13ea5b759645a0779edc6dbfec9abfd83220844",
      "tree": "864dd495718195bd065d9f26edac2504e6de5af0",
      "parents": [
        "1ac0cb5d0e22d5e483f56b2bc12172dec1cf7536"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hugh.dickins@tiscali.co.uk",
        "time": "Mon Sep 21 17:03:30 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Sep 22 07:17:40 2009 -0700"
      },
      "message": "mm: reinstate ZERO_PAGE\n\nKAMEZAWA Hiroyuki has observed customers of earlier kernels taking\nadvantage of the ZERO_PAGE: which we stopped do_anonymous_page() from\nusing in 2.6.24.  And there were a couple of regression reports on LKML.\n\nFollowing suggestions from Linus, reinstate do_anonymous_page() use of\nthe ZERO_PAGE; but this time avoid dirtying its struct page cacheline\nwith (map)count updates - let vm_normal_page() regard it as abnormal.\n\nUse it only on arches which __HAVE_ARCH_PTE_SPECIAL (x86, s390, sh32,\nmost powerpc): that\u0027s not essential, but minimizes additional branches\n(keeping them in the unlikely pte_special case); and incidentally\nexcludes mips (some models of which needed eight colours of ZERO_PAGE\nto avoid costly exceptions).\n\nDon\u0027t be fanatical about avoiding ZERO_PAGE updates: get_user_pages()\ncallers won\u0027t want to make exceptions for it, so increment its count\nthere.  Changes to mlock and migration? happily seems not needed.\n\nIn most places it\u0027s quicker to check pfn than struct page address:\nprepare a __read_mostly zero_pfn for that.  Does get_dump_page()\nstill need its ZERO_PAGE check? probably not, but keep it anyway.\n\nSigned-off-by: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "1ac0cb5d0e22d5e483f56b2bc12172dec1cf7536",
      "tree": "68114711dc747a557895896af991623438034c2d",
      "parents": [
        "2a15efc953b26ad57d7d38b9e6782d57e53b4ab2"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hugh.dickins@tiscali.co.uk",
        "time": "Mon Sep 21 17:03:29 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Sep 22 07:17:40 2009 -0700"
      },
      "message": "mm: fix anonymous dirtying\n\ndo_anonymous_page() has been wrong to dirty the pte regardless.\nIf it\u0027s not going to mark the pte writable, then it won\u0027t help\nto mark it dirty here, and clogs up memory with pages which will\nneed swap instead of being thrown away.  Especially wrong if no\novercommit is chosen, and this vma is not yet VM_ACCOUNTed -\nwe could exceed the limit and OOM despite no overcommit.\n\nSigned-off-by: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nCc: \u003cstable@kernel.org\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "2a15efc953b26ad57d7d38b9e6782d57e53b4ab2",
      "tree": "f4d04903b3303e80460d2fa3f38da2b7eea82d22",
      "parents": [
        "8e4b9a60718970bbc02dfd3abd0b956ab65af231"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hugh.dickins@tiscali.co.uk",
        "time": "Mon Sep 21 17:03:27 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Sep 22 07:17:40 2009 -0700"
      },
      "message": "mm: follow_hugetlb_page flags\n\nfollow_hugetlb_page() shouldn\u0027t be guessing about the coredump case\neither: pass the foll_flags down to it, instead of just the write bit.\n\nRemove that obscure huge_zeropage_ok() test.  The decision is easy,\nthough unlike the non-huge case - here vm_ops-\u003efault is always set.\nBut we know that a fault would serve up zeroes, unless there\u0027s\nalready a hugetlbfs pagecache page to back the range.\n\n(Alternatively, since hugetlb pages aren\u0027t swapped out under pressure,\nyou could save more dump space by arguing that a page not yet faulted\ninto this process cannot be relevant to the dump; but that would be\nmore surprising.)\n\nSigned-off-by: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "8e4b9a60718970bbc02dfd3abd0b956ab65af231",
      "tree": "4c19152cea19882071a74f92c0cf6a16d5711f41",
      "parents": [
        "f3e8fccd06d27773186a0094371daf2d84c79469"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hugh.dickins@tiscali.co.uk",
        "time": "Mon Sep 21 17:03:26 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Sep 22 07:17:40 2009 -0700"
      },
      "message": "mm: FOLL_DUMP replace FOLL_ANON\n\nThe \"FOLL_ANON optimization\" and its use_zero_page() test have caused\nconfusion and bugs: why does it test VM_SHARED? for the very good but\nunsatisfying reason that VMware crashed without.  As we look to maybe\nreinstating anonymous use of the ZERO_PAGE, we need to sort this out.\n\nEasily done: it\u0027s silly for __get_user_pages() and follow_page() to\nbe guessing whether it\u0027s safe to assume that they\u0027re being used for\na coredump (which can take a shortcut snapshot where other uses must\nhandle a fault) - just tell them with GUP_FLAGS_DUMP and FOLL_DUMP.\n\nget_dump_page() doesn\u0027t even want a ZERO_PAGE: an error suits fine.\n\nSigned-off-by: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    }
  ],
  "next": "f3e8fccd06d27773186a0094371daf2d84c79469"
}
