)]}'
{
  "log": [
    {
      "commit": "4746efded84d7c5a9c8d64d4c6e814ff0cf9fb42",
      "tree": "174f400db27c1a1d9a66407931199aabfdce6bba",
      "parents": [
        "f7b88631a89757d70192044c9d9f2e8d2fc02f2c"
      ],
      "author": {
        "name": "Shaohua Li",
        "email": "shaohua.li@intel.com",
        "time": "Tue Jul 19 08:49:26 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jul 19 22:09:31 2011 -0700"
      },
      "message": "vmscan: fix a livelock in kswapd\n\nI\u0027m running a workload which triggers a lot of swap in a machine with 4\nnodes.  After I kill the workload, I found a kswapd livelock.  Sometimes\nkswapd3 or kswapd2 are keeping running and I can\u0027t access filesystem,\nbut most memory is free.\n\nThis looks like a regression since commit 08951e545918c159 (\"mm: vmscan:\ncorrect check for kswapd sleeping in sleeping_prematurely\").\n\nNode 2 and 3 have only ZONE_NORMAL, but balance_pgdat() will return 0\nfor classzone_idx.  The reason is end_zone in balance_pgdat() is 0 by\ndefault, if all zones have watermark ok, end_zone will keep 0.\n\nLater sleeping_prematurely() always returns true.  Because this is an\norder 3 wakeup, and if classzone_idx is 0, both balanced_pages and\npresent_pages in pgdat_balanced() are 0.  We add a special case here.\nIf a zone has no page, we think it\u0027s balanced.  This fixes the livelock.\n\nSigned-off-by: Shaohua Li \u003cshaohua.li@intel.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: \u003cstable@kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "215ddd6664ced067afca7eebd2d1eb83f064ff5a",
      "tree": "b0e01235355d9c77b3bf63e0a57a6721fc8e3793",
      "parents": [
        "da175d06b437093f93109ba9e5efbe44dfdf9409"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mgorman@suse.de",
        "time": "Fri Jul 08 15:39:40 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Jul 08 21:14:43 2011 -0700"
      },
      "message": "mm: vmscan: only read new_classzone_idx from pgdat when reclaiming successfully\n\nDuring allocator-intensive workloads, kswapd will be woken frequently\ncausing free memory to oscillate between the high and min watermark.  This\nis expected behaviour.  Unfortunately, if the highest zone is small, a\nproblem occurs.\n\nWhen balance_pgdat() returns, it may be at a lower classzone_idx than it\nstarted because the highest zone was unreclaimable.  Before checking if it\nshould go to sleep though, it checks pgdat-\u003eclasszone_idx which when there\nis no other activity will be MAX_NR_ZONES-1.  It interprets this as it has\nbeen woken up while reclaiming, skips scheduling and reclaims again.  As\nthere is no useful reclaim work to do, it enters into a loop of shrinking\nslab consuming loads of CPU until the highest zone becomes reclaimable for\na long period of time.\n\nThere are two problems here.  1) If the returned classzone or order is\nlower, it\u0027ll continue reclaiming without scheduling.  2) if the highest\nzone was marked unreclaimable but balance_pgdat() returns immediately at\nDEF_PRIORITY, the new lower classzone is not communicated back to kswapd()\nfor sleeping.\n\nThis patch does two things that are related.  If the end_zone is\nunreclaimable, this information is communicated back.  Second, if the\nclasszone or order was reduced due to failing to reclaim, new information\nis not read from pgdat and instead an attempt is made to go to sleep.  Due\nto this, it is also necessary that pgdat-\u003eclasszone_idx be initialised\neach time to pgdat-\u003enr_zones - 1 to avoid re-reads being interpreted as\nwakeups.\n\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReported-by: Pádraig Brady \u003cP@draigBrady.com\u003e\nTested-by: Pádraig Brady \u003cP@draigBrady.com\u003e\nTested-by: Andrew Lutomirski \u003cluto@mit.edu\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: \u003cstable@kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "da175d06b437093f93109ba9e5efbe44dfdf9409",
      "tree": "04bbed7c41347e3614690947c66f87e8e38a6051",
      "parents": [
        "d7868dae893c83c50c7824bc2bc75f93d114669f"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mgorman@suse.de",
        "time": "Fri Jul 08 15:39:39 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Jul 08 21:14:43 2011 -0700"
      },
      "message": "mm: vmscan: evaluate the watermarks against the correct classzone\n\nWhen deciding if kswapd is sleeping prematurely, the classzone is taken\ninto account but this is different to what balance_pgdat() and the\nallocator are doing.  Specifically, the DMA zone will be checked based on\nthe classzone used when waking kswapd which could be for a GFP_KERNEL or\nGFP_HIGHMEM request.  The lowmem reserve limit kicks in, the watermark is\nnot met and kswapd thinks it\u0027s sleeping prematurely keeping kswapd awake in\nerror.\n\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReported-by: Pádraig Brady \u003cP@draigBrady.com\u003e\nTested-by: Pádraig Brady \u003cP@draigBrady.com\u003e\nTested-by: Andrew Lutomirski \u003cluto@mit.edu\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: \u003cstable@kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d7868dae893c83c50c7824bc2bc75f93d114669f",
      "tree": "7c9e56513ecbbf086c81ebff77310f80e0232ecc",
      "parents": [
        "08951e545918c1594434d000d88a7793e2452a9b"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mgorman@suse.de",
        "time": "Fri Jul 08 15:39:38 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Jul 08 21:14:43 2011 -0700"
      },
      "message": "mm: vmscan: do not apply pressure to slab if we are not applying pressure to zone\n\nDuring allocator-intensive workloads, kswapd will be woken frequently\ncausing free memory to oscillate between the high and min watermark.  This\nis expected behaviour.\n\nWhen kswapd applies pressure to zones during node balancing, it checks if\nthe zone is above a high+balance_gap threshold.  If it is, it does not\napply pressure but it unconditionally shrinks slab on a global basis which\nis excessive.  In the event kswapd is being kept awake due to a high small\nunreclaimable zone, it skips zone shrinking but still calls shrink_slab().\n\nOnce pressure has been applied, the check for zone being unreclaimable is\nbeing made before the check is made if all_unreclaimable should be set.\nThis miss of unreclaimable can cause has_under_min_watermark_zone to be\nset due to an unreclaimable zone preventing kswapd backing off on\ncongestion_wait().\n\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReported-by: Pádraig Brady \u003cP@draigBrady.com\u003e\nTested-by: Pádraig Brady \u003cP@draigBrady.com\u003e\nTested-by: Andrew Lutomirski \u003cluto@mit.edu\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: \u003cstable@kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "08951e545918c1594434d000d88a7793e2452a9b",
      "tree": "56f91aba454751e2b6bcd67945ed2d4ebbeb2025",
      "parents": [
        "902daf6580cffe04721250fb71b5527a98718b11"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mgorman@suse.de",
        "time": "Fri Jul 08 15:39:36 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Jul 08 21:14:42 2011 -0700"
      },
      "message": "mm: vmscan: correct check for kswapd sleeping in sleeping_prematurely\n\nDuring allocator-intensive workloads, kswapd will be woken frequently\ncausing free memory to oscillate between the high and min watermark.  This\nis expected behaviour.  Unfortunately, if the highest zone is small, a\nproblem occurs.\n\nThis seems to happen most with recent sandybridge laptops but it\u0027s\nprobably a co-incidence as some of these laptops just happen to have a\nsmall Normal zone.  The reproduction case is almost always during copying\nlarge files that kswapd pegs at 100% CPU until the file is deleted or\ncache is dropped.\n\nThe problem is mostly down to sleeping_prematurely() keeping kswapd awake\nwhen the highest zone is small and unreclaimable and compounded by the\nfact we shrink slabs even when not shrinking zones causing a lot of time\nto be spent in shrinkers and a lot of memory to be reclaimed.\n\nPatch 1 corrects sleeping_prematurely to check the zones matching\n\tthe classzone_idx instead of all zones.\n\nPatch 2 avoids shrinking slab when we are not shrinking a zone.\n\nPatch 3 notes that sleeping_prematurely is checking lower zones against\n\ta high classzone which is not what allocators or balance_pgdat()\n\tis doing leading to an artifical belief that kswapd should be\n\tstill awake.\n\nPatch 4 notes that when balance_pgdat() gives up on a high zone that the\n\tdecision is not communicated to sleeping_prematurely()\n\nThis problem affects 2.6.38.8 for certain and is expected to affect 2.6.39\nand 3.0-rc4 as well.  If accepted, they need to go to -stable to be picked\nup by distros and this series is against 3.0-rc4.  I\u0027ve cc\u0027d people that\nreported similar problems recently to see if they still suffer from the\nproblem and if this fixes it.\n\nThis patch: correct the check for kswapd sleeping in sleeping_prematurely()\n\nDuring allocator-intensive workloads, kswapd will be woken frequently\ncausing free memory to oscillate between the high and min watermark.  This\nis expected behaviour.\n\nA problem occurs if the highest zone is small.  balance_pgdat() only\nconsiders unreclaimable zones when priority is DEF_PRIORITY but\nsleeping_prematurely considers all zones.  It\u0027s possible for this sequence\nto occur\n\n  1. kswapd wakes up and enters balance_pgdat()\n  2. At DEF_PRIORITY, marks highest zone unreclaimable\n  3. At DEF_PRIORITY-1, ignores highest zone setting end_zone\n  4. At DEF_PRIORITY-1, calls shrink_slab freeing memory from\n        highest zone, clearing all_unreclaimable. Highest zone\n        is still unbalanced\n  5. kswapd returns and calls sleeping_prematurely\n  6. sleeping_prematurely looks at *all* zones, not just the ones\n     being considered by balance_pgdat. The highest small zone\n     has all_unreclaimable cleared but the zone is not\n     balanced. all_zones_ok is false so kswapd stays awake\n\nThis patch corrects the behaviour of sleeping_prematurely to check the\nzones balance_pgdat() checked.\n\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReported-by: Pádraig Brady \u003cP@draigBrady.com\u003e\nTested-by: Pádraig Brady \u003cP@draigBrady.com\u003e\nTested-by: Andrew Lutomirski \u003cluto@mit.edu\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: \u003cstable@kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ac34a1a3c39da0a1b9188d12a9ce85506364ed2a",
      "tree": "f74f34047c6bc516e29196685cc8671aff4a02d2",
      "parents": [
        "26c4caea9d697043cc5a458b96411b86d7f6babd"
      ],
      "author": {
        "name": "KAMEZAWA Hiroyuki",
        "email": "kamezawa.hiroyu@jp.fujitsu.com",
        "time": "Mon Jun 27 16:18:12 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Jun 27 18:00:13 2011 -0700"
      },
      "message": "memcg: fix direct softlimit reclaim to be called in limit path\n\nCommit d149e3b25d7c (\"memcg: add the soft_limit reclaim in global direct\nreclaim\") adds a softlimit hook to shrink_zones().  By this, soft limit\nis called as\n\n   try_to_free_pages()\n       do_try_to_free_pages()\n           shrink_zones()\n               mem_cgroup_soft_limit_reclaim()\n\nThen, direct reclaim is memcg softlimit hint aware, now.\n\nBut, the memory cgroup\u0027s \"limit\" path can call softlimit shrinker.\n\n   try_to_free_mem_cgroup_pages()\n       do_try_to_free_pages()\n           shrink_zones()\n               mem_cgroup_soft_limit_reclaim()\n\nThis will cause a global reclaim when a memcg hits limit.\n\nThis is bug. soft_limit_reclaim() should be called when\nscanning_global_lru(sc) \u003d\u003d true.\n\nAnd the commit adds a variable \"total_scanned\" for counting softlimit\nscanned pages....it\u0027s not \"total\".  This patch removes the variable and\nupdate sc-\u003enr_scanned instead of it.  This will affect shrink_slab()\u0027s\nscan condition but, global LRU is scanned by softlimit and I think this\nchange makes sense.\n\nTODO: avoid too much scanning of a zone when softlimit did enough work.\n\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nCc: Ying Han \u003cyinghan@google.com\u003e\nCc: Michal Hocko \u003cmhocko@suse.cz\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d179e84ba5da1d0024087d1759a2938817a00f3f",
      "tree": "169c9cc4030a793df1bc29613eff85ee3acef9a9",
      "parents": [
        "7454f4ba40b419eb999a3c61a99da662bf1a2bb8"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Wed Jun 15 15:08:51 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Jun 15 20:04:02 2011 -0700"
      },
      "message": "mm: vmscan: do not use page_count without a page pin\n\nIt is unsafe to run page_count during the physical pfn scan because\ncompound_head could trip on a dangling pointer when reading\npage-\u003efirst_page if the compound page is being freed by another CPU.\n\n[mgorman@suse.de: split out patch]\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\n\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a433658c30974fc87ba3ff52d7e4e6299762aa3d",
      "tree": "8df65e22af520ca5c020281763e6874d0bb51bc5",
      "parents": [
        "e1bbd19bc4afef7adb80cca163800391c4f5773d"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Wed Jun 15 15:08:13 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Jun 15 20:03:59 2011 -0700"
      },
      "message": "vmscan,memcg: memcg aware swap token\n\nCurrently, memcg reclaim can disable swap token even if the swap token mm\ndoesn\u0027t belong in its memory cgroup.  It\u0027s slightly risky.  If an admin\ncreates very small mem-cgroup and silly guy runs contentious heavy memory\npressure workload, every tasks are going to lose swap token and then\nsystem may become unresponsive.  That\u0027s bad.\n\nThis patch adds \u0027memcg\u0027 parameter into disable_swap_token().  and if the\nparameter doesn\u0027t match swap token, VM doesn\u0027t disable it.\n\n[akpm@linux-foundation.org: coding-style fixes]\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Rik van Riel\u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "1bac180bd29e03989f50054af97b53b8d37a364a",
      "tree": "6797cb73a27c1e8b7d1ea79764356dc69486dad4",
      "parents": [
        "4fd14ebf6e3b66423dfac2bc9defda7b83ee07b3"
      ],
      "author": {
        "name": "Ying Han",
        "email": "yinghan@google.com",
        "time": "Thu May 26 16:25:36 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu May 26 17:12:35 2011 -0700"
      },
      "message": "memcg: rename mem_cgroup_zone_nr_pages() to mem_cgroup_zone_nr_lru_pages()\n\nThe caller of the function has been renamed to zone_nr_lru_pages(), and\nthis is just fixing up in the memcg code.  The current name is easily to\nbe mis-read as zone\u0027s total number of pages.\n\nSigned-off-by: Ying Han \u003cyinghan@google.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "246e87a9393448c20873bc5dee64be68ed559e24",
      "tree": "a17016142b267fcba2e3be9908f8138c8dcb3f3a",
      "parents": [
        "889976dbcb1218119fdd950fb7819084e37d7d37"
      ],
      "author": {
        "name": "KAMEZAWA Hiroyuki",
        "email": "kamezawa.hiroyu@jp.fujitsu.com",
        "time": "Thu May 26 16:25:34 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu May 26 17:12:35 2011 -0700"
      },
      "message": "memcg: fix get_scan_count() for small targets\n\nDuring memory reclaim we determine the number of pages to be scanned per\nzone as\n\n\t(anon + file) \u003e\u003e priority.\nAssume\n\tscan \u003d (anon + file) \u003e\u003e priority.\n\nIf scan \u003c SWAP_CLUSTER_MAX, the scan will be skipped for this time and\npriority gets higher.  This has some problems.\n\n  1. This increases priority as 1 without any scan.\n     To do scan in this priority, amount of pages should be larger than 512M.\n     If pages\u003e\u003epriority \u003c SWAP_CLUSTER_MAX, it\u0027s recorded and scan will be\n     batched, later. (But we lose 1 priority.)\n     If memory size is below 16M, pages \u003e\u003e priority is 0 and no scan in\n     DEF_PRIORITY forever.\n\n  2. If zone-\u003eall_unreclaimabe\u003d\u003dtrue, it\u0027s scanned only when priority\u003d\u003d0.\n     So, x86\u0027s ZONE_DMA will never be recoverred until the user of pages\n     frees memory by itself.\n\n  3. With memcg, the limit of memory can be small. When using small memcg,\n     it gets priority \u003c DEF_PRIORITY-2 very easily and need to call\n     wait_iff_congested().\n     For doing scan before priorty\u003d9, 64MB of memory should be used.\n\nThen, this patch tries to scan SWAP_CLUSTER_MAX of pages in force...when\n\n  1. the target is enough small.\n  2. it\u0027s kswapd or memcg reclaim.\n\nThen we can avoid rapid priority drop and may be able to recover\nall_unreclaimable in a small zones.  And this patch removes nr_saved_scan.\n This will allow scanning in this priority even when pages \u003e\u003e priority is\nvery small.\n\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Ying Han \u003cyinghan@google.com\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "889976dbcb1218119fdd950fb7819084e37d7d37",
      "tree": "7508706ddb6bcbe0f673aca3744f30f281b17734",
      "parents": [
        "4e4c941c108eff10844d2b441d96dab44f32f424"
      ],
      "author": {
        "name": "Ying Han",
        "email": "yinghan@google.com",
        "time": "Thu May 26 16:25:33 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu May 26 17:12:35 2011 -0700"
      },
      "message": "memcg: reclaim memory from nodes in round-robin order\n\nPresently, memory cgroup\u0027s direct reclaim frees memory from the current\nnode.  But this has some troubles.  Usually when a set of threads works in\na cooperative way, they tend to operate on the same node.  So if they hit\nlimits under memcg they will reclaim memory from themselves, damaging the\nactive working set.\n\nFor example, assume 2 node system which has Node 0 and Node 1 and a memcg\nwhich has 1G limit.  After some work, file cache remains and the usages\nare\n\n   Node 0:  1M\n   Node 1:  998M.\n\nand run an application on Node 0, it will eat its foot before freeing\nunnecessary file caches.\n\nThis patch adds round-robin for NUMA and adds equal pressure to each node.\nWhen using cpuset\u0027s spread memory feature, this will work very well.\n\nBut yes, a better algorithm is needed.\n\n[akpm@linux-foundation.org: comment editing]\n[kamezawa.hiroyu@jp.fujitsu.com: fix time comparisons]\nSigned-off-by: Ying Han \u003cyinghan@google.com\u003e\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d149e3b25d7c5f33de9aa866303926fa53535aa7",
      "tree": "160c8c3136246921458c96ab8257381d702208aa",
      "parents": [
        "0ae5e89c60c9eb87da36a2614836bc434b0ec2ad"
      ],
      "author": {
        "name": "Ying Han",
        "email": "yinghan@google.com",
        "time": "Thu May 26 16:25:27 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu May 26 17:12:35 2011 -0700"
      },
      "message": "memcg: add the soft_limit reclaim in global direct reclaim.\n\nWe recently added the change in global background reclaim which counts the\nreturn value of soft_limit reclaim.  Now this patch adds the similar logic\non global direct reclaim.\n\nWe should skip scanning global LRU on shrink_zone if soft_limit reclaim\ndoes enough work.  This is the first step where we start with counting the\nnr_scanned and nr_reclaimed from soft_limit reclaim into global\nscan_control.\n\nSigned-off-by: Ying Han \u003cyinghan@google.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nCc: Balbir Singh \u003cbalbir@linux.vnet.ibm.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Michal Hocko \u003cmhocko@suse.cz\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0ae5e89c60c9eb87da36a2614836bc434b0ec2ad",
      "tree": "0d509fd83ac7e7d2f52dfcbba769c43aeeb68b5f",
      "parents": [
        "f042e707ee671e4beb5389abeb9a1819a2cf5532"
      ],
      "author": {
        "name": "Ying Han",
        "email": "yinghan@google.com",
        "time": "Thu May 26 16:25:25 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu May 26 17:12:35 2011 -0700"
      },
      "message": "memcg: count the soft_limit reclaim in global background reclaim\n\nThe global kswapd scans per-zone LRU and reclaims pages regardless of the\ncgroup. It breaks memory isolation since one cgroup can end up reclaiming\npages from another cgroup. Instead we should rely on memcg-aware target\nreclaim including per-memcg kswapd and soft_limit hierarchical reclaim under\nmemory pressure.\n\nIn the global background reclaim, we do soft reclaim before scanning the\nper-zone LRU. However, the return value is ignored. This patch is the first\nstep to skip shrink_zone() if soft_limit reclaim does enough work.\n\nThis is part of the effort which tries to reduce reclaiming pages in global\nLRU in memcg. The per-memcg background reclaim patchset further enhances the\nper-cgroup targetting reclaim, which I should have V4 posted shortly.\n\nTry running multiple memory intensive workloads within seperate memcgs. Watch\nthe counters of soft_steal in memory.stat.\n\n  $ cat /dev/cgroup/A/memory.stat | grep \u0027soft\u0027\n  soft_steal 240000\n  soft_scan 240000\n  total_soft_steal 240000\n  total_soft_scan 240000\n\nThis patch:\n\nIn the global background reclaim, we do soft reclaim before scanning the\nper-zone LRU.  However, the return value is ignored.\n\nWe would like to skip shrink_zone() if soft_limit reclaim does enough\nwork.  Also, we need to make the memory pressure balanced across per-memcg\nzones, like the logic vm-core.  This patch is the first step where we\nstart with counting the nr_scanned and nr_reclaimed from soft_limit\nreclaim into the global scan_control.\n\nSigned-off-by: Ying Han \u003cyinghan@google.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nAcked-by: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "1495f230fa7750479c79e3656286b9183d662077",
      "tree": "e5e233bb9fe1916ccc7281e7dcc71b1572fb22c5",
      "parents": [
        "a09ed5e00084448453c8bada4dcd31e5fbfc2f21"
      ],
      "author": {
        "name": "Ying Han",
        "email": "yinghan@google.com",
        "time": "Tue May 24 17:12:27 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:26 2011 -0700"
      },
      "message": "vmscan: change shrinker API by passing shrink_control struct\n\nChange each shrinker\u0027s API by consolidating the existing parameters into\nshrink_control struct.  This will simplify any further features added w/o\ntouching each file of shrinker.\n\n[akpm@linux-foundation.org: fix build]\n[akpm@linux-foundation.org: fix warning]\n[kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]\n[akpm@linux-foundation.org: fix xfs warning]\n[akpm@linux-foundation.org: update gfs2]\nSigned-off-by: Ying Han \u003cyinghan@google.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Pavel Emelyanov \u003cxemul@openvz.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nCc: Steven Whitehouse \u003cswhiteho@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a09ed5e00084448453c8bada4dcd31e5fbfc2f21",
      "tree": "493f5f2a93efb080cdcc28e793cbcfc7999e66eb",
      "parents": [
        "7b1de5868b124d8f399d8791ed30a9b679d64d4d"
      ],
      "author": {
        "name": "Ying Han",
        "email": "yinghan@google.com",
        "time": "Tue May 24 17:12:26 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:25 2011 -0700"
      },
      "message": "vmscan: change shrink_slab() interfaces by passing shrink_control\n\nConsolidate the existing parameters to shrink_slab() into a new\nshrink_control struct.  This is needed later to pass the same struct to\nshrinkers.\n\nSigned-off-by: Ying Han \u003cyinghan@google.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Pavel Emelyanov \u003cxemul@openvz.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0c917313a8d84fcc0c376db3f7edb7c06f06f920",
      "tree": "de95ed4a300d1034b1abae2eb8c7e509c9dfb341",
      "parents": [
        "bd486285f24ac2fd1ff64688fb0729712c5712c4"
      ],
      "author": {
        "name": "Konstantin Khlebnikov",
        "email": "khlebnikov@openvz.org",
        "time": "Tue May 24 17:12:21 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:23 2011 -0700"
      },
      "message": "mm: strictly require elevated page refcount in isolate_lru_page()\n\nisolate_lru_page() must be called only with stable reference to the page,\nthis is what is written in the comment above it, this is reasonable.\n\ncurrent isolate_lru_page() users and its page extra reference sources:\n\n mm/huge_memory.c:\n  __collapse_huge_page_isolate()\t- reference from pte\n\n mm/memcontrol.c:\n  mem_cgroup_move_parent()\t\t- get_page_unless_zero()\n  mem_cgroup_move_charge_pte_range()\t- reference from pte\n\n mm/memory-failure.c:\n  soft_offline_page()\t\t\t- fixed, reference from get_any_page()\n  delete_from_lru_cache() - reference from caller or get_page_unless_zero()\n\t[ seems like there bug, because __memory_failure() can call\n\t  page_action() for hpages tail, but it is ok for\n\t  isolate_lru_page(), tail getted and not in lru]\n\n mm/memory_hotplug.c:\n  do_migrate_range()\t\t\t- fixed, get_page_unless_zero()\n\n mm/mempolicy.c:\n  migrate_page_add()\t\t\t- reference from pte\n\n mm/migrate.c:\n  do_move_page_to_node_array()\t\t- reference from follow_page()\n\n mlock.c:\t\t\t\t- various external references\n\n mm/vmscan.c:\n  putback_lru_page()\t\t\t- reference from isolate_lru_page()\n\nIt seems that all isolate_lru_page() users are ready now for this\nrestriction.  So, let\u0027s replace redundant get_page_unless_zero() with\nget_page() and add page initial reference count check with VM_BUG_ON()\n\nSigned-off-by: Konstantin Khlebnikov \u003ckhlebnikov@openvz.org\u003e\nCc: Andi Kleen \u003candi@firstfloor.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f06590bd718ed950c98828e30ef93204028f3210",
      "tree": "60d1c52a538618a16ebcd82a4d949446fd2036c7",
      "parents": [
        "afc7e326a3f5bafc41324d7926c324414e343ee5"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Tue May 24 17:11:11 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:01 2011 -0700"
      },
      "message": "mm: vmscan: correctly check if reclaimer should schedule during shrink_slab\n\nIt has been reported on some laptops that kswapd is consuming large\namounts of CPU and not being scheduled when SLUB is enabled during large\namounts of file copying.  It is expected that this is due to kswapd\nmissing every cond_resched() point because;\n\nshrink_page_list() calls cond_resched() if inactive pages were isolated\n        which in turn may not happen if all_unreclaimable is set in\n        shrink_zones(). If for whatver reason, all_unreclaimable is\n        set on all zones, we can miss calling cond_resched().\n\nbalance_pgdat() only calls cond_resched if the zones are not\n        balanced. For a high-order allocation that is balanced, it\n        checks order-0 again. During that window, order-0 might have\n        become unbalanced so it loops again for order-0 and returns\n        that it was reclaiming for order-0 to kswapd(). It can then\n        find that a caller has rewoken kswapd for a high-order and\n        re-enters balance_pgdat() without ever calling cond_resched().\n\nshrink_slab only calls cond_resched() if we are reclaiming slab\n\tpages. If there are a large number of direct reclaimers, the\n\tshrinker_rwsem can be contended and prevent kswapd calling\n\tcond_resched().\n\nThis patch modifies the shrink_slab() case.  If the semaphore is\ncontended, the caller will still check cond_resched().  After each\nsuccessful call into a shrinker, the check for cond_resched() remains in\ncase one shrinker is particularly slow.\n\n[mgorman@suse.de: preserve call to cond_resched after each call into shrinker]\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: James Bottomley \u003cJames.Bottomley@HansenPartnership.com\u003e\nTested-by: Colin King \u003ccolin.king@canonical.com\u003e\nCc: Raghavendra D Prabhu \u003craghu.prabhu13@gmail.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Christoph Lameter \u003ccl@linux.com\u003e\nCc: Pekka Enberg \u003cpenberg@kernel.org\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: \u003cstable@kernel.org\u003e\t\t[2.6.38+]\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "afc7e326a3f5bafc41324d7926c324414e343ee5",
      "tree": "c7b06a424b7a35840a41dca008291ca5dcad9b42",
      "parents": [
        "a71ae47a2cbfa542c69f695809124da4e4dd9e8f"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "hannes@cmpxchg.org",
        "time": "Tue May 24 17:11:09 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 25 08:39:01 2011 -0700"
      },
      "message": "mm: vmscan: correct use of pgdat_balanced in sleeping_prematurely\n\nThere are a few reports of people experiencing hangs when copying large\namounts of data with kswapd using a large amount of CPU which appear to be\ndue to recent reclaim changes.  SLUB using high orders is the trigger but\nnot the root cause as SLUB has been using high orders for a while.  The\nroot cause was bugs introduced into reclaim which are addressed by the\nfollowing two patches.\n\nPatch 1 corrects logic introduced by commit 1741c877 (\"mm: kswapd:\n        keep kswapd awake for high-order allocations until a percentage of\n        the node is balanced\") to allow kswapd to go to sleep when\n        balanced for high orders.\n\nPatch 2 notes that it is possible for kswapd to miss every\n        cond_resched() and updates shrink_slab() so it\u0027ll at least reach\n        that scheduling point.\n\nChris Wood reports that these two patches in isolation are sufficient to\nprevent the system hanging.  AFAIK, they should also resolve similar hangs\nexperienced by James Bottomley.\n\nThis patch:\n\nJohannes Weiner poined out that the logic in commit 1741c877 (\"mm: kswapd:\nkeep kswapd awake for high-order allocations until a percentage of the\nnode is balanced\") is backwards.  Instead of allowing kswapd to go to\nsleep when balancing for high order allocations, it keeps it kswapd\nrunning uselessly.\n\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: James Bottomley \u003cJames.Bottomley@HansenPartnership.com\u003e\nTested-by: Colin King \u003ccolin.king@canonical.com\u003e\nCc: Raghavendra D Prabhu \u003craghu.prabhu13@gmail.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Christoph Lameter \u003ccl@linux.com\u003e\nCc: Pekka Enberg \u003cpenberg@kernel.org\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: \u003cstable@kernel.org\u003e\t\t[2.6.38+]\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "268bb0ce3e87872cb9290c322b0d35bce230d88f",
      "tree": "c8331ade4a3e24fc589c4eb62731bc2312d35333",
      "parents": [
        "257313b2a87795e07a0bdf58d0fffbdba8b31051"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri May 20 12:50:29 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri May 20 12:50:29 2011 -0700"
      },
      "message": "sanitize \u003clinux/prefetch.h\u003e usage\n\nCommit e66eed651fd1 (\"list: remove prefetching from regular list\niterators\") removed the include of prefetch.h from list.h, which\nuncovered several cases that had apparently relied on that rather\nobscure header file dependency.\n\nSo this fixes things up a bit, using\n\n   grep -L linux/prefetch.h $(git grep -l \u0027[^a-z_]prefetchw*(\u0027 -- \u0027*.[ch]\u0027)\n   grep -L \u0027prefetchw*(\u0027 $(git grep -l \u0027linux/prefetch.h\u0027 -- \u0027*.[ch]\u0027)\n\nto guide us in finding files that either need \u003clinux/prefetch.h\u003e\ninclusion, or have it despite not needing it.\n\nThere are more of them around (mostly network drivers), but this gets\nmany core ones.\n\nReported-by: Stephen Rothwell \u003csfr@canb.auug.org.au\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d6c438b6cd733834a3cec55af8577a8fc3548016",
      "tree": "95c7719415aecf9f2a09b1e885d23ba54eb9b5b8",
      "parents": [
        "d5f33d45e4c0e306e8d16b4573891a65d9ad544f"
      ],
      "author": {
        "name": "KAMEZAWA Hiroyuki",
        "email": "kamezawa.hiroyu@jp.fujitsu.com",
        "time": "Tue May 17 15:44:10 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed May 18 02:55:23 2011 -0700"
      },
      "message": "memcg: fix zone congestion\n\nZONE_CONGESTED should be a state of global memory reclaim.  If not, a busy\nmemcg sets this and give unnecessary throttoling in wait_iff_congested()\nagainst memory recalim in other contexts.  This makes system performance\nbad.\n\nI\u0027ll think about \"memcg is congested!\" flag is required or not, later.\nBut this fix is required first.\n\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nAcked-by: Ying Han \u003cyinghan@google.com\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nCc: Johannes Weiner \u003cjweiner@redhat.com\u003e\nCc: Michal Hocko \u003cmhocko@suse.cz\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "929bea7c714220fc76ce3f75bef9056477c28e74",
      "tree": "d41b4592b658173e00c7b8bad2bce048f02e0ead",
      "parents": [
        "fe936dfc23fed3475b11067e8d9b70553eafcd9e"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Thu Apr 14 15:22:12 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Apr 14 16:06:56 2011 -0700"
      },
      "message": "vmscan: all_unreclaimable() use zone-\u003eall_unreclaimable as a name\n\nall_unreclaimable check in direct reclaim has been introduced at 2.6.19\nby following commit.\n\n\t2006 Sep 25; commit 408d8544; oom: use unreclaimable info\n\nAnd it went through strange history. firstly, following commit broke\nthe logic unintentionally.\n\n\t2008 Apr 29; commit a41f24ea; page allocator: smarter retry of\n\t\t\t\t      costly-order allocations\n\nTwo years later, I\u0027ve found obvious meaningless code fragment and\nrestored original intention by following commit.\n\n\t2010 Jun 04; commit bb21c7ce; vmscan: fix do_try_to_free_pages()\n\t\t\t\t      return value when priority\u003d\u003d0\n\nBut, the logic didn\u0027t works when 32bit highmem system goes hibernation\nand Minchan slightly changed the algorithm and fixed it .\n\n\t2010 Sep 22: commit d1908362: vmscan: check all_unreclaimable\n\t\t\t\t      in direct reclaim path\n\nBut, recently, Andrey Vagin found the new corner case. Look,\n\n\tstruct zone {\n\t  ..\n\t        int                     all_unreclaimable;\n\t  ..\n\t        unsigned long           pages_scanned;\n\t  ..\n\t}\n\nzone-\u003eall_unreclaimable and zone-\u003epages_scanned are neigher atomic\nvariables nor protected by lock.  Therefore zones can become a state of\nzone-\u003epage_scanned\u003d0 and zone-\u003eall_unreclaimable\u003d1.  In this case, current\nall_unreclaimable() return false even though zone-\u003eall_unreclaimabe\u003d1.\n\nThis resulted in the kernel hanging up when executing a loop of the form\n\n1. fork\n2. mmap\n3. touch memory\n4. read memory\n5. munmmap\n\nas described in\nhttp://www.gossamer-threads.com/lists/linux/kernel/1348725#1348725\n\nIs this ignorable minor issue?  No.  Unfortunately, x86 has very small dma\nzone and it become zone-\u003eall_unreclamble\u003d1 easily.  and if it become\nall_unreclaimable\u003d1, it never restore all_unreclaimable\u003d0.  Why?  if\nall_unreclaimable\u003d1, vmscan only try DEF_PRIORITY reclaim and\na-few-lru-pages\u003e\u003eDEF_PRIORITY always makes 0.  that mean no page scan at\nall!\n\nEventually, oom-killer never works on such systems.  That said, we can\u0027t\nuse zone-\u003epages_scanned for this purpose.  This patch restore\nall_unreclaimable() use zone-\u003eall_unreclaimable as old.  and in addition,\nto add oom_killer_disabled check to avoid reintroduce the issue of commit\nd1908362 (\"vmscan: check all_unreclaimable in direct reclaim path\").\n\nReported-by: Andrey Vagin \u003cavagin@openvz.org\u003e\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: David Rientjes \u003crientjes@google.com\u003e\nCc: \u003cstable@kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "25985edcedea6396277003854657b5f3cb31a628",
      "tree": "f026e810210a2ee7290caeb737c23cb6472b7c38",
      "parents": [
        "6aba74f2791287ec407e0f92487a725a25908067"
      ],
      "author": {
        "name": "Lucas De Marchi",
        "email": "lucas.demarchi@profusion.mobi",
        "time": "Wed Mar 30 22:57:33 2011 -0300"
      },
      "committer": {
        "name": "Lucas De Marchi",
        "email": "lucas.demarchi@profusion.mobi",
        "time": "Thu Mar 31 11:26:23 2011 -0300"
      },
      "message": "Fix common misspellings\n\nFixes generated by \u0027codespell\u0027 and manually reviewed.\n\nSigned-off-by: Lucas De Marchi \u003clucas.demarchi@profusion.mobi\u003e\n"
    },
    {
      "commit": "6c5103890057b1bb781b26b7aae38d33e4c517d8",
      "tree": "e6e57961dcddcb5841acb34956e70b9dc696a880",
      "parents": [
        "3dab04e6978e358ad2307bca563fabd6c5d2c58b",
        "9d2e157d970a73b3f270b631828e03eb452d525e"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Mar 24 10:16:26 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Mar 24 10:16:26 2011 -0700"
      },
      "message": "Merge branch \u0027for-2.6.39/core\u0027 of git://git.kernel.dk/linux-2.6-block\n\n* \u0027for-2.6.39/core\u0027 of git://git.kernel.dk/linux-2.6-block: (65 commits)\n  Documentation/iostats.txt: bit-size reference etc.\n  cfq-iosched: removing unnecessary think time checking\n  cfq-iosched: Don\u0027t clear queue stats when preempt.\n  blk-throttle: Reset group slice when limits are changed\n  blk-cgroup: Only give unaccounted_time under debug\n  cfq-iosched: Don\u0027t set active queue in preempt\n  block: fix non-atomic access to genhd inflight structures\n  block: attempt to merge with existing requests on plug flush\n  block: NULL dereference on error path in __blkdev_get()\n  cfq-iosched: Don\u0027t update group weights when on service tree\n  fs: assign sb-\u003es_bdi to default_backing_dev_info if the bdi is going away\n  block: Require subsystems to explicitly allocate bio_set integrity mempool\n  jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging\n  jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging\n  fs: make fsync_buffers_list() plug\n  mm: make generic_writepages() use plugging\n  blk-cgroup: Add unaccounted time to timeslice_used.\n  block: fixup plugging stubs for !CONFIG_BLOCK\n  block: remove obsolete comments for blkdev_issue_zeroout.\n  blktrace: Use rq-\u003ecmd_flags directly in blk_add_trace_rq.\n  ...\n\nFix up conflicts in fs/{aio.c,super.c}\n"
    },
    {
      "commit": "8afdcece4911e51cfff2b50a269418914cab8a3f",
      "tree": "fcfb966822f0f6c128c754f3876a80106c9cc654",
      "parents": [
        "7571966189e54adf0a8bc1384d6f13f44052ba63"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Tue Mar 22 16:33:04 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Mar 22 17:44:04 2011 -0700"
      },
      "message": "mm: vmscan: kswapd should not free an excessive number of pages when balancing small zones\n\nWhen reclaiming for order-0 pages, kswapd requires that all zones be\nbalanced.  Each cycle through balance_pgdat() does background ageing on\nall zones if necessary and applies equal pressure on the inactive zone\nunless a lot of pages are free already.\n\nA \"lot of free pages\" is defined as a \"balance gap\" above the high\nwatermark which is currently 7*high_watermark.  Historically this was\nreasonable as min_free_kbytes was small.  However, on systems using huge\npages, it is recommended that min_free_kbytes is higher and it is tuned\nwith hugeadm --set-recommended-min_free_kbytes.  With the introduction of\ntransparent huge page support, this recommended value is also applied.  On\nX86-64 with 4G of memory, min_free_kbytes becomes 67584 so one would\nexpect around 68M of memory to be free.  The Normal zone is approximately\n35000 pages so under even normal memory pressure such as copying a large\nfile, it gets exhausted quickly.  As it is getting exhausted, kswapd\napplies pressure equally to all zones, including the DMA32 zone.  DMA32 is\napproximately 700,000 pages with a high watermark of around 23,000 pages.\nIn this situation, kswapd will reclaim around (23000*8 where 8 is the high\nwatermark + balance gap of 7 * high watermark) pages or 718M of pages\nbefore the zone is ignored.  What the user sees is that free memory far\nhigher than it should be.\n\nTo avoid an excessive number of pages being reclaimed from the larger\nzones, explicitely defines the \"balance gap\" to be either 1% of the zone\nor the low watermark for the zone, whichever is smaller.  While kswapd\nwill check all zones to apply pressure, it\u0027ll ignore zones that meets the\n(high_wmark + balance_gap) watermark.\n\nTo test this, 80G were copied from a partition and the amount of memory\nbeing used was recorded.  A comparison of a patch and unpatched kernel can\nbe seen at\nhttp://www.csn.ul.ie/~mel/postings/minfree-20110222/memory-usage-hydra.ps\nand shows that kswapd is not reclaiming as much memory with the patch\napplied.\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: \"Chen, Tim C\" \u003ctim.c.chen@intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e64a782fec684c29a8204c51b3cb554dce588592",
      "tree": "5ff0beb21b973f1ad0edc1e31b6a1c2ee4406bdc",
      "parents": [
        "702cfbf93aaf3a091b0c64c8766c1ade0a820c38"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Tue Mar 22 16:32:44 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Mar 22 17:44:02 2011 -0700"
      },
      "message": "mm: change __remove_from_page_cache()\n\nNow we renamed remove_from_page_cache with delete_from_page_cache.  As\nconsistency of __remove_from_swap_cache and remove_from_swap_cache, we\nchange internal page cache handling function name, too.\n\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nAcked-by: Hugh Dickins \u003chughd@google.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d527caf22e48480b102c7c6ee5b9ba12170148f7",
      "tree": "7d53a2c430f8c020b6fa8390396dd2d1ce480b9a",
      "parents": [
        "89699605fe7cfd8611900346f61cb6cbf179b10a"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Tue Mar 22 16:30:38 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Mar 22 17:44:00 2011 -0700"
      },
      "message": "mm: compaction: prevent kswapd compacting memory to reduce CPU usage\n\nThis patch reverts 5a03b051 (\"thp: use compaction in kswapd for GFP_ATOMIC\norder \u003e 0\") due to reports stating that kswapd CPU usage was higher and\nIRQs were being disabled more frequently.  This was reported at\nhttp://www.spinics.net/linux/fedora/alsa-user/msg09885.html.\n\nWithout this patch applied, CPU usage by kswapd hovers around the 20% mark\naccording to the tester (Arthur Marsh:\nhttp://www.spinics.net/linux/fedora/alsa-user/msg09899.html).  With this\npatch applied, it\u0027s around 2%.\n\nThe problem is not related to THP which specifies __GFP_NO_KSWAPD but is\ntriggered by high-order allocations hitting the low watermark for their\norder and waking kswapd on kernels with CONFIG_COMPACTION set.  The most\ncommon trigger for this is network cards configured for jumbo frames but\nit\u0027s also possible it\u0027ll be triggered by fork-heavy workloads (order-1)\nand some wireless cards which depend on order-1 allocations.\n\nThe symptoms for the user will be high CPU usage by kswapd in low-memory\nsituations which could be confused with another writeback problem.  While\na patch like 5a03b051 may be reintroduced in the future, this patch plays\nit safe for now and reverts it.\n\n[mel@csn.ul.ie: Beefed up the changelog]\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReported-by: Arthur Marsh \u003carthur.marsh@internode.on.net\u003e\nTested-by: Arthur Marsh \u003carthur.marsh@internode.on.net\u003e\nCc: \u003cstable@kernel.org\u003e\t\t[2.6.38.1]\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4c63f5646e405b5010cc9499419060bf2e838f5b",
      "tree": "df91ba315032c8ec4aafeb3ab96fdfa7c6c656e1",
      "parents": [
        "cafb0bfca1a73efd6d8a4a6a6a716e6134b96c24",
        "69d60eb96ae8a73cf9b79cf28051caf973006011"
      ],
      "author": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Thu Mar 10 08:58:35 2011 +0100"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Thu Mar 10 08:58:35 2011 +0100"
      },
      "message": "Merge branch \u0027for-2.6.39/stack-plug\u0027 into for-2.6.39/core\n\nConflicts:\n\tblock/blk-core.c\n\tblock/blk-flush.c\n\tdrivers/md/raid1.c\n\tdrivers/md/raid10.c\n\tdrivers/md/raid5.c\n\tfs/nilfs2/btnode.c\n\tfs/nilfs2/mdt.c\n\nSigned-off-by: Jens Axboe \u003cjaxboe@fusionio.com\u003e\n"
    },
    {
      "commit": "7eaceaccab5f40bbfda044629a6298616aeaed50",
      "tree": "33954d12f63e25a47eb6d86ef3d3d0a5e62bf752",
      "parents": [
        "73c101011926c5832e6e141682180c4debe2cf45"
      ],
      "author": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Thu Mar 10 08:52:07 2011 +0100"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Thu Mar 10 08:52:07 2011 +0100"
      },
      "message": "block: remove per-queue plugging\n\nCode has been converted over to the new explicit on-stack plugging,\nand delay users have been converted to use the new API for that.\nSo lets kill off the old plugging along with aops-\u003esync_page().\n\nSigned-off-by: Jens Axboe \u003cjaxboe@fusionio.com\u003e\n"
    },
    {
      "commit": "2876592f231d436c295b67726313f6f3cfb6e243",
      "tree": "e53c6db2aed6e672481c31083287d79f32ad45f4",
      "parents": [
        "ac3c8304190ed0daaa2fb01ce2a069be5e2a52a7"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Fri Feb 25 14:44:20 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Feb 25 15:07:36 2011 -0800"
      },
      "message": "mm: vmscan: stop reclaim/compaction earlier due to insufficient progress if !__GFP_REPEAT\n\nshould_continue_reclaim() for reclaim/compaction allows scanning to\ncontinue even if pages are not being reclaimed until the full list is\nscanned.  In terms of allocation success, this makes sense but potentially\nit introduces unwanted latency for high-order allocations such as\ntransparent hugepages and network jumbo frames that would prefer to fail\nthe allocation attempt and fallback to order-0 pages.  Worse, there is a\npotential that the full LRU scan will clear all the young bits, distort\npage aging information and potentially push pages into swap that would\nhave otherwise remained resident.\n\nThis patch will stop reclaim/compaction if no pages were reclaimed in the\nlast SWAP_CLUSTER_MAX pages that were considered.  For allocations such as\nhugetlbfs that use __GFP_REPEAT and have fewer fallback options, the full\nLRU list may still be scanned.\n\nOrder-0 allocation should not be affected because RECLAIM_MODE_COMPACTION\nis not set so the following avoids the gfp_mask being examined:\n\n        if (!(sc-\u003ereclaim_mode \u0026 RECLAIM_MODE_COMPACTION))\n                return false;\n\nA tool was developed based on ftrace that tracked the latency of\nhigh-order allocations while transparent hugepage support was enabled and\nthree benchmarks were run.  The \"fix-infinite\" figures are 2.6.38-rc4 with\nJohannes\u0027s patch \"vmscan: fix zone shrinking exit when scan work is done\"\napplied.\n\n  STREAM Highorder Allocation Latency Statistics\n                 fix-infinite     break-early\n  1 :: Count            10298           10229\n  1 :: Min             0.4560          0.4640\n  1 :: Mean            1.0589          1.0183\n  1 :: Max            14.5990         11.7510\n  1 :: Stddev          0.5208          0.4719\n  2 :: Count                2               1\n  2 :: Min             1.8610          3.7240\n  2 :: Mean            3.4325          3.7240\n  2 :: Max             5.0040          3.7240\n  2 :: Stddev          1.5715          0.0000\n  9 :: Count           111696          111694\n  9 :: Min             0.5230          0.4110\n  9 :: Mean           10.5831         10.5718\n  9 :: Max            38.4480         43.2900\n  9 :: Stddev          1.1147          1.1325\n\nMean time for order-1 allocations is reduced.  order-2 looks increased but\nwith so few allocations, it\u0027s not particularly significant.  THP mean\nallocation latency is also reduced.  That said, allocation time varies so\nsignificantly that the reductions are within noise.\n\nMax allocation time is reduced by a significant amount for low-order\nallocations but reduced for THP allocations which presumably are now\nbreaking before reclaim has done enough work.\n\n  SysBench Highorder Allocation Latency Statistics\n                 fix-infinite     break-early\n  1 :: Count            15745           15677\n  1 :: Min             0.4250          0.4550\n  1 :: Mean            1.1023          1.0810\n  1 :: Max            14.4590         10.8220\n  1 :: Stddev          0.5117          0.5100\n  2 :: Count                1               1\n  2 :: Min             3.0040          2.1530\n  2 :: Mean            3.0040          2.1530\n  2 :: Max             3.0040          2.1530\n  2 :: Stddev          0.0000          0.0000\n  9 :: Count             2017            1931\n  9 :: Min             0.4980          0.7480\n  9 :: Mean           10.4717         10.3840\n  9 :: Max            24.9460         26.2500\n  9 :: Stddev          1.1726          1.1966\n\nAgain, mean time for order-1 allocations is reduced while order-2\nallocations are too few to draw conclusions from.  The mean time for THP\nallocations is also slightly reduced albeit the reductions are within\nvarianes.\n\nOnce again, our maximum allocation time is significantly reduced for\nlow-order allocations and slightly increased for THP allocations.\n\n  Anon stream mmap reference Highorder Allocation Latency Statistics\n  1 :: Count             1376            1790\n  1 :: Min             0.4940          0.5010\n  1 :: Mean            1.0289          0.9732\n  1 :: Max             6.2670          4.2540\n  1 :: Stddev          0.4142          0.2785\n  2 :: Count                1               -\n  2 :: Min             1.9060               -\n  2 :: Mean            1.9060               -\n  2 :: Max             1.9060               -\n  2 :: Stddev          0.0000               -\n  9 :: Count            11266           11257\n  9 :: Min             0.4990          0.4940\n  9 :: Mean        27250.4669      24256.1919\n  9 :: Max      11439211.0000    6008885.0000\n  9 :: Stddev     226427.4624     186298.1430\n\nThis benchmark creates one thread per CPU which references an amount of\nanonymous memory 1.5 times the size of physical RAM.  This pounds swap\nquite heavily and is intended to exercise THP a bit.\n\nMean allocation time for order-1 is reduced as before.  It\u0027s also reduced\nfor THP allocations but the variations here are pretty massive due to\nswap.  As before, maximum allocation times are significantly reduced.\n\nOverall, the patch reduces the mean and maximum allocation latencies for\nthe smaller high-order allocations.  This was with Slab configured so it\nwould be expected to be more significant with Slub which uses these size\nallocations more aggressively.\n\nThe mean allocation times for THP allocations are also slightly reduced.\nThe maximum latency was slightly increased as predicted by the comments\ndue to reclaim/compaction breaking early.  However, workloads care more\nabout the latency of lower-order allocations than THP so it\u0027s an\nacceptable trade-off.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nAcked-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Michal Hocko \u003cmhocko@suse.cz\u003e\nCc: Kent Overstreet \u003ckent.overstreet@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f0fdc5e8e6f579310458aef43d1610a0bb5e81a4",
      "tree": "c3f9de434bf9065d2f1d7a5f4214f28efb4920c4",
      "parents": [
        "419d8c96dbfa558f00e623023917d0a5afc46129"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "hannes@cmpxchg.org",
        "time": "Thu Feb 10 15:01:34 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Feb 11 16:12:20 2011 -0800"
      },
      "message": "vmscan: fix zone shrinking exit when scan work is done\n\nCommit 3e7d34497067 (\"mm: vmscan: reclaim order-0 and use compaction\ninstead of lumpy reclaim\") introduced an indefinite loop in\nshrink_zone().\n\nIt meant to break out of this loop when no pages had been reclaimed and\nnot a single page was even scanned.  The way it would detect the latter\nis by taking a snapshot of sc-\u003enr_scanned at the beginning of the\nfunction and comparing it against the new sc-\u003enr_scanned after the scan\nloop.  But it would re-iterate without updating that snapshot, looping\nforever if sc-\u003enr_scanned changed at least once since shrink_zone() was\ninvoked.\n\nThis is not the sole condition that would exit that loop, but it\nrequires other processes to change the zone state, as the reclaimer that\nis stuck obviously can not anymore.\n\nThis is only happening for higher-order allocations, where reclaim is\nrun back to back with compaction.\n\nSigned-off-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReported-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nTested-by: Kent Overstreet\u003ckent.overstreet@gmail.com\u003e\nReported-by: Kent Overstreet \u003ckent.overstreet@gmail.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f33261d75b88f55a08e6a9648cef73509979bfba",
      "tree": "f3d8b4f41c860e9f6d054173870319a75c14c155",
      "parents": [
        "4f542e3dd90a96ee0f8fcb8173cb4104f5f753e6"
      ],
      "author": {
        "name": "David Rientjes",
        "email": "rientjes@google.com",
        "time": "Tue Jan 25 15:07:20 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Jan 26 10:50:00 2011 +1000"
      },
      "message": "mm: fix deferred congestion timeout if preferred zone is not allowed\n\nBefore 0e093d99763e (\"writeback: do not sleep on the congestion queue if\nthere are no congested BDIs or if significant congestion is not being\nencountered in the current zone\"), preferred_zone was only used for NUMA\nstatistics, to determine the zoneidx from which to allocate from given\nthe type requested, and whether to utilize memory compaction.\n\nwait_iff_congested(), though, uses preferred_zone to determine if the\ncongestion wait should be deferred because its dirty pages are backed by\na congested bdi.  This incorrectly defers the timeout and busy loops in\nthe page allocator with various cond_resched() calls if preferred_zone\nis not allowed in the current context, usually consuming 100% of a cpu.\n\nThis patch ensures preferred_zone is an allowed zone in the fastpath\ndepending on whether current is constrained by its cpuset or nodes in\nits mempolicy (when the nodemask passed is non-NULL).  This is correct\nsince the fastpath allocation always passes ALLOC_CPUSET when trying to\nallocate memory.  In the slowpath, this patch resets preferred_zone to\nthe first zone of the allowed type when the allocation is not\nconstrained by current\u0027s cpuset, i.e.  it does not pass ALLOC_CPUSET.\n\nThis patch also ensures preferred_zone is from the set of allowed nodes\nwhen called from within direct reclaim since allocations are always\nconstrained by cpusets in this context (it is blockable).\n\nBoth of these uses of cpuset_current_mems_allowed are protected by\nget_mems_allowed().\n\nSigned-off-by: David Rientjes \u003crientjes@google.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "3305de51bf612603c9a4e4dc98ceb839ef933749",
      "tree": "83c1f03790d6a658d736f6a5e4067aefb604b293",
      "parents": [
        "abb65272a190660790096628859e752172d822fd"
      ],
      "author": {
        "name": "Jesper Juhl",
        "email": "jj@chaosbits.net",
        "time": "Thu Jan 20 14:44:20 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 20 17:02:05 2011 -0800"
      },
      "message": "mm/vmscan.c: remove duplicate include of compaction.h\n\nSigned-off-by: Jesper Juhl \u003cjj@chaosbits.net\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "7a608572a282a74978e10fd6cd63090aebe29f5c",
      "tree": "03e52f73d7c35ffcea8f46e14ec569da818a7631",
      "parents": [
        "9e8a462a0141b12e22c4a2f0c12e0542770401f0"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Jan 17 14:42:19 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Jan 17 14:42:19 2011 -0800"
      },
      "message": "Revert \"mm: batch activate_page() to reduce lock contention\"\n\nThis reverts commit 744ed1442757767ffede5008bb13e0805085902e.\n\nChris Mason ended up chasing down some page allocation errors and pages\nstuck waiting on the IO scheduler, and was able to narrow it down to two\ncommits: commit 744ed1442757 (\"mm: batch activate_page() to reduce lock\ncontention\") and d8505dee1a87 (\"mm: simplify code of swap.c\").\n\nThis reverts the first of them.\n\nReported-and-debugged-by: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nCc: Jens Axboe \u003cjaxboe@fusionio.com\u003e\nCc: linux-mm \u003clinux-mm@kvack.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Shaohua Li \u003cshaohua.li@intel.com\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "744ed1442757767ffede5008bb13e0805085902e",
      "tree": "75af93524570b40056f2367059dfa84ba7d90186",
      "parents": [
        "d8505dee1a87b8d41b9c4ee1325cd72258226fbc"
      ],
      "author": {
        "name": "Shaohua Li",
        "email": "shaohua.li@intel.com",
        "time": "Thu Jan 13 15:47:34 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:50 2011 -0800"
      },
      "message": "mm: batch activate_page() to reduce lock contention\n\nThe zone-\u003elru_lock is heavily contented in workload where activate_page()\nis frequently used.  We could do batch activate_page() to reduce the lock\ncontention.  The batched pages will be added into zone list when the pool\nis full or page reclaim is trying to drain them.\n\nFor example, in a 4 socket 64 CPU system, create a sparse file and 64\nprocesses, processes shared map to the file.  Each process read access the\nwhole file and then exit.  The process exit will do unmap_vmas() and cause\na lot of activate_page() call.  In such workload, we saw about 58% total\ntime reduction with below patch.  Other workloads with a lot of\nactivate_page also benefits a lot too.\n\nI tested some microbenchmarks:\ncase-anon-cow-rand-mt\t\t0.58%\ncase-anon-cow-rand\t\t-3.30%\ncase-anon-cow-seq-mt\t\t-0.51%\ncase-anon-cow-seq\t\t-5.68%\ncase-anon-r-rand-mt\t\t0.23%\ncase-anon-r-rand\t\t0.81%\ncase-anon-r-seq-mt\t\t-0.71%\ncase-anon-r-seq\t\t\t-1.99%\ncase-anon-rx-rand-mt\t\t2.11%\ncase-anon-rx-seq-mt\t\t3.46%\ncase-anon-w-rand-mt\t\t-0.03%\ncase-anon-w-rand\t\t-0.50%\ncase-anon-w-seq-mt\t\t-1.08%\ncase-anon-w-seq\t\t\t-0.12%\ncase-anon-wx-rand-mt\t\t-5.02%\ncase-anon-wx-seq-mt\t\t-1.43%\ncase-fork\t\t\t1.65%\ncase-fork-sleep\t\t\t-0.07%\ncase-fork-withmem\t\t1.39%\ncase-hugetlb\t\t\t-0.59%\ncase-lru-file-mmap-read-mt\t-0.54%\ncase-lru-file-mmap-read\t\t0.61%\ncase-lru-file-mmap-read-rand\t-2.24%\ncase-lru-file-readonce\t\t-0.64%\ncase-lru-file-readtwice\t\t-11.69%\ncase-lru-memcg\t\t\t-1.35%\ncase-mmap-pread-rand-mt\t\t1.88%\ncase-mmap-pread-rand\t\t-15.26%\ncase-mmap-pread-seq-mt\t\t0.89%\ncase-mmap-pread-seq\t\t-69.72%\ncase-mmap-xread-rand-mt\t\t0.71%\ncase-mmap-xread-seq-mt\t\t0.38%\n\nThe most significent are:\ncase-lru-file-readtwice\t\t-11.69%\ncase-mmap-pread-rand\t\t-15.26%\ncase-mmap-pread-seq\t\t-69.72%\n\nwhich use activate_page a lot.  others are basically variations because\neach run has slightly difference.\n\n[akpm@linux-foundation.org: coding-style fixes]\nSigned-off-by: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Andi Kleen \u003candi@firstfloor.org\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "9992af102974f3f8a02a1f2729c3461881539e26",
      "tree": "40958e1a8bd7efc7c9a4d28e2b77d86bb8688734",
      "parents": [
        "2c888cfbc1b45508a44763d85ba2e8ac43faff5f"
      ],
      "author": {
        "name": "Rik van Riel",
        "email": "riel@redhat.com",
        "time": "Thu Jan 13 15:47:13 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:46 2011 -0800"
      },
      "message": "thp: scale nr_rotated to balance memory pressure\n\nMake sure we scale up nr_rotated when we encounter a referenced\ntransparent huge page.  This ensures pageout scanning balance is not\ndistorted when there are huge pages on the LRU.\n\nSigned-off-by: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "2c888cfbc1b45508a44763d85ba2e8ac43faff5f",
      "tree": "9a7f2214e5d6a01d5724ae63d4d50cddeb2293ff",
      "parents": [
        "97562cd243298acf573620c764a1037bd545c9bc"
      ],
      "author": {
        "name": "Rik van Riel",
        "email": "riel@redhat.com",
        "time": "Thu Jan 13 15:47:13 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:46 2011 -0800"
      },
      "message": "thp: fix anon memory statistics with transparent hugepages\n\nCount each transparent hugepage as HPAGE_PMD_NR pages in the LRU\nstatistics, so the Active(anon) and Inactive(anon) statistics in\n/proc/meminfo are correct.\n\nSigned-off-by: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "5a03b051ed87e72b959f32a86054e1142ac4cf55",
      "tree": "31f0e8efb86d48b0292f8a7ea4bd9cf7c930a0ab",
      "parents": [
        "878aee7d6b5504e01b9caffce080e792b6b8d090"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Thu Jan 13 15:47:11 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:46 2011 -0800"
      },
      "message": "thp: use compaction in kswapd for GFP_ATOMIC order \u003e 0\n\nThis takes advantage of memory compaction to properly generate pages of\norder \u003e 0 if regular page reclaim fails and priority level becomes more\nsevere and we don\u0027t reach the proper watermarks.\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "dc83edd941f412e938841b4989be24aa288a1aa6",
      "tree": "07dbc04d544f3200b3b13be1af6c57f44ffa63c8",
      "parents": [
        "355b09c47a0cbb73b3e65a57c03f157f2e7ddb0b"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:46:26 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:37 2011 -0800"
      },
      "message": "mm: kswapd: use the classzone idx that kswapd was using for sleeping_prematurely()\n\nWhen kswapd is woken up for a high-order allocation, it takes account of\nthe highest usable zone by the caller (the classzone idx).  During\nallocation, this index is used to select the lowmem_reserve[] that should\nbe applied to the watermark calculation in zone_watermark_ok().\n\nWhen balancing a node, kswapd considers the highest unbalanced zone to be\nthe classzone index.  This will always be at least be the callers\nclasszone_idx and can be higher.  However, sleeping_prematurely() always\nconsiders the lowest zone (e.g.  ZONE_DMA) to be the classzone index.\nThis means that sleeping_prematurely() can consider a zone to be balanced\nthat is unusable by the allocation request that originally woke kswapd.\nThis patch changes sleeping_prematurely() to use a classzone_idx matching\nthe value it used in balance_pgdat().\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: Eric B Munson \u003cemunson@mgebm.net\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Simon Kirby \u003csim@hostway.ca\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "355b09c47a0cbb73b3e65a57c03f157f2e7ddb0b",
      "tree": "26be6f89cac5b6f5b321cf74103444ae8775c3eb",
      "parents": [
        "4d40502ea580c35414a1466d86f96484910ebaec"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:46:24 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:37 2011 -0800"
      },
      "message": "mm: kswapd: treat zone-\u003eall_unreclaimable in sleeping_prematurely similar to balance_pgdat()\n\nAfter DEF_PRIORITY, balance_pgdat() considers all_unreclaimable zones to\nbe balanced but sleeping_prematurely does not.  This can force kswapd to\nstay awake longer than it should.  This patch fixes it.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Eric B Munson \u003cemunson@mgebm.net\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Simon Kirby \u003csim@hostway.ca\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4d40502ea580c35414a1466d86f96484910ebaec",
      "tree": "ed03d2b5a100be1c3371d304421af221fa893129",
      "parents": [
        "0abdee2bd4118366c62349a304f81537be69af33"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:46:23 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:37 2011 -0800"
      },
      "message": "mm: kswapd: reset kswapd_max_order and classzone_idx after reading\n\nWhen kswapd wakes up, it reads its order and classzone from pgdat and\ncalls balance_pgdat.  While its awake, it potentially reclaimes at a high\norder and a low classzone index.  This might have been a once-off that was\nnot required by subsequent callers.  However, because the pgdat values\nwere not reset, they remain artifically high while balance_pgdat() is\nrunning and potentially kswapd enters a second unnecessary reclaim cycle.\nReset the pgdat order and classzone index after reading.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Eric B Munson \u003cemunson@mgebm.net\u003e\nCc: Simon Kirby \u003csim@hostway.ca\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0abdee2bd4118366c62349a304f81537be69af33",
      "tree": "c013abd2dd49b3837d033eb4d32dfb57984d273e",
      "parents": [
        "1741c87757448cedd03224f01586504f9256415d"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:46:22 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:37 2011 -0800"
      },
      "message": "mm: kswapd: use the order that kswapd was reclaiming at for sleeping_prematurely()\n\nBefore kswapd goes to sleep, it uses sleeping_prematurely() to check if\nthere was a race pushing a zone below its watermark.  If the race\nhappened, it stays awake.  However, balance_pgdat() can decide to reclaim\nat order-0 if it decides that high-order reclaim is not working as\nexpected.  This information is not passed back to sleeping_prematurely().\nThe impact is that kswapd remains awake reclaiming pages long after it\nshould have gone to sleep.  This patch passes the adjusted order to\nsleeping_prematurely and uses the same logic as balance_pgdat to decide if\nit\u0027s ok to go to sleep.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Eric B Munson \u003cemunson@mgebm.net\u003e\nCc: Simon Kirby \u003csim@hostway.ca\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "1741c87757448cedd03224f01586504f9256415d",
      "tree": "e8f3bace5f0cd1652a3a2a682189b19f7b3af875",
      "parents": [
        "9950474883e027e6e728cbcff25f7f2bf0c96530"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:46:21 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:37 2011 -0800"
      },
      "message": "mm: kswapd: keep kswapd awake for high-order allocations until a percentage of the node is balanced\n\nWhen reclaiming for high-orders, kswapd is responsible for balancing a\nnode but it should not reclaim excessively.  It avoids excessive reclaim\nby considering if any zone in a node is balanced then the node is\nbalanced.  In the cases where there are imbalanced zone sizes (e.g.\nZONE_DMA with both ZONE_DMA32 and ZONE_NORMAL), kswapd can go to sleep\nprematurely as just one small zone was balanced.\n\nThis alters the sleep logic of kswapd slightly.  It counts the number of\npages that make up the balanced zones.  If the total number of balanced\npages is more than a quarter of the zone, kswapd will go back to sleep.\nThis should keep a node balanced without reclaiming an excessive number of\npages.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Eric B Munson \u003cemunson@mgebm.net\u003e\nCc: Simon Kirby \u003csim@hostway.ca\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "9950474883e027e6e728cbcff25f7f2bf0c96530",
      "tree": "ecfdd3e68a25f1ef7822428c44f8375efbe9bc0c",
      "parents": [
        "c585a2678d83ba8fb02fa6b197de0ac7d67377f1"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:46:20 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:37 2011 -0800"
      },
      "message": "mm: kswapd: stop high-order balancing when any suitable zone is balanced\n\nSimon Kirby reported the following problem\n\n   We\u0027re seeing cases on a number of servers where cache never fully\n   grows to use all available memory.  Sometimes we see servers with 4 GB\n   of memory that never seem to have less than 1.5 GB free, even with a\n   constantly-active VM.  In some cases, these servers also swap out while\n   this happens, even though they are constantly reading the working set\n   into memory.  We have been seeing this happening for a long time; I\n   don\u0027t think it\u0027s anything recent, and it still happens on 2.6.36.\n\nAfter some debugging work by Simon, Dave Hansen and others, the prevaling\ntheory became that kswapd is reclaiming order-3 pages requested by SLUB\ntoo aggressive about it.\n\nThere are two apparent problems here.  On the target machine, there is a\nsmall Normal zone in comparison to DMA32.  As kswapd tries to balance all\nzones, it would continually try reclaiming for Normal even though DMA32\nwas balanced enough for callers.  The second problem is that\nsleeping_prematurely() does not use the same logic as balance_pgdat() when\ndeciding whether to sleep or not.  This keeps kswapd artifically awake.\n\nA number of tests were run and the figures from previous postings will\nlook very different for a few reasons.  One, the old figures were forcing\nmy network card to use GFP_ATOMIC in attempt to replicate Simon\u0027s problem.\n Second, I previous specified slub_min_order\u003d3 again in an attempt to\nreproduce Simon\u0027s problem.  In this posting, I\u0027m depending on Simon to say\nwhether his problem is fixed or not and these figures are to show the\nimpact to the ordinary cases.  Finally, the \"vmscan\" figures are taken\nfrom /proc/vmstat instead of the tracepoints.  There is less information\nbut recording is less disruptive.\n\nThe first test of relevance was postmark with a process running in the\nbackground reading a large amount of anonymous memory in blocks.  The\nobjective was to vaguely simulate what was happening on Simon\u0027s machine\nand it\u0027s memory intensive enough to have kswapd awake.\n\nPOSTMARK\n                                            traceonly          kanyzone\nTransactions per second:              156.00 ( 0.00%)   153.00 (-1.96%)\nData megabytes read per second:        21.51 ( 0.00%)    21.52 ( 0.05%)\nData megabytes written per second:     29.28 ( 0.00%)    29.11 (-0.58%)\nFiles created alone per second:       250.00 ( 0.00%)   416.00 (39.90%)\nFiles create/transact per second:      79.00 ( 0.00%)    76.00 (-3.95%)\nFiles deleted alone per second:       520.00 ( 0.00%)   420.00 (-23.81%)\nFiles delete/transact per second:      79.00 ( 0.00%)    76.00 (-3.95%)\n\nMMTests Statistics: duration\nUser/Sys Time Running Test (seconds)         16.58      17.4\nTotal Elapsed Time (seconds)                218.48    222.47\n\nVMstat Reclaim Statistics: vmscan\nDirect reclaims                                  0          4\nDirect reclaim pages scanned                     0        203\nDirect reclaim pages reclaimed                   0        184\nKswapd pages scanned                        326631     322018\nKswapd pages reclaimed                      312632     309784\nKswapd low wmark quickly                         1          4\nKswapd high wmark quickly                      122        475\nKswapd skip congestion_wait                      1          0\nPages activated                             700040     705317\nPages deactivated                           212113     203922\nPages written                                 9875       6363\n\nTotal pages scanned                         326631    322221\nTotal pages reclaimed                       312632    309968\n%age total pages scanned/reclaimed          95.71%    96.20%\n%age total pages scanned/written             3.02%     1.97%\n\nproc vmstat: Faults\nMajor Faults                                   300       254\nMinor Faults                                645183    660284\nPage ins                                    493588    486704\nPage outs                                  4960088   4986704\nSwap ins                                      1230       661\nSwap outs                                     9869      6355\n\nPerformance is mildly affected because kswapd is no longer doing as much\nwork and the background memory consumer process is getting in the way.\nNote that kswapd scanned and reclaimed fewer pages as it\u0027s less aggressive\nand overall fewer pages were scanned and reclaimed.  Swap in/out is\nparticularly reduced again reflecting kswapd throwing out fewer pages.\n\nThe slight performance impact is unfortunate here but it looks like a\ndirect result of kswapd being less aggressive.  As the bug report is about\ntoo many pages being freed by kswapd, it may have to be accepted for now.\n\nThe second test is a streaming IO benchmark that was previously used by\nJohannes to show regressions in page reclaim.\n\nMICRO\n\t\t\t\t\t traceonly  kanyzone\nUser/Sys Time Running Test (seconds)         29.29     28.87\nTotal Elapsed Time (seconds)                492.18    488.79\n\nVMstat Reclaim Statistics: vmscan\nDirect reclaims                               2128       1460\nDirect reclaim pages scanned               2284822    1496067\nDirect reclaim pages reclaimed              148919     110937\nKswapd pages scanned                      15450014   16202876\nKswapd pages reclaimed                     8503697    8537897\nKswapd low wmark quickly                      3100       3397\nKswapd high wmark quickly                     1860       7243\nKswapd skip congestion_wait                    708        801\nPages activated                               9635       9573\nPages deactivated                             1432       1271\nPages written                                  223       1130\n\nTotal pages scanned                       17734836  17698943\nTotal pages reclaimed                      8652616   8648834\n%age total pages scanned/reclaimed          48.79%    48.87%\n%age total pages scanned/written             0.00%     0.01%\n\nproc vmstat: Faults\nMajor Faults                                   165       221\nMinor Faults                               9655785   9656506\nPage ins                                      3880      7228\nPage outs                                 37692940  37480076\nSwap ins                                         0        69\nSwap outs                                       19        15\n\nAgain fewer pages are scanned and reclaimed as expected and this time the\ntest completed faster.  Note that kswapd is hitting its watermarks faster\n(low and high wmark quickly) which I expect is due to kswapd reclaiming\nfewer pages.\n\nI also ran fs-mark, iozone and sysbench but there is nothing interesting\nto report in the figures.  Performance is not significantly changed and\nthe reclaim statistics look reasonable.\n\nTgis patch:\n\nWhen the allocator enters its slow path, kswapd is woken up to balance the\nnode.  It continues working until all zones within the node are balanced.\nFor order-0 allocations, this makes perfect sense but for higher orders it\ncan have unintended side-effects.  If the zone sizes are imbalanced,\nkswapd may reclaim heavily within a smaller zone discarding an excessive\nnumber of pages.  The user-visible behaviour is that kswapd is awake and\nreclaiming even though plenty of pages are free from a suitable zone.\n\nThis patch alters the \"balance\" logic for high-order reclaim allowing\nkswapd to stop if any suitable zone becomes balanced to reduce the number\nof pages it reclaims from other zones.  kswapd still tries to ensure that\norder-0 watermarks for all zones are met before sleeping.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Eric B Munson \u003cemunson@mgebm.net\u003e\nCc: Simon Kirby \u003csim@hostway.ca\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f3a310bc4e5ce7e55e1c8e25c31e63af017f3e50",
      "tree": "0c78777bd505f44edeb9bbcc50fb3154896574aa",
      "parents": [
        "9927af740b1b9b1e769310bd0b91425e8047b803"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:46:00 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:34 2011 -0800"
      },
      "message": "mm: vmscan: rename lumpy_mode to reclaim_mode\n\nWith compaction being used instead of lumpy reclaim, the name lumpy_mode\nand associated variables is a bit misleading.  Rename lumpy_mode to\nreclaim_mode which is a better fit.  There is no functional change.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Andy Whitcroft \u003capw@shadowen.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "77f1fe6b08b13a87391549c8a820ddc817b6f50e",
      "tree": "720865bd0994da3787b6f37d33b2ee4c26a2de6c",
      "parents": [
        "3e7d344970673c5334cf7b5bb27c8c0942b06126"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:45:57 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:34 2011 -0800"
      },
      "message": "mm: migration: allow migration to operate asynchronously and avoid synchronous compaction in the faster path\n\nMigration synchronously waits for writeback if the initial passes fails.\nCallers of memory compaction do not necessarily want this behaviour if the\ncaller is latency sensitive or expects that synchronous migration is not\ngoing to have a significantly better success rate.\n\nThis patch adds a sync parameter to migrate_pages() allowing the caller to\nindicate if wait_on_page_writeback() is allowed within migration or not.\nFor reclaim/compaction, try_to_compact_pages() is first called\nasynchronously, direct reclaim runs and then try_to_compact_pages() is\ncalled synchronously as there is a greater expectation that it\u0027ll succeed.\n\n[akpm@linux-foundation.org: build/merge fix]\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Andy Whitcroft \u003capw@shadowen.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "3e7d344970673c5334cf7b5bb27c8c0942b06126",
      "tree": "832ecb4da5fd27efa5a503df5b96bfdee2a52ffd",
      "parents": [
        "ee64fc9354e515a79c7232cfde65c88ec627308b"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:45:56 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:33 2011 -0800"
      },
      "message": "mm: vmscan: reclaim order-0 and use compaction instead of lumpy reclaim\n\nLumpy reclaim is disruptive.  It reclaims a large number of pages and\nignores the age of the pages it reclaims.  This can incur significant\nstalls and potentially increase the number of major faults.\n\nCompaction has reached the point where it is considered reasonably stable\n(meaning it has passed a lot of testing) and is a potential candidate for\ndisplacing lumpy reclaim.  This patch introduces an alternative to lumpy\nreclaim whe compaction is available called reclaim/compaction.  The basic\noperation is very simple - instead of selecting a contiguous range of\npages to reclaim, a number of order-0 pages are reclaimed and then\ncompaction is later by either kswapd (compact_zone_order()) or direct\ncompaction (__alloc_pages_direct_compact()).\n\n[akpm@linux-foundation.org: fix build]\n[akpm@linux-foundation.org: use conventional task_struct naming]\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Andy Whitcroft \u003capw@shadowen.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ee64fc9354e515a79c7232cfde65c88ec627308b",
      "tree": "fb5fb6c0045ff5467ed5870d5f64806784deba2d",
      "parents": [
        "b7aba6984dc048503b69c2a885098cdd430832bf"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:45:55 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:33 2011 -0800"
      },
      "message": "mm: vmscan: convert lumpy_mode into a bitmask\n\nCurrently lumpy_mode is an enum and determines if lumpy reclaim is off,\nsyncronous or asyncronous.  In preparation for using compaction instead of\nlumpy reclaim, this patch converts the flags into a bitmap.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Andy Whitcroft \u003capw@shadowen.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f0bc0a60b13f209df16062f94e9fb4b90dc08708",
      "tree": "c876603d2cb17a3e2ed159ca5e08e4665cc09fe2",
      "parents": [
        "c3f0da631539b3b8e17f6dda567af9958d49d14f"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Thu Jan 13 15:45:50 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:32 2011 -0800"
      },
      "message": "vmscan: factor out kswapd sleeping logic from kswapd()\n\nCurrently, kswapd() has deep nesting and is slightly hard to read.  Clean\nthis up.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "b44129b30652c8771db2265939bb8b463724043d",
      "tree": "d5b669ff4faea020b03e894706f49d5d1ae56907",
      "parents": [
        "88f5acf88ae6a9778f6d25d0d5d7ec2d57764a97"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:45:43 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:31 2011 -0800"
      },
      "message": "mm: vmstat: use a single setter function and callback for adjusting percpu thresholds\n\nreduce_pgdat_percpu_threshold() and restore_pgdat_percpu_threshold() exist\nto adjust the per-cpu vmstat thresholds while kswapd is awake to avoid\nerrors due to counter drift.  The functions duplicate some code so this\npatch replaces them with a single set_pgdat_percpu_threshold() that takes\na callback function to calculate the desired threshold as a parameter.\n\n[akpm@linux-foundation.org: readability tweak]\n[kosaki.motohiro@jp.fujitsu.com: set_pgdat_percpu_threshold(): don\u0027t use for_each_online_cpu]\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Christoph Lameter \u003ccl@linux.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "88f5acf88ae6a9778f6d25d0d5d7ec2d57764a97",
      "tree": "6f39beef8cf918eb2ca9f64ae1bcd1ea79ca487a",
      "parents": [
        "43bb40c9e3aa51a3b038c9df2c9afb4d4685614d"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Thu Jan 13 15:45:41 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:31 2011 -0800"
      },
      "message": "mm: page allocator: adjust the per-cpu counter threshold when memory is low\n\nCommit aa45484 (\"calculate a better estimate of NR_FREE_PAGES when memory\nis low\") noted that watermarks were based on the vmstat NR_FREE_PAGES.  To\navoid synchronization overhead, these counters are maintained on a per-cpu\nbasis and drained both periodically and when a threshold is above a\nthreshold.  On large CPU systems, the difference between the estimate and\nreal value of NR_FREE_PAGES can be very high.  The system can get into a\ncase where pages are allocated far below the min watermark potentially\ncausing livelock issues.  The commit solved the problem by taking a better\nreading of NR_FREE_PAGES when memory was low.\n\nUnfortately, as reported by Shaohua Li this accurate reading can consume a\nlarge amount of CPU time on systems with many sockets due to cache line\nbouncing.  This patch takes a different approach.  For large machines\nwhere counter drift might be unsafe and while kswapd is awake, the per-cpu\nthresholds for the target pgdat are reduced to limit the level of drift to\nwhat should be a safe level.  This incurs a performance penalty in heavy\nmemory pressure by a factor that depends on the workload and the machine\nbut the machine should function correctly without accidentally exhausting\nall memory on a node.  There is an additional cost when kswapd wakes and\nsleeps but the event is not expected to be frequent - in Shaohua\u0027s test\ncase, there was one recorded sleep and wake event at least.\n\nTo ensure that kswapd wakes up, a safe version of zone_watermark_ok() is\nintroduced that takes a more accurate reading of NR_FREE_PAGES when called\nfrom wakeup_kswapd, when deciding whether it is really safe to go back to\nsleep in sleeping_prematurely() and when deciding if a zone is really\nbalanced or not in balance_pgdat().  We are still using an expensive\nfunction but limiting how often it is called.\n\nWhen the test case is reproduced, the time spent in the watermark\nfunctions is reduced.  The following report is on the percentage of time\nspent cumulatively spent in the functions zone_nr_free_pages(),\nzone_watermark_ok(), __zone_watermark_ok(), zone_watermark_ok_safe(),\nzone_page_state_snapshot(), zone_page_state().\n\nvanilla                      11.6615%\ndisable-threshold            0.2584%\n\nDavid said:\n\n: We had to pull aa454840 \"mm: page allocator: calculate a better estimate\n: of NR_FREE_PAGES when memory is low and kswapd is awake\" from 2.6.36\n: internally because tests showed that it would cause the machine to stall\n: as the result of heavy kswapd activity.  I merged it back with this fix as\n: it is pending in the -mm tree and it solves the issue we were seeing, so I\n: definitely think this should be pushed to -stable (and I would seriously\n: consider it for 2.6.37 inclusion even at this late date).\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReported-by: Shaohua Li \u003cshaohua.li@intel.com\u003e\nReviewed-by: Christoph Lameter \u003ccl@linux.com\u003e\nTested-by: Nicolas Bareil \u003cnico@chdir.org\u003e\nCc: David Rientjes \u003crientjes@google.com\u003e\nCc: Kyle McMartin \u003ckyle@mcmartin.ca\u003e\nCc: \u003cstable@kernel.org\u003e\t\t[2.6.37.1, 2.6.36.x]\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "6072d13c429373c5d63b69dadbbef40a9b035552",
      "tree": "a2bf745efaa4092f2a8d7d9a9b160c2a7a3b303f",
      "parents": [
        "0aded708d125a3ff7e5abaea9c2d9c6d7ebbfdcd"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Dec 01 13:35:19 2010 -0500"
      },
      "committer": {
        "name": "Trond Myklebust",
        "email": "Trond.Myklebust@netapp.com",
        "time": "Thu Dec 02 09:55:21 2010 -0500"
      },
      "message": "Call the filesystem back whenever a page is removed from the page cache\n\nNFS needs to be able to release objects that are stored in the page\ncache once the page itself is no longer visible from the page cache.\n\nThis patch adds a callback to the address space operations that allows\nfilesystems to perform page cleanups once the page has been removed\nfrom the page cache.\n\nOriginal patch by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n[trondmy: cover the cases of invalidate_inode_pages2() and\n          truncate_inode_pages()]\nSigned-off-by: Trond Myklebust \u003cTrond.Myklebust@netapp.com\u003e\n"
    },
    {
      "commit": "1dce071e18b7264457d17c0dec4c7e430bfaee7d",
      "tree": "ced52f7f8e4177f9ea37f891f4d33d0a5109e651",
      "parents": [
        "38715258aa2e8cd94bd4aafadc544e5104efd551"
      ],
      "author": {
        "name": "Shaohua Li",
        "email": "shaohua.li@intel.com",
        "time": "Thu Nov 11 14:05:17 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Nov 12 07:55:31 2010 -0800"
      },
      "message": "vmscan: avoid setting zone congested if no page dirty\n\nnr_dirty and nr_congested are increased only when the page is dirty.  So\nif all pages are clean, both them will be zero.  In this case, we should\nnot mark the zone congested.\n\nSigned-off-by: Shaohua Li \u003cshaohua.li@intel.com\u003e\nReviewed-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "2e30244a7cc1ff09013a1238d415b4076406388e",
      "tree": "f555a78df877bbbd300ba3ebce6e31b5609e965f",
      "parents": [
        "4cbec4c8b9fda9ec784086fe7f74cd32a8adda95"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Tue Oct 26 14:21:46 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:08 2010 -0700"
      },
      "message": "vmscan,tmpfs: treat used once pages on tmpfs as used once\n\nWhen a page has PG_referenced, shrink_page_list() discards it only if it\nis not dirty.  This rule works fine if the backing filesystem is a regular\none.  PG_dirty is a good signal that the page was used recently because\nthe flusher threads clean pages periodically.  In addition, page writeback\nis costlier than simple page discard.\n\nHowever, when a page is on tmpfs this heuristic doesn\u0027t work because\nflusher threads don\u0027t write back tmpfs pages.  Consequently tmpfs pages\nalways rotate around the lru twice at least and adds unnecessary lru\nchurn.  Simple tmpfs streaming io shouldn\u0027t cause large anonymous page\nswap-out.\n\nRemove this unncessary reclaim bonus of tmpfs pages.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nReviewed-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0e093d99763eb4cea09f8ca4f1d01f34e121d10b",
      "tree": "fad38f9c3651c81db298521141a79d9468f71986",
      "parents": [
        "08fc468f4eaf6683bae5bdb94743a09d8630cb80"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Tue Oct 26 14:21:45 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:07 2010 -0700"
      },
      "message": "writeback: do not sleep on the congestion queue if there are no congested BDIs or if significant congestion is not being encountered in the current zone\n\nIf congestion_wait() is called with no BDI congested, the caller will\nsleep for the full timeout and this may be an unnecessary sleep.  This\npatch adds a wait_iff_congested() that checks congestion and only sleeps\nif a BDI is congested else, it calls cond_resched() to ensure the caller\nis not hogging the CPU longer than its quota but otherwise will not sleep.\n\nThis is aimed at reducing some of the major desktop stalls reported during\nIO.  For example, while kswapd is operating, it calls congestion_wait()\nbut it could just have been reclaiming clean page cache pages with no\ncongestion.  Without this patch, it would sleep for a full timeout but\nafter this patch, it\u0027ll just call schedule() if it has been on the CPU too\nlong.  Similar logic applies to direct reclaimers that are not making\nenough progress.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Jens Axboe \u003caxboe@kernel.dk\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "08fc468f4eaf6683bae5bdb94743a09d8630cb80",
      "tree": "a2225421eb8e01a8e9df588f5064be81059af91a",
      "parents": [
        "47185052165a4c5de0a461018238375dd982c2ec"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Tue Oct 26 14:21:44 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:07 2010 -0700"
      },
      "message": "vmscan: isolate_lru_pages(): stop neighbour search if neighbour cannot be isolated\n\nisolate_lru_pages() does not just isolate LRU tail pages, but also\nisolates neighbour pages of the eviction page.  The neighbour search does\nnot stop even if neighbours cannot be isolated which is excessive as the\nlumpy reclaim will no longer result in a successful higher order\nallocation.  This patch stops the PFN neighbour pages if an isolation\nfails and moves on to the next block.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "47185052165a4c5de0a461018238375dd982c2ec",
      "tree": "4425005eb24efdd66d0d9d4f95a11fdad04843b1",
      "parents": [
        "7d3579e8e61937cbba268ea9b218d006b6d64221"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Tue Oct 26 14:21:43 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:07 2010 -0700"
      },
      "message": "vmscan: remove dead code in shrink_inactive_list()\n\nAfter synchrounous lumpy reclaim, the page_list is guaranteed to not have\nactive pages as page activation in shrink_page_list() disables lumpy\nreclaim.  Remove the dead code.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "7d3579e8e61937cbba268ea9b218d006b6d64221",
      "tree": "4fa1863641343eee551681d60a823a84a2611289",
      "parents": [
        "bc57e00f5e0b2480ef222c775c49552d3a930db7"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Tue Oct 26 14:21:42 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:07 2010 -0700"
      },
      "message": "vmscan: narrow the scenarios in whcih lumpy reclaim uses synchrounous reclaim\n\nshrink_page_list() can decide to give up reclaiming a page under a\nnumber of conditions such as\n\n  1. trylock_page() failure\n  2. page is unevictable\n  3. zone reclaim and page is mapped\n  4. PageWriteback() is true\n  5. page is swapbacked and swap is full\n  6. add_to_swap() failure\n  7. page is dirty and gfpmask don\u0027t have GFP_IO, GFP_FS\n  8. page is pinned\n  9. IO queue is congested\n 10. pageout() start IO, but not finished\n\nWith lumpy reclaim, failures result in entering synchronous lumpy reclaim\nbut this can be unnecessary.  In cases (2), (3), (5), (6), (7) and (8),\nthere is no point retrying.  This patch causes lumpy reclaim to abort when\nit is known it will fail.\n\nCase (9) is more interesting. current behavior is,\n  1. start shrink_page_list(async)\n  2. found queue_congested()\n  3. skip pageout write\n  4. still start shrink_page_list(sync)\n  5. wait on a lot of pages\n  6. again, found queue_congested()\n  7. give up pageout write again\n\nSo, it\u0027s useless time wasting.  However, just skipping page reclaim is\nalso notgood as x86 allocating a huge page needs 512 pages for example.\nIt can have more dirty pages than queue congestion threshold (~\u003d128).\n\nAfter this patch, pageout() behaves as follows;\n\n - If order \u003e PAGE_ALLOC_COSTLY_ORDER\n\tIgnore queue congestion always.\n - If order \u003c\u003d PAGE_ALLOC_COSTLY_ORDER\n\tskip write page and disable lumpy reclaim.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "bc57e00f5e0b2480ef222c775c49552d3a930db7",
      "tree": "51aa33378602a41fb73b9b2fbee2ca04706aa9d6",
      "parents": [
        "52bb9198668968506f9d12bf35d7f5d3f094921e"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Tue Oct 26 14:21:41 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:07 2010 -0700"
      },
      "message": "vmscan: synchronous lumpy reclaim should not call congestion_wait()\n\ncongestion_wait() means \"wait until queue congestion is cleared\".\nHowever, synchronous lumpy reclaim does not need this congestion_wait() as\nshrink_page_list(PAGEOUT_IO_SYNC) uses wait_on_page_writeback() and it\nprovides the necessary waiting.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e11da5b4fdf01d71d73c21cb92b00595b917d7fd",
      "tree": "30da286bac7533fba5c119396491ab05a92471fd",
      "parents": [
        "66d9a986cddbbc2ea5db013e7999c621a956cc47"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Tue Oct 26 14:21:40 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:07 2010 -0700"
      },
      "message": "tracing, vmscan: add trace events for LRU list shrinking\n\nThere have been numerous reports of stalls that pointed at the problem\nbeing somewhere in the VM.  There are multiple roots to the problems which\nmeans dealing with any of the root problems in isolation is tricky to\njustify on their own and they would still need integration testing.  This\npatch series puts together two different patch sets which in combination\nshould tackle some of the root causes of latency problems being reported.\n\nPatch 1 adds a tracepoint for shrink_inactive_list.  For this series, the\nmost important results is being able to calculate the scanning/reclaim\nratio as a measure of the amount of work being done by page reclaim.\n\nPatch 2 accounts for time spent in congestion_wait.\n\nPatches 3-6 were originally developed by Kosaki Motohiro but reworked for\nthis series.  It has been noted that lumpy reclaim is far too aggressive\nand trashes the system somewhat.  As SLUB uses high-order allocations, a\nlarge cost incurred by lumpy reclaim will be noticeable.  It was also\nreported during transparent hugepage support testing that lumpy reclaim\nwas trashing the system and these patches should mitigate that problem\nwithout disabling lumpy reclaim.\n\nPatch 7 adds wait_iff_congested() and replaces some callers of\ncongestion_wait().  wait_iff_congested() only sleeps if there is a BDI\nthat is currently congested.  Patch 8 notes that any BDI being congested\nis not necessarily a problem because there could be multiple BDIs of\nvarying speeds and numberous zones.  It attempts to track when a zone\nbeing reclaimed contains many pages backed by a congested BDI and if so,\nreclaimers wait on the congestion queue.\n\nI ran a number of tests with monitoring on X86, X86-64 and PPC64. Each\nmachine had 3G of RAM and the CPUs were\n\nX86:    Intel P4 2-core\nX86-64: AMD Phenom 4-core\nPPC64:  PPC970MP\n\nEach used a single disk and the onboard IO controller.  Dirty ratio was\nleft at 20.  I\u0027m just going to report for X86-64 and PPC64 in a vague\nattempt to keep this report short.  Four kernels were tested each based on\nv2.6.36-rc4\n\ntraceonly-v2r2:     Patches 1 and 2 to instrument vmscan reclaims and congestion_wait\nlowlumpy-v2r3:      Patches 1-6 to test if lumpy reclaim is better\nwaitcongest-v2r3:   Patches 1-7 to only wait on congestion\nwaitwriteback-v2r4: Patches 1-8 to detect when a zone is congested\n\nnocongest-v1r5: Patches 1-3 for testing wait_iff_congestion\nnodirect-v1r5:  Patches 1-10 to disable filesystem writeback for better IO\n\nThe tests run were as follows\n\nkernbench\n\tcompile-based benchmark. Smoke test performance\n\nsysbench\n\tOLTP read-only benchmark. Will be re-run in the future as read-write\n\nmicro-mapped-file-stream\n\tThis is a micro-benchmark from Johannes Weiner that accesses a\n\tlarge sparse-file through mmap(). It was configured to run in only\n\tsingle-CPU mode but can be indicative of how well page reclaim\n\tidentifies suitable pages.\n\nstress-highalloc\n\tTries to allocate huge pages under heavy load.\n\nkernbench, iozone and sysbench did not report any performance regression\non any machine.  sysbench did pressure the system lightly and there was\nreclaim activity but there were no difference of major interest between\nthe kernels.\n\nX86-64 micro-mapped-file-stream\n\n                                      traceonly-v2r2           lowlumpy-v2r3        waitcongest-v2r3     waitwriteback-v2r4\npgalloc_dma                       1639.00 (   0.00%)       667.00 (-145.73%)      1167.00 ( -40.45%)       578.00 (-183.56%)\npgalloc_dma32                  2842410.00 (   0.00%)   2842626.00 (   0.01%)   2843043.00 (   0.02%)   2843014.00 (   0.02%)\npgalloc_normal                       0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)\npgsteal_dma                        729.00 (   0.00%)        85.00 (-757.65%)       609.00 ( -19.70%)       125.00 (-483.20%)\npgsteal_dma32                  2338721.00 (   0.00%)   2447354.00 (   4.44%)   2429536.00 (   3.74%)   2436772.00 (   4.02%)\npgsteal_normal                       0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)\npgscan_kswapd_dma                 1469.00 (   0.00%)       532.00 (-176.13%)      1078.00 ( -36.27%)       220.00 (-567.73%)\npgscan_kswapd_dma32            4597713.00 (   0.00%)   4503597.00 (  -2.09%)   4295673.00 (  -7.03%)   3891686.00 ( -18.14%)\npgscan_kswapd_normal                 0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)\npgscan_direct_dma                   71.00 (   0.00%)       134.00 (  47.01%)       243.00 (  70.78%)       352.00 (  79.83%)\npgscan_direct_dma32             305820.00 (   0.00%)    280204.00 (  -9.14%)    600518.00 (  49.07%)    957485.00 (  68.06%)\npgscan_direct_normal                 0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)\npageoutrun                       16296.00 (   0.00%)     21254.00 (  23.33%)     18447.00 (  11.66%)     20067.00 (  18.79%)\nallocstall                         443.00 (   0.00%)       273.00 ( -62.27%)       513.00 (  13.65%)      1568.00 (  71.75%)\n\nThese are based on the raw figures taken from /proc/vmstat.  It\u0027s a rough\nmeasure of reclaim activity.  Note that allocstall counts are higher\nbecause we are entering direct reclaim more often as a result of not\nsleeping in congestion.  In itself, it\u0027s not necessarily a bad thing.\nIt\u0027s easier to get a view of what happened from the vmscan tracepoint\nreport.\n\nFTrace Reclaim Statistics: vmscan\n\n                                traceonly-v2r2   lowlumpy-v2r3 waitcongest-v2r3 waitwriteback-v2r4\nDirect reclaims                                443        273        513       1568\nDirect reclaim pages scanned                305968     280402     600825     957933\nDirect reclaim pages reclaimed               43503      19005      30327     117191\nDirect reclaim write file async I/O              0          0          0          0\nDirect reclaim write anon async I/O              0          3          4         12\nDirect reclaim write file sync I/O               0          0          0          0\nDirect reclaim write anon sync I/O               0          0          0          0\nWake kswapd requests                        187649     132338     191695     267701\nKswapd wakeups                                   3          1          4          1\nKswapd pages scanned                       4599269    4454162    4296815    3891906\nKswapd pages reclaimed                     2295947    2428434    2399818    2319706\nKswapd reclaim write file async I/O              1          0          1          1\nKswapd reclaim write anon async I/O             59        187         41        222\nKswapd reclaim write file sync I/O               0          0          0          0\nKswapd reclaim write anon sync I/O               0          0          0          0\nTime stalled direct reclaim (seconds)         4.34       2.52       6.63       2.96\nTime kswapd awake (seconds)                  11.15      10.25      11.01      10.19\n\nTotal pages scanned                        4905237   4734564   4897640   4849839\nTotal pages reclaimed                      2339450   2447439   2430145   2436897\n%age total pages scanned/reclaimed          47.69%    51.69%    49.62%    50.25%\n%age total pages scanned/written             0.00%     0.00%     0.00%     0.00%\n%age  file pages scanned/written             0.00%     0.00%     0.00%     0.00%\nPercentage Time Spent Direct Reclaim        29.23%    19.02%    38.48%    20.25%\nPercentage Time kswapd Awake                78.58%    78.85%    76.83%    79.86%\n\nWhat is interesting here for nocongest in particular is that while direct\nreclaim scans more pages, the overall number of pages scanned remains the\nsame and the ratio of pages scanned to pages reclaimed is more or less the\nsame.  In other words, while we are sleeping less, reclaim is not doing\nmore work and as direct reclaim and kswapd is awake for less time, it\nwould appear to be doing less work.\n\nFTrace Reclaim Statistics: congestion_wait\nDirect number congest     waited                87        196         64          0\nDirect time   congest     waited            4604ms     4732ms     5420ms        0ms\nDirect full   congest     waited                72        145         53          0\nDirect number conditional waited                 0          0        324       1315\nDirect time   conditional waited               0ms        0ms        0ms        0ms\nDirect full   conditional waited                 0          0          0          0\nKSwapd number congest     waited                20         10         15          7\nKSwapd time   congest     waited            1264ms      536ms      884ms      284ms\nKSwapd full   congest     waited                10          4          6          2\nKSwapd number conditional waited                 0          0          0          0\nKSwapd time   conditional waited               0ms        0ms        0ms        0ms\nKSwapd full   conditional waited                 0          0          0          0\n\nThe vanilla kernel spent 8 seconds asleep in direct reclaim and no time at\nall asleep with the patches.\n\nMMTests Statistics: duration\nUser/Sys Time Running Test (seconds)         10.51     10.73      10.6     11.66\nTotal Elapsed Time (seconds)                 14.19     13.00     14.33     12.76\n\nOverall, the tests completed faster. It is interesting to note that backing off further\nwhen a zone is congested and not just a BDI was more efficient overall.\n\nPPC64 micro-mapped-file-stream\npgalloc_dma                    3024660.00 (   0.00%)   3027185.00 (   0.08%)   3025845.00 (   0.04%)   3026281.00 (   0.05%)\npgalloc_normal                       0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)\npgsteal_dma                    2508073.00 (   0.00%)   2565351.00 (   2.23%)   2463577.00 (  -1.81%)   2532263.00 (   0.96%)\npgsteal_normal                       0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)\npgscan_kswapd_dma              4601307.00 (   0.00%)   4128076.00 ( -11.46%)   3912317.00 ( -17.61%)   3377165.00 ( -36.25%)\npgscan_kswapd_normal                 0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)\npgscan_direct_dma               629825.00 (   0.00%)    971622.00 (  35.18%)   1063938.00 (  40.80%)   1711935.00 (  63.21%)\npgscan_direct_normal                 0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)         0.00 (   0.00%)\npageoutrun                       27776.00 (   0.00%)     20458.00 ( -35.77%)     18763.00 ( -48.04%)     18157.00 ( -52.98%)\nallocstall                         977.00 (   0.00%)      2751.00 (  64.49%)      2098.00 (  53.43%)      5136.00 (  80.98%)\n\nSimilar trends to x86-64. allocstalls are up but it\u0027s not necessarily bad.\n\nFTrace Reclaim Statistics: vmscan\nDirect reclaims                                977       2709       2098       5136\nDirect reclaim pages scanned                629825     963814    1063938    1711935\nDirect reclaim pages reclaimed               75550     242538     150904     387647\nDirect reclaim write file async I/O              0          0          0          2\nDirect reclaim write anon async I/O              0         10          0          4\nDirect reclaim write file sync I/O               0          0          0          0\nDirect reclaim write anon sync I/O               0          0          0          0\nWake kswapd requests                        392119    1201712     571935     571921\nKswapd wakeups                                   3          2          3          3\nKswapd pages scanned                       4601307    4128076    3912317    3377165\nKswapd pages reclaimed                     2432523    2318797    2312673    2144616\nKswapd reclaim write file async I/O             20          1          1          1\nKswapd reclaim write anon async I/O             57        132         11        121\nKswapd reclaim write file sync I/O               0          0          0          0\nKswapd reclaim write anon sync I/O               0          0          0          0\nTime stalled direct reclaim (seconds)         6.19       7.30      13.04      10.88\nTime kswapd awake (seconds)                  21.73      26.51      25.55      23.90\n\nTotal pages scanned                        5231132   5091890   4976255   5089100\nTotal pages reclaimed                      2508073   2561335   2463577   2532263\n%age total pages scanned/reclaimed          47.95%    50.30%    49.51%    49.76%\n%age total pages scanned/written             0.00%     0.00%     0.00%     0.00%\n%age  file pages scanned/written             0.00%     0.00%     0.00%     0.00%\nPercentage Time Spent Direct Reclaim        18.89%    20.65%    32.65%    27.65%\nPercentage Time kswapd Awake                72.39%    80.68%    78.21%    77.40%\n\nAgain, a similar trend that the congestion_wait changes mean that direct\nreclaim scans more pages but the overall number of pages scanned while\nslightly reduced, are very similar.  The ratio of scanning/reclaimed\nremains roughly similar.  The downside is that kswapd and direct reclaim\nwas awake longer and for a larger percentage of the overall workload.\nIt\u0027s possible there were big differences in the amount of time spent\nreclaiming slab pages between the different kernels which is plausible\nconsidering that the micro tests runs after fsmark and sysbench.\n\nTrace Reclaim Statistics: congestion_wait\nDirect number congest     waited               845       1312        104          0\nDirect time   congest     waited           19416ms    26560ms     7544ms        0ms\nDirect full   congest     waited               745       1105         72          0\nDirect number conditional waited                 0          0       1322       2935\nDirect time   conditional waited               0ms        0ms       12ms      312ms\nDirect full   conditional waited                 0          0          0          3\nKSwapd number congest     waited                39        102         75         63\nKSwapd time   congest     waited            2484ms     6760ms     5756ms     3716ms\nKSwapd full   congest     waited                20         48         46         25\nKSwapd number conditional waited                 0          0          0          0\nKSwapd time   conditional waited               0ms        0ms        0ms        0ms\nKSwapd full   conditional waited                 0          0          0          0\n\nThe vanilla kernel spent 20 seconds asleep in direct reclaim and only\n312ms asleep with the patches.  The time kswapd spent congest waited was\nalso reduced by a large factor.\n\nMMTests Statistics: duration\nser/Sys Time Running Test (seconds)         26.58     28.05      26.9     28.47\nTotal Elapsed Time (seconds)                 30.02     32.86     32.67     30.88\n\nWith all patches applies, the completion times are very similar.\n\nX86-64 STRESS-HIGHALLOC\n                traceonly-v2r2     lowlumpy-v2r3  waitcongest-v2r3waitwriteback-v2r4\nPass 1          82.00 ( 0.00%)    84.00 ( 2.00%)    85.00 ( 3.00%)    85.00 ( 3.00%)\nPass 2          90.00 ( 0.00%)    87.00 (-3.00%)    88.00 (-2.00%)    89.00 (-1.00%)\nAt Rest         92.00 ( 0.00%)    90.00 (-2.00%)    90.00 (-2.00%)    91.00 (-1.00%)\n\nSuccess figures across the board are broadly similar.\n\n                traceonly-v2r2     lowlumpy-v2r3  waitcongest-v2r3waitwriteback-v2r4\nDirect reclaims                               1045        944        886        887\nDirect reclaim pages scanned                135091     119604     109382     101019\nDirect reclaim pages reclaimed               88599      47535      47863      46671\nDirect reclaim write file async I/O            494        283        465        280\nDirect reclaim write anon async I/O          29357      13710      16656      13462\nDirect reclaim write file sync I/O             154          2          2          3\nDirect reclaim write anon sync I/O           14594        571        509        561\nWake kswapd requests                          7491        933        872        892\nKswapd wakeups                                 814        778        731        780\nKswapd pages scanned                       7290822   15341158   11916436   13703442\nKswapd pages reclaimed                     3587336    3142496    3094392    3187151\nKswapd reclaim write file async I/O          91975      32317      28022      29628\nKswapd reclaim write anon async I/O        1992022     789307     829745     849769\nKswapd reclaim write file sync I/O               0          0          0          0\nKswapd reclaim write anon sync I/O               0          0          0          0\nTime stalled direct reclaim (seconds)      4588.93    2467.16    2495.41    2547.07\nTime kswapd awake (seconds)                2497.66    1020.16    1098.06    1176.82\n\nTotal pages scanned                        7425913  15460762  12025818  13804461\nTotal pages reclaimed                      3675935   3190031   3142255   3233822\n%age total pages scanned/reclaimed          49.50%    20.63%    26.13%    23.43%\n%age total pages scanned/written            28.66%     5.41%     7.28%     6.47%\n%age  file pages scanned/written             1.25%     0.21%     0.24%     0.22%\nPercentage Time Spent Direct Reclaim        57.33%    42.15%    42.41%    42.99%\nPercentage Time kswapd Awake                43.56%    27.87%    29.76%    31.25%\n\nScanned/reclaimed ratios again look good with big improvements in\nefficiency.  The Scanned/written ratios also look much improved.  With a\nbetter scanned/written ration, there is an expectation that IO would be\nmore efficient and indeed, the time spent in direct reclaim is much\nreduced by the full series and kswapd spends a little less time awake.\n\nOverall, indications here are that allocations were happening much faster\nand this can be seen with a graph of the latency figures as the\nallocations were taking place\nhttp://www.csn.ul.ie/~mel/postings/vmscanreduce-20101509/highalloc-interlatency-hydra-mean.ps\n\nFTrace Reclaim Statistics: congestion_wait\nDirect number congest     waited              1333        204        169          4\nDirect time   congest     waited           78896ms     8288ms     7260ms      200ms\nDirect full   congest     waited               756         92         69          2\nDirect number conditional waited                 0          0         26        186\nDirect time   conditional waited               0ms        0ms        0ms     2504ms\nDirect full   conditional waited                 0          0          0         25\nKSwapd number congest     waited                 4        395        227        282\nKSwapd time   congest     waited             384ms    25136ms    10508ms    18380ms\nKSwapd full   congest     waited                 3        232         98        176\nKSwapd number conditional waited                 0          0          0          0\nKSwapd time   conditional waited               0ms        0ms        0ms        0ms\nKSwapd full   conditional waited                 0          0          0          0\nKSwapd full   conditional waited               318          0        312          9\n\nOverall, the time spent speeping is reduced.  kswapd is still hitting\ncongestion_wait() but that is because there are callers remaining where it\nwasn\u0027t clear in advance if they should be changed to wait_iff_congested()\nor not.  Overall the sleep imes are reduced though - from 79ish seconds to\nabout 19.\n\nMMTests Statistics: duration\nUser/Sys Time Running Test (seconds)       3415.43   3386.65   3388.39    3377.5\nTotal Elapsed Time (seconds)               5733.48   3660.33   3689.41   3765.39\n\nWith the full series, the time to complete the tests are reduced by 30%\n\nPPC64 STRESS-HIGHALLOC\n                traceonly-v2r2     lowlumpy-v2r3  waitcongest-v2r3waitwriteback-v2r4\nPass 1          17.00 ( 0.00%)    34.00 (17.00%)    38.00 (21.00%)    43.00 (26.00%)\nPass 2          25.00 ( 0.00%)    37.00 (12.00%)    42.00 (17.00%)    46.00 (21.00%)\nAt Rest         49.00 ( 0.00%)    43.00 (-6.00%)    45.00 (-4.00%)    51.00 ( 2.00%)\n\nSuccess rates there are *way* up particularly considering that the 16MB\nhuge pages on PPC64 mean that it\u0027s always much harder to allocate them.\n\nFTrace Reclaim Statistics: vmscan\n              stress-highalloc  stress-highalloc  stress-highalloc  stress-highalloc\n                traceonly-v2r2     lowlumpy-v2r3  waitcongest-v2r3waitwriteback-v2r4\nDirect reclaims                                499        505        564        509\nDirect reclaim pages scanned                223478      41898      51818      45605\nDirect reclaim pages reclaimed              137730      21148      27161      23455\nDirect reclaim write file async I/O            399        136        162        136\nDirect reclaim write anon async I/O          46977       2865       4686       3998\nDirect reclaim write file sync I/O              29          0          1          3\nDirect reclaim write anon sync I/O           31023        159        237        239\nWake kswapd requests                           420        351        360        326\nKswapd wakeups                                 185        294        249        277\nKswapd pages scanned                      15703488   16392500   17821724   17598737\nKswapd pages reclaimed                     5808466    2908858    3139386    3145435\nKswapd reclaim write file async I/O         159938      18400      18717      13473\nKswapd reclaim write anon async I/O        3467554     228957     322799     234278\nKswapd reclaim write file sync I/O               0          0          0          0\nKswapd reclaim write anon sync I/O               0          0          0          0\nTime stalled direct reclaim (seconds)      9665.35    1707.81    2374.32    1871.23\nTime kswapd awake (seconds)                9401.21    1367.86    1951.75    1328.88\n\nTotal pages scanned                       15926966  16434398  17873542  17644342\nTotal pages reclaimed                      5946196   2930006   3166547   3168890\n%age total pages scanned/reclaimed          37.33%    17.83%    17.72%    17.96%\n%age total pages scanned/written            23.27%     1.52%     1.94%     1.43%\n%age  file pages scanned/written             1.01%     0.11%     0.11%     0.08%\nPercentage Time Spent Direct Reclaim        44.55%    35.10%    41.42%    36.91%\nPercentage Time kswapd Awake                86.71%    43.58%    52.67%    41.14%\n\nWhile the scanning rates are slightly up, the scanned/reclaimed and\nscanned/written figures are much improved.  The time spent in direct\nreclaim and with kswapd are massively reduced, mostly by the lowlumpy\npatches.\n\nFTrace Reclaim Statistics: congestion_wait\nDirect number congest     waited               725        303        126          3\nDirect time   congest     waited           45524ms     9180ms     5936ms      300ms\nDirect full   congest     waited               487        190         52          3\nDirect number conditional waited                 0          0        200        301\nDirect time   conditional waited               0ms        0ms        0ms     1904ms\nDirect full   conditional waited                 0          0          0         19\nKSwapd number congest     waited                 0          2         23          4\nKSwapd time   congest     waited               0ms      200ms      420ms      404ms\nKSwapd full   congest     waited                 0          2          2          4\nKSwapd number conditional waited                 0          0          0          0\nKSwapd time   conditional waited               0ms        0ms        0ms        0ms\nKSwapd full   conditional waited                 0          0          0          0\n\nNot as dramatic a story here but the time spent asleep is reduced and we\ncan still see what wait_iff_congested is going to sleep when necessary.\n\nMMTests Statistics: duration\nUser/Sys Time Running Test (seconds)      12028.09   3157.17   3357.79   3199.16\nTotal Elapsed Time (seconds)              10842.07   3138.72   3705.54   3229.85\n\nThe time to complete this test goes way down.  With the full series, we\nare allocating over twice the number of huge pages in 30% of the time and\nthere is a corresponding impact on the allocation latency graph available\nat.\n\nhttp://www.csn.ul.ie/~mel/postings/vmscanreduce-20101509/highalloc-interlatency-powyah-mean.ps\n\nThis patch:\n\nAdd a trace event for shrink_inactive_list() and updates the sample\npostprocessing script appropriately.  It can be used to determine how many\npages were reclaimed and for non-lumpy reclaim where exactly the pages\nwere reclaimed from.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "66d9a986cddbbc2ea5db013e7999c621a956cc47",
      "tree": "bfe0d223e9b07c2300f445f9694525e956657d1e",
      "parents": [
        "bce54bbfde07e8b300f39dae14756c12a6ceca65"
      ],
      "author": {
        "name": "Shaohua Li",
        "email": "shaohua.li@intel.com",
        "time": "Tue Oct 26 14:21:37 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:07 2010 -0700"
      },
      "message": "vmscan: delete dead code\n\n`priority\u0027 cannot be negative here.  And the comment is obsolete.\n\nSigned-off-by: Shaohua Li \u003cshaohua.li@intel.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "74e3f3c3391d81a959f58a1191a560703a4415b4",
      "tree": "b4688926ebe2c40b422bd6df0989ec09ea0f7046",
      "parents": [
        "49ac825587f33afec8841b7fab2eb4db775014e6"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Tue Oct 26 14:21:31 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:06 2010 -0700"
      },
      "message": "vmscan: prevent background aging of anon page in no swap system\n\nYing Han reported that backing aging of anon pages in no swap system\ncauses unnecessary TLB flush.\n\nWhen I sent a patch(69c8548175), I wanted this patch but Rik pointed out\nand allowed aging of anon pages to give a chance to promote from inactive\nto active LRU.\n\nIt has a two problem.\n\n1) non-swap system\n\nNever make sense to age anon pages.\n\n2) swap configured but still doesn\u0027t swapon\n\nIt doesn\u0027t make sense to age anon pages until swap-on time.  But it\u0027s\narguable.  If we have aged anon pages by swapon, VM have moved anon pages\nfrom active to inactive.  And in the time swapon by admin, the VM can\u0027t\nreclaim hot pages so we can protect hot pages swapout.\n\nBut let\u0027s think about it.  When does swap-on happen?  It depends on admin.\n we can\u0027t expect it.  Nonetheless, we have done aging of anon pages to\nprotect hot pages swapout.  It means we lost run time overhead when below\nhigh watermark but gain hot page swap-[in/out] overhead when VM decide\nswapout.  Is it true?  Let\u0027s think more detail.  We don\u0027t promote anon\npages in case of non-swap system.  So even though VM does aging of anon\npages, the pages would be in inactive LRU for a long time.  It means many\nof pages in there would mark access bit again.  So access bit hot/code\nseparation would be pointless.\n\nThis patch prevents unnecessary anon pages demotion in not-yet-swapon and\nnon-configured swap system.  Even, in non-configuared swap system\ninactive_anon_is_low can be compiled out.\n\nIt could make side effect that hot anon pages could swap out when admin\ndoes swap on.  But I think sooner or later it would be steady state.  So\nit\u0027s not a big problem.\n\nWe could lose someting but gain more thing(TLB flush and unnecessary\nfunction call to demote anon pages).\n\nSigned-off-by: Ying Han \u003cyinghan@google.com\u003e\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e4455abb50a19562dbfdc51a8424fda9b588bd6d",
      "tree": "add38aec00027e9a115778425a41d3d075a9ced6",
      "parents": [
        "f19e77a3dc884510dba740caa6dee126b7d40156"
      ],
      "author": {
        "name": "Thadeu Lima de Souza Cascardo",
        "email": "cascardo@holoscopio.com",
        "time": "Tue Oct 26 14:21:28 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:05 2010 -0700"
      },
      "message": "mm: only build per-node scan_unevictable functions when NUMA is enabled\n\nNon-NUMA systems do never create these files anyway, since they are only\ncreated by driver subsystem when NUMA is configured.\n\n[akpm@linux-foundation.org: cleanup]\nSigned-off-by: Thadeu Lima de Souza Cascardo \u003ccascardo@holoscopio.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "1b430beee5e388605dfb092b214ef0320f752cf6",
      "tree": "c1b1ece282aab771fd1386a3fe0c6e82cb5c5bfe",
      "parents": [
        "d19d5476f4b9f91d2de92b91588bb118beba6c0d"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Tue Oct 26 14:21:26 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:05 2010 -0700"
      },
      "message": "writeback: remove nonblocking/encountered_congestion references\n\nThis removes more dead code that was somehow missed by commit 0d99519efef\n(writeback: remove unused nonblocking and congestion checks).  There are\nno behavior change except for the removal of two entries from one of the\next4 tracing interface.\n\nThe nonblocking checks in -\u003ewritepages are no longer used because the\nflusher now prefer to block on get_request_wait() than to skip inodes on\nIO congestion.  The latter will lead to more seeky IO.\n\nThe nonblocking checks in -\u003ewritepage are no longer used because it\u0027s\nredundant with the WB_SYNC_NONE check.\n\nWe no long set -\u003enonblocking in VM page out and page migration, because\na) it\u0027s effectively redundant with WB_SYNC_NONE in current code\nb) it\u0027s old semantic of \"Don\u0027t get stuck on request queues\" is mis-behavior:\n   that would skip some dirty inodes on congestion and page out others, which\n   is unfair in terms of LRU age.\n\nInspired by Christoph Hellwig. Thanks!\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Theodore Ts\u0027o \u003ctytso@mit.edu\u003e\nCc: David Howells \u003cdhowells@redhat.com\u003e\nCc: Sage Weil \u003csage@newdream.net\u003e\nCc: Steve French \u003csfrench@samba.org\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Jens Axboe \u003caxboe@kernel.dk\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "229aebb873e29726b91e076161649cf45154b0bf",
      "tree": "acc02a3702215bce8d914f4c8cc3d7a1382b1c67",
      "parents": [
        "8de547e1824437f3c6af180d3ed2162fa4b3f389",
        "50a23e6eec6f20d55a3a920e47adb455bff6046e"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sun Oct 24 13:41:39 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sun Oct 24 13:41:39 2010 -0700"
      },
      "message": "Merge branch \u0027for-next\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial\n\n* \u0027for-next\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)\n  Update broken web addresses in arch directory.\n  Update broken web addresses in the kernel.\n  Revert \"drivers/usb: Remove unnecessary return\u0027s from void functions\" for musb gadget\n  Revert \"Fix typo: configuation \u003d\u003e configuration\" partially\n  ida: document IDA_BITMAP_LONGS calculation\n  ext2: fix a typo on comment in ext2/inode.c\n  drivers/scsi: Remove unnecessary casts of private_data\n  drivers/s390: Remove unnecessary casts of private_data\n  net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data\n  drivers/infiniband: Remove unnecessary casts of private_data\n  drivers/gpu/drm: Remove unnecessary casts of private_data\n  kernel/pm_qos_params.c: Remove unnecessary casts of private_data\n  fs/ecryptfs: Remove unnecessary casts of private_data\n  fs/seq_file.c: Remove unnecessary casts of private_data\n  arm: uengine.c: remove C99 comments\n  arm: scoop.c: remove C99 comments\n  Fix typo configue \u003d\u003e configure in comments\n  Fix typo: configuation \u003d\u003e configuration\n  Fix typo interrest[ing|ed] \u003d\u003e interest[ing|ed]\n  Fix various typos of valid in comments\n  ...\n\nFix up trivial conflicts in:\n\tdrivers/char/ipmi/ipmi_si_intf.c\n\tdrivers/usb/gadget/rndis.c\n\tnet/irda/irnet/irnet_ppp.c\n"
    },
    {
      "commit": "d1908362ae0b97374eb8328fbb471576332f9fb1",
      "tree": "db8ba2a4de2e9ac61b8e94fc03a76ddbd24d321f",
      "parents": [
        "eba93fcc34d6c4387ce8fbb53bb7b685f91f3343"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Wed Sep 22 13:05:01 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Sep 22 17:22:39 2010 -0700"
      },
      "message": "vmscan: check all_unreclaimable in direct reclaim path\n\nM.  Vefa Bicakci reported 2.6.35 kernel hang up when hibernation on his\n32bit 3GB mem machine.\n(https://bugzilla.kernel.org/show_bug.cgi?id\u003d16771). Also he bisected\nthe regression to\n\n  commit bb21c7ce18eff8e6e7877ca1d06c6db719376e3c\n  Author: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\n  Date:   Fri Jun 4 14:15:05 2010 -0700\n\n     vmscan: fix do_try_to_free_pages() return value when priority\u003d\u003d0 reclaim failure\n\nAt first impression, this seemed very strange because the above commit\nonly chenged function return value and hibernate_preallocate_memory()\nignore return value of shrink_all_memory().  But it\u0027s related.\n\nNow, page allocation from hibernation code may enter infinite loop if the\nsystem has highmem.  The reasons are that vmscan don\u0027t care enough OOM\ncase when oom_killer_disabled.\n\nThe problem sequence is following as.\n\n1. hibernation\n2. oom_disable\n3. alloc_pages\n4. do_try_to_free_pages\n       if (scanning_global_lru(sc) \u0026\u0026 !all_unreclaimable)\n               return 1;\n\nIf kswapd is not freozen, it would set zone-\u003eall_unreclaimable to 1 and\nthen shrink_zones maybe return true(ie, all_unreclaimable is true).  So at\nlast, alloc_pages could go to _nopage_.  If it is, it should have no\nproblem.\n\nThis patch adds all_unreclaimable check to protect in direct reclaim path,\ntoo.  It can care of hibernation OOM case and help bailout\nall_unreclaimable case slightly.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReported-by: M. Vefa Bicakci \u003cbicave@superonline.com\u003e\nReported-by: \u003ccaiqian@redhat.com\u003e\nReviewed-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nTested-by: \u003ccaiqian@redhat.com\u003e\nAcked-by: Rafael J. Wysocki \u003crjw@sisk.pl\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nCc: \u003cstable@kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "415b54e37a5d0efa7ff5d4d12285b1e82d574c3e",
      "tree": "3a07cb7407061b273b605b6aa96f2cf8c4ae10b2",
      "parents": [
        "3d529946ce292336793b85198bd59afc75e16bd4"
      ],
      "author": {
        "name": "Nikanth Karthikesan",
        "email": "knikanth@suse.de",
        "time": "Tue Aug 17 15:39:09 2010 +0530"
      },
      "committer": {
        "name": "Jiri Kosina",
        "email": "jkosina@suse.cz",
        "time": "Wed Aug 18 10:22:24 2010 +0200"
      },
      "message": "Fix typo s/contenious/continuous in comment\n\nFix typo s/contenious/continuous in comment.\n\nSigned-off-by: Nikanth Karthikesan \u003cknikanth@suse.de\u003e\nSigned-off-by: Jiri Kosina \u003cjkosina@suse.cz\u003e\n"
    },
    {
      "commit": "00918b6ab89df8984ca06397cb77994dabd73f9b",
      "tree": "2ca2f0f0e7f3ca235c254f05759f96f160e3c0ab",
      "parents": [
        "14fec79680f7cc4617d6ba69324e63d4a732986c"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Tue Aug 10 18:03:05 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Aug 11 08:59:19 2010 -0700"
      },
      "message": "memcg: remove nid and zid argument from mem_cgroup_soft_limit_reclaim()\n\nmem_cgroup_soft_limit_reclaim() has zone, nid and zid argument.  but nid\nand zid can be calculated from zone.  So remove it.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nCc: Nishimura Daisuke \u003cd-nishimura@mtf.biglobe.ne.jp\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "14fec79680f7cc4617d6ba69324e63d4a732986c",
      "tree": "f8a9b627a03d04ec7c76fb67f8ea66c81c57a92f",
      "parents": [
        "da280d636b83f0f5d92921c99ef5c7d7c3e751cc"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Tue Aug 10 18:03:05 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Aug 11 08:59:19 2010 -0700"
      },
      "message": "memcg: mem_cgroup_shrink_node_zone() doesn\u0027t need sc.nodemask\n\nCurrently mem_cgroup_shrink_node_zone() call shrink_zone() directly.  thus\nit doesn\u0027t need to initialize sc.nodemask because shrink_zone() doesn\u0027t\nuse it at all.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nCc: Nishimura Daisuke \u003cd-nishimura@mtf.biglobe.ne.jp\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "da280d636b83f0f5d92921c99ef5c7d7c3e751cc",
      "tree": "8f64c4234031589b7818f3d90a1a9552016161ff",
      "parents": [
        "b8f5c5664d51776d74c84228c4b7165abfa92a18"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Tue Aug 10 18:03:04 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Aug 11 08:59:19 2010 -0700"
      },
      "message": "memcg: kill unnecessary initialization in mem_cgroup_shrink_node_zone()\n\nsc.nr_reclaimed and sc.nr_scanned have already been initialized few lines\nabove \"struct scan_control sc \u003d {}\" statement.\n\nSo, This patch remove this unnecessary code.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Nishimura Daisuke \u003cd-nishimura@mtf.biglobe.ne.jp\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "b8f5c5664d51776d74c84228c4b7165abfa92a18",
      "tree": "951e09e1810a356d8ecff7a85637e9c2e2f4e49d",
      "parents": [
        "f75ca962037ffd639a44fd88933cd9b84c4c4411"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Tue Aug 10 18:03:02 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Aug 11 08:59:19 2010 -0700"
      },
      "message": "memcg: sc.nr_to_reclaim should be initialized\n\nCurrently, mem_cgroup_shrink_node_zone() initialize sc.nr_to_reclaim as 0.\n It mean shrink_zone() only scan 32 pages and immediately return even if\nit doesn\u0027t reclaim any pages.\n\nThis patch fixes it.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nCc: Nishimura Daisuke \u003cd-nishimura@mtf.biglobe.ne.jp\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e31f3698cd3499e676f6b0ea12e3528f569c4fa3",
      "tree": "0133cc0e11384c7293bdf0812ee04996a02c8826",
      "parents": [
        "51980ac9e72fb5f22c81b7798d65b691125d70ee"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Aug 09 17:20:01 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:03 2010 -0700"
      },
      "message": "vmscan: raise the bar to PAGEOUT_IO_SYNC stalls\n\nFix \"system goes unresponsive under memory pressure and lots of\ndirty/writeback pages\" bug.\n\n\thttp://lkml.org/lkml/2010/4/4/86\n\nIn the above thread, Andreas Mohr described that\n\n\tInvoking any command locked up for minutes (note that I\u0027m\n\ttalking about attempted additional I/O to the _other_,\n\t_unaffected_ main system HDD - such as loading some shell\n\tbinaries -, NOT the external SSD18M!!).\n\nThis happens when the two conditions are both meet:\n- under memory pressure\n- writing heavily to a slow device\n\nOOM also happens in Andreas\u0027 system.  The OOM trace shows that 3 processes\nare stuck in wait_on_page_writeback() in the direct reclaim path.  One in\ndo_fork() and the other two in unix_stream_sendmsg().  They are blocked on\nthis condition:\n\n\t(sc-\u003eorder \u0026\u0026 priority \u003c DEF_PRIORITY - 2)\n\nwhich was introduced in commit 78dc583d (vmscan: low order lumpy reclaim\nalso should use PAGEOUT_IO_SYNC) one year ago.  That condition may be too\npermissive.  In Andreas\u0027 case, 512MB/1024 \u003d 512KB.  If the direct reclaim\nfor the order-1 fork() allocation runs into a range of 512KB\nhard-to-reclaim LRU pages, it will be stalled.\n\nIt\u0027s a severe problem in three ways.\n\nFirstly, it can easily happen in daily desktop usage.  vmscan priority can\neasily go below (DEF_PRIORITY - 2) on _local_ memory pressure.  Even if\nthe system has 50% globally reclaimable pages, it still has good\nopportunity to have 0.1% sized hard-to-reclaim ranges.  For example, a\nsimple dd can easily create a big range (up to 20%) of dirty pages in the\nLRU lists.  And order-1 to order-3 allocations are more than common with\nSLUB.  Try \"grep -v \u00271 :\u0027 /proc/slabinfo\" to get the list of high order\nslab caches.  For example, the order-1 radix_tree_node slab cache may\nstall applications at swap-in time; the order-3 inode cache on most\nfilesystems may stall applications when trying to read some file; the\norder-2 proc_inode_cache may stall applications when trying to open a\n/proc file.\n\nSecondly, once triggered, it will stall unrelated processes (not doing IO\nat all) in the system.  This \"one slow USB device stalls the whole system\"\navalanching effect is very bad.\n\nThirdly, once stalled, the stall time could be intolerable long for the\nusers.  When there are 20MB queued writeback pages and USB 1.1 is writing\nthem in 1MB/s, wait_on_page_writeback() will stuck for up to 20 seconds.\nNot to mention it may be called multiple times.\n\nSo raise the bar to only enable PAGEOUT_IO_SYNC when priority goes below\nDEF_PRIORITY/3, or 6.25% LRU size.  As the default dirty throttle ratio is\n20%, it will hardly be triggered by pure dirty pages.  We\u0027d better treat\nPAGEOUT_IO_SYNC as some last resort workaround -- its stall time is so\nuncomfortably long (easily goes beyond 1s).\n\nThe bar is only raised for (order \u003c PAGE_ALLOC_COSTLY_ORDER) allocations,\nwhich are easy to satisfy in 1TB memory boxes.  So, although 6.25% of\nmemory could be an awful lot of pages to scan on a system with 1TB of\nmemory, it won\u0027t really have to busy scan that much.\n\nAndreas tested an older version of this patch and reported that it mostly\nfixed his problem.  Mel Gorman helped improve it and KOSAKI Motohiro will\nfix it further in the next patch.\n\nReported-by: Andreas Mohr \u003candi@lisas.de\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "bdce6d9ebf52c1f6c23163d1a33320ce7c007f73",
      "tree": "6070de94cdece474e277d8878721421671e212eb",
      "parents": [
        "cf4dcc3e9b374e1b61a7c22faf868707ce78d6a9"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Mon Aug 09 17:19:56 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:03 2010 -0700"
      },
      "message": "memcg, vmscan: add memcg reclaim tracepoint\n\nMemcg also need to trace reclaim progress as direct reclaim.  This patch\nadd it.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nAcked-by: Balbir Singh \u003cbalbir@linux.vnet.ibm.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4dc4b3d971b23e12d483ba9f3b93b648c54b298a",
      "tree": "845cc8debe146f683510841982323c338eb10000",
      "parents": [
        "57250a5bf0f6ff68dc339572adbd881a11f366fa"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Mon Aug 09 17:19:54 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:03 2010 -0700"
      },
      "message": "vmscan: shrink_slab() requires the number of lru_pages, not the page order\n\nPresently shrink_slab() has the following scanning equation.\n\n                            lru_scanned        max_pass\n  basic_scan_objects \u003d 4 x -------------  x -----------------------------\n                            lru_pages        shrinker-\u003eseeks (default:2)\n\n  scan_objects \u003d min(basic_scan_objects, max_pass * 2)\n\nIf we pass very small value as lru_pages instead real number of lru pages,\nshrink_slab() drop much objects rather than necessary.  And now,\n__zone_reclaim() pass \u0027order\u0027 as lru_pages by mistake.  That produces a\nbad result.\n\nFor example, if we receive very low memory pressure (scan \u003d 32, order \u003d\n0), shrink_slab() via zone_reclaim() always drop _all_ icache/dcache\nobjects.  (see above equation, very small lru_pages make very big\nscan_objects result).\n\nThis patch fixes it.\n\n[akpm@linux-foundation.org: fix layout, typos]\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "58c37f6e0dfaaab85a3c11fcbf24451dfe70c721",
      "tree": "f1d6f6299059e5aa5fc3668ef9f561605491deb3",
      "parents": [
        "15748048991e801a2d18ce5da4e0d528852bc106"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Mon Aug 09 17:19:51 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:02 2010 -0700"
      },
      "message": "vmscan: protect reading of reclaim_stat with lru_lock\n\nRik van Riel pointed out reading reclaim_stat should be protected\nlru_lock, otherwise vmscan might sweep 2x much pages.\n\nThis fault was introduced by\n\n  commit 4f98a2fee8acdb4ac84545df98cccecfd130f8db\n  Author: Rik van Riel \u003criel@redhat.com\u003e\n  Date:   Sat Oct 18 20:26:32 2008 -0700\n\n    vmscan: split LRU lists into anon \u0026 file sets\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "15748048991e801a2d18ce5da4e0d528852bc106",
      "tree": "e31dcdf36bbcfdd1c78546637d59faa963597bac",
      "parents": [
        "7ee92255470daa0edb93866aec6e27534cd9a177"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Mon Aug 09 17:19:50 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:02 2010 -0700"
      },
      "message": "vmscan: avoid subtraction of unsigned types\n\n\u0027slab_reclaimable\u0027 and \u0027nr_pages\u0027 are unsigned.  Subtraction is unsafe\nbecause negative results would be misinterpreted.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "1489fa14cb757b496c8fa2b63097dbcee6690695",
      "tree": "288905eab717db3cf1f6b7419e1989e89411ce04",
      "parents": [
        "abe4c3b50c3f25cb1baf56036024860f12f96015"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Mon Aug 09 17:19:33 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:00 2010 -0700"
      },
      "message": "vmscan: update isolated page counters outside of main path in shrink_inactive_list()\n\nWhen shrink_inactive_list() isolates pages, it updates a number of\ncounters using temporary variables to gather them.  These consume stack\nand it\u0027s in the main path that calls -\u003ewritepage().  This patch moves the\naccounting updates outside of the main path to reduce stack usage.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "abe4c3b50c3f25cb1baf56036024860f12f96015",
      "tree": "20ac47ac168b30f1dde6773d103ed13432802049",
      "parents": [
        "666356297ec4e9e6594c6008803f2b1403ff7950"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Mon Aug 09 17:19:31 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:00 2010 -0700"
      },
      "message": "vmscan: set up pagevec as late as possible in shrink_page_list()\n\nshrink_page_list() sets up a pagevec to release pages as according as they\nare free.  It uses significant amounts of stack on the pagevec.  This\npatch adds pages to be freed via pagevec to a linked list which is then\nfreed en-masse at the end.  This avoids using stack in the main path that\npotentially calls writepage().\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "666356297ec4e9e6594c6008803f2b1403ff7950",
      "tree": "aaa1e60f81588d0d90c2279c9812a32e5a085a27",
      "parents": [
        "d4debc66d1fc1b98a68081c4c8156f171841dca8"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Mon Aug 09 17:19:30 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:00 2010 -0700"
      },
      "message": "vmscan: set up pagevec as late as possible in shrink_inactive_list()\n\nshrink_inactive_list() sets up a pagevec to release unfreeable pages.  It\nuses significant amounts of stack doing this.  This patch splits\nshrink_inactive_list() to take the stack usage out of the main path so\nthat callers to writepage() do not contain an unused pagevec on the stack.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d4debc66d1fc1b98a68081c4c8156f171841dca8",
      "tree": "35bc1d2212e3c70ded11d31d28698fee0beeb3b2",
      "parents": [
        "e247dbce5cc747a714e8dcbd6b3f442cc2a284cf"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Mon Aug 09 17:19:29 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:00 2010 -0700"
      },
      "message": "vmscan: remove unnecessary temporary vars in do_try_to_free_pages\n\nRemove temporary variable that is only used once and does not help clarify\ncode.\n\n[akpm@linux-foundation.org: coding-style fixes]\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e247dbce5cc747a714e8dcbd6b3f442cc2a284cf",
      "tree": "9ade331a0be10aab1a67160b6feebc2ef06d5878",
      "parents": [
        "25edde0332916ae706ccf83de688be57bcc844b7"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Mon Aug 09 17:19:28 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:00 2010 -0700"
      },
      "message": "vmscan: simplify shrink_inactive_list()\n\nNow, max_scan of shrink_inactive_list() is always passed less than\nSWAP_CLUSTER_MAX.  then, we can remove scanning pages loop in it.  This\npatch also help stack diet.\n\ndetail\n - remove \"while (nr_scanned \u003c max_scan)\" loop\n - remove nr_freed (now, we use nr_reclaimed directly)\n - remove nr_scan (now, we use nr_scanned directly)\n - rename max_scan to nr_to_scan\n - pass nr_to_scan into isolate_pages() directly instead\n   using SWAP_CLUSTER_MAX\n\n[akpm@linux-foundation.org: coding-style fixes]\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "25edde0332916ae706ccf83de688be57bcc844b7",
      "tree": "35a5b0e651f9cdb48d9a55a748970339c4f681bc",
      "parents": [
        "b898cc70019ce1835bbf6c47bdf978adc36faa42"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Mon Aug 09 17:19:27 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:00 2010 -0700"
      },
      "message": "vmscan: kill prev_priority completely\n\nSince 2.6.28 zone-\u003eprev_priority is unused. Then it can be removed\nsafely. It reduce stack usage slightly.\n\nNow I have to say that I\u0027m sorry. 2 years ago, I thought prev_priority\ncan be integrate again, it\u0027s useful. but four (or more) times trying\nhaven\u0027t got good performance number. Thus I give up such approach.\n\nThe rest of this changelog is notes on prev_priority and why it existed in\nthe first place and why it might be not necessary any more. This information\nis based heavily on discussions between Andrew Morton, Rik van Riel and\nKosaki Motohiro who is heavily quotes from.\n\nHistorically prev_priority was important because it determined when the VM\nwould start unmapping PTE pages. i.e. there are no balances of note within\nthe VM, Anon vs File and Mapped vs Unmapped. Without prev_priority, there\nis a potential risk of unnecessarily increasing minor faults as a large\namount of read activity of use-once pages could push mapped pages to the\nend of the LRU and get unmapped.\n\nThere is no proof this is still a problem but currently it is not considered\nto be. Active files are not deactivated if the active file list is smaller\nthan the inactive list reducing the liklihood that file-mapped pages are\nbeing pushed off the LRU and referenced executable pages are kept on the\nactive list to avoid them getting pushed out by read activity.\n\nEven if it is a problem, prev_priority prev_priority wouldn\u0027t works\nnowadays. First of all, current vmscan still a lot of UP centric code. it\nexpose some weakness on some dozens CPUs machine. I think we need more and\nmore improvement.\n\nThe problem is, current vmscan mix up per-system-pressure, per-zone-pressure\nand per-task-pressure a bit. example, prev_priority try to boost priority to\nother concurrent priority. but if the another task have mempolicy restriction,\nit is unnecessary, but also makes wrong big latency and exceeding reclaim.\nper-task based priority + prev_priority adjustment make the emulation of\nper-system pressure. but it have two issue 1) too rough and brutal emulation\n2) we need per-zone pressure, not per-system.\n\nAnother example, currently DEF_PRIORITY is 12. it mean the lru rotate about\n2 cycle (1/4096 + 1/2048 + 1/1024 + .. + 1) before invoking OOM-Killer.\nbut if 10,0000 thrreads enter DEF_PRIORITY reclaim at the same time, the\nsystem have higher memory pressure than priority\u003d\u003d0 (1/4096*10,000 \u003e 2).\nprev_priority can\u0027t solve such multithreads workload issue. In other word,\nprev_priority concept assume the sysmtem don\u0027t have lots threads.\"\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nReviewed-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "755f0225e8347b23a33ee6e3fb14a35310f95766",
      "tree": "23489e9f52f1435c7b8bc4da8e98d607460e4e23",
      "parents": [
        "a8a94d151521b248727c1f88756174e15260815a"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Mon Aug 09 17:19:18 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:45:00 2010 -0700"
      },
      "message": "vmscan: tracing: add trace event when a page is written\n\nAdd a trace event for when page reclaim queues a page for IO and records\nwhether it is synchronous or asynchronous.  Excessive synchronous IO for a\nprocess can result in noticeable stalls during direct reclaim.  Excessive\nIO from page reclaim may indicate that the system is seriously under\nprovisioned for the amount of dirty pages that exist.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Larry Woodman \u003clwoodman@redhat.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a8a94d151521b248727c1f88756174e15260815a",
      "tree": "81ea18e9c52f6260f9b52b16e592cc89bfd9d260",
      "parents": [
        "33906bc5c87b50028364405ec425de9638afc719"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Mon Aug 09 17:19:17 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:44:59 2010 -0700"
      },
      "message": "vmscan: tracing: add trace events for LRU page isolation\n\nAdd an event for when pages are isolated en-masse from the LRU lists.\nThis event augments the information available on LRU traffic and can be\nused to evaluate lumpy reclaim.\n\n[akpm@linux-foundation.org: coding-style fixes]\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Larry Woodman \u003clwoodman@redhat.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "33906bc5c87b50028364405ec425de9638afc719",
      "tree": "d5d6f431bd517a4a914972f3ce968dc99de73694",
      "parents": [
        "c6a8a8c589b53f90854a07db3b5806ce111e826b"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Mon Aug 09 17:19:16 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:44:59 2010 -0700"
      },
      "message": "vmscan: tracing: add trace events for kswapd wakeup, sleeping and direct reclaim\n\nAdd two trace events for kswapd waking up and going asleep for the\npurposes of tracking kswapd activity and two trace events for direct\nreclaim beginning and ending.  The information can be used to work out how\nmuch time a process or the system is spending on the reclamation of pages\nand in the case of direct reclaim, how many pages were reclaimed for that\nprocess.  High frequency triggering of these events could point to memory\npressure problems.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Larry Woodman \u003clwoodman@redhat.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "c6a8a8c589b53f90854a07db3b5806ce111e826b",
      "tree": "c806b2f0d8f6f5b315f94daf864999b273d9530f",
      "parents": [
        "b00d3ea7cfe44e177ad5cd8141209d46478a7a51"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Mon Aug 09 17:19:14 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:44:59 2010 -0700"
      },
      "message": "vmscan: recalculate lru_pages on each priority\n\nshrink_zones() need relatively long time and lru_pages can change\ndramatically during shrink_zones().  So lru_pages should be recalculated\nfor each priority.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "b00d3ea7cfe44e177ad5cd8141209d46478a7a51",
      "tree": "3508bd3106e6a83decd6e0db7521180e11d4754e",
      "parents": [
        "f446daaea9d4a420d16c606f755f3689dcb2d0ce"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Mon Aug 09 17:19:13 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:44:59 2010 -0700"
      },
      "message": "vmscan: zone_reclaim don\u0027t call disable_swap_token()\n\nSwap token don\u0027t works when zone reclaim is enabled since it was born.\nBecause __zone_reclaim() always call disable_swap_token() unconditionally.\n\nThis kill swap token feature completely.  As far as I know, nobody want to\nthat.  Remove it.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a6aa62a0909b9ccb1f8b0d2653920ba071037972",
      "tree": "2df66e9a20cdb8fac10a1979c6b678e98ee67f0a",
      "parents": [
        "c61284e99191b2284fb74dae6961d4d09e4e59e8"
      ],
      "author": {
        "name": "Nick Piggin",
        "email": "npiggin@suse.de",
        "time": "Tue Jul 20 13:24:25 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jul 20 16:25:40 2010 -0700"
      },
      "message": "mm/vmscan.c: fix mapping use after free\n\nWe need lock_page_nosync() here because we have no reference to the\nmapping when taking the page lock.\n\nSigned-off-by: Nick Piggin \u003cnpiggin@suse.de\u003e\nReviewed-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "7f8275d0d660c146de6ee3017e1e2e594c49e820",
      "tree": "884db927118b44102750b5168ee36ef4b8b5cb4e",
      "parents": [
        "d0c6f6258478e1dba532bf7c28e2cd6e1047d3a4"
      ],
      "author": {
        "name": "Dave Chinner",
        "email": "dchinner@redhat.com",
        "time": "Mon Jul 19 14:56:17 2010 +1000"
      },
      "committer": {
        "name": "Dave Chinner",
        "email": "david@fromorbit.com",
        "time": "Mon Jul 19 14:56:17 2010 +1000"
      },
      "message": "mm: add context argument to shrinker callback\n\nThe current shrinker implementation requires the registered callback\nto have global state to work from. This makes it difficult to shrink\ncaches that are not global (e.g. per-filesystem caches). Pass the shrinker\nstructure to the callback so that users can embed the shrinker structure\nin the context the shrinker needs to operate on and get back to it in the\ncallback via container_of().\n\nSigned-off-by: Dave Chinner \u003cdchinner@redhat.com\u003e\nReviewed-by: Christoph Hellwig \u003chch@lst.de\u003e\n"
    },
    {
      "commit": "bb21c7ce18eff8e6e7877ca1d06c6db719376e3c",
      "tree": "555edaded1e0a771df406ce2b6b63368df6de6cd",
      "parents": [
        "9e506f7adce8e6165a104d3d78fddd8ff0cdccf8"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Fri Jun 04 14:15:05 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Jun 04 15:21:45 2010 -0700"
      },
      "message": "vmscan: fix do_try_to_free_pages() return value when priority\u003d\u003d0 reclaim failure\n\nGreg Thelen reported recent Johannes\u0027s stack diet patch makes kernel hang.\n His test is following.\n\n  mount -t cgroup none /cgroups -o memory\n  mkdir /cgroups/cg1\n  echo $$ \u003e /cgroups/cg1/tasks\n  dd bs\u003d1024 count\u003d1024 if\u003d/dev/null of\u003d/data/foo\n  echo $$ \u003e /cgroups/tasks\n  echo 1 \u003e /cgroups/cg1/memory.force_empty\n\nActually, This OOM hard to try logic have been corrupted since following\ntwo years old patch.\n\n\tcommit a41f24ea9fd6169b147c53c2392e2887cc1d9247\n\tAuthor: Nishanth Aravamudan \u003cnacc@us.ibm.com\u003e\n\tDate:   Tue Apr 29 00:58:25 2008 -0700\n\n\t    page allocator: smarter retry of costly-order allocations\n\nOriginal intention was \"return success if the system have shrinkable zones\nthough priority\u003d\u003d0 reclaim was failure\".  But the above patch changed to\n\"return nr_reclaimed if .....\".  Oh, That forgot nr_reclaimed may be 0 if\npriority\u003d\u003d0 reclaim failure.\n\nAnd Johannes\u0027s patch 0aeb2339e54e (\"vmscan: remove all_unreclaimable scan\ncontrol\") made it more corrupt.  Originally, priority\u003d\u003d0 reclaim failure\non memcg return 0, but this patch changed to return 1.  It totally\nconfused memcg.\n\nThis patch fixes it completely.\n\nReported-by: Greg Thelen \u003cgthelen@google.com\u003e\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nTested-by: Greg Thelen \u003cgthelen@google.com\u003e\nAcked-by: Balbir Singh \u003cbalbir@linux.vnet.ibm.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "8b25c6d2231b978ccce9c401e771932bde79aa9f",
      "tree": "13845799e14e49465de1529680df7def59dcfeb8",
      "parents": [
        "0aeb2339e54e40d0788a7017ecaeac7f5271e262"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "hannes@cmpxchg.org",
        "time": "Mon May 24 14:32:40 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue May 25 08:07:00 2010 -0700"
      },
      "message": "vmscan: remove isolate_pages callback scan control\n\nFor now, we have global isolation vs.  memory control group isolation, do\nnot allow the reclaim entry function to set an arbitrary page isolation\ncallback, we do not need that flexibility.\n\nAnd since we already pass around the group descriptor for the memory\ncontrol group isolation case, just use it to decide which one of the two\nisolator functions to use.\n\nThe decisions can be merged into nearby branches, so no extra cost there.\nIn fact, we save the indirect calls.\n\nSigned-off-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0aeb2339e54e40d0788a7017ecaeac7f5271e262",
      "tree": "66889ce248257e7e24c998a22994ccef222e4622",
      "parents": [
        "142762bd8d8c46345e79f0f68d3374564306972f"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "hannes@cmpxchg.org",
        "time": "Mon May 24 14:32:40 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue May 25 08:07:00 2010 -0700"
      },
      "message": "vmscan: remove all_unreclaimable scan control\n\nThis scan control is abused to communicate a return value from\nshrink_zones().  Write this idiomatically and remove the knob.\n\nSigned-off-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "5f53e76299ceebd68bdf9495e8ff80db77711236",
      "tree": "2ecb8324a6593a49868161d85511cc14d474900a",
      "parents": [
        "bf8abe8b926f7546eb763fd2a088fe461dde6317"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Mon May 24 14:32:37 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue May 25 08:07:00 2010 -0700"
      },
      "message": "vmscan: page_check_references(): check low order lumpy reclaim properly\n\nIf vmscan is under lumpy reclaim mode, it have to ignore referenced bit\nfor making contenious free pages.  but current page_check_references()\ndoesn\u0027t.\n\nFix it.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Lee Schermerhorn \u003cLee.Schermerhorn@hp.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "76a33fc380c9a65e01eb15b3b87c05863a0d51db",
      "tree": "506db7a03eb41e245a859ac241ff0680097427e5",
      "parents": [
        "6ec3a12712ac67ffa4b80d16e0767ffd2431a68d"
      ],
      "author": {
        "name": "Shaohua Li",
        "email": "shaohua.li@intel.com",
        "time": "Mon May 24 14:32:36 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue May 25 08:07:00 2010 -0700"
      },
      "message": "vmscan: prevent get_scan_ratio() rounding errors\n\nget_scan_ratio() calculates percentage and if the percentage is \u003c 1%, it\nwill round percentage down to 0% and cause we completely ignore scanning\nanon/file pages to reclaim memory even the total anon/file pages are very\nbig.\n\nTo avoid underflow, we don\u0027t use percentage, instead we directly calculate\nhow many pages should be scaned.  In this way, we should get several\nscanned pages for \u003c 1% percent.\n\nThis has some benefits:\n\n1. increase our calculation precision\n\n2.  making our scan more smoothly.  Without this, if percent[x] is\n   underflow, shrink_zone() doesn\u0027t scan any pages and suddenly it scans\n   all pages when priority is zero.  With this, even priority isn\u0027t zero,\n   shrink_zone() gets chance to scan some pages.\n\nNote, this patch doesn\u0027t really change logics, but just increase\nprecision.  For system with a lot of memory, this might slightly changes\nbehavior.  For example, in a sequential file read workload, without the\npatch, we don\u0027t swap any anon pages.  With it, if anon memory size is\nbigger than 16G, we will see one anon page swapped.  The 16G is calculated\nas PAGE_SIZE * priority(4096) * (fp/ap).  fp/ap is assumed to be 1024\nwhich is common in this workload.  So the impact sounds not a big deal.\n\nSigned-off-by: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "c175a0ce7584e5b498fff8cbdb9aa7912aa9fbba",
      "tree": "dd924daef4a9e0ac9729c5b61c30b8e3cc96f971",
      "parents": [
        "f1a5ab1210579e2d3ac8c0c227645823af5aafb0"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Mon May 24 14:32:26 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue May 25 08:06:59 2010 -0700"
      },
      "message": "mm: move definition for LRU isolation modes to a header\n\nCurrently, vmscan.c defines the isolation modes for __isolate_lru_page().\nMemory compaction needs access to these modes for isolating pages for\nmigration.  This patch exports them.\n\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nAcked-by: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "c0ff7453bb5c7c98e0885fb94279f2571946f280",
      "tree": "8bb2b169a5145f0496575dbd2f48bb4b1c83f819",
      "parents": [
        "708c1bbc9d0c3e57f40501794d9b0eed29d10fce"
      ],
      "author": {
        "name": "Miao Xie",
        "email": "miaox@cn.fujitsu.com",
        "time": "Mon May 24 14:32:08 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue May 25 08:06:57 2010 -0700"
      },
      "message": "cpuset,mm: fix no node to alloc memory when changing cpuset\u0027s mems\n\nBefore applying this patch, cpuset updates task-\u003emems_allowed and\nmempolicy by setting all new bits in the nodemask first, and clearing all\nold unallowed bits later.  But in the way, the allocator may find that\nthere is no node to alloc memory.\n\nThe reason is that cpuset rebinds the task\u0027s mempolicy, it cleans the\nnodes which the allocater can alloc pages on, for example:\n\n(mpol: mempolicy)\n\ttask1\t\t\ttask1\u0027s mpol\ttask2\n\talloc page\t\t1\n\t  alloc on node0? NO\t1\n\t\t\t\t1\t\tchange mems from 1 to 0\n\t\t\t\t1\t\trebind task1\u0027s mpol\n\t\t\t\t0-1\t\t  set new bits\n\t\t\t\t0\t  \t  clear disallowed bits\n\t  alloc on node1? NO\t0\n\t  ...\n\tcan\u0027t alloc page\n\t  goto oom\n\nThis patch fixes this problem by expanding the nodes range first(set newly\nallowed bits) and shrink it lazily(clear newly disallowed bits).  So we\nuse a variable to tell the write-side task that read-side task is reading\nnodemask, and the write-side task clears newly disallowed nodes after\nread-side task ends the current memory allocation.\n\n[akpm@linux-foundation.org: fix spello]\nSigned-off-by: Miao Xie \u003cmiaox@cn.fujitsu.com\u003e\nCc: David Rientjes \u003crientjes@google.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Paul Menage \u003cmenage@google.com\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nCc: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nCc: Ravikiran Thirumalai \u003ckiran@scalex86.org\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: Andi Kleen \u003candi@firstfloor.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d6da1a5abc2bf3a06a5bda08e0f6833409234666",
      "tree": "dd6e7a306879b49a7947d3f7015bc84f5e27d869",
      "parents": [
        "6e191f7bb083544dc4fa3879ff81caf97c65d197"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Tue Apr 06 14:34:56 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Apr 07 08:38:03 2010 -0700"
      },
      "message": "mm: revert \"vmscan: get_scan_ratio() cleanup\"\n\nShaohua Li reported his tmpfs streaming I/O test can lead to make oom.\nThe test uses a 6G tmpfs in a system with 3G memory.  In the tmpfs, there\nare 6 copies of kernel source and the test does kbuild for each copy.  His\ninvestigation shows the test has a lot of rotated anon pages and quite few\nfile pages, so get_scan_ratio calculates percent[0] (i.e.  scanning\npercent for anon) to be zero.  Actually the percent[0] shoule be a big\nvalue, but our calculation round it to zero.\n\nAlthough before commit 84b18490 (\"vmscan: get_scan_ratio() cleanup\") , we\nhave the same problem too.  But the old logic can rescue percent[0]\u003d\u003d0\ncase only when priority\u003d\u003d0.  It had hided the real issue.  I didn\u0027t think\nmerely streaming io can makes percent[0]\u003d\u003d0 \u0026\u0026 priority\u003d\u003d0 situation.  but\nI was wrong.\n\nSo, definitely we have to fix such tmpfs streaming io issue.  but anyway I\nrevert the regression commit at first.\n\nThis reverts commit 84b18490d1f1bc7ed5095c929f78bc002eb70f26.\n\nSigned-off-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nReported-by: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "5a0e3ad6af8660be21ca98a971cd00f331318c05",
      "tree": "5bfb7be11a03176a87296a43ac6647975c00a1d1",
      "parents": [
        "ed391f4ebf8f701d3566423ce8f17e614cde9806"
      ],
      "author": {
        "name": "Tejun Heo",
        "email": "tj@kernel.org",
        "time": "Wed Mar 24 17:04:11 2010 +0900"
      },
      "committer": {
        "name": "Tejun Heo",
        "email": "tj@kernel.org",
        "time": "Tue Mar 30 22:02:32 2010 +0900"
      },
      "message": "include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h\n\npercpu.h is included by sched.h and module.h and thus ends up being\nincluded when building most .c files.  percpu.h includes slab.h which\nin turn includes gfp.h making everything defined by the two files\nuniversally available and complicating inclusion dependencies.\n\npercpu.h -\u003e slab.h dependency is about to be removed.  Prepare for\nthis change by updating users of gfp and slab facilities include those\nheaders directly instead of assuming availability.  As this conversion\nneeds to touch large number of source files, the following script is\nused as the basis of conversion.\n\n  http://userweb.kernel.org/~tj/misc/slabh-sweep.py\n\nThe script does the followings.\n\n* Scan files for gfp and slab usages and update includes such that\n  only the necessary includes are there.  ie. if only gfp is used,\n  gfp.h, if slab is used, slab.h.\n\n* When the script inserts a new include, it looks at the include\n  blocks and try to put the new include such that its order conforms\n  to its surrounding.  It\u0027s put in the include block which contains\n  core kernel includes, in the same order that the rest are ordered -\n  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there\n  doesn\u0027t seem to be any matching order.\n\n* If the script can\u0027t find a place to put a new include (mostly\n  because the file doesn\u0027t have fitting include block), it prints out\n  an error message indicating which .h file needs to be added to the\n  file.\n\nThe conversion was done in the following steps.\n\n1. The initial automatic conversion of all .c files updated slightly\n   over 4000 files, deleting around 700 includes and adding ~480 gfp.h\n   and ~3000 slab.h inclusions.  The script emitted errors for ~400\n   files.\n\n2. Each error was manually checked.  Some didn\u0027t need the inclusion,\n   some needed manual addition while adding it to implementation .h or\n   embedding .c file was more appropriate for others.  This step added\n   inclusions to around 150 files.\n\n3. The script was run again and the output was compared to the edits\n   from #2 to make sure no file was left behind.\n\n4. Several build tests were done and a couple of problems were fixed.\n   e.g. lib/decompress_*.c used malloc/free() wrappers around slab\n   APIs requiring slab.h to be added manually.\n\n5. The script was run on all .h files but without automatically\n   editing them as sprinkling gfp.h and slab.h inclusions around .h\n   files could easily lead to inclusion dependency hell.  Most gfp.h\n   inclusion directives were ignored as stuff from gfp.h was usually\n   wildly available and often used in preprocessor macros.  Each\n   slab.h inclusion directive was examined and added manually as\n   necessary.\n\n6. percpu.h was updated not to include slab.h.\n\n7. Build test were done on the following configurations and failures\n   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my\n   distributed build env didn\u0027t work with gcov compiles) and a few\n   more options had to be turned off depending on archs to make things\n   build (like ipr on powerpc/64 which failed due to missing writeq).\n\n   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.\n   * powerpc and powerpc64 SMP allmodconfig\n   * sparc and sparc64 SMP allmodconfig\n   * ia64 SMP allmodconfig\n   * s390 SMP allmodconfig\n   * alpha SMP allmodconfig\n   * um on x86_64 SMP allmodconfig\n\n8. percpu.h modifications were reverted so that it could be applied as\n   a separate patch and serve as bisection point.\n\nGiven the fact that I had only a couple of failures from tests on step\n6, I\u0027m fairly confident about the coverage of this conversion patch.\nIf there is a breakage, it\u0027s likely to be something in one of the arch\nheaders which should be easily discoverable easily on most builds of\nthe specific arch.\n\nSigned-off-by: Tejun Heo \u003ctj@kernel.org\u003e\nGuess-its-ok-by: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: Ingo Molnar \u003cmingo@redhat.com\u003e\nCc: Lee Schermerhorn \u003cLee.Schermerhorn@hp.com\u003e\n"
    },
    {
      "commit": "645747462435d84c6c6a64269ed49cc3015f753d",
      "tree": "4cbbddcddd429704dd4f205f6371bb329dcb0ff1",
      "parents": [
        "31c0569c3b0b6cc8a867ac6665ca081553f7984c"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "hannes@cmpxchg.org",
        "time": "Fri Mar 05 13:42:22 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Mar 06 11:26:27 2010 -0800"
      },
      "message": "vmscan: detect mapped file pages used only once\n\nThe VM currently assumes that an inactive, mapped and referenced file page\nis in use and promotes it to the active list.\n\nHowever, every mapped file page starts out like this and thus a problem\narises when workloads create a stream of such pages that are used only for\na short time.  By flooding the active list with those pages, the VM\nquickly gets into trouble finding eligible reclaim canditates.  The result\nis long allocation latencies and eviction of the wrong pages.\n\nThis patch reuses the PG_referenced page flag (used for unmapped file\npages) to implement a usage detection that scales with the speed of LRU\nlist cycling (i.e.  memory pressure).\n\nIf the scanner encounters those pages, the flag is set and the page cycled\nagain on the inactive list.  Only if it returns with another page table\nreference it is activated.  Otherwise it is reclaimed as \u0027not recently\nused cache\u0027.\n\nThis effectively changes the minimum lifetime of a used-once mapped file\npage from a full memory cycle to an inactive list cycle, which allows it\nto occur in linear streams without affecting the stable working set of the\nsystem.\n\nSigned-off-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: OSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "31c0569c3b0b6cc8a867ac6665ca081553f7984c",
      "tree": "c3d3e02f941fed0f91981d55d93540d2acaaecbd",
      "parents": [
        "dfc8d636cdb95f7b792d5ba8c9f3b295809c125d"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "hannes@cmpxchg.org",
        "time": "Fri Mar 05 13:42:21 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Mar 06 11:26:27 2010 -0800"
      },
      "message": "vmscan: drop page_mapping_inuse()\n\npage_mapping_inuse() is a historic predicate function for pages that are\nabout to be reclaimed or deactivated.\n\nAccording to it, a page is in use when it is mapped into page tables OR\npart of swap cache OR backing an mmapped file.\n\nThis function is used in combination with page_referenced(), which checks\nfor young bits in ptes and the page descriptor itself for the\nPG_referenced bit.  Thus, checking for unmapped swap cache pages is\nmeaningless as PG_referenced is not set for anonymous pages and unmapped\npages do not have young ptes.  The test makes no difference.\n\nProtecting file pages that are not by themselves mapped but are part of a\nmapped file is also a historic leftover for short-lived things like the\nexec() code in libc.  However, the VM now does reference accounting and\nactivation of pages at unmap time and thus the special treatment on\nreclaim is obsolete.\n\nThis patch drops page_mapping_inuse() and switches the two callsites to\nuse page_mapped() directly.\n\nSigned-off-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: OSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "dfc8d636cdb95f7b792d5ba8c9f3b295809c125d",
      "tree": "90070c49adb5a8833d8fc034bc94cc696797e22e",
      "parents": [
        "e7c84ee22b8321fa0130a53d4c9806474d62eff0"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "hannes@cmpxchg.org",
        "time": "Fri Mar 05 13:42:19 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Mar 06 11:26:27 2010 -0800"
      },
      "message": "vmscan: factor out page reference checks\n\nThe used-once mapped file page detection patchset.\n\nIt is meant to help workloads with large amounts of shortly used file\nmappings, like rtorrent hashing a file or git when dealing with loose\nobjects (git gc on a bigger site?).\n\nRight now, the VM activates referenced mapped file pages on first\nencounter on the inactive list and it takes a full memory cycle to\nreclaim them again.  When those pages dominate memory, the system\nno longer has a meaningful notion of \u0027working set\u0027 and is required\nto give up the active list to make reclaim progress.  Obviously,\nthis results in rather bad scanning latencies and the wrong pages\nbeing reclaimed.\n\nThis patch makes the VM be more careful about activating mapped file\npages in the first place.  The minimum granted lifetime without\nanother memory access becomes an inactive list cycle instead of the\nfull memory cycle, which is more natural given the mentioned loads.\n\nThis test resembles a hashing rtorrent process.  Sequentially, 32MB\nchunks of a file are mapped into memory, hashed (sha1) and unmapped\nagain.  While this happens, every 5 seconds a process is launched and\nits execution time taken:\n\n\tpython2.4 -c \u0027import pydoc\u0027\n\told: max\u003d2.31s mean\u003d1.26s (0.34)\n\tnew: max\u003d1.25s mean\u003d0.32s (0.32)\n\n\tfind /etc -type f\n\told: max\u003d2.52s mean\u003d1.44s (0.43)\n\tnew: max\u003d1.92s mean\u003d0.12s (0.17)\n\n\tvim -c \u0027:quit\u0027\n\told: max\u003d6.14s mean\u003d4.03s (0.49)\n\tnew: max\u003d3.48s mean\u003d2.41s (0.25)\n\n\tmplayer --help\n\told: max\u003d8.08s mean\u003d5.74s (1.02)\n\tnew: max\u003d3.79s mean\u003d1.32s (0.81)\n\n\toverall hash time (stdev):\n\told: time\u003d1192.30 (12.85) thruput\u003d25.78mb/s (0.27)\n\tnew: time\u003d1060.27 (32.58) thruput\u003d29.02mb/s (0.88) (-11%)\n\nI also tested kernbench with regular IO streaming in the background to\nsee whether the delayed activation of frequently used mapped file\npages had a negative impact on performance in the presence of pressure\non the inactive list.  The patch made no significant difference in\ntiming, neither for kernbench nor for the streaming IO throughput.\n\nThe first patch submission raised concerns about the cost of the extra\nfaults for actually activated pages on machines that have no hardware\nsupport for young page table entries.\n\nI created an artificial worst case scenario on an ARM machine with\naround 300MHz and 64MB of memory to figure out the dimensions\ninvolved.  The test would mmap a file of 20MB, then\n\n  1. touch all its pages to fault them in\n  2. force one full scan cycle on the inactive file LRU\n  -- old: mapping pages activated\n  -- new: mapping pages inactive\n  3. touch the mapping pages again\n  -- old and new: fault exceptions to set the young bits\n  4. force another full scan cycle on the inactive file LRU\n  5. touch the mapping pages one last time\n  -- new: fault exceptions to set the young bits\n\nThe test showed an overall increase of 6% in time over 100 iterations\nof the above (old: ~212sec, new: ~225sec).  13 secs total overhead /\n(100 * 5k pages), ignoring the execution time of the test itself,\nmakes for about 25us overhead for every page that gets actually\nactivated.  Note:\n\n  1. File mapping the size of one third of main memory, _completely_\n  in active use across memory pressure - i.e., most pages referenced\n  within one LRU cycle.  This should be rare to non-existant,\n  especially on such embedded setups.\n\n  2. Many huge activation batches.  Those batches only occur when the\n  working set fluctuates.  If it changes completely between every full\n  LRU cycle, you have problematic reclaim overhead anyway.\n\n  3. Access of activated pages at maximum speed: sequential loads from\n  every single page without doing anything in between.  In reality,\n  the extra faults will get distributed between actual operations on\n  the data.\n\nSo even if a workload manages to get the VM into the situation of\nactivating a third of memory in one go on such a setup, it will take\n2.2 seconds instead 2.1 without the patch.\n\nComparing the numbers (and my user-experience over several months),\nI think this change is an overall improvement to the VM.\n\nPatch 1 is only refactoring to break up that ugly compound conditional\nin shrink_page_list() and make it easy to document and add new checks\nin a readable fashion.\n\nPatch 2 gets rid of the obsolete page_mapping_inuse().  It\u0027s not\nstrictly related to #3, but it was in the original submission and is a\nnet simplification, so I kept it.\n\nPatch 3 implements used-once detection of mapped file pages.\n\nThis patch:\n\nMoving the big conditional into its own predicate function makes the code\na bit easier to read and allows for better commenting on the checks\none-by-one.\n\nThis is just cleaning up, no semantics should have been changed.\n\nSigned-off-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: OSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    }
  ],
  "next": "93e4a89a8c987189b168a530a331ef6d0fcf07a7"
}
