)]}'
{
  "log": [
    {
      "commit": "8ba8ed54de4dd79bb88ab6cd7dbf2e83d58d6d57",
      "tree": "d4d1c687f1b7f58783103b43a04618d8e4019bba",
      "parents": [
        "bbbc4791cd48ac12996e43c0033b504c79b53639",
        "468e6a20afaccb67e2a7d7f60d301f90e1c6f301"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Nov 22 08:22:48 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Nov 22 08:22:48 2011 -0800"
      },
      "message": "Merge branch \u0027writeback-for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux\n\n* \u0027writeback-for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:\n  writeback: remove vm_dirties and task-\u003edirties\n  writeback: hard throttle 1000+ dd on a slow USB stick\n  mm: Make task in balance_dirty_pages() killable\n"
    },
    {
      "commit": "b6844523839779030430ff28f036f83e2a3f43e6",
      "tree": "0af97f08911fab7e1351646172b1805c287ea300",
      "parents": [
        "15bd1cfb3055d866614cdaf38e43201936264e50",
        "99cb2ddcc617f43917e94a4147aa3ccdb2bcd77e"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Nov 18 13:18:07 2011 -0200"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Nov 18 13:18:07 2011 -0200"
      },
      "message": "Merge branch \u0027stable/for-linus-fixes-3.2\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen\n\n* \u0027stable/for-linus-fixes-3.2\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:\n  xen-gntalloc: signedness bug in add_grefs()\n  xen-gntalloc: integer overflow in gntalloc_ioctl_alloc()\n  xen-gntdev: integer overflow in gntdev_alloc_map()\n  xen:pvhvm: enable PVHVM VCPU placement when using more than 32 CPUs.\n  xen/balloon: Avoid OOM when requesting highmem\n  xen: Remove hanging references to CONFIG_XEN_PLATFORM_PCI\n  xen: map foreign pages for shared rings by updating the PTEs directly\n"
    },
    {
      "commit": "15bd1cfb3055d866614cdaf38e43201936264e50",
      "tree": "020261b5a984684201a39e661934aa5dcdf82f83",
      "parents": [
        "9545eb61e5bb70055fd9358f25f95387f7398cba",
        "019ceb7d5d252ce71001a157cf29f4ac28501b72"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Nov 18 09:34:35 2011 -0200"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Nov 18 09:34:35 2011 -0200"
      },
      "message": "Merge branch \u0027for-linus\u0027 of git://git.kernel.dk/linux-block\n\n* \u0027for-linus\u0027 of git://git.kernel.dk/linux-block:\n  block: add missed trace_block_plug\n  paride: fix potential information leak in pg_read()\n  bio: change some signed vars to unsigned\n  block: avoid unnecessary plug list flush\n  cciss: auto engage SCSI mid layer at driver load time\n  loop: cleanup set_status interface\n  include/linux/bio.h: use a static inline function for bio_integrity_clone()\n  loop: prevent information leak after failed read\n  block: Always check length of all iov entries in blk_rq_map_user_iov()\n  The Windows driver .inf disables ASPM on all cciss devices. Do the same.\n  backing-dev: ensure wakeup_timer is deleted\n  block: Revert \"[SCSI] genhd: add a new attribute \"alias\" in gendisk\"\n"
    },
    {
      "commit": "468e6a20afaccb67e2a7d7f60d301f90e1c6f301",
      "tree": "5558e92e85decd0fa0bb95ed6e637e1f68ea2fe1",
      "parents": [
        "1df647197c5b8aacaeb58592cba9a1df322c9000"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Sep 07 10:41:32 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Thu Nov 17 20:49:06 2011 +0800"
      },
      "message": "writeback: remove vm_dirties and task-\u003edirties\n\nThey are not used any more.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "1df647197c5b8aacaeb58592cba9a1df322c9000",
      "tree": "d413b165aca10d3a6058e39430680e38c09c0037",
      "parents": [
        "499d05ecf990a7a7bbf9e0a273f9969f8ec69efc"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Nov 13 19:47:32 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Thu Nov 17 20:39:32 2011 +0800"
      },
      "message": "writeback: hard throttle 1000+ dd on a slow USB stick\n\nThe sleep based balance_dirty_pages() can pause at most MAX_PAUSE\u003d200ms\non every 1 4KB-page, which means it cannot throttle a task under\n4KB/200ms\u003d20KB/s. So when there are more than 512 dd writing to a\n10MB/s USB stick, its bdi dirty pages could grow out of control.\n\nEven if we can increase MAX_PAUSE, the minimal (task_ratelimit \u003d 1)\nmeans a limit of 4KB/s.\n                                                       \nThey can eventually be safeguarded by the global limit check \n(nr_dirty \u003c dirty_thresh). However if someone is also writing to an \nHDD at the same time, it\u0027ll get poor HDD write performance.\n                                                       \nWe at least want to maintain good write performance for other devices\nwhen one device is attacked by some \"massive parallel\" workload, or\nsuffers from slow write bandwidth, or somehow get stalled due to some \nerror condition (eg. NFS server not responding).\n\nFor a stalled device, we need to completely block its dirtiers, too,\nbefore its bdi dirty pages grow all the way up to the global limit and\nleave no space for the other functional devices.\n\nSo change the loop exit condition to\n\n\t/*\n\t * Always enforce global dirty limit; also enforce bdi dirty limit\n\t * if the normal max_pause sleeps cannot keep things under control.\n\t */\n\tif (nr_dirty \u003c dirty_thresh \u0026\u0026\n\t    (bdi_dirty \u003c bdi_thresh || bdi-\u003edirty_ratelimit \u003e 1))\n\t\tbreak;\n\nwhich can be further simplified to\n\n\tif (task_ratelimit)\n\t\tbreak;\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "cd12909cb576d37311fe35868780e82d5007d0c8",
      "tree": "70ec60af4feb32087f542a838fe4dce8717f0cd6",
      "parents": [
        "1ea6b8f48918282bdca0b32a34095504ee65bab5"
      ],
      "author": {
        "name": "David Vrabel",
        "email": "david.vrabel@citrix.com",
        "time": "Thu Sep 29 16:53:32 2011 +0100"
      },
      "committer": {
        "name": "Konrad Rzeszutek Wilk",
        "email": "konrad.wilk@oracle.com",
        "time": "Wed Nov 16 12:13:08 2011 -0500"
      },
      "message": "xen: map foreign pages for shared rings by updating the PTEs directly\n\nWhen mapping a foreign page with xenbus_map_ring_valloc() with the\nGNTTABOP_map_grant_ref hypercall, set the GNTMAP_contains_pte flag and\npass a pointer to the PTE (in init_mm).\n\nAfter the page is mapped, the usual fault mechanism can be used to\nupdate additional MMs.  This allows the vmalloc_sync_all() to be\nremoved from alloc_vm_area().\n\nSigned-off-by: David Vrabel \u003cdavid.vrabel@citrix.com\u003e\nAcked-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\n[v1: Squashed fix by Michal for no-mmu case]\nSigned-off-by: Konrad Rzeszutek Wilk \u003ckonrad.wilk@oracle.com\u003e\nSigned-off-by: Michal Simek \u003cmonstr@monstr.eu\u003e\n"
    },
    {
      "commit": "499d05ecf990a7a7bbf9e0a273f9969f8ec69efc",
      "tree": "cbcdc35276936db1d63959261bfbc02dda2b48a3",
      "parents": [
        "6aaf05f472c97ebceff47d9eef464574f1a55727"
      ],
      "author": {
        "name": "Jan Kara",
        "email": "jack@suse.cz",
        "time": "Wed Nov 16 19:34:48 2011 +0800"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Nov 16 19:53:44 2011 +0800"
      },
      "message": "mm: Make task in balance_dirty_pages() killable\n\nThere is no reason why task in balance_dirty_pages() shouldn\u0027t be killable\nand it helps in recovering from some error conditions (like when filesystem\ngoes in error state and cannot accept writeback anymore but we still want to\nkill processes using it to be able to unmount it).\n\nThere will be follow up patches to further abort the generic_perform_write()\nand other filesystem write loops, to avoid large write + SIGKILL combination\nexceeding the dirty limit and possibly strange OOM.\n\nReported-by: Kazuya Mio \u003ck-mio@sx.jp.nec.com\u003e\nTested-by: Kazuya Mio \u003ck-mio@sx.jp.nec.com\u003e\nReviewed-by: Neil Brown \u003cneilb@suse.de\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "ea4039a34c4c206d015d34a49d0b00868e37db1d",
      "tree": "1c7c38a83e0765f717f11079c573bb94f7445b77",
      "parents": [
        "66e13e66b6c4e5b2ecd6225e1f8437640cfb6498"
      ],
      "author": {
        "name": "Hillf Danton",
        "email": "dhillf@gmail.com",
        "time": "Tue Nov 15 14:36:12 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Nov 15 22:41:52 2011 -0200"
      },
      "message": "hugetlb: release pages in the error path of hugetlb_cow()\n\nIf we fail to prepare an anon_vma, the {new, old}_page should be released,\nor they will leak.\n\nSigned-off-by: Hillf Danton \u003cdhillf@gmail.com\u003e\nReviewed-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Johannes Weiner \u003cjweiner@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "5aecc85abdb9ac2b0e6548d13652a34142e7ae89",
      "tree": "8939325e1ab98bce8caf16d9bdd3e57cd73af846",
      "parents": [
        "001ef5e4554b851cf50fe03bc4c266c28ed8e62d"
      ],
      "author": {
        "name": "Michal Hocko",
        "email": "mhocko@suse.cz",
        "time": "Tue Nov 15 14:36:07 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Nov 15 22:41:51 2011 -0200"
      },
      "message": "oom: do not kill tasks with oom_score_adj OOM_SCORE_ADJ_MIN\n\nCommit c9f01245 (\"oom: remove oom_disable_count\") has removed the\noom_disable_count counter which has been used for early break out from\noom_badness so we could never select a task with oom_score_adj set to\nOOM_SCORE_ADJ_MIN (oom disabled).\n\nNow that the counter is gone we are always going through heuristics\ncalculation and we always return a non zero positive value.  This means\nthat we can end up killing a task with OOM disabled because it is\nindistinguishable from regular tasks with 1% resp.  CAP_SYS_ADMIN tasks\nwith 3% usage of memory or tasks with oom_score_adj set but OOM enabled.\n\nLet\u0027s break out early if the task should have OOM disabled.\n\nSigned-off-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nAcked-by: David Rientjes \u003crientjes@google.com\u003e\nAcked-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Oleg Nesterov \u003coleg@redhat.com\u003e\nCc: Ying Han \u003cyinghan@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "7a401a972df8e184b3d1a3fc958c0a4ddee8d312",
      "tree": "b42d7c60b9c9826f79c4f3c30a623ebe7b40ce1a",
      "parents": [
        "d0985394e7fee6b25a7cc8335d45bc1c1a8ab2d3"
      ],
      "author": {
        "name": "Rabin Vincent",
        "email": "rabin.vincent@stericsson.com",
        "time": "Fri Nov 11 13:29:04 2011 +0100"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "axboe@kernel.dk",
        "time": "Fri Nov 11 13:29:04 2011 +0100"
      },
      "message": "backing-dev: ensure wakeup_timer is deleted\n\nbdi_prune_sb() in bdi_unregister() attempts to removes the bdi links\nfrom all super_blocks and then del_timer_sync() the writeback timer.\n\nHowever, this can race with __mark_inode_dirty(), leading to\nbdi_wakeup_thread_delayed() rearming the writeback timer on the bdi\nwe\u0027re unregistering, after we\u0027ve called del_timer_sync().\n\nThis can end up with the bdi being freed with an active timer inside it,\nas in the case of the following dump after the removal of an SD card.\n\nFix this by redoing the del_timer_sync() in bdi_destory().\n\n ------------[ cut here ]------------\n WARNING: at /home/rabin/kernel/arm/lib/debugobjects.c:262 debug_print_object+0x9c/0xc8()\n ODEBUG: free active (active state 0) object type: timer_list hint: wakeup_timer_fn+0x0/0x180\n Modules linked in:\n Backtrace:\n [\u003cc00109dc\u003e] (dump_backtrace+0x0/0x110) from [\u003cc0236e4c\u003e] (dump_stack+0x18/0x1c)\n  r6:c02bc638 r5:00000106 r4:c79f5d18 r3:00000000\n [\u003cc0236e34\u003e] (dump_stack+0x0/0x1c) from [\u003cc0025e6c\u003e] (warn_slowpath_common+0x54/0x6c)\n [\u003cc0025e18\u003e] (warn_slowpath_common+0x0/0x6c) from [\u003cc0025f28\u003e] (warn_slowpath_fmt+0x38/0x40)\n  r8:20000013 r7:c780c6f0 r6:c031613c r5:c780c6f0 r4:c02b1b29\n r3:00000009\n [\u003cc0025ef0\u003e] (warn_slowpath_fmt+0x0/0x40) from [\u003cc015eb4c\u003e] (debug_print_object+0x9c/0xc8)\n  r3:c02b1b29 r2:c02bc662\n [\u003cc015eab0\u003e] (debug_print_object+0x0/0xc8) from [\u003cc015f574\u003e] (debug_check_no_obj_freed+0xac/0x1dc)\n  r6:c7964000 r5:00000001 r4:c7964000\n [\u003cc015f4c8\u003e] (debug_check_no_obj_freed+0x0/0x1dc) from [\u003cc00a9e38\u003e] (kmem_cache_free+0x88/0x1f8)\n [\u003cc00a9db0\u003e] (kmem_cache_free+0x0/0x1f8) from [\u003cc014286c\u003e] (blk_release_queue+0x70/0x78)\n [\u003cc01427fc\u003e] (blk_release_queue+0x0/0x78) from [\u003cc015290c\u003e] (kobject_release+0x70/0x84)\n  r5:c79641f0 r4:c796420c\n [\u003cc015289c\u003e] (kobject_release+0x0/0x84) from [\u003cc0153ce4\u003e] (kref_put+0x68/0x80)\n  r7:00000083 r6:c74083d0 r5:c015289c r4:c796420c\n [\u003cc0153c7c\u003e] (kref_put+0x0/0x80) from [\u003cc01527d0\u003e] (kobject_put+0x48/0x5c)\n  r5:c79643b4 r4:c79641f0\n [\u003cc0152788\u003e] (kobject_put+0x0/0x5c) from [\u003cc013ddd8\u003e] (blk_cleanup_queue+0x68/0x74)\n  r4:c7964000\n [\u003cc013dd70\u003e] (blk_cleanup_queue+0x0/0x74) from [\u003cc01a6370\u003e] (mmc_blk_put+0x78/0xe8)\n  r5:00000000 r4:c794c400\n [\u003cc01a62f8\u003e] (mmc_blk_put+0x0/0xe8) from [\u003cc01a64b4\u003e] (mmc_blk_release+0x24/0x38)\n  r5:c794c400 r4:c0322824\n [\u003cc01a6490\u003e] (mmc_blk_release+0x0/0x38) from [\u003cc00de11c\u003e] (__blkdev_put+0xe8/0x170)\n  r5:c78d5e00 r4:c74083c0\n [\u003cc00de034\u003e] (__blkdev_put+0x0/0x170) from [\u003cc00de2c0\u003e] (blkdev_put+0x11c/0x12c)\n  r8:c79f5f70 r7:00000001 r6:c74083d0 r5:00000083 r4:c74083c0\n r3:00000000\n [\u003cc00de1a4\u003e] (blkdev_put+0x0/0x12c) from [\u003cc00b0724\u003e] (kill_block_super+0x60/0x6c)\n  r7:c7942300 r6:c79f4000 r5:00000083 r4:c74083c0\n [\u003cc00b06c4\u003e] (kill_block_super+0x0/0x6c) from [\u003cc00b0a94\u003e] (deactivate_locked_super+0x44/0x70)\n  r6:c79f4000 r5:c031af64 r4:c794dc00 r3:c00b06c4\n [\u003cc00b0a50\u003e] (deactivate_locked_super+0x0/0x70) from [\u003cc00b1358\u003e] (deactivate_super+0x6c/0x70)\n  r5:c794dc00 r4:c794dc00\n [\u003cc00b12ec\u003e] (deactivate_super+0x0/0x70) from [\u003cc00c88b0\u003e] (mntput_no_expire+0x188/0x194)\n  r5:c794dc00 r4:c7942300\n [\u003cc00c8728\u003e] (mntput_no_expire+0x0/0x194) from [\u003cc00c95e0\u003e] (sys_umount+0x2e4/0x310)\n  r6:c7942300 r5:00000000 r4:00000000 r3:00000000\n [\u003cc00c92fc\u003e] (sys_umount+0x0/0x310) from [\u003cc000d940\u003e] (ret_fast_syscall+0x0/0x30)\n ---[ end trace e5c83c92ada51c76 ]---\n\nCc: stable@kernel.org\nSigned-off-by: Rabin Vincent \u003crabin.vincent@stericsson.com\u003e\nSigned-off-by: Linus Walleij \u003clinus.walleij@linaro.org\u003e\nSigned-off-by: Jens Axboe \u003caxboe@kernel.dk\u003e\n"
    },
    {
      "commit": "3a73dbbc9bb3fc8594cd67af4db6c563175dfddb",
      "tree": "e5120c19fd8e83a38d5c0852336a92c5b7862c6a",
      "parents": [
        "31555213f03bca37d2c02e10946296052f4ecfcd"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Nov 07 19:19:28 2011 +0800"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Nov 07 19:19:28 2011 +0800"
      },
      "message": "writeback: fix uninitialized task_ratelimit\n\nIn balance_dirty_pages() task_ratelimit may be not initialized\n(initialization skiped by goto pause), and then used when calling\ntracing hook.\n\nFix it by moving the task_ratelimit assignment before goto pause.\n\nReported-by: Witold Baryluk \u003cbaryluk@smp.if.uj.edu.pl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "32aaeffbd4a7457bf2f7448b33b5946ff2a960eb",
      "tree": "faf7ad871d87176423ff9ed1d1ba4d9c688fc23f",
      "parents": [
        "208bca0860406d16398145ddd950036a737c3c9d",
        "67b84999b1a8b1af5625b1eabe92146c5eb42932"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sun Nov 06 19:44:47 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sun Nov 06 19:44:47 2011 -0800"
      },
      "message": "Merge branch \u0027modsplit-Oct31_2011\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux\n\n* \u0027modsplit-Oct31_2011\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)\n  Revert \"tracing: Include module.h in define_trace.h\"\n  irq: don\u0027t put module.h into irq.h for tracking irqgen modules.\n  bluetooth: macroize two small inlines to avoid module.h\n  ip_vs.h: fix implicit use of module_get/module_put from module.h\n  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence\n  include: replace linux/module.h with \"struct module\" wherever possible\n  include: convert various register fcns to macros to avoid include chaining\n  crypto.h: remove unused crypto_tfm_alg_modname() inline\n  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE\n  pm_runtime.h: explicitly requires notifier.h\n  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h\n  miscdevice.h: fix up implicit use of lists and types\n  stop_machine.h: fix implicit use of smp.h for smp_processor_id\n  of: fix implicit use of errno.h in include/linux/of.h\n  of_platform.h: delete needless include \u003clinux/module.h\u003e\n  acpi: remove module.h include from platform/aclinux.h\n  miscdevice.h: delete unnecessary inclusion of module.h\n  device_cgroup.h: delete needless include \u003clinux/module.h\u003e\n  net: sch_generic remove redundant use of \u003clinux/module.h\u003e\n  net: inet_timewait_sock doesnt need \u003clinux/module.h\u003e\n  ...\n\nFix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in\n - drivers/media/dvb/frontends/dibx000_common.c\n - drivers/media/video/{mt9m111.c,ov6650.c}\n - drivers/mfd/ab3550-core.c\n - include/linux/dmaengine.h\n"
    },
    {
      "commit": "208bca0860406d16398145ddd950036a737c3c9d",
      "tree": "7797a16c17d8bd155120126fa7976727fc6de013",
      "parents": [
        "6aad3738f6a79fd0ca480eaceefe064cc471f6eb",
        "0e175a1835ffc979e55787774e58ec79e41957d7"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sun Nov 06 19:02:23 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sun Nov 06 19:02:23 2011 -0800"
      },
      "message": "Merge branch \u0027writeback-for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux\n\n* \u0027writeback-for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:\n  writeback: Add a \u0027reason\u0027 to wb_writeback_work\n  writeback: send work item to queue_io, move_expired_inodes\n  writeback: trace event balance_dirty_pages\n  writeback: trace event bdi_dirty_ratelimit\n  writeback: fix ppc compile warnings on do_div(long long, unsigned long)\n  writeback: per-bdi background threshold\n  writeback: dirty position control - bdi reserve area\n  writeback: control dirty pause time\n  writeback: limit max dirty pause time\n  writeback: IO-less balance_dirty_pages()\n  writeback: per task dirty rate limit\n  writeback: stabilize bdi-\u003edirty_ratelimit\n  writeback: dirty rate control\n  writeback: add bg_threshold parameter to __bdi_update_bandwidth()\n  writeback: dirty position control\n  writeback: account per-bdi accumulated dirtied pages\n"
    },
    {
      "commit": "b4fdcb02f1e39c27058a885905bd0277370ba441",
      "tree": "fd4cfd1994f21f44afe5e7904681fb5ac09f81b8",
      "parents": [
        "044595d4e448305fbaec472eb7d22636d24e7d8c",
        "6dd9ad7df2019b1e33a372a501907db293ebcd0d"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Nov 04 17:06:58 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Nov 04 17:06:58 2011 -0700"
      },
      "message": "Merge branch \u0027for-3.2/core\u0027 of git://git.kernel.dk/linux-block\n\n* \u0027for-3.2/core\u0027 of git://git.kernel.dk/linux-block: (29 commits)\n  block: don\u0027t call blk_drain_queue() if elevator is not up\n  blk-throttle: use queue_is_locked() instead of lockdep_is_held()\n  blk-throttle: Take blkcg-\u003elock while traversing blkcg-\u003epolicy_list\n  blk-throttle: Free up policy node associated with deleted rule\n  block: warn if tag is greater than real_max_depth.\n  block: make gendisk hold a reference to its queue\n  blk-flush: move the queue kick into\n  blk-flush: fix invalid BUG_ON in blk_insert_flush\n  block: Remove the control of complete cpu from bio.\n  block: fix a typo in the blk-cgroup.h file\n  block: initialize the bounce pool if high memory may be added later\n  block: fix request_queue lifetime handling by making blk_queue_cleanup() properly shutdown\n  block: drop @tsk from attempt_plug_merge() and explain sync rules\n  block: make get_request[_wait]() fail if queue is dead\n  block: reorganize throtl_get_tg() and blk_throtl_bio()\n  block: reorganize queue draining\n  block: drop unnecessary blk_get/put_queue() in scsi_cmd_ioctl() and blk_get_tg()\n  block: pass around REQ_* flags instead of broken down booleans during request alloc/free\n  block: move blk_throtl prototypes to block/blk.h\n  block: fix genhd refcounting in blkio_policy_parse_and_set()\n  ...\n\nFix up trivial conflicts due to \"mddev_t\" -\u003e \"struct mddev\" conversion\nand making the request functions be of type \"void\" instead of \"int\" in\n - drivers/md/{faulty.c,linear.c,md.c,md.h,multipath.c,raid0.c,raid1.c,raid10.c,raid5.c}\n - drivers/staging/zram/zram_drv.c\n"
    },
    {
      "commit": "092f4c56c1927e4b61a41ee8055005f1cb437009",
      "tree": "616ceb54b7671ccc13922ae9e002b8b972f6e09e",
      "parents": [
        "80c2861672bbf000f6af838656959ee937e4ee4d",
        "c1e2ee2dc436574880758b3836fc96935b774c32"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 02 16:07:27 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 02 16:07:27 2011 -0700"
      },
      "message": "Merge branch \u0027akpm\u0027 (Andrew\u0027s incoming - part two)\n\nSays Andrew:\n\n \"60 patches.  That\u0027s good enough for -rc1 I guess.  I have quite a lot\n  of detritus to be rechecked, work through maintainers, etc.\n\n - most of the remains of MM\n - rtc\n - various misc\n - cgroups\n - memcg\n - cpusets\n - procfs\n - ipc\n - rapidio\n - sysctl\n - pps\n - w1\n - drivers/misc\n - aio\"\n\n* akpm: (60 commits)\n  memcg: replace ss-\u003eid_lock with a rwlock\n  aio: allocate kiocbs in batches\n  drivers/misc/vmw_balloon.c: fix typo in code comment\n  drivers/misc/vmw_balloon.c: determine page allocation flag can_sleep outside loop\n  w1: disable irqs in critical section\n  drivers/w1/w1_int.c: multiple masters used same init_name\n  drivers/power/ds2780_battery.c: fix deadlock upon insertion and removal\n  drivers/power/ds2780_battery.c: add a nolock function to w1 interface\n  drivers/power/ds2780_battery.c: create central point for calling w1 interface\n  w1: ds2760 and ds2780, use ida for id and ida_simple_get() to get it\n  pps gpio client: add missing dependency\n  pps: new client driver using GPIO\n  pps: default echo function\n  include/linux/dma-mapping.h: add dma_zalloc_coherent()\n  sysctl: make CONFIG_SYSCTL_SYSCALL default to n\n  sysctl: add support for poll()\n  RapidIO: documentation update\n  drivers/net/rionet.c: fix ethernet address macros for LE platforms\n  RapidIO: fix potential null deref in rio_setup_device()\n  RapidIO: add mport driver for Tsi721 bridge\n  ...\n"
    },
    {
      "commit": "61600f578fbd2e8ad0c90bddb9c729e7628d3813",
      "tree": "25842f7e2ee743c66e30df1de3cb666d48bbd063",
      "parents": [
        "4799401fef9d5951b2da384c5eb08034c48e08a0"
      ],
      "author": {
        "name": "H Hartley Sweeten",
        "email": "hartleys@visionengravers.com",
        "time": "Wed Nov 02 13:38:36 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 02 16:07:00 2011 -0700"
      },
      "message": "mm/page_cgroup.c: quiet sparse noise\n\nwarning: symbol \u0027swap_cgroup_ctrl\u0027 was not declared. Should it be static?\n\nSigned-off-by: H Hartley Sweeten \u003chsweeten@visionengravers.com\u003e\nCc: Paul Menage \u003cpaul@paulmenage.org\u003e\nCc: Li Zefan \u003clizf@cn.fujitsu.com\u003e\nAcked-by: Balbir Singh \u003cbsingharora@gmail.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4799401fef9d5951b2da384c5eb08034c48e08a0",
      "tree": "94f9113c6870f46811aaa0f08cdb3ca2beba1e9c",
      "parents": [
        "a61ed3cec51cfd4877855c24890ab8d3e2b143e3"
      ],
      "author": {
        "name": "Steven Rostedt",
        "email": "srostedt@redhat.com",
        "time": "Wed Nov 02 13:38:33 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 02 16:07:00 2011 -0700"
      },
      "message": "memcg: Fix race condition in memcg_check_events() with this_cpu usage\n\nVarious code in memcontrol.c () calls this_cpu_read() on the calculations\nto be done from two different percpu variables, or does an open-coded\nread-modify-write on a single percpu variable.\n\nDisable preemption throughout these operations so that the writes go to\nthe correct palces.\n\n[hannes@cmpxchg.org: added this_cpu to __this_cpu conversion]\nSigned-off-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nSigned-off-by: Steven Rostedt \u003crostedt@goodmis.org\u003e\nCc: Greg Thelen \u003cgthelen@google.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Balbir Singh \u003cbalbir@linux.vnet.ibm.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nCc: Thomas Gleixner \u003ctglx@linutronix.de\u003e\nCc: Peter Zijlstra \u003cpeterz@infradead.org\u003e\nCc: Christoph Lameter \u003ccl@linux.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a61ed3cec51cfd4877855c24890ab8d3e2b143e3",
      "tree": "3ffb850513b202c8eb0ac09ad6d307719bd39856",
      "parents": [
        "9b272977e3b99a8699361d214b51f98c8a9e0e7b"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "jweiner@redhat.com",
        "time": "Wed Nov 02 13:38:29 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 02 16:07:00 2011 -0700"
      },
      "message": "memcg: close race between charge and putback\n\nThere is a potential race between a thread charging a page and another\nthread putting it back to the LRU list:\n\n  charge:                         putback:\n  SetPageCgroupUsed               SetPageLRU\n  PageLRU \u0026\u0026 add to memcg LRU     PageCgroupUsed \u0026\u0026 add to memcg LRU\n\nThe order of setting one flag and checking the other is crucial, otherwise\nthe charge may observe !PageLRU while the putback observes !PageCgroupUsed\nand the page is not linked to the memcg LRU at all.\n\nGlobal memory pressure may fix this by trying to isolate and putback the\npage for reclaim, where that putback would link it to the memcg LRU again.\n Without that, the memory cgroup is undeletable due to a charge whose\nphysical page can not be found and moved out.\n\nSigned-off-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nCc: Ying Han \u003cyinghan@google.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nCc: Balbir Singh \u003cbsingharora@gmail.com\u003e\nCc: Michal Hocko \u003cmhocko@suse.cz\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "9b272977e3b99a8699361d214b51f98c8a9e0e7b",
      "tree": "2113cee95a42ea893aa6eddb01b14e563153fabb",
      "parents": [
        "0a619e58703b86d53d07e938eade9a91a4a863c6"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "jweiner@redhat.com",
        "time": "Wed Nov 02 13:38:23 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 02 16:07:00 2011 -0700"
      },
      "message": "memcg: skip scanning active lists based on individual size\n\nReclaim decides to skip scanning an active list when the corresponding\ninactive list is above a certain size in comparison to leave the assumed\nworking set alone while there are still enough reclaim candidates around.\n\nThe memcg implementation of comparing those lists instead reports whether\nthe whole memcg is low on the requested type of inactive pages,\nconsidering all nodes and zones.\n\nThis can lead to an oversized active list not being scanned because of the\nstate of the other lists in the memcg, as well as an active list being\nscanned while its corresponding inactive list has enough pages.\n\nNot only is this wrong, it\u0027s also a scalability hazard, because the global\nmemory state over all nodes and zones has to be gathered for each memcg\nand zone scanned.\n\nMake these calculations purely based on the size of the two LRU lists\nthat are actually affected by the outcome of the decision.\n\nSigned-off-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nCc: Balbir Singh \u003cbsingharora@gmail.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: Ying Han \u003cyinghan@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0a619e58703b86d53d07e938eade9a91a4a863c6",
      "tree": "0579cebdbdbb90507db04b320acb4191f8a86f2e",
      "parents": [
        "715a5ee82ab3c07430f748630044354132add5ad"
      ],
      "author": {
        "name": "Igor Mammedov",
        "email": "imammedo@redhat.com",
        "time": "Wed Nov 02 13:38:21 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 02 16:07:00 2011 -0700"
      },
      "message": "memcg: do not expose uninitialized mem_cgroup_per_node to world\n\nIf somebody is touching data too early, it might be easier to diagnose a\nproblem when dereferencing NULL at mem-\u003einfo.nodeinfo[node] than trying to\nunderstand why mem_cgroup_per_zone is [un|partly]initialized.\n\nSigned-off-by: Igor Mammedov \u003cimammedo@redhat.com\u003e\nAcked-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "715a5ee82ab3c07430f748630044354132add5ad",
      "tree": "f77a20fbcd0e19dcb3b65f511194e01e8095bf6a",
      "parents": [
        "c0ff4b8540a5c158b8e5bafb7d767298b67b0b92"
      ],
      "author": {
        "name": "KAMEZAWA Hiroyuki",
        "email": "kamezawa.hiroyu@jp.fujitsu.com",
        "time": "Wed Nov 02 13:38:18 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 02 16:06:59 2011 -0700"
      },
      "message": "memcg: fix oom schedule_timeout()\n\nBefore calling schedule_timeout(), task state should be changed.\n\nSigned-off-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "c0ff4b8540a5c158b8e5bafb7d767298b67b0b92",
      "tree": "a47a2bcd0b7b80056cde7ba6b1263aae78f77212",
      "parents": [
        "ff7ee93f47151e23601856e7eb5510babf956571"
      ],
      "author": {
        "name": "Raghavendra K T",
        "email": "raghavendra.kt@linux.vnet.ibm.com",
        "time": "Wed Nov 02 13:38:15 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 02 16:06:59 2011 -0700"
      },
      "message": "memcg: rename mem variable to memcg\n\nThe memcg code sometimes uses \"struct mem_cgroup *mem\" and sometimes uses\n\"struct mem_cgroup *memcg\".  Rename all mem variables to memcg in source\nfile.\n\nSigned-off-by: Raghavendra K T \u003craghavendra.kt@linux.vnet.ibm.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nAcked-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ff7ee93f47151e23601856e7eb5510babf956571",
      "tree": "2a62777ebdec1383d3dd6098cfe8a325c99f2dde",
      "parents": [
        "77ceab8ea590d7dc6c8f055ce43dfebd74428107"
      ],
      "author": {
        "name": "Steven Rostedt",
        "email": "rostedt@goodmis.org",
        "time": "Wed Nov 02 13:38:11 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 02 16:06:59 2011 -0700"
      },
      "message": "cgroup/kmemleak: Annotate alloc_page() for cgroup allocations\n\nWhen the cgroup base was allocated with kmalloc, it was necessary to\nannotate the variable with kmemleak_not_leak().  But because it has\nrecently been changed to be allocated with alloc_page() (which skips\nkmemleak checks) causes a warning on boot up.\n\nI was triggering this output:\n\n allocated 8388608 bytes of page_cgroup\n please try \u0027cgroup_disable\u003dmemory\u0027 option if you don\u0027t want memory cgroups\n kmemleak: Trying to color unknown object at 0xf5840000 as Grey\n Pid: 0, comm: swapper Not tainted 3.0.0-test #12\n Call Trace:\n  [\u003cc17e34e6\u003e] ? printk+0x1d/0x1f^M\n  [\u003cc10e2941\u003e] paint_ptr+0x4f/0x78\n  [\u003cc178ab57\u003e] kmemleak_not_leak+0x58/0x7d\n  [\u003cc108ae9f\u003e] ? __rcu_read_unlock+0x9/0x7d\n  [\u003cc1cdb462\u003e] kmemleak_init+0x19d/0x1e9\n  [\u003cc1cbf771\u003e] start_kernel+0x346/0x3ec\n  [\u003cc1cbf1b4\u003e] ? loglevel+0x18/0x18\n  [\u003cc1cbf0aa\u003e] i386_start_kernel+0xaa/0xb0\n\nAfter a bit of debugging I tracked the object 0xf840000 (and others) down\nto the cgroup code.  The change from allocating base with kmalloc to\nalloc_page() has the base not calling kmemleak_alloc() which adds the\npointer to the object_tree_root, but kmemleak_not_leak() adds it to the\ncrt_early_log[] table.  On kmemleak_init(), the entry is found in the\nearly_log[] but not the object_tree_root, and this error message is\ndisplayed.\n\nIf alloc_page() fails then it defaults back to vmalloc() which still uses\nthe kmemleak_alloc() which makes us still need the kmemleak_not_leak()\ncall.  The solution is to call the kmemleak_alloc() directly if the\nalloc_page() succeeds.\n\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nSigned-off-by: Steven Rostedt \u003crostedt@goodmis.org\u003e\nAcked-by: Catalin Marinas \u003ccatalin.marinas@arm.com\u003e\nSigned-off-by: Jonathan Nieder \u003cjrnieder@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "70b50f94f1644e2aa7cb374819cfd93f3c28d725",
      "tree": "79198cd9a92600140827a670d1ed5eefdcd23d79",
      "parents": [
        "994c0e992522c123298b4a91b72f5e67ba2d1123"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Wed Nov 02 13:36:59 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 02 16:06:57 2011 -0700"
      },
      "message": "mm: thp: tail page refcounting fix\n\nMichel while working on the working set estimation code, noticed that\ncalling get_page_unless_zero() on a random pfn_to_page(random_pfn)\nwasn\u0027t safe, if the pfn ended up being a tail page of a transparent\nhugepage under splitting by __split_huge_page_refcount().\n\nHe then found the problem could also theoretically materialize with\npage_cache_get_speculative() during the speculative radix tree lookups\nthat uses get_page_unless_zero() in SMP if the radix tree page is freed\nand reallocated and get_user_pages is called on it before\npage_cache_get_speculative has a chance to call get_page_unless_zero().\n\nSo the best way to fix the problem is to keep page_tail-\u003e_count zero at\nall times.  This will guarantee that get_page_unless_zero() can never\nsucceed on any tail page.  page_tail-\u003e_mapcount is guaranteed zero and\nis unused for all tail pages of a compound page, so we can simply\naccount the tail page references there and transfer them to\ntail_page-\u003e_count in __split_huge_page_refcount() (in addition to the\nhead_page-\u003e_mapcount).\n\nWhile debugging this s/_count/_mapcount/ change I also noticed get_page is\ncalled by direct-io.c on pages returned by get_user_pages.  That wasn\u0027t\nentirely safe because the two atomic_inc in get_page weren\u0027t atomic.  As\nopposed to other get_user_page users like secondary-MMU page fault to\nestablish the shadow pagetables would never call any superflous get_page\nafter get_user_page returns.  It\u0027s safer to make get_page universally safe\nfor tail pages and to use get_page_foll() within follow_page (inside\nget_user_pages()).  get_page_foll() is safe to do the refcounting for tail\npages without taking any locks because it is run within PT lock protected\ncritical sections (PT lock for pte and page_table_lock for\npmd_trans_huge).\n\nThe standard get_page() as invoked by direct-io instead will now take\nthe compound_lock but still only for tail pages.  The direct-io paths\nare usually I/O bound and the compound_lock is per THP so very\nfinegrined, so there\u0027s no risk of scalability issues with it.  A simple\ndirect-io benchmarks with all lockdep prove locking and spinlock\ndebugging infrastructure enabled shows identical performance and no\noverhead.  So it\u0027s worth it.  Ideally direct-io should stop calling\nget_page() on pages returned by get_user_pages().  The spinlock in\nget_page() is already optimized away for no-THP builds but doing\nget_page() on tail pages returned by GUP is generally a rare operation\nand usually only run in I/O paths.\n\nThis new refcounting on page_tail-\u003e_mapcount in addition to avoiding new\nRCU critical sections will also allow the working set estimation code to\nwork without any further complexity associated to the tail page\nrefcounting with THP.\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nReported-by: Michel Lespinasse \u003cwalken@google.com\u003e\nReviewed-by: Michel Lespinasse \u003cwalken@google.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nCc: Johannes Weiner \u003cjweiner@redhat.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Benjamin Herrenschmidt \u003cbenh@kernel.crashing.org\u003e\nCc: David Gibson \u003cdavid@gibson.dropbear.id.au\u003e\nCc: \u003cstable@kernel.org\u003e\nCc: \u003cstable@vger.kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "6d6b77f163c7eabedbba00ed2abb7d4a570bff76",
      "tree": "6ce074a7dd5a25fae28ef9de6f59ddee08ea4e61",
      "parents": [
        "dd2a981f46a0903a8770a784f213d4d40bbb6f19"
      ],
      "author": {
        "name": "Miklos Szeredi",
        "email": "mszeredi@suse.cz",
        "time": "Fri Oct 28 14:13:28 2011 +0200"
      },
      "committer": {
        "name": "Christoph Hellwig",
        "email": "hch@serles.lst.de",
        "time": "Wed Nov 02 12:53:43 2011 +0100"
      },
      "message": "filesystems: add missing nlink wrappers\n\nReplace direct i_nlink updates with the respective updater function\n(inc_nlink, drop_nlink, clear_nlink, inode_dec_link_count).\n\nSigned-off-by: Miklos Szeredi \u003cmszeredi@suse.cz\u003e\n"
    },
    {
      "commit": "a1cb2c60ddc98ff4e5246f410558805401ceee67",
      "tree": "49e3e620ff2974dc78fad8df0b343b07b75be407",
      "parents": [
        "3d470fc385defa60d9af610f05db8e7f8b4f2f5e"
      ],
      "author": {
        "name": "Dimitri Sivanich",
        "email": "sivanich@sgi.com",
        "time": "Mon Oct 31 17:09:46 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:51 2011 -0700"
      },
      "message": "mm/vmstat.c: cache align vm_stat\n\nAvoid false sharing of the vm_stat array.\n\nThis was found to adversely affect tmpfs I/O performance.\n\nTests run on a 640 cpu UV system.\n\nWith 120 threads doing parallel writes, each to different tmpfs mounts:\nNo patch:\t\t~300 MB/sec\nWith vm_stat alignment:\t~430 MB/sec\n\nSigned-off-by: Dimitri Sivanich \u003csivanich@sgi.com\u003e\nAcked-by: Christoph Lameter \u003ccl@gentwo.org\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "3d470fc385defa60d9af610f05db8e7f8b4f2f5e",
      "tree": "8db16148d94a2ae2723e209e0f2d7fe026361972",
      "parents": [
        "35d8c7ad7208dad5d352c483408e555022750978"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hughd@google.com",
        "time": "Mon Oct 31 17:09:43 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:51 2011 -0700"
      },
      "message": "mm: munlock use mapcount to avoid terrible overhead\n\nA process spent 30 minutes exiting, just munlocking the pages of a large\nanonymous area that had been alternately mprotected into page-sized vmas:\nfor every single page there\u0027s an anon_vma walk through all the other\nlittle vmas to find the right one.\n\nA general fix to that would be a lot more complicated (use prio_tree on\nanon_vma?), but there\u0027s one very simple thing we can do to speed up the\ncommon case: if a page to be munlocked is mapped only once, then it is our\nvma that it is mapped into, and there\u0027s no need whatever to walk through\nall the others.\n\nOkay, there is a very remote race in munlock_vma_pages_range(), if between\nits follow_page() and lock_page(), another process were to munlock the\nsame page, then page reclaim remove it from our vma, then another process\nmlock it again.  We would find it with page_mapcount 1, yet it\u0027s still\nmlocked in another process.  But never mind, that\u0027s much less likely than\nthe down_read_trylock() failure which munlocking already tolerates (in\ntry_to_unmap_one()): in due course page reclaim will discover and move the\npage to unevictable instead.\n\n[akpm@linux-foundation.org: add comment]\nSigned-off-by: Hugh Dickins \u003chughd@google.com\u003e\nCc: Michel Lespinasse \u003cwalken@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "35d8c7ad7208dad5d352c483408e555022750978",
      "tree": "19de03a943fb00f7ed239147256aa558ddfd99b3",
      "parents": [
        "0089e4853ae1ac161fae5137170971ccb6f4f152"
      ],
      "author": {
        "name": "Hillf Danton",
        "email": "dhillf@gmail.com",
        "time": "Mon Oct 31 17:09:40 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:51 2011 -0700"
      },
      "message": "mm/huge_memory: fix typo when updating mmu cache\n\nThere are three cases of update_mmu_cache() in the file, and the case in\nfunction collapse_huge_page() has a typo, namely the last parameter used,\nwhich is corrected based on the other two cases.\n\nDue to the define of update_mmu_cache by X86, the only arch that\nimplements THP currently, the change here has no really crystal point, but\none or two minutes of efforts could be saved for those archs that are\nlikely to support THP in future.\n\nSigned-off-by: Hillf Danton \u003cdhillf@gmail.com\u003e\nCc: Johannes Weiner \u003cjweiner@redhat.com\u003e\nReviewed-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0089e4853ae1ac161fae5137170971ccb6f4f152",
      "tree": "6e23993f320cd6b0d45a5fb4832188cd2092efe2",
      "parents": [
        "df9d6985be2a7e7683c46e4c6ea608fc69f02b45"
      ],
      "author": {
        "name": "Hillf Danton",
        "email": "dhillf@gmail.com",
        "time": "Mon Oct 31 17:09:38 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:50 2011 -0700"
      },
      "message": "mm/huge_memory: fix copying user highpage\n\nThe THP copy-on-write handler falls back to regular-sized pages for a huge\npage replacement upon allocation failure or if THP has been individually\ndisabled in the target VMA.  The loop responsible for copying page-sized\nchunks accidentally uses multiples of PAGE_SHIFT instead of PAGE_SIZE as\nthe virtual address arg for copy_user_highpage().\n\nSigned-off-by: Hillf Danton \u003cdhillf@gmail.com\u003e\nAcked-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nReviewed-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "df9d6985be2a7e7683c46e4c6ea608fc69f02b45",
      "tree": "81bcf7cfbc842a7eee7aa18104a1fbecd2d316e8",
      "parents": [
        "e0c23279c9f800c403f37511484d9014ac83adec"
      ],
      "author": {
        "name": "Christoph Lameter",
        "email": "cl@gentwo.org",
        "time": "Mon Oct 31 17:09:35 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:50 2011 -0700"
      },
      "message": "mm: do not drain pagevecs for mlockall(MCL_FUTURE)\n\nMCL_FUTURE does not move pages between lru list and draining the LRU per\ncpu pagevecs is a nasty activity.  Avoid doing it unecessarily.\n\nSigned-off-by: Christoph Lameter \u003ccl@gentwo.org\u003e\nCc: David Rientjes \u003crientjes@google.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nAcked-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e0c23279c9f800c403f37511484d9014ac83adec",
      "tree": "9dcf058d3d1c691328ea5839dfe9c340e47ee3fa",
      "parents": [
        "e0887c19b2daa140f20ca8104bdc5740f39dbb86"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mgorman@suse.de",
        "time": "Mon Oct 31 17:09:33 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:50 2011 -0700"
      },
      "message": "vmscan: abort reclaim/compaction if compaction can proceed\n\nIf compaction can proceed, shrink_zones() stops doing any work but its\ncallers still call shrink_slab() which raises the priority and potentially\nsleeps.  This is unnecessary and wasteful so this patch aborts direct\nreclaim/compaction entirely if compaction can proceed.\n\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nCc: Josh Boyer \u003cjwboyer@redhat.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e0887c19b2daa140f20ca8104bdc5740f39dbb86",
      "tree": "86330414eb04b5989e68661c205aa52d46ca7ebf",
      "parents": [
        "21ee9f398be209ccbb62929d35961ca1ed48eec3"
      ],
      "author": {
        "name": "Rik van Riel",
        "email": "riel@redhat.com",
        "time": "Mon Oct 31 17:09:31 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:50 2011 -0700"
      },
      "message": "vmscan: limit direct reclaim for higher order allocations\n\nWhen suffering from memory fragmentation due to unfreeable pages, THP page\nfaults will repeatedly try to compact memory.  Due to the unfreeable\npages, compaction fails.\n\nNeedless to say, at that point page reclaim also fails to create free\ncontiguous 2MB areas.  However, that doesn\u0027t stop the current code from\ntrying, over and over again, and freeing a minimum of 4MB (2UL \u003c\u003c\nsc-\u003eorder pages) at every single invocation.\n\nThis resulted in my 12GB system having 2-3GB free memory, a corresponding\namount of used swap and very sluggish response times.\n\nThis can be avoided by having the direct reclaim code not reclaim from\nzones that already have plenty of free memory available for compaction.\n\nIf compaction still fails due to unmovable memory, doing additional\nreclaim will only hurt the system, not help.\n\n[jweiner@redhat.com: change comment to explain the order check]\nSigned-off-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "21ee9f398be209ccbb62929d35961ca1ed48eec3",
      "tree": "4127e14f4a07a2cb7bc6eb902e9c7b0baab8e84f",
      "parents": [
        "2f1da6421570d064a94e17190a4955c2df99794d"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Mon Oct 31 17:09:28 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:50 2011 -0700"
      },
      "message": "vmscan: add barrier to prevent evictable page in unevictable list\n\nWhen a race between putback_lru_page() and shmem_lock with lock\u003d0 happens,\nprogrom execution order is as follows, but clear_bit in processor #1 could\nbe reordered right before spin_unlock of processor #1.  Then, the page\nwould be stranded on the unevictable list.\n\nspin_lock\nSetPageLRU\nspin_unlock\n                                clear_bit(AS_UNEVICTABLE)\n                                spin_lock\n                                if PageLRU()\n                                        if !test_bit(AS_UNEVICTABLE)\n                                        \tmove evictable list\nsmp_mb\nif !test_bit(AS_UNEVICTABLE)\n        move evictable list\n                                spin_unlock\n\nBut, pagevec_lookup() in scan_mapping_unevictable_pages() has\nrcu_read_[un]lock() so it could protect reordering before reaching\ntest_bit(AS_UNEVICTABLE) on processor #1 so this problem never happens.\nBut it\u0027s a unexpected side effect and we should solve this problem\nproperly.\n\nThis patch adds a barrier after mapping_clear_unevictable.\n\nI didn\u0027t meet this problem but just found during review.\n\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nAcked-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "2f1da6421570d064a94e17190a4955c2df99794d",
      "tree": "8d382dfd9a264aa698131d4ae2a345ba3fb25794",
      "parents": [
        "e754d79d35f0b8612445a9bd7491c48d7317e3ad"
      ],
      "author": {
        "name": "H Hartley Sweeten",
        "email": "hartleys@visionengravers.com",
        "time": "Mon Oct 31 17:09:25 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:50 2011 -0700"
      },
      "message": "mm/huge_memory.c: quiet sparse noise\n\nQuiet the sparse noise:\n\nwarning: symbol \u0027khugepaged_scan\u0027 was not declared. Should it be static?\nwarning: context imbalance in \u0027khugepaged_scan_mm_slot\u0027 - unexpected unlock\n\nSigned-off-by: H Hartley Sweeten \u003chsweeten@visionengravers.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Johannes Weiner \u003cjweiner@redhat.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e754d79d35f0b8612445a9bd7491c48d7317e3ad",
      "tree": "6ba53676ba3b425144b7723793e9cda6581a6790",
      "parents": [
        "22d5368a0838c00ed0e3ec20e7ff8c6e46ba99ef"
      ],
      "author": {
        "name": "H Hartley Sweeten",
        "email": "hartleys@visionengravers.com",
        "time": "Mon Oct 31 17:09:23 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:50 2011 -0700"
      },
      "message": "mm/mempolicy.c: quiet sparse noise\n\nQuiet the spares noise:\n\nwarning: symbol \u0027default_policy\u0027 was not declared. Should it be static?\n\nSigned-off-by: H Hartley Sweeten \u003chsweeten@visionengravers.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Stephen Wilson \u003cwilsons@start.ca\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "22d5368a0838c00ed0e3ec20e7ff8c6e46ba99ef",
      "tree": "9692d1a53240795454d46e8cd7abbd0c7002dfa0",
      "parents": [
        "2d7d3eb2bad116e0d1b3b3930a923c55f6d0f70e"
      ],
      "author": {
        "name": "H Hartley Sweeten",
        "email": "hartleys@visionengravers.com",
        "time": "Mon Oct 31 17:09:19 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:50 2011 -0700"
      },
      "message": "mm/thrash.c: quiet sparse noise\n\nQuiet the following sparse noise:\n\nwarning: symbol \u0027swap_token_memcg\u0027 was not declared. Should it be static?\n\nSigned-off-by: H Hartley Sweeten \u003chsweeten@visionengravers.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "2d7d3eb2bad116e0d1b3b3930a923c55f6d0f70e",
      "tree": "ace773565549cc0b97657654983701b6ea5f9f55",
      "parents": [
        "264e56d8247ef6e31ed4386926cae86c61ddcb18"
      ],
      "author": {
        "name": "H Hartley Sweeten",
        "email": "hartleys@visionengravers.com",
        "time": "Mon Oct 31 17:09:15 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:50 2011 -0700"
      },
      "message": "mm/memblock.c: quiet sparse noise\n\nQuiet the following sparse noise in this file:\n\nwarning: symbol \u0027memblock_overlaps_region\u0027 was not declared. Should it be static?\n\nSigned-off-by: H Hartley Sweeten \u003chsweeten@visionengravers,com\u003e\nCc: Yinghai Lu \u003cyinghai@kernel.org\u003e\nCc: \"H. Peter Anvin\" \u003chpa@linux.intel.com\u003e\nCc: Benjamin Herrenschmidt \u003cbenh@kernel.crashing.org\u003e\nCc: Tomi Valkeinen \u003ctomi.valkeinen@nokia.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "264e56d8247ef6e31ed4386926cae86c61ddcb18",
      "tree": "87e85ee670fb7ae4c0cd7bdeae700faff021bf48",
      "parents": [
        "3f380998aeb51b99d5d22cadb41162e1e9db70d2"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "jweiner@redhat.com",
        "time": "Mon Oct 31 17:09:13 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:49 2011 -0700"
      },
      "message": "mm: disable user interface to manually rescue unevictable pages\n\nAt one point, anonymous pages were supposed to go on the unevictable list\nwhen no swap space was configured, and the idea was to manually rescue\nthose pages after adding swap and making them evictable again.  But\nnowadays, swap-backed pages on the anon LRU list are not scanned without\navailable swap space anyway, so there is no point in moving them to a\nseparate list anymore.\n\nThe manual rescue could also be used in case pages were stranded on the\nunevictable list due to race conditions.  But the code has been around for\na while now and newly discovered bugs should be properly reported and\ndealt with instead of relying on such a manual fixup.\n\nIn addition to the lack of a usecase, the sysfs interface to rescue pages\nfrom a specific NUMA node has been broken since its introduction, so it\u0027s\nunlikely that anybody ever relied on that.\n\nThis patch removes the functionality behind the sysctl and the\nnode-interface and emits a one-time warning when somebody tries to access\neither of them.\n\nSigned-off-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nReported-by: Kautuk Consul \u003cconsul.kautuk@gmail.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "3f380998aeb51b99d5d22cadb41162e1e9db70d2",
      "tree": "1d5bb20368b06c7b86e7257771209795bd764f00",
      "parents": [
        "4e9dc5df46001510ebd3b3e54faa650f474e51a3"
      ],
      "author": {
        "name": "Kautuk Consul",
        "email": "consul.kautuk@gmail.com",
        "time": "Mon Oct 31 17:09:11 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:49 2011 -0700"
      },
      "message": "vmscan.c: fix invalid strict_strtoul() check in write_scan_unevictable_node()\n\nwrite_scan_unevictable_node() checks the value req returned by\nstrict_strtoul() and returns 1 if req is 0.\n\nHowever, when strict_strtoul() returns 0, it means successful conversion\nof buf to unsigned long.\n\nDue to this, the function was not proceeding to scan the zones for\nunevictable pages even though we write a valid value to the\nscan_unevictable_pages sys file.\n\nChange this check slightly to check for invalid value in buf as well as 0\nvalue stored in res after successful conversion via strict_strtoul.  In\nboth cases, we do not perform the scanning of this node\u0027s zones.\n\nSigned-off-by: Kautuk Consul \u003cconsul.kautuk@gmail.com\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Johannes Weiner \u003cjweiner@redhat.com\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4e9dc5df46001510ebd3b3e54faa650f474e51a3",
      "tree": "a774cae0bbf399401c6d3068859e4a8fe36ee121",
      "parents": [
        "d43a87e68e9e71d2987a29cc239acec4e8f410c9"
      ],
      "author": {
        "name": "Li Haifeng",
        "email": "omycle@gmail.com",
        "time": "Mon Oct 31 17:09:09 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:49 2011 -0700"
      },
      "message": "mm: fix kunmap_high() comment\n\nSigned-off-by: Li Haifeng \u003comycle@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d43a87e68e9e71d2987a29cc239acec4e8f410c9",
      "tree": "75b2950b286dcad4c4e7785f991ab0a35146b8d1",
      "parents": [
        "dd73e85f6d8f721d66bcbd2734a5f4bc3d3cd768"
      ],
      "author": {
        "name": "Kyungmin Park",
        "email": "kyungmin.park@samsung.com",
        "time": "Mon Oct 31 17:09:08 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:49 2011 -0700"
      },
      "message": "mm: compaction: make compact_zone_order() static\n\nThere\u0027s no compact_zone_order() user outside file scope, so make it static.\n\nSigned-off-by: Kyungmin Park \u003ckyungmin.park@samsung.com\u003e\nAcked-by: David Rientjes \u003crientjes@google.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "dd73e85f6d8f721d66bcbd2734a5f4bc3d3cd768",
      "tree": "a39776b9025de8baf6a346a43db1aee9bbafad08",
      "parents": [
        "72a2ebd8bc62e6658513d3b2a1119e91c3ea6810"
      ],
      "author": {
        "name": "Dean Nelson",
        "email": "dnelson@redhat.com",
        "time": "Mon Oct 31 17:09:04 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:49 2011 -0700"
      },
      "message": "HWPOISON: convert pr_debug()s to pr_info()s\n\nCommit fb46e73520940b (\"HWPOISON: Convert pr_debugs to pr_info) authored\nby Andi Kleen converted a number of pr_debug()s to pr_info()s.\n\nAbout the same time additional code with pr_debug()s was added by two\nother commits 8c6c2ecb4466 (\"HWPOSION, hugetlb: recover from free hugepage\nerror when !MF_COUNT_INCREASED\") and d950b95882f3d (\"HWPOISON, hugetlb:\nsoft offlining for hugepage\").  And these pr_debug()s failed to get\nconverted to pr_info()s.\n\nThis patch converts them as well.  And does some minor related whitespace\ncleanup.\n\nSigned-off-by: Dean Nelson \u003cdnelson@redhat.com\u003e\nReviewed-by: Andi Kleen \u003cak@linux.intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "584cff54e1ff8f59d5109dc8093efedff8bcc375",
      "tree": "464c9acb8bf22bf3a052fef23d3fd9f4f892da4a",
      "parents": [
        "09f363c7363eb10cfb4b82094bd7064e5608258b"
      ],
      "author": {
        "name": "Kautuk Consul",
        "email": "consul.kautuk@gmail.com",
        "time": "Mon Oct 31 17:08:59 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:49 2011 -0700"
      },
      "message": "mm/mmap.c: eliminate the ret variable from mm_take_all_locks()\n\nThe ret variable is really not needed in mm_take_all_locks().\n\nSigned-off-by: Kautuk Consul \u003cconsul.kautuk@gmail.com\u003e\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "20c8c62891a346e09c8d26de41ce78bd7a76c5b0",
      "tree": "6e0a81457ee2c142f96e6e3db66c479eac21fc4a",
      "parents": [
        "99ef0315f1b320f392acc4364598340e78758fd2"
      ],
      "author": {
        "name": "Andrew Morton",
        "email": "akpm@linux-foundation.org",
        "time": "Mon Oct 31 17:08:54 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:49 2011 -0700"
      },
      "message": "mm-add-comment-explaining-task-state-setting-in-bdi_forker_thread-fix\n\nfiddle wording\n\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "99ef0315f1b320f392acc4364598340e78758fd2",
      "tree": "e8aadca4fd1debdf9ea17c238d3bf2fc73a96137",
      "parents": [
        "de7d2b567d040e3b67fe7121945982f14343213d"
      ],
      "author": {
        "name": "Wanlong Gao",
        "email": "gaowanlong@cn.fujitsu.com",
        "time": "Mon Oct 31 17:08:51 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:49 2011 -0700"
      },
      "message": "ksm: fix the comment of try_to_unmap_one()\n\ntry_to_unmap_one() is called by try_to_unmap_ksm(), too.\n\nSigned-off-by: Wanlong Gao \u003cgaowanlong@cn.fujitsu.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "de7d2b567d040e3b67fe7121945982f14343213d",
      "tree": "c834efc1b4117049b5da786417d9cc14c2b31076",
      "parents": [
        "f0dfcde099453aa4c0dc42473828d15a6d492936"
      ],
      "author": {
        "name": "Joe Perches",
        "email": "joe@perches.com",
        "time": "Mon Oct 31 17:08:48 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:48 2011 -0700"
      },
      "message": "mm/vmalloc.c: report more vmalloc failures\n\nSome vmalloc failure paths do not report OOM conditions.\n\nAdd warn_alloc_failed, which also does a dump_stack, to those failure\npaths.\n\nThis allows more site specific vmalloc failure logging message printks to\nbe removed.\n\nSigned-off-by: Joe Perches \u003cjoe@perches.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f0dfcde099453aa4c0dc42473828d15a6d492936",
      "tree": "50346f4e8d9a773f266194661edea903a855a110",
      "parents": [
        "d1f0ece6cdca973c01a46dff0eb062baafe78a85"
      ],
      "author": {
        "name": "Alex,Shi",
        "email": "alex.shi@intel.com",
        "time": "Mon Oct 31 17:08:45 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:48 2011 -0700"
      },
      "message": "kswapd: assign new_order and new_classzone_idx after wakeup in sleeping\n\nThere 2 places to read pgdat in kswapd.  One is return from a successful\nbalance, another is waked up from kswapd sleeping.  The new_order and\nnew_classzone_idx represent the balance input order and classzone_idx.\n\nBut current new_order and new_classzone_idx are not assigned after\nkswapd_try_to_sleep(), that will cause a bug in the following scenario.\n\n1: after a successful balance, kswapd goes to sleep, and new_order \u003d 0;\n   new_classzone_idx \u003d __MAX_NR_ZONES - 1;\n\n2: kswapd waked up with order \u003d 3 and classzone_idx \u003d ZONE_NORMAL\n\n3: in the balance_pgdat() running, a new balance wakeup happened with\n   order \u003d 5, and classzone_idx \u003d ZONE_NORMAL\n\n4: the first wakeup(order \u003d 3) finished successufly, return order \u003d 3\n   but, the new_order is still 0, so, this balancing will be treated as a\n   failed balance.  And then the second tighter balancing will be missed.\n\nSo, to avoid the above problem, the new_order and new_classzone_idx need\nto be assigned for later successful comparison.\n\nSigned-off-by: Alex Shi \u003calex.shi@intel.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nTested-by: Pádraig Brady \u003cP@draigBrady.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d1f0ece6cdca973c01a46dff0eb062baafe78a85",
      "tree": "570aabe98bb19454723c122cee74fa42cbfa49c9",
      "parents": [
        "d2ebd0f6b89567eb93ead4e2ca0cbe03021f344b"
      ],
      "author": {
        "name": "Jonghwan Choi",
        "email": "jhbird.choi@samsung.com",
        "time": "Mon Oct 31 17:08:42 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:48 2011 -0700"
      },
      "message": "mm/memblock.c: small function definition fixes\n\nwarning: function \u0027memblock_memory_can_coalesce\u0027\nwith external linkage has definition.\n\nSigned-off-by: Jonghwan Choi \u003cjhbird.choi@samsung.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d2ebd0f6b89567eb93ead4e2ca0cbe03021f344b",
      "tree": "1f8fc3f7702a8b4d362f5537e135e96f86043f8d",
      "parents": [
        "64212ec569bfdd094f7a23d9b09862209a983559"
      ],
      "author": {
        "name": "Alex,Shi",
        "email": "alex.shi@intel.com",
        "time": "Mon Oct 31 17:08:39 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:48 2011 -0700"
      },
      "message": "kswapd: avoid unnecessary rebalance after an unsuccessful balancing\n\nIn commit 215ddd66 (\"mm: vmscan: only read new_classzone_idx from pgdat\nwhen reclaiming successfully\") , Mel Gorman said kswapd is better to sleep\nafter a unsuccessful balancing if there is tighter reclaim request pending\nin the balancing.  But in the following scenario, kswapd do something that\nis not matched our expectation.  The patch fixes this issue.\n\n1, Read pgdat request A (classzone_idx, order \u003d 3)\n2, balance_pgdat()\n3, During pgdat, a new pgdat request B (classzone_idx, order \u003d 5) is placed\n4, balance_pgdat() returns but failed since returned order \u003d 0\n5, pgdat of request A assigned to balance_pgdat(), and do balancing again.\n   While the expectation behavior of kswapd should try to sleep.\n\nSigned-off-by: Alex Shi \u003calex.shi@intel.com\u003e\nReviewed-by: Tim Chen \u003ctim.c.chen@linux.intel.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nTested-by: Pádraig Brady \u003cP@draigBrady.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "64212ec569bfdd094f7a23d9b09862209a983559",
      "tree": "af9a80c08795e602c5a6401fa3599562c2b6be48",
      "parents": [
        "3ee9a4f086716d792219c021e8509f91165a4128"
      ],
      "author": {
        "name": "Akinobu Mita",
        "email": "akinobu.mita@gmail.com",
        "time": "Mon Oct 31 17:08:38 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:48 2011 -0700"
      },
      "message": "debug-pagealloc: add support for highmem pages\n\nThis adds support for highmem pages poisoning and verification to the\ndebug-pagealloc feature for no-architecture support.\n\n[akpm@linux-foundation.org: remove unneeded preempt_disable/enable]\nSigned-off-by: Akinobu Mita \u003cakinobu.mita@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "3ee9a4f086716d792219c021e8509f91165a4128",
      "tree": "f85162b8e024624f07909eaba4e85b89df924ebb",
      "parents": [
        "06d5e032adcbc7d50c606a1396f00e2474e4213e"
      ],
      "author": {
        "name": "Joe Perches",
        "email": "joe@perches.com",
        "time": "Mon Oct 31 17:08:35 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:48 2011 -0700"
      },
      "message": "mm: neaten warn_alloc_failed\n\nAdd __attribute__((format (printf...) to the function to validate format\nand arguments.  Use vsprintf extension %pV to avoid any possible message\ninterleaving.  Coalesce format string.  Convert printks/pr_warning to\npr_warn.\n\n[akpm@linux-foundation.org: use the __printf() macro]\nSigned-off-by: Joe Perches \u003cjoe@perches.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "37a1c49a91ad55f917a399ef2174b5ebda4283f9",
      "tree": "d272ab0f51016181493c6792f0cf229a87da9ae3",
      "parents": [
        "7b6efc2bc4f19952b25ebf9b236e5ac43cd386c2"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Mon Oct 31 17:08:30 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:48 2011 -0700"
      },
      "message": "thp: mremap support and TLB optimization\n\nThis adds THP support to mremap (decreases the number of split_huge_page()\ncalls).\n\nHere are also some benchmarks with a proggy like this:\n\n\u003d\u003d\u003d\n#define _GNU_SOURCE\n#include \u003csys/mman.h\u003e\n#include \u003cstdlib.h\u003e\n#include \u003cstdio.h\u003e\n#include \u003cstring.h\u003e\n#include \u003csys/time.h\u003e\n\n#define SIZE (5UL*1024*1024*1024)\n\nint main()\n{\n        static struct timeval oldstamp, newstamp;\n\tlong diffsec;\n\tchar *p, *p2, *p3, *p4;\n\tif (posix_memalign((void **)\u0026p, 2*1024*1024, SIZE))\n\t\tperror(\"memalign\"), exit(1);\n\tif (posix_memalign((void **)\u0026p2, 2*1024*1024, SIZE))\n\t\tperror(\"memalign\"), exit(1);\n\tif (posix_memalign((void **)\u0026p3, 2*1024*1024, 4096))\n\t\tperror(\"memalign\"), exit(1);\n\n\tmemset(p, 0xff, SIZE);\n\tmemset(p2, 0xff, SIZE);\n\tmemset(p3, 0x77, 4096);\n\tgettimeofday(\u0026oldstamp, NULL);\n\tp4 \u003d mremap(p, SIZE, SIZE, MREMAP_FIXED|MREMAP_MAYMOVE, p3);\n\tgettimeofday(\u0026newstamp, NULL);\n\tdiffsec \u003d newstamp.tv_sec - oldstamp.tv_sec;\n\tdiffsec \u003d newstamp.tv_usec - oldstamp.tv_usec + 1000000 * diffsec;\n\tprintf(\"usec %ld\\n\", diffsec);\n\tif (p \u003d\u003d MAP_FAILED || p4 !\u003d p3)\n\t//if (p \u003d\u003d MAP_FAILED)\n\t\tperror(\"mremap\"), exit(1);\n\tif (memcmp(p4, p2, SIZE))\n\t\tprintf(\"mremap bug\\n\"), exit(1);\n\tprintf(\"ok\\n\");\n\n\treturn 0;\n}\n\u003d\u003d\u003d\n\nTHP on\n\n Performance counter stats for \u0027./largepage13\u0027 (3 runs):\n\n          69195836 dTLB-loads                 ( +-   3.546% )  (scaled from 50.30%)\n             60708 dTLB-load-misses           ( +-  11.776% )  (scaled from 52.62%)\n         676266476 dTLB-stores                ( +-   5.654% )  (scaled from 69.54%)\n             29856 dTLB-store-misses          ( +-   4.081% )  (scaled from 89.22%)\n        1055848782 iTLB-loads                 ( +-   4.526% )  (scaled from 80.18%)\n              8689 iTLB-load-misses           ( +-   2.987% )  (scaled from 58.20%)\n\n        7.314454164  seconds time elapsed   ( +-   0.023% )\n\nTHP off\n\n Performance counter stats for \u0027./largepage13\u0027 (3 runs):\n\n        1967379311 dTLB-loads                 ( +-   0.506% )  (scaled from 60.59%)\n           9238687 dTLB-load-misses           ( +-  22.547% )  (scaled from 61.87%)\n        2014239444 dTLB-stores                ( +-   0.692% )  (scaled from 60.40%)\n           3312335 dTLB-store-misses          ( +-   7.304% )  (scaled from 67.60%)\n        6764372065 iTLB-loads                 ( +-   0.925% )  (scaled from 79.00%)\n              8202 iTLB-load-misses           ( +-   0.475% )  (scaled from 70.55%)\n\n        9.693655243  seconds time elapsed   ( +-   0.069% )\n\ngrep thp /proc/vmstat\nthp_fault_alloc 35849\nthp_fault_fallback 0\nthp_collapse_alloc 3\nthp_collapse_alloc_failed 0\nthp_split 0\n\nthp_split 0 confirms no thp split despite plenty of hugepages allocated.\n\nThe measurement of only the mremap time (so excluding the 3 long\nmemset and final long 10GB memory accessing memcmp):\n\nTHP on\n\nusec 14824\nusec 14862\nusec 14859\n\nTHP off\n\nusec 256416\nusec 255981\nusec 255847\n\nWith an older kernel without the mremap optimizations (the below patch\noptimizes the non THP version too).\n\nTHP on\n\nusec 392107\nusec 390237\nusec 404124\n\nTHP off\n\nusec 444294\nusec 445237\nusec 445820\n\nI guess with a threaded program that sends more IPI on large SMP it\u0027d\ncreate an even larger difference.\n\nAll debug options are off except DEBUG_VM to avoid skewing the\nresults.\n\nThe only problem for native 2M mremap like it happens above both the\nsource and destination address must be 2M aligned or the hugepmd can\u0027t be\nmoved without a split but that is an hardware limitation.\n\n[akpm@linux-foundation.org: coding-style nitpicking]\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "7b6efc2bc4f19952b25ebf9b236e5ac43cd386c2",
      "tree": "bae674bd95329a498a5f2cc5d9c23bf5a4a54305",
      "parents": [
        "ebed48460be5abd86d9a24fa7c66378e58109f30"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Mon Oct 31 17:08:26 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:48 2011 -0700"
      },
      "message": "mremap: avoid sending one IPI per page\n\nThis replaces ptep_clear_flush() with ptep_get_and_clear() and a single\nflush_tlb_range() at the end of the loop, to avoid sending one IPI for\neach page.\n\nThe mmu_notifier_invalidate_range_start/end section is enlarged\naccordingly but this is not going to fundamentally change things.  It was\nmore by accident that the region under mremap was for the most part still\navailable for secondary MMUs: the primary MMU was never allowed to\nreliably access that region for the duration of the mremap (modulo\ntrapping SIGSEGV on the old address range which sounds unpractical and\nflakey).  If users wants secondary MMUs not to lose access to a large\nregion under mremap they should reduce the mremap size accordingly in\nuserland and run multiple calls.  Overall this will run faster so it\u0027s\nactually going to reduce the time the region is under mremap for the\nprimary MMU which should provide a net benefit to apps.\n\nFor KVM this is a noop because the guest physical memory is never\nmremapped, there\u0027s just no point it ever moving it while guest runs.  One\ntarget of this optimization is JVM GC (so unrelated to the mmu notifier\nlogic).\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ebed48460be5abd86d9a24fa7c66378e58109f30",
      "tree": "721d935953c58ac014e3d00ea1092ab480ea3356",
      "parents": [
        "6661672053aee709d93f5dbd7887c789364c11d4"
      ],
      "author": {
        "name": "Andrea Arcangeli",
        "email": "aarcange@redhat.com",
        "time": "Mon Oct 31 17:08:22 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:47 2011 -0700"
      },
      "message": "mremap: check for overflow using deltas\n\nUsing \"- 1\" relies on the old_end to be page aligned and PAGE_SIZE \u003e 1,\nthose are reasonable requirements but the check remains obscure and it\nlooks more like an off by one error than an overflow check.  This I feel\nwill improve readability.\n\nSigned-off-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "6661672053aee709d93f5dbd7887c789364c11d4",
      "tree": "9d11bdb001d4e08cb03f8fffb1e499a9204408d1",
      "parents": [
        "0a93ebef698b08ed04af0d7d913bab8aedfdc253"
      ],
      "author": {
        "name": "Sam Ravnborg",
        "email": "sam@ravnborg.org",
        "time": "Mon Oct 31 17:08:20 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:47 2011 -0700"
      },
      "message": "memblock: add NO_BOOTMEM config symbol\n\nWith the NO_BOOTMEM symbol added architectures may now use the following\nsyntax to tell that they do not need bootmem:\n\n\tselect NO_BOOTMEM\n\nThis is much more convinient than adding a new kconfig symbol which was\notherwise required.\n\nAdding this symbol does not conflict with the architctures that already\ndefine their own symbol.\n\nSigned-off-by: Sam Ravnborg \u003csam@ravnborg.org\u003e\nCc: Yinghai Lu \u003cyinghai@kernel.org\u003e\nAcked-by: Tejun Heo \u003ctj@kernel.org\u003e\nCc: \"H. Peter Anvin\" \u003chpa@zytor.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0a93ebef698b08ed04af0d7d913bab8aedfdc253",
      "tree": "dcdb4bba9355647dd060d9150422537ad126442a",
      "parents": [
        "f5252e009d5b87071a919221e4f6624184005368"
      ],
      "author": {
        "name": "Sam Ravnborg",
        "email": "sam@ravnborg.org",
        "time": "Mon Oct 31 17:08:16 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:47 2011 -0700"
      },
      "message": "memblock: add memblock_start_of_DRAM()\n\nSPARC32 require access to the start address.  Add a new helper\nmemblock_start_of_DRAM() to give access to the address of the first\nmemblock - which contains the lowest address.\n\nThe awkward name was chosen to match the already present\nmemblock_end_of_DRAM().\n\nSigned-off-by: Sam Ravnborg \u003csam@ravnborg.org\u003e\nCc: \"David S. Miller\" \u003cdavem@davemloft.net\u003e\nCc: Yinghai Lu \u003cyinghai@kernel.org\u003e\nAcked-by: Tejun Heo \u003ctj@kernel.org\u003e\nCc: \"H. Peter Anvin\" \u003chpa@zytor.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f5252e009d5b87071a919221e4f6624184005368",
      "tree": "4be380e99c468dcb10597c445eb6b801897eafea",
      "parents": [
        "8c5fb8eadde41f67c61a7ac2d3246dab87bf7020"
      ],
      "author": {
        "name": "Mitsuo Hayasaka",
        "email": "mitsuo.hayasaka.hu@hitachi.com",
        "time": "Mon Oct 31 17:08:13 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:47 2011 -0700"
      },
      "message": "mm: avoid null pointer access in vm_struct via /proc/vmallocinfo\n\nThe /proc/vmallocinfo shows information about vmalloc allocations in\nvmlist that is a linklist of vm_struct.  It, however, may access pages\nfield of vm_struct where a page was not allocated.  This results in a null\npointer access and leads to a kernel panic.\n\nWhy this happens: In __vmalloc_node_range() called from vmalloc(), newly\nallocated vm_struct is added to vmlist at __get_vm_area_node() and then,\nsome fields of vm_struct such as nr_pages and pages are set at\n__vmalloc_area_node().  In other words, it is added to vmlist before it is\nfully initialized.  At the same time, when the /proc/vmallocinfo is read,\nit accesses the pages field of vm_struct according to the nr_pages field\nat show_numa_info().  Thus, a null pointer access happens.\n\nThe patch adds the newly allocated vm_struct to the vmlist *after* it is\nfully initialized.  So, it can avoid accessing the pages field with\nunallocated page when show_numa_info() is called.\n\nSigned-off-by: Mitsuo Hayasaka \u003cmitsuo.hayasaka.hu@hitachi.com\u003e\nCc: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nCc: David Rientjes \u003crientjes@google.com\u003e\nCc: Namhyung Kim \u003cnamhyung@gmail.com\u003e\nCc: \"Paul E. McKenney\" \u003cpaulmck@linux.vnet.ibm.com\u003e\nCc: Jeremy Fitzhardinge \u003cjeremy.fitzhardinge@citrix.com\u003e\nCc: \u003cstable@kernel.org\u003e\nCc: \u003cstable@vger.kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "8c5fb8eadde41f67c61a7ac2d3246dab87bf7020",
      "tree": "77d333e6e5f939c43ee57f16f499764e3ddb815a",
      "parents": [
        "798248206b59acc6e1238c778281419c041891a7"
      ],
      "author": {
        "name": "Akinobu Mita",
        "email": "akinobu.mita@gmail.com",
        "time": "Mon Oct 31 17:08:10 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:47 2011 -0700"
      },
      "message": "mm/debug-pagealloc.c: use memchr_inv\n\nUse newly introduced memchr_inv() for page verification.\n\nSigned-off-by: Akinobu Mita \u003cakinobu.mita@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "798248206b59acc6e1238c778281419c041891a7",
      "tree": "ff8564431367b442b18bca4a0a9732e5799e2391",
      "parents": [
        "77311139f364d7f71fc9ba88f59fd90e60205007"
      ],
      "author": {
        "name": "Akinobu Mita",
        "email": "akinobu.mita@gmail.com",
        "time": "Mon Oct 31 17:08:07 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:47 2011 -0700"
      },
      "message": "lib/string.c: introduce memchr_inv()\n\nmemchr_inv() is mainly used to check whether the whole buffer is filled\nwith just a specified byte.\n\nThe function name and prototype are stolen from logfs and the\nimplementation is from SLUB.\n\nSigned-off-by: Akinobu Mita \u003cakinobu.mita@gmail.com\u003e\nAcked-by: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nAcked-by: Pekka Enberg \u003cpenberg@kernel.org\u003e\nCc: Matt Mackall \u003cmpm@selenic.com\u003e\nAcked-by: Joern Engel \u003cjoern@logfs.org\u003e\nCc: Marcin Slusarz \u003cmarcin.slusarz@gmail.com\u003e\nCc: Eric Dumazet \u003ceric.dumazet@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "77311139f364d7f71fc9ba88f59fd90e60205007",
      "tree": "5cde11d67e431781893bb0375dcae665750ed6dc",
      "parents": [
        "16fb951237c2b0b28037b992ee44e7ee401c30d1"
      ],
      "author": {
        "name": "Akinobu Mita",
        "email": "akinobu.mita@gmail.com",
        "time": "Mon Oct 31 17:08:05 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:47 2011 -0700"
      },
      "message": "mm/debug-pagealloc.c: use plain __ratelimit() instead of printk_ratelimit()\n\nprintk_ratelimit() should not be used, because it shares ratelimiting\nstate with all other unrelated printk_ratelimit() callsites.\n\nSigned-off-by: Akinobu Mita \u003cakinobu.mita@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "16fb951237c2b0b28037b992ee44e7ee401c30d1",
      "tree": "756ba52239d304d8f45deb102f17960f0a8517ec",
      "parents": [
        "49ea7eb65e7c5060807fb9312b1ad4c3eab82e2c"
      ],
      "author": {
        "name": "Shaohua Li",
        "email": "shaohua.li@intel.com",
        "time": "Mon Oct 31 17:08:02 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:47 2011 -0700"
      },
      "message": "vmscan: count pages into balanced for zone with good watermark\n\nIt\u0027s possible a zone watermark is ok when entering the balance_pgdat()\nloop, while the zone is within the requested classzone_idx.  Count pages\nfrom this zone into `balanced\u0027.  In this way, we can skip shrinking zones\ntoo much for high order allocation.\n\nSigned-off-by: Shaohua Li \u003cshaohua.li@intel.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "49ea7eb65e7c5060807fb9312b1ad4c3eab82e2c",
      "tree": "88eaa206cdcac1190817820a0eb56bca2585f9ea",
      "parents": [
        "92df3a723f84cdf8133560bbff950a7a99e92bc9"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mgorman@suse.de",
        "time": "Mon Oct 31 17:07:59 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:47 2011 -0700"
      },
      "message": "mm: vmscan: immediately reclaim end-of-LRU dirty pages when writeback completes\n\nWhen direct reclaim encounters a dirty page, it gets recycled around the\nLRU for another cycle.  This patch marks the page PageReclaim similar to\ndeactivate_page() so that the page gets reclaimed almost immediately after\nthe page gets cleaned.  This is to avoid reclaiming clean pages that are\nyounger than a dirty page encountered at the end of the LRU that might\nhave been something like a use-once page.\n\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nAcked-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Alex Elder \u003caelder@sgi.com\u003e\nCc: Theodore Ts\u0027o \u003ctytso@mit.edu\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "92df3a723f84cdf8133560bbff950a7a99e92bc9",
      "tree": "503efc278236d877508da66ea7ec7cbb81203d64",
      "parents": [
        "f84f6e2b0868f198f97a32ba503d6f9f319a249a"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mgorman@suse.de",
        "time": "Mon Oct 31 17:07:56 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:46 2011 -0700"
      },
      "message": "mm: vmscan: throttle reclaim if encountering too many dirty pages under writeback\n\nWorkloads that are allocating frequently and writing files place a large\nnumber of dirty pages on the LRU.  With use-once logic, it is possible for\nthem to reach the end of the LRU quickly requiring the reclaimer to scan\nmore to find clean pages.  Ordinarily, processes that are dirtying memory\nwill get throttled by dirty balancing but this is a global heuristic and\ndoes not take into account that LRUs are maintained on a per-zone basis.\nThis can lead to a situation whereby reclaim is scanning heavily, skipping\nover a large number of pages under writeback and recycling them around the\nLRU consuming CPU.\n\nThis patch checks how many of the number of pages isolated from the LRU\nwere dirty and under writeback.  If a percentage of them under writeback,\nthe process will be throttled if a backing device or the zone is\ncongested.  Note that this applies whether it is anonymous or file-backed\npages that are under writeback meaning that swapping is potentially\nthrottled.  This is intentional due to the fact if the swap device is\ncongested, scanning more pages and dispatching more IO is not going to\nhelp matters.\n\nThe percentage that must be in writeback depends on the priority.  At\ndefault priority, all of them must be dirty.  At DEF_PRIORITY-1, 50% of\nthem must be, DEF_PRIORITY-2, 25% etc.  i.e.  as pressure increases the\ngreater the likelihood the process will get throttled to allow the flusher\nthreads to make some progress.\n\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Alex Elder \u003caelder@sgi.com\u003e\nCc: Theodore Ts\u0027o \u003ctytso@mit.edu\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f84f6e2b0868f198f97a32ba503d6f9f319a249a",
      "tree": "2103afe0304afd0045dca4b92dfd35922cfc289b",
      "parents": [
        "966dbde2c208e07bab7a45a7855e1e693eabe661"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mgorman@suse.de",
        "time": "Mon Oct 31 17:07:51 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:46 2011 -0700"
      },
      "message": "mm: vmscan: do not writeback filesystem pages in kswapd except in high priority\n\nIt is preferable that no dirty pages are dispatched for cleaning from the\npage reclaim path.  At normal priorities, this patch prevents kswapd\nwriting pages.\n\nHowever, page reclaim does have a requirement that pages be freed in a\nparticular zone.  If it is failing to make sufficient progress (reclaiming\n\u003c SWAP_CLUSTER_MAX at any priority priority), the priority is raised to\nscan more pages.  A priority of DEF_PRIORITY - 3 is considered to be the\npoint where kswapd is getting into trouble reclaiming pages.  If this\npriority is reached, kswapd will dispatch pages for writing.\n\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Johannes Weiner \u003cjweiner@redhat.com\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Alex Elder \u003caelder@sgi.com\u003e\nCc: Theodore Ts\u0027o \u003ctytso@mit.edu\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a18bba061c789f5815c3efc3c80e6ac269911964",
      "tree": "bec0234fb338f8e06b6e39df2cfa09acf2a968a3",
      "parents": [
        "ee72886d8ed5d9de3fa0ed3b99a7ca7702576a96"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mgorman@suse.de",
        "time": "Mon Oct 31 17:07:42 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:46 2011 -0700"
      },
      "message": "mm: vmscan: remove dead code related to lumpy reclaim waiting on pages under writeback\n\nLumpy reclaim worked with two passes - the first which queued pages for IO\nand the second which waited on writeback.  As direct reclaim can no longer\nwrite pages there is some dead code.  This patch removes it but direct\nreclaim will continue to wait on pages under writeback while in\nsynchronous reclaim mode.\n\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Johannes Weiner \u003cjweiner@redhat.com\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Alex Elder \u003caelder@sgi.com\u003e\nCc: Theodore Ts\u0027o \u003ctytso@mit.edu\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ee72886d8ed5d9de3fa0ed3b99a7ca7702576a96",
      "tree": "d9596005d3ea38541c5dfe1c2a0b7d5a4d73488f",
      "parents": [
        "e10d59f2c3decaf22cc5d3de7040eba202bc2df3"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Mon Oct 31 17:07:38 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:46 2011 -0700"
      },
      "message": "mm: vmscan: do not writeback filesystem pages in direct reclaim\n\nTesting from the XFS folk revealed that there is still too much I/O from\nthe end of the LRU in kswapd.  Previously it was considered acceptable by\nVM people for a small number of pages to be written back from reclaim with\ntesting generally showing about 0.3% of pages reclaimed were written back\n(higher if memory was low).  That writing back a small number of pages is\nok has been heavily disputed for quite some time and Dave Chinner\nexplained it well;\n\n\tIt doesn\u0027t have to be a very high number to be a problem. IO\n\tis orders of magnitude slower than the CPU time it takes to\n\tflush a page, so the cost of making a bad flush decision is\n\tvery high. And single page writeback from the LRU is almost\n\talways a bad flush decision.\n\nTo complicate matters, filesystems respond very differently to requests\nfrom reclaim according to Christoph Hellwig;\n\n\txfs tries to write it back if the requester is kswapd\n\text4 ignores the request if it\u0027s a delayed allocation\n\tbtrfs ignores the request\n\nAs a result, each filesystem has different performance characteristics\nwhen under memory pressure and there are many pages being dirtied.  In\nsome cases, the request is ignored entirely so the VM cannot depend on the\nIO being dispatched.\n\nThe objective of this series is to reduce writing of filesystem-backed\npages from reclaim, play nicely with writeback that is already in progress\nand throttle reclaim appropriately when writeback pages are encountered.\nThe assumption is that the flushers will always write pages faster than if\nreclaim issues the IO.\n\nA secondary goal is to avoid the problem whereby direct reclaim splices\ntwo potentially deep call stacks together.\n\nThere is a potential new problem as reclaim has less control over how long\nbefore a page in a particularly zone or container is cleaned and direct\nreclaimers depend on kswapd or flusher threads to do the necessary work.\nHowever, as filesystems sometimes ignore direct reclaim requests already,\nit is not expected to be a serious issue.\n\nPatch 1 disables writeback of filesystem pages from direct reclaim\n\tentirely. Anonymous pages are still written.\n\nPatch 2 removes dead code in lumpy reclaim as it is no longer able\n\tto synchronously write pages. This hurts lumpy reclaim but\n\tthere is an expectation that compaction is used for hugepage\n\tallocations these days and lumpy reclaim\u0027s days are numbered.\n\nPatches 3-4 add warnings to XFS and ext4 if called from\n\tdirect reclaim. With patch 1, this \"never happens\" and is\n\tintended to catch regressions in this logic in the future.\n\nPatch 5 disables writeback of filesystem pages from kswapd unless\n\tthe priority is raised to the point where kswapd is considered\n\tto be in trouble.\n\nPatch 6 throttles reclaimers if too many dirty pages are being\n\tencountered and the zones or backing devices are congested.\n\nPatch 7 invalidates dirty pages found at the end of the LRU so they\n\tare reclaimed quickly after being written back rather than\n\twaiting for a reclaimer to find them\n\nI consider this series to be orthogonal to the writeback work but it is\nworth noting that the writeback work affects the viability of patch 8 in\nparticular.\n\nI tested this on ext4 and xfs using fs_mark, a simple writeback test based\non dd and a micro benchmark that does a streaming write to a large mapping\n(exercises use-once LRU logic) followed by streaming writes to a mix of\nanonymous and file-backed mappings.  The command line for fs_mark when\nbotted with 512M looked something like\n\n./fs_mark -d  /tmp/fsmark-2676  -D  100  -N  150  -n  150  -L  25  -t  1  -S0  -s  10485760\n\nThe number of files was adjusted depending on the amount of available\nmemory so that the files created was about 3xRAM.  For multiple threads,\nthe -d switch is specified multiple times.\n\nThe test machine is x86-64 with an older generation of AMD processor with\n4 cores.  The underlying storage was 4 disks configured as RAID-0 as this\nwas the best configuration of storage I had available.  Swap is on a\nseparate disk.  Dirty ratio was tuned to 40% instead of the default of\n20%.\n\nTesting was run with and without monitors to both verify that the patches\nwere operating as expected and that any performance gain was real and not\ndue to interference from monitors.\n\nHere is a summary of results based on testing XFS.\n\n512M1P-xfs           Files/s  mean                 32.69 ( 0.00%)     34.44 ( 5.08%)\n512M1P-xfs           Elapsed Time fsmark                    51.41     48.29\n512M1P-xfs           Elapsed Time simple-wb                114.09    108.61\n512M1P-xfs           Elapsed Time mmap-strm                113.46    109.34\n512M1P-xfs           Kswapd efficiency fsmark                 62%       63%\n512M1P-xfs           Kswapd efficiency simple-wb              56%       61%\n512M1P-xfs           Kswapd efficiency mmap-strm              44%       42%\n512M-xfs             Files/s  mean                 30.78 ( 0.00%)     35.94 (14.36%)\n512M-xfs             Elapsed Time fsmark                    56.08     48.90\n512M-xfs             Elapsed Time simple-wb                112.22     98.13\n512M-xfs             Elapsed Time mmap-strm                219.15    196.67\n512M-xfs             Kswapd efficiency fsmark                 54%       56%\n512M-xfs             Kswapd efficiency simple-wb              54%       55%\n512M-xfs             Kswapd efficiency mmap-strm              45%       44%\n512M-4X-xfs          Files/s  mean                 30.31 ( 0.00%)     33.33 ( 9.06%)\n512M-4X-xfs          Elapsed Time fsmark                    63.26     55.88\n512M-4X-xfs          Elapsed Time simple-wb                100.90     90.25\n512M-4X-xfs          Elapsed Time mmap-strm                261.73    255.38\n512M-4X-xfs          Kswapd efficiency fsmark                 49%       50%\n512M-4X-xfs          Kswapd efficiency simple-wb              54%       56%\n512M-4X-xfs          Kswapd efficiency mmap-strm              37%       36%\n512M-16X-xfs         Files/s  mean                 60.89 ( 0.00%)     65.22 ( 6.64%)\n512M-16X-xfs         Elapsed Time fsmark                    67.47     58.25\n512M-16X-xfs         Elapsed Time simple-wb                103.22     90.89\n512M-16X-xfs         Elapsed Time mmap-strm                237.09    198.82\n512M-16X-xfs         Kswapd efficiency fsmark                 45%       46%\n512M-16X-xfs         Kswapd efficiency simple-wb              53%       55%\n512M-16X-xfs         Kswapd efficiency mmap-strm              33%       33%\n\nUp until 512-4X, the FSmark improvements were statistically significant.\nFor the 4X and 16X tests the results were within standard deviations but\njust barely.  The time to completion for all tests is improved which is an\nimportant result.  In general, kswapd efficiency is not affected by\nskipping dirty pages.\n\n1024M1P-xfs          Files/s  mean                 39.09 ( 0.00%)     41.15 ( 5.01%)\n1024M1P-xfs          Elapsed Time fsmark                    84.14     80.41\n1024M1P-xfs          Elapsed Time simple-wb                210.77    184.78\n1024M1P-xfs          Elapsed Time mmap-strm                162.00    160.34\n1024M1P-xfs          Kswapd efficiency fsmark                 69%       75%\n1024M1P-xfs          Kswapd efficiency simple-wb              71%       77%\n1024M1P-xfs          Kswapd efficiency mmap-strm              43%       44%\n1024M-xfs            Files/s  mean                 35.45 ( 0.00%)     37.00 ( 4.19%)\n1024M-xfs            Elapsed Time fsmark                    94.59     91.00\n1024M-xfs            Elapsed Time simple-wb                229.84    195.08\n1024M-xfs            Elapsed Time mmap-strm                405.38    440.29\n1024M-xfs            Kswapd efficiency fsmark                 79%       71%\n1024M-xfs            Kswapd efficiency simple-wb              74%       74%\n1024M-xfs            Kswapd efficiency mmap-strm              39%       42%\n1024M-4X-xfs         Files/s  mean                 32.63 ( 0.00%)     35.05 ( 6.90%)\n1024M-4X-xfs         Elapsed Time fsmark                   103.33     97.74\n1024M-4X-xfs         Elapsed Time simple-wb                204.48    178.57\n1024M-4X-xfs         Elapsed Time mmap-strm                528.38    511.88\n1024M-4X-xfs         Kswapd efficiency fsmark                 81%       70%\n1024M-4X-xfs         Kswapd efficiency simple-wb              73%       72%\n1024M-4X-xfs         Kswapd efficiency mmap-strm              39%       38%\n1024M-16X-xfs        Files/s  mean                 42.65 ( 0.00%)     42.97 ( 0.74%)\n1024M-16X-xfs        Elapsed Time fsmark                   103.11     99.11\n1024M-16X-xfs        Elapsed Time simple-wb                200.83    178.24\n1024M-16X-xfs        Elapsed Time mmap-strm                397.35    459.82\n1024M-16X-xfs        Kswapd efficiency fsmark                 84%       69%\n1024M-16X-xfs        Kswapd efficiency simple-wb              74%       73%\n1024M-16X-xfs        Kswapd efficiency mmap-strm              39%       40%\n\nAll FSMark tests up to 16X had statistically significant improvements.\nFor the most part, tests are completing faster with the exception of the\nstreaming writes to a mixture of anonymous and file-backed mappings which\nwere slower in two cases\n\nIn the cases where the mmap-strm tests were slower, there was more\nswapping due to dirty pages being skipped.  The number of additional pages\nswapped is almost identical to the fewer number of pages written from\nreclaim.  In other words, roughly the same number of pages were reclaimed\nbut swapping was slower.  As the test is a bit unrealistic and stresses\nmemory heavily, the small shift is acceptable.\n\n4608M1P-xfs          Files/s  mean                 29.75 ( 0.00%)     30.96 ( 3.91%)\n4608M1P-xfs          Elapsed Time fsmark                   512.01    492.15\n4608M1P-xfs          Elapsed Time simple-wb                618.18    566.24\n4608M1P-xfs          Elapsed Time mmap-strm                488.05    465.07\n4608M1P-xfs          Kswapd efficiency fsmark                 93%       86%\n4608M1P-xfs          Kswapd efficiency simple-wb              88%       84%\n4608M1P-xfs          Kswapd efficiency mmap-strm              46%       45%\n4608M-xfs            Files/s  mean                 27.60 ( 0.00%)     28.85 ( 4.33%)\n4608M-xfs            Elapsed Time fsmark                   555.96    532.34\n4608M-xfs            Elapsed Time simple-wb                659.72    571.85\n4608M-xfs            Elapsed Time mmap-strm               1082.57   1146.38\n4608M-xfs            Kswapd efficiency fsmark                 89%       91%\n4608M-xfs            Kswapd efficiency simple-wb              88%       82%\n4608M-xfs            Kswapd efficiency mmap-strm              48%       46%\n4608M-4X-xfs         Files/s  mean                 26.00 ( 0.00%)     27.47 ( 5.35%)\n4608M-4X-xfs         Elapsed Time fsmark                   592.91    564.00\n4608M-4X-xfs         Elapsed Time simple-wb                616.65    575.07\n4608M-4X-xfs         Elapsed Time mmap-strm               1773.02   1631.53\n4608M-4X-xfs         Kswapd efficiency fsmark                 90%       94%\n4608M-4X-xfs         Kswapd efficiency simple-wb              87%       82%\n4608M-4X-xfs         Kswapd efficiency mmap-strm              43%       43%\n4608M-16X-xfs        Files/s  mean                 26.07 ( 0.00%)     26.42 ( 1.32%)\n4608M-16X-xfs        Elapsed Time fsmark                   602.69    585.78\n4608M-16X-xfs        Elapsed Time simple-wb                606.60    573.81\n4608M-16X-xfs        Elapsed Time mmap-strm               1549.75   1441.86\n4608M-16X-xfs        Kswapd efficiency fsmark                 98%       98%\n4608M-16X-xfs        Kswapd efficiency simple-wb              88%       82%\n4608M-16X-xfs        Kswapd efficiency mmap-strm              44%       42%\n\nUnlike the other tests, the fsmark results are not statistically\nsignificant but the min and max times are both improved and for the most\npart, tests completed faster.\n\nThere are other indications that this is an improvement as well.  For\nexample, in the vast majority of cases, there were fewer pages scanned by\ndirect reclaim implying in many cases that stalls due to direct reclaim\nare reduced.  KSwapd is scanning more due to skipping dirty pages which is\nunfortunate but the CPU usage is still acceptable\n\nIn an earlier set of tests, I used blktrace and in almost all cases\nthroughput throughout the entire test was higher.  However, I ended up\ndiscarding those results as recording blktrace data was too heavy for my\nliking.\n\nOn a laptop, I plugged in a USB stick and ran a similar tests of tests\nusing it as backing storage.  A desktop environment was running and for\nthe entire duration of the tests, firefox and gnome terminal were\nlaunching and exiting to vaguely simulate a user.\n\n1024M-xfs            Files/s  mean               0.41 ( 0.00%)        0.44 ( 6.82%)\n1024M-xfs            Elapsed Time fsmark               2053.52   1641.03\n1024M-xfs            Elapsed Time simple-wb            1229.53    768.05\n1024M-xfs            Elapsed Time mmap-strm            4126.44   4597.03\n1024M-xfs            Kswapd efficiency fsmark              84%       85%\n1024M-xfs            Kswapd efficiency simple-wb           92%       81%\n1024M-xfs            Kswapd efficiency mmap-strm           60%       51%\n1024M-xfs            Avg wait ms fsmark                5404.53     4473.87\n1024M-xfs            Avg wait ms simple-wb             2541.35     1453.54\n1024M-xfs            Avg wait ms mmap-strm             3400.25     3852.53\n\nThe mmap-strm results were hurt because firefox launching had a tendency\nto push the test out of memory.  On the postive side, firefox launched\nmarginally faster with the patches applied.  Time to completion for many\ntests was faster but more importantly - the \"Avg wait\" time as measured by\niostat was far lower implying the system would be more responsive.  It was\nalso the case that \"Avg wait ms\" on the root filesystem was lower.  I\ntested it manually and while the system felt slightly more responsive\nwhile copying data to a USB stick, it was marginal enough that it could be\nmy imagination.\n\nThis patch: do not writeback filesystem pages in direct reclaim.\n\nWhen kswapd is failing to keep zones above the min watermark, a process\nwill enter direct reclaim in the same manner kswapd does.  If a dirty page\nis encountered during the scan, this page is written to backing storage\nusing mapping-\u003ewritepage.\n\nThis causes two problems.  First, it can result in very deep call stacks,\nparticularly if the target storage or filesystem are complex.  Some\nfilesystems ignore write requests from direct reclaim as a result.  The\nsecond is that a single-page flush is inefficient in terms of IO.  While\nthere is an expectation that the elevator will merge requests, this does\nnot always happen.  Quoting Christoph Hellwig;\n\n\tThe elevator has a relatively small window it can operate on,\n\tand can never fix up a bad large scale writeback pattern.\n\nThis patch prevents direct reclaim writing back filesystem pages by\nchecking if current is kswapd.  Anonymous pages are still written to swap\nas there is not the equivalent of a flusher thread for anonymous pages.\nIf the dirty pages cannot be written back, they are placed back on the LRU\nlists.  There is now a direct dependency on dirty page balancing to\nprevent too many pages in the system being dirtied which would prevent\nreclaim making forward progress.\n\nSigned-off-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Johannes Weiner \u003cjweiner@redhat.com\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Alex Elder \u003caelder@sgi.com\u003e\nCc: Theodore Ts\u0027o \u003ctytso@mit.edu\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Dave Hansen \u003cdave@linux.vnet.ibm.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f11c0ca501af89fc07b0d9f17531ba3b68a4ef39",
      "tree": "c66a24b4ca2778b940c01a2af78eca6abc0b3421",
      "parents": [
        "4f31888c104687078f8d88c2f11eca1080c88464"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "jweiner@redhat.com",
        "time": "Mon Oct 31 17:07:27 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:46 2011 -0700"
      },
      "message": "mm: vmscan: drop nr_force_scan[] from get_scan_count\n\nThe nr_force_scan[] tuple holds the effective scan numbers for anon and\nfile pages in case the situation called for a forced scan and the\nregularly calculated scan numbers turned out zero.\n\nHowever, the effective scan number can always be assumed to be\nSWAP_CLUSTER_MAX right before the division into anon and file.  The\nnumerators and denominator are properly set up for all cases, be it force\nscan for just file, just anon, or both, to do the right thing.\n\nSigned-off-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nCc: Ying Han \u003cyinghan@google.com\u003e\nCc: Balbir Singh \u003cbsingharora@gmail.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Daisuke Nishimura \u003cnishimura@mxp.nes.nec.co.jp\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4f31888c104687078f8d88c2f11eca1080c88464",
      "tree": "453bcfffe1955f6087156916eb102af1575206ee",
      "parents": [
        "f5fc870da2f8798edb5481cd2137a3b2d5bd1b19"
      ],
      "author": {
        "name": "Dave Jones",
        "email": "davej@redhat.com",
        "time": "Mon Oct 31 17:07:24 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:45 2011 -0700"
      },
      "message": "mm: output a list of loaded modules when we hit bad_page()\n\nWhen we get a bad_page bug report, it\u0027s useful to see what modules the\nuser had loaded.\n\nSigned-off-by: Dave Jones \u003cdavej@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "43362a4977e37db46f86f7e6ab935f0006956632",
      "tree": "5ab7070237ebd3f40d7fcfc0066586422da8310a",
      "parents": [
        "c9f01245b6a7d77d17deaa71af10f6aca14fa24e"
      ],
      "author": {
        "name": "David Rientjes",
        "email": "rientjes@google.com",
        "time": "Mon Oct 31 17:07:18 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:45 2011 -0700"
      },
      "message": "oom: fix race while temporarily setting current\u0027s oom_score_adj\n\ntest_set_oom_score_adj() was introduced in 72788c385604 (\"oom: replace\nPF_OOM_ORIGIN with toggling oom_score_adj\") to temporarily elevate\ncurrent\u0027s oom_score_adj for ksm and swapoff without requiring an\nadditional per-process flag.\n\nUsing that function to both set oom_score_adj to OOM_SCORE_ADJ_MAX and\nthen reinstate the previous value is racy since it\u0027s possible that\nuserspace can set the value to something else itself before the old value\nis reinstated.  That results in userspace setting current\u0027s oom_score_adj\nto a different value and then the kernel immediately setting it back to\nits previous value without notification.\n\nTo fix this, a new compare_swap_oom_score_adj() function is introduced\nwith the same semantics as the compare and swap CAS instruction, or\nCMPXCHG on x86.  It is used to reinstate the previous value of\noom_score_adj if and only if the present value is the same as the old\nvalue.\n\nSigned-off-by: David Rientjes \u003crientjes@google.com\u003e\nCc: Oleg Nesterov \u003coleg@redhat.com\u003e\nCc: Ying Han \u003cyinghan@google.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "c9f01245b6a7d77d17deaa71af10f6aca14fa24e",
      "tree": "13ffde591a5bcefba39cb6393f09b27f1ebc1a30",
      "parents": [
        "7b0d44fa49b1dcfdcf4897f12ddd12ddeab1a9d7"
      ],
      "author": {
        "name": "David Rientjes",
        "email": "rientjes@google.com",
        "time": "Mon Oct 31 17:07:15 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:45 2011 -0700"
      },
      "message": "oom: remove oom_disable_count\n\nThis removes mm-\u003eoom_disable_count entirely since it\u0027s unnecessary and\ncurrently buggy.  The counter was intended to be per-process but it\u0027s\ncurrently decremented in the exit path for each thread that exits, causing\nit to underflow.\n\nThe count was originally intended to prevent oom killing threads that\nshare memory with threads that cannot be killed since it doesn\u0027t lead to\nfuture memory freeing.  The counter could be fixed to represent all\nthreads sharing the same mm, but it\u0027s better to remove the count since:\n\n - it is possible that the OOM_DISABLE thread sharing memory with the\n   victim is waiting on that thread to exit and will actually cause\n   future memory freeing, and\n\n - there is no guarantee that a thread is disabled from oom killing just\n   because another thread sharing its mm is oom disabled.\n\nSigned-off-by: David Rientjes \u003crientjes@google.com\u003e\nReported-by: Oleg Nesterov \u003coleg@redhat.com\u003e\nReviewed-by: Oleg Nesterov \u003coleg@redhat.com\u003e\nCc: Ying Han \u003cyinghan@google.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "7b0d44fa49b1dcfdcf4897f12ddd12ddeab1a9d7",
      "tree": "c840608f5266e4ba783f4c4405efe89b69ae5754",
      "parents": [
        "f660daac474c6f7c2d710100e29b3276a6f4db0a"
      ],
      "author": {
        "name": "David Rientjes",
        "email": "rientjes@google.com",
        "time": "Mon Oct 31 17:07:11 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:45 2011 -0700"
      },
      "message": "oom: avoid killing kthreads if they assume the oom killed thread\u0027s mm\n\nAfter selecting a task to kill, the oom killer iterates all processes and\nkills all other threads that share the same mm_struct in different thread\ngroups.  It would not otherwise be helpful to kill a thread if its memory\nwould not be subsequently freed.\n\nA kernel thread, however, may assume a user thread\u0027s mm by using\nuse_mm().  This is only temporary and should not result in sending a\nSIGKILL to that kthread.\n\nThis patch ensures that only user threads and not kthreads are sent a\nSIGKILL if they share the same mm_struct as the oom killed task.\n\nSigned-off-by: David Rientjes \u003crientjes@google.com\u003e\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f660daac474c6f7c2d710100e29b3276a6f4db0a",
      "tree": "ad142e254a7b804cb158f2c64e9e6a77e8e4388c",
      "parents": [
        "d08c429b06d21bd2add88aea2cd1996f1b9b3bda"
      ],
      "author": {
        "name": "David Rientjes",
        "email": "rientjes@google.com",
        "time": "Mon Oct 31 17:07:07 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:45 2011 -0700"
      },
      "message": "oom: thaw threads if oom killed thread is frozen before deferring\n\nIf a thread has been oom killed and is frozen, thaw it before returning to\nthe page allocator.  Otherwise, it can stay frozen indefinitely and no\nmemory will be freed.\n\nSigned-off-by: David Rientjes \u003crientjes@google.com\u003e\nReported-by: Konstantin Khlebnikov \u003ckhlebnikov@openvz.org\u003e\nCc: Oleg Nesterov \u003coleg@redhat.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: \"Rafael J. Wysocki\" \u003crjw@sisk.pl\u003e\nAcked-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "d08c429b06d21bd2add88aea2cd1996f1b9b3bda",
      "tree": "7a7f0002e4747ebc70978dcda565a09a943dc992",
      "parents": [
        "3da367c3e5fca71d4e778fa565d9b098d5518f4a"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "jweiner@redhat.com",
        "time": "Mon Oct 31 17:07:05 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:45 2011 -0700"
      },
      "message": "mm/page-writeback.c: document bdi_min_ratio\n\nLooks like someone got distracted after adding the comment characters.\n\nSigned-off-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nAcked-by: Peter Zijlstra \u003cpeterz@infradead.org\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "3da367c3e5fca71d4e778fa565d9b098d5518f4a",
      "tree": "915dff1989bdffaed157b56f724631b5d8f2d328",
      "parents": [
        "3fa36acbced23c563345de3179dfe1775f15be5e"
      ],
      "author": {
        "name": "Shaohua Li",
        "email": "shaohua.li@intel.com",
        "time": "Mon Oct 31 17:07:03 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:45 2011 -0700"
      },
      "message": "vmscan: add block plug for page reclaim\n\nper-task block plug can reduce block queue lock contention and increase\nrequest merge.  Currently page reclaim doesn\u0027t support it.  I originally\nthought page reclaim doesn\u0027t need it, because kswapd thread count is\nlimited and file cache write is done at flusher mostly.\n\nWhen I test a workload with heavy swap in a 4-node machine, each CPU is\ndoing direct page reclaim and swap.  This causes block queue lock\ncontention.  In my test, without below patch, the CPU utilization is about\n2% ~ 7%.  With the patch, the CPU utilization is about 1% ~ 3%.  Disk\nthroughput isn\u0027t changed.  This should improve normal kswapd write and\nfile cache write too (increase request merge for example), but might not\nbe so obvious as I explain above.\n\nSigned-off-by: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Jens Axboe \u003caxboe@kernel.dk\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0dabec93de633a87adfbbe1d800a4c56cd19d73b",
      "tree": "51850bc562f8f95d284dbd7baeecfaefd573fccf",
      "parents": [
        "f80c0673610e36ae29d63e3297175e22f70dde5f"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Mon Oct 31 17:06:57 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:45 2011 -0700"
      },
      "message": "mm: migration: clean up unmap_and_move()\n\nunmap_and_move() is one a big messy function.  Clean it up.\n\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Michal Hocko \u003cmhocko@suse.cz\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f80c0673610e36ae29d63e3297175e22f70dde5f",
      "tree": "0a6aab3b637fa75961224e9261eb544156672c34",
      "parents": [
        "39deaf8585152f1a35c1676d3d7dc6ae0fb65967"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Mon Oct 31 17:06:55 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:44 2011 -0700"
      },
      "message": "mm: zone_reclaim: make isolate_lru_page() filter-aware\n\nIn __zone_reclaim case, we don\u0027t want to shrink mapped page.  Nonetheless,\nwe have isolated mapped page and re-add it into LRU\u0027s head.  It\u0027s\nunnecessary CPU overhead and makes LRU churning.\n\nOf course, when we isolate the page, the page might be mapped but when we\ntry to migrate the page, the page would be not mapped.  So it could be\nmigrated.  But race is rare and although it happens, it\u0027s no big deal.\n\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nCc: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "39deaf8585152f1a35c1676d3d7dc6ae0fb65967",
      "tree": "a7509ea61c2f1028ed7ef961aa1abd16d50905f9",
      "parents": [
        "4356f21d09283dc6d39a6f7287a65ddab61e2808"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Mon Oct 31 17:06:51 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:44 2011 -0700"
      },
      "message": "mm: compaction: make isolate_lru_page() filter-aware\n\nIn async mode, compaction doesn\u0027t migrate dirty or writeback pages.  So,\nit\u0027s meaningless to pick the page and re-add it to lru list.\n\nOf course, when we isolate the page in compaction, the page might be dirty\nor writeback but when we try to migrate the page, the page would be not\ndirty, writeback.  So it could be migrated.  But it\u0027s very unlikely as\nisolate and migration cycle is much faster than writeout.\n\nSo, this patch helps cpu overhead and prevent unnecessary LRU churning.\n\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4356f21d09283dc6d39a6f7287a65ddab61e2808",
      "tree": "34822a1662ea83291455834556a4fb5bf98ecd72",
      "parents": [
        "b9e84ac1536d35aee03b2601f19694949f0bd506"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Mon Oct 31 17:06:47 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:44 2011 -0700"
      },
      "message": "mm: change isolate mode from #define to bitwise type\n\nChange ISOLATE_XXX macro with bitwise isolate_mode_t type.  Normally,\nmacro isn\u0027t recommended as it\u0027s type-unsafe and making debugging harder as\nsymbol cannot be passed throught to the debugger.\n\nQuote from Johannes\n\" Hmm, it would probably be cleaner to fully convert the isolation mode\ninto independent flags.  INACTIVE, ACTIVE, BOTH is currently a\ntri-state among flags, which is a bit ugly.\"\n\nThis patch moves isolate mode from swap.h to mmzone.h by memcontrol.h\n\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Michal Hocko \u003cmhocko@suse.cz\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "b9e84ac1536d35aee03b2601f19694949f0bd506",
      "tree": "822eb1802818954248efc5cf67dc9a8a0ace5908",
      "parents": [
        "fcf634098c00dd9cd247447368495f0b79be12d1"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Mon Oct 31 17:06:44 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:44 2011 -0700"
      },
      "message": "mm: compaction: trivial clean up in acct_isolated()\n\nacct_isolated of compaction uses page_lru_base_type which returns only\nbase type of LRU list so it never returns LRU_ACTIVE_ANON or\nLRU_ACTIVE_FILE.  In addtion, cc-\u003enr_[anon|file] is used in only\nacct_isolated so it doesn\u0027t have fields in conpact_control.\n\nThis patch removes fields from compact_control and makes clear function of\nacct_issolated which counts the number of anon|file pages isolated.\n\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nReviewed-by: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "fcf634098c00dd9cd247447368495f0b79be12d1",
      "tree": "77fc98cd461bd52ba3b14e833d54a115ffbbd7bc",
      "parents": [
        "32ea845d5bafc37b7406bea1aee3005407cb0900"
      ],
      "author": {
        "name": "Christopher Yeoh",
        "email": "cyeoh@au1.ibm.com",
        "time": "Mon Oct 31 17:06:39 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:44 2011 -0700"
      },
      "message": "Cross Memory Attach\n\nThe basic idea behind cross memory attach is to allow MPI programs doing\nintra-node communication to do a single copy of the message rather than a\ndouble copy of the message via shared memory.\n\nThe following patch attempts to achieve this by allowing a destination\nprocess, given an address and size from a source process, to copy memory\ndirectly from the source process into its own address space via a system\ncall.  There is also a symmetrical ability to copy from the current\nprocess\u0027s address space into a destination process\u0027s address space.\n\n- Use of /proc/pid/mem has been considered, but there are issues with\n  using it:\n  - Does not allow for specifying iovecs for both src and dest, assuming\n    preadv or pwritev was implemented either the area read from or\n  written to would need to be contiguous.\n  - Currently mem_read allows only processes who are currently\n  ptrace\u0027ing the target and are still able to ptrace the target to read\n  from the target. This check could possibly be moved to the open call,\n  but its not clear exactly what race this restriction is stopping\n  (reason  appears to have been lost)\n  - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix\n  domain socket is a bit ugly from a userspace point of view,\n  especially when you may have hundreds if not (eventually) thousands\n  of processes  that all need to do this with each other\n  - Doesn\u0027t allow for some future use of the interface we would like to\n  consider adding in the future (see below)\n  - Interestingly reading from /proc/pid/mem currently actually\n  involves two copies! (But this could be fixed pretty easily)\n\nAs mentioned previously use of vmsplice instead was considered, but has\nproblems.  Since you need the reader and writer working co-operatively if\nthe pipe is not drained then you block.  Which requires some wrapping to\ndo non blocking on the send side or polling on the receive.  In all to all\ncommunication it requires ordering otherwise you can deadlock.  And in the\nexample of many MPI tasks writing to one MPI task vmsplice serialises the\ncopying.\n\nThere are some cases of MPI collectives where even a single copy interface\ndoes not get us the performance gain we could.  For example in an\nMPI_Reduce rather than copy the data from the source we would like to\ninstead use it directly in a mathops (say the reduce is doing a sum) as\nthis would save us doing a copy.  We don\u0027t need to keep a copy of the data\nfrom the source.  I haven\u0027t implemented this, but I think this interface\ncould in the future do all this through the use of the flags - eg could\nspecify the math operation and type and the kernel rather than just\ncopying the data would apply the specified operation between the source\nand destination and store it in the destination.\n\nAlthough we don\u0027t have a \"second user\" of the interface (though I\u0027ve had\nsome nibbles from people who may be interested in using it for intra\nprocess messaging which is not MPI).  This interface is something which\nhardware vendors are already doing for their custom drivers to implement\nfast local communication.  And so in addition to this being useful for\nOpenMPI it would mean the driver maintainers don\u0027t have to fix things up\nwhen the mm changes.\n\nThere was some discussion about how much faster a true zero copy would\ngo. Here\u0027s a link back to the email with some testing I did on that:\n\nhttp://marc.info/?l\u003dlinux-mm\u0026m\u003d130105930902915\u0026w\u003d2\n\nThere is a basic man page for the proposed interface here:\n\nhttp://ozlabs.org/~cyeoh/cma/process_vm_readv.txt\n\nThis has been implemented for x86 and powerpc, other architecture should\nmainly (I think) just need to add syscall numbers for the process_vm_readv\nand process_vm_writev. There are 32 bit compatibility versions for\n64-bit kernels.\n\nFor arch maintainers there are some simple tests to be able to quickly\nverify that the syscalls are working correctly here:\n\nhttp://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz\n\nSigned-off-by: Chris Yeoh \u003cyeohc@au1.ibm.com\u003e\nCc: Ingo Molnar \u003cmingo@elte.hu\u003e\nCc: \"H. Peter Anvin\" \u003chpa@zytor.com\u003e\nCc: Thomas Gleixner \u003ctglx@linutronix.de\u003e\nCc: Arnd Bergmann \u003carnd@arndb.de\u003e\nCc: Paul Mackerras \u003cpaulus@samba.org\u003e\nCc: Benjamin Herrenschmidt \u003cbenh@kernel.crashing.org\u003e\nCc: David Howells \u003cdhowells@redhat.com\u003e\nCc: James Morris \u003cjmorris@namei.org\u003e\nCc: \u003clinux-man@vger.kernel.org\u003e\nCc: \u003clinux-arch@vger.kernel.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "7c77509c542927ee2a3c8812fad84957e51bf67d",
      "tree": "df2d80be6ddf42b529ea4814b20010cbc036ea55",
      "parents": [
        "b95f1b31b75588306e32b2afd32166cad48f670b"
      ],
      "author": {
        "name": "Paul Gortmaker",
        "email": "paul.gortmaker@windriver.com",
        "time": "Sun Oct 16 02:03:46 2011 -0400"
      },
      "committer": {
        "name": "Paul Gortmaker",
        "email": "paul.gortmaker@windriver.com",
        "time": "Mon Oct 31 09:20:12 2011 -0400"
      },
      "message": "mm: fix implicit stat.h usage in dmapool.c\n\nThe removal of the implicitly everywhere module.h and its child includes\nwill reveal this implicit stat.h usage:\n\nmm/dmapool.c:108: error: ‘S_IRUGO’ undeclared here (not in a function)\n\nSigned-off-by: Paul Gortmaker \u003cpaul.gortmaker@windriver.com\u003e\n"
    },
    {
      "commit": "b95f1b31b75588306e32b2afd32166cad48f670b",
      "tree": "b5496144e41b117cfe5ae70b145b5351709ec4d0",
      "parents": [
        "b9e15bafdf1aa20791cdefdcbf1ccf7d7aa03aaa"
      ],
      "author": {
        "name": "Paul Gortmaker",
        "email": "paul.gortmaker@windriver.com",
        "time": "Sun Oct 16 02:01:52 2011 -0400"
      },
      "committer": {
        "name": "Paul Gortmaker",
        "email": "paul.gortmaker@windriver.com",
        "time": "Mon Oct 31 09:20:12 2011 -0400"
      },
      "message": "mm: Map most files to use export.h instead of module.h\n\nThe files changed within are only using the EXPORT_SYMBOL\nmacro variants.  They are not using core modular infrastructure\nand hence don\u0027t need module.h but only the export.h header.\n\nSigned-off-by: Paul Gortmaker \u003cpaul.gortmaker@windriver.com\u003e\n"
    },
    {
      "commit": "b9e15bafdf1aa20791cdefdcbf1ccf7d7aa03aaa",
      "tree": "459af0df6234a50b43b87078fd252941e231cbc3",
      "parents": [
        "e25934a51772f47edd94d7b7d08b0e167769639c"
      ],
      "author": {
        "name": "Paul Gortmaker",
        "email": "paul.gortmaker@windriver.com",
        "time": "Thu May 26 16:00:52 2011 -0400"
      },
      "committer": {
        "name": "Paul Gortmaker",
        "email": "paul.gortmaker@windriver.com",
        "time": "Mon Oct 31 09:20:12 2011 -0400"
      },
      "message": "mm: Add export.h for EXPORT_SYMBOL to active symbol exporters\n\nThese files were getting \u003clinux/module.h\u003e via an implicit include\npath, but we want to crush those out of existence since they cost\ntime during compiles of processing thousands of lines of headers\nfor no reason.  Give them the lightweight header that just contains\nthe EXPORT_SYMBOL infrastructure.\n\nSigned-off-by: Paul Gortmaker \u003cpaul.gortmaker@windriver.com\u003e\n"
    },
    {
      "commit": "e25934a51772f47edd94d7b7d08b0e167769639c",
      "tree": "c72dd84e95178d1e5e5223bf5a736e75430f1305",
      "parents": [
        "9a418455134f5dc23f124d2818b2e8e1cea997a1"
      ],
      "author": {
        "name": "Paul Gortmaker",
        "email": "paul.gortmaker@windriver.com",
        "time": "Thu May 26 15:58:15 2011 -0400"
      },
      "committer": {
        "name": "Paul Gortmaker",
        "email": "paul.gortmaker@windriver.com",
        "time": "Mon Oct 31 09:20:11 2011 -0400"
      },
      "message": "mm: delete various needless include \u003clinux/module.h\u003e\n\nThere is nothing modular in these files, and no reason to drag\nin all the 357 headers that module.h brings with it, since\nit just slows down compiles.\n\nSigned-off-by: Paul Gortmaker \u003cpaul.gortmaker@windriver.com\u003e\n"
    },
    {
      "commit": "0e175a1835ffc979e55787774e58ec79e41957d7",
      "tree": "6ec4b65a8de4e9d1c12d26a1079079ed81d79450",
      "parents": [
        "ad4e38dd6a33bb3a4882c487d7abe621e583b982"
      ],
      "author": {
        "name": "Curt Wohlgemuth",
        "email": "curtw@google.com",
        "time": "Fri Oct 07 21:54:10 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 31 00:33:36 2011 +0800"
      },
      "message": "writeback: Add a \u0027reason\u0027 to wb_writeback_work\n\nThis creates a new \u0027reason\u0027 field in a wb_writeback_work\nstructure, which unambiguously identifies who initiates\nwriteback activity.  A \u0027wb_reason\u0027 enumeration has been\nadded to writeback.h, to enumerate the possible reasons.\n\nThe \u0027writeback_work_class\u0027 and tracepoint event class and\n\u0027writeback_queue_io\u0027 tracepoints are updated to include the\nsymbolic \u0027reason\u0027 in all trace events.\n\nAnd the \u0027writeback_inodes_sbXXX\u0027 family of routines has had\na wb_stats parameter added to them, so callers can specify\nwhy writeback is being started.\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Curt Wohlgemuth \u003ccurtw@google.com\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "ece13ac31bbe492d940ba0bc4ade2ae1521f46a5",
      "tree": "2bfddab0f62999bf595a72913b79cabafbad0e40",
      "parents": [
        "b48c104d2211b0ac881a71f5f76a3816225f8111"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Aug 29 23:33:20 2010 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 31 00:29:38 2011 +0800"
      },
      "message": "writeback: trace event balance_dirty_pages\n\nUseful for analyzing the dynamics of the throttling algorithms and\ndebugging user reported problems.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "b48c104d2211b0ac881a71f5f76a3816225f8111",
      "tree": "b947f3fd4c8b49ee12d516f3eb520209c577387b",
      "parents": [
        "50657fc4dfa7e345a1008f7c1de0bf930bbecca9"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Mar 02 17:22:49 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 31 00:29:21 2011 +0800"
      },
      "message": "writeback: trace event bdi_dirty_ratelimit\n\nIt helps understand how various throttle bandwidths are updated.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "f362f98e7c445643d27c610bb7a86b79727b592e",
      "tree": "399d9ebccdfbdfe9690ab1403a001d6f08e54b41",
      "parents": [
        "f793f2961170c0b49c1650e69e7825484159ce62",
        "f3c7691e8d30d88899b514675c7c86d19057b5fd"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Oct 28 10:49:34 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Oct 28 10:49:34 2011 -0700"
      },
      "message": "Merge branch \u0027for-next\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue\n\n* \u0027for-next\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits)\n  leases: fix write-open/read-lease race\n  nfs: drop unnecessary locking in llseek\n  ext4: replace cut\u0027n\u0027pasted llseek code with generic_file_llseek_size\n  vfs: add generic_file_llseek_size\n  vfs: do (nearly) lockless generic_file_llseek\n  direct-io: merge direct_io_walker into __blockdev_direct_IO\n  direct-io: inline the complete submission path\n  direct-io: separate map_bh from dio\n  direct-io: use a slab cache for struct dio\n  direct-io: rearrange fields in dio/dio_submit to avoid holes\n  direct-io: fix a wrong comment\n  direct-io: separate fields only used in the submission path from struct dio\n  vfs: fix spinning prevention in prune_icache_sb\n  vfs: add a comment to inode_permission()\n  vfs: pass all mask flags check_acl and posix_acl_permission\n  vfs: add hex format for MAY_* flag values\n  vfs: indicate that the permission functions take all the MAY_* flags\n  compat: sync compat_stats with statfs.\n  vfs: add \"device\" tag to /proc/self/mountstats\n  cleanup: vfs: small comment fix for block_invalidatepage\n  ...\n\nFix up trivial conflict in fs/gfs2/file.c (llseek changes)\n"
    },
    {
      "commit": "39be79c16f2b8eb07dd0d4e965cddfe39cc0534a",
      "tree": "821611221295d47c671ec72e1fb558efcedff03b",
      "parents": [
        "c3b92c8787367a8bb53d57d9789b558f1295cc96"
      ],
      "author": {
        "name": "Jeff Layton",
        "email": "jlayton@redhat.com",
        "time": "Thu Oct 27 23:53:08 2011 +0200"
      },
      "committer": {
        "name": "Christoph Hellwig",
        "email": "hch@serles.lst.de",
        "time": "Fri Oct 28 13:55:08 2011 +0200"
      },
      "message": "vfs: iov_iter: have iov_iter_advance decrement nr_segs appropriately\n\nCurrently, when you call iov_iter_advance, then the pointer to the iovec\narray can be incremented, but it does not decrement the nr_segs value in\nthe iov_iter struct. The result is a iov_iter struct with a nr_segs\nvalue that goes beyond the end of the array.\n\nWhile I\u0027m not aware of anything that\u0027s specifically broken by this, it\nseems odd and a bit dangerous not to decrement that value. If someone\nwere to trust the nr_segs value to be correct, then they could end up\nwalking off the end of the array.\n\nChanging this might also provide some micro-optimization when dealing\nwith the last iovec in an array. Many of the other routines that deal\nwith iov_iter have optimized codepaths when nr_segs \u003d\u003d 1.\n\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nSigned-off-by: Jeff Layton \u003cjlayton@redhat.com\u003e\nSigned-off-by: Christoph Hellwig \u003chch@lst.de\u003e\n"
    },
    {
      "commit": "e182a345d40deba7c3165a2857812bf403818319",
      "tree": "01cace799491cbb6bea19c10de971fd3a84d9868",
      "parents": [
        "3cfef9524677a4ecb392d6fbffe6ebce6302f1d4",
        "fe353178653b15add8626f5474842601be160281",
        "dcc3be6a548a1e51adaab3be6d9dfbb68bc0e3a0"
      ],
      "author": {
        "name": "Pekka Enberg",
        "email": "penberg@kernel.org",
        "time": "Wed Oct 26 18:09:12 2011 +0300"
      },
      "committer": {
        "name": "Pekka Enberg",
        "email": "penberg@kernel.org",
        "time": "Wed Oct 26 18:09:12 2011 +0300"
      },
      "message": "Merge branches \u0027slab/next\u0027 and \u0027slub/partial\u0027 into slab/for-linus\n"
    },
    {
      "commit": "59e52534172d845ebffb0d7e85fc56fb7b857051",
      "tree": "49552e03f1bdb413cd8b5f7542e91770688d7047",
      "parents": [
        "73692d9bb58ecc2fa73f4b2bfcf6eadaa6d49a26",
        "0d89e54c8249645404283436d952afc261a04e1e"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 25 12:11:02 2011 +0200"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 25 12:11:02 2011 +0200"
      },
      "message": "Merge branch \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial\n\n* \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (59 commits)\n  MAINTAINERS: linux-m32r is moderated for non-subscribers\n  linux@lists.openrisc.net is moderated for non-subscribers\n  Drop default from \"DM365 codec select\" choice\n  parisc: Kconfig: cleanup Kernel page size default\n  Kconfig: remove redundant CONFIG_ prefix on two symbols\n  cris: remove arch/cris/arch-v32/lib/nand_init.S\n  microblaze: add missing CONFIG_ prefixes\n  h8300: drop puzzling Kconfig dependencies\n  MAINTAINERS: microblaze-uclinux@itee.uq.edu.au is moderated for non-subscribers\n  tty: drop superfluous dependency in Kconfig\n  ARM: mxc: fix Kconfig typo \u0027i.MX51\u0027\n  Fix file references in Kconfig files\n  aic7xxx: fix Kconfig references to READMEs\n  Fix file references in drivers/ide/\n  thinkpad_acpi: Fix printk typo \u0027bluestooth\u0027\n  bcmring: drop commented out line in Kconfig\n  btmrvl_sdio: fix typo \u0027btmrvl_sdio_sd6888\u0027\n  doc: raw1394: Trivial typo fix\n  CIFS: Don\u0027t free volume_info-\u003eUNC until we are entirely done with it.\n  treewide: Correct spelling of successfully in comments\n  ...\n"
    },
    {
      "commit": "36b8d186e6cc8e32cb5227f5645a58e1bc0af190",
      "tree": "1000ad26e189e6ff2c53fb7eeff605f59c7ad94e",
      "parents": [
        "cd85b557414fe4cd44ea6608825e96612a5fe2b2",
        "c45ed235abf1b0b6666417e3c394f18717976acd"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 25 09:45:31 2011 +0200"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 25 09:45:31 2011 +0200"
      },
      "message": "Merge branch \u0027next\u0027 of git://selinuxproject.org/~jmorris/linux-security\n\n* \u0027next\u0027 of git://selinuxproject.org/~jmorris/linux-security: (95 commits)\n  TOMOYO: Fix incomplete read after seek.\n  Smack: allow to access /smack/access as normal user\n  TOMOYO: Fix unused kernel config option.\n  Smack: fix: invalid length set for the result of /smack/access\n  Smack: compilation fix\n  Smack: fix for /smack/access output, use string instead of byte\n  Smack: domain transition protections (v3)\n  Smack: Provide information for UDS getsockopt(SO_PEERCRED)\n  Smack: Clean up comments\n  Smack: Repair processing of fcntl\n  Smack: Rule list lookup performance\n  Smack: check permissions from user space (v2)\n  TOMOYO: Fix quota and garbage collector.\n  TOMOYO: Remove redundant tasklist_lock.\n  TOMOYO: Fix domain transition failure warning.\n  TOMOYO: Remove tomoyo_policy_memory_lock spinlock.\n  TOMOYO: Simplify garbage collector.\n  TOMOYO: Fix make namespacecheck warnings.\n  target: check hex2bin result\n  encrypted-keys: check hex2bin result\n  ...\n"
    },
    {
      "commit": "3bcfeaf93f44112053e1c36aa681d9efc1185ddc",
      "tree": "15206964bf4eb4892de4c8850c799def913971db",
      "parents": [
        "c9a929dde3913780b5c416f4bb9d9ed804f509ce"
      ],
      "author": {
        "name": "David Vrabel",
        "email": "david.vrabel@citrix.com",
        "time": "Thu Oct 20 21:24:30 2011 +0200"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "axboe@kernel.dk",
        "time": "Thu Oct 20 21:24:30 2011 +0200"
      },
      "message": "block: initialize the bounce pool if high memory may be added later\n\ninit_emergency_pool() does not create the page pool for bouncing block\nrequests if the current count of high pages is zero.  If high memory\nmay be added later (either via memory hotplug or a balloon driver in a\nvirtualized system) then a oops occurs if a request with a high page\nneed bouncing because the pool does not exist.\n\nSo, always create the pool if memory hotplug is enabled and change the\ntest so it\u0027s valid even if all high pages are currently in the balloon\n(the balloon drivers adjust totalhigh_pages but not max_pfn).\n\nSigned-off-by: David Vrabel \u003cdavid.vrabel@citrix.com\u003e\nSigned-off-by: Jens Axboe \u003caxboe@kernel.dk\u003e\n"
    },
    {
      "commit": "486cf46f3f9be5f2a966016c1a8fe01e32cde09e",
      "tree": "98a6e2376507dee6ea89a9b0073511c703d940dc",
      "parents": [
        "e4fcd69c9e4e273352e0f87cabd9648606da0c3e"
      ],
      "author": {
        "name": "Hugh Dickins",
        "email": "hughd@google.com",
        "time": "Wed Oct 19 12:50:35 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Oct 19 23:42:58 2011 -0700"
      },
      "message": "mm: fix race between mremap and removing migration entry\n\nI don\u0027t usually pay much attention to the stale \"? \" addresses in\nstack backtraces, but this lucky report from Pawel Sikora hints that\nmremap\u0027s move_ptes() has inadequate locking against page migration.\n\n 3.0 BUG_ON(!PageLocked(p)) in migration_entry_to_page():\n kernel BUG at include/linux/swapops.h:105!\n RIP: 0010:[\u003cffffffff81127b76\u003e]  [\u003cffffffff81127b76\u003e]\n                       migration_entry_wait+0x156/0x160\n  [\u003cffffffff811016a1\u003e] handle_pte_fault+0xae1/0xaf0\n  [\u003cffffffff810feee2\u003e] ? __pte_alloc+0x42/0x120\n  [\u003cffffffff8112c26b\u003e] ? do_huge_pmd_anonymous_page+0xab/0x310\n  [\u003cffffffff81102a31\u003e] handle_mm_fault+0x181/0x310\n  [\u003cffffffff81106097\u003e] ? vma_adjust+0x537/0x570\n  [\u003cffffffff81424bed\u003e] do_page_fault+0x11d/0x4e0\n  [\u003cffffffff81109a05\u003e] ? do_mremap+0x2d5/0x570\n  [\u003cffffffff81421d5f\u003e] page_fault+0x1f/0x30\n\nmremap\u0027s down_write of mmap_sem, together with i_mmap_mutex or lock,\nand pagetable locks, were good enough before page migration (with its\nrequirement that every migration entry be found) came in, and enough\nwhile migration always held mmap_sem; but not enough nowadays, when\nthere\u0027s memory hotremove and compaction.\n\nThe danger is that move_ptes() lets a migration entry dodge around\nbehind remove_migration_pte()\u0027s back, so it\u0027s in the old location when\nlooking at the new, then in the new location when looking at the old.\n\nEither mremap\u0027s move_ptes() must additionally take anon_vma lock(), or\nmigration\u0027s remove_migration_pte() must stop peeking for is_swap_entry()\nbefore it takes pagetable lock.\n\nConsensus chooses the latter: we prefer to add overhead to migration\nthan to mremapping, which gets used by JVMs and by exec stack setup.\n\nReported-and-tested-by: Paweł Sikora \u003cpluto@agmk.net\u003e\nSigned-off-by: Hugh Dickins \u003chughd@google.com\u003e\nAcked-by: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: stable@vger.kernel.org\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "50657fc4dfa7e345a1008f7c1de0bf930bbecca9",
      "tree": "b1b1da53bc881b021635d9a43bad0047390485d2",
      "parents": [
        "b00949aa2df9970a912bf060bc95e99da356881c"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Tue Oct 11 17:06:33 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Tue Oct 11 17:45:24 2011 +0800"
      },
      "message": "writeback: fix ppc compile warnings on do_div(long long, unsigned long)\n\nFix powerpc compile warnings\n\nmm/page-writeback.c: In function \u0027bdi_position_ratio\u0027:\nmm/page-writeback.c:622:3: warning: comparison of distinct pointer types lacks a cast [enabled by default]\npage-writeback.c:635:4: warning: comparison of distinct pointer types lacks a cast [enabled by default]\n\nAlso fix gcc \"uninitialized var\" warnings.\n\nReported-by: Stephen Rothwell \u003csfr@canb.auug.org.au\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "8927f66c4ede9a18b4b58f7e6f9debca67065f6b",
      "tree": "f7c8490ab23a20cb86874ca8112f3dd1fc6002ae",
      "parents": [
        "57fc978cfb61ed40a7bbfe5a569359159ba31abd"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Thu Aug 04 22:16:46 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:58 2011 +0800"
      },
      "message": "writeback: dirty position control - bdi reserve area\n\nKeep a minimal pool of dirty pages for each bdi, so that the disk IO\nqueues won\u0027t underrun. Also gently increase a small bdi_thresh to avoid\nit stuck in 0 for some light dirtied bdi.\n\nIt\u0027s particularly useful for JBOD and small memory system.\n\nIt may result in (pos_ratio \u003e 1) at the setpoint and push the dirty\npages high. This is more or less intended because the bdi is in the\ndanger of IO queue underflow.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "57fc978cfb61ed40a7bbfe5a569359159ba31abd",
      "tree": "870ffd08e0c1bb0dde55e4f1ed4dfa2bda8e3a80",
      "parents": [
        "c8462cc9de9e92264ec647903772f6036a99b286"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jun 11 19:32:32 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:58 2011 +0800"
      },
      "message": "writeback: control dirty pause time\n\nThe dirty pause time shall ultimately be controlled by adjusting\nnr_dirtied_pause, since there is relationship\n\n\tpause \u003d pages_dirtied / task_ratelimit\n\nAssuming\n\n\tpages_dirtied ~\u003d nr_dirtied_pause\n\ttask_ratelimit ~\u003d dirty_ratelimit\n\nWe get\n\n\tnr_dirtied_pause ~\u003d dirty_ratelimit * desired_pause\n\nHere dirty_ratelimit is preferred over task_ratelimit because it\u0027s\nmore stable.\n\nIt\u0027s also important to limit possible large transitional errors:\n\n- bw is changing quickly\n- pages_dirtied \u003c\u003c nr_dirtied_pause on entering dirty exceeded area\n- pages_dirtied \u003e\u003e nr_dirtied_pause on btrfs (to be improved by a\n  separate fix, but still expect non-trivial errors)\n\nSo we end up using the above formula inside clamp_val().\n\nThe best test case for this code is to run 100 \"dd bs\u003d4M\" tasks on\nbtrfs and check its pause time distribution.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "c8462cc9de9e92264ec647903772f6036a99b286",
      "tree": "f442132f53651a04e67f3a119ead9f54be51a6cb",
      "parents": [
        "143dfe8611a63030ce0c79419dc362f7838be557"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jun 11 19:21:43 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:57 2011 +0800"
      },
      "message": "writeback: limit max dirty pause time\n\nApply two policies to scale down the max pause time for\n\n1) small number of concurrent dirtiers\n2) small memory system (comparing to storage bandwidth)\n\nMAX_PAUSE\u003d200ms may only be suitable for high end servers with lots of\nconcurrent dirtiers, where the large pause time can reduce much overheads.\n\nOtherwise, smaller pause time is desirable whenever possible, so as to\nget good responsiveness and smooth user experiences. It\u0027s actually\nrequired for good disk utilization in the case when all the dirty pages\ncan be synced to disk within MAX_PAUSE\u003d200ms.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "143dfe8611a63030ce0c79419dc362f7838be557",
      "tree": "626b823d86fbb947296fc6c7fe2be324a85f3b5c",
      "parents": [
        "9d823e8f6b1b7b39f952d7d1795f29162143a433"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Fri Aug 27 18:45:12 2010 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:57 2011 +0800"
      },
      "message": "writeback: IO-less balance_dirty_pages()\n\nAs proposed by Chris, Dave and Jan, don\u0027t start foreground writeback IO\ninside balance_dirty_pages(). Instead, simply let it idle sleep for some\ntime to throttle the dirtying task. In the mean while, kick off the\nper-bdi flusher thread to do background writeback IO.\n\nRATIONALS\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n\n- disk seeks on concurrent writeback of multiple inodes (Dave Chinner)\n\n  If every thread doing writes and being throttled start foreground\n  writeback, it leads to N IO submitters from at least N different\n  inodes at the same time, end up with N different sets of IO being\n  issued with potentially zero locality to each other, resulting in\n  much lower elevator sort/merge efficiency and hence we seek the disk\n  all over the place to service the different sets of IO.\n  OTOH, if there is only one submission thread, it doesn\u0027t jump between\n  inodes in the same way when congestion clears - it keeps writing to\n  the same inode, resulting in large related chunks of sequential IOs\n  being issued to the disk. This is more efficient than the above\n  foreground writeback because the elevator works better and the disk\n  seeks less.\n\n- lock contention and cache bouncing on concurrent IO submitters (Dave Chinner)\n\n  With this patchset, the fs_mark benchmark on a 12-drive software RAID0 goes\n  from CPU bound to IO bound, freeing \"3-4 CPUs worth of spinlock contention\".\n\n  * \"CPU usage has dropped by ~55%\", \"it certainly appears that most of\n    the CPU time saving comes from the removal of contention on the\n    inode_wb_list_lock\" (IMHO at least 10% comes from the reduction of\n    cacheline bouncing, because the new code is able to call much less\n    frequently into balance_dirty_pages() and hence access the global\n    page states)\n\n  * the user space \"App overhead\" is reduced by 20%, by avoiding the\n    cacheline pollution by the complex writeback code path\n\n  * \"for a ~5% throughput reduction\", \"the number of write IOs have\n    dropped by ~25%\", and the elapsed time reduced from 41:42.17 to\n    40:53.23.\n\n  * On a simple test of 100 dd, it reduces the CPU %system time from 30% to 3%,\n    and improves IO throughput from 38MB/s to 42MB/s.\n\n- IO size too small for fast arrays and too large for slow USB sticks\n\n  The write_chunk used by current balance_dirty_pages() cannot be\n  directly set to some large value (eg. 128MB) for better IO efficiency.\n  Because it could lead to more than 1 second user perceivable stalls.\n  Even the current 4MB write size may be too large for slow USB sticks.\n  The fact that balance_dirty_pages() starts IO on itself couples the\n  IO size to wait time, which makes it hard to do suitable IO size while\n  keeping the wait time under control.\n\n  Now it\u0027s possible to increase writeback chunk size proportional to the\n  disk bandwidth. In a simple test of 50 dd\u0027s on XFS, 1-HDD, 3GB ram,\n  the larger writeback size dramatically reduces the seek count to 1/10\n  (far beyond my expectation) and improves the write throughput by 24%.\n\n- long block time in balance_dirty_pages() hurts desktop responsiveness\n\n  Many of us may have the experience: it often takes a couple of seconds\n  or even long time to stop a heavy writing dd/cp/tar command with\n  Ctrl-C or \"kill -9\".\n\n- IO pipeline broken by bumpy write() progress\n\n  There are a broad class of \"loop {read(buf); write(buf);}\" applications\n  whose read() pipeline will be under-utilized or even come to a stop if\n  the write()s have long latencies _or_ don\u0027t progress in a constant rate.\n  The current threshold based throttling inherently transfers the large\n  low level IO completion fluctuations to bumpy application write()s,\n  and further deteriorates with increasing number of dirtiers and/or bdi\u0027s.\n\n  For example, when doing 50 dd\u0027s + 1 remote rsync to an XFS partition,\n  the rsync progresses very bumpy in legacy kernel, and throughput is\n  improved by 67% by this patchset. (plus the larger write chunk size,\n  it will be 93% speedup).\n\n  The new rate based throttling can support 1000+ dd\u0027s with excellent\n  smoothness, low latency and low overheads.\n\nFor the above reasons, it\u0027s much better to do IO-less and low latency\npauses in balance_dirty_pages().\n\nJan Kara, Dave Chinner and me explored the scheme to let\nbalance_dirty_pages() wait for enough writeback IO completions to\nsafeguard the dirty limit. However it\u0027s found to have two problems:\n\n- in large NUMA systems, the per-cpu counters may have big accounting\n  errors, leading to big throttle wait time and jitters.\n\n- NFS may kill large amount of unstable pages with one single COMMIT.\n  Because NFS server serves COMMIT with expensive fsync() IOs, it is\n  desirable to delay and reduce the number of COMMITs. So it\u0027s not\n  likely to optimize away such kind of bursty IO completions, and the\n  resulted large (and tiny) stall times in IO completion based throttling.\n\nSo here is a pause time oriented approach, which tries to control the\npause time in each balance_dirty_pages() invocations, by controlling\nthe number of pages dirtied before calling balance_dirty_pages(), for\nsmooth and efficient dirty throttling:\n\n- avoid useless (eg. zero pause time) balance_dirty_pages() calls\n- avoid too small pause time (less than   4ms, which burns CPU power)\n- avoid too large pause time (more than 200ms, which hurts responsiveness)\n- avoid big fluctuations of pause times\n\nIt can control pause times at will. The default policy (in a followup\npatch) will be to do ~10ms pauses in 1-dd case, and increase to ~100ms\nin 1000-dd case.\n\nBEHAVIOR CHANGE\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n\n(1) dirty threshold\n\nUsers will notice that the applications will get throttled once crossing\nthe global (background + dirty)/2\u003d15% threshold, and then balanced around\n17.5%. Before patch, the behavior is to just throttle it at 20% dirtyable\nmemory in 1-dd case.\n\nSince the task will be soft throttled earlier than before, it may be\nperceived by end users as performance \"slow down\" if his application\nhappens to dirty more than 15% dirtyable memory.\n\n(2) smoothness/responsiveness\n\nUsers will notice a more responsive system during heavy writeback.\n\"killall dd\" will take effect instantly.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "9d823e8f6b1b7b39f952d7d1795f29162143a433",
      "tree": "2ef4c0d29353452dd2f894e7dbd240a31bdd0a02",
      "parents": [
        "7381131cbcf7e15d201a0ffd782a4698efe4e740"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jun 11 18:10:12 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:57 2011 +0800"
      },
      "message": "writeback: per task dirty rate limit\n\nAdd two fields to task_struct.\n\n1) account dirtied pages in the individual tasks, for accuracy\n2) per-task balance_dirty_pages() call intervals, for flexibility\n\nThe balance_dirty_pages() call interval (ie. nr_dirtied_pause) will\nscale near-sqrt to the safety gap between dirty pages and threshold.\n\nThe main problem of per-task nr_dirtied is, if 1k+ tasks start dirtying\npages at exactly the same time, each task will be assigned a large\ninitial nr_dirtied_pause, so that the dirty threshold will be exceeded\nlong before each task reached its nr_dirtied_pause and hence call\nbalance_dirty_pages().\n\nThe solution is to watch for the number of pages dirtied on each CPU in\nbetween the calls into balance_dirty_pages(). If it exceeds ratelimit_pages\n(3% dirty threshold), force call balance_dirty_pages() for a chance to\nset bdi-\u003edirty_exceeded. In normal situations, this safeguarding\ncondition is not expected to trigger at all.\n\nOn the sqrt in dirty_poll_interval():\n\nIt will serve as an initial guess when dirty pages are still in the\nfreerun area.\n\nWhen dirty pages are floating inside the dirty control scope [freerun,\nlimit], a followup patch will use some refined dirty poll interval to\nget the desired pause time.\n\n   thresh-dirty (MB)    sqrt\n\t\t   1      16\n\t\t   2      22\n\t\t   4      32\n\t\t   8      45\n\t\t  16      64\n\t\t  32      90\n\t\t  64     128\n\t\t 128     181\n\t\t 256     256\n\t\t 512     362\n\t\t1024     512\n\nThe above table means, given 1MB (or 1GB) gap and the dd tasks polling\nbalance_dirty_pages() on every 16 (or 512) pages, the dirty limit won\u0027t\nbe exceeded as long as there are less than 16 (or 512) concurrent dd\u0027s.\n\nSo sqrt naturally leads to less overheads and more safe concurrent tasks\nfor large memory servers, which have large (thresh-freerun) gaps.\n\npeter: keep the per-CPU ratelimit for safeguarding the 1k+ tasks case\n\nCC: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nReviewed-by: Andrea Righi \u003candrea@betterlinux.com\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    }
  ],
  "next": "7381131cbcf7e15d201a0ffd782a4698efe4e740"
}