)]}'
{
  "log": [
    {
      "commit": "68809c7108b9a75baf2a888b1c19ce1a4680f600",
      "tree": "72dac532abb4f42d197023a0495952c387c835ce",
      "parents": [
        "4cd9069a0a0e5fb8b007425c937642682ac96c76"
      ],
      "author": {
        "name": "Fengguang Wu",
        "email": "fengguang.wu@intel.com",
        "time": "Sun May 06 13:21:42 2012 +0800"
      },
      "committer": {
        "name": "Fengguang Wu",
        "email": "fengguang.wu@intel.com",
        "time": "Sun May 06 13:41:58 2012 +0800"
      },
      "message": "writeback: initialize global_dirty_limit\n\nThis prevents global_dirty_limit from remaining 0 (the initial value)\nfor long time, since it\u0027s only updated in update_dirty_limit() when\nabove the dirty freerun area.\n\nIt will avoid unexpected consequences when some random code use it as a\nconvenient approximation of the global dirty threshold.\n\nSigned-off-by: Fengguang Wu \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "18cf8cf8bab1296f477ee4dd8f78b5b23c5a192e",
      "tree": "78ce4000085e180d89e85e9363379b55e455b205",
      "parents": [
        "668ce0ac707719d866af7e432e518af7b4c575ad"
      ],
      "author": {
        "name": "H Hartley Sweeten",
        "email": "hartleys@visionengravers.com",
        "time": "Thu Apr 12 13:44:20 2012 -0700"
      },
      "committer": {
        "name": "Fengguang Wu",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Apr 14 17:37:27 2012 +0800"
      },
      "message": "mm: page-writeback.c: local functions should not be exposed globally\n\nThe function global_dirtyable_memory is only referenced in this file and\nshould be marked static to prevent it from being exposed globally.\n\nThis quiets the sparse warning:\n\nwarning: symbol \u0027global_dirtyable_memory\u0027 was not declared. Should it be static?\n\nSigned-off-by: H Hartley Sweeten \u003chsweeten@visionengravers.com\u003e\nSigned-off-by: Fengguang Wu \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "69e1aaddd63104f37021d0b0f6abfd9623c9134c",
      "tree": "14ad49741b428d270b681694bb2df349465455b9",
      "parents": [
        "56b59b429b4c26e5e730bc8c3d837de9f7d0a966",
        "9d547c35799a4ddd235f1565cec2fff6c9263504"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Mar 28 10:02:55 2012 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Mar 28 10:02:55 2012 -0700"
      },
      "message": "Merge tag \u0027ext4_for_linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4\n\nPull ext4 updates for 3.4 from Ted Ts\u0027o:\n \"Ext4 commits for 3.3 merge window; mostly cleanups and bug fixes\n\n  The changes to export dirty_writeback_interval are from Artem\u0027s s_dirt\n  cleanup patch series.  The same is true of the change to remove the\n  s_dirt helper functions which never got used by anyone in-tree.  I\u0027ve\n  run these changes by Al Viro, and am carrying them so that Artem can\n  more easily fix up the rest of the file systems during the next merge\n  window.  (Originally we had hopped to remove the use of s_dirt from\n  ext4 during this merge window, but his patches had some bugs, so I\n  ultimately ended dropping them from the ext4 tree.)\"\n\n* tag \u0027ext4_for_linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (66 commits)\n  vfs: remove unused superblock helpers\n  mm: export dirty_writeback_interval\n  ext4: remove useless s_dirt assignment\n  ext4: write superblock only once on unmount\n  ext4: do not mark superblock as dirty unnecessarily\n  ext4: correct ext4_punch_hole return codes\n  ext4: remove restrictive checks for EOFBLOCKS_FL\n  ext4: always set then trimmed blocks count into len\n  ext4: fix trimmed block count accunting\n  ext4: fix start and len arguments handling in ext4_trim_fs()\n  ext4: update s_free_{inodes,blocks}_count during online resize\n  ext4: change some printk() calls to use ext4_msg() instead\n  ext4: avoid output message interleaving in ext4_error_\u003cfoo\u003e()\n  ext4: remove trailing newlines from ext4_msg() and ext4_error() messages\n  ext4: add no_printk argument validation, fix fallout\n  ext4: remove redundant \"EXT4-fs: \" from uses of ext4_msg\n  ext4: give more helpful error message in ext4_ext_rm_leaf()\n  ext4: remove unused code from ext4_ext_map_blocks()\n  ext4: rewrite punch hole to use ext4_ext_remove_space()\n  jbd2: cleanup journal tail after transaction commit\n  ...\n"
    },
    {
      "commit": "91913a2942d2b582c40673956dec1a9c71d32fe4",
      "tree": "b055ae821b46fa7e11e1f7035a3da7b1e711f5e4",
      "parents": [
        "182f514f883abb5f942c94e61c371c4b406352d4"
      ],
      "author": {
        "name": "Artem Bityutskiy",
        "email": "artem.bityutskiy@linux.intel.com",
        "time": "Wed Mar 21 22:33:00 2012 -0400"
      },
      "committer": {
        "name": "Theodore Ts\u0027o",
        "email": "tytso@mit.edu",
        "time": "Wed Mar 21 22:33:00 2012 -0400"
      },
      "message": "mm: export dirty_writeback_interval\n\nExport \u0027dirty_writeback_interval\u0027 to make it visible to\nfile-systems. We are going to push superblock management down to\nfile-systems and get rid of the \u0027sync_supers\u0027 kernel thread completly.\n\nSigned-off-by: Artem Bityutskiy \u003cartem.bityutskiy@linux.intel.com\u003e\nCc: Al Viro \u003cviro@ZenIV.linux.org.uk\u003e\nSigned-off-by: \"Theodore Ts\u0027o\" \u003ctytso@mit.edu\u003e\n"
    },
    {
      "commit": "47a133339c332f9f8e155c70f5da401aded69948",
      "tree": "94f8fb512ebe8e29d0a3f11a04b04004afd4ab59",
      "parents": [
        "1010bb1b80edb0713415dfe1f97114d320f58c4f"
      ],
      "author": {
        "name": "Fengguang Wu",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Mar 21 16:34:09 2012 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Mar 21 17:54:58 2012 -0700"
      },
      "message": "mm: use global_dirty_limit in throttle_vm_writeout()\n\nWhen starting a memory hog task, a desktop box w/o swap is found to go\nunresponsive for a long time.  It\u0027s solely caused by lots of congestion\nwaits in throttle_vm_writeout():\n\n gnome-system-mo-4201 553.073384: congestion_wait: throttle_vm_writeout+0x70/0x7f shrink_mem_cgroup_zone+0x48f/0x4a1\n gnome-system-mo-4201 553.073386: writeback_congestion_wait: usec_timeout\u003d100000 usec_delayed\u003d100000\n           gtali-4237 553.080377: congestion_wait: throttle_vm_writeout+0x70/0x7f shrink_mem_cgroup_zone+0x48f/0x4a1\n           gtali-4237 553.080378: writeback_congestion_wait: usec_timeout\u003d100000 usec_delayed\u003d100000\n            Xorg-3483 553.103375: congestion_wait: throttle_vm_writeout+0x70/0x7f shrink_mem_cgroup_zone+0x48f/0x4a1\n            Xorg-3483 553.103377: writeback_congestion_wait: usec_timeout\u003d100000 usec_delayed\u003d100000\n\nThe root cause is, the dirty threshold is knocked down a lot by the memory\nhog task.  Fixed by using global_dirty_limit which decreases gradually on\nsuch events and can guarantee we stay above (the also decreasing) nr_dirty\nin the progress of following down to the new dirty threshold.\n\nSigned-off-by: Fengguang Wu \u003cfengguang.wu@intel.com\u003e\nCc: Johannes Weiner \u003channes@cmpxchg.org\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Greg Thelen \u003cgthelen@google.com\u003e\nCc: Ying Han \u003cyinghan@google.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nCc: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "001a541ea9163ace5e8243ee0e907ad80a4c0ec2",
      "tree": "a76225046369c440de93739add9823f5ea060245",
      "parents": [
        "40ba587923ae67090d9f141c1d3c951be5c1420e",
        "bc31b86a5923fad5f3fbb6192f767f410241ba27"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jan 10 16:59:59 2012 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jan 10 16:59:59 2012 -0800"
      },
      "message": "Merge branch \u0027writeback-for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux\n\n* \u0027writeback-for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:\n  writeback: move MIN_WRITEBACK_PAGES to fs-writeback.c\n  writeback: balanced_rate cannot exceed write bandwidth\n  writeback: do strict bdi dirty_exceeded\n  writeback: avoid tiny dirty poll intervals\n  writeback: max, min and target dirty pause time\n  writeback: dirty ratelimit - think time compensation\n  btrfs: fix dirtied pages accounting on sub-page writes\n  writeback: fix dirtied pages accounting on redirty\n  writeback: fix dirtied pages accounting on sub-page writes\n  writeback: charge leaked page dirties to active tasks\n  writeback: Include all dirty inodes in background writeback\n"
    },
    {
      "commit": "a756cf5908530e8b40bdf569eb48b40139e8d7fd",
      "tree": "ba9df151d5468098c7eae563ce09faea6a539fc0",
      "parents": [
        "ccafa2879fb8d13b8031337a8743eac4189e5d6e"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "jweiner@redhat.com",
        "time": "Tue Jan 10 15:07:49 2012 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jan 10 16:30:43 2012 -0800"
      },
      "message": "mm: try to distribute dirty pages fairly across zones\n\nThe maximum number of dirty pages that exist in the system at any time is\ndetermined by a number of pages considered dirtyable and a user-configured\npercentage of those, or an absolute number in bytes.\n\nThis number of dirtyable pages is the sum of memory provided by all the\nzones in the system minus their lowmem reserves and high watermarks, so\nthat the system can retain a healthy number of free pages without having\nto reclaim dirty pages.\n\nBut there is a flaw in that we have a zoned page allocator which does not\ncare about the global state but rather the state of individual memory\nzones.  And right now there is nothing that prevents one zone from filling\nup with dirty pages while other zones are spared, which frequently leads\nto situations where kswapd, in order to restore the watermark of free\npages, does indeed have to write pages from that zone\u0027s LRU list.  This\ncan interfere so badly with IO from the flusher threads that major\nfilesystems (btrfs, xfs, ext4) mostly ignore write requests from reclaim\nalready, taking away the VM\u0027s only possibility to keep such a zone\nbalanced, aside from hoping the flushers will soon clean pages from that\nzone.\n\nEnter per-zone dirty limits.  They are to a zone\u0027s dirtyable memory what\nthe global limit is to the global amount of dirtyable memory, and try to\nmake sure that no single zone receives more than its fair share of the\nglobally allowed dirty pages in the first place.  As the number of pages\nconsidered dirtyable excludes the zones\u0027 lowmem reserves and high\nwatermarks, the maximum number of dirty pages in a zone is such that the\nzone can always be balanced without requiring page cleaning.\n\nAs this is a placement decision in the page allocator and pages are\ndirtied only after the allocation, this patch allows allocators to pass\n__GFP_WRITE when they know in advance that the page will be written to and\nbecome dirty soon.  The page allocator will then attempt to allocate from\nthe first zone of the zonelist - which on NUMA is determined by the task\u0027s\nNUMA memory policy - that has not exceeded its dirty limit.\n\nAt first glance, it would appear that the diversion to lower zones can\nincrease pressure on them, but this is not the case.  With a full high\nzone, allocations will be diverted to lower zones eventually, so it is\nmore of a shift in timing of the lower zone allocations.  Workloads that\npreviously could fit their dirty pages completely in the higher zone may\nbe forced to allocate from lower zones, but the amount of pages that\n\"spill over\" are limited themselves by the lower zones\u0027 dirty constraints,\nand thus unlikely to become a problem.\n\nFor now, the problem of unfair dirty page distribution remains for NUMA\nconfigurations where the zones allowed for allocation are in sum not big\nenough to trigger the global dirty limits, wake up the flusher threads and\nremedy the situation.  Because of this, an allocation that could not\nsucceed on any of the considered zones is allowed to ignore the dirty\nlimits before going into direct reclaim or even failing the allocation,\nuntil a future patch changes the global dirty throttling and flusher\nthread activation so that they take individual zone states into account.\n\n\t\t\tTest results\n\n15M DMA + 3246M DMA32 + 504 Normal \u003d 3765M memory\n40% dirty ratio\n16G USB thumb drive\n10 runs of dd if\u003d/dev/zero of\u003ddisk/zeroes bs\u003d32k count\u003d$((10 \u003c\u003c 15))\n\n\t\tseconds\t\t\tnr_vmscan_write\n\t\t        (stddev)\t       min|     median|        max\nxfs\nvanilla:\t 549.747( 3.492)\t     0.000|      0.000|      0.000\npatched:\t 550.996( 3.802)\t     0.000|      0.000|      0.000\n\nfuse-ntfs\nvanilla:\t1183.094(53.178)\t 54349.000|  59341.000|  65163.000\npatched:\t 558.049(17.914)\t     0.000|      0.000|     43.000\n\nbtrfs\nvanilla:\t 573.679(14.015)\t156657.000| 460178.000| 606926.000\npatched:\t 563.365(11.368)\t     0.000|      0.000|   1362.000\n\next4\nvanilla:\t 561.197(15.782)\t     0.000|2725438.000|4143837.000\npatched:\t 568.806(17.496)\t     0.000|      0.000|      0.000\n\nSigned-off-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nTested-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ccafa2879fb8d13b8031337a8743eac4189e5d6e",
      "tree": "0202fd26218faba5751de0906c430f422b0ccbac",
      "parents": [
        "ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "jweiner@redhat.com",
        "time": "Tue Jan 10 15:07:44 2012 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jan 10 16:30:43 2012 -0800"
      },
      "message": "mm: writeback: cleanups in preparation for per-zone dirty limits\n\nThe next patch will introduce per-zone dirty limiting functions in\naddition to the traditional global dirty limiting.\n\nRename determine_dirtyable_memory() to global_dirtyable_memory() before\nadding the zone-specific version, and fix up its documentation.\n\nAlso, move the functions to determine the dirtyable memory and the\nfunction to calculate the dirty limit based on that together so that their\nrelationship is more apparent and that they can be commented on as a\ngroup.\n\nSigned-off-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Mel Gorman \u003cmel@suse.de\u003e\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Rik van Riel \u003criel@redhat.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d",
      "tree": "0a6f7dcca59d22abe07973e3fafc41719ff3ad9d",
      "parents": [
        "25bd91bd27820d5971258cecd1c0e64b0e485144"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "jweiner@redhat.com",
        "time": "Tue Jan 10 15:07:42 2012 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jan 10 16:30:43 2012 -0800"
      },
      "message": "mm: exclude reserved pages from dirtyable memory\n\nPer-zone dirty limits try to distribute page cache pages allocated for\nwriting across zones in proportion to the individual zone sizes, to reduce\nthe likelihood of reclaim having to write back individual pages from the\nLRU lists in order to make progress.\n\nThis patch:\n\nThe amount of dirtyable pages should not include the full number of free\npages: there is a number of reserved pages that the page allocator and\nkswapd always try to keep free.\n\nThe closer (reclaimable pages - dirty pages) is to the number of reserved\npages, the more likely it becomes for reclaim to run into dirty pages:\n\n       +----------+ ---\n       |   anon   |  |\n       +----------+  |\n       |          |  |\n       |          |  -- dirty limit new    -- flusher new\n       |   file   |  |                     |\n       |          |  |                     |\n       |          |  -- dirty limit old    -- flusher old\n       |          |                        |\n       +----------+                       --- reclaim\n       | reserved |\n       +----------+\n       |  kernel  |\n       +----------+\n\nThis patch introduces a per-zone dirty reserve that takes both the lowmem\nreserve as well as the high watermark of the zone into account, and a\nglobal sum of those per-zone values that is subtracted from the global\namount of dirtyable pages.  The lowmem reserve is unavailable to page\ncache allocations and kswapd tries to keep the high watermark free.  We\ndon\u0027t want to end up in a situation where reclaim has to clean pages in\norder to balance zones.\n\nNot treating reserved pages as dirtyable on a global level is only a\nconceptual fix.  In reality, dirty pages are not distributed equally\nacross zones and reclaim runs into dirty pages on a regular basis.\n\nBut it is important to get this right before tackling the problem on a\nper-zone level, where the distance between reclaim and the dirty pages is\nmostly much smaller in absolute numbers.\n\n[akpm@linux-foundation.org: fix highmem build]\nSigned-off-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nCc: Shaohua Li \u003cshaohua.li@intel.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "1edf223485c42c99655dcd001db1e46ad5e5d2d7",
      "tree": "33b93dc8f2a249806150b5792ac1787688bf6b74",
      "parents": [
        "e4e11180dfa545233e5145919b75b7fac88638df"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "hannes@cmpxchg.org",
        "time": "Tue Jan 10 15:06:57 2012 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jan 10 16:30:41 2012 -0800"
      },
      "message": "mm/page-writeback.c: make determine_dirtyable_memory static again\n\nThe tracing ring-buffer used this function briefly, but not anymore.\nMake it local to the writeback code again.\n\nAlso, move the function so that no forward declaration needs to be\nreintroduced.\n\nSigned-off-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nAcked-by: Mel Gorman \u003cmgorman@suse.de\u003e\nReviewed-by: Michal Hocko \u003cmhocko@suse.cz\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ff01bb4832651c6d25ac509a06a10fcbd75c461c",
      "tree": "bbfdebd317db97d346df78293566f36e883b1be9",
      "parents": [
        "94ea4158f1733e3b10cef067d535f504866e0c41"
      ],
      "author": {
        "name": "Al Viro",
        "email": "viro@zeniv.linux.org.uk",
        "time": "Fri Sep 16 02:31:11 2011 -0400"
      },
      "committer": {
        "name": "Al Viro",
        "email": "viro@zeniv.linux.org.uk",
        "time": "Tue Jan 03 22:54:07 2012 -0500"
      },
      "message": "fs: move code out of buffer.c\n\nMove invalidate_bdev, block_sync_page into fs/block_dev.c.  Export\nkill_bdev as well, so brd doesn\u0027t have to open code it.  Reduce\nbuffer_head.h requirement accordingly.\n\nRemoved a rather large comment from invalidate_bdev, as it looked a bit\nobsolete to bother moving.  The small comment replacing it says enough.\n\nSigned-off-by: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Al Viro \u003cviro@ZenIV.linux.org.uk\u003e\nCc: Christoph Hellwig \u003chch@lst.de\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Al Viro \u003cviro@zeniv.linux.org.uk\u003e\n"
    },
    {
      "commit": "bdaac4902a8225bf247ecaeac46c4b2980cc70e5",
      "tree": "cf6c065d9e46c53b45248f634491a6d864563382",
      "parents": [
        "82791940545be38810dfd5e03ee701e749f04aab"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Aug 03 14:30:36 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Dec 18 14:20:33 2011 +0800"
      },
      "message": "writeback: balanced_rate cannot exceed write bandwidth\n\nAdd an upper limit to balanced_rate according to the below inequality.\nThis filters out some rare but huge singular points, which at least\nenables more readable gnuplot figures.\n\nWhen there are N dd dirtiers,\n\n\tbalanced_dirty_ratelimit \u003d write_bw / N\n\nSo it holds that\n\n\tbalanced_dirty_ratelimit \u003c\u003d write_bw\n\nThe singular points originate from dirty_rate in the below formular:\n\n        balanced_dirty_ratelimit \u003d task_ratelimit * write_bw / dirty_rate\nwhere\n\tdirty_rate \u003d (number of page dirties in the past 200ms) / 200ms\n\nIn the extreme case, if all dd tasks suddenly get blocked on something\nelse and hence no pages are dirtied at all, dirty_rate will be 0 and\nbalanced_dirty_ratelimit will be inf. This could happen in reality.\n\nNote that these huge singular points are not a real threat, since they\nare _guaranteed_ to be filtered out by the\n\tmin(balanced_dirty_ratelimit, task_ratelimit)\nline in bdi_update_dirty_ratelimit(). task_ratelimit is based on the\nnumber of dirty pages, which will never _suddenly_ fly away like\nbalanced_dirty_ratelimit. So any weirdly large balanced_dirty_ratelimit\nwill be cut down to the level of task_ratelimit.\n\nThere won\u0027t be tiny singular points though, as long as the dirty pages\nlie inside the dirty throttling region (above the freerun region).\nBecause there the dd tasks will be throttled by balanced_dirty_pages()\nand won\u0027t be able to suddenly dirty much more pages than average.\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nAcked-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "82791940545be38810dfd5e03ee701e749f04aab",
      "tree": "427e4b1f535dfa483de6b71d1f59c13fd07a0ff9",
      "parents": [
        "5b9b357435a51ff14835c06d8b00765a4c68f313"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Dec 03 21:26:01 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Dec 18 14:20:31 2011 +0800"
      },
      "message": "writeback: do strict bdi dirty_exceeded\n\nThis helps to reduce dirty throttling polls and hence CPU overheads.\n\nbdi-\u003edirty_exceeded typically only helps when suddenly starting 100+\ndd\u0027s on a disk, in which case the dd\u0027s may need to poll\nbalance_dirty_pages() earlier than tsk-\u003enr_dirtied_pause.\n\nCC: Jan Kara \u003cjack@suse.cz\u003e\nCC: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "5b9b357435a51ff14835c06d8b00765a4c68f313",
      "tree": "858bdc6ce0984aa0a9abc88d4c53931e6b299312",
      "parents": [
        "7ccb9ad5364d6ac0c803096c67e76a7545cf7a77"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Tue Dec 06 13:17:17 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Dec 18 14:20:30 2011 +0800"
      },
      "message": "writeback: avoid tiny dirty poll intervals\n\nThe LKP tests see big 56% regression for the case fio_mmap_randwrite_64k.\nShaohua manages to root cause it to be the much smaller dirty pause times\nand hence much more frequent invocations to the IO-less balance_dirty_pages().\nSince fio_mmap_randwrite_64k effectively contains both reads and writes,\nthe more frequent pauses triggered more idling in the cfq IO scheduler.\n\nThe solution is to increase pause time all the way up to the max 200ms\nin this case, which is found to restore most performance. This will help\nreduce CPU overheads in other cases, too.\n\nNote that I don\u0027t expect many performance critical workloads to run this\naccess pattern: the mmap read-on-write is rather inefficient and could\nbe avoided by doing normal writes syscalls.\n\nCC: Jan Kara \u003cjack@suse.cz\u003e\nCC: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nReported-by: Li Shaohua \u003cshaohua.li@intel.com\u003e\nTested-by: Li Shaohua \u003cshaohua.li@intel.com\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "7ccb9ad5364d6ac0c803096c67e76a7545cf7a77",
      "tree": "53894333454bca278f20f9c5841dd1b45c384721",
      "parents": [
        "83712358ba0a1497ce59a4f84ce4dd0f803fe6fc"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Nov 30 11:08:55 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Dec 18 14:20:28 2011 +0800"
      },
      "message": "writeback: max, min and target dirty pause time\n\nControl the pause time and the call intervals to balance_dirty_pages()\nwith three parameters:\n\n1) max_pause, limited by bdi_dirty and MAX_PAUSE\n\n2) the target pause time, grows with the number of dd tasks\n   and is normally limited by max_pause/2\n\n3) the minimal pause, set to half the target pause\n   and is used to skip short sleeps and accumulate them into bigger ones\n\nThe typical behaviors after patch:\n\n- if ever task_ratelimit is far below dirty_ratelimit, the pause time\n  will remain constant at max_pause and nr_dirtied_pause will be\n  fluctuating with task_ratelimit\n\n- in the normal cases, nr_dirtied_pause will remain stable (keep in the\n  same pace with dirty_ratelimit) and the pause time will be fluctuating\n  with task_ratelimit\n\nIn summary, someone has to fluctuate with task_ratelimit, because\n\n\ttask_ratelimit \u003d nr_dirtied_pause / pause\n\nWe normally prefer a stable nr_dirtied_pause, until reaching max_pause.\n\nThe notable behavior changes are:\n\n- in stable workloads, there will no longer be sudden big trajectory\n  switching of nr_dirtied_pause as concerned by Peter. It will be as\n  smooth as dirty_ratelimit and changing proportionally with it (as\n  always, assuming bdi bandwidth does not fluctuate across 2^N lines,\n  otherwise nr_dirtied_pause will show up in 2+ parallel trajectories)\n\n- in the rare cases when something keeps task_ratelimit far below\n  dirty_ratelimit, the smoothness can no longer be retained and\n  nr_dirtied_pause will be \"dancing\" with task_ratelimit. This fixes a\n  (not that destructive but still not good) bug that\n\t  dirty_ratelimit gets brought down undesirably\n\t  \u003c\u003d balanced_dirty_ratelimit is under estimated\n\t  \u003c\u003d weakly executed task_ratelimit\n\t  \u003c\u003d pause goes too large and gets trimmed down to max_pause\n\t  \u003c\u003d nr_dirtied_pause (based on dirty_ratelimit) is set too large\n\t  \u003c\u003d dirty_ratelimit being much larger than task_ratelimit\n\n- introduce min_pause to avoid small pause sleeps\n\n- when pause is trimmed down to max_pause, try to compensate it at the\n  next pause time\n\nThe \"refactor\" type of changes are:\n\nThe max_pause equation is slightly transformed to make it slightly more\nefficient.\n\nWe now scale target_pause by (N * 10ms) on 2^N concurrent tasks, which\nis effectively equal to the original scaling max_pause by (N * 20ms)\nbecause the original code does implicit target_pause ~\u003d max_pause / 2.\nBased on the same implicit ratio, target_pause starts with 10ms on 1 dd.\n\nCC: Jan Kara \u003cjack@suse.cz\u003e\nCC: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "83712358ba0a1497ce59a4f84ce4dd0f803fe6fc",
      "tree": "d17ab27a7bff50616e3b63ad137c004d9ccfbcb0",
      "parents": [
        "32c7f202a4801252a0f3578807b75a961f792870"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jun 11 19:25:42 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Dec 18 14:20:27 2011 +0800"
      },
      "message": "writeback: dirty ratelimit - think time compensation\n\nCompensate the task\u0027s think time when computing the final pause time,\nso that -\u003edirty_ratelimit can be executed accurately.\n\n        think time :\u003d time spend outside of balance_dirty_pages()\n\nIn the rare case that the task slept longer than the 200ms period time\n(result in negative pause time), the sleep time will be compensated in\nthe following periods, too, if it\u0027s less than 1 second.\n\nAccumulated errors are carefully avoided as long as the max pause area\nis not hitted.\n\nPseudo code:\n\n        period \u003d pages_dirtied / task_ratelimit;\n        think \u003d jiffies - dirty_paused_when;\n        pause \u003d period - think;\n\n1) normal case: period \u003e think\n\n        pause \u003d period - think\n        dirty_paused_when \u003d jiffies + pause\n        nr_dirtied \u003d 0\n\n                             period time\n              |\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003e|\n                  think time      pause time\n              |\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003e|\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003e|\n        ------|----------------|---------------|------------------------\n        dirty_paused_when   jiffies\n\n2) no pause case: period \u003c\u003d think\n\n        don\u0027t pause; reduce future pause time by:\n        dirty_paused_when +\u003d period\n        nr_dirtied \u003d 0\n\n                           period time\n              |\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003e|\n                                  think time\n              |\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003e|\n        ------|--------------------------------+-------------------|----\n        dirty_paused_when                                       jiffies\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nAcked-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "2f800fbd777b792de54187088df19a7df0251254",
      "tree": "2dc813b41da42d647468dafcf49a0b9e673df8b5",
      "parents": [
        "d3bc1fef9389e409a772ea174a5e41a6f93d9b7b"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Aug 08 15:22:00 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Dec 18 14:20:23 2011 +0800"
      },
      "message": "writeback: fix dirtied pages accounting on redirty\n\nDe-account the accumulative dirty counters on page redirty.\n\nPage redirties (very common in ext4) will introduce mismatch between\ncounters (a) and (b)\n\na) NR_DIRTIED, BDI_DIRTIED, tsk-\u003enr_dirtied\nb) NR_WRITTEN, BDI_WRITTEN\n\nThis will introduce systematic errors in balanced_rate and result in\ndirty page position errors (ie. the dirty pages are no longer balanced\naround the global/bdi setpoints).\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nAcked-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "d3bc1fef9389e409a772ea174a5e41a6f93d9b7b",
      "tree": "d1e47354263b7c930a7ec4428909693d10a10c50",
      "parents": [
        "54848d73f9f254631303d6eab9b976855988b266"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Thu Apr 14 07:52:37 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Dec 18 14:20:22 2011 +0800"
      },
      "message": "writeback: fix dirtied pages accounting on sub-page writes\n\nWhen dd in 512bytes, generic_perform_write() calls\nbalance_dirty_pages_ratelimited() 8 times for the same page, but\nobviously the page is only dirtied once.\n\nFix it by accounting tsk-\u003enr_dirtied and bdp_ratelimits at page dirty time.\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nAcked-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "54848d73f9f254631303d6eab9b976855988b266",
      "tree": "9fb4b7e564f2c0df88d0bde2f482b9b7efc847fa",
      "parents": [
        "1bc36b6426ae49139e9f56491db76b95921454d7"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Tue Apr 05 13:21:19 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Dec 18 14:20:20 2011 +0800"
      },
      "message": "writeback: charge leaked page dirties to active tasks\n\nIt\u0027s a years long problem that a large number of short-lived dirtiers\n(eg. gcc instances in a fast kernel build) may starve long-run dirtiers\n(eg. dd) as well as pushing the dirty pages to the global hard limit.\n\nThe solution is to charge the pages dirtied by the exited gcc to the\nother random dirtying tasks. It sounds not perfect, however should\nbehave good enough in practice, seeing as that throttled tasks aren\u0027t\nactually running so those that are running are more likely to pick it up\nand get throttled, therefore promoting an equal spread.\n\nRandy: fix compile error: \u0027dirty_throttle_leaks\u0027 undeclared in exit.c\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nAcked-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Randy Dunlap \u003crdunlap@xenotime.net\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "82e230a07de3812a5e87a27979f033dad59172e3",
      "tree": "672ecaa3a1cf3585aa941491b2cf77ae38f1d8ff",
      "parents": [
        "c5c6343c4d75f9d3226e05a72e7861e967fc8099"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Fri Dec 02 18:21:51 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Thu Dec 08 10:49:29 2011 +0800"
      },
      "message": "writeback: set max_pause to lowest value on zero bdi_dirty\n\nSome trace shows lots of bdi_dirty\u003d0 lines where it\u0027s actually some\nsmall value if w/o the accounting errors in the per-cpu bdi stats.\n\nIn this case the max pause time should really be set to the smallest\n(non-zero) value to avoid IO queue underrun and improve throughput.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "c5c6343c4d75f9d3226e05a72e7861e967fc8099",
      "tree": "e31d6d748b347314d7ef64fac8f14acc36dfb701",
      "parents": [
        "aed21ad28b1323b2807faea019e5ac388a7bc837"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Fri Dec 02 10:21:33 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Thu Dec 08 10:49:27 2011 +0800"
      },
      "message": "writeback: permit through good bdi even when global dirty exceeded\n\nOn a system with 1 local mount and 1 NFS mount, if the NFS server\nbecomes not responding when dd to the NFS mount, the NFS dirty pages may\nexceed the global dirty limit and _every_ task involving writing will be\nblocked. The whole system appears unresponsive.\n\nThe workaround is to permit through the bdi\u0027s that only has a small\nnumber of dirty pages. The number chosen (bdi_stat_error pages) is not\nenough to enable the local disk to run in optimal throughput, however is\nenough to make the system responsive on a broken NFS mount. The user can\nthen kill the dirtiers on the NFS mount and increase the global dirty\nlimit to bring up the local disk\u0027s throughput.\n\nIt risks allowing dirty pages to grow much larger than the global dirty\nlimit when there are 1000+ mounts, however that\u0027s very unlikely to happen,\nespecially in low memory profiles.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "aed21ad28b1323b2807faea019e5ac388a7bc837",
      "tree": "64d6bf0e86b7d256621420d2266d5c7c29bb5d50",
      "parents": [
        "a50527b19c62c808a7fca022816fff88a50b948d"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Nov 23 11:44:41 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Thu Dec 08 10:49:20 2011 +0800"
      },
      "message": "writeback: comment on the bdi dirty threshold\n\nWe do \"floating proportions\" to let active devices to grow its target\nshare of dirty pages and stalled/inactive devices to decrease its target\nshare over time.\n\nIt works well except in the case of \"an inactive disk suddenly goes\nbusy\", where the initial target share may be too small. To mitigate\nthis, bdi_position_ratio() has the below line to raise a small\nbdi_thresh when it\u0027s safe to do so, so that the disk be feed with enough\ndirty pages for efficient IO and in turn fast rampup of bdi_thresh:\n\n        bdi_thresh \u003d max(bdi_thresh, (limit - dirty) / 8);\n\nbalance_dirty_pages() normally does negative feedback control which\nadjusts ratelimit to balance the bdi dirty pages around the target.\nIn some extreme cases when that is not enough, it will have to block\nthe tasks completely until the bdi dirty pages drop below bdi_thresh.\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nAcked-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "468e6a20afaccb67e2a7d7f60d301f90e1c6f301",
      "tree": "5558e92e85decd0fa0bb95ed6e637e1f68ea2fe1",
      "parents": [
        "1df647197c5b8aacaeb58592cba9a1df322c9000"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Sep 07 10:41:32 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Thu Nov 17 20:49:06 2011 +0800"
      },
      "message": "writeback: remove vm_dirties and task-\u003edirties\n\nThey are not used any more.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "1df647197c5b8aacaeb58592cba9a1df322c9000",
      "tree": "d413b165aca10d3a6058e39430680e38c09c0037",
      "parents": [
        "499d05ecf990a7a7bbf9e0a273f9969f8ec69efc"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Nov 13 19:47:32 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Thu Nov 17 20:39:32 2011 +0800"
      },
      "message": "writeback: hard throttle 1000+ dd on a slow USB stick\n\nThe sleep based balance_dirty_pages() can pause at most MAX_PAUSE\u003d200ms\non every 1 4KB-page, which means it cannot throttle a task under\n4KB/200ms\u003d20KB/s. So when there are more than 512 dd writing to a\n10MB/s USB stick, its bdi dirty pages could grow out of control.\n\nEven if we can increase MAX_PAUSE, the minimal (task_ratelimit \u003d 1)\nmeans a limit of 4KB/s.\n                                                       \nThey can eventually be safeguarded by the global limit check \n(nr_dirty \u003c dirty_thresh). However if someone is also writing to an \nHDD at the same time, it\u0027ll get poor HDD write performance.\n                                                       \nWe at least want to maintain good write performance for other devices\nwhen one device is attacked by some \"massive parallel\" workload, or\nsuffers from slow write bandwidth, or somehow get stalled due to some \nerror condition (eg. NFS server not responding).\n\nFor a stalled device, we need to completely block its dirtiers, too,\nbefore its bdi dirty pages grow all the way up to the global limit and\nleave no space for the other functional devices.\n\nSo change the loop exit condition to\n\n\t/*\n\t * Always enforce global dirty limit; also enforce bdi dirty limit\n\t * if the normal max_pause sleeps cannot keep things under control.\n\t */\n\tif (nr_dirty \u003c dirty_thresh \u0026\u0026\n\t    (bdi_dirty \u003c bdi_thresh || bdi-\u003edirty_ratelimit \u003e 1))\n\t\tbreak;\n\nwhich can be further simplified to\n\n\tif (task_ratelimit)\n\t\tbreak;\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "499d05ecf990a7a7bbf9e0a273f9969f8ec69efc",
      "tree": "cbcdc35276936db1d63959261bfbc02dda2b48a3",
      "parents": [
        "6aaf05f472c97ebceff47d9eef464574f1a55727"
      ],
      "author": {
        "name": "Jan Kara",
        "email": "jack@suse.cz",
        "time": "Wed Nov 16 19:34:48 2011 +0800"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Nov 16 19:53:44 2011 +0800"
      },
      "message": "mm: Make task in balance_dirty_pages() killable\n\nThere is no reason why task in balance_dirty_pages() shouldn\u0027t be killable\nand it helps in recovering from some error conditions (like when filesystem\ngoes in error state and cannot accept writeback anymore but we still want to\nkill processes using it to be able to unmount it).\n\nThere will be follow up patches to further abort the generic_perform_write()\nand other filesystem write loops, to avoid large write + SIGKILL combination\nexceeding the dirty limit and possibly strange OOM.\n\nReported-by: Kazuya Mio \u003ck-mio@sx.jp.nec.com\u003e\nTested-by: Kazuya Mio \u003ck-mio@sx.jp.nec.com\u003e\nReviewed-by: Neil Brown \u003cneilb@suse.de\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "3a73dbbc9bb3fc8594cd67af4db6c563175dfddb",
      "tree": "e5120c19fd8e83a38d5c0852336a92c5b7862c6a",
      "parents": [
        "31555213f03bca37d2c02e10946296052f4ecfcd"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Nov 07 19:19:28 2011 +0800"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Nov 07 19:19:28 2011 +0800"
      },
      "message": "writeback: fix uninitialized task_ratelimit\n\nIn balance_dirty_pages() task_ratelimit may be not initialized\n(initialization skiped by goto pause), and then used when calling\ntracing hook.\n\nFix it by moving the task_ratelimit assignment before goto pause.\n\nReported-by: Witold Baryluk \u003cbaryluk@smp.if.uj.edu.pl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "32aaeffbd4a7457bf2f7448b33b5946ff2a960eb",
      "tree": "faf7ad871d87176423ff9ed1d1ba4d9c688fc23f",
      "parents": [
        "208bca0860406d16398145ddd950036a737c3c9d",
        "67b84999b1a8b1af5625b1eabe92146c5eb42932"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sun Nov 06 19:44:47 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sun Nov 06 19:44:47 2011 -0800"
      },
      "message": "Merge branch \u0027modsplit-Oct31_2011\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux\n\n* \u0027modsplit-Oct31_2011\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)\n  Revert \"tracing: Include module.h in define_trace.h\"\n  irq: don\u0027t put module.h into irq.h for tracking irqgen modules.\n  bluetooth: macroize two small inlines to avoid module.h\n  ip_vs.h: fix implicit use of module_get/module_put from module.h\n  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence\n  include: replace linux/module.h with \"struct module\" wherever possible\n  include: convert various register fcns to macros to avoid include chaining\n  crypto.h: remove unused crypto_tfm_alg_modname() inline\n  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE\n  pm_runtime.h: explicitly requires notifier.h\n  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h\n  miscdevice.h: fix up implicit use of lists and types\n  stop_machine.h: fix implicit use of smp.h for smp_processor_id\n  of: fix implicit use of errno.h in include/linux/of.h\n  of_platform.h: delete needless include \u003clinux/module.h\u003e\n  acpi: remove module.h include from platform/aclinux.h\n  miscdevice.h: delete unnecessary inclusion of module.h\n  device_cgroup.h: delete needless include \u003clinux/module.h\u003e\n  net: sch_generic remove redundant use of \u003clinux/module.h\u003e\n  net: inet_timewait_sock doesnt need \u003clinux/module.h\u003e\n  ...\n\nFix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in\n - drivers/media/dvb/frontends/dibx000_common.c\n - drivers/media/video/{mt9m111.c,ov6650.c}\n - drivers/mfd/ab3550-core.c\n - include/linux/dmaengine.h\n"
    },
    {
      "commit": "208bca0860406d16398145ddd950036a737c3c9d",
      "tree": "7797a16c17d8bd155120126fa7976727fc6de013",
      "parents": [
        "6aad3738f6a79fd0ca480eaceefe064cc471f6eb",
        "0e175a1835ffc979e55787774e58ec79e41957d7"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sun Nov 06 19:02:23 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sun Nov 06 19:02:23 2011 -0800"
      },
      "message": "Merge branch \u0027writeback-for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux\n\n* \u0027writeback-for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:\n  writeback: Add a \u0027reason\u0027 to wb_writeback_work\n  writeback: send work item to queue_io, move_expired_inodes\n  writeback: trace event balance_dirty_pages\n  writeback: trace event bdi_dirty_ratelimit\n  writeback: fix ppc compile warnings on do_div(long long, unsigned long)\n  writeback: per-bdi background threshold\n  writeback: dirty position control - bdi reserve area\n  writeback: control dirty pause time\n  writeback: limit max dirty pause time\n  writeback: IO-less balance_dirty_pages()\n  writeback: per task dirty rate limit\n  writeback: stabilize bdi-\u003edirty_ratelimit\n  writeback: dirty rate control\n  writeback: add bg_threshold parameter to __bdi_update_bandwidth()\n  writeback: dirty position control\n  writeback: account per-bdi accumulated dirtied pages\n"
    },
    {
      "commit": "d08c429b06d21bd2add88aea2cd1996f1b9b3bda",
      "tree": "7a7f0002e4747ebc70978dcda565a09a943dc992",
      "parents": [
        "3da367c3e5fca71d4e778fa565d9b098d5518f4a"
      ],
      "author": {
        "name": "Johannes Weiner",
        "email": "jweiner@redhat.com",
        "time": "Mon Oct 31 17:07:05 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Oct 31 17:30:45 2011 -0700"
      },
      "message": "mm/page-writeback.c: document bdi_min_ratio\n\nLooks like someone got distracted after adding the comment characters.\n\nSigned-off-by: Johannes Weiner \u003cjweiner@redhat.com\u003e\nAcked-by: Peter Zijlstra \u003cpeterz@infradead.org\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "b95f1b31b75588306e32b2afd32166cad48f670b",
      "tree": "b5496144e41b117cfe5ae70b145b5351709ec4d0",
      "parents": [
        "b9e15bafdf1aa20791cdefdcbf1ccf7d7aa03aaa"
      ],
      "author": {
        "name": "Paul Gortmaker",
        "email": "paul.gortmaker@windriver.com",
        "time": "Sun Oct 16 02:01:52 2011 -0400"
      },
      "committer": {
        "name": "Paul Gortmaker",
        "email": "paul.gortmaker@windriver.com",
        "time": "Mon Oct 31 09:20:12 2011 -0400"
      },
      "message": "mm: Map most files to use export.h instead of module.h\n\nThe files changed within are only using the EXPORT_SYMBOL\nmacro variants.  They are not using core modular infrastructure\nand hence don\u0027t need module.h but only the export.h header.\n\nSigned-off-by: Paul Gortmaker \u003cpaul.gortmaker@windriver.com\u003e\n"
    },
    {
      "commit": "0e175a1835ffc979e55787774e58ec79e41957d7",
      "tree": "6ec4b65a8de4e9d1c12d26a1079079ed81d79450",
      "parents": [
        "ad4e38dd6a33bb3a4882c487d7abe621e583b982"
      ],
      "author": {
        "name": "Curt Wohlgemuth",
        "email": "curtw@google.com",
        "time": "Fri Oct 07 21:54:10 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 31 00:33:36 2011 +0800"
      },
      "message": "writeback: Add a \u0027reason\u0027 to wb_writeback_work\n\nThis creates a new \u0027reason\u0027 field in a wb_writeback_work\nstructure, which unambiguously identifies who initiates\nwriteback activity.  A \u0027wb_reason\u0027 enumeration has been\nadded to writeback.h, to enumerate the possible reasons.\n\nThe \u0027writeback_work_class\u0027 and tracepoint event class and\n\u0027writeback_queue_io\u0027 tracepoints are updated to include the\nsymbolic \u0027reason\u0027 in all trace events.\n\nAnd the \u0027writeback_inodes_sbXXX\u0027 family of routines has had\na wb_stats parameter added to them, so callers can specify\nwhy writeback is being started.\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Curt Wohlgemuth \u003ccurtw@google.com\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "ece13ac31bbe492d940ba0bc4ade2ae1521f46a5",
      "tree": "2bfddab0f62999bf595a72913b79cabafbad0e40",
      "parents": [
        "b48c104d2211b0ac881a71f5f76a3816225f8111"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Aug 29 23:33:20 2010 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 31 00:29:38 2011 +0800"
      },
      "message": "writeback: trace event balance_dirty_pages\n\nUseful for analyzing the dynamics of the throttling algorithms and\ndebugging user reported problems.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "b48c104d2211b0ac881a71f5f76a3816225f8111",
      "tree": "b947f3fd4c8b49ee12d516f3eb520209c577387b",
      "parents": [
        "50657fc4dfa7e345a1008f7c1de0bf930bbecca9"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Mar 02 17:22:49 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 31 00:29:21 2011 +0800"
      },
      "message": "writeback: trace event bdi_dirty_ratelimit\n\nIt helps understand how various throttle bandwidths are updated.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "50657fc4dfa7e345a1008f7c1de0bf930bbecca9",
      "tree": "b1b1da53bc881b021635d9a43bad0047390485d2",
      "parents": [
        "b00949aa2df9970a912bf060bc95e99da356881c"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Tue Oct 11 17:06:33 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Tue Oct 11 17:45:24 2011 +0800"
      },
      "message": "writeback: fix ppc compile warnings on do_div(long long, unsigned long)\n\nFix powerpc compile warnings\n\nmm/page-writeback.c: In function \u0027bdi_position_ratio\u0027:\nmm/page-writeback.c:622:3: warning: comparison of distinct pointer types lacks a cast [enabled by default]\npage-writeback.c:635:4: warning: comparison of distinct pointer types lacks a cast [enabled by default]\n\nAlso fix gcc \"uninitialized var\" warnings.\n\nReported-by: Stephen Rothwell \u003csfr@canb.auug.org.au\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "8927f66c4ede9a18b4b58f7e6f9debca67065f6b",
      "tree": "f7c8490ab23a20cb86874ca8112f3dd1fc6002ae",
      "parents": [
        "57fc978cfb61ed40a7bbfe5a569359159ba31abd"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Thu Aug 04 22:16:46 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:58 2011 +0800"
      },
      "message": "writeback: dirty position control - bdi reserve area\n\nKeep a minimal pool of dirty pages for each bdi, so that the disk IO\nqueues won\u0027t underrun. Also gently increase a small bdi_thresh to avoid\nit stuck in 0 for some light dirtied bdi.\n\nIt\u0027s particularly useful for JBOD and small memory system.\n\nIt may result in (pos_ratio \u003e 1) at the setpoint and push the dirty\npages high. This is more or less intended because the bdi is in the\ndanger of IO queue underflow.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "57fc978cfb61ed40a7bbfe5a569359159ba31abd",
      "tree": "870ffd08e0c1bb0dde55e4f1ed4dfa2bda8e3a80",
      "parents": [
        "c8462cc9de9e92264ec647903772f6036a99b286"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jun 11 19:32:32 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:58 2011 +0800"
      },
      "message": "writeback: control dirty pause time\n\nThe dirty pause time shall ultimately be controlled by adjusting\nnr_dirtied_pause, since there is relationship\n\n\tpause \u003d pages_dirtied / task_ratelimit\n\nAssuming\n\n\tpages_dirtied ~\u003d nr_dirtied_pause\n\ttask_ratelimit ~\u003d dirty_ratelimit\n\nWe get\n\n\tnr_dirtied_pause ~\u003d dirty_ratelimit * desired_pause\n\nHere dirty_ratelimit is preferred over task_ratelimit because it\u0027s\nmore stable.\n\nIt\u0027s also important to limit possible large transitional errors:\n\n- bw is changing quickly\n- pages_dirtied \u003c\u003c nr_dirtied_pause on entering dirty exceeded area\n- pages_dirtied \u003e\u003e nr_dirtied_pause on btrfs (to be improved by a\n  separate fix, but still expect non-trivial errors)\n\nSo we end up using the above formula inside clamp_val().\n\nThe best test case for this code is to run 100 \"dd bs\u003d4M\" tasks on\nbtrfs and check its pause time distribution.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "c8462cc9de9e92264ec647903772f6036a99b286",
      "tree": "f442132f53651a04e67f3a119ead9f54be51a6cb",
      "parents": [
        "143dfe8611a63030ce0c79419dc362f7838be557"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jun 11 19:21:43 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:57 2011 +0800"
      },
      "message": "writeback: limit max dirty pause time\n\nApply two policies to scale down the max pause time for\n\n1) small number of concurrent dirtiers\n2) small memory system (comparing to storage bandwidth)\n\nMAX_PAUSE\u003d200ms may only be suitable for high end servers with lots of\nconcurrent dirtiers, where the large pause time can reduce much overheads.\n\nOtherwise, smaller pause time is desirable whenever possible, so as to\nget good responsiveness and smooth user experiences. It\u0027s actually\nrequired for good disk utilization in the case when all the dirty pages\ncan be synced to disk within MAX_PAUSE\u003d200ms.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "143dfe8611a63030ce0c79419dc362f7838be557",
      "tree": "626b823d86fbb947296fc6c7fe2be324a85f3b5c",
      "parents": [
        "9d823e8f6b1b7b39f952d7d1795f29162143a433"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Fri Aug 27 18:45:12 2010 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:57 2011 +0800"
      },
      "message": "writeback: IO-less balance_dirty_pages()\n\nAs proposed by Chris, Dave and Jan, don\u0027t start foreground writeback IO\ninside balance_dirty_pages(). Instead, simply let it idle sleep for some\ntime to throttle the dirtying task. In the mean while, kick off the\nper-bdi flusher thread to do background writeback IO.\n\nRATIONALS\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n\n- disk seeks on concurrent writeback of multiple inodes (Dave Chinner)\n\n  If every thread doing writes and being throttled start foreground\n  writeback, it leads to N IO submitters from at least N different\n  inodes at the same time, end up with N different sets of IO being\n  issued with potentially zero locality to each other, resulting in\n  much lower elevator sort/merge efficiency and hence we seek the disk\n  all over the place to service the different sets of IO.\n  OTOH, if there is only one submission thread, it doesn\u0027t jump between\n  inodes in the same way when congestion clears - it keeps writing to\n  the same inode, resulting in large related chunks of sequential IOs\n  being issued to the disk. This is more efficient than the above\n  foreground writeback because the elevator works better and the disk\n  seeks less.\n\n- lock contention and cache bouncing on concurrent IO submitters (Dave Chinner)\n\n  With this patchset, the fs_mark benchmark on a 12-drive software RAID0 goes\n  from CPU bound to IO bound, freeing \"3-4 CPUs worth of spinlock contention\".\n\n  * \"CPU usage has dropped by ~55%\", \"it certainly appears that most of\n    the CPU time saving comes from the removal of contention on the\n    inode_wb_list_lock\" (IMHO at least 10% comes from the reduction of\n    cacheline bouncing, because the new code is able to call much less\n    frequently into balance_dirty_pages() and hence access the global\n    page states)\n\n  * the user space \"App overhead\" is reduced by 20%, by avoiding the\n    cacheline pollution by the complex writeback code path\n\n  * \"for a ~5% throughput reduction\", \"the number of write IOs have\n    dropped by ~25%\", and the elapsed time reduced from 41:42.17 to\n    40:53.23.\n\n  * On a simple test of 100 dd, it reduces the CPU %system time from 30% to 3%,\n    and improves IO throughput from 38MB/s to 42MB/s.\n\n- IO size too small for fast arrays and too large for slow USB sticks\n\n  The write_chunk used by current balance_dirty_pages() cannot be\n  directly set to some large value (eg. 128MB) for better IO efficiency.\n  Because it could lead to more than 1 second user perceivable stalls.\n  Even the current 4MB write size may be too large for slow USB sticks.\n  The fact that balance_dirty_pages() starts IO on itself couples the\n  IO size to wait time, which makes it hard to do suitable IO size while\n  keeping the wait time under control.\n\n  Now it\u0027s possible to increase writeback chunk size proportional to the\n  disk bandwidth. In a simple test of 50 dd\u0027s on XFS, 1-HDD, 3GB ram,\n  the larger writeback size dramatically reduces the seek count to 1/10\n  (far beyond my expectation) and improves the write throughput by 24%.\n\n- long block time in balance_dirty_pages() hurts desktop responsiveness\n\n  Many of us may have the experience: it often takes a couple of seconds\n  or even long time to stop a heavy writing dd/cp/tar command with\n  Ctrl-C or \"kill -9\".\n\n- IO pipeline broken by bumpy write() progress\n\n  There are a broad class of \"loop {read(buf); write(buf);}\" applications\n  whose read() pipeline will be under-utilized or even come to a stop if\n  the write()s have long latencies _or_ don\u0027t progress in a constant rate.\n  The current threshold based throttling inherently transfers the large\n  low level IO completion fluctuations to bumpy application write()s,\n  and further deteriorates with increasing number of dirtiers and/or bdi\u0027s.\n\n  For example, when doing 50 dd\u0027s + 1 remote rsync to an XFS partition,\n  the rsync progresses very bumpy in legacy kernel, and throughput is\n  improved by 67% by this patchset. (plus the larger write chunk size,\n  it will be 93% speedup).\n\n  The new rate based throttling can support 1000+ dd\u0027s with excellent\n  smoothness, low latency and low overheads.\n\nFor the above reasons, it\u0027s much better to do IO-less and low latency\npauses in balance_dirty_pages().\n\nJan Kara, Dave Chinner and me explored the scheme to let\nbalance_dirty_pages() wait for enough writeback IO completions to\nsafeguard the dirty limit. However it\u0027s found to have two problems:\n\n- in large NUMA systems, the per-cpu counters may have big accounting\n  errors, leading to big throttle wait time and jitters.\n\n- NFS may kill large amount of unstable pages with one single COMMIT.\n  Because NFS server serves COMMIT with expensive fsync() IOs, it is\n  desirable to delay and reduce the number of COMMITs. So it\u0027s not\n  likely to optimize away such kind of bursty IO completions, and the\n  resulted large (and tiny) stall times in IO completion based throttling.\n\nSo here is a pause time oriented approach, which tries to control the\npause time in each balance_dirty_pages() invocations, by controlling\nthe number of pages dirtied before calling balance_dirty_pages(), for\nsmooth and efficient dirty throttling:\n\n- avoid useless (eg. zero pause time) balance_dirty_pages() calls\n- avoid too small pause time (less than   4ms, which burns CPU power)\n- avoid too large pause time (more than 200ms, which hurts responsiveness)\n- avoid big fluctuations of pause times\n\nIt can control pause times at will. The default policy (in a followup\npatch) will be to do ~10ms pauses in 1-dd case, and increase to ~100ms\nin 1000-dd case.\n\nBEHAVIOR CHANGE\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n\n(1) dirty threshold\n\nUsers will notice that the applications will get throttled once crossing\nthe global (background + dirty)/2\u003d15% threshold, and then balanced around\n17.5%. Before patch, the behavior is to just throttle it at 20% dirtyable\nmemory in 1-dd case.\n\nSince the task will be soft throttled earlier than before, it may be\nperceived by end users as performance \"slow down\" if his application\nhappens to dirty more than 15% dirtyable memory.\n\n(2) smoothness/responsiveness\n\nUsers will notice a more responsive system during heavy writeback.\n\"killall dd\" will take effect instantly.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "9d823e8f6b1b7b39f952d7d1795f29162143a433",
      "tree": "2ef4c0d29353452dd2f894e7dbd240a31bdd0a02",
      "parents": [
        "7381131cbcf7e15d201a0ffd782a4698efe4e740"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jun 11 18:10:12 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:57 2011 +0800"
      },
      "message": "writeback: per task dirty rate limit\n\nAdd two fields to task_struct.\n\n1) account dirtied pages in the individual tasks, for accuracy\n2) per-task balance_dirty_pages() call intervals, for flexibility\n\nThe balance_dirty_pages() call interval (ie. nr_dirtied_pause) will\nscale near-sqrt to the safety gap between dirty pages and threshold.\n\nThe main problem of per-task nr_dirtied is, if 1k+ tasks start dirtying\npages at exactly the same time, each task will be assigned a large\ninitial nr_dirtied_pause, so that the dirty threshold will be exceeded\nlong before each task reached its nr_dirtied_pause and hence call\nbalance_dirty_pages().\n\nThe solution is to watch for the number of pages dirtied on each CPU in\nbetween the calls into balance_dirty_pages(). If it exceeds ratelimit_pages\n(3% dirty threshold), force call balance_dirty_pages() for a chance to\nset bdi-\u003edirty_exceeded. In normal situations, this safeguarding\ncondition is not expected to trigger at all.\n\nOn the sqrt in dirty_poll_interval():\n\nIt will serve as an initial guess when dirty pages are still in the\nfreerun area.\n\nWhen dirty pages are floating inside the dirty control scope [freerun,\nlimit], a followup patch will use some refined dirty poll interval to\nget the desired pause time.\n\n   thresh-dirty (MB)    sqrt\n\t\t   1      16\n\t\t   2      22\n\t\t   4      32\n\t\t   8      45\n\t\t  16      64\n\t\t  32      90\n\t\t  64     128\n\t\t 128     181\n\t\t 256     256\n\t\t 512     362\n\t\t1024     512\n\nThe above table means, given 1MB (or 1GB) gap and the dd tasks polling\nbalance_dirty_pages() on every 16 (or 512) pages, the dirty limit won\u0027t\nbe exceeded as long as there are less than 16 (or 512) concurrent dd\u0027s.\n\nSo sqrt naturally leads to less overheads and more safe concurrent tasks\nfor large memory servers, which have large (thresh-freerun) gaps.\n\npeter: keep the per-CPU ratelimit for safeguarding the 1k+ tasks case\n\nCC: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nReviewed-by: Andrea Righi \u003candrea@betterlinux.com\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "7381131cbcf7e15d201a0ffd782a4698efe4e740",
      "tree": "83f00c40d0a3fcd41ff2e6681a5da70dd155628a",
      "parents": [
        "be3ffa276446e1b691a2bf84e7621e5a6fb49db9"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Fri Aug 26 15:53:24 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:57 2011 +0800"
      },
      "message": "writeback: stabilize bdi-\u003edirty_ratelimit\n\nThere are some imperfections in balanced_dirty_ratelimit.\n\n1) large fluctuations\n\nThe dirty_rate used for computing balanced_dirty_ratelimit is merely\naveraged in the past 200ms (very small comparing to the 3s estimation\nperiod for write_bw), which makes rather dispersed distribution of\nbalanced_dirty_ratelimit.\n\nIt\u0027s pretty hard to average out the singular points by increasing the\nestimation period. Considering that the averaging technique will\nintroduce very undesirable time lags, I give it up totally. (btw, the 3s\nwrite_bw averaging time lag is much more acceptable because its impact\nis one-way and therefore won\u0027t lead to oscillations.)\n\nThe more practical way is filtering -- most singular\nbalanced_dirty_ratelimit points can be filtered out by remembering some\nprev_balanced_rate and prev_prev_balanced_rate. However the more\nreliable way is to guard balanced_dirty_ratelimit with task_ratelimit.\n\n2) due to truncates and fs redirties, the (write_bw \u003c\u003d\u003e dirty_rate)\nmatch could become unbalanced, which may lead to large systematical\nerrors in balanced_dirty_ratelimit. The truncates, due to its possibly\nbumpy nature, can hardly be compensated smoothly. So let\u0027s face it. When\nsome over-estimated balanced_dirty_ratelimit brings dirty_ratelimit\nhigh, dirty pages will go higher than the setpoint. task_ratelimit will\nin turn become lower than dirty_ratelimit.  So if we consider both\nbalanced_dirty_ratelimit and task_ratelimit and update dirty_ratelimit\nonly when they are on the same side of dirty_ratelimit, the systematical\nerrors in balanced_dirty_ratelimit won\u0027t be able to bring\ndirty_ratelimit far away.\n\nThe balanced_dirty_ratelimit estimation may also be inaccurate near\n@limit or @freerun, however is less an issue.\n\n3) since we ultimately want to\n\n- keep the fluctuations of task ratelimit as small as possible\n- keep the dirty pages around the setpoint as long time as possible\n\nthe update policy used for (2) also serves the above goals nicely:\nif for some reason the dirty pages are high (task_ratelimit \u003c dirty_ratelimit),\nand dirty_ratelimit is low (dirty_ratelimit \u003c balanced_dirty_ratelimit),\nthere is no point to bring up dirty_ratelimit in a hurry only to hurt\nboth the above two goals.\n\nSo, we make use of task_ratelimit to limit the update of dirty_ratelimit\nin two ways:\n\n1) avoid changing dirty rate when it\u0027s against the position control target\n   (the adjusted rate will slow down the progress of dirty pages going\n   back to setpoint).\n\n2) limit the step size. task_ratelimit is changing values step by step,\n   leaving a consistent trace comparing to the randomly jumping\n   balanced_dirty_ratelimit. task_ratelimit also has the nice smaller\n   errors in stable state and typically larger errors when there are big\n   errors in rate.  So it\u0027s a pretty good limiting factor for the step\n   size of dirty_ratelimit.\n\nNote that bdi-\u003edirty_ratelimit is always tracking balanced_dirty_ratelimit.\ntask_ratelimit is merely used as a limiting factor.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "be3ffa276446e1b691a2bf84e7621e5a6fb49db9",
      "tree": "ca1b112195a9a8b63265f3204748cb23cff5b653",
      "parents": [
        "af6a311384bce6c88e15c80ab22ab051a918b4eb"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Jun 12 10:51:31 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:56 2011 +0800"
      },
      "message": "writeback: dirty rate control\n\nIt\u0027s all about bdi-\u003edirty_ratelimit, which aims to be (write_bw / N)\nwhen there are N dd tasks.\n\nOn write() syscall, use bdi-\u003edirty_ratelimit\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n\n    balance_dirty_pages(pages_dirtied)\n    {\n        task_ratelimit \u003d bdi-\u003edirty_ratelimit * bdi_position_ratio();\n        pause \u003d pages_dirtied / task_ratelimit;\n        sleep(pause);\n    }\n\nOn every 200ms, update bdi-\u003edirty_ratelimit\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n\n    bdi_update_dirty_ratelimit()\n    {\n        task_ratelimit \u003d bdi-\u003edirty_ratelimit * bdi_position_ratio();\n        balanced_dirty_ratelimit \u003d task_ratelimit * write_bw / dirty_rate;\n        bdi-\u003edirty_ratelimit \u003d balanced_dirty_ratelimit\n    }\n\nEstimation of balanced bdi-\u003edirty_ratelimit\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n\nbalanced task_ratelimit\n-----------------------\n\nbalance_dirty_pages() needs to throttle tasks dirtying pages such that\nthe total amount of dirty pages stays below the specified dirty limit in\norder to avoid memory deadlocks. Furthermore we desire fairness in that\ntasks get throttled proportionally to the amount of pages they dirty.\n\nIOW we want to throttle tasks such that we match the dirty rate to the\nwriteout bandwidth, this yields a stable amount of dirty pages:\n\n        dirty_rate \u003d\u003d write_bw                                          (1)\n\nThe fairness requirement gives us:\n\n        task_ratelimit \u003d balanced_dirty_ratelimit\n                       \u003d\u003d write_bw / N                                  (2)\n\nwhere N is the number of dd tasks.  We don\u0027t know N beforehand, but\nstill can estimate balanced_dirty_ratelimit within 200ms.\n\nStart by throttling each dd task at rate\n\n        task_ratelimit \u003d task_ratelimit_0                               (3)\n                         (any non-zero initial value is OK)\n\nAfter 200ms, we measured\n\n        dirty_rate \u003d # of pages dirtied by all dd\u0027s / 200ms\n        write_bw   \u003d # of pages written to the disk / 200ms\n\nFor the aggressive dd dirtiers, the equality holds\n\n        dirty_rate \u003d\u003d N * task_rate\n                   \u003d\u003d N * task_ratelimit_0                              (4)\nOr\n        task_ratelimit_0 \u003d\u003d dirty_rate / N                              (5)\n\nNow we conclude that the balanced task ratelimit can be estimated by\n\n                                                      write_bw\n        balanced_dirty_ratelimit \u003d task_ratelimit_0 * ----------        (6)\n                                                      dirty_rate\n\nBecause with (4) and (5) we can get the desired equality (1):\n\n                                                       write_bw\n        balanced_dirty_ratelimit \u003d\u003d (dirty_rate / N) * ----------\n                                                       dirty_rate\n                                 \u003d\u003d write_bw / N\n\nThen using the balanced task ratelimit we can compute task pause times like:\n\n        task_pause \u003d task-\u003enr_dirtied / task_ratelimit\n\ntask_ratelimit with position control\n------------------------------------\n\nHowever, while the above gives us means of matching the dirty rate to\nthe writeout bandwidth, it at best provides us with a stable dirty page\ncount (assuming a static system). In order to control the dirty page\ncount such that it is high enough to provide performance, but does not\nexceed the specified limit we need another control.\n\nThe dirty position control works by extending (2) to\n\n        task_ratelimit \u003d balanced_dirty_ratelimit * pos_ratio           (7)\n\nwhere pos_ratio is a negative feedback function that subjects to\n\n1) f(setpoint) \u003d 1.0\n2) df/dx \u003c 0\n\nThat is, if the dirty pages are ABOVE the setpoint, we throttle each\ntask a bit more HEAVY than balanced_dirty_ratelimit, so that the dirty\npages are created less fast than they are cleaned, thus DROP to the\nsetpoints (and the reverse).\n\nBased on (7) and the assumption that both dirty_ratelimit and pos_ratio\nremains CONSTANT for the past 200ms, we get\n\n        task_ratelimit_0 \u003d balanced_dirty_ratelimit * pos_ratio         (8)\n\nPutting (8) into (6), we get the formula used in\nbdi_update_dirty_ratelimit():\n\n                                                write_bw\n        balanced_dirty_ratelimit *\u003d pos_ratio * ----------              (9)\n                                                dirty_rate\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "af6a311384bce6c88e15c80ab22ab051a918b4eb",
      "tree": "55ebac9ff575b3b6b4cfe46a38282c007c62d188",
      "parents": [
        "6c14ae1e92c77eabd3e7527cf2e7836cde8b8487"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 20:46:17 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:56 2011 +0800"
      },
      "message": "writeback: add bg_threshold parameter to __bdi_update_bandwidth()\n\nNo behavior change.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "6c14ae1e92c77eabd3e7527cf2e7836cde8b8487",
      "tree": "dd5fdede1873b246bbeb3518a9a741bb057db0c5",
      "parents": [
        "c8e28ce049faa53a470c132893abbc9f2bde9420"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Mar 02 16:04:18 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:56 2011 +0800"
      },
      "message": "writeback: dirty position control\n\nbdi_position_ratio() provides a scale factor to bdi-\u003edirty_ratelimit, so\nthat the resulted task rate limit can drive the dirty pages back to the\nglobal/bdi setpoints.\n\nOld scheme is,\n                                          |\n                           free run area  |  throttle area\n  ----------------------------------------+----------------------------\u003e\n                                    thresh^                  dirty pages\n\nNew scheme is,\n\n  ^ task rate limit\n  |\n  |            *\n  |             *\n  |              *\n  |[free run]      *      [smooth throttled]\n  |                  *\n  |                     *\n  |                         *\n  ..bdi-\u003edirty_ratelimit..........*\n  |                               .     *\n  |                               .          *\n  |                               .              *\n  |                               .                 *\n  |                               .                    *\n  +-------------------------------.-----------------------*------------\u003e\n                          setpoint^                  limit^  dirty pages\n\nThe slope of the bdi control line should be\n\n1) large enough to pull the dirty pages to setpoint reasonably fast\n\n2) small enough to avoid big fluctuations in the resulted pos_ratio and\n   hence task ratelimit\n\nSince the fluctuation range of the bdi dirty pages is typically observed\nto be within 1-second worth of data, the bdi control line\u0027s slope is\nselected to be a linear function of bdi write bandwidth, so that it can\nadapt to slow/fast storage devices well.\n\nAssume the bdi control line\n\n\tpos_ratio \u003d 1.0 + k * (dirty - bdi_setpoint)\n\nwhere k is the negative slope.\n\nIf targeting for 12.5% fluctuation range in pos_ratio when dirty pages\nare fluctuating in range\n\n\t[bdi_setpoint - write_bw/2, bdi_setpoint + write_bw/2],\n\nwe get slope\n\n\tk \u003d - 1 / (8 * write_bw)\n\nLet pos_ratio(x_intercept) \u003d 0, we get the parameter used in code:\n\n\tx_intercept \u003d bdi_setpoint + 8 * write_bw\n\nThe global/bdi slopes are nicely complementing each other when the\nsystem has only one major bdi (indicated by bdi_thresh ~\u003d thresh):\n\n1) slope of global control line    \u003d\u003e scaling to the control scope size\n2) slope of main bdi control line  \u003d\u003e scaling to the writeout bandwidth\n\nso that\n\n- in memory tight systems, (1) becomes strong enough to squeeze dirty\n  pages inside the control scope\n\n- in large memory systems where the \"gravity\" of (1) for pulling the\n  dirty pages to setpoint is too weak, (2) can back (1) up and drive\n  dirty pages to bdi_setpoint ~\u003d setpoint reasonably fast.\n\nUnfortunately in JBOD setups, the fluctuation range of bdi threshold\nis related to memory size due to the interferences between disks.  In\nthis case, the bdi slope will be weighted sum of write_bw and bdi_thresh.\n\nGiven equations\n\n        span \u003d x_intercept - bdi_setpoint\n        k \u003d df/dx \u003d - 1 / span\n\nand the extremum values\n\n        span \u003d bdi_thresh\n        dx \u003d bdi_thresh\n\nwe get\n\n        df \u003d - dx / span \u003d - 1.0\n\nThat means, when bdi_dirty deviates bdi_thresh up, pos_ratio and hence\ntask ratelimit will fluctuate by -100%.\n\npeter: use 3rd order polynomial for the global control line\n\nCC: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "c8e28ce049faa53a470c132893abbc9f2bde9420",
      "tree": "40553e72f50a613c93f4175d1d28d06abc90db65",
      "parents": [
        "9b13776977d45505469edc6decc93e9e3799afe2"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Jan 23 10:07:47 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Oct 03 21:08:56 2011 +0800"
      },
      "message": "writeback: account per-bdi accumulated dirtied pages\n\nIntroduce the BDI_DIRTIED counter. It will be used for estimating the\nbdi\u0027s dirty bandwidth.\n\nCC: Jan Kara \u003cjack@suse.cz\u003e\nCC: Michael Rubin \u003cmrubin@google.com\u003e\nCC: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "bb0822954aab7d23a3f902c2a103ee0242f6046e",
      "tree": "3049962f0ecc05eea4b2b4ef5480b6708bc74ce7",
      "parents": [
        "93ee7a9340d64f20295aacc3fb6a22b759323280"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Tue Aug 16 13:37:14 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Fri Aug 19 22:42:07 2011 +0800"
      },
      "message": "squeeze max-pause area and drop pass-good area\n\nRevert the pass-good area introduced in ffd1f609ab10 (\"writeback:\nintroduce max-pause and pass-good dirty limits\") and make the max-pause\narea smaller and safe.\n\nThis fixes ~30% performance regression in the ext3 data\u003dwriteback\nfio_mmap_randwrite_64k/fio_mmap_randrw_64k test cases, where there are\n12 JBOD disks, on each disk runs 8 concurrent tasks doing reads+writes.\n\nUsing deadline scheduler also has a regression, but not that big as CFQ,\nso this suggests we have some write starvation.\n\nThe test logs show that\n\n- the disks are sometimes under utilized\n\n- global dirty pages sometimes rush high to the pass-good area for\n  several hundred seconds, while in the mean time some bdi dirty pages\n  drop to very low value (bdi_dirty \u003c\u003c bdi_thresh).  Then suddenly the\n  global dirty pages dropped under global dirty threshold and bdi_dirty\n  rush very high (for example, 2 times higher than bdi_thresh). During\n  which time balance_dirty_pages() is not called at all.\n\nSo the problems are\n\n1) The random writes progress so slow that they break the assumption of\n   the max-pause logic that \"8 pages per 200ms is typically more than\n   enough to curb heavy dirtiers\".\n\n2) The max-pause logic ignored task_bdi_thresh and thus opens the possibility\n   for some bdi\u0027s to over dirty pages, leading to (bdi_dirty \u003e\u003e bdi_thresh)\n   and then (bdi_thresh \u003e\u003e bdi_dirty) for others.\n\n3) The higher max-pause/pass-good thresholds somehow leads to the bad\n   swing of dirty pages.\n\nThe fix is to allow the task to slightly dirty over task_bdi_thresh, but\nno way to exceed bdi_dirty and/or global dirty_thresh.\n\nTests show that it fixed the JBOD regression completely (both behavior\nand performance), while still being able to cut down large pause times\nin balance_dirty_pages() for single-disk cases.\n\nReported-by: Li Shaohua \u003cshaohua.li@intel.com\u003e\nTested-by: Li Shaohua \u003cshaohua.li@intel.com\u003e\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "f01ef569cddb1a8627b1c6b3a134998ad1cf4b22",
      "tree": "29ea1a0942c8549c24411e976cd6891c7e995e89",
      "parents": [
        "a93a1329271038f0e8337061d3b41b3b212a851e",
        "bcff25fc8aa47a13faff8b4b992589813f7b450a"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jul 26 10:39:54 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jul 26 10:39:54 2011 -0700"
      },
      "message": "Merge branch \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback\n\n* \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback: (27 commits)\n  mm: properly reflect task dirty limits in dirty_exceeded logic\n  writeback: don\u0027t busy retry writeback on new/freeing inodes\n  writeback: scale IO chunk size up to half device bandwidth\n  writeback: trace global_dirty_state\n  writeback: introduce max-pause and pass-good dirty limits\n  writeback: introduce smoothed global dirty limit\n  writeback: consolidate variable names in balance_dirty_pages()\n  writeback: show bdi write bandwidth in debugfs\n  writeback: bdi write bandwidth estimation\n  writeback: account per-bdi accumulated written pages\n  writeback: make writeback_control.nr_to_write straight\n  writeback: skip tmpfs early in balance_dirty_pages_ratelimited_nr()\n  writeback: trace event writeback_queue_io\n  writeback: trace event writeback_single_inode\n  writeback: remove .nonblocking and .encountered_congestion\n  writeback: remove writeback_control.more_io\n  writeback: skip balance_dirty_pages() for in-memory fs\n  writeback: add bdi_dirty_limit() kernel-doc\n  writeback: avoid extra sync work at enqueue time\n  writeback: elevate queue_io() into wb_writeback()\n  ...\n\nFix up trivial conflicts in fs/fs-writeback.c and mm/filemap.c\n"
    },
    {
      "commit": "99b12e3d882bc7ebdfe0de381dff3b16d21c38f7",
      "tree": "2aa21f5b5a6e03e49cd7af4dbd1b38ec451b09d7",
      "parents": [
        "48f170fb7d7db8789ccc23e051af61f62af5f685"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Jul 25 17:12:37 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Jul 25 20:57:11 2011 -0700"
      },
      "message": "writeback: account NR_WRITTEN at IO completion time\n\nNR_WRITTEN is now accounted at block IO enqueue time, which is not very\naccurate as to common understanding.  This moves NR_WRITTEN accounting to\nthe IO completion time and makes it more consistent with BDI_WRITTEN,\nwhich is used for bandwidth estimation.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "72c4783210f77fd743f0a316858d33f27db51e7c",
      "tree": "4efc95eb0aaade090bac42e72c5973ada6d2cdb1",
      "parents": [
        "76d3fbf8fbf6cc78ceb63549e0e0c5bc8a88f838"
      ],
      "author": {
        "name": "Konstantin Khlebnikov",
        "email": "khlebnikov@openvz.org",
        "time": "Mon Jul 25 17:12:31 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Jul 25 20:57:11 2011 -0700"
      },
      "message": "mm: remove useless rcu lock-unlock from mapping_tagged()\n\nradix_tree_tagged() is lockless - it reads from a member of the raid-tree\nroot node.  It does not require any protection.\n\nSigned-off-by: Konstantin Khlebnikov \u003ckhlebnikov@openvz.org\u003e\nCc: Hugh Dickins \u003chughd@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "bcff25fc8aa47a13faff8b4b992589813f7b450a",
      "tree": "ae93e2b8ba1417bf6327f79154c69b9afc8328bb",
      "parents": [
        "fcc5c22218a18509a7412bf074fc9a7a5d874a8a"
      ],
      "author": {
        "name": "Jan Kara",
        "email": "jack@suse.cz",
        "time": "Fri Jul 01 13:31:25 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Jul 24 10:51:52 2011 +0800"
      },
      "message": "mm: properly reflect task dirty limits in dirty_exceeded logic\n\nWe set bdi-\u003edirty_exceeded (and thus ratelimiting code starts to\ncall balance_dirty_pages() every 8 pages) when a per-bdi limit is\nexceeded or global limit is exceeded. But per-bdi limit also depends\non the task. Thus different tasks reach the limit on that bdi at\ndifferent levels of dirty pages. The result is that with current code\nbdi-\u003edirty_exceeded ping-ponged between 1 and 0 depending on which task\njust got into balance_dirty_pages().\n\nWe fix the issue by clearing bdi-\u003edirty_exceeded only when per-bdi amount\nof dirty pages drops below the threshold (7/8 * bdi_dirty_limit) where task\nlimits already do not have any influence.\n\nImpact:  The end result is, the dirty pages are kept more tightly under\ncontrol, with the average number slightly lowered than before.  This\nreduces the risk to throttle light dirtiers and hence more responsive.\nHowever it may add overheads by enforcing balance_dirty_pages() calls\non every 8 pages when there are 2+ heavy dirtiers.\n\nCC: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nCC: Christoph Hellwig \u003chch@infradead.org\u003e\nCC: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCC: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "e1cbe236013c82bcf9a156e98d7b47efb89d2674",
      "tree": "c2f1764a3d07fd01fdbe6fd7d6ecd647557808d5",
      "parents": [
        "ffd1f609ab10532e8137b4b981fdf903ef4d0b32"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Dec 06 22:34:29 2010 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jul 09 22:09:03 2011 -0700"
      },
      "message": "writeback: trace global_dirty_state\n\nAdd trace event balance_dirty_state for showing the global dirty page\ncounts and thresholds at each global_dirty_limits() invocation.  This\nwill cover the callers throttle_vm_writeout(), over_bground_thresh()\nand each balance_dirty_pages() loop.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "ffd1f609ab10532e8137b4b981fdf903ef4d0b32",
      "tree": "b691e22952a8d12782cc73ec21e7aa450859cf4f",
      "parents": [
        "c42843f2f0bbc9d716a32caf667d18fc2bf3bc4c"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Jun 19 22:18:42 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jul 09 22:09:02 2011 -0700"
      },
      "message": "writeback: introduce max-pause and pass-good dirty limits\n\nThe max-pause limit helps to keep the sleep time inside\nbalance_dirty_pages() within MAX_PAUSE\u003d200ms. The 200ms max sleep means\nper task rate limit of 8pages/200ms\u003d160KB/s when dirty exceeded, which\nnormally is enough to stop dirtiers from continue pushing the dirty\npages high, unless there are a sufficient large number of slow dirtiers\n(eg. 500 tasks doing 160KB/s will still sum up to 80MB/s, exceeding the\nwrite bandwidth of a slow disk and hence accumulating more and more dirty\npages).\n\nThe pass-good limit helps to let go of the good bdi\u0027s in the presence of\na blocked bdi (ie. NFS server not responding) or slow USB disk which for\nsome reason build up a large number of initial dirty pages that refuse\nto go away anytime soon.\n\nFor example, given two bdi\u0027s A and B and the initial state\n\n\tbdi_thresh_A \u003d dirty_thresh / 2\n\tbdi_thresh_B \u003d dirty_thresh / 2\n\tbdi_dirty_A  \u003d dirty_thresh / 2\n\tbdi_dirty_B  \u003d dirty_thresh / 2\n\nThen A get blocked, after a dozen seconds\n\n\tbdi_thresh_A \u003d 0\n\tbdi_thresh_B \u003d dirty_thresh\n\tbdi_dirty_A  \u003d dirty_thresh / 2\n\tbdi_dirty_B  \u003d dirty_thresh / 2\n\nThe (bdi_dirty_B \u003c bdi_thresh_B) test is now useless and the dirty pages\nwill be effectively throttled by condition (nr_dirty \u003c dirty_thresh).\nThis has two problems:\n(1) we lose the protections for light dirtiers\n(2) balance_dirty_pages() effectively becomes IO-less because the\n    (bdi_nr_reclaimable \u003e bdi_thresh) test won\u0027t be true. This is good\n    for IO, but balance_dirty_pages() loses an important way to break\n    out of the loop which leads to more spread out throttle delays.\n\nDIRTY_PASSGOOD_AREA can eliminate the above issues. The only problem is,\nDIRTY_PASSGOOD_AREA needs to be defined as 2 to fully cover the above\nexample while this patch uses the more conservative value 8 so as not to\nsurprise people with too many dirty pages than expected.\n\nThe max-pause limit won\u0027t noticeably impact the speed dirty pages are\nknocked down when there is a sudden drop of global/bdi dirty thresholds.\nBecause the heavy dirties will be throttled below 160KB/s which is slow\nenough. It does help to avoid long dirty throttle delays and especially\nwill make light dirtiers more responsive.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "c42843f2f0bbc9d716a32caf667d18fc2bf3bc4c",
      "tree": "835b801d215dd70cbb5a282232ce23fa3167a880",
      "parents": [
        "7762741e3af69720186802e945229b6a5afd5c49"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Mar 02 15:54:09 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jul 09 22:09:02 2011 -0700"
      },
      "message": "writeback: introduce smoothed global dirty limit\n\nThe start of a heavy weight application (ie. KVM) may instantly knock\ndown determine_dirtyable_memory() if the swap is not enabled or full.\nglobal_dirty_limits() and bdi_dirty_limit() will in turn get global/bdi\ndirty thresholds that are _much_ lower than the global/bdi dirty pages.\n\nbalance_dirty_pages() will then heavily throttle all dirtiers including\nthe light ones, until the dirty pages drop below the new dirty thresholds.\nDuring this _deep_ dirty-exceeded state, the system may appear rather\nunresponsive to the users.\n\nAbout \"deep\" dirty-exceeded: task_dirty_limit() assigns 1/8 lower dirty\nthreshold to heavy dirtiers than light ones, and the dirty pages will\nbe throttled around the heavy dirtiers\u0027 dirty threshold and reasonably\nbelow the light dirtiers\u0027 dirty threshold. In this state, only the heavy\ndirtiers will be throttled and the dirty pages are carefully controlled\nto not exceed the light dirtiers\u0027 dirty threshold. However if the\nthreshold itself suddenly drops below the number of dirty pages, the\nlight dirtiers will get heavily throttled.\n\nSo introduce global_dirty_limit for tracking the global dirty threshold\nwith policies\n\n- follow downwards slowly\n- follow up in one shot\n\nglobal_dirty_limit can effectively mask out the impact of sudden drop of\ndirtyable memory. It will be used in the next patch for two new type of\ndirty limits. Note that the new dirty limits are not going to avoid\nthrottling the light dirtiers, but could limit their sleep time to 200ms.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "7762741e3af69720186802e945229b6a5afd5c49",
      "tree": "e5ca904b7b31154b1a412bcd3a2160f31581bdb7",
      "parents": [
        "00821b002df7da867bb2c15b4f83f3706371383f"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Sep 12 13:34:05 2010 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jul 09 22:09:02 2011 -0700"
      },
      "message": "writeback: consolidate variable names in balance_dirty_pages()\n\nIntroduce\n\n\tnr_dirty \u003d NR_FILE_DIRTY + NR_WRITEBACK + NR_UNSTABLE_NFS\n\nin order to simplify many tests in the following patches.\n\nbalance_dirty_pages() will eventually care only about the dirty sums\nbesides nr_writeback.\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "e98be2d599207c6b31e9bb340d52a231b2f3662d",
      "tree": "3ae28e7d621a6e2ddf8e7462f8d282901c113d5c",
      "parents": [
        "f7d2b1ecd0c714adefc7d3a942ef87beb828a763"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Aug 29 11:22:30 2010 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jul 09 22:09:01 2011 -0700"
      },
      "message": "writeback: bdi write bandwidth estimation\n\nThe estimation value will start from 100MB/s and adapt to the real\nbandwidth in seconds.\n\nIt tries to update the bandwidth only when disk is fully utilized.\nAny inactive period of more than one second will be skipped.\n\nThe estimated bandwidth will be reflecting how fast the device can\nwriteout when _fully utilized_, and won\u0027t drop to 0 when it goes idle.\nThe value will remain constant at disk idle time. At busy write time, if\nnot considering fluctuations, it will also remain high unless be knocked\ndown by possible concurrent reads that compete for the disk time and\nbandwidth with async writes.\n\nThe estimation is not done purely in the flusher because there is no\nguarantee for write_cache_pages() to return timely to update bandwidth.\n\nThe bdi-\u003eavg_write_bandwidth smoothing is very effective for filtering\nout sudden spikes, however may be a little biased in long term.\n\nThe overheads are low because the bdi bandwidth update only occurs at\n200ms intervals.\n\nThe 200ms update interval is suitable, because it\u0027s not possible to get\nthe real bandwidth for the instance at all, due to large fluctuations.\n\nThe NFS commits can be as large as seconds worth of data. One XFS\ncompletion may be as large as half second worth of data if we are going\nto increase the write chunk to half second worth of data. In ext4,\nfluctuations with time period of around 5 seconds is observed. And there\nis another pattern of irregular periods of up to 20 seconds on SSD tests.\n\nThat\u0027s why we are not only doing the estimation at 200ms intervals, but\nalso averaging them over a period of 3 seconds and then go further to do\nanother level of smoothing in avg_write_bandwidth.\n\nCC: Li Shaohua \u003cshaohua.li@intel.com\u003e\nCC: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "f7d2b1ecd0c714adefc7d3a942ef87beb828a763",
      "tree": "7c5adb0abd73d3ad449b94698dadbaceb573a6f4",
      "parents": [
        "d46db3d58233be4be980eb1e42eebe7808bcabab"
      ],
      "author": {
        "name": "Jan Kara",
        "email": "jack@suse.cz",
        "time": "Wed Dec 08 22:44:24 2010 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jul 09 22:09:01 2011 -0700"
      },
      "message": "writeback: account per-bdi accumulated written pages\n\nIntroduce the BDI_WRITTEN counter. It will be used for estimating the\nbdi\u0027s write bandwidth.\n\nPeter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e:\nMove BDI_WRITTEN accounting into __bdi_writeout_inc().\nThis will cover and fix fuse, which only calls bdi_writeout_inc().\n\nCC: Michael Rubin \u003cmrubin@google.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nSigned-off-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "d46db3d58233be4be980eb1e42eebe7808bcabab",
      "tree": "6d813b33938d915f0c0633e8615d1ffdcc554c96",
      "parents": [
        "36715cef0770b7e2547892b7c3197fc024274630"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed May 04 19:54:37 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jul 09 22:09:01 2011 -0700"
      },
      "message": "writeback: make writeback_control.nr_to_write straight\n\nPass struct wb_writeback_work all the way down to writeback_sb_inodes(),\nand initialize the struct writeback_control there.\n\nstruct writeback_control is basically designed to control writeback of a\nsingle file, but we keep abuse it for writing multiple files in\nwriteback_sb_inodes() and its callers.\n\nIt immediately clean things up, e.g. suddenly wbc.nr_to_write vs\nwork-\u003enr_pages starts to make sense, and instead of saving and restoring\npages_skipped in writeback_sb_inodes it can always start with a clean\nzero value.\n\nIt also makes a neat IO pattern change: large dirty files are now\nwritten in the full 4MB writeback chunk size, rather than whatever\nremained quota in wbc-\u003enr_to_write.\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nProposed-by: Christoph Hellwig \u003chch@infradead.org\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "36715cef0770b7e2547892b7c3197fc024274630",
      "tree": "34b690df719e6e46a37e0cef40b8c21f34bc36f8",
      "parents": [
        "e84d0a4f8e39a73003a6ec9a11b07702745f4c1f"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sat Jun 11 17:53:57 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Mon Jun 20 00:25:46 2011 +0800"
      },
      "message": "writeback: skip tmpfs early in balance_dirty_pages_ratelimited_nr()\n\nThis helps prevent tmpfs dirtiers from skewing the per-cpu bdp_ratelimits.\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "3efaf0faba6793cd91298c76315e15de59c13ae0",
      "tree": "76712999b686348798cd21964dfba18f66854379",
      "parents": [
        "6f7186562771ec9b629914df328048449ccddf4a"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Thu Dec 16 22:22:00 2010 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Jun 08 08:25:22 2011 +0800"
      },
      "message": "writeback: skip balance_dirty_pages() for in-memory fs\n\nThis avoids unnecessary checks and dirty throttling on tmpfs/ramfs.\n\nNotes about the tmpfs/ramfs behavior changes:\n\nAs for 2.6.36 and older kernels, the tmpfs writes will sleep inside\nbalance_dirty_pages() as long as we are over the (dirty+background)/2\nglobal throttle threshold.  This is because both the dirty pages and\nthreshold will be 0 for tmpfs/ramfs. Hence this test will always\nevaluate to TRUE:\n\n                dirty_exceeded \u003d\n                        (bdi_nr_reclaimable + bdi_nr_writeback \u003e\u003d bdi_thresh)\n                        || (nr_reclaimable + nr_writeback \u003e\u003d dirty_thresh);\n\nFor 2.6.37, someone complained that the current logic does not allow the\nusers to set vm.dirty_ratio\u003d0.  So commit 4cbec4c8b9 changed the test to\n\n                dirty_exceeded \u003d\n                        (bdi_nr_reclaimable + bdi_nr_writeback \u003e bdi_thresh)\n                        || (nr_reclaimable + nr_writeback \u003e dirty_thresh);\n\nSo 2.6.37 will behave differently for tmpfs/ramfs: it will never get\nthrottled unless the global dirty threshold is exceeded (which is very\nunlikely to happen; once happen, will block many tasks).\n\nI\u0027d say that the 2.6.36 behavior is very bad for tmpfs/ramfs. It means\nfor a busy writing server, tmpfs write()s may get livelocked! The\n\"inadvertent\" throttling can hardly bring help to any workload because\nof its \"either no throttling, or get throttled to death\" property.\n\nSo based on 2.6.37, this patch won\u0027t bring more noticeable changes.\n\nCC: Hugh Dickins \u003chughd@google.com\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nReviewed-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "6f7186562771ec9b629914df328048449ccddf4a",
      "tree": "573a5550c80843373fac6f95f7ef767d92cb83f6",
      "parents": [
        "e185dda89d69cde142b48059413a03561f41f78a"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Mar 02 17:14:34 2011 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Jun 08 08:25:22 2011 +0800"
      },
      "message": "writeback: add bdi_dirty_limit() kernel-doc\n\nClarify the bdi_dirty_limit() comment.\n\nAcked-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "6e6938b6d3130305a5960c86b1a9b21e58cf6144",
      "tree": "de5546e8390ce31cd31412d2ef78ce732a42191c",
      "parents": [
        "59c5f46fbe01a00eedf54a23789634438bb80603"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Sun Jun 06 10:38:15 2010 -0600"
      },
      "committer": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Jun 08 08:25:20 2011 +0800"
      },
      "message": "writeback: introduce .tagged_writepages for the WB_SYNC_NONE sync stage\n\nsync(2) is performed in two stages: the WB_SYNC_NONE sync and the\nWB_SYNC_ALL sync. Identify the first stage with .tagged_writepages and\ndo livelock prevention for it, too.\n\nJan\u0027s commit f446daaea9 (\"mm: implement writeback livelock avoidance\nusing page tagging\") is a partial fix in that it only fixed the\nWB_SYNC_ALL phase livelock.\n\nAlthough ext4 is tested to no longer livelock with commit f446daaea9,\nit may due to some \"redirty_tail() after pages_skipped\" effect which\nis by no means a guarantee for _all_ the file systems.\n\nNote that writeback_inodes_sb() is called by not only sync(), they are\ntreated the same because the other callers also need livelock prevention.\n\nImpact:  It changes the order in which pages/inodes are synced to disk.\nNow in the WB_SYNC_NONE stage, it won\u0027t proceed to write the next inode\nuntil finished with the current inode.\n\nAcked-by: Jan Kara \u003cjack@suse.cz\u003e\nCC: Dave Chinner \u003cdavid@fromorbit.com\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\n"
    },
    {
      "commit": "6c5103890057b1bb781b26b7aae38d33e4c517d8",
      "tree": "e6e57961dcddcb5841acb34956e70b9dc696a880",
      "parents": [
        "3dab04e6978e358ad2307bca563fabd6c5d2c58b",
        "9d2e157d970a73b3f270b631828e03eb452d525e"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Mar 24 10:16:26 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Mar 24 10:16:26 2011 -0700"
      },
      "message": "Merge branch \u0027for-2.6.39/core\u0027 of git://git.kernel.dk/linux-2.6-block\n\n* \u0027for-2.6.39/core\u0027 of git://git.kernel.dk/linux-2.6-block: (65 commits)\n  Documentation/iostats.txt: bit-size reference etc.\n  cfq-iosched: removing unnecessary think time checking\n  cfq-iosched: Don\u0027t clear queue stats when preempt.\n  blk-throttle: Reset group slice when limits are changed\n  blk-cgroup: Only give unaccounted_time under debug\n  cfq-iosched: Don\u0027t set active queue in preempt\n  block: fix non-atomic access to genhd inflight structures\n  block: attempt to merge with existing requests on plug flush\n  block: NULL dereference on error path in __blkdev_get()\n  cfq-iosched: Don\u0027t update group weights when on service tree\n  fs: assign sb-\u003es_bdi to default_backing_dev_info if the bdi is going away\n  block: Require subsystems to explicitly allocate bio_set integrity mempool\n  jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging\n  jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging\n  fs: make fsync_buffers_list() plug\n  mm: make generic_writepages() use plugging\n  blk-cgroup: Add unaccounted time to timeslice_used.\n  block: fixup plugging stubs for !CONFIG_BLOCK\n  block: remove obsolete comments for blkdev_issue_zeroout.\n  blktrace: Use rq-\u003ecmd_flags directly in blk_add_trace_rq.\n  ...\n\nFix up conflicts in fs/{aio.c,super.c}\n"
    },
    {
      "commit": "cf15b07cf448e19dcb31a19f0cbaf898b08ce975",
      "tree": "78c377875ae4ee60181a205b6f01c4b52c49e03d",
      "parents": [
        "24b8ff7c27d9e975540656e377de44a2a181a01f"
      ],
      "author": {
        "name": "Jun\u0027ichi Nomura",
        "email": "j-nomura@ce.jp.nec.com",
        "time": "Tue Mar 22 16:33:40 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Mar 22 17:44:09 2011 -0700"
      },
      "message": "writeback: make mapping-\u003ewriteback_index to point to the last written page\n\nFor range-cyclic writeback (e.g.  kupdate), the writeback code sets a\ncontinuation point of the next writeback to mapping-\u003ewriteback_index which\nis set the page after the last written page.  This happens so that we\nevenly write the whole file even if pages in it get continuously\nredirtied.\n\nHowever, in some cases, sequential writer is writing in the middle of the\npage and it just redirties the last written page by continuing from that.\nFor example with an application which uses a file as a big ring buffer we\nsee:\n\n[1st writeback session]\n       ...\n       flush-8:0-2743  4571: block_bio_queue: 8,0 W 94898514 + 8\n       flush-8:0-2743  4571: block_bio_queue: 8,0 W 94898522 + 8\n       flush-8:0-2743  4571: block_bio_queue: 8,0 W 94898530 + 8\n       flush-8:0-2743  4571: block_bio_queue: 8,0 W 94898538 + 8\n       flush-8:0-2743  4571: block_bio_queue: 8,0 W 94898546 + 8\n     kworker/0:1-11    4571: block_rq_issue: 8,0 W 0 () 94898514 + 40\n\u003e\u003e     flush-8:0-2743  4571: block_bio_queue: 8,0 W 94898554 + 8\n\u003e\u003e     flush-8:0-2743  4571: block_rq_issue: 8,0 W 0 () 94898554 + 8\n\n[2nd writeback session after 35sec]\n       flush-8:0-2743  4606: block_bio_queue: 8,0 W 94898562 + 8\n       flush-8:0-2743  4606: block_bio_queue: 8,0 W 94898570 + 8\n       flush-8:0-2743  4606: block_bio_queue: 8,0 W 94898578 + 8\n       ...\n     kworker/0:1-11    4606: block_rq_issue: 8,0 W 0 () 94898562 + 640\n     kworker/0:1-11    4606: block_rq_issue: 8,0 W 0 () 94899202 + 72\n       ...\n       flush-8:0-2743  4606: block_bio_queue: 8,0 W 94899962 + 8\n       flush-8:0-2743  4606: block_bio_queue: 8,0 W 94899970 + 8\n       flush-8:0-2743  4606: block_bio_queue: 8,0 W 94899978 + 8\n       flush-8:0-2743  4606: block_bio_queue: 8,0 W 94899986 + 8\n       flush-8:0-2743  4606: block_bio_queue: 8,0 W 94899994 + 8\n     kworker/0:1-11    4606: block_rq_issue: 8,0 W 0 () 94899962 + 40\n\u003e\u003e     flush-8:0-2743  4606: block_bio_queue: 8,0 W 94898554 + 8\n\u003e\u003e     flush-8:0-2743  4606: block_rq_issue: 8,0 W 0 () 94898554 + 8\n\nSo we seeked back to 94898554 after we wrote all the pages at the end of\nthe file.\n\nThis extra seek seems unnecessary.  If we continue writeback from the last\nwritten page, we can avoid it and do not cause harm to other cases.  The\noriginal intent of even writeout over the whole file is preserved and if\nthe page does not get redirtied pagevec_lookup_tag() just skips it.\n\nAs an exceptional case, when I/O error happens, set done_index to the next\npage as the comment in the code suggests.\n\nTested-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nSigned-off-by: Jun\u0027ichi Nomura \u003cj-nomura@ce.jp.nec.com\u003e\nSigned-off-by: Jan Kara \u003cjack@suse.cz\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "278df9f451dc71dcd002246be48358a473504ad0",
      "tree": "3b79e956f2f0b9381f62518ff2fcf94df4ff9c3f",
      "parents": [
        "3f58a82943337fb6e79acfa5346719a97d3c0b98"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Tue Mar 22 16:32:54 2011 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Mar 22 17:44:04 2011 -0700"
      },
      "message": "mm: reclaim invalidated page ASAP\n\ninvalidate_mapping_pages is very big hint to reclaimer.  It means user\ndoesn\u0027t want to use the page any more.  So in order to prevent working set\npage eviction, this patch move the page into tail of inactive list by\nPG_reclaim.\n\nPlease, remember that pages in inactive list are working set as well as\nactive list.  If we don\u0027t move pages into inactive list\u0027s tail, pages near\nby tail of inactive list can be evicted although we have a big clue about\nuseless pages.  It\u0027s totally bad.\n\nNow PG_readahead/PG_reclaim is shared.  fe3cba17 added ClearPageReclaim\ninto clear_page_dirty_for_io for preventing fast reclaiming readahead\nmarker page.\n\nIn this series, PG_reclaim is used by invalidated page, too.  If VM find\nthe page is invalidated and it\u0027s dirty, it sets PG_reclaim to reclaim\nasap.  Then, when the dirty page will be writeback,\nclear_page_dirty_for_io will clear PG_reclaim unconditionally.  It\ndisturbs this serie\u0027s goal.\n\nI think it\u0027s okay to clear PG_readahead when the page is dirty, not\nwriteback time.  So this patch moves ClearPageReadahead.  In v4,\nClearPageReadahead in set_page_dirty has a problem which is reported by\nSteven Barrett.  It\u0027s due to compound page.  Some driver(ex, audio) calls\nset_page_dirty with compound page which isn\u0027t on LRU.  but my patch does\nClearPageRelcaim on compound page.  In non-CONFIG_PAGEFLAGS_EXTENDED, it\nbreaks PageTail flag.\n\nI think it doesn\u0027t affect THP and pass my test with THP enabling but Cced\nAndrea for double check.\n\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nReported-by: Steven Barrett \u003cdamentz@liquorix.net\u003e\nReviewed-by: Johannes Weiner \u003channes@cmpxchg.org\u003e\nAcked-by: Rik van Riel \u003criel@redhat.com\u003e\nAcked-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnpiggin@kernel.dk\u003e\nCc: Andrea Arcangeli \u003caarcange@redhat.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "9b6096a65f99a89dfd8328c4e469e7b53b3ae04a",
      "tree": "8853a3c389d40bda6fd5e109179be29695d536af",
      "parents": [
        "167400d34070ebbc408dc0f447c4ddb4bf837360"
      ],
      "author": {
        "name": "Shaohua Li",
        "email": "shaohua.li@intel.com",
        "time": "Thu Mar 17 10:47:06 2011 +0100"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Thu Mar 17 10:47:06 2011 +0100"
      },
      "message": "mm: make generic_writepages() use plugging\n\nThis recovers a performance regression caused by the removal\nof the per-device plugging.\n\nSigned-off-by: Jens Axboe \u003cjaxboe@fusionio.com\u003e\n"
    },
    {
      "commit": "7eaceaccab5f40bbfda044629a6298616aeaed50",
      "tree": "33954d12f63e25a47eb6d86ef3d3d0a5e62bf752",
      "parents": [
        "73c101011926c5832e6e141682180c4debe2cf45"
      ],
      "author": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Thu Mar 10 08:52:07 2011 +0100"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Thu Mar 10 08:52:07 2011 +0100"
      },
      "message": "block: remove per-queue plugging\n\nCode has been converted over to the new explicit on-stack plugging,\nand delay users have been converted to use the new API for that.\nSo lets kill off the old plugging along with aops-\u003esync_page().\n\nSigned-off-by: Jens Axboe \u003cjaxboe@fusionio.com\u003e\n"
    },
    {
      "commit": "240c879f20a605346705be24253bc9fc6fa8a106",
      "tree": "85c38509483aa5f69d3dea5daa21412dd2e6aced",
      "parents": [
        "ecb256f815232b35ae8382cff36ca8ce0bbd077e"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Thu Jan 13 15:46:27 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:38 2011 -0800"
      },
      "message": "writeback: avoid unnecessary determine_dirtyable_memory call\n\nI think determine_dirtyable_memory() is a rather costly function since it\nneed many atomic reads for gathering zone/global page state.  But when we\nuse vm_dirty_bytes \u0026\u0026 dirty_background_bytes, we don\u0027t need that costly\ncalculation.\n\nThis patch eliminates such unnecessary overhead.\n\nNOTE : newly added if condition might add overhead in normal path.\n       But it should be _really_ small because anyway we need the\n       access both vm_dirty_bytes and dirty_background_bytes so it is\n       likely to hit the cache.\n\n[akpm@linux-foundation.org: fix used-uninitialised warning]\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "c3f0da631539b3b8e17f6dda567af9958d49d14f",
      "tree": "b9bc0060ac0fa99d0dc9e6caadb0abf0172249ef",
      "parents": [
        "c691b9d983d7015d54057034f4cd9b6d8affd976"
      ],
      "author": {
        "name": "Bob Liu",
        "email": "lliubbo@gmail.com",
        "time": "Thu Jan 13 15:45:49 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 17:32:32 2011 -0800"
      },
      "message": "mm/page-writeback.c: fix __set_page_dirty_no_writeback() return value\n\n__set_page_dirty_no_writeback() should return true if it actually\ntransitioned the page from a clean to dirty state although it seems nobody\nuses its return value at present.\n\nSigned-off-by: Bob Liu \u003clliubbo@gmail.com\u003e\nAcked-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "008d23e4852d78bb2618f2035f8b2110b6a6b968",
      "tree": "81c88f744f6f3fc84132527c1ddc0b4da410c5e2",
      "parents": [
        "8f685fbda43deccd130d192c9fcef1444649eaca",
        "bfc672dcf323877228682aff79dff8ecd9f30ff8"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 10:05:56 2011 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 13 10:05:56 2011 -0800"
      },
      "message": "Merge branch \u0027for-next\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial\n\n* \u0027for-next\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)\n  Documentation/trace/events.txt: Remove obsolete sched_signal_send.\n  writeback: fix global_dirty_limits comment runtime -\u003e real-time\n  ppc: fix comment typo singal -\u003e signal\n  drivers: fix comment typo diable -\u003e disable.\n  m68k: fix comment typo diable -\u003e disable.\n  wireless: comment typo fix diable -\u003e disable.\n  media: comment typo fix diable -\u003e disable.\n  remove doc for obsolete dynamic-printk kernel-parameter\n  remove extraneous \u0027is\u0027 from Documentation/iostats.txt\n  Fix spelling milisec -\u003e ms in snd_ps3 module parameter description\n  Fix spelling mistakes in comments\n  Revert conflicting V4L changes\n  i7core_edac: fix typos in comments\n  mm/rmap.c: fix comment\n  sound, ca0106: Fix assignment to \u0027channel\u0027.\n  hrtimer: fix a typo in comment\n  init/Kconfig: fix typo\n  anon_inodes: fix wrong function name in comment\n  fix comment typos concerning \"consistent\"\n  poll: fix a typo in comment\n  ...\n\nFix up trivial conflicts in:\n - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)\n - fs/ext4/ext4.h\n\nAlso fix missed \u0027diabled\u0027 typo in drivers/net/bnx2x/bnx2x.h while at it.\n"
    },
    {
      "commit": "ebd1373d40be1f295e48877c7582fe9028164e6e",
      "tree": "3a42dac97f902cd90f0949e7f8e7d86d01684e76",
      "parents": [
        "8dd11f80ab73fa6d47f4a9aabb5cee7bc69e7f7a"
      ],
      "author": {
        "name": "Minchan Kim",
        "email": "minchan.kim@gmail.com",
        "time": "Tue Jan 04 01:36:48 2011 +0900"
      },
      "committer": {
        "name": "Jiri Kosina",
        "email": "jkosina@suse.cz",
        "time": "Tue Jan 04 11:09:29 2011 +0100"
      },
      "message": "writeback: fix global_dirty_limits comment runtime -\u003e real-time\n\nChange runtime with real-time\n\nCc: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nSigned-off-by: Minchan Kim \u003cminchan.kim@gmail.com\u003e\nSigned-off-by: Jiri Kosina \u003cjkosina@suse.cz\u003e\n"
    },
    {
      "commit": "d153ba64450b9371158c6516d6cac120faace44c",
      "tree": "78bb90ade76b84312e1e332a02021eb8eb1cda42",
      "parents": [
        "f06328d7721ad3852c45eb2a10a0c8f9439b5f33"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Tue Dec 21 17:24:21 2010 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Dec 22 19:43:33 2010 -0800"
      },
      "message": "writeback: do uninterruptible sleep in balance_dirty_pages()\n\nUsing TASK_INTERRUPTIBLE in balance_dirty_pages() seems wrong.  If it\u0027s\ngoing to do that then it must break out if signal_pending(), otherwise\nit\u0027s pretty much guaranteed to degenerate into a busywait loop.  Plus we\n*do* want these processes to appear in D state and to contribute to load\naverage.\n\nSo it should be TASK_UNINTERRUPTIBLE.                 -- Andrew Morton\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "4cbec4c8b9fda9ec784086fe7f74cd32a8adda95",
      "tree": "669c3df27982345b52d0bfca8026e3f275e64a03",
      "parents": [
        "0e093d99763eb4cea09f8ca4f1d01f34e121d10b"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Tue Oct 26 14:21:45 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:08 2010 -0700"
      },
      "message": "writeback: remove the internal 5% low bound on dirty_ratio\n\nThe dirty_ratio was silently limited in global_dirty_limits() to \u003e\u003d 5%.\nThis is not a user expected behavior.  And it\u0027s inconsistent with\ncalc_period_shift(), which uses the plain vm_dirty_ratio value.\n\nLet\u0027s remove the internal bound.\n\nAt the same time, fix balance_dirty_pages() to work with the\ndirty_thresh\u003d0 case.  This allows applications to proceed when\ndirty+writeback pages are all cleaned.\n\nAnd \"\u003e\" fits with the name \"exceeded\" better than \"\u003e\u003d\" does.  Neil thinks\nit is an aesthetic improvement as well as a functional one :)\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nProposed-by: Con Kolivas \u003ckernel@kolivas.org\u003e\nAcked-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nReviewed-by: Rik van Riel \u003criel@redhat.com\u003e\nReviewed-by: Neil Brown \u003cneilb@suse.de\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "ea941f0e2a8c02ae876cd73deb4e1557248f258c",
      "tree": "d2006c10cce4f134dc83f7f5aaa1d0096902cc1a",
      "parents": [
        "f629d1c9bd0dbc44a6c4f9a4a67d1646c42bfc6f"
      ],
      "author": {
        "name": "Michael Rubin",
        "email": "mrubin@google.com",
        "time": "Tue Oct 26 14:21:35 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:06 2010 -0700"
      },
      "message": "writeback: add nr_dirtied and nr_written to /proc/vmstat\n\nTo help developers and applications gain visibility into writeback\nbehaviour adding two entries to vm_stat_items and /proc/vmstat.  This will\nallow us to track the \"written\" and \"dirtied\" counts.\n\n   # grep nr_dirtied /proc/vmstat\n   nr_dirtied 3747\n   # grep nr_written /proc/vmstat\n   nr_written 3618\n\nSigned-off-by: Michael Rubin \u003cmrubin@google.com\u003e\nReviewed-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Jens Axboe \u003caxboe@kernel.dk\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnickpiggin@yahoo.com.au\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f629d1c9bd0dbc44a6c4f9a4a67d1646c42bfc6f",
      "tree": "22ac36b494b40e17bfa68e85a094b9cc4b2f6093",
      "parents": [
        "0def08e3acc2c9c934e4671487029aed52202d42"
      ],
      "author": {
        "name": "Michael Rubin",
        "email": "mrubin@google.com",
        "time": "Tue Oct 26 14:21:33 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Oct 26 16:52:06 2010 -0700"
      },
      "message": "mm: add account_page_writeback()\n\nTo help developers and applications gain visibility into writeback\nbehaviour this patch adds two counters to /proc/vmstat.\n\n  # grep nr_dirtied /proc/vmstat\n  nr_dirtied 3747\n  # grep nr_written /proc/vmstat\n  nr_written 3618\n\nThese entries allow user apps to understand writeback behaviour over time\nand learn how it is impacting their performance.  Currently there is no\nway to inspect dirty and writeback speed over time.  It\u0027s not possible for\nnr_dirty/nr_writeback.\n\nThese entries are necessary to give visibility into writeback behaviour.\nWe have /proc/diskstats which lets us understand the io in the block\nlayer.  We have blktrace for more in depth understanding.  We have\ne2fsprogs and debugsfs to give insight into the file systems behaviour,\nbut we don\u0027t offer our users the ability understand what writeback is\ndoing.  There is no way to know how active it is over the whole system, if\nit\u0027s falling behind or to quantify it\u0027s efforts.  With these values\nexported users can easily see how much data applications are sending\nthrough writeback and also at what rates writeback is processing this\ndata.  Comparing the rates of change between the two allow developers to\nsee when writeback is not able to keep up with incoming traffic and the\nrate of dirty memory being sent to the IO back end.  This allows folks to\nunderstand their io workloads and track kernel issues.  Non kernel\nengineers at Google often use these counters to solve puzzling performance\nproblems.\n\nPatch #4 adds a pernode vmstat file with nr_dirtied and nr_written\n\nPatch #5 add writeback thresholds to /proc/vmstat\n\nCurrently these values are in debugfs. But they should be promoted to\n/proc since they are useful for developers who are writing databases\nand file servers and are not debugging the kernel.\n\nThe output is as below:\n\n # grep threshold /proc/vmstat\n nr_pages_dirty_threshold 409111\n nr_pages_dirty_background_threshold 818223\n\nThis patch:\n\nThis allows code outside of the mm core to safely manipulate page\nwriteback state and not worry about the other accounting.  Not using these\nroutines means that some code will lose track of the accounting and we get\nbugs.\n\nModify nilfs2 to use interface.\n\nSigned-off-by: Michael Rubin \u003cmrubin@google.com\u003e\nReviewed-by: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nReviewed-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: KONISHI Ryusuke \u003ckonishi.ryusuke@lab.ntt.co.jp\u003e\nCc: Jiro SEKIBA \u003cjir@unicus.jp\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Jens Axboe \u003caxboe@kernel.dk\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Nick Piggin \u003cnickpiggin@yahoo.com.au\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "997396a73a94de7d92d82e30d7bb1d931e38cb16",
      "tree": "2190a66e085f16a1985e008be167d6fc4ea6734d",
      "parents": [
        "6f4dbeca1a5bac4552d49d9e7b774da9f6625e74",
        "b545787dbb00a041c541a4759d938ddb0108295a"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Aug 28 14:07:20 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Aug 28 14:07:20 2010 -0700"
      },
      "message": "Merge branch \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client\n\n* \u0027for-linus\u0027 of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:\n  ceph: fix get_ticket_handler() error handling\n  ceph: don\u0027t BUG on ENOMEM during mds reconnect\n  ceph: ceph_mdsc_build_path() returns an ERR_PTR\n  ceph: Fix warnings\n  ceph: ceph_get_inode() returns an ERR_PTR\n  ceph: initialize fields on new dentry_infos\n  ceph: maintain i_head_snapc when any caps are dirty, not just for data\n  ceph: fix osd request lru adjustment when sending request\n  ceph: don\u0027t improperly set dir complete when holding EXCL cap\n  mm: exporting account_page_dirty\n  ceph: direct requests in snapped namespace based on nonsnap parent\n  ceph: queue cap snap writeback for realm children on snap update\n  ceph: include dirty xattrs state in snapped caps\n  ceph: fix xattr cap writeback\n  ceph: fix multiple mds session shutdown\n"
    },
    {
      "commit": "546a1924224078c6f582e68f890b05b387b42653",
      "tree": "f863df4fd74f85c8177d9eb1467a351cd6d0acfc",
      "parents": [
        "4536f2ad8b330453d7ebec0746c4374eadd649b1"
      ],
      "author": {
        "name": "Dave Chinner",
        "email": "dchinner@redhat.com",
        "time": "Tue Aug 24 11:44:34 2010 +1000"
      },
      "committer": {
        "name": "Dave Chinner",
        "email": "david@fromorbit.com",
        "time": "Tue Aug 24 11:44:34 2010 +1000"
      },
      "message": "writeback: write_cache_pages doesn\u0027t terminate at nr_to_write \u003c\u003d 0\n\nI noticed XFS writeback in 2.6.36-rc1 was much slower than it should have\nbeen. Enabling writeback tracing showed:\n\n    flush-253:16-8516  [007] 1342952.351608: wbc_writepage: bdi 253:16: towrt\u003d1024 skip\u003d0 mode\u003d0 kupd\u003d0 bgrd\u003d1 reclm\u003d0 cyclic\u003d1 more\u003d0 older\u003d0x0 start\u003d0x0 end\u003d0x0\n    flush-253:16-8516  [007] 1342952.351654: wbc_writepage: bdi 253:16: towrt\u003d1023 skip\u003d0 mode\u003d0 kupd\u003d0 bgrd\u003d1 reclm\u003d0 cyclic\u003d1 more\u003d0 older\u003d0x0 start\u003d0x0 end\u003d0x0\n    flush-253:16-8516  [000] 1342952.369520: wbc_writepage: bdi 253:16: towrt\u003d0 skip\u003d0 mode\u003d0 kupd\u003d0 bgrd\u003d1 reclm\u003d0 cyclic\u003d1 more\u003d0 older\u003d0x0 start\u003d0x0 end\u003d0x0\n    flush-253:16-8516  [000] 1342952.369542: wbc_writepage: bdi 253:16: towrt\u003d-1 skip\u003d0 mode\u003d0 kupd\u003d0 bgrd\u003d1 reclm\u003d0 cyclic\u003d1 more\u003d0 older\u003d0x0 start\u003d0x0 end\u003d0x0\n    flush-253:16-8516  [000] 1342952.369549: wbc_writepage: bdi 253:16: towrt\u003d-2 skip\u003d0 mode\u003d0 kupd\u003d0 bgrd\u003d1 reclm\u003d0 cyclic\u003d1 more\u003d0 older\u003d0x0 start\u003d0x0 end\u003d0x0\n\nWriteback is not terminating in background writeback if -\u003ewritepage is\nreturning with wbc-\u003enr_to_write \u003d\u003d 0, resulting in sub-optimal single page\nwriteback on XFS.\n\nFix the write_cache_pages loop to terminate correctly when this situation\noccurs and so prevent this sub-optimal background writeback pattern. This\nimproves sustained sequential buffered write performance from around\n250MB/s to 750MB/s for a 100GB file on an XFS filesystem on my 8p test VM.\n\nCc:\u003cstable@kernel.org\u003e\nSigned-off-by: Dave Chinner \u003cdchinner@redhat.com\u003e\nReviewed-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nReviewed-by: Christoph Hellwig \u003chch@lst.de\u003e\n"
    },
    {
      "commit": "679ceace848e9fd570678396ffe1ef034e00e82d",
      "tree": "670768527852f134cfe3e20d425534a16edd968a",
      "parents": [
        "eb6bb1c5bdc6e455a9d16cb845cc65afc9b0a617"
      ],
      "author": {
        "name": "Michael Rubin",
        "email": "mrubin@google.com",
        "time": "Fri Aug 20 02:31:26 2010 -0700"
      },
      "committer": {
        "name": "Sage Weil",
        "email": "sage@newdream.net",
        "time": "Sun Aug 22 15:16:51 2010 -0700"
      },
      "message": "mm: exporting account_page_dirty\n\nThis allows code outside of the mm core to safely manipulate page state\nand not worry about the other accounting. Not using these routines means\nthat some code will lose track of the accounting and we get bugs. This\nhas happened once already.\n\nSigned-off-by: Michael Rubin \u003cmrubin@google.com\u003e\nSigned-off-by: Sage Weil \u003csage@newdream.net\u003e\n"
    },
    {
      "commit": "d5ed3a4af77b851b6271ad3d9abc4c57fa3ce0f5",
      "tree": "f06894404e4af25051e8918bfd3fdac95974fc97",
      "parents": [
        "f2e41e910320197d55b52e28d99a07130f2ae738"
      ],
      "author": {
        "name": "Jan Kara",
        "email": "jack@suse.cz",
        "time": "Thu Aug 19 14:13:33 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Aug 20 09:34:55 2010 -0700"
      },
      "message": "lib/radix-tree.c: fix overflow in radix_tree_range_tag_if_tagged()\n\nWhen radix_tree_maxindex() is ~0UL, it can happen that scanning overflows\nindex and tree traversal code goes astray reading memory until it hits\nunreadable memory.  Check for overflow and exit in that case.\n\nSigned-off-by: Jan Kara \u003cjack@suse.cz\u003e\nCc: Christoph Hellwig \u003chch@lst.de\u003e\nCc: Nick Piggin \u003cnickpiggin@yahoo.com.au\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "03ab450f030b08d786c7a262b67816396f09c7ab",
      "tree": "1c6e245f823bd3c2c3f5b584f6f25cf83d1c2447",
      "parents": [
        "163475fb111cb2f85aef2428a6c1f9eefba8be23"
      ],
      "author": {
        "name": "Randy Dunlap",
        "email": "randy.dunlap@oracle.com",
        "time": "Sat Aug 14 13:05:17 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Sat Aug 14 16:20:59 2010 -0700"
      },
      "message": "mm/page-writeback: fix non-kernel-doc function comments\n\nRemove leading /** from non-kernel-doc function comments to prevent\nkernel-doc warnings.\n\nSigned-off-by: Randy Dunlap \u003crandy.dunlap@oracle.com\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "1babe18385d3976043c04237ce837f3736197eb4",
      "tree": "c766bb0022ec5188cd7e991fc1f9ad51687e8aca",
      "parents": [
        "16c4042f08919f447d6b2a55679546c9b97c7264"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Aug 11 14:17:40 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Aug 12 08:43:30 2010 -0700"
      },
      "message": "writeback: add comment to the dirty limit functions\n\nDocument global_dirty_limits() and bdi_dirty_limit().\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Jens Axboe \u003caxboe@kernel.dk\u003e\nCc: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "16c4042f08919f447d6b2a55679546c9b97c7264",
      "tree": "0248b64d46237854ebe67efe8c742cb5878d8611",
      "parents": [
        "e50e37201ae2e7d6a52e87815759e6481f0bcfb9"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Aug 11 14:17:39 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Aug 12 08:43:29 2010 -0700"
      },
      "message": "writeback: avoid unnecessary calculation of bdi dirty thresholds\n\nSplit get_dirty_limits() into global_dirty_limits()+bdi_dirty_limit(), so\nthat the latter can be avoided when under global dirty background\nthreshold (which is the normal state for most systems).\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Jens Axboe \u003caxboe@kernel.dk\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "e50e37201ae2e7d6a52e87815759e6481f0bcfb9",
      "tree": "efb500382d5e9628351cb16286f579ad9bd455db",
      "parents": [
        "a292dfa01794477126d3f022559eb235edde00b0"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Wed Aug 11 14:17:37 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Aug 12 08:43:29 2010 -0700"
      },
      "message": "writeback: balance_dirty_pages(): reduce calls to global_page_state\n\nReducing the number of times balance_dirty_pages calls global_page_state\nreduces the cache references and so improves write performance on a\nvariety of workloads.\n\n\u0027perf stats\u0027 of simple fio write tests shows the reduction in cache\naccess.  Where the test is fio \u0027write,mmap,600Mb,pre_read\u0027 on AMD AthlonX2\nwith 3Gb memory (dirty_threshold approx 600 Mb) running each test 10\ntimes, dropping the fasted \u0026 slowest values then taking the average \u0026\nstandard deviation\n\n\t\taverage (s.d.) in millions (10^6)\n2.6.31-rc8\t648.6 (14.6)\n+patch\t\t620.1 (16.5)\n\nAchieving this reduction is by dropping clip_bdi_dirty_limit as it rereads\nthe counters to apply the dirty_threshold and moving this check up into\nbalance_dirty_pages where it has already read the counters.\n\nAlso by rearrange the for loop to only contain one copy of the limit tests\nallows the pdflush test after the loop to use the local copies of the\ncounters rather than rereading them.\n\nIn the common case with no throttling it now calls global_page_state 5\nfewer times and bdi_stat 2 fewer.\n\nFengguang:\n\nThis patch slightly changes behavior by replacing clip_bdi_dirty_limit()\nwith the explicit check (nr_reclaimable + nr_writeback \u003e\u003d dirty_thresh) to\navoid exceeding the dirty limit.  Since the bdi dirty limit is mostly\naccurate we don\u0027t need to do routinely clip.  A simple dirty limit check\nwould be enough.\n\nThe check is necessary because, in principle we should throttle everything\ncalling balance_dirty_pages() when we\u0027re over the total limit, as said by\nPeter.\n\nWe now set and clear dirty_exceeded not only based on bdi dirty limits,\nbut also on the global dirty limit.  The global limit check is added in\nplace of clip_bdi_dirty_limit() for safety and not intended as a behavior\nchange.  The bdi limits should be tight enough to keep all dirty pages\nunder the global limit at most time; occasional small exceeding should be\nOK though.  The change makes the logic more obvious: the global limit is\nthe ultimate goal and shall be always imposed.\n\nWe may now start background writeback work based on outdated conditions.\nThat\u0027s safe because the bdi flush thread will (and have to) double check\nthe states.  It reduces overall overheads because the test based on old\nstates still have good chance to be right.\n\n[akpm@linux-foundation.org] fix uninitialized dirty_exceeded\nSigned-off-by: Richard Kennedy \u003crichard@rsk.demon.co.uk\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Jan Kara \u003cjack@suse.cz\u003e\nAcked-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Jens Axboe \u003caxboe@kernel.dk\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "3c111a071da260aa1e9cae3e882e2109c4e9bdfc",
      "tree": "27a830f9981dc1a9734a190d9890d4eddf0e6357",
      "parents": [
        "0a7992c90828a65281c3c9cf180be3b432d277b2"
      ],
      "author": {
        "name": "Randy Dunlap",
        "email": "randy.dunlap@oracle.com",
        "time": "Wed Aug 11 14:17:30 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Aug 12 08:43:29 2010 -0700"
      },
      "message": "mm: fix fatal kernel-doc error\n\nFix a fatal kernel-doc error due to a #define coming between a function\u0027s\nkernel-doc notation and the function signature.  (kernel-doc cannot handle\nthis)\n\nSigned-off-by: Randy Dunlap \u003crandy.dunlap@oracle.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "2f9e825d3e0e2b407ae8f082de5c00afcf7378fb",
      "tree": "f8b3ee40674ce4acd5508a0a0bf52a30904caf6c",
      "parents": [
        "7ae0dea900b027cd90e8a3e14deca9a19e17638b",
        "de75d60d5ea235e6e09f4962ab22541ce0fe176a"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Aug 10 15:22:42 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Aug 10 15:22:42 2010 -0700"
      },
      "message": "Merge branch \u0027for-2.6.36\u0027 of git://git.kernel.dk/linux-2.6-block\n\n* \u0027for-2.6.36\u0027 of git://git.kernel.dk/linux-2.6-block: (149 commits)\n  block: make sure that REQ_* types are seen even with CONFIG_BLOCK\u003dn\n  xen-blkfront: fix missing out label\n  blkdev: fix blkdev_issue_zeroout return value\n  block: update request stacking methods to support discards\n  block: fix missing export of blk_types.h\n  writeback: fix bad _bh spinlock nesting\n  drbd: revert \"delay probes\", feature is being re-implemented differently\n  drbd: Initialize all members of sync_conf to their defaults [Bugz 315]\n  drbd: Disable delay probes for the upcomming release\n  writeback: cleanup bdi_register\n  writeback: add new tracepoints\n  writeback: remove unnecessary init_timer call\n  writeback: optimize periodic bdi thread wakeups\n  writeback: prevent unnecessary bdi threads wakeups\n  writeback: move bdi threads exiting logic to the forker thread\n  writeback: restructure bdi forker loop a little\n  writeback: move last_active to bdi\n  writeback: do not remove bdi from bdi_list\n  writeback: simplify bdi code a little\n  writeback: do not lose wake-ups in bdi threads\n  ...\n\nFixed up pretty trivial conflicts in drivers/block/virtio_blk.c and\ndrivers/scsi/scsi_error.c as per Jens.\n"
    },
    {
      "commit": "f446daaea9d4a420d16c606f755f3689dcb2d0ce",
      "tree": "be2afc18f79aa4ff9be245b0a036aa06185b5dc4",
      "parents": [
        "ebf8aa44beed48cd17893a83d92a4403e5f9d9e2"
      ],
      "author": {
        "name": "Jan Kara",
        "email": "jack@suse.cz",
        "time": "Mon Aug 09 17:19:12 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Aug 09 20:44:59 2010 -0700"
      },
      "message": "mm: implement writeback livelock avoidance using page tagging\n\nWe try to avoid livelocks of writeback when some steadily creates dirty\npages in a mapping we are writing out.  For memory-cleaning writeback,\nusing nr_to_write works reasonably well but we cannot really use it for\ndata integrity writeback.  This patch tries to solve the problem.\n\nThe idea is simple: Tag all pages that should be written back with a\nspecial tag (TOWRITE) in the radix tree.  This can be done rather quickly\nand thus livelocks should not happen in practice.  Then we start doing the\nhard work of locking pages and sending them to disk only for those pages\nthat have TOWRITE tag set.\n\nNote: Adding new radix tree tag grows radix tree node from 288 to 296\nbytes for 32-bit archs and from 552 to 560 bytes for 64-bit archs.\nHowever, the number of slab/slub items per page remains the same (13 and 7\nrespectively).\n\nSigned-off-by: Jan Kara \u003cjack@suse.cz\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Nick Piggin \u003cnickpiggin@yahoo.com.au\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Theodore Ts\u0027o \u003ctytso@mit.edu\u003e\nCc: Jens Axboe \u003caxboe@kernel.dk\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "9e094383b60066996fbc3b53891324e5d2ec858d",
      "tree": "a3f4af7e60781f62b52dcfddd2032c3bbf933644",
      "parents": [
        "028c2dd184c097809986684f2f0627eea5529fea"
      ],
      "author": {
        "name": "Dave Chinner",
        "email": "dchinner@redhat.com",
        "time": "Wed Jul 07 13:24:08 2010 +1000"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Sat Aug 07 18:24:26 2010 +0200"
      },
      "message": "writeback: Add tracing to write_cache_pages\n\nAdd a trace event to the -\u003ewritepage loop in write_cache_pages to give\nvisibility into how the -\u003ewritepage call is changing variables within the\nwriteback control structure. Of most interest is how wbc-\u003enr_to_write changes\nfrom call to call, especially with filesystems that write multiple pages\nin -\u003ewritepage.\n\nSigned-off-by: Dave Chinner \u003cdchinner@redhat.com\u003e\nReviewed-by: Christoph Hellwig \u003chch@lst.de\u003e\nSigned-off-by: Jens Axboe \u003cjaxboe@fusionio.com\u003e\n"
    },
    {
      "commit": "028c2dd184c097809986684f2f0627eea5529fea",
      "tree": "f6eb9e30a24d73597e5ce2a65b4638e9d1947504",
      "parents": [
        "455b2864686d3591b3b2f39eb46290c95f76471f"
      ],
      "author": {
        "name": "Dave Chinner",
        "email": "dchinner@redhat.com",
        "time": "Wed Jul 07 13:24:07 2010 +1000"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Sat Aug 07 18:24:25 2010 +0200"
      },
      "message": "writeback: Add tracing to balance_dirty_pages\n\nTracing high level background writeback events is good, but it doesn\u0027t\ngive the entire picture. Add visibility into write throttling to catch IO\ndispatched by foreground throttling of processing dirtying lots of pages.\n\nSigned-off-by: Dave Chinner \u003cdchinner@redhat.com\u003e\nSigned-off-by: Jens Axboe \u003cjaxboe@fusionio.com\u003e\n"
    },
    {
      "commit": "9c3a8ee8a1d72c5c0d7fbdf426d80e270ddfa54c",
      "tree": "fa131760a61f66afeede852622ede0d716965489",
      "parents": [
        "06d738fa9155ff16dba3d7e501ba4581d01a98cb"
      ],
      "author": {
        "name": "Christoph Hellwig",
        "email": "hch@lst.de",
        "time": "Thu Jun 10 12:07:27 2010 +0200"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Tue Jul 06 08:54:03 2010 +0200"
      },
      "message": "writeback: remove writeback_inodes_wbc\n\nThis was just an odd wrapper around writeback_inodes_wb.  Removing this\nalso allows to get rid of the bdi member of struct writeback_control\nwhich was rather out of place there.\n\nSigned-off-by: Christoph Hellwig \u003chch@lst.de\u003e\nSigned-off-by: Jens Axboe \u003cjaxboe@fusionio.com\u003e\n"
    },
    {
      "commit": "c5444198ca210498e8ac0ba121b4cd3537aa12f7",
      "tree": "c423d38fe1ac7f51a48e455a19ecbe2354811fca",
      "parents": [
        "b8c2f3474f1077599ec6e90c2f263f17055cc3d8"
      ],
      "author": {
        "name": "Christoph Hellwig",
        "email": "hch@lst.de",
        "time": "Tue Jun 08 18:15:15 2010 +0200"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Fri Jun 11 12:58:08 2010 +0200"
      },
      "message": "writeback: simplify and split bdi_start_writeback\n\nbdi_start_writeback now never gets a superblock passed, so we can just remove\nthat case.  And to further untangle the code and flatten the call stack\nsplit it into two trivial helpers for it\u0027s two callers.\n\nSigned-off-by: Christoph Hellwig \u003chch@lst.de\u003e\nSigned-off-by: Jens Axboe \u003cjaxboe@fusionio.com\u003e\n"
    },
    {
      "commit": "d87815cb2090e07b0b0b2d73dc9740706e92c80c",
      "tree": "0e23b40fce5b09c94dab2bf773601b310d8d9b09",
      "parents": [
        "254c8c2dbf0e06a560a5814eb90cb628adb2de66"
      ],
      "author": {
        "name": "Dave Chinner",
        "email": "dchinner@redhat.com",
        "time": "Wed Jun 09 10:37:20 2010 +1000"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jun 08 18:12:44 2010 -0700"
      },
      "message": "writeback: limit write_cache_pages integrity scanning to current EOF\n\nsync can currently take a really long time if a concurrent writer is\nextending a file. The problem is that the dirty pages on the address\nspace grow in the same direction as write_cache_pages scans, so if\nthe writer keeps ahead of writeback, the writeback will not\nterminate until the writer stops adding dirty pages.\n\nFor a data integrity sync, we only need to write the pages dirty at\nthe time we start the writeback, so we can stop scanning once we get\nto the page that was at the end of the file at the time the scan\nstarted.\n\nThis will prevent operations like copying a large file preventing\nsync from completing as it will not write back pages that were\ndirtied after the sync was started. This does not impact the\nexisting integrity guarantees, as any dirty page (old or new)\nwithin the EOF range at the start of the scan will still be\ncaptured.\n\nThis patch will not prevent sync from blocking on large writes into\nholes. That requires more complex intervention while this patch only\naddresses the common append-case of this sync holdoff.\n\nSigned-off-by: Dave Chinner \u003cdchinner@redhat.com\u003e\nReviewed-by: Christoph Hellwig \u003chch@lst.de\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0b5649278e39a068aaf91399941bab1b4a4a3cc2",
      "tree": "3fd2c782385137f5b135c07149de772e207fdaf8",
      "parents": [
        "8d7458daea2a6809d32418bf489b949d23de99ea"
      ],
      "author": {
        "name": "Dave Chinner",
        "email": "dchinner@redhat.com",
        "time": "Wed Jun 09 10:37:18 2010 +1000"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jun 08 18:12:44 2010 -0700"
      },
      "message": "writeback: pay attention to wbc-\u003enr_to_write in write_cache_pages\n\nIf a filesystem writes more than one page in -\u003ewritepage, write_cache_pages\nfails to notice this and continues to attempt writeback when wbc-\u003enr_to_write\nhas gone negative - this trace was captured from XFS:\n\n    wbc_writeback_start: towrt\u003d1024\n    wbc_writepage: towrt\u003d1024\n    wbc_writepage: towrt\u003d0\n    wbc_writepage: towrt\u003d-1\n    wbc_writepage: towrt\u003d-5\n    wbc_writepage: towrt\u003d-21\n    wbc_writepage: towrt\u003d-85\n\nThis has adverse effects on filesystem writeback behaviour. write_cache_pages()\nneeds to terminate after a certain number of pages are written, not after a\ncertain number of calls to -\u003ewritepage are made.  This is a regression\nintroduced by 17bc6c30cf6bfffd816bdc53682dd46fc34a2cf4 (\"vfs: Add\nno_nrwrite_index_update writeback control flag\"), but cannot be reverted\ndirectly due to subsequent bug fixes that have gone in on top of it.\n\nSigned-off-by: Dave Chinner \u003cdchinner@redhat.com\u003e\nReviewed-by: Christoph Hellwig \u003chch@lst.de\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0e3c9a2284f5417f196e327c254d0b84c9ee8929",
      "tree": "e3fb40ebe7d042b4b3c1042bc7f2edaf7fb6eee0",
      "parents": [
        "f17625b318d9b151e7bd41e31223e9d89b2aaa77"
      ],
      "author": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Tue Jun 01 11:08:43 2010 +0200"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jaxboe@fusionio.com",
        "time": "Tue Jun 01 11:08:43 2010 +0200"
      },
      "message": "Revert \"writeback: fix WB_SYNC_NONE writeback from umount\"\n\nThis reverts commit e913fc825dc685a444cb4c1d0f9d32f372f59861.\n\nWe are investigating a hang associated with the WB_SYNC_NONE changes,\nso revert them for now.\n\nConflicts:\n\n\tfs/fs-writeback.c\n\tmm/page-writeback.c\n\nSigned-off-by: Jens Axboe \u003cjaxboe@fusionio.com\u003e\n"
    },
    {
      "commit": "df96e96f76571c30d903829a7b2ab2b421028790",
      "tree": "d3cc536a9aea6f99228789fe92ba81b195e8049c",
      "parents": [
        "c2c4986eddaa7dc3d036cb2bfa5c8c5f1f2492a0"
      ],
      "author": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Fri May 21 20:01:54 2010 +0200"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Fri May 21 20:01:54 2010 +0200"
      },
      "message": "writeback: fix mixed up arguments to bdi_start_writeback()\n\nThe laptop mode timer had the nr_pages and sb_locked arguments\nmixed up.\n\nSigned-off-by: Jens Axboe \u003cjens.axboe@oracle.com\u003e\n"
    },
    {
      "commit": "c2c4986eddaa7dc3d036cb2bfa5c8c5f1f2492a0",
      "tree": "4787499c06028f73c770daadae772c9af7c3499c",
      "parents": [
        "b403a98e260f3a8c7c33f58a07c7ae549852170f"
      ],
      "author": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Thu May 20 09:18:47 2010 +0200"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Fri May 21 20:01:03 2010 +0200"
      },
      "message": "writeback: fix problem with !CONFIG_BLOCK compilation\n\nWhen CONFIG_BLOCK isn\u0027t enabled:\n\nmm/page-writeback.c: In function \u0027laptop_mode_timer_fn\u0027:\nmm/page-writeback.c:708: error: dereferencing pointer to incomplete type\nmm/page-writeback.c:709: error: dereferencing pointer to incomplete type\n\nFix this by essentially eliminating the laptop sync handlers when\nCONFIG_BLOCK isn\u0027t set, as most are only used from the block layer code.\nThe exception is laptop_sync_completion() which is used from sys_sync(),\nmake that an empty declaration in that case.\n\nReported-by: Randy Dunlap \u003crandy.dunlap@oracle.com\u003e\nSigned-off-by: Jens Axboe \u003cjens.axboe@oracle.com\u003e\n"
    },
    {
      "commit": "6423104b6a1e6f0c18be60e8c33f02d263331d5e",
      "tree": "e22957400e9679bf82b62e03d6bd831181053945",
      "parents": [
        "f9eadbbd424c083b8005c7b738f644611b9ef489"
      ],
      "author": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Fri May 21 20:00:35 2010 +0200"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Fri May 21 20:00:35 2010 +0200"
      },
      "message": "writeback: fixups for !dirty_writeback_centisecs\n\nCommit 69b62d01 fixed up most of the places where we would enter\nbusy schedule() spins when disabling the periodic background\nwriteback. This fixes up the sb timer so that it doesn\u0027t get\nhammered on with the delay disabled, and ensures that it gets\nrearmed if needed when /proc/sys/vm/dirty_writeback_centisecs\ngets modified.\n\nbdi_forker_task() also needs to check for !dirty_writeback_centisecs\nand use schedule() appropriately, fix that up too.\n\nSigned-off-by: Jens Axboe \u003cjens.axboe@oracle.com\u003e\n"
    },
    {
      "commit": "e913fc825dc685a444cb4c1d0f9d32f372f59861",
      "tree": "e470697e43ffe4028ac81c17d3ef90ee9f30bcfb",
      "parents": [
        "69b62d01ec44fe0d505d89917392347732135a4d"
      ],
      "author": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Mon May 17 12:55:07 2010 +0200"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Mon May 17 12:55:07 2010 +0200"
      },
      "message": "writeback: fix WB_SYNC_NONE writeback from umount\n\nWhen umount calls sync_filesystem(), we first do a WB_SYNC_NONE\nwriteback to kick off writeback of pending dirty inodes, then follow\nthat up with a WB_SYNC_ALL to wait for it. Since umount already holds\nthe sb s_umount mutex, WB_SYNC_NONE ends up doing nothing and all\nwriteback happens as WB_SYNC_ALL. This can greatly slow down umount,\nsince WB_SYNC_ALL writeback is a data integrity operation and thus\na bigger hammer than simple WB_SYNC_NONE. For barrier aware file systems\nit\u0027s a lot slower.\n\nSigned-off-by: Jens Axboe \u003cjens.axboe@oracle.com\u003e\n"
    },
    {
      "commit": "31373d09da5b7fe21fe6f781e92bd534a3495f00",
      "tree": "38cd9896cfc6ce106a03431658a9b98a09129034",
      "parents": [
        "9195291e5f05e01d67f9a09c756b8aca8f009089"
      ],
      "author": {
        "name": "Matthew Garrett",
        "email": "mjg@redhat.com",
        "time": "Tue Apr 06 14:25:14 2010 +0200"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Tue Apr 06 14:25:14 2010 +0200"
      },
      "message": "laptop-mode: Make flushes per-device\n\nOne of the features of laptop-mode is that it forces a writeout of dirty\npages if something else triggers a physical read or write from a device.\nThe current implementation flushes pages on all devices, rather than only\nthe one that triggered the flush. This patch alters the behaviour so that\nonly the recently accessed block device is flushed, preventing other\ndisks being spun up for no terribly good reason.\n\nSigned-off-by: Matthew Garrett \u003cmjg@redhat.com\u003e\nSigned-off-by: Jens Axboe \u003cjens.axboe@oracle.com\u003e\n"
    },
    {
      "commit": "0d99519efef15fd0cf84a849492c7b1deee1e4b7",
      "tree": "d0f9d922ef73f6b9c4529826878f3cc5567848fd",
      "parents": [
        "b17621fed6aa039387e35f9b4d34d98f213e5673"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@gmail.com",
        "time": "Thu Dec 03 13:54:25 2009 +0100"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Thu Dec 03 13:54:25 2009 +0100"
      },
      "message": "writeback: remove unused nonblocking and congestion checks\n\n- no one is calling wb_writeback and write_cache_pages with\n  wbc.nonblocking\u003d1 any more\n- lumpy pageout will want to do nonblocking writeback without the\n  congestion wait\n\nSo remove the congestion checks as suggested by Chris.\n\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nCc: Chris Mason \u003cchris.mason@oracle.com\u003e\nCc: Jens Axboe \u003cjens.axboe@oracle.com\u003e\nCc: Trond Myklebust \u003cTrond.Myklebust@netapp.com\u003e\nCc: Christoph Hellwig \u003chch@infradead.org\u003e\nCc: Dave Chinner \u003cdavid@fromorbit.com\u003e\nCc: Evgeniy Polyakov \u003czbr@ioremap.net\u003e\nCc: Alex Elder \u003caelder@sgi.com\u003e\nSigned-off-by: Jens Axboe \u003cjens.axboe@oracle.com\u003e\n"
    },
    {
      "commit": "d25105e8911bff1dbd68e387f12901c5b1a15fe8",
      "tree": "bcb94e898b9f3b0322db74473e4dd319a16308e2",
      "parents": [
        "8c279598585e4992a41016bb973993ed15888cb3"
      ],
      "author": {
        "name": "Wu Fengguang",
        "email": "fengguang.wu@intel.com",
        "time": "Fri Oct 09 12:40:42 2009 +0200"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Fri Oct 09 12:40:42 2009 +0200"
      },
      "message": "writeback: account IO throttling wait as iowait\n\nIt makes sense to do IOWAIT when someone is blocked\ndue to IO throttle, as suggested by Kame and Peter.\n\nThere is an old comment for not doing IOWAIT on throttle,\nhowever it has been mismatching the code for a long time.\n\nIf we stop accounting IOWAIT for 2.6.32, it could be an\nundesirable behavior change. So restore the io_schedule.\n\nCC: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCC: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Wu Fengguang \u003cfengguang.wu@intel.com\u003e\nSigned-off-by: Jens Axboe \u003cjens.axboe@oracle.com\u003e\n"
    },
    {
      "commit": "a72bfd4dea053bb8e2233902c3f1893ef5485802",
      "tree": "1246fc000adfee6d2874b9324eaf7383ad4413bb",
      "parents": [
        "6d7f18f6ea3a13af95bdf507fc54d42b165e1712"
      ],
      "author": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Sat Sep 26 00:07:46 2009 +0200"
      },
      "committer": {
        "name": "Jens Axboe",
        "email": "jens.axboe@oracle.com",
        "time": "Sat Sep 26 00:10:40 2009 +0200"
      },
      "message": "writeback: pass in super_block to bdi_start_writeback()\n\nSometimes we only want to write pages from a specific super_block,\nso allow that to be passed in.\n\nThis fixes a problem with commit 56a131dcf7ed36c3c6e36bea448b674ea85ed5bb\ncausing writeback on all super_blocks on a bdi, where we only really\nwant to sync a specific sb from writeback_inodes_sb().\n\nSigned-off-by: Jens Axboe \u003cjens.axboe@oracle.com\u003e\n"
    },
    {
      "commit": "6d7f18f6ea3a13af95bdf507fc54d42b165e1712",
      "tree": "8f6f3a6d46835aa767823fa7049609408a87afc2",
      "parents": [
        "53cddfcc0e760d2b364878b6dadbd0c6d087cfae",
        "56a131dcf7ed36c3c6e36bea448b674ea85ed5bb"
      ],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Sep 25 09:27:30 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Fri Sep 25 09:27:30 2009 -0700"
      },
      "message": "Merge branch \u0027writeback\u0027 of git://git.kernel.dk/linux-2.6-block\n\n* \u0027writeback\u0027 of git://git.kernel.dk/linux-2.6-block:\n  writeback: writeback_inodes_sb() should use bdi_start_writeback()\n  writeback: don\u0027t delay inodes redirtied by a fast dirtier\n  writeback: make the super_block pinning more efficient\n  writeback: don\u0027t resort for a single super_block in move_expired_inodes()\n  writeback: move inodes from one super_block together\n  writeback: get rid to incorrect references to pdflush in comments\n  writeback: improve readability of the wb_writeback() continue/break logic\n  writeback: cleanup writeback_single_inode()\n  writeback: kupdate writeback shall not stop when more io is possible\n  writeback: stop background writeback when below background threshold\n  writeback: balance_dirty_pages() shall write more than dirtied pages\n  fs: Fix busyloop in wb_writeback()\n"
    }
  ],
  "next": "5b0830cb9085f4b69f9d57d7f3aaff322ffbec26"
}
