)]}'
{
  "log": [
    {
      "commit": "6adef3ebe570bcde67fd6c16101451ddde5712b5",
      "tree": "0f60e2a4d01850ae33aee6cefc7a59845ede89a0",
      "parents": [
        "2c488db27b614816024e7994117f599337de0f34"
      ],
      "author": {
        "name": "Jack Steiner",
        "email": "steiner@sgi.com",
        "time": "Wed May 26 14:42:49 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu May 27 09:12:44 2010 -0700"
      },
      "message": "cpusets: new round-robin rotor for SLAB allocations\n\nWe have observed several workloads running on multi-node systems where\nmemory is assigned unevenly across the nodes in the system.  There are\nnumerous reasons for this but one is the round-robin rotor in\ncpuset_mem_spread_node().\n\nFor example, a simple test that writes a multi-page file will allocate\npages on nodes 0 2 4 6 ...  Odd nodes are skipped.  (Sometimes it\nallocates on odd nodes \u0026 skips even nodes).\n\nAn example is shown below.  The program \"lfile\" writes a file consisting\nof 10 pages.  The program then mmaps the file \u0026 uses get_mempolicy(...,\nMPOL_F_NODE) to determine the nodes where the file pages were allocated.\nThe output is shown below:\n\n\t# ./lfile\n\t allocated on nodes: 2 4 6 0 1 2 6 0 2\n\nThere is a single rotor that is used for allocating both file pages \u0026 slab\npages.  Writing the file allocates both a data page \u0026 a slab page\n(buffer_head).  This advances the RR rotor 2 nodes for each page\nallocated.\n\nA quick confirmation seems to confirm this is the cause of the uneven\nallocation:\n\n\t# echo 0 \u003e/dev/cpuset/memory_spread_slab\n\t# ./lfile\n\t allocated on nodes: 6 7 8 9 0 1 2 3 4 5\n\nThis patch introduces a second rotor that is used for slab allocations.\n\nSigned-off-by: Jack Steiner \u003csteiner@sgi.com\u003e\nAcked-by: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: Pekka Enberg \u003cpenberg@cs.helsinki.fi\u003e\nCc: Paul Menage \u003cmenage@google.com\u003e\nCc: Jack Steiner \u003csteiner@sgi.com\u003e\nCc: Robin Holt \u003cholt@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "c0ff7453bb5c7c98e0885fb94279f2571946f280",
      "tree": "8bb2b169a5145f0496575dbd2f48bb4b1c83f819",
      "parents": [
        "708c1bbc9d0c3e57f40501794d9b0eed29d10fce"
      ],
      "author": {
        "name": "Miao Xie",
        "email": "miaox@cn.fujitsu.com",
        "time": "Mon May 24 14:32:08 2010 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue May 25 08:06:57 2010 -0700"
      },
      "message": "cpuset,mm: fix no node to alloc memory when changing cpuset\u0027s mems\n\nBefore applying this patch, cpuset updates task-\u003emems_allowed and\nmempolicy by setting all new bits in the nodemask first, and clearing all\nold unallowed bits later.  But in the way, the allocator may find that\nthere is no node to alloc memory.\n\nThe reason is that cpuset rebinds the task\u0027s mempolicy, it cleans the\nnodes which the allocater can alloc pages on, for example:\n\n(mpol: mempolicy)\n\ttask1\t\t\ttask1\u0027s mpol\ttask2\n\talloc page\t\t1\n\t  alloc on node0? NO\t1\n\t\t\t\t1\t\tchange mems from 1 to 0\n\t\t\t\t1\t\trebind task1\u0027s mpol\n\t\t\t\t0-1\t\t  set new bits\n\t\t\t\t0\t  \t  clear disallowed bits\n\t  alloc on node1? NO\t0\n\t  ...\n\tcan\u0027t alloc page\n\t  goto oom\n\nThis patch fixes this problem by expanding the nodes range first(set newly\nallowed bits) and shrink it lazily(clear newly disallowed bits).  So we\nuse a variable to tell the write-side task that read-side task is reading\nnodemask, and the write-side task clears newly disallowed nodes after\nread-side task ends the current memory allocation.\n\n[akpm@linux-foundation.org: fix spello]\nSigned-off-by: Miao Xie \u003cmiaox@cn.fujitsu.com\u003e\nCc: David Rientjes \u003crientjes@google.com\u003e\nCc: Nick Piggin \u003cnpiggin@suse.de\u003e\nCc: Paul Menage \u003cmenage@google.com\u003e\nCc: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nCc: Hugh Dickins \u003chugh.dickins@tiscali.co.uk\u003e\nCc: Ravikiran Thirumalai \u003ckiran@scalex86.org\u003e\nCc: KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nCc: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: Andi Kleen \u003candi@firstfloor.org\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "9084bb8246ea935b98320554229e2f371f7f52fa",
      "tree": "8478d18125e3b4a7e0a31d702647dee1830d23ef",
      "parents": [
        "6a1bdc1b577ebcb65f6603c57f8347309bc4ab13"
      ],
      "author": {
        "name": "Oleg Nesterov",
        "email": "oleg@redhat.com",
        "time": "Mon Mar 15 10:10:27 2010 +0100"
      },
      "committer": {
        "name": "Ingo Molnar",
        "email": "mingo@elte.hu",
        "time": "Fri Apr 02 20:12:03 2010 +0200"
      },
      "message": "sched: Make select_fallback_rq() cpuset friendly\n\nIntroduce cpuset_cpus_allowed_fallback() helper to fix the cpuset problems\nwith select_fallback_rq(). It can be called from any context and can\u0027t use\nany cpuset locks including task_lock(). It is called when the task doesn\u0027t\nhave online cpus in -\u003ecpus_allowed but ttwu/etc must be able to find a\nsuitable cpu.\n\nI am not proud of this patch. Everything which needs such a fat comment\ncan\u0027t be good even if correct. But I\u0027d prefer to not change the locking\nrules in the code I hardly understand, and in any case I believe this\nsimple change make the code much more correct compared to deadlocks we\ncurrently have.\n\nSigned-off-by: Oleg Nesterov \u003coleg@redhat.com\u003e\nSigned-off-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nLKML-Reference: \u003c20100315091027.GA9155@redhat.com\u003e\nSigned-off-by: Ingo Molnar \u003cmingo@elte.hu\u003e\n"
    },
    {
      "commit": "897f0b3c3ff40b443c84e271bef19bd6ae885195",
      "tree": "6b969149bb59591a1c9485de405639db6c4208d6",
      "parents": [
        "25c2d55c00c6097e6792ebf21e31342f23b9b768"
      ],
      "author": {
        "name": "Oleg Nesterov",
        "email": "oleg@redhat.com",
        "time": "Mon Mar 15 10:10:03 2010 +0100"
      },
      "committer": {
        "name": "Ingo Molnar",
        "email": "mingo@elte.hu",
        "time": "Fri Apr 02 20:12:01 2010 +0200"
      },
      "message": "sched: Kill the broken and deadlockable cpuset_lock/cpuset_cpus_allowed_locked code\n\nThis patch just states the fact the cpusets/cpuhotplug interaction is\nbroken and removes the deadlockable code which only pretends to work.\n\n- cpuset_lock() doesn\u0027t really work. It is needed for\n  cpuset_cpus_allowed_locked() but we can\u0027t take this lock in\n  try_to_wake_up()-\u003eselect_fallback_rq() path.\n\n- cpuset_lock() is deadlockable. Suppose that a task T bound to CPU takes\n  callback_mutex. If cpu_down(CPU) happens before T drops callback_mutex\n  stop_machine() preempts T, then migration_call(CPU_DEAD) tries to take\n  cpuset_lock() and hangs forever because CPU is already dead and thus\n  T can\u0027t be scheduled.\n\n- cpuset_cpus_allowed_locked() is deadlockable too. It takes task_lock()\n  which is not irq-safe, but try_to_wake_up() can be called from irq.\n\nKill them, and change select_fallback_rq() to use cpu_possible_mask, like\nwe currently do without CONFIG_CPUSETS.\n\nAlso, with or without this patch, with or without CONFIG_CPUSETS, the\ncallers of select_fallback_rq() can race with each other or with\nset_cpus_allowed() pathes.\n\nThe subsequent patches try to to fix these problems.\n\nSigned-off-by: Oleg Nesterov \u003coleg@redhat.com\u003e\nSigned-off-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nLKML-Reference: \u003c20100315091003.GA9123@redhat.com\u003e\nSigned-off-by: Ingo Molnar \u003cmingo@elte.hu\u003e\n"
    },
    {
      "commit": "58568d2a8215cb6f55caf2332017d7bdff954e1c",
      "tree": "ffcdee457494ac78d6550b0aeac86536ca152e7b",
      "parents": [
        "950592f7b991f267d707d372b90f508bbe72acbc"
      ],
      "author": {
        "name": "Miao Xie",
        "email": "miaox@cn.fujitsu.com",
        "time": "Tue Jun 16 15:31:49 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jun 16 19:47:31 2009 -0700"
      },
      "message": "cpuset,mm: update tasks\u0027 mems_allowed in time\n\nFix allocating page cache/slab object on the unallowed node when memory\nspread is set by updating tasks\u0027 mems_allowed after its cpuset\u0027s mems is\nchanged.\n\nIn order to update tasks\u0027 mems_allowed in time, we must modify the code of\nmemory policy.  Because the memory policy is applied in the process\u0027s\ncontext originally.  After applying this patch, one task directly\nmanipulates anothers mems_allowed, and we use alloc_lock in the\ntask_struct to protect mems_allowed and memory policy of the task.\n\nBut in the fast path, we didn\u0027t use lock to protect them, because adding a\nlock may lead to performance regression.  But if we don\u0027t add a lock,the\ntask might see no nodes when changing cpuset\u0027s mems_allowed to some\nnon-overlapping set.  In order to avoid it, we set all new allowed nodes,\nthen clear newly disallowed ones.\n\n[lee.schermerhorn@hp.com:\n  The rework of mpol_new() to extract the adjusting of the node mask to\n  apply cpuset and mpol flags \"context\" breaks set_mempolicy() and mbind()\n  with MPOL_PREFERRED and a NULL nodemask--i.e., explicit local\n  allocation.  Fix this by adding the check for MPOL_PREFERRED and empty\n  node mask to mpol_new_mpolicy().\n\n  Remove the now unneeded \u0027nodes \u003d NULL\u0027 from mpol_new().\n\n  Note that mpol_new_mempolicy() is always called with a non-NULL\n  \u0027nodes\u0027 parameter now that it has been removed from mpol_new().\n  Therefore, we don\u0027t need to test nodes for NULL before testing it for\n  \u0027empty\u0027.  However, just to be extra paranoid, add a VM_BUG_ON() to\n  verify this assumption.]\n[lee.schermerhorn@hp.com:\n\n  I don\u0027t think the function name \u0027mpol_new_mempolicy\u0027 is descriptive\n  enough to differentiate it from mpol_new().\n\n  This function applies cpuset set context, usually constraining nodes\n  to those allowed by the cpuset.  However, when the \u0027RELATIVE_NODES flag\n  is set, it also translates the nodes.  So I settled on\n  \u0027mpol_set_nodemask()\u0027, because the comment block for mpol_new() mentions\n  that we need to call this function to \"set nodes\".\n\n  Some additional minor line length, whitespace and typo cleanup.]\nSigned-off-by: Miao Xie \u003cmiaox@cn.fujitsu.com\u003e\nCc: Ingo Molnar \u003cmingo@elte.hu\u003e\nCc: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nCc: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: Paul Menage \u003cmenage@google.com\u003e\nCc: Nick Piggin \u003cnickpiggin@yahoo.com.au\u003e\nCc: Yasunori Goto \u003cy-goto@jp.fujitsu.com\u003e\nCc: Pekka Enberg \u003cpenberg@cs.helsinki.fi\u003e\nCc: David Rientjes \u003crientjes@google.com\u003e\nSigned-off-by: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "a1bc5a4eee990a1f290735c8694d0aebdad095fa",
      "tree": "f3e5849823444136df9c7f91f7217e1894235682",
      "parents": [
        "7f81b1ae18416b457e4d5ff23f0bd598e8a42224"
      ],
      "author": {
        "name": "David Rientjes",
        "email": "rientjes@google.com",
        "time": "Thu Apr 02 16:57:54 2009 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Apr 02 19:04:57 2009 -0700"
      },
      "message": "cpusets: replace zone allowed functions with node allowed\n\nThe cpuset_zone_allowed() variants are actually only a function of the\nzone\u0027s node.\n\nCc: Paul Menage \u003cmenage@google.com\u003e\nAcked-by: Christoph Lameter \u003ccl@linux-foundation.org\u003e\nCc: Randy Dunlap \u003crandy.dunlap@oracle.com\u003e\nSigned-off-by: David Rientjes \u003crientjes@google.com\u003e\nCc: Ingo Molnar \u003cmingo@elte.hu\u003e\nCc: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "aa85ea5b89c36c51200d795dd788139bd9b8cf50",
      "tree": "0b68a35b691417d927127376beb0541d96c9cc64",
      "parents": [
        "1a8a51004a18b627ea81444201f7867875212f46"
      ],
      "author": {
        "name": "Rusty Russell",
        "email": "rusty@rustcorp.com.au",
        "time": "Mon Mar 30 22:05:15 2009 -0600"
      },
      "committer": {
        "name": "Rusty Russell",
        "email": "rusty@rustcorp.com.au",
        "time": "Mon Mar 30 22:05:16 2009 +1030"
      },
      "message": "cpumask: use new cpumask_ functions in core code.\n\nImpact: cleanup\n\nTime to clean up remaining laggards using the old cpu_ functions.\n\nSigned-off-by: Rusty Russell \u003crusty@rustcorp.com.au\u003e\nCc: Greg Kroah-Hartman \u003cgregkh@suse.de\u003e\nCc: Ingo Molnar \u003cmingo@elte.hu\u003e\nCc: Trond.Myklebust@netapp.com\n"
    },
    {
      "commit": "6af866af34a96fed24a55979a78b6f73bd4e8e87",
      "tree": "e0c4b27ce3b684ebb2f6fa3685051e01a86d7354",
      "parents": [
        "300ed6cbb70718872cb4936d1d22ef295f9ba44d"
      ],
      "author": {
        "name": "Li Zefan",
        "email": "lizf@cn.fujitsu.com",
        "time": "Wed Jan 07 18:08:45 2009 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Thu Jan 08 08:31:11 2009 -0800"
      },
      "message": "cpuset: remove remaining pointers to cpumask_t\n\nImpact: cleanups, use new cpumask API\n\nFinal trivial cleanups: mainly s/cpumask_t/struct cpumask\n\nNote there is a FIXME in generate_sched_domains(). A future patch will\nchange struct cpumask *doms to struct cpumask *doms[].\n(I suppose Rusty will do this.)\n\nSigned-off-by: Li Zefan \u003clizf@cn.fujitsu.com\u003e\nCc: Ingo Molnar \u003cmingo@elte.hu\u003e\nCc: Rusty Russell \u003crusty@rustcorp.com.au\u003e\nAcked-by: Mike Travis \u003ctravis@sgi.com\u003e\nCc: Paul Menage \u003cmenage@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "75aa199410359dc5fbcf9025ff7af98a9d20f0d5",
      "tree": "569bffa181ccba56d884ec7e826ae61384297f56",
      "parents": [
        "c7d4caeb1d68d07f77cc09fc20b7759d6d7aa3b1"
      ],
      "author": {
        "name": "David Rientjes",
        "email": "rientjes@google.com",
        "time": "Tue Jan 06 14:39:01 2009 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Tue Jan 06 15:58:59 2009 -0800"
      },
      "message": "oom: print triggering task\u0027s cpuset and mems allowed\n\nWhen cpusets are enabled, it\u0027s necessary to print the triggering task\u0027s\nset of allowable nodes so the subsequently printed meminfo can be\ninterpreted correctly.\n\nWe also print the task\u0027s cpuset name for informational purposes.\n\n[rientjes@google.com: task lock current before dereferencing cpuset]\nCc: Paul Menage \u003cmenage@google.com\u003e\nCc: Li Zefan \u003clizf@cn.fujitsu.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f481891fdc49d3d1b8a9674a1825d183069a805f",
      "tree": "4f027a1321dcd06165394d0a23e49df51c8befc1",
      "parents": [
        "ac97b9f9a2d0b83488e0bbcb8517b229d5c9b142"
      ],
      "author": {
        "name": "Miao Xie",
        "email": "miaox@cn.fujitsu.com",
        "time": "Wed Nov 19 15:36:30 2008 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Wed Nov 19 18:49:58 2008 -0800"
      },
      "message": "cpuset: update top cpuset\u0027s mems after adding a node\n\nAfter adding a node into the machine, top cpuset\u0027s mems isn\u0027t updated.\n\nBy reviewing the code, we found that the update function\n\n  cpuset_track_online_nodes()\n\nwas invoked after node_states[N_ONLINE] changes.  It is wrong because\nN_ONLINE just means node has pgdat, and if node has/added memory, we use\nN_HIGH_MEMORY.  So, We should invoke the update function after\nnode_states[N_HIGH_MEMORY] changes, just like its commit says.\n\nThis patch fixes it.  And we use notifier of memory hotplug instead of\ndirect calling of cpuset_track_online_nodes().\n\nSigned-off-by: Miao Xie \u003cmiaox@cn.fujitsu.com\u003e\nAcked-by: Yasunori Goto \u003cy-goto@jp.fujitsu.com\u003e\nCc: David Rientjes \u003crientjes@google.com\u003e\nCc: Paul Menage \u003cmenage@google.com\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "dfb512ec4834116124da61d6c1ee10fd0aa32bd6",
      "tree": "ea4f847f2a29face1b5774c6d44ec41bf92e302b",
      "parents": [
        "cf417141cbb3a4ceb5cca15b2c1f099bd0a6603c"
      ],
      "author": {
        "name": "Max Krasnyansky",
        "email": "maxk@qualcomm.com",
        "time": "Fri Aug 29 13:11:41 2008 -0700"
      },
      "committer": {
        "name": "Ingo Molnar",
        "email": "mingo@elte.hu",
        "time": "Sat Sep 06 19:22:15 2008 +0200"
      },
      "message": "sched: arch_reinit_sched_domains() must destroy domains to force rebuild\n\nWhat I realized recently is that calling rebuild_sched_domains() in\narch_reinit_sched_domains() by itself is not enough when cpusets are enabled.\npartition_sched_domains() code is trying to avoid unnecessary domain rebuilds\nand will not actually rebuild anything if new domain masks match the old ones.\n\nWhat this means is that doing\n     echo 1 \u003e /sys/devices/system/cpu/sched_mc_power_savings\non a system with cpusets enabled will not take affect untill something changes\nin the cpuset setup (ie new sets created or deleted).\n\nThis patch fixes restore correct behaviour where domains must be rebuilt in\norder to enable MC powersaving flags.\n\nTest on quad-core Core2 box with both CONFIG_CPUSETS and !CONFIG_CPUSETS.\nAlso tested on dual-core Core2 laptop. Lockdep is happy and things are working\nas expected.\n\nSigned-off-by: Max Krasnyansky \u003cmaxk@qualcomm.com\u003e\nTested-by: Vaidyanathan Srinivasan \u003csvaidy@linux.vnet.ibm.com\u003e\nSigned-off-by: Ingo Molnar \u003cmingo@elte.hu\u003e\n"
    },
    {
      "commit": "e761b7725234276a802322549cee5255305a0930",
      "tree": "27b351a7d5fc9a93590e0effce1c5adb1bfcebc0",
      "parents": [
        "7ebefa8ceefed44cc321be70afc54a585a68ac0b"
      ],
      "author": {
        "name": "Max Krasnyansky",
        "email": "maxk@qualcomm.com",
        "time": "Tue Jul 15 04:43:49 2008 -0700"
      },
      "committer": {
        "name": "Ingo Molnar",
        "email": "mingo@elte.hu",
        "time": "Fri Jul 18 13:22:25 2008 +0200"
      },
      "message": "cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)\n\nThis is based on Linus\u0027 idea of creating cpu_active_map that prevents\nscheduler load balancer from migrating tasks to the cpu that is going\ndown.\n\nIt allows us to simplify domain management code and avoid unecessary\ndomain rebuilds during cpu hotplug event handling.\n\nPlease ignore the cpusets part for now. It needs some more work in order\nto avoid crazy lock nesting. Although I did simplfy and unify domain\nreinitialization logic. We now simply call partition_sched_domains() in\nall the cases. This means that we\u0027re using exact same code paths as in\ncpusets case and hence the test below cover cpusets too.\nCpuset changes to make rebuild_sched_domains() callable from various\ncontexts are in the separate patch (right next after this one).\n\nThis not only boots but also easily handles\n\twhile true; do make clean; make -j 8; done\nand\n\twhile true; do on-off-cpu 1; done\nat the same time.\n(on-off-cpu 1 simple does echo 0/1 \u003e /sys/.../cpu1/online thing).\n\nSuprisingly the box (dual-core Core2) is quite usable. In fact I\u0027m typing\nthis on right now in gnome-terminal and things are moving just fine.\n\nAlso this is running with most of the debug features enabled (lockdep,\nmutex, etc) no BUG_ONs or lockdep complaints so far.\n\nI believe I addressed all of the Dmitry\u0027s comments for original Linus\u0027\nversion. I changed both fair and rt balancer to mask out non-active cpus.\nAnd replaced cpu_is_offline() with !cpu_active() in the main scheduler\ncode where it made sense (to me).\n\nSigned-off-by: Max Krasnyanskiy \u003cmaxk@qualcomm.com\u003e\nAcked-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\nAcked-by: Peter Zijlstra \u003ca.p.zijlstra@chello.nl\u003e\nAcked-by: Gregory Haskins \u003cghaskins@novell.com\u003e\nCc: dmitry.adamushko@gmail.com\nCc: pj@sgi.com\nSigned-off-by: Ingo Molnar \u003cmingo@elte.hu\u003e\n"
    },
    {
      "commit": "19770b32609b6bf97a3dece2529089494cbfc549",
      "tree": "3b5922d1b20aabdf929bde9309f323841717747a",
      "parents": [
        "dd1a239f6f2d4d3eedd318583ec319aa145b324c"
      ],
      "author": {
        "name": "Mel Gorman",
        "email": "mel@csn.ul.ie",
        "time": "Mon Apr 28 02:12:18 2008 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@linux-foundation.org",
        "time": "Mon Apr 28 08:58:19 2008 -0700"
      },
      "message": "mm: filter based on a nodemask as well as a gfp_mask\n\nThe MPOL_BIND policy creates a zonelist that is used for allocations\ncontrolled by that mempolicy.  As the per-node zonelist is already being\nfiltered based on a zone id, this patch adds a version of __alloc_pages() that\ntakes a nodemask for further filtering.  This eliminates the need for\nMPOL_BIND to create a custom zonelist.\n\nA positive benefit of this is that allocations using MPOL_BIND now use the\nlocal node\u0027s distance-ordered zonelist instead of a custom node-id-ordered\nzonelist.  I.e., pages will be allocated from the closest allowed node with\navailable memory.\n\n[Lee.Schermerhorn@hp.com: Mempolicy: update stale documentation and comments]\n[Lee.Schermerhorn@hp.com: Mempolicy: make dequeue_huge_page_vma() obey MPOL_BIND nodemask]\n[Lee.Schermerhorn@hp.com: Mempolicy: make dequeue_huge_page_vma() obey MPOL_BIND nodemask rework]\nSigned-off-by: Mel Gorman \u003cmel@csn.ul.ie\u003e\nAcked-by: Christoph Lameter \u003cclameter@sgi.com\u003e\nSigned-off-by: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@csn.ul.ie\u003e\nCc: Hugh Dickins \u003chugh@veritas.com\u003e\nCc: Nick Piggin \u003cnickpiggin@yahoo.com.au\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "f9a86fcbbb1e5542eabf45c9144ac4b6330861a4",
      "tree": "0a3f8d57969b2dc8d2663e05d6ee36f9b50ba26a",
      "parents": [
        "f70316dace2bb99730800d47044acb818c6735f6"
      ],
      "author": {
        "name": "Mike Travis",
        "email": "travis@sgi.com",
        "time": "Fri Apr 04 18:11:07 2008 -0700"
      },
      "committer": {
        "name": "Ingo Molnar",
        "email": "mingo@elte.hu",
        "time": "Sat Apr 19 19:44:58 2008 +0200"
      },
      "message": "cpuset: modify cpuset_set_cpus_allowed to use cpumask pointer\n\n  * Modify cpuset_cpus_allowed to return the currently allowed cpuset\n    via a pointer argument instead of as the function return value.\n\n  * Use new set_cpus_allowed_ptr function.\n\n  * Cleanup CPU_MASK_ALL and NODE_MASK_ALL uses.\n\nDepends on:\n\t[sched-devel]: sched: add new set_cpus_allowed_ptr function\n\nSigned-off-by: Mike Travis \u003ctravis@sgi.com\u003e\nSigned-off-by: Ingo Molnar \u003cmingo@elte.hu\u003e\n"
    },
    {
      "commit": "31f1de46b90ad360a16e7af3e277d104961df923",
      "tree": "a54e8698d4e4d088c4008e0ae91b579b13d2c208",
      "parents": [
        "1a510089849ff9f70b654659bf976a6baf3a4833"
      ],
      "author": {
        "name": "KOSAKI Motohiro",
        "email": "kosaki.motohiro@jp.fujitsu.com",
        "time": "Tue Feb 12 13:30:22 2008 +0900"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@woody.linux-foundation.org",
        "time": "Mon Feb 11 20:48:29 2008 -0800"
      },
      "message": "mempolicy: silently restrict nodemask to allowed nodes\n\nKosaki Motohito noted that \"numactl --interleave\u003dall ...\" failed in the\npresence of memoryless nodes.  This patch attempts to fix that problem.\n\nSome background:\n\nnumactl --interleave\u003dall calls set_mempolicy(2) with a fully populated\n[out to MAXNUMNODES] nodemask.  set_mempolicy() [in do_set_mempolicy()]\ncalls contextualize_policy() which requires that the nodemask be a\nsubset of the current task\u0027s mems_allowed; else EINVAL will be returned.\n\nA task\u0027s mems_allowed will always be a subset of node_states[N_HIGH_MEMORY]\ni.e., nodes with memory.  So, a fully populated nodemask will be\ndeclared invalid if it includes memoryless nodes.\n\n  NOTE:  the same thing will occur when running in a cpuset\n         with restricted mem_allowed--for the same reason:\n         node mask contains dis-allowed nodes.\n\nmbind(2), on the other hand, just masks off any nodes in the nodemask\nthat are not included in the caller\u0027s mems_allowed.\n\nIn each case [mbind() and set_mempolicy()], mpol_check_policy() will\ncomplain [again, resulting in EINVAL] if the nodemask contains any\nmemoryless nodes.  This is somewhat redundant as mpol_new() will remove\nmemoryless nodes for interleave policy, as will bind_zonelist()--called\nby mpol_new() for BIND policy.\n\nProposed fix:\n\n1) modify contextualize_policy logic to:\n   a) remember whether the incoming node mask is empty.\n   b) if not, restrict the nodemask to allowed nodes, as is\n      currently done in-line for mbind().  This guarantees\n      that the resulting mask includes only nodes with memory.\n\n      NOTE:  this is a [benign, IMO] change in behavior for\n             set_mempolicy().  Dis-allowed nodes will be\n             silently ignored, rather than returning an error.\n\n   c) fold this code into mpol_check_policy(), replace 2 calls to\n      contextualize_policy() to call mpol_check_policy() directly\n      and remove contextualize_policy().\n\n2) In existing mpol_check_policy() logic, after \"contextualization\":\n   a) MPOL_DEFAULT:  require that in coming mask \"was_empty\"\n   b) MPOL_{BIND|INTERLEAVE}:  require that contextualized nodemask\n      contains at least one node.\n   c) add a case for MPOL_PREFERRED:  if in coming was not empty\n      and resulting mask IS empty, user specified invalid nodes.\n      Return EINVAL.\n   c) remove the now redundant check for memoryless nodes\n\n3) remove the now redundant masking of policy nodes for interleave\n   policy from mpol_new().\n\n4) Now that mpol_check_policy() contextualizes the nodemask, remove\n   the in-line nodes_and() from sys_mbind().  I believe that this\n   restores mbind() to the behavior before the memoryless-nodes\n   patch series.  E.g., we\u0027ll no longer treat an invalid nodemask\n   with MPOL_PREFERRED as local allocation.\n\n[ Patch history:\n\n  v1 -\u003e v2:\n   - Communicate whether or not incoming node mask was empty to\n     mpol_check_policy() for better error checking.\n   - As suggested by David Rientjes, remove the now unused\n     cpuset_nodes_subset_current_mems_allowed() from cpuset.h\n\n  v2 -\u003e v3:\n   - As suggested by Kosaki Motohito, fold the \"contextualization\"\n     of policy nodemask into mpol_check_policy().  Looks a little\n     cleaner. ]\n\nSigned-off-by:  Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nSigned-off-by:  KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nTested-by:      KOSAKI Motohiro \u003ckosaki.motohiro@jp.fujitsu.com\u003e\nAcked-by:       David Rientjes \u003crientjes@google.com\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "df5f8314ca30d6a76735748e5ba4ca9809c0f434",
      "tree": "e0a6157b1666a320e69586a81c77a3fe83b36a2a",
      "parents": [
        "a56d3fc74c0178c5f41c48315604d62cff4e746d"
      ],
      "author": {
        "name": "Eric W. Biederman",
        "email": "ebiederm@xmission.com",
        "time": "Fri Feb 08 04:18:33 2008 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@woody.linux-foundation.org",
        "time": "Fri Feb 08 09:22:24 2008 -0800"
      },
      "message": "proc: seqfile convert proc_pid_status to properly handle pid namespaces\n\nCurrently we possibly lookup the pid in the wrong pid namespace.  So\nseq_file convert proc_pid_status which ensures the proper pid namespaces is\npassed in.\n\n[akpm@linux-foundation.org: coding-style fixes]\n[akpm@linux-foundation.org: build fix]\n[akpm@linux-foundation.org: another build fix]\n[akpm@linux-foundation.org: s390 build fix]\n[akpm@linux-foundation.org: fix task_name() output]\n[akpm@linux-foundation.org: fix nommu build]\nSigned-off-by: Eric W. Biederman \u003cebiederm@xmission.com\u003e\nCc: Andrew Morgan \u003cmorgan@kernel.org\u003e\nCc: Serge Hallyn \u003cserue@us.ibm.com\u003e\nCc: Cedric Le Goater \u003cclg@fr.ibm.com\u003e\nCc: Pavel Emelyanov \u003cxemul@openvz.org\u003e\nCc: Martin Schwidefsky \u003cschwidefsky@de.ibm.com\u003e\nCc: Heiko Carstens \u003cheiko.carstens@de.ibm.com\u003e\nCc: Paul Menage \u003cmenage@google.com\u003e\nCc: Paul Jackson \u003cpj@sgi.com\u003e\nCc: David Rientjes \u003crientjes@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "470fd646444c65a5d062a371f5ec8dcedee61239",
      "tree": "59b923486d4a95efa07c4b2ad7cb0b1fcc3f3c88",
      "parents": [
        "bd89aabc6761de1c35b154fe6f914a445d301510"
      ],
      "author": {
        "name": "Cliff Wickman",
        "email": "cpw@sgi.com",
        "time": "Thu Oct 18 23:40:46 2007 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@woody.linux-foundation.org",
        "time": "Fri Oct 19 11:53:44 2007 -0700"
      },
      "message": "hotplug cpu: migrate a task within its cpuset\n\nWhen a cpu is disabled, move_task_off_dead_cpu() is called for tasks that have\nbeen running on that cpu.\n\nCurrently, such a task is migrated:\n 1) to any cpu on the same node as the disabled cpu, which is both online\n    and among that task\u0027s cpus_allowed\n 2) to any cpu which is both online and among that task\u0027s cpus_allowed\n\nIt is typical of a multithreaded application running on a large NUMA system to\nhave its tasks confined to a cpuset so as to cluster them near the memory that\nthey share.  Furthermore, it is typical to explicitly place such a task on a\nspecific cpu in that cpuset.  And in that case the task\u0027s cpus_allowed\nincludes only a single cpu.\n\nThis patch would insert a preference to migrate such a task to some cpu within\nits cpuset (and set its cpus_allowed to its entire cpuset).\n\nWith this patch, migrate the task to:\n 1) to any cpu on the same node as the disabled cpu, which is both online\n    and among that task\u0027s cpus_allowed\n 2) to any online cpu within the task\u0027s cpuset\n 3) to any cpu which is both online and among that task\u0027s cpus_allowed\n\nIn order to do this, move_task_off_dead_cpu() must make a call to\ncpuset_cpus_allowed_locked(), a new subset of cpuset_cpus_allowed(), that will\nnot block.  (name change - per Oleg\u0027s suggestion)\n\nCalls are made to cpuset_lock() and cpuset_unlock() in migration_call() to set\nthe cpuset mutex during the whole migrate_live_tasks() and\nmigrate_dead_tasks() procedure.\n\n[akpm@linux-foundation.org: build fix]\n[pj@sgi.com: Fix indentation and spacing]\nSigned-off-by: Cliff Wickman \u003ccpw@sgi.com\u003e\nCc: Oleg Nesterov \u003coleg@tv-sign.ru\u003e\nCc: Christoph Lameter \u003cclameter@sgi.com\u003e\nCc: Paul Jackson \u003cpj@sgi.com\u003e\nCc: Ingo Molnar \u003cmingo@elte.hu\u003e\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "8793d854edbc2774943a4b0de3304dc73991159a",
      "tree": "380b3403a0fedfcce61d9af5af1ffbcc71017abf",
      "parents": [
        "81a6a5cdd2c5cd70874b88afe524ab09e9e869af"
      ],
      "author": {
        "name": "Paul Menage",
        "email": "menage@google.com",
        "time": "Thu Oct 18 23:39:39 2007 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@woody.linux-foundation.org",
        "time": "Fri Oct 19 11:53:36 2007 -0700"
      },
      "message": "Task Control Groups: make cpusets a client of cgroups\n\nRemove the filesystem support logic from the cpusets system and makes cpusets\na cgroup subsystem\n\nThe \"cpuset\" filesystem becomes a dummy filesystem; attempts to mount it get\npassed through to the cgroup filesystem with the appropriate options to\nemulate the old cpuset filesystem behaviour.\n\nSigned-off-by: Paul Menage \u003cmenage@google.com\u003e\nCc: Serge E. Hallyn \u003cserue@us.ibm.com\u003e\nCc: \"Eric W. Biederman\" \u003cebiederm@xmission.com\u003e\nCc: Dave Hansen \u003chaveblue@us.ibm.com\u003e\nCc: Balbir Singh \u003cbalbir@in.ibm.com\u003e\nCc: Paul Jackson \u003cpj@sgi.com\u003e\nCc: Kirill Korotaev \u003cdev@openvz.org\u003e\nCc: Herbert Poetzl \u003cherbert@13thfloor.at\u003e\nCc: Srivatsa Vaddagiri \u003cvatsa@in.ibm.com\u003e\nCc: Cedric Le Goater \u003cclg@fr.ibm.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "bbe373f2c60b2aa36c3231734a5afc5271a06718",
      "tree": "00146d69594672ca41e35be8ff9b349e8751ab5c",
      "parents": [
        "7213f5066fc8a17c78389fe245de522b5cf0648a"
      ],
      "author": {
        "name": "David Rientjes",
        "email": "rientjes@google.com",
        "time": "Tue Oct 16 23:25:58 2007 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@woody.linux-foundation.org",
        "time": "Wed Oct 17 08:42:46 2007 -0700"
      },
      "message": "oom: compare cpuset mems_allowed instead of exclusive ancestors\n\nInstead of testing for overlap in the memory nodes of the the nearest\nexclusive ancestor of both current and the candidate task, it is better to\nsimply test for intersection between the task\u0027s mems_allowed in their task\ndescriptors.  This does not require taking callback_mutex since it is only\nused as a hint in the badness scoring.\n\nTasks that do not have an intersection in their mems_allowed with the current\ntask are not explicitly restricted from being OOM killed because it is quite\npossible that the candidate task has allocated memory there before and has\nsince changed its mems_allowed.\n\nCc: Andrea Arcangeli \u003candrea@suse.de\u003e\nAcked-by: Christoph Lameter \u003cclameter@sgi.com\u003e\nSigned-off-by: David Rientjes \u003crientjes@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "0e1e7c7a739562a321fda07c7cd2a97a7114f8f8",
      "tree": "f2148e5b667152681625c19cf8b2a556500994ea",
      "parents": [
        "523b945855a1427000ffc707c610abe5947ae607"
      ],
      "author": {
        "name": "Christoph Lameter",
        "email": "clameter@sgi.com",
        "time": "Tue Oct 16 01:25:38 2007 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@woody.linux-foundation.org",
        "time": "Tue Oct 16 09:42:59 2007 -0700"
      },
      "message": "Memoryless nodes: Use N_HIGH_MEMORY for cpusets\n\ncpusets try to ensure that any node added to a cpuset\u0027s mems_allowed is\non-line and contains memory.  The assumption was that online nodes contained\nmemory.  Thus, it is possible to add memoryless nodes to a cpuset and then add\ntasks to this cpuset.  This results in continuous series of oom-kill and\napparent system hang.\n\nChange cpusets to use node_states[N_HIGH_MEMORY] [a.k.a.  node_memory_map] in\nplace of node_online_map when vetting memories.  Return error if admin\nattempts to write a non-empty mems_allowed node mask containing only\nmemoryless-nodes.\n\nSigned-off-by: Lee Schermerhorn \u003clee.schermerhorn@hp.com\u003e\nSigned-off-by: Bob Picco \u003cbob.picco@hp.com\u003e\nSigned-off-by: Nishanth Aravamudan \u003cnacc@us.ibm.com\u003e\nCc: KAMEZAWA Hiroyuki \u003ckamezawa.hiroyu@jp.fujitsu.com\u003e\nCc: Mel Gorman \u003cmel@skynet.ie\u003e\nSigned-off-by: Christoph Lameter \u003cclameter@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "540473208f8ac71c25a87e1a2670c3c18dd4d6db",
      "tree": "716c6b412ebf3e232bd85da785315f888283d991",
      "parents": [
        "f59e5e82096f81a2cb7d7833001956d81e9fa6fb"
      ],
      "author": {
        "name": "Arjan van de Ven",
        "email": "arjan@linux.intel.com",
        "time": "Mon Feb 12 00:55:28 2007 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@woody.linux-foundation.org",
        "time": "Mon Feb 12 09:48:44 2007 -0800"
      },
      "message": "[PATCH] mark struct file_operations const 1\n\nMany struct file_operations in the kernel can be \"const\".  Marking them const\nmoves these to the .rodata section, which avoids false sharing with potential\ndirty data.  In addition it\u0027ll catch accidental writes at compile time to\nthese shared resources.\n\nSigned-off-by: Arjan van de Ven \u003carjan@linux.intel.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@linux-foundation.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@linux-foundation.org\u003e\n"
    },
    {
      "commit": "089e34b60033863549fbe561d31ac8c778a20e7f",
      "tree": "aea34fde97b91231626f54440390456e1bd3e0aa",
      "parents": [
        "918d3f90e8d5657491024f64427e9a5ea632d284"
      ],
      "author": {
        "name": "Andrew Morton",
        "email": "akpm@osdl.org",
        "time": "Fri Dec 29 16:49:04 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@woody.osdl.org",
        "time": "Sat Dec 30 10:56:43 2006 -0800"
      },
      "message": "[PATCH] cpuset procfs warning fix\n\nfs/proc/base.c:1869: warning: initialization discards qualifiers from pointer target type\nfs/proc/base.c:2150: warning: initialization discards qualifiers from pointer target type\n\nCc: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "02a0e53d8227aff5e62e0433f82c12c1c2805fd6",
      "tree": "fe32435308e5f1afe8bd12357bd8c5ff3b4133c7",
      "parents": [
        "55935a34a428a1497e3b37982e2782c09c6f914d"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Wed Dec 13 00:34:25 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@woody.osdl.org",
        "time": "Wed Dec 13 09:05:49 2006 -0800"
      },
      "message": "[PATCH] cpuset: rework cpuset_zone_allowed api\n\nElaborate the API for calling cpuset_zone_allowed(), so that users have to\nexplicitly choose between the two variants:\n\n  cpuset_zone_allowed_hardwall()\n  cpuset_zone_allowed_softwall()\n\nUntil now, whether or not you got the hardwall flavor depended solely on\nwhether or not you or\u0027d in the __GFP_HARDWALL gfp flag to the gfp_mask\nargument.\n\nIf you didn\u0027t specify __GFP_HARDWALL, you implicitly got the softwall\nversion.\n\nUnfortunately, this meant that users would end up with the softwall version\nwithout thinking about it.  Since only the softwall version might sleep,\nthis led to bugs with possible sleeping in interrupt context on more than\none occassion.\n\nThe hardwall version requires that the current tasks mems_allowed allows\nthe node of the specified zone (or that you\u0027re in interrupt or that\n__GFP_THISNODE is set or that you\u0027re on a one cpuset system.)\n\nThe softwall version, depending on the gfp_mask, might allow a node if it\nwas allowed in the nearest enclusing cpuset marked mem_exclusive (which\nrequires taking the cpuset lock \u0027callback_mutex\u0027 to evaluate.)\n\nThis patch removes the cpuset_zone_allowed() call, and forces the caller to\nexplicitly choose between the hardwall and the softwall case.\n\nIf the caller wants the gfp_mask to determine this choice, they should (1)\nbe sure they can sleep or that __GFP_HARDWALL is set, and (2) invoke the\ncpuset_zone_allowed_softwall() routine.\n\nThis adds another 100 or 200 bytes to the kernel text space, due to the few\nlines of nearly duplicate code at the top of both cpuset_zone_allowed_*\nroutines.  It should save a few instructions executed for the calls that\nturned into calls of cpuset_zone_allowed_hardwall, thanks to not having to\nset (before the call) then check (within the call) the __GFP_HARDWALL flag.\n\nFor the most critical call, from get_page_from_freelist(), the same\ninstructions are executed as before -- the old cpuset_zone_allowed()\nroutine it used to call is the same code as the\ncpuset_zone_allowed_softwall() routine that it calls now.\n\nNot a perfect win, but seems worth it, to reduce this chance of hitting a\nsleeping with irq off complaint again.\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "15ad7cdcfd76450d4beebc789ec646664238184d",
      "tree": "279d05a76ae0906c23ee2de8c5684d95d9886ad3",
      "parents": [
        "4a08a9f68168e547c2baf100020e9b96cae5fbd1"
      ],
      "author": {
        "name": "Helge Deller",
        "email": "deller@gmx.de",
        "time": "Wed Dec 06 20:40:36 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@woody.osdl.org",
        "time": "Thu Dec 07 08:39:46 2006 -0800"
      },
      "message": "[PATCH] struct seq_operations and struct file_operations constification\n\n - move some file_operations structs into the .rodata section\n\n - move static strings from policy_types[] array into the .rodata section\n\n - fix generic seq_operations usages, so that those structs may be defined\n   as \"const\" as well\n\n[akpm@osdl.org: couple of fixes]\nSigned-off-by: Helge Deller \u003cdeller@gmx.de\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "9276b1bc96a132f4068fdee00983c532f43d3a26",
      "tree": "04d64444cf6558632cfc7514b5437578b5e616af",
      "parents": [
        "89689ae7f95995723fbcd5c116c47933a3bb8b13"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Wed Dec 06 20:31:48 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@woody.osdl.org",
        "time": "Thu Dec 07 08:39:20 2006 -0800"
      },
      "message": "[PATCH] memory page_alloc zonelist caching speedup\n\nOptimize the critical zonelist scanning for free pages in the kernel memory\nallocator by caching the zones that were found to be full recently, and\nskipping them.\n\nRemembers the zones in a zonelist that were short of free memory in the\nlast second.  And it stashes a zone-to-node table in the zonelist struct,\nto optimize that conversion (minimize its cache footprint.)\n\nRecent changes:\n\n    This differs in a significant way from a similar patch that I\n    posted a week ago.  Now, instead of having a nodemask_t of\n    recently full nodes, I have a bitmask of recently full zones.\n    This solves a problem that last weeks patch had, which on\n    systems with multiple zones per node (such as DMA zone) would\n    take seeing any of these zones full as meaning that all zones\n    on that node were full.\n\n    Also I changed names - from \"zonelist faster\" to \"zonelist cache\",\n    as that seemed to better convey what we\u0027re doing here - caching\n    some of the key zonelist state (for faster access.)\n\n    See below for some performance benchmark results.  After all that\n    discussion with David on why I didn\u0027t need them, I went and got\n    some ;).  I wanted to verify that I had not hurt the normal case\n    of memory allocation noticeably.  At least for my one little\n    microbenchmark, I found (1) the normal case wasn\u0027t affected, and\n    (2) workloads that forced scanning across multiple nodes for\n    memory improved up to 10% fewer System CPU cycles and lower\n    elapsed clock time (\u0027sys\u0027 and \u0027real\u0027).  Good.  See details, below.\n\n    I didn\u0027t have the logic in get_page_from_freelist() for various\n    full nodes and zone reclaim failures correct.  That should be\n    fixed up now - notice the new goto labels zonelist_scan,\n    this_zone_full, and try_next_zone, in get_page_from_freelist().\n\nThere are two reasons I persued this alternative, over some earlier\nproposals that would have focused on optimizing the fake numa\nemulation case by caching the last useful zone:\n\n 1) Contrary to what I said before, we (SGI, on large ia64 sn2 systems)\n    have seen real customer loads where the cost to scan the zonelist\n    was a problem, due to many nodes being full of memory before\n    we got to a node we could use.  Or at least, I think we have.\n    This was related to me by another engineer, based on experiences\n    from some time past.  So this is not guaranteed.  Most likely, though.\n\n    The following approach should help such real numa systems just as\n    much as it helps fake numa systems, or any combination thereof.\n\n 2) The effort to distinguish fake from real numa, using node_distance,\n    so that we could cache a fake numa node and optimize choosing\n    it over equivalent distance fake nodes, while continuing to\n    properly scan all real nodes in distance order, was going to\n    require a nasty blob of zonelist and node distance munging.\n\n    The following approach has no new dependency on node distances or\n    zone sorting.\n\nSee comment in the patch below for a description of what it actually does.\n\nTechnical details of note (or controversy):\n\n - See the use of \"zlc_active\" and \"did_zlc_setup\" below, to delay\n   adding any work for this new mechanism until we\u0027ve looked at the\n   first zone in zonelist.  I figured the odds of the first zone\n   having the memory we needed were high enough that we should just\n   look there, first, then get fancy only if we need to keep looking.\n\n - Some odd hackery was needed to add items to struct zonelist, while\n   not tripping up the custom zonelists built by the mm/mempolicy.c\n   code for MPOL_BIND.  My usual wordy comments below explain this.\n   Search for \"MPOL_BIND\".\n\n - Some per-node data in the struct zonelist is now modified frequently,\n   with no locking.  Multiple CPU cores on a node could hit and mangle\n   this data.  The theory is that this is just performance hint data,\n   and the memory allocator will work just fine despite any such mangling.\n   The fields at risk are the struct \u0027zonelist_cache\u0027 fields \u0027fullzones\u0027\n   (a bitmask) and \u0027last_full_zap\u0027 (unsigned long jiffies).  It should\n   all be self correcting after at most a one second delay.\n\n - This still does a linear scan of the same lengths as before.  All\n   I\u0027ve optimized is making the scan faster, not algorithmically\n   shorter.  It is now able to scan a compact array of \u0027unsigned\n   short\u0027 in the case of many full nodes, so one cache line should\n   cover quite a few nodes, rather than each node hitting another\n   one or two new and distinct cache lines.\n\n - If both Andi and Nick don\u0027t find this too complicated, I will be\n   (pleasantly) flabbergasted.\n\n - I removed the comment claiming we only use one cachline\u0027s worth of\n   zonelist.  We seem, at least in the fake numa case, to have put the\n   lie to that claim.\n\n - I pay no attention to the various watermarks and such in this performance\n   hint.  A node could be marked full for one watermark, and then skipped\n   over when searching for a page using a different watermark.  I think\n   that\u0027s actually quite ok, as it will tend to slightly increase the\n   spreading of memory over other nodes, away from a memory stressed node.\n\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n\nPerformance - some benchmark results and analysis:\n\nThis benchmark runs a memory hog program that uses multiple\nthreads to touch alot of memory as quickly as it can.\n\nMultiple runs were made, touching 12, 38, 64 or 90 GBytes out of\nthe total 96 GBytes on the system, and using 1, 19, 37, or 55\nthreads (on a 56 CPU system.)  System, user and real (elapsed)\ntimings were recorded for each run, shown in units of seconds,\nin the table below.\n\nTwo kernels were tested - 2.6.18-mm3 and the same kernel with\nthis zonelist caching patch added.  The table also shows the\npercentage improvement the zonelist caching sys time is over\n(lower than) the stock *-mm kernel.\n\n      number     2.6.18-mm3\t   zonelist-cache    delta (\u003c 0 good)\tpercent\n GBs    N  \t------------\t   --------------    ----------------\tsystime\n mem threads   sys user  real\t  sys  user  real     sys  user  real\t better\n  12\t 1     153   24   177\t  151\t 24   176      -2     0    -1\t   1%\n  12\t19\t99   22     8\t   99\t 22\t8\t0     0     0\t   0%\n  12\t37     111   25     6\t  112\t 25\t6\t1     0     0\t  -0%\n  12\t55     115   25     5\t  110\t 23\t5      -5    -2     0\t   4%\n  38\t 1     502   74   576\t  497\t 73   570      -5    -1    -6\t   0%\n  38\t19     426   78    48\t  373\t 76    39     -53    -2    -9\t  12%\n  38\t37     544   83    36\t  547\t 82    36\t3    -1     0\t  -0%\n  38\t55     501   77    23\t  511\t 80    24      10     3     1\t  -1%\n  64\t 1     917  125  1042\t  890\t124  1014     -27    -1   -28\t   2%\n  64\t19    1118  138   119\t  965\t141   103    -153     3   -16\t  13%\n  64\t37    1202  151    94\t 1136\t150    81     -66    -1   -13\t   5%\n  64\t55    1118  141    61\t 1072\t140    58     -46    -1    -3\t   4%\n  90\t 1    1342  177  1519\t 1275\t174  1450     -67    -3   -69\t   4%\n  90\t19    2392  199   192\t 2116\t189   176    -276   -10   -16\t  11%\n  90\t37    3313  238   175\t 2972\t225   145    -341   -13   -30\t  10%\n  90\t55    1948  210   104\t 1843\t213   100    -105     3    -4\t   5%\n\nNotes:\n 1) This test ran a memory hog program that started a specified number N of\n    threads, and had each thread allocate and touch 1/N\u0027th of\n    the total memory to be used in the test run in a single loop,\n    writing a constant word to memory, one store every 4096 bytes.\n    Watching this test during some earlier trial runs, I would see\n    each of these threads sit down on one CPU and stay there, for\n    the remainder of the pass, a different CPU for each thread.\n\n 2) The \u0027real\u0027 column is not comparable to the \u0027sys\u0027 or \u0027user\u0027 columns.\n    The \u0027real\u0027 column is seconds wall clock time elapsed, from beginning\n    to end of that test pass.  The \u0027sys\u0027 and \u0027user\u0027 columns are total\n    CPU seconds spent on that test pass.  For a 19 thread test run,\n    for example, the sum of \u0027sys\u0027 and \u0027user\u0027 could be up to 19 times the\n    number of \u0027real\u0027 elapsed wall clock seconds.\n\n 3) Tests were run on a fresh, single-user boot, to minimize the amount\n    of memory already in use at the start of the test, and to minimize\n    the amount of background activity that might interfere.\n\n 4) Tests were done on a 56 CPU, 28 Node system with 96 GBytes of RAM.\n\n 5) Notice that the \u0027real\u0027 time gets large for the single thread runs, even\n    though the measured \u0027sys\u0027 and \u0027user\u0027 times are modest.  I\u0027m not sure what\n    that means - probably something to do with it being slow for one thread to\n    be accessing memory along ways away.  Perhaps the fake numa system, running\n    ostensibly the same workload, would not show this substantial degradation\n    of \u0027real\u0027 time for one thread on many nodes -- lets hope not.\n\n 6) The high thread count passes (one thread per CPU - on 55 of 56 CPUs)\n    ran quite efficiently, as one might expect.  Each pair of threads needed\n    to allocate and touch the memory on the node the two threads shared, a\n    pleasantly parallizable workload.\n\n 7) The intermediate thread count passes, when asking for alot of memory forcing\n    them to go to a few neighboring nodes, improved the most with this zonelist\n    caching patch.\n\nConclusions:\n * This zonelist cache patch probably makes little difference one way or the\n   other for most workloads on real numa hardware, if those workloads avoid\n   heavy off node allocations.\n * For memory intensive workloads requiring substantial off-node allocations\n   on real numa hardware, this patch improves both kernel and elapsed timings\n   up to ten per-cent.\n * For fake numa systems, I\u0027m optimistic, but will have to leave that up to\n   Rohit Seth to actually test (once I get him a 2.6.18 backport.)\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nCc: Rohit Seth \u003crohitseth@google.com\u003e\nCc: Christoph Lameter \u003cclameter@engr.sgi.com\u003e\nCc: David Rientjes \u003crientjes@cs.washington.edu\u003e\nCc: Paul Menage \u003cmenage@google.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "38837fc75acb7fa9b0e111b0241fe4fe76c5d4b3",
      "tree": "51508cbc49527e35921efb4ba31bca7da9795ad2",
      "parents": [
        "af3ffa6758dbd2ab7ebe62dddf66b3aa94d64eeb"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Fri Sep 29 02:01:16 2006 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Fri Sep 29 09:18:21 2006 -0700"
      },
      "message": "[PATCH] cpuset: top_cpuset tracks hotplug changes to node_online_map\n\nChange the list of memory nodes allowed to tasks in the top (root) nodeset\nto dynamically track what cpus are online, using a call to a cpuset hook\nfrom the memory hotplug code.  Make this top cpus file read-only.\n\nOn systems that have cpusets configured in their kernel, but that aren\u0027t\nactively using cpusets (for some distros, this covers the majority of\nsystems) all tasks end up in the top cpuset.\n\nIf that system does support memory hotplug, then these tasks cannot make\nuse of memory nodes that are added after system boot, because the memory\nnodes are not allowed in the top cpuset.  This is a surprising regression\nover earlier kernels that didn\u0027t have cpusets enabled.\n\nOne key motivation for this change is to remain consistent with the\nbehaviour for the top_cpuset\u0027s \u0027cpus\u0027, which is also read-only, and which\nautomatically tracks the cpu_online_map.\n\nThis change also has the minor benefit that it fixes a long standing,\nlittle noticed, minor bug in cpusets.  The cpuset performance tweak to\nshort circuit the cpuset_zone_allowed() check on systems with just a single\ncpuset (see \u0027number_of_cpusets\u0027, in linux/cpuset.h) meant that simply\nchanging the \u0027mems\u0027 of the top_cpuset had no affect, even though the change\n(the write system call) appeared to succeed.  With the following change,\nthat write to the \u0027mems\u0027 file fails -EACCES, and the \u0027mems\u0027 file stubbornly\nrefuses to be changed via user space writes.  Thus no one should be mislead\ninto thinking they\u0027ve changed the top_cpusets\u0027s \u0027mems\u0027 when in affect they\nhaven\u0027t.\n\nIn order to keep the behaviour of cpusets consistent between systems\nactively making use of them and systems not using them, this patch changes\nthe behaviour of the \u0027mems\u0027 file in the top (root) cpuset, making it read\nonly, and making it automatically track the value of node_online_map.  Thus\ntasks in the top cpuset will have automatic use of hot plugged memory nodes\nallowed by their cpuset.\n\n[akpm@osdl.org: build fix]\n[bunk@stusta.de: build fix]\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Adrian Bunk \u003cbunk@stusta.de\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "825a46af5ac171f9f41f794a0a00165588ba1589",
      "tree": "b690fe9d809d7b047f0393097fc79892e1217d98",
      "parents": [
        "8a39cc60bfa5a72f32d975729a354daca124f6de"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Fri Mar 24 03:16:03 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Fri Mar 24 07:33:22 2006 -0800"
      },
      "message": "[PATCH] cpuset memory spread basic implementation\n\nThis patch provides the implementation and cpuset interface for an alternative\nmemory allocation policy that can be applied to certain kinds of memory\nallocations, such as the page cache (file system buffers) and some slab caches\n(such as inode caches).\n\nThe policy is called \"memory spreading.\" If enabled, it spreads out these\nkinds of memory allocations over all the nodes allowed to a task, instead of\npreferring to place them on the node where the task is executing.\n\nAll other kinds of allocations, including anonymous pages for a tasks stack\nand data regions, are not affected by this policy choice, and continue to be\nallocated preferring the node local to execution, as modified by the NUMA\nmempolicy.\n\nThere are two boolean flag files per cpuset that control where the kernel\nallocates pages for the file system buffers and related in kernel data\nstructures.  They are called \u0027memory_spread_page\u0027 and \u0027memory_spread_slab\u0027.\n\nIf the per-cpuset boolean flag file \u0027memory_spread_page\u0027 is set, then the\nkernel will spread the file system buffers (page cache) evenly over all the\nnodes that the faulting task is allowed to use, instead of preferring to put\nthose pages on the node where the task is running.\n\nIf the per-cpuset boolean flag file \u0027memory_spread_slab\u0027 is set, then the\nkernel will spread some file system related slab caches, such as for inodes\nand dentries evenly over all the nodes that the faulting task is allowed to\nuse, instead of preferring to put those pages on the node where the task is\nrunning.\n\nThe implementation is simple.  Setting the cpuset flags \u0027memory_spread_page\u0027\nor \u0027memory_spread_cache\u0027 turns on the per-process flags PF_SPREAD_PAGE or\nPF_SPREAD_SLAB, respectively, for each task that is in the cpuset or\nsubsequently joins that cpuset.  In subsequent patches, the page allocation\ncalls for the affected page cache and slab caches are modified to perform an\ninline check for these flags, and if set, a call to a new routine\ncpuset_mem_spread_node() returns the node to prefer for the allocation.\n\nThe cpuset_mem_spread_node() routine is also simple.  It uses the value of a\nper-task rotor cpuset_mem_spread_rotor to select the next node in the current\ntasks mems_allowed to prefer for the allocation.\n\nThis policy can provide substantial improvements for jobs that need to place\nthread local data on the corresponding node, but that need to access large\nfile system data sets that need to be spread across the several nodes in the\njobs cpuset in order to fit.  Without this patch, especially for jobs that\nmight have one thread reading in the data set, the memory allocation across\nthe nodes in the jobs cpuset can become very uneven.\n\nA couple of Copyright year ranges are updated as well.  And a couple of email\naddresses that can be found in the MAINTAINERS file are removed.\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "505970b96e3b7d22177c38e03435a68376628e7a",
      "tree": "5508317e391961355bf3d946a6aac05bb21569eb",
      "parents": [
        "ed68cb3676bb179768529aeb808403d57295af56"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Sat Jan 14 13:21:06 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Sat Jan 14 18:27:10 2006 -0800"
      },
      "message": "[PATCH] cpuset oom lock fix\n\nThe problem, reported in:\n\n  http://bugzilla.kernel.org/show_bug.cgi?id\u003d5859\n\nand by various other email messages and lkml posts is that the cpuset hook\nin the oom (out of memory) code can try to take a cpuset semaphore while\nholding the tasklist_lock (a spinlock).\n\nOne must not sleep while holding a spinlock.\n\nThe fix seems easy enough - move the cpuset semaphore region outside the\ntasklist_lock region.\n\nThis required a few lines of mechanism to implement.  The oom code where\nthe locking needs to be changed does not have access to the cpuset locks,\nwhich are internal to kernel/cpuset.c only.  So I provided a couple more\ncpuset interface routines, available to the rest of the kernel, which\nsimple take and drop the lock needed here (cpusets callback_sem).\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "c417f0242ebe578924a30d4e53d35b5059fed4e7",
      "tree": "3058c7c79aedb11e7013f5faca34eb07e9a761bd",
      "parents": [
        "04c19fa6f16047abff2288ddbc1f0798ede5a849"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Sun Jan 08 01:02:01 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Sun Jan 08 20:13:44 2006 -0800"
      },
      "message": "[PATCH] cpuset: remove test for null cpuset from alloc code path\n\nRemove a couple of more lines of code from the cpuset hooks in the page\nallocation code path.\n\nThere was a check for a NULL cpuset pointer in the routine\ncpuset_update_task_memory_state() that was only needed during system boot,\nafter the memory subsystem was initialized, before the cpuset subsystem was\ninitialized, to catch a NULL task-\u003ecpuset pointer.\n\nAdd a cpuset_init_early() routine, just before the mem_init() call in\ninit/main.c, that sets up just enough of the init tasks cpuset structure to\nrender cpuset_update_task_memory_state() calls harmless.\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "202f72d5d1b5c2c084f63ef996c736d208b447b5",
      "tree": "f50551f9588f9090fee17130614e17a2dd64c656",
      "parents": [
        "74cb21553f4bf244185b9bec4c26e4e3169ad55e"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Sun Jan 08 01:01:57 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Sun Jan 08 20:13:44 2006 -0800"
      },
      "message": "[PATCH] cpuset: number_of_cpusets optimization\n\nEasy little optimization hack to avoid actually having to call\ncpuset_zone_allowed() and check mems_allowed, in the main page allocation\nroutine, __alloc_pages().  This saves several CPU cycles per page allocation\non systems not using cpusets.\n\nA counter is updated each time a cpuset is created or removed, and whenever\nthere is only one cpuset in the system, it must be the root cpuset, which\ncontains all CPUs and all Memory Nodes.  In that case, when the counter is\none, all allocations are allowed.\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "909d75a3b77bdd8baa9429bad3b69a654d2954ce",
      "tree": "f9955ff697b7569fc75e5b8683d886315f34ac49",
      "parents": [
        "cf2a473c4089aa41c26f653200673f5a4cc25047"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Sun Jan 08 01:01:55 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Sun Jan 08 20:13:44 2006 -0800"
      },
      "message": "[PATCH] cpuset: implement cpuset_mems_allowed\n\nProvide a cpuset_mems_allowed() method, which the sys_migrate_pages() code\nneeded, to obtain the mems_allowed vector of a cpuset, and replaced the\nworkaround in sys_migrate_pages() to call this new method.\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "cf2a473c4089aa41c26f653200673f5a4cc25047",
      "tree": "0bce21f4684a382b13e93ba5b85409cf5eab1c2c",
      "parents": [
        "b4b2641843db124637fa3d2cb2101982035dcc82"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Sun Jan 08 01:01:54 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Sun Jan 08 20:13:43 2006 -0800"
      },
      "message": "[PATCH] cpuset: combine refresh_mems and update_mems\n\nThe important code paths through alloc_pages_current() and alloc_page_vma(),\nby which most kernel page allocations go, both called\ncpuset_update_current_mems_allowed(), which in turn called refresh_mems().\n-Both- of these latter two routines did a tasklock, got the tasks cpuset\npointer, and checked for out of date cpuset-\u003emems_generation.\n\nThat was a silly duplication of code and waste of CPU cycles on an important\ncode path.\n\nConsolidated those two routines into a single routine, called\ncpuset_update_task_memory_state(), since it updates more than just\nmems_allowed.\n\nChanged all callers of either routine to call the new consolidated routine.\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "3e0d98b9f1eb757fc98efc84e74e54a08308aa73",
      "tree": "7cf1c75994f734ede7ec89373de640c4a58b237a",
      "parents": [
        "5966514db662fb24c9bb43226a80106bcffd51f8"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Sun Jan 08 01:01:49 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Sun Jan 08 20:13:42 2006 -0800"
      },
      "message": "[PATCH] cpuset: memory pressure meter\n\nProvide a simple per-cpuset metric of memory pressure, tracking the -rate-\nthat the tasks in a cpuset call try_to_free_pages(), the synchronous\n(direct) memory reclaim code.\n\nThis enables batch managers monitoring jobs running in dedicated cpusets to\nefficiently detect what level of memory pressure that job is causing.\n\nThis is useful both on tightly managed systems running a wide mix of\nsubmitted jobs, which may choose to terminate or reprioritize jobs that are\ntrying to use more memory than allowed on the nodes assigned them, and with\ntightly coupled, long running, massively parallel scientific computing jobs\nthat will dramatically fail to meet required performance goals if they\nstart to use more memory than allowed to them.\n\nThis patch just provides a very economical way for the batch manager to\nmonitor a cpuset for signs of memory pressure.  It\u0027s up to the batch\nmanager or other user code to decide what to do about it and take action.\n\n\u003d\u003d\u003e Unless this feature is enabled by writing \"1\" to the special file\n    /dev/cpuset/memory_pressure_enabled, the hook in the rebalance\n    code of __alloc_pages() for this metric reduces to simply noticing\n    that the cpuset_memory_pressure_enabled flag is zero.  So only\n    systems that enable this feature will compute the metric.\n\nWhy a per-cpuset, running average:\n\n    Because this meter is per-cpuset, rather than per-task or mm, the\n    system load imposed by a batch scheduler monitoring this metric is\n    sharply reduced on large systems, because a scan of the tasklist can be\n    avoided on each set of queries.\n\n    Because this meter is a running average, instead of an accumulating\n    counter, a batch scheduler can detect memory pressure with a single\n    read, instead of having to read and accumulate results for a period of\n    time.\n\n    Because this meter is per-cpuset rather than per-task or mm, the\n    batch scheduler can obtain the key information, memory pressure in a\n    cpuset, with a single read, rather than having to query and accumulate\n    results over all the (dynamically changing) set of tasks in the cpuset.\n\nA per-cpuset simple digital filter (requires a spinlock and 3 words of data\nper-cpuset) is kept, and updated by any task attached to that cpuset, if it\nenters the synchronous (direct) page reclaim code.\n\nA per-cpuset file provides an integer number representing the recent\n(half-life of 10 seconds) rate of direct page reclaims caused by the tasks\nin the cpuset, in units of reclaims attempted per second, times 1000.\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "5966514db662fb24c9bb43226a80106bcffd51f8",
      "tree": "9c6d8f4f6fee0d6574de7e225141d37b28811dc3",
      "parents": [
        "96b7f34143c2c823a6a750fcb758fc66c44945d2"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Sun Jan 08 01:01:47 2006 -0800"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Sun Jan 08 20:13:42 2006 -0800"
      },
      "message": "[PATCH] cpuset: mempolicy one more nodemask conversion\n\nFinish converting mm/mempolicy.c from bitmaps to nodemasks.  The previous\nconversion had left one routine using bitmaps, since it involved a\ncorresponding change to kernel/cpuset.c\n\nFix that interface by replacing with a simple macro that calls nodes_subset(),\nor if !CONFIG_CPUSET, returns (1).\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nCc: Christoph Lameter \u003cchristoph@lameter.com\u003e\nCc: Andi Kleen \u003cak@muc.de\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "dd0fc66fb33cd610bc1a5db8a5e232d34879b4d7",
      "tree": "51f96a9db96293b352e358f66032e1f4ff79fafb",
      "parents": [
        "3b0e77bd144203a507eb191f7117d2c5004ea1de"
      ],
      "author": {
        "name": "Al Viro",
        "email": "viro@ftp.linux.org.uk",
        "time": "Fri Oct 07 07:46:04 2005 +0100"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Sat Oct 08 15:00:57 2005 -0700"
      },
      "message": "[PATCH] gfp flags annotations - part 1\n\n - added typedef unsigned int __nocast gfp_t;\n\n - replaced __nocast uses for gfp flags with gfp_t - it gives exactly\n   the same warnings as far as sparse is concerned, doesn\u0027t change\n   generated code (from gcc point of view we replaced unsigned int with\n   typedef) and documents what\u0027s going on far better.\n\nSigned-off-by: Al Viro \u003cviro@zeniv.linux.org.uk\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "ef08e3b4981aebf2ba9bd7025ef7210e8eec07ce",
      "tree": "3b5386e011c87dde384115c8eb0d6961c2536025",
      "parents": [
        "9bf2229f8817677127a60c177aefce1badd22d7b"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Tue Sep 06 15:18:13 2005 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Wed Sep 07 16:57:40 2005 -0700"
      },
      "message": "[PATCH] cpusets: confine oom_killer to mem_exclusive cpuset\n\nNow the real motivation for this cpuset mem_exclusive patch series seems\ntrivial.\n\nThis patch keeps a task in or under one mem_exclusive cpuset from provoking an\noom kill of a task under a non-overlapping mem_exclusive cpuset.  Since only\ninterrupt and GFP_ATOMIC allocations are allowed to escape mem_exclusive\ncontainment, there is little to gain from oom killing a task under a\nnon-overlapping mem_exclusive cpuset, as almost all kernel and user memory\nallocation must come from disjoint memory nodes.\n\nThis patch enables configuring a system so that a runaway job under one\nmem_exclusive cpuset cannot cause the killing of a job in another such cpuset\nthat might be using very high compute and memory resources for a prolonged\ntime.\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "9bf2229f8817677127a60c177aefce1badd22d7b",
      "tree": "06e95863a26b197233081db1dafd869dfd231950",
      "parents": [
        "f90b1d2f1aaaa40c6519a32e69615edc25bb97d5"
      ],
      "author": {
        "name": "Paul Jackson",
        "email": "pj@sgi.com",
        "time": "Tue Sep 06 15:18:12 2005 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@g5.osdl.org",
        "time": "Wed Sep 07 16:57:40 2005 -0700"
      },
      "message": "[PATCH] cpusets: formalize intermediate GFP_KERNEL containment\n\nThis patch makes use of the previously underutilized cpuset flag\n\u0027mem_exclusive\u0027 to provide what amounts to another layer of memory placement\nresolution.  With this patch, there are now the following four layers of\nmemory placement available:\n\n 1) The whole system (interrupt and GFP_ATOMIC allocations can use this),\n 2) The nearest enclosing mem_exclusive cpuset (GFP_KERNEL allocations can use),\n 3) The current tasks cpuset (GFP_USER allocations constrained to here), and\n 4) Specific node placement, using mbind and set_mempolicy.\n\nThese nest - each layer is a subset (same or within) of the previous.\n\nLayer (2) above is new, with this patch.  The call used to check whether a\nzone (its node, actually) is in a cpuset (in its mems_allowed, actually) is\nextended to take a gfp_mask argument, and its logic is extended, in the case\nthat __GFP_HARDWALL is not set in the flag bits, to look up the cpuset\nhierarchy for the nearest enclosing mem_exclusive cpuset, to determine if\nplacement is allowed.  The definition of GFP_USER, which used to be identical\nto GFP_KERNEL, is changed to also set the __GFP_HARDWALL bit, in the previous\ncpuset_gfp_hardwall_flag patch.\n\nGFP_ATOMIC and GFP_KERNEL allocations will stay within the current tasks\ncpuset, so long as any node therein is not too tight on memory, but will\nescape to the larger layer, if need be.\n\nThe intended use is to allow something like a batch manager to handle several\njobs, each job in its own cpuset, but using common kernel memory for caches\nand such.  Swapper and oom_kill activity is also constrained to Layer (2).  A\ntask in or below one mem_exclusive cpuset should not cause swapping on nodes\nin another non-overlapping mem_exclusive cpuset, nor provoke oom_killing of a\ntask in another such cpuset.  Heavy use of kernel memory for i/o caching and\nsuch by one job should not impact the memory available to jobs in other\nnon-overlapping mem_exclusive cpusets.\n\nThis patch enables providing hardwall, inescapable cpusets for memory\nallocations of each job, while sharing kernel memory allocations between\nseveral jobs, in an enclosing mem_exclusive cpuset.\n\nLike Dinakar\u0027s patch earlier to enable administering sched domains using the\ncpu_exclusive flag, this patch also provides a useful meaning to a cpuset flag\nthat had previously done nothing much useful other than restrict what cpuset\nconfigurations were allowed.\n\nSigned-off-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "9a8488965dc4c42a4a1f84cab907c7d6c5cf1563",
      "tree": "58581a02cc06bb1a2991209c9e4d559353cbec6f",
      "parents": [
        "b52402c783d8c16b11f146a244bb21086a94bf84"
      ],
      "author": {
        "name": "Benoit Boissinot",
        "email": "benoit.boissinot@ens-lyon.org",
        "time": "Sat Apr 16 15:25:59 2005 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@ppc970.osdl.org",
        "time": "Sat Apr 16 15:25:59 2005 -0700"
      },
      "message": "[PATCH] cpuset: remove function attribute const\n\ngcc-4 warns with\ninclude/linux/cpuset.h:21: warning: type qualifiers ignored on function\nreturn type\n\ncpuset_cpus_allowed is declared with const\nextern const cpumask_t cpuset_cpus_allowed(const struct task_struct *p);\n\nFirst const should be __attribute__((const)), but the gcc manual\nexplains that:\n\n\"Note that a function that has pointer arguments and examines the data\npointed to must not be declared const. Likewise, a function that calls a\nnon-const function usually must not be const. It does not make sense for\na const function to return void.\"\n\nThe following patch remove const from the function declaration.\n\nSigned-off-by: Benoit Boissinot \u003cbenoit.boissinot@ens-lyon.org\u003e\nAcked-by: Paul Jackson \u003cpj@sgi.com\u003e\nSigned-off-by: Andrew Morton \u003cakpm@osdl.org\u003e\nSigned-off-by: Linus Torvalds \u003ctorvalds@osdl.org\u003e\n"
    },
    {
      "commit": "1da177e4c3f41524e886b7f1b8a0c1fc7321cac2",
      "tree": "0bba044c4ce775e45a88a51686b5d9f90697ea9d",
      "parents": [],
      "author": {
        "name": "Linus Torvalds",
        "email": "torvalds@ppc970.osdl.org",
        "time": "Sat Apr 16 15:20:36 2005 -0700"
      },
      "committer": {
        "name": "Linus Torvalds",
        "email": "torvalds@ppc970.osdl.org",
        "time": "Sat Apr 16 15:20:36 2005 -0700"
      },
      "message": "Linux-2.6.12-rc2\n\nInitial git repository build. I\u0027m not bothering with the full history,\neven though we have it. We can create a separate \"historical\" git\narchive of that later if we want to, and in the meantime it\u0027s about\n3.2GB when imported into git - space that would just make the early\ngit days unnecessarily complicated, when we don\u0027t have a lot of good\ninfrastructure for it.\n\nLet it rip!\n"
    }
  ]
}
