)]}'
{
  "log": [
    {
      "commit": "64db4cfff99c04cd5f550357edcc8780f96b54a2",
      "tree": "4856e788d21f0e31ed78a22b70b4521f7237705e",
      "parents": [
        "d110ec3a1e1f522e2e9dfceb9c36d6590c26d2d4"
      ],
      "author": {
        "name": "Paul E. McKenney",
        "email": "paulmck@linux.vnet.ibm.com",
        "time": "Thu Dec 18 21:55:32 2008 +0100"
      },
      "committer": {
        "name": "Ingo Molnar",
        "email": "mingo@elte.hu",
        "time": "Thu Dec 18 21:56:04 2008 +0100"
      },
      "message": "\"Tree RCU\": scalable classic RCU implementation\n\nThis patch fixes a long-standing performance bug in classic RCU that\nresults in massive internal-to-RCU lock contention on systems with\nmore than a few hundred CPUs.  Although this patch creates a separate\nflavor of RCU for ease of review and patch maintenance, it is intended\nto replace classic RCU.\n\nThis patch still handles stress better than does mainline, so I am still\ncalling it ready for inclusion.  This patch is against the -tip tree.\nNevertheless, experience on an actual 1000+ CPU machine would still be\nmost welcome.\n\nMost of the changes noted below were found while creating an rcutiny\n(which should permit ejecting the current rcuclassic) and while doing\ndetailed line-by-line documentation.\n\nUpdates from v9 (http://lkml.org/lkml/2008/12/2/334):\n\no\tFixes from remainder of line-by-line code walkthrough,\n\tincluding comment spelling, initialization, undesirable\n\tnarrowing due to type conversion, removing redundant memory\n\tbarriers, removing redundant local-variable initialization,\n\tand removing redundant local variables.\n\n\tI do not believe that any of these fixes address the CPU-hotplug\n\tissues that Andi Kleen was seeing, but please do give it a whirl\n\tin case the machine is smarter than I am.\n\n\tA writeup from the walkthrough may be found at the following\n\tURL, in case you are suffering from terminal insomnia or\n\tmasochism:\n\n\thttp://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf\n\no\tMade rcutree tracing use seq_file, as suggested some time\n\tago by Lai Jiangshan.\n\no\tAdded a .csv variant of the rcudata debugfs trace file, to allow\n\tpeople having thousands of CPUs to drop the data into\n\ta spreadsheet.\tTested with oocalc and gnumeric.  Updated\n\tdocumentation to suit.\n\nUpdates from v8 (http://lkml.org/lkml/2008/11/15/139):\n\no\tFix a theoretical race between grace-period initialization and\n\tforce_quiescent_state() that could occur if more than three\n\tjiffies were required to carry out the grace-period\n\tinitialization.  Which it might, if you had enough CPUs.\n\no\tApply Ingo\u0027s printk-standardization patch.\n\no\tSubstitute local variables for repeated accesses to global\n\tvariables.\n\no\tFix comment misspellings and redundant (but harmless) increments\n\tof -\u003en_rcu_pending (this latter after having explicitly added it).\n\no\tApply checkpatch fixes.\n\nUpdates from v7 (http://lkml.org/lkml/2008/10/10/291):\n\no\tFixed a number of problems noted by Gautham Shenoy, including\n\tthe cpu-stall-detection bug that he was having difficulty\n\tconvincing me was real.  ;-)\n\no\tChanged cpu-stall detection to wait for ten seconds rather than\n\tthree in order to reduce false positive, as suggested by Ingo\n\tMolnar.\n\no\tProduced a design document (http://lwn.net/Articles/305782/).\n\tThe act of writing this document uncovered a number of both\n\ttheoretical and \"here and now\" bugs as noted below.\n\no\tFix dynticks_nesting accounting confusion, simplify WARN_ON()\n\tcondition, fix kerneldoc comments, and add memory barriers\n\tin dynticks interface functions.\n\no\tAdd more data to tracing.\n\no\tRemove unused \"rcu_barrier\" field from rcu_data structure.\n\no\tCount calls to rcu_pending() from scheduling-clock interrupt\n\tto use as a surrogate timebase should jiffies stop counting.\n\no\tFix a theoretical race between force_quiescent_state() and\n\tgrace-period initialization.  Yes, initialization does have to\n\tgo on for some jiffies for this race to occur, but given enough\n\tCPUs...\n\nUpdates from v6 (http://lkml.org/lkml/2008/9/23/448):\n\no\tFix a number of checkpatch.pl complaints.\n\no\tApply review comments from Ingo Molnar and Lai Jiangshan\n\ton the stall-detection code.\n\no\tFix several bugs in !CONFIG_SMP builds.\n\no\tFix a misspelled config-parameter name so that RCU now announces\n\tat boot time if stall detection is configured.\n\no\tRun tests on numerous combinations of configurations parameters,\n\twhich after the fixes above, now build and run correctly.\n\nUpdates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line):\n\no\tFix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a\n\tchangeset some time ago, and finally got around to retesting\n\tthis option).\n\no\tFix some tracing bugs in rcupreempt that caused incorrect\n\ttotals to be printed.\n\no\tI now test with a more brutal random-selection online/offline\n\tscript (attached).  Probably more brutal than it needs to be\n\ton the people reading it as well, but so it goes.\n\no\tA number of optimizations and usability improvements:\n\n\to\tMake rcu_pending() ignore the grace-period timeout when\n\t\tthere is no grace period in progress.\n\n\to\tMake force_quiescent_state() avoid going for a global\n\t\tlock in the case where there is no grace period in\n\t\tprogress.\n\n\to\tRearrange struct fields to improve struct layout.\n\n\to\tMake call_rcu() initiate a grace period if RCU was\n\t\tidle, rather than waiting for the next scheduling\n\t\tclock interrupt.\n\n\to\tInvoke rcu_irq_enter() and rcu_irq_exit() only when\n\t\tidle, as suggested by Andi Kleen.  I still don\u0027t\n\t\tcompletely trust this change, and might back it out.\n\n\to\tMake CONFIG_RCU_TRACE be the single config variable\n\t\tmanipulated for all forms of RCU, instead of the prior\n\t\tconfusion.\n\n\to\tDocument tracing files and formats for both rcupreempt\n\t\tand rcutree.\n\nUpdates from v4 for those missing v5 given its bad subject line:\n\no\tSeparated dynticks interface so that NMIs and irqs call separate\n\tfunctions, greatly simplifying it.  In particular, this code\n\tno longer requires a proof of correctness.  ;-)\n\no\tSeparated dynticks state out into its own per-CPU structure,\n\tavoiding the duplicated accounting.\n\no\tThe case where a dynticks-idle CPU runs an irq handler that\n\tinvokes call_rcu() is now correctly handled, forcing that CPU\n\tout of dynticks-idle mode.\n\no\tReview comments have been applied (thank you all!!!).\n\tFor but one example, fixed the dynticks-ordering issue that\n\tManfred pointed out, saving me much debugging.  ;-)\n\no\tAdjusted rcuclassic and rcupreempt to handle dynticks changes.\n\nAttached is an updated patch to Classic RCU that applies a hierarchy,\ngreatly reducing the contention on the top-level lock for large machines.\nThis passes 10-hour concurrent rcutorture and online-offline testing on\n128-CPU ppc64 without dynticks enabled, and exposes some timekeeping\nbugs in presence of dynticks (exciting working on a system where\n\"sleep 1\" hangs until interrupted...), which were fixed in the\n2.6.27 kernel.  It is getting more reliable than mainline by some\nmeasures, so the next version will be against -tip for inclusion.\nSee also Manfred Spraul\u0027s recent patches (or his earlier work from\n2004 at http://marc.info/?l\u003dlinux-kernel\u0026m\u003d108546384711797\u0026w\u003d2).\nWe will converge onto a common patch in the fullness of time, but are\ncurrently exploring different regions of the design space.  That said,\nI have already gratefully stolen quite a few of Manfred\u0027s ideas.\n\nThis patch provides CONFIG_RCU_FANOUT, which controls the bushiness\nof the RCU hierarchy.  Defaults to 32 on 32-bit machines and 64 on\n64-bit machines.  If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT,\nthere is no hierarchy.  By default, the RCU initialization code will\nadjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA\narchitectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable\nthis balancing, allowing the hierarchy to be exactly aligned to the\nunderlying hardware.  Up to two levels of hierarchy are permitted\n(in addition to the root node), allowing up to 16,384 CPUs on 32-bit\nsystems and up to 262,144 CPUs on 64-bit systems.  I just know that I\nam going to regret saying this, but this seems more than sufficient\nfor the foreseeable future.  (Some architectures might wish to set\nCONFIG_RCU_FANOUT\u003d4, which would limit such architectures to 64 CPUs.\nIf this becomes a real problem, additional levels can be added, but I\ndoubt that it will make a significant difference on real hardware.)\n\nIn the common case, a given CPU will manipulate its private rcu_data\nstructure and the rcu_node structure that it shares with its immediate\nneighbors.  This can reduce both lock and memory contention by multiple\norders of magnitude, which should eliminate the need for the strange\nmanipulations that are reported to be required when running Linux on\nvery large systems.\n\nSome shortcomings:\n\no\tMore bugs will probably surface as a result of an ongoing\n\tline-by-line code inspection.\n\n\tPatches will be provided as required.\n\no\tThere are probably hangs, rcutorture failures, \u0026c.  Seems\n\tquite stable on a 128-CPU machine, but that is kind of small\n\tcompared to 4096 CPUs.  However, seems to do better than\n\tmainline.\n\n\tPatches will be provided as required.\n\no\tThe memory footprint of this version is several KB larger\n\tthan rcuclassic.\n\n\tA separate UP-only rcutiny patch will be provided, which will\n\treduce the memory footprint significantly, even compared\n\tto the old rcuclassic.  One such patch passes light testing,\n\tand has a memory footprint smaller even than rcuclassic.\n\tInitial reaction from various embedded guys was \"it is not\n\tworth it\", so am putting it aside.\n\nCredits:\n\no\tManfred Spraul for ideas, review comments, and bugs spotted,\n\tas well as some good friendly competition.  ;-)\n\no\tJosh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers,\n\tLai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton\n\tfor reviews and comments.\n\no\tThomas Gleixner for much-needed help with some timer issues\n\t(see patches below).\n\no\tJon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos,\n\tAndy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton\n\tBlanchard, Dave Kleikamp, and Nathan Lynch for keeping machines\n\talive despite my heavy abuse^Wtesting.\n\nSigned-off-by: Paul E. McKenney \u003cpaulmck@linux.vnet.ibm.com\u003e\nSigned-off-by: Ingo Molnar \u003cmingo@elte.hu\u003e\n"
    }
  ]
}
