)]}'
{
  "log": [
    {
      "commit": "5ead97c84fa7d63a6a7a2f4e9f18f452bd109045",
      "tree": "26f6bc55dce0f119f7d3c8d6b40d2f287601db36",
      "parents": [
        "a42089dd358a7673a0a23126589a9029e57c2049"
      ],
      "author": {
        "name": "Jeremy Fitzhardinge",
        "email": "jeremy@xensource.com",
        "time": "Tue Jul 17 18:37:04 2007 -0700"
      },
      "committer": {
        "name": "Jeremy Fitzhardinge",
        "email": "jeremy@goop.org",
        "time": "Wed Jul 18 08:47:42 2007 -0700"
      },
      "message": "xen: Core Xen implementation\n\nThis patch is a rollup of all the core pieces of the Xen\nimplementation, including:\n - booting and setup\n - pagetable setup\n - privileged instructions\n - segmentation\n - interrupt flags\n - upcalls\n - multicall batching\n\nBOOTING AND SETUP\n\nThe vmlinux image is decorated with ELF notes which tell the Xen\ndomain builder what the kernel\u0027s requirements are; the domain builder\nthen constructs the address space accordingly and starts the kernel.\n\nXen has its own entrypoint for the kernel (contained in an ELF note).\nThe ELF notes are set up by xen-head.S, which is included into head.S.\nIn principle it could be linked separately, but it seems to provoke\nlots of binutils bugs.\n\nBecause the domain builder starts the kernel in a fairly sane state\n(32-bit protected mode, paging enabled, flat segments set up), there\u0027s\nnot a lot of setup needed before starting the kernel proper.  The main\nsteps are:\n  1. Install the Xen paravirt_ops, which is simply a matter of a\n     structure assignment.\n  2. Set init_mm to use the Xen-supplied pagetables (analogous to the\n     head.S generated pagetables in a native boot).\n  3. Reserve address space for Xen, since it takes a chunk at the top\n     of the address space for its own use.\n  4. Call start_kernel()\n\nPAGETABLE SETUP\n\nOnce we hit the main kernel boot sequence, it will end up calling back\nvia paravirt_ops to set up various pieces of Xen specific state.  One\nof the critical things which requires a bit of extra care is the\nconstruction of the initial init_mm pagetable.  Because Xen places\ntight constraints on pagetables (an active pagetable must always be\nvalid, and must always be mapped read-only to the guest domain), we\nneed to be careful when constructing the new pagetable to keep these\nconstraints in mind.  It turns out that the easiest way to do this is\nuse the initial Xen-provided pagetable as a template, and then just\ninsert new mappings for memory where a mapping doesn\u0027t already exist.\n\nThis means that during pagetable setup, it uses a special version of\nxen_set_pte which ignores any attempt to remap a read-only page as\nread-write (since Xen will map its own initial pagetable as RO), but\nlets other changes to the ptes happen, so that things like NX are set\nproperly.\n\nPRIVILEGED INSTRUCTIONS AND SEGMENTATION\n\nWhen the kernel runs under Xen, it runs in ring 1 rather than ring 0.\nThis means that it is more privileged than user-mode in ring 3, but it\nstill can\u0027t run privileged instructions directly.  Non-performance\ncritical instructions are dealt with by taking a privilege exception\nand trapping into the hypervisor and emulating the instruction, but\nmore performance-critical instructions have their own specific\nparavirt_ops.  In many cases we can avoid having to do any hypercalls\nfor these instructions, or the Xen implementation is quite different\nfrom the normal native version.\n\nThe privileged instructions fall into the broad classes of:\n  Segmentation: setting up the GDT and the GDT entries, LDT,\n     TLS and so on.  Xen doesn\u0027t allow the GDT to be directly\n     modified; all GDT updates are done via hypercalls where the new\n     entries can be validated.  This is important because Xen uses\n     segment limits to prevent the guest kernel from damaging the\n     hypervisor itself.\n  Traps and exceptions: Xen uses a special format for trap entrypoints,\n     so when the kernel wants to set an IDT entry, it needs to be\n     converted to the form Xen expects.  Xen sets int 0x80 up specially\n     so that the trap goes straight from userspace into the guest kernel\n     without going via the hypervisor.  sysenter isn\u0027t supported.\n  Kernel stack: The esp0 entry is extracted from the tss and provided to\n     Xen.\n  TLB operations: the various TLB calls are mapped into corresponding\n     Xen hypercalls.\n  Control registers: all the control registers are privileged.  The most\n     important is cr3, which points to the base of the current pagetable,\n     and we handle it specially.\n\nAnother instruction we treat specially is CPUID, even though its not\nprivileged.  We want to control what CPU features are visible to the\nrest of the kernel, and so CPUID ends up going into a paravirt_op.\nXen implements this mainly to disable the ACPI and APIC subsystems.\n\nINTERRUPT FLAGS\n\nXen maintains its own separate flag for masking events, which is\ncontained within the per-cpu vcpu_info structure.  Because the guest\nkernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely\nignored (and must be, because even if a guest domain disables\ninterrupts for itself, it can\u0027t disable them overall).\n\n(A note on terminology: \"events\" and interrupts are effectively\nsynonymous.  However, rather than using an \"enable flag\", Xen uses a\n\"mask flag\", which blocks event delivery when it is non-zero.)\n\nThere are paravirt_ops for each of cli/sti/save_fl/restore_fl, which\nare implemented to manage the Xen event mask state.  The only thing\nworth noting is that when events are unmasked, we need to explicitly\nsee if there\u0027s a pending event and call into the hypervisor to make\nsure it gets delivered.\n\nUPCALLS\n\nXen needs a couple of upcall (or callback) functions to be implemented\nby each guest.  One is the event upcalls, which is how events\n(interrupts, effectively) are delivered to the guests.  The other is\nthe failsafe callback, which is used to report errors in either\nreloading a segment register, or caused by iret.  These are\nimplemented in i386/kernel/entry.S so they can jump into the normal\niret_exc path when necessary.\n\nMULTICALL BATCHING\n\nXen provides a multicall mechanism, which allows multiple hypercalls\nto be issued at once in order to mitigate the cost of trapping into\nthe hypervisor.  This is particularly useful for context switches,\nsince the 4-5 hypercalls they would normally need (reload cr3, update\nTLS, maybe update LDT) can be reduced to one.  This patch implements a\ngeneric batching mechanism for hypercalls, which gets used in many\nplaces in the Xen code.\n\nSigned-off-by: Jeremy Fitzhardinge \u003cjeremy@xensource.com\u003e\nSigned-off-by: Chris Wright \u003cchrisw@sous-sol.org\u003e\nCc: Ian Pratt \u003cian.pratt@xensource.com\u003e\nCc: Christian Limpach \u003cChristian.Limpach@cl.cam.ac.uk\u003e\nCc: Adrian Bunk \u003cbunk@stusta.de\u003e\n"
    }
  ]
}
