| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | In Linux 2.5 kernels (and later), USB device drivers have additional control | 
|  | 2 | over how DMA may be used to perform I/O operations.  The APIs are detailed | 
|  | 3 | in the kernel usb programming guide (kerneldoc, from the source code). | 
|  | 4 |  | 
|  | 5 |  | 
|  | 6 | API OVERVIEW | 
|  | 7 |  | 
|  | 8 | The big picture is that USB drivers can continue to ignore most DMA issues, | 
|  | 9 | though they still must provide DMA-ready buffers (see DMA-mapping.txt). | 
|  | 10 | That's how they've worked through the 2.4 (and earlier) kernels. | 
|  | 11 |  | 
|  | 12 | OR:  they can now be DMA-aware. | 
|  | 13 |  | 
|  | 14 | - New calls enable DMA-aware drivers, letting them allocate dma buffers and | 
|  | 15 | manage dma mappings for existing dma-ready buffers (see below). | 
|  | 16 |  | 
|  | 17 | - URBs have an additional "transfer_dma" field, as well as a transfer_flags | 
|  | 18 | bit saying if it's valid.  (Control requests also have "setup_dma" and a | 
|  | 19 | corresponding transfer_flags bit.) | 
|  | 20 |  | 
|  | 21 | - "usbcore" will map those DMA addresses, if a DMA-aware driver didn't do | 
|  | 22 | it first and set URB_NO_TRANSFER_DMA_MAP or URB_NO_SETUP_DMA_MAP.  HCDs | 
|  | 23 | don't manage dma mappings for URBs. | 
|  | 24 |  | 
|  | 25 | - There's a new "generic DMA API", parts of which are usable by USB device | 
|  | 26 | drivers.  Never use dma_set_mask() on any USB interface or device; that | 
|  | 27 | would potentially break all devices sharing that bus. | 
|  | 28 |  | 
|  | 29 |  | 
|  | 30 | ELIMINATING COPIES | 
|  | 31 |  | 
|  | 32 | It's good to avoid making CPUs copy data needlessly.  The costs can add up, | 
|  | 33 | and effects like cache-trashing can impose subtle penalties. | 
|  | 34 |  | 
| David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 35 | - If you're doing lots of small data transfers from the same buffer all | 
|  | 36 | the time, that can really burn up resources on systems which use an | 
|  | 37 | IOMMU to manage the DMA mappings.  It can cost MUCH more to set up and | 
|  | 38 | tear down the IOMMU mappings with each request than perform the I/O! | 
|  | 39 |  | 
|  | 40 | For those specific cases, USB has primitives to allocate less expensive | 
|  | 41 | memory.  They work like kmalloc and kfree versions that give you the right | 
|  | 42 | kind of addresses to store in urb->transfer_buffer and urb->transfer_dma. | 
|  | 43 | You'd also set URB_NO_TRANSFER_DMA_MAP in urb->transfer_flags: | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 44 |  | 
|  | 45 | void *usb_buffer_alloc (struct usb_device *dev, size_t size, | 
|  | 46 | int mem_flags, dma_addr_t *dma); | 
|  | 47 |  | 
|  | 48 | void usb_buffer_free (struct usb_device *dev, size_t size, | 
|  | 49 | void *addr, dma_addr_t dma); | 
|  | 50 |  | 
| David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 51 | Most drivers should *NOT* be using these primitives; they don't need | 
|  | 52 | to use this type of memory ("dma-coherent"), and memory returned from | 
|  | 53 | kmalloc() will work just fine. | 
|  | 54 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 55 | For control transfers you can use the buffer primitives or not for each | 
|  | 56 | of the transfer buffer and setup buffer independently.  Set the flag bits | 
|  | 57 | URB_NO_TRANSFER_DMA_MAP and URB_NO_SETUP_DMA_MAP to indicate which | 
|  | 58 | buffers you have prepared.  For non-control transfers URB_NO_SETUP_DMA_MAP | 
|  | 59 | is ignored. | 
|  | 60 |  | 
|  | 61 | The memory buffer returned is "dma-coherent"; sometimes you might need to | 
|  | 62 | force a consistent memory access ordering by using memory barriers.  It's | 
|  | 63 | not using a streaming DMA mapping, so it's good for small transfers on | 
| David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 64 | systems where the I/O would otherwise thrash an IOMMU mapping.  (See | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 65 | Documentation/DMA-mapping.txt for definitions of "coherent" and "streaming" | 
|  | 66 | DMA mappings.) | 
|  | 67 |  | 
|  | 68 | Asking for 1/Nth of a page (as well as asking for N pages) is reasonably | 
|  | 69 | space-efficient. | 
|  | 70 |  | 
| David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 71 | On most systems the memory returned will be uncached, because the | 
|  | 72 | semantics of dma-coherent memory require either bypassing CPU caches | 
|  | 73 | or using cache hardware with bus-snooping support.  While x86 hardware | 
|  | 74 | has such bus-snooping, many other systems use software to flush cache | 
|  | 75 | lines to prevent DMA conflicts. | 
|  | 76 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 77 | - Devices on some EHCI controllers could handle DMA to/from high memory. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 78 |  | 
| David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 79 | Unfortunately, the current Linux DMA infrastructure doesn't have a sane | 
|  | 80 | way to expose these capabilities ... and in any case, HIGHMEM is mostly a | 
|  | 81 | design wart specific to x86_32.  So your best bet is to ensure you never | 
|  | 82 | pass a highmem buffer into a USB driver.  That's easy; it's the default | 
|  | 83 | behavior.  Just don't override it; e.g. with NETIF_F_HIGHDMA. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 84 |  | 
| David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 85 | This may force your callers to do some bounce buffering, copying from | 
|  | 86 | high memory to "normal" DMA memory.  If you can come up with a good way | 
|  | 87 | to fix this issue (for x86_32 machines with over 1 GByte of memory), | 
|  | 88 | feel free to submit patches. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 89 |  | 
|  | 90 |  | 
|  | 91 | WORKING WITH EXISTING BUFFERS | 
|  | 92 |  | 
|  | 93 | Existing buffers aren't usable for DMA without first being mapped into the | 
| David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 94 | DMA address space of the device.  However, most buffers passed to your | 
|  | 95 | driver can safely be used with such DMA mapping.  (See the first section | 
|  | 96 | of DMA-mapping.txt, titled "What memory is DMA-able?") | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 97 |  | 
|  | 98 | - When you're using scatterlists, you can map everything at once.  On some | 
|  | 99 | systems, this kicks in an IOMMU and turns the scatterlists into single | 
|  | 100 | DMA transactions: | 
|  | 101 |  | 
|  | 102 | int usb_buffer_map_sg (struct usb_device *dev, unsigned pipe, | 
|  | 103 | struct scatterlist *sg, int nents); | 
|  | 104 |  | 
|  | 105 | void usb_buffer_dmasync_sg (struct usb_device *dev, unsigned pipe, | 
|  | 106 | struct scatterlist *sg, int n_hw_ents); | 
|  | 107 |  | 
|  | 108 | void usb_buffer_unmap_sg (struct usb_device *dev, unsigned pipe, | 
|  | 109 | struct scatterlist *sg, int n_hw_ents); | 
|  | 110 |  | 
|  | 111 | It's probably easier to use the new usb_sg_*() calls, which do the DMA | 
|  | 112 | mapping and apply other tweaks to make scatterlist i/o be fast. | 
|  | 113 |  | 
|  | 114 | - Some drivers may prefer to work with the model that they're mapping large | 
|  | 115 | buffers, synchronizing their safe re-use.  (If there's no re-use, then let | 
|  | 116 | usbcore do the map/unmap.)  Large periodic transfers make good examples | 
|  | 117 | here, since it's cheaper to just synchronize the buffer than to unmap it | 
|  | 118 | each time an urb completes and then re-map it on during resubmission. | 
|  | 119 |  | 
|  | 120 | These calls all work with initialized urbs:  urb->dev, urb->pipe, | 
|  | 121 | urb->transfer_buffer, and urb->transfer_buffer_length must all be | 
|  | 122 | valid when these calls are used (urb->setup_packet must be valid too | 
|  | 123 | if urb is a control request): | 
|  | 124 |  | 
|  | 125 | struct urb *usb_buffer_map (struct urb *urb); | 
|  | 126 |  | 
|  | 127 | void usb_buffer_dmasync (struct urb *urb); | 
|  | 128 |  | 
|  | 129 | void usb_buffer_unmap (struct urb *urb); | 
|  | 130 |  | 
|  | 131 | The calls manage urb->transfer_dma for you, and set URB_NO_TRANSFER_DMA_MAP | 
|  | 132 | so that usbcore won't map or unmap the buffer.  The same goes for | 
|  | 133 | urb->setup_dma and URB_NO_SETUP_DMA_MAP for control requests. | 
| David Brownell | fbf54dd | 2007-07-01 23:33:12 -0700 | [diff] [blame] | 134 |  | 
|  | 135 | Note that several of those interfaces are currently commented out, since | 
|  | 136 | they don't have current users.  See the source code.  Other than the dmasync | 
|  | 137 | calls (where the underlying DMA primitives have changed), most of them can | 
|  | 138 | easily be commented back in if you want to use them. |