| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | How To Write Linux PCI Drivers | 
|  | 2 |  | 
|  | 3 | by Martin Mares <mj@ucw.cz> on 07-Feb-2000 | 
|  | 4 |  | 
|  | 5 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 
|  | 6 | The world of PCI is vast and it's full of (mostly unpleasant) surprises. | 
|  | 7 | Different PCI devices have different requirements and different bugs -- | 
|  | 8 | because of this, the PCI support layer in Linux kernel is not as trivial | 
|  | 9 | as one would wish. This short pamphlet tries to help all potential driver | 
|  | 10 | authors find their way through the deep forests of PCI handling. | 
|  | 11 |  | 
|  | 12 |  | 
|  | 13 | 0. Structure of PCI drivers | 
|  | 14 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 
|  | 15 | There exist two kinds of PCI drivers: new-style ones (which leave most of | 
|  | 16 | probing for devices to the PCI layer and support online insertion and removal | 
|  | 17 | of devices [thus supporting PCI, hot-pluggable PCI and CardBus in a single | 
|  | 18 | driver]) and old-style ones which just do all the probing themselves. Unless | 
|  | 19 | you have a very good reason to do so, please don't use the old way of probing | 
|  | 20 | in any new code. After the driver finds the devices it wishes to operate | 
|  | 21 | on (either the old or the new way), it needs to perform the following steps: | 
|  | 22 |  | 
|  | 23 | Enable the device | 
|  | 24 | Access device configuration space | 
|  | 25 | Discover resources (addresses and IRQ numbers) provided by the device | 
|  | 26 | Allocate these resources | 
|  | 27 | Communicate with the device | 
|  | 28 | Disable the device | 
|  | 29 |  | 
|  | 30 | Most of these topics are covered by the following sections, for the rest | 
|  | 31 | look at <linux/pci.h>, it's hopefully well commented. | 
|  | 32 |  | 
|  | 33 | If the PCI subsystem is not configured (CONFIG_PCI is not set), most of | 
|  | 34 | the functions described below are defined as inline functions either completely | 
|  | 35 | empty or just returning an appropriate error codes to avoid lots of ifdefs | 
|  | 36 | in the drivers. | 
|  | 37 |  | 
|  | 38 |  | 
|  | 39 | 1. New-style drivers | 
|  | 40 | ~~~~~~~~~~~~~~~~~~~~ | 
|  | 41 | The new-style drivers just call pci_register_driver during their initialization | 
|  | 42 | with a pointer to a structure describing the driver (struct pci_driver) which | 
|  | 43 | contains: | 
|  | 44 |  | 
|  | 45 | name		Name of the driver | 
|  | 46 | id_table	Pointer to table of device ID's the driver is | 
|  | 47 | interested in.  Most drivers should export this | 
|  | 48 | table using MODULE_DEVICE_TABLE(pci,...). | 
|  | 49 | probe		Pointer to a probing function which gets called (during | 
|  | 50 | execution of pci_register_driver for already existing | 
|  | 51 | devices or later if a new device gets inserted) for all | 
|  | 52 | PCI devices which match the ID table and are not handled | 
|  | 53 | by the other drivers yet. This function gets passed a | 
|  | 54 | pointer to the pci_dev structure representing the device | 
|  | 55 | and also which entry in the ID table did the device | 
|  | 56 | match. It returns zero when the driver has accepted the | 
|  | 57 | device or an error code (negative number) otherwise. | 
|  | 58 | This function always gets called from process context, | 
|  | 59 | so it can sleep. | 
|  | 60 | remove		Pointer to a function which gets called whenever a | 
|  | 61 | device being handled by this driver is removed (either | 
|  | 62 | during deregistration of the driver or when it's | 
|  | 63 | manually pulled out of a hot-pluggable slot). This | 
|  | 64 | function always gets called from process context, so it | 
|  | 65 | can sleep. | 
|  | 66 | save_state	Save a device's state before it's suspend. | 
|  | 67 | suspend		Put device into low power state. | 
|  | 68 | resume		Wake device from low power state. | 
|  | 69 | enable_wake	Enable device to generate wake events from a low power | 
|  | 70 | state. | 
|  | 71 |  | 
|  | 72 | (Please see Documentation/power/pci.txt for descriptions | 
|  | 73 | of PCI Power Management and the related functions) | 
|  | 74 |  | 
|  | 75 | The ID table is an array of struct pci_device_id ending with a all-zero entry. | 
|  | 76 | Each entry consists of: | 
|  | 77 |  | 
|  | 78 | vendor, device	Vendor and device ID to match (or PCI_ANY_ID) | 
|  | 79 | subvendor,	Subsystem vendor and device ID to match (or PCI_ANY_ID) | 
|  | 80 | subdevice | 
|  | 81 | class,		Device class to match. The class_mask tells which bits | 
|  | 82 | class_mask	of the class are honored during the comparison. | 
|  | 83 | driver_data	Data private to the driver. | 
|  | 84 |  | 
|  | 85 | Most drivers don't need to use the driver_data field.  Best practice | 
|  | 86 | for use of driver_data is to use it as an index into a static list of | 
|  | 87 | equivalant device types, not to use it as a pointer. | 
|  | 88 |  | 
|  | 89 | Have a table entry {PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID} | 
|  | 90 | to have probe() called for every PCI device known to the system. | 
|  | 91 |  | 
|  | 92 | New PCI IDs may be added to a device driver at runtime by writing | 
|  | 93 | to the file /sys/bus/pci/drivers/{driver}/new_id.  When added, the | 
|  | 94 | driver will probe for all devices it can support. | 
|  | 95 |  | 
|  | 96 | echo "vendor device subvendor subdevice class class_mask driver_data" > \ | 
|  | 97 | /sys/bus/pci/drivers/{driver}/new_id | 
|  | 98 | where all fields are passed in as hexadecimal values (no leading 0x). | 
|  | 99 | Users need pass only as many fields as necessary; vendor, device, | 
|  | 100 | subvendor, and subdevice fields default to PCI_ANY_ID (FFFFFFFF), | 
|  | 101 | class and classmask fields default to 0, and driver_data defaults to | 
|  | 102 | 0UL.  Device drivers must initialize use_driver_data in the dynids struct | 
|  | 103 | in their pci_driver struct prior to calling pci_register_driver in order | 
|  | 104 | for the driver_data field to get passed to the driver. Otherwise, only a | 
|  | 105 | 0 is passed in that field. | 
|  | 106 |  | 
|  | 107 | When the driver exits, it just calls pci_unregister_driver() and the PCI layer | 
|  | 108 | automatically calls the remove hook for all devices handled by the driver. | 
|  | 109 |  | 
|  | 110 | Please mark the initialization and cleanup functions where appropriate | 
|  | 111 | (the corresponding macros are defined in <linux/init.h>): | 
|  | 112 |  | 
|  | 113 | __init		Initialization code. Thrown away after the driver | 
|  | 114 | initializes. | 
|  | 115 | __exit		Exit code. Ignored for non-modular drivers. | 
|  | 116 | __devinit	Device initialization code. Identical to __init if | 
|  | 117 | the kernel is not compiled with CONFIG_HOTPLUG, normal | 
|  | 118 | function otherwise. | 
|  | 119 | __devexit	The same for __exit. | 
|  | 120 |  | 
|  | 121 | Tips: | 
|  | 122 | The module_init()/module_exit() functions (and all initialization | 
|  | 123 | functions called only from these) should be marked __init/exit. | 
|  | 124 | The struct pci_driver shouldn't be marked with any of these tags. | 
|  | 125 | The ID table array should be marked __devinitdata. | 
|  | 126 | The probe() and remove() functions (and all initialization | 
|  | 127 | functions called only from these) should be marked __devinit/exit. | 
|  | 128 | If you are sure the driver is not a hotplug driver then use only | 
|  | 129 | __init/exit __initdata/exitdata. | 
|  | 130 |  | 
|  | 131 | Pointers to functions marked as __devexit must be created using | 
|  | 132 | __devexit_p(function_name).  That will generate the function | 
|  | 133 | name or NULL if the __devexit function will be discarded. | 
|  | 134 |  | 
|  | 135 |  | 
|  | 136 | 2. How to find PCI devices manually (the old style) | 
|  | 137 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 
|  | 138 | PCI drivers not using the pci_register_driver() interface search | 
|  | 139 | for PCI devices manually using the following constructs: | 
|  | 140 |  | 
|  | 141 | Searching by vendor and device ID: | 
|  | 142 |  | 
|  | 143 | struct pci_dev *dev = NULL; | 
|  | 144 | while (dev = pci_get_device(VENDOR_ID, DEVICE_ID, dev)) | 
|  | 145 | configure_device(dev); | 
|  | 146 |  | 
|  | 147 | Searching by class ID (iterate in a similar way): | 
|  | 148 |  | 
|  | 149 | pci_get_class(CLASS_ID, dev) | 
|  | 150 |  | 
|  | 151 | Searching by both vendor/device and subsystem vendor/device ID: | 
|  | 152 |  | 
|  | 153 | pci_get_subsys(VENDOR_ID, DEVICE_ID, SUBSYS_VENDOR_ID, SUBSYS_DEVICE_ID, dev). | 
|  | 154 |  | 
|  | 155 | You can use the constant PCI_ANY_ID as a wildcard replacement for | 
|  | 156 | VENDOR_ID or DEVICE_ID.  This allows searching for any device from a | 
|  | 157 | specific vendor, for example. | 
|  | 158 |  | 
|  | 159 | These functions are hotplug-safe. They increment the reference count on | 
|  | 160 | the pci_dev that they return. You must eventually (possibly at module unload) | 
|  | 161 | decrement the reference count on these devices by calling pci_dev_put(). | 
|  | 162 |  | 
|  | 163 |  | 
|  | 164 | 3. Enabling and disabling devices | 
|  | 165 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 
|  | 166 | Before you do anything with the device you've found, you need to enable | 
|  | 167 | it by calling pci_enable_device() which enables I/O and memory regions of | 
|  | 168 | the device, allocates an IRQ if necessary, assigns missing resources if | 
|  | 169 | needed and wakes up the device if it was in suspended state. Please note | 
|  | 170 | that this function can fail. | 
|  | 171 |  | 
|  | 172 | If you want to use the device in bus mastering mode, call pci_set_master() | 
|  | 173 | which enables the bus master bit in PCI_COMMAND register and also fixes | 
|  | 174 | the latency timer value if it's set to something bogus by the BIOS. | 
|  | 175 |  | 
|  | 176 | If you want to use the PCI Memory-Write-Invalidate transaction, | 
|  | 177 | call pci_set_mwi().  This enables the PCI_COMMAND bit for Mem-Wr-Inval | 
|  | 178 | and also ensures that the cache line size register is set correctly. | 
|  | 179 | Make sure to check the return value of pci_set_mwi(), not all architectures | 
|  | 180 | may support Memory-Write-Invalidate. | 
|  | 181 |  | 
|  | 182 | If your driver decides to stop using the device (e.g., there was an | 
|  | 183 | error while setting it up or the driver module is being unloaded), it | 
|  | 184 | should call pci_disable_device() to deallocate any IRQ resources, disable | 
|  | 185 | PCI bus-mastering, etc.  You should not do anything with the device after | 
|  | 186 | calling pci_disable_device(). | 
|  | 187 |  | 
|  | 188 | 4. How to access PCI config space | 
|  | 189 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 
|  | 190 | You can use pci_(read|write)_config_(byte|word|dword) to access the config | 
|  | 191 | space of a device represented by struct pci_dev *. All these functions return 0 | 
|  | 192 | when successful or an error code (PCIBIOS_...) which can be translated to a text | 
|  | 193 | string by pcibios_strerror. Most drivers expect that accesses to valid PCI | 
|  | 194 | devices don't fail. | 
|  | 195 |  | 
|  | 196 | If you don't have a struct pci_dev available, you can call | 
|  | 197 | pci_bus_(read|write)_config_(byte|word|dword) to access a given device | 
|  | 198 | and function on that bus. | 
|  | 199 |  | 
|  | 200 | If you access fields in the standard portion of the config header, please | 
|  | 201 | use symbolic names of locations and bits declared in <linux/pci.h>. | 
|  | 202 |  | 
|  | 203 | If you need to access Extended PCI Capability registers, just call | 
|  | 204 | pci_find_capability() for the particular capability and it will find the | 
|  | 205 | corresponding register block for you. | 
|  | 206 |  | 
|  | 207 |  | 
|  | 208 | 5. Addresses and interrupts | 
|  | 209 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 
|  | 210 | Memory and port addresses and interrupt numbers should NOT be read from the | 
|  | 211 | config space. You should use the values in the pci_dev structure as they might | 
|  | 212 | have been remapped by the kernel. | 
|  | 213 |  | 
|  | 214 | See Documentation/IO-mapping.txt for how to access device memory. | 
|  | 215 |  | 
|  | 216 | You still need to call request_region() for I/O regions and | 
|  | 217 | request_mem_region() for memory regions to make sure nobody else is using the | 
|  | 218 | same device. | 
|  | 219 |  | 
|  | 220 | All interrupt handlers should be registered with SA_SHIRQ and use the devid | 
|  | 221 | to map IRQs to devices (remember that all PCI interrupts are shared). | 
|  | 222 |  | 
|  | 223 |  | 
|  | 224 | 6. Other interesting functions | 
|  | 225 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 
|  | 226 | pci_find_slot()			Find pci_dev corresponding to given bus and | 
|  | 227 | slot numbers. | 
|  | 228 | pci_set_power_state()		Set PCI Power Management state (0=D0 ... 3=D3) | 
|  | 229 | pci_find_capability()		Find specified capability in device's capability | 
|  | 230 | list. | 
|  | 231 | pci_module_init()		Inline helper function for ensuring correct | 
|  | 232 | pci_driver initialization and error handling. | 
|  | 233 | pci_resource_start()		Returns bus start address for a given PCI region | 
|  | 234 | pci_resource_end()		Returns bus end address for a given PCI region | 
|  | 235 | pci_resource_len()		Returns the byte length of a PCI region | 
|  | 236 | pci_set_drvdata()		Set private driver data pointer for a pci_dev | 
|  | 237 | pci_get_drvdata()		Return private driver data pointer for a pci_dev | 
|  | 238 | pci_set_mwi()			Enable Memory-Write-Invalidate transactions. | 
|  | 239 | pci_clear_mwi()			Disable Memory-Write-Invalidate transactions. | 
|  | 240 |  | 
|  | 241 |  | 
|  | 242 | 7. Miscellaneous hints | 
|  | 243 | ~~~~~~~~~~~~~~~~~~~~~~ | 
|  | 244 | When displaying PCI slot names to the user (for example when a driver wants | 
|  | 245 | to tell the user what card has it found), please use pci_name(pci_dev) | 
|  | 246 | for this purpose. | 
|  | 247 |  | 
|  | 248 | Always refer to the PCI devices by a pointer to the pci_dev structure. | 
|  | 249 | All PCI layer functions use this identification and it's the only | 
|  | 250 | reasonable one. Don't use bus/slot/function numbers except for very | 
|  | 251 | special purposes -- on systems with multiple primary buses their semantics | 
|  | 252 | can be pretty complex. | 
|  | 253 |  | 
|  | 254 | If you're going to use PCI bus mastering DMA, take a look at | 
|  | 255 | Documentation/DMA-mapping.txt. | 
|  | 256 |  | 
|  | 257 | Don't try to turn on Fast Back to Back writes in your driver.  All devices | 
|  | 258 | on the bus need to be capable of doing it, so this is something which needs | 
|  | 259 | to be handled by platform and generic code, not individual drivers. | 
|  | 260 |  | 
|  | 261 |  | 
|  | 262 | 8. Obsolete functions | 
|  | 263 | ~~~~~~~~~~~~~~~~~~~~~ | 
|  | 264 | There are several functions which you might come across when trying to | 
|  | 265 | port an old driver to the new PCI interface.  They are no longer present | 
|  | 266 | in the kernel as they aren't compatible with hotplug or PCI domains or | 
|  | 267 | having sane locking. | 
|  | 268 |  | 
|  | 269 | pcibios_present() and		Since ages, you don't need to test presence | 
|  | 270 | pci_present()			of PCI subsystem when trying to talk to it. | 
|  | 271 | If it's not there, the list of PCI devices | 
|  | 272 | is empty and all functions for searching for | 
|  | 273 | devices just return NULL. | 
|  | 274 | pcibios_(read|write)_*		Superseded by their pci_(read|write)_* | 
|  | 275 | counterparts. | 
|  | 276 | pcibios_find_*			Superseded by their pci_get_* counterparts. | 
|  | 277 | pci_for_each_dev()		Superseded by pci_get_device() | 
|  | 278 | pci_for_each_dev_reverse()	Superseded by pci_find_device_reverse() | 
|  | 279 | pci_for_each_bus()		Superseded by pci_find_next_bus() | 
|  | 280 | pci_find_device()		Superseded by pci_get_device() | 
|  | 281 | pci_find_subsys()		Superseded by pci_get_subsys() | 
| Matthew Wilcox | a3ea7fb | 2005-03-29 19:08:48 +0100 | [diff] [blame] | 282 | pci_find_slot()			Superseded by pci_get_slot() | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 283 | pcibios_find_class()		Superseded by pci_get_class() | 
|  | 284 | pci_find_class()		Superseded by pci_get_class() | 
|  | 285 | pci_(read|write)_*_nodev()	Superseded by pci_bus_(read|write)_*() |