| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | <?xml version="1.0" encoding="UTF-8"?> | 
 | 2 | <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" | 
 | 3 | 	"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> | 
 | 4 |  | 
 | 5 | <book id="DoingIO"> | 
 | 6 |  <bookinfo> | 
 | 7 |   <title>Bus-Independent Device Accesses</title> | 
 | 8 |    | 
 | 9 |   <authorgroup> | 
 | 10 |    <author> | 
 | 11 |     <firstname>Matthew</firstname> | 
 | 12 |     <surname>Wilcox</surname> | 
 | 13 |     <affiliation> | 
 | 14 |      <address> | 
 | 15 |       <email>matthew@wil.cx</email> | 
 | 16 |      </address> | 
 | 17 |     </affiliation> | 
 | 18 |    </author> | 
 | 19 |   </authorgroup> | 
 | 20 |  | 
 | 21 |   <authorgroup> | 
 | 22 |    <author> | 
 | 23 |     <firstname>Alan</firstname> | 
 | 24 |     <surname>Cox</surname> | 
 | 25 |     <affiliation> | 
 | 26 |      <address> | 
 | 27 |       <email>alan@redhat.com</email> | 
 | 28 |      </address> | 
 | 29 |     </affiliation> | 
 | 30 |    </author> | 
 | 31 |   </authorgroup> | 
 | 32 |  | 
 | 33 |   <copyright> | 
 | 34 |    <year>2001</year> | 
 | 35 |    <holder>Matthew Wilcox</holder> | 
 | 36 |   </copyright> | 
 | 37 |  | 
 | 38 |   <legalnotice> | 
 | 39 |    <para> | 
 | 40 |      This documentation is free software; you can redistribute | 
 | 41 |      it and/or modify it under the terms of the GNU General Public | 
 | 42 |      License as published by the Free Software Foundation; either | 
 | 43 |      version 2 of the License, or (at your option) any later | 
 | 44 |      version. | 
 | 45 |    </para> | 
 | 46 |        | 
 | 47 |    <para> | 
 | 48 |      This program is distributed in the hope that it will be | 
 | 49 |      useful, but WITHOUT ANY WARRANTY; without even the implied | 
 | 50 |      warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. | 
 | 51 |      See the GNU General Public License for more details. | 
 | 52 |    </para> | 
 | 53 |        | 
 | 54 |    <para> | 
 | 55 |      You should have received a copy of the GNU General Public | 
 | 56 |      License along with this program; if not, write to the Free | 
 | 57 |      Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, | 
 | 58 |      MA 02111-1307 USA | 
 | 59 |    </para> | 
 | 60 |        | 
 | 61 |    <para> | 
 | 62 |      For more details see the file COPYING in the source | 
 | 63 |      distribution of Linux. | 
 | 64 |    </para> | 
 | 65 |   </legalnotice> | 
 | 66 |  </bookinfo> | 
 | 67 |  | 
 | 68 | <toc></toc> | 
 | 69 |  | 
 | 70 |   <chapter id="intro"> | 
 | 71 |       <title>Introduction</title> | 
 | 72 |   <para> | 
 | 73 | 	Linux provides an API which abstracts performing IO across all busses | 
 | 74 | 	and devices, allowing device drivers to be written independently of | 
 | 75 | 	bus type. | 
 | 76 |   </para> | 
 | 77 |   </chapter> | 
 | 78 |  | 
 | 79 |   <chapter id="bugs"> | 
 | 80 |      <title>Known Bugs And Assumptions</title> | 
 | 81 |   <para> | 
 | 82 | 	None.	 | 
 | 83 |   </para> | 
 | 84 |   </chapter> | 
 | 85 |  | 
 | 86 |   <chapter id="mmio"> | 
 | 87 |     <title>Memory Mapped IO</title> | 
| Rob Landley | 541ceb3 | 2007-10-16 23:31:30 -0700 | [diff] [blame] | 88 |     <sect1 id="getting_access_to_the_device"> | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 89 |       <title>Getting Access to the Device</title> | 
 | 90 |       <para> | 
 | 91 | 	The most widely supported form of IO is memory mapped IO. | 
 | 92 | 	That is, a part of the CPU's address space is interpreted | 
 | 93 | 	not as accesses to memory, but as accesses to a device.  Some | 
 | 94 | 	architectures define devices to be at a fixed address, but most | 
 | 95 | 	have some method of discovering devices.  The PCI bus walk is a | 
 | 96 | 	good example of such a scheme.	This document does not cover how | 
 | 97 | 	to receive such an address, but assumes you are starting with one. | 
 | 98 | 	Physical addresses are of type unsigned long.  | 
 | 99 |       </para> | 
 | 100 |  | 
 | 101 |       <para> | 
 | 102 | 	This address should not be used directly.  Instead, to get an | 
 | 103 | 	address suitable for passing to the accessor functions described | 
 | 104 | 	below, you should call <function>ioremap</function>. | 
 | 105 | 	An address suitable for accessing the device will be returned to you. | 
 | 106 |       </para> | 
 | 107 |  | 
 | 108 |       <para> | 
 | 109 | 	After you've finished using the device (say, in your module's | 
 | 110 | 	exit routine), call <function>iounmap</function> in order to return | 
 | 111 | 	the address space to the kernel.  Most architectures allocate new | 
 | 112 | 	address space each time you call <function>ioremap</function>, and | 
 | 113 | 	they can run out unless you call <function>iounmap</function>. | 
 | 114 |       </para> | 
 | 115 |     </sect1> | 
 | 116 |  | 
| Rob Landley | 541ceb3 | 2007-10-16 23:31:30 -0700 | [diff] [blame] | 117 |     <sect1 id="accessing_the_device"> | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 118 |       <title>Accessing the device</title> | 
 | 119 |       <para> | 
 | 120 | 	The part of the interface most used by drivers is reading and | 
 | 121 | 	writing memory-mapped registers on the device.	Linux provides | 
 | 122 | 	interfaces to read and write 8-bit, 16-bit, 32-bit and 64-bit | 
 | 123 | 	quantities.  Due to a historical accident, these are named byte, | 
 | 124 | 	word, long and quad accesses.  Both read and write accesses are | 
 | 125 | 	supported; there is no prefetch support at this time. | 
 | 126 |       </para> | 
 | 127 |  | 
 | 128 |       <para> | 
 | 129 | 	The functions are named <function>readb</function>, | 
 | 130 | 	<function>readw</function>, <function>readl</function>, | 
 | 131 | 	<function>readq</function>, <function>readb_relaxed</function>, | 
 | 132 | 	<function>readw_relaxed</function>, <function>readl_relaxed</function>, | 
 | 133 | 	<function>readq_relaxed</function>, <function>writeb</function>, | 
 | 134 | 	<function>writew</function>, <function>writel</function> and | 
 | 135 | 	<function>writeq</function>. | 
 | 136 |       </para> | 
 | 137 |  | 
 | 138 |       <para> | 
 | 139 | 	Some devices (such as framebuffers) would like to use larger | 
 | 140 | 	transfers than 8 bytes at a time.  For these devices, the | 
 | 141 | 	<function>memcpy_toio</function>, <function>memcpy_fromio</function> | 
 | 142 | 	and <function>memset_io</function> functions are provided. | 
 | 143 | 	Do not use memset or memcpy on IO addresses; they | 
 | 144 | 	are not guaranteed to copy data in order. | 
 | 145 |       </para> | 
 | 146 |  | 
 | 147 |       <para> | 
 | 148 | 	The read and write functions are defined to be ordered. That is the | 
 | 149 | 	compiler is not permitted to reorder the I/O sequence. When the  | 
 | 150 | 	ordering can be compiler optimised, you can use <function> | 
 | 151 | 	__readb</function> and friends to indicate the relaxed ordering. Use  | 
 | 152 | 	this with care. | 
 | 153 |       </para> | 
 | 154 |  | 
 | 155 |       <para> | 
 | 156 | 	While the basic functions are defined to be synchronous with respect | 
 | 157 | 	to each other and ordered with respect to each other the busses the | 
 | 158 | 	devices sit on may themselves have asynchronicity. In particular many | 
 | 159 | 	authors are burned by the fact that PCI bus writes are posted | 
 | 160 | 	asynchronously. A driver author must issue a read from the same | 
 | 161 | 	device to ensure that writes have occurred in the specific cases the | 
 | 162 | 	author cares. This kind of property cannot be hidden from driver | 
 | 163 | 	writers in the API.  In some cases, the read used to flush the device | 
 | 164 | 	may be expected to fail (if the card is resetting, for example).  In | 
 | 165 | 	that case, the read should be done from config space, which is | 
 | 166 | 	guaranteed to soft-fail if the card doesn't respond. | 
 | 167 |       </para> | 
 | 168 |  | 
 | 169 |       <para> | 
 | 170 | 	The following is an example of flushing a write to a device when | 
 | 171 | 	the driver would like to ensure the write's effects are visible prior | 
 | 172 | 	to continuing execution. | 
 | 173 |       </para> | 
 | 174 |  | 
 | 175 | <programlisting> | 
 | 176 | static inline void | 
 | 177 | qla1280_disable_intrs(struct scsi_qla_host *ha) | 
 | 178 | { | 
 | 179 | 	struct device_reg *reg; | 
 | 180 |  | 
 | 181 | 	reg = ha->iobase; | 
 | 182 | 	/* disable risc and host interrupts */ | 
 | 183 | 	WRT_REG_WORD(&reg->ictrl, 0); | 
 | 184 | 	/* | 
 | 185 | 	 * The following read will ensure that the above write | 
 | 186 | 	 * has been received by the device before we return from this | 
 | 187 | 	 * function. | 
 | 188 | 	 */ | 
 | 189 | 	RD_REG_WORD(&reg->ictrl); | 
 | 190 | 	ha->flags.ints_enabled = 0; | 
 | 191 | } | 
 | 192 | </programlisting> | 
 | 193 |  | 
 | 194 |       <para> | 
 | 195 | 	In addition to write posting, on some large multiprocessing systems | 
 | 196 | 	(e.g. SGI Challenge, Origin and Altix machines) posted writes won't | 
 | 197 | 	be strongly ordered coming from different CPUs.  Thus it's important | 
 | 198 | 	to properly protect parts of your driver that do memory-mapped writes | 
 | 199 | 	with locks and use the <function>mmiowb</function> to make sure they | 
 | 200 | 	arrive in the order intended.  Issuing a regular <function>readX | 
 | 201 | 	</function> will also ensure write ordering, but should only be used | 
 | 202 | 	when the driver has to be sure that the write has actually arrived | 
 | 203 | 	at the device (not that it's simply ordered with respect to other | 
 | 204 | 	writes), since a full <function>readX</function> is a relatively | 
 | 205 | 	expensive operation. | 
 | 206 |       </para> | 
 | 207 |  | 
 | 208 |       <para> | 
 | 209 | 	Generally, one should use <function>mmiowb</function> prior to | 
 | 210 | 	releasing a spinlock that protects regions using <function>writeb | 
 | 211 | 	</function> or similar functions that aren't surrounded by <function> | 
 | 212 | 	readb</function> calls, which will ensure ordering and flushing.  The | 
 | 213 | 	following pseudocode illustrates what might occur if write ordering | 
 | 214 | 	isn't guaranteed via <function>mmiowb</function> or one of the | 
 | 215 | 	<function>readX</function> functions. | 
 | 216 |       </para> | 
 | 217 |  | 
 | 218 | <programlisting> | 
 | 219 | CPU A:  spin_lock_irqsave(&dev_lock, flags) | 
 | 220 | CPU A:  ... | 
 | 221 | CPU A:  writel(newval, ring_ptr); | 
 | 222 | CPU A:  spin_unlock_irqrestore(&dev_lock, flags) | 
 | 223 |         ... | 
 | 224 | CPU B:  spin_lock_irqsave(&dev_lock, flags) | 
 | 225 | CPU B:  writel(newval2, ring_ptr); | 
 | 226 | CPU B:  ... | 
 | 227 | CPU B:  spin_unlock_irqrestore(&dev_lock, flags) | 
 | 228 | </programlisting> | 
 | 229 |  | 
 | 230 |       <para> | 
 | 231 | 	In the case above, newval2 could be written to ring_ptr before | 
 | 232 | 	newval.  Fixing it is easy though: | 
 | 233 |       </para> | 
 | 234 |  | 
 | 235 | <programlisting> | 
 | 236 | CPU A:  spin_lock_irqsave(&dev_lock, flags) | 
 | 237 | CPU A:  ... | 
 | 238 | CPU A:  writel(newval, ring_ptr); | 
 | 239 | CPU A:  mmiowb(); /* ensure no other writes beat us to the device */ | 
 | 240 | CPU A:  spin_unlock_irqrestore(&dev_lock, flags) | 
 | 241 |         ... | 
 | 242 | CPU B:  spin_lock_irqsave(&dev_lock, flags) | 
 | 243 | CPU B:  writel(newval2, ring_ptr); | 
 | 244 | CPU B:  ... | 
 | 245 | CPU B:  mmiowb(); | 
 | 246 | CPU B:  spin_unlock_irqrestore(&dev_lock, flags) | 
 | 247 | </programlisting> | 
 | 248 |  | 
 | 249 |       <para> | 
 | 250 | 	See tg3.c for a real world example of how to use <function>mmiowb | 
 | 251 | 	</function> | 
 | 252 |       </para> | 
 | 253 |  | 
 | 254 |       <para> | 
 | 255 | 	PCI ordering rules also guarantee that PIO read responses arrive | 
 | 256 | 	after any outstanding DMA writes from that bus, since for some devices | 
 | 257 | 	the result of a <function>readb</function> call may signal to the | 
 | 258 | 	driver that a DMA transaction is complete.  In many cases, however, | 
 | 259 | 	the driver may want to indicate that the next | 
 | 260 | 	<function>readb</function> call has no relation to any previous DMA | 
 | 261 | 	writes performed by the device.  The driver can use | 
 | 262 | 	<function>readb_relaxed</function> for these cases, although only | 
 | 263 | 	some platforms will honor the relaxed semantics.  Using the relaxed | 
 | 264 | 	read functions will provide significant performance benefits on | 
 | 265 | 	platforms that support it.  The qla2xxx driver provides examples | 
 | 266 | 	of how to use <function>readX_relaxed</function>.  In many cases, | 
 | 267 | 	a majority of the driver's <function>readX</function> calls can | 
 | 268 | 	safely be converted to <function>readX_relaxed</function> calls, since | 
 | 269 | 	only a few will indicate or depend on DMA completion. | 
 | 270 |       </para> | 
 | 271 |     </sect1> | 
 | 272 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 273 |   </chapter> | 
 | 274 |  | 
| Rob Landley | 541ceb3 | 2007-10-16 23:31:30 -0700 | [diff] [blame] | 275 |   <chapter id="port_space_accesses"> | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 276 |     <title>Port Space Accesses</title> | 
| Rob Landley | 541ceb3 | 2007-10-16 23:31:30 -0700 | [diff] [blame] | 277 |     <sect1 id="port_space_explained"> | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 278 |       <title>Port Space Explained</title> | 
 | 279 |  | 
 | 280 |       <para> | 
 | 281 | 	Another form of IO commonly supported is Port Space.  This is a | 
 | 282 | 	range of addresses separate to the normal memory address space. | 
 | 283 | 	Access to these addresses is generally not as fast as accesses | 
 | 284 | 	to the memory mapped addresses, and it also has a potentially | 
 | 285 | 	smaller address space. | 
 | 286 |       </para> | 
 | 287 |  | 
 | 288 |       <para> | 
 | 289 | 	Unlike memory mapped IO, no preparation is required | 
 | 290 | 	to access port space. | 
 | 291 |       </para> | 
 | 292 |  | 
 | 293 |     </sect1> | 
| Rob Landley | 541ceb3 | 2007-10-16 23:31:30 -0700 | [diff] [blame] | 294 |     <sect1 id="accessing_port_space"> | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 295 |       <title>Accessing Port Space</title> | 
 | 296 |       <para> | 
 | 297 | 	Accesses to this space are provided through a set of functions | 
 | 298 | 	which allow 8-bit, 16-bit and 32-bit accesses; also | 
 | 299 | 	known as byte, word and long.  These functions are | 
 | 300 | 	<function>inb</function>, <function>inw</function>, | 
 | 301 | 	<function>inl</function>, <function>outb</function>, | 
 | 302 | 	<function>outw</function> and <function>outl</function>. | 
 | 303 |       </para> | 
 | 304 |  | 
 | 305 |       <para> | 
 | 306 | 	Some variants are provided for these functions.  Some devices | 
 | 307 | 	require that accesses to their ports are slowed down.  This | 
 | 308 | 	functionality is provided by appending a <function>_p</function> | 
 | 309 | 	to the end of the function.  There are also equivalents to memcpy. | 
 | 310 | 	The <function>ins</function> and <function>outs</function> | 
 | 311 | 	functions copy bytes, words or longs to the given port. | 
 | 312 |       </para> | 
 | 313 |     </sect1> | 
 | 314 |  | 
 | 315 |   </chapter> | 
 | 316 |  | 
 | 317 |   <chapter id="pubfunctions"> | 
 | 318 |      <title>Public Functions Provided</title> | 
| Randy Dunlap | 08d7b5a | 2007-10-12 21:17:00 -0700 | [diff] [blame] | 319 | !Iinclude/asm-x86/io_32.h | 
| Rolf Eike Beer | 5ca2481 | 2007-07-19 17:48:44 -0700 | [diff] [blame] | 320 | !Elib/iomap.c | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 321 |   </chapter> | 
 | 322 |  | 
 | 323 | </book> |