| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 1 | Lesson 1: Spin locks | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2 |  | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 3 | The most basic primitive for locking is spinlock. | 
| Ed L. Cashin | 017f021 | 2007-07-15 23:41:50 -0700 | [diff] [blame] | 4 |  | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 5 | static DEFINE_SPINLOCK(xxx_lock); | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 6 |  | 
 | 7 | 	unsigned long flags; | 
 | 8 |  | 
 | 9 | 	spin_lock_irqsave(&xxx_lock, flags); | 
 | 10 | 	... critical section here .. | 
 | 11 | 	spin_unlock_irqrestore(&xxx_lock, flags); | 
 | 12 |  | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 13 | The above is always safe. It will disable interrupts _locally_, but the | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 14 | spinlock itself will guarantee the global lock, so it will guarantee that | 
 | 15 | there is only one thread-of-control within the region(s) protected by that | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 16 | lock. This works well even under UP. The above sequence under UP | 
 | 17 | essentially is just the same as doing | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 18 |  | 
 | 19 | 	unsigned long flags; | 
 | 20 |  | 
 | 21 | 	save_flags(flags); cli(); | 
 | 22 | 	 ... critical section ... | 
 | 23 | 	restore_flags(flags); | 
 | 24 |  | 
 | 25 | so the code does _not_ need to worry about UP vs SMP issues: the spinlocks | 
 | 26 | work correctly under both (and spinlocks are actually more efficient on | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 27 | architectures that allow doing the "save_flags + cli" in one operation). | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 28 |  | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 29 |    NOTE! Implications of spin_locks for memory are further described in: | 
 | 30 |  | 
 | 31 |      Documentation/memory-barriers.txt | 
 | 32 |        (5) LOCK operations. | 
 | 33 |        (6) UNLOCK operations. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 34 |  | 
 | 35 | The above is usually pretty simple (you usually need and want only one | 
 | 36 | spinlock for most things - using more than one spinlock can make things a | 
 | 37 | lot more complex and even slower and is usually worth it only for | 
 | 38 | sequences that you _know_ need to be split up: avoid it at all cost if you | 
 | 39 | aren't sure). HOWEVER, it _does_ mean that if you have some code that does | 
 | 40 |  | 
 | 41 | 	cli(); | 
 | 42 | 	.. critical section .. | 
 | 43 | 	sti(); | 
 | 44 |  | 
 | 45 | and another sequence that does | 
 | 46 |  | 
 | 47 | 	spin_lock_irqsave(flags); | 
 | 48 | 	.. critical section .. | 
 | 49 | 	spin_unlock_irqrestore(flags); | 
 | 50 |  | 
 | 51 | then they are NOT mutually exclusive, and the critical regions can happen | 
 | 52 | at the same time on two different CPU's. That's fine per se, but the | 
 | 53 | critical regions had better be critical for different things (ie they | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 54 | can't stomp on each other). | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 55 |  | 
 | 56 | The above is a problem mainly if you end up mixing code - for example the | 
 | 57 | routines in ll_rw_block() tend to use cli/sti to protect the atomicity of | 
 | 58 | their actions, and if a driver uses spinlocks instead then you should | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 59 | think about issues like the above. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 60 |  | 
 | 61 | This is really the only really hard part about spinlocks: once you start | 
 | 62 | using spinlocks they tend to expand to areas you might not have noticed | 
 | 63 | before, because you have to make sure the spinlocks correctly protect the | 
 | 64 | shared data structures _everywhere_ they are used. The spinlocks are most | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 65 | easily added to places that are completely independent of other code (for | 
 | 66 | example, internal driver data structures that nobody else ever touches). | 
 | 67 |  | 
 | 68 |    NOTE! The spin-lock is safe only when you _also_ use the lock itself | 
 | 69 |    to do locking across CPU's, which implies that EVERYTHING that | 
 | 70 |    touches a shared variable has to agree about the spinlock they want | 
 | 71 |    to use. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 72 |  | 
 | 73 | ---- | 
 | 74 |  | 
 | 75 | Lesson 2: reader-writer spinlocks. | 
 | 76 |  | 
 | 77 | If your data accesses have a very natural pattern where you usually tend | 
 | 78 | to mostly read from the shared variables, the reader-writer locks | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 79 | (rw_lock) versions of the spinlocks are sometimes useful. They allow multiple | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 80 | readers to be in the same critical region at once, but if somebody wants | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 81 | to change the variables it has to get an exclusive write lock. | 
 | 82 |  | 
 | 83 |    NOTE! reader-writer locks require more atomic memory operations than | 
 | 84 |    simple spinlocks.  Unless the reader critical section is long, you | 
 | 85 |    are better off just using spinlocks. | 
 | 86 |  | 
 | 87 | The routines look the same as above: | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 88 |  | 
 | 89 |    rwlock_t xxx_lock = RW_LOCK_UNLOCKED; | 
 | 90 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 91 | 	unsigned long flags; | 
 | 92 |  | 
 | 93 | 	read_lock_irqsave(&xxx_lock, flags); | 
 | 94 | 	.. critical section that only reads the info ... | 
 | 95 | 	read_unlock_irqrestore(&xxx_lock, flags); | 
 | 96 |  | 
 | 97 | 	write_lock_irqsave(&xxx_lock, flags); | 
 | 98 | 	.. read and write exclusive access to the info ... | 
 | 99 | 	write_unlock_irqrestore(&xxx_lock, flags); | 
 | 100 |  | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 101 | The above kind of lock may be useful for complex data structures like | 
 | 102 | linked lists, especially searching for entries without changing the list | 
 | 103 | itself.  The read lock allows many concurrent readers.  Anything that | 
 | 104 | _changes_ the list will have to get the write lock. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 105 |  | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 106 |    NOTE! RCU is better for list traversal, but requires careful | 
 | 107 |    attention to design detail (see Documentation/RCU/listRCU.txt). | 
 | 108 |  | 
 | 109 | Also, you cannot "upgrade" a read-lock to a write-lock, so if you at _any_ | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 110 | time need to do any changes (even if you don't do it every time), you have | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 111 | to get the write-lock at the very beginning. | 
 | 112 |  | 
 | 113 |    NOTE! We are working hard to remove reader-writer spinlocks in most | 
 | 114 |    cases, so please don't add a new one without consensus.  (Instead, see | 
 | 115 |    Documentation/RCU/rcu.txt for complete information.) | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 116 |  | 
 | 117 | ---- | 
 | 118 |  | 
 | 119 | Lesson 3: spinlocks revisited. | 
 | 120 |  | 
 | 121 | The single spin-lock primitives above are by no means the only ones. They | 
 | 122 | are the most safe ones, and the ones that work under all circumstances, | 
 | 123 | but partly _because_ they are safe they are also fairly slow. They are | 
 | 124 | much faster than a generic global cli/sti pair, but slower than they'd | 
 | 125 | need to be, because they do have to disable interrupts (which is just a | 
 | 126 | single instruction on a x86, but it's an expensive one - and on other | 
 | 127 | architectures it can be worse). | 
 | 128 |  | 
 | 129 | If you have a case where you have to protect a data structure across | 
 | 130 | several CPU's and you want to use spinlocks you can potentially use | 
 | 131 | cheaper versions of the spinlocks. IFF you know that the spinlocks are | 
 | 132 | never used in interrupt handlers, you can use the non-irq versions: | 
 | 133 |  | 
 | 134 | 	spin_lock(&lock); | 
 | 135 | 	... | 
 | 136 | 	spin_unlock(&lock); | 
 | 137 |  | 
 | 138 | (and the equivalent read-write versions too, of course). The spinlock will | 
 | 139 | guarantee the same kind of exclusive access, and it will be much faster.  | 
 | 140 | This is useful if you know that the data in question is only ever | 
 | 141 | manipulated from a "process context", ie no interrupts involved.  | 
 | 142 |  | 
 | 143 | The reasons you mustn't use these versions if you have interrupts that | 
 | 144 | play with the spinlock is that you can get deadlocks: | 
 | 145 |  | 
 | 146 | 	spin_lock(&lock); | 
 | 147 | 	... | 
 | 148 | 		<- interrupt comes in: | 
 | 149 | 			spin_lock(&lock); | 
 | 150 |  | 
 | 151 | where an interrupt tries to lock an already locked variable. This is ok if | 
 | 152 | the other interrupt happens on another CPU, but it is _not_ ok if the | 
 | 153 | interrupt happens on the same CPU that already holds the lock, because the | 
 | 154 | lock will obviously never be released (because the interrupt is waiting | 
 | 155 | for the lock, and the lock-holder is interrupted by the interrupt and will | 
 | 156 | not continue until the interrupt has been processed).  | 
 | 157 |  | 
 | 158 | (This is also the reason why the irq-versions of the spinlocks only need | 
 | 159 | to disable the _local_ interrupts - it's ok to use spinlocks in interrupts | 
 | 160 | on other CPU's, because an interrupt on another CPU doesn't interrupt the | 
 | 161 | CPU that holds the lock, so the lock-holder can continue and eventually | 
 | 162 | releases the lock).  | 
 | 163 |  | 
 | 164 | Note that you can be clever with read-write locks and interrupts. For | 
 | 165 | example, if you know that the interrupt only ever gets a read-lock, then | 
 | 166 | you can use a non-irq version of read locks everywhere - because they | 
 | 167 | don't block on each other (and thus there is no dead-lock wrt interrupts.  | 
 | 168 | But when you do the write-lock, you have to use the irq-safe version.  | 
 | 169 |  | 
 | 170 | For an example of being clever with rw-locks, see the "waitqueue_lock"  | 
 | 171 | handling in kernel/sched.c - nothing ever _changes_ a wait-queue from | 
 | 172 | within an interrupt, they only read the queue in order to know whom to | 
 | 173 | wake up. So read-locks are safe (which is good: they are very common | 
 | 174 | indeed), while write-locks need to protect themselves against interrupts. | 
 | 175 |  | 
 | 176 | 		Linus | 
 | 177 |  | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 178 | ---- | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 179 |  | 
| William Allen Simpson | fb0bbb9 | 2009-12-13 15:12:46 -0500 | [diff] [blame] | 180 | Reference information: | 
 | 181 |  | 
 | 182 | For dynamic initialization, use spin_lock_init() or rwlock_init() as | 
 | 183 | appropriate: | 
 | 184 |  | 
 | 185 |    spinlock_t xxx_lock; | 
 | 186 |    rwlock_t xxx_rw_lock; | 
 | 187 |  | 
 | 188 |    static int __init xxx_init(void) | 
 | 189 |    { | 
 | 190 | 	spin_lock_init(&xxx_lock); | 
 | 191 | 	rwlock_init(&xxx_rw_lock); | 
 | 192 | 	... | 
 | 193 |    } | 
 | 194 |  | 
 | 195 |    module_init(xxx_init); | 
 | 196 |  | 
 | 197 | For static initialization, use DEFINE_SPINLOCK() / DEFINE_RWLOCK() or | 
 | 198 | __SPIN_LOCK_UNLOCKED() / __RW_LOCK_UNLOCKED() as appropriate. | 
 | 199 |  | 
 | 200 | SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated.  These interfere | 
 | 201 | with lockdep state tracking. | 
 | 202 |  | 
 | 203 | Most of the time, you can simply turn: | 
 | 204 | 	static spinlock_t xxx_lock = SPIN_LOCK_UNLOCKED; | 
 | 205 | into: | 
 | 206 | 	static DEFINE_SPINLOCK(xxx_lock); | 
 | 207 |  | 
 | 208 | Static structure member variables go from: | 
 | 209 |  | 
 | 210 | 	struct foo bar { | 
 | 211 | 		.lock	=	SPIN_LOCK_UNLOCKED; | 
 | 212 | 	}; | 
 | 213 |  | 
 | 214 | to: | 
 | 215 |  | 
 | 216 | 	struct foo bar { | 
 | 217 | 		.lock	=	__SPIN_LOCK_UNLOCKED(bar.lock); | 
 | 218 | 	}; | 
 | 219 |  | 
 | 220 | Declaration of static rw_locks undergo a similar transformation. |