| Patrick Pannuto | 0fcb808 | 2010-08-02 15:01:05 -0700 | [diff] [blame] | 1 | delays - Information on the various kernel delay / sleep mechanisms | 
 | 2 | ------------------------------------------------------------------- | 
 | 3 |  | 
 | 4 | This document seeks to answer the common question: "What is the | 
 | 5 | RightWay (TM) to insert a delay?" | 
 | 6 |  | 
 | 7 | This question is most often faced by driver writers who have to | 
 | 8 | deal with hardware delays and who may not be the most intimately | 
 | 9 | familiar with the inner workings of the Linux Kernel. | 
 | 10 |  | 
 | 11 |  | 
 | 12 | Inserting Delays | 
 | 13 | ---------------- | 
 | 14 |  | 
 | 15 | The first, and most important, question you need to ask is "Is my | 
 | 16 | code in an atomic context?"  This should be followed closely by "Does | 
 | 17 | it really need to delay in atomic context?" If so... | 
 | 18 |  | 
 | 19 | ATOMIC CONTEXT: | 
 | 20 | 	You must use the *delay family of functions. These | 
 | 21 | 	functions use the jiffie estimation of clock speed | 
 | 22 | 	and will busy wait for enough loop cycles to achieve | 
 | 23 | 	the desired delay: | 
 | 24 |  | 
 | 25 | 	ndelay(unsigned long nsecs) | 
 | 26 | 	udelay(unsigned long usecs) | 
 | 27 | 	mdelay(unsgined long msecs) | 
 | 28 |  | 
 | 29 | 	udelay is the generally preferred API; ndelay-level | 
 | 30 | 	precision may not actually exist on many non-PC devices. | 
 | 31 |  | 
 | 32 | 	mdelay is macro wrapper around udelay, to account for | 
 | 33 | 	possible overflow when passing large arguments to udelay. | 
 | 34 | 	In general, use of mdelay is discouraged and code should | 
 | 35 | 	be refactored to allow for the use of msleep. | 
 | 36 |  | 
 | 37 | NON-ATOMIC CONTEXT: | 
 | 38 | 	You should use the *sleep[_range] family of functions. | 
 | 39 | 	There are a few more options here, while any of them may | 
 | 40 | 	work correctly, using the "right" sleep function will | 
 | 41 | 	help the scheduler, power management, and just make your | 
 | 42 | 	driver better :) | 
 | 43 |  | 
 | 44 | 	-- Backed by busy-wait loop: | 
 | 45 | 		udelay(unsigned long usecs) | 
 | 46 | 	-- Backed by hrtimers: | 
 | 47 | 		usleep_range(unsigned long min, unsigned long max) | 
 | 48 | 	-- Backed by jiffies / legacy_timers | 
 | 49 | 		msleep(unsigned long msecs) | 
 | 50 | 		msleep_interruptible(unsigned long msecs) | 
 | 51 |  | 
 | 52 | 	Unlike the *delay family, the underlying mechanism | 
 | 53 | 	driving each of these calls varies, thus there are | 
 | 54 | 	quirks you should be aware of. | 
 | 55 |  | 
 | 56 |  | 
 | 57 | 	SLEEPING FOR "A FEW" USECS ( < ~10us? ): | 
 | 58 | 		* Use udelay | 
 | 59 |  | 
 | 60 | 		- Why not usleep? | 
 | 61 | 			On slower systems, (embedded, OR perhaps a speed- | 
 | 62 | 			stepped PC!) the overhead of setting up the hrtimers | 
 | 63 | 			for usleep *may* not be worth it. Such an evaluation | 
 | 64 | 			will obviously depend on your specific situation, but | 
 | 65 | 			it is something to be aware of. | 
 | 66 |  | 
 | 67 | 	SLEEPING FOR ~USECS OR SMALL MSECS ( 10us - 20ms): | 
 | 68 | 		* Use usleep_range | 
 | 69 |  | 
 | 70 | 		- Why not msleep for (1ms - 20ms)? | 
 | 71 | 			Explained originally here: | 
 | 72 | 				http://lkml.org/lkml/2007/8/3/250 | 
 | 73 | 			msleep(1~20) may not do what the caller intends, and | 
 | 74 | 			will often sleep longer (~20 ms actual sleep for any | 
 | 75 | 			value given in the 1~20ms range). In many cases this | 
 | 76 | 			is not the desired behavior. | 
 | 77 |  | 
 | 78 | 		- Why is there no "usleep" / What is a good range? | 
 | 79 | 			Since usleep_range is built on top of hrtimers, the | 
 | 80 | 			wakeup will be very precise (ish), thus a simple | 
 | 81 | 			usleep function would likely introduce a large number | 
 | 82 | 			of undesired interrupts. | 
 | 83 |  | 
 | 84 | 			With the introduction of a range, the scheduler is | 
 | 85 | 			free to coalesce your wakeup with any other wakeup | 
 | 86 | 			that may have happened for other reasons, or at the | 
 | 87 | 			worst case, fire an interrupt for your upper bound. | 
 | 88 |  | 
 | 89 | 			The larger a range you supply, the greater a chance | 
 | 90 | 			that you will not trigger an interrupt; this should | 
 | 91 | 			be balanced with what is an acceptable upper bound on | 
 | 92 | 			delay / performance for your specific code path. Exact | 
 | 93 | 			tolerances here are very situation specific, thus it | 
 | 94 | 			is left to the caller to determine a reasonable range. | 
 | 95 |  | 
 | 96 | 	SLEEPING FOR LARGER MSECS ( 10ms+ ) | 
 | 97 | 		* Use msleep or possibly msleep_interruptible | 
 | 98 |  | 
 | 99 | 		- What's the difference? | 
 | 100 | 			msleep sets the current task to TASK_UNINTERRUPTIBLE | 
 | 101 | 			whereas msleep_interruptible sets the current task to | 
 | 102 | 			TASK_INTERRUPTIBLE before scheduling the sleep. In | 
 | 103 | 			short, the difference is whether the sleep can be ended | 
 | 104 | 			early by a signal. In general, just use msleep unless | 
 | 105 | 			you know you have a need for the interruptible variant. |